Science.gov

Sample records for structural genomics consortium

  1. Genome Structure Gallery from the Mycobacterium Tuberculosis Structual Genomics Consortium

    DOE Data Explorer

    The TB Structural Genomics Consortium works with the structures of proteins from M. tuberculosis, analyzing these structures in the context of functional information that currently exists and that the Consortium generates. The database of linked structural and functional information constructed from this project will form a lasting basis for understanding M. tuberculosis pathogenesis and for structure-based drug design. The Consortium's structural and functional information is publicly available. The Structures Gallery makes more than 650 total structures available by PDB identifier. Some of these are not consortium targets, but all are viewable in 3D color and can be manipulated in various ways by Jmol, an open-source Java viewer for chemical structures in 3D from http://www.jmol.org/

  2. Genomic standards consortium projects.

    PubMed

    Field, Dawn; Sterk, Peter; Kottmann, Renzo; De Smet, J Wim; Amaral-Zettler, Linda; Cochrane, Guy; Cole, James R; Davies, Neil; Dawyndt, Peter; Garrity, George M; Gilbert, Jack A; Glöckner, Frank Oliver; Hirschman, Lynette; Klenk, Hans-Peter; Knight, Rob; Kyrpides, Nikos; Meyer, Folker; Karsch-Mizrachi, Ilene; Morrison, Norman; Robbins, Robert; San Gil, Inigo; Sansone, Susanna; Schriml, Lynn; Tatusova, Tatiana; Ussery, Dave; Yilmaz, Pelin; White, Owen; Wooley, John; Caporaso, Gregory

    2014-06-15

    The Genomic Standards Consortium (GSC) is an open-membership community that was founded in 2005 to work towards the development, implementation and harmonization of standards in the field of genomics. Starting with the defined task of establishing a minimal set of descriptions the GSC has evolved into an active standards-setting body that currently has 18 ongoing projects, with additional projects regularly proposed from within and outside the GSC. Here we describe our recently enacted policy for proposing new activities that are intended to be taken on by the GSC, along with the template for proposing such new activities. PMID:25197446

  3. The Genomic Standards Consortium

    PubMed Central

    Field, Dawn; Amaral-Zettler, Linda; Cochrane, Guy; Cole, James R.; Dawyndt, Peter; Garrity, George M.; Gilbert, Jack; Glöckner, Frank Oliver; Hirschman, Lynette; Karsch-Mizrachi, Ilene; Klenk, Hans-Peter; Knight, Rob; Kottmann, Renzo; Kyrpides, Nikos; Meyer, Folker; San Gil, Inigo; Sansone, Susanna-Assunta; Schriml, Lynn M.; Sterk, Peter; Tatusova, Tatiana; Ussery, David W.; White, Owen; Wooley, John

    2011-01-01

    A vast and rich body of information has grown up as a result of the world's enthusiasm for 'omics technologies. Finding ways to describe and make available this information that maximise its usefulness has become a major effort across the 'omics world. At the heart of this effort is the Genomic Standards Consortium (GSC), an open-membership organization that drives community-based standardization activities, Here we provide a short history of the GSC, provide an overview of its range of current activities, and make a call for the scientific community to join forces to improve the quality and quantity of contextual information about our public collections of genomes, metagenomes, and marker gene sequences. PMID:21713030

  4. The High-Throughput Protein Sample Production Platform of the Northeast Structural Genomics Consortium

    PubMed Central

    Xiao, Rong; Anderson, Stephen; Aramini, James; Belote, Rachel; Buchwald, William A.; Ciccosanti, Colleen; Conover, Ken; Everett, John K.; Hamilton, Keith; Huang, Yuanpeng Janet; Janjua, Haleema; Jiang, Mei; Kornhaber, Gregory J.; Lee, Dong Yup; Locke, Jessica Y.; Ma, Li-Chung; Maglaqui, Melissa; Mao, Lei; Mitra, Saheli; Patel, Dayaban; Rossi, Paolo; Sahdev, Seema; Sharma, Seema; Shastry, Ritu; Swapna, G.V.T.; Tong, Saichu N.; Wang, Dongyan; Wang, Huang; Zhao, Li; Montelione, Gaetano T.; Acton, Thomas B.

    2014-01-01

    We describe the core Protein Production Platform of the Northeast Structural Genomics Consortium (NESG) and outline the strategies used for producing high-quality protein samples. The platform is centered on the cloning, expression and purification of 6X-His-tagged proteins using T7-based Escherichia coli systems. The 6X-His tag allows for similar purification procedures for most targets and implementation of high-throughput (HTP) parallel methods. In most cases, the 6X-His-tagged proteins are sufficiently purified (> 97% homogeneity) using a HTP two-step purification protocol for most structural studies. Using this platform, the open reading frames of over 16,000 different targeted proteins (or domains) have been cloned as > 26,000 constructs. Over the past nine years, more than 16,000 of these expressed protein, and more than 4,400 proteins (or domains) have been purified to homogeneity in tens of milligram quantities (see Summary Statistics, http://nesg.org/statistics.html). Using these samples, the NESG has deposited more than 900 new protein structures to the Protein Data Bank (PDB). The methods described here are effective in producing eukaryotic and prokaryotic protein samples in E. coli. This paper summarizes some of the updates made to the protein production pipeline in the last five years, corresponding to phase 2 of the NIGMS Protein Structure Initiative (PSI-2) project. The NESG Protein Production Platform is suitable for implementation in a large individual laboratory or by a small group of collaborating investigators. These advanced automated and/or parallel cloning, expression, purification, and biophysical screening technologies are of broad value to the structural biology, functional proteomics, and structural genomics communities. PMID:20688167

  5. The New York Consortium on Membrane Protein Structure (NYCOMPS): a high-throughput platform for structural genomics of integral membrane proteins

    PubMed Central

    Love, James; Mancia, Filippo; Shapiro, Lawrence; Punta, Marco; Rost, Burkhard; Girvin, Mark; Wang, Da-Neng; Zhou, Ming; Hunt, John F.; Szyperski, Thomas; Gouaux, Eric; MacKinnon, Roderick; McDermott, Ann; Honig, Barry; Inouye, Masayori; Montelione, Gaetano

    2011-01-01

    The New York Consortium on Membrane Protein Structure (NYCOMPS) was formed to accelerate the acquisition of structural information on membrane proteins by applying a structural genomics approach. NY-COMPS comprises a bioinformatics group, a centralized facility operating a high-throughput cloning and screening pipeline, a set of associated wet labs that perform high-level protein production and structure determination by x-ray crystallography and NMR, and a set of investigators focused on methods development. In the first three years of operation, the NYCOMPS pipeline has so far produced and screened 7,250 expression constructs for 8,045 target proteins. Approximately 600 of these verified targets were scaled up to levels required for structural studies, so far yielding 24 membrane protein crystals. Here we describe the overall structure of NYCOMPS and provide details on the high-throughput pipeline. PMID:20690043

  6. Meeting Report from the Genomic Standards Consortium (GSC) Workshop 8

    PubMed Central

    Kyrpides, Nikos; Field, Dawn; Sterk, Peter; Kottmann, Renzo; Glöckner, Frank Oliver; Hirschman, Lynette; Garrity, George M.; Cochrane, Guy; Wooley, John

    2010-01-01

    This report summarizes the proceedings of the 8th meeting of the Genomic Standards Consortium held at the Department of Energy Joint Genome Institute in Walnut Creek, CA, USA on September 9-11, 2009. This three-day workshop marked the maturing of Genomic Standards Consortium from an informal gathering of researchers interested in developing standards in the field of genomic and metagenomics to an established community with a defined governance mechanism, its own open access journal, and a family of established standards for describing genomes, metagenomes and marker studies (i.e. ribosomal RNA gene surveys). There will be increased efforts within the GSC to reach out to the wider scientific community via a range of new projects. Further information about the GSC and its activities can be found at http://gensc.org/. PMID:21304696

  7. Retirement Plan Consortium Structures for K-12

    ERIC Educational Resources Information Center

    Kevin, John

    2012-01-01

    As school districts continue to seek administrative efficiencies and cost reductions in the wake of severe budget pressures, the resources they devote to creating or expanding retirement plan consortia is increasing. Understanding how to structure a retirement plan consortium is paramount to successfully achieving the many objectives of…

  8. 77 FR 43237 - Genome in a Bottle Consortium-Work Plan Review Workshop

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-07-24

    ... National Institute of Standards and Technology Genome in a Bottle Consortium--Work Plan Review Workshop... stakeholders about the draft consortium work plan, broadly solicit consortium membership from interested stakeholders, and invite members to participate in work plan implementation. DATES: The Genome in a...

  9. Meeting Report from the Genomic Standards Consortium (GSC) Workshop 9

    PubMed Central

    Davidsen, Tanja; Madupu, Ramana; Sterk, Peter; Field, Dawn; Garrity, George; Gilbert, Jack; Glöckner, Frank Oliver; Hirschman, Lynette; Kolker, Eugene; Kottmann, Renzo; Kyrpides, Nikos; Meyer, Folker; Morrison, Norman; Schriml, Lynn; Tatusova, Tatiana; Wooley, John

    2010-01-01

    This report summarizes the proceedings of the 9th workshop of the Genomic Standards Consortium (GSC), held at the J. Craig Venter Institute, Rockville, MD, USA. It was the first GSC workshop to have open registration and attracted over 90 participants. This workshop featured sessions that provided overviews of the full range of ongoing GSC projects. It included sessions on Standards in Genomic Sciences, the open access journal of the GSC, building standards for genome annotation, the M5 platform for next-generation collaborative computational infrastructures, building ties with the biodiversity research community and two discussion panels with government and industry participants. Progress was made on all fronts, and major outcomes included the completion of the MIENS specification for publication and the formation of the Biodiversity working group. PMID:21304722

  10. International Cancer Genome Consortium Data Portal--a one-stop shop for cancer genomics data.

    PubMed

    Zhang, Junjun; Baran, Joachim; Cros, A; Guberman, Jonathan M; Haider, Syed; Hsu, Jack; Liang, Yong; Rivkin, Elena; Wang, Jianxin; Whitty, Brett; Wong-Erasmus, Marie; Yao, Long; Kasprzyk, Arek

    2011-01-01

    The International Cancer Genome Consortium (ICGC) is a collaborative effort to characterize genomic abnormalities in 50 different cancer types. To make this data available, the ICGC has created the ICGC Data Portal. Powered by the BioMart software, the Data Portal allows each ICGC member institution to manage and maintain its own databases locally, while seamlessly presenting all the data in a single access point for users. The Data Portal currently contains data from 24 cancer projects, including ICGC, The Cancer Genome Atlas (TCGA), Johns Hopkins University, and the Tumor Sequencing Project. It consists of 3478 genomes and 13 cancer types and subtypes. Available open access data types include simple somatic mutations, copy number alterations, structural rearrangements, gene expression, microRNAs, DNA methylation and exon junctions. Additionally, simple germline variations are available as controlled access data. The Data Portal uses a web-based graphical user interface (GUI) to offer researchers multiple ways to quickly and easily search and analyze the available data. The web interface can assist in constructing complicated queries across multiple data sets. Several application programming interfaces are also available for programmatic access. Here we describe the organization, functionality, and capabilities of the ICGC Data Portal. PMID:21930502

  11. Towards a Universal Clinical Genomics Database: The 2012 International Standards for Cytogenomic Arrays (ISCA) Consortium Meeting

    PubMed Central

    Riggs, Erin Rooney; Wain, Karen E.; Riethmaier, Darlene; Savage, Melissa; Smith-Packard, Bethanny; Kaminsky, Erin B.; Rehm, Heidi L.; Martin, Christa Lese; Ledbetter, David H.; Faucett, W. Andrew

    2013-01-01

    The 2012 International Standards for Cytogenomic Arrays (ISCA) Consortium Meeting, “Towards a Universal Clinical Genomic Database,” was held in Bethesda, MD, 21–22 May 2012 and was attended by over 200 individuals from around the world representing clinical genetic testing laboratories, clinicians, academia, industry, research, and regulatory agencies. The scientific program centered on expanding the current focus of the ISCA Consortium to include the collection and curation of both structural and sequence-level variation into a unified clinical genomics database, available to the public through resources such as the National Center for Biotechnology Information (NCBI)’s ClinVar database. Here, we provide an overview of the conference, with summaries of the topics presented for discussion by over 25 different speakers. Presentations are available online at www.iscaconsortium.org. PMID:23463607

  12. Genome Analyses and Supplement Data from the International Populus Genome Consortium (IPGC)

    DOE Data Explorer

    International Populus Genome Consortium (IPGC)

    The sequencing of the first tree genome, that of Populus, was a project initiated by the Office of Biological and Environmental Research in DOE’s Office of Science. The International Populus Genome Consortium (IPGC) was formed to help develop and guide post-sequence activities. The IPGC website, hosted at the Oak Ridge National Laboratory, provides draft sequence data as it is made available from DOE Joint Genome Institute, genome analyses for Populus, lists of related publications and resources, and the science plan. The data are available at http://www.ornl.gov/sci/ipgc/ssr_resource.htm.

  13. The PRRS Host Genomic Consortium (PHGC) Database: Management of large data sets.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In any consortium project where large amounts of phenotypic and genotypic data are collected across several research labs, issues arise with maintenance and analysis of datasets. The PRRS Host Genomic Consortium (PHGC) Database was developed to meet this need for the PRRS research community. The sch...

  14. The Tennessee Mouse Genome Consortium: Identification of ocular mutants

    SciTech Connect

    Jablonski, Monica M.; Wang, Xiaofei; Lu, Lu; Miller, Darla R; Rinchik, Eugene M; Williams, Robert; Goldowitz, Daniel

    2005-06-01

    The Tennessee Mouse Genome Consortium (TMGC) is in its fifth year of a ethylnitrosourea (ENU)-based mutagenesis screen to detect recessive mutations that affect the eye and brain. Each pedigree is tested by various phenotyping domains including the eye, neurohistology, behavior, aging, ethanol, drug, social behavior, auditory, and epilepsy domains. The utilization of a highly efficient breeding protocol and coordination of various universities across Tennessee makes it possible for mice with ENU-induced mutations to be evaluated by nine distinct phenotyping domains within this large-scale project known as the TMGC. Our goal is to create mutant lines that model human diseases and disease syndromes and to make the mutant mice available to the scientific research community. Within the eye domain, mice are screened for anterior and posterior segment abnormalities using slit-lamp biomicroscopy, indirect ophthalmoscopy, fundus photography, eye weight, histology, and immunohistochemistry. As of January 2005, we have screened 958 pedigrees and 4800 mice, excluding those used in mapping studies. We have thus far identified seven pedigrees with primary ocular abnormalities. Six of the mutant pedigrees have retinal or subretinal aberrations, while the remaining pedigree presents with an abnormal eye size. Continued characterization of these mutant mice should in most cases lead to the identification of the mutated gene, as well as provide insight into the function of each gene. Mice from each of these pedigrees of mutant mice are available for distribution to researchers for independent study.

  15. The Teleprasenz Consortium: Structure and intentions

    NASA Technical Reports Server (NTRS)

    Blauert, Jens

    1991-01-01

    The Teleprasenz-Consortium is an open group of currently 37 scientists of different disciplines who devote a major part of their research activities to the foundations of telepresence technology. Telepresence technology is basically understood as a means to bridge spatial and temporal gaps as well as certain kinds of concealment, inaccessibility and danger of exposure. The activities of the consortium are organized into three main branches: virtual environment, surveillance and control systems, and speech and language technology. A brief summary of the main activities in these areas is given.

  16. Report of the 13th Genomic Standards Consortium Meeting, Shenzhen, China, March 4–7, 2012.

    PubMed Central

    Bao, Yiming; Wang, Hui; Sansone, Susanna-Assunta; Edmunds, Scott C.; Morrison, Norman; Meyer, Folker; Schriml, Lynn M.; Davies, Neil; Sterk, Peter; Wilkening, Jared; Garrity, George M.; Field, Dawn; Robbins, Robert; Smith, Daniel P.; Mizrachi, Ilene; Moreau, Corrie

    2012-01-01

    This report details the outcome of the 13th Meeting of the Genomic Standards Consortium. The three-day conference was held at the Kingkey Palace Hotel, Shenzhen, China, on March 5–7, 2012, and was hosted by the Beijing Genomics Institute. The meeting, titled From Genomes to Interactions to Communities to Models, highlighted the role of data standards associated with genomic, metagenomic, and amplicon sequence data and the contextual information associated with the sample. To this end the meeting focused on genomic projects for animals, plants, fungi, and viruses; metagenomic studies in host-microbe interactions; and the dynamics of microbial communities. In addition, the meeting hosted a Genomic Observatories Network session, a Genomic Standards Consortium biodiversity working group session, and a Microbiology of the Built Environment session sponsored by the Alfred P. Sloan Foundation. PMID:22768370

  17. Genome Consortium for Active Teaching: Meeting the Goals of BIO2010

    ERIC Educational Resources Information Center

    Campbell, A. Malcolm; Ledbetter, Mary Lee S.; Hoopes, Laura L. M.; Eckdahl, Todd T.; Heyer, Laurie J.; Rosenwald, Anne; Fowlks, Edison; Tonidandel, Scott; Bucholtz, Brooke; Gottfried, Gail

    2007-01-01

    The Genome Consortium for Active Teaching (GCAT) facilitates the use of modern genomics methods in undergraduate education. Initially focused on microarray technology, but with an eye toward diversification, GCAT is a community working to improve the education of tomorrow's life science professionals. GCAT participants have access to affordable…

  18. The global cancer genomics consortium's third annual symposium: from oncogenomics to cancer care

    PubMed Central

    Costa, Luis; Casimiro, Sandra; Gupta, Sudeep; Knapp, Stefan; Pillai, M.Radhakrishna; Toi, Masakazu; Badwe, Rajendra; Carmo-Fonseca, Maria; Kumar, Rakesh

    2014-01-01

    The Global Cancer Genomics Consortium (GCGC) is a cohesive network of oncologists, cancer biologists and structural and genomic experts residing in six institutions from Portugal, United Kingdom, Japan, India, and United States. The team is using its combined resources and infrastructures to address carefully selected, shared, burning questions in cancer medicine. The Third Annual Symposium was organized by the Institute of Molecular Medicine, Lisbon Medical School, Lisbon, Portugal, from September 18 to 20, 2013. To highlight the benefits and limitations of recent advances in cancer genomics, the meeting focused on how to better translate our gains in oncogenomics to cancer patients while engaging our younger colleagues in cancer medicine at-large. Over two hundreds participants actively discussed some of the most recent advances in the areas cancer genomics, transcriptomics and cancer system biology and how to best apply such knowledge to cancer therapeutics, biomarkers discovery and drug development, and an essential role played by bio-banking throughout the process. In brief, the GCGC symposium provided a platform for students and translational cancer researchers to share their excitement and worries as we are beginning to translate the gains in oncogenomics to a better cancer patient treatment.

  19. Connecting Genomic Alterations to Cancer Biology with Proteomics: The NCI Clinical Proteomic Tumor Analysis Consortium

    SciTech Connect

    Ellis, Matthew; Gillette, Michael; Carr, Steven A.; Paulovich, Amanda G.; Smith, Richard D.; Rodland, Karin D.; Townsend, Reid; Kinsinger, Christopher; Mesri, Mehdi; Rodriguez, Henry; Liebler, Daniel

    2013-10-03

    The National Cancer Institute (NCI) Clinical Proteomic Tumor Analysis Consortium is applying the latest generation of proteomic technologies to genomically annotated tumors from The Cancer Genome Atlas (TCGA) program, a joint initiative of the NCI and the National Human Genome Research Institute. By providing a fully integrated accounting of DNA, RNA, and protein abnormalities in individual tumors, these datasets will illuminate the complex relationship between genomic abnormalities and cancer phenotypes, thus producing biologic insights as well as a wave of novel candidate biomarkers and therapeutic targets amenable to verifi cation using targeted mass spectrometry methods.

  20. Clinical Sequencing Exploratory Research Consortium: Accelerating Evidence-Based Practice of Genomic Medicine.

    PubMed

    Green, Robert C; Goddard, Katrina A B; Jarvik, Gail P; Amendola, Laura M; Appelbaum, Paul S; Berg, Jonathan S; Bernhardt, Barbara A; Biesecker, Leslie G; Biswas, Sawona; Blout, Carrie L; Bowling, Kevin M; Brothers, Kyle B; Burke, Wylie; Caga-Anan, Charlisse F; Chinnaiyan, Arul M; Chung, Wendy K; Clayton, Ellen W; Cooper, Gregory M; East, Kelly; Evans, James P; Fullerton, Stephanie M; Garraway, Levi A; Garrett, Jeremy R; Gray, Stacy W; Henderson, Gail E; Hindorff, Lucia A; Holm, Ingrid A; Lewis, Michelle Huckaby; Hutter, Carolyn M; Janne, Pasi A; Joffe, Steven; Kaufman, David; Knoppers, Bartha M; Koenig, Barbara A; Krantz, Ian D; Manolio, Teri A; McCullough, Laurence; McEwen, Jean; McGuire, Amy; Muzny, Donna; Myers, Richard M; Nickerson, Deborah A; Ou, Jeffrey; Parsons, Donald W; Petersen, Gloria M; Plon, Sharon E; Rehm, Heidi L; Roberts, J Scott; Robinson, Dan; Salama, Joseph S; Scollon, Sarah; Sharp, Richard R; Shirts, Brian; Spinner, Nancy B; Tabor, Holly K; Tarczy-Hornoch, Peter; Veenstra, David L; Wagle, Nikhil; Weck, Karen; Wilfond, Benjamin S; Wilhelmsen, Kirk; Wolf, Susan M; Wynn, Julia; Yu, Joon-Ho

    2016-06-01

    Despite rapid technical progress and demonstrable effectiveness for some types of diagnosis and therapy, much remains to be learned about clinical genome and exome sequencing (CGES) and its role within the practice of medicine. The Clinical Sequencing Exploratory Research (CSER) consortium includes 18 extramural research projects, one National Human Genome Research Institute (NHGRI) intramural project, and a coordinating center funded by the NHGRI and National Cancer Institute. The consortium is exploring analytic and clinical validity and utility, as well as the ethical, legal, and social implications of sequencing via multidisciplinary approaches; it has thus far recruited 5,577 participants across a spectrum of symptomatic and healthy children and adults by utilizing both germline and cancer sequencing. The CSER consortium is analyzing data and creating publically available procedures and tools related to participant preferences and consent, variant classification, disclosure and management of primary and secondary findings, health outcomes, and integration with electronic health records. Future research directions will refine measures of clinical utility of CGES in both germline and somatic testing, evaluate the use of CGES for screening in healthy individuals, explore the penetrance of pathogenic variants through extensive phenotyping, reduce discordances in public databases of genes and variants, examine social and ethnic disparities in the provision of genomics services, explore regulatory issues, and estimate the value and downstream costs of sequencing. The CSER consortium has established a shared community of research sites by using diverse approaches to pursue the evidence-based development of best practices in genomic medicine. PMID:27181682

  1. Draft Genome Sequence of Achromobacter sp. Strain AR476-2, Isolated from a Cellulolytic Consortium.

    PubMed

    Kurth, Daniel; Romero, Cintia M; Fernandez, Pablo M; Ferrero, Marcela A; Martinez, M Alejandra

    2016-01-01

    Achromobacter sp. AR476-2 is a noncellulolytic strain previously isolated from a cellulolytic consortium selected from samples of insect gut. Its genome sequence could contribute to the unraveling of the complex interaction of microorganisms and enzymes involved in the biodegradation of lignocellulosic biomass in nature. PMID:27340069

  2. Draft Genome Sequence of Achromobacter sp. Strain AR476-2, Isolated from a Cellulolytic Consortium

    PubMed Central

    Kurth, Daniel; Romero, Cintia M.; Fernandez, Pablo M.; Ferrero, Marcela A.

    2016-01-01

    Achromobacter sp. AR476-2 is a noncellulolytic strain previously isolated from a cellulolytic consortium selected from samples of insect gut. Its genome sequence could contribute to the unraveling of the complex interaction of microorganisms and enzymes involved in the biodegradation of lignocellulosic biomass in nature. PMID:27340069

  3. The Mycosphaerella Genomics Consortium: Understanding Pathogenicity and Adaptation

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Mycosphaerella is one of the largest genera of plant pathogenic fungi. Due to their high economic importance, the genomes of two Mycosphaerella species have been sequenced and four others are in progress. Analysis of these genomes has potential practical benefits through identification of effectors ...

  4. Personal Genome Sequencing in Ostensibly Healthy Individuals and the PeopleSeq Consortium

    PubMed Central

    Linderman, Michael D.; Nielsen, Daiva E.; Green, Robert C.

    2016-01-01

    Thousands of ostensibly healthy individuals have had their exome or genome sequenced, but a much smaller number of these individuals have received any personal genomic results from that sequencing. We term those projects in which ostensibly healthy participants can receive sequencing-derived genetic findings and may also have access to their genomic data as participatory predispositional personal genome sequencing (PPGS). Here we are focused on genome sequencing applied in a pre-symptomatic context and so define PPGS to exclude diagnostic genome sequencing intended to identify the molecular cause of suspected or diagnosed genetic disease. In this report we describe the design of completed and underway PPGS projects, briefly summarize the results reported to date and introduce the PeopleSeq Consortium, a newly formed collaboration of PPGS projects designed to collect much-needed longitudinal outcome data. PMID:27023617

  5. High-Throughput Computational and Experimental Techniques in Structural Genomics

    PubMed Central

    Chance, Mark R.; Fiser, Andras; Sali, Andrej; Pieper, Ursula; Eswar, Narayanan; Xu, Guiping; Fajardo, J. Eduardo; Radhakannan, Thirumuruhan; Marinkovic, Nebojsa

    2004-01-01

    Structural genomics has as its goal the provision of structural information for all possible ORF sequences through a combination of experimental and computational approaches. The access to genome sequences and cloning resources from an ever-widening array of organisms is driving high-throughput structural studies by the New York Structural Genomics Research Consortium. In this report, we outline the progress of the Consortium in establishing its pipeline for structural genomics, and some of the experimental and bioinformatics efforts leading to structural annotation of proteins. The Consortium has established a pipeline for structural biology studies, automated modeling of ORF sequences using solved (template) structures, and a novel high-throughput approach (metallomics) to examining the metal binding to purified protein targets. The Consortium has so far produced 493 purified proteins from >1077 expression vectors. A total of 95 have resulted in crystal structures, and 81 are deposited in the Protein Data Bank (PDB). Comparative modeling of these structures has generated >40,000 structural models. We also initiated a high-throughput metal analysis of the purified proteins; this has determined that 10%-15% of the targets contain a stoichiometric structural or catalytic transition metal atom. The progress of the structural genomics centers in the U.S. and around the world suggests that the goal of providing useful structural information on most all ORF domains will be realized. This projected resource will provide structural biology information important to understanding the function of most proteins of the cell. PMID:15489337

  6. Functional Insights from Structural Genomics

    SciTech Connect

    Forouhar,F.; Kuzin, A.; Seetharaman, J.; Lee, I.; Zhou, W.; Abashidze, M.; Chen, Y.; Montelione, G.; Tong, L.; et al

    2007-01-01

    Structural genomics efforts have produced structural information, either directly or by modeling, for thousands of proteins over the past few years. While many of these proteins have known functions, a large percentage of them have not been characterized at the functional level. The structural information has provided valuable functional insights on some of these proteins, through careful structural analyses, serendipity, and structure-guided functional screening. Some of the success stories based on structures solved at the Northeast Structural Genomics Consortium (NESG) are reported here. These include a novel methyl salicylate esterase with important role in plant innate immunity, a novel RNA methyltransferase (H. influenzae yggJ (HI0303)), a novel spermidine/spermine N-acetyltransferase (B. subtilis PaiA), a novel methyltransferase or AdoMet binding protein (A. fulgidus AF{_}0241), an ATP:cob(I)alamin adenosyltransferase (B. subtilis YvqK), a novel carboxysome pore (E. coli EutN), a proline racemase homolog with a disrupted active site (B. melitensis BME11586), an FMN-dependent enzyme (S. pneumoniae SP{_}1951), and a 12-stranded {beta}-barrel with a novel fold (V. parahaemolyticus VPA1032).

  7. Clinical utilization of genomics data produced by the international Pseudomonas aeruginosa consortium.

    PubMed

    Freschi, Luca; Jeukens, Julie; Kukavica-Ibrulj, Irena; Boyle, Brian; Dupont, Marie-Josée; Laroche, Jérôme; Larose, Stéphane; Maaroufi, Halim; Fothergill, Joanne L; Moore, Matthew; Winsor, Geoffrey L; Aaron, Shawn D; Barbeau, Jean; Bell, Scott C; Burns, Jane L; Camara, Miguel; Cantin, André; Charette, Steve J; Dewar, Ken; Déziel, Éric; Grimwood, Keith; Hancock, Robert E W; Harrison, Joe J; Heeb, Stephan; Jelsbak, Lars; Jia, Baofeng; Kenna, Dervla T; Kidd, Timothy J; Klockgether, Jens; Lam, Joseph S; Lamont, Iain L; Lewenza, Shawn; Loman, Nick; Malouin, François; Manos, Jim; McArthur, Andrew G; McKeown, Josie; Milot, Julie; Naghra, Hardeep; Nguyen, Dao; Pereira, Sheldon K; Perron, Gabriel G; Pirnay, Jean-Paul; Rainey, Paul B; Rousseau, Simon; Santos, Pedro M; Stephenson, Anne; Taylor, Véronique; Turton, Jane F; Waglechner, Nicholas; Williams, Paul; Thrane, Sandra W; Wright, Gerard D; Brinkman, Fiona S L; Tucker, Nicholas P; Tümmler, Burkhard; Winstanley, Craig; Levesque, Roger C

    2015-01-01

    The International Pseudomonas aeruginosa Consortium is sequencing over 1000 genomes and building an analysis pipeline for the study of Pseudomonas genome evolution, antibiotic resistance and virulence genes. Metadata, including genomic and phenotypic data for each isolate of the collection, are available through the International Pseudomonas Consortium Database (http://ipcd.ibis.ulaval.ca/). Here, we present our strategy and the results that emerged from the analysis of the first 389 genomes. With as yet unmatched resolution, our results confirm that P. aeruginosa strains can be divided into three major groups that are further divided into subgroups, some not previously reported in the literature. We also provide the first snapshot of P. aeruginosa strain diversity with respect to antibiotic resistance. Our approach will allow us to draw potential links between environmental strains and those implicated in human and animal infections, understand how patients become infected and how the infection evolves over time as well as identify prognostic markers for better evidence-based decisions on patient care. PMID:26483767

  8. Clinical utilization of genomics data produced by the international Pseudomonas aeruginosa consortium

    PubMed Central

    Freschi, Luca; Jeukens, Julie; Kukavica-Ibrulj, Irena; Boyle, Brian; Dupont, Marie-Josée; Laroche, Jérôme; Larose, Stéphane; Maaroufi, Halim; Fothergill, Joanne L.; Moore, Matthew; Winsor, Geoffrey L.; Aaron, Shawn D.; Barbeau, Jean; Bell, Scott C.; Burns, Jane L.; Camara, Miguel; Cantin, André; Charette, Steve J.; Dewar, Ken; Déziel, Éric; Grimwood, Keith; Hancock, Robert E. W.; Harrison, Joe J.; Heeb, Stephan; Jelsbak, Lars; Jia, Baofeng; Kenna, Dervla T.; Kidd, Timothy J.; Klockgether, Jens; Lam, Joseph S.; Lamont, Iain L.; Lewenza, Shawn; Loman, Nick; Malouin, François; Manos, Jim; McArthur, Andrew G.; McKeown, Josie; Milot, Julie; Naghra, Hardeep; Nguyen, Dao; Pereira, Sheldon K.; Perron, Gabriel G.; Pirnay, Jean-Paul; Rainey, Paul B.; Rousseau, Simon; Santos, Pedro M.; Stephenson, Anne; Taylor, Véronique; Turton, Jane F.; Waglechner, Nicholas; Williams, Paul; Thrane, Sandra W.; Wright, Gerard D.; Brinkman, Fiona S. L.; Tucker, Nicholas P.; Tümmler, Burkhard; Winstanley, Craig; Levesque, Roger C.

    2015-01-01

    The International Pseudomonas aeruginosa Consortium is sequencing over 1000 genomes and building an analysis pipeline for the study of Pseudomonas genome evolution, antibiotic resistance and virulence genes. Metadata, including genomic and phenotypic data for each isolate of the collection, are available through the International Pseudomonas Consortium Database (http://ipcd.ibis.ulaval.ca/). Here, we present our strategy and the results that emerged from the analysis of the first 389 genomes. With as yet unmatched resolution, our results confirm that P. aeruginosa strains can be divided into three major groups that are further divided into subgroups, some not previously reported in the literature. We also provide the first snapshot of P. aeruginosa strain diversity with respect to antibiotic resistance. Our approach will allow us to draw potential links between environmental strains and those implicated in human and animal infections, understand how patients become infected and how the infection evolves over time as well as identify prognostic markers for better evidence-based decisions on patient care. PMID:26483767

  9. The Psychiatric Genomics Consortium Posttraumatic Stress Disorder Workgroup: Posttraumatic Stress Disorder Enters the Age of Large-Scale Genomic Collaboration

    PubMed Central

    Logue, Mark W; Amstadter, Ananda B; Baker, Dewleen G; Duncan, Laramie; Koenen, Karestan C; Liberzon, Israel; Miller, Mark W; Morey, Rajendra A; Nievergelt, Caroline M; Ressler, Kerry J; Smith, Alicia K; Smoller, Jordan W; Stein, Murray B; Sumner, Jennifer A; Uddin, Monica

    2015-01-01

    The development of posttraumatic stress disorder (PTSD) is influenced by genetic factors. Although there have been some replicated candidates, the identification of risk variants for PTSD has lagged behind genetic research of other psychiatric disorders such as schizophrenia, autism, and bipolar disorder. Psychiatric genetics has moved beyond examination of specific candidate genes in favor of the genome-wide association study (GWAS) strategy of very large numbers of samples, which allows for the discovery of previously unsuspected genes and molecular pathways. The successes of genetic studies of schizophrenia and bipolar disorder have been aided by the formation of a large-scale GWAS consortium: the Psychiatric Genomics Consortium (PGC). In contrast, only a handful of GWAS of PTSD have appeared in the literature to date. Here we describe the formation of a group dedicated to large-scale study of PTSD genetics: the PGC-PTSD. The PGC-PTSD faces challenges related to the contingency on trauma exposure and the large degree of ancestral genetic diversity within and across participating studies. Using the PGC analysis pipeline supplemented by analyses tailored to address these challenges, we anticipate that our first large-scale GWAS of PTSD will comprise over 10 000 cases and 30 000 trauma-exposed controls. Following in the footsteps of our PGC forerunners, this collaboration—of a scope that is unprecedented in the field of traumatic stress—will lead the search for replicable genetic associations and new insights into the biological underpinnings of PTSD. PMID:25904361

  10. LaGomiCs-Lagomorph Genomics Consortium: An International Collaborative Effort for Sequencing the Genomes of an Entire Mammalian Order.

    PubMed

    Fontanesi, Luca; Di Palma, Federica; Flicek, Paul; Smith, Andrew T; Thulin, Carl-Gustaf; Alves, Paulo C

    2016-07-01

    The order Lagomorpha comprises about 90 living species, divided in 2 families: the pikas (Family Ochotonidae), and the rabbits, hares, and jackrabbits (Family Leporidae). Lagomorphs are important economically and scientifically as major human food resources, valued game species, pests of agricultural significance, model laboratory animals, and key elements in food webs. A quarter of the lagomorph species are listed as threatened. They are native to all continents except Antarctica, and occur up to 5000 m above sea level, from the equator to the Arctic, spanning a wide range of environmental conditions. The order has notable taxonomic problems presenting significant difficulties for defining a species due to broad phenotypic variation, overlap of morphological characteristics, and relatively recent speciation events. At present, only the genomes of 2 species, the European rabbit (Oryctolagus cuniculus) and American pika (Ochotona princeps) have been sequenced and assembled. Starting from a paucity of genome information, the main scientific aim of the Lagomorph Genomics Consortium (LaGomiCs), born from a cooperative initiative of the European COST Action "A Collaborative European Network on Rabbit Genome Biology-RGB-Net" and the World Lagomorph Society (WLS), is to provide an international framework for the sequencing of the genome of all extant and selected extinct lagomorphs. Sequencing the genomes of an entire order will provide a large amount of information to address biological problems not only related to lagomorphs but also to all mammals. We present current and planned sequencing programs and outline the final objective of LaGomiCs possible through broad international collaboration. PMID:26921276

  11. The peanut genome consortium and peanut genome sequence: Creating a better future through global food security

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The competitiveness of peanuts in domestic and global markets has been threatened by losses in productivity and quality that are attributed to diseases, pests, environmental stresses and allergy or food safety issues. The U.S. Peanut Genome Initiative (PGI) was launched in 2004, and expanded to a gl...

  12. Report of the 14th Genomic Standards Consortium Meeting, Oxford, UK, September 17-21, 2012.

    PubMed Central

    Davies, Neil; Field, Dawn; Amaral-Zettler, Linda; Barker, Katharine; Bicak, Mesude; Bourlat, Sarah; Coddington, Jonathan; Deck, John; Drummond, Alexei; Gilbert, Jack A.; Glöckner, Frank Oliver; Kottmann, Renzo; Meyer, Chris; Morrison, Norman; Obst, Matthias; Robbins, Robert; Schriml, Lynn; Sterk, Peter; Stones-Havas, Steven

    2014-01-01

    This report summarizes the proceedings of the 14th workshop of the Genomic Standards Consortium (GSC) held at the University of Oxford in September 2012. The primary goal of the workshop was to work towards the launch of the Genomic Observatories (GOs) Network under the GSC. For the first time, it brought together potential GOs sites, GSC members, and a range of interested partner organizations. It thus represented the first meeting of the GOs Network (GOs1). Key outcomes include the formation of a core group of “champions” ready to take the GOs Network forward, as well as the formation of working groups. The workshop also served as the first meeting of a wide range of participants in the Ocean Sampling Day (OSD) initiative, a first GOs action. Three projects with complementary interests – COST Action ES1103, MG4U and Micro B3 – organized joint sessions at the workshop. A two-day GSC Hackathon followed the main three days of meetings.

  13. Integrated Genomic Analysis of Diverse Induced Pluripotent Stem Cells from the Progenitor Cell Biology Consortium.

    PubMed

    Salomonis, Nathan; Dexheimer, Phillip J; Omberg, Larsson; Schroll, Robin; Bush, Stacy; Huo, Jeffrey; Schriml, Lynn; Ho Sui, Shannan; Keddache, Mehdi; Mayhew, Christopher; Shanmukhappa, Shiva Kumar; Wells, James; Daily, Kenneth; Hubler, Shane; Wang, Yuliang; Zambidis, Elias; Margolin, Adam; Hide, Winston; Hatzopoulos, Antonis K; Malik, Punam; Cancelas, Jose A; Aronow, Bruce J; Lutzko, Carolyn

    2016-07-12

    The rigorous characterization of distinct induced pluripotent stem cells (iPSC) derived from multiple reprogramming technologies, somatic sources, and donors is required to understand potential sources of variability and downstream potential. To achieve this goal, the Progenitor Cell Biology Consortium performed comprehensive experimental and genomic analyses of 58 iPSC from ten laboratories generated using a variety of reprogramming genes, vectors, and cells. Associated global molecular characterization studies identified functionally informative correlations in gene expression, DNA methylation, and/or copy-number variation among key developmental and oncogenic regulators as a result of donor, sex, line stability, reprogramming technology, and cell of origin. Furthermore, X-chromosome inactivation in PSC produced highly correlated differences in teratoma-lineage staining and regulator expression upon differentiation. All experimental results, and raw, processed, and metadata from these analyses, including powerful tools, are interactively accessible from a new online portal at https://www.synapse.org to serve as a reusable resource for the stem cell community. PMID:27293150

  14. Genomic analysis reveals key aspects of prokaryotic symbiosis in the phototrophic consortium “Chlorochromatium aggregatum”

    PubMed Central

    2013-01-01

    Background ‘Chlorochromatium aggregatum’ is a phototrophic consortium, a symbiosis that may represent the highest degree of mutual interdependence between two unrelated bacteria not associated with a eukaryotic host. ‘Chlorochromatium aggregatum’ is a motile, barrel-shaped aggregate formed from a single cell of ‘Candidatus Symbiobacter mobilis”, a polarly flagellated, non-pigmented, heterotrophic bacterium, which is surrounded by approximately 15 epibiont cells of Chlorobium chlorochromatii, a non-motile photolithoautotrophic green sulfur bacterium. Results We analyzed the complete genome sequences of both organisms to understand the basis for this symbiosis. Chl. chlorochromatii has acquired relatively few symbiosis-specific genes; most acquired genes are predicted to modify the cell wall or function in cell-cell adhesion. In striking contrast, ‘Ca. S. mobilis’ appears to have undergone massive gene loss, is probably no longer capable of independent growth, and thus may only reproduce when consortia divide. A detailed model for the energetic and metabolic bases of the dependency of ‘Ca. S. mobilis’ on Chl. chlorochromatii is described. Conclusions Genomic analyses suggest that three types of interactions lead to a highly sophisticated relationship between these two organisms. Firstly, extensive metabolic exchange, involving carbon, nitrogen, and sulfur sources as well as vitamins, occurs from the epibiont to the central bacterium. Secondly, ‘Ca. S. mobilis’ can sense and move towards light and sulfide, resources that only directly benefit the epibiont. Thirdly, electron cycling mechanisms, particularly those mediated by quinones and potentially involving shared protonmotive force, could provide an important basis for energy exchange in this and other symbiotic relationships. PMID:24267588

  15. Athlome Project Consortium: a concerted effort to discover genomic and other "omic" markers of athletic performance.

    PubMed

    Pitsiladis, Yannis P; Tanaka, Masashi; Eynon, Nir; Bouchard, Claude; North, Kathryn N; Williams, Alun G; Collins, Malcolm; Moran, Colin N; Britton, Steven L; Fuku, Noriyuki; Ashley, Euan A; Klissouras, Vassilis; Lucia, Alejandro; Ahmetov, Ildus I; de Geus, Eco; Alsayrafi, Mohammed

    2016-03-01

    Despite numerous attempts to discover genetic variants associated with elite athletic performance, injury predisposition, and elite/world-class athletic status, there has been limited progress to date. Past reliance on candidate gene studies predominantly focusing on genotyping a limited number of single nucleotide polymorphisms or the insertion/deletion variants in small, often heterogeneous cohorts (i.e., made up of athletes of quite different sport specialties) have not generated the kind of results that could offer solid opportunities to bridge the gap between basic research in exercise sciences and deliverables in biomedicine. A retrospective view of genetic association studies with complex disease traits indicates that transition to hypothesis-free genome-wide approaches will be more fruitful. In studies of complex disease, it is well recognized that the magnitude of genetic association is often smaller than initially anticipated, and, as such, large sample sizes are required to identify the gene effects robustly. A symposium was held in Athens and on the Greek island of Santorini from 14-17 May 2015 to review the main findings in exercise genetics and genomics and to explore promising trends and possibilities. The symposium also offered a forum for the development of a position stand (the Santorini Declaration). Among the participants, many were involved in ongoing collaborative studies (e.g., ELITE, GAMES, Gene SMART, GENESIS, and POWERGENE). A consensus emerged among participants that it would be advantageous to bring together all current studies and those recently launched into one new large collaborative initiative, which was subsequently named the Athlome Project Consortium. PMID:26715623

  16. Global efforts in structural genomics.

    PubMed

    Stevens, R C; Yokoyama, S; Wilson, I A

    2001-10-01

    A worldwide initiative in structural genomics aims to capitalize on the recent successes of the genome projects. Substantial new investments in structural genomics in the past 2 years indicate the high level of support for these international efforts. Already, enormous progress has been made on high-throughput methodologies and technologies that will speed up macromolecular structure determinations. Recent international meetings have resulted in the formation of an International Structural Genomics Organization to formulate policy and foster cooperation between the public and private efforts. PMID:11588249

  17. DNA Methylation in Newborns and Maternal Smoking in Pregnancy: Genome-wide Consortium Meta-analysis.

    PubMed

    Joubert, Bonnie R; Felix, Janine F; Yousefi, Paul; Bakulski, Kelly M; Just, Allan C; Breton, Carrie; Reese, Sarah E; Markunas, Christina A; Richmond, Rebecca C; Xu, Cheng-Jian; Küpers, Leanne K; Oh, Sam S; Hoyo, Cathrine; Gruzieva, Olena; Söderhäll, Cilla; Salas, Lucas A; Baïz, Nour; Zhang, Hongmei; Lepeule, Johanna; Ruiz, Carlos; Ligthart, Symen; Wang, Tianyuan; Taylor, Jack A; Duijts, Liesbeth; Sharp, Gemma C; Jankipersadsing, Soesma A; Nilsen, Roy M; Vaez, Ahmad; Fallin, M Daniele; Hu, Donglei; Litonjua, Augusto A; Fuemmeler, Bernard F; Huen, Karen; Kere, Juha; Kull, Inger; Munthe-Kaas, Monica Cheng; Gehring, Ulrike; Bustamante, Mariona; Saurel-Coubizolles, Marie José; Quraishi, Bilal M; Ren, Jie; Tost, Jörg; Gonzalez, Juan R; Peters, Marjolein J; Håberg, Siri E; Xu, Zongli; van Meurs, Joyce B; Gaunt, Tom R; Kerkhof, Marjan; Corpeleijn, Eva; Feinberg, Andrew P; Eng, Celeste; Baccarelli, Andrea A; Benjamin Neelon, Sara E; Bradman, Asa; Merid, Simon Kebede; Bergström, Anna; Herceg, Zdenko; Hernandez-Vargas, Hector; Brunekreef, Bert; Pinart, Mariona; Heude, Barbara; Ewart, Susan; Yao, Jin; Lemonnier, Nathanaël; Franco, Oscar H; Wu, Michael C; Hofman, Albert; McArdle, Wendy; Van der Vlies, Pieter; Falahi, Fahimeh; Gillman, Matthew W; Barcellos, Lisa F; Kumar, Ashish; Wickman, Magnus; Guerra, Stefano; Charles, Marie-Aline; Holloway, John; Auffray, Charles; Tiemeier, Henning W; Smith, George Davey; Postma, Dirkje; Hivert, Marie-France; Eskenazi, Brenda; Vrijheid, Martine; Arshad, Hasan; Antó, Josep M; Dehghan, Abbas; Karmaus, Wilfried; Annesi-Maesano, Isabella; Sunyer, Jordi; Ghantous, Akram; Pershagen, Göran; Holland, Nina; Murphy, Susan K; DeMeo, Dawn L; Burchard, Esteban G; Ladd-Acosta, Christine; Snieder, Harold; Nystad, Wenche; Koppelman, Gerard H; Relton, Caroline L; Jaddoe, Vincent W V; Wilcox, Allen; Melén, Erik; London, Stephanie J

    2016-04-01

    Epigenetic modifications, including DNA methylation, represent a potential mechanism for environmental impacts on human disease. Maternal smoking in pregnancy remains an important public health problem that impacts child health in a myriad of ways and has potential lifelong consequences. The mechanisms are largely unknown, but epigenetics most likely plays a role. We formed the Pregnancy And Childhood Epigenetics (PACE) consortium and meta-analyzed, across 13 cohorts (n = 6,685), the association between maternal smoking in pregnancy and newborn blood DNA methylation at over 450,000 CpG sites (CpGs) by using the Illumina 450K BeadChip. Over 6,000 CpGs were differentially methylated in relation to maternal smoking at genome-wide statistical significance (false discovery rate, 5%), including 2,965 CpGs corresponding to 2,017 genes not previously related to smoking and methylation in either newborns or adults. Several genes are relevant to diseases that can be caused by maternal smoking (e.g., orofacial clefts and asthma) or adult smoking (e.g., certain cancers). A number of differentially methylated CpGs were associated with gene expression. We observed enrichment in pathways and processes critical to development. In older children (5 cohorts, n = 3,187), 100% of CpGs gave at least nominal levels of significance, far more than expected by chance (p value < 2.2 × 10(-16)). Results were robust to different normalization methods used across studies and cell type adjustment. In this large scale meta-analysis of methylation data, we identified numerous loci involved in response to maternal smoking in pregnancy with persistence into later childhood and provide insights into mechanisms underlying effects of this important exposure. PMID:27040690

  18. Genome Clone Libraries and Data from the Integrated Molecular Analysis of Genomes and their Expression (I.M.A.G.E.) Consortium

    DOE Data Explorer

    The I.M.A.G.E. Consortium was initiated in 1993 by four academic groups on a collaborative basis after informal discussions led to a common vision of how to achieve an important goal in the study of the human genome: the Integrated Molecular Analysis of Genomes and their Expression Consortium's primary goal is to create arrayed cDNA libraries and associated bioinformatics tools, and make them publicly available to the research community. The primary organisms of interest include intensively studied mammalian species, including human, mouse, rat and non-human primate species. The Consortium has also focused on several commonly studied model organisms; as part of this effort it has arrayed cDNAs from zebrafish, and Fugu (pufferfish) as well as Xenopus laevis and X. tropicalis (frog). Utilizing high speed robotics, over nine million individual cDNA clones have been arrayed into 384-well microtiter plates, and sufficient replicas have been created to distribute copies both to sequencing centers and to a network of five distributors located worldwide. The I.M.A.G.E. Consortium represents the world's largest public cDNA collection, and works closely with the National Institutes of Health's Mammalian Gene Collection(MGC) to help it achieve its goal of creating a full-length cDNA clone for every human and mouse gene. I.M.A.G.E. is also a member of the ORFeome Collaboration, working to generate a complete set of expression-ready open reading frame clones representing each human gene. Custom informatics tools have been developed in support of these projects to better allow the research community to select clones of interest and track and collect all data deposited into public databases about those clones and their related sequences. I.M.A.G.E. clones are publicly available, free of any royalties, and may be used by anyone agreeing with the Consortium's guidelines.

  19. Complete Genome Sequence of a Phenanthrene Degrader, Mycobacterium sp. Strain EPa45 (NBRC 110737), Isolated from a Phenanthrene-Degrading Consortium

    PubMed Central

    Kato, Hiromi; Ogawa, Natsumi; Ohtsubo, Yoshiyuki; Oshima, Kenshiro; Toyoda, Atsushi; Mori, Hiroshi; Nagata, Yuji; Kurokawa, Ken; Hattori, Masahira; Fujiyama, Asao

    2015-01-01

    A phenanthrene degrader, Mycobacterium sp. EPa45, was isolated from a phenanthrene-degrading consortium. Here, we report the complete genome sequence of EPa45, which has a 6.2-Mb single circular chromosome. We propose a phenanthrene degradation pathway in EPa45 based on the complete genome sequence. PMID:26184940

  20. Informational laws of genome structures

    PubMed Central

    Bonnici, Vincenzo; Manca, Vincenzo

    2016-01-01

    In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined. PMID:27354155

  1. Informational laws of genome structures.

    PubMed

    Bonnici, Vincenzo; Manca, Vincenzo

    2016-01-01

    In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined. PMID:27354155

  2. Informational laws of genome structures

    NASA Astrophysics Data System (ADS)

    Bonnici, Vincenzo; Manca, Vincenzo

    2016-06-01

    In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined.

  3. Legal aspects of genetic databases for international biomedical research: the example of the International Cancer Genome Consortium (ICGC).

    PubMed

    Romeo-Casabona, Carlos; Nicolás, Pilar; Knoppers, Bartha Maria; Joly, Yann; Wallace, Susan E; Chalmers, Don; Dyke, Stephanie; Kennedy, Karen; Troncoso, Antonio; Kaan, Terry; Rial-Sebbag, Emmanuelle

    2012-01-01

    There is a noticeable lack of international regulation on personal data exchange and management in research. This article sheds light in this area by describing how the International Cancer Genome Consortium is developing policies and procedures to address the ethical and legal issues raised by the international transfer of data and results. These policies and procedures aim, first and most importantly, to safeguard the interests of the research participants and other involved stakeholders and, secondly, to facilitate the sharing of data and results to realize greater benefits from this kind of internationally collaborative genetic research. PMID:23520913

  4. GCAT-SEEKquence: Genome Consortium for Active Teaching of Undergraduates through Increased Faculty Access to Next-Generation Sequencing Data

    PubMed Central

    Buonaccorsi, Vincent P.; Boyle, Michael D.; Grove, Deborah; Praul, Craig; Sakk, Eric; Stuart, Ash; Tobin, Tammy; Hosler, Jay; Carney, Susan L.; Engle, Michael J.; Overton, Barry E.; Newman, Jeffrey D.; Pizzorno, Marie; Powell, Jennifer R.; Trun, Nancy

    2011-01-01

    To transform undergraduate biology education, faculty need to provide opportunities for students to engage in the process of science. The rise of research approaches using next-generation (NextGen) sequencing has been impressive, but incorporation of such approaches into the undergraduate curriculum remains a major challenge. In this paper, we report proceedings of a National Science Foundation–funded workshop held July 11–14, 2011, at Juniata College. The purpose of the workshop was to develop a regional research coordination network for undergraduate biology education (RCN/UBE). The network is collaborating with a genome-sequencing core facility located at Pennsylvania State University (University Park) to enable undergraduate students and faculty at small colleges to access state-of-the-art sequencing technology. We aim to create a database of references, protocols, and raw data related to NextGen sequencing, and to find innovative ways to reduce costs related to sequencing and bioinformatics analysis. It was agreed that our regional network for NextGen sequencing could operate more effectively if it were partnered with the Genome Consortium for Active Teaching (GCAT) as a new arm of that consortium, entitled GCAT-SEEK(quence). This step would also permit the approach to be replicated elsewhere. PMID:22135368

  5. GCAT-SEEKquence: genome consortium for active teaching of undergraduates through increased faculty access to next-generation sequencing data.

    PubMed

    Buonaccorsi, Vincent P; Boyle, Michael D; Grove, Deborah; Praul, Craig; Sakk, Eric; Stuart, Ash; Tobin, Tammy; Hosler, Jay; Carney, Susan L; Engle, Michael J; Overton, Barry E; Newman, Jeffrey D; Pizzorno, Marie; Powell, Jennifer R; Trun, Nancy

    2011-01-01

    To transform undergraduate biology education, faculty need to provide opportunities for students to engage in the process of science. The rise of research approaches using next-generation (NextGen) sequencing has been impressive, but incorporation of such approaches into the undergraduate curriculum remains a major challenge. In this paper, we report proceedings of a National Science Foundation-funded workshop held July 11-14, 2011, at Juniata College. The purpose of the workshop was to develop a regional research coordination network for undergraduate biology education (RCN/UBE). The network is collaborating with a genome-sequencing core facility located at Pennsylvania State University (University Park) to enable undergraduate students and faculty at small colleges to access state-of-the-art sequencing technology. We aim to create a database of references, protocols, and raw data related to NextGen sequencing, and to find innovative ways to reduce costs related to sequencing and bioinformatics analysis. It was agreed that our regional network for NextGen sequencing could operate more effectively if it were partnered with the Genome Consortium for Active Teaching (GCAT) as a new arm of that consortium, entitled GCAT-SEEK(quence). This step would also permit the approach to be replicated elsewhere. PMID:22135368

  6. Genome Sequence of Bacillus endophyticus and Analysis of Its Companion Mechanism in the Ketogulonigenium vulgare-Bacillus Strain Consortium.

    PubMed

    Jia, Nan; Du, Jin; Ding, Ming-Zhu; Gao, Feng; Yuan, Ying-Jin

    2015-01-01

    Bacillus strains have been widely used as the companion strain of Ketogulonigenium vulgare in the process of vitamin C fermentation. Different Bacillus strains generate different effects on the growth of K. vulgare and ultimately influence the productivity. First, we identified that Bacillus endophyticus Hbe603 was an appropriate strain to cooperate with K. vulgare and the product conversion rate exceeded 90% in industrial vitamin C fermentation. Here, we report the genome sequencing of the B. endophyticus Hbe603 industrial companion strain and speculate its possible advantage in the consortium. The circular chromosome of B. endophyticus Hbe603 has a size of 4.87 Mb with GC content of 36.64% and has the highest similarity with that of Bacillus megaterium among all the bacteria with complete genomes. By comparing the distribution of COGs with that of Bacillus thuringiensis, Bacillus cereus and B. megaterium, B. endophyticus has less genes related to cell envelope biogenesis and signal transduction mechanisms, and more genes related to carbohydrate transport and metabolism, energy production and conversion, as well as lipid transport and metabolism. Genome-based functional studies revealed the specific capability of B. endophyticus in sporulation, transcription regulation, environmental resistance, membrane transportation, extracellular proteins and nutrients synthesis, which would be beneficial for K. vulgare. In particular, B. endophyticus lacks the Rap-Phr signal cascade system and, in part, spore coat related proteins. In addition, it has specific pathways for vitamin B12 synthesis and sorbitol metabolism. The genome analysis of the industrial B. endophyticus will help us understand its cooperative mechanism in the K. vulgare-Bacillus strain consortium to improve the fermentation of vitamin C. PMID:26248285

  7. Genome Sequence of Bacillus endophyticus and Analysis of Its Companion Mechanism in the Ketogulonigenium vulgare-Bacillus Strain Consortium

    PubMed Central

    Jia, Nan; Du, Jin; Ding, Ming-Zhu; Gao, Feng; Yuan, Ying-Jin

    2015-01-01

    Bacillus strains have been widely used as the companion strain of Ketogulonigenium vulgare in the process of vitamin C fermentation. Different Bacillus strains generate different effects on the growth of K. vulgare and ultimately influence the productivity. First, we identified that Bacillus endophyticus Hbe603 was an appropriate strain to cooperate with K. vulgare and the product conversion rate exceeded 90% in industrial vitamin C fermentation. Here, we report the genome sequencing of the B. endophyticus Hbe603 industrial companion strain and speculate its possible advantage in the consortium. The circular chromosome of B. endophyticus Hbe603 has a size of 4.87 Mb with GC content of 36.64% and has the highest similarity with that of Bacillus megaterium among all the bacteria with complete genomes. By comparing the distribution of COGs with that of Bacillus thuringiensis, Bacillus cereus and B. megaterium, B. endophyticus has less genes related to cell envelope biogenesis and signal transduction mechanisms, and more genes related to carbohydrate transport and metabolism, energy production and conversion, as well as lipid transport and metabolism. Genome-based functional studies revealed the specific capability of B. endophyticus in sporulation, transcription regulation, environmental resistance, membrane transportation, extracellular proteins and nutrients synthesis, which would be beneficial for K. vulgare. In particular, B. endophyticus lacks the Rap-Phr signal cascade system and, in part, spore coat related proteins. In addition, it has specific pathways for vitamin B12 synthesis and sorbitol metabolism. The genome analysis of the industrial B. endophyticus will help us understand its cooperative mechanism in the K. vulgare-Bacillus strain consortium to improve the fermentation of vitamin C. PMID:26248285

  8. New York-Structural GenomiX Research Consortium (NYXGXRC): a Large Scale Center for the Protein Structure Initiative

    SciTech Connect

    Bonanno,J.; Almo, S.; Bresnick, A.; Chance, M.; Fiser, A.; Swaminathan, S.; Jiang, J.; Studier, F.; Shapiro, L.; et al.

    2005-01-01

    Structural GenomiX, Inc. (SGX), four New York area institutions, and two University of California schools have formed the New York Structural GenomiX Research Consortium (NYSGXRC), an industrial/academic Research Consortium that exploits individual core competencies to support all aspects of the NIH-NIGMS funded Protein Structure Initiative (PSI), including protein family classification and target selection, generation of protein for biophysical analyses, sample preparation for structural studies, structure determination and analyses, and dissemination of results. At the end of the PSI Pilot Study Phase (PSI-1), the NYSGXRC will be capable of producing 100-200 experimentally determined protein structures annually. All Consortium activities can be scaled to increase production capacity significantly during the Production Phase of the PSI (PSI-2). The Consortium utilizes both centralized and de-centralized production teams with clearly defined deliverables and hand-off procedures that are supported by a web-based target/sample tracking system (SGX Laboratory Information Data Management System, LIMS, and NYSGXRC Internal Consortium Experimental Database, ICE-DB). Consortium management is provided by an Executive Committee, which is composed of the PI and all Co-PIs. Progress to date is tracked on a publicly available Consortium web site (http://www.nysgxrc.org) and all DNA/protein reagents and experimental protocols are distributed freely from the New York City Area institutions. In addition to meeting the requirements of the Pilot Study Phase and preparing for the Production Phase of the PSI, the NYSGXRC aims to develop modular technologies that are transferable to structural biology laboratories in both academe and industry. The NYSGXRC PI and Co-PIs intend the PSI to have a transforming effect on the disciplines of X-ray crystallography and NMR spectroscopy of biological macromolecules. Working with other PSI-funded Centers, the NYSGXRC seeks to create the

  9. The discrepancies in the results of bioinformatics tools for genomic structural annotation

    NASA Astrophysics Data System (ADS)

    Pawełkowicz, Magdalena; Nowak, Robert; Osipowski, Paweł; Rymuszka, Jacek; Świerkula, Katarzyna; Wojcieszek, Michał; Przybecki, Zbigniew

    2014-11-01

    A major focus of sequencing project is to identify genes in genomes. However it is necessary to define the variety of genes and the criteria for identifying them. In this work we present discrepancies and dependencies from the application of different bioinformatic programs for structural annotation performed on the cucumber data set from Polish Consortium of Cucumber Genome Sequencing. We use Fgenesh, GenScan and GeneMark to automated structural annotation, the results have been compared to reference annotation.

  10. Comparative genetic mapping between clementine, pummelo and sweet orange and the interspecicic structure of the Clementine genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Comparative genetic mapping between clementine, pummelo and sweet orange and the interspecicic structure of the Clementine genome The availability of a saturated genetic map of Clementine was identified by the International Citrus Genome Consortium as an essential prerequisite to assist the assembly...

  11. Structural Genomics of Protein Phosphatases

    SciTech Connect

    Almo,S.; Bonanno, J.; Sauder, J.; Emtage, S.; Dilorenzo, T.; Malashkevich, V.; Wasserman, S.; Swaminathan, S.; Eswaramoorthy, S.; et al

    2007-01-01

    The New York SGX Research Center for Structural Genomics (NYSGXRC) of the NIGMS Protein Structure Initiative (PSI) has applied its high-throughput X-ray crystallographic structure determination platform to systematic studies of all human protein phosphatases and protein phosphatases from biomedically-relevant pathogens. To date, the NYSGXRC has determined structures of 21 distinct protein phosphatases: 14 from human, 2 from mouse, 2 from the pathogen Toxoplasma gondii, 1 from Trypanosoma brucei, the parasite responsible for African sleeping sickness, and 2 from the principal mosquito vector of malaria in Africa, Anopheles gambiae. These structures provide insights into both normal and pathophysiologic processes, including transcriptional regulation, regulation of major signaling pathways, neural development, and type 1 diabetes. In conjunction with the contributions of other international structural genomics consortia, these efforts promise to provide an unprecedented database and materials repository for structure-guided experimental and computational discovery of inhibitors for all classes of protein phosphatases.

  12. The global cancer genomics consortium's symposium: new era of molecular medicine and epigenetic cancer medicine - cross section of genomics and epigenetics

    PubMed Central

    Toi, Masakazu; Pillai, M. Radhakrishna; Gupta, Sudeep; Badwe, Rajendra; Carmo-Fonseca, Maria; Costa, Luis; Chow, Louis WC; Knapp, Stefan; Kumar, Rakesh

    2015-01-01

    The Global Cancer Genomics Consortium (GCGC) colleagues continue to function together as an interactive multidisciplinary team of cancer biologists and oncologists with interests in genomics and building a bidirectional bridge between cancer clinics and laboratories while taking advantage of shared resources among its member scientists. The GCGC includes member scientists from six institutions in Lisbon, United Kingdom, Japan, India and United States, and was formed in December 2010 for a period of five years. Driven by valuable lessons learned from the previous symposiums, the fourth GCGC Symposium focused on a cross section of genomic and epigenetic cancer medicine and it's for this reason we chose the conference theme - New Era of Molecular Medicine and Epigenetic Cancer Medicine: Cross Section of Genomics and Epigenetics. This year's symposium was co-organized by the Organization for Oncology and Translational Research (OOTR) at the Shiran Hall, Kyoto University, Kyoto, Japan, from November 14 and 15, 2014. The symposium attracted around 80 participants from 14 countries, and counted with 23 invited platform speakers. Scientific sessions included eight platform sessions and one poster session, and three plenary lectures. The symposium focused on cancer stem cells and self-renewal, cancer transcriptome, tumor heterogeneity, tumor biology, breast cancer genomics, targeted therapeutics and personalized medicine. The issues of cancer stem cells and tumor heterogeneity were echoed in most of the scientific presentations. The meeting concluded with an oral presentation by the best poster awardee and closing remarks by meeting co-chairs.

  13. Comparative genomics analysis of the companion mechanisms of Bacillus thuringiensis Bc601 and Bacillus endophyticus Hbe603 in bacterial consortium

    PubMed Central

    Jia, Nan; Ding, Ming-Zhu; Gao, Feng; Yuan, Ying-Jin

    2016-01-01

    Bacillus thuringiensis and Bacillus endophyticus both act as the companion bacteria, which cooperate with Ketogulonigenium vulgare in vitamin C two-step fermentation. Two Bacillus species have different morphologies, swarming motility and 2-keto-L-gulonic acid productivities when they co-culture with K. vulgare. Here, we report the complete genome sequencing of B. thuringiensis Bc601 and eight plasmids of B. endophyticus Hbe603, and carry out the comparative genomics analysis. Consequently, B. thuringiensis Bc601, with greater ability of response to the external environment, has been found more two-component system, sporulation coat and peptidoglycan biosynthesis related proteins than B. endophyticus Hbe603, and B. endophyticus Hbe603, with greater ability of nutrients biosynthesis, has been found more alpha-galactosidase, propanoate, glutathione and inositol phosphate metabolism, and amino acid degradation related proteins than B. thuringiensis Bc601. Different ability of swarming motility, response to the external environment and nutrients biosynthesis may reflect different companion mechanisms of two Bacillus species. Comparative genomic analysis of B. endophyticus and B. thuringiensis enables us to further understand the cooperative mechanism with K. vulgare, and facilitate the optimization of bacterial consortium. PMID:27353048

  14. Comparative genomics analysis of the companion mechanisms of Bacillus thuringiensis Bc601 and Bacillus endophyticus Hbe603 in bacterial consortium.

    PubMed

    Jia, Nan; Ding, Ming-Zhu; Gao, Feng; Yuan, Ying-Jin

    2016-01-01

    Bacillus thuringiensis and Bacillus endophyticus both act as the companion bacteria, which cooperate with Ketogulonigenium vulgare in vitamin C two-step fermentation. Two Bacillus species have different morphologies, swarming motility and 2-keto-L-gulonic acid productivities when they co-culture with K. vulgare. Here, we report the complete genome sequencing of B. thuringiensis Bc601 and eight plasmids of B. endophyticus Hbe603, and carry out the comparative genomics analysis. Consequently, B. thuringiensis Bc601, with greater ability of response to the external environment, has been found more two-component system, sporulation coat and peptidoglycan biosynthesis related proteins than B. endophyticus Hbe603, and B. endophyticus Hbe603, with greater ability of nutrients biosynthesis, has been found more alpha-galactosidase, propanoate, glutathione and inositol phosphate metabolism, and amino acid degradation related proteins than B. thuringiensis Bc601. Different ability of swarming motility, response to the external environment and nutrients biosynthesis may reflect different companion mechanisms of two Bacillus species. Comparative genomic analysis of B. endophyticus and B. thuringiensis enables us to further understand the cooperative mechanism with K. vulgare, and facilitate the optimization of bacterial consortium. PMID:27353048

  15. 1000 Genomes-based imputation identifies novel and refined associations for the Wellcome Trust Case Control Consortium phase 1 Data.

    PubMed

    Huang, Jie; Ellinghaus, David; Franke, Andre; Howie, Bryan; Li, Yun

    2012-07-01

    We hypothesize that imputation based on data from the 1000 Genomes Project can identify novel association signals on a genome-wide scale due to the dense marker map and the large number of haplotypes. To test the hypothesis, the Wellcome Trust Case Control Consortium (WTCCC) Phase I genotype data were imputed using 1000 genomes as reference (20100804 EUR), and seven case/control association studies were performed using imputed dosages. We observed two 'missed' disease-associated variants that were undetectable by the original WTCCC analysis, but were reported by later studies after the 2007 WTCCC publication. One is within the IL2RA gene for association with type 1 diabetes and the other in proximity with the CDKN2B gene for association with type 2 diabetes. We also identified two refined associations. One is SNP rs11209026 in exon 9 of IL23R for association with Crohn's disease, which is predicted to be probably damaging by PolyPhen2. The other refined variant is in the CUX2 gene region for association with type 1 diabetes, where the newly identified top SNP rs1265564 has an association P-value of 1.68 × 10(-16). The new lead SNP for the two refined loci provides a more plausible explanation for the disease association. We demonstrated that 1000 Genomes-based imputation could indeed identify both novel (in our case, 'missed' because they were detected and replicated by studies after 2007) and refined signals. We anticipate the findings derived from this study to provide timely information when individual groups and consortia are beginning to engage in 1000 genomes-based imputation. PMID:22293688

  16. The International Barley Sequencing Consortium — At the Threshold of Efficient Access to the Barley Genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Sequencing the genome of barley, an agriculturally and industrially important cereal crop and a useful diploid model for bread wheat, has become a realistic undertaking. Important steps have been initiated to improve genomics tools, build and anchor a physical map, develop a high-density genetic ma...

  17. Unmet Challenges of Structural Genomics

    PubMed Central

    Chruszcz, Maksymilian; Domagalski, Marcin; Osinski, Tomasz; Wlodawer, Alexander; Minor, Wladek

    2010-01-01

    Summary Structural genomics (SG) programs have developed during the last decade many novel methodologies for faster and more accurate structure determination. These new tools and approaches led to determination of thousands of protein structures. The generation of enormous amounts of experimental data resulted in significant improvements in the understanding of many biological processes at molecular levels. However, the amount of data collected so far is so large that traditional analysis methods are limiting the rate of extraction of biological and biochemical information from 3-D models. This situation has prompted us to review the challenges that remain unmet by structural genomics, as well as the areas in which the potential impact of SG could exceed what has been achieved so far. PMID:20810277

  18. 78 FR 47674 - Genome in a Bottle Consortium-Progress and Planning Workshop

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-08-06

    ...: select appropriate sources for whole genome RMs and identify or design synthetic DNA constructs that... and synthetic DNA RMs along with the methods (documentary standards) and reference data necessary...

  19. Genome-wide Association Studies of MRI-defined Brain Infarcts: Meta-analysis from the CHARGE Consortium

    PubMed Central

    Debette, Stephanie; Bis, Joshua C.; Fornage, Myriam; Schmidt, Helena; Ikram, M. Arfan; Sigurdsson, Sigurdur; Heiss, Gerardo; Struchalin, Maksim; Smith, Albert V.; van der Lugt, Aad; DeCarli, Charles; Lumley, Thomas; Knopman, David S.; Enzinger, Christian; Eiriksdottir, Gudny; Koudstaal, Peter J.; DeStefano, Anita L.; Psaty, Bruce M.; Dufouil, Carole; Catellier, Diane J.; Fazekas, Franz; Aspelund, Thor; Aulchenko, Yurii S.; Beiser, Alexa; Rotter, Jerome I.; Tzourio, Christophe; Shibata, Dean K.; Tscherner, Maria; Harris, Tamara B.; Rivadeneira, Fernando; Atwood, Larry D.; Rice, Kenneth; Gottesman, Rebecca F.; van Buchem, Mark A.; Uitterlinden, Andre G.; Kelly-Hayes, Margaret; Cushman, Mary; Zhu, Yicheng; Boerwinkle, Eric; Gudnason, Vilmundur; Hofman, Albert; Romero, Jose R.; Lopez, Oscar; van Duijn, Cornelia M.; Au, Rhoda; Heckbert, Susan R.; Wolf, Philip A.; Mosley, Thomas H.; Seshadri, Sudha; Breteler, Monique M.B.; Schmidt, Reinhold; Launer, Lenore J.; Longstreth, WT

    2010-01-01

    Background Previous studies examining genetic associations with MRI-defined brain infarct have yielded inconsistent findings. We investigated genetic variation underlying covert MRI-infarct, in persons without histories of transient ischemic attack or stroke. We performed meta-analysis of genome-wide association studies of white participants in 6 studies comprising the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium. Methods Using 2.2 million genotyped and imputed SNPs, each study performed cross-sectional genome-wide association analysis of MRI-infarct using age and sex-adjusted logistic regression models. Study-specific findings were combined in an inverse-variance weighted meta-analysis, including 9401 participants with mean age 69.7, 19.4% of whom had ≥1 MRI-infarct. Results The most significant association was found with rs2208454 (minor allele frequency: 20%), located in intron 3 of MACRO Domain Containing 2 gene and in the downstream region of Fibronectin Leucine Rich Transmembrane Protein 3 gene. Each copy of the minor allele was associated with lower risk of MRI-infarcts: odds ratio=0.76, 95% confidence interval=0.68–0.84, p=4.64×10−7. Highly suggestive associations (p<1.0×10−5) were also found for 22 other SNPs in linkage disequilibrium (r2>0.64) with rs2208454. The association with rs2208454 did not replicate in independent samples of 1822 white and 644 African-American participants, although 4 SNPs within 200kb from rs2208454 were associated with MRI-infarcts in African-American sample. Conclusions This first community-based, genome-wide association study on covert MRI-infarcts uncovered novel associations. Although replication of the association with top SNP failed, possibly due to insufficient power, results in the African American sample are encouraging, and further efforts at replication are needed. PMID:20044523

  20. Structural variations in plant genomes

    PubMed Central

    Edwards, David; Varshney, Rajeev K.

    2014-01-01

    Differences between plant genomes range from single nucleotide polymorphisms to large-scale duplications, deletions and rearrangements. The large polymorphisms are termed structural variants (SVs). SVs have received significant attention in human genetics and were found to be responsible for various chronic diseases. However, little effort has been directed towards understanding the role of SVs in plants. Many recent advances in plant genetics have resulted from improvements in high-resolution technologies for measuring SVs, including microarray-based techniques, and more recently, high-throughput DNA sequencing. In this review we describe recent reports of SV in plants and describe the genomic technologies currently used to measure these SVs. PMID:24907366

  1. THE PLANT ONTOLOGY CONSORTIUM AND PLANT ONTOLOGIES

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The goal of the Plant OntologyTM Consortium is to produce structured controlled vocabularies, arranged in ontologies, that can be applied to plant-based database information even as knowledge of the biology of the relevant plant taxa (e.g., development, anatomy, morphology, genomics, proteomics) is ...

  2. A genome-wide approach to children's aggressive behavior: The EAGLE consortium.

    PubMed

    Pappa, Irene; St Pourcain, Beate; Benke, Kelly; Cavadino, Alana; Hakulinen, Christian; Nivard, Michel G; Nolte, Ilja M; Tiesler, Carla M T; Bakermans-Kranenburg, Marian J; Davies, Gareth E; Evans, David M; Geoffroy, Marie-Claude; Grallert, Harald; Groen-Blokhuis, Maria M; Hudziak, James J; Kemp, John P; Keltikangas-Järvinen, Liisa; McMahon, George; Mileva-Seitz, Viara R; Motazedi, Ehsan; Power, Christine; Raitakari, Olli T; Ring, Susan M; Rivadeneira, Fernando; Rodriguez, Alina; Scheet, Paul A; Seppälä, Ilkka; Snieder, Harold; Standl, Marie; Thiering, Elisabeth; Timpson, Nicholas J; Veenstra, René; Velders, Fleur P; Whitehouse, Andrew J O; Smith, George Davey; Heinrich, Joachim; Hypponen, Elina; Lehtimäki, Terho; Middeldorp, Christel M; Oldehinkel, Albertine J; Pennell, Craig E; Boomsma, Dorret I; Tiemeier, Henning

    2016-07-01

    Individual differences in aggressive behavior emerge in early childhood and predict persisting behavioral problems and disorders. Studies of antisocial and severe aggression in adulthood indicate substantial underlying biology. However, little attention has been given to genome-wide approaches of aggressive behavior in children. We analyzed data from nine population-based studies and assessed aggressive behavior using well-validated parent-reported questionnaires. This is the largest sample exploring children's aggressive behavior to date (N = 18,988), with measures in two developmental stages (N = 15,668 early childhood and N = 16,311 middle childhood/early adolescence). First, we estimated the additive genetic variance of children's aggressive behavior based on genome-wide SNP information, using genome-wide complex trait analysis (GCTA). Second, genetic associations within each study were assessed using a quasi-Poisson regression approach, capturing the highly right-skewed distribution of aggressive behavior. Third, we performed meta-analyses of genome-wide associations for both the total age-mixed sample and the two developmental stages. Finally, we performed a gene-based test using the summary statistics of the total sample. GCTA quantified variance tagged by common SNPs (10-54%). The meta-analysis of the total sample identified one region in chromosome 2 (2p12) at near genome-wide significance (top SNP rs11126630, P = 5.30 × 10(-8) ). The separate meta-analyses of the two developmental stages revealed suggestive evidence of association at the same locus. The gene-based analysis indicated association of variation within AVPR1A with aggressive behavior. We conclude that common variants at 2p12 show suggestive evidence for association with childhood aggression. Replication of these initial findings is needed, and further studies should clarify its biological meaning. © 2015 Wiley Periodicals, Inc. PMID:26087016

  3. 2004 Structural, Function and Evolutionary Genomics

    SciTech Connect

    Douglas L. Brutlag Nancy Ryan Gray

    2005-03-23

    This Gordon conference will cover the areas of structural, functional and evolutionary genomics. It will take a systematic approach to genomics, examining the evolution of proteins, protein functional sites, protein-protein interactions, regulatory networks, and metabolic networks. Emphasis will be placed on what we can learn from comparative genomics and entire genomes and proteomes.

  4. Novel Loci Associated with Usual Sleep Duration: The CHARGE Consortium Genome-Wide Association Study

    PubMed Central

    Gottlieb, Daniel J.; Hek, Karin; Chen, Ting-hsu; Watson, Nathaniel F.; Eiriksdottir, Gudny; Byrne, Enda M.; Cornelis, Marilyn; Warby, Simon C.; Bandinelli, Stefania; Cherkas, Lynn; Evans, Daniel S.; Grabe, Hans J.; Lahti, Jari; Li, Man; Lehtimäki, Terho; Lumley, Thomas; Marciante, Kristin D.; Pérusse, Louis; Psaty, Bruce M.; Robbins, John; Tranah, Gregory J.; Vink, Jacqueline M.; Wilk, Jemma B.; Stafford, Jeanette M.; Bellis, Claire; Biffar, Reiner; Bouchard, Claude; Cade, Brian; Curhan, Gary C.; Eriksson, Johan G.; Ewert, Ralf; Ferrucci, Luigi; Fülöp, Tibor; Gehrman, Philip R.; Goodloe, Robert; Harris, Tamara B.; Heath, Andrew C.; Hernandez, Dena; Hofman, Albert; Hottenga, Jouke-Jan; Hunter, David J.; Jensen, Majken K.; Johnson, Andrew D.; Kähönen, Mika; Kao, Linda; Kraft, Peter; Larkin, Emma K.; Lauderdale, Diane S.; Luik, Annemarie I.; Medici, Marco; Montgomery, Grant W.; Palotie, Aarno; Patel, Sanjay R.; Pistis, Giorgio; Porcu, Eleonora; Quaye, Lydia; Raitakari, Olli; Redline, Susan; Rimm, Eric B.; Rotter, Jerome I.; Smith, Albert V.; Spector, Tim D.; Teumer, Alexander; Uitterlinden, André G.; Vohl, Marie-Claude; Widen, Elisabeth; Willemsen, Gonneke; Young, Terry; Zhang, Xiaoling; Liu, Yongmei; Blangero, John; Boomsma, Dorret I.; Gudnason, Vilmundur; Hu, Frank; Mangino, Massimo; Martin, Nicholas G.; O’Connor, George T.; Stone, Katie L.; Tanaka, Toshiko; Viikari, Jorma; Gharib, Sina A.; Punjabi, Naresh M.; Räikkönen, Katri; Völzke, Henry; Mignot, Emmanuel; Tiemeier, Henning

    2015-01-01

    Usual sleep duration is a heritable trait correlated with psychiatric morbidity, cardiometabolic disease and mortality, although little is known about the genetic variants influencing this trait. A genome-wide association study of usual sleep duration was conducted using 18 population-based cohorts totaling 47,180 individuals of European ancestry. Genome-wide significant association was identified at two loci. The strongest is located on chromosome 2, in an intergenic region 35–80 kb upstream from the thyroid-specific transcription factor PAX8 (lowest p=1.1 ×10−9). This finding was replicated in an African-American sample of 4771 individuals (lowest p=9.3 × 10−4). The strongest combined association was at rs1823125 (p=1.5 × 10−10, minor allele frequency 0.26 in the discovery sample, 0.12 in the replication sample), with each copy of the minor allele associated with a sleep duration 3.1 minutes longer per night. The alleles associated with longer sleep duration were associated in previous genome-wide association studies with a more favorable metabolic profile and a lower risk of attention deficit hyperactivity disorder. Understanding the mechanisms underlying these associations may help elucidate biological mechanisms influencing sleep duration and its association with psychiatric, metabolic and cardiovascular disease. PMID:25469926

  5. Genome-wide Studies of Verbal Declarative Memory in Nondemented Older People: The Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium

    PubMed Central

    Debette, Stéphanie; Ibrahim Verbaas, Carla A.; Bressler, Jan; Schuur, Maaike; Smith, Albert; Bis, Joshua C.; Davies, Gail; Wolf, Christiane; Gudnason, Vilmundur; Chibnik, Lori B.; Yang, Qiong; deStefano, Anita L.; de Quervain, Dominique J.F.; Srikanth, Velandai; Lahti, Jari; Grabe, Hans J.; Smith, Jennifer A.; Priebe, Lutz; Yu, Lei; Karbalai, Nazanin; Hayward, Caroline; Wilson, James F.; Campbell, Harry; Petrovic, Katja; Fornage, Myriam; Chauhan, Ganesh; Yeo, Robin; Boxall, Ruth; Becker, James; Stegle, Oliver; Mather, Karen A.; Chouraki, Vincent; Sun, Qi; Rose, Lynda M.; Resnick, Susan; Oldmeadow, Christopher; Kirin, Mirna; Wright, Alan F.; Jonsdottir, Maria K.; Au, Rhoda; Becker, Albert; Amin, Najaf; Nalls, Mike A.; Turner, Stephen T.; Kardia, Sharon L.R.; Oostra, Ben; Windham, Gwen; Coker, Laura H.; Zhao, Wei; Knopman, David S.; Heiss, Gerardo; Griswold, Michael E.; Gottesman, Rebecca F.; Vitart, Veronique; Hastie, Nicholas D.; Zgaga, Lina; Rudan, Igor; Polasek, Ozren; Holliday, Elizabeth G.; Schofield, Peter; Choi, Seung Hoan; Tanaka, Toshiko; An, Yang; Perry, Rodney T.; Kennedy, Richard E.; Sale, Michèle M.; Wang, Jing; Wadley, Virginia G.; Liewald, David C.; Ridker, Paul M.; Gow, Alan J.; Pattie, Alison; Starr, John M.; Porteous, David; Liu, Xuan; Thomson, Russell; Armstrong, Nicola J.; Eiriksdottir, Gudny; Assareh, Arezoo A.; Kochan, Nicole A.; Widen, Elisabeth; Palotie, Aarno; Hsieh, Yi-Chen; Eriksson, Johan G.; Vogler, Christian; van Swieten, John C.; Shulman, Joshua M.; Beiser, Alexa; Rotter, Jerome; Schmidt, Carsten O.; Hoffmann, Wolfgang; Nöthen, Markus M.; Ferrucci, Luigi; Attia, John; Uitterlinden, Andre G.; Amouyel, Philippe; Dartigues, Jean-François; Amieva, Hélène; Räikkönen, Katri; Garcia, Melissa; Wolf, Philip A.; Hofman, Albert; Longstreth, W.T.; Psaty, Bruce M.; Boerwinkle, Eric; DeJager, Philip L.; Sachdev, Perminder S.; Schmidt, Reinhold; Breteler, Monique M.B.; Teumer, Alexander; Lopez, Oscar L.; Cichon, Sven; Chasman, Daniel I.; Grodstein, Francine; Müller-Myhsok, Bertram; Tzourio, Christophe; Papassotiropoulos, Andreas; Bennett, David A.; Ikram, Arfan M.; Deary, Ian J.; van Duijn, Cornelia M.; Launer, Lenore; Fitzpatrick, Annette L.; Seshadri, Sudha; Mosley, Thomas H.

    2015-01-01

    BACKGROUND Memory performance in older persons can reflect genetic influences on cognitive function and dementing processes. We aimed to identify genetic contributions to verbal declarative memory in a community setting. METHODS We conducted genome-wide association studies for paragraph or word list delayed recall in 19 cohorts from the Cohorts for Heart and Aging Research in Genomic Epidemiology consortium, comprising 29,076 dementia-and stroke-free individuals of European descent, aged ≥45 years. Replication of suggestive associations (p < 5 × 10−6) was sought in 10,617 participants of European descent, 3811 African-Americans, and 1561 young adults. RESULTS rs4420638, near APOE, was associated with poorer delayed recall performance in discovery (p = 5.57 × 10−10) and replication cohorts (p = 5.65 × 10−8). This association was stronger for paragraph than word list delayed recall and in the oldest persons. Two associations with specific tests, in subsets of the total sample, reached genome-wide significance in combined analyses of discovery and replication (rs11074779 [HS3ST4], p = 3.11 × 10−8, and rs6813517 [SPOCK3], p = 2.58 × 10−8) near genes involved in immune response. A genetic score combining 58 independent suggestive memory risk variants was associated with increasing Alzheimer disease pathology in 725 autopsy samples. Association of memory risk loci with gene expression in 138 human hippocampus samples showed cis-associations with WDR48 and CLDN5, both related to ubiquitin metabolism. CONCLUSIONS This largest study to date exploring the genetics of memory function in ~ 40,000 older individuals revealed genome-wide associations and suggested an involvement of immune and ubiquitin pathways. PMID:25648963

  6. [Developments in cancer care with innovative genomics. 2008 report of the National Cancer Consortium].

    PubMed

    Tímár, József; Kásler, Miklós; Heringh, Alexandra; Soós, Miklós; Mathiász, Dóra; Romány, Anna; Józsa, Adrienn; Szilák, László; Forrai, Tamás; Patthy, László; Kovács, Gábor

    2009-12-01

    In the 3rd year of the program 8 new molecular diagnostic services have been introduced to clinic in the management of breast-, lung-, colorectal cancers as well as in GIST and melanoma. Two patents have been filed for innovative modulation of mito/motogenic signaling pathways in cancer cells. In preclinical models of human cancer a functional imaging technique was developed to detect vascular eff ects of erythropoietin. Using a genomic approach, the sequential changes in human melanoma during systemic dissemination were determined revealing several novel potential prognostic factors and some interesting novel targets for therapy. PMID:20071304

  7. Genome-wide high-density SNP linkage search for glioma susceptibility loci: results from the Gliogene Consortium

    PubMed Central

    Shete, Sanjay; Lau, Ching C; Houlston, Richard S; Claus, Elizabeth B; Barnholtz-Sloan, Jill; Lai, Rose; Il’yasova, Dora; Schildkraut, Joellen; Sadetzki, Siegal; Johansen, Christoffer; Bernstein, Jonine L; Olson, Sara H; Jenkins, Robert B; Yang, Ping; Vick, Nicholas A; Wrensch, Margaret; Davis, Faith G; McCarthy, Bridget J; Leung, Eastwood Hon-chiu; Davis, Caleb; Cheng, Rita; Hosking, Fay J; Armstrong, Georgina N; Liu, Yanhong; Yu, Robert K; Henriksson, Roger; Consortium, The Gliogene; Melin, Beatrice S; Bondy, Melissa L

    2011-01-01

    Gliomas, which generally have a poor prognosis, are the most common primary malignant brain tumors in adults. Recent genome-wide association studies have demonstrated that inherited susceptibility plays a role in the development of glioma. Although first-degree relatives of patients exhibit a two-fold increased risk of glioma, the search for susceptibility loci in familial forms of the disease has been challenging because the disease is relatively rare, fatal, and heterogeneous, making it difficult to collect sufficient biosamples from families for statistical power. To address this challenge, the Genetic Epidemiology of Glioma International Consortium (Gliogene) was formed to collect DNA samples from families with two or more cases of histologically confirmed glioma. In this study, we present results obtained from 46 U.S. families in which multipoint linkage analyses were undertaken using nonparametric (model-free) methods. After removal of high linkage disequilibrium SNPs, we obtained a maximum nonparametric linkage score (NPL) of 3.39 (P=0.0005) at 17q12–21.32 and the Z-score of 4.20 (P=0.000007). To replicate our findings, we genotyped 29 independent U.S. families and obtained a maximum NPL score of 1.26 (P=0.008) and the Z-score of 1.47 (P=0.035). Accounting for the genetic heterogeneity using the ordered subset analysis approach, the combined analyses of 75 families resulted in a maximum NPL score of 3.81 (P=0.00001). The genomic regions we have implicated in this study may offer novel insights into glioma susceptibility, focusing future work to identify genes that cause familial glioma. PMID:22037877

  8. Novel loci associated with usual sleep duration: the CHARGE Consortium Genome-Wide Association Study.

    PubMed

    Gottlieb, D J; Hek, K; Chen, T-H; Watson, N F; Eiriksdottir, G; Byrne, E M; Cornelis, M; Warby, S C; Bandinelli, S; Cherkas, L; Evans, D S; Grabe, H J; Lahti, J; Li, M; Lehtimäki, T; Lumley, T; Marciante, K D; Pérusse, L; Psaty, B M; Robbins, J; Tranah, G J; Vink, J M; Wilk, J B; Stafford, J M; Bellis, C; Biffar, R; Bouchard, C; Cade, B; Curhan, G C; Eriksson, J G; Ewert, R; Ferrucci, L; Fülöp, T; Gehrman, P R; Goodloe, R; Harris, T B; Heath, A C; Hernandez, D; Hofman, A; Hottenga, J-J; Hunter, D J; Jensen, M K; Johnson, A D; Kähönen, M; Kao, L; Kraft, P; Larkin, E K; Lauderdale, D S; Luik, A I; Medici, M; Montgomery, G W; Palotie, A; Patel, S R; Pistis, G; Porcu, E; Quaye, L; Raitakari, O; Redline, S; Rimm, E B; Rotter, J I; Smith, A V; Spector, T D; Teumer, A; Uitterlinden, A G; Vohl, M-C; Widen, E; Willemsen, G; Young, T; Zhang, X; Liu, Y; Blangero, J; Boomsma, D I; Gudnason, V; Hu, F; Mangino, M; Martin, N G; O'Connor, G T; Stone, K L; Tanaka, T; Viikari, J; Gharib, S A; Punjabi, N M; Räikkönen, K; Völzke, H; Mignot, E; Tiemeier, H

    2015-10-01

    Usual sleep duration is a heritable trait correlated with psychiatric morbidity, cardiometabolic disease and mortality, although little is known about the genetic variants influencing this trait. A genome-wide association study (GWAS) of usual sleep duration was conducted using 18 population-based cohorts totaling 47 180 individuals of European ancestry. Genome-wide significant association was identified at two loci. The strongest is located on chromosome 2, in an intergenic region 35- to 80-kb upstream from the thyroid-specific transcription factor PAX8 (lowest P=1.1 × 10(-9)). This finding was replicated in an African-American sample of 4771 individuals (lowest P=9.3 × 10(-4)). The strongest combined association was at rs1823125 (P=1.5 × 10(-10), minor allele frequency 0.26 in the discovery sample, 0.12 in the replication sample), with each copy of the minor allele associated with a sleep duration 3.1 min longer per night. The alleles associated with longer sleep duration were associated in previous GWAS with a more favorable metabolic profile and a lower risk of attention deficit hyperactivity disorder. Understanding the mechanisms underlying these associations may help elucidate biological mechanisms influencing sleep duration and its association with psychiatric, metabolic and cardiovascular disease. PMID:25469926

  9. A genome-wide association study for venous thromboembolism: the extended Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium

    PubMed Central

    Pankratz, Nathan; Leebeek, Frank W.; Paré, Guillaume; de Andrade, Mariza; Tzourio, Christophe; Psaty, Bruce M.; Basu, Saonli; Ruiter, Rikje; Rose, Lynda; Armasu, Sebastian M.; Lumley, Thomas; Heckbert, Susan R.; Uitterlinden, André G.; Lathrop, Mark; Rice, Kenneth M.; Cushman, Mary; Hofman, Albert; Lambert, Jean-Charles; Glazer, Nicole L.; Pankow, James S.; Witteman, Jacqueline C.; Amouyel, Philippe; Bis, Joshua C.; Bovill, Edwin G.; Kong, Xiaoxiao; Tracy, Russell P.; Boerwinkle, Eric; Rotter, Jerome I.; Trégouët, David-Alexandre; Loth, Daan W.

    2014-01-01

    Venous thromboembolism (VTE) is a common, heritable disease resulting in high rates of hospitalization and mortality. Yet few associations between VTE and genetic variants, all in the coagulation pathway, have been established. To identify additional genetic determinants of VTE, we conducted a 2-stage genome-wide association study (GWAS) among individuals of European ancestry in the extended CHARGE VTE consortium. The discovery GWAS comprised 1,618 incident VTE cases out of 44,499 participants from six community-based studies. Genotypes for genome-wide single-nucleotide polymorphisms (SNPs) were imputed to ~2.5 million SNPs in HapMap and association with VTE assessed using study-design appropriate regression methods. Meta-analysis of these results identified two known loci, in F5 and ABO. Top 1,047 tag SNPs (p≤0.0016) from the discovery GWAS were tested for association in an additional 3,231 cases and 3,536 controls from three case-control studies. In the combined data from these two stages, additional genome-wide significant associations were observed on 4q35 at F11 (top SNP rs4253399, intronic to F11) and on 4q28 at FGG (rs6536024, 9.7 kb from FGG) (p<5.0×10−13 for both). The associations at the FGG locus were not completely explained by previously reported variants. Loci at or near SUSD1 and OTUD7A showed borderline yet novel associations (p<5.0×10-6) and constitute new candidate genes. In conclusion, this large GWAS replicated key genetic associations in F5 and ABO, and confirmed the importance of F11 and FGG loci for VTE. Future studies are warranted to better characterize the associations with F11 and FGG and to replicate the new candidate associations. PMID:23650146

  10. An Integrated Functional Genomics Consortium to Increase Carbon Sequestration in Poplars: Optimizing Aboveground Carbon Gain

    SciTech Connect

    Karnosky, David F; Podila, G Krishna; Burton, Andrew J

    2009-02-17

    This project used gene expression patterns from two forest Free-Air CO2 Enrichment (FACE) experiments (Aspen FACE in northern Wisconsin and POPFACE in Italy) to examine ways to increase the aboveground carbon sequestration potential of poplars (Populus). The aim was to use patterns of global gene expression to identify candidate genes for increased carbon sequestration. Gene expression studies were linked to physiological measurements in order to elucidate bottlenecks in carbon acquisition in trees grown in elevated CO2 conditions. Delayed senescence allowing additional carbon uptake late in the growing season, was also examined, and expression of target genes was tested in elite P. deltoides x P. trichocarpa hybrids. In Populus euramericana, gene expression was sensitive to elevated CO2, but the response depended on the developmental age of the leaves. Most differentially expressed genes were upregulated in elevated CO2 in young leaves, while most were downregulated in elevated CO2 in semi-mature leaves. In P. deltoides x P. trichocarpa hybrids, leaf development and leaf quality traits, including leaf area, leaf shape, epidermal cell area, stomatal number, specific leaf area, and canopy senescence were sensitive to elevated CO2. Significant increases under elevated CO2 occurred for both above- and belowground growth in the F-2 generation. Three areas of the genome played a role in determining aboveground growth response to elevated CO2, with three additional areas of the genome important in determining belowground growth responses to elevated CO2. In Populus tremuloides, CO2-responsive genes in leaves were found to differ between two aspen clones that showed different growth responses, despite similarity in many physiological parameters (photosynthesis, stomatal conductance, and leaf area index). The CO2-responsive clone shunted C into pathways associated with active defense/response to stress, carbohydrate/starch biosynthesis and subsequent growth. The CO2

  11. Effect of trichloroethylene and tetrachloroethylene on methane oxidation and community structure of methanotrophic consortium.

    PubMed

    Choi, Sun-Ah; Lee, Eun-Hee; Cho, Kyung-Suk

    2013-01-01

    The methane oxidation rate and community structure of a methanotrophic consortium were analyzed to determine the effects of trichloroethylene (TCE) and tetrachloroethylene (PCE) on methane oxidation. The maximum methane oxidation rate (Vmax ) of the consortium was 326.8 μmol·g-dry biomass(-1)·h(-1), and it had a half-saturation constant (Km ) of 143.8 μM. The addition of TCE or PCE resulted in decreased methane oxidation rates, which were decreased from 101.73 to 5.47-24.64 μmol·g-dry biomass(-1)·h(-1) with an increase in the TCE-to-methane ratio, and to 61.95-67.43 μmol·g-dry biomass(-1)·h(-1) with an increase in the PCE-to-methane ratio. TCE and PCE were non-competitive inhibitors for methane oxidation, and their inhibition constants (Ki ) were 33.4 and 132.0 μM, respectively. When the methanotrophic community was analyzed based on pmoA using quantitative real-time PCR (qRT-PCR), the pmoA gene copy numbers were shown to decrease from 7.3 ± 0.7 × 10(8) to 2.1-5.0 × 10(7) pmoA gene copy number · g-dry biomass(-1) with an increase in the TCE-to-methane ratio and to 2.5-7.0 × 10(7) pmoA gene copy number · g-dry biomass(-1) with an increase in the PCE-to-methane ratio. Community analysis by microarray demonstrated that Methylocystis (type II methanotrophs) were the most abundant in the methanotrophic community composition in the presence of TCE. These results suggest that toxic effects caused by TCE and PCE change not only methane oxidation rates but also the community structure of the methanotrophic consortium. PMID:23947712

  12. The Seattle Structural Genomics Center for Infectious Disease (SSGCID)

    PubMed Central

    Myler, P.J.; Stacy, R.; Stewart, L.; Staker, B.L.; Van Voorhis, W.C.; Varani, G.; Buchko, G.W.

    2010-01-01

    The NIAID-funded Seattle Structural Genomics Center for Infectious Disease (SSGCID) is a consortium established to apply structural genomics approaches to potential drug targets from NIAID priority organisms for biodefense and emerging and re-emerging diseases. The mission of the SSGCID is to determine ~400 protein structures over five years ending in 2012. In order to maximize biomedical impact, ligand-based drug-lead discovery campaigns will be pursued for a small number of high-impact targets. Here we review the center’s target selection processes, which include pro-active engagement of the infectious disease research and drug therapy communities to identify drug targets, essential enzymes, virulence factors and vaccine candidates of biomedical relevance to combat infectious diseases. This is followed by a brief overview of the SSGCID structure determination pipeline and ligand screening methodology. Finally, specifics of our resources available to the scientific community are presented. Physical materials and data produced by SSGCID will be made available to the scientific community, with the aim that they will provide essential groundwork benefiting future research and drug discovery. PMID:19594426

  13. The National Astronomy Consortium Summer Student Research Program at NRAO-Socorro: Year 2 structure

    NASA Astrophysics Data System (ADS)

    Mills, Elisabeth A.; Sheth, Kartik; Giles, Faye; Perez, Laura M.; Arancibia, Demian; Burke-Spolaor, Sarah

    2016-01-01

    I will present a summary of the program structure used for the second year of hosting a summer student research cohort of the National Astronomy Consortium (NAC) at the National Radio Astronomy Observatory in Socorro, NM. The NAC is a program partnering physics and astronomy departments in majority and minority-serving institutions across the country. The primary aim of this program is to support traditionally underrepresented students interested in pursuing a career in STEM through a 9-10 week summer astronomy research project and a year of additional mentoring after they return to their home institution. I will describe the research, professional development, and inclusivity goals of the program, and show how these were used to create a weekly syllabus for the summer. I will also highlight several unique aspects of this program, including the recruitment of remote mentors for students to better balance the gender and racial diversity of available role models for the students, as well as the hosting of a contemporaneous series of visiting diversity speakers. Finally, I will discuss structures for continuing to engage, interact with, and mentor students in the academic year following the summer program. A goal of this work going forward is to be able to make instructional and organizational materials from this program available to other sites interested in joining the NAC or hosting similar programs at their own institution.

  14. Population Genomics of Cardiometabolic Traits: Design of the University College London-London School of Hygiene and Tropical Medicine-Edinburgh-Bristol (UCLEB) Consortium

    PubMed Central

    Wong, Andrew; Amuzu, Antoinette; Ong, Ken; Gaunt, Tom; Holmes, Michael V.; Warren, Helen; Davies, Teri-Louise; Drenos, Fotios; Cooper, Jackie; Sofat, Reecha; Caulfield, Mark; Ebrahim, Shah; Lawlor, Debbie A.; Talmud, Philippa J.; Humphries, Steve E.; Power, Christine; Hypponen, Elina; Richards, Marcus; Hardy, Rebecca; Kuh, Diana; Wareham, Nicholas; Ben-Shlomo, Yoav; Day, Ian N.; Whincup, Peter; Morris, Richard; Strachan, Mark W. J.; Price, Jacqueline; Kumari, Meena; Kivimaki, Mika; Plagnol, Vincent; Dudbridge, Frank; Whittaker, John C.; Casas, Juan P.; Hingorani, Aroon D.

    2013-01-01

    Substantial advances have been made in identifying common genetic variants influencing cardiometabolic traits and disease outcomes through genome wide association studies. Nevertheless, gaps in knowledge remain and new questions have arisen regarding the population relevance, mechanisms, and applications for healthcare. Using a new high-resolution custom single nucleotide polymorphism (SNP) array (Metabochip) incorporating dense coverage of genomic regions linked to cardiometabolic disease, the University College-London School-Edinburgh-Bristol (UCLEB) consortium of highly-phenotyped population-based prospective studies, aims to: (1) fine map functionally relevant SNPs; (2) precisely estimate individual absolute and population attributable risks based on individual SNPs and their combination; (3) investigate mechanisms leading to altered risk factor profiles and CVD events; and (4) use Mendelian randomisation to undertake studies of the causal role in CVD of a range of cardiovascular biomarkers to inform public health policy and help develop new preventative therapies. PMID:23977022

  15. Population genomics of cardiometabolic traits: design of the University College London-London School of Hygiene and Tropical Medicine-Edinburgh-Bristol (UCLEB) Consortium.

    PubMed

    Shah, Tina; Engmann, Jorgen; Dale, Caroline; Shah, Sonia; White, Jon; Giambartolomei, Claudia; McLachlan, Stela; Zabaneh, Delilah; Cavadino, Alana; Finan, Chris; Wong, Andrew; Amuzu, Antoinette; Ong, Ken; Gaunt, Tom; Holmes, Michael V; Warren, Helen; Swerdlow, Daniel I; Davies, Teri-Louise; Drenos, Fotios; Cooper, Jackie; Sofat, Reecha; Caulfield, Mark; Ebrahim, Shah; Lawlor, Debbie A; Talmud, Philippa J; Humphries, Steve E; Power, Christine; Hypponen, Elina; Richards, Marcus; Hardy, Rebecca; Kuh, Diana; Wareham, Nicholas; Langenberg, Claudia; Ben-Shlomo, Yoav; Day, Ian N; Whincup, Peter; Morris, Richard; Strachan, Mark W J; Price, Jacqueline; Kumari, Meena; Kivimaki, Mika; Plagnol, Vincent; Dudbridge, Frank; Whittaker, John C; Casas, Juan P; Hingorani, Aroon D

    2013-01-01

    Substantial advances have been made in identifying common genetic variants influencing cardiometabolic traits and disease outcomes through genome wide association studies. Nevertheless, gaps in knowledge remain and new questions have arisen regarding the population relevance, mechanisms, and applications for healthcare. Using a new high-resolution custom single nucleotide polymorphism (SNP) array (Metabochip) incorporating dense coverage of genomic regions linked to cardiometabolic disease, the University College-London School-Edinburgh-Bristol (UCLEB) consortium of highly-phenotyped population-based prospective studies, aims to: (1) fine map functionally relevant SNPs; (2) precisely estimate individual absolute and population attributable risks based on individual SNPs and their combination; (3) investigate mechanisms leading to altered risk factor profiles and CVD events; and (4) use Mendelian randomisation to undertake studies of the causal role in CVD of a range of cardiovascular biomarkers to inform public health policy and help develop new preventative therapies. PMID:23977022

  16. Towards a whole genome physical map in rainbow trout

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Over the last five years, tremendous genomic resources were developed in salmonids. In 2005, INRA joined formally the consortium for Genome Research on All Salmonids Program (cGRASP). This consortium (www.cgrasp.org) is the international collaborative structure for establishing needed pre- and post-...

  17. Oncofertility Consortium

    MedlinePlus

    ... September 15, 2016 National Physicians Cooperative Brigid Martz Smith July 21, 2016 Postdoctoral Position in Pediatric Fertility ... 2016 Oncofertility Consortium Clinic/Center Map Brigid Martz Smith June 30, 2016 Zika Virus Concerns Grow as ...

  18. Systematic Prioritization of Druggable Mutations in ∼5000 Genomes Across 16 Cancer Types Using a Structural Genomics-based Approach.

    PubMed

    Zhao, Junfei; Cheng, Feixiong; Wang, Yuanyuan; Arteaga, Carlos L; Zhao, Zhongming

    2016-02-01

    A massive amount of somatic mutations has been cataloged in large-scale projects such as The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium projects. The majority of the somatic mutations found in tumor genomes are neutral 'passenger' rather than damaging "driver" mutations. Now, understanding their biological consequences and prioritizing them for druggable targets are urgently needed. Thanks to the rapid advances in structural genomics technologies (e.g. X-ray), large-scale protein structural data has now been made available, providing critical information for deciphering functional roles of mutations in cancer and prioritizing those alterations that may mediate drug binding at the atom resolution and, as such, be druggable targets. We hypothesized that mutations at protein-ligand binding-site residues are likely to be druggable targets. Thus, to prioritize druggable mutations, we developed SGDriver, a structural genomics-based method incorporating the somatic missense mutations into protein-ligand binding-site residues using a Bayes inference statistical framework. We applied SGDriver to 746,631 missense mutations observed in 4997 tumor-normal pairs across 16 cancer types from The Cancer Genome Atlas. SGDriver detected 14,471 potential druggable mutations in 2091 proteins (including 1,516 recurrently mutated proteins) across 3558 cancer genomes (71.2%), and further identified 298 proteins harboring mutations that were significantly enriched at protein-ligand binding-site residues (adjusted p value < 0.05). The identified proteins are significantly enriched in both oncoproteins and tumor suppressors. The follow-up drug-target network analysis suggested 98 known and 126 repurposed druggable anticancer targets (e.g. SPOP and NR3C1). Furthermore, our integrative analysis indicated that 13% of patients might benefit from current targeted therapy, and this -proportion would increase to 31% when considering drug repositioning. This study

  19. Genome-wide Membrane Protein Structure Prediction

    PubMed Central

    Piccoli, Stefano; Suku, Eda; Garonzi, Marianna; Giorgetti, Alejandro

    2013-01-01

    Transmembrane proteins allow cells to extensively communicate with the external world in a very accurate and specific way. They form principal nodes in several signaling pathways and attract large interest in therapeutic intervention, as the majority pharmaceutical compounds target membrane proteins. Thus, according to the current genome annotation methods, a detailed structural/functional characterization at the protein level of each of the elements codified in the genome is also required. The extreme difficulty in obtaining high-resolution three-dimensional structures, calls for computational approaches. Here we review to which extent the efforts made in the last few years, combining the structural characterization of membrane proteins with protein bioinformatics techniques, could help describing membrane proteins at a genome-wide scale. In particular we analyze the use of comparative modeling techniques as a way of overcoming the lack of high-resolution three-dimensional structures in the human membrane proteome. PMID:24403851

  20. The fractal structure of the mitochondrial genomes

    NASA Astrophysics Data System (ADS)

    Oiwa, Nestor N.; Glazier, James A.

    2002-08-01

    The mitochondrial DNA genome has a definite multifractal structure. We show that loops, hairpins and inverted palindromes are responsible for this self-similarity. We can thus establish a definite relation between the function of subsequences and their fractal dimension. Intriguingly, protein coding DNAs also exhibit palindromic structures, although they do not appear in the sequence of amino acids. These structures may reflect the stabilization and transcriptional control of DNA or the control of posttranscriptional editing of mRNA.

  1. Genome Structure of the Legume, Lotus japonicus

    PubMed Central

    Sato, Shusei; Nakamura, Yasukazu; Kaneko, Takakazu; Asamizu, Erika; Kato, Tomohiko; Nakao, Mitsuteru; Sasamoto, Shigemi; Watanabe, Akiko; Ono, Akiko; Kawashima, Kumiko; Fujishiro, Tsunakazu; Katoh, Midori; Kohara, Mitsuyo; Kishida, Yoshie; Minami, Chiharu; Nakayama, Shinobu; Nakazaki, Naomi; Shimizu, Yoshimi; Shinpo, Sayaka; Takahashi, Chika; Wada, Tsuyuko; Yamada, Manabu; Ohmido, Nobuko; Hayashi, Makoto; Fukui, Kiichi; Baba, Tomoya; Nakamichi, Tomoko; Mori, Hirotada; Tabata, Satoshi

    2008-01-01

    The legume Lotus japonicus has been widely used as a model system to investigate the genetic background of legume-specific phenomena such as symbiotic nitrogen fixation. Here, we report structural features of the L. japonicus genome. The 315.1-Mb sequences determined in this and previous studies correspond to 67% of the genome (472 Mb), and are likely to cover 91.3% of the gene space. Linkage mapping anchored 130-Mb sequences onto the six linkage groups. A total of 10 951 complete and 19 848 partial structures of protein-encoding genes were assigned to the genome. Comparative analysis of these genes revealed the expansion of several functional domains and gene families that are characteristic of L. japonicus. Synteny analysis detected traces of whole-genome duplication and the presence of synteny blocks with other plant genomes to various degrees. This study provides the first opportunity to look into the complex and unique genetic system of legumes. PMID:18511435

  2. Using Genomics for Natural Product Structure Elucidation.

    PubMed

    Tietz, Jonathan I; Mitchell, Douglas A

    2016-01-01

    Natural products (NPs) are the most historically bountiful source of chemical matter for drug development-especially for anti-infectives. With insights gleaned from genome mining, interest in natural product discovery has been reinvigorated. An essential stage in NP discovery is structural elucidation, which sheds light not only on the chemical composition of a molecule but also its novelty, properties, and derivatization potential. The history of structure elucidation is replete with techniquebased revolutions: combustion analysis, crystallography, UV, IR, MS, and NMR have each provided game-changing advances; the latest such advance is genomics. All natural products have a genetic basis, and the ability to obtain and interpret genomic information for structure elucidation is increasingly available at low cost to non-specialists. In this review, we describe the value of genomics as a structural elucidation technique, especially from the perspective of the natural product chemist approaching an unknown metabolite. Herein we first introduce the databases and programs of interest to the natural products chemist, with an emphasis on those currently most suited for general usability. We describe strategies for linking observed natural product-linked phenotypes to their corresponding gene clusters. We then discuss techniques for extracting structural information from genes, illustrated with numerous case examples. We also provide an analysis of the biases and limitations of the field with recommendations for future development. Our overview is not only aimed at biologically-oriented researchers already at ease with bioinformatic techniques, but also, in particular, at natural product, organic, and/or medicinal chemists not previously familiar with genomic techniques. PMID:26456468

  3. A data management system for structural genomics.

    PubMed

    Raymond, Stéphane; O'Toole, Nicholas; Cygler, Miroslaw

    2004-06-21

    BACKGROUND: Structural genomics (SG) projects aim to determine thousands of protein structures by the development of high-throughput techniques for all steps of the experimental structure determination pipeline. Crucial to the success of such endeavours is the careful tracking and archiving of experimental and external data on protein targets. RESULTS: We have developed a sophisticated data management system for structural genomics. Central to the system is an Oracle-based, SQL-interfaced database. The database schema deals with all facets of the structure determination process, from target selection to data deposition. Users access the database via any web browser. Experimental data is input by users with pre-defined web forms. Data can be displayed according to numerous criteria. A list of all current target proteins can be viewed, with links for each target to associated entries in external databases. To avoid unnecessary work on targets, our data management system matches protein sequences weekly using BLAST to entries in the Protein Data Bank and to targets of other SG centers worldwide. CONCLUSION: Our system is a working, effective and user-friendly data management tool for structural genomics projects. In this report we present a detailed summary of the various capabilities of the system, using real target data as examples, and indicate our plans for future enhancements. PMID:15210054

  4. From a consortium sequence to a unified sequence: the Bacillus subtilis 168 reference genome a decade later

    PubMed Central

    Barbe, Valérie; Cruveiller, Stéphane; Kunst, Frank; Lenoble, Patricia; Meurice, Guillaume; Sekowska, Agnieszka; Vallenet, David; Wang, Tingzhang; Moszer, Ivan; Médigue, Claudine; Danchin, Antoine

    2009-01-01

    Comparative genomics is the cornerstone of identification of gene functions. The immense number of living organisms precludes experimental identification of functions except in a handful of model organisms. The bacterial domain is split into large branches, among which the Firmicutes occupy a considerable space. Bacillus subtilis has been the model of Firmicutes for decades and its genome has been a reference for more than 10 years. Sequencing the genome involved more than 30 laboratories, with different expertises, in a attempt to make the most of the experimental information that could be associated with the sequence. This had the expected drawback that the sequencing expertise was quite varied among the groups involved, especially at a time when sequencing genomes was extremely hard work. The recent development of very efficient, fast and accurate sequencing techniques, in parallel with the development of high-level annotation platforms, motivated the present resequencing work. The updated sequence has been reannotated in agreement with the UniProt protein knowledge base, keeping in perspective the split between the paleome (genes necessary for sustaining and perpetuating life) and the cenome (genes required for occupation of a niche, suggesting here that B. subtilis is an epiphyte). This should permit investigators to make reliable inferences to prepare validation experiments in a variety of domains of bacterial growth and development as well as build up accurate phylogenies. PMID:19383706

  5. Genome-wide association study of lifetime cannabis use based on a large meta-analytic sample of 32 330 subjects from the International Cannabis Consortium

    PubMed Central

    Stringer, S; Minică, C C; Verweij, K J H; Mbarek, H; Bernard, M; Derringer, J; van Eijk, K R; Isen, J D; Loukola, A; Maciejewski, D F; Mihailov, E; van der Most, P J; Sánchez-Mora, C; Roos, L; Sherva, R; Walters, R; Ware, J J; Abdellaoui, A; Bigdeli, T B; Branje, S J T; Brown, S A; Bruinenberg, M; Casas, M; Esko, T; Garcia-Martinez, I; Gordon, S D; Harris, J M; Hartman, C A; Henders, A K; Heath, A C; Hickie, I B; Hickman, M; Hopfer, C J; Hottenga, J J; Huizink, A C; Irons, D E; Kahn, R S; Korhonen, T; Kranzler, H R; Krauter, K; van Lier, P A C; Lubke, G H; Madden, P A F; Mägi, R; McGue, M K; Medland, S E; Meeus, W H J; Miller, M B; Montgomery, G W; Nivard, M G; Nolte, I M; Oldehinkel, A J; Pausova, Z; Qaiser, B; Quaye, L; Ramos-Quiroga, J A; Richarte, V; Rose, R J; Shin, J; Stallings, M C; Stiby, A I; Wall, T L; Wright, M J; Koot, H M; Paus, T; Hewitt, J K; Ribasés, M; Kaprio, J; Boks, M P; Snieder, H; Spector, T; Munafò, M R; Metspalu, A; Gelernter, J; Boomsma, D I; Iacono, W G; Martin, N G; Gillespie, N A; Derks, E M; Vink, J M

    2016-01-01

    Cannabis is the most widely produced and consumed illicit psychoactive substance worldwide. Occasional cannabis use can progress to frequent use, abuse and dependence with all known adverse physical, psychological and social consequences. Individual differences in cannabis initiation are heritable (40–48%). The International Cannabis Consortium was established with the aim to identify genetic risk variants of cannabis use. We conducted a meta-analysis of genome-wide association data of 13 cohorts (N=32 330) and four replication samples (N=5627). In addition, we performed a gene-based test of association, estimated single-nucleotide polymorphism (SNP)-based heritability and explored the genetic correlation between lifetime cannabis use and cigarette use using LD score regression. No individual SNPs reached genome-wide significance. Nonetheless, gene-based tests identified four genes significantly associated with lifetime cannabis use: NCAM1, CADM2, SCOC and KCNT2. Previous studies reported associations of NCAM1 with cigarette smoking and other substance use, and those of CADM2 with body mass index, processing speed and autism disorders, which are phenotypes previously reported to be associated with cannabis use. Furthermore, we showed that, combined across the genome, all common SNPs explained 13–20% (P<0.001) of the liability of lifetime cannabis use. Finally, there was a strong genetic correlation (rg=0.83; P=1.85 × 10−8) between lifetime cannabis use and lifetime cigarette smoking implying that the SNP effect sizes of the two traits are highly correlated. This is the largest meta-analysis of cannabis GWA studies to date, revealing important new insights into the genetic pathways of lifetime cannabis use. Future functional studies should explore the impact of the identified genes on the biological mechanisms of cannabis use. PMID:27023175

  6. Genome-wide association study of lifetime cannabis use based on a large meta-analytic sample of 32 330 subjects from the International Cannabis Consortium.

    PubMed

    Stringer, S; Minică, C C; Verweij, K J H; Mbarek, H; Bernard, M; Derringer, J; van Eijk, K R; Isen, J D; Loukola, A; Maciejewski, D F; Mihailov, E; van der Most, P J; Sánchez-Mora, C; Roos, L; Sherva, R; Walters, R; Ware, J J; Abdellaoui, A; Bigdeli, T B; Branje, S J T; Brown, S A; Bruinenberg, M; Casas, M; Esko, T; Garcia-Martinez, I; Gordon, S D; Harris, J M; Hartman, C A; Henders, A K; Heath, A C; Hickie, I B; Hickman, M; Hopfer, C J; Hottenga, J J; Huizink, A C; Irons, D E; Kahn, R S; Korhonen, T; Kranzler, H R; Krauter, K; van Lier, P A C; Lubke, G H; Madden, P A F; Mägi, R; McGue, M K; Medland, S E; Meeus, W H J; Miller, M B; Montgomery, G W; Nivard, M G; Nolte, I M; Oldehinkel, A J; Pausova, Z; Qaiser, B; Quaye, L; Ramos-Quiroga, J A; Richarte, V; Rose, R J; Shin, J; Stallings, M C; Stiby, A I; Wall, T L; Wright, M J; Koot, H M; Paus, T; Hewitt, J K; Ribasés, M; Kaprio, J; Boks, M P; Snieder, H; Spector, T; Munafò, M R; Metspalu, A; Gelernter, J; Boomsma, D I; Iacono, W G; Martin, N G; Gillespie, N A; Derks, E M; Vink, J M

    2016-01-01

    Cannabis is the most widely produced and consumed illicit psychoactive substance worldwide. Occasional cannabis use can progress to frequent use, abuse and dependence with all known adverse physical, psychological and social consequences. Individual differences in cannabis initiation are heritable (40-48%). The International Cannabis Consortium was established with the aim to identify genetic risk variants of cannabis use. We conducted a meta-analysis of genome-wide association data of 13 cohorts (N=32 330) and four replication samples (N=5627). In addition, we performed a gene-based test of association, estimated single-nucleotide polymorphism (SNP)-based heritability and explored the genetic correlation between lifetime cannabis use and cigarette use using LD score regression. No individual SNPs reached genome-wide significance. Nonetheless, gene-based tests identified four genes significantly associated with lifetime cannabis use: NCAM1, CADM2, SCOC and KCNT2. Previous studies reported associations of NCAM1 with cigarette smoking and other substance use, and those of CADM2 with body mass index, processing speed and autism disorders, which are phenotypes previously reported to be associated with cannabis use. Furthermore, we showed that, combined across the genome, all common SNPs explained 13-20% (P<0.001) of the liability of lifetime cannabis use. Finally, there was a strong genetic correlation (rg=0.83; P=1.85 × 10(-8)) between lifetime cannabis use and lifetime cigarette smoking implying that the SNP effect sizes of the two traits are highly correlated. This is the largest meta-analysis of cannabis GWA studies to date, revealing important new insights into the genetic pathways of lifetime cannabis use. Future functional studies should explore the impact of the identified genes on the biological mechanisms of cannabis use. PMID:27023175

  7. Haemonchus contortus: Genome Structure, Organization and Comparative Genomics.

    PubMed

    Laing, R; Martinelli, A; Tracey, A; Holroyd, N; Gilleard, J S; Cotton, J A

    2016-01-01

    One of the first genome sequencing projects for a parasitic nematode was that for Haemonchus contortus. The open access data from the Wellcome Trust Sanger Institute provided a valuable early resource for the research community, particularly for the identification of specific genes and genetic markers. Later, a second sequencing project was initiated by the University of Melbourne, and the two draft genome sequences for H. contortus were published back-to-back in 2013. There is a pressing need for long-range genomic information for genetic mapping, population genetics and functional genomic studies, so we are continuing to improve the Wellcome Trust Sanger Institute assembly to provide a finished reference genome for H. contortus. This review describes this process, compares the H. contortus genome assemblies with draft genomes from other members of the strongylid group and discusses future directions for parasite genomics using the H. contortus model. PMID:27238013

  8. Meta-analysis of Genome-Wide Association Studies for Extraversion: Findings from the Genetics of Personality Consortium.

    PubMed

    van den Berg, Stéphanie M; de Moor, Marleen H M; Verweij, Karin J H; Krueger, Robert F; Luciano, Michelle; Arias Vasquez, Alejandro; Matteson, Lindsay K; Derringer, Jaime; Esko, Tõnu; Amin, Najaf; Gordon, Scott D; Hansell, Narelle K; Hart, Amy B; Seppälä, Ilkka; Huffman, Jennifer E; Konte, Bettina; Lahti, Jari; Lee, Minyoung; Miller, Mike; Nutile, Teresa; Tanaka, Toshiko; Teumer, Alexander; Viktorin, Alexander; Wedenoja, Juho; Abdellaoui, Abdel; Abecasis, Goncalo R; Adkins, Daniel E; Agrawal, Arpana; Allik, Jüri; Appel, Katja; Bigdeli, Timothy B; Busonero, Fabio; Campbell, Harry; Costa, Paul T; Smith, George Davey; Davies, Gail; de Wit, Harriet; Ding, Jun; Engelhardt, Barbara E; Eriksson, Johan G; Fedko, Iryna O; Ferrucci, Luigi; Franke, Barbara; Giegling, Ina; Grucza, Richard; Hartmann, Annette M; Heath, Andrew C; Heinonen, Kati; Henders, Anjali K; Homuth, Georg; Hottenga, Jouke-Jan; Iacono, William G; Janzing, Joost; Jokela, Markus; Karlsson, Robert; Kemp, John P; Kirkpatrick, Matthew G; Latvala, Antti; Lehtimäki, Terho; Liewald, David C; Madden, Pamela A F; Magri, Chiara; Magnusson, Patrik K E; Marten, Jonathan; Maschio, Andrea; Mbarek, Hamdi; Medland, Sarah E; Mihailov, Evelin; Milaneschi, Yuri; Montgomery, Grant W; Nauck, Matthias; Nivard, Michel G; Ouwens, Klaasjan G; Palotie, Aarno; Pettersson, Erik; Polasek, Ozren; Qian, Yong; Pulkki-Råback, Laura; Raitakari, Olli T; Realo, Anu; Rose, Richard J; Ruggiero, Daniela; Schmidt, Carsten O; Slutske, Wendy S; Sorice, Rossella; Starr, John M; St Pourcain, Beate; Sutin, Angelina R; Timpson, Nicholas J; Trochet, Holly; Vermeulen, Sita; Vuoksimaa, Eero; Widen, Elisabeth; Wouda, Jasper; Wright, Margaret J; Zgaga, Lina; Porteous, David; Minelli, Alessandra; Palmer, Abraham A; Rujescu, Dan; Ciullo, Marina; Hayward, Caroline; Rudan, Igor; Metspalu, Andres; Kaprio, Jaakko; Deary, Ian J; Räikkönen, Katri; Wilson, James F; Keltikangas-Järvinen, Liisa; Bierut, Laura J; Hettema, John M; Grabe, Hans J; Penninx, Brenda W J H; van Duijn, Cornelia M; Evans, David M; Schlessinger, David; Pedersen, Nancy L; Terracciano, Antonio; McGue, Matt; Martin, Nicholas G; Boomsma, Dorret I

    2016-03-01

    Extraversion is a relatively stable and heritable personality trait associated with numerous psychosocial, lifestyle and health outcomes. Despite its substantial heritability, no genetic variants have been detected in previous genome-wide association (GWA) studies, which may be due to relatively small sample sizes of those studies. Here, we report on a large meta-analysis of GWA studies for extraversion in 63,030 subjects in 29 cohorts. Extraversion item data from multiple personality inventories were harmonized across inventories and cohorts. No genome-wide significant associations were found at the single nucleotide polymorphism (SNP) level but there was one significant hit at the gene level for a long non-coding RNA site (LOC101928162). Genome-wide complex trait analysis in two large cohorts showed that the additive variance explained by common SNPs was not significantly different from zero, but polygenic risk scores, weighted using linkage information, significantly predicted extraversion scores in an independent cohort. These results show that extraversion is a highly polygenic personality trait, with an architecture possibly different from other complex human traits, including other personality traits. Future studies are required to further determine which genetic variants, by what modes of gene action, constitute the heritable nature of extraversion. PMID:26362575

  9. Glycoprotein Structural Genomics: Solving the Glycosylation Problem

    PubMed Central

    Chang, Veronica T.; Crispin, Max; Aricescu, A. Radu; Harvey, David J.; Nettleship, Joanne E.; Fennelly, Janet A.; Yu, Chao; Boles, Kent S.; Evans, Edward J.; Stuart, David I.; Dwek, Raymond A.; Jones, E. Yvonne; Owens, Raymond J.; Davis, Simon J.

    2007-01-01

    Summary Glycoproteins present special problems for structural genomic analysis because they often require glycosylation in order to fold correctly, whereas their chemical and conformational heterogeneity generally inhibits crystallization. We show that the “glycosylation problem” can be solved by expressing glycoproteins transiently in mammalian cells in the presence of the N-glycosylation processing inhibitors, kifunensine or swainsonine. This allows the correct folding of the glycoproteins, but leaves them sensitive to enzymes, such as endoglycosidase H, that reduce the N-glycans to single residues, enhancing crystallization. Since the scalability of transient mammalian expression is now comparable to that of bacterial systems, this approach should relieve one of the major bottlenecks in structural genomic analysis. PMID:17355862

  10. The Quality and Validation of Structures from Structural Genomics

    PubMed Central

    Domagalski, Marcin J.; Zheng, Heping; Zimmerman, Matthew D.; Dauter, Zbigniew; Wlodawer, Alexander; Minor, Wladek

    2014-01-01

    Quality control of three-dimensional structures of macromolecules is a critical step to ensure the integrity of structural biology data, especially those produced by structural genomics centers. Whereas the Protein Data Bank (PDB) has proven to be a remarkable success overall, the inconsistent quality of structures reveals a lack of universal standards for structure/deposit validation. Here, we review the state-of-the-art methods used in macromolecular structure validation, focusing on validation of structures determined by X-ray crystallography. We describe some general protocols used in the rebuilding and re-refinement of problematic structural models. We also briefly discuss some frontier areas of structure validation, including refinement of protein–ligand complexes, automation of structure redetermination, and the use of NMR structures and computational models to solve X-ray crystal structures by molecular replacement. PMID:24203341

  11. Mechanisms underlying structural variant formation in genomic disorders

    PubMed Central

    Carvalho, Claudia M. B.; Lupski, James R.

    2016-01-01

    With the recent burst of technological developments in genomics, and the clinical implementation of genome-wide assays, our understanding of the molecular basis of genomic disorders, specifically the contribution of structural variation to disease burden, is evolving quickly. Ongoing studies have revealed a ubiquitous role for genome architecture in the formation of structural variants at a given locus, both in DNA recombination-based processes and in replication-based processes. These reports showcase the influence of repeat sequences on genomic stability and structural variant complexity and also highlight the tremendous plasticity and dynamic nature of our genome in evolution, health and disease susceptibility. PMID:26924765

  12. Consistent directions of effect for established type 2 diabetes risk variants across populations: the population architecture using Genomics and Epidemiology (PAGE) Consortium.

    PubMed

    Haiman, Christopher A; Fesinmeyer, Megan D; Spencer, Kylee L; Buzková, Petra; Voruganti, V Saroja; Wan, Peggy; Haessler, Jeff; Franceschini, Nora; Monroe, Kristine R; Howard, Barbara V; Jackson, Rebecca D; Florez, Jose C; Kolonel, Laurence N; Buyske, Steven; Goodloe, Robert J; Liu, Simin; Manson, Joann E; Meigs, James B; Waters, Kevin; Mukamal, Kenneth J; Pendergrass, Sarah A; Shrader, Peter; Wilkens, Lynne R; Hindorff, Lucia A; Ambite, Jose Luis; North, Kari E; Peters, Ulrike; Crawford, Dana C; Le Marchand, Loic; Pankow, James S

    2012-06-01

    Common genetic risk variants for type 2 diabetes (T2D) have primarily been identified in populations of European and Asian ancestry. We tested whether the direction of association with 20 T2D risk variants generalizes across six major racial/ethnic groups in the U.S. as part of the Population Architecture using Genomics and Epidemiology Consortium (16,235 diabetes case and 46,122 control subjects of European American, African American, Hispanic, East Asian, American Indian, and Native Hawaiian ancestry). The percentage of positive (odds ratio [OR] >1 for putative risk allele) associations ranged from 69% in American Indians to 100% in European Americans. Of the nine variants where we observed significant heterogeneity of effect by racial/ethnic group (P(heterogeneity) < 0.05), eight were positively associated with risk (OR >1) in at least five groups. The marked directional consistency of association observed for most genetic variants across populations implies a shared functional common variant in each region. Fine-mapping of all loci will be required to reveal markers of risk that are important within and across populations. PMID:22474029

  13. GMOL: An Interactive Tool for 3D Genome Structure Visualization.

    PubMed

    Nowotny, Jackson; Wells, Avery; Oluwadare, Oluwatosin; Xu, Lingfei; Cao, Renzhi; Trieu, Tuan; He, Chenfeng; Cheng, Jianlin

    2016-01-01

    It has been shown that genome spatial structures largely affect both genome activity and DNA function. Knowing this, many researchers are currently attempting to accurately model genome structures. Despite these increased efforts there still exists a shortage of tools dedicated to visualizing the genome. Creating a tool that can accurately visualize the genome can aid researchers by highlighting structural relationships that may not be obvious when examining the sequence information alone. Here we present a desktop application, known as GMOL, designed to effectively visualize genome structures so that researchers may better analyze genomic data. GMOL was developed based upon our multi-scale approach that allows a user to scale between six separate levels within the genome. With GMOL, a user can choose any unit at any scale and scale it up or down to visualize its structure and retrieve corresponding genome sequences. Users can also interactively manipulate and measure the whole genome structure and extract static images and machine-readable data files in PDB format from the multi-scale structure. By using GMOL researchers will be able to better understand and analyze genome structure models and the impact their structural relations have on genome activity and DNA function. PMID:26868282

  14. Genome structure of cottontail rabbit herpesvirus.

    PubMed

    Cebrian, J; Berthelot, N; Laithier, M

    1989-02-01

    The genome structure of a herpesvirus isolated from primary cultures of kidney cells from the cottontail rabbit Sylvilagus floridanus was elucidated by using electron microscopy and restriction enzyme analysis. The genome, which was about 150 kilobase pairs long and which had an average G + C composition of 45%, consisted of two regions with unique base sequences (54 and 47 kilobase pairs) enclosed by reiterations of a 925-base-pair sequence with a variable copy number. The internal repeats were in opposite polarity with respect to the terminal repeats, and both unique regions underwent inversion. The nucleotide sequence of the repeat unit was determined, and virion DNA termini were precisely localized within this sequence. Elements showing homology with the cleavage-packaging signals common to other herpesviruses were detected. The data indicate that this virus is different from the previously described herpesvirus sylvilagus. PMID:2911115

  15. Genome-Wide Structural Variation Detection by Genome Mapping on Nanochannel Arrays

    PubMed Central

    Mak, Angel C. Y.; Lai, Yvonne Y. Y.; Lam, Ernest T.; Kwok, Tsz-Piu; Leung, Alden K. Y.; Poon, Annie; Mostovoy, Yulia; Hastie, Alex R.; Stedman, William; Anantharaman, Thomas; Andrews, Warren; Zhou, Xiang; Pang, Andy W. C.; Dai, Heng; Chu, Catherine; Lin, Chin; Wu, Jacob J. K.; Li, Catherine M. L.; Li, Jing-Woei; Yim, Aldrin K. Y.; Chan, Saki; Sibert, Justin; Džakula, Željko; Cao, Han; Yiu, Siu-Ming; Chan, Ting-Fung; Yip, Kevin Y.; Xiao, Ming; Kwok, Pui-Yan

    2016-01-01

    Comprehensive whole-genome structural variation detection is challenging with current approaches. With diploid cells as DNA source and the presence of numerous repetitive elements, short-read DNA sequencing cannot be used to detect structural variation efficiently. In this report, we show that genome mapping with long, fluorescently labeled DNA molecules imaged on nanochannel arrays can be used for whole-genome structural variation detection without sequencing. While whole-genome haplotyping is not achieved, local phasing (across >150-kb regions) is routine, as molecules from the parental chromosomes are examined separately. In one experiment, we generated genome maps from a trio from the 1000 Genomes Project, compared the maps against that derived from the reference human genome, and identified structural variations that are >5 kb in size. We find that these individuals have many more structural variants than those published, including some with the potential of disrupting gene function or regulation. PMID:26510793

  16. Genome-Wide Association Study for Incident Myocardial Infarction and Coronary Heart Disease in Prospective Cohort Studies: The CHARGE Consortium

    PubMed Central

    Cupples, L. Adrienne; Trompet, Stella; Chasman, Daniel I.; Lumley, Thomas; Völker, Uwe; Buckley, Brendan M.; Ding, Jingzhong; Jensen, Majken K.; Folsom, Aaron R.; Kritchevsky, Stephen B.; Girman, Cynthia J.; Ford, Ian; Dörr, Marcus; Salomaa, Veikko; Uitterlinden, André G.; Eiriksdottir, Gudny; Vasan, Ramachandran S.; Franceschini, Nora; Carty, Cara L.; Virtamo, Jarmo; Demissie, Serkalem; Amouyel, Philippe; Arveiler, Dominique; Heckbert, Susan R.; Ferrières, Jean; Ducimetière, Pierre; Smith, Nicholas L.; Wang, Ying A.; Siscovick, David S.; Rice, Kenneth M.; Wiklund, Per-Gunnar; Taylor, Kent D.; Evans, Alun; Kee, Frank; Rotter, Jerome I.; Karvanen, Juha; Kuulasmaa, Kari; Heiss, Gerardo; Kraft, Peter; Launer, Lenore J.; Hofman, Albert; Markus, Marcello R. P.; Rose, Lynda M.; Silander, Kaisa; Wagner, Peter; Benjamin, Emelia J.; Lohman, Kurt; Stott, David J.; Rivadeneira, Fernando; Harris, Tamara B.; Levy, Daniel; Liu, Yongmei; Rimm, Eric B.; Jukema, J. Wouter; Völzke, Henry; Ridker, Paul M.; Blankenberg, Stefan; Franco, Oscar H.; Gudnason, Vilmundur; Psaty, Bruce M.; Boerwinkle, Eric; O'Donnell, Christopher J.

    2016-01-01

    Background Data are limited on genome-wide association studies (GWAS) for incident coronary heart disease (CHD). Moreover, it is not known whether genetic variants identified to date also associate with risk of CHD in a prospective setting. Methods We performed a two-stage GWAS analysis of incident myocardial infarction (MI) and CHD in a total of 64,297 individuals (including 3898 MI cases, 5465 CHD cases). SNPs that passed an arbitrary threshold of 5×10−6 in Stage I were taken to Stage II for further discovery. Furthermore, in an analysis of prognosis, we studied whether known SNPs from former GWAS were associated with total mortality in individuals who experienced MI during follow-up. Results In Stage I 15 loci passed the threshold of 5×10−6; 8 loci for MI and 8 loci for CHD, for which one locus overlapped and none were reported in previous GWAS meta-analyses. We took 60 SNPs representing these 15 loci to Stage II of discovery. Four SNPs near QKI showed nominally significant association with MI (p-value<8.8×10−3) and three exceeded the genome-wide significance threshold when Stage I and Stage II results were combined (top SNP rs6941513: p = 6.2×10−9). Despite excellent power, the 9p21 locus SNP (rs1333049) was only modestly associated with MI (HR = 1.09, p-value = 0.02) and marginally with CHD (HR = 1.06, p-value = 0.08). Among an inception cohort of those who experienced MI during follow-up, the risk allele of rs1333049 was associated with a decreased risk of subsequent mortality (HR = 0.90, p-value = 3.2×10−3). Conclusions QKI represents a novel locus that may serve as a predictor of incident CHD in prospective studies. The association of the 9p21 locus both with increased risk of first myocardial infarction and longer survival after MI highlights the importance of study design in investigating genetic determinants of complex disorders. PMID:26950853

  17. Chloroplast genome structure in Ilex (Aquifoliaceae)

    PubMed Central

    Yao, Xin; Tan, Yun-Hong; Liu, Ying-Ying; Song, Yu; Yang, Jun-Bo; Corlett, Richard T.

    2016-01-01

    Aquifoliaceae is the largest family in the campanulid order Aquifoliales. It consists of a single genus, Ilex, the hollies, which is the largest woody dioecious genus in the angiosperms. Most species are in East Asia or South America. The taxonomy and evolutionary history remain unclear due to the lack of a robust species-level phylogeny. We produced the first complete chloroplast genomes in this family, including seven Ilex species, by Illumina sequencing of long-range PCR products and subsequent reference-guided de novo assembly. These genomes have a typical bicyclic structure with a conserved genome arrangement and moderate divergence. The total length is 157,741 bp and there is one large single-copy region (LSC) with 87,109 bp, one small single-copy with 18,436 bp, and a pair of inverted repeat regions (IR) with 52,196 bp. A total of 144 genes were identified, including 96 protein-coding genes, 40 tRNA and 8 rRNA. Thirty-four repetitive sequences were identified in Ilex pubescens, with lengths >14 bp and identity >90%, and 11 divergence hotspot regions that could be targeted for phylogenetic markers. This study will contribute to improved resolution of deep branches of the Ilex phylogeny and facilitate identification of Ilex species. PMID:27378489

  18. Structural characterization of genomes by large scale sequence-structure threading: application of reliability analysis in structural genomics

    PubMed Central

    Cherkasov, Artem; Ho Sui, Shannan J; Brunham, Robert C; Jones, Steven JM

    2004-01-01

    Background We establish that the occurrence of protein folds among genomes can be accurately described with a Weibull function. Systems which exhibit Weibull character can be interpreted with reliability theory commonly used in engineering analysis. For instance, Weibull distributions are widely used in reliability, maintainability and safety work to model time-to-failure of mechanical devices, mechanisms, building constructions and equipment. Results We have found that the Weibull function describes protein fold distribution within and among genomes more accurately than conventional power functions which have been used in a number of structural genomic studies reported to date. It has also been found that the Weibull reliability parameter β for protein fold distributions varies between genomes and may reflect differences in rates of gene duplication in evolutionary history of organisms. Conclusions The results of this work demonstrate that reliability analysis can provide useful insights and testable predictions in the fields of comparative and structural genomics. PMID:15274750

  19. Genome-wide analysis identifies novel loci associated with ovarian cancer outcomes: findings from the Ovarian Cancer Association Consortium

    PubMed Central

    Johnatty, Sharon E.; Tyrer, Jonathan P.; Kar, Siddhartha; Beesley, Jonathan; Lu, Yi; Gao, Bo; Fasching, Peter A.; Hein, Alexander; Ekici, Arif B.; Beckmann, Matthias W.; Lambrechts, Diether; Nieuwenhuysen, Els Van; Vergote, Ignace; Lambrechts, Sandrina; Rossing, Mary Anne; Doherty, Jennifer A.; Chang-Claude, Jenny; Modugno, Francesmary; Ness, Roberta B.; Moysich, Kirsten B.; Levine, Douglas A.; Kiemeney, Lambertus A.; Massuger, Leon F.A.G.; Gronwald, Jacek; Lubiński, Jan; Jakubowska, Anna; Cybulski, Cezary; Brinton, Louise; Lissowska, Jolanta; Wentzensen, Nicolas; Song, Honglin; Rhenius, Valerie; Campbell, Ian; Eccles, Diana; Sieh, Weiva; Whittemore, Alice S.; McGuire, Valerie; Rothstein, Joseph H.; Sutphen, Rebecca; Anton-Culver, Hoda; Ziogas, Argyrios; Gayther, Simon A.; Gentry-Maharaj, Aleksandra; Menon, Usha; Ramus, Susan J.; Pearce, Celeste L; Pike, Malcolm C; Stram, Daniel O.; Wu, Anna H.; Kupryjanczyk, Jolanta; Dansonka-Mieszkowska, Agnieszka; Rzepecka, Iwona K.; Spiewankiewicz, Beata; Goodman, Marc T.; Wilkens, Lynne R.; Carney, Michael E.; Thompson, Pamela J; Heitz, Florian; du Bois, Andreas; Schwaab, Ira; Harter, Philipp; Pisterer, Jacobus; Hillemanns, Peter; Karlan, Beth Y.; Walsh, Christine; Lester, Jenny; Orsulic, Sandra; Winham, Stacey J; Earp, Madalene; Larson, Melissa C.; Fogarty, Zachary C.; Høgdall, Estrid; Jensen, Allan; Kjaer, Susanne Kruger; Fridley, Brooke L.; Cunningham, Julie M.; Vierkant, Robert A.; Schildkraut, Joellen M.; Iversen, Edwin S.; Terry, Kathryn L.; Cramer, Daniel W.; Bandera, Elisa V.; Orlow, Irene; Pejovic, Tanja; Bean, Yukie; Høgdall, Claus; Lundvall, Lene; McNeish, Ian; Paul, James; Carty, Karen; Siddiqui, Nadeem; Glasspool, Rosalind; Sellers, Thomas; Kennedy, Catherine; Chiew, Yoke-Eng; Berchuck, Andrew; MacGregor, Stuart; deFazio, Anna; Pharoah, Paul D.P.; Goode, Ellen L.; deFazio, Anna; Webb, Penelope M.; Chenevix-Trench, Georgia

    2015-01-01

    Purpose Chemotherapy resistance remains a major challenge in the treatment of ovarian cancer. We hypothesize that germline polymorphisms might be associated with clinical outcome. Experimental Design We analyzed ~2.8 million genotyped and imputed SNPs from the iCOGS experiment for progression-free survival (PFS) and overall survival (OS) in 2,901 European epithelial ovarian cancer (EOC) patients who underwent firstline treatment of cytoreductive surgery and chemotherapy regardless of regimen, and in a subset of 1,098 patients treated with ≥4 cycles of paclitaxel and carboplatin at standard doses. We evaluated the top SNPs in 4,434 EOC patients including patients from The Cancer Genome Atlas. Additionally we conducted pathway analysis of all intragenic SNPs and tested their association with PFS and OS using gene set enrichment analysis. Results Five SNPs were significantly associated (p≤1.0x10−5) with poorer outcomes in at least one of the four analyses, three of which, rs4910232 (11p15.3), rs2549714 (16q23) and rs6674079 (1q22) were located in long non-coding RNAs (lncRNAs) RP11–179A10.1, RP11–314O13.1 and RP11–284F21.8 respectively (p≤7.1x10−6). ENCODE ChIP-seq data at 1q22 for normal ovary shows evidence of histone modification around RP11–284F21.8, and rs6674079 is perfectly correlated with another SNP within the super-enhancer MEF2D, expression levels of which were reportedly associated with prognosis in another solid tumor. YAP1- and WWTR1 (TAZ)-stimulated gene expression, and HDL-mediated lipid transport pathways were associated with PFS and OS, respectively, in the cohort who had standard chemotherapy (pGSEA≤6x10−3). Conclusion We have identified SNPs in three lncRNAs that might be important targets for novel EOC therapies. PMID:26152742

  20. Target Selection and Annotation for the Structural Genomics of the Amidohydrolase and Enolase Superfamilies

    SciTech Connect

    Pieper, U.; Chiang, R; Seffernick, J; Brown, S; Glasner, M; Kelly, L; Eswar, N; Sauder, M; Bonanno, J; et al,

    2009-01-01

    To study the substrate specificity of enzymes, we use the amidohydrolase and enolase superfamilies as model systems; members of these superfamilies share a common TIM barrel fold and catalyze a wide range of chemical reactions. Here, we describe a collaboration between the Enzyme Specificity Consortium (ENSPEC) and the New York SGX Research Center for Structural Genomics (NYSGXRC) that aims to maximize the structural coverage of the amidohydrolase and enolase superfamilies. Using sequence- and structure-based protein comparisons, we first selected 535 target proteins from a variety of genomes for high-throughput structure determination by X-ray crystallography; 63 of these targets were not previously annotated as superfamily members. To date, 20 unique amidohydrolase and 41 unique enolase structures have been determined, increasing the fraction of sequences in the two superfamilies that can be modeled based on at least 30% sequence identity from 45% to 73%. We present case studies of proteins related to uronate isomerase (an amidohydrolase superfamily member) and mandelate racemase (an enolase superfamily member), to illustrate how this structure-focused approach can be used to generate hypotheses about sequence-structure-function relationships.

  1. A Meta-analysis of Four Genome-Wide Association Studies of Survival to Age 90 Years or Older: The Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium

    PubMed Central

    Walter, Stefan; Lunetta, Kathryn L.; Garcia, Melissa E.; Slagboom, P. Eline; Christensen, Kaare; Arnold, Alice M.; Aspelund, Thor; Aulchenko, Yurii S.; Benjamin, Emelia J.; Christiansen, Lene; D'Agostino, Ralph B.; Fitzpatrick, Annette L.; Franceschini, Nora; Glazer, Nicole L.; Gudnason, Vilmundur; Hofman, Albert; Kaplan, Robert; Karasik, David; Kelly-Hayes, Margaret; Kiel, Douglas P.; Launer, Lenore J.; Marciante, Kristin D.; Massaro, Joseph M.; Miljkovic, Iva; Nalls, Michael A.; Hernandez, Dena; Psaty, Bruce M.; Rivadeneira, Fernando; Rotter, Jerome; Seshadri, Sudha; Smith, Albert V.; Taylor, Kent D.; Tiemeier, Henning; Uh, Hae-Won; Uitterlinden, André G.; Vaupel, James W.; Walston, Jeremy; Westendorp, Rudi G. J.; Harris, Tamara B.; Lumley, Thomas; van Duijn, Cornelia M.; Murabito, Joanne M.

    2010-01-01

    Background. Genome-wide association studies (GWAS) may yield insights into longevity. Methods. We performed a meta-analysis of GWAS in Caucasians from four prospective cohort studies: the Age, Gene/Environment Susceptibility-Reykjavik Study, the Cardiovascular Health Study, the Framingham Heart Study, and the Rotterdam Study participating in the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium. Longevity was defined as survival to age 90 years or older (n = 1,836); the comparison group comprised cohort members who died between the ages of 55 and 80 years (n = 1,955). In a second discovery stage, additional genotyping was conducted in the Leiden Longevity Study cohort and the Danish 1905 cohort. Results. There were 273 single-nucleotide polymorphism (SNP) associations with p < .0001, but none reached the prespecified significance level of 5 × 10−8. Of the most significant SNPs, 24 were independent signals, and 16 of these SNPs were successfully genotyped in the second discovery stage, with one association for rs9664222, reaching 6.77 × 10−7 for the combined meta-analysis of CHARGE and the stage 2 cohorts. The SNP lies in a region near MINPP1 (chromosome 10), a well-conserved gene involved in regulation of cellular proliferation. The minor allele was associated with lower odds of survival past age 90 (odds ratio = 0.82). Associations of interest in a homologue of the longevity assurance gene (LASS3) and PAPPA2 were not strengthened in the second stage. Conclusion. Survival studies of larger size or more extreme or specific phenotypes may support or refine these initial findings. PMID:20304771

  2. The Single Nucleotide Polymorphism Consortium

    NASA Technical Reports Server (NTRS)

    Morgan, Michael

    2003-01-01

    I want to discuss both the Single Nucleotide Polymorphism (SNP) Consortium and the Human Genome Project. I am afraid most of my presentation will be thin on law and possibly too high on rhetoric. Having been engaged in a personal and direct way with these issues as a trained scientist, I find it quite difficult to be always as objective as I ought to be.

  3. Child Development and Structural Variation in the Human Genome

    ERIC Educational Resources Information Center

    Zhang, Ying; Haraksingh, Rajini; Grubert, Fabian; Abyzov, Alexej; Gerstein, Mark; Weissman, Sherman; Urban, Alexander E.

    2013-01-01

    Structural variation of the human genome sequence is the insertion, deletion, or rearrangement of stretches of DNA sequence sized from around 1,000 to millions of base pairs. Over the past few years, structural variation has been shown to be far more common in human genomes than previously thought. Very little is currently known about the effects…

  4. Comparison of 6q25 Breast Cancer Hits from Asian and European Genome Wide Association Studies in the Breast Cancer Association Consortium (BCAC)

    PubMed Central

    Hein, Rebecca; Maranian, Melanie; Hopper, John L.; Kapuscinski, Miroslaw K.; Southey, Melissa C.; Park, Daniel J.; Schmidt, Marjanka K.; Broeks, Annegien; Hogervorst, Frans B. L.; Bueno-de-Mesquit, H. Bas; Muir, Kenneth R.; Lophatananon, Artitaya; Rattanamongkongul, Suthee; Puttawibul, Puttisak; Fasching, Peter A.; Hein, Alexander; Ekici, Arif B.; Beckmann, Matthias W.; Fletcher, Olivia; Johnson, Nichola; dos Santos Silva, Isabel; Peto, Julian; Sawyer, Elinor; Tomlinson, Ian; Kerin, Michael; Miller, Nicola; Marmee, Frederick; Schneeweiss, Andreas; Sohn, Christof; Burwinkel, Barbara; Guénel, Pascal; Cordina-Duverger, Emilie; Menegaux, Florence; Truong, Thérèse; Bojesen, Stig E.; Nordestgaard, Børge G.; Flyger, Henrik; Milne, Roger L.; Perez, Jose Ignacio Arias; Zamora, M. Pilar; Benítez, Javier; Anton-Culver, Hoda; Ziogas, Argyrios; Bernstein, Leslie; Clarke, Christina A.; Brenner, Hermann; Müller, Heiko; Arndt, Volker; Stegmaier, Christa; Rahman, Nazneen; Seal, Sheila; Turnbull, Clare; Renwick, Anthony; Meindl, Alfons; Schott, Sarah; Bartram, Claus R.; Schmutzler, Rita K.; Brauch, Hiltrud; Hamann, Ute; Ko, Yon-Dschun; Wang-Gohrke, Shan; Dörk, Thilo; Schürmann, Peter; Karstens, Johann H.; Hillemanns, Peter; Nevanlinna, Heli; Heikkinen, Tuomas; Aittomäki, Kristiina; Blomqvist, Carl; Bogdanova, Natalia V.; Zalutsky, Iosif V.; Antonenkova, Natalia N.; Bermisheva, Marina; Prokovieva, Darya; Farahtdinova, Albina; Khusnutdinova, Elza; Lindblom, Annika; Margolin, Sara; Mannermaa, Arto; Kataja, Vesa; Kosma, Veli-Matti; Hartikainen, Jaana; Chen, Xiaoqing; Beesley, Jonathan; Investigators, kConFab; Lambrechts, Diether; Zhao, Hui; Neven, Patrick; Wildiers, Hans; Nickels, Stefan; Flesch-Janys, Dieter; Radice, Paolo; Peterlongo, Paolo; Manoukian, Siranoush; Barile, Monica; Couch, Fergus J.; Olson, Janet E.; Wang, Xianshu; Fredericksen, Zachary; Giles, Graham G.; Baglietto, Laura; McLean, Catriona A.; Severi, Gianluca; Offit, Kenneth; Robson, Mark; Gaudet, Mia M.; Vijai, Joseph; Alnæs, Grethe Grenaker; Kristensen, Vessela; Børresen-Dale, Anne-Lise; John, Esther M.; Miron, Alexander; Winqvist, Robert; Pylkäs, Katri; Jukkola-Vuorinen, Arja; Grip, Mervi; Andrulis, Irene L.; Knight, Julia A.; Glendon, Gord; Mulligan, Anna Marie; Figueroa, Jonine D.; García-Closas, Montserrat; Lissowska, Jolanta; Sherman, Mark E.; Hooning, Maartje; Martens, John W. M.; Seynaeve, Caroline; Collée, Margriet; Hall, Per; Humpreys, Keith; Czene, Kamila; Liu, Jianjun; Cox, Angela; Brock, Ian W.; Cross, Simon S.; Reed, Malcolm W. R.; Ahmed, Shahana; Ghoussaini, Maya; Pharoah, Paul DP.; Kang, Daehee; Yoo, Keun-Young; Noh, Dong-Young; Jakubowska, Anna; Jaworska, Katarzyna; Durda, Katarzyna; Złowocka, Elżbieta; Sangrajrang, Suleeporn; Gaborieau, Valerie; Brennan, Paul; McKay, James; Shen, Chen-Yang; Yu, Jyh-Cherng; Hsu, Huan-Ming; Hou, Ming-Feng; Orr, Nick; Schoemaker, Minouk; Ashworth, Alan; Swerdlow, Anthony; Trentham-Dietz, Amy; Newcomb, Polly A.; Titus, Linda; Egan, Kathleen M.; Chenevix-Trench, Georgia; Antoniou, Antonis C.; Humphreys, Manjeet K.; Morrison, Jonathan; Chang-Claude, Jenny; Easton, Douglas F.; Dunning, Alison M.

    2012-01-01

    The 6q25.1 locus was first identified via a genome-wide association study (GWAS) in Chinese women and marked by single nucleotide polymorphism (SNP) rs2046210, approximately 180 Kb upstream of ESR1. There have been conflicting reports about the association of this locus with breast cancer in Europeans, and a GWAS in Europeans identified a different SNP, tagged here by rs12662670. We examined the associations of both SNPs in up to 61,689 cases and 58,822 controls from forty-four studies collaborating in the Breast Cancer Association Consortium, of which four studies were of Asian and 39 of European descent. Logistic regression was used to estimate odds ratios (OR) and 95% confidence intervals (CI). Case-only analyses were used to compare SNP effects in Estrogen Receptor positive (ER+) versus negative (ER−) tumours. Models including both SNPs were fitted to investigate whether the SNP effects were independent. Both SNPs are significantly associated with breast cancer risk in both ethnic groups. Per-allele ORs are higher in Asian than in European studies [rs2046210: OR (A/G) = 1.36 (95% CI 1.26–1.48), p = 7.6×10−14 in Asians and 1.09 (95% CI 1.07–1.11), p = 6.8×10−18 in Europeans. rs12662670: OR (G/T) = 1.29 (95% CI 1.19–1.41), p = 1.2×10−9 in Asians and 1.12 (95% CI 1.08–1.17), p = 3.8×10−9 in Europeans]. SNP rs2046210 is associated with a significantly greater risk of ER− than ER+ tumours in Europeans [OR (ER−) = 1.20 (95% CI 1.15–1.25), p = 1.8×10−17 versus OR (ER+) = 1.07 (95% CI 1.04–1.1), p = 1.3×10−7, pheterogeneity = 5.1×10−6]. In these Asian studies, by contrast, there is no clear evidence of a differential association by tumour receptor status. Each SNP is associated with risk after adjustment for the other SNP. These results suggest the presence of two variants at 6q25.1 each independently associated with breast cancer risk in Asians and in Europeans. Of these two, the one

  5. GAS STORAGE TECHNOLGOY CONSORTIUM

    SciTech Connect

    Robert W. Watson

    2004-04-23

    Gas storage is a critical element in the natural gas industry. Producers, transmission and distribution companies, marketers, and end users all benefit directly from the load balancing function of storage. The unbundling process has fundamentally changed the way storage is used and valued. As an unbundled service, the value of storage is being recovered at rates that reflect its value. Moreover, the marketplace has differentiated between various types of storage services, and has increasingly rewarded flexibility, safety, and reliability. The size of the natural gas market has increased and is projected to continue to increase towards 30 trillion cubic feet (TCF) over the next 10 to 15 years. Much of this increase is projected to come from electric generation, particularly peaking units. Gas storage, particularly the flexible services that are most suited to electric loads, is critical in meeting the needs of these new markets. In order to address the gas storage needs of the natural gas industry, an industry-driven consortium was created--the Gas Storage Technology Consortium (GSTC). The objective of the GSTC is to provide a means to accomplish industry-driven research and development designed to enhance operational flexibility and deliverability of the Nation's gas storage system, and provide a cost effective, safe, and reliable supply of natural gas to meet domestic demand. To accomplish this objective, the project is divided into three phases that are managed and directed by the GSTC Coordinator. Base funding for the consortium is provided by the U.S. Department of Energy (DOE). In addition, funding is anticipated from the Gas Technology Institute (GTI). The first phase, Phase 1A, was initiated on September 30, 2003, and is scheduled for completion on March 31, 2004. Phase 1A of the project includes the creation of the GSTC structure, development of constitution (by-laws) for the consortium, and development and refinement of a technical approach (work plan) for

  6. GAS STORAGE TECHNOLOGY CONSORTIUM

    SciTech Connect

    Robert W. Watson

    2004-04-17

    Gas storage is a critical element in the natural gas industry. Producers, transmission and distribution companies, marketers, and end users all benefit directly from the load balancing function of storage. The unbundling process has fundamentally changed the way storage is used and valued. As an unbundled service, the value of storage is being recovered at rates that reflect its value. Moreover, the marketplace has differentiated between various types of storage services, and has increasingly rewarded flexibility, safety, and reliability. The size of the natural gas market has increased and is projected to continue to increase towards 30 trillion cubic feet (TCF) over the next 10 to 15 years. Much of this increase is projected to come from electric generation, particularly peaking units. Gas storage, particularly the flexible services that are most suited to electric loads, is critical in meeting the needs of these new markets. In order to address the gas storage needs of the natural gas industry, an industry-driven consortium was created--the Gas Storage Technology Consortium (GSTC). The objective of the GSTC is to provide a means to accomplish industry-driven research and development designed to enhance operational flexibility and deliverability of the Nation's gas storage system, and provide a cost effective, safe, and reliable supply of natural gas to meet domestic demand. To accomplish this objective, the project is divided into three phases that are managed and directed by the GSTC Coordinator. Base funding for the consortium is provided by the U.S. Department of Energy (DOE). In addition, funding is anticipated from the Gas Technology Institute (GTI). The first phase, Phase 1A, was initiated on September 30, 2003, and is scheduled for completion on March 31, 2004. Phase 1A of the project includes the creation of the GSTC structure, development of constitution (by-laws) for the consortium, and development and refinement of a technical approach (work plan) for

  7. Gorilla genome structural variation reveals evolutionary parallelisms with chimpanzee.

    PubMed

    Ventura, Mario; Catacchio, Claudia R; Alkan, Can; Marques-Bonet, Tomas; Sajjadian, Saba; Graves, Tina A; Hormozdiari, Fereydoun; Navarro, Arcadi; Malig, Maika; Baker, Carl; Lee, Choli; Turner, Emily H; Chen, Lin; Kidd, Jeffrey M; Archidiacono, Nicoletta; Shendure, Jay; Wilson, Richard K; Eichler, Evan E

    2011-10-01

    Structural variation has played an important role in the evolutionary restructuring of human and great ape genomes. Recent analyses have suggested that the genomes of chimpanzee and human have been particularly enriched for this form of genetic variation. Here, we set out to assess the extent of structural variation in the gorilla lineage by generating 10-fold genomic sequence coverage from a western lowland gorilla and integrating these data into a physical and cytogenetic framework of structural variation. We discovered and validated over 7665 structural changes within the gorilla lineage, including sequence resolution of inversions, deletions, duplications, and mobile element insertions. A comparison with human and other ape genomes shows that the gorilla genome has been subjected to the highest rate of segmental duplication. We show that both the gorilla and chimpanzee genomes have experienced independent yet convergent patterns of structural mutation that have not occurred in humans, including the formation of subtelomeric heterochromatic caps, the hyperexpansion of segmental duplications, and bursts of retroviral integrations. Our analysis suggests that the chimpanzee and gorilla genomes are structurally more derived than either orangutan or human genomes. PMID:21685127

  8. International Lymphoma Epidemiology Consortium

    Cancer.gov

    The InterLymph Consortium, or formally the International Consortium of Investigators Working on Non-Hodgkin's Lymphoma Epidemiologic Studies, is an open scientific forum for epidemiologic research in non-Hodgkin's lymphoma.

  9. Multiple genome alignment for identifying the core structure among moderately related microbial genomes

    PubMed Central

    Uchiyama, Ikuo

    2008-01-01

    Background Identifying the set of intrinsically conserved genes, or the genomic core, among related genomes is crucial for understanding prokaryotic genomes where horizontal gene transfers are common. Although core genome identification appears to be obvious among very closely related genomes, it becomes more difficult when more distantly related genomes are compared. Here, we consider the core structure as a set of sufficiently long segments in which gene orders are conserved so that they are likely to have been inherited mainly through vertical transfer, and developed a method for identifying the core structure by finding the order of pre-identified orthologous groups (OGs) that maximally retains the conserved gene orders. Results The method was applied to genome comparisons of two well-characterized families, Bacillaceae and Enterobacteriaceae, and identified their core structures comprising 1438 and 2125 OGs, respectively. The core sets contained most of the essential genes and their related genes, which were primarily included in the intersection of the two core sets comprising around 700 OGs. The definition of the genomic core based on gene order conservation was demonstrated to be more robust than the simpler approach based only on gene conservation. We also investigated the core structures in terms of G+C content homogeneity and phylogenetic congruence, and found that the core genes primarily exhibited the expected characteristic, i.e., being indigenous and sharing the same history, more than the non-core genes. Conclusion The results demonstrate that our strategy of genome alignment based on gene order conservation can provide an effective approach to identify the genomic core among moderately related microbial genomes. PMID:18976470

  10. THE FEDERAL INTEGRATED BIOTREATMENT RESEARCH CONSORTIUM (FLASK TO FIELD)

    EPA Science Inventory

    The Federal Integrated Biotreatment Research Consortium (Flask to Field) represented a 7-year concerted effort by several research laboratories to develop bioremediation technologies for contaminated DoD sites. The consortium structure consisted of a director and four thrust are...

  11. Genome Editing of Structural Variations: Modeling and Gene Correction.

    PubMed

    Park, Chul-Yong; Sung, Jin Jea; Kim, Dong-Wook

    2016-07-01

    The analysis of chromosomal structural variations (SVs), such as inversions and translocations, was made possible by the completion of the human genome project and the development of genome-wide sequencing technologies. SVs contribute to genetic diversity and evolution, although some SVs can cause diseases such as hemophilia A in humans. Genome engineering technology using programmable nucleases (e.g., ZFNs, TALENs, and CRISPR/Cas9) has been rapidly developed, enabling precise and efficient genome editing for SV research. Here, we review advances in modeling and gene correction of SVs, focusing on inversion, translocation, and nucleotide repeat expansion. PMID:27016031

  12. On the analysis of large-scale genomic structures.

    PubMed

    Oiwa, Nestor Norio; Goldman, Carla

    2005-01-01

    We apply methods from statistical physics (histograms, correlation functions, fractal dimensions, and singularity spectra) to characterize large-scale structure of the distribution of nucleotides along genomic sequences. We discuss the role of the extension of noncoding segments ("junk DNA") for the genomic organization, and the connection between the coding segment distribution and the high-eukaryotic chromatin condensation. The following sequences taken from GenBank were analyzed: complete genome of Xanthomonas campestri, complete genome of yeast, chromosome V of Caenorhabditis elegans, and human chromosome XVII around gene BRCA1. The results are compared with the random and periodic sequences and those generated by simple and generalized fractal Cantor sets. PMID:15858230

  13. Web-Based Arabidopsis Functional and Structural Genomics Resources

    PubMed Central

    Lu, Yan; Last, Robert L.

    2008-01-01

    As plant research moves to a “post-genomic” era, many diverse internet resources become available to the international research community. Arabidopsis thaliana, because of its small size, rapid life cycle and simple genome, has been a model system for decades, with much research funding and many projects devoted to creation of functional and structural genomics resources. Different types of data, including genome, transcriptome, proteome, phenome, metabolome and ionome are stored in these resources. In this chapter, a variety of genomics resources are introduced, with simple descriptions of how some can be accessed by laboratory researchers via the internet. PMID:22303243

  14. The Genomic Threading Database: a comprehensive resource for structural annotations of the genomes from key organisms.

    PubMed

    McGuffin, Liam J; Street, Stefano A; Bryson, Kevin; Sørensen, Søren-Aksel; Jones, David T

    2004-01-01

    Currently, the Genomic Threading Database (GTD) contains structural assignments for the proteins encoded within the genomes of nine eukaryotes and 101 prokaryotes. Structural annotations are carried out using a modified version of GenTHREADER, a reliable fold recognition method. The Gen THREADER annotation jobs are distributed across multiple clusters of processors using grid technology and the predictions are deposited in a relational database accessible via a web interface at http://bioinf.cs.ucl.ac.uk/GTD. Using this system, up to 84% of proteins encoded within a genome can be confidently assigned to known folds with 72% of the residues aligned. On average in the GTD, 64% of proteins encoded within a genome are confidently assigned to known folds and 58% of the residues are aligned to structures. PMID:14681393

  15. Structural Genomics of Minimal Organisms: Pipeline and Results

    SciTech Connect

    Kim, Sung-Hou; Shin, Dong-Hae; Kim, Rosalind; Adams, Paul; Chandonia, John-Marc

    2007-09-14

    The initial objective of the Berkeley Structural Genomics Center was to obtain a near complete three-dimensional (3D) structural information of all soluble proteins of two minimal organisms, closely related pathogens Mycoplasma genitalium and M. pneumoniae. The former has fewer than 500 genes and the latter has fewer than 700 genes. A semiautomated structural genomics pipeline was set up from target selection, cloning, expression, purification, and ultimately structural determination. At the time of this writing, structural information of more than 93percent of all soluble proteins of M. genitalium is avail able. This chapter summarizes the approaches taken by the authors' center.

  16. The bioleaching potential of a bacterial consortium.

    PubMed

    Latorre, Mauricio; Cortés, María Paz; Travisany, Dante; Di Genova, Alex; Budinich, Marko; Reyes-Jara, Angélica; Hödar, Christian; González, Mauricio; Parada, Pilar; Bobadilla-Fazzini, Roberto A; Cambiazo, Verónica; Maass, Alejandro

    2016-10-01

    This work presents the molecular foundation of a consortium of five efficient bacteria strains isolated from copper mines currently used in state of the art industrial-scale biotechnology. The strains Acidithiobacillus thiooxidans Licanantay, Acidiphilium multivorum Yenapatur, Leptospirillum ferriphilum Pañiwe, Acidithiobacillus ferrooxidans Wenelen and Sulfobacillus thermosulfidooxidans Cutipay were selected for genome sequencing based on metal tolerance, oxidation activity and bioleaching of copper efficiency. An integrated model of metabolic pathways representing the bioleaching capability of this consortium was generated. Results revealed that greater efficiency in copper recovery may be explained by the higher functional potential of L. ferriphilum Pañiwe and At. thiooxidans Licanantay to oxidize iron and reduced inorganic sulfur compounds. The consortium had a greater capacity to resist copper, arsenic and chloride ion compared to previously described biomining strains. Specialization and particular components in these bacteria provided the consortium a greater ability to bioleach copper sulfide ores. PMID:27416516

  17. Genomic structure of the human caldesmon gene.

    PubMed Central

    Hayashi, K; Yano, H; Hashida, T; Takeuchi, R; Takeda, O; Asada, K; Takahashi, E; Kato, I; Sobue, K

    1992-01-01

    The high molecular weight caldesmon (h-CaD) is predominantly expressed in smooth muscles, whereas the low molecular weight caldesmon (l-CaD) is widely distributed in nonmuscle tissues and cells. The changes in CaD isoform expression are closely correlated with the phenotypic modulation of smooth muscle cells. During a search for isoform diversity of human CaDs, l-CaD cDNAs were cloned from HeLa S3 cells. HeLa l-CaD I is composed of 558 amino acids, whereas 26 amino acids (residues 202-227 for HeLa l-CaD I) are deleted in HeLa l-CaD II. The short amino-terminal sequence of HeLa l-CaDs is different from that of fibroblast (WI-38) l-CaD II and human aorta h-CaD. We have also identified WI-38 l-CaD I, which contains a 26-amino acid insertion relative to WI-38 l-CaD II. To reveal the molecular events of the expressional regulation of the CaD isoforms, the genomic structure of the human CaD gene was determined. The human CaD gene is composed of 14 exons and was mapped to a single locus, 7q33-q34. The 26-amino acid insertion is encoded in exon 4 and is specifically spliced in the mRNAs for both h-CaD and l-CaDs I. Exon 3 is the exon that encodes the central repeating domain specific to h-CaD (residues 208-436) together with the common domain in all CaD (residues 73-207 for h-CaD and WI-38 l-CaDs, and residues 68-201 for HeLa l-CaDs). The regulation of h- and l-CaD expression is thought to depend on selection of the two 5' splice sites within exon 3. Thus, the change in expression between l-CaD and h-CaD might be caused by this splicing pathway. Images PMID:1465449

  18. Structural genomics-impact on biomedicine and drug discovery.

    PubMed

    Weigelt, Johan

    2010-05-01

    The field of structural genomics emerged as one of many 'omics disciplines more than a decade ago, and a multitude of large scale initiatives have been launched across the world. Development and implementation of methods for high-throughput structural biology represents a common denominator among different structural genomics programs. From another perspective a distinction between "biology-driven" versus "structure-driven" approaches can be made. This review outlines the general themes of structural genomics, its achievements and its impact on biomedicine and drug discovery. The growing number of high resolution structures of known and potential drug target proteins is expected to have tremendous value for future drug discovery programs. Moreover, the availability of large numbers of purified proteins enables generation of tool reagents, such as chemical probes and antibodies, to further explore protein function in the cell. PMID:20211166

  19. Strawberry Part 3 - structural and functional genomics

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The area of strawberry genomics is rapidly changing because of the burgeoning interest in, and need for, reference plants for the Rosaceae family, which contains many important fruit, nut, ornamental and wood crops, including peach, apple, almond, rose and cherry. This chapter describes the current...

  20. PSI-2: Structural Genomics to Cover Protein Domain Family Space

    PubMed Central

    Dessailly, Benoît H.; Nair, Rajesh; Jaroszewski, Lukasz; Fajardo, J. Eduardo; Kouranov, Andrei; Lee, David; Fiser, Andras; Godzik, Adam; Rost, Burkhard; Orengo, Christine

    2010-01-01

    Summary One major objective of structural genomics efforts, including the NIH-funded Protein Structure Initiative (PSI), has been to increase the structural coverage of protein sequence space. Here, we present the target selection strategy used during the second phase of PSI (PSI-2). This strategy, jointly devised by the bioinformatics groups associated with the PSI-2 large-scale production centres, targets representatives from large, structurally uncharacterised protein domain families, and from structurally uncharacterised subfamilies in very large and diverse families with incomplete structural coverage. These very large families are extremely diverse both structurally and functionally, and are highly over-represented in known proteomes. On the basis of several metrics, we then discuss to what extent PSI-2, during its first three years, has increased the structural coverage of genomes, and contributed structural and functional novelty. Together, the results presented here suggest that PSI-2 is successfully meeting its objectives and provides useful insights into structural and functional space. PMID:19523904

  1. Terragenome: International Soil Metagenome Sequencing Consortium (GSC8 Meeting)

    ScienceCinema

    Jansson, Janet [LBNL

    2011-04-29

    The Genomic Standards Consortium was formed in September 2005. It is an international, open-membership working body which promotes standardization in the description of genomes and the exchange and integration of genomic data. The 2009 meeting was an activity of a five-year funding "Research Coordination Network" from the National Science Foundation and was organized held at the DOE Joint Genome Institute with organizational support provided by the JGI and by the University of California - San Diego. Janet Jansson of the Lawrence Berkeley National Laboratory discusses the Terragenome Initiative at the Genomic Standards Consortium's 8th meeting at the DOE JGI in Walnut Creek, Calif. on Sept. 9, 2009

  2. Structural and Operational Complexity of the Geobacter Sulfurreducens Genome

    SciTech Connect

    Qiu, Yu; Cho, Byung-Kwan; Park, Young S.; Lovley, Derek R.; Palsson, Bernhard O.; Zengler, Karsten

    2010-06-30

    Prokaryotic genomes can be annotated based on their structural, operational, and functional properties. These annotations provide the pivotal scaffold for understanding cellular functions on a genome-scale, such as metabolism and transcriptional regulation. Here, we describe a systems approach to simultaneously determine the structural and operational annotation of the Geobacter sulfurreducens genome. Integration of proteomics, transcriptomics, RNA polymerase, and sigma factor-binding information with deep-sequencing-based analysis of primary 59-end transcripts allowed for a most precise annotation. The structural annotation is comprised of numerous previously undetected genes, noncoding RNAs, prevalent leaderless mRNA transcripts, and antisense transcripts. When compared with other prokaryotes, we found that the number of antisense transcripts reversely correlated with genome size. The operational annotation consists of 1453 operons, 22% of which have multiple transcription start sites that use different RNA polymerase holoenzymes. Several operons with multiple transcription start sites encoded genes with essential functions, giving insight into the regulatory complexity of the genome. The experimentally determined structural and operational annotations can be combined with functional annotation, yielding a new three-level annotation that greatly expands our understanding of prokaryotic genomes.

  3. Structure and replication of geminivirus genomes.

    PubMed

    Davies, J W; Stanley, J; Donson, J; Mullineaux, P M; Boulton, M I

    1987-01-01

    The geminiviruses are a group of plant viruses containing single-stranded (ss) DNA in particles comprising two quasi-icosahedral units. Some are transmitted by whiteflies, others by leafhoppers. Comparisons were made of the genome organization and expression of cassava latent virus (CLV) and maize streak virus (MSV) and beet curly top virus (BCTV), each with distinct host range and insect vector species characteristics. From these studies, several indications as to the replication mechanism(s) are suggested. PMID:3503890

  4. Radiogenomics Consortium (RGC)

    Cancer.gov

    The Radiogenomics Consortium's hypothesis is that a cancer patient's likelihood of developing toxicity to radiation therapy is influenced by common genetic variations, such as single nucleotide polymorphisms (SNPs).

  5. Community structure and PAH ring-hydroxylating dioxygenase genes of a marine pyrene-degrading microbial consortium.

    PubMed

    Gallego, Sara; Vila, Joaquim; Tauler, Margalida; Nieto, José María; Breugelmans, Philip; Springael, Dirk; Grifoll, Magdalena

    2014-07-01

    Marine microbial consortium UBF, enriched from a beach polluted by the Prestige oil spill and highly efficient in degrading this heavy fuel, was subcultured in pyrene minimal medium. The pyrene-degrading subpopulation (UBF-Py) mineralized 31 % of pyrene without accumulation of partially oxidized intermediates indicating the cooperation of different microbial components in substrate mineralization. The microbial community composition was characterized by culture dependent and PCR based methods (PCR-DGGE and clone libraries). Molecular analyses showed a highly stable community composed by Alphaproteobacteria (84 %, Breoghania, Thalassospira, Paracoccus, and Martelella) and Actinobacteria (16 %, Gordonia). The members of Thalasosspira and Gordonia were not recovered as pure cultures, but five additional strains, not detected in the molecular analysis, that classified within the genera Novosphingobium, Sphingopyxis, Aurantimonas (Alphaproteobacteria), Alcanivorax (Gammaproteobacteria) and Micrococcus (Actinobacteria), were isolated. None of the isolates degraded pyrene or other PAHs in pure culture. PCR amplification of Gram-positive and Gram-negative dioxygenase genes did not produce results with any of the cultured strains. However, sequences related to the NidA3 pyrene dioxygenase present in mycobacterial strains were detected in UBF-Py consortium, suggesting the representative of Gordonia as the key pyrene degrader, which is consistent with a preeminent role of actinobacteria in pyrene removal in coastal environments affected by marine oil spills. PMID:24356981

  6. Breast and Prostate Cancer Cohort Consortium (BPC3)

    Cancer.gov

    Breast and Prostate Cancer Cohort Consortium collaborates with three genomic facilities, epidemiologists, population geneticists, and biostatisticians from multiple institutions to study hormone-related gene variants and environmental factors in breast and prostate cancers.

  7. Population-based 3D genome structure analysis reveals driving forces in spatial genome organization

    PubMed Central

    Li, Wenyuan; Kalhor, Reza; Dai, Chao; Hao, Shengli; Gong, Ke; Zhou, Yonggang; Li, Haochen; Zhou, Xianghong Jasmine; Le Gros, Mark A.; Larabell, Carolyn A.; Chen, Lin; Alber, Frank

    2016-01-01

    Conformation capture technologies (e.g., Hi-C) chart physical interactions between chromatin regions on a genome-wide scale. However, the structural variability of the genome between cells poses a great challenge to interpreting ensemble-averaged Hi-C data, particularly for long-range and interchromosomal interactions. Here, we present a probabilistic approach for deconvoluting Hi-C data into a model population of distinct diploid 3D genome structures, which facilitates the detection of chromatin interactions likely to co-occur in individual cells. Our approach incorporates the stochastic nature of chromosome conformations and allows a detailed analysis of alternative chromatin structure states. For example, we predict and experimentally confirm the presence of large centromere clusters with distinct chromosome compositions varying between individual cells. The stability of these clusters varies greatly with their chromosome identities. We show that these chromosome-specific clusters can play a key role in the overall chromosome positioning in the nucleus and stabilizing specific chromatin interactions. By explicitly considering genome structural variability, our population-based method provides an important tool for revealing novel insights into the key factors shaping the spatial genome organization. PMID:26951677

  8. Spectral entropy criteria for structural segmentation in genomic DNA sequences

    NASA Astrophysics Data System (ADS)

    Chechetkin, V. R.; Lobzin, V. V.

    2004-07-01

    The spectral entropy is calculated with Fourier structure factors and characterizes the level of structural ordering in a sequence of symbols. It may efficiently be applied to the assessment and reconstruction of the modular structure in genomic DNA sequences. We present the relevant spectral entropy criteria for the local and non-local structural segmentation in DNA sequences. The results are illustrated with the model examples and analysis of intervening exon-intron segments in the protein-coding regions.

  9. GAS STORAGE TECHNOLOGY CONSORTIUM

    SciTech Connect

    Robert W. Watson

    2004-10-18

    Gas storage is a critical element in the natural gas industry. Producers, transmission and distribution companies, marketers, and end users all benefit directly from the load balancing function of storage. The unbundling process has fundamentally changed the way storage is used and valued. As an unbundled service, the value of storage is being recovered at rates that reflect its value. Moreover, the marketplace has differentiated between various types of storage services, and has increasingly rewarded flexibility, safety, and reliability. The size of the natural gas market has increased and is projected to continue to increase towards 30 trillion cubic feet (TCF) over the next 10 to 15 years. Much of this increase is projected to come from electric generation, particularly peaking units. Gas storage, particularly the flexible services that are most suited to electric loads, is critical in meeting the needs of these new markets. In order to address the gas storage needs of the natural gas industry, an industry-driven consortium was created--the Gas Storage Technology Consortium (GSTC). The objective of the GSTC is to provide a means to accomplish industry-driven research and development designed to enhance operational flexibility and deliverability of the Nation's gas storage system, and provide a cost effective, safe, and reliable supply of natural gas to meet domestic demand. To accomplish this objective, the project is divided into three phases that are managed and directed by the GSTC Coordinator. The first phase, Phase 1A, was initiated on September 30, 2003, and was completed on March 31, 2004. Phase 1A of the project included the creation of the GSTC structure, development and refinement of a technical approach (work plan) for deliverability enhancement and reservoir management. This report deals with Phase 1B and encompasses the period July 1, 2004, through September 30, 2004. During this time period there were three main activities. First was the ongoing

  10. GAS STORAGE TECHNOLOGY CONSORTIUM

    SciTech Connect

    Robert W. Watson

    2004-07-15

    Gas storage is a critical element in the natural gas industry. Producers, transmission and distribution companies, marketers, and end users all benefit directly from the load balancing function of storage. The unbundling process has fundamentally changed the way storage is used and valued. As an unbundled service, the value of storage is being recovered at rates that reflect its value. Moreover, the marketplace has differentiated between various types of storage services, and has increasingly rewarded flexibility, safety, and reliability. The size of the natural gas market has increased and is projected to continue to increase towards 30 trillion cubic feet (TCF) over the next 10 to 15 years. Much of this increase is projected to come from electric generation, particularly peaking units. Gas storage, particularly the flexible services that are most suited to electric loads, is critical in meeting the needs of these new markets. In order to address the gas storage needs of the natural gas industry, an industry-driven consortium was created--the Gas Storage Technology Consortium (GSTC). The objective of the GSTC is to provide a means to accomplish industry-driven research and development designed to enhance operational flexibility and deliverability of the Nation's gas storage system, and provide a cost effective, safe, and reliable supply of natural gas to meet domestic demand. To accomplish this objective, the project is divided into three phases that are managed and directed by the GSTC Coordinator. Base funding for the consortium is provided by the U.S. Department of Energy (DOE). In addition, funding is anticipated from the Gas Technology Institute (GTI). The first phase, Phase 1A, was initiated on September 30, 2003, and was completed on March 31, 2004. Phase 1A of the project included the creation of the GSTC structure, development and refinement of a technical approach (work plan) for deliverability enhancement and reservoir management. This report deals with

  11. Structural Genomics and Drug Discovery for Infectious Diseases

    SciTech Connect

    Anderson, W.F.

    2010-09-03

    The application of structural genomics methods and approaches to proteins from organisms causing infectious diseases is making available the three dimensional structures of many proteins that are potential drug targets and laying the groundwork for structure aided drug discovery efforts. There are a number of structural genomics projects with a focus on pathogens that have been initiated worldwide. The Center for Structural Genomics of Infectious Diseases (CSGID) was recently established to apply state-of-the-art high throughput structural biology technologies to the characterization of proteins from the National Institute for Allergy and Infectious Diseases (NIAID) category A-C pathogens and organisms causing emerging, or re-emerging infectious diseases. The target selection process emphasizes potential biomedical benefits. Selected proteins include known drug targets and their homologs, essential enzymes, virulence factors and vaccine candidates. The Center also provides a structure determination service for the infectious disease scientific community. The ultimate goal is to generate a library of structures that are available to the scientific community and can serve as a starting point for further research and structure aided drug discovery for infectious diseases. To achieve this goal, the CSGID will determine protein crystal structures of 400 proteins and protein-ligand complexes using proven, rapid, highly integrated, and cost-effective methods for such determination, primarily by X-ray crystallography. High throughput crystallographic structure determination is greatly aided by frequent, convenient access to high-performance beamlines at third-generation synchrotron X-ray sources.

  12. Genome Pool Strategy for Structural Coverage of Protein Families

    SciTech Connect

    Jaroszewski, L.; Slabinski, L.; Wooley, J.; Deacon, A.M.; Lesley, S.A.; Wilson, I.A.; Godzik, A.

    2009-05-18

    Even closely homologous proteins often have different crystallization properties and propensities. This observation can be used to introduce an additional dimension into crystallization trials by simultaneous targeting multiple homologs in what we call a 'genome pool' strategy. We show that this strategy works because protein physicochemical properties correlated with crystallization success have a surprisingly broad distribution within most protein families. There are also easy and difficult families where this distribution is tilted in one direction. This leads to uneven structural coverage of protein families, with more easy ones solved. Increasing the size of the genome pool can improve chances of solving the difficult ones. In contrast, our analysis does not indicate that any specific genomes are easy or difficult. Finally, we show that the group of proteins with known 3D structures is systematically different from the general pool of known proteins and we assess the structural consequences of these differences.

  13. The Impact of Structural Genomics: Expectations and Outcomes

    SciTech Connect

    Chandonia, John-Marc; Brenner, Steven E.

    2005-12-21

    Structural Genomics (SG) projects aim to expand our structural knowledge of biological macromolecules, while lowering the average costs of structure determination. We quantitatively analyzed the novelty, cost, and impact of structures solved by SG centers, and contrast these results with traditional structural biology. The first structure from a protein family is particularly important to reveal the fold and ancient relationships to other proteins. In the last year, approximately half of such structures were solved at a SG center rather than in a traditional laboratory. Furthermore, the cost of solving a structure at the most efficient U.S. center has now dropped to one-quarter the estimated cost of solving a structure by traditional methods. However, top structural biology laboratories are much more efficient than the average, and comparable to SG centers despite working on very challenging structures. Moreover, traditional structural biology papers are cited significantly more often, suggesting greater current impact.

  14. Benefits of Structural Genomics for Drug Discovery Research

    SciTech Connect

    Grabowski, M.; Chruszcz, M; Zimmerman, M; Kirillova, O; Minor, W

    2009-01-01

    While three dimensional structures have long been used to search for new drug targets, only a fraction of new drugs coming to the market has been developed with the use of a structure-based drug discovery approach. However, the recent years have brought not only an avalanche of new macromolecular structures, but also significant advances in the protein structure determination methodology only now making their way into structure-based drug discovery. In this paper, we review recent developments resulting from the Structural Genomics (SG) programs, focusing on the methods and results most likely to improve our understanding of the molecular foundation of human diseases. SG programs have been around for almost a decade, and in that time, have contributed a significant part of the structural coverage of both the genomes of pathogens causing infectious diseases and structurally uncharacterized biological processes in general. Perhaps most importantly, SG programs have developed new methodology at all steps of the structure determination process, not only to determine new structures highly efficiently, but also to screen protein/ligand interactions. We describe the methodologies, experience and technologies developed by SG, which range from improvements to cloning protocols to improved procedures for crystallographic structure solution that may be applied in 'traditional' structural biology laboratories particularly those performing drug discovery. We also discuss the conditions that must be met to convert the present high-throughput structure determination pipeline into a high-output structure-based drug discovery system.

  15. Coevolution of the Organization and Structure of Prokaryotic Genomes.

    PubMed

    Touchon, Marie; Rocha, Eduardo P C

    2016-01-01

    The cytoplasm of prokaryotes contains many molecular machines interacting directly with the chromosome. These vital interactions depend on the chromosome structure, as a molecule, and on the genome organization, as a unit of genetic information. Strong selection for the organization of the genetic elements implicated in these interactions drives replicon ploidy, gene distribution, operon conservation, and the formation of replication-associated traits. The genomes of prokaryotes are also very plastic with high rates of horizontal gene transfer and gene loss. The evolutionary conflicts between plasticity and organization lead to the formation of regions with high genetic diversity whose impact on chromosome structure is poorly understood. Prokaryotic genomes are remarkable documents of natural history because they carry the imprint of all of these selective and mutational forces. Their study allows a better understanding of molecular mechanisms, their impact on microbial evolution, and how they can be tinkered in synthetic biology. PMID:26729648

  16. Northeast Artificial Intelligence Consortium annual report, 1986. Volume 8. Part B. Parallel, structural, and optimal techniques in vision. Interim technical report, January-December 1986

    SciTech Connect

    Brown, C.M.

    1988-06-01

    The Northeast Artificial Intelligence Consortium's purpose is to conduct pertinent research in artificial intelligence and to perform activities ancillary to this research. These volumes describe progress that has been made in the second year of the existence of the NAIC on the technical research tasks undertaken at the member universities. The topics covered in general are: versatile expert system for equipment maintenance, distributed AI for communications system control, automatic photo interpretation, time-oriented problem solving, speech understanding systems, knowledge base maintenance, hardware architectures for very large systems, knowledge-based reasoning and planning, and a knowledge acquisition, assistance, and explanation system. This part addresses various aspects of parallel, structural, and optimal techniques in computer vision.

  17. Symbolic extensions applied to multiscale structure of genomes.

    PubMed

    Downarowicz, Tomasz; Travisany, Dante; Montecino, Martin; Maass, Alejandro

    2014-06-01

    A genome of a living organism consists of a long string of symbols over a finite alphabet carrying critical information for the organism. This includes its ability to control post natal growth, homeostasis, adaptation to changes in the surrounding environment, or to biochemically respond at the cellular level to various specific regulatory signals. In this sense, a genome represents a symbolic encoding of a highly organized system of information whose functioning may be revealed as a natural multilayer structure in terms of complexity and prominence. In this paper we use the mathematical theory of symbolic extensions as a framework to shed light onto how this multilayer organization is reflected in the symbolic coding of the genome. The distribution of data in an element of a standard symbolic extension of a dynamical system has a specific form: the symbolic sequence is divided into several subsequences (which we call layers) encoding the dynamics on various "scales". We propose that a similar structure resides within the genomes, building our analogy on some of the most recent findings in the field of regulation of genomic DNA functioning. PMID:24728912

  18. Mitochondrial Disease Sequence Data Resource (MSeqDR): a global grass-roots consortium to facilitate deposition, curation, annotation, and integrated analysis of genomic data for the mitochondrial disease clinical and research communities.

    PubMed

    Falk, Marni J; Shen, Lishuang; Gonzalez, Michael; Leipzig, Jeremy; Lott, Marie T; Stassen, Alphons P M; Diroma, Maria Angela; Navarro-Gomez, Daniel; Yeske, Philip; Bai, Renkui; Boles, Richard G; Brilhante, Virginia; Ralph, David; DaRe, Jeana T; Shelton, Robert; Terry, Sharon F; Zhang, Zhe; Copeland, William C; van Oven, Mannis; Prokisch, Holger; Wallace, Douglas C; Attimonelli, Marcella; Krotoski, Danuta; Zuchner, Stephan; Gai, Xiaowu

    2015-03-01

    Success rates for genomic analyses of highly heterogeneous disorders can be greatly improved if a large cohort of patient data is assembled to enhance collective capabilities for accurate sequence variant annotation, analysis, and interpretation. Indeed, molecular diagnostics requires the establishment of robust data resources to enable data sharing that informs accurate understanding of genes, variants, and phenotypes. The "Mitochondrial Disease Sequence Data Resource (MSeqDR) Consortium" is a grass-roots effort facilitated by the United Mitochondrial Disease Foundation to identify and prioritize specific genomic data analysis needs of the global mitochondrial disease clinical and research community. A central Web portal (https://mseqdr.org) facilitates the coherent compilation, organization, annotation, and analysis of sequence data from both nuclear and mitochondrial genomes of individuals and families with suspected mitochondrial disease. This Web portal provides users with a flexible and expandable suite of resources to enable variant-, gene-, and exome-level sequence analysis in a secure, Web-based, and user-friendly fashion. Users can also elect to share data with other MSeqDR Consortium members, or even the general public, either by custom annotation tracks or through the use of a convenient distributed annotation system (DAS) mechanism. A range of data visualization and analysis tools are provided to facilitate user interrogation and understanding of genomic, and ultimately phenotypic, data of relevance to mitochondrial biology and disease. Currently available tools for nuclear and mitochondrial gene analyses include an MSeqDR GBrowse instance that hosts optimized mitochondrial disease and mitochondrial DNA (mtDNA) specific annotation tracks, as well as an MSeqDR locus-specific database (LSDB) that curates variant data on more than 1300 genes that have been implicated in mitochondrial disease and/or encode mitochondria-localized proteins. MSeqDR is

  19. NCI Cohort Consortium Membership

    Cancer.gov

    The NCI Cohort Consortium membership is international and includes investigators responsible for more than 40 high-quality cohorts who are studying large and diverse populations in more than 15 different countries.

  20. Genomic Structure and Evolution of Multigene Families: “Flowers” on the Human Genome

    PubMed Central

    Kim, Hie Lim; Iwase, Mineyo; Igawa, Takeshi; Nishioka, Tasuku; Kaneko, Satoko; Katsura, Yukako; Takahata, Naoyuki; Satta, Yoko

    2012-01-01

    We report the results of an extensive investigation of genomic structures in the human genome, with a particular focus on relatively large repeats (>50 kb) in adjacent chromosomal regions. We named such structures “Flowers” because the pattern observed on dot plots resembles a flower. We detected a total of 291 Flowers in the human genome. They were predominantly located in euchromatic regions. Flowers are gene-rich compared to the average gene density of the genome. Genes involved in systems receiving environmental information, such as immunity and detoxification, were overrepresented in Flowers. Within a Flower, the mean number of duplication units was approximately four. The maximum and minimum identities between homologs in a Flower showed different distributions; the maximum identity was often concentrated to 100% identity, while the minimum identity was evenly distributed in the range of 78% to 100%. Using a gene conversion detection test, we found frequent and/or recent gene conversion events within the tested Flowers. Interestingly, many of those converted regions contained protein-coding genes. Computer simulation studies suggest that one role of such frequent gene conversions is the elongation of the life span of gene families in a Flower by the resurrection of pseudogenes. PMID:22779033

  1. Genomic structure and evolution of multigene families: "flowers" on the human genome.

    PubMed

    Kim, Hie Lim; Iwase, Mineyo; Igawa, Takeshi; Nishioka, Tasuku; Kaneko, Satoko; Katsura, Yukako; Takahata, Naoyuki; Satta, Yoko

    2012-01-01

    We report the results of an extensive investigation of genomic structures in the human genome, with a particular focus on relatively large repeats (>50 kb) in adjacent chromosomal regions. We named such structures "Flowers" because the pattern observed on dot plots resembles a flower. We detected a total of 291 Flowers in the human genome. They were predominantly located in euchromatic regions. Flowers are gene-rich compared to the average gene density of the genome. Genes involved in systems receiving environmental information, such as immunity and detoxification, were overrepresented in Flowers. Within a Flower, the mean number of duplication units was approximately four. The maximum and minimum identities between homologs in a Flower showed different distributions; the maximum identity was often concentrated to 100% identity, while the minimum identity was evenly distributed in the range of 78% to 100%. Using a gene conversion detection test, we found frequent and/or recent gene conversion events within the tested Flowers. Interestingly, many of those converted regions contained protein-coding genes. Computer simulation studies suggest that one role of such frequent gene conversions is the elongation of the life span of gene families in a Flower by the resurrection of pseudogenes. PMID:22779033

  2. Hierarchical structure analysis describing abnormal base composition of genomes

    NASA Astrophysics Data System (ADS)

    Ouyang, Zhengqing; Liu, Jian-Kun; She, Zhen-Su

    2005-10-01

    Abnormal base compositional patterns of genomic DNA sequences are studied in the framework of a hierarchical structure (HS) model originally proposed for the study of fully developed turbulence [She and Lévêque, Phys. Rev. Lett. 72, 336 (1994)]. The HS similarity law is verified over scales between 103bp and 105bp , and the HS parameter β is proposed to describe the degree of heterogeneity in the base composition patterns. More than one hundred bacteria, archaea, virus, yeast, and human genome sequences have been analyzed and the results show that the HS analysis efficiently captures abnormal base composition patterns, and the parameter β is a characteristic measure of the genome. Detailed examination of the values of β reveals an intriguing link to the evolutionary events of genetic material transfer. Finally, a sequence complexity (S) measure is proposed to characterize gradual increase of organizational complexity of the genome during the evolution. The present study raises several interesting issues in the evolutionary history of genomes.

  3. Structural analysis of hepatitis C RNA genome using DNA microarrays

    PubMed Central

    Martell, María; Briones, Carlos; de Vicente, Aránzazu; Piron, María; Esteban, Juan I.; Esteban, Rafael; Guardia, Jaime; Gómez, Jordi

    2004-01-01

    Many studies have tried to identify specific nucleotide sequences in the quasispecies of hepatitis C virus (HCV) that determine resistance or sensitivity to interferon (IFN) therapy, unfortunately without conclusive results. Although viral proteins represent the most evident phenotype of the virus, genomic RNA sequences determine secondary and tertiary structures which are also part of the viral phenotype and can be involved in important biological roles. In this work, a method of RNA structure analysis has been developed based on the hybridization of labelled HCV transcripts to microarrays of complementary DNA oligonucleotides. Hybridizations were carried out at non-denaturing conditions, using appropriate temperature and buffer composition to allow binding to the immobilized probes of the RNA transcript without disturbing its secondary/tertiary structural motifs. Oligonucleotides printed onto the microarray covered the entire 5′ non-coding region (5′NCR), the first three-quarters of the core region, the E2–NS2 junction and the first 400 nt of the NS3 region. We document the use of this methodology to analyse the structural degree of a large region of HCV genomic RNA in two genotypes associated with different responses to IFN treatment. The results reported here show different structural degree along the genome regions analysed, and differential hybridization patterns for distinct genotypes in NS2 and NS3 HCV regions. PMID:15247323

  4. Complete nucleotide sequence of the Cryptomeria japonica D. Don. chloroplast genome and comparative chloroplast genomics: diversified genomic structure of coniferous species

    PubMed Central

    Hirao, Tomonori; Watanabe, Atsushi; Kurita, Manabu; Kondo, Teiji; Takata, Katsuhiko

    2008-01-01

    Background The recent determination of complete chloroplast (cp) genomic sequences of various plant species has enabled numerous comparative analyses as well as advances in plant and genome evolutionary studies. In angiosperms, the complete cp genome sequences of about 70 species have been determined, whereas those of only three gymnosperm species, Cycas taitungensis, Pinus thunbergii, and Pinus koraiensis have been established. The lack of information regarding the gene content and genomic structure of gymnosperm cp genomes may severely hamper further progress of plant and cp genome evolutionary studies. To address this need, we report here the complete nucleotide sequence of the cp genome of Cryptomeria japonica, the first in the Cupressaceae sensu lato of gymnosperms, and provide a comparative analysis of their gene content and genomic structure that illustrates the unique genomic features of gymnosperms. Results The C. japonica cp genome is 131,810 bp in length, with 112 single copy genes and two duplicated (trnI-CAU, trnQ-UUG) genes that give a total of 116 genes. Compared to other land plant cp genomes, the C. japonica cp has lost one of the relevant large inverted repeats (IRs) found in angiosperms, fern, liverwort, and gymnosperms, such as Cycas and Gingko, and additionally has completely lost its trnR-CCG, partially lost its trnT-GGU, and shows diversification of accD. The genomic structure of the C. japonica cp genome also differs significantly from those of other plant species. For example, we estimate that a minimum of 15 inversions would be required to transform the gene organization of the Pinus thunbergii cp genome into that of C. japonica. In the C. japonica cp genome, direct repeat and inverted repeat sequences are observed at the inversion and translocation endpoints, and these sequences may be associated with the genomic rearrangements. Conclusion The observed differences in genomic structure between C. japonica and other land plants, including

  5. Genetic contributions to variation in general cognitive function: a meta-analysis of genome-wide association studies in the CHARGE consortium (N=53949).

    PubMed

    Davies, G; Armstrong, N; Bis, J C; Bressler, J; Chouraki, V; Giddaluru, S; Hofer, E; Ibrahim-Verbaas, C A; Kirin, M; Lahti, J; van der Lee, S J; Le Hellard, S; Liu, T; Marioni, R E; Oldmeadow, C; Postmus, I; Smith, A V; Smith, J A; Thalamuthu, A; Thomson, R; Vitart, V; Wang, J; Yu, L; Zgaga, L; Zhao, W; Boxall, R; Harris, S E; Hill, W D; Liewald, D C; Luciano, M; Adams, H; Ames, D; Amin, N; Amouyel, P; Assareh, A A; Au, R; Becker, J T; Beiser, A; Berr, C; Bertram, L; Boerwinkle, E; Buckley, B M; Campbell, H; Corley, J; De Jager, P L; Dufouil, C; Eriksson, J G; Espeseth, T; Faul, J D; Ford, I; Gottesman, R F; Griswold, M E; Gudnason, V; Harris, T B; Heiss, G; Hofman, A; Holliday, E G; Huffman, J; Kardia, S L R; Kochan, N; Knopman, D S; Kwok, J B; Lambert, J-C; Lee, T; Li, G; Li, S-C; Loitfelder, M; Lopez, O L; Lundervold, A J; Lundqvist, A; Mather, K A; Mirza, S S; Nyberg, L; Oostra, B A; Palotie, A; Papenberg, G; Pattie, A; Petrovic, K; Polasek, O; Psaty, B M; Redmond, P; Reppermund, S; Rotter, J I; Schmidt, H; Schuur, M; Schofield, P W; Scott, R J; Steen, V M; Stott, D J; van Swieten, J C; Taylor, K D; Trollor, J; Trompet, S; Uitterlinden, A G; Weinstein, G; Widen, E; Windham, B G; Jukema, J W; Wright, A F; Wright, M J; Yang, Q; Amieva, H; Attia, J R; Bennett, D A; Brodaty, H; de Craen, A J M; Hayward, C; Ikram, M A; Lindenberger, U; Nilsson, L-G; Porteous, D J; Räikkönen, K; Reinvang, I; Rudan, I; Sachdev, P S; Schmidt, R; Schofield, P R; Srikanth, V; Starr, J M; Turner, S T; Weir, D R; Wilson, J F; van Duijn, C; Launer, L; Fitzpatrick, A L; Seshadri, S; Mosley, T H; Deary, I J

    2015-02-01

    General cognitive function is substantially heritable across the human life course from adolescence to old age. We investigated the genetic contribution to variation in this important, health- and well-being-related trait in middle-aged and older adults. We conducted a meta-analysis of genome-wide association studies of 31 cohorts (N=53,949) in which the participants had undertaken multiple, diverse cognitive tests. A general cognitive function phenotype was tested for, and created in each cohort by principal component analysis. We report 13 genome-wide significant single-nucleotide polymorphism (SNP) associations in three genomic regions, 6q16.1, 14q12 and 19q13.32 (best SNP and closest gene, respectively: rs10457441, P=3.93 × 10(-9), MIR2113; rs17522122, P=2.55 × 10(-8), AKAP6; rs10119, P=5.67 × 10(-9), APOE/TOMM40). We report one gene-based significant association with the HMGN1 gene located on chromosome 21 (P=1 × 10(-6)). These genes have previously been associated with neuropsychiatric phenotypes. Meta-analysis results are consistent with a polygenic model of inheritance. To estimate SNP-based heritability, the genome-wide complex trait analysis procedure was applied to two large cohorts, the Atherosclerosis Risk in Communities Study (N=6617) and the Health and Retirement Study (N=5976). The proportion of phenotypic variation accounted for by all genotyped common SNPs was 29% (s.e.=5%) and 28% (s.e.=7%), respectively. Using polygenic prediction analysis, ~1.2% of the variance in general cognitive function was predicted in the Generation Scotland cohort (N=5487; P=1.5 × 10(-17)). In hypothesis-driven tests, there was significant association between general cognitive function and four genes previously associated with Alzheimer's disease: TOMM40, APOE, ABCG1 and MEF2C. PMID:25644384

  6. Genetic contributions to variation in general cognitive function: a meta-analysis of genome-wide association studies in the CHARGE consortium (N=53 949)

    PubMed Central

    Davies, G; Armstrong, N; Bis, J C; Bressler, J; Chouraki, V; Giddaluru, S; Hofer, E; Ibrahim-Verbaas, C A; Kirin, M; Lahti, J; van der Lee, S J; Le Hellard, S; Liu, T; Marioni, R E; Oldmeadow, C; Postmus, I; Smith, A V; Smith, J A; Thalamuthu, A; Thomson, R; Vitart, V; Wang, J; Yu, L; Zgaga, L; Zhao, W; Boxall, R; Harris, S E; Hill, W D; Liewald, D C; Luciano, M; Adams, H; Ames, D; Amin, N; Amouyel, P; Assareh, A A; Au, R; Becker, J T; Beiser, A; Berr, C; Bertram, L; Boerwinkle, E; Buckley, B M; Campbell, H; Corley, J; De Jager, P L; Dufouil, C; Eriksson, J G; Espeseth, T; Faul, J D; Ford, I; Scotland, Generation; Gottesman, R F; Griswold, M E; Gudnason, V; Harris, T B; Heiss, G; Hofman, A; Holliday, E G; Huffman, J; Kardia, S L R; Kochan, N; Knopman, D S; Kwok, J B; Lambert, J-C; Lee, T; Li, G; Li, S-C; Loitfelder, M; Lopez, O L; Lundervold, A J; Lundqvist, A; Mather, K A; Mirza, S S; Nyberg, L; Oostra, B A; Palotie, A; Papenberg, G; Pattie, A; Petrovic, K; Polasek, O; Psaty, B M; Redmond, P; Reppermund, S; Rotter, J I; Schmidt, H; Schuur, M; Schofield, P W; Scott, R J; Steen, V M; Stott, D J; van Swieten, J C; Taylor, K D; Trollor, J; Trompet, S; Uitterlinden, A G; Weinstein, G; Widen, E; Windham, B G; Jukema, J W; Wright, A F; Wright, M J; Yang, Q; Amieva, H; Attia, J R; Bennett, D A; Brodaty, H; de Craen, A J M; Hayward, C; Ikram, M A; Lindenberger, U; Nilsson, L-G; Porteous, D J; Räikkönen, K; Reinvang, I; Rudan, I; Sachdev, P S; Schmidt, R; Schofield, P R; Srikanth, V; Starr, J M; Turner, S T; Weir, D R; Wilson, J F; van Duijn, C; Launer, L; Fitzpatrick, A L; Seshadri, S; Mosley, T H; Deary, I J

    2015-01-01

    General cognitive function is substantially heritable across the human life course from adolescence to old age. We investigated the genetic contribution to variation in this important, health- and well-being-related trait in middle-aged and older adults. We conducted a meta-analysis of genome-wide association studies of 31 cohorts (N=53 949) in which the participants had undertaken multiple, diverse cognitive tests. A general cognitive function phenotype was tested for, and created in each cohort by principal component analysis. We report 13 genome-wide significant single-nucleotide polymorphism (SNP) associations in three genomic regions, 6q16.1, 14q12 and 19q13.32 (best SNP and closest gene, respectively: rs10457441, P=3.93 × 10−9, MIR2113; rs17522122, P=2.55 × 10−8, AKAP6; rs10119, P=5.67 × 10−9, APOE/TOMM40). We report one gene-based significant association with the HMGN1 gene located on chromosome 21 (P=1 × 10−6). These genes have previously been associated with neuropsychiatric phenotypes. Meta-analysis results are consistent with a polygenic model of inheritance. To estimate SNP-based heritability, the genome-wide complex trait analysis procedure was applied to two large cohorts, the Atherosclerosis Risk in Communities Study (N=6617) and the Health and Retirement Study (N=5976). The proportion of phenotypic variation accounted for by all genotyped common SNPs was 29% (s.e.=5%) and 28% (s.e.=7%), respectively. Using polygenic prediction analysis, ~1.2% of the variance in general cognitive function was predicted in the Generation Scotland cohort (N=5487; P=1.5 × 10−17). In hypothesis-driven tests, there was significant association between general cognitive function and four genes previously associated with Alzheimer's disease: TOMM40, APOE, ABCG1 and MEF2C. PMID:25644384

  7. Unlocking the bovine genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The draft genome sequence of cattle (Bos taurus) has now been analyzed by the Bovine Genome Sequencing and Analysis Consortium and the Bovine HapMap Consortium, which together represent an extensive collaboration involving more than 300 scientists from 25 different countries. ...

  8. A Genome Wide Survey of SNP Variation Reveals the Genetic Structure of Sheep Breeds

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genetic structure of sheep reflects their domestication and subsequent formation into discrete breeds. Understanding genetic structure is essential for achieving genetic improvement through genome-wide association studies, genomic selection and the dissection of quantitative traits. After identi...

  9. Genome databases

    SciTech Connect

    Courteau, J.

    1991-10-11

    Since the Genome Project began several years ago, a plethora of databases have been developed or are in the works. They range from the massive Genome Data Base at Johns Hopkins University, the central repository of all gene mapping information, to small databases focusing on single chromosomes or organisms. Some are publicly available, others are essentially private electronic lab notebooks. Still others limit access to a consortium of researchers working on, say, a single human chromosome. An increasing number incorporate sophisticated search and analytical software, while others operate as little more than data lists. In consultation with numerous experts in the field, a list has been compiled of some key genome-related databases. The list was not limited to map and sequence databases but also included the tools investigators use to interpret and elucidate genetic data, such as protein sequence and protein structure databases. Because a major goal of the Genome Project is to map and sequence the genomes of several experimental animals, including E. coli, yeast, fruit fly, nematode, and mouse, the available databases for those organisms are listed as well. The author also includes several databases that are still under development - including some ambitious efforts that go beyond data compilation to create what are being called electronic research communities, enabling many users, rather than just one or a few curators, to add or edit the data and tag it as raw or confirmed.

  10. Transcriptional consequences of genomic structural aberrations in breast cancer

    PubMed Central

    Inaki, Koichiro; Hillmer, Axel M.; Ukil, Leena; Yao, Fei; Woo, Xing Yi; Vardy, Leah A.; Zawack, Kelson Folkvard Braaten; Lee, Charlie Wah Heng; Ariyaratne, Pramila Nuwantha; Chan, Yang Sun; Desai, Kartiki Vasant; Bergh, Jonas; Hall, Per; Putti, Thomas Choudary; Ong, Wai Loon; Shahab, Atif; Cacheux-Rataboul, Valere; Karuturi, Radha Krishna Murthy; Sung, Wing-Kin; Ruan, Xiaoan; Bourque, Guillaume; Ruan, Yijun; Liu, Edison T.

    2011-01-01

    Using a long-span, paired-end deep sequencing strategy, we have comprehensively identified cancer genome rearrangements in eight breast cancer genomes. Herein, we show that 40%–54% of these structural genomic rearrangements result in different forms of fusion transcripts and that 44% are potentially translated. We find that single segmental tandem duplication spanning several genes is a major source of the fusion gene transcripts in both cell lines and primary tumors involving adjacent genes placed in the reverse-order position by the duplication event. Certain other structural mutations, however, tend to attenuate gene expression. From these candidate gene fusions, we have found a fusion transcript (RPS6KB1–VMP1) recurrently expressed in ∼30% of breast cancers associated with potential clinical consequences. This gene fusion is caused by tandem duplication on 17q23 and appears to be an indicator of local genomic instability altering the expression of oncogenic components such as MIR21 and RPS6KB1. PMID:21467264

  11. Evolution of genomic structures on Mammalian sex chromosomes.

    PubMed

    Katsura, Yukako; Iwase, Mineyo; Satta, Yoko

    2012-04-01

    Throughout mammalian evolution, recombination between the two sex chromosomes was suppressed in a stepwise manner. It is thought that the suppression of recombination led to an accumulation of deleterious mutations and frequent genomic rearrangements on the Y chromosome. In this article, we review three evolutionary aspects related to genomic rearrangements and structures, such as inverted repeats (IRs) and palindromes (PDs), on the mammalian sex chromosomes. First, we describe the stepwise manner in which recombination between the X and Y chromosomes was suppressed in placental mammals and discuss a genomic rearrangement that might have led to the formation of present pseudoautosomal boundaries (PAB). Second, we describe ectopic gene conversion between the X and Y chromosomes, and propose possible molecular causes. Third, we focus on the evolutionary mode and timing of PD formation on the X and Y chromosomes. The sequence of the chimpanzee Y chromosome was recently published by two groups. Both groups suggest that rapid evolution of genomic structure occurred on the Y chromosome. Our re-analysis of the sequences confirmed the species-specific mode of human and chimpanzee Y chromosomal evolution. Finally, we present a general outlook regarding the rapid evolution of mammalian sex chromosomes. PMID:23024603

  12. Six-layer structure for genomics and its applications.

    PubMed

    Kamatani, Naoyuki

    2016-03-01

    The term 'genetics' was coined before an understanding of DNA sequence data was achieved, and it is now insufficient to describe the broad areas in which DNA data have important roles. The term genomics is more broadly descriptive, but it does not provide a satisfactory conceptual framework that scientists can share. Here I propose a six-layer structure that describes the entire scientific field for 'genomics'. The proposed layers are 'life' as the uppermost layer, followed by 'species', 'population', 'family', 'individual' and finally 'cell' as the bottommost layer. In each pair of adjacent layers, each member of the upper layer comprises a set of members of the lower layer. In each layer, we can define consistent partial orders of members based on genomic data in the forms of phylogenic and pedigree trees. Although total orders such as those defined for time and space in physics cannot be defined in biology, defining consistent partial orders allows mathematical analysis to be performed. I will show that mathematical genetics studies can be understood as attempts to bridge gaps between layers of the proposed six-layer structure, while genetic tests can be understood as procedures to differentiate among members of each layer by using genomic data. PMID:26559752

  13. Genome-Wide Approaches for RNA Structure Probing.

    PubMed

    Silverman, Ian M; Berkowitz, Nathan D; Gosai, Sager J; Gregory, Brian D

    2016-01-01

    RNA molecules of all types fold into complex secondary and tertiary structures that are important for their function and regulation. Structural and catalytic RNAs such as ribosomal RNA (rRNA) and transfer RNA (tRNA) are central players in protein synthesis, and only function through their proper folding into intricate three-dimensional structures. Studies of messenger RNA (mRNA) regulation have also revealed that structural elements embedded within these RNA species are important for the proper regulation of their total level in the transcriptome. More recently, the discovery of microRNAs (miRNAs) and long non-coding RNAs (lncRNAs) has shed light on the importance of RNA structure to genome, transcriptome, and proteome regulation. Due to the relatively small number, high conservation, and importance of structural and catalytic RNAs to all life, much early work in RNA structure analysis mapped out a detailed view of these molecules. Computational and physical methods were used in concert with enzymatic and chemical structure probing to create high-resolution models of these fundamental biological molecules. However, the recent expansion in our knowledge of the importance of RNA structure to coding and regulatory RNAs has left the field in need of faster and scalable methods for high-throughput structural analysis. To address this, nuclease and chemical RNA structure probing methodologies have been adapted for genome-wide analysis. These methods have been deployed to globally characterize thousands of RNA structures in a single experiment. Here, we review these experimental methodologies for high-throughput RNA structure determination and discuss the insights gained from each approach. PMID:27256381

  14. Genome-wide association study for refractive astigmatism reveals genetic co-determination with spherical equivalent refractive error: the CREAM consortium.

    PubMed

    Li, Qing; Wojciechowski, Robert; Simpson, Claire L; Hysi, Pirro G; Verhoeven, Virginie J M; Ikram, Mohammad Kamran; Höhn, René; Vitart, Veronique; Hewitt, Alex W; Oexle, Konrad; Mäkelä, Kari-Matti; MacGregor, Stuart; Pirastu, Mario; Fan, Qiao; Cheng, Ching-Yu; St Pourcain, Beaté; McMahon, George; Kemp, John P; Northstone, Kate; Rahi, Jugnoo S; Cumberland, Phillippa M; Martin, Nicholas G; Sanfilippo, Paul G; Lu, Yi; Wang, Ya Xing; Hayward, Caroline; Polašek, Ozren; Campbell, Harry; Bencic, Goran; Wright, Alan F; Wedenoja, Juho; Zeller, Tanja; Schillert, Arne; Mirshahi, Alireza; Lackner, Karl; Yip, Shea Ping; Yap, Maurice K H; Ried, Janina S; Gieger, Christian; Murgia, Federico; Wilson, James F; Fleck, Brian; Yazar, Seyhan; Vingerling, Johannes R; Hofman, Albert; Uitterlinden, André; Rivadeneira, Fernando; Amin, Najaf; Karssen, Lennart; Oostra, Ben A; Zhou, Xin; Teo, Yik-Ying; Tai, E Shyong; Vithana, Eranga; Barathi, Veluchamy; Zheng, Yingfeng; Siantar, Rosalynn Grace; Neelam, Kumari; Shin, Youchan; Lam, Janice; Yonova-Doing, Ekaterina; Venturini, Cristina; Hosseini, S Mohsen; Wong, Hoi-Suen; Lehtimäki, Terho; Kähönen, Mika; Raitakari, Olli; Timpson, Nicholas J; Evans, David M; Khor, Chiea-Chuen; Aung, Tin; Young, Terri L; Mitchell, Paul; Klein, Barbara; van Duijn, Cornelia M; Meitinger, Thomas; Jonas, Jost B; Baird, Paul N; Mackey, David A; Wong, Tien Yin; Saw, Seang-Mei; Pärssinen, Olavi; Stambolian, Dwight; Hammond, Christopher J; Klaver, Caroline C W; Williams, Cathy; Paterson, Andrew D; Bailey-Wilson, Joan E; Guggenheim, Jeremy A

    2015-02-01

    To identify genetic variants associated with refractive astigmatism in the general population, meta-analyses of genome-wide association studies were performed for: White Europeans aged at least 25 years (20 cohorts, N = 31,968); Asian subjects aged at least 25 years (7 cohorts, N = 9,295); White Europeans aged <25 years (4 cohorts, N = 5,640); and all independent individuals from the above three samples combined with a sample of Chinese subjects aged <25 years (N = 45,931). Participants were classified as cases with refractive astigmatism if the average cylinder power in their two eyes was at least 1.00 diopter and as controls otherwise. Genome-wide association analysis was carried out for each cohort separately using logistic regression. Meta-analysis was conducted using a fixed effects model. In the older European group the most strongly associated marker was downstream of the neurexin-1 (NRXN1) gene (rs1401327, P = 3.92E-8). No other region reached genome-wide significance, and association signals were lower for the younger European group and Asian group. In the meta-analysis of all cohorts, no marker reached genome-wide significance: The most strongly associated regions were, NRXN1 (rs1401327, P = 2.93E-07), TOX (rs7823467, P = 3.47E-07) and LINC00340 (rs12212674, P = 1.49E-06). For 34 markers identified in prior GWAS for spherical equivalent refractive error, the beta coefficients for genotype versus spherical equivalent, and genotype versus refractive astigmatism, were highly correlated (r = -0.59, P = 2.10E-04). This work revealed no consistent or strong genetic signals for refractive astigmatism; however, the TOX gene region previously identified in GWAS for spherical equivalent refractive error was the second most strongly associated region. Analysis of additional markers provided evidence supporting widespread genetic co-susceptibility for spherical and astigmatic refractive errors. PMID:25367360

  15. Target Selection and Determination of Function in Structural Genomics

    PubMed Central

    Watson, James D.; Todd, Annabel E.; Bray, James; Laskowski, Roman A.; Edwards, Aled; Joachimiak, Andrzej; Orengo, Christine A.; Thornton, Janet M.

    2011-01-01

    Summary The first crucial step in any structural genomics project is the selection and prioritization of target proteins for structure determination. There may be a number of selection criteria to be satisfied, including that the proteins have novel folds, that they be representatives of large families for which no structure is known, and so on. The better the selection at this stage, the greater is the value of the structures obtained at the end of the experimental process. This value can be further enhanced once the protein structures have been solved if the functions of the given proteins can also be determined. Here we describe the methods used at either end of the experimental process: firstly, sensitive sequence comparison techniques for selecting a high-quality list of target proteins, and secondly the various computational methods that can be applied to the eventual 3D structures to determine the most likely biochemical function of the proteins in question. PMID:12880206

  16. Implications of structural genomics target selection strategies: Pfam5000, whole genome, and random approaches

    SciTech Connect

    Chandonia, John-Marc; Brenner, Steven E.

    2004-07-14

    The structural genomics project is an international effort to determine the three-dimensional shapes of all important biological macromolecules, with a primary focus on proteins. Target proteins should be selected according to a strategy which is medically and biologically relevant, of good value, and tractable. As an option to consider, we present the Pfam5000 strategy, which involves selecting the 5000 most important families from the Pfam database as sources for targets. We compare the Pfam5000 strategy to several other proposed strategies that would require similar numbers of targets. These include including complete solution of several small to moderately sized bacterial proteomes, partial coverage of the human proteome, and random selection of approximately 5000 targets from sequenced genomes. We measure the impact that successful implementation of these strategies would have upon structural interpretation of the proteins in Swiss-Prot, TrEMBL, and 131 complete proteomes (including 10 of eukaryotes) from the Proteome Analysis database at EBI. Solving the structures of proteins from the 5000 largest Pfam families would allow accurate fold assignment for approximately 68 percent of all prokaryotic proteins (covering 59 percent of residues) and 61 percent of eukaryotic proteins (40 percent of residues). More fine-grained coverage which would allow accurate modeling of these proteins would require an order of magnitude more targets. The Pfam5000 strategy may be modified in several ways, for example to focus on larger families, bacterial sequences, or eukaryotic sequences; as long as secondary consideration is given to large families within Pfam, coverage results vary only slightly. In contrast, focusing structural genomics on a single tractable genome would have only a limited impact in structural knowledge of other proteomes: a significant fraction (about 30-40 percent of the proteins, and 40-60 percent of the residues) of each proteome is classified in small

  17. Interactions Between Genome-wide Significant Genetic Variants and Circulating Concentrations of Insulin-like Growth Factor 1, Sex Hormones, and Binding Proteins in Relation to Prostate Cancer Risk in the National Cancer Institute Breast and Prostate Cancer Cohort Consortium

    PubMed Central

    Tsilidis, Konstantinos K.; Travis, Ruth C.; Appleby, Paul N.; Allen, Naomi E.; Lindstrom, Sara; Schumacher, Fredrick R.; Cox, David; Hsing, Ann W.; Ma, Jing; Severi, Gianluca; Albanes, Demetrius; Virtamo, Jarmo; Boeing, Heiner; Bueno-de-Mesquita, H. Bas; Johansson, Mattias; Quirós, J. Ramón; Riboli, Elio; Siddiq, Afshan; Tjønneland, Anne; Trichopoulos, Dimitrios; Tumino, Rosario; Gaziano, J. Michael; Giovannucci, Edward; Hunter, David J.; Kraft, Peter; Stampfer, Meir J.; Giles, Graham G.; Andriole, Gerald L.; Berndt, Sonja I.; Chanock, Stephen J.; Hayes, Richard B.; Key, Timothy J.

    2012-01-01

    Genome-wide association studies (GWAS) have identified many single nucleotide polymorphisms (SNPs) associated with prostate cancer risk. There is limited information on the mechanistic basis of these associations, particularly about whether they interact with circulating concentrations of growth factors and sex hormones, which may be important in prostate cancer etiology. Using conditional logistic regression, the authors compared per-allele odds ratios for prostate cancer for 39 GWAS-identified SNPs across thirds (tertile groups) of circulating concentrations of insulin-like growth factor 1 (IGF-1), insulin-like growth factor binding protein 3 (IGFBP-3), testosterone, androstenedione, androstanediol glucuronide, estradiol, and sex hormone-binding globulin (SHBG) for 3,043 cases and 3,478 controls in the Breast and Prostate Cancer Cohort Consortium. After allowing for multiple testing, none of the SNPs examined were significantly associated with growth factor or hormone concentrations, and the SNP-prostate cancer associations did not differ by these concentrations, although 4 interactions were marginally significant (MSMB-rs10993994 with androstenedione (uncorrected P = 0.008); CTBP2-rs4962416 with IGFBP-3 (uncorrected P = 0.003); 11q13.2-rs12418451 with IGF-1 (uncorrected P = 0.006); and 11q13.2-rs10896449 with SHBG (uncorrected P = 0.005)). The authors found no strong evidence that associations between GWAS-identified SNPs and prostate cancer are modified by circulating concentrations of IGF-1, sex hormones, or their major binding proteins. PMID:22459122

  18. The Mitochondrial Genome of Soybean Reveals Complex Genome Structures and Gene Evolution at Intercellular and Phylogenetic Levels

    PubMed Central

    Chang, Shengxin; Wang, Yankun; Lu, Jiangjie; Gai, Junyi; Li, Jijie; Chu, Pu; Guan, Rongzhan; Zhao, Tuanjie

    2013-01-01

    Determining mitochondrial genomes is important for elucidating vital activities of seed plants. Mitochondrial genomes are specific to each plant species because of their variable size, complex structures and patterns of gene losses and gains during evolution. This complexity has made research on the soybean mitochondrial genome difficult compared with its nuclear and chloroplast genomes. The present study helps to solve a 30-year mystery regarding the most complex mitochondrial genome structure, showing that pairwise rearrangements among the many large repeats may produce an enriched molecular pool of 760 circles in seed plants. The soybean mitochondrial genome harbors 58 genes of known function in addition to 52 predicted open reading frames of unknown function. The genome contains sequences of multiple identifiable origins, including 6.8 kb and 7.1 kb DNA fragments that have been transferred from the nuclear and chloroplast genomes, respectively, and some horizontal DNA transfers. The soybean mitochondrial genome has lost 16 genes, including nine protein-coding genes and seven tRNA genes; however, it has acquired five chloroplast-derived genes during evolution. Four tRNA genes, common among the three genomes, are derived from the chloroplast. Sizeable DNA transfers to the nucleus, with pericentromeric regions as hotspots, are observed, including DNA transfers of 125.0 kb and 151.6 kb identified unambiguously from the soybean mitochondrial and chloroplast genomes, respectively. The soybean nuclear genome has acquired five genes from its mitochondrial genome. These results provide biological insights into the mitochondrial genome of seed plants, and are especially helpful for deciphering vital activities in soybean. PMID:23431381

  19. California Space Grant Consortium

    NASA Technical Reports Server (NTRS)

    Kosmatka, John; Berger, Wolfgang; Wiskerchen, Michael J.

    2005-01-01

    The organizational and administrative structure of the CaSGC has the Consortium Headquarters Office (Principal Investigator - Dr. John Kosmatka, California Statewide Director - Dr. Michael Wiskerchen) at UC San Diego. Each affiliate member institution has a campus director and an scholarship/fellowship selection committee. Each affiliate campus director also serves on the CaSGC Advisory Council and coordinates CMIS data collection and submission. The CaSGC strives to maintain a balance between expanded affiliate membership and continued high quality in targeted program areas of aerospace research, education, workforce development, and public outreach. Associate members are encouraged to participate on a project-by-project basis that meets the needs of California and the goals and objectives of the CaSGC. Associate members have responsibilities relating only to the CaSGC projects they are directly engaged in. Each year, as part of the CaSGC Improvement Plan, the CaSGC Advisory Council evaluates the performance of the affiliate and associate membership in terms of contributions to the CaSGC Strategic Plan, These CaSGC membership evaluations provide a constructive means for elevating productive members and removing non-performing members. This Program Improvement and Results (PIR) report will document CaSGC program improvement results and impacts that directly respond to the specific needs of California in the area of aerospace-related education and human capital development and the Congressional mandate to "increase the understanding, assessment, development and utilization of space resources by promoting a strong education base, responsive research and training activities, and broad and prompt dissemination of knowledge and technology".

  20. Viral genome structures are optimal for capsid assembly

    PubMed Central

    Perlmutter, Jason D; Qiao, Cong; Hagan, Michael F

    2013-01-01

    Understanding how virus capsids assemble around their nucleic acid (NA) genomes could promote efforts to block viral propagation or to reengineer capsids for gene therapy applications. We develop a coarse-grained model of capsid proteins and NAs with which we investigate assembly dynamics and thermodynamics. In contrast to recent theoretical models, we find that capsids spontaneously ‘overcharge’; that is, the negative charge of the NA exceeds the positive charge on capsid. When applied to specific viruses, the optimal NA lengths closely correspond to the natural genome lengths. Calculations based on linear polyelectrolytes rather than base-paired NAs underpredict the optimal length, demonstrating the importance of NA structure to capsid assembly. These results suggest that electrostatics, excluded volume, and NA tertiary structure are sufficient to predict assembly thermodynamics and that the ability of viruses to selectively encapsidate their genomic NAs can be explained, at least in part, on a thermodynamic basis. DOI: http://dx.doi.org/10.7554/eLife.00632.001 PMID:23795290

  1. Meet me halfway: when genomics meets structural bioinformatics.

    PubMed

    Gong, Sungsam; Worth, Catherine L; Cheng, Tammy M K; Blundell, Tom L

    2011-06-01

    The DNA sequencing technology developed by Frederick Sanger in the 1970s established genomics as the basis of comparative genetics. The recent invention of next-generation sequencing (NGS) platform has added a new dimension to genome research by generating ultra-fast and high-throughput sequencing data in an unprecedented manner. The advent of NGS technology also provides the opportunity to study genetic diseases where sequence variants or mutations are sought to establish a causal relationship with disease phenotypes. However, it is not a trivial task to seek genetic variants responsible for genetic diseases and even harder for complex diseases such as diabetes and cancers. In such polygenic diseases, multiple genes and alleles, which can exist in healthy individuals, come together to contribute to common disease phenotypes in a complex manner. Hence, it is desirable to have an approach that integrates omics data with both knowledge of protein structure and function and an understanding of networks/pathways, i.e. functional genomics and systems biology; in this way, genotype-phenotype relationships can be better understood. In this review, we bring this 'bottom-up' approach alongside the current NGS-driven genetic study of genetic variations and disease aetiology. We describe experimental and computational techniques for assessing genetic variants and their deleterious effects on protein structure and function. PMID:21350909

  2. QUANTITATIVE CHARACTERIZATION OF MICROBIAL BIOMASS AND COMMUNITY STRUCTURE IN SUBSURFACE MATERIAL: A PROKARYOTIC CONSORTIUM RESPONSIVE TO ORGANIC CONTAMINATION

    EPA Science Inventory

    Application of quantitative methods for microbial biomass, community structure, and nutritional status to the subsurface samples collected with careful attention to contamination reveals a group of microbes. The microbiota are sparse by several measures of biomass compared to sur...

  3. Rapid evolution and complex structural organization in genomic regions harboring multiple prolamin genes in the polyploid wheat genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genes encoding wheat prolamins belong to complicated multi-gene families in the wheat genome. To understand the structural complexity of storage protein loci, we sequenced and analyzed orthologous regions containing both gliadin and LMW-glutenin genes from the A and B genomes of a tetraploid wheat ...

  4. Genetic determinants of heel bone properties: genome-wide association meta-analysis and replication in the GEFOS/GENOMOS consortium.

    PubMed

    Moayyeri, Alireza; Hsu, Yi-Hsiang; Karasik, David; Estrada, Karol; Xiao, Su-Mei; Nielson, Carrie; Srikanth, Priya; Giroux, Sylvie; Wilson, Scott G; Zheng, Hou-Feng; Smith, Albert V; Pye, Stephen R; Leo, Paul J; Teumer, Alexander; Hwang, Joo-Yeon; Ohlsson, Claes; McGuigan, Fiona; Minster, Ryan L; Hayward, Caroline; Olmos, José M; Lyytikäinen, Leo-Pekka; Lewis, Joshua R; Swart, Karin M A; Masi, Laura; Oldmeadow, Chris; Holliday, Elizabeth G; Cheng, Sulin; van Schoor, Natasja M; Harvey, Nicholas C; Kruk, Marcin; del Greco M, Fabiola; Igl, Wilmar; Trummer, Olivia; Grigoriou, Efi; Luben, Robert; Liu, Ching-Ti; Zhou, Yanhua; Oei, Ling; Medina-Gomez, Carolina; Zmuda, Joseph; Tranah, Greg; Brown, Suzanne J; Williams, Frances M; Soranzo, Nicole; Jakobsdottir, Johanna; Siggeirsdottir, Kristin; Holliday, Kate L; Hannemann, Anke; Go, Min Jin; Garcia, Melissa; Polasek, Ozren; Laaksonen, Marika; Zhu, Kun; Enneman, Anke W; McEvoy, Mark; Peel, Roseanne; Sham, Pak Chung; Jaworski, Maciej; Johansson, Åsa; Hicks, Andrew A; Pludowski, Pawel; Scott, Rodney; Dhonukshe-Rutten, Rosalie A M; van der Velde, Nathalie; Kähönen, Mika; Viikari, Jorma S; Sievänen, Harri; Raitakari, Olli T; González-Macías, Jesús; Hernández, Jose L; Mellström, Dan; Ljunggren, Osten; Cho, Yoon Shin; Völker, Uwe; Nauck, Matthias; Homuth, Georg; Völzke, Henry; Haring, Robin; Brown, Matthew A; McCloskey, Eugene; Nicholson, Geoffrey C; Eastell, Richard; Eisman, John A; Jones, Graeme; Reid, Ian R; Dennison, Elaine M; Wark, John; Boonen, Steven; Vanderschueren, Dirk; Wu, Frederick C W; Aspelund, Thor; Richards, J Brent; Bauer, Doug; Hofman, Albert; Khaw, Kay-Tee; Dedoussis, George; Obermayer-Pietsch, Barbara; Gyllensten, Ulf; Pramstaller, Peter P; Lorenc, Roman S; Cooper, Cyrus; Kung, Annie Wai Chee; Lips, Paul; Alen, Markku; Attia, John; Brandi, Maria Luisa; de Groot, Lisette C P G M; Lehtimäki, Terho; Riancho, José A; Campbell, Harry; Liu, Yongmei; Harris, Tamara B; Akesson, Kristina; Karlsson, Magnus; Lee, Jong-Young; Wallaschofski, Henri; Duncan, Emma L; O'Neill, Terence W; Gudnason, Vilmundur; Spector, Timothy D; Rousseau, François; Orwoll, Eric; Cummings, Steven R; Wareham, Nick J; Rivadeneira, Fernando; Uitterlinden, Andre G; Prince, Richard L; Kiel, Douglas P; Reeve, Jonathan; Kaptoge, Stephen K

    2014-06-01

    Quantitative ultrasound of the heel captures heel bone properties that independently predict fracture risk and, with bone mineral density (BMD) assessed by X-ray (DXA), may be convenient alternatives for evaluating osteoporosis and fracture risk. We performed a meta-analysis of genome-wide association (GWA) studies to assess the genetic determinants of heel broadband ultrasound attenuation (BUA; n = 14 260), velocity of sound (VOS; n = 15 514) and BMD (n = 4566) in 13 discovery cohorts. Independent replication involved seven cohorts with GWA data (in silico n = 11 452) and new genotyping in 15 cohorts (de novo n = 24 902). In combined random effects, meta-analysis of the discovery and replication cohorts, nine single nucleotide polymorphisms (SNPs) had genome-wide significant (P < 5 × 10(-8)) associations with heel bone properties. Alongside SNPs within or near previously identified osteoporosis susceptibility genes including ESR1 (6q25.1: rs4869739, rs3020331, rs2982552), SPTBN1 (2p16.2: rs11898505), RSPO3 (6q22.33: rs7741021), WNT16 (7q31.31: rs2908007), DKK1 (10q21.1: rs7902708) and GPATCH1 (19q13.11: rs10416265), we identified a new locus on chromosome 11q14.2 (rs597319 close to TMEM135, a gene recently linked to osteoblastogenesis and longevity) significantly associated with both BUA and VOS (P < 8.23 × 10(-14)). In meta-analyses involving 25 cohorts with up to 14 985 fracture cases, six of 10 SNPs associated with heel bone properties at P < 5 × 10(-6) also had the expected direction of association with any fracture (P < 0.05), including three SNPs with P < 0.005: 6q22.33 (rs7741021), 7q31.31 (rs2908007) and 10q21.1 (rs7902708). In conclusion, this GWA study reveals the effect of several genes common to central DXA-derived BMD and heel ultrasound/DXA measures and points to a new genetic locus with potential implications for better understanding of osteoporosis pathophysiology. PMID:24430505

  5. NCI Cohort Consortium

    Cancer.gov

    The NCI Cohort Consortium is an extramural-intramural partnership formed by the National Cancer Institute to address the need for large-scale collaborations to pool the large quantity of data and biospecimens necessary to conduct a wide range of cancer studies.

  6. The Idaho Consortium.

    ERIC Educational Resources Information Center

    Beaird, James H.

    The Idaho Consortium was established by the state board of education to remedy perceived needs involving insufficient certificated teachers, excessive teacher mobility, shortage of teacher candidates, inadequate inservice training, a low level of administrative leadership, and a lack of programs in special education, early childhood education,…

  7. Mitochondrial Disease Sequence Data Resource (MSeqDR): A global grass-roots consortium to facilitate deposition, curation, annotation, and integrated analysis of genomic data for the mitochondrial disease clinical and research communities

    PubMed Central

    Falk, Marni J.; Shen, Lishuang; Gonzalez, Michael; Leipzig, Jeremy; Lott, Marie T.; Stassen, Alphons P.M.; Diroma, Maria Angela; Navarro-Gomez, Daniel; Yeske, Philip; Bai, Renkui; Boles, Richard G.; Brilhante, Virginia; Ralph, David; DaRe, Jeana T.; Shelton, Robert; Terry, Sharon; Zhang, Zhe; Copeland, William C.; van Oven, Mannis; Prokisch, Holger; Wallace, Douglas C.; Attimonelli, Marcella; Krotoski, Danuta; Zuchner, Stephan; Gai, Xiaowu

    2014-01-01

    Success rates for genomic analyses of highly heterogeneous disorders can be greatly improved if a large cohort of patient data is assembled to enhance collective capabilities for accurate sequence variant annotation, analysis, and interpretation. Indeed, molecular diagnostics requires the establishment of robust data resources to enable data sharing that informs accurate understanding of genes, variants, and phenotypes. The “Mitochondrial Disease Sequence Data Resource (MSeqDR) Consortium” is a grass-roots effort facilitated by the United Mitochondrial Disease Foundation to identify and prioritize specific genomic data analysis needs of the global mitochondrial disease clinical and research community. A central Web portal (https://mseqdr.org) facilitates the coherent compilation, organization, annotation, and analysis of sequence data from both nuclear and mitochondrial genomes of individuals and families with suspected mitochondrial disease. This Web portal provides users with a flexible and expandable suite of resources to enable variant-, gene-, and exome-level sequence analysis in a secure, Web-based, and user-friendly fashion. Users can also elect to share data with other MSeqDR Consortium members, or even the general public, either by custom annotation tracks or through use of a convenient distributed annotation system (DAS) mechanism. A range of data visualization and analysis tools are provided to facilitate user interrogation and understanding of genomic, and ultimately phenotypic, data of relevance to mitochondrial biology and disease. Currently available tools for nuclear and mitochondrial gene analyses include an MSeqDR GBrowse instance that hosts optimized mitochondrial disease and mitochondrial DNA (mtDNA) specific annotation tracks, as well as an MSeqDR locus-specific database (LSDB) that curates variant data on more than 1,300 genes that have been implicated in mitochondrial disease and/or encode mitochondria-localized proteins. MSeqDR is

  8. Genomic Characterization of Prenatally Detected Chromosomal Structural Abnormalities Using Oligonucleotide Array Comparative Genomic Hybridization

    PubMed Central

    Li, Peining; Pomianowski, Pawel; DiMaio, Miriam S.; Florio, Joanne R.; Rossi, Michael R.; Xiang, Bixia; Xu, Fang; Yang, Hui; Geng, Qian; Xie, Jiansheng; Mahoney, Maurice J.

    2013-01-01

    Detection of chromosomal structural abnormalities using conventional cytogenetic methods poses a challenge for prenatal genetic counseling due to unpredictable clinical outcomes and risk of recurrence. Of the 1,726 prenatal cases in a 3-year period, we performed oligonucleotide array comparative genomic hybridization (aCGH) analysis on 11 cases detected with various structural chromosomal abnormalities. In nine cases, genomic aberrations and gene contents involving a 3p distal deletion, a marker chromosome from chromosome 4, a derivative chromosome 5 from a 5p/7q translocation, a de novo distal 6q deletion, a recombinant chromosome 8 comprised of an 8p duplication and an 8q deletion, an extra derivative chromosome 9 from an 8p/9q translocation, mosaicism for chromosome 12q with added material of initially unknown origin, an unbalanced 13q/15q rearrangement, and a distal 18q duplication and deletion were delineated. An absence of pathogenic copy number changes was noted in one case with a de novo 11q/14q translocation and in another with a familial insertion of 21q into a 19q. Genomic characterization of the structural abnormalities aided in the prediction of clinical outcomes. These results demonstrated the value of aCGH analysis in prenatal cases with subtle or complex chromosomal rearrangements. Furthermore, a retrospective analysis of clinical indications of our prenatal cases showed that approximately 20% of them had abnormal ultrasound findings and should be considered as high risk pregnancies for a combined chromosome and aCGH analysis. PMID:21671377

  9. X-ray scattering data and structural genomics

    NASA Astrophysics Data System (ADS)

    Doniach, Sebastian

    2003-03-01

    High throughput structural genomics has the ambitious goal of determining the structure of all, or a very large number of protein folds using the high-resolution techniques of protein crystallography and NMR. However, the program is facing significant bottlenecks in reaching this goal, which include problems of protein expression and crystallization. In this talk, some preliminary results on how the low-resolution technique of small-angle X-ray solution scattering (SAXS) can help ameliorate some of these bottlenecks will be presented. One of the most significant bottlenecks arises from the difficulty of crystallizing integral membrane proteins, where only a handful of structures are available compared to thousands of structures for soluble proteins. By 3-dimensional reconstruction from SAXS data, the size and shape of detergent-solubilized integral membrane proteins can be characterized. This information can then be used to classify membrane proteins which constitute some 25% of all genomes. SAXS may also be used to study the dependence of interparticle interference scattering on solvent conditions so that regions of the protein solution phase diagram which favor crystallization can be elucidated. As a further application, SAXS may be used to provide physical constraints on computational methods for protein structure prediction based on primary sequence information. This in turn can help in identifying structural homologs of a given protein, which can then give clues to its function. D. Walther, F. Cohen and S. Doniach. "Reconstruction of low resolution three-dimensional density maps from one-dimensional small angle x-ray scattering data for biomolecules." J. Appl. Cryst. 33(2):350-363 (2000). Protein structure prediction constrained by solution X-ray scattering data and structural homology identification Zheng WJ, Doniach S JOURNAL OF MOLECULAR BIOLOGY , v. 316(#1) pp. 173-187 FEB 8, 2002

  10. Structural Genomics: From Genes to Structures With Valuable Materials And Many Questions in Between

    SciTech Connect

    Fox, B.G.; Goulding, C.; Malkowski, M.G.; Stewart, L.; Deacon, A.; /SLAC, SSRL

    2009-04-30

    The Protein Structure Initiative (PSI), funded by the US National Institutes of Health (NIH), provides a framework for the development and systematic evaluation of methods to solve protein structures. Although the PSI and other structural genomics efforts around the world have led to the solution of many new protein structures as well as the development of new methods, methodological bottlenecks still exist and are being addressed in this 'production phase' of PSI.

  11. A genome-wide association study suggests that a locus within the ataxin 2 binding protein 1 gene is associated with hand osteoarthritis: the Treat-OA consortium

    PubMed Central

    Zhai, G; van Meurs, J B J; Livshits, G; Meulenbelt, I; Valdes, A M; Soranzo, N; Hart, D; Zhang, F; Kato, B S; Richards, J B; Williams, F M K; Inouye, M; Kloppenburg, M; Deloukas, P; Slagboom, E; Uitterlinden, A; Spector, T D

    2009-01-01

    To identify the susceptibility gene in hand osteoarthritis (OA) the authors used a two-stage approach genome-wide association study using two discovery samples (the TwinsUK cohort and the Rotterdam discovery subset; a total of 1804 subjects) and four replication samples (the Chingford Study, the Chuvasha Skeletal Aging Study, the Rotterdam replication subset and the Genetics, Arthrosis, and Progression (GARP) Study; a total of 3266 people). Five single-nucleotide polymorphisms (SNPs) had a likelihood of association with hand OA in the discovery stage and one of them (rs716508), was successfully confirmed in the replication stage (meta-analysis p = 1.81×10−5). The C allele conferred a reduced risk of 33% to 41% using a case–control definition. The SNP is located in intron 1 of the A2BP1 gene. This study also found that the same allele of the SNP significantly reduced bone density at both the hip and spine (p<0.01), suggesting the potential mechanism of the gene in hand OA might be via effects on subchondral bone. The authors' findings provide a potential new insight into genetic mechanisms in the development of hand OA. PMID:19508968

  12. Genomic Heterogeneity and Structural Variation in Soybean Near Isogenic Lines

    PubMed Central

    Stec, Adrian O.; Bhaskar, Pudota B.; Bolon, Yung-Tsi; Nolan, Rebecca; Shoemaker, Randy C.; Vance, Carroll P.; Stupar, Robert M.

    2013-01-01

    Near isogenic lines (NILs) are a critical genetic resource for the soybean research community. The ability to identify and characterize the genes driving the phenotypic differences between NILs is limited by the degree to which differential genetic introgressions can be resolved. Furthermore, the genetic heterogeneity extant among NIL sub-lines is an unaddressed research topic that might have implications for how genomic and phenotypic data from NILs are utilized. In this study, a recently developed high-resolution comparative genomic hybridization (CGH) platform was used to investigate the structure and diversity of genetic introgressions in two classical soybean NIL populations, respectively varying in protein content and iron deficiency chlorosis (IDC) susceptibility. There were three objectives: assess the capacity for CGH to resolve genomic introgressions, identify introgressions that are heterogeneous among NIL sub-lines, and associate heterogeneous introgressions with susceptibility to IDC. Using the CGH approach, introgression boundaries were refined and previously unknown introgressions were revealed. Furthermore, heterogeneous introgressions were identified within seven sub-lines of the IDC NIL “IsoClark.” This included three distinct introgression haplotypes linked to the major iron susceptible locus on chromosome 03. A phenotypic assessment of the seven sub-lines did not reveal any differences in IDC susceptibility, indicating that the genetic heterogeneity among the lines does not have a significant impact on the primary NIL phenotype. PMID:23630538

  13. A Roadmap for Functional Structural Variants in the Soybean Genome

    PubMed Central

    Anderson, Justin E.; Kantar, Michael B.; Kono, Thomas Y.; Fu, Fengli; Stec, Adrian O.; Song, Qijian; Cregan, Perry B.; Specht, James E.; Diers, Brian W.; Cannon, Steven B.; McHale, Leah K.; Stupar, Robert M.

    2014-01-01

    Gene structural variation (SV) has recently emerged as a key genetic mechanism underlying several important phenotypic traits in crop species. We screened a panel of 41 soybean (Glycine max) accessions serving as parents in a soybean nested association mapping population for deletions and duplications in more than 53,000 gene models. Array hybridization and whole genome resequencing methods were used as complementary technologies to identify SV in 1528 genes, or approximately 2.8%, of the soybean gene models. Although SV occurs throughout the genome, SV enrichment was noted in families of biotic defense response genes. Among accessions, SV was nearly eightfold less frequent for gene models that have retained paralogs since the last whole genome duplication event, compared with genes that have not retained paralogs. Increases in gene copy number, similar to that described at the Rhg1 resistance locus, account for approximately one-fourth of the genic SV events. This assessment of soybean SV occurrence presents a target list of genes potentially responsible for rapidly evolving and/or adaptive traits. PMID:24855315

  14. The impact of structural genomics: the first quindecennial.

    PubMed

    Grabowski, Marek; Niedzialkowska, Ewa; Zimmerman, Matthew D; Minor, Wladek

    2016-03-01

    The period 2000-2015 brought the advent of high-throughput approaches to protein structure determination. With the overall funding on the order of $2 billion (in 2010 dollars), the structural genomics (SG) consortia established worldwide have developed pipelines for target selection, protein production, sample preparation, crystallization, and structure determination by X-ray crystallography and NMR. These efforts resulted in the determination of over 13,500 protein structures, mostly from unique protein families, and increased the structural coverage of the expanding protein universe. SG programs contributed over 4400 publications to the scientific literature. The NIH-funded Protein Structure Initiatives alone have produced over 2000 scientific publications, which to date have attracted more than 93,000 citations. Software and database developments that were necessary to handle high-throughput structure determination workflows have led to structures of better quality and improved integrity of the associated data. Organized and accessible data have a positive impact on the reproducibility of scientific experiments. Most of the experimental data generated by the SG centers are freely available to the community and has been utilized by scientists in various fields of research. SG projects have created, improved, streamlined, and validated many protocols for protein production and crystallization, data collection, and functional analysis, significantly benefiting biological and biomedical research. PMID:26935210

  15. Population structure and minimum core genome typing of Legionella pneumophila

    PubMed Central

    Qin, Tian; Zhang, Wen; Liu, Wenbin; Zhou, Haijian; Ren, Hongyu; Shao, Zhujun; Lan, Ruiting; Xu, Jianguo

    2016-01-01

    Legionella pneumophila is an important human pathogen causing Legionnaires’ disease. In this study, whole genome sequencing (WGS) was used to study the characteristics and population structure of L. pneumophila strains. We sequenced and compared 53 isolates of L. pneumophila covering different serogroups and sequence-based typing (SBT) types (STs). We found that 1,896 single-copy orthologous genes were shared by all isolates and were defined as the minimum core genome (MCG) of L. pneumophila. A total of 323,224 single-nucleotide polymorphisms (SNPs) were identified among the 53 strains. After excluding 314,059 SNPs which were likely to be results of recombination, the remaining 9,165 SNPs were referred to as MCG SNPs. Population Structure analysis based on MCG divided the 53 L. pneumophila into nine MCG groups. The within-group distances were much smaller than the between-group distances, indicating considerable divergence between MCG groups. MCG groups were also supplied by phylogenetic analysis and may be considered as robust taxonomic units within L. pneumophila. Among the nine MCG groups, eight showed high intracellular growth ability while one showed low intracellular growth ability. Furthermore, MCG typing also showed high resolution in subtyping ST1 strains. The results obtained in this study provided significant insights into the evolution, population structure and pathogenicity of L. pneumophila. PMID:26888563

  16. Population structure and minimum core genome typing of Legionella pneumophila.

    PubMed

    Qin, Tian; Zhang, Wen; Liu, Wenbin; Zhou, Haijian; Ren, Hongyu; Shao, Zhujun; Lan, Ruiting; Xu, Jianguo

    2016-01-01

    Legionella pneumophila is an important human pathogen causing Legionnaires' disease. In this study, whole genome sequencing (WGS) was used to study the characteristics and population structure of L. pneumophila strains. We sequenced and compared 53 isolates of L. pneumophila covering different serogroups and sequence-based typing (SBT) types (STs). We found that 1,896 single-copy orthologous genes were shared by all isolates and were defined as the minimum core genome (MCG) of L. pneumophila. A total of 323,224 single-nucleotide polymorphisms (SNPs) were identified among the 53 strains. After excluding 314,059 SNPs which were likely to be results of recombination, the remaining 9,165 SNPs were referred to as MCG SNPs. Population Structure analysis based on MCG divided the 53 L. pneumophila into nine MCG groups. The within-group distances were much smaller than the between-group distances, indicating considerable divergence between MCG groups. MCG groups were also supplied by phylogenetic analysis and may be considered as robust taxonomic units within L. pneumophila. Among the nine MCG groups, eight showed high intracellular growth ability while one showed low intracellular growth ability. Furthermore, MCG typing also showed high resolution in subtyping ST1 strains. The results obtained in this study provided significant insights into the evolution, population structure and pathogenicity of L. pneumophila. PMID:26888563

  17. Simple repetitive sequences in the genome: structure and functional significance.

    PubMed

    Brahmachari, S K; Meera, G; Sarkar, P S; Balagurumoorthy, P; Tripathi, J; Raghavan, S; Shaligram, U; Pataskar, S

    1995-09-01

    The current explosion of DNA sequence information has generated increasing evidence for the claim that noncoding repetitive DNA sequences present within and around different genes could play an important role in genetic control processes, although the precise role and mechanism by which these sequences function are poorly understood. Several of the simple repetitive sequences which occur in a large number of loci throughout the human and other eukaryotic genomes satisfy the sequence criteria for forming non-B DNA structures in vitro. We have summarized some of the features of three different types of simple repeats that highlight the importance of repetitive DNA in the control of gene expression and chromatin organization. (i) (TG/CA)n repeats are widespread and conserved in many loci. These sequences are associated with nucleosomes of varying linker length and may play a role in chromatin organization. These Z-potential sequences can help absorb superhelical stress during transcription and aid in recombination. (ii) Human telomeric repeat (TTAGGG)n adopts a novel quadruplex structure and exhibits unusual chromatin organization. This unusual structural motif could explain chromosome pairing and stability. (iii) Intragenic amplification of (CTG)n/(CAG)n trinucleotide repeat, which is now known to be associated with several genetic disorders, could down-regulate gene expression in vivo. The overall implications of these findings vis-à-vis repetitive sequences in the genome are summarized. PMID:8582360

  18. Advanced Separation Consortium

    SciTech Connect

    2006-01-01

    The Center for Advanced Separation Technologies (CAST) was formed in 2001 under the sponsorship of the US Department of Energy to conduct fundamental research in advanced separation and to develop technologies that can be used to produce coal and minerals in an efficient and environmentally acceptable manner. The CAST consortium consists of seven universities - Virginia Tech, West Virginia University, University of Kentucky, Montana Tech, University of Utah, University of Nevada-Reno, and New Mexico Tech. The consortium brings together a broad range of expertise to solve problems facing the US coal industry and the mining sector in general. At present, a total of 60 research projects are under way. The article outlines some of these, on topics including innovative dewatering technologies, removal of mercury and other impurities, and modelling of the flotation process. 1 photo.

  19. Kansas Wind Energy Consortium

    SciTech Connect

    Gruenbacher, Don

    2015-12-31

    This project addresses both fundamental and applied research problems that will help with problems defined by the DOE “20% Wind by 2030 Report”. In particular, this work focuses on increasing the capacity of small or community wind generation capabilities that would be operated in a distributed generation approach. A consortium (KWEC – Kansas Wind Energy Consortium) of researchers from Kansas State University and Wichita State University aims to dramatically increase the penetration of wind energy via distributed wind power generation. We believe distributed generation through wind power will play a critical role in the ability to reach and extend the renewable energy production targets set by the Department of Energy. KWEC aims to find technical and economic solutions to enable widespread implementation of distributed renewable energy resources that would apply to wind.

  20. High-resolution haplotype block structure in the cattle genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The Bovine HapMap Consortium has generated assay panels to genotype ~30,000 single nucleotide polymorphisms (SNPs) from 501 animals sampled from 19 worldwide taurine and indicine breeds, plus two outgroup species (Anoa and Water Buffalo). Within the larger set of SNPs we targeted 101 high density re...

  1. Structural genomic variation in childhood epilepsies with complex phenotypes

    PubMed Central

    Helbig, Ingo; Swinkels, Marielle E M; Aten, Emmelien; Caliebe, Almuth; van 't Slot, Ruben; Boor, Rainer; von Spiczak, Sarah; Muhle, Hiltrud; Jähn, Johanna A; van Binsbergen, Ellen; van Nieuwenhuizen, Onno; Jansen, Floor E; Braun, Kees P J; de Haan, Gerrit-Jan; Tommerup, Niels; Stephani, Ulrich; Hjalgrim, Helle; Poot, Martin; Lindhout, Dick; Brilstra, Eva H; Møller, Rikke S; Koeleman, Bobby PC

    2014-01-01

    A genetic contribution to a broad range of epilepsies has been postulated, and particularly copy number variations (CNVs) have emerged as significant genetic risk factors. However, the role of CNVs in patients with epilepsies with complex phenotypes is not known. Therefore, we investigated the role of CNVs in patients with unclassified epilepsies and complex phenotypes. A total of 222 patients from three European countries, including patients with structural lesions on magnetic resonance imaging (MRI), dysmorphic features, and multiple congenital anomalies, were clinically evaluated and screened for CNVs. MRI findings including acquired or developmental lesions and patient characteristics were subdivided and analyzed in subgroups. MRI data were available for 88.3% of patients, of whom 41.6% had abnormal MRI findings. Eighty-eight rare CNVs were discovered in 71 out of 222 patients (31.9%). Segregation of all identified variants could be assessed in 42 patients, 11 of which were de novo. The frequency of all structural variants and de novo variants was not statistically different between patients with or without MRI abnormalities or MRI subcategories. Patients with dysmorphic features were more likely to carry a rare CNV. Genome-wide screening methods for rare CNVs may provide clues for the genetic etiology in patients with a broader range of epilepsies than previously anticipated, including in patients with various brain anomalies detectable by MRI. Performing genome-wide screens for rare CNVs can be a valuable contribution to the routine diagnostic workup in patients with a broad range of childhood epilepsies. PMID:24281369

  2. Recognizing genes and other components of genomic structure

    SciTech Connect

    Burks, C. ); Myers, E. . Dept. of Computer Science); Stormo, G.D. . Dept. of Molecular, Cellular and Developmental Biology)

    1991-01-01

    The Aspen Center for Physics (ACP) sponsored a three-week workshop, with 26 scientists participating, from 28 May to 15 June, 1990. The workshop, entitled Recognizing Genes and Other Components of Genomic Structure, focussed on discussion of current needs and future strategies for developing the ability to identify and predict the presence of complex functional units on sequenced, but otherwise uncharacterized, genomic DNA. We addressed the need for computationally-based, automatic tools for synthesizing available data about individual consensus sequences and local compositional patterns into the composite objects (e.g., genes) that are -- as composite entities -- the true object of interest when scanning DNA sequences. The workshop was structured to promote sustained informal contact and exchange of expertise between molecular biologists, computer scientists, and mathematicians. No participant stayed for less than one week, and most attended for two or three weeks. Computers, software, and databases were available for use as electronic blackboards'' and as the basis for collaborative exploration of ideas being discussed and developed at the workshop. 23 refs., 2 tabs.

  3. Structural constraints in the packaging of bluetongue virus genomic segments

    PubMed Central

    Burkhardt, Christiane; Sung, Po-Yu; Celma, Cristina C.

    2014-01-01

    The mechanism used by bluetongue virus (BTV) to ensure the sorting and packaging of its 10 genomic segments is still poorly understood. In this study, we investigated the packaging constraints for two BTV genomic segments from two different serotypes. Segment 4 (S4) of BTV serotype 9 was mutated sequentially and packaging of mutant ssRNAs was investigated by two newly developed RNA packaging assay systems, one in vivo and the other in vitro. Modelling of the mutated ssRNA followed by biochemical data analysis suggested that a conformational motif formed by interaction of the 5′ and 3′ ends of the molecule was necessary and sufficient for packaging. A similar structural signal was also identified in S8 of BTV serotype 1. Furthermore, the same conformational analysis of secondary structures for positive-sense ssRNAs was used to generate a chimeric segment that maintained the putative packaging motif but contained unrelated internal sequences. This chimeric segment was packaged successfully, confirming that the motif identified directs the correct packaging of the segment. PMID:24980574

  4. Secure web book to store structural genomics research data.

    PubMed

    Manjasetty, Babu A; Höppner, Klaus; Mueller, Uwe; Heinemann, Udo

    2003-01-01

    Recently established collaborative structural genomics programs aim at significantly accelerating the crystal structure analysis of proteins. These large-scale projects require efficient data management systems to ensure seamless collaboration between different groups of scientists working towards the same goal. Within the Berlin-based Protein Structure Factory, the synchrotron X-ray data collection and the subsequent crystal structure analysis tasks are located at BESSY, a third-generation synchrotron source. To organize file-based communication and data transfer at the BESSY site of the Protein Structure Factory, we have developed the web-based BCLIMS, the BESSY Crystallography Laboratory Information Management System. BCLIMS is a relational data management system which is powered by MySQL as the database engine and Apache HTTP as the web server. The database interface routines are written in Python programing language. The software is freely available to academic users. Here we describe the storage, retrieval and manipulation of laboratory information, mainly pertaining to the synchrotron X-ray diffraction experiments and the subsequent protein structure analysis, using BCLIMS. PMID:14649296

  5. The impact of population structure on genomic prediction in stratified populations.

    PubMed

    Guo, Zhigang; Tucker, Dominic M; Basten, Christopher J; Gandhi, Harish; Ersoz, Elhan; Guo, Baohong; Xu, Zhanyou; Wang, Daolong; Gay, Gilles

    2014-03-01

    Impacts of population structure on the evaluation of genomic heritability and prediction were investigated and quantified using high-density markers in diverse panels in rice and maize. Population structure is an important factor affecting estimation of genomic heritability and assessment of genomic prediction in stratified populations. In this study, our first objective was to assess effects of population structure on estimations of genomic heritability using the diversity panels in rice and maize. Results indicate population structure explained 33 and 7.5% of genomic heritability for rice and maize, respectively, depending on traits, with the remaining heritability explained by within-subpopulation variation. Estimates of within-subpopulation heritability were higher than that derived from quantitative trait loci identified in genome-wide association studies, suggesting 65% improvement in genetic gains. The second objective was to evaluate effects of population structure on genomic prediction using cross-validation experiments. When population structure exists in both training and validation sets, correcting for population structure led to a significant decrease in accuracy with genomic prediction. In contrast, when prediction was limited to a specific subpopulation, population structure showed little effect on accuracy and within-subpopulation genetic variance dominated predictions. Finally, effects of genomic heritability on genomic prediction were investigated. Accuracies with genomic prediction increased with genomic heritability in both training and validation sets, with the former showing a slightly greater impact. In summary, our results suggest that the population structure contribution to genomic prediction varies based on prediction strategies, and is also affected by the genetic architectures of traits and populations. In practical breeding, these conclusions may be helpful to better understand and utilize the different genetic resources in genomic

  6. Backbone Solution Structures of Proteins Using Residual Dipolar Couplings: Application to a Novel Structural Genomics Target

    PubMed Central

    Valafar, H.; Mayer, K. L.; Bougault, C. M.; LeBlond, P. D.; Jenney, F. E.; Brereton, P. S.; Adams, M.W.W.; Prestegard, J.H.

    2006-01-01

    Structural genomics (or proteomics) activities are critically dependent on the availability of high-throughput structure determination methodology. Development of such methodology has been a particular challenge for NMR based structure determination because of the demands for isotopic labeling of proteins and the requirements for very long data acquisition times. We present here a methodology that gains efficiency from a focus on determination of backbone structures of proteins as opposed to full structures with all side chains in place. This focus is appropriate given the presumption that many protein structures in the future will be built using computational methods that start from representative fold family structures and replace as many as 70% of the side chains in the course of structure determination. The methodology we present is based primarily on residual dipolar couplings (RDCs), readily accessible NMR observables that constrain the orientation of backbone fragments irrespective of separation in space. A new software tool is described for the assembly of backbone fragments under RDC constraints and an application to a structural genomics target is presented. The target is an 8.7 kDa protein from Pyrococcus furiosus, PF1061, that was previously not well annotated, and had a nearest structurally characterized neighbor with only 33% sequence identity. The structure produced shows structural similarity to this sequence homologue, but also shows similarity to other proteins that suggests a functional role in sulfur transfer. Given the backbone structure and a possible functional link this should be an ideal target for development of modeling methods. PMID:15704012

  7. Studies on cattle genomic structural variation provide insights into ruminant speciation and adaptation

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genomic structural variations, including segmental duplications (SD) and copy number variations (CNV), contribute significantly to individual health and disease in primates and rodents. As a part of the bovine genome annotation effort, we performed the first genome-wide analysis of SD in cattle usin...

  8. A physical map for the Amborella trichopoda genome sheds light on the evolution of angiosperm genome structure

    PubMed Central

    2011-01-01

    Background Recent phylogenetic analyses have identified Amborella trichopoda, an understory tree species endemic to the forests of New Caledonia, as sister to a clade including all other known flowering plant species. The Amborella genome is a unique reference for understanding the evolution of angiosperm genomes because it can serve as an outgroup to root comparative analyses. A physical map, BAC end sequences and sample shotgun sequences provide a first view of the 870 Mbp Amborella genome. Results Analysis of Amborella BAC ends sequenced from each contig suggests that the density of long terminal repeat retrotransposons is negatively correlated with that of protein coding genes. Syntenic, presumably ancestral, gene blocks were identified in comparisons of the Amborella BAC contigs and the sequenced Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera and Oryza sativa genomes. Parsimony mapping of the loss of synteny corroborates previous analyses suggesting that the rate of structural change has been more rapid on lineages leading to Arabidopsis and Oryza compared with lineages leading to Populus and Vitis. The gamma paleohexiploidy event identified in the Arabidopsis, Populus and Vitis genomes is shown to have occurred after the divergence of all other known angiosperms from the lineage leading to Amborella. Conclusions When placed in the context of a physical map, BAC end sequences representing just 5.4% of the Amborella genome have facilitated reconstruction of gene blocks that existed in the last common ancestor of all flowering plants. The Amborella genome is an invaluable reference for inferences concerning the ancestral angiosperm and subsequent genome evolution. PMID:21619600

  9. Knowledge Mobilization across Boundaries with the Use of Novel Organizational Structures, Conferencing Strategies, and Technological Tools: The Ontario Consortium of Undergraduate Biology Educators (oCUBE) Model

    ERIC Educational Resources Information Center

    Kajiura, Lovaye; Smit, Julie; Montpetit, Colin; Kelly, Tamara; Waugh, Jennifer; Rawle, Fiona; Clark, Julie; Neumann, Melody; French, Michelle

    2014-01-01

    The Ontario Consortium of Undergraduate Biology Educators (oCUBE) brings together over 50 biology educators from 18 Ontario universities with the common goal to improve the biology undergraduate experience for both students and educators. This goal is achieved through an innovative mix of highly interactive face-to-face meetings, online…

  10. Comparative genetics and genomics of nematodes: genome structure, development, and lifestyle.

    PubMed

    Sommer, Ralf J; Streit, Adrian

    2011-01-01

    Nematodes are found in virtually all habitats on earth. Many of them are parasites of plants and animals, including humans. The free-living nematode, Caenorhabditis elegans, is one of the genetically best-studied model organisms and was the first metazoan whose genome was fully sequenced. In recent years, the draft genome sequences of another six nematodes representing four of the five major clades of nematodes were published. Compared to mammalian genomes, all these genomes are very small. Nevertheless, they contain almost the same number of genes as the human genome. Nematodes are therefore a very attractive system for comparative genetic and genomic studies, with C. elegans as an excellent baseline. Here, we review the efforts that were made to extend genetic analysis to nematodes other than C. elegans, and we compare the seven available nematode genomes. One of the most striking findings is the unexpectedly high incidence of gene acquisition through horizontal gene transfer (HGT). PMID:21721943

  11. Midwest Superconductivity Consortium

    SciTech Connect

    Liedl, G.L.

    1992-01-01

    The Midwest Superconductivity Consortium's, MISCON, mission is to advance the science and understanding of high {Tc} superconductivity. Programmatic research focuses upon key materials-related problems: synthesis and processing; and limiting features in transport phenomena. During the past twenty-one projects produced over eighty-seven talks and seventy-two publications. Key achievements this past year expand our understanding of processing phenomena relating to crystallization and texture, metal superconductor composites, and modulated microstructures. Further noteworthy accomplishments include calculations on 2-D superconductor insulator transition, prediction of flux line lattice melting, and an expansion of our understanding and use of microwave phenomena as related to superconductors.

  12. Combustion Byproducts Recycling Consortium

    SciTech Connect

    Paul Ziemkiewicz; Tamara Vandivort; Debra Pflughoeft-Hassett; Y. Paul Chugh; James Hower

    2008-08-31

    The Combustion Byproducts Recycling Consortium (CBRC) program was developed as a focused program to remove and/or minimize the barriers for effective management of over 123 million tons of coal combustion byproducts (CCBs) annually generated in the USA. At the time of launching the CBRC in 1998, about 25% of CCBs were beneficially utilized while the remaining was disposed in on-site or off-site landfills. During the ten (10) year tenure of CBRC (1998-2008), after a critical review, 52 projects were funded nationwide. By region, the East, Midwest, and West had 21, 18, and 13 projects funded, respectively. Almost all projects were cooperative projects involving industry, government, and academia. The CBRC projects, to a large extent, successfully addressed the problems of large-scale utilization of CCBs. A few projects, such as the two Eastern Region projects that addressed the use of fly ash in foundry applications, might be thought of as a somewhat smaller application in comparison to construction and agricultural uses, but as a novel niche use, they set the stage to draw interest that fly ash substitution for Portland cement might not attract. With consideration of the large increase in flue gas desulfurization (FGD) gypsum in response to EPA regulations, agricultural uses of FGD gypsum hold promise for large-scale uses of a product currently directed to the (currently stagnant) home construction market. Outstanding achievements of the program are: (1) The CBRC successfully enhanced professional expertise in the area of CCBs throughout the nation. The enhanced capacity continues to provide technology and information transfer expertise to industry and regulatory agencies. (2) Several technologies were developed that can be used immediately. These include: (a) Use of CCBs for road base and sub-base applications; (b) full-depth, in situ stabilization of gravel roads or highway/pavement construction recycled materials; and (c) fired bricks containing up to 30%-40% F

  13. The genome-wide structure of the Jewish people.

    PubMed

    Behar, Doron M; Yunusbayev, Bayazit; Metspalu, Mait; Metspalu, Ene; Rosset, Saharon; Parik, Jüri; Rootsi, Siiri; Chaubey, Gyaneshwer; Kutuev, Ildus; Yudkovsky, Guennady; Khusnutdinova, Elza K; Balanovsky, Oleg; Semino, Ornella; Pereira, Luisa; Comas, David; Gurwitz, David; Bonne-Tamir, Batsheva; Parfitt, Tudor; Hammer, Michael F; Skorecki, Karl; Villems, Richard

    2010-07-01

    Contemporary Jews comprise an aggregate of ethno-religious communities whose worldwide members identify with each other through various shared religious, historical and cultural traditions. Historical evidence suggests common origins in the Middle East, followed by migrations leading to the establishment of communities of Jews in Europe, Africa and Asia, in what is termed the Jewish Diaspora. This complex demographic history imposes special challenges in attempting to address the genetic structure of the Jewish people. Although many genetic studies have shed light on Jewish origins and on diseases prevalent among Jewish communities, including studies focusing on uniparentally and biparentally inherited markers, genome-wide patterns of variation across the vast geographic span of Jewish Diaspora communities and their respective neighbours have yet to be addressed. Here we use high-density bead arrays to genotype individuals from 14 Jewish Diaspora communities and compare these patterns of genome-wide diversity with those from 69 Old World non-Jewish populations, of which 25 have not previously been reported. These samples were carefully chosen to provide comprehensive comparisons between Jewish and non-Jewish populations in the Diaspora, as well as with non-Jewish populations from the Middle East and north Africa. Principal component and structure-like analyses identify previously unrecognized genetic substructure within the Middle East. Most Jewish samples form a remarkably tight subcluster that overlies Druze and Cypriot samples but not samples from other Levantine populations or paired Diaspora host populations. In contrast, Ethiopian Jews (Beta Israel) and Indian Jews (Bene Israel and Cochini) cluster with neighbouring autochthonous populations in Ethiopia and western India, respectively, despite a clear paternal link between the Bene Israel and the Levant. These results cast light on the variegated genetic architecture of the Middle East, and trace the origins

  14. Integrated consensus map of cultivated peanut and wild relatives reveals structures of the A and B genomes of Arachis and divergence of the legume genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complex, tetraploid genome structure of peanut (Arachis hypogaea) has obstructed advances in genetics and genomics in the species. The aim of this study is to understand the genome structure of Arachis by developing a high-density integrated consensus map. Three recombinant inbred line populatio...

  15. Development of the international psyllid genome consortium

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Two of the most important emerging agricultural diseases in the USA are transmitted by two different insect species of psyllids from the Family Psyllidae. The Asian Citrus Psyllid (Diaphorina citri) is the principal vector of the intercellular, plant-pathogenic bacterium Liberibacter which cause Hua...

  16. Structural variation of the human genome: mechanisms, assays, and role in male infertility.

    PubMed

    Carvalho, Claudia M B; Zhang, Feng; Lupski, James R

    2011-02-01

    Genomic disorders are defined as diseases caused by rearrangements of the genome incited by a genomic architecture that conveys instability. Y-chromosome related dysfunctions such as male infertility are frequently associated with gross DNA rearrangements resulting from its peculiar genomic architecture. The Y-chromosome has evolved into a highly specialized chromosome to perform male functions, mainly spermatogenesis. Direct and inverted repeats, some of them palindromes with highly identical nucleotide sequences that can form DNA cruciform structures, characterize the genomic structure of the Y-chromosome long arm. Some particular Y chromosome genomic deletions can cause spermatogenic failure likely because of removal of one or more transcriptional units with a potential role in spermatogenesis. We describe mechanisms underlying the formation of human genomic rearrangements on autosomes and review Y-chromosome deletions associated with male infertility. PMID:21210740

  17. Portrait of a Consortium: ANKOS (Anatolian University Libraries Consortium)

    ERIC Educational Resources Information Center

    Erdogan, Phyllis; Karasozen, Bulent

    2009-01-01

    The Anatolian University Libraries Consortium (ANKOS) was created in 2001 with only a few members subscribed to nine e-journal collections and bibliographic databases. This Turkish library consortium had developed from one state and three private universities joining together for the purchase of two databases in 1999. Over time, the numbers of…

  18. Identification of novel RNA secondary structures within the hepatitis C virus genome reveals a cooperative involvement in genome packaging

    PubMed Central

    Stewart, H.; Bingham, R.J.; White, S. J.; Dykeman, E. C.; Zothner, C.; Tuplin, A. K.; Stockley, P. G.; Twarock, R.; Harris, M.

    2016-01-01

    The specific packaging of the hepatitis C virus (HCV) genome is hypothesised to be driven by Core-RNA interactions. To identify the regions of the viral genome involved in this process, we used SELEX (systematic evolution of ligands by exponential enrichment) to identify RNA aptamers which bind specifically to Core in vitro. Comparison of these aptamers to multiple HCV genomes revealed the presence of a conserved terminal loop motif within short RNA stem-loop structures. We postulated that interactions of these motifs, as well as sub-motifs which were present in HCV genomes at statistically significant levels, with the Core protein may drive virion assembly. We mutated 8 of these predicted motifs within the HCV infectious molecular clone JFH-1, thereby producing a range of mutant viruses predicted to possess altered RNA secondary structures. RNA replication and viral titre were unaltered in viruses possessing only one mutated structure. However, infectivity titres were decreased in viruses possessing a higher number of mutated regions. This work thus identified multiple novel RNA motifs which appear to contribute to genome packaging. We suggest that these structures act as cooperative packaging signals to drive specific RNA encapsidation during HCV assembly. PMID:26972799

  19. Hawaii Space Grant Consortium

    NASA Technical Reports Server (NTRS)

    Flynn, Luke P.

    2005-01-01

    The Hawai'i Space Grant Consortium is composed of ten institutions of higher learning including the University of Hawai'i at Manoa, the University of Hawai'i at Hilo, the University of Guam, and seven Community Colleges spread over the 4 main Hawaiian islands. Geographic separation is not the only obstacle that we face as a Consortium. Hawai'i has been mired in an economic downturn due to a lack of tourism for almost all of the period (2001 - 2004) covered by this report, although hotel occupancy rates and real estate sales have sky-rocketed in the last year. Our challenges have been many including providing quality educational opportunities in the face of shrinking State and Federal budgets, encouraging science and technology course instruction at the K-12 level in a public school system that is becoming less focused on high technology and more focused on developing basic reading and math skills, and assembling community college programs with instructors who are expected to teach more classes for the same salary. Motivated people can overcome these problems. Fortunately, the Hawai'i Space Grant Consortium (HSGC) consists of a group of highly motivated and talented individuals who have not only overcome these obstacles, but have excelled with the Program. We fill a critical need within the State of Hawai'i to provide our children with opportunities to pursue their dreams of becoming the next generation of NASA astronauts, engineers, and explorers. Our strength lies not only in our diligent and creative HSGC advisory board, but also with Hawai'i's teachers, students, parents, and industry executives who are willing to invest their time, effort, and resources into Hawai'i's future. Our operational philosophy is to FACE the Future, meaning that we will facilitate, administer, catalyze, and educate in order to achieve our objective of creating a highly technically capable workforce both here in Hawai'i and for NASA. In addition to administering to programs and

  20. SPring-8 Structural Biology Beamlines / Automatic Beamline Operation at RIKEN Structural Genomics Beamlines

    SciTech Connect

    Ueno, Go; Hasegawa, Kazuya; Okazaki, Nobuo; Sakai, Hisanobu; Kumasaka, Takashi; Yamamoto, Masaki

    2007-01-19

    RIKEN Structural Genomics Beamlines (BL26B1 and BL26B2) at SPring-8 have been constructed for high throughput protein crystallography. The beamline operation is automated cooperating with the sample changer robot. The operation software provides a centralized control utilizing the client and server architecture. The sample management system with the networked database has been implemented to accept dry-shipped crystals from distant users.

  1. Large-scale structure of genomic methylation patterns.

    PubMed

    Rollins, Robert A; Haghighi, Fatemeh; Edwards, John R; Das, Rajdeep; Zhang, Michael Q; Ju, Jingyue; Bestor, Timothy H

    2006-02-01

    The mammalian genome depends on patterns of methylated cytosines for normal function, but the relationship between genomic methylation patterns and the underlying sequence is unclear. We have characterized the methylation landscape of the human genome by global analysis of patterns of CpG depletion and by direct sequencing of 3073 unmethylated domains and 2565 methylated domains from human brain DNA. The genome was found to consist of short (<4 kb) unmethylated domains embedded in a matrix of long methylated domains. Unmethylated domains were enriched in promoters, CpG islands, and first exons, while methylated domains comprised interspersed and tandem-repeated sequences, exons other than first exons, and non-annotated single-copy sequences that are depleted in the CpG dinucleotide. The enrichment of regulatory sequences in the relatively small unmethylated compartment suggests that cytosine methylation constrains the effective size of the genome through the selective exposure of regulatory sequences. This buffers regulatory networks against changes in total genome size and provides an explanation for the C value paradox, which concerns the wide variations in genome size that scale independently of gene number. This suggestion is compatible with the finding that cytosine methylation is universal among large-genome eukaryotes, while many eukaryotes with genome sizes <5 x 10(8) bp do not methylate their DNA. PMID:16365381

  2. Advanced Lab Consortium ``Conspiracy''

    NASA Astrophysics Data System (ADS)

    Reichert, Jonathan F.

    2006-03-01

    Advanced Laboratory instruction is a time-honored and essential element of an undergraduate physics education. But, from my vantage point, it has been neglected by the two major professional societies, APS and AAPT. At some schools, it has been replaced by ``research experiences,'' but I contend that very few of these experiences in the research lab, particularly in the junior year, deliver what they promise. It is time to focus the attention of APS, AAPT, and the NSF on the advanced lab. We need to create an Advanced Lab Consortium (ALC) of faculty and staff to share experiments, suppliers, materials, pedagogy, ideas, in short to build a professional network for those committed to advanced lab instruction. The AAPT is currently in serious discussions on this topic and my company stands ready with both financial and personnel resources to support the effort. This talk is a plea for co-conspirators.

  3. Gas Storage Technology Consortium

    SciTech Connect

    Joel L. Morrison; Sharon L. Elder

    2007-06-30

    Gas storage is a critical element in the natural gas industry. Producers, transmission and distribution companies, marketers, and end users all benefit directly from the load balancing function of storage. The unbundling process has fundamentally changed the way storage is used and valued. As an unbundled service, the value of storage is being recovered at rates that reflect its value. Moreover, the marketplace has differentiated between various types of storage services and has increasingly rewarded flexibility, safety, and reliability. The size of the natural gas market has increased and is projected to continue to increase towards 30 trillion cubic feet over the next 10 to 15 years. Much of this increase is projected to come from electric generation, particularly peaking units. Gas storage, particularly the flexible services that are most suited to electric loads, is crucial in meeting the needs of these new markets. To address the gas storage needs of the natural gas industry, an industry-driven consortium was created--the Gas Storage Technology Consortium (GSTC). The objective of the GSTC is to provide a means to accomplish industry-driven research and development designed to enhance the operational flexibility and deliverability of the nation's gas storage system, and provide a cost-effective, safe, and reliable supply of natural gas to meet domestic demand. This report addresses the activities for the quarterly period of April 1, 2007 through June 30, 2007. Key activities during this time period included: (1) Organizing and hosting the 2007 GSTC Spring Meeting; (2) Identifying the 2007 GSTC projects, issuing award or declination letters, and begin drafting subcontracts; (3) 2007 project mentoring teams identified; (4) New NETL Project Manager; (5) Preliminary planning for the 2007 GSTC Fall Meeting; (6) Collecting and compiling the 2005 GSTC project final reports; and (7) Outreach and communications.

  4. Gas Storage Technology Consortium

    SciTech Connect

    Joel Morrison

    2005-09-14

    Gas storage is a critical element in the natural gas industry. Producers, transmission and distribution companies, marketers, and end users all benefit directly from the load balancing function of storage. The unbundling process has fundamentally changed the way storage is used and valued. As an unbundled service, the value of storage is being recovered at rates that reflect its value. Moreover, the marketplace has differentiated between various types of storage services, and has increasingly rewarded flexibility, safety, and reliability. The size of the natural gas market has increased and is projected to continue to increase towards 30 trillion cubic feet (TCF) over the next 10 to 15 years. Much of this increase is projected to come from electric generation, particularly peaking units. Gas storage, particularly the flexible services that are most suited to electric loads, is critical in meeting the needs of these new markets. In order to address the gas storage needs of the natural gas industry, an industry driven consortium was created--the Gas Storage Technology Consortium (GSTC). The objective of the GSTC is to provide a means to accomplish industry-driven research and development designed to enhance operational flexibility and deliverability of the Nation's gas storage system, and provide a cost effective, safe, and reliable supply of natural gas to meet domestic demand. This report addresses the activities for the quarterly period of April 1, 2005 through June 30, 2005. During this time period efforts were directed toward (1) GSTC administration changes, (2) participating in the American Gas Association Operations Conference and Biennial Exhibition, (3) issuing a Request for Proposals (RFP) for proposal solicitation for funding, and (4) organizing the proposal selection meeting.

  5. Gas Storage Technology Consortium

    SciTech Connect

    Joel L. Morrison; Sharon L. Elder

    2006-07-06

    Gas storage is a critical element in the natural gas industry. Producers, transmission & distribution companies, marketers, and end users all benefit directly from the load balancing function of storage. The unbundling process has fundamentally changed the way storage is used and valued. As an unbundled service, the value of storage is being recovered at rates that reflect its value. Moreover, the marketplace has differentiated between various types of storage services, and has increasingly rewarded flexibility, safety, and reliability. The size of the natural gas market has increased and is projected to continue to increase towards 30 trillion cubic feet (TCF) over the next 10 to 15 years. Much of this increase is projected to come from electric generation, particularly peaking units. Gas storage, particularly the flexible services that are most suited to electric loads, is critical in meeting the needs of these new markets. In order to address the gas storage needs of the natural gas industry, an industry-driven consortium was created--the Gas Storage Technology Consortium (GSTC). The objective of the GSTC is to provide a means to accomplish industry-driven research and development designed to enhance operational flexibility and deliverability of the Nation's gas storage system, and provide a cost effective, safe, and reliable supply of natural gas to meet domestic demand. This report addresses the activities for the quarterly period of April 1 to June 30, 2006. Key activities during this time period include: (1) Develop and process subcontract agreements for the eight projects selected for cofunding at the February 2006 GSTC Meeting; (2) Compiling and distributing the three 2004 project final reports to the GSTC Full members; (3) Develop template, compile listserv, and draft first GSTC Insider online newsletter; (4) Continue membership recruitment; (5) Identify projects and finalize agenda for the fall GSTC/AGA Underground Storage Committee Technology Transfer

  6. Gas Storage Technology Consortium

    SciTech Connect

    Joel L. Morrison; Sharon L. Elder

    2006-05-10

    Gas storage is a critical element in the natural gas industry. Producers, transmission and distribution companies, marketers, and end users all benefit directly from the load balancing function of storage. The unbundling process has fundamentally changed the way storage is used and valued. As an unbundled service, the value of storage is being recovered at rates that reflect its value. Moreover, the marketplace has differentiated between various types of storage services, and has increasingly rewarded flexibility, safety, and reliability. The size of the natural gas market has increased and is projected to continue to increase towards 30 trillion cubic feet (TCF) over the next 10 to 15 years. Much of this increase is projected to come from electric generation, particularly peaking units. Gas storage, particularly the flexible services that are most suited to electric loads, is critical in meeting the needs of these new markets. In order to address the gas storage needs of the natural gas industry, an industry-driven consortium was created--the Gas Storage Technology Consortium (GSTC). The objective of the GSTC is to provide a means to accomplish industry-driven research and development designed to enhance operational flexibility and deliverability of the Nation's gas storage system, and provide a cost effective, safe, and reliable supply of natural gas to meet domestic demand. This report addresses the activities for the quarterly period of January 1, 2006 through March 31, 2006. Activities during this time period were: (1) Organize and host the 2006 Spring Meeting in San Diego, CA on February 21-22, 2006; (2) Award 8 projects for co-funding by GSTC for 2006; (3) New members recruitment; and (4) Improving communications.

  7. Gas Storage Technology Consortium

    SciTech Connect

    Joel L. Morrison; Sharon L. Elder

    2007-03-31

    Gas storage is a critical element in the natural gas industry. Producers, transmission and distribution companies, marketers, and end users all benefit directly from the load balancing function of storage. The unbundling process has fundamentally changed the way storage is used and valued. As an unbundled service, the value of storage is being recovered at rates that reflect its value. Moreover, the marketplace has differentiated between various types of storage services and has increasingly rewarded flexibility, safety, and reliability. The size of the natural gas market has increased and is projected to continue to increase towards 30 trillion cubic feet (TCF) over the next 10 to 15 years. Much of this increase is projected to come from electric generation, particularly peaking units. Gas storage, particularly the flexible services that are most suited to electric loads, is crucial in meeting the needs of these new markets. To address the gas storage needs of the natural gas industry, an industry-driven consortium was created - the Gas Storage Technology Consortium (GSTC). The objective of the GSTC is to provide a means to accomplish industry-driven research and development designed to enhance the operational flexibility and deliverability of the nation's gas storage system, and provide a cost-effective, safe, and reliable supply of natural gas to meet domestic demand. This report addresses the activities for the quarterly period of January1, 2007 through March 31, 2007. Key activities during this time period included: {lg_bullet} Drafting and distributing the 2007 RFP; {lg_bullet} Identifying and securing a meeting site for the GSTC 2007 Spring Proposal Meeting; {lg_bullet} Scheduling and participating in two (2) project mentoring conference calls; {lg_bullet} Conducting elections for four Executive Council seats; {lg_bullet} Collecting and compiling the 2005 GSTC Final Project Reports; and {lg_bullet} Outreach and communications.

  8. Nuclear Fabrication Consortium

    SciTech Connect

    Levesque, Stephen

    2013-04-05

    This report summarizes the activities undertaken by EWI while under contract from the Department of Energy (DOE) Office of Nuclear Energy (NE) for the management and operation of the Nuclear Fabrication Consortium (NFC). The NFC was established by EWI to independently develop, evaluate, and deploy fabrication approaches and data that support the re-establishment of the U.S. nuclear industry: ensuring that the supply chain will be competitive on a global stage, enabling more cost-effective and reliable nuclear power in a carbon constrained environment. The NFC provided a forum for member original equipment manufactures (OEM), fabricators, manufacturers, and materials suppliers to effectively engage with each other and rebuild the capacity of this supply chain by : Identifying and removing impediments to the implementation of new construction and fabrication techniques and approaches for nuclear equipment, including system components and nuclear plants. Providing and facilitating detailed scientific-based studies on new approaches and technologies that will have positive impacts on the cost of building of nuclear plants. Analyzing and disseminating information about future nuclear fabrication technologies and how they could impact the North American and the International Nuclear Marketplace. Facilitating dialog and initiate alignment among fabricators, owners, trade associations, and government agencies. Supporting industry in helping to create a larger qualified nuclear supplier network. Acting as an unbiased technology resource to evaluate, develop, and demonstrate new manufacturing technologies. Creating welder and inspector training programs to help enable the necessary workforce for the upcoming construction work. Serving as a focal point for technology, policy, and politically interested parties to share ideas and concepts associated with fabrication across the nuclear industry. The report the objectives and summaries of the Nuclear Fabrication Consortium

  9. Genomic structural variants are linked with intellectual disability.

    PubMed

    Bulayeva, Kazima; Lesch, Klaus-Peter; Bulayev, Oleg; Walsh, Christopher; Glatt, Stephen; Gurgenova, Farida; Omarova, Jamilja; Berdichevets, Irina; Thompson, Paul M

    2015-09-01

    Mutations in more than 500 genes have been associated with intellectual disability (ID) and related disorders of cognitive function, such as autism and schizophrenia. Here we aimed to unravel the molecular epidemiology of non-specific ID in a genetic isolate using a combination of population and molecular genetic approaches. A large multigenerational pedigree was ascertained within a Dagestan Genetic Heritage research program in a genetic isolate of indigenous ethnics. Clinical characteristics of the affected members were based on combining diagnoses from regional psychiatric hospitals with our own clinical assessment, using a Russian translation of the structured psychiatric interviews, the Diagnostic Interview for Genetic Studies and the Family Interview for Genetic Studies, based on DSM-IV criteria. Weber/CHLC 9.0 STRs set was used for multipoint parametric linkage analyses (Simwalk2.91). Next, we checked CNVs and LOH (based on Affymetrix SNP 5.0 data) in the linked with ID genomic regions with the aim to identify candidate genes associated with mutations in linked regions. The number of statistically significant (p ≤ 0.05) suggestive linkage peaks with 1.3 < LOD < 3.0 we detected in a total of 10 genomic regions: 1q41, 2p25.3-p24.2, 3p13-p12.1, 4q13.3, 10p11, 11q23, 12q24.22-q24.31, 17q24.2-q25.1, 21q22.13 and 22q12.3-q13.1. Three significant linkage signals with LOD >3 were obtained at 2p25.3-p24.2 under the dominant model, with a peak at 21 cM flanked by loci D2S2976 and D2S2952; at 12q24.22-q24.31 under the recessive model, with a peak at -120 cM flanked by marker D12S2070 and D12S395 and at 22q12.3 under the dominant model, with a peak at 32 cM flanked by marker D22S683 and D22S445. After a set of genes had been designated as possible candidates in these specific chromosomal regions,we conducted an exploratory search for LOH and CNV based on microarray data to detect structural genomic variants within five ID-linked regions with LOD scores between 2.0 and

  10. Structure-based inference of molecular functions of proteins of unknown function from Berkeley Structural Genomics Center

    SciTech Connect

    Kim, Sung-Hou; Shin, Dong Hae; Hou, Jingtong; Chandonia, John-Marc; Das, Debanu; Choi, In-Geol; Kim, Rosalind; Kim, Sung-Hou

    2007-09-02

    Advances in sequence genomics have resulted in an accumulation of a huge number of protein sequences derived from genome sequences. However, the functions of a large portion of them cannot be inferred based on the current methods of sequence homology detection to proteins of known functions. Three-dimensional structure can have an important impact in providing inference of molecular function (physical and chemical function) of a protein of unknown function. Structural genomics centers worldwide have been determining many 3-D structures of the proteins of unknown functions, and possible molecular functions of them have been inferred based on their structures. Combined with bioinformatics and enzymatic assay tools, the successful acceleration of the process of protein structure determination through high throughput pipelines enables the rapid functional annotation of a large fraction of hypothetical proteins. We present a brief summary of the process we used at the Berkeley Structural Genomics Center to infer molecular functions of proteins of unknown function.

  11. Comparative Genomics of Sibling Fungal Pathogenic Taxa Identifies Adaptive Evolution without Divergence in Pathogenicity Genes or Genomic Structure

    PubMed Central

    Sillo, Fabiano; Garbelotto, Matteo; Friedman, Maria; Gonthier, Paolo

    2015-01-01

    It has been estimated that the sister plant pathogenic fungal species Heterobasidion irregulare and Heterobasidion annosum may have been allopatrically isolated for 34–41 Myr. They are now sympatric due to the introduction of the first species from North America into Italy, where they freely hybridize. We used a comparative genomic approach to 1) confirm that the two species are distinct at the genomic level; 2) determine which gene groups have diverged the most and the least between species; 3) show that their overall genomic structures are similar, as predicted by the viability of hybrids, and identify genomic regions that instead are incongruent; and 4) test the previously formulated hypothesis that genes involved in pathogenicity may be less divergent between the two species than genes involved in saprobic decay and sporulation. Results based on the sequencing of three genomes per species identified a high level of interspecific similarity, but clearly confirmed the status of the two as distinct taxa. Genes involved in pathogenicity were more conserved between species than genes involved in saprobic growth and sporulation, corroborating at the genomic level that invasiveness may be determined by the two latter traits, as documented by field and inoculation studies. Additionally, the majority of genes under positive selection and the majority of genes bearing interspecific structural variations were involved either in transcriptional or in mitochondrial functions. This study provides genomic-level evidence that invasiveness of pathogenic microbes can be attained without the high levels of pathogenicity presumed to exist for pathogens challenging naïve hosts. PMID:26527650

  12. Sexual structures in Aspergillus: morphology, importance and genomics.

    PubMed

    Geiser, David M

    2009-01-01

    The genus Aspergillus comprises a few hundred species sharing a common asexual spore forming structure, the aspergillum. Approximately one-third of these species also produce a sexual stage, all but five of which are known to be homothallic. Sexual stages associated with Aspergillus fall into approximately ten different genera, reflecting a tremendous degree of phylogenetic and biological diversity. Sexual stages in Aspergillus are plectomycetous, typical for the order in which it resides, the Eurotiales. Theoretically, a homothallic Aspergillus species can produce both asexual conidia and sexual ascospores in both clonal and recombinant fashion, although the actual significance of these potential modes of reproduction is unclear. Aspergillus species with known sexual stages tend to be minor players in infections of humans, perhaps because of their tendency to produce fewer asexual spores compared to their non-teleomorphic congeners. The discovery of population genetic and genomic evidence for sex in species with no known sexual stage indicates that no assumptions can be made about the clonal versus recombinant life histories of a species based on its known mitotic and/or meiotic reproductive modes. PMID:18608901

  13. Sequence, genomic structure, and chromosomal assignment of human DOC-2

    SciTech Connect

    Albertsen, H.M.; Williams, B.; Smith, S.A.

    1996-04-15

    DOC-2 is a human gene originally identified as a 767-bp cDNA fragment isolated from normal ovarian epithelial cells by differential display against ovarian carcinoma cells. We have now determined the complete cDNA sequence of the 3.2-kb DOC-2 transcript and localized the gene to chromosome 5. A 12.5-kb genomic fragment at the 5{prime}-end of DOC-2 has also been sequenced, revealing the intron-exon structure of the first eight exons (788 bases) of the DOC-2 gene. Translation of the DOC-2 cDNA predicts a hydrophobic protein of 770 amino acid residues with a molecular weight of 82.5 kDa. Comparison of the DNA and amino acid sequences of DOC-2 to publicly accessible sequence data-bases revealed 83% identity to p96, a murine-responsive phosphoprotein. In addition, about 45% identity was observed between the first 140 N-terminal residues of DOC-2 and the Caenorhabditas elegans M110.5 and Drosophila melanoaster Dab genes. 14 refs., 3 figs.

  14. Fastbreak: a tool for analysis and visualization of structural variations in genomic data

    PubMed Central

    2012-01-01

    Genomic studies are now being undertaken on thousands of samples requiring new computational tools that can rapidly analyze data to identify clinically important features. Inferring structural variations in cancer genomes from mate-paired reads is a combinatorially difficult problem. We introduce Fastbreak, a fast and scalable toolkit that enables the analysis and visualization of large amounts of data from projects such as The Cancer Genome Atlas. PMID:23046488

  15. Automated search of natively folded protein fragments for high-throughput structure determination in structural genomics.

    PubMed Central

    Kuroda, Y.; Tani, K.; Matsuo, Y.; Yokoyama, S.

    2000-01-01

    Structural genomic projects envision almost routine protein structure determinations, which are currently imaginable only for small proteins with molecular weights below 25,000 Da. For larger proteins, structural insight can be obtained by breaking them into small segments of amino acid sequences that can fold into native structures, even when isolated from the rest of the protein. Such segments are autonomously folding units (AFU) and have sizes suitable for fast structural analyses. Here, we propose to expand an intuitive procedure often employed for identifying biologically important domains to an automatic method for detecting putative folded protein fragments. The procedure is based on the recognition that large proteins can be regarded as a combination of independent domains conserved among diverse organisms. We thus have developed a program that reorganizes the output of BLAST searches and detects regions with a large number of similar sequences. To automate the detection process, it is reduced to a simple geometrical problem of recognizing rectangular shaped elevations in a graph that plots the number of similar sequences at each residue of a query sequence. We used our program to quantitatively corroborate the premise that segments with conserved sequences correspond to domains that fold into native structures. We applied our program to a test data set composed of 99 amino acid sequences containing 150 segments with structures listed in the Protein Data Bank, and thus known to fold into native structures. Overall, the fragments identified by our program have an almost 50% probability of forming a native structure, and comparable results are observed with sequences containing domain linkers classified in SCOP. Furthermore, we verified that our program identifies AFU in libraries from various organisms, and we found a significant number of AFU candidates for structural analysis, covering an estimated 5 to 20% of the genomic databases. Altogether, these

  16. Reuse at the Software Productivity Consortium

    NASA Technical Reports Server (NTRS)

    Weiss, David M.

    1989-01-01

    The Software Productivity Consortium is sponsored by 14 aerospace companies as a developer of software engineering methods and tools. Software reuse and prototyping are currently the major emphasis areas. The Methodology and Measurement Project in the Software Technology Exploration Division has developed some concepts for reuse which they intend to develop into a synthesis process. They have identified two approaches to software reuse: opportunistic and systematic. The assumptions underlying the systematic approach, phrased as hypotheses, are the following: the redevelopment hypothesis, i.e., software developers solve the same problems repeatedly; the oracle hypothesis, i.e., developers are able to predict variations from one redevelopment to others; and the organizational hypothesis, i.e., software must be organized according to behavior and structure to take advantage of the predictions that the developers make. The conceptual basis for reuse includes: program families, information hiding, abstract interfaces, uses and information hiding hierarchies, and process structure. The primary reusable software characteristics are black-box descriptions, structural descriptions, and composition and decomposition based on program families. Automated support can be provided for systematic reuse, and the Consortium is developing a prototype reuse library and guidebook. The software synthesis process that the Consortium is aiming toward includes modeling, refinement, prototyping, reuse, assessment, and new construction.

  17. Genome mapping in capsicum and the evolution of genome structure in the solanaceae.

    PubMed Central

    Livingstone, K D; Lackney, V K; Blauth, J R; van Wijk, R; Jahn, M K

    1999-01-01

    We have created a genetic map of Capsicum (pepper) from an interspecific F2 population consisting of 11 large (76.2-192.3 cM) and 2 small (19.1 and 12.5 cM) linkage groups that cover a total of 1245.7 cM. Many of the markers are tomato probes that were chosen to cover the tomato genome, allowing comparison of this pepper map to the genetic map of tomato. Hybridization of all tomato-derived probes included in this study to positions throughout the pepper map suggests that no major losses have occurred during the divergence of these genomes. Comparison of the pepper and tomato genetic maps showed that 18 homeologous linkage blocks cover 98.1% of the tomato genome and 95.0% of the pepper genome. Through these maps and the potato map, we determined the number and types of rearrangements that differentiate these species and reconstructed a hypothetical progenitor genome. We conclude there have been 30 breaks as part of 5 translocations, 10 paracentric inversions, 2 pericentric inversions, and 4 disassociations or associations of genomic regions that differentiate tomato, potato, and pepper, as well as an additional reciprocal translocation, nonreciprocal translocation, and a duplication or deletion that differentiate the two pepper mapping parents. PMID:10388833

  18. GenomeTools: a comprehensive software library for efficient processing of structured genome annotations.

    PubMed

    Gremme, Gordon; Steinbiss, Sascha; Kurtz, Stefan

    2013-01-01

    Genome annotations are often published as plain text files describing genomic features and their subcomponents by an implicit annotation graph. In this paper, we present the GenomeTools, a convenient and efficient software library and associated software tools for developing bioinformatics software intended to create, process or convert annotation graphs. The GenomeTools strictly follow the annotation graph approach, offering a unified graph-based representation. This gives the developer intuitive and immediate access to genomic features and tools for their manipulation. To process large annotation sets with low memory overhead, we have designed and implemented an efficient pull-based approach for sequential processing of annotations. This allows to handle even the largest annotation sets, such as a complete catalogue of human variations. Our object-oriented C-based software library enables a developer to conveniently implement their own functionality on annotation graphs and to integrate it into larger workflows, simultaneously accessing compressed sequence data if required. The careful C implementation of the GenomeTools does not only ensure a light-weight memory footprint while allowing full sequential as well as random access to the annotation graph, but also facilitates the creation of bindings to a variety of script programming languages (like Python and Ruby) sharing the same interface. PMID:24091398

  19. Gas Storage Technology Consortium

    SciTech Connect

    Joel Morrison; Elizabeth Wood; Barbara Robuck

    2010-09-30

    The EMS Energy Institute at The Pennsylvania State University (Penn State) has managed the Gas Storage Technology Consortium (GSTC) since its inception in 2003. The GSTC infrastructure provided a means to accomplish industry-driven research and development designed to enhance the operational flexibility and deliverability of the nation's gas storage system, and provide a cost-effective, safe, and reliable supply of natural gas to meet domestic demand. The GSTC received base funding from the U.S. Department of Energy's (DOE) National Energy Technology Laboratory (NETL) Oil & Natural Gas Supply Program. The GSTC base funds were highly leveraged with industry funding for individual projects. Since its inception, the GSTC has engaged 67 members. The GSTC membership base was diverse, coming from 19 states, the District of Columbia, and Canada. The membership was comprised of natural gas storage field operators, service companies, industry consultants, industry trade organizations, and academia. The GSTC organized and hosted a total of 18 meetings since 2003. Of these, 8 meetings were held to review, discuss, and select proposals submitted for funding consideration. The GSTC reviewed a total of 75 proposals and committed co-funding to support 31 industry-driven projects. The GSTC committed co-funding to 41.3% of the proposals that it received and reviewed. The 31 projects had a total project value of $6,203,071 of which the GSTC committed $3,205,978 in co-funding. The committed GSTC project funding represented an average program cost share of 51.7%. Project applicants provided an average program cost share of 48.3%. In addition to the GSTC co-funding, the consortium provided the domestic natural gas storage industry with a technology transfer and outreach infrastructure. The technology transfer and outreach were conducted by having project mentoring teams and a GSTC website, and by working closely with the Pipeline Research Council International (PRCI) to jointly host

  20. Gas Storage Technology Consortium

    SciTech Connect

    Joel L. Morrison; Sharon L. Elder

    2006-09-30

    Gas storage is a critical element in the natural gas industry. Producers, transmission and distribution companies, marketers, and end users all benefit directly from the load balancing function of storage. The unbundling process has fundamentally changed the way storage is used and valued. As an unbundled service, the value of storage is being recovered at rates that reflect its value. Moreover, the marketplace has differentiated between various types of storage services, and has increasingly rewarded flexibility, safety, and reliability. The size of the natural gas market has increased and is projected to continue to increase towards 30 trillion cubic feet (TCF) over the next 10 to 15 years. Much of this increase is projected to come from electric generation, particularly peaking units. Gas storage, particularly the flexible services that are most suited to electric loads, is critical in meeting the needs of these new markets. In order to address the gas storage needs of the natural gas industry, an industry-driven consortium was created-the Gas Storage Technology Consortium (GSTC). The objective of the GSTC is to provide a means to accomplish industry-driven research and development designed to enhance operational flexibility and deliverability of the Nation's gas storage system, and provide a cost effective, safe, and reliable supply of natural gas to meet domestic demand. This report addresses the activities for the quarterly period of July 1, 2006 to September 30, 2006. Key activities during this time period include: {lg_bullet} Subaward contracts for all 2006 GSTC projects completed; {lg_bullet} Implement a formal project mentoring process by a mentor team; {lg_bullet} Upcoming Technology Transfer meetings: {sm_bullet} Finalize agenda for the American Gas Association Fall Underground Storage Committee/GSTC Technology Transfer Meeting in San Francisco, CA. on October 4, 2006; {sm_bullet} Identify projects and finalize agenda for the Fall GSTC Technology

  1. Draft Genome of the Wheat Rust Pathogen (Puccinia triticina) Unravels Genome-Wide Structural Variations during Evolution.

    PubMed

    Kiran, Kanti; Rawal, Hukam C; Dubey, Himanshu; Jaswal, Rajdeep; Devanna, B N; Gupta, Deepak Kumar; Bhardwaj, Subhash C; Prasad, P; Pal, Dharam; Chhuneja, Parveen; Balasubramanian, P; Kumar, J; Swami, M; Solanke, Amolkumar U; Gaikwad, Kishor; Singh, Nagendra K; Sharma, Tilak Raj

    2016-01-01

    Leaf rust is one of the most important diseases of wheat and is caused by Puccinia triticina, a highly variable rust pathogen prevalent worldwide. Decoding the genome of this pathogen will help in unraveling the molecular basis of its evolution and in the identification of genes responsible for its various biological functions. We generated high quality draft genome sequences (approximately 100- 106 Mb) of two races of P. triticina; the variable and virulent Race77 and the old, avirulent Race106. The genomes of races 77 and 106 had 33X and 27X coverage, respectively. We predicted 27678 and 26384 genes, with average lengths of 1,129 and 1,086 bases in races 77 and 106, respectively and found that the genomes consisted of 37.49% and 39.99% repetitive sequences. Genome wide comparative analysis revealed that Race77 differs substantially from Race106 with regard to segmental duplication (SD), repeat element, and SNP/InDel characteristics. Comparative analyses showed that Race 77 is a recent, highly variable and adapted Race compared with Race106. Further sequence analyses of 13 additional pathotypes of Race77 clearly differentiated the recent, active and virulent, from the older pathotypes. Average densities of 2.4 SNPs and 0.32 InDels per kb were obtained for all P. triticina pathotypes. Secretome analysis demonstrated that Race77 has more virulence factors than Race 106, which may be responsible for the greater degree of adaptation of this pathogen. We also found that genes under greater selection pressure were conserved in the genomes of both races, and may affect functions crucial for the higher levels of virulence factors in Race77. This study provides insights into the genome structure, genome organization, molecular basis of variation, and pathogenicity of P. triticina The genome sequence data generated in this study have been submitted to public domain databases and will be an important resource for comparative genomics studies of the more than 4000 existing

  2. The impact of genome-wide supported schizophrenia risk variants in the neurogranin gene on brain structure and function.

    PubMed

    Walton, Esther; Geisler, Daniel; Hass, Johanna; Liu, Jingyu; Turner, Jessica; Yendiki, Anastasia; Smolka, Michael N; Ho, Beng-Choon; Manoach, Dara S; Gollub, Randy L; Roessner, Veit; Calhoun, Vince D; Ehrlich, Stefan

    2013-01-01

    The neural mechanisms underlying genetic risk for schizophrenia, a highly heritable psychiatric condition, are still under investigation. New schizophrenia risk genes discovered through genome-wide association studies (GWAS), such as neurogranin (NRGN), can be used to identify these mechanisms. In this study we examined the association of two common NRGN risk single nucleotide polymorphisms (SNPs) with functional and structural brain-based intermediate phenotypes for schizophrenia. We obtained structural, functional MRI and genotype data of 92 schizophrenia patients and 114 healthy volunteers from the multisite Mind Clinical Imaging Consortium study. Two schizophrenia-associated NRGN SNPs (rs12807809 and rs12541) were tested for association with working memory-elicited dorsolateral prefrontal cortex (DLPFC) activity and surface-wide cortical thickness. NRGN rs12541 risk allele homozygotes (TT) displayed increased working memory-related activity in several brain regions, including the left DLPFC, left insula, left somatosensory cortex and the cingulate cortex, when compared to non-risk allele carriers. NRGN rs12807809 non-risk allele (C) carriers showed reduced cortical gray matter thickness compared to risk allele homozygotes (TT) in an area comprising the right pericalcarine gyrus, the right cuneus, and the right lingual gyrus. Our study highlights the effects of schizophrenia risk variants in the NRGN gene on functional and structural brain-based intermediate phenotypes for schizophrenia. These results support recent GWAS findings and further implicate NRGN in the pathophysiology of schizophrenia by suggesting that genetic NRGN risk variants contribute to subtle changes in neural functioning and anatomy that can be quantified with neuroimaging methods. PMID:24098564

  3. Architecture and Secondary Structure of an Entire HIV-1 RNA Genome

    PubMed Central

    Watts, Joseph M.; Dang, Kristen K.; Gorelick, Robert J.; Leonard, Christopher W.; Bess, Julian W.; Swanstrom, Ronald; Burch, Christina L.; Weeks, Kevin M.

    2009-01-01

    Single-stranded RNA viruses encompass broad classes of infectious agents and cause the common cold, cancer, AIDS, and other serious health threats. Viral replication is regulated at many levels, including using conserved genomic RNA structures. Most potential regulatory elements within viral RNA genomes are uncharacterized. Here we report the structure of an entire HIV-1 genome at single nucleotide resolution using SHAPE, a high-throughput RNA analysis technology. The genome encodes protein structure at two levels. In addition to the correspondence between RNA and protein primary sequences, a correlation exists between high levels of RNA structure and sequences that encode inter-domain loops in HIV proteins. This correlation suggests RNA structure modulates ribosome elongation to promote native protein folding. Some simple genome elements previously shown to be important, including the ribosomal gag-pol frameshift stem-loop, are components of larger RNA motifs. We also identify organizational principles for unstructured RNA regions. Highly used splice acceptors lie in unstructured motifs and hypervariable regions are sequestered from flanking genome regions by stable insulator helices. These results emphasize that the HIV-1 genome and, potentially, many coding RNAs are punctuated by numerous previously unrecognized regulatory motifs and that extensive RNA structure may constitute an additional level of the genetic code. PMID:19661910

  4. Full-length RNA structure prediction of the HIV-1 genome reveals a conserved core domain.

    PubMed

    Sükösd, Zsuzsanna; Andersen, Ebbe S; Seemann, Stefan E; Jensen, Mads Krogh; Hansen, Mathias; Gorodkin, Jan; Kjems, Jørgen

    2015-12-01

    A distance constrained secondary structural model of the ≈10 kb RNA genome of the HIV-1 has been predicted but higher-order structures, involving long distance interactions, are currently unknown. We present the first global RNA secondary structure model for the HIV-1 genome, which integrates both comparative structure analysis and information from experimental data in a full-length prediction without distance constraints. Besides recovering known structural elements, we predict several novel structural elements that are conserved in HIV-1 evolution. Our results also indicate that the structure of the HIV-1 genome is highly variable in most regions, with a limited number of stable and conserved RNA secondary structures. Most interesting, a set of long distance interactions form a core organizing structure (COS) that organize the genome into three major structural domains. Despite overlapping protein-coding regions the COS is supported by a particular high frequency of compensatory base changes, suggesting functional importance for this element. This new structural element potentially organizes the whole genome into three major domains protruding from a conserved core structure with potential roles in replication and evolution for the virus. PMID:26476446

  5. Unsupervised pattern discovery in human chromatin structure through genomic segmentation

    PubMed Central

    Hoffman, Michael M.; Buske, Orion J.; Wang, Jie; Weng, Zhiping; Bilmes, Jeff A.; Noble, William Stafford

    2012-01-01

    We applied a dynamic Bayesian network method that identifies joint patterns from multiple functional genomics experiments to ChIP-seq histone modification and transcription factor data, and DNaseI-seq and FAIRE-seq open chromatin readouts from the human cell line K562. In an unsupervised fashion, we identified patterns associated with transcription start sites, gene ends, enhancers, CTCF elements, and repressed regions. Software and genome browser tracks are at http://noble.gs.washington.edu/proj/segway/. PMID:22426492

  6. Consortium-Based Genetic Studies of Kawasaki Disease in Korea: Korean Kawasaki Disease Genetics Consortium

    PubMed Central

    Hong, Young Mi; Jang, Gi Young; Yun, Sin Weon; Yu, Jeong Jin; Yoon, Kyung Lim; Lee, Kyung-Yil; Kil, Hong-Rang

    2015-01-01

    In order to perform large-scale genetic studies of Kawasaki disease (KD) in Korea, the Korean Kawasaki Disease Genetics Consortium (KKDGC) was formed in 2008 with 10 hospitals. Since the establishment of KKDGC, there has been a collection of clinical data from a total of 1198 patients, and approximately 5 mL of blood samples per patient (for genomic deoxyribonucleic acid and plasma isolation), using a standard clinical data collection form and a nation-wide networking system for blood sample pick-up. In the clinical risk factor analysis using the collected clinical data of 478 KD patients, it was found that incomplete KD type, intravenous immunoglobulin (IVIG) non-responsiveness, and long febrile days are major risk factors for coronary artery lesions development, whereas low serum albumin concentration is an independent risk factor for IVIG non-responsiveness. In addition, we identified a KD susceptibility locus at 1p31, a coronary artery aneurysm locus (KCNN2 gene), and the causal variant in the C-reactive protein (CRP) promoter region, as determining the increased CRP levels in KD patients, by means of genome-wide association studies. Currently, this consortium is continually collecting more clinical data and genomic samples to identify the clinical and genetic risk factors via a single nucleotide polymorphism chip and exome sequencing, as well as collaborating with several international KD genetics teams. The consortium-based approach for genetic studies of KD in Korea will be a very effective way to understand the unknown etiology and causal mechanism of KD, which may be affected by multiple genes and environmental factors. PMID:26617644

  7. Micro and nanofluidic structures for cell sorting and genomic analysis

    NASA Astrophysics Data System (ADS)

    Morton, Keith J.

    Microfluidic systems promise rapid analysis of small samples in a compact and inexpensive format. But direct scaling of lab bench protocols on-chip is challenging because laminar flows in typical microfluidic devices are characterized by non-mixing streamlines. Common microfluidic mixers and sorters work by diffusion, limiting application to objects that diffuse slowly such as cells and DNA. Recently Huang et.al. developed a passive microfluidic element to continuously separate bio-particles deterministically. In Deterministic Lateral Displacement (DLD), objects are sorted by size as they transit an asymmetric array of microfabricated posts. This thesis further develops DLD arrays with applications in three broad new areas. First the arrays are used, not simply to sort particles, but to move streams of cells through functional flows for chemical treatment---such as on-chip immunofluorescent labeling of blood cells with washing, and on-chip E.coli cell lysis with simultaneous chromosome extraction. Secondly, modular tiling of the basic DLD element is used to construct complex particle handling modes that include beam steering for jets of cells and beads. Thirdly, nanostructured DLD arrays are built using Nanoimprint Lithography (NIL) and continuous-flow separation of 100 nm and 200 nm size particles is demonstrated. Finally a number of ancillary nanofabrication techniques were developed in support of these overall goals, including methods to interface nanofluidic structures with standard microfluidic components such as inlet channels and reservoirs, precision etching of ultra-high aspect ratio (>50:1) silicon nanostructures, and fabrication of narrow (˜ 35 nm) channels used to stretch genomic length DNA.

  8. COnsortium of METabolomics Studies (COMETS)

    Cancer.gov

    The COnsortium of METabolomics Studies (COMETS) is an extramural-intramural partnership that promotes collaboration among prospective cohort studies that follow participants for a range of outcomes and perform metabolomic profiling of individuals.

  9. INTEGRATED PETROLEUM ENVIRONMENTAL CONSORTIUM (IPEC)

    EPA Science Inventory

    EPA GRANT NUMBER: R827015
    Title: Integrated Petroleum Environmental Consortium (IPEC)
    Investigator: Kerry L. Sublette
    Institution: University of Tulsa
    EPA Project Officer: S. Bala Krishnan
    Project Period: October 1, 19...

  10. A sequence-based survey of the complex structural organization of tumor genomes

    SciTech Connect

    Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav; Yu, Peng; Wu, Chunxiao; Huang, Guiqing; Linardopoulou, Elena V.; Trask, Barbara J.; Waldman, Frederic; Costello, Joseph; Pienta, Kenneth J.; Mills, Gordon B.; Bajsarowicz, Krystyna; Kobayashi, Yasuko; Sridharan, Shivaranjani; Paris, Pamela; Tao, Quanzhou; Aerni, Sarah J.; Brown, Raymond P.; Bashir, Ali; Gray, Joe W.; Cheng, Jan-Fang; de Jong, Pieter; Nefedov, Mikhail; Ried, Thomas; Padilla-Nash, Hesed M.; Collins, Colin C.

    2008-04-03

    The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison of the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.

  11. Hickory Consortium 2001 Final Report

    SciTech Connect

    Not Available

    2003-02-01

    As with all Building America Program consortia, systems thinking is the key to understanding the processes that Hickory Consortium hopes to improve. The Hickory Consortium applies this thinking to more than the whole-building concept. Their systems thinking embraces the meta process of how housing construction takes place in America. By understanding the larger picture, they are able to identify areas where improvements can be made and how to implement them.

  12. The First Complete Chloroplast Genome Sequences in Actinidiaceae: Genome Structure and Comparative Analysis

    PubMed Central

    Yao, Xiaohong; Tang, Ping; Li, Zuozhou; Li, Dawei; Liu, Yifei; Huang, Hongwen

    2015-01-01

    Actinidia chinensis is an important economic plant belonging to the basal lineage of the asterids. Availability of a complete Actinidia chloroplast genome sequence is crucial to understanding phylogenetic relationships among major lineages of angiosperms and facilitates kiwifruit genetic improvement. We report here the complete nucleotide sequences of the chloroplast genomes for Actinidia chinensis and A. chinensis var deliciosa obtained through de novo assembly of Illumina paired-end reads produced by total DNA sequencing. The total genome size ranges from 155,446 to 157,557 bp, with an inverted repeat (IR) of 24,013 to 24,391 bp, a large single copy region (LSC) of 87,984 to 88,337 bp and a small single copy region (SSC) of 20,332 to 20,336 bp. The genome encodes 113 different genes, including 79 unique protein-coding genes, 30 tRNA genes and 4 ribosomal RNA genes, with 16 duplicated in the inverted repeats, and a tRNA gene (trnfM-CAU) duplicated once in the LSC region. Comparisons of IR boundaries among four asterid species showed that IR/LSC borders were extended into the 5’ portion of the psbA gene and IR contraction occurred in Actinidia. The clap gene has been lost from the chloroplast genome in Actinidia, and may have been transferred to the nucleus during chloroplast evolution. Twenty-seven polymorphic simple sequence repeat (SSR) loci were identified in the Actinidia chloroplast genome. Maximum parsimony analyses of a 72-gene, 16 taxa angiosperm dataset strongly support the placement of Actinidiaceae in Ericales within the basal asterids. PMID:26046631

  13. Combustion Byproducts Recycling Consortium

    SciTech Connect

    Paul Ziemkiewicz; Tamara Vandivort; Debra Pflughoeft-Hassett; Y. Paul Chugh; James Hower

    2008-08-31

    Each year, over 100 million tons of solid byproducts are produced by coal-burning electric utilities in the United States. Annual production of flue gas desulfurization (FGD) byproducts continues to increase as the result of more stringent sulfur emission restrictions. In addition, stricter limits on NOx emissions mandated by the 1990 Clean Air Act have resulted in utility burner/boiler modifications that frequently yield higher carbon concentrations in fly ash, which restricts the use of the ash as a cement replacement. Controlling ammonia in ash is also of concern. If newer, 'clean coal' combustion and gasification technologies are adopted, their byproducts may also present a management challenge. The objective of the Combustion Byproducts Recycling Consortium (CBRC) is to develop and demonstrate technologies to address issues related to the recycling of byproducts associated with coal combustion processes. A goal of CBRC is that these technologies, by the year 2010, will lead to an overall ash utilization rate from the current 34% to 50% by such measures as increasing the current rate of FGD byproduct use and increasing in the number of uses considered 'allowable' under state regulations. Another issue of interest to the CBRC would be to examine the environmental impact of both byproduct utilization and disposal. No byproduct utilization technology is likely to be adopted by industry unless it is more cost-effective than landfilling. Therefore, it is extremely important that the utility industry provide guidance to the R&D program. Government agencies and private-sector organizations that may be able to utilize these materials in the conduct of their missions should also provide input. The CBRC will serve as an effective vehicle for acquiring and maintaining guidance from these diverse organizations so that the proper balance in the R&D program is achieved.

  14. Combustion Byproducts Recycling Consortium

    SciTech Connect

    Ziemkiewicz, Paul; Vandivort, Tamara; Pflughoeft-Hassett, Debra; Chugh, Y Paul; Hower, James

    2008-08-31

    Each year, over 100 million tons of solid byproducts are produced by coal-burning electric utilities in the United States. Annual production of flue gas desulfurization (FGD) byproducts continues to increase as the result of more stringent sulfur emission restrictions. In addition, stricter limits on NOx emissions mandated by the 1990 Clean Air Act have resulted in utility burner/boiler modifications that frequently yield higher carbon concentrations in fly ash, which restricts the use of the ash as a cement replacement. Controlling ammonia in ash is also of concern. If newer, “clean coal” combustion and gasification technologies are adopted, their byproducts may also present a management challenge. The objective of the Combustion Byproducts Recycling Consortium (CBRC) is to develop and demonstrate technologies to address issues related to the recycling of byproducts associated with coal combustion processes. A goal of CBRC is that these technologies, by the year 2010, will lead to an overall ash utilization rate from the current 34% to 50% by such measures as increasing the current rate of FGD byproduct use and increasing in the number of uses considered “allowable” under state regulations. Another issue of interest to the CBRC would be to examine the environmental impact of both byproduct utilization and disposal. No byproduct utilization technology is likely to be adopted by industry unless it is more cost-effective than landfilling. Therefore, it is extremely important that the utility industry provide guidance to the R&D program. Government agencies and privatesector organizations that may be able to utilize these materials in the conduct of their missions should also provide input. The CBRC will serve as an effective vehicle for acquiring and maintaining guidance from these diverse organizations so that the proper balance in the R&D program is achieved.

  15. Local chromatin structure of heterochromatin regulates repeatedDNA stability, nucleolus structure, and genome integrity

    SciTech Connect

    Peng, Jamy C.

    2007-05-05

    Heterochromatin constitutes a significant portion of the genome in higher eukaryotes; approximately 30% in Drosophila and human. Heterochromatin contains a high repeat DNA content and a low density of protein-encoding genes. In contrast, euchromatin is composed mostly of unique sequences and contains the majority of single-copy genes. Genetic and cytological studies demonstrated that heterochromatin exhibits regulatory roles in chromosome organization, centromere function and telomere protection. As an epigenetically regulated structure, heterochromatin formation is not defined by any DNA sequence consensus. Heterochromatin is characterized by its association with nucleosomes containing methylated-lysine 9 of histone H3 (H3K9me), heterochromatin protein 1 (HP1) that binds H3K9me, and Su(var)3-9, which methylates H3K9 and binds HP1. Heterochromatin formation and functions are influenced by HP1, Su(var)3-9, and the RNA interference (RNAi) pathway. My thesis project investigates how heterochromatin formation and function impact nuclear architecture, repeated DNA organization, and genome stability in Drosophila melanogaster. H3K9me-based chromatin reduces extrachromosomal DNA formation; most likely by restricting the access of repair machineries to repeated DNAs. Reducing extrachromosomal ribosomal DNA stabilizes rDNA repeats and the nucleolus structure. H3K9me-based chromatin also inhibits DNA damage in heterochromatin. Cells with compromised heterochromatin structure, due to Su(var)3-9 or dcr-2 (a component of the RNAi pathway) mutations, display severe DNA damage in heterochromatin compared to wild type. In these mutant cells, accumulated DNA damage leads to chromosomal defects such as translocations, defective DNA repair response, and activation of the G2-M DNA repair and mitotic checkpoints that ensure cellular and animal viability. My thesis research suggests that DNA replication, repair, and recombination mechanisms in heterochromatin differ from those in

  16. Computational structural variation discovery in genomes: state of the art and challenges

    NASA Astrophysics Data System (ADS)

    Osipowski, Paweł; Pawełkowicz, Magdalena; Przybecki, Zbigniew

    2014-11-01

    Identifying structural variations is crucial to obtain comprehensive knowledge on genomic differentiation. Massive data generated by present technologies determines researchers to make use of computational methods for variation discovery in genomes. Focusing on results and trying to specify challenges remained and possible solutions for the future, here we give a review of state-of-the-art methods and software utilized for structural variation discovery.

  17. Diversity of Genome Structure in Salmonella enterica Serovar Typhi Populations†

    PubMed Central

    Kothapalli, Sushma; Nair, Satheesh; Alokam, Suneetha; Pang, Tikki; Khakhria, Rasik; Woodward, David; Johnson, Wendy; Stocker, Bruce A. D.; Sanderson, Kenneth E.; Liu, Shu-Lin

    2005-01-01

    The genomes of most strains of Salmonella and Escherichia coli are highly conserved. In contrast, all 136 wild-type strains of Salmonella enterica serovar Typhi analyzed by partial digestion with I-CeuI (an endonuclease which cuts within the rrn operons) and pulsed-field gel electrophoresis and by PCR have rearrangements due to homologous recombination between the rrn operons leading to inversions and translocations. Recombination between rrn operons in culture is known to be equally frequent in S. enterica serovar Typhi and S. enterica serovar Typhimurium; thus, the recombinants in S. enterica serovar Typhi, but not those in S. enterica serovar Typhimurium, are able to survive in nature. However, even in S. enterica serovar Typhi the need for genome balance and the need for gene dosage impose limits on rearrangements. Of 100 strains of genome types 1 to 6, 72 were only 25.5 kb off genome balance (the relative lengths of the replichores during bidirectional replication from oriC to the termination of replication [Ter]), while 28 strains were less balanced (41 kb off balance), indicating that the survival of the best-balanced strains was greater. In addition, the need for appropriate gene dosage apparently selected against rearrangements which moved genes from their accustomed distance from oriC. Although rearrangements involving the seven rrn operons are very common in S. enterica serovar Typhi, other duplicated regions, such as the 25 IS200 elements, are very rarely involved in rearrangements. Large deletions and insertions in the genome are uncommon, except for deletions of Salmonella pathogenicity island 7 (usually 134 kb) from fragment I-CeuI-G and 40-kb insertions, possibly a prophage, in fragment I-CeuI-E. The phage types were determined, and the origins of the phage types appeared to be independent of the origins of the genome types. PMID:15805510

  18. Genome structure of introgressive lines Triticum aestivum/Aegilops sharonensis.

    PubMed

    Antonyuk, M Z; Bodylyova, M V; Ternovskaya, T K

    2009-01-01

    The lines Triticum aestivum/Aegilops sharonensis were explored in regard to the presence of introgressions in the line genomes, their amount and belonging to definite homoeologic group. The results of studying of chromosome associations in M1 of pollen mother celles in the hybrids between the lines with each other and with recurrent common wheat genotype Avrora were compared with the data of the line assessment for the chromosomal biochemical and morphological markers. 26 lines were distinguished between six groups with specific genome rearrangement regard to recurrent genotype. PMID:20458978

  19. Duplex stem-loop-containing quadruplex motifs in the human genome: a combined genomic and structural study

    PubMed Central

    Lim, Kah Wai; Jenjaroenpun, Piroon; Low, Zhen Jie; Khong, Zi Jian; Ng, Yi Siang; Kuznetsov, Vladimir Andreevich; Phan, Anh Tuân

    2015-01-01

    Duplex stem-loops and four-stranded G-quadruplexes have been implicated in (patho)biological processes. Overlap of stem-loop- and quadruplex-forming sequences could give rise to quadruplex–duplex hybrids (QDH), which combine features of both structural forms and could exhibit unique properties. Here, we present a combined genomic and structural study of stem-loop-containing quadruplex sequences (SLQS) in the human genome. Based on a maximum loop length of 20 nt, our survey identified 80 307 SLQS, embedded within 60 172 unique clusters. Our analysis suggested that these should cover close to half of total SLQS in the entire genome. Among these, 48 508 SLQS were strand-specifically located in genic/promoter regions, with the majority of genes displaying a low number of SLQS. Notably, genes containing abundant SLQS clusters were strongly associated with brain tissues. Enrichment analysis of SLQS-positive genes and mapping of SLQS onto transcriptional/mutagenesis hotspots and cancer-associated genes, provided a statistical framework supporting the biological involvements of SLQS. In vitro formation of diverse QDH by selective SLQS hits were successfully verified by nuclear magnetic resonance spectroscopy. Folding topologies of two SLQS were elucidated in detail. We also demonstrated that sequence changes at mutation/single-nucleotide polymorphism loci could affect the structural conformations adopted by SLQS. Thus, our predicted SLQS offer novel insights into the potential involvement of QDH in diverse (patho)biological processes and could represent novel regulatory signals. PMID:25958397

  20. Divergence of the mitochondrial genome structure in the apicomplexan parasites, Babesia and Theileria.

    PubMed

    Hikosaka, Kenji; Watanabe, Yoh-Ichi; Tsuji, Naotoshi; Kita, Kiyoshi; Kishine, Hiroe; Arisue, Nobuko; Palacpac, Nirianne Marie Q; Kawazu, Shin-Ichiro; Sawai, Hiromi; Horii, Toshihiro; Igarashi, Ikuo; Tanabe, Kazuyuki

    2010-05-01

    Mitochondrial (mt) genomes from diverse phylogenetic groups vary considerably in size, structure, and organization. The genus Plasmodium, causative agent of malaria, of the phylum Apicomplexa, has the smallest mt genome in the form of a circular and/or tandemly repeated linear element of 6 kb, encoding only three protein genes (cox1, cox3, and cob). The closely related genera Babesia and Theileria also have small mt genomes (6.6 kb) that are monomeric linear with an organization distinct from Plasmodium. To elucidate the structural divergence and evolution of mt genomes between Babesia/Theileria and Plasmodium, we determined five new sequences from Babesia bigemina, B. caballi, B. gibsoni, Theileria orientalis, and T. equi. Together with previously reported sequences of B. bovis, T. annulata, and T. parva, all eight Babesia and Theileria mt genomes are linear molecules with terminal inverted repeats (TIRs) on both ends containing three protein-coding genes (cox1, cox3, and cob) and six large subunit (LSU) ribosomal RNA (rRNA) gene fragments. The organization and transcriptional direction of protein-coding genes and the rRNA gene fragments were completely conserved in the four Babesia species. In contrast, notable variation occurred in the four Theileria species. Although the genome structures of T. annulata and T. parva were nearly identical to those of Babesia, an inversion in the 3-kb central region was found in T. orientalis. Moreover, the T. equi mt genome is the largest (8.2 kb) and most divergent with unusually long TIR sequences, in which cox3 and two LSU rRNA gene fragments are located. The T. equi mt genome showed little synteny to the other species. These results suggest that the Theileria mt genome is highly diverse with lineage-specific evolution in two Theileria species: genome inversion in T. orientalis and gene-embedded long TIR in T. equi. PMID:20034997

  1. Mosaic Structure Of Foot-And-Mouth Disease Virus Genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We report the results of a simple pairwise scanning analysis designed to identify inter-serotype recombination events applied to genome data from 144 isolates of foot-and-mouth disease virus (FMDV) representing all seven serotypes. We identify large numbers of candidate recombinant fragments from a...

  2. Mosaic Structure of Foot-and-Mouth Disease Virus Genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We report the results of a simple pairwise scanning analysis designed to identify inter-serotype recombination events applied to genome data from 144 isolates of foot-and-mouth disease virus (FMDV) representing all seven serotypes. We identify large numbers of candidate recombinant fragments from al...

  3. Studying Cattle Genomic Structural Variations in the Green Economy Era

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Transgenic cattle carrying multiple genomic modifications have been produced by serial rounds of somatic cell chromatin transfer (cloning) of sequentially genetically targeted somatic cells. However, cloning efficiency tends to decline with the increase of rounds of cloning. It is possible that mult...

  4. Training set optimization under population structure in genomic selection

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The optimization of the training set (TRS) in genomic selection (GS) has received much interest in both animal and plant breeding, because it is critical to the accuracy of the prediction models. In this study, five different TRS sampling algorithms, stratified sampling, mean of the Coefficient of D...

  5. Training set optimization under population structure in genomic selection

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The optimization of the training set (TRS) in genomic selection has received much interest in both animal and plant breeding, because it is critical to the accuracy of the prediction models. In this study, five different TRS sampling algorithms, stratified sampling, mean of the coefficient of determ...

  6. Coverage of whole proteome by structural genomics observed through protein homology modeling database

    PubMed Central

    Yamaguchi, Akihiro; Go, Mitiko

    2006-01-01

    We have been developing FAMSBASE, a protein homology-modeling database of whole ORFs predicted from genome sequences. The latest update of FAMSBASE (http://daisy.nagahama-i-bio.ac.jp/Famsbase/), which is based on the protein three-dimensional (3D) structures released by November 2003, contains modeled 3D structures for 368,724 open reading frames (ORFs) derived from genomes of 276 species, namely 17 archaebacterial, 130 eubacterial, 18 eukaryotic and 111 phage genomes. Those 276 genomes are predicted to have 734,193 ORFs in total and the current FAMSBASE contains protein 3D structure of approximately 50% of the ORF products. However, cases that a modeled 3D structure covers the whole part of an ORF product are rare. When portion of an ORF with 3D structure is compared in three kingdoms of life, in archaebacteria and eubacteria, approximately 60% of the ORFs have modeled 3D structures covering almost the entire amino acid sequences, however, the percentage falls to about 30% in eukaryotes. When annual differences in the number of ORFs with modeled 3D structure are calculated, the fraction of modeled 3D structures of soluble protein for archaebacteria is increased by 5%, and that for eubacteria by 7% in the last 3 years. Assuming that this rate would be maintained and that determination of 3D structures for predicted disordered regions is unattainable, whole soluble protein model structures of prokaryotes without the putative disordered regions will be in hand within 15 years. For eukaryotic proteins, they will be in hand within 25 years. The 3D structures we will have at those times are not the 3D structure of the entire proteins encoded in single ORFs, but the 3D structures of separate structural domains. Measuring or predicting spatial arrangements of structural domains in an ORF will then be a coming issue of structural genomics. PMID:17146617

  7. ICONE: An International Consortium of Neuro Endovascular Centres

    PubMed Central

    Raymond, J.; White, P.; Kallmes, D.F.; Spears, J.; Marotta, T.; Roy, D.; Guilbert, F.; Weill, A.; Nguyen, T.; Molyneux, A.J.; Cloft, H.; Cekirge, S.; Saatci, I.; Bracard, S.; Meder, J.-F.; Moret, J.; Cognard, C.; Qureshi, A.I.; Turk, A.S.; Berenstein, A.

    2008-01-01

    Summary The proliferation of new endovascular devices and therapeutic strategies calls for a prudent and rational evaluation of their clinical benefit. This evaluation must be done in an effective manner and in collaboration with industry. Such research initiative requires organisational and methodological support to survive and thrive in a competitive environment. We propose the formation of an international consortium, an academic alliance committed to the pursuit of effective neurovascular therapies. Such a consortium would be dedicated to the design and execution of basic science, device development and clinical trials. The Consortium is owned and operated by its members. Members are international leaders in neurointerventional research and clinical practice. The Consortium brings competency, knowledge, and expertise to industry as well as to its membership across a spectrum of research initiatives such as: expedited review of clinical trials, protocol development, surveys and systematic reviews; laboratory expertise and support for research design and grant applications to public agencies. Once objectives and protocols are approved, the Consortium provides a stable network of centers capable of timely realization of clinical trials or preclinical investigations in an optimal environment. The Consortium is a non-profit organization. The potential revenue generated from client-sponsored financial agreements will be re-directed to the academic and research objectives of the organization. The Consortium wishes to work in concert with industry, to support emerging trends in neurovascular therapeutic development. The Consortium is a realistic endeavour optimally structured to promote excellence through scientific appraisal of our treatments, and to accelerate technical progress while maximizing patients’ safety and welfare. PMID:20557763

  8. Deeper insight into the structure of the anaerobic digestion microbial community; the biogas microbiome database is expanded with 157 new genomes.

    PubMed

    Treu, Laura; Kougias, Panagiotis G; Campanaro, Stefano; Bassani, Ilaria; Angelidaki, Irini

    2016-09-01

    This research aimed to better characterize the biogas microbiome by means of high throughput metagenomic sequencing and to elucidate the core microbial consortium existing in biogas reactors independently from the operational conditions. Assembly of shotgun reads followed by an established binning strategy resulted in the highest, up to now, extraction of microbial genomes involved in biogas producing systems. From the 236 extracted genome bins, it was remarkably found that the vast majority of them could only be characterized at high taxonomic levels. This result confirms that the biogas microbiome is comprised by a consortium of unknown species. A comparative analysis between the genome bins of the current study and those extracted from a previous metagenomic assembly demonstrated a similar phylogenetic distribution of the main taxa. Finally, this analysis led to the identification of a subset of common microbes that could be considered as the core essential group in biogas production. PMID:27243603

  9. Cell-of-Origin-Specific 3D Genome Structure Acquired during Somatic Cell Reprogramming

    PubMed Central

    Krijger, Peter Hugo Lodewijk; Di Stefano, Bruno; de Wit, Elzo; Limone, Francesco; van Oevelen, Chris; de Laat, Wouter; Graf, Thomas

    2016-01-01

    Summary Forced expression of reprogramming factors can convert somatic cells into induced pluripotent stem cells (iPSCs). Here we studied genome topology dynamics during reprogramming of different somatic cell types with highly distinct genome conformations. We find large-scale topologically associated domain (TAD) repositioning and alterations of tissue-restricted genomic neighborhoods and chromatin loops, effectively erasing the somatic-cell-specific genome structures while establishing an embryonic stem-cell-like 3D genome. Yet, early passage iPSCs carry topological hallmarks that enable recognition of their cell of origin. These hallmarks are not remnants of somatic chromosome topologies. Instead, the distinguishing topological features are acquired during reprogramming, as we also find for cell-of-origin-dependent gene expression patterns. PMID:26971819

  10. Genome sequence, comparative analysis and haplotype structure of the domestic dog.

    PubMed

    Lindblad-Toh, Kerstin; Wade, Claire M; Mikkelsen, Tarjei S; Karlsson, Elinor K; Jaffe, David B; Kamal, Michael; Clamp, Michele; Chang, Jean L; Kulbokas, Edward J; Zody, Michael C; Mauceli, Evan; Xie, Xiaohui; Breen, Matthew; Wayne, Robert K; Ostrander, Elaine A; Ponting, Chris P; Galibert, Francis; Smith, Douglas R; DeJong, Pieter J; Kirkness, Ewen; Alvarez, Pablo; Biagi, Tara; Brockman, William; Butler, Jonathan; Chin, Chee-Wye; Cook, April; Cuff, James; Daly, Mark J; DeCaprio, David; Gnerre, Sante; Grabherr, Manfred; Kellis, Manolis; Kleber, Michael; Bardeleben, Carolyne; Goodstadt, Leo; Heger, Andreas; Hitte, Christophe; Kim, Lisa; Koepfli, Klaus-Peter; Parker, Heidi G; Pollinger, John P; Searle, Stephen M J; Sutter, Nathan B; Thomas, Rachael; Webber, Caleb; Baldwin, Jennifer; Abebe, Adal; Abouelleil, Amr; Aftuck, Lynne; Ait-Zahra, Mostafa; Aldredge, Tyler; Allen, Nicole; An, Peter; Anderson, Scott; Antoine, Claudel; Arachchi, Harindra; Aslam, Ali; Ayotte, Laura; Bachantsang, Pasang; Barry, Andrew; Bayul, Tashi; Benamara, Mostafa; Berlin, Aaron; Bessette, Daniel; Blitshteyn, Berta; Bloom, Toby; Blye, Jason; Boguslavskiy, Leonid; Bonnet, Claude; Boukhgalter, Boris; Brown, Adam; Cahill, Patrick; Calixte, Nadia; Camarata, Jody; Cheshatsang, Yama; Chu, Jeffrey; Citroen, Mieke; Collymore, Alville; Cooke, Patrick; Dawoe, Tenzin; Daza, Riza; Decktor, Karin; DeGray, Stuart; Dhargay, Norbu; Dooley, Kimberly; Dooley, Kathleen; Dorje, Passang; Dorjee, Kunsang; Dorris, Lester; Duffey, Noah; Dupes, Alan; Egbiremolen, Osebhajajeme; Elong, Richard; Falk, Jill; Farina, Abderrahim; Faro, Susan; Ferguson, Diallo; Ferreira, Patricia; Fisher, Sheila; FitzGerald, Mike; Foley, Karen; Foley, Chelsea; Franke, Alicia; Friedrich, Dennis; Gage, Diane; Garber, Manuel; Gearin, Gary; Giannoukos, Georgia; Goode, Tina; Goyette, Audra; Graham, Joseph; Grandbois, Edward; Gyaltsen, Kunsang; Hafez, Nabil; Hagopian, Daniel; Hagos, Birhane; Hall, Jennifer; Healy, Claire; Hegarty, Ryan; Honan, Tracey; Horn, Andrea; Houde, Nathan; Hughes, Leanne; Hunnicutt, Leigh; Husby, M; Jester, Benjamin; Jones, Charlien; Kamat, Asha; Kanga, Ben; Kells, Cristyn; Khazanovich, Dmitry; Kieu, Alix Chinh; Kisner, Peter; Kumar, Mayank; Lance, Krista; Landers, Thomas; Lara, Marcia; Lee, William; Leger, Jean-Pierre; Lennon, Niall; Leuper, Lisa; LeVine, Sarah; Liu, Jinlei; Liu, Xiaohong; Lokyitsang, Yeshi; Lokyitsang, Tashi; Lui, Annie; Macdonald, Jan; Major, John; Marabella, Richard; Maru, Kebede; Matthews, Charles; McDonough, Susan; Mehta, Teena; Meldrim, James; Melnikov, Alexandre; Meneus, Louis; Mihalev, Atanas; Mihova, Tanya; Miller, Karen; Mittelman, Rachel; Mlenga, Valentine; Mulrain, Leonidas; Munson, Glen; Navidi, Adam; Naylor, Jerome; Nguyen, Tuyen; Nguyen, Nga; Nguyen, Cindy; Nguyen, Thu; Nicol, Robert; Norbu, Nyima; Norbu, Choe; Novod, Nathaniel; Nyima, Tenchoe; Olandt, Peter; O'Neill, Barry; O'Neill, Keith; Osman, Sahal; Oyono, Lucien; Patti, Christopher; Perrin, Danielle; Phunkhang, Pema; Pierre, Fritz; Priest, Margaret; Rachupka, Anthony; Raghuraman, Sujaa; Rameau, Rayale; Ray, Verneda; Raymond, Christina; Rege, Filip; Rise, Cecil; Rogers, Julie; Rogov, Peter; Sahalie, Julie; Settipalli, Sampath; Sharpe, Theodore; Shea, Terrance; Sheehan, Mechele; Sherpa, Ngawang; Shi, Jianying; Shih, Diana; Sloan, Jessie; Smith, Cherylyn; Sparrow, Todd; Stalker, John; Stange-Thomann, Nicole; Stavropoulos, Sharon; Stone, Catherine; Stone, Sabrina; Sykes, Sean; Tchuinga, Pierre; Tenzing, Pema; Tesfaye, Senait; Thoulutsang, Dawa; Thoulutsang, Yama; Topham, Kerri; Topping, Ira; Tsamla, Tsamla; Vassiliev, Helen; Venkataraman, Vijay; Vo, Andy; Wangchuk, Tsering; Wangdi, Tsering; Weiand, Michael; Wilkinson, Jane; Wilson, Adam; Yadav, Shailendra; Yang, Shuli; Yang, Xiaoping; Young, Geneva; Yu, Qing; Zainoun, Joanne; Zembek, Lisa; Zimmer, Andrew; Lander, Eric S

    2005-12-01

    Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health. PMID:16341006

  11. Asymmetric cryo-EM reconstruction of phage MS2 reveals genome structure in situ.

    PubMed

    Koning, Roman I; Gomez-Blanco, Josue; Akopjana, Inara; Vargas, Javier; Kazaks, Andris; Tars, Kaspars; Carazo, José María; Koster, Abraham J

    2016-01-01

    In single-stranded ribonucleic acid (RNA) viruses, virus capsid assembly and genome packaging are intertwined processes. Using cryo-electron microscopy and single particle analysis we determined the asymmetric virion structure of bacteriophage MS2, which includes 178 copies of the coat protein, a single copy of the A-protein and the RNA genome. This reveals that in situ, the viral RNA genome can adopt a defined conformation. The RNA forms a branched network of stem-loops that almost all allocate near the capsid inner surface, while predominantly binding to coat protein dimers that are located in one-half of the capsid. This suggests that genomic RNA is highly involved in genome packaging and virion assembly. PMID:27561669

  12. The emerging biofuel crop Camelina sativa retains a highly undifferentiated hexaploid genome structure.

    PubMed

    Kagale, Sateesh; Koh, Chushin; Nixon, John; Bollina, Venkatesh; Clarke, Wayne E; Tuteja, Reetu; Spillane, Charles; Robinson, Stephen J; Links, Matthew G; Clarke, Carling; Higgins, Erin E; Huebert, Terry; Sharpe, Andrew G; Parkin, Isobel A P

    2014-01-01

    Camelina sativa is an oilseed with desirable agronomic and oil-quality attributes for a viable industrial oil platform crop. Here we generate the first chromosome-scale high-quality reference genome sequence for C. sativa and annotated 89,418 protein-coding genes, representing a whole-genome triplication event relative to the crucifer model Arabidopsis thaliana. C. sativa represents the first crop species to be sequenced from lineage I of the Brassicaceae. The well-preserved hexaploid genome structure of C. sativa surprisingly mirrors those of economically important amphidiploid Brassica crop species from lineage II as well as wheat and cotton. The three genomes of C. sativa show no evidence of fractionation bias and limited expression-level bias, both characteristics commonly associated with polyploid evolution. The highly undifferentiated polyploid genome of C. sativa presents significant consequences for breeding and genetic manipulation of this industrial oil crop. PMID:24759634

  13. The emerging biofuel crop Camelina sativa retains a highly undifferentiated hexaploid genome structure

    PubMed Central

    Kagale, Sateesh; Koh, Chushin; Nixon, John; Bollina, Venkatesh; Clarke, Wayne E.; Tuteja, Reetu; Spillane, Charles; Robinson, Stephen J.; Links, Matthew G.; Clarke, Carling; Higgins, Erin E.; Huebert, Terry; Sharpe, Andrew G.; Parkin, Isobel A. P.

    2014-01-01

    Camelina sativa is an oilseed with desirable agronomic and oil-quality attributes for a viable industrial oil platform crop. Here we generate the first chromosome-scale high-quality reference genome sequence for C. sativa and annotated 89,418 protein-coding genes, representing a whole-genome triplication event relative to the crucifer model Arabidopsis thaliana. C. sativa represents the first crop species to be sequenced from lineage I of the Brassicaceae. The well-preserved hexaploid genome structure of C. sativa surprisingly mirrors those of economically important amphidiploid Brassica crop species from lineage II as well as wheat and cotton. The three genomes of C. sativa show no evidence of fractionation bias and limited expression-level bias, both characteristics commonly associated with polyploid evolution. The highly undifferentiated polyploid genome of C. sativa presents significant consequences for breeding and genetic manipulation of this industrial oil crop. PMID:24759634

  14. Asymmetric cryo-EM reconstruction of phage MS2 reveals genome structure in situ

    PubMed Central

    Koning, Roman I; Gomez-Blanco, Josue; Akopjana, Inara; Vargas, Javier; Kazaks, Andris; Tars, Kaspars; Carazo, José María; Koster, Abraham J.

    2016-01-01

    In single-stranded ribonucleic acid (RNA) viruses, virus capsid assembly and genome packaging are intertwined processes. Using cryo-electron microscopy and single particle analysis we determined the asymmetric virion structure of bacteriophage MS2, which includes 178 copies of the coat protein, a single copy of the A-protein and the RNA genome. This reveals that in situ, the viral RNA genome can adopt a defined conformation. The RNA forms a branched network of stem-loops that almost all allocate near the capsid inner surface, while predominantly binding to coat protein dimers that are located in one-half of the capsid. This suggests that genomic RNA is highly involved in genome packaging and virion assembly. PMID:27561669

  15. Mining 3D genome structure populations identifies major factors governing the stability of regulatory communities

    PubMed Central

    Dai, Chao; Li, Wenyuan; Tjong, Harianto; Hao, Shengli; Zhou, Yonggang; Li, Qingjiao; Chen, Lin; Zhu, Bing; Alber, Frank; Jasmine Zhou, Xianghong

    2016-01-01

    Three-dimensional (3D) genome structures vary from cell to cell even in an isogenic sample. Unlike protein structures, genome structures are highly plastic, posing a significant challenge for structure-function mapping. Here we report an approach to comprehensively identify 3D chromatin clusters that each occurs frequently across a population of genome structures, either deconvoluted from ensemble-averaged Hi-C data or from a collection of single-cell Hi-C data. Applying our method to a population of genome structures (at the macrodomain resolution) of lymphoblastoid cells, we identify an atlas of stable inter-chromosomal chromatin clusters. A large number of these clusters are enriched in binding of specific regulatory factors and are therefore defined as ‘Regulatory Communities.' We reveal two major factors, centromere clustering and transcription factor binding, which significantly stabilize such communities. Finally, we show that the regulatory communities differ substantially from cell to cell, indicating that expression variability could be impacted by genome structures. PMID:27240697

  16. Hemipteran Mitochondrial Genomes: Features, Structures and Implications for Phylogeny

    PubMed Central

    Wang, Yuan; Chen, Jing; Jiang, Li-Yun; Qiao, Ge-Xia

    2015-01-01

    The study of Hemipteran mitochondrial genomes (mitogenomes) began with the Chagas disease vector, Triatoma dimidiata, in 2001. At present, 90 complete Hemipteran mitogenomes have been sequenced and annotated. This review examines the history of Hemipteran mitogenomes research and summarizes the main features of them including genome organization, nucleotide composition, protein-coding genes, tRNAs and rRNAs, and non-coding regions. Special attention is given to the comparative analysis of repeat regions. Gene rearrangements are an additional data type for a few families, and most mitogenomes are arranged in the same order to the proposed ancestral insect. We also discuss and provide insights on the phylogenetic analyses of a variety of taxonomic levels. This review is expected to further expand our understanding of research in this field and serve as a valuable reference resource. PMID:26039239

  17. Nanopatterned structures for biomolecular analysis toward genomic and proteomic applications

    NASA Astrophysics Data System (ADS)

    Chou, Chia-Fu; Gu, Jian; Wei, Qihuo; Liu, Yingjie; Gupta, Ravi; Nishio, Takeyoshi; Zenhausern, Frederic

    2005-01-01

    We report our fabrication of nanoscale devices using electron beam and nanoimprint lithography (NIL). We focus our study in the emerging fields of NIL, nanophotonics and nanobiotechnology and give a few examples as to how these nanodevices may be applied toward genomic and proteomic applications for molecular analysis. The examples include reverse NIL-fabricated nanofluidic channels for DNA stretching, nanoscale molecular traps constructed from dielectric constrictions for DNA or protein focusing by dielectrophoresis, multi-layer nanoburger and nanoburger multiplets for optimized surface-plasma enhanced Raman scattering for protein detection, and biomolecular motor-based nanosystems. The development of advanced nanopatterning techniques promises reliable and high-throughput manufacturing of nanodevices which could impact significantly on the areas of genomics, proteomics, drug discovery and molecular clinical diagnostics.

  18. Unsupervised pattern discovery in human chromatin structure through genomic segmentation.

    PubMed

    Hoffman, Michael M; Buske, Orion J; Wang, Jie; Weng, Zhiping; Bilmes, Jeff A; Noble, William Stafford

    2012-05-01

    We trained Segway, a dynamic Bayesian network method, simultaneously on chromatin data from multiple experiments, including positions of histone modifications, transcription-factor binding and open chromatin, all derived from a human chronic myeloid leukemia cell line. In an unsupervised fashion, we identified patterns associated with transcription start sites, gene ends, enhancers, transcriptional regulator CTCF-binding regions and repressed regions. Software and genome browser tracks are at http://noble.gs.washington.edu/proj/segway/. PMID:22426492

  19. Physical and genetic structure of the maize genome reflects its complex evolutionary history.

    PubMed

    Wei, Fusheng; Coe, Ed; Nelson, William; Bharti, Arvind K; Engler, Fred; Butler, Ed; Kim, HyeRan; Goicoechea, Jose Luis; Chen, Mingsheng; Lee, Seunghee; Fuks, Galina; Sanchez-Villeda, Hector; Schroeder, Steven; Fang, Zhiwei; McMullen, Michael; Davis, Georgia; Bowers, John E; Paterson, Andrew H; Schaeffer, Mary; Gardiner, Jack; Cone, Karen; Messing, Joachim; Soderlund, Carol; Wing, Rod A

    2007-07-01

    Maize (Zea mays L.) is one of the most important cereal crops and a model for the study of genetics, evolution, and domestication. To better understand maize genome organization and to build a framework for genome sequencing, we constructed a sequence-ready fingerprinted contig-based physical map that covers 93.5% of the genome, of which 86.1% is aligned to the genetic map. The fingerprinted contig map contains 25,908 genic markers that enabled us to align nearly 73% of the anchored maize genome to the rice genome. The distribution pattern of expressed sequence tags correlates to that of recombination. In collinear regions, 1 kb in rice corresponds to an average of 3.2 kb in maize, yet maize has a 6-fold genome size expansion. This can be explained by the fact that most rice regions correspond to two regions in maize as a result of its recent polyploid origin. Inversions account for the majority of chromosome structural variations during subsequent maize diploidization. We also find clear evidence of ancient genome duplication predating the divergence of the progenitors of maize and rice. Reconstructing the paleoethnobotany of the maize genome indicates that the progenitors of modern maize contained ten chromosomes. PMID:17658954

  20. Informational structure of two closely related eukaryotic genomes

    NASA Astrophysics Data System (ADS)

    Dehnert, Manuel; Helm, Werner E.; Hütt, Marc-Thorsten

    2006-08-01

    Attempts to identify a species on the basis of its DNA sequence on purely statistical grounds have been formulated for more than a decade. The most prominent of such genome signatures relies on neighborhood correlations (i.e., dinucleotide frequencies) and, consequently, attributes species identification to mechanisms operating on the dinucleotide level (e.g., neighbor-dependent mutations). For the examples of Mus musculus and Rattus norvegicus we analyze short- and intermediate-range statistical correlations in DNA sequences. These correlation profiles are computed for all chromosomes of the two species. We find that with increasing range of correlations the capacity to distinguish between the species on the basis of this correlation profile is getting better and requires ever shorter sequence segments for obtaining a full species separation. This finding suggests that distinctive traits within the sequence are situated beyond the level of few nucleotides. The large-scale statistical patterning of DNA sequences on which such genome signatures are based is thus substantially determined by mobile elements (e.g., transposons and retrotransposons). The study and interspecies comparison of such correlation profiles can, therefore, reveal features of retrotransposition, segmental duplications, and other processes of genome evolution.

  1. The Ocean Sampling Day Consortium

    SciTech Connect

    Kopf, Anna; Bicak, Mesude; Kottmann, Renzo; Schnetzer, Julia; Kostadinov, Ivaylo; Lehmann, Katja; Fernandez-Guerra, Antonio; Jeanthon, Christian; Rahav, Eyal; Ullrich, Matthias; Wichels, Antje; Gerdts, Gunnar; Polymenakou, Paraskevi; Kotoulas, Giorgos; Siam, Rania; Abdallah, Rehab Z.; Sonnenschein, Eva C.; Cariou, Thierry; O’Gara, Fergal; Jackson, Stephen; Orlic, Sandi; Steinke, Michael; Busch, Julia; Duarte, Bernardo; Caçador, Isabel; Canning-Clode, João; Bobrova, Oleksandra; Marteinsson, Viggo; Reynisson, Eyjolfur; Loureiro, Clara Magalhães; Luna, Gian Marco; Quero, Grazia Marina; Löscher, Carolin R.; Kremp, Anke; DeLorenzo, Marie E.; Øvreås, Lise; Tolman, Jennifer; LaRoche, Julie; Penna, Antonella; Frischer, Marc; Davis, Timothy; Katherine, Barker; Meyer, Christopher P.; Ramos, Sandra; Magalhães, Catarina; Jude-Lemeilleur, Florence; Aguirre-Macedo, Ma Leopoldina; Wang, Shiao; Poulton, Nicole; Jones, Scott; Collin, Rachel; Fuhrman, Jed A.; Conan, Pascal; Alonso, Cecilia; Stambler, Noga; Goodwin, Kelly; Yakimov, Michael M.; Baltar, Federico; Bodrossy, Levente; Van De Kamp, Jodie; Frampton, Dion M. F.; Ostrowski, Martin; Van Ruth, Paul; Malthouse, Paul; Claus, Simon; Deneudt, Klaas; Mortelmans, Jonas; Pitois, Sophie; Wallom, David; Salter, Ian; Costa, Rodrigo; Schroeder, Declan C.; Kandil, Mahrous M.; Amaral, Valentina; Biancalana, Florencia; Santana, Rafael; Pedrotti, Maria Luiza; Yoshida, Takashi; Ogata, Hiroyuki; Ingleton, Tim; Munnik, Kate; Rodriguez-Ezpeleta, Naiara; Berteaux-Lecellier, Veronique; Wecker, Patricia; Cancio, Ibon; Vaulot, Daniel; Bienhold, Christina; Ghazal, Hassan; Chaouni, Bouchra; Essayeh, Soumya; Ettamimi, Sara; Zaid, El Houcine; Boukhatem, Noureddine; Bouali, Abderrahim; Chahboune, Rajaa; Barrijal, Said; Timinouni, Mohammed; El Otmani, Fatima; Bennani, Mohamed; Mea, Marianna; Todorova, Nadezhda; Karamfilov, Ventzislav; ten Hoopen, Petra; Cochrane, Guy; L’Haridon, Stephane; Bizsel, Kemal Can; Vezzi, Alessandro; Lauro, Federico M.; Martin, Patrick; Jensen, Rachelle M.; Hinks, Jamie; Gebbels, Susan; Rosselli, Riccardo; De Pascale, Fabio; Schiavon, Riccardo; dos Santos, Antonina; Villar, Emilie; Pesant, Stéphane; Cataletto, Bruno; Malfatti, Francesca; Edirisinghe, Ranjith; Silveira, Jorge A. Herrera; Barbier, Michele; Turk, Valentina; Tinta, Tinkara; Fuller, Wayne J.; Salihoglu, Ilkay; Serakinci, Nedime; Ergoren, Mahmut Cerkez; Bresnan, Eileen; Iriberri, Juan; Nyhus, Paul Anders Fronth; Bente, Edvardsen; Karlsen, Hans Erik; Golyshin, Peter N.; Gasol, Josep M.; Moncheva, Snejana; Dzhembekova, Nina; Johnson, Zackary; Sinigalliano, Christopher David; Gidley, Maribeth Louise; Zingone, Adriana; Danovaro, Roberto; Tsiamis, George; Clark, Melody S.; Costa, Ana Cristina; El Bour, Monia; Martins, Ana M.; Collins, R. Eric; Ducluzeau, Anne-Lise; Martinez, Jonathan; Costello, Mark J.; Amaral-Zettler, Linda A.; Gilbert, Jack A.; Davies, Neil; Field, Dawn; Glöckner, Frank Oliver

    2015-06-19

    In this study, Ocean Sampling Day was initiated by the EU-funded Micro B3 (Marine Microbial Biodiversity, Bioinformatics, Biotechnology) project to obtain a snapshot of the marine microbial biodiversity and function of the world’s oceans. It is a simultaneous global mega-sequencing campaign aiming to generate the largest standardized microbial data set in a single day. This will be achievable only through the coordinated efforts of an Ocean Sampling Day Consortium, supportive partnerships and networks between sites. This commentary outlines the establishment, function and aims of the Consortium and describes our vision for a sustainable study of marine microbial communities and their embedded functional traits.

  2. The ocean sampling day consortium.

    PubMed

    Kopf, Anna; Bicak, Mesude; Kottmann, Renzo; Schnetzer, Julia; Kostadinov, Ivaylo; Lehmann, Katja; Fernandez-Guerra, Antonio; Jeanthon, Christian; Rahav, Eyal; Ullrich, Matthias; Wichels, Antje; Gerdts, Gunnar; Polymenakou, Paraskevi; Kotoulas, Giorgos; Siam, Rania; Abdallah, Rehab Z; Sonnenschein, Eva C; Cariou, Thierry; O'Gara, Fergal; Jackson, Stephen; Orlic, Sandi; Steinke, Michael; Busch, Julia; Duarte, Bernardo; Caçador, Isabel; Canning-Clode, João; Bobrova, Oleksandra; Marteinsson, Viggo; Reynisson, Eyjolfur; Loureiro, Clara Magalhães; Luna, Gian Marco; Quero, Grazia Marina; Löscher, Carolin R; Kremp, Anke; DeLorenzo, Marie E; Øvreås, Lise; Tolman, Jennifer; LaRoche, Julie; Penna, Antonella; Frischer, Marc; Davis, Timothy; Katherine, Barker; Meyer, Christopher P; Ramos, Sandra; Magalhães, Catarina; Jude-Lemeilleur, Florence; Aguirre-Macedo, Ma Leopoldina; Wang, Shiao; Poulton, Nicole; Jones, Scott; Collin, Rachel; Fuhrman, Jed A; Conan, Pascal; Alonso, Cecilia; Stambler, Noga; Goodwin, Kelly; Yakimov, Michael M; Baltar, Federico; Bodrossy, Levente; Van De Kamp, Jodie; Frampton, Dion Mf; Ostrowski, Martin; Van Ruth, Paul; Malthouse, Paul; Claus, Simon; Deneudt, Klaas; Mortelmans, Jonas; Pitois, Sophie; Wallom, David; Salter, Ian; Costa, Rodrigo; Schroeder, Declan C; Kandil, Mahrous M; Amaral, Valentina; Biancalana, Florencia; Santana, Rafael; Pedrotti, Maria Luiza; Yoshida, Takashi; Ogata, Hiroyuki; Ingleton, Tim; Munnik, Kate; Rodriguez-Ezpeleta, Naiara; Berteaux-Lecellier, Veronique; Wecker, Patricia; Cancio, Ibon; Vaulot, Daniel; Bienhold, Christina; Ghazal, Hassan; Chaouni, Bouchra; Essayeh, Soumya; Ettamimi, Sara; Zaid, El Houcine; Boukhatem, Noureddine; Bouali, Abderrahim; Chahboune, Rajaa; Barrijal, Said; Timinouni, Mohammed; El Otmani, Fatima; Bennani, Mohamed; Mea, Marianna; Todorova, Nadezhda; Karamfilov, Ventzislav; Ten Hoopen, Petra; Cochrane, Guy; L'Haridon, Stephane; Bizsel, Kemal Can; Vezzi, Alessandro; Lauro, Federico M; Martin, Patrick; Jensen, Rachelle M; Hinks, Jamie; Gebbels, Susan; Rosselli, Riccardo; De Pascale, Fabio; Schiavon, Riccardo; Dos Santos, Antonina; Villar, Emilie; Pesant, Stéphane; Cataletto, Bruno; Malfatti, Francesca; Edirisinghe, Ranjith; Silveira, Jorge A Herrera; Barbier, Michele; Turk, Valentina; Tinta, Tinkara; Fuller, Wayne J; Salihoglu, Ilkay; Serakinci, Nedime; Ergoren, Mahmut Cerkez; Bresnan, Eileen; Iriberri, Juan; Nyhus, Paul Anders Fronth; Bente, Edvardsen; Karlsen, Hans Erik; Golyshin, Peter N; Gasol, Josep M; Moncheva, Snejana; Dzhembekova, Nina; Johnson, Zackary; Sinigalliano, Christopher David; Gidley, Maribeth Louise; Zingone, Adriana; Danovaro, Roberto; Tsiamis, George; Clark, Melody S; Costa, Ana Cristina; El Bour, Monia; Martins, Ana M; Collins, R Eric; Ducluzeau, Anne-Lise; Martinez, Jonathan; Costello, Mark J; Amaral-Zettler, Linda A; Gilbert, Jack A; Davies, Neil; Field, Dawn; Glöckner, Frank Oliver

    2015-01-01

    Ocean Sampling Day was initiated by the EU-funded Micro B3 (Marine Microbial Biodiversity, Bioinformatics, Biotechnology) project to obtain a snapshot of the marine microbial biodiversity and function of the world's oceans. It is a simultaneous global mega-sequencing campaign aiming to generate the largest standardized microbial data set in a single day. This will be achievable only through the coordinated efforts of an Ocean Sampling Day Consortium, supportive partnerships and networks between sites. This commentary outlines the establishment, function and aims of the Consortium and describes our vision for a sustainable study of marine microbial communities and their embedded functional traits. PMID:26097697

  3. Three-dimensional structure of a viral genome-delivery portal vertex

    PubMed Central

    Olia, Adam S.; Prevelige, Peter E.; Johnson, John E.; Cingolani, Gino

    2011-01-01

    DNA viruses such as bacteriophages and herpesviruses deliver their genome into and out of the capsid through large proteinaceous assemblies, known as portal proteins. Here we report two snapshots of the dodecameric portal protein of bacteriophage P22. The 3.25 Å resolution structure of the portal protein core bound to twelve copies of gp4 reveals a ~1.1 MDa assembly formed by 24 proteins. Unexpectedly, a lower resolution structure of the full length portal protein unveils the unique topology of the C-terminal domain, which forms a ~200 Å long, α-helical barrel. This domain inserts deeply into the virion and is highly conserved in the Podoviridae family. We propose that the barrel domain facilitates genome spooling onto the interior surface of the capsid during genome packaging and, in analogy to a rifle barrel, increases the accuracy of genome ejection into the host cell. PMID:21499245

  4. Three-dimensional Structure of a Viral Genome-delivery Portal Vertex

    SciTech Connect

    A Olia; P Prevelige Jr.; J Johnson; G Cingolani

    2011-12-31

    DNA viruses such as bacteriophages and herpesviruses deliver their genome into and out of the capsid through large proteinaceous assemblies, known as portal proteins. Here, we report two snapshots of the dodecameric portal protein of bacteriophage P22. The 3.25-{angstrom}-resolution structure of the portal-protein core bound to 12 copies of gene product 4 (gp4) reveals a {approx}1.1-MDa assembly formed by 24 proteins. Unexpectedly, a lower-resolution structure of the full-length portal protein unveils the unique topology of the C-terminal domain, which forms a {approx}200-{angstrom}-long {alpha}-helical barrel. This domain inserts deeply into the virion and is highly conserved in the Podoviridae family. We propose that the barrel domain facilitates genome spooling onto the interior surface of the capsid during genome packaging and, in analogy to a rifle barrel, increases the accuracy of genome ejection into the host cell.

  5. Detection of Genomic Structural Variants from Next-Generation Sequencing Data

    PubMed Central

    Tattini, Lorenzo; D’Aurizio, Romina; Magi, Alberto

    2015-01-01

    Structural variants are genomic rearrangements larger than 50 bp accounting for around 1% of the variation among human genomes. They impact on phenotypic diversity and play a role in various diseases including neurological/neurocognitive disorders and cancer development and progression. Dissecting structural variants from next-generation sequencing data presents several challenges and a number of approaches have been proposed in the literature. In this mini review, we describe and summarize the latest tools – and their underlying algorithms – designed for the analysis of whole-genome sequencing, whole-exome sequencing, custom captures, and amplicon sequencing data, pointing out the major advantages/drawbacks. We also report a summary of the most recent applications of third-generation sequencing platforms. This assessment provides a guided indication – with particular emphasis on human genetics and copy number variants – for researchers involved in the investigation of these genomic events. PMID:26161383

  6. Visualizing the global secondary structure of a viral RNA genome with cryo-electron microscopy

    PubMed Central

    Garmann, Rees F.; Gopal, Ajaykumar; Athavale, Shreyas S.; Knobler, Charles M.; Gelbart, William M.; Harvey, Stephen C.

    2015-01-01

    The lifecycle, and therefore the virulence, of single-stranded (ss)-RNA viruses is regulated not only by their particular protein gene products, but also by the secondary and tertiary structure of their genomes. The secondary structure of the entire genomic RNA of satellite tobacco mosaic virus (STMV) was recently determined by selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE). The SHAPE analysis suggested a single highly extended secondary structure with much less branching than occurs in the ensemble of structures predicted by purely thermodynamic algorithms. Here we examine the solution-equilibrated STMV genome by direct visualization with cryo-electron microscopy (cryo-EM), using an RNA of similar length transcribed from the yeast genome as a control. The cryo-EM data reveal an ensemble of branching patterns that are collectively consistent with the SHAPE-derived secondary structure model. Thus, our results both elucidate the statistical nature of the secondary structure of large ss-RNAs and give visual support for modern RNA structure determination methods. Additionally, this work introduces cryo-EM as a means to distinguish between competing secondary structure models if the models differ significantly in terms of the number and/or length of branches. Furthermore, with the latest advances in cryo-EM technology, we suggest the possibility of developing methods that incorporate restraints from cryo-EM into the next generation of algorithms for the determination of RNA secondary and tertiary structures. PMID:25752599

  7. The complete genome sequence and genome structure of passion fruit mosaic virus.

    PubMed

    Song, Yeon Sook; Ryu, Ki Hyun

    2011-06-01

    In this study, we determined the complete sequence of the genomic RNA of a Florida isolate of maracuja mosaic virus (MarMV-FL) and compared it to that of a Peru isolate of the virus (MarMV-P) and those of other known tobamoviruses. Complete sequence analysis revealed that the isolate should be considered a member of a new species and named passion fruit mosaic virus (PafMV). The genomic RNA of PafMV consists of 6,791 nucleotides and encodes four open reading frames (ORFs) coding for proteins of 125 kDa (1,101 aa), 184 kDa (1,612 aa), 34 kDa (311 aa) and 18 kDa (164 aa) in consecutive order from the 5' to the 3' end. The sequence homologies of the four ORFs of PafMV were from 78.8% to 81.6% to those of MarMV-P at the amino acid level. The sequence homologies of the four ORFs of PafMV ranged from 36.0% to 77.9% and from 21.7% to 81.6% to those of other tobamoviruses, at the nucleotide and amino acid level, respectively. Phylogenetic analysis revealed that these PafMV-encoded proteins are closely related to those of MarMV-P. In conclusion, the results indicate that PafMV and MarMV-P belong to different species within the genus Tobamovirus. PMID:21547441

  8. Structure and Genome Organization of Cherry Virus A (Capillovirus, Betaflexiviridae) from China Using Small RNA Sequencing

    PubMed Central

    Wang, Jiawei; Zhai, Ying; Liu, Weizhen; Dhingra, Amit

    2016-01-01

    Cherry virus A (CVA) (Capillovirus, Betaflexiviridae) is widely present in cherry-growing areas. We obtained the complete genome of a CVA isolate (CVA-TA) using small RNA deep sequencing, followed by overlapping reverse transcription-PCR (RT-PCR) and rapid amplification of cDNA ends (RACE). The newly identified 5′-untranslated region (5′-UTR) from CVA-TA may form additional hairpin and loop structures to stabilize the CVA genome. PMID:27174277

  9. Structure and Genome Organization of Cherry Virus A (Capillovirus, Betaflexiviridae) from China Using Small RNA Sequencing.

    PubMed

    Wang, Jiawei; Zhai, Ying; Liu, Weizhen; Dhingra, Amit; Pappu, Hanu R; Liu, Qingzhong

    2016-01-01

    Cherry virus A (CVA) (Capillovirus, Betaflexiviridae) is widely present in cherry-growing areas. We obtained the complete genome of a CVA isolate (CVA-TA) using small RNA deep sequencing, followed by overlapping reverse transcription-PCR (RT-PCR) and rapid amplification of cDNA ends (RACE). The newly identified 5'-untranslated region (5'-UTR) from CVA-TA may form additional hairpin and loop structures to stabilize the CVA genome. PMID:27174277

  10. Consortium for materials development in space

    NASA Technical Reports Server (NTRS)

    1993-01-01

    During fiscal 1993, the Consortium for Materials Development in Space (CMDS) maintained the organizational structure and project orientation established in prior years. The commercial objectives are improved materials, biomedical applications, and infrastructure and support hardware. Projects include nonlinear optical materials; space materials (specifically polymer foam/films, atomic oxygen and high temperature superconductors); alloyed and blended materials: sintered and alloyed materials; polymer and carbonate blends; electrodeposition; organic separation; materials dispersion and biodynamics; space carriers: Consort, COMET support, Spacehab utilization; and flight services: accelerometers, CMIX, USEC, ORSEP, and Space Experiment Facility (SEF).

  11. Midwest Superconductivity Consortium: 1994 Progress report

    SciTech Connect

    Not Available

    1995-01-01

    The mission of the Midwest Superconductivity Consortium, MISCON, is to advance the science and understanding of high {Tc} superconductivity. During the past year, 27 projects produced over 123 talks and 139 publications. Group activities and interactions involved 2 MISCON group meetings (held in August and January); with the second MISCON Workshop held in August; 13 external speakers; 79 collaborations (with universities, industry, Federal laboratories, and foreign research centers); and 48 exchanges of samples and/or measurements. Research achievements this past year focused on understanding the effects of processing phenomena on structure-property interrelationships and the fundamental nature of transport properties in high-temperature superconductors.

  12. Human insulin genome sequence map, biochemical structure of insulin for recombinant DNA insulin.

    PubMed

    Chakraborty, Chiranjib; Mungantiwar, Ashish A

    2003-08-01

    Insulin is a essential molecule for type I diabetes that is marketed by very few companies. It is the first molecule, which was made by recombinant technology; but the commercialization process is very difficult. Knowledge about biochemical structure of insulin and human insulin genome sequence map is pivotal to large scale manufacturing of recombinant DNA Insulin. This paper reviews human insulin genome sequence map, the amino acid sequence of porcine insulin, crystal structure of porcine insulin, insulin monomer, aggregation surfaces of insulin, conformational variation in the insulin monomer, insulin X-ray structures for recombinant DNA technology in the synthesis of human insulin in Escherichia coli. PMID:12769691

  13. Terminal structures of West Nile virus genomic RNA and their interactions with viral NS5 protein

    SciTech Connect

    Dong Hongping; Zhang Bo; Shi Peiyong

    2008-11-10

    Genome cyclization is essential for flavivirus replication. We used RNases to probe the structures formed by the 5'-terminal 190 nucleotides and the 3'-terminal 111 nucleotides of the West Nile virus (WNV) genomic RNA. When analyzed individually, the two RNAs adopt stem-loop structures as predicted by the thermodynamic-folding program. However, when mixed together, the two RNAs form a duplex that is mediated through base-pairings of two sets of RNA elements (5'CS/3'CSI and 5'UAR/3'UAR). Formation of the RNA duplex facilitates a conformational change that leaves the 3'-terminal nucleotides of the genome (position - 8 to - 16) to be single-stranded. Viral NS5 binds specifically to the 5'-terminal stem-loop (SL1) of the genomic RNA. The 5'SL1 RNA structure is essential for WNV replication. The study has provided further evidence to suggest that flavivirus genome cyclization and NS5/5'SL1 RNA interaction facilitate NS5 binding to the 3' end of the genome for the initiation of viral minus-strand RNA synthesis.

  14. Integrating Sequencing Technologies in Personal Genomics: Optimal Low Cost Reconstruction of Structural Variants

    PubMed Central

    Du, Jiang; Bjornson, Robert D.; Zhang, Zhengdong D.; Kong, Yong; Snyder, Michael; Gerstein, Mark B.

    2009-01-01

    The goal of human genome re-sequencing is obtaining an accurate assembly of an individual's genome. Recently, there has been great excitement in the development of many technologies for this (e.g. medium and short read sequencing from companies such as 454 and SOLiD, and high-density oligo-arrays from Affymetrix and NimbelGen), with even more expected to appear. The costs and sensitivities of these technologies differ considerably from each other. As an important goal of personal genomics is to reduce the cost of re-sequencing to an affordable point, it is worthwhile to consider optimally integrating technologies. Here, we build a simulation toolbox that will help us optimally combine different technologies for genome re-sequencing, especially in reconstructing large structural variants (SVs). SV reconstruction is considered the most challenging step in human genome re-sequencing. (It is sometimes even harder than de novo assembly of small genomes because of the duplications and repetitive sequences in the human genome.) To this end, we formulate canonical problems that are representative of issues in reconstruction and are of small enough scale to be computationally tractable and simulatable. Using semi-realistic simulations, we show how we can combine different technologies to optimally solve the assembly at low cost. With mapability maps, our simulations efficiently handle the inhomogeneous repeat-containing structure of the human genome and the computational complexity of practical assembly algorithms. They quantitatively show how combining different read lengths is more cost-effective than using one length, how an optimal mixed sequencing strategy for reconstructing large novel SVs usually also gives accurate detection of SNPs/indels, how paired-end reads can improve reconstruction efficiency, and how adding in arrays is more efficient than just sequencing for disentangling some complex SVs. Our strategy should facilitate the sequencing of human genomes at

  15. CFD Parametric Study of Consortium Impeller

    NASA Technical Reports Server (NTRS)

    Cheng, Gary C.; Chen, Y. S.; Garcia, Roberto; Williams, Robert W.

    1993-01-01

    . Due to the complexity of blade geometries, the TANDEM blade configurations were analyzed with the multi-zone grid structure. Both the 7.5 deg- and the 22.5 deg-clocking TANDEM blade cases utilized a 80K mesh system. The numerical result of two TANDEM blade modifications indicates the efficiency and the head are worse than those of the baseline case due to larger flow distortion. The gap between the TANDEM blade and the full blade allows the flow passes through and heavily loads the pressure side of the partial blade such that flow reversal occurs near the suction side of the splitter. The flow split at the exit of impeller blades is very non-uniform for TANDEM blade cases, and this will greatly induce the side load on the diffuser. Therefore, the TANDEM blade modification in the present CFD analysis does not improve the performance of the consortium impeller.

  16. CFD parametric study of consortium impeller

    NASA Astrophysics Data System (ADS)

    Cheng, Gary C.; Chen, Y. S.; Garcia, Roberto; Williams, Robert W.

    1993-07-01

    . Due to the complexity of blade geometries, the TANDEM blade configurations were analyzed with the multi-zone grid structure. Both the 7.5 deg- and the 22.5 deg-clocking TANDEM blade cases utilized a 80K mesh system. The numerical result of two TANDEM blade modifications indicates the efficiency and the head are worse than those of the baseline case due to larger flow distortion. The gap between the TANDEM blade and the full blade allows the flow passes through and heavily loads the pressure side of the partial blade such that flow reversal occurs near the suction side of the splitter. The flow split at the exit of impeller blades is very non-uniform for TANDEM blade cases, and this will greatly induce the side load on the diffuser. consortium impeller.

  17. A Bivariate Whole Genome Linkage Study Identified Genomic Regions Influencing Both BMD and Bone Structure

    PubMed Central

    Liu, Xiao-Gang; Liu, Yong-Jun; Liu, Jianfeng; Pei, Yufang; Xiong, Dong-Hai; Shen, Hui; Deng, Hong-Yi; Papasian, Christopher J; Drees, Betty M; Hamilton, James J; Recker, Robert R; Deng, Hong-Wen

    2008-01-01

    Areal BMD (aBMD) and areal bone size (ABS) are biologically correlated traits and are each important determinants of bone strength and risk of fractures. Studies showed that aBMD and ABS are genetically correlated, indicating that they may share some common genetic factors, which, however, are largely unknown. To study the genetic factors influencing both aBMD and ABS, bivariate whole genome linkage analyses were conducted for aBMD-ABS at the femoral neck (FN), lumbar spine (LS), and ultradistal (UD)-forearm in a large sample of 451 white pedigrees made up of 4498 individuals. We detected significant linkage on chromosome Xq27 (LOD = 4.89) for LS aBMD-ABS. In addition, we detected suggestive linkages at 20q11 (LOD = 3.65) and Xp11 (LOD = 2.96) for FN aBMD-ABS; at 12p11 (LOD = 3.39) and 17q21 (LOD = 2.94) for LS aBMD-ABS; and at 5q23 (LOD = 3.54), 7p15 (LOD = 3.45), Xq27 (LOD = 2.93), and 12p11 (LOD = 2.92) for UD-forearm aBMD-ABS. Subsequent discrimination analyses indicated that quantitative trait loci (QTLs) at 12p11 and 17q21 may have pleiotropic effects on aBMD and ABS. This study identified several genomic regions that may contain QTLs important for both aBMD and ABS. Further endeavors are necessary to follow these regions to eventually pinpoint the genetic variants affecting bone strength and risk of fractures. PMID:18597637

  18. Federal Laboratory Consortium Resource Directory.

    ERIC Educational Resources Information Center

    Federal Laboratory Consortium, Washington, DC.

    Intended to assist both the private and public sectors to locate and utilize technological expertise within the federal laboratories, this directory lists the federal laboratories and centers that are affiliated with the Federal Laboratory Consortium and describes the area of technological expertise they can make available to solve problems. This…

  19. Federal Laboratory Consortium Resource Directory.

    ERIC Educational Resources Information Center

    Federal Laboratory Consortium, Washington, DC.

    Designed to bridge the communication gap between the Federal Laboratory Consortium (FLC) and public and private sectors of the country, this directory has been prepared as a compilation of scientific and technical research and development activities at federal laboratories, which are directing technology transfer efforts toward increasing the use…

  20. Brain Tumor Epidemiology Consortium (BTEC)

    Cancer.gov

    The Brain Tumor Epidemiology Consortium is an open scientific forum organized to foster the development of multi-center, international and inter-disciplinary collaborations that will lead to a better understanding of the etiology, outcomes, and prevention of brain tumors.

  1. An integrated map of structural variation in 2,504 human genomes.

    PubMed

    Sudmant, Peter H; Rausch, Tobias; Gardner, Eugene J; Handsaker, Robert E; Abyzov, Alexej; Huddleston, John; Zhang, Yan; Ye, Kai; Jun, Goo; Hsi-Yang Fritz, Markus; Konkel, Miriam K; Malhotra, Ankit; Stütz, Adrian M; Shi, Xinghua; Paolo Casale, Francesco; Chen, Jieming; Hormozdiari, Fereydoun; Dayama, Gargi; Chen, Ken; Malig, Maika; Chaisson, Mark J P; Walter, Klaudia; Meiers, Sascha; Kashin, Seva; Garrison, Erik; Auton, Adam; Lam, Hugo Y K; Jasmine Mu, Xinmeng; Alkan, Can; Antaki, Danny; Bae, Taejeong; Cerveira, Eliza; Chines, Peter; Chong, Zechen; Clarke, Laura; Dal, Elif; Ding, Li; Emery, Sarah; Fan, Xian; Gujral, Madhusudan; Kahveci, Fatma; Kidd, Jeffrey M; Kong, Yu; Lameijer, Eric-Wubbo; McCarthy, Shane; Flicek, Paul; Gibbs, Richard A; Marth, Gabor; Mason, Christopher E; Menelaou, Androniki; Muzny, Donna M; Nelson, Bradley J; Noor, Amina; Parrish, Nicholas F; Pendleton, Matthew; Quitadamo, Andrew; Raeder, Benjamin; Schadt, Eric E; Romanovitch, Mallory; Schlattl, Andreas; Sebra, Robert; Shabalin, Andrey A; Untergasser, Andreas; Walker, Jerilyn A; Wang, Min; Yu, Fuli; Zhang, Chengsheng; Zhang, Jing; Zheng-Bradley, Xiangqun; Zhou, Wanding; Zichner, Thomas; Sebat, Jonathan; Batzer, Mark A; McCarroll, Steven A; Mills, Ryan E; Gerstein, Mark B; Bashir, Ali; Stegle, Oliver; Devine, Scott E; Lee, Charles; Eichler, Evan E; Korbel, Jan O

    2015-10-01

    Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association. PMID:26432246

  2. Transcriptional regulatory network shapes the genome structure of Saccharomyces cerevisiae

    PubMed Central

    Li, Songling; Heermann, Dieter W.

    2013-01-01

    Among cellular processes gene transcription is central. More and more evidence is mounting that transcription is tightly connected with the spatial organization of the chromosomes. Spatial proximity of genes sharing transcriptional machinery is one of the consequences of this organization. Motivated by information on the physical relationship among genes identified via chromosomal conformation capture methods, we complement the spatial organization with the idea that genes under similar transcription factor control, but possible scattered throughout the genome, might be in physically proximity to facilitate the access of their commonly used transcription factors. Unlike the transcription factory model, “interacting” genes in our “Gene Proximity Model” are not necessarily immediate physical neighbors but are in spatial proximity. Considering the stochastic nature of TF-promoter binding, this local condensation mechanism could serve as a tie to recruit co-regulated genes to guarantee the swiftness of biological reactions. We tested this idea with a simple eukaryotic organism, Saccharomyces cerevisiae. Chromosomal interaction patterns and folding behavior generated by our model re-construct those obtained from experiments. We show that the transcriptional regulatory network has a close linkage with the genome organization in budding yeast, which is fundamental and instrumental to later studies on other more complex eukaryotes. PMID:23674068

  3. Comparative Genome Structure, Secondary Metabolite, and Effector Coding Capacity across Cochliobolus Pathogens

    SciTech Connect

    Condon, Bradford J.; Leng, Yueqiang; Wu, Dongliang; Bushley, Kathryn E.; Ohm, Robin A.; Otillar, Robert; Martin, Joel; Schackwitz, Wendy; Grimwood, Jane; MohdZainudin, NurAinlzzati; Xue, Chunsheng; Wang, Rui; Manning, Viola A.; Dhillon, Braham; Tu, Zheng Jin; Steffenson, Brian J.; Salamov, Asaf; Sun, Hui; Lowry, Steve; LaButti, Kurt; Han, James; Copeland, Alex; Lindquist, Erika; Barry, Kerrie; Schmutz, Jeremy; Baker, Scott E.; Ciuffetti, Lynda M.; Grigoriev, Igor V.; Zhong, Shaobin; Turgeon, B. Gillian

    2013-01-24

    The genomes of five Cochliobolus heterostrophus strains, two Cochliobolus sativus strains, three additional Cochliobolus species (Cochliobolus victoriae, Cochliobolus carbonum, Cochliobolus miyabeanus), and closely related Setosphaeria turcica were sequenced at the Joint Genome Institute (JGI). The datasets were used to identify SNPs between strains and species, unique genomic regions, core secondary metabolism genes, and small secreted protein (SSP) candidate effector encoding genes with a view towards pinpointing structural elements and gene content associated with specificity of these closely related fungi to different cereal hosts. Whole-genome alignment shows that three to five of each genome differs between strains of the same species, while a quarter of each genome differs between species. On average, SNP counts among field isolates of the same C. heterostrophus species are more than 25 higher than those between inbred lines and 50 lower than SNPs between Cochliobolus species. The suites of nonribosomal peptide synthetase (NRPS), polyketide synthase (PKS), and SSP encoding genes are astoundingly diverse among species but remarkably conserved among isolates of the same species, whether inbred or field strains, except for defining examples that map to unique genomic regions. Functional analysis of several strain-unique PKSs and NRPSs reveal a strong correlation with a role in virulence.

  4. Comparative Genome Structure, Secondary Metabolite, and Effector Coding Capacity across Cochliobolus Pathogens

    PubMed Central

    Bushley, Kathryn E.; Ohm, Robin A.; Otillar, Robert; Martin, Joel; Schackwitz, Wendy; Grimwood, Jane; MohdZainudin, NurAinIzzati; Xue, Chunsheng; Wang, Rui; Manning, Viola A.; Dhillon, Braham; Tu, Zheng Jin; Steffenson, Brian J.; Salamov, Asaf; Sun, Hui; Lowry, Steve; LaButti, Kurt; Han, James; Copeland, Alex; Lindquist, Erika; Barry, Kerrie; Schmutz, Jeremy; Baker, Scott E.; Ciuffetti, Lynda M.; Grigoriev, Igor V.; Zhong, Shaobin; Turgeon, B. Gillian

    2013-01-01

    The genomes of five Cochliobolus heterostrophus strains, two Cochliobolus sativus strains, three additional Cochliobolus species (Cochliobolus victoriae, Cochliobolus carbonum, Cochliobolus miyabeanus), and closely related Setosphaeria turcica were sequenced at the Joint Genome Institute (JGI). The datasets were used to identify SNPs between strains and species, unique genomic regions, core secondary metabolism genes, and small secreted protein (SSP) candidate effector encoding genes with a view towards pinpointing structural elements and gene content associated with specificity of these closely related fungi to different cereal hosts. Whole-genome alignment shows that three to five percent of each genome differs between strains of the same species, while a quarter of each genome differs between species. On average, SNP counts among field isolates of the same C. heterostrophus species are more than 25× higher than those between inbred lines and 50× lower than SNPs between Cochliobolus species. The suites of nonribosomal peptide synthetase (NRPS), polyketide synthase (PKS), and SSP–encoding genes are astoundingly diverse among species but remarkably conserved among isolates of the same species, whether inbred or field strains, except for defining examples that map to unique genomic regions. Functional analysis of several strain-unique PKSs and NRPSs reveal a strong correlation with a role in virulence. PMID:23357949

  5. Long-Range Order and Fractality in the Structure and Organization of Eukaryotic Genomes

    NASA Astrophysics Data System (ADS)

    Polychronopoulos, Dimitris; Tsiagkas, Giannis; Athanasopoulou, Labrini; Sellis, Diamantis; Almirantis, Yannis

    2014-12-01

    The late Professor J.S. Nicolis always emphasized, both in his writings and in presentations and discussions with students and friends, the relevance of a dynamical systems approach to biology. In particular, viewing the genome as a "biological text" captures the dynamical character of both the evolution and function of the organisms in the form of correlations indicating the presence of a long-range order. This genomic structure can be expressed in forms reminiscent of natural languages and several temporal and spatial traces l by the functioning of dynamical systems: Zipf laws, self-similarity and fractality. Here we review several works of our group and recent unpublished results, focusing on the chromosomal distribution of biologically active genomic components: Genes and protein-coding segments, CpG islands, transposable elements belonging to all major classes and several types of conserved non-coding genomic elements. We report the systematic appearance of power-laws in the size distribution of the distances between elements belonging to each of these types of functional genomic elements. Moreover, fractality is also found in several cases, using box-counting and entropic scaling.We present here, for the first time in a unified way, an aggregative model of the genomic dynamics which can explain the observed patterns on the grounds of known phenomena accompanying genome evolution. Our results comply with recent findings about a "fractal globule" geometry of chromatin in the eukaryotic nucleus.

  6. Insights into the genome structure and copy-number variation of Eimeria tenella

    PubMed Central

    2012-01-01

    of the genome of E. tenella from shotgun data, and to help reveal its overall structure. A preliminary assessment of copy-number variation (extra or missing copies of genomic segments) between strains of E. tenella was also carried out. The emerging picture is of a very unusual genome architecture displaying inter-strain copy-number variation. We suggest that these features may be related to the known ability of this parasite to rapidly develop drug resistance. PMID:22889016

  7. Structure and Genome Release Mechanism of the Human Cardiovirus Saffold Virus 3

    PubMed Central

    Mullapudi, Edukondalu; Nováček, Jiří; Pálková, Lenka; Kulich, Pavel; Lindberg, A. Michael; van Kuppeveld, Frank J. M.

    2016-01-01

    ABSTRACT In order to initiate an infection, viruses need to deliver their genomes into cells. This involves uncoating the genome and transporting it to the cytoplasm. The process of genome delivery is not well understood for nonenveloped viruses. We address this gap in our current knowledge by studying the uncoating of the nonenveloped human cardiovirus Saffold virus 3 (SAFV-3) of the family Picornaviridae. SAFVs cause diseases ranging from gastrointestinal disorders to meningitis. We present a structure of a native SAFV-3 virion determined to 2.5 Å by X-ray crystallography and an 11-Å-resolution cryo-electron microscopy reconstruction of an “altered” particle that is primed for genome release. The altered particles are expanded relative to the native virus and contain pores in the capsid that might serve as channels for the release of VP4 subunits, N termini of VP1, and the RNA genome. Unlike in the related enteroviruses, pores in SAFV-3 are located roughly between the icosahedral 3- and 5-fold axes at an interface formed by two VP1 and one VP3 subunit. Furthermore, in native conditions many cardioviruses contain a disulfide bond formed by cysteines that are separated by just one residue. The disulfide bond is located in a surface loop of VP3. We determined the structure of the SAFV-3 virion in which the disulfide bonds are reduced. Disruption of the bond had minimal effect on the structure of the loop, but it increased the stability and decreased the infectivity of the virus. Therefore, compounds specifically disrupting or binding to the disulfide bond might limit SAFV infection. IMPORTANCE A capsid assembled from viral proteins protects the virus genome during transmission from one cell to another. However, when a virus enters a cell the virus genome has to be released from the capsid in order to initiate infection. This process is not well understood for nonenveloped viruses. We address this gap in our current knowledge by studying the genome release of

  8. Structural Alterations from Multiple Displacement Amplification of a Human Genome Revealed by Mate-Pair Sequencing

    PubMed Central

    Jiao, Xiang; Rosenlund, Magnus; Hooper, Sean D.; Tellgren-Roth, Christian; He, Liqun; Fu, Yutao; Mangion, Jonathan; Sjöblom, Tobias

    2011-01-01

    Comprehensive identification of the acquired mutations that cause common cancers will require genomic analyses of large sets of tumor samples. Typically, the tissue material available from tumor specimens is limited, which creates a demand for accurate template amplification. We therefore evaluated whether phi29-mediated whole genome amplification introduces false positive structural mutations by massive mate-pair sequencing of a normal human genome before and after such amplification. Multiple displacement amplification led to a decrease in clone coverage and an increase by two orders of magnitude in the prevalence of inversions, but did not increase the prevalence of translocations. While multiple strand displacement amplification may find uses in translocation analyses, it is likely that alternative amplification strategies need to be developed to meet the demands of cancer genomics. PMID:21799804

  9. Genomic and structural organization of Drosophila melanogaster G elements.

    PubMed Central

    Di Nocera, P P; Graziani, F; Lavorgna, G

    1986-01-01

    The properties and the genomic organization of G elements, a moderately repeated DNA family of D. melanogaster, are reported. G elements lack terminal repeats, generate target site duplications at the point of insertion and exhibit at one end a stretch of A residues of variable length. In a large number of recombinant clones analyzed G elements occur in tandem arrays, interspersed with specific ribosomal DNA (rDNA) segments. This arrangement results from the insertion of members of the G family within the nontranscribed spacer (NTS) of rDNA units. Similarity of the site of integration of G elements to that of ribosomal DNA insertions suggests that distinct DNA sequences might have been inserted into rDNA through a partly common pathway. Images PMID:3003691

  10. VI. Genome structure and cognitive map of Williams syndrome.

    PubMed

    Korenberg, J R; Chen, X N; Hirota, H; Lai, Z; Bellugi, U; Burian, D; Roe, B; Matsuoka, R

    2000-01-01

    Williams syndrome (WMS) is a most compelling model of human cognition, of human genome organization, and of evolution. Due to a deletion in chromosome band 7q11.23, subjects have cardiovascular, connective tissue, and neurodevelopmental deficits. Given the striking peaks and valleys in neurocognition including deficits in visual-spatial and global processing, preserved language and face processing, hypersociability, and heightened affect, the goal of this work has been to identify the genes that are responsible, the cause of the deletion, and its origin in primate evolution. To do this, we have generated an integrated physical, genetic, and transcriptional map of the WMS and flanking regions using multicolor metaphase and interphase fluorescence in situ hybridization (FISH) of bacterial artificial chromosomes (BACs) and P1 artificial chromosomes (PACs), BAC end sequencing, PCR gene marker and microsatellite, large-scale sequencing, cDNA library, and database analyses. The results indicate the genomic organization of the WMS region as two nested duplicated regions flanking a largely single-copy region. There are at least two common deletion breakpoints, one in the centromeric and at least two in the telomeric repeated regions. Clones anchoring the unique to the repeated regions are defined along with three new pseudogene families. Primate studies indicate an evolutionary hot spot for chromosomal inversion in the WMS region. A cognitive phenotypic map of WMS is presented, which combines previous data with five further WMS subjects and three atypical WMS subjects with deletions; two larger (deleted for D7S489L) and one smaller, deleted for genes telomeric to FZD9, through LIMK1, but not WSCR1 or telomeric. The results establish regions and consequent gene candidates for WMS features including mental retardation, hypersociability, and facial features. The approach provides the basis for defining pathways linking genetic underpinnings with the neuroanatomical

  11. Toward a standard in structural genome annotation for prokaryotes

    SciTech Connect

    Tripp, H. James; Sutton, Granger; White, Owen; Wortman, Jennifer; Pati, Amrita; Mikhailova, Natalia; Ovchinnikova, Galina; Payne, Samuel H.; Kyrpides, Nikos C.; Ivanova, Natalia

    2015-07-25

    In an effort to identify the best practice for finding genes in prokaryotic genomes and propose it as a standard for automated annotation pipelines, we collected 1,004,576 peptides from various publicly available resources, and these were used as a basis to evaluate various gene-calling methods. The peptides came from 45 bacterial replicons with an average GC content from 31 % to 74 %, biased toward higher GC content genomes. Automated, manual, and semi-manual methods were used to tally errors in three widely used gene calling methods, as evidenced by peptides mapped outside the boundaries of called genes. We found that the consensus set of identical genes predicted by the three methods constitutes only about 70 % of the genes predicted by each individual method (with start and stop required to coincide). Peptide data was useful for evaluating some of the differences between gene callers, but not reliable enough to make the results conclusive, due to limitations inherent in any proteogenomic study. A single, unambiguous, unanimous best practice did not emerge from this analysis, since the available proteomics data were not adequate to provide an objective measurement of differences in the accuracy between these methods. However, as a result of this study, software, reference data, and procedures have been better matched among participants, representing a step toward a much-needed standard. In the absence of sufficient amount of experimental data to achieve a universal standard, our recommendation is that any of these methods can be used by the community, as long as a single method is employed across all datasets to be compared.

  12. Toward a standard in structural genome annotation for prokaryotes

    DOE PAGESBeta

    Tripp, H. James; Sutton, Granger; White, Owen; Wortman, Jennifer; Pati, Amrita; Mikhailova, Natalia; Ovchinnikova, Galina; Payne, Samuel H.; Kyrpides, Nikos C.; Ivanova, Natalia

    2015-07-25

    In an effort to identify the best practice for finding genes in prokaryotic genomes and propose it as a standard for automated annotation pipelines, we collected 1,004,576 peptides from various publicly available resources, and these were used as a basis to evaluate various gene-calling methods. The peptides came from 45 bacterial replicons with an average GC content from 31 % to 74 %, biased toward higher GC content genomes. Automated, manual, and semi-manual methods were used to tally errors in three widely used gene calling methods, as evidenced by peptides mapped outside the boundaries of called genes. We found thatmore » the consensus set of identical genes predicted by the three methods constitutes only about 70 % of the genes predicted by each individual method (with start and stop required to coincide). Peptide data was useful for evaluating some of the differences between gene callers, but not reliable enough to make the results conclusive, due to limitations inherent in any proteogenomic study. A single, unambiguous, unanimous best practice did not emerge from this analysis, since the available proteomics data were not adequate to provide an objective measurement of differences in the accuracy between these methods. However, as a result of this study, software, reference data, and procedures have been better matched among participants, representing a step toward a much-needed standard. In the absence of sufficient amount of experimental data to achieve a universal standard, our recommendation is that any of these methods can be used by the community, as long as a single method is employed across all datasets to be compared.« less

  13. Structural and functional genome analysis using extended chromatin

    SciTech Connect

    Heaf, T.; Ward, D.C.

    1994-09-01

    Highly extended linear chromatin fibers (ECFs) produced by detergent and high-salt lysis and stretching of nuclear chromatin across the surface of a glass slide can by hybridized over physical distances of at least several Mb. This allows long-range FISH analysis of the human genome with excellent DNA resolution (<10 kb/{mu}m). The insertion of Alu elements which are more than 50-fold underrepresented in centromeres can be seen within and near long tandem arrays of alpha-satellite DNA. Long tracts of trinucleotide repeats, i.e. (CCA){sub n}, can be localized within larger genomic regions. The combined application of BrdU incorporation and ECFs allows one to study the spatio-temporal distribution of DNA replication sites in finer detail. DNA synthesis occurs at multiple discrete sites within Mb arrays of alpha-satellite. Replicating DNA is tightly associated with the nuclear matrix and highly resistant to stretching out, while ECFs containing newly replicated DNA are easily released. Asynchrony in replication timing is accompanied by differences in condensation of homologous DNA segments. Extended chromatin reveals differential packaging of active and inactive DNA. Upon transcriptional inactivation by AMD, the normally compact rRNA genes become much more susceptible to decondensation procedures. By extending the chromatin from pachytene spermatocytes, meiotic pairing and genetic exchange between homologs can be visualized directly. Histone depletion by high salt and detergent produces loop chromatin surrounding the nuclear matrix in a halo-like fashion. DNA halos can be used to map nuclear matrix attachment sites in somatic cells and in mature sperm. Alpha-satellite containing DNA loops appear to be attached to the sperm-cell matrix by CENP-B boxes, short 17 bp sequences found in a subset of alpha satellite monomers. Sperm telomeres almost always appear as hybridization doublets, suggesting the presence of already replicated chromosome ends.

  14. Analysis of retrotransposon structural diversity uncovers properties and propensities in angiosperm genome evolution

    PubMed Central

    Vitte, Clémentine; Bennetzen, Jeffrey L.

    2006-01-01

    Analysis of LTR retrotransposon structures in five diploid angiosperm genomes uncovered very different relative levels of different types of genomic diversity. All species exhibited recent LTR retrotransposon mobility and also high rates of DNA removal by unequal homologous recombination and illegitimate recombination. The larger plant genomes contained many LTR retrotransposon families with >10,000 copies per haploid genome, whereas the smaller genomes contained few or no LTR retrotransposon families with >1,000 copies, suggesting that this differential potential for retroelement amplification is a primary factor in angiosperm genome size variation. The average ratios of transition to transversion mutations (Ts/Tv) in diverging LTRs were >1.5 for each species studied, suggesting that these elements are mostly 5-methylated at cytosines in an epigenetically silenced state. However, the diploid wheat Triticum monococcum and barley have unusually low Ts/Tv values (respectively, 1.9 and 1.6) compared with maize (3.9), medicago (3.6), and lotus (2.5), suggesting that this silencing is less complete in the two Triticeae. Such characteristics as the ratios of point mutations to indels (insertions and deletions) and the relative efficiencies of DNA removal by unequal homologous recombination compared with illegitimate recombination were highly variable between species. These latter variations did not correlate with genome size or phylogenetic relatedness, indicating that they frequently change during the evolutionary descent of plant lineages. In sum, the results indicate that the different sizes, contents, and structures of angiosperm genomes are outcomes of the same suite of mechanistic processes, but acting with different relative efficiencies in different plant lineages. PMID:17101966

  15. Development of Structural Neurobiology and Genomics Programs in the Neurogenetic Institute

    SciTech Connect

    Henderson, Brian E., M.D.

    2006-11-10

    The purpose of the DOE equipment-only grant was to purchase instrumentation in support of structural biology and genomics core facilities in the Zilkha Neurogenetic Institute (ZNI). The ZNI, a new laboratory facility (125,000 GSF) and a center of excellence at the Keck School of Medicine of USC, was opened in 2003. The goal of the ZNI is to recruit upwards of 30 new faculty investigators engaged in interdisciplinary research programs that will add breadth and depth to existing school strengths in neuroscience, epidemiology and genetics. Many of these faculty, and other faculty researchers at the Keck School will access structural biology and genomics facilities developed in the ZNI.

  16. Detecting Heterogeneity in Population Structure Across the Genome in Admixed Populations.

    PubMed

    McHugh, Caitlin; Brown, Lisa; Thornton, Timothy A

    2016-09-01

    The genetic structure of human populations is often characterized by aggregating measures of ancestry across the autosomal chromosomes. While it may be reasonable to assume that population structure patterns are similar genome-wide in relatively homogeneous populations, this assumption may not be appropriate for admixed populations, such as Hispanics and African-Americans, with recent ancestry from two or more continents. Recent studies have suggested that systematic ancestry differences can arise at genomic locations in admixed populations as a result of selection and nonrandom mating. Here, we propose a method, which we refer to as the chromosomal ancestry differences (CAnD) test, for detecting heterogeneity in population structure across the genome. CAnD can incorporate either local or chromosome-wide ancestry inferred from SNP genotype data to identify chromosomes harboring genomic regions with ancestry contributions that are significantly different than expected. In simulation studies with real genotype data from phase III of the HapMap Project, we demonstrate the validity and power of CAnD. We apply CAnD to the HapMap Mexican-American (MXL) and African-American (ASW) population samples; in this analysis the software RFMix is used to infer local ancestry at genomic regions, assuming admixing from Europeans, West Africans, and Native Americans. The CAnD test provides strong evidence of heterogeneity in population structure across the genome in the MXL sample ([Formula: see text]), which is largely driven by elevated Native American ancestry and deficit of European ancestry on the X chromosomes. Among the ASW, all chromosomes are largely African derived and no heterogeneity in population structure is detected in this sample. PMID:27440868

  17. 3D-GNOME: an integrated web service for structural modeling of the 3D genome

    PubMed Central

    Szalaj, Przemyslaw; Michalski, Paul J.; Wróblewski, Przemysław; Tang, Zhonghui; Kadlof, Michal; Mazzocco, Giovanni; Ruan, Yijun; Plewczynski, Dariusz

    2016-01-01

    Recent advances in high-throughput chromosome conformation capture (3C) technology, such as Hi-C and ChIA-PET, have demonstrated the importance of 3D genome organization in development, cell differentiation and transcriptional regulation. There is now a widespread need for computational tools to generate and analyze 3D structural models from 3C data. Here we introduce our 3D GeNOme Modeling Engine (3D-GNOME), a web service which generates 3D structures from 3C data and provides tools to visually inspect and annotate the resulting structures, in addition to a variety of statistical plots and heatmaps which characterize the selected genomic region. Users submit a bedpe (paired-end BED format) file containing the locations and strengths of long range contact points, and 3D-GNOME simulates the structure and provides a convenient user interface for further analysis. Alternatively, a user may generate structures using published ChIA-PET data for the GM12878 cell line by simply specifying a genomic region of interest. 3D-GNOME is freely available at http://3dgnome.cent.uw.edu.pl/. PMID:27185892

  18. 3D-GNOME: an integrated web service for structural modeling of the 3D genome.

    PubMed

    Szalaj, Przemyslaw; Michalski, Paul J; Wróblewski, Przemysław; Tang, Zhonghui; Kadlof, Michal; Mazzocco, Giovanni; Ruan, Yijun; Plewczynski, Dariusz

    2016-07-01

    Recent advances in high-throughput chromosome conformation capture (3C) technology, such as Hi-C and ChIA-PET, have demonstrated the importance of 3D genome organization in development, cell differentiation and transcriptional regulation. There is now a widespread need for computational tools to generate and analyze 3D structural models from 3C data. Here we introduce our 3D GeNOme Modeling Engine (3D-GNOME), a web service which generates 3D structures from 3C data and provides tools to visually inspect and annotate the resulting structures, in addition to a variety of statistical plots and heatmaps which characterize the selected genomic region. Users submit a bedpe (paired-end BED format) file containing the locations and strengths of long range contact points, and 3D-GNOME simulates the structure and provides a convenient user interface for further analysis. Alternatively, a user may generate structures using published ChIA-PET data for the GM12878 cell line by simply specifying a genomic region of interest. 3D-GNOME is freely available at http://3dgnome.cent.uw.edu.pl/. PMID:27185892

  19. Midwest Superconductivity Consortium - Final Progress Report October 2001

    SciTech Connect

    Bement, Arden L.

    2001-10-23

    The basic mission of the Consortium was to advance the science and understanding of high-T{sub c} superconductivity and to promote the development of new materials and improved processing technology. Focused group efforts were the key element of the research program. One program area is the understanding of the layered structures involved in candidate materials and the factors that control their formation, stability and relationship superconductor properties. The other program area had a focus upon factors that limit or control the transport properties such as weak links, flux lattice behavior, and interfaces. Interactions among Consortium d with industrial armiates were an integral part of the program.

  20. GWIDD: a comprehensive resource for genome-wide structural modeling of protein-protein interactions

    PubMed Central

    2012-01-01

    Protein-protein interactions are a key component of life processes. The knowledge of the three-dimensional structure of these interactions is important for understanding protein function. Genome-Wide Docking Database (http://gwidd.bioinformatics.ku.edu) offers an extensive source of data for structural studies of protein-protein complexes on genome scale. The current release of the database combines the available experimental data on the structure and characteristics of protein interactions with structural modeling of protein complexes for 771 organisms spanned over the entire universe of life from viruses to humans. The interactions are stored in a relational database with user-friendly interface that includes various search options. The search results can be interactively previewed; the structures, downloaded, along with the interaction characteristics. PMID:23245398

  1. Structure of Ljungan virus provides insight into genome packaging of this picornavirus

    NASA Astrophysics Data System (ADS)

    Zhu, Ling; Wang, Xiangxi; Ren, Jingshan; Porta, Claudine; Wenham, Hannah; Ekström, Jens-Ola; Panjwani, Anusha; Knowles, Nick J.; Kotecha, Abhay; Siebert, C. Alistair; Lindberg, A. Michael; Fry, Elizabeth E.; Rao, Zihe; Tuthill, Tobias J.; Stuart, David I.

    2015-10-01

    Picornaviruses are responsible for a range of human and animal diseases, but how their RNA genome is packaged remains poorly understood. A particularly poorly studied group within this family are those that lack the internal coat protein, VP4. Here we report the atomic structure of one such virus, Ljungan virus, the type member of the genus Parechovirus B, which has been linked to diabetes and myocarditis in humans. The 3.78-Å resolution cryo-electron microscopy structure shows remarkable features, including an extended VP1 C terminus, forming a major protuberance on the outer surface of the virus, and a basic motif at the N terminus of VP3, binding to which orders some 12% of the viral genome. This apparently charge-driven RNA attachment suggests that this branch of the picornaviruses uses a different mechanism of genome encapsidation, perhaps explored early in the evolution of picornaviruses.

  2. Structure of Ljungan virus provides insight into genome packaging of this picornavirus

    PubMed Central

    Zhu, Ling; Wang, Xiangxi; Ren, Jingshan; Porta, Claudine; Wenham, Hannah; Ekström, Jens-Ola; Panjwani, Anusha; Knowles, Nick J.; Kotecha, Abhay; Siebert, C. Alistair; Lindberg, A. Michael; Fry, Elizabeth E.; Rao, Zihe; Tuthill, Tobias J.; Stuart, David I.

    2015-01-01

    Picornaviruses are responsible for a range of human and animal diseases, but how their RNA genome is packaged remains poorly understood. A particularly poorly studied group within this family are those that lack the internal coat protein, VP4. Here we report the atomic structure of one such virus, Ljungan virus, the type member of the genus Parechovirus B, which has been linked to diabetes and myocarditis in humans. The 3.78-Å resolution cryo-electron microscopy structure shows remarkable features, including an extended VP1 C terminus, forming a major protuberance on the outer surface of the virus, and a basic motif at the N terminus of VP3, binding to which orders some 12% of the viral genome. This apparently charge-driven RNA attachment suggests that this branch of the picornaviruses uses a different mechanism of genome encapsidation, perhaps explored early in the evolution of picornaviruses. PMID:26446437

  3. The ISPRS Student Consortium: From launch to tenth anniversary

    NASA Astrophysics Data System (ADS)

    Kanjir, U.; Detchev, I.; Reyes, S. R.; Akkartal Aktas, A.; Lo, C. Y.; Miyazaki, H.

    2014-04-01

    The ISPRS Student Consortium is an international organization for students and young professionals in the fields of photogrammetry, remote sensing, and the geospatial information sciences. Since its start ten years ago, the number of members of the Student Consortium has been steadily growing, now reaching close to 1000. Its increased popularity, especially in recent years, is mainly due to the organization's worldwide involvement in student matters. The Student Consortium has helped organize numerous summer schools, youth forums, and student technical sessions at ISPRS sponsored conferences. In addition, the organization publishes a newsletter, and hosts several social media outlets in order to keep its global membership up-to-date on a regular basis. This paper will describe the structure of the organization, and it will give some example of its past student related activities.

  4. Genome structure and primitive sex chromosome revealed in Populus

    SciTech Connect

    Tuskan, Gerald A; Yin, Tongming; Gunter, Lee E; Blaudez, D

    2008-01-01

    We constructed a comprehensive genetic map for Populus and ordered 332 Mb of sequence scaffolds along the 19 haploid chromosomes in order to compare chromosomal regions among diverse members of the genus. These efforts lead us to conclude that chromosome XIX in Populus is evolving into a sex chromosome. Consistent segregation distortion in favor of the sub-genera Tacamahaca alleles provided evidence of divergent selection among species, particularly at the proximal end of chromosome XIX. A large microsatellite marker (SSR) cluster was detected in the distorted region even though the genome-wide distribute SSR sites was uniform across the physical map. The differences between the genetic map and physical sequence data suggested recombination suppression was occurring in the distorted region. A gender-determination locus and an overabundance of NBS-LRR genes were also co-located to the distorted region and were put forth as the cause for divergent selection and recombination suppression. This hypothesis was verified by using fine-scale mapping of an integrated scaffold in the vicinity of the gender-determination locus. As such it appears that chromosome XIX in Populus is in the process of evolving from an autosome into a sex chromosome and that NBS-LRR genes may play important role in the chromosomal diversification process in Populus.

  5. The Ocean Sampling Day Consortium

    DOE PAGESBeta

    Kopf, Anna; Bicak, Mesude; Kottmann, Renzo; Schnetzer, Julia; Kostadinov, Ivaylo; Lehmann, Katja; Fernandez-Guerra, Antonio; Jeanthon, Christian; Rahav, Eyal; Ullrich, Matthias; et al

    2015-06-19

    In this study, Ocean Sampling Day was initiated by the EU-funded Micro B3 (Marine Microbial Biodiversity, Bioinformatics, Biotechnology) project to obtain a snapshot of the marine microbial biodiversity and function of the world’s oceans. It is a simultaneous global mega-sequencing campaign aiming to generate the largest standardized microbial data set in a single day. This will be achievable only through the coordinated efforts of an Ocean Sampling Day Consortium, supportive partnerships and networks between sites. This commentary outlines the establishment, function and aims of the Consortium and describes our vision for a sustainable study of marine microbial communities and theirmore » embedded functional traits.« less

  6. Coding exon-structure aware realigner (CESAR) utilizes genome alignments for accurate comparative gene annotation

    PubMed Central

    Sharma, Virag; Elghafari, Anas; Hiller, Michael

    2016-01-01

    Identifying coding genes is an essential step in genome annotation. Here, we utilize existing whole genome alignments to detect conserved coding exons and then map gene annotations from one genome to many aligned genomes. We show that genome alignments contain thousands of spurious frameshifts and splice site mutations in exons that are truly conserved. To overcome these limitations, we have developed CESAR (Coding Exon-Structure Aware Realigner) that realigns coding exons, while considering reading frame and splice sites of each exon. CESAR effectively avoids spurious frameshifts in conserved genes and detects 91% of shifted splice sites. This results in the identification of thousands of additional conserved exons and 99% of the exons that lack inactivating mutations match real exons. Finally, to demonstrate the potential of using CESAR for comparative gene annotation, we applied it to 188 788 exons of 19 865 human genes to annotate human genes in 99 other vertebrates. These comparative gene annotations are available as a resource (http://bds.mpi-cbg.de/hillerlab/CESAR/). CESAR (https://github.com/hillerlab/CESAR/) can readily be applied to other alignments to accurately annotate coding genes in many other vertebrate and invertebrate genomes. PMID:27016733

  7. Coding exon-structure aware realigner (CESAR) utilizes genome alignments for accurate comparative gene annotation.

    PubMed

    Sharma, Virag; Elghafari, Anas; Hiller, Michael

    2016-06-20

    Identifying coding genes is an essential step in genome annotation. Here, we utilize existing whole genome alignments to detect conserved coding exons and then map gene annotations from one genome to many aligned genomes. We show that genome alignments contain thousands of spurious frameshifts and splice site mutations in exons that are truly conserved. To overcome these limitations, we have developed CESAR (Coding Exon-Structure Aware Realigner) that realigns coding exons, while considering reading frame and splice sites of each exon. CESAR effectively avoids spurious frameshifts in conserved genes and detects 91% of shifted splice sites. This results in the identification of thousands of additional conserved exons and 99% of the exons that lack inactivating mutations match real exons. Finally, to demonstrate the potential of using CESAR for comparative gene annotation, we applied it to 188 788 exons of 19 865 human genes to annotate human genes in 99 other vertebrates. These comparative gene annotations are available as a resource (http://bds.mpi-cbg.de/hillerlab/CESAR/). CESAR (https://github.com/hillerlab/CESAR/) can readily be applied to other alignments to accurately annotate coding genes in many other vertebrate and invertebrate genomes. PMID:27016733

  8. Genome-scale protein expression and structural biology of Plasmodium falciparum and related Apicomplexan organisms.

    PubMed

    Vedadi, Masoud; Lew, Jocelyne; Artz, Jennifer; Amani, Mehrnaz; Zhao, Yong; Dong, Aiping; Wasney, Gregory A; Gao, Mian; Hills, Tanya; Brokx, Stephen; Qiu, Wei; Sharma, Sujata; Diassiti, Angelina; Alam, Zahoor; Melone, Michelle; Mulichak, Anne; Wernimont, Amy; Bray, James; Loppnau, Peter; Plotnikova, Olga; Newberry, Kate; Sundararajan, Emayavaram; Houston, Simon; Walker, John; Tempel, Wolfram; Bochkarev, Alexey; Kozieradzki, Ivona; Edwards, Aled; Arrowsmith, Cheryl; Roos, David; Kain, Kevin; Hui, Raymond

    2007-01-01

    Parasites from the protozoan phylum Apicomplexa are responsible for diseases, such as malaria, toxoplasmosis and cryptosporidiosis, all of which have significantly higher rates of mortality and morbidity in economically underdeveloped regions of the world. Advances in vaccine development and drug discovery are urgently needed to control these diseases and can be facilitated by production of purified recombinant proteins from Apicomplexan genomes and determination of their 3D structures. To date, both heterologous expression and crystallization of Apicomplexan proteins have seen only limited success. In an effort to explore the effectiveness of producing and crystallizing proteins on a genome-scale using a standardized methodology, over 400 distinct Plasmodium falciparum target genes were chosen representing different cellular classes, along with select orthologues from four other Plasmodium species as well as Cryptosporidium parvum and Toxoplasma gondii. From a total of 1008 genes from the seven genomes, 304 (30.2%) produced purified soluble proteins and 97 (9.6%) crystallized, culminating in 36 crystal structures. These results demonstrate that, contrary to previous findings, a standardized platform using Escherichia coli can be effective for genome-scale production and crystallography of Apicomplexan proteins. Predictably, orthologous proteins from different Apicomplexan genomes behaved differently in expression, purification and crystallization, although the overall success rates of Plasmodium orthologues do not differ significantly. Their differences were effectively exploited to elevate the overall productivity to levels comparable to the most successful ongoing structural genomics projects: 229 of the 468 target genes produced purified soluble protein from one or more organisms, with 80 and 32 of the purified targets, respectively, leading to crystals and ultimately structures from one or more orthologues. PMID:17125854

  9. The AGTSR consortium: An update

    SciTech Connect

    Fant, D.B.; Golan, L.P.

    1995-10-01

    The Advanced Gas Turbine Systems Research (AGTSR) program is a collaborative University-Industry R&D Consortium that is managed and administered by the South Carolina Energy R&D Center. AGTSR is a nationwide consortium dedicated to advancing land-based gas turbine systems for improving future power generation capability. It directly supports the technology-research arm of the ATS program and targets industry-defined research needs in the areas of combustion, heat transfer, materials, aerodynamics, controls, alternative fuels, and advanced cycles. The consortium is organized to enhance U.S. competitiveness through close collaboration with universities, government, and industry at the R&D level. AGTSR is just finishing its third year of operation and is sponsored by the U.S. DOE - Morgantown Energy Technology Center. The program is scheduled to continue past the year 2000. At present, there are 78 performing member universities representing 36 states, and six cost-sharing U.S. gas turbine corporations. Three RFP`s have been announced and the fourth RFP is expected to be released in December, 1995. There are 31 research subcontracts underway at performing member universities. AGTSR has also organized three workshops, two in combustion and one in heat transfer. A materials workshop is in planning and is scheduled for February, 1996. An industrial internship program was initiated this past summer, with one intern positioned at each of the sponsoring companies. The AGTSR consortium nurtures close industry-university-government collaboration to enhance synergism and the transition of research results, accelerate and promote evolutionary-revolutionary R&D, and strives to keep a prominent U.S. industry strong and on top well into the 21st century. This paper will present the objectives and benefits of the AGTSR program, progress achieved to date, and future planned activity in fiscal year 1996.

  10. John Glenn Biomedical Engineering Consortium

    NASA Technical Reports Server (NTRS)

    Nall, Marsha

    2004-01-01

    The John Glenn Biomedical Engineering Consortium is an inter-institutional research and technology development, beginning with ten projects in FY02 that are aimed at applying GRC expertise in fluid physics and sensor development with local biomedical expertise to mitigate the risks of space flight on the health, safety, and performance of astronauts. It is anticipated that several new technologies will be developed that are applicable to both medical needs in space and on earth.

  11. The Non-Territorial Imperative in a CBTE Consortium.

    ERIC Educational Resources Information Center

    Roseberry, Kent B.

    The problems associated with development of a field based teacher education program in the public schools are examined. Conflict between the college education department and the cooperating schools for control of time and curriculum frequently appears in such programs. A model is proposed for the structure and development of a consortium that…

  12. Appalachian clean coal technology consortium

    SciTech Connect

    Kutz, K.; Yoon, Roe-Hoan

    1995-11-01

    The Appalachian Clean Coal Technology Consortium (ACCTC) has been established to help U.S. coal producers, particularly those in the Appalachian region, increase the production of lower-sulfur coal. The cooperative research conducted as part of the consortium activities will help utilities meet the emissions standards established by the 1990 Clean Air Act Amendments, enhance the competitiveness of U.S. coals in the world market, create jobs in economically-depressed coal producing regions, and reduce U.S. dependence on foreign energy supplies. The research activities will be conducted in cooperation with coal companies, equipment manufacturers, and A&E firms working in the Appalachian coal fields. This approach is consistent with President Clinton`s initiative in establishing Regional Technology Alliances to meet regional needs through technology development in cooperation with industry. The consortium activities are complementary to the High-Efficiency Preparation program of the Pittsburgh Energy Technology Center, but are broader in scope as they are inclusive of technology developments for both near-term and long-term applications, technology transfer, and training a highly-skilled work force.

  13. Evolution of the Exon-Intron Structure in Ciliate Genomes.

    PubMed

    Bondarenko, Vladyslav S; Gelfand, Mikhail S

    2016-01-01

    A typical eukaryotic gene is comprised of alternating stretches of regions, exons and introns, retained in and spliced out a mature mRNA, respectively. Although the length of introns may vary substantially among organisms, a large fraction of genes contains short introns in many species. Notably, some Ciliates (Paramecium and Nyctotherus) possess only ultra-short introns, around 25 bp long. In Paramecium, ultra-short introns with length divisible by three (3n) are under strong evolutionary pressure and have a high frequency of in-frame stop codons, which, in the case of intron retention, cause premature termination of mRNA translation and consequent degradation of the mis-spliced mRNA by the nonsense-mediated decay mechanism. Here, we analyzed introns in five genera of Ciliates, Paramecium, Tetrahymena, Ichthyophthirius, Oxytricha, and Stylonychia. Introns can be classified into two length classes in Tetrahymena and Ichthyophthirius (with means 48 bp, 69 bp, and 55 bp, 64 bp, respectively), but, surprisingly, comprise three distinct length classes in Oxytricha and Stylonychia (with means 33-35 bp, 47-51 bp, and 78-80 bp). In most ranges of the intron lengths, 3n introns are underrepresented and have a high frequency of in-frame stop codons in all studied species. Introns of Paramecium, Tetrahymena, and Ichthyophthirius are preferentially located at the 5' and 3' ends of genes, whereas introns of Oxytricha and Stylonychia are strongly skewed towards the 5' end. Analysis of evolutionary conservation shows that, in each studied genome, a significant fraction of intron positions is conserved between the orthologs, but intron lengths are not correlated between the species. In summary, our study provides a detailed characterization of introns in several genera of Ciliates and highlights some of their distinctive properties, which, together, indicate that splicing spellchecking is a universal and evolutionarily conserved process in the biogenesis of short introns in

  14. Structural genomics for drug design against the pathogen Coxiella burnetii.

    PubMed

    Franklin, Matthew C; Cheung, Jonah; Rudolph, Michael J; Burshteyn, Fiana; Cassidy, Michael; Gary, Ebony; Hillerich, Brandan; Yao, Zhong-Ke; Carlier, Paul R; Totrov, Maxim; Love, James D

    2015-12-01

    Coxiella burnetii is a highly infectious bacterium and potential agent of bioterrorism. However, it has not been studied as extensively as other biological agents, and very few of its proteins have been structurally characterized. To address this situation, we undertook a study of critical metabolic enzymes in C. burnetii that have great potential as drug targets. We used high-throughput techniques to produce novel crystal structures of 48 of these proteins. We selected one protein, C. burnetii dihydrofolate reductase (CbDHFR), for additional work to demonstrate the value of these structures for structure-based drug design. This enzyme's structure reveals a feature in the substrate binding groove that is different between CbDHFR and human dihydrofolate reductase (hDHFR). We then identified a compound by in silico screening that exploits this binding groove difference, and demonstrated that this compound inhibits CbDHFR with at least 25-fold greater potency than hDHFR. Since this binding groove feature is shared by many other prokaryotes, the compound identified could form the basis of a novel antibacterial agent effective against a broad spectrum of pathogenic bacteria. PMID:26033498

  15. Genomic structural analysis of porcine fatty acid desaturase cluster on chromosome 2.

    PubMed

    Taniguchi, Masaaki; Arakawa, Aisaku; Motoyama, Michiyo; Nakajima, Ikuyo; Nii, Masahiro; Mikawa, Satoshi

    2015-04-01

    Fatty acid composition is an economically important trait in meat-producing livestock. To gain insight into the molecular genetics of fatty acid desaturase (FADS) genes in pigs, we investigated the genomic structure of the porcine FADS gene family on chromosome 2. We also examined the tissue distribution of FADS gene expression. The genomic structure of FADS family in mammals consists of three isoforms FADS1, FADS2 and FADS3. However, porcine FADS cluster in the latest pig genome assembly (Sscrofa 10.2) containing some gaps is distinct from that in other mammals. We therefore sought to determine the genomic structure, including the FADS cluster in a 200-kbp range by sequencing gap regions. The structure we obtained was similar to that in other mammals. We then investigated the porcine FADS1 transcription start site and identified a novel isoform named FADS1b. Phylogenetic analysis revealed that the three members of the FADS cluster were orthologous among mammals, whereas the various FADS1 isoforms identified in pigs, mice and cattle might be attributable to species-specific transcriptional regulation with alternative promoters. Porcine FADS1b and FADS3 isoforms were predominantly expressed in the inner layer of the subcutaneous adipose tissue. Additional analyses will reveal the effects of these functionally unknown isoforms on fatty acid composition in pig fat tissues. PMID:25409917

  16. Genome sequencing of disease and carriage isolates of nontypeable Haemophilus influenzae identifies discrete population structure.

    PubMed

    De Chiara, Matteo; Hood, Derek; Muzzi, Alessandro; Pickard, Derek J; Perkins, Tim; Pizza, Mariagrazia; Dougan, Gordon; Rappuoli, Rino; Moxon, E Richard; Soriani, Marco; Donati, Claudio

    2014-04-01

    One of the main hurdles for the development of an effective and broadly protective vaccine against nonencapsulated isolates of Haemophilus influenzae (NTHi) lies in the genetic diversity of the species, which renders extremely difficult the identification of cross-protective candidate antigens. To assess whether a population structure of NTHi could be defined, we performed genome sequencing of a collection of diverse clinical isolates representative of both carriage and disease and of the diversity of the natural population. Analysis of the distribution of polymorphic sites in the core genome and of the composition of the accessory genome defined distinct evolutionary clades and supported a predominantly clonal evolution of NTHi, with the majority of genetic information transmitted vertically within lineages. A correlation between the population structure and the presence of selected surface-associated proteins and lipooligosaccharide structure, known to contribute to virulence, was found. This high-resolution, genome-based population structure of NTHi provides the foundation to obtain a better understanding, of NTHi adaptation to the host as well as its commensal and virulence behavior, that could facilitate intervention strategies against disease caused by this important human pathogen. PMID:24706866

  17. Comparative 3D Genome Structure Analysis of the Fission and the Budding Yeast

    PubMed Central

    Gong, Ke; Tjong, Harianto; Zhou, Xianghong Jasmine; Alber, Frank

    2015-01-01

    We studied the 3D structural organization of the fission yeast genome, which emerges from the tethering of heterochromatic regions in otherwise randomly configured chromosomes represented as flexible polymer chains in an nuclear environment. This model is sufficient to explain in a statistical manner many experimentally determined distinctive features of the fission yeast genome, including chromatin interaction patterns from Hi-C experiments and the co-locations of functionally related and co-expressed genes, such as genes expressed by Pol-III. Our findings demonstrate that some previously described structure-function correlations can be explained as a consequence of random chromatin collisions driven by a few geometric constraints (mainly due to centromere-SPB and telomere-NE tethering) combined with the specific gene locations in the chromosome sequence. We also performed a comparative analysis between the fission and budding yeast genome structures, for which we previously detected a similar organizing principle. However, due to the different chromosome sizes and numbers, substantial differences are observed in the 3D structural genome organization between the two species, most notably in the nuclear locations of orthologous genes, and the extent of nuclear territories for genes and chromosomes. However, despite those differences, remarkably, functional similarities are maintained, which is evident when comparing spatial clustering of functionally related genes in both yeasts. Functionally related genes show a similar spatial clustering behavior in both yeasts, even though their nuclear locations are largely different between the yeast species. PMID:25799503

  18. Femtomole SHAPE reveals regulatory structures in the authentic XMRV RNA genome

    PubMed Central

    Grohman, Jacob K.; Kottegoda, Sumith; Gorelick, Robert J.; Allbritton, Nancy L.; Weeks, Kevin M.

    2011-01-01

    Higher-order structure influences critical functions in nearly all non-coding and coding RNAs. Most single-nucleotide resolution RNA structure determination technologies cannot be used to analyze RNA from scarce biological samples, like viral genomes. To make quantitative RNA structure analysis applicable to a much wider array of RNA structure-function problems, we developed and applied high-sensitivity selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) to structural analysis of authentic genomic RNA of the xenotropic murine leukemia virus-related virus (XMRV). For analysis of fluorescently labeled cDNAs generated in high-sensitivity SHAPE experiments, we developed a two-color capillary electrophoresis approach with zeptomole molecular detection limits and sub-femtomole sensitivity for complete SHAPE experiments involving hundreds of individual RNA structure measurements. High-sensitivity SHAPE data correlated closely (R = 0.89) with data obtained by conventional capillary electrophoresis. Using high-sensitivity SHAPE, we determined the dimeric structure of the XMRV packaging domain, examined dynamic interactions between a packaging domain RNA and viral nucleocapsid protein inside virion particles, and identified the packaging signal for this virus. Despite extensive sequence differences between XMRV and the intensively studied Moloney murine leukemia virus, architectures of the regulatory domains are similar and reveal common principles of gammaretrovirus RNA genome packaging. PMID:22126209

  19. A rapid classification protocol for the CATH Domain Database to support structural genomics

    PubMed Central

    Pearl, Frances M. G.; Martin, Nigel; Bray, James E.; Buchan, Daniel W. A.; Harrison, Andrew P.; Lee, David; Reeves, Gabrielle A.; Shepherd, Adrian J.; Sillitoe, Ian; Todd, Annabel E.; Thornton, Janet M.; Orengo, Christine A.

    2001-01-01

    In order to support the structural genomic initiatives, both by rapidly classifying newly determined structures and by suggesting suitable targets for structure determination, we have recently developed several new protocols for classifying structures in the CATH domain database (http://www.biochem.ucl.ac.uk/bsm/cath). These aim to increase the speed of classification of new structures using fast algorithms for structure comparison (GRATH) and to improve the sensitivity in recognising distant structural relatives by incorporating sequence information from relatives in the genomes (DomainFinder). In order to ensure the integrity of the database given the expected increase in data, the CATH Protein Family Database (CATH-PFDB), which currently includes 25 320 structural domains and a further 160 000 sequence relatives has now been installed in a relational ORACLE database. This was essential for developing more rigorous validation procedures and for allowing efficient querying of the database, particularly for genome analysis. The associated Dictionary of Homologous Superfamilies [Bray,J.E., Todd,A.E., Pearl,F.M.G., Thornton,J.M. and Orengo,C.A. (2000) Protein Eng., 13, 153–165], which provides multiple structural alignments and functional information to assist in assigning new relatives, has also been expanded recently and now includes information for 903 homo­logous superfamilies. In order to improve coverage of known structures, preliminary classification levels are now provided for new structures at interim stages in the classification protocol. Since a large proportion of new structures can be rapidly classified using profile-based sequence analysis [e.g. PSI-BLAST: Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Nucleic Acids Res., 25, 3389–3402], this provides preliminary classification for easily recognisable homologues, which in the latest release of CATH (version 1.7) represented nearly three-quarters of

  20. A rapid classification protocol for the CATH Domain Database to support structural genomics.

    PubMed

    Pearl, F M; Martin, N; Bray, J E; Buchan, D W; Harrison, A P; Lee, D; Reeves, G A; Shepherd, A J; Sillitoe, I; Todd, A E; Thornton, J M; Orengo, C A

    2001-01-01

    In order to support the structural genomic initiatives, both by rapidly classifying newly determined structures and by suggesting suitable targets for structure determination, we have recently developed several new protocols for classifying structures in the CATH domain database (http://www.biochem.ucl.ac.uk/bsm/cath). These aim to increase the speed of classification of new structures using fast algorithms for structure comparison (GRATH) and to improve the sensitivity in recognising distant structural relatives by incorporating sequence information from relatives in the genomes (DomainFinder). In order to ensure the integrity of the database given the expected increase in data, the CATH Protein Family Database (CATH-PFDB), which currently includes 25,320 structural domains and a further 160,000 sequence relatives has now been installed in a relational ORACLE database. This was essential for developing more rigorous validation procedures and for allowing efficient querying of the database, particularly for genome analysis. The associated Dictionary of Homologous Superfamilies [Bray,J.E., Todd,A.E., Pearl,F.M.G., Thornton,J.M. and Orengo,C.A. (2000) Protein Eng., 13, 153-165], which provides multiple structural alignments and functional information to assist in assigning new relatives, has also been expanded recently and now includes information for 903 homologous superfamilies. In order to improve coverage of known structures, preliminary classification levels are now provided for new structures at interim stages in the classification protocol. Since a large proportion of new structures can be rapidly classified using profile-based sequence analysis [e.g. PSI-BLAST: Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Nucleic Acids Res., 25, 3389-3402], this provides preliminary classification for easily recognisable homologues, which in the latest release of CATH (version 1.7) represented nearly three-quarters of the non

  1. A roadmap for functional structural variants in the soybean genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Gene structural variation (SV) has recently emerged as a key genetic mechanism underlying several important phenotypic traits in crop species. We screened a panel of 41 soybean accessions serving as parents in a soybean nested association mapping population for deletions and duplications in over 53...

  2. High Density LD-Based Structural Variations Analysis in Cattle Genome

    PubMed Central

    Salomon-Torres, Ricardo; Matukumalli, Lakshmi K.; Van Tassell, Curtis P.; Villa-Angulo, Carlos; Gonzalez-Vizcarra, Víctor M.; Villa-Angulo, Rafael

    2014-01-01

    Genomic structural variations represent an important source of genetic variation in mammal genomes, thus, they are commonly related to phenotypic expressions. In this work, ∼770,000 single nucleotide polymorphism genotypes from 506 animals from 19 cattle breeds were analyzed. A simple LD-based structural variation was defined, and a genome-wide analysis was performed. After applying some quality control filters, for each breed and each chromosome we calculated the linkage disequilibrium (r2) of short range (≤100 Kb). We sorted SNP pairs by distance and obtained a set of LD means (called the expected means) using bins of 5 Kb. We identified 15,246 segments of at least 1 Kb, among the 19 breeds, consisting of sets of at least 3 adjacent SNPs so that, for each SNP, r2 within its neighbors in a 100 Kb range, to the right side of that SNP, were all bigger than, or all smaller than, the corresponding expected mean, and their P-value were significant after a Benjamini-Hochberg multiple testing correction. In addition, to account just for homogeneously distributed regions we considered only SNPs having at least 15 SNP neighbors within 100 Kb. We defined such segments as structural variations. By grouping all variations across all animals in the sample we defined 9,146 regions, involving a total of 53,137 SNPs; representing the 6.40% (160.98 Mb) from the bovine genome. The identified structural variations covered 3,109 genes. Clustering analysis showed the relatedness of breeds given the geographic region in which they are evolving. In summary, we present an analysis of structural variations based on the deviation of the expected short range LD between SNPs in the bovine genome. With an intuitive and simple definition based only on SNPs data it was possible to discern closeness of breeds due to grouping by geographic region in which they are evolving. PMID:25050984

  3. Genomic analysis of the hierarchical structure of regulatory networks

    PubMed Central

    Yu, Haiyuan; Gerstein, Mark

    2006-01-01

    A fundamental question in biology is how the cell uses transcription factors (TFs) to coordinate the expression of thousands of genes in response to various stimuli. The relationships between TFs and their target genes can be modeled in terms of directed regulatory networks. These relationships, in turn, can be readily compared with commonplace “chain-of-command” structures in social networks, which have characteristic hierarchical layouts. Here, we develop algorithms for identifying generalized hierarchies (allowing for various loop structures) and use these approaches to illuminate extensive pyramid-shaped hierarchical structures existing in the regulatory networks of representative prokaryotes (Escherichia coli) and eukaryotes (Saccharomyces cerevisiae), with most TFs at the bottom levels and only a few master TFs on top. These masters are situated near the center of the protein–protein interaction network, a different type of network from the regulatory one, and they receive most of the input for the whole regulatory hierarchy through protein interactions. Moreover, they have maximal influence over other genes, in terms of affecting expression-level changes. Surprisingly, however, TFs at the bottom of the regulatory hierarchy are more essential to the viability of the cell. Finally, one might think master TFs achieve their wide influence through directly regulating many targets, but TFs with most direct targets are in the middle of the hierarchy. We find, in fact, that these midlevel TFs are “control bottlenecks” in the hierarchy, and this great degree of control for “middle managers” has parallels in efficient social structures in various corporate and governmental settings. PMID:17003135

  4. Modeling Structural and Genomic Constraints in the Evolution of Proteins

    NASA Astrophysics Data System (ADS)

    Bastolla, Ugo; Porto, Markus

    Macromolecules influence the phenotype of the organism where they are expressed through their function, and in particular through their interactions. Nevertheless, it is very difficult to computationally predict protein function and interactions. Moreover, only a few residues take part in them. For these reasons, models of molecular evolution usually represent folded macromolecules such as RNA or proteins and identify the function of the molecule with the folded structure, whose stability determines the modeled fitness.

  5. Heterogeneous genome divergence, differential introgression, and the origin and structure of hybrid zones

    PubMed Central

    Harrison, Richard G; Larson, Erica L

    2016-01-01

    Hybrid zones have been promoted as windows on the evolutionary process and as laboratories for studying divergence and speciation. Patterns of divergence between hybridizing species can now be characterized on a genome-wide scale, and recent genome scans have focused on the presence of “islands” of divergence. Patterns of heterogeneous genomic divergence may reflect differential introgression following secondary contact and provide insights into which genome regions contribute to local adaptation, hybrid unfitness, and positive assortative mating. However, heterogeneous genome divergence can also arise in the absence of any gene flow, as a result of variation in selection and recombination across the genome. We suggest that to understand hybrid zone origins and dynamics, it is essential to distinguish between genome regions that are divergent between pure parental populations and regions that show restricted introgression where these populations interact in hybrid zones. The latter, more so than the former, reveal the likely genetic architecture of reproductive isolation. Mosaic hybrid zones, because of their complex structure and multiple contacts, are particularly good subjects for distinguishing primary intergradation from secondary contact. Comparisons among independent hybrid zones or transects that involve the “same” species pair can also help to distinguish between divergence with gene flow and secondary contact. However, data from replicate hybrid zones or replicate transects do not reveal consistent patterns; in a few cases, patterns of introgression are similar across independent transects, but for many taxa, there is distinct lack of concordance, presumably due to variation in environmental context and/or variation in the genetics of the interacting populations. PMID:26857437

  6. Draft Genome Sequence of Ruminoclostridium sp. Ne3, Clostridia from an Enrichment Culture Obtained from Australian Subterranean Termite, Nasutitermes exitiosus

    PubMed Central

    Lin, Hai; Tran-Dinh, Nai; Li, Dongmei; Greenfield, Paul; Midgley, David J.

    2015-01-01

    The draft genome sequence of Ruminoclostridium sp. Ne3 was reconstructed from the metagenome of a hydrogenogenic microbial consortium growing on xylan. The organism is likely the primary hemicellulose degrader within the consortium. PMID:25908130

  7. Draft Genome Sequence of Ruminoclostridium sp. Ne3, Clostridia from an Enrichment Culture Obtained from Australian Subterranean Termite, Nasutitermes exitiosus.

    PubMed

    Wang, Han; Lin, Hai; Tran-Dinh, Nai; Li, Dongmei; Greenfield, Paul; Midgley, David J

    2015-01-01

    The draft genome sequence of Ruminoclostridium sp. Ne3 was reconstructed from the metagenome of a hydrogenogenic microbial consortium growing on xylan. The organism is likely the primary hemicellulose degrader within the consortium. PMID:25908130

  8. Structural features of conopeptide genes inferred from partial sequences of the Conus tribblei genome.

    PubMed

    Barghi, Neda; Concepcion, Gisela P; Olivera, Baldomero M; Lluisma, Arturo O

    2016-02-01

    The evolvability of venom components (in particular, the gene-encoded peptide toxins) in venomous species serves as an adaptive strategy allowing them to target new prey types or respond to changes in the prey field. The structure, organization, and expression of the venom peptide genes may provide insights into the molecular mechanisms that drive the evolution of such genes. Conus is a particularly interesting group given the high chemical diversity of their venom peptides, and the rapid evolution of the conopeptide-encoding genes. Conus genomes, however, are large and characterized by a high proportion of repetitive sequences. As a result, the structure and organization of conopeptide genes have remained poorly known. In this study, a survey of the genome of Conus tribblei was undertaken to address this gap. A partial assembly of C. tribblei genome was generated; the assembly, though consisting of a large number of fragments, accounted for 2160.5 Mb of sequence. A large number of repetitive genomic elements consisting of 642.6 Mb of retrotransposable elements, simple repeats, and novel interspersed repeats were observed. We characterized the structural organization and distribution of conotoxin genes in the genome. A significant number of conopeptide genes (estimated to be between 148 and 193) belonging to different superfamilies with complete or nearly complete exon regions were observed, ~60 % of which were expressed. The unexpressed conopeptide genes represent hidden but significant conotoxin diversity. The conotoxin genes also differed in the frequency and length of the introns. The interruption of exons by long introns in the conopeptide genes and the presence of repeats in the introns may indicate the importance of introns in facilitating recombination, evolution and diversification of conotoxins. These findings advance our understanding of the structural framework that promotes the gene-level molecular evolution of venom peptides. PMID:26423067

  9. Structure of the Acidianus Filamentous Virus 3 and Comparative Genomics of Related Archaeal Lipothrixviruses▿

    PubMed Central

    Vestergaard, Gisle; Aramayo, Ricardo; Basta, Tamara; Häring, Monika; Peng, Xu; Brügger, Kim; Chen, Lanming; Rachel, Reinhard; Boisset, Nicolas; Garrett, Roger A.; Prangishvili, David

    2008-01-01

    Four novel filamentous viruses with double-stranded DNA genomes, namely, Acidianus filamentous virus 3 (AFV3), AFV6, AFV7, and AFV8, have been characterized from the hyperthermophilic archaeal genus Acidianus, and they are assigned to the Betalipothrixvirus genus of the family Lipothrixviridae. The structures of the approximately 2-μm-long virions are similar, and one of them, AFV3, was studied in detail. It consists of a cylindrical envelope containing globular subunits arranged in a helical formation that is unique for any known double-stranded DNA virus. The envelope is 3.1 nm thick and encases an inner core with two parallel rows of protein subunits arranged like a zipper. Each end of the virion is tapered and carries three short filaments. Two major structural proteins were identified as being common to all betalipothrixviruses. The viral genomes were sequenced and analyzed, and they reveal a high level of conservation in both gene content and gene order over large regions, with this similarity extending partly to the earlier described betalipothrixvirus Sulfolobus islandicus filamentous virus. A few predicted gene products of each virus, in addition to the structural proteins, could be assigned specific functions, including a putative helicase involved in Holliday junction branch migration, a nuclease, a protein phosphatase, transcriptional regulators, and glycosyltransferases. The AFV7 genome appears to have undergone intergenomic recombination with a large section of an AFV2-like viral genome, apparently resulting in phenotypic changes, as revealed by the presence of AFV2-like termini in the AFV7 virions. Shared features of the genomes include (i) large inverted terminal repeats exhibiting conserved, regularly spaced direct repeats; (ii) a highly conserved operon encoding the two major structural proteins; (iii) multiple overlapping open reading frames, which may be indicative of gene recoding; (iv) putative 12-bp genetic elements; and (v) partial gene

  10. The Drosophila Helicase MLE Targets Hairpin Structures in Genomic Transcripts.

    PubMed

    Cugusi, Simona; Li, Yujing; Jin, Peng; Lucchesi, John C

    2016-01-01

    RNA hairpins are a common type of secondary structures that play a role in every aspect of RNA biochemistry including RNA editing, mRNA stability, localization and translation of transcripts, and in the activation of the RNA interference (RNAi) and microRNA (miRNA) pathways. Participation in these functions often requires restructuring the RNA molecules by the association of single-strand (ss) RNA-binding proteins or by the action of helicases. The Drosophila MLE helicase has long been identified as a member of the MSL complex responsible for dosage compensation. The complex includes one of two long non-coding RNAs and MLE was shown to remodel the roX RNA hairpin structures in order to initiate assembly of the complex. Here we report that this function of MLE may apply to the hairpins present in the primary RNA transcripts that generate the small molecules responsible for RNA interference. Using stocks from the Transgenic RNAi Project and the Vienna Drosophila Research Center, we show that MLE specifically targets hairpin RNAs at their site of transcription. The association of MLE at these sites is independent of sequence and chromosome location. We use two functional assays to test the biological relevance of this association and determine that MLE participates in the RNAi pathway. PMID:26752049

  11. The Drosophila Helicase MLE Targets Hairpin Structures in Genomic Transcripts

    PubMed Central

    Cugusi, Simona; Li, Yujing; Jin, Peng; Lucchesi, John C.

    2016-01-01

    RNA hairpins are a common type of secondary structures that play a role in every aspect of RNA biochemistry including RNA editing, mRNA stability, localization and translation of transcripts, and in the activation of the RNA interference (RNAi) and microRNA (miRNA) pathways. Participation in these functions often requires restructuring the RNA molecules by the association of single-strand (ss) RNA-binding proteins or by the action of helicases. The Drosophila MLE helicase has long been identified as a member of the MSL complex responsible for dosage compensation. The complex includes one of two long non-coding RNAs and MLE was shown to remodel the roX RNA hairpin structures in order to initiate assembly of the complex. Here we report that this function of MLE may apply to the hairpins present in the primary RNA transcripts that generate the small molecules responsible for RNA interference. Using stocks from the Transgenic RNAi Project and the Vienna Drosophila Research Center, we show that MLE specifically targets hairpin RNAs at their site of transcription. The association of MLE at these sites is independent of sequence and chromosome location. We use two functional assays to test the biological relevance of this association and determine that MLE participates in the RNAi pathway. PMID:26752049

  12. 3D structures of membrane proteins from genomic sequencing

    PubMed Central

    Hopf, Thomas A.; Colwell, Lucy J.; Sheridan, Robert; Rost, Burkhard; Sander, Chris; Marks, Debora S.

    2012-01-01

    Summary We show that amino acid co-variation in proteins, extracted from the evolutionary sequence record, can be used to fold transmembrane proteins. We use this technique to predict previously unknown, 3D structures for 11 transmembrane proteins (with up to 14 helices) from their sequences alone. The prediction method (EVfold_membrane), applies a maximum entropy approach to infer evolutionary co-variation in pairs of sequence positions within a protein family and then generates all-atom models with the derived pairwise distance constraints. We benchmark the approach with blinded, de novo computation of known transmembrane protein structures from 23 families, demonstrating unprecedented accuracy of the method for large transmembrane proteins. We show how the method can predict oligomerization, functional sites, and conformational changes in transmembrane proteins. With the rapid rise in large-scale sequencing, more accurate and more comprehensive information on evolutionary constraints can be decoded from genetic variation, greatly expanding the repertoire of transmembrane proteins amenable to modelling by this method. PMID:22579045

  13. Genome-wide patterns of population structure and admixture in West Africans and African Americans.

    PubMed

    Bryc, Katarzyna; Auton, Adam; Nelson, Matthew R; Oksenberg, Jorge R; Hauser, Stephen L; Williams, Scott; Froment, Alain; Bodo, Jean-Marie; Wambebe, Charles; Tishkoff, Sarah A; Bustamante, Carlos D

    2010-01-12

    Quantifying patterns of population structure in Africans and African Americans illuminates the history of human populations and is critical for undertaking medical genomic studies on a global scale. To obtain a fine-scale genome-wide perspective of ancestry, we analyze Affymetrix GeneChip 500K genotype data from African Americans (n = 365) and individuals with ancestry from West Africa (n = 203 from 12 populations) and Europe (n = 400 from 42 countries). We find that population structure within the West African sample reflects primarily language and secondarily geographical distance, echoing the Bantu expansion. Among African Americans, analysis of genomic admixture by a principal component-based approach indicates that the median proportion of European ancestry is 18.5% (25th-75th percentiles: 11.6-27.7%), with very large variation among individuals. In the African-American sample as a whole, few autosomal regions showed exceptionally high or low mean African ancestry, but the X chromosome showed elevated levels of African ancestry, consistent with a sex-biased pattern of gene flow with an excess of European male and African female ancestry. We also find that genomic profiles of individual African Americans afford personalized ancestry reconstructions differentiating ancient vs. recent European and African ancestry. Finally, patterns of genetic similarity among inferred African segments of African-American genomes and genomes of contemporary African populations included in this study suggest African ancestry is most similar to non-Bantu Niger-Kordofanian-speaking populations, consistent with historical documents of the African Diaspora and trans-Atlantic slave trade. PMID:20080753

  14. Characterization of the Genome, Proteome, and Structure of Yersiniophage ϕR1-37

    PubMed Central

    Hyytiäinen, Heidi J.; Happonen, Lotta J.; Kiljunen, Saija; Datta, Neeta; Mattinen, Laura; Williamson, Kirsty; Kristo, Paula; Szeliga, Magdalena; Kalin-Mänttäri, Laura; Ahola-Iivarinen, Elina; Kalkkinen, Nisse; Butcher, Sarah J.

    2012-01-01

    The bacteriophage vB_YecM-ϕR1-37 (ϕR1-37) is a lytic yersiniophage that can propagate naturally in different Yersinia species carrying the correct lipopolysaccharide receptor. This large-tailed phage has deoxyuridine (dU) instead of thymidine in its DNA. In this study, we determined the genomic sequence of phage ϕR1-37, mapped parts of the phage transcriptome, characterized the phage particle proteome, and characterized the virion structure by cryo-electron microscopy and image reconstruction. The 262,391-bp genome of ϕR1-37 is one of the largest sequenced phage genomes, and it contains 367 putative open reading frames (ORFs) and 5 tRNA genes. Mass-spectrometric analysis identified 69 phage particle structural proteins with the genes scattered throughout the genome. A total of 269 of the ORFs (73%) lack homologues in sequence databases. Based on terminator and promoter sequences identified from the intergenic regions, the phage genome was predicted to consist of 40 to 60 transcriptional units. Image reconstruction revealed that the ϕR1-37 capsid consists of hexameric capsomers arranged on a T=27 lattice similar to the bacteriophage ϕKZ. The tail of ϕR1-37 has a contractile sheath. We conclude that phage ϕR1-37 is a representative of a novel phage type that carries the dU-containing genome in a ϕKZ-like head. PMID:22973030

  15. An Isochore-Like Structure in the Genome of the Flatworm Schistosoma mansoni

    PubMed Central

    Lamolle, Guillermo; Protasio, Anna V.; Iriarte, Andrés; Jara, Eugenio; Simón, Diego; Musto, Héctor

    2016-01-01

    Eukaryotic genomes are compositionally heterogeneous, that is, composed by regions that differ in guanine–cytosine (GC) content (isochores). The most well documented case is that of vertebrates (mainly mammals) although it has been also noted among unicellular eukaryotes and invertebrates. In the human genome, regarded as a typical mammal, this heterogeneity is associated with several features. Specifically, genes located in GC-richest regions are the GC3-richest, display CpG islands and have shorter introns. Furthermore, these genes are more heavily expressed and tend to be located at the extremes of the chromosomes. Although the compositional heterogeneity seems to be widespread among eukaryotes, the associated properties noted in the human genome and other mammals have not been investigated in depth in other taxa. Here we provide evidence that the genome of the parasitic flatworm Schistosoma mansoni is compositionally heterogeneous and exhibits an isochore-like structure, displaying some features associated, until now, only with the human and other vertebrate genomes, with the exception of gene concentration. PMID:27435793

  16. An Isochore-Like Structure in the Genome of the Flatworm Schistosoma mansoni.

    PubMed

    Lamolle, Guillermo; Protasio, Anna V; Iriarte, Andrés; Jara, Eugenio; Simón, Diego; Musto, Héctor

    2016-01-01

    Eukaryotic genomes are compositionally heterogeneous, that is, composed by regions that differ in guanine-cytosine (GC) content (isochores). The most well documented case is that of vertebrates (mainly mammals) although it has been also noted among unicellular eukaryotes and invertebrates. In the human genome, regarded as a typical mammal, this heterogeneity is associated with several features. Specifically, genes located in GC-richest regions are the GC3-richest, display CpG islands and have shorter introns. Furthermore, these genes are more heavily expressed and tend to be located at the extremes of the chromosomes. Although the compositional heterogeneity seems to be widespread among eukaryotes, the associated properties noted in the human genome and other mammals have not been investigated in depth in other taxa Here we provide evidence that the genome of the parasitic flatworm Schistosoma mansoni is compositionally heterogeneous and exhibits an isochore-like structure, displaying some features associated, until now, only with the human and other vertebrate genomes, with the exception of gene concentration. PMID:27435793

  17. Identification and classification of conserved RNA secondary structures in the human genome.

    PubMed

    Pedersen, Jakob Skou; Bejerano, Gill; Siepel, Adam; Rosenbloom, Kate; Lindblad-Toh, Kerstin; Lander, Eric S; Kent, Jim; Miller, Webb; Haussler, David

    2006-04-01

    The discoveries of microRNAs and riboswitches, among others, have shown functional RNAs to be biologically more important and genomically more prevalent than previously anticipated. We have developed a general comparative genomics method based on phylogenetic stochastic context-free grammars for identifying functional RNAs encoded in the human genome and used it to survey an eight-way genome-wide alignment of the human, chimpanzee, mouse, rat, dog, chicken, zebra-fish, and puffer-fish genomes for deeply conserved functional RNAs. At a loose threshold for acceptance, this search resulted in a set of 48,479 candidate RNA structures. This screen finds a large number of known functional RNAs, including 195 miRNAs, 62 histone 3'UTR stem loops, and various types of known genetic recoding elements. Among the highest-scoring new predictions are 169 new miRNA candidates, as well as new candidate selenocysteine insertion sites, RNA editing hairpins, RNAs involved in transcript auto regulation, and many folds that form singletons or small functional RNA families of completely unknown function. While the rate of false positives in the overall set is difficult to estimate and is likely to be substantial, the results nevertheless provide evidence for many new human functional RNAs and present specific predictions to facilitate their further characterization. PMID:16628248

  18. Different effects of the TAR structure on HIV-1 and HIV-2 genomic RNA translation

    PubMed Central

    Soto-Rifo, Ricardo; Limousin, Taran; Rubilar, Paulina S.; Ricci, Emiliano P.; Décimo, Didier; Moncorgé, Olivier; Trabaud, Mary-Anne; André, Patrice; Cimarelli, Andrea; Ohlmann, Théophile

    2012-01-01

    The 5′-untranslated region (5′-UTR) of the genomic RNA of human immunodeficiency viruses type-1 (HIV-1) and type-2 (HIV-2) is composed of highly structured RNA motifs essential for viral replication that are expected to interfere with Gag and Gag-Pol translation. Here, we have analyzed and compared the properties by which the viral 5′-UTR drives translation from the genomic RNA of both human immunodeficiency viruses. Our results showed that translation from the HIV-2 gRNA was very poor compared to that of HIV-1. This was rather due to the intrinsic structural motifs in their respective 5′-UTR without involvement of any viral protein. Further investigation pointed to a different role of TAR RNA, which was much inhibitory for HIV-2 translation. Altogether, these data highlight important structural and functional differences between these two human pathogens. PMID:22121214

  19. A structure-based approach for targeting the HIV-1 genomic RNA dimerization initiation site.

    PubMed

    Ennifar, Eric; Paillart, Jean-Christophe; Bernacchi, Serena; Walter, Philippe; Pale, Patrick; Decout, Jean-Luc; Marquet, Roland; Dumas, Philippe

    2007-10-01

    Dimerization of the genomic RNA is an important step of the HIV-1 replication cycle. The Dimerization Initiation Site (DIS) promotes dimerization of the viral genome by forming a loop-loop complex between two DIS hairpins. Crystal structures of the DIS loop-loop complex revealed an unexpected and strong similitude with the bacterial 16S ribosomal aminoacyl-tRNA site (A site), which is the target of aminoglycoside antibiotics. As a consequence of these structural and sequence similarities, the HIV-1 DIS also binds some aminoglycosides, not only in vitro, but also ex vivo, in lymphoid cells and in viral particles. Crystal structures of the DIS loop-loop in complex with several aminoglycoside antibiotics provide a detailed-view of the DIS/drug interaction and reveal some hints about possible modifications to increase the drug affinity and/or specificity. PMID:17434658

  20. The complete mitochondrial genome structure of the jaguar (Panthera onca).

    PubMed

    Caragiulo, Anthony; Dougherty, Eric; Soto, Sofia; Rabinowitz, Salisa; Amato, George

    2016-01-01

    The jaguar (Panthera onca) is the largest felid in the Western hemisphere, and the only member of the Panthera genus in the New World. The jaguar inhabits most countries within Central and South America, and is considered near threatened by the International Union for the Conservation of Nature. This study represents the first sequence of the entire jaguar mitogenome, which was the only Panthera mitogenome that had not been sequenced. The jaguar mitogenome is 17,049 bases and possesses the same molecular structure as other felid mitogenomes. Bayesian inference (BI) and maximum likelihood (ML) were used to determine the phylogenetic placement of the jaguar within the Panthera genus. Both BI and ML analyses revealed the jaguar to be sister to the tiger/leopard/snow leopard clade. PMID:25010076

  1. VAMDC Consortium: A Service to Astrophysics

    NASA Astrophysics Data System (ADS)

    L Dubernet, M.; Moreau, N.; Zwoelf, C. M.; Ba, Y. A.

    2015-12-01

    The VAMDC Consortium is a worldwide consortium which federates Atomic and Molecular databases through an e-science infrastructure and a political organisation. About 90% of the inter-connected databases handle data that are used for the interpretation of spectra and for the modelisation of media of many fields of astrophysics. This paper presents how the VAMDC Consortium is organised in order to provide a ``service'' to the astrophysics community.

  2. PanScan, the Pancreatic Cancer Cohort Consortium, and the Pancreatic Cancer Case-Control Consortium

    Cancer.gov

    The Pancreatic Cancer Cohort Consortium consists of more than a dozen prospective epidemiologic cohort studies within the NCI Cohort Consortium, whose leaders work together to investigate the etiology and natural history of pancreatic cancer.

  3. Exploring the genetic basis of stroke. Spanish stroke genetics consortium.

    PubMed

    Giralt-Steinhauer, E; Jiménez-Conde, J; Soriano Tárraga, C; Mola, M; Rodríguez-Campello, A; Cuadrado-Godia, E; Ois, A; Fernández-Cádenas, I; Carrera, C; Montaner, J; Díaz Navarro, R M; Vives-Bauzá, C; Roquer, J

    2014-01-01

    This article provides an overview of stroke genetics studies ranging from the candidate gene approach to more recent studies by the genome wide association. It highlights the complexity of stroke owing to its different aetiopathogenic mechanisms, the difficulties in studying its genetic component, and the solutions provided to date. The study emphasises the importance of cooperation between the different centres, whether this takes places occasionally or through the creation of lasting consortiums. This strategy is currently essential to the completion of high-quality scientific studies that allow researchers to gain a better knowledge of the genetic component of stroke as it relates to aetiology, treatment, and prevention. PMID:23831412

  4. Midwest Superconductivity Consortium: 1995 Progress report

    SciTech Connect

    1996-01-01

    The mission of the Midwest Superconductivity Consortium, MISCON, is to advance the science and understanding of high Tc superconductivity. During the past year, 26 projects produced over 133 talks and 127 publications. Three Master`s Degrees and 9 Doctor`s of Philosophy Degrees were granted to students working on MISCON projects. Group activities and interactions involved 2 MISCON group meetings (held in January and July); the third MISCON Summer School held in July; 12 external speakers; 81 collaborations (with universities, industry, Federal laboratories, and foreign research centers); and 54 exchanges of samples and/or measurements. Research achievements this past year focused on understanding the effects of processing phenomena on structure-property interrelationships and the fundamental nature of transport properties in high-temp superconductors.

  5. Midwest Superconductivity Consortium. Progress report, 1992

    SciTech Connect

    Bement, A.L. Jr.

    1993-01-01

    Mission of the Midwest Superconductivity Consortium, MISCON, is to advance the science and understanding of high Tc superconductivity. Programmatic research focuses upon key materials-related problems; principally, synthesis and processing and properties limiting transport phenomena. During the past year, 26 projects produced over 133 talks and 113 publications. publications. Two Master`s Degrees and one Ph.D. were granted to students working on MISCON projects. Group activities and interactions involved two MISCON group meetings (held in July and January), twenty external speakers, 36 collaborations, 10 exchanges of samples and/or measurements, and one (1) gift of equipment from industry. Research achievements this past year expanded our understanding of processing phenomena on structure property interrelationships and the fundamental nature of transport properties in high-temperature superconductors.

  6. Outside the coding genome, mammalian microRNAs confer structural and functional complexity

    PubMed Central

    Olive, Virginie; Minella, Alex C.; He, Lin

    2015-01-01

    MicroRNAs (miRNAs) comprise a class of small, regulatory noncoding RNAs (ncRNAs) with pivotal roles in post-transcriptional gene regulation. Since their initial discovery in 1993, numerous miRNAs have been identified in mammalian genomes, many of which play important roles in diverse cellular processes in development and disease. These small ncRNAs regulate the expression of many protein-coding genes post-transcriptionally, thus adding a substantial complexity to the molecular networks underlying physiological development and disease. In part, this complexity arises from the distinct gene structures, the extensive genomic redundancy, and the complex regulation of the expression and biogenesis of miRNAs. These characteristics contribute to the functional robustness and versatility of miRNAs and provide important clues to the functional significance of these small ncRNAs. The unique structure and function of miRNAs will continue to inspire many to explore the vast noncoding genome and to elucidate the molecular basis for the functional complexity of mammalian genomes. PMID:25783159

  7. Genome structures and transcriptomes signify niche adaptation for the multiple-ion-tolerant extremophyte Schrenkiella parvula.

    PubMed

    Oh, Dong-Ha; Hong, Hyewon; Lee, Sang Yeol; Yun, Dae-Jin; Bohnert, Hans J; Dassanayake, Maheshi

    2014-04-01

    Schrenkiella parvula (formerly Thellungiella parvula), a close relative of Arabidopsis (Arabidopsis thaliana) and Brassica crop species, thrives on the shores of Lake Tuz, Turkey, where soils accumulate high concentrations of multiple-ion salts. Despite the stark differences in adaptations to extreme salt stresses, the genomes of S. parvula and Arabidopsis show extensive synteny. S. parvula completes its life cycle in the presence of Na⁺, K⁺, Mg²⁺, Li⁺, and borate at soil concentrations lethal to Arabidopsis. Genome structural variations, including tandem duplications and translocations of genes, interrupt the colinearity observed throughout the S. parvula and Arabidopsis genomes. Structural variations distinguish homologous gene pairs characterized by divergent promoter sequences and basal-level expression strengths. Comparative RNA sequencing reveals the enrichment of ion-transport functions among genes with higher expression in S. parvula, while pathogen defense-related genes show higher expression in Arabidopsis. Key stress-related ion transporter genes in S. parvula showed increased copy number, higher transcript dosage, and evidence for subfunctionalization. This extremophyte offers a framework to identify the requisite adjustments of genomic architecture and expression control for a set of genes found in most plants in a way to support distinct niche adaptation and lifestyles. PMID:24563282

  8. PAPILLOMAVIRUS GENOME STRUCTURE, EXPRESSION, AND POST-TRANSCRIPTIONAL REGULATION

    PubMed Central

    Zheng, Zhi-Ming; Baker, Carl C.

    2006-01-01

    Papillomaviruses are a group of small non-enveloped DNA tumor viruses whose infection usually causes benign epithelial lesions (warts). Certain types of HPVs, such as HPV-16, HPV-18, and HPV-31, have been recognized as causative agents of cervical cancer and anal cancer and their infections, which arise via sexual transmission, are associated with more than 95% of cervical cancer. Papillomaviruses infect keratinocytes in the basal layer of stratified squamous epithelia and replicate in the nucleus of infected keratinocytes in a differentiation-dependent manner. Viral gene expression in infected cells depends on cell differentiation and is tightly regulated at the transcriptional and post-transcriptional levels. A noteworthy feature of all papillomavirus transcripts is that they are transcribed as a bicistronic or polycistronic form containing two or more ORFs and are polyadenylated at either an early or late poly(A) site. In the past ten years, remarkable progress has been made in understanding how this complex viral gene expression is regulated at the level of transcription (such as via DNA methylation) and particularly post-transcription (including RNA splicing, polyadenylation, and translation). Current knowledge of papillomavirus mRNA structure and RNA processing has provided some clues on how to control viral oncogene expression. However, we still have little knowledge about which mRNAs are used to translate each viral protein. Continuing research on post-transcriptional regulation of papillomavirus infection will remain as a future focus to provide more insights into papillomavirus-host interactions, the virus life-cycle, and viral oncogenesis. PMID:16720315

  9. Protein surface analysis for function annotation in high-throughput structural genomics pipeline

    PubMed Central

    Binkowski, T. Andrew; Joachimiak, Andrzej; Liang, Jie

    2005-01-01

    Structural genomics (SG) initiatives are expanding the universe of protein fold space by rapidly determining structures of proteins that were intentionally selected on the basis of low sequence similarity to proteins of known structure. Often these proteins have no associated biochemical or cellular functions. The SG success has resulted in an accelerated deposition of novel structures. In some cases the structural bioinformatics analysis applied to these novel structures has provided specific functional assignment. However, this approach has also uncovered limitations in the functional analysis of uncharacterized proteins using traditional sequence and backbone structure methodologies. A novel method, named pvSOAR (pocket and void Surface of Amino Acid Residues), of comparing the protein surfaces of geometrically defined pockets and voids was developed. pvSOAR was able to detect previously unrecognized and novel functional relationships between surface features of proteins. In this study, pvSOAR is applied to several structural genomics proteins. We examined the surfaces of YecM, BioH, and RpiB from Escherichia coli as well as the CBS domains from inosine-5′-monosphate dehydrogenase from Streptococcus pyogenes, conserved hypothetical protein Ta549 from Thermoplasm acidophilum, and CBS domain protein mt1622 from Methanobacterium thermoautotrophicum with the goal to infer information about their biochemical function. PMID:16322579

  10. Analysis of interactions between the epigenome and structural mutability of the genome using Genboree workbench tools

    PubMed Central

    2014-01-01

    Background Interactions between the epigenome and structural genomic variation are potentially bi-directional. In one direction, structural variants may cause epigenomic changes in cis. In the other direction, specific local epigenomic states such as DNA hypomethylation associate with local genomic instability. Methods To study these interactions, we have developed several tools and exposed them to the scientific community using the Software-as-a-Service model via the Genboree Workbench. One key tool is Breakout, an algorithm for fast and accurate detection of structural variants from mate pair sequencing data. Results By applying Breakout and other Genboree Workbench tools we map breakpoints in breast and prostate cancer cell lines and tumors, discriminate between polymorphic breakpoints of germline origin and those of somatic origin, and analyze both types of breakpoints in the context of the Human Epigenome Atlas, ENCODE databases, and other sources of epigenomic profiles. We confirm previous findings that genomic instability in human germline associates with hypomethylation of DNA, binding sites of Suz12, a key member of the PRC2 Polycomb complex, and with PRC2-associated histone marks H3K27me3 and H3K9me3. Breakpoints in germline and in breast cancer associate with distal regulatory of active gene transcription. Breast cancer cell lines and tumors show distinct patterns of structural mutability depending on their ER, PR, or HER2 status. Conclusions The patterns of association that we detected suggest that cell-type specific epigenomes may determine cell-type specific patterns of selective structural mutability of the genome. PMID:25080362

  11. A Genome Wide Survey of SNP Variation Reveals the Genetic Structure of Sheep Breeds

    PubMed Central

    Kijas, James W.; Townley, David; Dalrymple, Brian P.; Heaton, Michael P.; Maddox, Jillian F.; McGrath, Annette; Wilson, Peter; Ingersoll, Roxann G.; McCulloch, Russell; McWilliam, Sean; Tang, Dave; McEwan, John; Cockett, Noelle; Oddy, V. Hutton; Nicholas, Frank W.; Raadsma, Herman

    2009-01-01

    The genetic structure of sheep reflects their domestication and subsequent formation into discrete breeds. Understanding genetic structure is essential for achieving genetic improvement through genome-wide association studies, genomic selection and the dissection of quantitative traits. After identifying the first genome-wide set of SNP for sheep, we report on levels of genetic variability both within and between a diverse sample of ovine populations. Then, using cluster analysis and the partitioning of genetic variation, we demonstrate sheep are characterised by weak phylogeographic structure, overlapping genetic similarity and generally low differentiation which is consistent with their short evolutionary history. The degree of population substructure was, however, sufficient to cluster individuals based on geographic origin and known breed history. Specifically, African and Asian populations clustered separately from breeds of European origin sampled from Australia, New Zealand, Europe and North America. Furthermore, we demonstrate the presence of stratification within some, but not all, ovine breeds. The results emphasize that careful documentation of genetic structure will be an essential prerequisite when mapping the genetic basis of complex traits. Furthermore, the identification of a subset of SNP able to assign individuals into broad groupings demonstrates even a small panel of markers may be suitable for applications such as traceability. PMID:19270757

  12. Adaptive potential of genomic structural variation in human and mammalian evolution.

    PubMed

    Radke, David W; Lee, Charles

    2015-09-01

    Because phenotypic innovations must be genetically heritable for biological evolution to proceed, it is natural to consider new mutation events as well as standing genetic variation as sources for their birth. Previous research has identified a number of single-nucleotide polymorphisms that underlie a subset of adaptive traits in organisms. However, another well-known class of variation, genomic structural variation, could have even greater potential to produce adaptive phenotypes, due to the variety of possible types of alterations (deletions, insertions, duplications, among others) at different genomic positions and with variable lengths. It is from these dramatic genomic alterations, and selection on their phenotypic consequences, that adaptations leading to biological diversification could be derived. In this review, using studies in humans and other mammals, we highlight examples of how phenotypic variation from structural variants might become adaptive in populations and potentially enable biological diversification. Phenotypic change arising from structural variants will be described according to their immediate effect on organismal metabolic processes, immunological response and physical features. Study of population dynamics of segregating structural variation can therefore provide a window into understanding current and historical biological diversification. PMID:26003631

  13. The Structure of Human Parechovirus 1 Reveals an Association of the RNA Genome with the Capsid

    PubMed Central

    Kalynych, Sergei; Pálková, Lenka

    2015-01-01

    ABSTRACT Parechoviruses are human pathogens that cause diseases ranging from gastrointestinal disorders to encephalitis. Unlike those of most picornaviruses, parechovirus capsids are composed of only three subunits: VP0, VP1, and VP3. Here, we present the structure of a human parechovirus 1 (HPeV-1) virion determined to a resolution of 3.1 Å. We found that interactions among pentamers in the HPeV-1 capsid are mediated by the N termini of VP0s, which correspond to the capsid protein VP4 and the N-terminal part of the capsid protein VP2 of other picornaviruses. In order to facilitate delivery of the virus genome into the cytoplasm, the N termini of VP0s have to be released from contacts between pentamers and exposed at the particle surface, resulting in capsid disruption. A hydrophobic pocket, which can be targeted by capsid-binding antiviral compounds in many other picornaviruses, is not present in HPeV-1. However, we found that interactions between the HPeV-1 single-stranded RNA genome and subunits VP1 and VP3 in the virion impose a partial icosahedral ordering on the genome. The residues involved in RNA binding are conserved among all parechoviruses, suggesting a putative role of the genome in virion stability or assembly. Therefore, putative small molecules that could disrupt HPeV RNA-capsid protein interactions could be developed into antiviral inhibitors. IMPORTANCE Human parechoviruses (HPeVs) are pathogens that cause diseases ranging from respiratory and gastrointestinal disorders to encephalitis. Recently, there have been outbreaks of HPeV infections in Western Europe and North America. We present the first atomic structure of parechovirus HPeV-1 determined by X-ray crystallography. The structure explains why HPeVs cannot be targeted by antiviral compounds that are effective against other picornaviruses. Furthermore, we found that the interactions of the HPeV-1 genome with the capsid resulted in a partial icosahedral ordering of the genome. The residues

  14. Structure and genome release of Twort-like Myoviridae phage with a double-layered baseplate.

    PubMed

    Nováček, Jiří; Šiborová, Marta; Benešík, Martin; Pantůček, Roman; Doškař, Jiří; Plevka, Pavel

    2016-08-16

    Bacteriophages from the family Myoviridae use double-layered contractile tails to infect bacteria. Contraction of the tail sheath enables the tail tube to penetrate through the bacterial cell wall and serve as a channel for the transport of the phage genome into the cytoplasm. However, the mechanisms controlling the tail contraction and genome release of phages with "double-layered" baseplates were unknown. We used cryo-electron microscopy to show that the binding of the Twort-like phage phi812 to the Staphylococcus aureus cell wall requires a 210° rotation of the heterohexameric receptor-binding and tripod protein complexes within its baseplate about an axis perpendicular to the sixfold axis of the tail. This rotation reorients the receptor-binding proteins to point away from the phage head, and also results in disruption of the interaction of the tripod proteins with the tail sheath, hence triggering its contraction. However, the tail sheath contraction of Myoviridae phages is not sufficient to induce genome ejection. We show that the end of the phi812 double-stranded DNA genome is bound to one protein subunit from a connector complex that also forms an interface between the phage head and tail. The tail sheath contraction induces conformational changes of the neck and connector that result in disruption of the DNA binding. The genome penetrates into the neck, but is stopped at a bottleneck before the tail tube. A subsequent structural change of the tail tube induced by its interaction with the S. aureus cell is required for the genome's release. PMID:27469164

  15. Long-Range Correlations in Genomic DNA: A Signature of the Nucleosomal Structure

    NASA Astrophysics Data System (ADS)

    Audit, B.; Thermes, C.; Vaillant, C.; D'Aubenton-Carafa, Y.; Muzy, J. F.; Arneodo, A.

    2001-03-01

    We use the ``wavelet transform microscope'' to carry out a comparative statistical analysis of DNA bending profiles and of the corresponding DNA texts. In the three kingdoms, one reveals on both signals a characteristic scale of 100-200 bp that separates two different regimes of power-law correlations (PLC). In the small-scale regime, PLC are observed in eukaryotic, in double-strand DNA viral, and in archaeal genomes, which contrasts with their total absence in the genomes of eubacteria and their viruses. This strongly suggests that small-scale PLC are related to the mechanisms underlying the wrapping of DNA in the nucleosomal structure. We further speculate that the large scale PLC are the signature of the higher-order structure and dynamics of chromatin.

  16. Universal Internucleotide Statistics in Full Genomes: A Footprint of the DNA Structure and Packaging?

    PubMed Central

    Bogachev, Mikhail I.; Kayumov, Airat R.; Bunde, Armin

    2014-01-01

    Uncovering the fundamental laws that govern the complex DNA structural organization remains challenging and is largely based upon reconstructions from the primary nucleotide sequences. Here we investigate the distributions of the internucleotide intervals and their persistence properties in complete genomes of various organisms from Archaea and Bacteria to H. Sapiens aiming to reveal the manifestation of the universal DNA architecture. We find that in all considered organisms the internucleotide interval distributions exhibit the same -exponential form. While in prokaryotes a single -exponential function makes the best fit, in eukaryotes the PDF contains additionally a second -exponential, which in the human genome makes a perfect approximation over nearly 10 decades. We suggest that this functional form is a footprint of the heterogeneous DNA structure, where the first -exponential reflects the universal helical pitch that appears both in pro- and eukaryotic DNA, while the second -exponential is a specific marker of the large-scale eukaryotic DNA organization. PMID:25438044

  17. Global MLST of Salmonella Typhi Revisited in Post-genomic Era: Genetic Conservation, Population Structure, and Comparative Genomics of Rare Sequence Types

    PubMed Central

    Yap, Kien-Pong; Ho, Wing S.; Gan, Han M.; Chai, Lay C.; Thong, Kwai L.

    2016-01-01

    Typhoid fever, caused by Salmonella enterica serovar Typhi, remains an important public health burden in Southeast Asia and other endemic countries. Various genotyping methods have been applied to study the genetic variations of this human-restricted pathogen. Multilocus sequence typing (MLST) is one of the widely accepted methods, and recently, there is a growing interest in the re-application of MLST in the post-genomic era. In this study, we provide the global MLST distribution of S. Typhi utilizing both publicly available 1,826 S. Typhi genome sequences in addition to performing conventional MLST on S. Typhi strains isolated from various endemic regions spanning over a century. Our global MLST analysis confirms the predominance of two sequence types (ST1 and ST2) co-existing in the endemic regions. Interestingly, S. Typhi strains with ST8 are currently confined within the African continent. Comparative genomic analyses of ST8 and other rare STs with genomes of ST1/ST2 revealed unique mutations in important virulence genes such as flhB, sipC, and tviD that may explain the variations that differentiate between seemingly successful (widespread) and unsuccessful (poor dissemination) S. Typhi populations. Large scale whole-genome phylogeny demonstrated evidence of phylogeographical structuring and showed that ST8 may have diverged from the earlier ancestral population of ST1 and ST2, which later lost some of its fitness advantages, leading to poor worldwide dissemination. In response to the unprecedented increase in genomic data, this study demonstrates and highlights the utility of large-scale genome-based MLST as a quick and effective approach to narrow the scope of in-depth comparative genomic analysis and consequently provide new insights into the fine scale of pathogen evolution and population structure. PMID:26973639

  18. Genome structure of a Saccharomyces cerevisiae strain widely used in bioethanol production

    PubMed Central

    Argueso, Juan Lucas; Carazzolle, Marcelo F.; Mieczkowski, Piotr A.; Duarte, Fabiana M.; Netto, Osmar V.C.; Missawa, Silvia K.; Galzerani, Felipe; Costa, Gustavo G.L.; Vidal, Ramon O.; Noronha, Melline F.; Dominska, Margaret; Andrietta, Maria G.S.; Andrietta, Sílvio R.; Cunha, Anderson F.; Gomes, Luiz H.; Tavares, Flavio C.A.; Alcarde, André R.; Dietrich, Fred S.; McCusker, John H.; Petes, Thomas D.; Pereira, Gonçalo A.G.

    2009-01-01

    Bioethanol is a biofuel produced mainly from the fermentation of carbohydrates derived from agricultural feedstocks by the yeast Saccharomyces cerevisiae. One of the most widely adopted strains is PE-2, a heterothallic diploid naturally adapted to the sugar cane fermentation process used in Brazil. Here we report the molecular genetic analysis of a PE-2 derived diploid (JAY270), and the complete genome sequence of a haploid derivative (JAY291). The JAY270 genome is highly heterozygous (∼2 SNPs/kb) and has several structural polymorphisms between homologous chromosomes. These chromosomal rearrangements are confined to the peripheral regions of the chromosomes, with breakpoints within repetitive DNA sequences. Despite its complex karyotype, this diploid, when sporulated, had a high frequency of viable spores. Hybrid diploids formed by outcrossing with the laboratory strain S288c also displayed good spore viability. Thus, the rearrangements that exist near the ends of chromosomes do not impair meiosis, as they do not span regions that contain essential genes. This observation is consistent with a model in which the peripheral regions of chromosomes represent plastic domains of the genome that are free to recombine ectopically and experiment with alternative structures. We also explored features of the JAY270 and JAY291 genomes that help explain their high adaptation to industrial environments, exhibiting desirable phenotypes such as high ethanol and cell mass production and high temperature and oxidative stress tolerance. The genomic manipulation of such strains could enable the creation of a new generation of industrial organisms, ideally suited for use as delivery vehicles for future bioenergy technologies. PMID:19812109

  19. Whole genome comparison between table and wine grapes reveals a comprehensive catalog of structural variants

    PubMed Central

    2014-01-01

    Background Grapevine (Vitis vinifera L.) is the most important Mediterranean fruit crop, used to produce both wine and spirits as well as table grape and raisins. Wine and table grape cultivars represent two divergent germplasm pools with different origins and domestication history, as well as differential characteristics for berry size, cluster architecture and berry chemical profile, among others. ‘Sultanina’ plays a pivotal role in modern table grape breeding providing the main source of seedlessness. This cultivar is also one of the most planted for fresh consumption and raisins production. Given its importance, we sequenced it and implemented a novel strategy for the de novo assembly of its highly heterozygous genome. Results Our approach produced a draft genome of 466 Mb, recovering 82% of the genes present in the grapevine reference genome; in addition, we identified 240 novel genes. A large number of structural variants and SNPs were identified. Among them, 45 (21 SNPs and 24 INDELs) were experimentally confirmed in ‘Sultanina’ and six SNPs in other 23 table grape varieties. Transposable elements corresponded to ca. 80% of the repetitive sequences involved in structural variants and more than 2,000 genes were affected in their structure by these variants. Some of these genes are likely involved in embryo development, suggesting that they may contribute to seedlessness, a key trait for table grapes. Conclusions This work produced the first structural variants and SNPs catalog for grapevine, constituting a novel and very powerful tool for genomic studies in this key fruit crop, particularly useful to support marker assisted breeding in table grapes. PMID:24397443

  20. A Structural Model of the Genome Packaging Process in a Membrane-Containing Double Stranded DNA Virus

    PubMed Central

    Hong, Chuan; Oksanen, Hanna M.; Liu, Xiangan; Jakana, Joanita; Bamford, Dennis H.; Chiu, Wah

    2014-01-01

    Two crucial steps in the virus life cycle are genome encapsidation to form an infective virion and genome exit to infect the next host cell. In most icosahedral double-stranded (ds) DNA viruses, the viral genome enters and exits the capsid through a unique vertex. Internal membrane-containing viruses possess additional complexity as the genome must be translocated through the viral membrane bilayer. Here, we report the structure of the genome packaging complex with a membrane conduit essential for viral genome encapsidation in the tailless icosahedral membrane-containing bacteriophage PRD1. We utilize single particle electron cryo-microscopy (cryo-EM) and symmetry-free image reconstruction to determine structures of PRD1 virion, procapsid, and packaging deficient mutant particles. At the unique vertex of PRD1, the packaging complex replaces the regular 5-fold structure and crosses the lipid bilayer. These structures reveal that the packaging ATPase P9 and the packaging efficiency factor P6 form a dodecameric portal complex external to the membrane moiety, surrounded by ten major capsid protein P3 trimers. The viral transmembrane density at the special vertex is assigned to be a hexamer of heterodimer of proteins P20 and P22. The hexamer functions as a membrane conduit for the DNA and as a nucleating site for the unique vertex assembly. Our structures show a conformational alteration in the lipid membrane after the P9 and P6 are recruited to the virion. The P8-genome complex is then packaged into the procapsid through the unique vertex while the genome terminal protein P8 functions as a valve that closes the channel once the genome is inside. Comparing mature virion, procapsid, and mutant particle structures led us to propose an assembly pathway for the genome packaging apparatus in the PRD1 virion. PMID:25514469

  1. The complete mitochondrial genome sequence of the tubeworm Lamellibrachia satsuma and structural conservation in the mitochondrial genome control regions of Order Sabellida.

    PubMed

    Patra, Ajit Kumar; Kwon, Yong Min; Kang, Sung Gyun; Fujiwara, Yoshihiro; Kim, Sang-Jin

    2016-04-01

    The control region of the mitochondrial genomes shows high variation in conserved sequence organizations, which follow distinct evolutionary patterns in different species or taxa. In this study, we sequenced the complete mitochondrial genome of Lamellibrachia satsuma from the cold-seep region of Kagoshima Bay, as a part of whole genome study and extensively studied the structural features and patterns of the control region sequences. We obtained 15,037 bp of mitochondrial genome using Illumina sequencing and identified the non-coding AT-rich region or control region (354 bp, AT=83.9%) located between trnH and trnR. We found 7 conserved sequence blocks (CSB), scattered throughout the control region of L. satsuma and other taxa of Annelida. The poly-TA stretches, which commonly form the stem of multiple stem-loop structures, are most conserved in the CSB-I and CSB-II regions. The mitochondrial genome of L. satsuma encodes a unique repetitive sequence in the control region, which forms a unique secondary structure in comparison to Lamellibrachia luymesi. Phylogenetic analyses of all protein-coding genes indicate that L. satsuma forms a monophyletic clade with L. luymesi along with other tubeworms found in cold-seep regions (genera: Lamellibrachia, Escarpia, and Seepiophila). In general, the control region sequences of Annelida could be aligned with certainty within each genus, and to some extent within the family, but with a higher rate of variation in conserved regions. PMID:26776396

  2. Infer Metagenomic Abundance and Reveal Homologous Genomes Based on the Structure of Taxonomy Tree.

    PubMed

    Qiu, Yu-Qing; Tian, Xue; Zhang, Shihua

    2015-01-01

    Metagenomic research uses sequencing technologies to investigate the genetic biodiversity of microbiomes presented in various ecosystems or animal tissues. The composition of a microbial community is highly associated with the environment in which the organisms exist. As large amount of sequencing short reads of microorganism genomes obtained, accurately estimating the abundance of microorganisms within a metagenomic sample is becoming an increasing challenge in bioinformatics. In this paper, we describe a hierarchical taxonomy tree-based mixture model (HTTMM) for estimating the abundance of taxon within a microbial community by incorporating the structure of the taxonomy tree. In this model, genome-specific short reads and homologous short reads among genomes can be distinguished and represented by leaf and intermediate nodes in the taxonomy tree, respectively. We adopt an expectation-maximization algorithm to solve this model. Using simulated and real-world data, we demonstrate that the proposed method is superior to both flat mixture model and lowest common ancestry-based methods. Moreover, this model can reveal previously unaddressed homologous genomes. PMID:26451823

  3. Transposon Insertions, Structural Variations, and SNPs Contribute to the Evolution of the Melon Genome.

    PubMed

    Sanseverino, Walter; Hénaff, Elizabeth; Vives, Cristina; Pinosio, Sara; Burgos-Paz, William; Morgante, Michele; Ramos-Onsins, Sebastián E; Garcia-Mas, Jordi; Casacuberta, Josep Maria

    2015-10-01

    The availability of extensive databases of crop genome sequences should allow analysis of crop variability at an unprecedented scale, which should have an important impact in plant breeding. However, up to now the analysis of genetic variability at the whole-genome scale has been mainly restricted to single nucleotide polymorphisms (SNPs). This is a strong limitation as structural variation (SV) and transposon insertion polymorphisms are frequent in plant species and have had an important mutational role in crop domestication and breeding. Here, we present the first comprehensive analysis of melon genetic diversity, which includes a detailed analysis of SNPs, SV, and transposon insertion polymorphisms. The variability found among seven melon varieties representing the species diversity and including wild accessions and highly breed lines, is relatively high due in part to the marked divergence of some lineages. The diversity is distributed nonuniformly across the genome, being lower at the extremes of the chromosomes and higher in the pericentromeric regions, which is compatible with the effect of purifying selection and recombination forces over functional regions. Additionally, this variability is greatly reduced among elite varieties, probably due to selection during breeding. We have found some chromosomal regions showing a high differentiation of the elite varieties versus the rest, which could be considered as strongly selected candidate regions. Our data also suggest that transposons and SV may be at the origin of an important fraction of the variability in melon, which highlights the importance of analyzing all types of genetic variability to understand crop genome evolution. PMID:26174143

  4. Genomic Structure of an Economically Important Cyanobacterium, Arthrospira (Spirulina) platensis NIES-39

    PubMed Central

    Fujisawa, Takatomo; Narikawa, Rei; Okamoto, Shinobu; Ehira, Shigeki; Yoshimura, Hidehisa; Suzuki, Iwane; Masuda, Tatsuru; Mochimaru, Mari; Takaichi, Shinichi; Awai, Koichiro; Sekine, Mitsuo; Horikawa, Hiroshi; Yashiro, Isao; Omata, Seiha; Takarada, Hiromi; Katano, Yoko; Kosugi, Hiroki; Tanikawa, Satoshi; Ohmori, Kazuko; Sato, Naoki; Ikeuchi, Masahiko; Fujita, Nobuyuki; Ohmori, Masayuki

    2010-01-01

    A filamentous non-N2-fixing cyanobacterium, Arthrospira (Spirulina) platensis, is an important organism for industrial applications and as a food supply. Almost the complete genome of A. platensis NIES-39 was determined in this study. The genome structure of A. platensis is estimated to be a single, circular chromosome of 6.8 Mb, based on optical mapping. Annotation of this 6.7 Mb sequence yielded 6630 protein-coding genes as well as two sets of rRNA genes and 40 tRNA genes. Of the protein-coding genes, 78% are similar to those of other organisms; the remaining 22% are currently unknown. A total 612 kb of the genome comprise group II introns, insertion sequences and some repetitive elements. Group I introns are located in a protein-coding region. Abundant restriction-modification systems were determined. Unique features in the gene composition were noted, particularly in a large number of genes for adenylate cyclase and haemolysin-like Ca2+-binding proteins and in chemotaxis proteins. Filament-specific genes were highlighted by comparative genomic analysis. PMID:20203057

  5. Tri-District Arts Consortium Summer Program.

    ERIC Educational Resources Information Center

    Kirby, Charlotte O.

    1990-01-01

    The Tri-District Arts Consortium in South Carolina was formed to serve artistically gifted students in grades six-nine. The consortium developed a summer program offering music, dance, theatre, and visual arts instruction through a curriculum of intense training, performing, and hands-on experiences with faculty members and guest artists. (JDD)

  6. The Salix Consortium in New York

    SciTech Connect

    Wulf, T.; Jones, J.

    1998-09-28

    Energy crops for electrical production are being given a boost by the Salix Consortium, an association of 20 corporations and industrial, government, farming, and research organizations. The consortium supports commercial development of willows for generating electricity, which are being grown for utilities across the Northeast region of the U.S. for use in cofiring with coal in existing power plants.

  7. Increasing Sales by Developing Production Consortiums.

    ERIC Educational Resources Information Center

    Smith, Christopher A.; Russo, Robert

    Intended to help rehabilitation facility administrators increase organizational income from manufacturing and/or contracted service sources, this document provides a decision-making model for the development of a production consortium. The document consists of five chapters and two appendices. Chapter 1 defines the consortium concept, explains…

  8. Structure, evolution, and comparative genomics of tetraploid cotton based on a high-density genetic linkage map

    PubMed Central

    Li, Ximei; Jin, Xin; Wang, Hantao; Zhang, Xianlong; Lin, Zhongxu

    2016-01-01

    A high-density linkage map was constructed using 1,885 newly obtained loci and 3,747 previously published loci, which included 5,152 loci with 4696.03 cM in total length and 0.91 cM in mean distance. Homology analysis in the cotton genome further confirmed the 13 expected homologous chromosome pairs and revealed an obvious inversion on Chr10 or Chr20 and repeated inversions on Chr07 or Chr16. In addition, two reciprocal translocations between Chr02 and Chr03 and between Chr04 and Chr05 were confirmed. Comparative genomics between the tetraploid cotton and the diploid cottons showed that no major structural changes exist between DT and D chromosomes but rather between AT and A chromosomes. Blast analysis between the tetraploid cotton genome and the mixed genome of two diploid cottons showed that most AD chromosomes, regardless of whether it is from the AT or DT genome, preferentially matched with the corresponding homologous chromosome in the diploid A genome, and then the corresponding homologous chromosome in the diploid D genome, indicating that the diploid D genome underwent converted evolution by the diploid A genome to form the DT genome during polyploidization. In addition, the results reflected that a series of chromosomal translocations occurred among Chr01/Chr15, Chr02/Chr14, Chr03/Chr17, Chr04/Chr22, and Chr05/Chr19. PMID:27084896

  9. Combining Functional and Structural Genomics to Sample the Essential Burkholderia Structome

    PubMed Central

    Baugh, Loren; Gallagher, Larry A.; Patrapuvich, Rapatbhorn; Clifton, Matthew C.; Gardberg, Anna S.; Edwards, Thomas E.; Armour, Brianna; Begley, Darren W.; Dieterich, Shellie H.; Dranow, David M.; Abendroth, Jan; Fairman, James W.; Fox, David; Staker, Bart L.; Phan, Isabelle; Gillespie, Angela; Choi, Ryan; Nakazawa-Hewitt, Steve; Nguyen, Mary Trang; Napuli, Alberto; Barrett, Lynn; Buchko, Garry W.; Stacy, Robin; Myler, Peter J.; Stewart, Lance J.; Manoil, Colin; Van Voorhis, Wesley C.

    2013-01-01

    Background The genus Burkholderia includes pathogenic gram-negative bacteria that cause melioidosis, glanders, and pulmonary infections of patients with cancer and cystic fibrosis. Drug resistance has made development of new antimicrobials critical. Many approaches to discovering new antimicrobials, such as structure-based drug design and whole cell phenotypic screens followed by lead refinement, require high-resolution structures of proteins essential to the parasite. Methodology/Principal Findings We experimentally identified 406 putative essential genes in B. thailandensis, a low-virulence species phylogenetically similar to B. pseudomallei, the causative agent of melioidosis, using saturation-level transposon mutagenesis and next-generation sequencing (Tn-seq). We selected 315 protein products of these genes based on structure-determination criteria, such as excluding very large and/or integral membrane proteins, and entered them into the Seattle Structural Genomics Center for Infection Disease (SSGCID) structure determination pipeline. To maximize structural coverage of these targets, we applied an “ortholog rescue” strategy for those producing insoluble or difficult to crystallize proteins, resulting in the addition of 387 orthologs (or paralogs) from seven other Burkholderia species into the SSGCID pipeline. This structural genomics approach yielded structures from 31 putative essential targets from B. thailandensis, and 25 orthologs from other Burkholderia species, yielding an overall structural coverage for 49 of the 406 essential gene families, with a total of 88 depositions into the Protein Data Bank. Of these, 25 proteins have properties of a potential antimicrobial drug target i.e., no close human homolog, part of an essential metabolic pathway, and a deep binding pocket. We describe the structures of several potential drug targets in detail. Conclusions/Significance This collection of structures, solubility and experimental essentiality data

  10. Genomic organization of the crested ibis MHC provides new insight into ancestral avian MHC structure

    PubMed Central

    Chen, Li-Cheng; Lan, Hong; Sun, Li; Deng, Yan-Li; Tang, Ke-Yi; Wan, Qiu-Hong

    2015-01-01

    The major histocompatibility complex (MHC) plays an important role in immune response. Avian MHCs are not well characterized, only reporting highly compact Galliformes MHCs and extensively fragmented zebra finch MHC. We report the first genomic structure of an endangered Pelecaniformes (crested ibis) MHC containing 54 genes in three regions spanning ~500 kb. In contrast to the loose BG (26 loci within 265 kb) and Class I (11 within 150) genomic structures, the Core Region is condensed (17 within 85). Furthermore, this Region exhibits a COL11A2 gene, followed by four tandem MHC class II αβ dyads retaining two suites of anciently duplicated “αβ” lineages. Thus, the crested ibis MHC structure is entirely different from the known avian MHC architectures but similar to that of mammalian MHCs, suggesting that the fundamental structure of ancestral avian class II MHCs should be “COL11A2-IIαβ1-IIαβ2.” The gene structures, residue characteristics, and expression levels of the five class I genes reveal inter-locus functional divergence. However, phylogenetic analysis indicates that these five genes generate a well-supported intra-species clade, showing evidence for recent duplications. Our analyses suggest dramatic structural variation among avian MHC lineages, help elucidate avian MHC evolution, and provide a foundation for future conservation studies. PMID:25608659

  11. In silico prediction and screening of modular crystal structures via a high-throughput genomic approach

    PubMed Central

    Li, Yi; Li, Xu; Liu, Jiancong; Duan, Fangzheng; Yu, Jihong

    2015-01-01

    High-throughput computational methods capable of predicting, evaluating and identifying promising synthetic candidates with desired properties are highly appealing to today's scientists. Despite some successes, in silico design of crystalline materials with complex three-dimensionally extended structures remains challenging. Here we demonstrate the application of a new genomic approach to ABC-6 zeolites, a family of industrially important catalysts whose structures are built from the stacking of modular six-ring layers. The sequences of layer stacking, which we deem the genes of this family, determine the structures and the properties of ABC-6 zeolites. By enumerating these gene-like stacking sequences, we have identified 1,127 most realizable new ABC-6 structures out of 78 groups of 84,292 theoretical ones, and experimentally realized 2 of them. Our genomic approach can extract crucial structural information directly from these gene-like stacking sequences, enabling high-throughput identification of synthetic targets with desired properties among a large number of candidate structures. PMID:26395233

  12. Joint modeling of RNase footprint sequencing profiles for genome-wide inference of RNA structure.

    PubMed

    Zou, Chenchen; Ouyang, Zhengqing

    2015-10-30

    Recent studies have revealed significant roles of RNA structure in almost every step of RNA processing, including transcription, splicing, transport and translation. RNase footprint sequencing (RNase-seq) has emerged to dissect RNA structures at the genome scale. However, it remains challenging to analyze RNase-seq data because of the issues of signal sparsity, variability and correlations among various RNases. We present a probabilistic framework, joint Poisson-gamma mixture (JPGM), for integrative modeling of multiple RNase-seq profiles. Combining JPGM with hidden Markov model allows genome-wide inference of RNA structures. We apply the joint modeling approach for inferring base pairing states on simulated data sets and RNase-seq profiles of the double-strand specific RNase V1 and single-strand specific RNase S1 in yeast. We demonstrate that joint analysis of V1 and S1 profiles outputs interpretable RNA structure states, while approaches that analyze each profile separately do not. The joint modeling approach predicts the structure states of all nucleotides in 3196 transcripts of yeast without compromising accuracy, while the simple thresholding approach misses 43% of the nucleotides. Furthermore, the posterior probabilities outputted by our model are able to resolve the structural ambiguity of ≈300 000 nucleotides with overlapping V1 and S1 cleavage sites. Our model also generates RNA accessibilities, which are associated with three-dimensional conformations. PMID:26400167

  13. Modeling and small-angle neutron scattering spectra of chromatin supernucleosomal structures at genome scale

    NASA Astrophysics Data System (ADS)

    Ilatovskiy, Andrey V.; Lebedev, Dmitry V.; Filatov, Michael V.; Grigoriev, Mikhail; Petukhov, Michael G.; Isaev-Ivanov, Vladimir V.

    2011-11-01

    Eukaryotic genome is a highly compacted nucleoprotein complex organized in a hierarchical structure based on nucleosomes. Detailed organization of this structure remains unknown. In the present work we developed algorithms for geometry modeling of the supernucleosomal chromatin structure and for computing distance distribution functions and small-angle neutron scattering (SANS) spectra of the genome-scale (˜106 nucleosomes) chromatin structure at residue resolution. Our physical nucleosome model was based on the mononucleosome crystal structure. A nucleosome was assumed to be rigid within a local coordinate system. Interface parameters between nucleosomes can be set for each nucleosome independently. Pair distance distributions were computed with Monte Carlo simulation. SANS spectra were calculated with Fourier transformation of weighted distance distribution; the concentration of heavy water in solvent and probability of H/D exchange were taken into account. Two main modes of supernucleosomal structure generation were used. In a free generation mode all interface parameters were chosen randomly, whereas nucleosome self-intersections were not allowed. The second generation mode (generation in volume) enabled spherical or cubical wall restrictions. It was shown that calculated SANS spectra for a number of our models were in general agreement with available experimental data.

  14. In silico prediction and screening of modular crystal structures via a high-throughput genomic approach

    NASA Astrophysics Data System (ADS)

    Li, Yi; Li, Xu; Liu, Jiancong; Duan, Fangzheng; Yu, Jihong

    2015-09-01

    High-throughput computational methods capable of predicting, evaluating and identifying promising synthetic candidates with desired properties are highly appealing to today's scientists. Despite some successes, in silico design of crystalline materials with complex three-dimensionally extended structures remains challenging. Here we demonstrate the application of a new genomic approach to ABC-6 zeolites, a family of industrially important catalysts whose structures are built from the stacking of modular six-ring layers. The sequences of layer stacking, which we deem the genes of this family, determine the structures and the properties of ABC-6 zeolites. By enumerating these gene-like stacking sequences, we have identified 1,127 most realizable new ABC-6 structures out of 78 groups of 84,292 theoretical ones, and experimentally realized 2 of them. Our genomic approach can extract crucial structural information directly from these gene-like stacking sequences, enabling high-throughput identification of synthetic targets with desired properties among a large number of candidate structures.

  15. The AGTSR consortium: An update

    SciTech Connect

    Fant, D.B.; Golan, L.P.

    1995-12-31

    The Advanced Gas Turbine Systems Research program is a nationwide consortium dedicated to advancing land-based gas turbine systems for improving future power generation capability. It directly supports the technology-research arm of the ATS program and targets industry- defined research needs in the areas of combustion, heat transfer, materials, aerodynamics, controls, alternative fuels, and advanced cycles. It is organized to enhance U.S. competitiveness through close collaboration with universities, government, and industry at the R&D level. AGTSR is just finishing its third year of operation; it is scheduled to continue past the year 2000. This update reviews the AGTSR triad, which consists of university/industry R&D activities, technology transfer programs, and trial student programs.

  16. A 4-gigabase physical map unlocks the structure and evolution of the complex genome of Aegilops tauschii, the wheat D-genome progenitor

    PubMed Central

    Luo, Ming-Cheng; Gu, Yong Q.; You, Frank M.; Deal, Karin R.; Ma, Yaqin; Hu, Yuqin; Huo, Naxin; Wang, Yi; Wang, Jirui; Chen, Shiyong; Jorgensen, Chad M.; Zhang, Yong; McGuire, Patrick E.; Pasternak, Shiran; Stein, Joshua C.; Ware, Doreen; Kramer, Melissa; McCombie, W. Richard; Kianian, Shahryar F.; Martis, Mihaela M.; Mayer, Klaus F. X.; Sehgal, Sunish K.; Li, Wanlong; Gill, Bikram S.; Bevan, Michael W.; Šimková, Hana; Doležel, Jaroslav; Weining, Song; Lazo, Gerard R.; Anderson, Olin D.; Dvorak, Jan

    2013-01-01

    The current limitations in genome sequencing technology require the construction of physical maps for high-quality draft sequences of large plant genomes, such as that of Aegilops tauschii, the wheat D-genome progenitor. To construct a physical map of the Ae. tauschii genome, we fingerprinted 461,706 bacterial artificial chromosome clones, assembled contigs, designed a 10K Ae. tauschii Infinium SNP array, constructed a 7,185-marker genetic map, and anchored on the map contigs totaling 4.03 Gb. Using whole genome shotgun reads, we extended the SNP marker sequences and found 17,093 genes and gene fragments. We showed that collinearity of the Ae. tauschii genes with Brachypodium distachyon, rice, and sorghum decreased with phylogenetic distance and that structural genome evolution rates have been high across all investigated lineages in subfamily Pooideae, including that of Brachypodieae. We obtained additional information about the evolution of the seven Triticeae chromosomes from 12 ancestral chromosomes and uncovered a pattern of centromere inactivation accompanying nested chromosome insertions in grasses. We showed that the density of noncollinear genes along the Ae. tauschii chromosomes positively correlates with recombination rates, suggested a cause, and showed that new genes, exemplified by disease resistance genes, are preferentially located in high-recombination chromosome regions. PMID:23610408

  17. The SNP Consortium website: past, present and future.

    PubMed

    Thorisson, Gudmundur A; Stein, Lincoln D

    2003-01-01

    The SNP Consortium website (http://snp.cshl.org) has undergone many changes since its initial conception three years ago. The database back end has been changed from the venerable ACeDB to the more scalable MySQL engine. Users can access the data via gene or single nucleotide polymorphism (SNP) keyword searches and browse or dump SNP data to textfiles. A graphical genome browsing interface shows SNPs mapped onto the genome assembly in the context of externally available gene predictions and other features. SNP allele frequency and genotype data are available via FTP-download and on individual SNP report web pages. SNP linkage maps are available for download and for browsing in a comparative map viewer. All software components of the data coordinating center (DCC) website (http://snp.cshl.org) are open source. PMID:12519964

  18. Genomic islands of divergence and their consequences for the resolution of spatial structure in an exploited marine fish

    PubMed Central

    Bradbury, Ian R; Hubert, Sophie; Higgins, Brent; Bowman, Sharen; Borza, Tudor; Paterson, Ian G; Snelgrove, Paul V R; Morris, Corey J; Gregory, Robert S; Hardie, David; Hutchings, Jeffrey A; Ruzzante, Daniel E; Taggart, Christopher T; Bentzen, Paul

    2013-01-01

    As populations diverge, genomic regions associated with adaptation display elevated differentiation. These genomic islands of adaptive divergence can inform conservation efforts in exploited species, by refining the delineation of management units, and providing genomic tools for more precise and effective population monitoring and the successful assignment of individuals and products. We explored heterogeneity in genomic divergence and its impact on the resolution of spatial population structure in exploited populations of Atlantic cod, Gadus morhua, using genome wide expressed sequence derived single nucleotide polymorphisms in 466 individuals sampled across the range. Outlier tests identified elevated divergence at 5.2% of SNPs, consistent with directional selection in one-third of linkage groups. Genomic regions of elevated divergence ranged in size from a single position to several cM. Structuring at neutral loci was associated with geographic features, whereas outlier SNPs revealed genetic discontinuities in both the eastern and western Atlantic. This fine-scale geographic differentiation enhanced assignment to region of origin, and through the identification of adaptive diversity, fundamentally changes how these populations should be conserved. This work demonstrates the utility of genome scans for adaptive divergence in the delineation of stock structure, the traceability of individuals and products, and ultimately a role for population genomics in fisheries conservation. PMID:23745137

  19. Genomic islands of divergence and their consequences for the resolution of spatial structure in an exploited marine fish.

    PubMed

    Bradbury, Ian R; Hubert, Sophie; Higgins, Brent; Bowman, Sharen; Borza, Tudor; Paterson, Ian G; Snelgrove, Paul V R; Morris, Corey J; Gregory, Robert S; Hardie, David; Hutchings, Jeffrey A; Ruzzante, Daniel E; Taggart, Christopher T; Bentzen, Paul

    2013-04-01

    As populations diverge, genomic regions associated with adaptation display elevated differentiation. These genomic islands of adaptive divergence can inform conservation efforts in exploited species, by refining the delineation of management units, and providing genomic tools for more precise and effective population monitoring and the successful assignment of individuals and products. We explored heterogeneity in genomic divergence and its impact on the resolution of spatial population structure in exploited populations of Atlantic cod, Gadus morhua, using genome wide expressed sequence derived single nucleotide polymorphisms in 466 individuals sampled across the range. Outlier tests identified elevated divergence at 5.2% of SNPs, consistent with directional selection in one-third of linkage groups. Genomic regions of elevated divergence ranged in size from a single position to several cM. Structuring at neutral loci was associated with geographic features, whereas outlier SNPs revealed genetic discontinuities in both the eastern and western Atlantic. This fine-scale geographic differentiation enhanced assignment to region of origin, and through the identification of adaptive diversity, fundamentally changes how these populations should be conserved. This work demonstrates the utility of genome scans for adaptive divergence in the delineation of stock structure, the traceability of individuals and products, and ultimately a role for population genomics in fisheries conservation. PMID:23745137

  20. Population genomic structure and linkage disequilibrium analysis of South African goat breeds using genome-wide SNP data.

    PubMed

    Mdladla, K; Dzomba, E F; Huson, H J; Muchadeyi, F C

    2016-08-01

    The sustainability of goat farming in marginal areas of southern Africa depends on local breeds that are adapted to specific agro-ecological conditions. Unimproved non-descript goats are the main genetic resources used for the development of commercial meat-type breeds of South Africa. Little is known about genetic diversity and the genetics of adaptation of these indigenous goat populations. This study investigated the genetic diversity, population structure and breed relations, linkage disequilibrium, effective population size and persistence of gametic phase in goat populations of South Africa. Three locally developed meat-type breeds of the Boer (n = 33), Savanna (n = 31), Kalahari Red (n = 40), a feral breed of Tankwa (n = 25) and unimproved non-descript village ecotypes (n = 110) from four goat-producing provinces of the Eastern Cape, KwaZulu-Natal, Limpopo and North West were assessed using the Illumina Goat 50K SNP Bead Chip assay. The proportion of SNPs with minor allele frequencies >0.05 ranged from 84.22% in the Tankwa to 97.58% in the Xhosa ecotype, with a mean of 0.32 ± 0.13 across populations. Principal components analysis, admixture and pairwise FST identified Tankwa as a genetically distinct population and supported clustering of the populations according to their historical origins. Genome-wide FST identified 101 markers potentially under positive selection in the Tankwa. Average linkage disequilibrium was highest in the Tankwa (r(2)  = 0.25 ± 0.26) and lowest in the village ecotypes (r(2) range = 0.09 ± 0.12 to 0.11 ± 0.14). We observed an effective population size of <150 for all populations 13 generations ago. The estimated correlations for all breed pairs were lower than 0.80 at marker distances >100 kb with the exception of those in Savanna and Tswana populations. This study highlights the high level of genetic diversity in South African indigenous goats as well as the utility of the genome-wide SNP marker panels in

  1. Genome structure affects the rate of autosyndesis and allosyndesis in AABC, BBAC and CCAB Brassica interspecific hybrids.

    PubMed

    Mason, Annaliese S; Huteau, Virginie; Eber, Frédérique; Coriton, Olivier; Yan, Guijun; Nelson, Matthew N; Cowling, Wallace A; Chèvre, Anne-Marie

    2010-09-01

    Gene introgression into allopolyploid crop species from diploid or polyploid ancestors can be accomplished through homologous or homoeologous chromosome pairing during meiosis. We produced trigenomic Brassica interspecific hybrids (genome complements AABC, BBAC and CCAB) from the amphidiploid species Brassica napus (AACC), Brassica juncea (AABB) and Brassica carinata (BBCC) in order to test whether the structure of each genome affects frequencies of homologous and homoeologous (both allosyndetic and autosyndetic) pairing during meiosis. AABC hybrids produced from three genotypes of B. napus were included to assess the genetic control of homoeologous pairing. Multi-colour fluorescent in situ hybridisation was used to quantify homologous pairing (e.g. A-genome bivalents in AABC), allosyndetic associations (e.g. B-C in AABC) and autosyndetic associations (e.g. B-B in AABC) at meiosis. A high percentage of homologous chromosomes formed pairs (97.5-99.3%), although many pairs were also involved in autosyndetic and allosyndetic associations. Allosyndesis was observed most frequently as A-C genome associations (mean 4.0 per cell) and less frequently as A-B genome associations (0.8 per cell) and B-C genome associations (0.3 per cell). Autosyndesis occurred most frequently in the haploid A genome (0.75 A-A per cell) and least frequently in the haploid B genome (0.13 B-B per cell). The frequency of C-C autosyndesis was greater in BBAC hybrids (0.75 per cell) than in any other hybrid. The frequency of A-B, A-C and B-C allosyndesis was affected by the genomic structure of the trigenomic hybrids. Frequency of allosyndesis was also influenced by the genotype of the B. napus paternal parent for the three AABC (B. juncea × B. napus) hybrid types. Homoeologous pairing between the Brassica A, B and C genomes in interspecific hybrids may be influenced by complex interactions between genome structure and allelic composition. PMID:20571876

  2. Whole-Genome Sequencing Reveals Diverse Models of Structural Variations in Esophageal Squamous Cell Carcinoma.

    PubMed

    Cheng, Caixia; Zhou, Yong; Li, Hongyi; Xiong, Teng; Li, Shuaicheng; Bi, Yanghui; Kong, Pengzhou; Wang, Fang; Cui, Heyang; Li, Yaoping; Fang, Xiaodong; Yan, Ting; Li, Yike; Wang, Juan; Yang, Bin; Zhang, Ling; Jia, Zhiwu; Song, Bin; Hu, Xiaoling; Yang, Jie; Qiu, Haile; Zhang, Gehong; Liu, Jing; Xu, Enwei; Shi, Ruyi; Zhang, Yanyan; Liu, Haiyan; He, Chanting; Zhao, Zhenxiang; Qian, Yu; Rong, Ruizhou; Han, Zhiwei; Zhang, Yanlin; Luo, Wen; Wang, Jiaqian; Peng, Shaoliang; Yang, Xukui; Li, Xiangchun; Li, Lin; Fang, Hu; Liu, Xingmin; Ma, Li; Chen, Yunqing; Guo, Shiping; Chen, Xing; Xi, Yanfeng; Li, Guodong; Liang, Jianfang; Yang, Xiaofeng; Guo, Jiansheng; Jia, JunMei; Li, Qingshan; Cheng, Xiaolong; Zhan, Qimin; Cui, Yongping

    2016-02-01

    Comprehensive identification of somatic structural variations (SVs) and understanding their mutational mechanisms in cancer might contribute to understanding biological differences and help to identify new therapeutic targets. Unfortunately, characterization of complex SVs across the whole genome and the mutational mechanisms underlying esophageal squamous cell carcinoma (ESCC) is largely unclear. To define a comprehensive catalog of somatic SVs, affected target genes, and their underlying mechanisms in ESCC, we re-analyzed whole-genome sequencing (WGS) data from 31 ESCCs using Meerkat algorithm to predict somatic SVs and Patchwork to determine copy-number changes. We found deletions and translocations with NHEJ and alt-EJ signature as the dominant SV types, and 16% of deletions were complex deletions. SVs frequently led to disruption of cancer-associated genes (e.g., CDKN2A and NOTCH1) with different mutational mechanisms. Moreover, chromothripsis, kataegis, and breakage-fusion-bridge (BFB) were identified as contributing to locally mis-arranged chromosomes that occurred in 55% of ESCCs. These genomic catastrophes led to amplification of oncogene through chromothripsis-derived double-minute chromosome formation (e.g., FGFR1 and LETM2) or BFB-affected chromosomes (e.g., CCND1, EGFR, ERBB2, MMPs, and MYC), with approximately 30% of ESCCs harboring BFB-derived CCND1 amplification. Furthermore, analyses of copy-number alterations reveal high frequency of whole-genome duplication (WGD) and recurrent focal amplification of CDCA7 that might act as a potential oncogene in ESCC. Our findings reveal molecular defects such as chromothripsis and BFB in malignant transformation of ESCCs and demonstrate diverse models of SVs-derived target genes in ESCCs. These genome-wide SV profiles and their underlying mechanisms provide preventive, diagnostic, and therapeutic implications for ESCCs. PMID:26833333

  3. Whole-Genome Sequencing Reveals Diverse Models of Structural Variations in Esophageal Squamous Cell Carcinoma

    PubMed Central

    Cheng, Caixia; Zhou, Yong; Li, Hongyi; Xiong, Teng; Li, Shuaicheng; Bi, Yanghui; Kong, Pengzhou; Wang, Fang; Cui, Heyang; Li, Yaoping; Fang, Xiaodong; Yan, Ting; Li, Yike; Wang, Juan; Yang, Bin; Zhang, Ling; Jia, Zhiwu; Song, Bin; Hu, Xiaoling; Yang, Jie; Qiu, Haile; Zhang, Gehong; Liu, Jing; Xu, Enwei; Shi, Ruyi; Zhang, Yanyan; Liu, Haiyan; He, Chanting; Zhao, Zhenxiang; Qian, Yu; Rong, Ruizhou; Han, Zhiwei; Zhang, Yanlin; Luo, Wen; Wang, Jiaqian; Peng, Shaoliang; Yang, Xukui; Li, Xiangchun; Li, Lin; Fang, Hu; Liu, Xingmin; Ma, Li; Chen, Yunqing; Guo, Shiping; Chen, Xing; Xi, Yanfeng; Li, Guodong; Liang, Jianfang; Yang, Xiaofeng; Guo, Jiansheng; Jia, JunMei; Li, Qingshan; Cheng, Xiaolong; Zhan, Qimin; Cui, Yongping

    2016-01-01

    Comprehensive identification of somatic structural variations (SVs) and understanding their mutational mechanisms in cancer might contribute to understanding biological differences and help to identify new therapeutic targets. Unfortunately, characterization of complex SVs across the whole genome and the mutational mechanisms underlying esophageal squamous cell carcinoma (ESCC) is largely unclear. To define a comprehensive catalog of somatic SVs, affected target genes, and their underlying mechanisms in ESCC, we re-analyzed whole-genome sequencing (WGS) data from 31 ESCCs using Meerkat algorithm to predict somatic SVs and Patchwork to determine copy-number changes. We found deletions and translocations with NHEJ and alt-EJ signature as the dominant SV types, and 16% of deletions were complex deletions. SVs frequently led to disruption of cancer-associated genes (e.g., CDKN2A and NOTCH1) with different mutational mechanisms. Moreover, chromothripsis, kataegis, and breakage-fusion-bridge (BFB) were identified as contributing to locally mis-arranged chromosomes that occurred in 55% of ESCCs. These genomic catastrophes led to amplification of oncogene through chromothripsis-derived double-minute chromosome formation (e.g., FGFR1 and LETM2) or BFB-affected chromosomes (e.g., CCND1, EGFR, ERBB2, MMPs, and MYC), with approximately 30% of ESCCs harboring BFB-derived CCND1 amplification. Furthermore, analyses of copy-number alterations reveal high frequency of whole-genome duplication (WGD) and recurrent focal amplification of CDCA7 that might act as a potential oncogene in ESCC. Our findings reveal molecular defects such as chromothripsis and BFB in malignant transformation of ESCCs and demonstrate diverse models of SVs-derived target genes in ESCCs. These genome-wide SV profiles and their underlying mechanisms provide preventive, diagnostic, and therapeutic implications for ESCCs. PMID:26833333

  4. Structure and genome release of Twort-like Myoviridae phage with a double-layered baseplate

    PubMed Central

    Nováček, Jiří; Šiborová, Marta; Benešík, Martin; Pantůček, Roman; Doškař, Jiří; Plevka, Pavel

    2016-01-01

    Bacteriophages from the family Myoviridae use double-layered contractile tails to infect bacteria. Contraction of the tail sheath enables the tail tube to penetrate through the bacterial cell wall and serve as a channel for the transport of the phage genome into the cytoplasm. However, the mechanisms controlling the tail contraction and genome release of phages with “double-layered” baseplates were unknown. We used cryo-electron microscopy to show that the binding of the Twort-like phage phi812 to the Staphylococcus aureus cell wall requires a 210° rotation of the heterohexameric receptor-binding and tripod protein complexes within its baseplate about an axis perpendicular to the sixfold axis of the tail. This rotation reorients the receptor-binding proteins to point away from the phage head, and also results in disruption of the interaction of the tripod proteins with the tail sheath, hence triggering its contraction. However, the tail sheath contraction of Myoviridae phages is not sufficient to induce genome ejection. We show that the end of the phi812 double-stranded DNA genome is bound to one protein subunit from a connector complex that also forms an interface between the phage head and tail. The tail sheath contraction induces conformational changes of the neck and connector that result in disruption of the DNA binding. The genome penetrates into the neck, but is stopped at a bottleneck before the tail tube. A subsequent structural change of the tail tube induced by its interaction with the S. aureus cell is required for the genome’s release. PMID:27469164

  5. Improving safety of aircraft engines: a consortium approach

    NASA Astrophysics Data System (ADS)

    Brasche, Lisa J. H.

    1996-11-01

    With over seven million departures per year, air transportation has become not a luxury, but a standard mode of transportation for the United States. A critical aspect of modern air transport is the jet engine, a complex engineered component that has enabled the rapid travel to which we have all become accustomed. One of the enabling technologies for safe air travel is nondestructive evaluation, or NDE, which includes various inspection techniques used to assess the health or integrity of a structure, component, or material. The Engine Titanium Consortium (ETC) was established in 1993 to respond to recommendations made by the Federal Aviation Administration (FAA) Titanium Rotating Components Review Team (TRCRT) for improvements in inspection of engine titanium. Several recent accomplishments of the ETC are detailed in this paper. The objective of the Engine Titanium Consortium is to provide the FAAand the manufacturers with reliable and costeffective new methods and/or improvements in mature methods for detecting cracks, inclusions, and imperfections in titanium. The consortium consists of a team of researchers from academia and industry-namely, Iowa State University, Allied Signal Propulsion Engines, General Electric Aircraft Engines, and Pratt & Whitney Engines-who work together to develop program priorities, organize a program plan, conduct the research, and implement the solutions. The true advantage of the consortium approach is that it brings together the research talents of academia and the engineering talents of industry to tackle a technology-base problem. In bringing industrial competitors together, the consortium ensures that the research results, which have safety implications and result from FAA funds, are shared and become part of the public domain.

  6. High Resolution Genetic Mapping by Genome Sequencing Reveals Genome Duplication and Tetraploid Genetic Structure of the Diploid Miscanthus sinensis

    PubMed Central

    Ma, Xue-Feng; Jensen, Elaine; Alexandrov, Nickolai; Troukhan, Maxim; Zhang, Liping; Thomas-Jones, Sian; Farrar, Kerrie; Clifton-Brown, John; Donnison, Iain; Swaller, Timothy; Flavell, Richard

    2012-01-01

    We have created a high-resolution linkage map of Miscanthus sinensis, using genotyping-by-sequencing (GBS), identifying all 19 linkage groups for the first time. The result is technically significant since Miscanthus has a very large and highly heterozygous genome, but has no or limited genomics information to date. The composite linkage map containing markers from both parental linkage maps is composed of 3,745 SNP markers spanning 2,396 cM on 19 linkage groups with a 0.64 cM average resolution. Comparative genomics analyses of the M. sinensis composite linkage map to the genomes of sorghum, maize, rice, and Brachypodium distachyon indicate that sorghum has the closest syntenic relationship to Miscanthus compared to other species. The comparative results revealed that each pair of the 19 M. sinensis linkages aligned to one sorghum chromosome, except for LG8, which mapped to two sorghum chromosomes (4 and 7), presumably due to a chromosome fusion event after genome duplication. The data also revealed several other chromosome rearrangements relative to sorghum, including two telomere-centromere inversions of the sorghum syntenic chromosome 7 in LG8 of M. sinensis and two paracentric inversions of sorghum syntenic chromosome 4 in LG7 and LG8 of M. sinensis. The results clearly demonstrate, for the first time, that the diploid M. sinensis is tetraploid origin consisting of two sub-genomes. This complete and high resolution composite linkage map will not only serve as a useful resource for novel QTL discoveries, but also enable informed deployment of the wealth of existing genomics resources of other species to the improvement of Miscanthus as a high biomass energy crop. In addition, it has utility as a reference for genome sequence assembly for the forthcoming whole genome sequencing of the Miscanthus genus. PMID:22439001

  7. High resolution genetic mapping by genome sequencing reveals genome duplication and tetraploid genetic structure of the diploid Miscanthus sinensis.

    PubMed

    Ma, Xue-Feng; Jensen, Elaine; Alexandrov, Nickolai; Troukhan, Maxim; Zhang, Liping; Thomas-Jones, Sian; Farrar, Kerrie; Clifton-Brown, John; Donnison, Iain; Swaller, Timothy; Flavell, Richard

    2012-01-01

    We have created a high-resolution linkage map of Miscanthus sinensis, using genotyping-by-sequencing (GBS), identifying all 19 linkage groups for the first time. The result is technically significant since Miscanthus has a very large and highly heterozygous genome, but has no or limited genomics information to date. The composite linkage map containing markers from both parental linkage maps is composed of 3,745 SNP markers spanning 2,396 cM on 19 linkage groups with a 0.64 cM average resolution. Comparative genomics analyses of the M. sinensis composite linkage map to the genomes of sorghum, maize, rice, and Brachypodium distachyon indicate that sorghum has the closest syntenic relationship to Miscanthus compared to other species. The comparative results revealed that each pair of the 19 M. sinensis linkages aligned to one sorghum chromosome, except for LG8, which mapped to two sorghum chromosomes (4 and 7), presumably due to a chromosome fusion event after genome duplication. The data also revealed several other chromosome rearrangements relative to sorghum, including two telomere-centromere inversions of the sorghum syntenic chromosome 7 in LG8 of M. sinensis and two paracentric inversions of sorghum syntenic chromosome 4 in LG7 and LG8 of M. sinensis. The results clearly demonstrate, for the first time, that the diploid M. sinensis is tetraploid origin consisting of two sub-genomes. This complete and high resolution composite linkage map will not only serve as a useful resource for novel QTL discoveries, but also enable informed deployment of the wealth of existing genomics resources of other species to the improvement of Miscanthus as a high biomass energy crop. In addition, it has utility as a reference for genome sequence assembly for the forthcoming whole genome sequencing of the Miscanthus genus. PMID:22439001

  8. Genomic and supragenomic structure of the nucleotide-like G-protein-coupled receptor GPR34.

    PubMed

    Engemaier, Eva; Römpler, Holger; Schöneberg, Torsten; Schulz, Angela

    2006-02-01

    Directed cloning approaches and large-scale sequencing of several vertebrate genomes unveiled many new members of the G-protein-coupled receptor (GPCR) superfamily, among them GPR34. Initial studies showed that GPR34 is an evolutionarily old GPCR structurally related to a group of ADP-like receptors. To gain insight into the genomic organization, regulation of expression, and supragenomic diversification of GPR34 several vertebrate species were analyzed. In contrast to the obviously intronless coding region GPR34 displays an evolutionary preserved 5' noncoding intron-exon structure. Further, an alternatively used cryptic intron was identified within the coding region, which shortens the N terminus by 47 amino acids. Ubiquitous expression of GPR34 is driven by genomic sequences upstream of at least two transcriptional start regions in mouse and rat but only one region in human. In rodents, both promoters are active in all tissues investigated, but the level of activity is tissue-specific. At the translational level, several conserved in-frame AUGs within the first 150 bp of the coding region may serve as start points for translation in human and other mammals. Combinatory mutagenesis and expression of reporter constructs confirmed these multiple translational start points and revealed a preference for the second in-frame AUG in human GPR34. Our data show that multiple translation initiation starts and alternative splicing contribute to the supragenomic diversification of GPR34. PMID:16338117

  9. Dinoflagellate Gene Structure and Intron Splice Sites in a Genomic Tandem Array.

    PubMed

    Mendez, Gregory S; Delwiche, Charles F; Apt, Kirk E; Lippmeier, J Casey

    2015-01-01

    Dinoflagellates are one of the last major lineages of eukaryotes for which little is known about genome structure and organization. We report here the sequence and gene structure of a clone isolated from a cosmid library which, to our knowledge, represents the largest contiguously sequenced, dinoflagellate genomic, tandem gene array. These data, combined with information from a large transcriptomic library, allowed a high level of confidence of every base pair call. This degree of confidence is not possible with PCR-based contigs. The sequence contains an intron-rich set of five highly expressed gene repeats arranged in tandem. One of the tandem repeat gene members contains an intron 26,372 bp long. This study characterizes a splice site consensus sequence for dinoflagellate introns. Two to nine base pairs around the 3' splice site are repeated by an identical two to nine base pairs around the 5' splice site. The 5' and 3' splice sites are in the same locations within each repeat so that the repeat is found only once in the mature mRNA. This identically repeated intron boundary sequence might be useful in gene modeling and annotation of genomes. PMID:25963315

  10. Whole-genome mutational landscape and characterization of noncoding and structural mutations in liver cancer.

    PubMed

    Fujimoto, Akihiro; Furuta, Mayuko; Totoki, Yasushi; Tsunoda, Tatsuhiko; Kato, Mamoru; Shiraishi, Yuichi; Tanaka, Hiroko; Taniguchi, Hiroaki; Kawakami, Yoshiiku; Ueno, Masaki; Gotoh, Kunihito; Ariizumi, Shun-Ichi; Wardell, Christopher P; Hayami, Shinya; Nakamura, Toru; Aikata, Hiroshi; Arihiro, Koji; Boroevich, Keith A; Abe, Tetsuo; Nakano, Kaoru; Maejima, Kazuhiro; Sasaki-Oku, Aya; Ohsawa, Ayako; Shibuya, Tetsuo; Nakamura, Hiromi; Hama, Natsuko; Hosoda, Fumie; Arai, Yasuhito; Ohashi, Shoko; Urushidate, Tomoko; Nagae, Genta; Yamamoto, Shogo; Ueda, Hiroki; Tatsuno, Kenji; Ojima, Hidenori; Hiraoka, Nobuyoshi; Okusaka, Takuji; Kubo, Michiaki; Marubashi, Shigeru; Yamada, Terumasa; Hirano, Satoshi; Yamamoto, Masakazu; Ohdan, Hideki; Shimada, Kazuaki; Ishikawa, Osamu; Yamaue, Hiroki; Chayama, Kazuki; Miyano, Satoru; Aburatani, Hiroyuki; Shibata, Tatsuhiro; Nakagawa, Hidewaki

    2016-05-01

    Liver cancer, which is most often associated with virus infection, is prevalent worldwide, and its underlying etiology and genomic structure are heterogeneous. Here we provide a whole-genome landscape of somatic alterations in 300 liver cancers from Japanese individuals. Our comprehensive analysis identified point mutations, structural variations (STVs), and virus integrations, in noncoding and coding regions. We discovered mutational signatures related to liver carcinogenesis and recurrently mutated coding and noncoding regions, such as long intergenic noncoding RNA genes (NEAT1 and MALAT1), promoters, CTCF-binding sites, and regulatory regions. STV analysis found a significant association with replication timing and identified known (CDKN2A, CCND1, APC, and TERT) and new (ASH1L, NCOR1, and MACROD2) cancer-related genes that were recurrently affected by STVs, leading to altered expression. These results emphasize the value of whole-genome sequencing analysis in discovering cancer driver mutations and understanding comprehensive molecular profiles of liver cancer, especially with regard to STVs and noncoding mutations. PMID:27064257

  11. FANCJ is essential to maintain microsatellite structure genome-wide during replication stress.

    PubMed

    Barthelemy, Joanna; Hanenberg, Helmut; Leffak, Michael

    2016-08-19

    Microsatellite DNAs that form non-B structures are implicated in replication fork stalling, DNA double strand breaks (DSBs) and human disease. Fanconi anemia (FA) is an inherited disorder in which mutations in at least nineteen genes are responsible for the phenotypes of genome instability and cancer predisposition. FA pathway proteins are active in the resolution of non-B DNA structures including interstrand crosslinks, G quadruplexes and DNA triplexes. In FANCJ helicase depleted cells, we show that hydroxyurea or aphidicolin treatment leads to loss of microsatellite polymerase chain reaction signals and to chromosome recombination at an ectopic hairpin forming CTG/CAG repeat in the HeLa genome. Moreover, diverse endogenous microsatellite signals were also lost upon replication stress after FANCJ depletion, and in FANCJ null patient cells. The phenotype of microsatellite signal instability is specific for FANCJ apart from the intact FA pathway, and is consistent with DSBs at microsatellites genome-wide in FANCJ depleted cells following replication stress. PMID:27179029

  12. Physical mapping and genomic structure of the human TNFR2 gene

    SciTech Connect

    Beltinger, C.P.; White, P.S.; Maris, J.M.

    1996-07-01

    The tumor necrosis factor receptor 2 (TNFR2) gene localizes to 1p36.2, a genomic region characteristically deleted in neuroblastomas and other malignancies. In addition, TNFR2 is the principal mediator of the effects of TNF on cellular immunity, and it may cooperate with TNFR1 in the killing of nonlymphoid cells. Therefore, we undertook an analysis of the genomic structure and precise physical mapping of this gene. The TNFR2 gene is contained on 10 exons that span 26 kb. Most of the functional domains of TNFR2 are encoded by separate exons, and each of the repeats of the extracellular cysteine-rich domain is interrupted by an intron. The genomic structure reveals a close relationship to TNFR1, another member of the TNFR superfamily. Based on electrophoretic analysis of yeast artificial chromosomes, TNFR2 maps within 400 kb of the genetic marker D1S434. In addition, we have identified a new polymorphic dinucleotide repeat within intron 4 of TNFR2. The genetic sequence information and exon-intron boundaries we have determined will facilitate mutational analysis of this gene to determine its potential role in neuroblastoma, as well as in other cancers with characteristic deletions or rearrangements of 1p36. 52 refs., 3 figs., 1 tab.

  13. Genetic Structure of the Han Chinese Population Revealed by Genome-wide SNP Variation

    PubMed Central

    Chen, Jieming; Zheng, Houfeng; Bei, Jin-Xin; Sun, Liangdan; Jia, Wei-hua; Li, Tao; Zhang, Furen; Seielstad, Mark; Zeng, Yi-Xin; Zhang, Xuejun; Liu, Jianjun

    2009-01-01

    Population stratification is a potential problem for genome-wide association studies (GWAS), confounding results and causing spurious associations. Hence, understanding how allele frequencies vary across geographic regions or among subpopulations is an important prelude to analyzing GWAS data. Using over 350,000 genome-wide autosomal SNPs in over 6000 Han Chinese samples from ten provinces of China, our study revealed a one-dimensional “north-south” population structure and a close correlation between geography and the genetic structure of the Han Chinese. The north-south population structure is consistent with the historical migration pattern of the Han Chinese population. Metropolitan cities in China were, however, more diffused “outliers,” probably because of the impact of modern migration of peoples. At a very local scale within the Guangdong province, we observed evidence of population structure among dialect groups, probably on account of endogamy within these dialects. Via simulation, we show that empirical levels of population structure observed across modern China can cause spurious associations in GWAS if not properly handled. In the Han Chinese, geographic matching is a good proxy for genetic matching, particularly in validation and candidate-gene studies in which population stratification cannot be directly accessed and accounted for because of the lack of genome-wide data, with the exception of the metropolitan cities, where geographical location is no longer a good indicator of ancestral origin. Our findings are important for designing GWAS in the Chinese population, an activity that is expected to intensify greatly in the near future. PMID:19944401

  14. Structure and Genome Organization of AFV2, a Novel Archaeal Lipothrixvirus with Unusual Terminal and Core Structures†

    PubMed Central

    Häring, Monika; Vestergaard, Gisle; Brügger, Kim; Rachel, Reinhard; Garrett, Roger A.; Prangishvili, David

    2005-01-01

    A novel filamentous virus, AFV2, from the hyperthermophilic archaeal genus Acidianus shows structural similarity to lipothrixviruses but differs from them in its unusual terminal and core structures. The double-stranded DNA genome contains 31,787 bp and carries eight open reading frames homologous to those of other lipothrixviruses, a single tRNALys gene containing a 12-bp archaeal intron, and a 1,008-bp repeat-rich region near the center of the genome. PMID:15901711

  15. Maintenance of genome stability in plants: repairing DNA double strand breaks and chromatin structure stability.

    PubMed

    Roy, Sujit

    2014-01-01

    Plant cells are subject to high levels of DNA damage resulting from plant's obligatory dependence on sunlight and the associated exposure to environmental stresses like solar UV radiation, high soil salinity, drought, chilling injury, and other air and soil pollutants including heavy metals and metabolic by-products from endogenous processes. The irreversible DNA damages, generated by the environmental and genotoxic stresses affect plant growth and development, reproduction, and crop productivity. Thus, for maintaining genome stability, plants have developed an extensive array of mechanisms for the detection and repair of DNA damages. This review will focus recent advances in our understanding of mechanisms regulating plant genome stability in the context of repairing of double stand breaks and chromatin structure maintenance. PMID:25295048

  16. An integrated map of structural variation in 2,504 human genomes

    PubMed Central

    Jun, Goo; Fritz, Markus Hsi-Yang; Konkel, Miriam K.; Malhotra, Ankit; Stütz, Adrian M.; Shi, Xinghua; Casale, Francesco Paolo; Chen, Jieming; Hormozdiari, Fereydoun; Dayama, Gargi; Chen, Ken; Malig, Maika; Chaisson, Mark J.P.; Walter, Klaudia; Meiers, Sascha; Kashin, Seva; Garrison, Erik; Auton, Adam; Lam, Hugo Y. K.; Mu, Xinmeng Jasmine; Alkan, Can; Antaki, Danny; Bae, Taejeong; Cerveira, Eliza; Chines, Peter; Chong, Zechen; Clarke, Laura; Dal, Elif; Ding, Li; Emery, Sarah; Fan, Xian; Gujral, Madhusudan; Kahveci, Fatma; Kidd, Jeffrey M.; Kong, Yu; Lameijer, Eric-Wubbo; McCarthy, Shane; Flicek, Paul; Gibbs, Richard A.; Marth, Gabor; Mason, Christopher E.; Menelaou, Androniki; Muzny, Donna M.; Nelson, Bradley J.; Noor, Amina; Parrish, Nicholas F.; Pendleton, Matthew; Quitadamo, Andrew; Raeder, Benjamin; Schadt, Eric E.; Romanovitch, Mallory; Schlattl, Andreas; Sebra, Robert; Shabalin, Andrey A.; Untergasser, Andreas; Walker, Jerilyn A.; Wang, Min; Yu, Fuli; Zhang, Chengsheng; Zhang, Jing; Zheng-Bradley, Xiangqun; Zhou, Wanding; Zichner, Thomas; Sebat, Jonathan; Batzer, Mark A.; McCarroll, Steven A.; Mills, Ryan E.; Gerstein, Mark B.; Bashir, Ali; Stegle, Oliver; Devine, Scott E.; Lee, Charles; Eichler, Evan E.; Korbel, Jan O.

    2015-01-01

    Summary Structural variants (SVs) are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight SV classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype-blocks in 26 human populations. Analyzing this set, we identify numerous gene-intersecting SVs exhibiting population stratification and describe naturally occurring homozygous gene knockouts suggesting the dispensability of a variety of human genes. We demonstrate that SVs are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of SV complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex SVs with multiple breakpoints likely formed through individual mutational events. Our catalog will enhance future studies into SV demography, functional impact and disease association. PMID:26432246

  17. Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine

    PubMed Central

    Elsik, Christine G.; Tayal, Aditi; Diesh, Colin M.; Unni, Deepak R.; Emery, Marianne L.; Nguyen, Hung N.; Hagen, Darren E.

    2016-01-01

    We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. PMID:26578564

  18. Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine.

    PubMed

    Elsik, Christine G; Tayal, Aditi; Diesh, Colin M; Unni, Deepak R; Emery, Marianne L; Nguyen, Hung N; Hagen, Darren E

    2016-01-01

    We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. PMID:26578564

  19. SUNrises on the International Plant Nucleus Consortium

    PubMed Central

    Graumann, Katja; Bass, Hank W.; Parry, Geraint

    2013-01-01

    The nuclear periphery is a dynamic, structured environment, whose precise functions are essential for global processes—from nuclear, to cellular, to organismal. Its main components—the nuclear envelope (NE) with inner and outer nuclear membranes (INM and ONM), nuclear pore complexes (NPC), associated cytoskeletal and nucleoskeletal components as well as chromatin are conserved across eukaryotes (Fig. 1). In metazoans in particular, the structure and functions of nuclear periphery components are intensely researched partly because of their involvement in various human diseases. While far less is known about these in plants, the last few years have seen a significant increase in research activity in this area. Plant biologists are not only catching up with the animal field, but recent findings are pushing our advances in this field globally. In recognition of this developing field, the Annual Society of Experimental Biology Meeting in Salzburg kindly hosted a session co-organized by Katja Graumann and David E. Evans (Oxford Brookes University) highlighting new insights into plant nuclear envelope proteins and their interactions. This session brought together leading researchers with expertise in topics such as epigenetics, meiosis, nuclear pore structure and functions, nucleoskeleton and nuclear envelope composition. An open and friendly exchange of ideas was fundamental to the success of the meeting, which resulted in founding the International Plant Nucleus Consortium. This review highlights new developments in plant nuclear envelope research presented at the conference and their importance for the wider understanding of metazoan, yeast and plant nuclear envelope functions and properties. PMID:23324458

  20. Morphology, Genome Sequence, and Structural Proteome of Type Phage P335 from Lactococcus lactis▿ †

    PubMed Central

    Labrie, Simon J.; Josephsen, Jytte; Neve, Horst; Vogensen, Finn K.; Moineau, Sylvain

    2008-01-01

    Lactococcus lactis phage P335 is a virulent type phage for the species that bears its name and belongs to the Siphoviridae family. Morphologically, P335 resembled the L. lactis phages TP901-1 and Tuc2009, except for a shorter tail and a different collar/whisker structure. Its 33,613-bp double-stranded DNA genome had 50 open reading frames. Putative functions were assigned to 29 of them. Unlike other sequenced genomes from lactococcal phages belonging to this species, P335 did not have a lysogeny module. However, it did carry a dUTPase gene, the most conserved gene among this phage species. Comparative genomic analyses revealed a high level of identity between the morphogenesis modules of the phages P335, ul36, TP901-1, and Tuc2009 and two putative prophages of L. lactis SK11. Differences were noted in genes coding for receptor-binding proteins, in agreement with their distinct host ranges. Sixteen structural proteins of phage P335 were identified by liquid chromatography-tandem mass spectrometry. A 2.8-kb insertion was recognized between the putative genes coding for the activator of late transcription (Alt) and the small terminase subunit (TerS). Four genes within this region were autonomously late transcribed and possibly under the control of Alt. Three of the four deduced proteins had similarities with proteins from Streptococcus pyogenes prophages, suggesting that P335 acquired this module from another phage genome. The genetic diversity of the P335 species indicates that they are exceptional models for studying the modular theory of phage evolution. PMID:18539805

  1. Fourth Progress and Information Report of the Vocational-Technical Education Consortium of States. (V-TECS.)

    ERIC Educational Resources Information Center

    Lee, Connie W.; And Others

    Five areas of concern relating to the Vocational-Technical Education Consortium of States (V-TECS) are documented in this report. First, following an introduction which discusses the purpose of the V-TECS system (to develop materials for performance based instruction), the organizational structure of the sixteen state consortium is presented,…

  2. Damming the genomic data flood using a comprehensive analysis and storage data structure

    PubMed Central

    Bouffard, Marc; Phillips, Michael S.; Brown, Andrew M.K.; Marsh, Sharon; Tardif, Jean-Claude; van Rooij, Tibor

    2010-01-01

    Data generation, driven by rapid advances in genomic technologies, is fast outpacing our analysis capabilities. Faced with this flood of data, more hardware and software resources are added to accommodate data sets whose structure has not specifically been designed for analysis. This leads to unnecessarily lengthy processing times and excessive data handling and storage costs. Current efforts to address this have centered on developing new indexing schemas and analysis algorithms, whereas the root of the problem lies in the format of the data itself. We have developed a new data structure for storing and analyzing genotype and phenotype data. By leveraging data normalization techniques, database management system capabilities and the use of a novel multi-table, multidimensional database structure we have eliminated the following: (i) unnecessarily large data set size due to high levels of redundancy, (ii) sequential access to these data sets and (iii) common bottlenecks in analysis times. The resulting novel data structure horizontally divides the data to circumvent traditional problems associated with the use of databases for very large genomic data sets. The resulting data set required 86% less disk space and performed analytical calculations 6248 times faster compared to a standard approach without any loss of information. Database URL: http://castor.pharmacogenomics.ca PMID:21159730

  3. Probing Retroviral and Retrotransposon Genome Structures: The “SHAPE” of Things to Come

    PubMed Central

    Sztuba-Solinska, Joanna; Le Grice, Stuart F. J.

    2012-01-01

    Understanding the nuances of RNA structure as they pertain to biological function remains a formidable challenge for retrovirus research and development of RNA-based therapeutics, an area of particular importance with respect to combating HIV infection. Although a variety of chemical and enzymatic RNA probing techniques have been successfully employed for more than 30 years, they primarily interrogate small (100–500 nt) RNAs that have been removed from their biological context, potentially eliminating long-range tertiary interactions (such as kissing loops and pseudoknots) that may play a critical regulatory role. Selective 2′ hydroxyl acylation analyzed by primer extension (SHAPE), pioneered recently by Merino and colleagues, represents a facile, user-friendly technology capable of interrogating RNA structure with a single reagent and, combined with automated capillary electrophoresis, can analyze an entire 10,000-nucleotide RNA genome in a matter of weeks. Despite these obvious advantages, SHAPE essentially provides a nucleotide “connectivity map,” conversion of which into a 3-D structure requires a variety of complementary approaches. This paper summarizes contributions from SHAPE towards our understanding of the structure of retroviral genomes, modifications to which technology that have been developed to address some of its limitations, and future challenges. PMID:22685659

  4. Genome-wide functional annotation and structural verification of metabolic ORFeome of Chlamydomonas reinhardtii

    PubMed Central

    2011-01-01

    Background Recent advances in the field of metabolic engineering have been expedited by the availability of genome sequences and metabolic modelling approaches. The complete sequencing of the C. reinhardtii genome has made this unicellular alga a good candidate for metabolic engineering studies; however, the annotation of the relevant genes has not been validated and the much-needed metabolic ORFeome is currently unavailable. We describe our efforts on the functional annotation of the ORF models released by the Joint Genome Institute (JGI), prediction of their subcellular localizations, and experimental verification of their structural annotation at the genome scale. Results We assigned enzymatic functions to the translated JGI ORF models of C. reinhardtii by reciprocal BLAST searches of the putative proteome against the UniProt and AraCyc enzyme databases. The best match for each translated ORF was identified and the EC numbers were transferred onto the ORF models. Enzymatic functional assignment was extended to the paralogs of the ORFs by clustering ORFs using BLASTCLUST. In total, we assigned 911 enzymatic functions, including 886 EC numbers, to 1,427 transcripts. We further annotated the enzymatic ORFs by prediction of their subcellular localization. The majority of the ORFs are predicted to be compartmentalized in the cytosol and chloroplast. We verified the structure of the metabolism-related ORF models by reverse transcription-PCR of the functionally annotated ORFs. Following amplification and cloning, we carried out 454FLX and Sanger sequencing of the ORFs. Based on alignment of the 454FLX reads to the ORF predicted sequences, we obtained more than 90% coverage for more than 80% of the ORFs. In total, 1,087 ORF models were verified by 454 and Sanger sequencing methods. We obtained expression evidence for 98% of the metabolic ORFs in the algal cells grown under constant light in the presence of acetate. Conclusions We functionally annotated approximately 1

  5. Genomic-scale comparison of sequence- and structure-based methods of function prediction: Does structure provide additional insight?

    PubMed Central

    Fetrow, Jacquelyn S.; Siew, Naomi; Di Gennaro, Jeannine A.; Martinez-Yamout, Maria; Dyson, H. Jane; Skolnick, Jeffrey

    2001-01-01

    A function annotation method using the sequence-to-structure-to-function paradigm is applied to the identification of all disulfide oxidoreductases in the Saccharomyces cerevisiae genome. The method identifies 27 sequences as potential disulfide oxidoreductases. All previously known thioredoxins, glutaredoxins, and disulfide isomerases are correctly identified. Three of the 27 predictions are probable false-positives. Three novel predictions, which subsequently have been experimentally validated, are presented. Two additional novel predictions suggest a disulfide oxidoreductase regulatory mechanism for two subunits (OST3 and OST6) of the yeast oligosaccharyltransferase complex. Based on homology, this prediction can be extended to a potential tumor suppressor gene, N33, in humans, whose biochemical function was not previously known. Attempts to obtain a folded, active N33 construct to test the prediction were unsuccessful. The results show that structure prediction coupled with biochemically relevant structural motifs is a powerful method for the function annotation of genome sequences and can provide more detailed, robust predictions than function prediction methods that rely on sequence comparison alone. PMID:11316881

  6. Structural heterogeneity and functional diversity of topologically associating domains in mammalian genomes

    PubMed Central

    Wang, Xiao-Tao; Dong, Peng-Fei; Zhang, Hong-Yu; Peng, Cheng

    2015-01-01

    Recent chromosome conformation capture (3C) derived techniques have revealed that topologically associating domain (TAD) is a pervasive element in chromatin three-dimensional (3D) organization. However, there is currently no parameter to quantitatively measure the structural characteristics of TADs, thus obscuring our understanding on the structural and functional differences among TADs. Based on our finding that there exist intrinsic chromatin interaction patterns in TADs, we define a theoretical parameter, called aggregation preference (AP), to characterize TAD structures by capturing the interaction aggregation degree. Applying this defined parameter to 11 Hi-C data sets generated by both traditional and in situ Hi-C experimental pipelines, our analyses reveal that heterogeneous structures exist among TADs, and this structural heterogeneity is significantly correlated to DNA sequences, epigenomic signals and gene expressions. Although TADs can be stable in genomic positions across cell lines, structural comparisons show that a considerable number of stable TADs undergo significantly structural rearrangements during cell changes. Moreover, the structural change of TAD is tightly associated with its transcription remodeling. Altogether, the theoretical parameter defined in this work provides a quantitative method to link structural characteristics and biological functions of TADs, and this linkage implies that chromatin interaction pattern has the potential to mark transcription activity in TADs. PMID:26150425

  7. The contribution of co-transcriptional RNA:DNA hybrid structures to DNA damage and genome instability

    PubMed Central

    Hamperl, Stephan; Cimprich, Karlene A.

    2014-01-01

    Accurate DNA replication and DNA repair are crucial for the maintenance of genome stability, and it is generally accepted that failure of these processes is a major source of DNA damage in cells. Intriguingly, recent evidence suggests that DNA damage is more likely to occur at genomic loci with high transcriptional activity. Furthermore, loss of certain RNA processing factors in eukaryotic cells is associated with increased formation of co-transcriptional RNA:DNA hybrid structures known as R-loops, resulting in double-strand breaks (DSBs) and DNA damage. However, the molecular mechanisms by which R-loop structures ultimately lead to DNA breaks and genome instability is not well understood. In this review, we summarize the current knowledge about the formation, recognition and processing of RNA:DNA hybrids, and discuss possible mechanisms by which these structures contribute to DNA damage and genome instability in the cell. PMID:24746923

  8. The genomic structure of human BTK, the defective gene in X-linked agammaglobulinemia

    SciTech Connect

    Rohrer, J.; Parolini, O.; Conley, M.E. |; Belmont, J.W.

    1994-12-31

    It has recently been demonstrated that mutations in the gene for Bruton`s tyrosine kinase (BTK) are responsible for X-linked agammaglobulinemia. Southern blot analysis and sequencing of cDNA were used to document deletions, insertions, and single base pair substitutions. To facilitate analysis of BTK regulation and to permit the development of assays that could be used to screen genomic DNA for mutations in BTK, the authors determined the genomic organization of this gene. Subcloning of a cosmid and a yeast artificial chromosome showed that BTK is divided into 19 exons spanning 37 kilobases of genomic DNA. Analysis of the region 5{prime} to the first untranslated exon revealed no consensus TATAA or CAAT boxes; however, three retinoic acid binding sites were identified in this region. Comparison of the structure of BTK with that of other nonreceptor tyrosine kinases, including SRC, FES, and CSK, demonstrated a lack of conservation of exon borders. Information obtained in this study will contribute to understanding of the evolution of nonreceptor tyrosine kinases. It will also be useful in diagnostic studies, including carrier detection, and in studies directed towards gene therapy or gene replacement. 29 refs., 2 figs., 2 tabs.

  9. Paleogenomics. Genomic structure in Europeans dating back at least 36,200 years.

    PubMed

    Seguin-Orlando, Andaine; Korneliussen, Thorfinn S; Sikora, Martin; Malaspinas, Anna-Sapfo; Manica, Andrea; Moltke, Ida; Albrechtsen, Anders; Ko, Amy; Margaryan, Ashot; Moiseyev, Vyacheslav; Goebel, Ted; Westaway, Michael; Lambert, David; Khartanovich, Valeri; Wall, Jeffrey D; Nigst, Philip R; Foley, Robert A; Lahr, Marta Mirazon; Nielsen, Rasmus; Orlando, Ludovic; Willerslev, Eske

    2014-11-28

    The origin of contemporary Europeans remains contentious. We obtained a genome sequence from Kostenki 14 in European Russia dating from 38,700 to 36,200 years ago, one of the oldest fossils of anatomically modern humans from Europe. We find that Kostenki 14 shares a close ancestry with the 24,000-year-old Mal'ta boy from central Siberia, European Mesolithic hunter-gatherers, some contemporary western Siberians, and many Europeans, but not eastern Asians. Additionally, the Kostenki 14 genome shows evidence of shared ancestry with a population basal to all Eurasians that also relates to later European Neolithic farmers. We find that Kostenki 14 contains more Neandertal DNA that is contained in longer tracts than present Europeans. Our findings reveal the timing of divergence of western Eurasians and East Asians to be more than 36,200 years ago and that European genomic structure today dates back to the Upper Paleolithic and derives from a metapopulation that at times stretched from Europe to central Asia. PMID:25378462

  10. Viral genome structures, charge, and sequences are optimal for capsid assembly

    NASA Astrophysics Data System (ADS)

    Hagan, Michael

    2014-03-01

    For many viruses, the spontaneous assembly of a capsid shell around the nu-cleic acid (NA) genome is an essential step in the viral life cycle. Capsid formation is a multicomponent, out-of-equilibrium assembly process for which kinetic effects and thermodynamic constraints compete to determine the outcome. Understand-ing how viral components drive highly efficient assembly under these constraints could promote biomedical efforts to block viral propagation, and would elucidate the factors controlling assembly in a wide range of systems containing proteins and polyelectrolytes. This talk will describe coarse-grained models of capsid proteins and NAs with which we investigate the dynamics and thermodynamics of virus assembly. In con-trast to recent theoretical models, we find that capsids spontaneously `overcharge' that is, the NA length which is kinetically and thermodynamically optimal possess-es a negative charge greater than the positive charge of the capsid. When applied to specific virus capsids, the calculated optimal NA lengths closely correspond to the natural viral genome lengths. These results suggest that the features included in this model (i.e. electrostatics, excluded volume, and NA tertiary structure) play key roles in determining assembly thermodynamics and consequently exert selec-tive pressure on viral evolution. I will then discuss mechanisms by which se-quence-specific interactions between NAs and capsid proteins promote selective encapsidation of the viral genome. This work was supported by NIH R01GM108021 and the Brandeis MRSEC NSF-MRSEC-0820492.

  11. Assessing Diversity of DNA Structure-Related Sequence Features in Prokaryotic Genomes

    PubMed Central

    Huang, Yongjie; Mrázek, Jan

    2014-01-01

    Prokaryotic genomes are diverse in terms of their nucleotide and oligonucleotide composition as well as presence of various sequence features that can affect physical properties of the DNA molecule. We present a survey of local sequence patterns which have a potential to promote non-canonical DNA conformations (i.e. different from standard B-DNA double helix) and interpret the results in terms of relationships with organisms' habitats, phylogenetic classifications, and other characteristics. Our present work differs from earlier similar surveys not only by investigating a wider range of sequence patterns in a large number of genomes but also by using a more realistic null model to assess significant deviations. Our results show that simple sequence repeats and Z-DNA-promoting patterns are generally suppressed in prokaryotic genomes, whereas palindromes and inverted repeats are over-represented. Representation of patterns that promote Z-DNA and intrinsic DNA curvature increases with increasing optimal growth temperature (OGT), and decreases with increasing oxygen requirement. Additionally, representations of close direct repeats, palindromes and inverted repeats exhibit clear negative trends with increasing OGT. The observed relationships with environmental characteristics, particularly OGT, suggest possible evolutionary scenarios of structural adaptation of DNA to particular environmental niches. PMID:24408877

  12. Assessing diversity of DNA structure-related sequence features in prokaryotic genomes.

    PubMed

    Huang, Yongjie; Mrázek, Jan

    2014-06-01

    Prokaryotic genomes are diverse in terms of their nucleotide and oligonucleotide composition as well as presence of various sequence features that can affect physical properties of the DNA molecule. We present a survey of local sequence patterns which have a potential to promote non-canonical DNA conformations (i.e. different from standard B-DNA double helix) and interpret the results in terms of relationships with organisms' habitats, phylogenetic classifications, and other characteristics. Our present work differs from earlier similar surveys not only by investigating a wider range of sequence patterns in a large number of genomes but also by using a more realistic null model to assess significant deviations. Our results show that simple sequence repeats and Z-DNA-promoting patterns are generally suppressed in prokaryotic genomes, whereas palindromes and inverted repeats are over-represented. Representation of patterns that promote Z-DNA and intrinsic DNA curvature increases with increasing optimal growth temperature (OGT), and decreases with increasing oxygen requirement. Additionally, representations of close direct repeats, palindromes and inverted repeats exhibit clear negative trends with increasing OGT. The observed relationships with environmental characteristics, particularly OGT, suggest possible evolutionary scenarios of structural adaptation of DNA to particular environmental niches. PMID:24408877

  13. Primary structure of the human follistatin precursor and its genomic organization

    SciTech Connect

    Shimasaki, Shunichi; Koga, Makoto; Esch, F.; Cooksey, K.; Mercado, M.; Koba, A.; Ueno, Naoto; Ying, Shaoyao; Ling, N.; Guillemin, R. )

    1988-06-01

    Follistatin is a single-chain gonadal protein that specifically inhibits follicle-stimulating hormone release. By use of the recently characterized porcine follistatin cDNA as a probe to screen a human testis cDNA library and a genomic library, the structure of the complete human follistatin precursor as well as its genomic organization have been determined. Three of eight cDNA clones that were sequenced predicted a precursor with 344 amino acids, whereas the remaining five cDNA clones encoded a 317 amino acid precursor, resulting from alternative splicing of the precursor mRNA. Mature follistatins contain four contiguous domains that are encoded by precisely separated exons; three of the domains are highly similar to each other, as well as to human epidermal growth factor and human pancreatic secretory trypsin inhibitor. The genomic organization of the human follistatin is similar to that of the human epidermal growth factor gene and thus supports the notion of exon shuffling during evolution.

  14. Structural Plasticity of the Protein Plug That Traps Newly Packaged Genomes in Podoviridae Virions.

    PubMed

    Bhardwaj, Anshul; Sankhala, Rajeshwer S; Olia, Adam S; Brooke, Dewey; Casjens, Sherwood R; Taylor, Derek J; Prevelige, Peter E; Cingolani, Gino

    2016-01-01

    Bacterial viruses of the P22-like family encode a specialized tail needle essential for genome stabilization after DNA packaging and implicated in Gram-negative cell envelope penetration. The atomic structure of P22 tail needle (gp26) crystallized at acidic pH reveals a slender fiber containing an N-terminal "trimer of hairpins" tip. Although the length and composition of tail needles vary significantly in Podoviridae, unexpectedly, the amino acid sequence of the N-terminal tip is exceptionally conserved in more than 200 genomes of P22-like phages and prophages. In this paper, we used x-ray crystallography and EM to investigate the neutral pH structure of three tail needles from bacteriophage P22, HK620, and Sf6. In all cases, we found that the N-terminal tip is poorly structured, in stark contrast to the compact trimer of hairpins seen in gp26 crystallized at acidic pH. Hydrogen-deuterium exchange mass spectrometry, limited proteolysis, circular dichroism spectroscopy, and gel filtration chromatography revealed that the N-terminal tip is highly dynamic in solution and unlikely to adopt a stable trimeric conformation at physiological pH. This is supported by the cryo-EM reconstruction of P22 mature virion tail, where the density of gp26 N-terminal tip is incompatible with a trimer of hairpins. We propose the tail needle N-terminal tip exists in two conformations: a pre-ejection extended conformation, which seals the portal vertex after genome packaging, and a postejection trimer of hairpins, which forms upon its release from the virion. The conformational plasticity of the tail needle N-terminal tip is built in the amino acid sequence, explaining its extraordinary conservation in nature. PMID:26574546

  15. Genome analysis: Assigning protein coding regions to three-dimensional structures.

    PubMed Central

    Salamov, A. A.; Suwa, M.; Orengo, C. A.; Swindells, M. B.

    1999-01-01

    We describe the results of a procedure for maximizing the number of sequences that can be reliably linked to a protein of known three-dimensional structure. Unlike other methods, which try to increase sensitivity through the use of fold recognition software, we only use conventional sequence alignment tools, but apply them in a manner that significantly increases the number of relationships detected. We analyzed 11 genomes and found that, depending on the genome, between 23 and 32% of the ORFs had significant matches to proteins of known structure. In all cases, the aligned region consisted of either >100 residues or >50% of the smaller sequence. Slightly higher percentages could be attained if smaller motifs were also included. This is significantly higher than most previously reported methods, even those that have a fold-recognition component. We survey the biochemical and structural characteristics of the most frequently occurring proteins, and discuss the extent to which alignment methods can realistically assign function to gene products. PMID:10211823

  16. Structuring the bacterial genome: Y1-transposases associated with REP-BIME sequences†

    PubMed Central

    Ton-Hoang, Bao; Siguier, Patricia; Quentin, Yves; Onillon, Séverine; Marty, Brigitte; Fichant, Gwennaele; Chandler, Mick

    2012-01-01

    REPs are highly repeated intergenic palindromic sequences often clustered into structures called BIMEs including two individual REPs separated by short linker of variable length. They play a variety of key roles in the cell. REPs also resemble the sub-terminal hairpins of the atypical IS200/605 family of insertion sequences which encode Y1 transposases (TnpAIS200/IS605). These belong to the HUH endonuclease family, carry a single catalytic tyrosine (Y) and promote single strand transposition. Recently, a new clade of Y1 transposases (TnpAREP) was found associated with REP/BIME in structures called REPtrons. It has been suggested that TnpAREP is responsible for REP/BIME proliferation over genomes. We analysed and compared REP distribution and REPtron structure in numerous available E. coli and Shigella strains. Phylogenetic analysis clearly indicated that tnpAREP was acquired early in the species radiation and was lost later in some strains. To understand REP/BIME behaviour within the host genome, we also studied E. coli K12 TnpAREP activity in vitro and demonstrated that it catalyses cleavage and recombination of BIMEs. While TnpAREP shared the same general organization and similar catalytic characteristics with TnpAIS200/IS605 transposases, it exhibited distinct properties potentially important in the creation of BIME variability and in their amplification. TnpAREP may therefore be one of the first examples of transposase domestication in prokaryotes. PMID:22199259

  17. Evolutionary genomics reveals conserved structural determinants of signaling and adaptation in microbial chemoreceptors

    SciTech Connect

    Alexander, Roger P; Jouline, Igor B

    2007-01-01

    As an important model for transmembrane signaling, methyl-accepting chemotaxis proteins (MCPs) have been extensively studied by using genetic, biochemical, and structural techniques. However, details of the molecular mechanism of signaling are still not well understood. The availability of genomic information for hundreds of species enables the identification of features in protein sequences that are conserved over long evolutionary distances and thus are critically important for function. We carried out a large-scale comparative genomic analysis of the MCP signaling and adaptation domain family and identified features that appear to be critical for receptor structure and function. Based on domain length and sequence conservation, we identified seven major MCP classes and three distinct structural regions within the cytoplasmic domain: signaling, methylation, and flexible bundle subdomains. The flexible bundle subdomain, not previously recognized in MCPs, is a conserved element that appears to be important for signal transduction. Remarkably, the N- and C-terminal helical arms of the cytoplasmic domain maintain symmetry in length and register despite dramatic variation, from 24 to 64 7-aa heptads in overall domain length. Loss of symmetry is observed in some MCPs, where it is concomitant with specific changes in the sensory module. Each major MCP class has a distinct pattern of predicted methylation sites that is well supported by experimental data. Our findings indicate that signaling and adaptation functions within the MCP cytoplasmic domain are tightly coupled, and that their coevolution has contributed to the significant diversity in chemotaxis mechanisms among different organisms.

  18. Update on the Pfam5000 Strategy for Selection of StructuralGenomics Targets

    SciTech Connect

    Chandonia, John-Marc; Brenner, Steven E.

    2005-06-27

    Structural Genomics is an international effort to determine the three-dimensional shapes of all important biological macromolecules, with a primary focus on proteins. Target proteins should be selected according to a strategy that is medically and biologically relevant, of good financial value, and tractable. In 2003, we presented the ''Pfam5000'' strategy, which involves selecting the 5,000 most important families from the Pfam database as sources for targets. In this update, we show that although both the Pfam database and the number of sequenced genomes have increased in size, the expected benefits of the Pfam5000 strategy have not changed substantially. Solving the structures of proteins from the 5,000 largest Pfam families would allow accurate fold assignment for approximately 65 percent of all prokaryotic proteins (covering 54 percent of residues) and 63 percent of eukaryotic proteins (42 percent of residues). Fewer than 2,300 of the largest families on this list remain to be solved, making the project feasible in the next five years given the expected throughput to be achieved in the production phase of the Protein Structure Initiative.

  19. Building a local research consortium.

    PubMed

    Martin, P A

    1994-05-01

    Although state, regional, and national networking often are critical to the nurse researchers, local support that is broader than what is found in any single agency may be the foundation needed by clinicians who want "more" research than that prescribed by their current role. More formal consortiums have successfully implemented a variety of research projects and are another possibility to explore (Beaman & Strader, 1990; Bolton, 1991; Chenitz et al., 1990; Keefe et al., 1988; Thiele, 1989). Another option is some state nurses' associations that have formal research assemblies (eg., Ohio Nurses Association, Assembly of Nurse Researchers). However, forming a local, less formal group with a few expert advisors may supply the energy and momentum necessary for both using and conducting research at a grassroots level. The expert advisors should be research-trained nurses (almost always with a PhD or DNS) who are active group members. Although Fitzpatrick encouraged collaborative research and writing early in the history of Applied Nursing Research (Fitzpatrick, 1989), in 1993 approximately two thirds of the articles in Applied Nursing Research still were single authored. Nurses are not using collaboration to its fullest extent. An informal group in one community has been one way to release the scholarship that was latent in many nurses. PMID:8031105

  20. Gene Ontology Consortium: going forward.

    PubMed

    2015-01-01

    The Gene Ontology (GO; http://www.geneontology.org) is a community-based bioinformatics resource that supplies information about gene product function using ontologies to represent biological knowledge. Here we describe improvements and expansions to several branches of the ontology, as well as updates that have allowed us to more efficiently disseminate the GO and capture feedback from the research community. The Gene Ontology Consortium (GOC) has expanded areas of the ontology such as cilia-related terms, cell-cycle terms and multicellular organism processes. We have also implemented new tools for generating ontology terms based on a set of logical rules making use of templates, and we have made efforts to increase our use of logical definitions. The GOC has a new and improved web site summarizing new developments and documentation, serving as a portal to GO data. Users can perform GO enrichment analysis, and search the GO for terms, annotations to gene products, and associated metadata across multiple species using the all-new AmiGO 2 browser. We encourage and welcome the input of the research community in all biological areas in our continued effort to improve the Gene Ontology. PMID:25428369

  1. Gene Ontology Consortium: going forward

    PubMed Central

    2015-01-01

    The Gene Ontology (GO; http://www.geneontology.org) is a community-based bioinformatics resource that supplies information about gene product function using ontologies to represent biological knowledge. Here we describe improvements and expansions to several branches of the ontology, as well as updates that have allowed us to more efficiently disseminate the GO and capture feedback from the research community. The Gene Ontology Consortium (GOC) has expanded areas of the ontology such as cilia-related terms, cell-cycle terms and multicellular organism processes. We have also implemented new tools for generating ontology terms based on a set of logical rules making use of templates, and we have made efforts to increase our use of logical definitions. The GOC has a new and improved web site summarizing new developments and documentation, serving as a portal to GO data. Users can perform GO enrichment analysis, and search the GO for terms, annotations to gene products, and associated metadata across multiple species using the all-new AmiGO 2 browser. We encourage and welcome the input of the research community in all biological areas in our continued effort to improve the Gene Ontology. PMID:25428369

  2. Establishing an International Soil Modelling Consortium

    NASA Astrophysics Data System (ADS)

    Vereecken, Harry; Schnepf, Andrea; Vanderborght, Jan

    2015-04-01

    Soil is one of the most critical life-supporting compartments of the Biosphere. Soil provides numerous ecosystem services such as a habitat for biodiversity, water and nutrients, as well as producing food, feed, fiber and energy. To feed the rapidly growing world population in 2050, agricultural food production must be doubled using the same land resources footprint. At the same time, soil resources are threatened due to improper management and climate change. Soil is not only essential for establishing a sustainable bio-economy, but also plays a key role also in a broad range of societal challenges including 1) climate change mitigation and adaptation, 2) land use change 3) water resource protection, 4) biotechnology for human health, 5) biodiversity and ecological sustainability, and 6) combating desertification. Soils regulate and support water, mass and energy fluxes between the land surface, the vegetation, the atmosphere and the deep subsurface and control storage and release of organic matter affecting climate regulation and biogeochemical cycles. Despite the many important functions of soil, many fundamental knowledge gaps remain, regarding the role of soil biota and biodiversity on ecosystem services, the structure and dynamics of soil communities, the interplay between hydrologic and biotic processes, the quantification of soil biogeochemical processes and soil structural processes, the resilience and recovery of soils from stress, as well as the prediction of soil development and the evolution of soils in the landscape, to name a few. Soil models have long played an important role in quantifying and predicting soil processes and related ecosystem services. However, a new generation of soil models based on a whole systems approach comprising all physical, mechanical, chemical and biological processes is now required to address these critical knowledge gaps and thus contribute to the preservation of ecosystem services, improve our understanding of climate

  3. WILLIAMSBURG BROOKLYN ASTHMA AND ENVIRONMENT CONSORTIUM

    EPA Science Inventory

    The Consortium expects to develop a family health promotion model in which organized residents have access to easily understood, scientifically accurate, community-specific information about their health, their environment, and the relationship between the two,...

  4. International Mouse Phenotyping Consortium (IMPC) —

    Cancer.gov

    The International Mouse Phenotyping Consortium (IMPC) comprises a group of major mouse genetics research institutions along with national funding organisations formed to address the challenge of developing an encyclopedia of mammalian gene function.

  5. International Radical Cystectomy Consortium: A way forward

    PubMed Central

    Raza, Syed Johar; Field, Erinn; Kibel, Adam S.; Mottrie, Alex; Weizer, Alon Z.; Wagner, Andrew; Hemal, Ashok K.; Scherr, Douglas S.; Schanne, Francis; Gaboardi, Franco; Wu, Guan; Peabody, James O.; Koauk, Jihad; Redorta, Joan Palou; Pattaras, John G.; Rha, Koon-Ho; Richstone, Lee; Balbay, M. Derya; Menon, Mani; Hayn, Mathew; Stoeckle, Micheal; Wiklund, Peter; Dasgupta, Prokar; Pruthi, Raj; Ghavamian, Reza; Khan, Shamim; Siemer, Stephan; Maatman, Thomas; Wilson, Timothy; Poulakis, Vassilis; Wilding, Greg; Guru, Khurshid A.

    2014-01-01

    Robot-assisted radical cystectomy (RARC) is an emerging operative alternative to open surgery for the management of invasive bladder cancer. Studies from single institutions provide limited data due to the small number of patients. In order to better understand the related outcomes, a world-wide consortium was established in 2006 of patients undergoing RARC, called the International Robotic Cystectomy Consortium (IRCC). Thus far, the IRCC has reported its findings on various areas of operative interest and continues to expand its capacity to include other operative modalities and transform it into the International Radical Cystectomy Consortium. This article summarizes the findings of the IRCC and highlights the future direction of the consortium. PMID:25097319

  6. CORAL DISEASE & HEALTH CONSORTIUM: FINDING SOLUTIONS

    EPA Science Inventory

    The National Oceanic Atmospheric Administration (NOAA), the Environmental Protection Agency (EPA), and the Department of Interior (DOI) developed the framework for a Coral Disease and Health Consortium (CDHC) for the United States Coral Reef Task Force (USCRTF) through an interag...

  7. International Lymphoma Epidemiology Consortium (InterLymph)

    Cancer.gov

    A consortium designed to enhance collaboration among epidemiologists studying lymphoma, to provide a forum for the exchange of research ideas, and to create a framework for collaborating on analyses that pool data from multiple studies

  8. The LBNL/JSU/AGMUS Science Consortium

    SciTech Connect

    1996-04-01

    This report discusses the 11 year of accomplishments of the science consortium of minority graduates from Jackson State University and Ana G. Mendez University at the Lawrence Berkeley National Laboratory.

  9. Comparison of SIV and HIV-1 genomic RNA structures reveals impact of sequence evolution on conserved and non-conserved structural motifs.

    PubMed

    Pollom, Elizabeth; Dang, Kristen K; Potter, E Lake; Gorelick, Robert J; Burch, Christina L; Weeks, Kevin M; Swanstrom, Ronald

    2013-01-01

    RNA secondary structure plays a central role in the replication and metabolism of all RNA viruses, including retroviruses like HIV-1. However, structures with known function represent only a fraction of the secondary structure reported for HIV-1(NL4-3). One tool to assess the importance of RNA structures is to examine their conservation over evolutionary time. To this end, we used SHAPE to model the secondary structure of a second primate lentiviral genome, SIVmac239, which shares only 50% sequence identity at the nucleotide level with HIV-1NL4-3. Only about half of the paired nucleotides are paired in both genomic RNAs and, across the genome, just 71 base pairs form with the same pairing partner in both genomes. On average the RNA secondary structure is thus evolving at a much faster rate than the sequence. Structure at the Gag-Pro-Pol frameshift site is maintained but in a significantly altered form, while the impact of selection for maintaining a protein binding interaction can be seen in the conservation of pairing partners in the small RRE stems where Rev binds. Structures that are conserved between SIVmac239 and HIV-1(NL4-3) also occur at the 5' polyadenylation sequence, in the plus strand primer sites, PPT and cPPT, and in the stem-loop structure that includes the first splice acceptor site. The two genomes are adenosine-rich and cytidine-poor. The structured regions are enriched in guanosines, while unpaired regions are enriched in adenosines, and functionaly important structures have stronger base pairing than nonconserved structures. We conclude that much of the secondary structure is the result of fortuitous pairing in a metastable state that reforms during sequence evolution. However, secondary structure elements with important function are stabilized by higher guanosine content that allows regions of structure to persist as sequence evolution proceeds, and, within the confines of selective pressure, allows structures to evolve. PMID:23593004

  10. The genome and structural proteome of an ocean siphovirus: a new window into the cyanobacterial ‘mobilome’

    PubMed Central

    Sullivan, Matthew B; Krastins, Bryan; Hughes, Jennifer L; Kelly, Libusha; Chase, Michael; Sarracino, David; Chisholm, Sallie W

    2009-01-01

    Prochlorococcus, an abundant phototroph in the oceans, are infected by members of three families of viruses: myo-, podo- and siphoviruses. Genomes of myo- and podoviruses isolated on Prochlorococcus contain DNA replication machinery and virion structural genes homologous to those from coliphages T4 and T7 respectively. They also contain a suite of genes of cyanobacterial origin, most notably photosynthesis genes, which are expressed during infection and appear integral to the evolutionary trajectory of both host and phage. Here we present the first genome of a cyanobacterial siphovirus, P-SS2, which was isolated from Atlantic slope waters using a Prochlorococcus host (MIT9313). The P-SS2 genome is larger than, and considerably divergent from, previously sequenced siphoviruses. It appears most closely related to lambdoid siphoviruses, with which it shares 13 functional homologues. The ∼108 kb P-SS2 genome encodes 131 predicted proteins and notably lacks photosynthesis genes which have consistently been found in other marine cyanophage, but does contain 14 other cyanobacterial homologues. While only six structural proteins were identified from the genome sequence, 35 proteins were detected experimentally; these mapped onto capsid and tail structural modules in the genome. P-SS2 is potentially capable of integration into its host as inferred from bioinformatically identified genetic machinery int, bet, exo and a 53 bp attachment site. The host attachment site appears to be a genomic island that is tied to insertion sequence (IS) activity that could facilitate mobility of a gene involved in the nitrogen-stress response. The homologous region and a secondary IS-element hot-spot in Synechococcus RS9917 are further evidence of IS-mediated genome evolution coincident with a probable relic prophage integration event. This siphovirus genome provides a glimpse into the biology of a deep-photic zone phage as well as the ocean cyanobacterial prophage and IS element

  11. Whole genome comparison of a large collection of mycobacteriophages reveals a continuum of phage genetic diversity.

    PubMed

    Pope, Welkin H; Bowman, Charles A; Russell, Daniel A; Jacobs-Sera, Deborah; Asai, David J; Cresawn, Steven G; Jacobs, William R; Hendrix, Roger W; Lawrence, Jeffrey G; Hatfull, Graham F

    2015-01-01

    The bacteriophage population is large, dynamic, ancient, and genetically diverse. Limited genomic information shows that phage genomes are mosaic, and the genetic architecture of phage populations remains ill-defined. To understand the population structure of phages infecting a single host strain, we isolated, sequenced, and compared 627 phages of Mycobacterium smegmatis. Their genetic diversity is considerable, and there are 28 distinct genomic types (clusters) with related nucleotide sequences. However, amino acid sequence comparisons show pervasive genomic mosaicism, and quantification of inter-cluster and intra-cluster relatedness reveals a continuum of genetic diversity, albeit with uneven representation of different phages. Furthermore, rarefaction analysis shows that the mycobacteriophage population is not closed, and there is a constant influx of genes from other sources. Phage isolation and analysis was performed by a large consortium of academic institutions, illustrating the substantial benefits of a disseminated, structured program involving large numbers of freshman undergraduates in scientific discovery. PMID:25919952

  12. Target Selection and Deselection at the Berkeley StructuralGenomics Center

    SciTech Connect

    Chandonia, John-Marc; Kim, Sung-Hou; Brenner, Steven E.

    2005-03-22

    At the Berkeley Structural Genomics Center (BSGC), our goalis to obtain a near-complete structural complement of proteins in theminimal organisms Mycoplasma genitalium and M. pneumoniae, two closelyrelated pathogens. Current targets for structure determination have beenselected in six major stages, starting with those predicted to be mosttractable to high throughput study and likely to yield new structuralinformation. We report on the process used to select these proteins, aswell as our target deselection procedure. Target deselection reducesexperimental effort by eliminating targets similar to those recentlysolved by the structural biology community or other centers. We measurethe impact of the 69 structures solved at the BSGC as of July 2004 onstructure prediction coverage of the M. pneumoniae and M. genitaliumproteomes. The number of Mycoplasma proteins for which thefold couldfirst be reliably assigned based on structures solved at the BSGC (24 M.pneumoniae and 21 M. genitalium) is approximately 25 percent of the totalresulting from work at all structural genomics centers and the worldwidestructural biology community (94 M. pneumoniae and 86M. genitalium)during the same period. As the number of structures contributed by theBSGC during that period is less than 1 percent of the total worldwideoutput, the benefits of a focused target selection strategy are apparent.If the structures of all current targets were solved, the percentage ofM. pneumoniae proteins for which folds could be reliably assigned wouldincrease from approximately 57 percent (391 of 687) at present to around80 percent (550 of 687), and the percentage of the proteome that could beaccurately modeled would increase from around 37 percent (254 of 687) toabout 64 percent (438 of 687). In M. genitalium, the percentage of theproteome that could be structurally annotated based on structures of ourremaining targets would rise from 72 percent (348 of 486) to around 76percent (371 of 486), with the

  13. LDRD Report FY 03: Structure and Function of Regulatory DNA: A Next Major Challenge in Genomics

    SciTech Connect

    Stubbs, L

    2003-02-18

    With the human genome sequence now available and high quality draft sequences of mouse, rat and many other creatures recently or soon to be released, the field of Genomics has entered an especially exciting phase. The raw materials for locating the {approx}30-40,000 human genes and understanding their basic structure are now online; next, the research community must begin to unravel the mechanisms through which those genes create the complexity of life. Laboratories around the world are already beginning to focus on cataloguing the times, sites and conditions under which each gene is active; others are racing to predict, and then experimentally analyze, the structures of proteins that human genes encode. These activities are extremely important, but they will not reveal the mechanisms through which the correct proteins are activated precisely in the specific cells and at the particular time that is required for normal developmental, health, and in response to the environment. Although we understand well the three-letter code through which genes dictate the production of proteins, the codes through which genes are turned on and off in precise, cell-specific patterns remain a mystery. Unraveling these codes are essential to understanding the functions of genes and the role of human genetic diversity in disease and environmental susceptibility. This problem also represents one of the most exciting challenges in modern biology, drawing in scientists from every discipline to develop the needed biological datasets, measurement technologies and algorithms. The LDRD effort that is the subject of this report was focused on establishing the basic technical and scientific foundations of a well-rounded program in gene regulatory biology at LLNL. The motivation for building these foundations was based on several drivers. First, with the sea-change in genomics, we sought to develop a new, exciting and foreward-thinking research focus for the LLNL genomics team, which could

  14. Structure and evolution of the atypical mitochondrial genome of Armadillidium vulgare (Isopoda, Crustacea).

    PubMed

    Marcadé, Isabelle; Cordaux, Richard; Doublet, Vincent; Debenest, Catherine; Bouchon, Didier; Raimond, Roland

    2007-12-01

    The crustacean isopod Armadillidium vulgare is characterized by an unusual approximately 42-kb-long mitochondrial genome consisting of two molecules co-occurring in mitochondria: a circular approximately 28-kb dimer formed by two approximately 14-kb monomers fused in opposite polarities and a linear approximately 14-kb monomer. Here we determined the nucleotide sequence of the fundamental monomeric unit of A. vulgare mitochondrial genome, to gain new insight into its structure and evolution. Our results suggest that the junction zone between monomers of the dimer structure is located in or near the control region. Direct sequencing indicated that the nucleotide sequences of the different monomer units are virtually identical. This suggests that gene conversion and/or replication processes play an important role in shaping nucleotide sequence variation in this mitochondrial genome. The only heteroplasmic site we identified predicts an alloacceptor tRNA change from tRNA(Ala) to tRNA(Val). Therefore, in A. vulgare, tRNA(Ala) and tRNA(Val) are found at the same locus in different monomers, ensuring that both tRNAs are present in mitochondria. The presence of this heteroplasmic site in all sequenced individuals suggests that the polymorphism is selectively maintained, probably because of the necessity of both tRNAs for maintaining proper mitochondrial functions. Thus, our results provide empirical evidence for the tRNA gene recruitment model of tRNA evolution. Moreover, interspecific comparisons showed that the A. vulgare mitochondrial gene order is highly derived compared to the putative ancestral arthropod type. By contrast, an overall high conservation of mitochondrial gene order is observed within crustacean isopods. PMID:17906827

  15. The complete plastid genome sequence of the parasitic green alga Helicosporidium sp. is highly reduced and structured

    PubMed Central

    de Koning, Audrey P; Keeling, Patrick J

    2006-01-01

    Background Loss of photosynthesis has occurred independently in several plant and algal lineages, and represents a major metabolic shift with potential consequences for the content and structure of plastid genomes. To investigate such changes, we sequenced the complete plastid genome of the parasitic, non-photosynthetic green alga, Helicosporidium. Results The Helicosporidium plastid genome is among the smallest known (37.5 kb), and like other plastids from non-photosynthetic organisms it lacks all genes for proteins that function in photosynthesis. Its reduced size results from more than just loss of genes, however; it has little non-coding DNA, with only one intron and tiny intergenic spaces, and no inverted repeat (no duplicated genes at all). It encodes precisely the minimal complement of tRNAs needed to translate the universal genetic code, and has eliminated all redundant isoacceptors. The Helicosporidium plastid genome is also highly structured, with each half of the circular genome containing nearly all genes on one strand. Helicosporidium is known to be related to trebouxiophyte green algae, but the genome is structured and compacted in a manner more reminiscent of the non-photosynthetic plastids of apicomplexan parasites. Conclusion Helicosporidium contributes significantly to our understanding of the evolution of plastid DNA because it illustrates the highly ordered reduction that occurred following the loss of a major metabolic function. The convergence of plastid genome structure in Helicosporidium and the Apicomplexa raises the interesting possibility that there are common forces that shape plastid genomes, subsequent to the loss of photosynthesis in an organism. PMID:16630350

  16. Chloroplast Genome Sequence of the Moss Tortula ruralis: Gene Content and Structural Arrangement Relative to Other Green Plant Chloroplast Genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Tortula ruralis, a widely distributed moss species in the family Pottiaceae, is increasingly being used as a model organism for the study of desiccation tolerance and mechanisms of cellular repair. In this paper, we present the chloroplast genome sequence of Tortula ruralis, only the second publishe...

  17. Mutational and structural analysis of diffuse large B-cell lymphoma using whole genome sequencing | Office of Cancer Genomics

    Cancer.gov

    Abstract: Diffuse large B-cell lymphoma (DLBCL) is a genetically heterogeneous cancer comprising at least two molecular subtypes that differ in gene expression and distribution of mutations. Recently, application of genome/exome sequencing and RNA-seq to DLBCL has revealed numerous genes that are recurrent targets of somatic point mutation in this disease.

  18. Universal internucleotide statistics in full genomes: a footprint of the DNA structure and packaging?

    PubMed

    Bogachev, Mikhail I; Kayumov, Airat R; Bunde, Armin

    2014-01-01

    Uncovering the fundamental laws that govern the complex DNA structural organization remains challenging and is largely based upon reconstructions from the primary nucleotide sequences. Here we investigate the distributions of the internucleotide intervals and their persistence properties in complete genomes of various organisms from Archaea and Bacteria to H. Sapiens aiming to reveal the manifestation of the universal DNA architecture. We find that in all considered organisms the internucleotide interval distributions exhibit the same [Formula: see text]-exponential form. While in prokaryotes a single [Formula: see text]-exponential function makes the best fit, in eukaryotes the PDF contains additionally a second [Formula: see text]-exponential, which in the human genome makes a perfect approximation over nearly 10 decades. We suggest that this functional form is a footprint of the heterogeneous DNA structure, where the first [Formula: see text]-exponential reflects the universal helical pitch that appears both in pro- and eukaryotic DNA, while the second [Formula: see text]-exponential is a specific marker of the large-scale eukaryotic DNA organization. PMID:25438044

  19. Deciphering the fine-structure of tribal admixture in the Bedouin population using genomic data

    PubMed Central

    Markus, B; Alshafee, I; Birk, O S

    2014-01-01

    The Bedouin Israeli population is highly inbred and structured with a very high prevalence of recessive diseases. Many studies in the past two decades focused on linkage analysis in large, multiple consanguineous pedigrees of this population. The advent of high-throughput technologies motivated researchers to search for rare variants shared between smaller pedigrees, integrating data from clinically similar yet seemingly non-related sporadic cases. However, such analyses are challenging because, without pedigree data, there is no prior knowledge regarding possible relatedness between the sporadic cases. Here, we describe models and techniques for the study of relationships between pedigrees and use them for the inference of tribal co-ancestry, delineating the complex social interactions between different tribes in the Negev Bedouins of southern Israel. Through our analysis, we differentiate between tribes that share many yet small genomic segments because of co-ancestry versus tribes that share larger segments because of recent admixture. The emergent pattern is well correlated with the prevalence of rare mutations in the different tribes. Tribes that do not intermarry, mostly because of social restrictions, hold private mutations, whereas tribes that do intermarry demonstrate a genetic flow of mutations between them. Thus, social structure within an inbred community can be delineated through genomic data, with implications to genetic counseling and genetic mapping. PMID:24084643

  20. Genome Scan for Selection in Structured Layer Chicken Populations Exploiting Linkage Disequilibrium Information

    PubMed Central

    Gholami, Mahmood; Reimer, Christian; Erbe, Malena; Preisinger, Rudolf; Weigend, Annett; Weigend, Steffen; Servin, Bertrand; Simianer, Henner

    2015-01-01

    An increasing interest is being placed in the detection of genes, or genomic regions, that have been targeted by selection because identifying signatures of selection can lead to a better understanding of genotype-phenotype relationships. A common strategy for the detection of selection signatures is to compare samples from distinct populations and to search for genomic regions with outstanding genetic differentiation. The aim of this study was to detect selective signatures in layer chicken populations using a recently proposed approach, hapFLK, which exploits linkage disequilibrium information while accounting appropriately for the hierarchical structure of populations. We performed the analysis on 70 individuals from three commercial layer breeds (White Leghorn, White Rock and Rhode Island Red), genotyped for approximately 1 million SNPs. We found a total of 41 and 107 regions with outstanding differentiation or similarity using hapFLK and its single SNP counterpart FLK respectively. Annotation of selection signature regions revealed various genes and QTL corresponding to productions traits, for which layer breeds were selected. A number of the detected genes were associated with growth and carcass traits, including IGF-1R, AGRP and STAT5B. We also annotated an interesting gene associated with the dark brown feather color mutational phenotype in chickens (SOX10). We compared FST, FLK and hapFLK and demonstrated that exploiting linkage disequilibrium information and accounting for hierarchical population structure decreased the false detection rate. PMID:26151449

  1. Mapping of a gene coding for a major late structural polypeptide on the vaccinia virus genome.

    PubMed Central

    Wittek, R; Hänggi, M; Hiller, G

    1984-01-01

    Cell-free translation of total RNA isolated from vaccinia virus-infected cells late in infection results in a complex mixture of polypeptides. A monospecific antibody directed against one of the major structural proteins of the virus particle immunoprecipitated a single polypeptide with a molecular weight of 11,000 (11K) from this mixture. Immunoprecipitation was therefore used to identify the structural polypeptide among the in vitro translation products of RNA purified by hybridization selection to restriction fragments of the vaccinia virus genome. This allowed us to map the mRNA coding for the 11K polypeptide to the extreme left-hand end of the HindIII E fragment. Detailed transcriptional mapping of this region of the genome by nuclease S1 analysis revealed the presence of a late RNA transcribed from the rightward-reading strand. Its 5' end mapped at ca. 130 base pairs to the left of the HindIII site at the junction between the HindIII F and E fragments. The map position of this RNA coincided precisely with the map position of the late message coding for the 11K polypeptide. Images PMID:6319738

  2. Complete mitogenome of the edible sea urchin Loxechinus albus: genetic structure and comparative genomics within Echinozoa.

    PubMed

    Cea, Graciela; Gaitán-Espitia, Juan Diego; Cárdenas, Leyla

    2015-06-01

    The edible Chilean red sea urchin, Loxechinus albus, is the only species of its genus and endemic to the Southeastern Pacific. In this study, we reconstructed the mitochondrial genome of L. albus by combining Sanger and pyrosequencing technologies. The mtDNA genome had a length of 15,737 bp and encoded the same 13 protein-coding genes, 22 transfer RNA genes, and two ribosomal RNA genes as other animal mtDNAs. The size of this mitogenome was similar to those of other Echinodermata species. Structural comparisons showed a highly conserved structure, composition, and gene order within Echinoidea and Holothuroidea, and nearly identical gene organization to that found in Asteroidea and Crinoidea, with the majority of differences explained by the inversions of some tRNA genes. Phylogenetic reconstruction supported the monophyly of Echinozoa and recovered the monophyletic relationship of Holothuroidea and Echinoidea. Within Holothuroidea, Bayesian and maximum likelihood analyses recovered a sister-group relationship between Dendrochirotacea and Aspidochirotida. Similarly within Echinoidea, these analyses revealed that L. albus was closely related to Paracentrotus lividus, both being part of a sister group to Strongylocentrotidae and Echinometridae. In addition, two major clades were found within Strongylocentrotidae. One of these clades comprised all of the representative species Strongylocentrotus and Hemicentrotus, whereas the other included species of Mesocentrotus and Pseudocentrotus. PMID:25433433

  3. Structural Maintenance of Chromosome (SMC) Proteins Link Microtubule Stability to Genome Integrity*

    PubMed Central

    Laflamme, Guillaume; Tremblay-Boudreault, Thierry; Roy, Marc-André; Andersen, Parker; Bonneil, Éric; Atchia, Kaleem; Thibault, Pierre; D'Amours, Damien; Kwok, Benjamin H.

    2014-01-01

    Structural maintenance of chromosome (SMC) proteins are key organizers of chromosome architecture and are essential for genome integrity. They act by binding to chromatin and connecting distinct parts of chromosomes together. Interestingly, their potential role in providing connections between chromatin and the mitotic spindle has not been explored. Here, we show that yeast SMC proteins bind directly to microtubules and can provide a functional link between microtubules and DNA. We mapped the microtubule-binding region of Smc5 and generated a mutant with impaired microtubule binding activity. This mutant is viable in yeast but exhibited a cold-specific conditional lethality associated with mitotic arrest, aberrant spindle structures, and chromosome segregation defects. In an in vitro reconstitution assay, this Smc5 mutant also showed a compromised ability to protect microtubules from cold-induced depolymerization. Collectively, these findings demonstrate that SMC proteins can bind to and stabilize microtubules and that SMC-microtubule interactions are essential to establish a robust system to maintain genome integrity. PMID:25135640

  4. HSA: integrating multi-track Hi-C data for genome-scale reconstruction of 3D chromatin structure.

    PubMed

    Zou, Chenchen; Zhang, Yuping; Ouyang, Zhengqing

    2016-01-01

    Genome-wide 3C technologies (Hi-C) are being increasingly employed to study three-dimensional (3D) genome conformations. Existing computational approaches are unable to integrate accumulating data to facilitate studying 3D chromatin structure and function. We present HSA ( http://ouyanglab.jax.org/hsa/ ), a flexible tool that jointly analyzes multiple contact maps to infer 3D chromatin structure at the genome scale. HSA globally searches the latent structure underlying different cleavage footprints. Its robustness and accuracy outperform or rival existing tools on extensive simulations and orthogonal experiment validations. Applying HSA to recent in situ Hi-C data, we found the 3D chromatin structures are highly conserved across various human cell types. PMID:26936376

  5. The Childhood Leukemia International Consortium

    PubMed Central

    Metayer, Catherine; Milne, Elizabeth; Clavel, Jacqueline; Infante-Rivard, Claire; Petridou, Eleni; Taylor, Malcolm; Schüz, Joachim; Spector, Logan G.; Dockerty, John D.; Magnani, Corrado; Pombo-de-Oliveira, Maria S.; Sinnett, Daniel; Murphy, Michael; Roman, Eve; Monge, Patricia; Ezzat, Sameera; Mueller, Beth A.; Scheurer, Michael E.; Armstrong, Bruce K.; Birch, Jill; Kaatsch, Peter; Koifman, Sergio; Lightfoot, Tracy; Bhatti, Parveen; Bondy, Melissa L.; Rudant, Jérémie; O’Neill, Kate; Miligi, Lucia; Dessypris, Nick; Kang, Alice Y.; Buffler, Patricia A.

    2013-01-01

    Background Acute leukemia is the most common cancer in children under 15 years of age; 80% are acute lymphoblastic leukemia (ALL) and 17% are acute myeloid leukemia (AML). Childhood leukemia shows further diversity based on cytogenetic and molecular characteristics, which may relate to distinct etiologies. Case–control studies conducted worldwide, particularly of ALL, have collected a wealth of data on potential risk factors and in some studies, biospecimens. There is growing evidence for the role of infectious/immunologic factors, fetal growth, and several environmental factors in the etiology of childhood ALL. The risk of childhood leukemia, like other complex diseases, is likely to be influenced both by independent and interactive effects of genes and environmental exposures. While some studies have analyzed the role of genetic variants, few have been sufficiently powered to investigate gene–environment interactions. Objectives The Childhood Leukemia International Consortium (CLIC) was established in 2007 to promote investigations of rarer exposures, gene–environment interactions and subtype-specific associations through the pooling of data from independent studies. Methods By September 2012, CLIC included 22 studies (recruitment period: 1962–present) from 12 countries, totaling approximately 31 000 cases and 50 000 controls. Of these, 19 case–control studies have collected detailed epidemiologic data, and DNA samples have been collected from children and child–parent trios in 15 and 13 of these studies, respectively. Two registry-based studies and one study comprising hospital records routinely obtained at birth and/or diagnosis have limited interview data or biospecimens. Conclusions CLIC provides a unique opportunity to fill gaps in knowledge about the role of environmental and genetic risk factors, critical windows of exposure, the effects of gene–environment interactions and associations among specific leukemia subtypes in different ethnic

  6. Consortium for Materials Development in Space

    NASA Technical Reports Server (NTRS)

    1999-01-01

    During FY99 the Consortium for Materials Development in Space (CMDS) was reorganized around the following guidelines: industry driven, product focus, an industry led advisory council, focus on University of Alabama in Huntsville (UAH) core competencies, linkage to regional investment firms to assist commercialization and to take advantage of space flights. The organizational structure of the CMDS changed considerably during the year. The decision was made to reduce the organization to a Director and an Administrative Assistant. The various research projects, including the employees, were transferred to the appropriate UAH research center or college. In addition, an advisory council was established to provide direction and guidance to the CMDS to ensure a strong commercial focus. The council will (i) review CMDS commercial development plans and provide feedback, (ii) perform an annual evaluation of the Center's progress and present the results of this review to the UAH Vice President for Research, (iii) serve as an avenue of communication between the CMDS and its commercial partners, and (iv) serve as an ambassador and advocate for the CMDS.

  7. The International Pea Genome Sequencing Project: Sequencing and Assembly Progresses Updates

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The International Consortium for the Pea Genome Sequencing (ICPG) includes scientists from six countries around the world. Its aim is to provide a high quality reference of the pea genome to the scientific community as well as to the pea breeder community. The consortium proposed a strategy that int...

  8. From Genome to Structure and Back Again: A Family Portrait of the Transcarbamylases

    PubMed Central

    Shi, Dashuang; Allewell, Norma M.; Tuchman, Mendel

    2015-01-01

    Enzymes in the transcarbamylase family catalyze the transfer of a carbamyl group from carbamyl phosphate (CP) to an amino group of a second substrate. The two best-characterized members, aspartate transcarbamylase (ATCase) and ornithine transcarbamylase (OTCase), are present in most organisms from bacteria to humans. Recently, structures of four new transcarbamylase members, N-acetyl-l-ornithine transcarbamylase (AOTCase), N-succinyl-l-ornithine transcarbamylase (SOTCase), ygeW encoded transcarbamylase (YTCase) and putrescine transcarbamylase (PTCase) have also been determined. Crystal structures of these enzymes have shown that they have a common overall fold with a trimer as their basic biological unit. The monomer structures share a common CP binding site in their N-terminal domain, but have different second substrate binding sites in their C-terminal domain. The discovery of three new transcarbamylases, l-2,3-diaminopropionate transcarbamylase (DPTCase), l-2,4-diaminobutyrate transcarbamylase (DBTCase) and ureidoglycine transcarbamylase (UGTCase), demonstrates that our knowledge and understanding of the spectrum of the transcarbamylase family is still incomplete. In this review, we summarize studies on the structures and function of transcarbamylases demonstrating how structural information helps to define biological function and how small structural differences govern enzyme specificity. Such information is important for correctly annotating transcarbamylase sequences in the genome databases and for identifying new members of the transcarbamylase family. PMID:26274952

  9. ViVar: A Comprehensive Platform for the Analysis and Visualization of Structural Genomic Variation

    PubMed Central

    Sante, Tom; Vergult, Sarah; Volders, Pieter-Jan; Kloosterman, Wigard P.; Trooskens, Geert; De Preter, Katleen; Dheedene, Annelies; Speleman, Frank; De Meyer, Tim; Menten, Björn

    2014-01-01

    Structural genomic variations play an important role in human disease and phenotypic diversity. With the rise of high-throughput sequencing tools, mate-pair/paired-end/single-read sequencing has become an important technique for the detection and exploration of structural variation. Several analysis tools exist to handle different parts and aspects of such sequencing based structural variation analyses pipelines. A comprehensive analysis platform to handle all steps, from processing the sequencing data, to the discovery and visualization of structural variants, is missing. The ViVar platform is built to handle the discovery of structural variants, from Depth Of Coverage analysis, aberrant read pair clustering to split read analysis. ViVar provides you with powerful visualization options, enables easy reporting of results and better usability and data management. The platform facilitates the processing, analysis and visualization, of structural variation based on massive parallel sequencing data, enabling the rapid identification of disease loci or genes. ViVar allows you to scale your analysis with your work load over multiple (cloud) servers, has user access control to keep your data safe and is easy expandable as analysis techniques advance. URL: https://www.cmgg.be/vivar/ PMID:25503062

  10. The Impact of Spatial Structure on Viral Genomic Diversity Generated during Adaptation to Thermal Stress

    PubMed Central

    Ally, Dilara; Wiss, Valorie R.; Deckert, Gail E.; Green, Danielle; Roychoudhury, Pavitra; Wichman, Holly A.; Brown, Celeste J.; Krone, Stephen M.

    2014-01-01

    Background Most clinical and natural microbial communities live and evolve in spatially structured environments. When changes in environmental conditions trigger evolutionary responses, spatial structure can impact the types of adaptive response and the extent to which they spread. In particular, localized competition in a spatial landscape can lead to the emergence of a larger number of different adaptive trajectories than would be found in well-mixed populations. Our goal was to determine how two levels of spatial structure affect genomic diversity in a population and how this diversity is manifested spatially. Methodology/Principal Findings We serially transferred bacteriophage populations growing at high temperatures (40°C) on agar plates for 550 generations at two levels of spatial structure. The level of spatial structure was determined by whether the physical locations of the phage subsamples were preserved or disrupted at each passage to fresh bacterial host populations. When spatial structure of the phage populations was preserved, there was significantly greater diversity on a global scale with restricted and patchy distribution. When spatial structure was disrupted with passaging to fresh hosts, beneficial mutants were spread across the entire plate. This resulted in reduced diversity, possibly due to clonal interference as the most fit mutants entered into competition on a global scale. Almost all substitutions present at the end of the adaptation in the populations with disrupted spatial structure were also present in the populations with structure preserved. Conclusions/Significance Our results are consistent with the patchy nature of the spread of adaptive mutants in a spatial landscape. Spatial structure enhances diversity and slows fixation of beneficial mutants. This added diversity could be beneficial in fluctuating environments. We also connect observed substitutions and their effects on fitness to aspects of phage biology, and we provide

  11. Niches, Population Structure and Genome Reduction in Ochrobactrum intermedium: Clues to Technology-Driven Emergence of Pathogens

    PubMed Central

    Aujoulat, Fabien; Romano-Bertrand, Sara; Masnou, Agnès; Marchandin, Hélène; Jumas-Bilak, Estelle

    2014-01-01

    Ochrobactrum intermedium is considered as an emerging human environmental opportunistic pathogen with mild virulence. The distribution of isolates and sequences described in literature and databases showed frequent association with human beings and polluted environments. As population structures are related to bacterial lifestyles, we investigated by multi-locus approach the genetic structure of a population of 65 isolates representative of the known natural distribution of O. intermedium. The population was further surveyed for genome dynamics using pulsed-field gel electrophoresis and genomics. The population displayed a clonal epidemic structure with events of recombination that occurred mainly in clonal complexes. Concerning biogeography, clones were shared by human and environments and were both cosmopolitan and local. The main cosmopolitan clone was genetically and genomically stable, and grouped isolates that all harbored an atypical insertion in the rrs. Ubiquitism and stability of this major clone suggested a clonal succes in a particular niche. Events of genomic reduction were detected in the population and the deleted genomic content was described for one isolate. O. intermedium displayed allopatric characters associated to a tendancy of genome reduction suggesting a specialization process. Considering its relatedness with Brucella, this specialization might be a commitment toward pathogenic life-style that could be driven by technological selective pressure related medical and industrial technologies. PMID:24465379

  12. Genome-Wide Study of Structural Variants in Bovine Holstein, Montbéliarde and Normande Dairy Breeds

    PubMed Central

    Boussaha, Mekki; Esquerré, Diane; Barbieri, Johanna; Djari, Anis; Pinton, Alain; Letaief, Rabia; Salin, Gérald; Escudié, Frédéric; Roulet, Alain; Fritz, Sébastien; Samson, Franck; Grohs, Cécile; Bernard, Maria; Klopp, Christophe; Boichard, Didier; Rocha, Dominique

    2015-01-01

    High-throughput sequencing technologies have offered in recent years new opportunities to study genome variations. These studies have mostly focused on single nucleotide polymorphisms, small insertions or deletions and on copy number variants. Other structural variants, such as large insertions or deletions, tandem duplications, translocations, and inversions are less well-studied, despite that some have an important impact on phenotypes. In the present study, we performed a large-scale survey of structural variants in cattle. We report the identification of 6,426 putative structural variants in cattle extracted from whole-genome sequence data of 62 bulls representing the three major French dairy breeds. These genomic variants affect DNA segments greater than 50 base pairs and correspond to deletions, inversions and tandem duplications. Out of these, we identified a total of 547 deletions and 410 tandem duplications which could potentially code for CNVs. Experimental validation was carried out on 331 structural variants using a novel high-throughput genotyping method. Out of these, 255 structural variants (77%) generated good quality genotypes and 191 (75%) of them were validated. Gene content analyses in structural variant regions revealed 941 large deletions removing completely one or several genes, including 10 single-copy genes. In addition, some of the structural variants are located within quantitative trait loci for dairy traits. This study is a pan-genome assessment of genomic variations in cattle and may provide a new glimpse into the bovine genome architecture. Our results may also help to study the effects of structural variants on gene expression and consequently their effect on certain phenotypes of interest. PMID:26317361

  13. Genome structure and metabolic features in the red seaweed Chondrus crispus shed light on evolution of the Archaeplastida

    PubMed Central

    Collén, Jonas; Porcel, Betina; Carré, Wilfrid; Ball, Steven G.; Chaparro, Cristian; Tonon, Thierry; Barbeyron, Tristan; Michel, Gurvan; Noel, Benjamin; Valentin, Klaus; Elias, Marek; Artiguenave, François; Arun, Alok; Aury, Jean-Marc; Barbosa-Neto, José F.; Bothwell, John H.; Bouget, François-Yves; Brillet, Loraine; Cabello-Hurtado, Francisco; Capella-Gutiérrez, Salvador; Charrier, Bénédicte; Cladière, Lionel; Cock, J. Mark; Coelho, Susana M.; Colleoni, Christophe; Czjzek, Mirjam; Da Silva, Corinne; Delage, Ludovic; Denoeud, France; Deschamps, Philippe; Dittami, Simon M.; Gabaldón, Toni; Gachon, Claire M. M.; Groisillier, Agnès; Hervé, Cécile; Jabbari, Kamel; Katinka, Michael; Kloareg, Bernard; Kowalczyk, Nathalie; Labadie, Karine; Leblanc, Catherine; Lopez, Pascal J.; McLachlan, Deirdre H.; Meslet-Cladiere, Laurence; Moustafa, Ahmed; Nehr, Zofia; Nyvall Collén, Pi; Panaud, Olivier; Partensky, Frédéric; Poulain, Julie; Rensing, Stefan A.; Rousvoal, Sylvie; Samson, Gaelle; Symeonidi, Aikaterini; Weissenbach, Jean; Zambounis, Antonios; Wincker, Patrick; Boyen, Catherine

    2013-01-01

    Red seaweeds are key components of coastal ecosystems and are economically important as food and as a source of gelling agents, but their genes and genomes have received little attention. Here we report the sequencing of the 105-Mbp genome of the florideophyte Chondrus crispus (Irish moss) and the annotation of the 9,606 genes. The genome features an unusual structure characterized by gene-dense regions surrounded by repeat-rich regions dominated by transposable elements. Despite its fairly large size, this genome shows features typical of compact genomes, e.g., on average only 0.3 introns per gene, short introns, low median distance between genes, small gene families, and no indication of large-scale genome duplication. The genome also gives insights into the metabolism of marine red algae and adaptations to the marine environment, including genes related to halogen metabolism, oxylipins, and multicellularity (microRNA processing and transcription factors). Particularly interesting are features related to carbohydrate metabolism, which include a minimalistic gene set for starch biosynthesis, the presence of cellulose synthases acquired before the primary endosymbiosis showing the polyphyly of cellulose synthesis in Archaeplastida, and cellulases absent in terrestrial plants as well as the occurrence of a mannosylglycerate synthase potentially originating from a marine bacterium. To explain the observations on genome structure and gene content, we propose an evolutionary scenario involving an ancestral red alga that was driven by early ecological forces to lose genes, introns, and intergenetic DNA; this loss was followed by an expansion of genome size as a consequence of activity of transposable elements. PMID:23503846

  14. Population structure and linkage disequilibrium in oat (Avena sativa L.): implications for genome-wide association studies

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The level of population structure and the extent of linkage disequilibrium (LD) can have large impacts on the power, resolution, and design of genome-wide association studies (GWAS) in plants. Until recently, the topics of LD and population structure have not been explored in oat due to the lack of...

  15. MPE-seq, a new method for the genome-wide analysis of chromatin structure.

    PubMed

    Ishii, Haruhiko; Kadonaga, James T; Ren, Bing

    2015-07-01

    The analysis of chromatin structure is essential for the understanding of transcriptional regulation in eukaryotes. Here we describe methidiumpropyl-EDTA sequencing (MPE-seq), a method for the genome-wide characterization of chromatin that involves the digestion of nuclei withMPE-Fe(II) followed by massively parallel sequencing. Like micrococcal nuclease (MNase), MPE-Fe(II) preferentially cleaves the linker DNA between nucleosomes. However, there are differences in the cleavage of nuclear chromatin by MPE-Fe(II) relative to MNase. Most notably, immediately upstream of the transcription start site of active promoters, we frequently observed nucleosome-sized (141-190 bp) and subnucleosome-sized (such as 101-140 bp) peaks of digested chromatin fragments with MPE-seq but not with MNase-seq. These peaks also correlate with the presence of core histones and could thus be due, at least in part, to noncanonical chromatin structures such as labile nucleosome-like particles that have been observed in other contexts. The subnucleosome-sized MPE-seq peaks exhibit a particularly distinct association with active promoters. In addition, unlike MNase, MPE-Fe(II) cleaves nuclear DNA with little sequence bias. In this regard, we found that DNA sequences at RNA splice sites are hypersensitive to digestion by MNase but not by MPE-Fe(II). This phenomenon may have affected the analysis of nucleosome occupancy over exons. These findings collectively indicate that MPE-seq provides a unique and straightforward means for the genome-wide analysis of chromatin structure with minimal DNA sequence bias. In particular, the combined use of MPE-seq and MNase-seq enables the identification of noncanonical chromatin structures that are likely to be important for the regulation of gene expression. PMID:26080409

  16. Population genomics of dengue virus serotype 4: insights into genetic structure and evolution.

    PubMed

    Waman, Vaishali P; Kasibhatla, Sunitha Manjari; Kale, Mohan M; Kulkarni-Kale, Urmila

    2016-08-01

    The spread of dengue disease has become a global public health concern. Dengue is caused by dengue virus, which is a mosquito-borne arbovirus of the genus Flavivirus, family Flaviviridae. There are four dengue virus serotypes (1-4), each of which is known to trigger mild to severe disease. Dengue virus serotype 4 (DENV-4) has four genotypes and is increasingly being reported to be re-emerging in various parts of the world. Therefore, the population structure and factors shaping the evolution of DENV-4 strains across the world were studied using genome-based population genetic, phylogenetic and selection pressure analysis methods. The population genomics study helped to reveal the spatiotemporal structure of the DENV-4 population and its primary division into two spatially distinct clusters: American and Asian. These spatial clusters show further time-dependent subdivisions within genotypes I and II. Thus, the DENV-4 population is observed to be stratified into eight genetically distinct lineages, two of which are formed by American strains and six of which are formed by Asian strains. Episodic positive selection was observed in the structural (E) and non-structural (NS2A and NS3) genes, which appears to be responsible for diversification of Asian lineages in general and that of modern lineages of genotype I and II in particular. In summary, the global DENV-4 population is stratified into eight genetically distinct lineages, in a spatiotemporal manner with limited recombination. The significant role of adaptive evolution in causing diversification of DENV-4 lineages is discussed. The evolution of DENV-4 appears to be governed by interplay between spatiotemporal distribution, episodic positive selection and intra/inter-genotype recombination. PMID:27169727

  17. Report on three Genomes to Life Workshops: Data Infrastructure, Modeling and Simulation, and Protein Structure Prediction

    SciTech Connect

    Geist, GA

    2003-09-16

    On July 22, 23, 24, 2003, three one day workshops were held in Gaithersburg, Maryland. Each was attended by about 30 computational biologists, mathematicians, and computer scientists who were experts in the respective workshop areas The first workshop discussed the data infrastructure needs for the Genomes to Life (GTL) program with the objective to identify gaps in the present GTL data infrastructure and define the GTL data infrastructure required for the success of the proposed GTL facilities. The second workshop discussed the modeling and simulation needs for the next phase of the GTL program and defined how these relate to the experimental data generated by genomics, proteomics, and metabolomics. The third workshop identified emerging technical challenges in computational protein structure prediction for DOE missions and outlining specific goals for the next phase of GTL. The workshops were attended by representatives from both OBER and OASCR. The invited experts at each of the workshops made short presentations on what they perceived as the key needs in the GTL data infrastructure, modeling and simulation, and structure prediction respectively. Each presentation was followed by a lively discussion by all the workshop attendees. The following findings and recommendations were derived from the three workshops. A seamless integration of GTL data spanning the entire range of genomics, proteomics, and metabolomics will be extremely challenging but it has to be treated as the first-class component of the GTL program to assure GTL's chances for success. High-throughput GTL facilities and ultrascale computing will make it possible to address the ultimate goal of modern biology: to achieve a fundamental, comprehensive, and systematic understanding of life. But first the GTL community needs to address the problem of the massive quantities and increased complexity of biological data produced by experiments and computations. Genome-scale collection, analysis

  18. Structure of the conserved hypothetical protein MAL13P1.257 from Plasmodium falciparum

    PubMed Central

    Holmes, Margaret A.; Buckner, Frederick S.; Van Voorhis, Wesley C.; Mehlin, Christopher; Boni, Erica; Earnest, Thomas N.; DeTitta, George; Luft, Joseph; Lauricella, Angela; Anderson, Lori; Kalyuzhniy, Oleksandr; Zucker, Frank; Schoenfeld, Lori W.; Hol, Wim G. J.; Merritt, Ethan A.

    2006-01-01

    The structure of a conserved hypothetical protein, PlasmoDB sequence MAL13P1.257 from Plasmodium falciparum, Pfam sequence family PF05907, has been determined as part of the structural genomics effort of the Structural Genomics of Pathogenic Protozoa consortium. The structure was determined by multiple-wavelength anomalous dispersion at 2.17 Å resolution. The structure is almost entirely β-sheet; it consists of 15 β-strands and one short 310-helix and represents a new protein fold. The packing of the two monomers in the asymmetric unit indicates that the biological unit may be a dimer. PMID:16511296

  19. Population genomic analysis of ancient and modern genomes yields new insights into the genetic ancestry of the Tyrolean Iceman and the genetic structure of Europe.

    PubMed

    Sikora, Martin; Carpenter, Meredith L; Moreno-Estrada, Andres; Henn, Brenna M; Underhill, Peter A; Sánchez-Quinto, Federico; Zara, Ilenia; Pitzalis, Maristella; Sidore, Carlo; Busonero, Fabio; Maschio, Andrea; Angius, Andrea; Jones, Chris; Mendoza-Revilla, Javier; Nekhrizov, Georgi; Dimitrova, Diana; Theodossiev, Nikola; Harkins, Timothy T; Keller, Andreas; Maixner, Frank; Zink, Albert; Abecasis, Goncalo; Sanna, Serena; Cucca, Francesco; Bustamante, Carlos D

    2014-05-01

    Genome sequencing of the 5,300-year-old mummy of the Tyrolean Iceman, found in 1991 on a glacier near the border of Italy and Austria, has yielded new insights into his origin and relationship to modern European populations. A key finding of that study was an apparent recent common ancestry with individuals from Sardinia, based largely on the Y chromosome haplogroup and common autosomal SNP variation. Here, we compiled and analyzed genomic datasets from both modern and ancient Europeans, including genome sequence data from over 400 Sardinians and two ancient Thracians from Bulgaria, to investigate this result in greater detail and determine its implications for the genetic structure of Neolithic Europe. Using whole-genome sequencing data, we confirm that the Iceman is, indeed, most closely related to Sardinians. Furthermore, we show that this relationship extends to other individuals from cultural contexts associated with the spread of agriculture during the Neolithic transition, in contrast to individuals from a hunter-gatherer context. We hypothesize that this genetic affinity of ancient samples from different parts of Europe with Sardinians represents a common genetic component that was geographically widespread across Europe during the Neolithic, likely related to migrations and population expansions associated with the spread of agriculture. PMID:24809476

  20. Comprehensive long-span paired-end-tag mapping reveals characteristic patterns of structural variations in epithelial cancer genomes

    PubMed Central

    Hillmer, Axel M.; Yao, Fei; Inaki, Koichiro; Lee, Wah Heng; Ariyaratne, Pramila N.; Teo, Audrey S.M.; Woo, Xing Yi; Zhang, Zhenshui; Zhao, Hao; Ukil, Leena; Chen, Jieqi P.; Zhu, Feng; So, Jimmy B.Y.; Salto-Tellez, Manuel; Poh, Wan Ting; Zawack, Kelson F.B.; Nagarajan, Niranjan; Gao, Song; Li, Guoliang; Kumar, Vikrant; Lim, Hui Ping J.; Sia, Yee Yen; Chan, Chee Seng; Leong, See Ting; Neo, Say Chuan; Choi, Poh Sum D.; Thoreau, Hervé; Tan, Patrick B.O.; Shahab, Atif; Ruan, Xiaoan; Bergh, Jonas; Hall, Per; Cacheux-Rataboul, Valère; Wei, Chia-Lin; Yeoh, Khay Guan; Sung, Wing-Kin; Bourque, Guillaume; Liu, Edison T.; Ruan, Yijun

    2011-01-01

    Somatic genome rearrangements are thought to play important roles in cancer development. We optimized a long-span paired-end-tag (PET) sequencing approach using 10-Kb genomic DNA inserts to study human genome structural variations (SVs). The use of a 10-Kb insert size allows the identification of breakpoints within repetitive or homology-containing regions of a few kilobases in size and results in a higher physical coverage compared with small insert libraries with the same sequencing effort. We have applied this approach to comprehensively characterize the SVs of 15 cancer and two noncancer genomes and used a filtering approach to strongly enrich for somatic SVs in the cancer genomes. Our analyses revealed that most inversions, deletions, and insertions are germ-line SVs, whereas tandem duplications, unpaired inversions, interchromosomal translocations, and complex rearrangements are over-represented among somatic rearrangements in cancer genomes. We demonstrate that the quantitative and connective nature of DNA–PET data is precise in delineating the genealogy of complex rearrangement events, we observe signatures that are compatible with breakage-fusion-bridge cycles, and we discover that large duplications are among the initial rearrangements that trigger genome instability for extensive amplification in epithelial cancers. PMID:21467267

  1. Genome characterization and population genetic structure of the zoonotic pathogen, Streptococcus canis

    PubMed Central

    2012-01-01

    Background Streptococcus canis is an important opportunistic pathogen of dogs and cats that can also infect a wide range of additional mammals including cows where it can cause mastitis. It is also an emerging human pathogen. Results Here we provide characterization of the first genome sequence for this species, strain FSL S3-227 (milk isolate from a cow with an intra-mammary infection). A diverse array of putative virulence factors was encoded by the S. canis FSL S3-227 genome. Approximately 75% of these gene sequences were homologous to known Streptococcal virulence factors involved in invasion, evasion, and colonization. Present in the genome are multiple potentially mobile genetic elements (MGEs) [plasmid, phage, integrative conjugative element (ICE)] and comparison to other species provided convincing evidence for lateral gene transfer (LGT) between S. canis and two additional bovine mastitis causing pathogens (Streptococcus agalactiae, and Streptococcus dysgalactiae subsp. dysgalactiae), with this transfer possibly contributing to host adaptation. Population structure among isolates obtained from Europe and USA [bovine = 56, canine = 26, and feline = 1] was explored. Ribotyping of all isolates and multi locus sequence typing (MLST) of a subset of the isolates (n = 45) detected significant differentiation between bovine and canine isolates (Fisher exact test: P = 0.0000 [ribotypes], P = 0.0030 [sequence types]), suggesting possible host adaptation of some genotypes. Concurrently, the ancestral clonal complex (54% of isolates) occurred in many tissue types, all hosts, and all geographic locations suggesting the possibility of a wide and diverse niche. Conclusion This study provides evidence highlighting the importance of LGT in the evolution of the bacteria S. canis, specifically, its possible role in host adaptation and acquisition of virulence factors. Furthermore, recent LGT detected between S. canis and human bacteria (Streptococcus

  2. Comparative Genome Analyses Reveal Distinct Structure in the Saltwater Crocodile MHC

    PubMed Central

    Jaratlerdsiri, Weerachai; Deakin, Janine; Godinez, Ricardo M.; Shan, Xueyan; Peterson, Daniel G.; Marthey, Sylvain; Lyons, Eric; McCarthy, Fiona M.; Isberg, Sally R.; Higgins, Damien P.; Chong, Amanda Y.; John, John St; Glenn, Travis C.; Ray, David A.; Gongora, Jaime

    2014-01-01

    The major histocompatibility complex (MHC) is a dynamic genome region with an essential role in the adaptive immunity of vertebrates, especially antigen presentation. The MHC is generally divided into subregions (classes I, II and III) containing genes of similar function across species, but with different gene number and organisation. Crocodylia (crocodilians) are widely distributed and represent an evolutionary distinct group among higher vertebrates, but the genomic organisation of MHC within this lineage has been largely unexplored. Here, we studied the MHC region of the saltwater crocodile (Crocodylus porosus) and compared it with that of other taxa. We characterised genomic clusters encompassing MHC class I and class II genes in the saltwater crocodile based on sequencing of bacterial artificial chromosomes. Six gene clusters spanning ∼452 kb were identified to contain nine MHC class I genes, six MHC class II genes, three TAP genes, and a TRIM gene. These MHC class I and class II genes were in separate scaffold regions and were greater in length (2–6 times longer) than their counterparts in well-studied fowl B loci, suggesting that the compaction of avian MHC occurred after the crocodilian-avian split. Comparative analyses between the saltwater crocodile MHC and that from the alligator and gharial showed large syntenic areas (>80% identity) with similar gene order. Comparisons with other vertebrates showed that the saltwater crocodile had MHC class I genes located along with TAP, consistent with birds studied. Linkage between MHC class I and TRIM39 observed in the saltwater crocodile resembled MHC in eutherians compared, but absent in avian MHC, suggesting that the saltwater crocodile MHC appears to have gene organisation intermediate between these two lineages. These observations suggest that the structure of the saltwater crocodile MHC, and other crocodilians, can help determine the MHC that was present in the ancestors of archosaurs. PMID:25503521

  3. Comparative genome analyses reveal distinct structure in the saltwater crocodile MHC.

    PubMed

    Jaratlerdsiri, Weerachai; Deakin, Janine; Godinez, Ricardo M; Shan, Xueyan; Peterson, Daniel G; Marthey, Sylvain; Lyons, Eric; McCarthy, Fiona M; Isberg, Sally R; Higgins, Damien P; Chong, Amanda Y; John, John St; Glenn, Travis C; Ray, David A; Gongora, Jaime

    2014-01-01

    The major histocompatibility complex (MHC) is a dynamic genome region with an essential role in the adaptive immunity of vertebrates, especially antigen presentation. The MHC is generally divided into subregions (classes I, II and III) containing genes of similar function across species, but with different gene number and organisation. Crocodylia (crocodilians) are widely distributed and represent an evolutionary distinct group among higher vertebrates, but the genomic organisation of MHC within this lineage has been largely unexplored. Here, we studied the MHC region of the saltwater crocodile (Crocodylus porosus) and compared it with that of other taxa. We characterised genomic clusters encompassing MHC class I and class II genes in the saltwater crocodile based on sequencing of bacterial artificial chromosomes. Six gene clusters spanning ∼452 kb were identified to contain nine MHC class I genes, six MHC class II genes, three TAP genes, and a TRIM gene. These MHC class I and class II genes were in separate scaffold regions and were greater in length (2-6 times longer) than their counterparts in well-studied fowl B loci, suggesting that the compaction of avian MHC occurred after the crocodilian-avian split. Comparative analyses between the saltwater crocodile MHC and that from the alligator and gharial showed large syntenic areas (>80% identity) with similar gene order. Comparisons with other vertebrates showed that the saltwater crocodile had MHC class I genes located along with TAP, consistent with birds studied. Linkage between MHC class I and TRIM39 observed in the saltwater crocodile resembled MHC in eutherians compared, but absent in avian MHC, suggesting that the saltwater crocodile MHC appears to have gene organisation intermediate between these two lineages. These observations suggest that the structure of the saltwater crocodile MHC, and other crocodilians, can help determine the MHC that was present in the ancestors of archosaurs. PMID:25503521

  4. Population genomic structure and adaptation in the zoonotic malaria parasite Plasmodium knowlesi.

    PubMed

    Assefa, Samuel; Lim, Caeul; Preston, Mark D; Duffy, Craig W; Nair, Mridul B; Adroub, Sabir A; Kadir, Khamisah A; Goldberg, Jonathan M; Neafsey, Daniel E; Divis, Paul; Clark, Taane G; Duraisingh, Manoj T; Conway, David J; Pain, Arnab; Singh, Balbir

    2015-10-20

    Malaria cases caused by the zoonotic parasite Plasmodium knowlesi are being increasingly reported throughout Southeast Asia and in travelers returning from the region. To test for evidence of signatures of selection or unusual population structure in this parasite, we surveyed genome sequence diversity in 48 clinical isolates recently sampled from Malaysian Borneo and in five lines maintained in laboratory rhesus macaques after isolation in the 1960s from Peninsular Malaysia and the Philippines. Overall genomewide nucleotide diversity (π = 6.03 × 10(-3)) was much higher than has been seen in worldwide samples of either of the major endemic malaria parasite species Plasmodium falciparum and Plasmodium vivax. A remarkable substructure is revealed within P. knowlesi, consisting of two major sympatric clusters of the clinical isolates and a third cluster comprising the laboratory isolates. There was deep differentiation between the two clusters of clinical isolates [mean genomewide fixation index (FST) = 0.21, with 9,293 SNPs having fixed differences of FST = 1.0]. This differentiation showed marked heterogeneity across the genome, with mean FST values of different chromosomes ranging from 0.08 to 0.34 and with further significant variation across regions within several chromosomes. Analysis of the largest cluster (cluster 1, 38 isolates) indicated long-term population growth, with negatively skewed allele frequency distributions (genomewide average Tajima's D = -1.35). Against this background there was evidence of balancing selection on particular genes, including the circumsporozoite protein (csp) gene, which had the top Tajima's D value (1.57), and scans of haplotype homozygosity implicate several genomic regions as being under recent positive selection. PMID:26438871

  5. Extensive sequencing of seven human genomes to characterize benchmark reference materials.

    PubMed

    Zook, Justin M; Catoe, David; McDaniel, Jennifer; Vang, Lindsay; Spies, Noah; Sidow, Arend; Weng, Ziming; Liu, Yuling; Mason, Christopher E; Alexander, Noah; Henaff, Elizabeth; McIntyre, Alexa B R; Chandramohan, Dhruva; Chen, Feng; Jaeger, Erich; Moshrefi, Ali; Pham, Khoa; Stedman, William; Liang, Tiffany; Saghbini, Michael; Dzakula, Zeljko; Hastie, Alex; Cao, Han; Deikus, Gintaras; Schadt, Eric; Sebra, Robert; Bashir, Ali; Truty, Rebecca M; Chang, Christopher C; Gulbahce, Natali; Zhao, Keyan; Ghosh, Srinka; Hyland, Fiona; Fu, Yutao; Chaisson, Mark; Xiao, Chunlin; Trow, Jonathan; Sherry, Stephen T; Zaranek, Alexander W; Ball, Madeleine; Bobe, Jason; Estep, Preston; Church, George M; Marks, Patrick; Kyriazopoulou-Panagiotopoulou, Sofia; Zheng, Grace X Y; Schnall-Levin, Michael; Ordonez, Heather S; Mudivarti, Patrice A; Giorda, Kristina; Sheng, Ying; Rypdal, Karoline Bjarnesdatter; Salit, Marc

    2016-01-01

    The Genome in a Bottle Consortium, hosted by the National Institute of Standards and Technology (NIST) is creating reference materials and data for human genome sequencing, as well as methods for genome comparison and benchmarking. Here, we describe a large, diverse set of sequencing data for seven human genomes; five are current or candidate NIST Reference Materials. The pilot genome, NA12878, has been released as NIST RM 8398. We also describe data from two Personal Genome Project trios, one of Ashkenazim Jewish ancestry and one of Chinese ancestry. The data come from 12 technologies: BioNano Genomics, Complete Genomics paired-end and LFR, Ion Proton exome, Oxford Nanopore, Pacific Biosciences, SOLiD, 10X Genomics GemCode WGS, and Illumina exome and WGS paired-end, mate-pair, and synthetic long reads. Cell lines, DNA, and data from these individuals are publicly available. Therefore, we expect these data to be useful for revealing novel information about the human genome and improving sequencing technologies, SNP, indel, and structural variant calling, and de novo assembly. PMID:27271295

  6. Extensive sequencing of seven human genomes to characterize benchmark reference materials

    PubMed Central

    Zook, Justin M.; Catoe, David; McDaniel, Jennifer; Vang, Lindsay; Spies, Noah; Sidow, Arend; Weng, Ziming; Liu, Yuling; Mason, Christopher E.; Alexander, Noah; Henaff, Elizabeth; McIntyre, Alexa B.R.; Chandramohan, Dhruva; Chen, Feng; Jaeger, Erich; Moshrefi, Ali; Pham, Khoa; Stedman, William; Liang, Tiffany; Saghbini, Michael; Dzakula, Zeljko; Hastie, Alex; Cao, Han; Deikus, Gintaras; Schadt, Eric; Sebra, Robert; Bashir, Ali; Truty, Rebecca M.; Chang, Christopher C.; Gulbahce, Natali; Zhao, Keyan; Ghosh, Srinka; Hyland, Fiona; Fu, Yutao; Chaisson, Mark; Xiao, Chunlin; Trow, Jonathan; Sherry, Stephen T.; Zaranek, Alexander W.; Ball, Madeleine; Bobe, Jason; Estep, Preston; Church, George M.; Marks, Patrick; Kyriazopoulou-Panagiotopoulou, Sofia; Zheng, Grace X.Y.; Schnall-Levin, Michael; Ordonez, Heather S.; Mudivarti, Patrice A.; Giorda, Kristina; Sheng, Ying; Rypdal, Karoline Bjarnesdatter; Salit, Marc

    2016-01-01

    The Genome in a Bottle Consortium, hosted by the National Institute of Standards and Technology (NIST) is creating reference materials and data for human genome sequencing, as well as methods for genome comparison and benchmarking. Here, we describe a large, diverse set of sequencing data for seven human genomes; five are current or candidate NIST Reference Materials. The pilot genome, NA12878, has been released as NIST RM 8398. We also describe data from two Personal Genome Project trios, one of Ashkenazim Jewish ancestry and one of Chinese ancestry. The data come from 12 technologies: BioNano Genomics, Complete Genomics paired-end and LFR, Ion Proton exome, Oxford Nanopore, Pacific Biosciences, SOLiD, 10X Genomics GemCode WGS, and Illumina exome and WGS paired-end, mate-pair, and synthetic long reads. Cell lines, DNA, and data from these individuals are publicly available. Therefore, we expect these data to be useful for revealing novel information about the human genome and improving sequencing technologies, SNP, indel, and structural variant calling, and de novo assembly. PMID:27271295

  7. Genome-wide analysis of core promoter structures in Schizosaccharomyces pombe with DeepCAGE.

    PubMed

    Li, Hua; Hou, Jingyi; Bai, Ling; Hu, Chuansheng; Tong, Pan; Kang, Yani; Zhao, Xiaodong; Shao, Zhifeng

    2015-01-01

    The core promoter, which immediately flanks the transcription start site (TSS), plays a critical role in transcriptional regulation of eukaryotes. Recent studies on higher eukaryotes have revealed an unprecedented complexity of core promoter structures that underscores diverse regulatory mechanisms of gene expression. For unicellular eukaryotes, however, the structures of core promoters have not been investigated in detail. As an important model organism, Schizosaccharomyces pombe still lacks the precise annotation for TSSs, thus hampering the analysis of core promoter structures and their relationship to higher eukaryotes. Here we used a deep sequencing-based approach (DeepCAGE) to generate 16 million uniquely mapped tags, corresponding to 93,736 positions in the S. pombe genome. The high-resolution TSS landscape enabled identification of over 8,000 core promoters, characterization of 4 promoter classes and observation of widespread alternative promoters. The landscape also allowed precise determination of the representative TSSs within core promoters, thus redefining the 5' UTR for 82.8% of S. pombe genes. We further identified the consensus initiator (Inr) sequence--PyPyPuN(A/C)(C/A), the TATA-enriched region (between position -25 and -37) and an Inr immediate downstream motif--CC(T/A)(T/C)(T/C/A)(A/G)CCA(A/T/C), all of which were associated with highly expressed promoters. In conclusion, the detailed analysis of core promoters not only significantly improves the genome annotation of S. pombe, but also reveals that this unicellular eukaryote shares a highly similar organization in the core promoters with higher eukaryotes. These findings lend additional evidence for the power of this model system in delineating complex regulatory processes in multicellular organisms, despite its perceived simplicity. PMID:25747261

  8. Genome-wide analysis of core promoter structures in Schizosaccharomyces pombe with DeepCAGE

    PubMed Central

    Li, Hua; Hou, Jingyi; Bai, Ling; Hu, Chuansheng; Tong, Pan; Kang, Yani; Zhao, Xiaodong; Shao, Zhifeng

    2015-01-01

    The core promoter, which immediately flanks the transcription start site (TSS), plays a critical role in transcriptional regulation of eukaryotes. Recent studies on higher eukaryotes have revealed an unprecedented complexity of core promoter structures that underscores diverse regulatory mechanisms of gene expression. For unicellular eukaryotes, however, the structures of core promoters have not been investigated in detail. As an important model organism, Schizosaccharomyces pombe still lacks the precise annotation for TSSs, thus hampering the analysis of core promoter structures and their relationship to higher eukaryotes. Here we used a deep sequencing-based approach (DeepCAGE) to generate 16 million uniquely mapped tags, corresponding to 93,736 positions in the S. pombe genome. The high-resolution TSS landscape enabled identification of over 8,000 core promoters, characterization of 4 promoter classes and observation of widespread alternative promoters. The landscape also allowed precise determination of the representative TSSs within core promoters, thus redefining the 5' UTR for 82.8% of S. pombe genes. We further identified the consensus initiator (Inr) sequence – PyPyPuN(A/C)(C/A), the TATA-enriched region (between position −25 and −37) and an Inr immediate downstream motif – CC(T/A)(T/C)(T/C/A)(A/G)CCA(A/T/C), all of which were associated with highly expressed promoters. In conclusion, the detailed analysis of core promoters not only significantly improves the genome annotation of S. pombe, but also reveals that this unicellular eukaryote shares a highly similar organization in the core promoters with higher eukaryotes. These findings lend additional evidence for the power of this model system in delineating complex regulatory processes in multicellular organisms, despite its perceived simplicity. PMID:25747261

  9. The population genomics of begomoviruses: global scale population structure and gene flow

    PubMed Central

    2010-01-01

    Background The rapidly growing availability of diverse full genome sequences from across the world is increasing the feasibility of studying the large-scale population processes that underly observable pattern of virus diversity. In particular, characterizing the genetic structure of virus populations could potentially reveal much about how factors such as geographical distributions, host ranges and gene flow between populations combine to produce the discontinuous patterns of genetic diversity that we perceive as distinct virus species. Among the richest and most diverse full genome datasets that are available is that for the dicotyledonous plant infecting genus, Begomovirus, in the Family Geminiviridae. The begomoviruses all share the same whitefly vector, are highly recombinogenic and are distributed throughout tropical and subtropical regions where they seriously threaten the food security of the world's poorest people. Results We