Science.gov

Sample records for structural genomics consortium

  1. Genome Structure Gallery from the Mycobacterium Tuberculosis Structual Genomics Consortium

    DOE Data Explorer

    The TB Structural Genomics Consortium works with the structures of proteins from M. tuberculosis, analyzing these structures in the context of functional information that currently exists and that the Consortium generates. The database of linked structural and functional information constructed from this project will form a lasting basis for understanding M. tuberculosis pathogenesis and for structure-based drug design. The Consortium's structural and functional information is publicly available. The Structures Gallery makes more than 650 total structures available by PDB identifier. Some of these are not consortium targets, but all are viewable in 3D color and can be manipulated in various ways by Jmol, an open-source Java viewer for chemical structures in 3D from http://www.jmol.org/

  2. The TB Structural Genomics Consortium: A decade of progress

    PubMed Central

    Chim, Nicholas; Habel, Jeff E.; Johnston, Jodie M.; Krieger, Inna; Miallau, Linda; Sankaranarayanan, Ramasamy; Morse, Robert P.; Bruning, John; Swanson, Stephanie; Kim, Haelee; Kim, Chang-Yub; Li, Hongye; Bulloch, Esther M.; Payne, Richard J.; Manos-Turvey, Alexandra; Hung, Li-Wei; Baker, Edward N.; Lott, J. Shaun; James, Michael N.G.; Terwilliger, Thomas C.; Eisenberg, David S.; Sacchettini, James C.; Goulding, Celia W.

    2012-01-01

    Summary The TB Structural Genomics Consortium is a worldwide organization of collaborators whose mission is the comprehensive structural determination and analyses of Mycobacterium tuberculosis proteins to ultimately aid in tuberculosis diagnosis and treatment. Congruent to the overall vision, Consortium members have additionally established an integrated facilities core to streamline M. tuberculosis structural biology and developed bioinformatics resources for data mining. This review aims to share the latest Consortium developments with the TB community, including recent structures of proteins that play significant roles within M. tuberculosis. Atomic resolution details may unravel mechanistic insights and reveal unique and novel protein features, as well as important protein-protein and protein-ligand interactions, which ultimately leads to a better understanding of M. tuberculosis biology and may be exploited for rational, structure-based therapeutics design. PMID:21247804

  3. Genomic standards consortium projects.

    PubMed

    Field, Dawn; Sterk, Peter; Kottmann, Renzo; De Smet, J Wim; Amaral-Zettler, Linda; Cochrane, Guy; Cole, James R; Davies, Neil; Dawyndt, Peter; Garrity, George M; Gilbert, Jack A; Glöckner, Frank Oliver; Hirschman, Lynette; Klenk, Hans-Peter; Knight, Rob; Kyrpides, Nikos; Meyer, Folker; Karsch-Mizrachi, Ilene; Morrison, Norman; Robbins, Robert; San Gil, Inigo; Sansone, Susanna; Schriml, Lynn; Tatusova, Tatiana; Ussery, Dave; Yilmaz, Pelin; White, Owen; Wooley, John; Caporaso, Gregory

    2014-06-15

    The Genomic Standards Consortium (GSC) is an open-membership community that was founded in 2005 to work towards the development, implementation and harmonization of standards in the field of genomics. Starting with the defined task of establishing a minimal set of descriptions the GSC has evolved into an active standards-setting body that currently has 18 ongoing projects, with additional projects regularly proposed from within and outside the GSC. Here we describe our recently enacted policy for proposing new activities that are intended to be taken on by the GSC, along with the template for proposing such new activities.

  4. Fragment-based cocktail crystallography by the Medical Structural Genomics of Pathogenic Protozoa Consortium

    PubMed Central

    Verlinde, Christophe L.M.J.; Fan, Erkang; Shibata, Sayaka; Zhang, Zongsheng; Sun, Zhihua; Deng, Wei; Ross, Jennifer; Kim, Jessica; Xiao, Liren; Arakaki, Tracy L.; Bosch, Jürgen; Caruthers, Jonathan M.; Larson, Eric T.; LeTrong, Isolde; Napuli, Alberto; Kelly, Angela; Mueller, Natasha; Zucker, Frank; Van Voorhis, Wesley C.; Buckner, Frederick S.; Merritt, Ethan A.; Hol, Wim G.J.

    2010-01-01

    The history of fragment-based drug discovery, with an emphasis on crystallographic methods, is sketched, illuminating various contributions, including our own, which preceded the industrial development of the method. Subsequently, the creation of the BMSC fragment cocktails library is described. The BMSC collection currently comprises 68 cocktails of 10 compounds that are shape-wise diverse. The utility of these cocktails for initiating lead discovery in structure-based drug design has been explored by soaking numerous protein crystals obtained by our MSGPP (Medical Structural Genomics of Pathogenic Protozoa) consortium. Details of the fragment selection and cocktail design procedures, as well as examples of the successes obtained are given. The BMSC Fragment Cocktail recipes are available free of charge and are in use in over 20 academic labs. PMID:19929835

  5. The High-Throughput Protein Sample Production Platform of the Northeast Structural Genomics Consortium

    PubMed Central

    Xiao, Rong; Anderson, Stephen; Aramini, James; Belote, Rachel; Buchwald, William A.; Ciccosanti, Colleen; Conover, Ken; Everett, John K.; Hamilton, Keith; Huang, Yuanpeng Janet; Janjua, Haleema; Jiang, Mei; Kornhaber, Gregory J.; Lee, Dong Yup; Locke, Jessica Y.; Ma, Li-Chung; Maglaqui, Melissa; Mao, Lei; Mitra, Saheli; Patel, Dayaban; Rossi, Paolo; Sahdev, Seema; Sharma, Seema; Shastry, Ritu; Swapna, G.V.T.; Tong, Saichu N.; Wang, Dongyan; Wang, Huang; Zhao, Li; Montelione, Gaetano T.; Acton, Thomas B.

    2014-01-01

    We describe the core Protein Production Platform of the Northeast Structural Genomics Consortium (NESG) and outline the strategies used for producing high-quality protein samples. The platform is centered on the cloning, expression and purification of 6X-His-tagged proteins using T7-based Escherichia coli systems. The 6X-His tag allows for similar purification procedures for most targets and implementation of high-throughput (HTP) parallel methods. In most cases, the 6X-His-tagged proteins are sufficiently purified (> 97% homogeneity) using a HTP two-step purification protocol for most structural studies. Using this platform, the open reading frames of over 16,000 different targeted proteins (or domains) have been cloned as > 26,000 constructs. Over the past nine years, more than 16,000 of these expressed protein, and more than 4,400 proteins (or domains) have been purified to homogeneity in tens of milligram quantities (see Summary Statistics, http://nesg.org/statistics.html). Using these samples, the NESG has deposited more than 900 new protein structures to the Protein Data Bank (PDB). The methods described here are effective in producing eukaryotic and prokaryotic protein samples in E. coli. This paper summarizes some of the updates made to the protein production pipeline in the last five years, corresponding to phase 2 of the NIGMS Protein Structure Initiative (PSI-2) project. The NESG Protein Production Platform is suitable for implementation in a large individual laboratory or by a small group of collaborating investigators. These advanced automated and/or parallel cloning, expression, purification, and biophysical screening technologies are of broad value to the structural biology, functional proteomics, and structural genomics communities. PMID:20688167

  6. The Global Cancer Genomics Consortium: interfacing genomics and cancer medicine.

    PubMed

    2012-08-01

    The Global Cancer Genomics Consortium (GCGC) is an international collaborative platform that amalgamates cancer biologists, cutting-edge genomics, and high-throughput expertise with medical oncologists and surgical oncologists; they address the most important translational questions that are central to cancer research and treatment. The annual GCGC symposium was held at the Advanced Centre for Treatment Research and Education in Cancer, Mumbai, India, from November 9 to 11, 2011. The symposium showcased international next-generation sequencing efforts that explore cancer-specific transcriptomic changes, single-nucleotide polymorphism, and copy number variations in various types of cancers, as well as the structural genomics approach to develop new therapeutic targets and chemical probes. From the spectrum of studies presented at the symposium, it is evident that the translation of emerging cancer genomics knowledge into clinical applications can only be achieved through the integration of multidisciplinary expertise. In summary, the GCGC symposium provided practical knowledge on structural and cancer genomics approaches, as well as an exclusive platform for focused cancer genomics endeavors. PMID:22628426

  7. The Global Cancer Genomics Consortium: interfacing genomics and cancer medicine.

    PubMed

    2012-08-01

    The Global Cancer Genomics Consortium (GCGC) is an international collaborative platform that amalgamates cancer biologists, cutting-edge genomics, and high-throughput expertise with medical oncologists and surgical oncologists; they address the most important translational questions that are central to cancer research and treatment. The annual GCGC symposium was held at the Advanced Centre for Treatment Research and Education in Cancer, Mumbai, India, from November 9 to 11, 2011. The symposium showcased international next-generation sequencing efforts that explore cancer-specific transcriptomic changes, single-nucleotide polymorphism, and copy number variations in various types of cancers, as well as the structural genomics approach to develop new therapeutic targets and chemical probes. From the spectrum of studies presented at the symposium, it is evident that the translation of emerging cancer genomics knowledge into clinical applications can only be achieved through the integration of multidisciplinary expertise. In summary, the GCGC symposium provided practical knowledge on structural and cancer genomics approaches, as well as an exclusive platform for focused cancer genomics endeavors.

  8. 77 FR 43237 - Genome in a Bottle Consortium-Work Plan Review Workshop

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-07-24

    ... National Institute of Standards and Technology Genome in a Bottle Consortium--Work Plan Review Workshop.... SUMMARY: NIST announces the Genome in a Bottle Consortium meeting to be held on Thursday and Friday, August 16 and 17, 2012. The Genome in a Bottle Consortium is planning to develop the reference...

  9. Retirement Plan Consortium Structures for K-12

    ERIC Educational Resources Information Center

    Kevin, John

    2012-01-01

    As school districts continue to seek administrative efficiencies and cost reductions in the wake of severe budget pressures, the resources they devote to creating or expanding retirement plan consortia is increasing. Understanding how to structure a retirement plan consortium is paramount to successfully achieving the many objectives of…

  10. Meeting report: the fourth Genomic Standards Consortium (GSC) workshop.

    PubMed

    Field, Dawn; Glöckner, Frank Oliver; Garrity, George M; Gray, Tanya; Sterk, Peter; Cochrane, Guy; Vaughan, Robert; Kolker, Eugene; Kottmann, Renzo; Kyrpides, Nikos; Angiuoli, Sam; Dawyndt, Peter; Guralnick, Robert; Goldstein, Philip; Hall, Neil; Hirschman, Lynette; Kravitz, Saul; Lister, Allyson L; Markowitz, Victor; Thomson, Nick; Whetzel, Trish

    2008-06-01

    This meeting report summarizes the proceedings of the "eGenomics: Cataloguing our Complete Genome Collection IV" workshop held June 6-8, 2007, at the National Institute for Environmental eScience (NIEeS), Cambridge, United Kingdom. This fourth workshop of the Genomic Standards Consortium (GSC) was a mix of short presentations, strategy discussions, and technical sessions. Speakers provided progress reports on the development of the "Minimum Information about a Genome Sequence" (MIGS) specification and the closely integrated "Minimum Information about a Metagenome Sequence" (MIMS) specification. The key outcome of the workshop was consensus on the next version of the MIGS/MIMS specification (v1.2). This drove further definition and restructuring of the MIGS/MIMS XML schema (syntax). With respect to semantics, a term vetting group was established to ensure that terms are properly defined and submitted to the appropriate ontology projects. Perhaps the single most important outcome of the workshop was a proposal to move beyond the concept of "minimum" to create a far richer XML schema that would define a "Genomic Contextual Data Markup Language" (GCDML) suitable for wider semantic integration across databases. GCDML will contain not only curated information (e.g., compliant with MIGS/MIMS), but also be extended to include a variety of data processing and calculations. Further information about the Genomic Standards Consortium and its range of activities can be found at http://gensc.org.

  11. Genome Analyses and Supplement Data from the International Populus Genome Consortium (IPGC)

    DOE Data Explorer

    International Populus Genome Consortium (IPGC)

    The sequencing of the first tree genome, that of Populus, was a project initiated by the Office of Biological and Environmental Research in DOE’s Office of Science. The International Populus Genome Consortium (IPGC) was formed to help develop and guide post-sequence activities. The IPGC website, hosted at the Oak Ridge National Laboratory, provides draft sequence data as it is made available from DOE Joint Genome Institute, genome analyses for Populus, lists of related publications and resources, and the science plan. The data are available at http://www.ornl.gov/sci/ipgc/ssr_resource.htm.

  12. Effective electron-density map improvement and structure validation on a Linux multi-CPU web cluster: The TB Structural Genomics Consortium Bias Removal Web Service.

    PubMed

    Reddy, Vinod; Swanson, Stanley M; Segelke, Brent; Kantardjieff, Katherine A; Sacchettini, James C; Rupp, Bernhard

    2003-12-01

    Anticipating a continuing increase in the number of structures solved by molecular replacement in high-throughput crystallography and drug-discovery programs, a user-friendly web service for automated molecular replacement, map improvement, bias removal and real-space correlation structure validation has been implemented. The service is based on an efficient bias-removal protocol, Shake&wARP, and implemented using EPMR and the CCP4 suite of programs, combined with various shell scripts and Fortran90 routines. The service returns improved maps, converted data files and real-space correlation and B-factor plots. User data are uploaded through a web interface and the CPU-intensive iteration cycles are executed on a low-cost Linux multi-CPU cluster using the Condor job-queuing package. Examples of map improvement at various resolutions are provided and include model completion and reconstruction of absent parts, sequence correction, and ligand validation in drug-target structures.

  13. Meeting Report from the Genomic Standards Consortium (GSC) Workshop 10.

    PubMed

    Glass, Elizabeth; Meyer, Folker; Gilbert, Jack A; Field, Dawn; Hunter, Sarah; Kottmann, Renzo; Kyrpides, Nikos; Sansone, Susanna; Schriml, Lynn; Sterk, Peter; White, Owen; Wooley, John

    2010-01-01

    This report summarizes the proceedings of the 10th workshop of the Genomic Standards Consortium (GSC), held at Argonne National Laboratory, IL, USA. It was the second GSC workshop to have open registration and attracted over 60 participants who worked together to progress the full range of projects ongoing within the GSC. Overall, the primary focus of the workshop was on advancing the M5 platform for next-generation collaborative computational infrastructures. Other key outcomes included the formation of a GSC working group focused on MIGS/MIMS/MIENS compliance using the ISA software suite and the formal launch of the GSC Developer Working Group. Further information about the GSC and its range of activities can be found at http://gensc.org/.

  14. Meeting report: the fifth Genomic Standards Consortium (GSC) workshop.

    PubMed

    Field, Dawn; Garrity, George M; Sansone, Susanna-Assunta; Sterk, Peter; Gray, Tanya; Kyrpides, Nikos; Hirschman, Lynette; Glöckner, Frank Oliver; Kottmann, Renzo; Angiuoli, Sam; White, Owen; Dawyndt, Peter; Thomson, Nick; Gil, Inigo San; Morrison, Norman; Tatusova, Tatiana; Mizrachi, Ilene; Vaughan, Robert; Cochrane, Guy; Kagan, Leonid; Murphy, Sean; Schriml, Lynn

    2008-06-01

    This meeting report summarizes the proceedings of the fifth Genomic Standards Consortium (GSC) workshop held December 12-14, 2007, at the European Bioinformatics Institute (EBI), Cambridge, UK. This fifth workshop served as a milestone event in the evolution of the GSC (launched in September 2005); the key outcome of the workshop was the finalization of a stable version of the MIGS specification (v2.0) for publication. This accomplishment enables, and also in some cases necessitates, downstream activities, which are described in the multiauthor, consensus-driven articles in this special issue of OMICS produced as a direct result of the workshop. This report briefly summarizes the workshop and overviews the special issue. In particular, it aims to explain how the various GSC-led projects are working together to help this community achieve its stated mission of further standardizing the descriptions of genomes and metagenomes and implementing improved mechanisms of data exchange and integration to enable more accurate comparative analyses. Further information about the GSC and its range of activities can be found at http://gensc.org.

  15. The Teleprasenz Consortium: Structure and intentions

    NASA Technical Reports Server (NTRS)

    Blauert, Jens

    1991-01-01

    The Teleprasenz-Consortium is an open group of currently 37 scientists of different disciplines who devote a major part of their research activities to the foundations of telepresence technology. Telepresence technology is basically understood as a means to bridge spatial and temporal gaps as well as certain kinds of concealment, inaccessibility and danger of exposure. The activities of the consortium are organized into three main branches: virtual environment, surveillance and control systems, and speech and language technology. A brief summary of the main activities in these areas is given.

  16. The Tennessee Mouse Genome Consortium: Identification of ocular mutants

    SciTech Connect

    Jablonski, Monica M.; Wang, Xiaofei; Lu, Lu; Miller, Darla R; Rinchik, Eugene M; Williams, Robert; Goldowitz, Daniel

    2005-06-01

    The Tennessee Mouse Genome Consortium (TMGC) is in its fifth year of a ethylnitrosourea (ENU)-based mutagenesis screen to detect recessive mutations that affect the eye and brain. Each pedigree is tested by various phenotyping domains including the eye, neurohistology, behavior, aging, ethanol, drug, social behavior, auditory, and epilepsy domains. The utilization of a highly efficient breeding protocol and coordination of various universities across Tennessee makes it possible for mice with ENU-induced mutations to be evaluated by nine distinct phenotyping domains within this large-scale project known as the TMGC. Our goal is to create mutant lines that model human diseases and disease syndromes and to make the mutant mice available to the scientific research community. Within the eye domain, mice are screened for anterior and posterior segment abnormalities using slit-lamp biomicroscopy, indirect ophthalmoscopy, fundus photography, eye weight, histology, and immunohistochemistry. As of January 2005, we have screened 958 pedigrees and 4800 mice, excluding those used in mapping studies. We have thus far identified seven pedigrees with primary ocular abnormalities. Six of the mutant pedigrees have retinal or subretinal aberrations, while the remaining pedigree presents with an abnormal eye size. Continued characterization of these mutant mice should in most cases lead to the identification of the mutated gene, as well as provide insight into the function of each gene. Mice from each of these pedigrees of mutant mice are available for distribution to researchers for independent study.

  17. Enriching public descriptions of marine phages using the Genomic Standards Consortium MIGS standard

    PubMed Central

    Duhaime, Melissa Beth; Kottmann, Renzo; Field, Dawn; Glöckner, Frank Oliver

    2011-01-01

    In any sequencing project, the possible depth of comparative analysis is determined largely by the amount and quality of the accompanying contextual data. The structure, content, and storage of this contextual data should be standardized to ensure consistent coverage of all sequenced entities and facilitate comparisons. The Genomic Standards Consortium (GSC) has developed the “Minimum Information about Genome/Metagenome Sequences (MIGS/MIMS)” checklist for the description of genomes and here we annotate all 30 publicly available marine bacteriophage sequences to the MIGS standard. These annotations build on existing International Nucleotide Sequence Database Collaboration (INSDC) records, and confirm, as expected that current submissions lack most MIGS fields. MIGS fields were manually curated from the literature and placed in XML format as specified by the Genomic Contextual Data Markup Language (GCDML). These “machine-readable” reports were then analyzed to highlight patterns describing this collection of genomes. Completed reports are provided in GCDML. This work represents one step towards the annotation of our complete collection of genome sequences and shows the utility of capturing richer metadata along with raw sequences. PMID:21677864

  18. Enriching public descriptions of marine phages using the Genomic Standards Consortium MIGS standard.

    PubMed

    Duhaime, Melissa Beth; Kottmann, Renzo; Field, Dawn; Glöckner, Frank Oliver

    2011-04-29

    In any sequencing project, the possible depth of comparative analysis is determined largely by the amount and quality of the accompanying contextual data. The structure, content, and storage of this contextual data should be standardized to ensure consistent coverage of all sequenced entities and facilitate comparisons. The Genomic Standards Consortium (GSC) has developed the "Minimum Information about Genome/Metagenome Sequences (MIGS/MIMS)" checklist for the description of genomes and here we annotate all 30 publicly available marine bacteriophage sequences to the MIGS standard. These annotations build on existing International Nucleotide Sequence Database Collaboration (INSDC) records, and confirm, as expected that current submissions lack most MIGS fields. MIGS fields were manually curated from the literature and placed in XML format as specified by the Genomic Contextual Data Markup Language (GCDML). These "machine-readable" reports were then analyzed to highlight patterns describing this collection of genomes. Completed reports are provided in GCDML. This work represents one step towards the annotation of our complete collection of genome sequences and shows the utility of capturing richer metadata along with raw sequences.

  19. Genome Consortium for Active Teaching: Meeting the Goals of BIO2010

    ERIC Educational Resources Information Center

    Campbell, A. Malcolm; Ledbetter, Mary Lee S.; Hoopes, Laura L. M.; Eckdahl, Todd T.; Heyer, Laurie J.; Rosenwald, Anne; Fowlks, Edison; Tonidandel, Scott; Bucholtz, Brooke; Gottfried, Gail

    2007-01-01

    The Genome Consortium for Active Teaching (GCAT) facilitates the use of modern genomics methods in undergraduate education. Initially focused on microarray technology, but with an eye toward diversification, GCAT is a community working to improve the education of tomorrow's life science professionals. GCAT participants have access to affordable…

  20. Clinical Sequencing Exploratory Research Consortium: Accelerating Evidence-Based Practice of Genomic Medicine.

    PubMed

    Green, Robert C; Goddard, Katrina A B; Jarvik, Gail P; Amendola, Laura M; Appelbaum, Paul S; Berg, Jonathan S; Bernhardt, Barbara A; Biesecker, Leslie G; Biswas, Sawona; Blout, Carrie L; Bowling, Kevin M; Brothers, Kyle B; Burke, Wylie; Caga-Anan, Charlisse F; Chinnaiyan, Arul M; Chung, Wendy K; Clayton, Ellen W; Cooper, Gregory M; East, Kelly; Evans, James P; Fullerton, Stephanie M; Garraway, Levi A; Garrett, Jeremy R; Gray, Stacy W; Henderson, Gail E; Hindorff, Lucia A; Holm, Ingrid A; Lewis, Michelle Huckaby; Hutter, Carolyn M; Janne, Pasi A; Joffe, Steven; Kaufman, David; Knoppers, Bartha M; Koenig, Barbara A; Krantz, Ian D; Manolio, Teri A; McCullough, Laurence; McEwen, Jean; McGuire, Amy; Muzny, Donna; Myers, Richard M; Nickerson, Deborah A; Ou, Jeffrey; Parsons, Donald W; Petersen, Gloria M; Plon, Sharon E; Rehm, Heidi L; Roberts, J Scott; Robinson, Dan; Salama, Joseph S; Scollon, Sarah; Sharp, Richard R; Shirts, Brian; Spinner, Nancy B; Tabor, Holly K; Tarczy-Hornoch, Peter; Veenstra, David L; Wagle, Nikhil; Weck, Karen; Wilfond, Benjamin S; Wilhelmsen, Kirk; Wolf, Susan M; Wynn, Julia; Yu, Joon-Ho

    2016-06-01

    Despite rapid technical progress and demonstrable effectiveness for some types of diagnosis and therapy, much remains to be learned about clinical genome and exome sequencing (CGES) and its role within the practice of medicine. The Clinical Sequencing Exploratory Research (CSER) consortium includes 18 extramural research projects, one National Human Genome Research Institute (NHGRI) intramural project, and a coordinating center funded by the NHGRI and National Cancer Institute. The consortium is exploring analytic and clinical validity and utility, as well as the ethical, legal, and social implications of sequencing via multidisciplinary approaches; it has thus far recruited 5,577 participants across a spectrum of symptomatic and healthy children and adults by utilizing both germline and cancer sequencing. The CSER consortium is analyzing data and creating publically available procedures and tools related to participant preferences and consent, variant classification, disclosure and management of primary and secondary findings, health outcomes, and integration with electronic health records. Future research directions will refine measures of clinical utility of CGES in both germline and somatic testing, evaluate the use of CGES for screening in healthy individuals, explore the penetrance of pathogenic variants through extensive phenotyping, reduce discordances in public databases of genes and variants, examine social and ethnic disparities in the provision of genomics services, explore regulatory issues, and estimate the value and downstream costs of sequencing. The CSER consortium has established a shared community of research sites by using diverse approaches to pursue the evidence-based development of best practices in genomic medicine.

  1. Draft Genome Sequence of Achromobacter sp. Strain AR476-2, Isolated from a Cellulolytic Consortium

    PubMed Central

    Kurth, Daniel; Romero, Cintia M.; Fernandez, Pablo M.; Ferrero, Marcela A.

    2016-01-01

    Achromobacter sp. AR476-2 is a noncellulolytic strain previously isolated from a cellulolytic consortium selected from samples of insect gut. Its genome sequence could contribute to the unraveling of the complex interaction of microorganisms and enzymes involved in the biodegradation of lignocellulosic biomass in nature. PMID:27340069

  2. Draft Genome Sequence of Achromobacter sp. Strain AR476-2, Isolated from a Cellulolytic Consortium.

    PubMed

    Kurth, Daniel; Romero, Cintia M; Fernandez, Pablo M; Ferrero, Marcela A; Martinez, M Alejandra

    2016-01-01

    Achromobacter sp. AR476-2 is a noncellulolytic strain previously isolated from a cellulolytic consortium selected from samples of insect gut. Its genome sequence could contribute to the unraveling of the complex interaction of microorganisms and enzymes involved in the biodegradation of lignocellulosic biomass in nature. PMID:27340069

  3. Clinical Sequencing Exploratory Research Consortium: Accelerating Evidence-Based Practice of Genomic Medicine.

    PubMed

    Green, Robert C; Goddard, Katrina A B; Jarvik, Gail P; Amendola, Laura M; Appelbaum, Paul S; Berg, Jonathan S; Bernhardt, Barbara A; Biesecker, Leslie G; Biswas, Sawona; Blout, Carrie L; Bowling, Kevin M; Brothers, Kyle B; Burke, Wylie; Caga-Anan, Charlisse F; Chinnaiyan, Arul M; Chung, Wendy K; Clayton, Ellen W; Cooper, Gregory M; East, Kelly; Evans, James P; Fullerton, Stephanie M; Garraway, Levi A; Garrett, Jeremy R; Gray, Stacy W; Henderson, Gail E; Hindorff, Lucia A; Holm, Ingrid A; Lewis, Michelle Huckaby; Hutter, Carolyn M; Janne, Pasi A; Joffe, Steven; Kaufman, David; Knoppers, Bartha M; Koenig, Barbara A; Krantz, Ian D; Manolio, Teri A; McCullough, Laurence; McEwen, Jean; McGuire, Amy; Muzny, Donna; Myers, Richard M; Nickerson, Deborah A; Ou, Jeffrey; Parsons, Donald W; Petersen, Gloria M; Plon, Sharon E; Rehm, Heidi L; Roberts, J Scott; Robinson, Dan; Salama, Joseph S; Scollon, Sarah; Sharp, Richard R; Shirts, Brian; Spinner, Nancy B; Tabor, Holly K; Tarczy-Hornoch, Peter; Veenstra, David L; Wagle, Nikhil; Weck, Karen; Wilfond, Benjamin S; Wilhelmsen, Kirk; Wolf, Susan M; Wynn, Julia; Yu, Joon-Ho

    2016-06-01

    Despite rapid technical progress and demonstrable effectiveness for some types of diagnosis and therapy, much remains to be learned about clinical genome and exome sequencing (CGES) and its role within the practice of medicine. The Clinical Sequencing Exploratory Research (CSER) consortium includes 18 extramural research projects, one National Human Genome Research Institute (NHGRI) intramural project, and a coordinating center funded by the NHGRI and National Cancer Institute. The consortium is exploring analytic and clinical validity and utility, as well as the ethical, legal, and social implications of sequencing via multidisciplinary approaches; it has thus far recruited 5,577 participants across a spectrum of symptomatic and healthy children and adults by utilizing both germline and cancer sequencing. The CSER consortium is analyzing data and creating publically available procedures and tools related to participant preferences and consent, variant classification, disclosure and management of primary and secondary findings, health outcomes, and integration with electronic health records. Future research directions will refine measures of clinical utility of CGES in both germline and somatic testing, evaluate the use of CGES for screening in healthy individuals, explore the penetrance of pathogenic variants through extensive phenotyping, reduce discordances in public databases of genes and variants, examine social and ethnic disparities in the provision of genomics services, explore regulatory issues, and estimate the value and downstream costs of sequencing. The CSER consortium has established a shared community of research sites by using diverse approaches to pursue the evidence-based development of best practices in genomic medicine. PMID:27181682

  4. Connecting Genomic Alterations to Cancer Biology with Proteomics: The NCI Clinical Proteomic Tumor Analysis Consortium

    SciTech Connect

    Ellis, Matthew; Gillette, Michael; Carr, Steven A.; Paulovich, Amanda G.; Smith, Richard D.; Rodland, Karin D.; Townsend, Reid; Kinsinger, Christopher; Mesri, Mehdi; Rodriguez, Henry; Liebler, Daniel

    2013-10-03

    The National Cancer Institute (NCI) Clinical Proteomic Tumor Analysis Consortium is applying the latest generation of proteomic technologies to genomically annotated tumors from The Cancer Genome Atlas (TCGA) program, a joint initiative of the NCI and the National Human Genome Research Institute. By providing a fully integrated accounting of DNA, RNA, and protein abnormalities in individual tumors, these datasets will illuminate the complex relationship between genomic abnormalities and cancer phenotypes, thus producing biologic insights as well as a wave of novel candidate biomarkers and therapeutic targets amenable to verifi cation using targeted mass spectrometry methods.

  5. Personal Genome Sequencing in Ostensibly Healthy Individuals and the PeopleSeq Consortium.

    PubMed

    Linderman, Michael D; Nielsen, Daiva E; Green, Robert C

    2016-03-25

    Thousands of ostensibly healthy individuals have had their exome or genome sequenced, but a much smaller number of these individuals have received any personal genomic results from that sequencing. We term those projects in which ostensibly healthy participants can receive sequencing-derived genetic findings and may also have access to their genomic data as participatory predispositional personal genome sequencing (PPGS). Here we are focused on genome sequencing applied in a pre-symptomatic context and so define PPGS to exclude diagnostic genome sequencing intended to identify the molecular cause of suspected or diagnosed genetic disease. In this report we describe the design of completed and underway PPGS projects, briefly summarize the results reported to date and introduce the PeopleSeq Consortium, a newly formed collaboration of PPGS projects designed to collect much-needed longitudinal outcome data.

  6. Personal Genome Sequencing in Ostensibly Healthy Individuals and the PeopleSeq Consortium

    PubMed Central

    Linderman, Michael D.; Nielsen, Daiva E.; Green, Robert C.

    2016-01-01

    Thousands of ostensibly healthy individuals have had their exome or genome sequenced, but a much smaller number of these individuals have received any personal genomic results from that sequencing. We term those projects in which ostensibly healthy participants can receive sequencing-derived genetic findings and may also have access to their genomic data as participatory predispositional personal genome sequencing (PPGS). Here we are focused on genome sequencing applied in a pre-symptomatic context and so define PPGS to exclude diagnostic genome sequencing intended to identify the molecular cause of suspected or diagnosed genetic disease. In this report we describe the design of completed and underway PPGS projects, briefly summarize the results reported to date and introduce the PeopleSeq Consortium, a newly formed collaboration of PPGS projects designed to collect much-needed longitudinal outcome data. PMID:27023617

  7. Functional Insights from Structural Genomics

    SciTech Connect

    Forouhar,F.; Kuzin, A.; Seetharaman, J.; Lee, I.; Zhou, W.; Abashidze, M.; Chen, Y.; Montelione, G.; Tong, L.; et al

    2007-01-01

    Structural genomics efforts have produced structural information, either directly or by modeling, for thousands of proteins over the past few years. While many of these proteins have known functions, a large percentage of them have not been characterized at the functional level. The structural information has provided valuable functional insights on some of these proteins, through careful structural analyses, serendipity, and structure-guided functional screening. Some of the success stories based on structures solved at the Northeast Structural Genomics Consortium (NESG) are reported here. These include a novel methyl salicylate esterase with important role in plant innate immunity, a novel RNA methyltransferase (H. influenzae yggJ (HI0303)), a novel spermidine/spermine N-acetyltransferase (B. subtilis PaiA), a novel methyltransferase or AdoMet binding protein (A. fulgidus AF{_}0241), an ATP:cob(I)alamin adenosyltransferase (B. subtilis YvqK), a novel carboxysome pore (E. coli EutN), a proline racemase homolog with a disrupted active site (B. melitensis BME11586), an FMN-dependent enzyme (S. pneumoniae SP{_}1951), and a 12-stranded {beta}-barrel with a novel fold (V. parahaemolyticus VPA1032).

  8. Clinical utilization of genomics data produced by the international Pseudomonas aeruginosa consortium.

    PubMed

    Freschi, Luca; Jeukens, Julie; Kukavica-Ibrulj, Irena; Boyle, Brian; Dupont, Marie-Josée; Laroche, Jérôme; Larose, Stéphane; Maaroufi, Halim; Fothergill, Joanne L; Moore, Matthew; Winsor, Geoffrey L; Aaron, Shawn D; Barbeau, Jean; Bell, Scott C; Burns, Jane L; Camara, Miguel; Cantin, André; Charette, Steve J; Dewar, Ken; Déziel, Éric; Grimwood, Keith; Hancock, Robert E W; Harrison, Joe J; Heeb, Stephan; Jelsbak, Lars; Jia, Baofeng; Kenna, Dervla T; Kidd, Timothy J; Klockgether, Jens; Lam, Joseph S; Lamont, Iain L; Lewenza, Shawn; Loman, Nick; Malouin, François; Manos, Jim; McArthur, Andrew G; McKeown, Josie; Milot, Julie; Naghra, Hardeep; Nguyen, Dao; Pereira, Sheldon K; Perron, Gabriel G; Pirnay, Jean-Paul; Rainey, Paul B; Rousseau, Simon; Santos, Pedro M; Stephenson, Anne; Taylor, Véronique; Turton, Jane F; Waglechner, Nicholas; Williams, Paul; Thrane, Sandra W; Wright, Gerard D; Brinkman, Fiona S L; Tucker, Nicholas P; Tümmler, Burkhard; Winstanley, Craig; Levesque, Roger C

    2015-01-01

    The International Pseudomonas aeruginosa Consortium is sequencing over 1000 genomes and building an analysis pipeline for the study of Pseudomonas genome evolution, antibiotic resistance and virulence genes. Metadata, including genomic and phenotypic data for each isolate of the collection, are available through the International Pseudomonas Consortium Database (http://ipcd.ibis.ulaval.ca/). Here, we present our strategy and the results that emerged from the analysis of the first 389 genomes. With as yet unmatched resolution, our results confirm that P. aeruginosa strains can be divided into three major groups that are further divided into subgroups, some not previously reported in the literature. We also provide the first snapshot of P. aeruginosa strain diversity with respect to antibiotic resistance. Our approach will allow us to draw potential links between environmental strains and those implicated in human and animal infections, understand how patients become infected and how the infection evolves over time as well as identify prognostic markers for better evidence-based decisions on patient care. PMID:26483767

  9. Clinical utilization of genomics data produced by the international Pseudomonas aeruginosa consortium

    PubMed Central

    Freschi, Luca; Jeukens, Julie; Kukavica-Ibrulj, Irena; Boyle, Brian; Dupont, Marie-Josée; Laroche, Jérôme; Larose, Stéphane; Maaroufi, Halim; Fothergill, Joanne L.; Moore, Matthew; Winsor, Geoffrey L.; Aaron, Shawn D.; Barbeau, Jean; Bell, Scott C.; Burns, Jane L.; Camara, Miguel; Cantin, André; Charette, Steve J.; Dewar, Ken; Déziel, Éric; Grimwood, Keith; Hancock, Robert E. W.; Harrison, Joe J.; Heeb, Stephan; Jelsbak, Lars; Jia, Baofeng; Kenna, Dervla T.; Kidd, Timothy J.; Klockgether, Jens; Lam, Joseph S.; Lamont, Iain L.; Lewenza, Shawn; Loman, Nick; Malouin, François; Manos, Jim; McArthur, Andrew G.; McKeown, Josie; Milot, Julie; Naghra, Hardeep; Nguyen, Dao; Pereira, Sheldon K.; Perron, Gabriel G.; Pirnay, Jean-Paul; Rainey, Paul B.; Rousseau, Simon; Santos, Pedro M.; Stephenson, Anne; Taylor, Véronique; Turton, Jane F.; Waglechner, Nicholas; Williams, Paul; Thrane, Sandra W.; Wright, Gerard D.; Brinkman, Fiona S. L.; Tucker, Nicholas P.; Tümmler, Burkhard; Winstanley, Craig; Levesque, Roger C.

    2015-01-01

    The International Pseudomonas aeruginosa Consortium is sequencing over 1000 genomes and building an analysis pipeline for the study of Pseudomonas genome evolution, antibiotic resistance and virulence genes. Metadata, including genomic and phenotypic data for each isolate of the collection, are available through the International Pseudomonas Consortium Database (http://ipcd.ibis.ulaval.ca/). Here, we present our strategy and the results that emerged from the analysis of the first 389 genomes. With as yet unmatched resolution, our results confirm that P. aeruginosa strains can be divided into three major groups that are further divided into subgroups, some not previously reported in the literature. We also provide the first snapshot of P. aeruginosa strain diversity with respect to antibiotic resistance. Our approach will allow us to draw potential links between environmental strains and those implicated in human and animal infections, understand how patients become infected and how the infection evolves over time as well as identify prognostic markers for better evidence-based decisions on patient care. PMID:26483767

  10. Meeting Report from the Genomic Standards Consortium (GSC) Workshops 6 and 7.

    PubMed

    Field, Dawn; Sterk, Peter; Kyrpides, Nikos; Kottmann, Renzo; Glöckner, Frank Oliver; Hirschman, Lynette; Garrity, George M; Wooley, John; Gilna, Paul

    2009-01-01

    This report summarizes the proceedings of the 6th and 7th workshops of the Genomic Standards Consortium (GSC), held back-to-back in 2008. GSC 6 focused on furthering the activities of GSC working groups, GSC 7 focused on outreach to the wider community. GSC 6 was held October 10-14, 2008 at the European Bioinformatics Institute, Cambridge, United Kingdom and included a two-day workshop focused on the refinement of the Genomic Contextual Data Markup Language (GCDML). GSC 7 was held as the opening day of the International Congress on Metagenomics 2008 in San Diego California. Major achievements of these combined meetings included an agreement from the International Nucleotide Sequence Database Consortium (INSDC) to create a "MIGS" keyword for capturing "Minimum Information about a Genome Sequence" compliant information within INSDC (DDBJ/EMBL /Genbank) records, launch of GCDML 1.0, MIGS compliance of the first set of "Genomic Encyclopedia of Bacteria and Archaea" project genomes, approval of a proposal to extend MIGS to 16S rRNA sequences within a "Minimum Information about an Environmental Sequence", finalization of plans for the GSC eJournal, "Standards in Genomic Sciences" (SIGS), and the formation of a GSC Board. Subsequently, the GSC has been awarded a Research Co-ordination Network (RCN4GSC) grant from the National Science Foundation, held the first SIGS workshop and launched the journal. The GSC will also be hosting outreach workshops at both ISMB 2009 and PSB 2010 focused on "Metagenomics, Metadata and MetaAnalysis" (M(3)). Further information about the GSC and its range of activities can be found at http://gensc.org, including videos of all the presentations at GSC 7.

  11. Genome Consortium for Active Teaching: Meeting the Goals of BIO2010

    PubMed Central

    Ledbetter, Mary Lee S.; Hoopes, Laura L.M.; Eckdahl, Todd T.; Heyer, Laurie J.; Rosenwald, Anne; Fowlks, Edison; Tonidandel, Scott; Bucholtz, Brooke; Gottfried, Gail

    2007-01-01

    The Genome Consortium for Active Teaching (GCAT) facilitates the use of modern genomics methods in undergraduate education. Initially focused on microarray technology, but with an eye toward diversification, GCAT is a community working to improve the education of tomorrow's life science professionals. GCAT participants have access to affordable microarrays, microarray scanners, free software for data analysis, and faculty workshops. Microarrays provided by GCAT have been used by 141 faculty on 134 campuses, including 21 faculty that serve large numbers of underrepresented minority students. An estimated 9480 undergraduates a year will have access to microarrays by 2009 as a direct result of GCAT faculty workshops. Gains for students include significantly improved comprehension of topics in functional genomics and increased interest in research. Faculty reported improved access to new technology and gains in understanding thanks to their involvement with GCAT. GCAT's network of supportive colleagues encourages faculty to explore genomics through student research and to learn a new and complex method with their undergraduates. GCAT is meeting important goals of BIO2010 by making research methods accessible to undergraduates, training faculty in genomics and bioinformatics, integrating mathematics into the biology curriculum, and increasing participation by underrepresented minority students. PMID:17548873

  12. Genome Consortium for Active Teaching: meeting the goals of BIO2010.

    PubMed

    Campbell, A Malcolm; Ledbetter, Mary Lee S; Hoopes, Laura L M; Eckdahl, Todd T; Heyer, Laurie J; Rosenwald, Anne; Fowlks, Edison; Tonidandel, Scott; Bucholtz, Brooke; Gottfried, Gail

    2007-01-01

    The Genome Consortium for Active Teaching (GCAT) facilitates the use of modern genomics methods in undergraduate education. Initially focused on microarray technology, but with an eye toward diversification, GCAT is a community working to improve the education of tomorrow's life science professionals. GCAT participants have access to affordable microarrays, microarray scanners, free software for data analysis, and faculty workshops. Microarrays provided by GCAT have been used by 141 faculty on 134 campuses, including 21 faculty that serve large numbers of underrepresented minority students. An estimated 9480 undergraduates a year will have access to microarrays by 2009 as a direct result of GCAT faculty workshops. Gains for students include significantly improved comprehension of topics in functional genomics and increased interest in research. Faculty reported improved access to new technology and gains in understanding thanks to their involvement with GCAT. GCAT's network of supportive colleagues encourages faculty to explore genomics through student research and to learn a new and complex method with their undergraduates. GCAT is meeting important goals of BIO2010 by making research methods accessible to undergraduates, training faculty in genomics and bioinformatics, integrating mathematics into the biology curriculum, and increasing participation by underrepresented minority students.

  13. The Psychiatric Genomics Consortium Posttraumatic Stress Disorder Workgroup: Posttraumatic Stress Disorder Enters the Age of Large-Scale Genomic Collaboration.

    PubMed

    Logue, Mark W; Amstadter, Ananda B; Baker, Dewleen G; Duncan, Laramie; Koenen, Karestan C; Liberzon, Israel; Miller, Mark W; Morey, Rajendra A; Nievergelt, Caroline M; Ressler, Kerry J; Smith, Alicia K; Smoller, Jordan W; Stein, Murray B; Sumner, Jennifer A; Uddin, Monica

    2015-09-01

    The development of posttraumatic stress disorder (PTSD) is influenced by genetic factors. Although there have been some replicated candidates, the identification of risk variants for PTSD has lagged behind genetic research of other psychiatric disorders such as schizophrenia, autism, and bipolar disorder. Psychiatric genetics has moved beyond examination of specific candidate genes in favor of the genome-wide association study (GWAS) strategy of very large numbers of samples, which allows for the discovery of previously unsuspected genes and molecular pathways. The successes of genetic studies of schizophrenia and bipolar disorder have been aided by the formation of a large-scale GWAS consortium: the Psychiatric Genomics Consortium (PGC). In contrast, only a handful of GWAS of PTSD have appeared in the literature to date. Here we describe the formation of a group dedicated to large-scale study of PTSD genetics: the PGC-PTSD. The PGC-PTSD faces challenges related to the contingency on trauma exposure and the large degree of ancestral genetic diversity within and across participating studies. Using the PGC analysis pipeline supplemented by analyses tailored to address these challenges, we anticipate that our first large-scale GWAS of PTSD will comprise over 10 000 cases and 30 000 trauma-exposed controls. Following in the footsteps of our PGC forerunners, this collaboration-of a scope that is unprecedented in the field of traumatic stress-will lead the search for replicable genetic associations and new insights into the biological underpinnings of PTSD.

  14. The Psychiatric Genomics Consortium Posttraumatic Stress Disorder Workgroup: Posttraumatic Stress Disorder Enters the Age of Large-Scale Genomic Collaboration

    PubMed Central

    Logue, Mark W; Amstadter, Ananda B; Baker, Dewleen G; Duncan, Laramie; Koenen, Karestan C; Liberzon, Israel; Miller, Mark W; Morey, Rajendra A; Nievergelt, Caroline M; Ressler, Kerry J; Smith, Alicia K; Smoller, Jordan W; Stein, Murray B; Sumner, Jennifer A; Uddin, Monica

    2015-01-01

    The development of posttraumatic stress disorder (PTSD) is influenced by genetic factors. Although there have been some replicated candidates, the identification of risk variants for PTSD has lagged behind genetic research of other psychiatric disorders such as schizophrenia, autism, and bipolar disorder. Psychiatric genetics has moved beyond examination of specific candidate genes in favor of the genome-wide association study (GWAS) strategy of very large numbers of samples, which allows for the discovery of previously unsuspected genes and molecular pathways. The successes of genetic studies of schizophrenia and bipolar disorder have been aided by the formation of a large-scale GWAS consortium: the Psychiatric Genomics Consortium (PGC). In contrast, only a handful of GWAS of PTSD have appeared in the literature to date. Here we describe the formation of a group dedicated to large-scale study of PTSD genetics: the PGC-PTSD. The PGC-PTSD faces challenges related to the contingency on trauma exposure and the large degree of ancestral genetic diversity within and across participating studies. Using the PGC analysis pipeline supplemented by analyses tailored to address these challenges, we anticipate that our first large-scale GWAS of PTSD will comprise over 10 000 cases and 30 000 trauma-exposed controls. Following in the footsteps of our PGC forerunners, this collaboration—of a scope that is unprecedented in the field of traumatic stress—will lead the search for replicable genetic associations and new insights into the biological underpinnings of PTSD. PMID:25904361

  15. Lessons from Structural Genomics*

    PubMed Central

    Terwilliger, Thomas C.; Stuart, David; Yokoyama, Shigeyuki

    2010-01-01

    A decade of structural genomics, the large-scale determination of protein structures, has generated a wealth of data and many important lessons for structural biology and for future large-scale projects. These lessons include a confirmation that it is possible to construct large-scale facilities that can determine the structures of a hundred or more proteins per year, that these structures can be of high quality, and that these structures can have an important impact. Technology development has played a critical role in structural genomics, the difficulties at each step of determining a structure of a particular protein can be quantified, and validation of technologies is nearly as important as the technologies themselves. Finally, rapid deposition of data in public databases has increased the impact and usefulness of the data and international cooperation has advanced the field and improved data sharing. PMID:19416074

  16. LaGomiCs-Lagomorph Genomics Consortium: An International Collaborative Effort for Sequencing the Genomes of an Entire Mammalian Order.

    PubMed

    Fontanesi, Luca; Di Palma, Federica; Flicek, Paul; Smith, Andrew T; Thulin, Carl-Gustaf; Alves, Paulo C

    2016-07-01

    The order Lagomorpha comprises about 90 living species, divided in 2 families: the pikas (Family Ochotonidae), and the rabbits, hares, and jackrabbits (Family Leporidae). Lagomorphs are important economically and scientifically as major human food resources, valued game species, pests of agricultural significance, model laboratory animals, and key elements in food webs. A quarter of the lagomorph species are listed as threatened. They are native to all continents except Antarctica, and occur up to 5000 m above sea level, from the equator to the Arctic, spanning a wide range of environmental conditions. The order has notable taxonomic problems presenting significant difficulties for defining a species due to broad phenotypic variation, overlap of morphological characteristics, and relatively recent speciation events. At present, only the genomes of 2 species, the European rabbit (Oryctolagus cuniculus) and American pika (Ochotona princeps) have been sequenced and assembled. Starting from a paucity of genome information, the main scientific aim of the Lagomorph Genomics Consortium (LaGomiCs), born from a cooperative initiative of the European COST Action "A Collaborative European Network on Rabbit Genome Biology-RGB-Net" and the World Lagomorph Society (WLS), is to provide an international framework for the sequencing of the genome of all extant and selected extinct lagomorphs. Sequencing the genomes of an entire order will provide a large amount of information to address biological problems not only related to lagomorphs but also to all mammals. We present current and planned sequencing programs and outline the final objective of LaGomiCs possible through broad international collaboration. PMID:26921276

  17. Integration of gene ontology pathways with North American Rheumatoid Arthritis Consortium genome-wide association data via linear modeling.

    PubMed

    Lebrec, Jérémie Jp; Huizinga, Tom Wj; Toes, René Em; Houwing-Duistermaat, Jeanine J; van Houwelingen, Hans C

    2009-01-01

    We describe an empirical Bayesian linear model for integration of functional gene annotation data with genome-wide association data. Using case-control study data from the North American Rheumatoid Arthritis Consortium and gene annotation data from the Gene Ontology, we illustrate how the method can be used to prioritize candidate genes for further investigation.

  18. Integration of gene ontology pathways with North American Rheumatoid Arthritis Consortium genome-wide association data via linear modeling

    PubMed Central

    2009-01-01

    We describe an empirical Bayesian linear model for integration of functional gene annotation data with genome-wide association data. Using case-control study data from the North American Rheumatoid Arthritis Consortium and gene annotation data from the Gene Ontology, we illustrate how the method can be used to prioritize candidate genes for further investigation. PMID:20018091

  19. Extending Standards for Genomics and Metagenomics Data: A Research Coordination Network for the Genomic Standards Consortium (RCN4GSC).

    PubMed

    Wooley, John C; Field, Dawn; Glöckner, Frank-Oliver

    2009-07-20

    Through a newly established Research Coordination Network for the Genomic Standards Consortium (RCN4GSC), the GSC will continue its leadership in establishing and integrating genomic standards through community-based efforts. These efforts, undertaken in the context of genomic and metagenomic research aim to ensure the electronic capture of all genomic data and to facilitate the achievement of a community consensus around collecting and managing relevant contextual information connected to the sequence data. The GSC operates as an open, inclusive organization, welcoming inspired biologists with a commitment to community service. Within the collaborative framework of the ongoing, international activities of the GSC, the RCN will expand the range of research domains engaged in these standardization efforts and sustain scientific networking to encourage active participation by the broader community. The RCN4GSC, funded for five years by the US National Science Foundation, will primarily support outcome-focused working meetings and the exchange of early-career scientists between GSC research groups in order to advance key standards contributions such as GCDML. Focusing on the timely delivery of the extant GSC core projects, the RCN will also extend the pioneering efforts of the GSC to engage researchers active in developing ecological, environmental and biodiversity data standards. As the initial goals of the GSC are increasingly achieved, promoting the comprehensive use of effective standards will be essential to ensure the effective use of sequence and associated data, to provide access for all biologists to all of the information, and to create interdisciplinary opportunities for discovery. The RCN will facilitate these implementation activities through participation in major scientific conferences and presentations on scientific advances enabled by community usage of genomic standards.

  20. The peanut genome consortium and peanut genome sequence: Creating a better future through global food security

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The competitiveness of peanuts in domestic and global markets has been threatened by losses in productivity and quality that are attributed to diseases, pests, environmental stresses and allergy or food safety issues. The U.S. Peanut Genome Initiative (PGI) was launched in 2004, and expanded to a gl...

  1. Report of the 14th Genomic Standards Consortium Meeting, Oxford, UK, September 17-21, 2012.

    PubMed Central

    Davies, Neil; Field, Dawn; Amaral-Zettler, Linda; Barker, Katharine; Bicak, Mesude; Bourlat, Sarah; Coddington, Jonathan; Deck, John; Drummond, Alexei; Gilbert, Jack A.; Glöckner, Frank Oliver; Kottmann, Renzo; Meyer, Chris; Morrison, Norman; Obst, Matthias; Robbins, Robert; Schriml, Lynn; Sterk, Peter; Stones-Havas, Steven

    2014-01-01

    This report summarizes the proceedings of the 14th workshop of the Genomic Standards Consortium (GSC) held at the University of Oxford in September 2012. The primary goal of the workshop was to work towards the launch of the Genomic Observatories (GOs) Network under the GSC. For the first time, it brought together potential GOs sites, GSC members, and a range of interested partner organizations. It thus represented the first meeting of the GOs Network (GOs1). Key outcomes include the formation of a core group of “champions” ready to take the GOs Network forward, as well as the formation of working groups. The workshop also served as the first meeting of a wide range of participants in the Ocean Sampling Day (OSD) initiative, a first GOs action. Three projects with complementary interests – COST Action ES1103, MG4U and Micro B3 – organized joint sessions at the workshop. A two-day GSC Hackathon followed the main three days of meetings.

  2. Integrated Genomic Analysis of Diverse Induced Pluripotent Stem Cells from the Progenitor Cell Biology Consortium.

    PubMed

    Salomonis, Nathan; Dexheimer, Phillip J; Omberg, Larsson; Schroll, Robin; Bush, Stacy; Huo, Jeffrey; Schriml, Lynn; Ho Sui, Shannan; Keddache, Mehdi; Mayhew, Christopher; Shanmukhappa, Shiva Kumar; Wells, James; Daily, Kenneth; Hubler, Shane; Wang, Yuliang; Zambidis, Elias; Margolin, Adam; Hide, Winston; Hatzopoulos, Antonis K; Malik, Punam; Cancelas, Jose A; Aronow, Bruce J; Lutzko, Carolyn

    2016-07-12

    The rigorous characterization of distinct induced pluripotent stem cells (iPSC) derived from multiple reprogramming technologies, somatic sources, and donors is required to understand potential sources of variability and downstream potential. To achieve this goal, the Progenitor Cell Biology Consortium performed comprehensive experimental and genomic analyses of 58 iPSC from ten laboratories generated using a variety of reprogramming genes, vectors, and cells. Associated global molecular characterization studies identified functionally informative correlations in gene expression, DNA methylation, and/or copy-number variation among key developmental and oncogenic regulators as a result of donor, sex, line stability, reprogramming technology, and cell of origin. Furthermore, X-chromosome inactivation in PSC produced highly correlated differences in teratoma-lineage staining and regulator expression upon differentiation. All experimental results, and raw, processed, and metadata from these analyses, including powerful tools, are interactively accessible from a new online portal at https://www.synapse.org to serve as a reusable resource for the stem cell community.

  3. Athlome Project Consortium: a concerted effort to discover genomic and other "omic" markers of athletic performance.

    PubMed

    Pitsiladis, Yannis P; Tanaka, Masashi; Eynon, Nir; Bouchard, Claude; North, Kathryn N; Williams, Alun G; Collins, Malcolm; Moran, Colin N; Britton, Steven L; Fuku, Noriyuki; Ashley, Euan A; Klissouras, Vassilis; Lucia, Alejandro; Ahmetov, Ildus I; de Geus, Eco; Alsayrafi, Mohammed

    2016-03-01

    Despite numerous attempts to discover genetic variants associated with elite athletic performance, injury predisposition, and elite/world-class athletic status, there has been limited progress to date. Past reliance on candidate gene studies predominantly focusing on genotyping a limited number of single nucleotide polymorphisms or the insertion/deletion variants in small, often heterogeneous cohorts (i.e., made up of athletes of quite different sport specialties) have not generated the kind of results that could offer solid opportunities to bridge the gap between basic research in exercise sciences and deliverables in biomedicine. A retrospective view of genetic association studies with complex disease traits indicates that transition to hypothesis-free genome-wide approaches will be more fruitful. In studies of complex disease, it is well recognized that the magnitude of genetic association is often smaller than initially anticipated, and, as such, large sample sizes are required to identify the gene effects robustly. A symposium was held in Athens and on the Greek island of Santorini from 14-17 May 2015 to review the main findings in exercise genetics and genomics and to explore promising trends and possibilities. The symposium also offered a forum for the development of a position stand (the Santorini Declaration). Among the participants, many were involved in ongoing collaborative studies (e.g., ELITE, GAMES, Gene SMART, GENESIS, and POWERGENE). A consensus emerged among participants that it would be advantageous to bring together all current studies and those recently launched into one new large collaborative initiative, which was subsequently named the Athlome Project Consortium.

  4. DNA Methylation in Newborns and Maternal Smoking in Pregnancy: Genome-wide Consortium Meta-analysis.

    PubMed

    Joubert, Bonnie R; Felix, Janine F; Yousefi, Paul; Bakulski, Kelly M; Just, Allan C; Breton, Carrie; Reese, Sarah E; Markunas, Christina A; Richmond, Rebecca C; Xu, Cheng-Jian; Küpers, Leanne K; Oh, Sam S; Hoyo, Cathrine; Gruzieva, Olena; Söderhäll, Cilla; Salas, Lucas A; Baïz, Nour; Zhang, Hongmei; Lepeule, Johanna; Ruiz, Carlos; Ligthart, Symen; Wang, Tianyuan; Taylor, Jack A; Duijts, Liesbeth; Sharp, Gemma C; Jankipersadsing, Soesma A; Nilsen, Roy M; Vaez, Ahmad; Fallin, M Daniele; Hu, Donglei; Litonjua, Augusto A; Fuemmeler, Bernard F; Huen, Karen; Kere, Juha; Kull, Inger; Munthe-Kaas, Monica Cheng; Gehring, Ulrike; Bustamante, Mariona; Saurel-Coubizolles, Marie José; Quraishi, Bilal M; Ren, Jie; Tost, Jörg; Gonzalez, Juan R; Peters, Marjolein J; Håberg, Siri E; Xu, Zongli; van Meurs, Joyce B; Gaunt, Tom R; Kerkhof, Marjan; Corpeleijn, Eva; Feinberg, Andrew P; Eng, Celeste; Baccarelli, Andrea A; Benjamin Neelon, Sara E; Bradman, Asa; Merid, Simon Kebede; Bergström, Anna; Herceg, Zdenko; Hernandez-Vargas, Hector; Brunekreef, Bert; Pinart, Mariona; Heude, Barbara; Ewart, Susan; Yao, Jin; Lemonnier, Nathanaël; Franco, Oscar H; Wu, Michael C; Hofman, Albert; McArdle, Wendy; Van der Vlies, Pieter; Falahi, Fahimeh; Gillman, Matthew W; Barcellos, Lisa F; Kumar, Ashish; Wickman, Magnus; Guerra, Stefano; Charles, Marie-Aline; Holloway, John; Auffray, Charles; Tiemeier, Henning W; Smith, George Davey; Postma, Dirkje; Hivert, Marie-France; Eskenazi, Brenda; Vrijheid, Martine; Arshad, Hasan; Antó, Josep M; Dehghan, Abbas; Karmaus, Wilfried; Annesi-Maesano, Isabella; Sunyer, Jordi; Ghantous, Akram; Pershagen, Göran; Holland, Nina; Murphy, Susan K; DeMeo, Dawn L; Burchard, Esteban G; Ladd-Acosta, Christine; Snieder, Harold; Nystad, Wenche; Koppelman, Gerard H; Relton, Caroline L; Jaddoe, Vincent W V; Wilcox, Allen; Melén, Erik; London, Stephanie J

    2016-04-01

    Epigenetic modifications, including DNA methylation, represent a potential mechanism for environmental impacts on human disease. Maternal smoking in pregnancy remains an important public health problem that impacts child health in a myriad of ways and has potential lifelong consequences. The mechanisms are largely unknown, but epigenetics most likely plays a role. We formed the Pregnancy And Childhood Epigenetics (PACE) consortium and meta-analyzed, across 13 cohorts (n = 6,685), the association between maternal smoking in pregnancy and newborn blood DNA methylation at over 450,000 CpG sites (CpGs) by using the Illumina 450K BeadChip. Over 6,000 CpGs were differentially methylated in relation to maternal smoking at genome-wide statistical significance (false discovery rate, 5%), including 2,965 CpGs corresponding to 2,017 genes not previously related to smoking and methylation in either newborns or adults. Several genes are relevant to diseases that can be caused by maternal smoking (e.g., orofacial clefts and asthma) or adult smoking (e.g., certain cancers). A number of differentially methylated CpGs were associated with gene expression. We observed enrichment in pathways and processes critical to development. In older children (5 cohorts, n = 3,187), 100% of CpGs gave at least nominal levels of significance, far more than expected by chance (p value < 2.2 × 10(-16)). Results were robust to different normalization methods used across studies and cell type adjustment. In this large scale meta-analysis of methylation data, we identified numerous loci involved in response to maternal smoking in pregnancy with persistence into later childhood and provide insights into mechanisms underlying effects of this important exposure.

  5. DNA Methylation in Newborns and Maternal Smoking in Pregnancy: Genome-wide Consortium Meta-analysis

    PubMed Central

    Joubert, Bonnie R.; Felix, Janine F.; Yousefi, Paul; Bakulski, Kelly M.; Just, Allan C.; Breton, Carrie; Reese, Sarah E.; Markunas, Christina A.; Richmond, Rebecca C.; Xu, Cheng-Jian; Küpers, Leanne K.; Oh, Sam S.; Hoyo, Cathrine; Gruzieva, Olena; Söderhäll, Cilla; Salas, Lucas A.; Baïz, Nour; Zhang, Hongmei; Lepeule, Johanna; Ruiz, Carlos; Ligthart, Symen; Wang, Tianyuan; Taylor, Jack A.; Duijts, Liesbeth; Sharp, Gemma C.; Jankipersadsing, Soesma A.; Nilsen, Roy M.; Vaez, Ahmad; Fallin, M. Daniele; Hu, Donglei; Litonjua, Augusto A.; Fuemmeler, Bernard F.; Huen, Karen; Kere, Juha; Kull, Inger; Munthe-Kaas, Monica Cheng; Gehring, Ulrike; Bustamante, Mariona; Saurel-Coubizolles, Marie José; Quraishi, Bilal M.; Ren, Jie; Tost, Jörg; Gonzalez, Juan R.; Peters, Marjolein J.; Håberg, Siri E.; Xu, Zongli; van Meurs, Joyce B.; Gaunt, Tom R.; Kerkhof, Marjan; Corpeleijn, Eva; Feinberg, Andrew P.; Eng, Celeste; Baccarelli, Andrea A.; Benjamin Neelon, Sara E.; Bradman, Asa; Merid, Simon Kebede; Bergström, Anna; Herceg, Zdenko; Hernandez-Vargas, Hector; Brunekreef, Bert; Pinart, Mariona; Heude, Barbara; Ewart, Susan; Yao, Jin; Lemonnier, Nathanaël; Franco, Oscar H.; Wu, Michael C.; Hofman, Albert; McArdle, Wendy; Van der Vlies, Pieter; Falahi, Fahimeh; Gillman, Matthew W.; Barcellos, Lisa F.; Kumar, Ashish; Wickman, Magnus; Guerra, Stefano; Charles, Marie-Aline; Holloway, John; Auffray, Charles; Tiemeier, Henning W.; Smith, George Davey; Postma, Dirkje; Hivert, Marie-France; Eskenazi, Brenda; Vrijheid, Martine; Arshad, Hasan; Antó, Josep M.; Dehghan, Abbas; Karmaus, Wilfried; Annesi-Maesano, Isabella; Sunyer, Jordi; Ghantous, Akram; Pershagen, Göran; Holland, Nina; Murphy, Susan K.; DeMeo, Dawn L.; Burchard, Esteban G.; Ladd-Acosta, Christine; Snieder, Harold; Nystad, Wenche; Koppelman, Gerard H.; Relton, Caroline L.; Jaddoe, Vincent W.V.; Wilcox, Allen; Melén, Erik; London, Stephanie J.

    2016-01-01

    Epigenetic modifications, including DNA methylation, represent a potential mechanism for environmental impacts on human disease. Maternal smoking in pregnancy remains an important public health problem that impacts child health in a myriad of ways and has potential lifelong consequences. The mechanisms are largely unknown, but epigenetics most likely plays a role. We formed the Pregnancy And Childhood Epigenetics (PACE) consortium and meta-analyzed, across 13 cohorts (n = 6,685), the association between maternal smoking in pregnancy and newborn blood DNA methylation at over 450,000 CpG sites (CpGs) by using the Illumina 450K BeadChip. Over 6,000 CpGs were differentially methylated in relation to maternal smoking at genome-wide statistical significance (false discovery rate, 5%), including 2,965 CpGs corresponding to 2,017 genes not previously related to smoking and methylation in either newborns or adults. Several genes are relevant to diseases that can be caused by maternal smoking (e.g., orofacial clefts and asthma) or adult smoking (e.g., certain cancers). A number of differentially methylated CpGs were associated with gene expression. We observed enrichment in pathways and processes critical to development. In older children (5 cohorts, n = 3,187), 100% of CpGs gave at least nominal levels of significance, far more than expected by chance (p value < 2.2 × 10−16). Results were robust to different normalization methods used across studies and cell type adjustment. In this large scale meta-analysis of methylation data, we identified numerous loci involved in response to maternal smoking in pregnancy with persistence into later childhood and provide insights into mechanisms underlying effects of this important exposure. PMID:27040690

  6. DNA Methylation in Newborns and Maternal Smoking in Pregnancy: Genome-wide Consortium Meta-analysis.

    PubMed

    Joubert, Bonnie R; Felix, Janine F; Yousefi, Paul; Bakulski, Kelly M; Just, Allan C; Breton, Carrie; Reese, Sarah E; Markunas, Christina A; Richmond, Rebecca C; Xu, Cheng-Jian; Küpers, Leanne K; Oh, Sam S; Hoyo, Cathrine; Gruzieva, Olena; Söderhäll, Cilla; Salas, Lucas A; Baïz, Nour; Zhang, Hongmei; Lepeule, Johanna; Ruiz, Carlos; Ligthart, Symen; Wang, Tianyuan; Taylor, Jack A; Duijts, Liesbeth; Sharp, Gemma C; Jankipersadsing, Soesma A; Nilsen, Roy M; Vaez, Ahmad; Fallin, M Daniele; Hu, Donglei; Litonjua, Augusto A; Fuemmeler, Bernard F; Huen, Karen; Kere, Juha; Kull, Inger; Munthe-Kaas, Monica Cheng; Gehring, Ulrike; Bustamante, Mariona; Saurel-Coubizolles, Marie José; Quraishi, Bilal M; Ren, Jie; Tost, Jörg; Gonzalez, Juan R; Peters, Marjolein J; Håberg, Siri E; Xu, Zongli; van Meurs, Joyce B; Gaunt, Tom R; Kerkhof, Marjan; Corpeleijn, Eva; Feinberg, Andrew P; Eng, Celeste; Baccarelli, Andrea A; Benjamin Neelon, Sara E; Bradman, Asa; Merid, Simon Kebede; Bergström, Anna; Herceg, Zdenko; Hernandez-Vargas, Hector; Brunekreef, Bert; Pinart, Mariona; Heude, Barbara; Ewart, Susan; Yao, Jin; Lemonnier, Nathanaël; Franco, Oscar H; Wu, Michael C; Hofman, Albert; McArdle, Wendy; Van der Vlies, Pieter; Falahi, Fahimeh; Gillman, Matthew W; Barcellos, Lisa F; Kumar, Ashish; Wickman, Magnus; Guerra, Stefano; Charles, Marie-Aline; Holloway, John; Auffray, Charles; Tiemeier, Henning W; Smith, George Davey; Postma, Dirkje; Hivert, Marie-France; Eskenazi, Brenda; Vrijheid, Martine; Arshad, Hasan; Antó, Josep M; Dehghan, Abbas; Karmaus, Wilfried; Annesi-Maesano, Isabella; Sunyer, Jordi; Ghantous, Akram; Pershagen, Göran; Holland, Nina; Murphy, Susan K; DeMeo, Dawn L; Burchard, Esteban G; Ladd-Acosta, Christine; Snieder, Harold; Nystad, Wenche; Koppelman, Gerard H; Relton, Caroline L; Jaddoe, Vincent W V; Wilcox, Allen; Melén, Erik; London, Stephanie J

    2016-04-01

    Epigenetic modifications, including DNA methylation, represent a potential mechanism for environmental impacts on human disease. Maternal smoking in pregnancy remains an important public health problem that impacts child health in a myriad of ways and has potential lifelong consequences. The mechanisms are largely unknown, but epigenetics most likely plays a role. We formed the Pregnancy And Childhood Epigenetics (PACE) consortium and meta-analyzed, across 13 cohorts (n = 6,685), the association between maternal smoking in pregnancy and newborn blood DNA methylation at over 450,000 CpG sites (CpGs) by using the Illumina 450K BeadChip. Over 6,000 CpGs were differentially methylated in relation to maternal smoking at genome-wide statistical significance (false discovery rate, 5%), including 2,965 CpGs corresponding to 2,017 genes not previously related to smoking and methylation in either newborns or adults. Several genes are relevant to diseases that can be caused by maternal smoking (e.g., orofacial clefts and asthma) or adult smoking (e.g., certain cancers). A number of differentially methylated CpGs were associated with gene expression. We observed enrichment in pathways and processes critical to development. In older children (5 cohorts, n = 3,187), 100% of CpGs gave at least nominal levels of significance, far more than expected by chance (p value < 2.2 × 10(-16)). Results were robust to different normalization methods used across studies and cell type adjustment. In this large scale meta-analysis of methylation data, we identified numerous loci involved in response to maternal smoking in pregnancy with persistence into later childhood and provide insights into mechanisms underlying effects of this important exposure. PMID:27040690

  7. Genome Clone Libraries and Data from the Integrated Molecular Analysis of Genomes and their Expression (I.M.A.G.E.) Consortium

    DOE Data Explorer

    The I.M.A.G.E. Consortium was initiated in 1993 by four academic groups on a collaborative basis after informal discussions led to a common vision of how to achieve an important goal in the study of the human genome: the Integrated Molecular Analysis of Genomes and their Expression Consortium's primary goal is to create arrayed cDNA libraries and associated bioinformatics tools, and make them publicly available to the research community. The primary organisms of interest include intensively studied mammalian species, including human, mouse, rat and non-human primate species. The Consortium has also focused on several commonly studied model organisms; as part of this effort it has arrayed cDNAs from zebrafish, and Fugu (pufferfish) as well as Xenopus laevis and X. tropicalis (frog). Utilizing high speed robotics, over nine million individual cDNA clones have been arrayed into 384-well microtiter plates, and sufficient replicas have been created to distribute copies both to sequencing centers and to a network of five distributors located worldwide. The I.M.A.G.E. Consortium represents the world's largest public cDNA collection, and works closely with the National Institutes of Health's Mammalian Gene Collection(MGC) to help it achieve its goal of creating a full-length cDNA clone for every human and mouse gene. I.M.A.G.E. is also a member of the ORFeome Collaboration, working to generate a complete set of expression-ready open reading frame clones representing each human gene. Custom informatics tools have been developed in support of these projects to better allow the research community to select clones of interest and track and collect all data deposited into public databases about those clones and their related sequences. I.M.A.G.E. clones are publicly available, free of any royalties, and may be used by anyone agreeing with the Consortium's guidelines.

  8. Informational laws of genome structures

    NASA Astrophysics Data System (ADS)

    Bonnici, Vincenzo; Manca, Vincenzo

    2016-06-01

    In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined.

  9. Informational laws of genome structures

    PubMed Central

    Bonnici, Vincenzo; Manca, Vincenzo

    2016-01-01

    In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined. PMID:27354155

  10. Informational laws of genome structures.

    PubMed

    Bonnici, Vincenzo; Manca, Vincenzo

    2016-01-01

    In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined. PMID:27354155

  11. Multivariate Analysis of Anthropometric Traits Using Summary Statistics of Genome-Wide Association Studies from GIANT Consortium

    PubMed Central

    Zhu, Xiaofeng

    2016-01-01

    Meta-analysis of single trait for multiple cohorts has been used for increasing statistical power in genome-wide association studies (GWASs). Although hundreds of variants have been identified by GWAS, these variants only explain a small fraction of phenotypic variation. Cross-phenotype association analysis (CPASSOC) can further improve statistical power by searching for variants that contribute to multiple traits, which is often relevant to pleiotropy. In this study, we performed CPASSOC analysis on the summary statistics from the Genetic Investigation of ANthropometric Traits (GIANT) consortium using a novel method recently developed by our group. Sex-specific meta-analysis data for height, body mass index (BMI), and waist-to-hip ratio adjusted for BMI (WHRadjBMI) from discovery phase of the GIANT consortium study were combined using CPASSOC for each trait as well as 3 traits together. The conventional meta-analysis results from the discovery phase data of GIANT consortium studies were used to compare with that from CPASSOC analysis. The CPASSOC analysis was able to identify 17 loci associated with anthropometric traits that were missed by conventional meta-analysis. Among these loci, 16 have been reported in literature by including additional samples and 1 is novel. We also demonstrated that CPASSOC is able to detect pleiotropic effects when analyzing multiple traits. PMID:27701450

  12. GCAT-SEEKquence: Genome Consortium for Active Teaching of Undergraduates through Increased Faculty Access to Next-Generation Sequencing Data

    PubMed Central

    Buonaccorsi, Vincent P.; Boyle, Michael D.; Grove, Deborah; Praul, Craig; Sakk, Eric; Stuart, Ash; Tobin, Tammy; Hosler, Jay; Carney, Susan L.; Engle, Michael J.; Overton, Barry E.; Newman, Jeffrey D.; Pizzorno, Marie; Powell, Jennifer R.; Trun, Nancy

    2011-01-01

    To transform undergraduate biology education, faculty need to provide opportunities for students to engage in the process of science. The rise of research approaches using next-generation (NextGen) sequencing has been impressive, but incorporation of such approaches into the undergraduate curriculum remains a major challenge. In this paper, we report proceedings of a National Science Foundation–funded workshop held July 11–14, 2011, at Juniata College. The purpose of the workshop was to develop a regional research coordination network for undergraduate biology education (RCN/UBE). The network is collaborating with a genome-sequencing core facility located at Pennsylvania State University (University Park) to enable undergraduate students and faculty at small colleges to access state-of-the-art sequencing technology. We aim to create a database of references, protocols, and raw data related to NextGen sequencing, and to find innovative ways to reduce costs related to sequencing and bioinformatics analysis. It was agreed that our regional network for NextGen sequencing could operate more effectively if it were partnered with the Genome Consortium for Active Teaching (GCAT) as a new arm of that consortium, entitled GCAT-SEEK(quence). This step would also permit the approach to be replicated elsewhere. PMID:22135368

  13. Genome Sequence of Bacillus endophyticus and Analysis of Its Companion Mechanism in the Ketogulonigenium vulgare-Bacillus Strain Consortium

    PubMed Central

    Jia, Nan; Du, Jin; Ding, Ming-Zhu; Gao, Feng; Yuan, Ying-Jin

    2015-01-01

    Bacillus strains have been widely used as the companion strain of Ketogulonigenium vulgare in the process of vitamin C fermentation. Different Bacillus strains generate different effects on the growth of K. vulgare and ultimately influence the productivity. First, we identified that Bacillus endophyticus Hbe603 was an appropriate strain to cooperate with K. vulgare and the product conversion rate exceeded 90% in industrial vitamin C fermentation. Here, we report the genome sequencing of the B. endophyticus Hbe603 industrial companion strain and speculate its possible advantage in the consortium. The circular chromosome of B. endophyticus Hbe603 has a size of 4.87 Mb with GC content of 36.64% and has the highest similarity with that of Bacillus megaterium among all the bacteria with complete genomes. By comparing the distribution of COGs with that of Bacillus thuringiensis, Bacillus cereus and B. megaterium, B. endophyticus has less genes related to cell envelope biogenesis and signal transduction mechanisms, and more genes related to carbohydrate transport and metabolism, energy production and conversion, as well as lipid transport and metabolism. Genome-based functional studies revealed the specific capability of B. endophyticus in sporulation, transcription regulation, environmental resistance, membrane transportation, extracellular proteins and nutrients synthesis, which would be beneficial for K. vulgare. In particular, B. endophyticus lacks the Rap-Phr signal cascade system and, in part, spore coat related proteins. In addition, it has specific pathways for vitamin B12 synthesis and sorbitol metabolism. The genome analysis of the industrial B. endophyticus will help us understand its cooperative mechanism in the K. vulgare-Bacillus strain consortium to improve the fermentation of vitamin C. PMID:26248285

  14. Genome Sequence of Bacillus endophyticus and Analysis of Its Companion Mechanism in the Ketogulonigenium vulgare-Bacillus Strain Consortium.

    PubMed

    Jia, Nan; Du, Jin; Ding, Ming-Zhu; Gao, Feng; Yuan, Ying-Jin

    2015-01-01

    Bacillus strains have been widely used as the companion strain of Ketogulonigenium vulgare in the process of vitamin C fermentation. Different Bacillus strains generate different effects on the growth of K. vulgare and ultimately influence the productivity. First, we identified that Bacillus endophyticus Hbe603 was an appropriate strain to cooperate with K. vulgare and the product conversion rate exceeded 90% in industrial vitamin C fermentation. Here, we report the genome sequencing of the B. endophyticus Hbe603 industrial companion strain and speculate its possible advantage in the consortium. The circular chromosome of B. endophyticus Hbe603 has a size of 4.87 Mb with GC content of 36.64% and has the highest similarity with that of Bacillus megaterium among all the bacteria with complete genomes. By comparing the distribution of COGs with that of Bacillus thuringiensis, Bacillus cereus and B. megaterium, B. endophyticus has less genes related to cell envelope biogenesis and signal transduction mechanisms, and more genes related to carbohydrate transport and metabolism, energy production and conversion, as well as lipid transport and metabolism. Genome-based functional studies revealed the specific capability of B. endophyticus in sporulation, transcription regulation, environmental resistance, membrane transportation, extracellular proteins and nutrients synthesis, which would be beneficial for K. vulgare. In particular, B. endophyticus lacks the Rap-Phr signal cascade system and, in part, spore coat related proteins. In addition, it has specific pathways for vitamin B12 synthesis and sorbitol metabolism. The genome analysis of the industrial B. endophyticus will help us understand its cooperative mechanism in the K. vulgare-Bacillus strain consortium to improve the fermentation of vitamin C. PMID:26248285

  15. Genome Sequence of Bacillus endophyticus and Analysis of Its Companion Mechanism in the Ketogulonigenium vulgare-Bacillus Strain Consortium.

    PubMed

    Jia, Nan; Du, Jin; Ding, Ming-Zhu; Gao, Feng; Yuan, Ying-Jin

    2015-01-01

    Bacillus strains have been widely used as the companion strain of Ketogulonigenium vulgare in the process of vitamin C fermentation. Different Bacillus strains generate different effects on the growth of K. vulgare and ultimately influence the productivity. First, we identified that Bacillus endophyticus Hbe603 was an appropriate strain to cooperate with K. vulgare and the product conversion rate exceeded 90% in industrial vitamin C fermentation. Here, we report the genome sequencing of the B. endophyticus Hbe603 industrial companion strain and speculate its possible advantage in the consortium. The circular chromosome of B. endophyticus Hbe603 has a size of 4.87 Mb with GC content of 36.64% and has the highest similarity with that of Bacillus megaterium among all the bacteria with complete genomes. By comparing the distribution of COGs with that of Bacillus thuringiensis, Bacillus cereus and B. megaterium, B. endophyticus has less genes related to cell envelope biogenesis and signal transduction mechanisms, and more genes related to carbohydrate transport and metabolism, energy production and conversion, as well as lipid transport and metabolism. Genome-based functional studies revealed the specific capability of B. endophyticus in sporulation, transcription regulation, environmental resistance, membrane transportation, extracellular proteins and nutrients synthesis, which would be beneficial for K. vulgare. In particular, B. endophyticus lacks the Rap-Phr signal cascade system and, in part, spore coat related proteins. In addition, it has specific pathways for vitamin B12 synthesis and sorbitol metabolism. The genome analysis of the industrial B. endophyticus will help us understand its cooperative mechanism in the K. vulgare-Bacillus strain consortium to improve the fermentation of vitamin C.

  16. The discrepancies in the results of bioinformatics tools for genomic structural annotation

    NASA Astrophysics Data System (ADS)

    Pawełkowicz, Magdalena; Nowak, Robert; Osipowski, Paweł; Rymuszka, Jacek; Świerkula, Katarzyna; Wojcieszek, Michał; Przybecki, Zbigniew

    2014-11-01

    A major focus of sequencing project is to identify genes in genomes. However it is necessary to define the variety of genes and the criteria for identifying them. In this work we present discrepancies and dependencies from the application of different bioinformatic programs for structural annotation performed on the cucumber data set from Polish Consortium of Cucumber Genome Sequencing. We use Fgenesh, GenScan and GeneMark to automated structural annotation, the results have been compared to reference annotation.

  17. Social and behavioral research in genomic sequencing: approaches from the Clinical Sequencing Exploratory Research Consortium Outcomes and Measures Working Group.

    PubMed

    Gray, Stacy W; Martins, Yolanda; Feuerman, Lindsay Z; Bernhardt, Barbara A; Biesecker, Barbara B; Christensen, Kurt D; Joffe, Steven; Rini, Christine; Veenstra, David; McGuire, Amy L

    2014-10-01

    The routine use of genomic sequencing in clinical medicine has the potential to dramatically alter patient care and medical outcomes. To fully understand the psychosocial and behavioral impact of sequencing integration into clinical practice, it is imperative that we identify the factors that influence sequencing-related decision making and patient outcomes. In an effort to develop a collaborative and conceptually grounded approach to studying sequencing adoption, members of the National Human Genome Research Institute's Clinical Sequencing Exploratory Research Consortium formed the Outcomes and Measures Working Group. Here we highlight the priority areas of investigation and psychosocial and behavioral outcomes identified by the Working Group. We also review some of the anticipated challenges to measurement in social and behavioral research related to genomic sequencing; opportunities for instrument development; and the importance of qualitative, quantitative, and mixed-method approaches. This work represents the early, shared efforts of multiple research teams as we strive to understand individuals' experiences with genomic sequencing. The resulting body of knowledge will guide recommendations for the optimal use of sequencing in clinical practice.

  18. Comparative genetic mapping between clementine, pummelo and sweet orange and the interspecicic structure of the Clementine genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Comparative genetic mapping between clementine, pummelo and sweet orange and the interspecicic structure of the Clementine genome The availability of a saturated genetic map of Clementine was identified by the International Citrus Genome Consortium as an essential prerequisite to assist the assembly...

  19. Structural Genomics of Protein Phosphatases

    SciTech Connect

    Almo,S.; Bonanno, J.; Sauder, J.; Emtage, S.; Dilorenzo, T.; Malashkevich, V.; Wasserman, S.; Swaminathan, S.; Eswaramoorthy, S.; et al

    2007-01-01

    The New York SGX Research Center for Structural Genomics (NYSGXRC) of the NIGMS Protein Structure Initiative (PSI) has applied its high-throughput X-ray crystallographic structure determination platform to systematic studies of all human protein phosphatases and protein phosphatases from biomedically-relevant pathogens. To date, the NYSGXRC has determined structures of 21 distinct protein phosphatases: 14 from human, 2 from mouse, 2 from the pathogen Toxoplasma gondii, 1 from Trypanosoma brucei, the parasite responsible for African sleeping sickness, and 2 from the principal mosquito vector of malaria in Africa, Anopheles gambiae. These structures provide insights into both normal and pathophysiologic processes, including transcriptional regulation, regulation of major signaling pathways, neural development, and type 1 diabetes. In conjunction with the contributions of other international structural genomics consortia, these efforts promise to provide an unprecedented database and materials repository for structure-guided experimental and computational discovery of inhibitors for all classes of protein phosphatases.

  20. The global cancer genomics consortium's symposium: new era of molecular medicine and epigenetic cancer medicine - cross section of genomics and epigenetics

    PubMed Central

    Toi, Masakazu; Pillai, M. Radhakrishna; Gupta, Sudeep; Badwe, Rajendra; Carmo-Fonseca, Maria; Costa, Luis; Chow, Louis WC; Knapp, Stefan; Kumar, Rakesh

    2015-01-01

    The Global Cancer Genomics Consortium (GCGC) colleagues continue to function together as an interactive multidisciplinary team of cancer biologists and oncologists with interests in genomics and building a bidirectional bridge between cancer clinics and laboratories while taking advantage of shared resources among its member scientists. The GCGC includes member scientists from six institutions in Lisbon, United Kingdom, Japan, India and United States, and was formed in December 2010 for a period of five years. Driven by valuable lessons learned from the previous symposiums, the fourth GCGC Symposium focused on a cross section of genomic and epigenetic cancer medicine and it's for this reason we chose the conference theme - New Era of Molecular Medicine and Epigenetic Cancer Medicine: Cross Section of Genomics and Epigenetics. This year's symposium was co-organized by the Organization for Oncology and Translational Research (OOTR) at the Shiran Hall, Kyoto University, Kyoto, Japan, from November 14 and 15, 2014. The symposium attracted around 80 participants from 14 countries, and counted with 23 invited platform speakers. Scientific sessions included eight platform sessions and one poster session, and three plenary lectures. The symposium focused on cancer stem cells and self-renewal, cancer transcriptome, tumor heterogeneity, tumor biology, breast cancer genomics, targeted therapeutics and personalized medicine. The issues of cancer stem cells and tumor heterogeneity were echoed in most of the scientific presentations. The meeting concluded with an oral presentation by the best poster awardee and closing remarks by meeting co-chairs.

  1. Comparative genomics analysis of the companion mechanisms of Bacillus thuringiensis Bc601 and Bacillus endophyticus Hbe603 in bacterial consortium.

    PubMed

    Jia, Nan; Ding, Ming-Zhu; Gao, Feng; Yuan, Ying-Jin

    2016-01-01

    Bacillus thuringiensis and Bacillus endophyticus both act as the companion bacteria, which cooperate with Ketogulonigenium vulgare in vitamin C two-step fermentation. Two Bacillus species have different morphologies, swarming motility and 2-keto-L-gulonic acid productivities when they co-culture with K. vulgare. Here, we report the complete genome sequencing of B. thuringiensis Bc601 and eight plasmids of B. endophyticus Hbe603, and carry out the comparative genomics analysis. Consequently, B. thuringiensis Bc601, with greater ability of response to the external environment, has been found more two-component system, sporulation coat and peptidoglycan biosynthesis related proteins than B. endophyticus Hbe603, and B. endophyticus Hbe603, with greater ability of nutrients biosynthesis, has been found more alpha-galactosidase, propanoate, glutathione and inositol phosphate metabolism, and amino acid degradation related proteins than B. thuringiensis Bc601. Different ability of swarming motility, response to the external environment and nutrients biosynthesis may reflect different companion mechanisms of two Bacillus species. Comparative genomic analysis of B. endophyticus and B. thuringiensis enables us to further understand the cooperative mechanism with K. vulgare, and facilitate the optimization of bacterial consortium. PMID:27353048

  2. Comparative genomics analysis of the companion mechanisms of Bacillus thuringiensis Bc601 and Bacillus endophyticus Hbe603 in bacterial consortium

    PubMed Central

    Jia, Nan; Ding, Ming-Zhu; Gao, Feng; Yuan, Ying-Jin

    2016-01-01

    Bacillus thuringiensis and Bacillus endophyticus both act as the companion bacteria, which cooperate with Ketogulonigenium vulgare in vitamin C two-step fermentation. Two Bacillus species have different morphologies, swarming motility and 2-keto-L-gulonic acid productivities when they co-culture with K. vulgare. Here, we report the complete genome sequencing of B. thuringiensis Bc601 and eight plasmids of B. endophyticus Hbe603, and carry out the comparative genomics analysis. Consequently, B. thuringiensis Bc601, with greater ability of response to the external environment, has been found more two-component system, sporulation coat and peptidoglycan biosynthesis related proteins than B. endophyticus Hbe603, and B. endophyticus Hbe603, with greater ability of nutrients biosynthesis, has been found more alpha-galactosidase, propanoate, glutathione and inositol phosphate metabolism, and amino acid degradation related proteins than B. thuringiensis Bc601. Different ability of swarming motility, response to the external environment and nutrients biosynthesis may reflect different companion mechanisms of two Bacillus species. Comparative genomic analysis of B. endophyticus and B. thuringiensis enables us to further understand the cooperative mechanism with K. vulgare, and facilitate the optimization of bacterial consortium. PMID:27353048

  3. Unmet Challenges of Structural Genomics

    PubMed Central

    Chruszcz, Maksymilian; Domagalski, Marcin; Osinski, Tomasz; Wlodawer, Alexander; Minor, Wladek

    2010-01-01

    Summary Structural genomics (SG) programs have developed during the last decade many novel methodologies for faster and more accurate structure determination. These new tools and approaches led to determination of thousands of protein structures. The generation of enormous amounts of experimental data resulted in significant improvements in the understanding of many biological processes at molecular levels. However, the amount of data collected so far is so large that traditional analysis methods are limiting the rate of extraction of biological and biochemical information from 3-D models. This situation has prompted us to review the challenges that remain unmet by structural genomics, as well as the areas in which the potential impact of SG could exceed what has been achieved so far. PMID:20810277

  4. 78 FR 47674 - Genome in a Bottle Consortium-Progress and Planning Workshop

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-08-06

    ...: select appropriate sources for whole genome RMs and identify or design synthetic DNA constructs that... and synthetic DNA RMs along with the methods (documentary standards) and reference data necessary...

  5. Genome-wide Association Studies of MRI-defined Brain Infarcts: Meta-analysis from the CHARGE Consortium

    PubMed Central

    Debette, Stephanie; Bis, Joshua C.; Fornage, Myriam; Schmidt, Helena; Ikram, M. Arfan; Sigurdsson, Sigurdur; Heiss, Gerardo; Struchalin, Maksim; Smith, Albert V.; van der Lugt, Aad; DeCarli, Charles; Lumley, Thomas; Knopman, David S.; Enzinger, Christian; Eiriksdottir, Gudny; Koudstaal, Peter J.; DeStefano, Anita L.; Psaty, Bruce M.; Dufouil, Carole; Catellier, Diane J.; Fazekas, Franz; Aspelund, Thor; Aulchenko, Yurii S.; Beiser, Alexa; Rotter, Jerome I.; Tzourio, Christophe; Shibata, Dean K.; Tscherner, Maria; Harris, Tamara B.; Rivadeneira, Fernando; Atwood, Larry D.; Rice, Kenneth; Gottesman, Rebecca F.; van Buchem, Mark A.; Uitterlinden, Andre G.; Kelly-Hayes, Margaret; Cushman, Mary; Zhu, Yicheng; Boerwinkle, Eric; Gudnason, Vilmundur; Hofman, Albert; Romero, Jose R.; Lopez, Oscar; van Duijn, Cornelia M.; Au, Rhoda; Heckbert, Susan R.; Wolf, Philip A.; Mosley, Thomas H.; Seshadri, Sudha; Breteler, Monique M.B.; Schmidt, Reinhold; Launer, Lenore J.; Longstreth, WT

    2010-01-01

    Background Previous studies examining genetic associations with MRI-defined brain infarct have yielded inconsistent findings. We investigated genetic variation underlying covert MRI-infarct, in persons without histories of transient ischemic attack or stroke. We performed meta-analysis of genome-wide association studies of white participants in 6 studies comprising the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium. Methods Using 2.2 million genotyped and imputed SNPs, each study performed cross-sectional genome-wide association analysis of MRI-infarct using age and sex-adjusted logistic regression models. Study-specific findings were combined in an inverse-variance weighted meta-analysis, including 9401 participants with mean age 69.7, 19.4% of whom had ≥1 MRI-infarct. Results The most significant association was found with rs2208454 (minor allele frequency: 20%), located in intron 3 of MACRO Domain Containing 2 gene and in the downstream region of Fibronectin Leucine Rich Transmembrane Protein 3 gene. Each copy of the minor allele was associated with lower risk of MRI-infarcts: odds ratio=0.76, 95% confidence interval=0.68–0.84, p=4.64×10−7. Highly suggestive associations (p<1.0×10−5) were also found for 22 other SNPs in linkage disequilibrium (r2>0.64) with rs2208454. The association with rs2208454 did not replicate in independent samples of 1822 white and 644 African-American participants, although 4 SNPs within 200kb from rs2208454 were associated with MRI-infarcts in African-American sample. Conclusions This first community-based, genome-wide association study on covert MRI-infarcts uncovered novel associations. Although replication of the association with top SNP failed, possibly due to insufficient power, results in the African American sample are encouraging, and further efforts at replication are needed. PMID:20044523

  6. A genome-wide approach to children's aggressive behavior: The EAGLE consortium.

    PubMed

    Pappa, Irene; St Pourcain, Beate; Benke, Kelly; Cavadino, Alana; Hakulinen, Christian; Nivard, Michel G; Nolte, Ilja M; Tiesler, Carla M T; Bakermans-Kranenburg, Marian J; Davies, Gareth E; Evans, David M; Geoffroy, Marie-Claude; Grallert, Harald; Groen-Blokhuis, Maria M; Hudziak, James J; Kemp, John P; Keltikangas-Järvinen, Liisa; McMahon, George; Mileva-Seitz, Viara R; Motazedi, Ehsan; Power, Christine; Raitakari, Olli T; Ring, Susan M; Rivadeneira, Fernando; Rodriguez, Alina; Scheet, Paul A; Seppälä, Ilkka; Snieder, Harold; Standl, Marie; Thiering, Elisabeth; Timpson, Nicholas J; Veenstra, René; Velders, Fleur P; Whitehouse, Andrew J O; Smith, George Davey; Heinrich, Joachim; Hypponen, Elina; Lehtimäki, Terho; Middeldorp, Christel M; Oldehinkel, Albertine J; Pennell, Craig E; Boomsma, Dorret I; Tiemeier, Henning

    2016-07-01

    Individual differences in aggressive behavior emerge in early childhood and predict persisting behavioral problems and disorders. Studies of antisocial and severe aggression in adulthood indicate substantial underlying biology. However, little attention has been given to genome-wide approaches of aggressive behavior in children. We analyzed data from nine population-based studies and assessed aggressive behavior using well-validated parent-reported questionnaires. This is the largest sample exploring children's aggressive behavior to date (N = 18,988), with measures in two developmental stages (N = 15,668 early childhood and N = 16,311 middle childhood/early adolescence). First, we estimated the additive genetic variance of children's aggressive behavior based on genome-wide SNP information, using genome-wide complex trait analysis (GCTA). Second, genetic associations within each study were assessed using a quasi-Poisson regression approach, capturing the highly right-skewed distribution of aggressive behavior. Third, we performed meta-analyses of genome-wide associations for both the total age-mixed sample and the two developmental stages. Finally, we performed a gene-based test using the summary statistics of the total sample. GCTA quantified variance tagged by common SNPs (10-54%). The meta-analysis of the total sample identified one region in chromosome 2 (2p12) at near genome-wide significance (top SNP rs11126630, P = 5.30 × 10(-8) ). The separate meta-analyses of the two developmental stages revealed suggestive evidence of association at the same locus. The gene-based analysis indicated association of variation within AVPR1A with aggressive behavior. We conclude that common variants at 2p12 show suggestive evidence for association with childhood aggression. Replication of these initial findings is needed, and further studies should clarify its biological meaning. © 2015 Wiley Periodicals, Inc.

  7. Structural genomics of infectious disease drug targets: the SSGCID

    PubMed Central

    Stacy, Robin; Begley, Darren W.; Phan, Isabelle; Staker, Bart L.; Van Voorhis, Wesley C.; Varani, Gabriele; Buchko, Garry W.; Stewart, Lance J.; Myler, Peter J.

    2011-01-01

    The Seattle Structural Genomics Center for Infectious Disease (SSGCID) is a consortium of researchers at Seattle BioMed, Emerald BioStructures, the University of Washington and Pacific Northwest National Laboratory that was established to apply structural genomics approaches to drug targets from infectious disease organisms. The SSGCID is currently funded over a five-year period by the National Institute of Allergy and Infectious Diseases (NIAID) to determine the three-dimensional structures of 400 proteins from a variety of Category A, B and C pathogens. Target selection engages the infectious disease research and drug-therapy communities to identify drug targets, essential enzymes, virulence factors and vaccine candidates of biomedical relevance to combat infectious diseases. The protein-expression systems, purified proteins, ligand screens and three-dimensional structures produced by SSGCID con­stitute a valuable resource for drug-discovery research, all of which is made freely available to the greater scientific community. This issue of Acta Crystallographica Section F, entirely devoted to the work of the SSGCID, covers the details of the high-throughput pipeline and presents a series of structures from a broad array of pathogenic organisms. Here, a background is provided on the structural genomics of infectious disease, the essential components of the SSGCID pipeline are discussed and a survey of progress to date is presented. PMID:21904037

  8. 2004 Structural, Function and Evolutionary Genomics

    SciTech Connect

    Douglas L. Brutlag Nancy Ryan Gray

    2005-03-23

    This Gordon conference will cover the areas of structural, functional and evolutionary genomics. It will take a systematic approach to genomics, examining the evolution of proteins, protein functional sites, protein-protein interactions, regulatory networks, and metabolic networks. Emphasis will be placed on what we can learn from comparative genomics and entire genomes and proteomes.

  9. Successive changes in community structure of an ethylbenzene-degrading sulfate-reducing consortium.

    PubMed

    Nakagawa, Tatsunori; Sato, Shinya; Yamamoto, Yoko; Fukui, Manabu

    2002-06-01

    The microbial community structure and successive changes in a mesophilic ethylbenzene-degrading sulfate-reducing consortium were for the first time clarified by the denaturing gradient gel electrophoresis (DGGE) analysis of the PCR amplified 16S rRNA gene fragments. At least ten bands on the DGGE gel were detected in the stationary phase. Phylogenetic analysis of the DGGE bands revealed that the consortium consisted of different eubacterial phyla including the delta subgroup of Proteobacteria, the order Sphingobacteriales, the order Spirochaetales, and the unknown bacterium. The most abundant band C was closely related to strain mXyS1, an m-xylene-degrading sulfate-reducing bacterium (SRB), and occurred as a sole band on DGGE gels in the logarithmic growth phase that 40% ethylbenzene was consumed accompanied by sulfide production. During further prolonged incubation, the dominancy of band C did not change. These results suggest that SRB corresponds to the most abundant band C and contributes mainly to the degradation of ethylbenzene coupled with sulfate reduction.

  10. Genome-wide Studies of Verbal Declarative Memory in Nondemented Older People: The Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium

    PubMed Central

    Debette, Stéphanie; Ibrahim Verbaas, Carla A.; Bressler, Jan; Schuur, Maaike; Smith, Albert; Bis, Joshua C.; Davies, Gail; Wolf, Christiane; Gudnason, Vilmundur; Chibnik, Lori B.; Yang, Qiong; deStefano, Anita L.; de Quervain, Dominique J.F.; Srikanth, Velandai; Lahti, Jari; Grabe, Hans J.; Smith, Jennifer A.; Priebe, Lutz; Yu, Lei; Karbalai, Nazanin; Hayward, Caroline; Wilson, James F.; Campbell, Harry; Petrovic, Katja; Fornage, Myriam; Chauhan, Ganesh; Yeo, Robin; Boxall, Ruth; Becker, James; Stegle, Oliver; Mather, Karen A.; Chouraki, Vincent; Sun, Qi; Rose, Lynda M.; Resnick, Susan; Oldmeadow, Christopher; Kirin, Mirna; Wright, Alan F.; Jonsdottir, Maria K.; Au, Rhoda; Becker, Albert; Amin, Najaf; Nalls, Mike A.; Turner, Stephen T.; Kardia, Sharon L.R.; Oostra, Ben; Windham, Gwen; Coker, Laura H.; Zhao, Wei; Knopman, David S.; Heiss, Gerardo; Griswold, Michael E.; Gottesman, Rebecca F.; Vitart, Veronique; Hastie, Nicholas D.; Zgaga, Lina; Rudan, Igor; Polasek, Ozren; Holliday, Elizabeth G.; Schofield, Peter; Choi, Seung Hoan; Tanaka, Toshiko; An, Yang; Perry, Rodney T.; Kennedy, Richard E.; Sale, Michèle M.; Wang, Jing; Wadley, Virginia G.; Liewald, David C.; Ridker, Paul M.; Gow, Alan J.; Pattie, Alison; Starr, John M.; Porteous, David; Liu, Xuan; Thomson, Russell; Armstrong, Nicola J.; Eiriksdottir, Gudny; Assareh, Arezoo A.; Kochan, Nicole A.; Widen, Elisabeth; Palotie, Aarno; Hsieh, Yi-Chen; Eriksson, Johan G.; Vogler, Christian; van Swieten, John C.; Shulman, Joshua M.; Beiser, Alexa; Rotter, Jerome; Schmidt, Carsten O.; Hoffmann, Wolfgang; Nöthen, Markus M.; Ferrucci, Luigi; Attia, John; Uitterlinden, Andre G.; Amouyel, Philippe; Dartigues, Jean-François; Amieva, Hélène; Räikkönen, Katri; Garcia, Melissa; Wolf, Philip A.; Hofman, Albert; Longstreth, W.T.; Psaty, Bruce M.; Boerwinkle, Eric; DeJager, Philip L.; Sachdev, Perminder S.; Schmidt, Reinhold; Breteler, Monique M.B.; Teumer, Alexander; Lopez, Oscar L.; Cichon, Sven; Chasman, Daniel I.; Grodstein, Francine; Müller-Myhsok, Bertram; Tzourio, Christophe; Papassotiropoulos, Andreas; Bennett, David A.; Ikram, Arfan M.; Deary, Ian J.; van Duijn, Cornelia M.; Launer, Lenore; Fitzpatrick, Annette L.; Seshadri, Sudha; Mosley, Thomas H.

    2015-01-01

    BACKGROUND Memory performance in older persons can reflect genetic influences on cognitive function and dementing processes. We aimed to identify genetic contributions to verbal declarative memory in a community setting. METHODS We conducted genome-wide association studies for paragraph or word list delayed recall in 19 cohorts from the Cohorts for Heart and Aging Research in Genomic Epidemiology consortium, comprising 29,076 dementia-and stroke-free individuals of European descent, aged ≥45 years. Replication of suggestive associations (p < 5 × 10−6) was sought in 10,617 participants of European descent, 3811 African-Americans, and 1561 young adults. RESULTS rs4420638, near APOE, was associated with poorer delayed recall performance in discovery (p = 5.57 × 10−10) and replication cohorts (p = 5.65 × 10−8). This association was stronger for paragraph than word list delayed recall and in the oldest persons. Two associations with specific tests, in subsets of the total sample, reached genome-wide significance in combined analyses of discovery and replication (rs11074779 [HS3ST4], p = 3.11 × 10−8, and rs6813517 [SPOCK3], p = 2.58 × 10−8) near genes involved in immune response. A genetic score combining 58 independent suggestive memory risk variants was associated with increasing Alzheimer disease pathology in 725 autopsy samples. Association of memory risk loci with gene expression in 138 human hippocampus samples showed cis-associations with WDR48 and CLDN5, both related to ubiquitin metabolism. CONCLUSIONS This largest study to date exploring the genetics of memory function in ~ 40,000 older individuals revealed genome-wide associations and suggested an involvement of immune and ubiquitin pathways. PMID:25648963

  11. A genome-wide association study for venous thromboembolism: the extended cohorts for heart and aging research in genomic epidemiology (CHARGE) consortium.

    PubMed

    Tang, Weihong; Teichert, Martina; Chasman, Daniel I; Heit, John A; Morange, Pierre-Emmanuel; Li, Guo; Pankratz, Nathan; Leebeek, Frank W; Paré, Guillaume; de Andrade, Mariza; Tzourio, Christophe; Psaty, Bruce M; Basu, Saonli; Ruiter, Rikje; Rose, Lynda; Armasu, Sebastian M; Lumley, Thomas; Heckbert, Susan R; Uitterlinden, André G; Lathrop, Mark; Rice, Kenneth M; Cushman, Mary; Hofman, Albert; Lambert, Jean-Charles; Glazer, Nicole L; Pankow, James S; Witteman, Jacqueline C; Amouyel, Philippe; Bis, Joshua C; Bovill, Edwin G; Kong, Xiaoxiao; Tracy, Russell P; Boerwinkle, Eric; Rotter, Jerome I; Trégouët, David-Alexandre; Loth, Daan W; Stricker, Bruno H Ch; Ridker, Paul M; Folsom, Aaron R; Smith, Nicholas L

    2013-07-01

    Venous thromboembolism (VTE) is a common, heritable disease resulting in high rates of hospitalization and mortality. Yet few associations between VTE and genetic variants, all in the coagulation pathway, have been established. To identify additional genetic determinants of VTE, we conducted a two-stage genome-wide association study (GWAS) among individuals of European ancestry in the extended cohorts for heart and aging research in genomic epidemiology (CHARGE) VTE consortium. The discovery GWAS comprised 1,618 incident VTE cases out of 44,499 participants from six community-based studies. Genotypes for genome-wide single-nucleotide polymorphisms (SNPs) were imputed to approximately 2.5 million SNPs in HapMap and association with VTE assessed using study-design appropriate regression methods. Meta-analysis of these results identified two known loci, in F5 and ABO. Top 1,047 tag SNPs (P ≤ 0.0016) from the discovery GWAS were tested for association in an additional 3,231 cases and 3,536 controls from three case-control studies. In the combined data from these two stages, additional genome-wide significant associations were observed on 4q35 at F11 (top SNP rs4253399, intronic to F11) and on 4q28 at FGG (rs6536024, 9.7 kb from FGG; P < 5.0 × 10(-13) for both). The associations at the FGG locus were not completely explained by previously reported variants. Loci at or near SUSD1 and OTUD7A showed borderline yet novel associations (P < 5.0 × 10(-6) ) and constitute new candidate genes. In conclusion, this large GWAS replicated key genetic associations in F5 and ABO, and confirmed the importance of F11 and FGG loci for VTE. Future studies are warranted to better characterize the associations with F11 and FGG and to replicate the new candidate associations.

  12. NRXN3 Is a Novel Locus for Waist Circumference: A Genome-Wide Association Study from the CHARGE Consortium

    PubMed Central

    Aspelund, Thor; Eiriksdottir, Gudny; Garcia, Melissa; Launer, Lenore J.; Smith, Albert V.; Mitchell, Braxton D.; McArdle, Patrick F.; Shuldiner, Alan R.; Bielinski, Suzette J.; Boerwinkle, Eric; Brancati, Fred; Demerath, Ellen W.; Pankow, James S.; Arnold, Alice M.; Chen, Yii-Der Ida; Glazer, Nicole L.; McKnight, Barbara; Psaty, Bruce M.; Rotter, Jerome I.; Amin, Najaf; Campbell, Harry; Gyllensten, Ulf; Pattaro, Cristian; Pramstaller, Peter P.; Rudan, Igor; Struchalin, Maksim; Vitart, Veronique; Gao, Xiaoyi; Kraja, Aldi; Province, Michael A.; Zhang, Qunyuan; Atwood, Larry D.; Dupuis, Josée; Hirschhorn, Joel N.; Jaquish, Cashell E.; O'Donnell, Christopher J.; Vasan, Ramachandran S.; White, Charles C.; Aulchenko, Yurii S.; Estrada, Karol; Hofman, Albert; Rivadeneira, Fernando; Uitterlinden, André G.; Witteman, Jacqueline C. M.; Oostra, Ben A.; Kaplan, Robert C.; Gudnason, Vilmundur; O'Connell, Jeffrey R.; Borecki, Ingrid B.; van Duijn, Cornelia M.; Cupples, L. Adrienne; Fox, Caroline S.; North, Kari E.

    2009-01-01

    Central abdominal fat is a strong risk factor for diabetes and cardiovascular disease. To identify common variants influencing central abdominal fat, we conducted a two-stage genome-wide association analysis for waist circumference (WC). In total, three loci reached genome-wide significance. In stage 1, 31,373 individuals of Caucasian descent from eight cohort studies confirmed the role of FTO and MC4R and identified one novel locus associated with WC in the neurexin 3 gene [NRXN3 (rs10146997, p = 6.4×10−7)]. The association with NRXN3 was confirmed in stage 2 by combining stage 1 results with those from 38,641 participants in the GIANT consortium (p = 0.009 in GIANT only, p = 5.3×10−8 for combined analysis, n = 70,014). Mean WC increase per copy of the G allele was 0.0498 z-score units (0.65 cm). This SNP was also associated with body mass index (BMI) [p = 7.4×10−6, 0.024 z-score units (0.10 kg/m2) per copy of the G allele] and the risk of obesity (odds ratio 1.13, 95% CI 1.07–1.19; p = 3.2×10−5 per copy of the G allele). The NRXN3 gene has been previously implicated in addiction and reward behavior, lending further evidence that common forms of obesity may be a central nervous system-mediated disorder. Our findings establish that common variants in NRXN3 are associated with WC, BMI, and obesity. PMID:19557197

  13. Novel loci associated with usual sleep duration: the CHARGE Consortium Genome-Wide Association Study.

    PubMed

    Gottlieb, D J; Hek, K; Chen, T-H; Watson, N F; Eiriksdottir, G; Byrne, E M; Cornelis, M; Warby, S C; Bandinelli, S; Cherkas, L; Evans, D S; Grabe, H J; Lahti, J; Li, M; Lehtimäki, T; Lumley, T; Marciante, K D; Pérusse, L; Psaty, B M; Robbins, J; Tranah, G J; Vink, J M; Wilk, J B; Stafford, J M; Bellis, C; Biffar, R; Bouchard, C; Cade, B; Curhan, G C; Eriksson, J G; Ewert, R; Ferrucci, L; Fülöp, T; Gehrman, P R; Goodloe, R; Harris, T B; Heath, A C; Hernandez, D; Hofman, A; Hottenga, J-J; Hunter, D J; Jensen, M K; Johnson, A D; Kähönen, M; Kao, L; Kraft, P; Larkin, E K; Lauderdale, D S; Luik, A I; Medici, M; Montgomery, G W; Palotie, A; Patel, S R; Pistis, G; Porcu, E; Quaye, L; Raitakari, O; Redline, S; Rimm, E B; Rotter, J I; Smith, A V; Spector, T D; Teumer, A; Uitterlinden, A G; Vohl, M-C; Widen, E; Willemsen, G; Young, T; Zhang, X; Liu, Y; Blangero, J; Boomsma, D I; Gudnason, V; Hu, F; Mangino, M; Martin, N G; O'Connor, G T; Stone, K L; Tanaka, T; Viikari, J; Gharib, S A; Punjabi, N M; Räikkönen, K; Völzke, H; Mignot, E; Tiemeier, H

    2015-10-01

    Usual sleep duration is a heritable trait correlated with psychiatric morbidity, cardiometabolic disease and mortality, although little is known about the genetic variants influencing this trait. A genome-wide association study (GWAS) of usual sleep duration was conducted using 18 population-based cohorts totaling 47 180 individuals of European ancestry. Genome-wide significant association was identified at two loci. The strongest is located on chromosome 2, in an intergenic region 35- to 80-kb upstream from the thyroid-specific transcription factor PAX8 (lowest P=1.1 × 10(-9)). This finding was replicated in an African-American sample of 4771 individuals (lowest P=9.3 × 10(-4)). The strongest combined association was at rs1823125 (P=1.5 × 10(-10), minor allele frequency 0.26 in the discovery sample, 0.12 in the replication sample), with each copy of the minor allele associated with a sleep duration 3.1 min longer per night. The alleles associated with longer sleep duration were associated in previous GWAS with a more favorable metabolic profile and a lower risk of attention deficit hyperactivity disorder. Understanding the mechanisms underlying these associations may help elucidate biological mechanisms influencing sleep duration and its association with psychiatric, metabolic and cardiovascular disease.

  14. Effect of trichloroethylene and tetrachloroethylene on methane oxidation and community structure of methanotrophic consortium.

    PubMed

    Choi, Sun-Ah; Lee, Eun-Hee; Cho, Kyung-Suk

    2013-01-01

    The methane oxidation rate and community structure of a methanotrophic consortium were analyzed to determine the effects of trichloroethylene (TCE) and tetrachloroethylene (PCE) on methane oxidation. The maximum methane oxidation rate (Vmax ) of the consortium was 326.8 μmol·g-dry biomass(-1)·h(-1), and it had a half-saturation constant (Km ) of 143.8 μM. The addition of TCE or PCE resulted in decreased methane oxidation rates, which were decreased from 101.73 to 5.47-24.64 μmol·g-dry biomass(-1)·h(-1) with an increase in the TCE-to-methane ratio, and to 61.95-67.43 μmol·g-dry biomass(-1)·h(-1) with an increase in the PCE-to-methane ratio. TCE and PCE were non-competitive inhibitors for methane oxidation, and their inhibition constants (Ki ) were 33.4 and 132.0 μM, respectively. When the methanotrophic community was analyzed based on pmoA using quantitative real-time PCR (qRT-PCR), the pmoA gene copy numbers were shown to decrease from 7.3 ± 0.7 × 10(8) to 2.1-5.0 × 10(7) pmoA gene copy number · g-dry biomass(-1) with an increase in the TCE-to-methane ratio and to 2.5-7.0 × 10(7) pmoA gene copy number · g-dry biomass(-1) with an increase in the PCE-to-methane ratio. Community analysis by microarray demonstrated that Methylocystis (type II methanotrophs) were the most abundant in the methanotrophic community composition in the presence of TCE. These results suggest that toxic effects caused by TCE and PCE change not only methane oxidation rates but also the community structure of the methanotrophic consortium.

  15. Genome-wide association studies of cerebral white matter lesion burden: the CHARGE Consortium

    PubMed Central

    Fornage, Myriam; Debette, Stephanie; Bis, Joshua C.; Schmidt, Helena; Ikram, M. Arfan; Dufouil, Carole; Sigurdsson, Sigurdur; Lumley, Thomas; DeStefano, Anita L.; Fazekas, Franz; Vrooman, Henri A.; Shibata, Dean K.; Maillard, Pauline; Zijdenbos, Alex; Smith, Albert V.; Gudnason, Haukur; de Boer, Renske; Cushman, Mary; Mazoyer, Bernard; Heiss, Gerardo; Vernooij, Meike W.; Enzinger, Christian; Glazer, Nicole L.; Beiser, Alexa; Knopman, David S.; Cavalieri, Margherita; Niessen, Wiro J.; Harris, Tamara B.; Petrovic, Katja; Lopez, Oscar L.; Au, Rhoda; Lambert, Jean-Charles; Hofman, Albert; Gottesman, Rebecca F.; Garcia, Melissa; Heckbert, Susan R.; Atwood, Larry D.; Catellier, Diane J.; Uitterlinden, Andre G.; Yang, Qiong; Smith, Nicholas L.; Aspelund, Thor; Romero, Jose R.; Rice, Kenneth; Taylor, Kent D.; Nalls, Michael A.; Rotter, Jerome I.; Sharret, Richey; van Duijn, Cornelia M.; Amouyel, Philippe; Wolf, Philip A.; Gudnason, Vilmundur; van der Lugt, Aad; Boerwinkle, Eric; Psaty, Bruce M.; Seshadri, Sudha; Tzourio, Christophe; Breteler, Monique M.B.; Mosley, Thomas H.; Schmidt, Reinhold; Longstreth, W.T.; DeCarli, Charles; Launer, Lenore J.

    2011-01-01

    Objective White matter hyperintensities (WMH) detectable by magnetic resonance imaging (MRI)are part of the spectrum of vascular injury associated with aging of the brain and are thought to reflect ischemic damage to the small deep cerebral vessels. WMH are associated with an increased risk of cognitive and motor dysfunction, dementia, depression, and stroke. Despite a significant heritability, few genetic loci influencing WMH burden have been identified. Methods We performed a meta-analysis of genome-wide association studies (GWAS) for WMH burden in 9,361 stroke-free individuals of European descent from 7 community-based cohorts. Significant findings were tested for replication in 3,024 individuals from 2 additional cohorts. Results We identified 6 novel risk-associated single nucleotide polymorphisms (SNPs)in one locus on chromosome 17q25 encompassing 6 known genes including WBP2, TRIM65, TRIM47, MRPL38, FBF1, and ACOX1. The most significant association was for rs3744028 (Pdiscovery= 4.0×10−9; Preplication =1.3×10−7; Pcombined =4.0×10−15). Other SNPs in this region also reaching genome-wide significance are rs9894383 (P=5.3×10−9), rs11869977 (P=5.7×10−9), rs936393 (P=6.8×10−9), rs3744017 (P=7.3×10−9), and rs1055129 (P=4.1×10−8). Variant alleles at these loci conferred a small increase in WMH burden (4–8% of the overall mean WMH burden in the sample). Interpretation This large GWAS of WMH burden in community-based cohorts of individuals of European descent identifies a novel locus on chromosome 17. Further characterization of this locus may provide novel insights into the pathogenesis of cerebral WMH. PMID:21681796

  16. A genome-wide association study for venous thromboembolism: the extended Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium

    PubMed Central

    Pankratz, Nathan; Leebeek, Frank W.; Paré, Guillaume; de Andrade, Mariza; Tzourio, Christophe; Psaty, Bruce M.; Basu, Saonli; Ruiter, Rikje; Rose, Lynda; Armasu, Sebastian M.; Lumley, Thomas; Heckbert, Susan R.; Uitterlinden, André G.; Lathrop, Mark; Rice, Kenneth M.; Cushman, Mary; Hofman, Albert; Lambert, Jean-Charles; Glazer, Nicole L.; Pankow, James S.; Witteman, Jacqueline C.; Amouyel, Philippe; Bis, Joshua C.; Bovill, Edwin G.; Kong, Xiaoxiao; Tracy, Russell P.; Boerwinkle, Eric; Rotter, Jerome I.; Trégouët, David-Alexandre; Loth, Daan W.

    2014-01-01

    Venous thromboembolism (VTE) is a common, heritable disease resulting in high rates of hospitalization and mortality. Yet few associations between VTE and genetic variants, all in the coagulation pathway, have been established. To identify additional genetic determinants of VTE, we conducted a 2-stage genome-wide association study (GWAS) among individuals of European ancestry in the extended CHARGE VTE consortium. The discovery GWAS comprised 1,618 incident VTE cases out of 44,499 participants from six community-based studies. Genotypes for genome-wide single-nucleotide polymorphisms (SNPs) were imputed to ~2.5 million SNPs in HapMap and association with VTE assessed using study-design appropriate regression methods. Meta-analysis of these results identified two known loci, in F5 and ABO. Top 1,047 tag SNPs (p≤0.0016) from the discovery GWAS were tested for association in an additional 3,231 cases and 3,536 controls from three case-control studies. In the combined data from these two stages, additional genome-wide significant associations were observed on 4q35 at F11 (top SNP rs4253399, intronic to F11) and on 4q28 at FGG (rs6536024, 9.7 kb from FGG) (p<5.0×10−13 for both). The associations at the FGG locus were not completely explained by previously reported variants. Loci at or near SUSD1 and OTUD7A showed borderline yet novel associations (p<5.0×10-6) and constitute new candidate genes. In conclusion, this large GWAS replicated key genetic associations in F5 and ABO, and confirmed the importance of F11 and FGG loci for VTE. Future studies are warranted to better characterize the associations with F11 and FGG and to replicate the new candidate associations. PMID:23650146

  17. An Integrated Functional Genomics Consortium to Increase Carbon Sequestration in Poplars: Optimizing Aboveground Carbon Gain

    SciTech Connect

    Karnosky, David F; Podila, G Krishna; Burton, Andrew J

    2009-02-17

    This project used gene expression patterns from two forest Free-Air CO2 Enrichment (FACE) experiments (Aspen FACE in northern Wisconsin and POPFACE in Italy) to examine ways to increase the aboveground carbon sequestration potential of poplars (Populus). The aim was to use patterns of global gene expression to identify candidate genes for increased carbon sequestration. Gene expression studies were linked to physiological measurements in order to elucidate bottlenecks in carbon acquisition in trees grown in elevated CO2 conditions. Delayed senescence allowing additional carbon uptake late in the growing season, was also examined, and expression of target genes was tested in elite P. deltoides x P. trichocarpa hybrids. In Populus euramericana, gene expression was sensitive to elevated CO2, but the response depended on the developmental age of the leaves. Most differentially expressed genes were upregulated in elevated CO2 in young leaves, while most were downregulated in elevated CO2 in semi-mature leaves. In P. deltoides x P. trichocarpa hybrids, leaf development and leaf quality traits, including leaf area, leaf shape, epidermal cell area, stomatal number, specific leaf area, and canopy senescence were sensitive to elevated CO2. Significant increases under elevated CO2 occurred for both above- and belowground growth in the F-2 generation. Three areas of the genome played a role in determining aboveground growth response to elevated CO2, with three additional areas of the genome important in determining belowground growth responses to elevated CO2. In Populus tremuloides, CO2-responsive genes in leaves were found to differ between two aspen clones that showed different growth responses, despite similarity in many physiological parameters (photosynthesis, stomatal conductance, and leaf area index). The CO2-responsive clone shunted C into pathways associated with active defense/response to stress, carbohydrate/starch biosynthesis and subsequent growth. The CO2

  18. The National Astronomy Consortium Summer Student Research Program at NRAO-Socorro: Year 2 structure

    NASA Astrophysics Data System (ADS)

    Mills, Elisabeth A.; Sheth, Kartik; Giles, Faye; Perez, Laura M.; Arancibia, Demian; Burke-Spolaor, Sarah

    2016-01-01

    I will present a summary of the program structure used for the second year of hosting a summer student research cohort of the National Astronomy Consortium (NAC) at the National Radio Astronomy Observatory in Socorro, NM. The NAC is a program partnering physics and astronomy departments in majority and minority-serving institutions across the country. The primary aim of this program is to support traditionally underrepresented students interested in pursuing a career in STEM through a 9-10 week summer astronomy research project and a year of additional mentoring after they return to their home institution. I will describe the research, professional development, and inclusivity goals of the program, and show how these were used to create a weekly syllabus for the summer. I will also highlight several unique aspects of this program, including the recruitment of remote mentors for students to better balance the gender and racial diversity of available role models for the students, as well as the hosting of a contemporaneous series of visiting diversity speakers. Finally, I will discuss structures for continuing to engage, interact with, and mentor students in the academic year following the summer program. A goal of this work going forward is to be able to make instructional and organizational materials from this program available to other sites interested in joining the NAC or hosting similar programs at their own institution.

  19. Insights into structural variations and genome rearrangements in prokaryotic genomes.

    PubMed

    Periwal, Vinita; Scaria, Vinod

    2015-01-01

    Structural variations (SVs) are genomic rearrangements that affect fairly large fragments of DNA. Most of the SVs such as inversions, deletions and translocations have been largely studied in context of genetic diseases in eukaryotes. However, recent studies demonstrate that genome rearrangements can also have profound impact on prokaryotic genomes, leading to altered cell phenotype. In contrast to single-nucleotide variations, SVs provide a much deeper insight into organization of bacterial genomes at a much better resolution. SVs can confer change in gene copy number, creation of new genes, altered gene expression and many other functional consequences. High-throughput technologies have now made it possible to explore SVs at a much refined resolution in bacterial genomes. Through this review, we aim to highlight the importance of the less explored field of SVs in prokaryotic genomes and their impact. We also discuss its potential applicability in the emerging fields of synthetic biology and genome engineering where targeted SVs could serve to create sophisticated and accurate genome editing.

  20. Population genomics of cardiometabolic traits: design of the University College London-London School of Hygiene and Tropical Medicine-Edinburgh-Bristol (UCLEB) Consortium.

    PubMed

    Shah, Tina; Engmann, Jorgen; Dale, Caroline; Shah, Sonia; White, Jon; Giambartolomei, Claudia; McLachlan, Stela; Zabaneh, Delilah; Cavadino, Alana; Finan, Chris; Wong, Andrew; Amuzu, Antoinette; Ong, Ken; Gaunt, Tom; Holmes, Michael V; Warren, Helen; Swerdlow, Daniel I; Davies, Teri-Louise; Drenos, Fotios; Cooper, Jackie; Sofat, Reecha; Caulfield, Mark; Ebrahim, Shah; Lawlor, Debbie A; Talmud, Philippa J; Humphries, Steve E; Power, Christine; Hypponen, Elina; Richards, Marcus; Hardy, Rebecca; Kuh, Diana; Wareham, Nicholas; Langenberg, Claudia; Ben-Shlomo, Yoav; Day, Ian N; Whincup, Peter; Morris, Richard; Strachan, Mark W J; Price, Jacqueline; Kumari, Meena; Kivimaki, Mika; Plagnol, Vincent; Dudbridge, Frank; Whittaker, John C; Casas, Juan P; Hingorani, Aroon D

    2013-01-01

    Substantial advances have been made in identifying common genetic variants influencing cardiometabolic traits and disease outcomes through genome wide association studies. Nevertheless, gaps in knowledge remain and new questions have arisen regarding the population relevance, mechanisms, and applications for healthcare. Using a new high-resolution custom single nucleotide polymorphism (SNP) array (Metabochip) incorporating dense coverage of genomic regions linked to cardiometabolic disease, the University College-London School-Edinburgh-Bristol (UCLEB) consortium of highly-phenotyped population-based prospective studies, aims to: (1) fine map functionally relevant SNPs; (2) precisely estimate individual absolute and population attributable risks based on individual SNPs and their combination; (3) investigate mechanisms leading to altered risk factor profiles and CVD events; and (4) use Mendelian randomisation to undertake studies of the causal role in CVD of a range of cardiovascular biomarkers to inform public health policy and help develop new preventative therapies.

  1. Chapter 6: Structural variation and medical genomics.

    PubMed

    Raphael, Benjamin J

    2012-01-01

    Differences between individual human genomes, or between human and cancer genomes, range in scale from single nucleotide variants (SNVs) through intermediate and large-scale duplications, deletions, and rearrangements of genomic segments. The latter class, called structural variants (SVs), have received considerable attention in the past several years as they are a previously under appreciated source of variation in human genomes. Much of this recent attention is the result of the availability of higher-resolution technologies for measuring these variants, including both microarray-based techniques, and more recently, high-throughput DNA sequencing. We describe the genomic technologies and computational techniques currently used to measure SVs, focusing on applications in human and cancer genomics.

  2. Oncofertility Consortium

    MedlinePlus

    ... September 15, 2016 National Physicians Cooperative Brigid Martz Smith July 21, 2016 Postdoctoral Position in Pediatric Fertility ... 2016 Oncofertility Consortium Clinic/Center Map Brigid Martz Smith June 30, 2016 Zika Virus Concerns Grow as ...

  3. Structural genomics for science and society.

    PubMed

    Hol, W G

    2000-11-01

    The field of robotics is affecting structural biology, enabling the era of structural genomics. The potential impact on protein fold prediction, biology, protein engineering and medicine is immense. Unraveling mysteries in the protein structure universe will require a dedicated effort for decades to come with computational toxicology as possibly a century long challenge.

  4. Towards a whole genome physical map in rainbow trout

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Over the last five years, tremendous genomic resources were developed in salmonids. In 2005, INRA joined formally the consortium for Genome Research on All Salmonids Program (cGRASP). This consortium (www.cgrasp.org) is the international collaborative structure for establishing needed pre- and post-...

  5. Genome-wide Membrane Protein Structure Prediction

    PubMed Central

    Piccoli, Stefano; Suku, Eda; Garonzi, Marianna; Giorgetti, Alejandro

    2013-01-01

    Transmembrane proteins allow cells to extensively communicate with the external world in a very accurate and specific way. They form principal nodes in several signaling pathways and attract large interest in therapeutic intervention, as the majority pharmaceutical compounds target membrane proteins. Thus, according to the current genome annotation methods, a detailed structural/functional characterization at the protein level of each of the elements codified in the genome is also required. The extreme difficulty in obtaining high-resolution three-dimensional structures, calls for computational approaches. Here we review to which extent the efforts made in the last few years, combining the structural characterization of membrane proteins with protein bioinformatics techniques, could help describing membrane proteins at a genome-wide scale. In particular we analyze the use of comparative modeling techniques as a way of overcoming the lack of high-resolution three-dimensional structures in the human membrane proteome. PMID:24403851

  6. The fractal structure of the mitochondrial genomes

    NASA Astrophysics Data System (ADS)

    Oiwa, Nestor N.; Glazier, James A.

    2002-08-01

    The mitochondrial DNA genome has a definite multifractal structure. We show that loops, hairpins and inverted palindromes are responsible for this self-similarity. We can thus establish a definite relation between the function of subsequences and their fractal dimension. Intriguingly, protein coding DNAs also exhibit palindromic structures, although they do not appear in the sequence of amino acids. These structures may reflect the stabilization and transcriptional control of DNA or the control of posttranscriptional editing of mRNA.

  7. An integrated approach to structural genomics.

    PubMed

    Heinemann, U; Frevert, J; Hofmann, K; Illing, G; Maurer, C; Oschkinat, H; Saenger, W

    2000-01-01

    Structural genomics aims at determining a set of protein structures that will represent all domain folds present in the biosphere. These structures can be used as the basis for the homology modelling of the majority of all remaining protein domains or, indeed, proteins. Structural genomics therefore promises to provide a comprehensive structural description of the protein universe. To achieve this, a broad scientific effort is required. The Berlin-based "Protein Structure Factory" (PSF) plans to contribute to this effort by setting up a local infrastructure for the low-cost, high-throughput analysis of soluble human proteins. In close collaboration with the German Human Genome Project (DHGP) protein-coding genes will be expressed in Escherichia coli or yeast. Affinity-tagged proteins will be purified semi-automatically for biophysical characterization and structure analysis by X-ray diffraction methods and NMR spectroscopy. In all steps of the structure analysis process, possibilities for automation, parallelization and standardization will be explored. Major new facilities that are created for the PSF include a robotic station for large-scale protein crystallization, an NMR center and an experimental station for protein crystallography at the synchrotron storage ring BESSY II in Berlin. PMID:11063780

  8. Circular structures in retroviral and cellular genomes.

    PubMed

    Albert, F G; Bronson, E C; Fitzgerald, D J; Anderson, J N

    1995-10-01

    A computer program for predicting DNA bending from nucleotide sequence was used to identify circular structures in retroviral and cellular genomes. An 830-base pair circular structure was located in a control region near the center of the genome of the human immunodeficiency virus type I (HIV-I). This unusual structure displayed relatively smooth planar bending throughout its length. The structure is conserved in diverse isolates of HIV-I, HIV-II, and simian immunodeficiency viruses, which implies that it is under selective constraints. A search of all sequences in the GenBank data base was carried out in order to identify similar circular structures in cellular DNA. The results revealed that the structures are associated with a wide range of sequences that undergo recombination, including most known examples of DNA inversion and subtelomeric translocation systems. Circular structures were also associated with replication and transposition systems where DNA looping has been implicated in the generation of large protein-DNA complexes. Experimental evidence for the structures was provided by studies which demonstrated that two sequences detected as circular by computer preferentially formed covalently closed circles during ligation reactions in vitro when compared to nonbent fragments, bent fragments with noncircular shapes, and total genomic DNA. In addition, a single T-->C substitution in one of these sequences rendered it less planar as seen by computer analysis and significantly reduced its rate of ligase-catalyzed cyclization. These results permit us to speculate that intrinsically circular structures facilitate DNA looping during formation of the large protein-DNA complexes that are involved in site- and region-specific recombination and in other genomic processes. PMID:7559522

  9. Genome Structure of the Legume, Lotus japonicus

    PubMed Central

    Sato, Shusei; Nakamura, Yasukazu; Kaneko, Takakazu; Asamizu, Erika; Kato, Tomohiko; Nakao, Mitsuteru; Sasamoto, Shigemi; Watanabe, Akiko; Ono, Akiko; Kawashima, Kumiko; Fujishiro, Tsunakazu; Katoh, Midori; Kohara, Mitsuyo; Kishida, Yoshie; Minami, Chiharu; Nakayama, Shinobu; Nakazaki, Naomi; Shimizu, Yoshimi; Shinpo, Sayaka; Takahashi, Chika; Wada, Tsuyuko; Yamada, Manabu; Ohmido, Nobuko; Hayashi, Makoto; Fukui, Kiichi; Baba, Tomoya; Nakamichi, Tomoko; Mori, Hirotada; Tabata, Satoshi

    2008-01-01

    The legume Lotus japonicus has been widely used as a model system to investigate the genetic background of legume-specific phenomena such as symbiotic nitrogen fixation. Here, we report structural features of the L. japonicus genome. The 315.1-Mb sequences determined in this and previous studies correspond to 67% of the genome (472 Mb), and are likely to cover 91.3% of the gene space. Linkage mapping anchored 130-Mb sequences onto the six linkage groups. A total of 10 951 complete and 19 848 partial structures of protein-encoding genes were assigned to the genome. Comparative analysis of these genes revealed the expansion of several functional domains and gene families that are characteristic of L. japonicus. Synteny analysis detected traces of whole-genome duplication and the presence of synteny blocks with other plant genomes to various degrees. This study provides the first opportunity to look into the complex and unique genetic system of legumes. PMID:18511435

  10. Using Genomics for Natural Product Structure Elucidation.

    PubMed

    Tietz, Jonathan I; Mitchell, Douglas A

    2016-01-01

    Natural products (NPs) are the most historically bountiful source of chemical matter for drug development-especially for anti-infectives. With insights gleaned from genome mining, interest in natural product discovery has been reinvigorated. An essential stage in NP discovery is structural elucidation, which sheds light not only on the chemical composition of a molecule but also its novelty, properties, and derivatization potential. The history of structure elucidation is replete with techniquebased revolutions: combustion analysis, crystallography, UV, IR, MS, and NMR have each provided game-changing advances; the latest such advance is genomics. All natural products have a genetic basis, and the ability to obtain and interpret genomic information for structure elucidation is increasingly available at low cost to non-specialists. In this review, we describe the value of genomics as a structural elucidation technique, especially from the perspective of the natural product chemist approaching an unknown metabolite. Herein we first introduce the databases and programs of interest to the natural products chemist, with an emphasis on those currently most suited for general usability. We describe strategies for linking observed natural product-linked phenotypes to their corresponding gene clusters. We then discuss techniques for extracting structural information from genes, illustrated with numerous case examples. We also provide an analysis of the biases and limitations of the field with recommendations for future development. Our overview is not only aimed at biologically-oriented researchers already at ease with bioinformatic techniques, but also, in particular, at natural product, organic, and/or medicinal chemists not previously familiar with genomic techniques.

  11. A data management system for structural genomics

    PubMed Central

    Raymond, Stéphane; O'Toole, Nicholas; Cygler, Miroslaw

    2004-01-01

    Background Structural genomics (SG) projects aim to determine thousands of protein structures by the development of high-throughput techniques for all steps of the experimental structure determination pipeline. Crucial to the success of such endeavours is the careful tracking and archiving of experimental and external data on protein targets. Results We have developed a sophisticated data management system for structural genomics. Central to the system is an Oracle-based, SQL-interfaced database. The database schema deals with all facets of the structure determination process, from target selection to data deposition. Users access the database via any web browser. Experimental data is input by users with pre-defined web forms. Data can be displayed according to numerous criteria. A list of all current target proteins can be viewed, with links for each target to associated entries in external databases. To avoid unnecessary work on targets, our data management system matches protein sequences weekly using BLAST to entries in the Protein Data Bank and to targets of other SG centers worldwide. Conclusion Our system is a working, effective and user-friendly data management tool for structural genomics projects. In this report we present a detailed summary of the various capabilities of the system, using real target data as examples, and indicate our plans for future enhancements. PMID:15210054

  12. A data management system for structural genomics.

    PubMed

    Raymond, Stéphane; O'Toole, Nicholas; Cygler, Miroslaw

    2004-06-21

    BACKGROUND: Structural genomics (SG) projects aim to determine thousands of protein structures by the development of high-throughput techniques for all steps of the experimental structure determination pipeline. Crucial to the success of such endeavours is the careful tracking and archiving of experimental and external data on protein targets. RESULTS: We have developed a sophisticated data management system for structural genomics. Central to the system is an Oracle-based, SQL-interfaced database. The database schema deals with all facets of the structure determination process, from target selection to data deposition. Users access the database via any web browser. Experimental data is input by users with pre-defined web forms. Data can be displayed according to numerous criteria. A list of all current target proteins can be viewed, with links for each target to associated entries in external databases. To avoid unnecessary work on targets, our data management system matches protein sequences weekly using BLAST to entries in the Protein Data Bank and to targets of other SG centers worldwide. CONCLUSION: Our system is a working, effective and user-friendly data management tool for structural genomics projects. In this report we present a detailed summary of the various capabilities of the system, using real target data as examples, and indicate our plans for future enhancements.

  13. Genome-wide association study of lifetime cannabis use based on a large meta-analytic sample of 32 330 subjects from the International Cannabis Consortium

    PubMed Central

    Stringer, S; Minică, C C; Verweij, K J H; Mbarek, H; Bernard, M; Derringer, J; van Eijk, K R; Isen, J D; Loukola, A; Maciejewski, D F; Mihailov, E; van der Most, P J; Sánchez-Mora, C; Roos, L; Sherva, R; Walters, R; Ware, J J; Abdellaoui, A; Bigdeli, T B; Branje, S J T; Brown, S A; Bruinenberg, M; Casas, M; Esko, T; Garcia-Martinez, I; Gordon, S D; Harris, J M; Hartman, C A; Henders, A K; Heath, A C; Hickie, I B; Hickman, M; Hopfer, C J; Hottenga, J J; Huizink, A C; Irons, D E; Kahn, R S; Korhonen, T; Kranzler, H R; Krauter, K; van Lier, P A C; Lubke, G H; Madden, P A F; Mägi, R; McGue, M K; Medland, S E; Meeus, W H J; Miller, M B; Montgomery, G W; Nivard, M G; Nolte, I M; Oldehinkel, A J; Pausova, Z; Qaiser, B; Quaye, L; Ramos-Quiroga, J A; Richarte, V; Rose, R J; Shin, J; Stallings, M C; Stiby, A I; Wall, T L; Wright, M J; Koot, H M; Paus, T; Hewitt, J K; Ribasés, M; Kaprio, J; Boks, M P; Snieder, H; Spector, T; Munafò, M R; Metspalu, A; Gelernter, J; Boomsma, D I; Iacono, W G; Martin, N G; Gillespie, N A; Derks, E M; Vink, J M

    2016-01-01

    Cannabis is the most widely produced and consumed illicit psychoactive substance worldwide. Occasional cannabis use can progress to frequent use, abuse and dependence with all known adverse physical, psychological and social consequences. Individual differences in cannabis initiation are heritable (40–48%). The International Cannabis Consortium was established with the aim to identify genetic risk variants of cannabis use. We conducted a meta-analysis of genome-wide association data of 13 cohorts (N=32 330) and four replication samples (N=5627). In addition, we performed a gene-based test of association, estimated single-nucleotide polymorphism (SNP)-based heritability and explored the genetic correlation between lifetime cannabis use and cigarette use using LD score regression. No individual SNPs reached genome-wide significance. Nonetheless, gene-based tests identified four genes significantly associated with lifetime cannabis use: NCAM1, CADM2, SCOC and KCNT2. Previous studies reported associations of NCAM1 with cigarette smoking and other substance use, and those of CADM2 with body mass index, processing speed and autism disorders, which are phenotypes previously reported to be associated with cannabis use. Furthermore, we showed that, combined across the genome, all common SNPs explained 13–20% (P<0.001) of the liability of lifetime cannabis use. Finally, there was a strong genetic correlation (rg=0.83; P=1.85 × 10−8) between lifetime cannabis use and lifetime cigarette smoking implying that the SNP effect sizes of the two traits are highly correlated. This is the largest meta-analysis of cannabis GWA studies to date, revealing important new insights into the genetic pathways of lifetime cannabis use. Future functional studies should explore the impact of the identified genes on the biological mechanisms of cannabis use. PMID:27023175

  14. Genome-wide association study of lifetime cannabis use based on a large meta-analytic sample of 32 330 subjects from the International Cannabis Consortium.

    PubMed

    Stringer, S; Minică, C C; Verweij, K J H; Mbarek, H; Bernard, M; Derringer, J; van Eijk, K R; Isen, J D; Loukola, A; Maciejewski, D F; Mihailov, E; van der Most, P J; Sánchez-Mora, C; Roos, L; Sherva, R; Walters, R; Ware, J J; Abdellaoui, A; Bigdeli, T B; Branje, S J T; Brown, S A; Bruinenberg, M; Casas, M; Esko, T; Garcia-Martinez, I; Gordon, S D; Harris, J M; Hartman, C A; Henders, A K; Heath, A C; Hickie, I B; Hickman, M; Hopfer, C J; Hottenga, J J; Huizink, A C; Irons, D E; Kahn, R S; Korhonen, T; Kranzler, H R; Krauter, K; van Lier, P A C; Lubke, G H; Madden, P A F; Mägi, R; McGue, M K; Medland, S E; Meeus, W H J; Miller, M B; Montgomery, G W; Nivard, M G; Nolte, I M; Oldehinkel, A J; Pausova, Z; Qaiser, B; Quaye, L; Ramos-Quiroga, J A; Richarte, V; Rose, R J; Shin, J; Stallings, M C; Stiby, A I; Wall, T L; Wright, M J; Koot, H M; Paus, T; Hewitt, J K; Ribasés, M; Kaprio, J; Boks, M P; Snieder, H; Spector, T; Munafò, M R; Metspalu, A; Gelernter, J; Boomsma, D I; Iacono, W G; Martin, N G; Gillespie, N A; Derks, E M; Vink, J M

    2016-01-01

    Cannabis is the most widely produced and consumed illicit psychoactive substance worldwide. Occasional cannabis use can progress to frequent use, abuse and dependence with all known adverse physical, psychological and social consequences. Individual differences in cannabis initiation are heritable (40-48%). The International Cannabis Consortium was established with the aim to identify genetic risk variants of cannabis use. We conducted a meta-analysis of genome-wide association data of 13 cohorts (N=32 330) and four replication samples (N=5627). In addition, we performed a gene-based test of association, estimated single-nucleotide polymorphism (SNP)-based heritability and explored the genetic correlation between lifetime cannabis use and cigarette use using LD score regression. No individual SNPs reached genome-wide significance. Nonetheless, gene-based tests identified four genes significantly associated with lifetime cannabis use: NCAM1, CADM2, SCOC and KCNT2. Previous studies reported associations of NCAM1 with cigarette smoking and other substance use, and those of CADM2 with body mass index, processing speed and autism disorders, which are phenotypes previously reported to be associated with cannabis use. Furthermore, we showed that, combined across the genome, all common SNPs explained 13-20% (P<0.001) of the liability of lifetime cannabis use. Finally, there was a strong genetic correlation (rg=0.83; P=1.85 × 10(-8)) between lifetime cannabis use and lifetime cigarette smoking implying that the SNP effect sizes of the two traits are highly correlated. This is the largest meta-analysis of cannabis GWA studies to date, revealing important new insights into the genetic pathways of lifetime cannabis use. Future functional studies should explore the impact of the identified genes on the biological mechanisms of cannabis use. PMID:27023175

  15. The Quality and Validation of Structures from Structural Genomics

    PubMed Central

    Domagalski, Marcin J.; Zheng, Heping; Zimmerman, Matthew D.; Dauter, Zbigniew; Wlodawer, Alexander; Minor, Wladek

    2014-01-01

    Quality control of three-dimensional structures of macromolecules is a critical step to ensure the integrity of structural biology data, especially those produced by structural genomics centers. Whereas the Protein Data Bank (PDB) has proven to be a remarkable success overall, the inconsistent quality of structures reveals a lack of universal standards for structure/deposit validation. Here, we review the state-of-the-art methods used in macromolecular structure validation, focusing on validation of structures determined by X-ray crystallography. We describe some general protocols used in the rebuilding and re-refinement of problematic structural models. We also briefly discuss some frontier areas of structure validation, including refinement of protein–ligand complexes, automation of structure redetermination, and the use of NMR structures and computational models to solve X-ray crystal structures by molecular replacement. PMID:24203341

  16. Mechanisms underlying structural variant formation in genomic disorders

    PubMed Central

    Carvalho, Claudia M. B.; Lupski, James R.

    2016-01-01

    With the recent burst of technological developments in genomics, and the clinical implementation of genome-wide assays, our understanding of the molecular basis of genomic disorders, specifically the contribution of structural variation to disease burden, is evolving quickly. Ongoing studies have revealed a ubiquitous role for genome architecture in the formation of structural variants at a given locus, both in DNA recombination-based processes and in replication-based processes. These reports showcase the influence of repeat sequences on genomic stability and structural variant complexity and also highlight the tremendous plasticity and dynamic nature of our genome in evolution, health and disease susceptibility. PMID:26924765

  17. Structural genomics reveals EVE as a new ASCH/PUA-related domain.

    PubMed

    Bertonati, Claudia; Punta, Marco; Fischer, Markus; Yachdav, Guy; Forouhar, Farhad; Zhou, Weihong; Kuzin, Alexander P; Seetharaman, Jayaraman; Abashidze, Mariam; Ramelot, Theresa A; Kennedy, Michael A; Cort, John R; Belachew, Adam; Hunt, John F; Tong, Liang; Montelione, Gaetano T; Rost, Burkhard

    2009-05-15

    We report on several proteins recently solved by structural genomics consortia, in particular by the Northeast Structural Genomics consortium (NESG). The proteins considered in this study differ substantially in their sequences but they share a similar structural core, characterized by a pseudobarrel five-stranded beta sheet. This core corresponds to the PUA domain-like architecture in the SCOP database. By connecting sequence information with structural knowledge, we characterize a new subgroup of these proteins that we propose to be distinctly different from previously described PUA domain-like domains such as PUA proper or ASCH. We refer to these newly defined domains as EVE. Although EVE may have retained the ability of PUA domains to bind RNA, the available experimental and computational data suggests that both the details of its molecular function and its cellular function differ from those of other PUA domain-like domains. This study of EVE and its relatives illustrates how the combination of structure and genomics creates new insights by connecting a cornucopia of structures that map to the same evolutionary potential. Primary sequence information alone would have not been sufficient to reveal these evolutionary links.

  18. The structural code of cyanobacterial genomes

    PubMed Central

    Lehmann, Robert; Machné, Rainer; Herzel, Hanspeter

    2014-01-01

    A periodic bias in nucleotide frequency with a period of about 11 bp is characteristic for bacterial genomes. This signal is commonly interpreted to relate to the helical pitch of negatively supercoiled DNA. Functions in supercoiling-dependent RNA transcription or as a ‘structural code’ for DNA packaging have been suggested. Cyanobacterial genomes showed especially strong periodic signals and, on the other hand, DNA supercoiling and supercoiling-dependent transcription are highly dynamic and underlie circadian rhythms of these phototrophic bacteria. Focusing on this phylum and dinucleotides, we find that a minimal motif of AT-tracts (AT2) yields the strongest signal. Strong genome-wide periodicity is ancestral to a clade of unicellular and polyploid species but lost upon morphological transitions into two baeocyte-forming and a symbiotic species. The signal is intermediate in heterocystous species and weak in monoploid picocyanobacteria. A pronounced ‘structural code’ may support efficient nucleoid condensation and segregation in polyploid cells. The major source of the AT2 signal are protein-coding regions, where it is encoded preferentially in the first and third codon positions. The signal shows only few relations to supercoiling-dependent and diurnal RNA transcription in Synechocystis sp. PCC 6803. Strong and specific signals in two distinct transposons suggest roles in transposase transcription and transpososome formation. PMID:25056315

  19. Chloroplast genome structure in Ilex (Aquifoliaceae)

    PubMed Central

    Yao, Xin; Tan, Yun-Hong; Liu, Ying-Ying; Song, Yu; Yang, Jun-Bo; Corlett, Richard T.

    2016-01-01

    Aquifoliaceae is the largest family in the campanulid order Aquifoliales. It consists of a single genus, Ilex, the hollies, which is the largest woody dioecious genus in the angiosperms. Most species are in East Asia or South America. The taxonomy and evolutionary history remain unclear due to the lack of a robust species-level phylogeny. We produced the first complete chloroplast genomes in this family, including seven Ilex species, by Illumina sequencing of long-range PCR products and subsequent reference-guided de novo assembly. These genomes have a typical bicyclic structure with a conserved genome arrangement and moderate divergence. The total length is 157,741 bp and there is one large single-copy region (LSC) with 87,109 bp, one small single-copy with 18,436 bp, and a pair of inverted repeat regions (IR) with 52,196 bp. A total of 144 genes were identified, including 96 protein-coding genes, 40 tRNA and 8 rRNA. Thirty-four repetitive sequences were identified in Ilex pubescens, with lengths >14 bp and identity >90%, and 11 divergence hotspot regions that could be targeted for phylogenetic markers. This study will contribute to improved resolution of deep branches of the Ilex phylogeny and facilitate identification of Ilex species. PMID:27378489

  20. Chloroplast genome structure in Ilex (Aquifoliaceae).

    PubMed

    Yao, Xin; Tan, Yun-Hong; Liu, Ying-Ying; Song, Yu; Yang, Jun-Bo; Corlett, Richard T

    2016-01-01

    Aquifoliaceae is the largest family in the campanulid order Aquifoliales. It consists of a single genus, Ilex, the hollies, which is the largest woody dioecious genus in the angiosperms. Most species are in East Asia or South America. The taxonomy and evolutionary history remain unclear due to the lack of a robust species-level phylogeny. We produced the first complete chloroplast genomes in this family, including seven Ilex species, by Illumina sequencing of long-range PCR products and subsequent reference-guided de novo assembly. These genomes have a typical bicyclic structure with a conserved genome arrangement and moderate divergence. The total length is 157,741 bp and there is one large single-copy region (LSC) with 87,109 bp, one small single-copy with 18,436 bp, and a pair of inverted repeat regions (IR) with 52,196 bp. A total of 144 genes were identified, including 96 protein-coding genes, 40 tRNA and 8 rRNA. Thirty-four repetitive sequences were identified in Ilex pubescens, with lengths >14 bp and identity >90%, and 11 divergence hotspot regions that could be targeted for phylogenetic markers. This study will contribute to improved resolution of deep branches of the Ilex phylogeny and facilitate identification of Ilex species. PMID:27378489

  1. Genome-Wide Association Study for Incident Myocardial Infarction and Coronary Heart Disease in Prospective Cohort Studies: The CHARGE Consortium

    PubMed Central

    Cupples, L. Adrienne; Trompet, Stella; Chasman, Daniel I.; Lumley, Thomas; Völker, Uwe; Buckley, Brendan M.; Ding, Jingzhong; Jensen, Majken K.; Folsom, Aaron R.; Kritchevsky, Stephen B.; Girman, Cynthia J.; Ford, Ian; Dörr, Marcus; Salomaa, Veikko; Uitterlinden, André G.; Eiriksdottir, Gudny; Vasan, Ramachandran S.; Franceschini, Nora; Carty, Cara L.; Virtamo, Jarmo; Demissie, Serkalem; Amouyel, Philippe; Arveiler, Dominique; Heckbert, Susan R.; Ferrières, Jean; Ducimetière, Pierre; Smith, Nicholas L.; Wang, Ying A.; Siscovick, David S.; Rice, Kenneth M.; Wiklund, Per-Gunnar; Taylor, Kent D.; Evans, Alun; Kee, Frank; Rotter, Jerome I.; Karvanen, Juha; Kuulasmaa, Kari; Heiss, Gerardo; Kraft, Peter; Launer, Lenore J.; Hofman, Albert; Markus, Marcello R. P.; Rose, Lynda M.; Silander, Kaisa; Wagner, Peter; Benjamin, Emelia J.; Lohman, Kurt; Stott, David J.; Rivadeneira, Fernando; Harris, Tamara B.; Levy, Daniel; Liu, Yongmei; Rimm, Eric B.; Jukema, J. Wouter; Völzke, Henry; Ridker, Paul M.; Blankenberg, Stefan; Franco, Oscar H.; Gudnason, Vilmundur; Psaty, Bruce M.; Boerwinkle, Eric; O'Donnell, Christopher J.

    2016-01-01

    Background Data are limited on genome-wide association studies (GWAS) for incident coronary heart disease (CHD). Moreover, it is not known whether genetic variants identified to date also associate with risk of CHD in a prospective setting. Methods We performed a two-stage GWAS analysis of incident myocardial infarction (MI) and CHD in a total of 64,297 individuals (including 3898 MI cases, 5465 CHD cases). SNPs that passed an arbitrary threshold of 5×10−6 in Stage I were taken to Stage II for further discovery. Furthermore, in an analysis of prognosis, we studied whether known SNPs from former GWAS were associated with total mortality in individuals who experienced MI during follow-up. Results In Stage I 15 loci passed the threshold of 5×10−6; 8 loci for MI and 8 loci for CHD, for which one locus overlapped and none were reported in previous GWAS meta-analyses. We took 60 SNPs representing these 15 loci to Stage II of discovery. Four SNPs near QKI showed nominally significant association with MI (p-value<8.8×10−3) and three exceeded the genome-wide significance threshold when Stage I and Stage II results were combined (top SNP rs6941513: p = 6.2×10−9). Despite excellent power, the 9p21 locus SNP (rs1333049) was only modestly associated with MI (HR = 1.09, p-value = 0.02) and marginally with CHD (HR = 1.06, p-value = 0.08). Among an inception cohort of those who experienced MI during follow-up, the risk allele of rs1333049 was associated with a decreased risk of subsequent mortality (HR = 0.90, p-value = 3.2×10−3). Conclusions QKI represents a novel locus that may serve as a predictor of incident CHD in prospective studies. The association of the 9p21 locus both with increased risk of first myocardial infarction and longer survival after MI highlights the importance of study design in investigating genetic determinants of complex disorders. PMID:26950853

  2. Genome-Wide Structural Variation Detection by Genome Mapping on Nanochannel Arrays

    PubMed Central

    Mak, Angel C. Y.; Lai, Yvonne Y. Y.; Lam, Ernest T.; Kwok, Tsz-Piu; Leung, Alden K. Y.; Poon, Annie; Mostovoy, Yulia; Hastie, Alex R.; Stedman, William; Anantharaman, Thomas; Andrews, Warren; Zhou, Xiang; Pang, Andy W. C.; Dai, Heng; Chu, Catherine; Lin, Chin; Wu, Jacob J. K.; Li, Catherine M. L.; Li, Jing-Woei; Yim, Aldrin K. Y.; Chan, Saki; Sibert, Justin; Džakula, Željko; Cao, Han; Yiu, Siu-Ming; Chan, Ting-Fung; Yip, Kevin Y.; Xiao, Ming; Kwok, Pui-Yan

    2016-01-01

    Comprehensive whole-genome structural variation detection is challenging with current approaches. With diploid cells as DNA source and the presence of numerous repetitive elements, short-read DNA sequencing cannot be used to detect structural variation efficiently. In this report, we show that genome mapping with long, fluorescently labeled DNA molecules imaged on nanochannel arrays can be used for whole-genome structural variation detection without sequencing. While whole-genome haplotyping is not achieved, local phasing (across >150-kb regions) is routine, as molecules from the parental chromosomes are examined separately. In one experiment, we generated genome maps from a trio from the 1000 Genomes Project, compared the maps against that derived from the reference human genome, and identified structural variations that are >5 kb in size. We find that these individuals have many more structural variants than those published, including some with the potential of disrupting gene function or regulation. PMID:26510793

  3. Genome-Wide Structural Variation Detection by Genome Mapping on Nanochannel Arrays.

    PubMed

    Mak, Angel C Y; Lai, Yvonne Y Y; Lam, Ernest T; Kwok, Tsz-Piu; Leung, Alden K Y; Poon, Annie; Mostovoy, Yulia; Hastie, Alex R; Stedman, William; Anantharaman, Thomas; Andrews, Warren; Zhou, Xiang; Pang, Andy W C; Dai, Heng; Chu, Catherine; Lin, Chin; Wu, Jacob J K; Li, Catherine M L; Li, Jing-Woei; Yim, Aldrin K Y; Chan, Saki; Sibert, Justin; Džakula, Željko; Cao, Han; Yiu, Siu-Ming; Chan, Ting-Fung; Yip, Kevin Y; Xiao, Ming; Kwok, Pui-Yan

    2016-01-01

    Comprehensive whole-genome structural variation detection is challenging with current approaches. With diploid cells as DNA source and the presence of numerous repetitive elements, short-read DNA sequencing cannot be used to detect structural variation efficiently. In this report, we show that genome mapping with long, fluorescently labeled DNA molecules imaged on nanochannel arrays can be used for whole-genome structural variation detection without sequencing. While whole-genome haplotyping is not achieved, local phasing (across >150-kb regions) is routine, as molecules from the parental chromosomes are examined separately. In one experiment, we generated genome maps from a trio from the 1000 Genomes Project, compared the maps against that derived from the reference human genome, and identified structural variations that are >5 kb in size. We find that these individuals have many more structural variants than those published, including some with the potential of disrupting gene function or regulation.

  4. The Isochore Structure of the Human Genome

    NASA Astrophysics Data System (ADS)

    Petrov, Dimitri; Arndt, Peter F.; Hwa, Terence

    2002-03-01

    Most of the genomes of warm-blooded vertebrates is a mosaic of very long (>200,000 bp) DNA segments, the isochores. These isochores are fairly homogeneous in base composition and distinguished by their guanine-cytosine (GC)-content. With the emergence of sequence data of different organisms we were able to study the isochore structure on scales up to length of chromosomes. We observed interesting long-range correlations and explore the possible mechanism(s) using sequence evolution models with mutation rates measured from the repetitive elements in the different isochores.

  5. Epigenomics and the structure of the living genome.

    PubMed

    Friedman, Nir; Rando, Oliver J

    2015-10-01

    Eukaryotic genomes are packaged into an extensively folded state known as chromatin. Analysis of the structure of eukaryotic chromosomes has been revolutionized by development of a suite of genome-wide measurement technologies, collectively termed "epigenomics." We review major advances in epigenomic analysis of eukaryotic genomes, covering aspects of genome folding at scales ranging from whole chromosome folding down to nucleotide-resolution assays that provide structural insights into protein-DNA interactions. We then briefly outline several challenges remaining and highlight new developments such as single-cell epigenomic assays that will help provide us with a high-resolution structural understanding of eukaryotic genomes.

  6. Genome-wide analysis identifies novel loci associated with ovarian cancer outcomes: findings from the Ovarian Cancer Association Consortium

    PubMed Central

    Johnatty, Sharon E.; Tyrer, Jonathan P.; Kar, Siddhartha; Beesley, Jonathan; Lu, Yi; Gao, Bo; Fasching, Peter A.; Hein, Alexander; Ekici, Arif B.; Beckmann, Matthias W.; Lambrechts, Diether; Nieuwenhuysen, Els Van; Vergote, Ignace; Lambrechts, Sandrina; Rossing, Mary Anne; Doherty, Jennifer A.; Chang-Claude, Jenny; Modugno, Francesmary; Ness, Roberta B.; Moysich, Kirsten B.; Levine, Douglas A.; Kiemeney, Lambertus A.; Massuger, Leon F.A.G.; Gronwald, Jacek; Lubiński, Jan; Jakubowska, Anna; Cybulski, Cezary; Brinton, Louise; Lissowska, Jolanta; Wentzensen, Nicolas; Song, Honglin; Rhenius, Valerie; Campbell, Ian; Eccles, Diana; Sieh, Weiva; Whittemore, Alice S.; McGuire, Valerie; Rothstein, Joseph H.; Sutphen, Rebecca; Anton-Culver, Hoda; Ziogas, Argyrios; Gayther, Simon A.; Gentry-Maharaj, Aleksandra; Menon, Usha; Ramus, Susan J.; Pearce, Celeste L; Pike, Malcolm C; Stram, Daniel O.; Wu, Anna H.; Kupryjanczyk, Jolanta; Dansonka-Mieszkowska, Agnieszka; Rzepecka, Iwona K.; Spiewankiewicz, Beata; Goodman, Marc T.; Wilkens, Lynne R.; Carney, Michael E.; Thompson, Pamela J; Heitz, Florian; du Bois, Andreas; Schwaab, Ira; Harter, Philipp; Pisterer, Jacobus; Hillemanns, Peter; Karlan, Beth Y.; Walsh, Christine; Lester, Jenny; Orsulic, Sandra; Winham, Stacey J; Earp, Madalene; Larson, Melissa C.; Fogarty, Zachary C.; Høgdall, Estrid; Jensen, Allan; Kjaer, Susanne Kruger; Fridley, Brooke L.; Cunningham, Julie M.; Vierkant, Robert A.; Schildkraut, Joellen M.; Iversen, Edwin S.; Terry, Kathryn L.; Cramer, Daniel W.; Bandera, Elisa V.; Orlow, Irene; Pejovic, Tanja; Bean, Yukie; Høgdall, Claus; Lundvall, Lene; McNeish, Ian; Paul, James; Carty, Karen; Siddiqui, Nadeem; Glasspool, Rosalind; Sellers, Thomas; Kennedy, Catherine; Chiew, Yoke-Eng; Berchuck, Andrew; MacGregor, Stuart; deFazio, Anna; Pharoah, Paul D.P.; Goode, Ellen L.; deFazio, Anna; Webb, Penelope M.; Chenevix-Trench, Georgia

    2015-01-01

    Purpose Chemotherapy resistance remains a major challenge in the treatment of ovarian cancer. We hypothesize that germline polymorphisms might be associated with clinical outcome. Experimental Design We analyzed ~2.8 million genotyped and imputed SNPs from the iCOGS experiment for progression-free survival (PFS) and overall survival (OS) in 2,901 European epithelial ovarian cancer (EOC) patients who underwent firstline treatment of cytoreductive surgery and chemotherapy regardless of regimen, and in a subset of 1,098 patients treated with ≥4 cycles of paclitaxel and carboplatin at standard doses. We evaluated the top SNPs in 4,434 EOC patients including patients from The Cancer Genome Atlas. Additionally we conducted pathway analysis of all intragenic SNPs and tested their association with PFS and OS using gene set enrichment analysis. Results Five SNPs were significantly associated (p≤1.0x10−5) with poorer outcomes in at least one of the four analyses, three of which, rs4910232 (11p15.3), rs2549714 (16q23) and rs6674079 (1q22) were located in long non-coding RNAs (lncRNAs) RP11–179A10.1, RP11–314O13.1 and RP11–284F21.8 respectively (p≤7.1x10−6). ENCODE ChIP-seq data at 1q22 for normal ovary shows evidence of histone modification around RP11–284F21.8, and rs6674079 is perfectly correlated with another SNP within the super-enhancer MEF2D, expression levels of which were reportedly associated with prognosis in another solid tumor. YAP1- and WWTR1 (TAZ)-stimulated gene expression, and HDL-mediated lipid transport pathways were associated with PFS and OS, respectively, in the cohort who had standard chemotherapy (pGSEA≤6x10−3). Conclusion We have identified SNPs in three lncRNAs that might be important targets for novel EOC therapies. PMID:26152742

  7. Molecular Genetic Evidence for Genetic Overlap between General Cognitive Ability and Risk for Schizophrenia: A Report from the Cognitive Genomics Consortium (COGENT)

    PubMed Central

    Lencz, Todd; Knowles, Emma; Davies, Gail; Guha, Saurav; Liewald, David C; Starr, John M; Djurovic, Srdjan; Melle, Ingrid; Sundet, Kjetil; Christoforou, Andrea; Reinvang, Ivar; Mukherjee, Semanti; Lundervold, Astri; Steen, Vidar M.; John, Majnu; Espeseth, Thomas; Räikkönen, Katri; Widen, Elisabeth; Palotie, Aarno; Eriksson, Johan G; Giegling, Ina; Konte, Bettina; Ikeda, Masashi; Roussos, Panos; Giakoumaki, Stella; Burdick, Katherine E.; Payton, Antony; Ollier, William; Horan, Mike; Donohoe, Gary; Morris, Derek; Corvin, Aiden; Gill, Michael; Pendleton, Neil; Iwata, Nakao; Darvasi, Ariel; Bitsios, Panos; Rujescu, Dan; Lahti, Jari; Hellard, Stephanie Le; Keller, Matthew C.; Andreassen, Ole A.; Deary, Ian J; Glahn, David C.; Malhotra, Anil K.

    2014-01-01

    It has long been recognized that generalized deficits in cognitive ability represent a core component of schizophrenia, evident prior to full illness onset and independent of medication. The possibility of genetic overlap between risk for schizophrenia and cognitive phenotypes has been suggested by the presence of cognitive deficits in first-degree relatives of patients with schizophrenia; however, until recently, molecular genetic approaches to test this overlap have been lacking. Within the last few years, large-scale genome-wide association studies (GWAS) of schizophrenia have demonstrated that a substantial proportion of the heritability of the disorder is explained by a polygenic component consisting of many common SNPs of extremely small effect. Similar results have been reported in GWAS of general cognitive ability. The primary aim of the present study is to provide the first molecular genetic test of the classic endophenotype hypothesis, which states that alleles associated with reduced cognitive ability should also serve to increase risk for schizophrenia. We tested the endophenotype hypothesis by applying polygenic SNP scores derived from a large-scale cognitive GWAS meta-analysis (~5000 individuals from 9 non-clinical cohorts comprising the COGENT consortium) to four schizophrenia case-control cohorts. As predicted, cases had significantly lower cognitive polygenic scores compared to controls. In parallel, polygenic risk scores for schizophrenia were associated with lower general cognitive ability. Additionally, using our large cognitive meta-analytic dataset, we identified nominally significant cognitive associations for several SNPs that have previously been robustly associated with schizophrenia susceptibility. Results provide molecular confirmation of the genetic overlap between schizophrenia and general cognitive ability, and may provide additional insight into pathophysiology of the disorder. PMID:24342994

  8. MiRNA-Related SNPs and Risk of Esophageal Adenocarcinoma and Barrett's Esophagus: Post Genome-Wide Association Analysis in the BEACON Consortium.

    PubMed

    Buas, Matthew F; Onstad, Lynn; Levine, David M; Risch, Harvey A; Chow, Wong-Ho; Liu, Geoffrey; Fitzgerald, Rebecca C; Bernstein, Leslie; Ye, Weimin; Bird, Nigel C; Romero, Yvonne; Casson, Alan G; Corley, Douglas A; Shaheen, Nicholas J; Wu, Anna H; Gammon, Marilie D; Reid, Brian J; Hardie, Laura J; Peters, Ulrike; Whiteman, David C; Vaughan, Thomas L

    2015-01-01

    Incidence of esophageal adenocarcinoma (EA) has increased substantially in recent decades. Multiple risk factors have been identified for EA and its precursor, Barrett's esophagus (BE), such as reflux, European ancestry, male sex, obesity, and tobacco smoking, and several germline genetic variants were recently associated with disease risk. Using data from the Barrett's and Esophageal Adenocarcinoma Consortium (BEACON) genome-wide association study (GWAS) of 2,515 EA cases, 3,295 BE cases, and 3,207 controls, we examined single nucleotide polymorphisms (SNPs) that potentially affect the biogenesis or biological activity of microRNAs (miRNAs), small non-coding RNAs implicated in post-transcriptional gene regulation, and deregulated in many cancers, including EA. Polymorphisms in three classes of genes were examined for association with risk of EA or BE: miRNA biogenesis genes (157 SNPs, 21 genes); miRNA gene loci (234 SNPs, 210 genes); and miRNA-targeted mRNAs (177 SNPs, 158 genes). Nominal associations (P<0.05) of 29 SNPs with EA risk, and 25 SNPs with BE risk, were observed. None remained significant after correction for multiple comparisons (FDR q>0.50), and we did not find evidence for interactions between variants analyzed and two risk factors for EA/BE (smoking and obesity). This analysis provides the most extensive assessment to date of miRNA-related SNPs in relation to risk of EA and BE. While common genetic variants within components of the miRNA biogenesis core pathway appear unlikely to modulate susceptibility to EA or BE, further studies may be warranted to examine potential associations between unassessed variants in miRNA genes and targets with disease risk.

  9. Pseudomonas aeruginosa Genomic Structure and Diversity

    PubMed Central

    Klockgether, Jens; Cramer, Nina; Wiehlmann, Lutz; Davenport, Colin F.; Tümmler, Burkhard

    2011-01-01

    The Pseudomonas aeruginosa genome (G + C content 65–67%, size 5.5–7 Mbp) is made up of a single circular chromosome and a variable number of plasmids. Sequencing of complete genomes or blocks of the accessory genome has revealed that the genome encodes a large repertoire of transporters, transcriptional regulators, and two-component regulatory systems which reflects its metabolic diversity to utilize a broad range of nutrients. The conserved core component of the genome is largely collinear among P. aeruginosa strains and exhibits an interclonal sequence diversity of 0.5–0.7%. Only a few loci of the core genome are subject to diversifying selection. Genome diversity is mainly caused by accessory DNA elements located in 79 regions of genome plasticity that are scattered around the genome and show an anomalous usage of mono- to tetradecanucleotides. Genomic islands of the pKLC102/PAGI-2 family that integrate into tRNALys or tRNAGly genes represent hotspots of inter- and intraclonal genomic diversity. The individual islands differ in their repertoire of metabolic genes that make a large contribution to the pangenome. In order to unravel intraclonal diversity of P. aeruginosa, the genomes of two members of the PA14 clonal complex from diverse habitats and geographic origin were compared. The genome sequences differed by less than 0.01% from each other. One hundred ninety-eight of the 231 single nucleotide substitutions (SNPs) were non-randomly distributed in the genome. Non-synonymous SNPs were mainly found in an integrated Pf1-like phage and in genes involved in transcriptional regulation, membrane and extracellular constituents, transport, and secretion. In summary, P. aeruginosa is endowed with a highly conserved core genome of low sequence diversity and a highly variable accessory genome that communicates with other pseudomonads and genera via horizontal gene transfer. PMID:21808635

  10. The Single Nucleotide Polymorphism Consortium

    NASA Technical Reports Server (NTRS)

    Morgan, Michael

    2003-01-01

    I want to discuss both the Single Nucleotide Polymorphism (SNP) Consortium and the Human Genome Project. I am afraid most of my presentation will be thin on law and possibly too high on rhetoric. Having been engaged in a personal and direct way with these issues as a trained scientist, I find it quite difficult to be always as objective as I ought to be.

  11. Shape memory alloy consortium (SMAC)

    NASA Astrophysics Data System (ADS)

    Jacot, A. Dean

    1999-07-01

    The application of smart structures to helicopter rotors has received widespread study in recent years. This is one of the major thrusts of the Shape Memory Alloy Consortium (SMAC) program. SMAC includes 3 companies and 4 Universities in a cost sharing consortium funded under DARPA Smart Materials and Structures program. This paper describes the objective of the SMAC effort, and its relationship to a previous DARPA smart structure rotorcraft program from which it originated. The SMAC program includes NiTinol fatigue/characterization studies, SMA actuator development, and ferromagnetic SMA material development. The paper summarizes the SMAC effort, and includes background and details on Boeing's development of a SMA torsional actuator for rotorcraft applications. SMA actuation is used to retwist the rotorcraft blade in flight, and result in a significant payload increase for either helicopters or tiltrotors. This paper is also augmented by several other papers in this conference with specific results from other SMAC consortium members.

  12. GAS STORAGE TECHNOLOGY CONSORTIUM

    SciTech Connect

    Robert W. Watson

    2004-04-17

    Gas storage is a critical element in the natural gas industry. Producers, transmission and distribution companies, marketers, and end users all benefit directly from the load balancing function of storage. The unbundling process has fundamentally changed the way storage is used and valued. As an unbundled service, the value of storage is being recovered at rates that reflect its value. Moreover, the marketplace has differentiated between various types of storage services, and has increasingly rewarded flexibility, safety, and reliability. The size of the natural gas market has increased and is projected to continue to increase towards 30 trillion cubic feet (TCF) over the next 10 to 15 years. Much of this increase is projected to come from electric generation, particularly peaking units. Gas storage, particularly the flexible services that are most suited to electric loads, is critical in meeting the needs of these new markets. In order to address the gas storage needs of the natural gas industry, an industry-driven consortium was created--the Gas Storage Technology Consortium (GSTC). The objective of the GSTC is to provide a means to accomplish industry-driven research and development designed to enhance operational flexibility and deliverability of the Nation's gas storage system, and provide a cost effective, safe, and reliable supply of natural gas to meet domestic demand. To accomplish this objective, the project is divided into three phases that are managed and directed by the GSTC Coordinator. Base funding for the consortium is provided by the U.S. Department of Energy (DOE). In addition, funding is anticipated from the Gas Technology Institute (GTI). The first phase, Phase 1A, was initiated on September 30, 2003, and is scheduled for completion on March 31, 2004. Phase 1A of the project includes the creation of the GSTC structure, development of constitution (by-laws) for the consortium, and development and refinement of a technical approach (work plan) for

  13. GAS STORAGE TECHNOLGOY CONSORTIUM

    SciTech Connect

    Robert W. Watson

    2004-04-23

    Gas storage is a critical element in the natural gas industry. Producers, transmission and distribution companies, marketers, and end users all benefit directly from the load balancing function of storage. The unbundling process has fundamentally changed the way storage is used and valued. As an unbundled service, the value of storage is being recovered at rates that reflect its value. Moreover, the marketplace has differentiated between various types of storage services, and has increasingly rewarded flexibility, safety, and reliability. The size of the natural gas market has increased and is projected to continue to increase towards 30 trillion cubic feet (TCF) over the next 10 to 15 years. Much of this increase is projected to come from electric generation, particularly peaking units. Gas storage, particularly the flexible services that are most suited to electric loads, is critical in meeting the needs of these new markets. In order to address the gas storage needs of the natural gas industry, an industry-driven consortium was created--the Gas Storage Technology Consortium (GSTC). The objective of the GSTC is to provide a means to accomplish industry-driven research and development designed to enhance operational flexibility and deliverability of the Nation's gas storage system, and provide a cost effective, safe, and reliable supply of natural gas to meet domestic demand. To accomplish this objective, the project is divided into three phases that are managed and directed by the GSTC Coordinator. Base funding for the consortium is provided by the U.S. Department of Energy (DOE). In addition, funding is anticipated from the Gas Technology Institute (GTI). The first phase, Phase 1A, was initiated on September 30, 2003, and is scheduled for completion on March 31, 2004. Phase 1A of the project includes the creation of the GSTC structure, development of constitution (by-laws) for the consortium, and development and refinement of a technical approach (work plan) for

  14. Child Development and Structural Variation in the Human Genome

    ERIC Educational Resources Information Center

    Zhang, Ying; Haraksingh, Rajini; Grubert, Fabian; Abyzov, Alexej; Gerstein, Mark; Weissman, Sherman; Urban, Alexander E.

    2013-01-01

    Structural variation of the human genome sequence is the insertion, deletion, or rearrangement of stretches of DNA sequence sized from around 1,000 to millions of base pairs. Over the past few years, structural variation has been shown to be far more common in human genomes than previously thought. Very little is currently known about the effects…

  15. International Lymphoma Epidemiology Consortium

    Cancer.gov

    The InterLymph Consortium, or formally the International Consortium of Investigators Working on Non-Hodgkin's Lymphoma Epidemiologic Studies, is an open scientific forum for epidemiologic research in non-Hodgkin's lymphoma.

  16. THE FEDERAL INTEGRATED BIOTREATMENT RESEARCH CONSORTIUM (FLASK TO FIELD)

    EPA Science Inventory

    The Federal Integrated Biotreatment Research Consortium (Flask to Field) represented a 7-year concerted effort by several research laboratories to develop bioremediation technologies for contaminated DoD sites. The consortium structure consisted of a director and four thrust are...

  17. Genome structure analysis of molluscs revealed whole genome duplication and lineage specific repeat variation.

    PubMed

    Yoshida, Masa-aki; Ishikura, Yukiko; Moritaki, Takeya; Shoguchi, Eiichi; Shimizu, Kentaro K; Sese, Jun; Ogura, Atsushi

    2011-09-01

    Comparative genome structure analysis allows us to identify novel genes, repetitive sequences and gene duplications. To explore lineage-specific genomic changes of the molluscs that is good model for development of nervous system in invertebrate, we conducted comparative genome structure analyses of three molluscs, pygmy squid, nautilus and scallops using partial genome shotgun sequencing. Most effective elements on the genome structural changes are repetitive elements (REs) causing expansion of genome size and whole genome duplication producing large amount of novel functional genes. Therefore, we investigated variation and proportion of REs and whole genome duplication. We, first, identified variations of REs in the three molluscan genomes by homology-based and de novo RE detection. Proportion of REs were 9.2%, 4.0%, and 3.8% in the pygmy squid, nautilus and scallop, respectively. We, then, estimated genome size of the species as 2.1, 4.2 and 1.8 Gb, respectively, with 2× coverage frequency and DNA sequencing theory. We also performed a gene duplication assay based on coding genes, and found that large-scale duplication events occurred after divergence from the limpet Lottia, an out-group of the three molluscan species. Comparison of all the results suggested that RE expansion did not relate to the increase in genome size of nautilus. Despite close relationships to nautilus, the squid has the largest portion of REs and smaller genome size than nautilus. We also identified lineage-specific RE and gene-family expansions, possibly relate to acquisition of the most complicated eye and brain systems in the three species.

  18. Genome alignment with graph data structures: a comparison

    PubMed Central

    2014-01-01

    Background Recent advances in rapid, low-cost sequencing have opened up the opportunity to study complete genome sequences. The computational approach of multiple genome alignment allows investigation of evolutionarily related genomes in an integrated fashion, providing a basis for downstream analyses such as rearrangement studies and phylogenetic inference. Graphs have proven to be a powerful tool for coping with the complexity of genome-scale sequence alignments. The potential of graphs to intuitively represent all aspects of genome alignments led to the development of graph-based approaches for genome alignment. These approaches construct a graph from a set of local alignments, and derive a genome alignment through identification and removal of graph substructures that indicate errors in the alignment. Results We compare the structures of commonly used graphs in terms of their abilities to represent alignment information. We describe how the graphs can be transformed into each other, and identify and classify graph substructures common to one or more graphs. Based on previous approaches, we compile a list of modifications that remove these substructures. Conclusion We show that crucial pieces of alignment information, associated with inversions and duplications, are not visible in the structure of all graphs. If we neglect vertex or edge labels, the graphs differ in their information content. Still, many ideas are shared among all graph-based approaches. Based on these findings, we outline a conceptual framework for graph-based genome alignment that can assist in the development of future genome alignment tools. PMID:24712884

  19. The bioleaching potential of a bacterial consortium.

    PubMed

    Latorre, Mauricio; Cortés, María Paz; Travisany, Dante; Di Genova, Alex; Budinich, Marko; Reyes-Jara, Angélica; Hödar, Christian; González, Mauricio; Parada, Pilar; Bobadilla-Fazzini, Roberto A; Cambiazo, Verónica; Maass, Alejandro

    2016-10-01

    This work presents the molecular foundation of a consortium of five efficient bacteria strains isolated from copper mines currently used in state of the art industrial-scale biotechnology. The strains Acidithiobacillus thiooxidans Licanantay, Acidiphilium multivorum Yenapatur, Leptospirillum ferriphilum Pañiwe, Acidithiobacillus ferrooxidans Wenelen and Sulfobacillus thermosulfidooxidans Cutipay were selected for genome sequencing based on metal tolerance, oxidation activity and bioleaching of copper efficiency. An integrated model of metabolic pathways representing the bioleaching capability of this consortium was generated. Results revealed that greater efficiency in copper recovery may be explained by the higher functional potential of L. ferriphilum Pañiwe and At. thiooxidans Licanantay to oxidize iron and reduced inorganic sulfur compounds. The consortium had a greater capacity to resist copper, arsenic and chloride ion compared to previously described biomining strains. Specialization and particular components in these bacteria provided the consortium a greater ability to bioleach copper sulfide ores. PMID:27416516

  20. Structural Genomics of Minimal Organisms: Pipeline and Results

    SciTech Connect

    Kim, Sung-Hou; Shin, Dong-Hae; Kim, Rosalind; Adams, Paul; Chandonia, John-Marc

    2007-09-14

    The initial objective of the Berkeley Structural Genomics Center was to obtain a near complete three-dimensional (3D) structural information of all soluble proteins of two minimal organisms, closely related pathogens Mycoplasma genitalium and M. pneumoniae. The former has fewer than 500 genes and the latter has fewer than 700 genes. A semiautomated structural genomics pipeline was set up from target selection, cloning, expression, purification, and ultimately structural determination. At the time of this writing, structural information of more than 93percent of all soluble proteins of M. genitalium is avail able. This chapter summarizes the approaches taken by the authors' center.

  1. Structural genomics of eukaryotic targets at a laboratory scale.

    PubMed

    Busso, Didier; Poussin-Courmontagne, Pierre; Rosé, David; Ripp, Raymond; Litt, Alain; Thierry, Jean-Claude; Moras, Dino

    2005-01-01

    Structural genomics programs are distributed worldwide and funded by large institutions such as the NIH in United-States, the RIKEN in Japan or the European Commission through the SPINE network in Europe. Such initiatives, essentially managed by large consortia, led to technology and method developments at the different steps required to produce biological samples compatible with structural studies. Besides specific applications, method developments resulted mainly upon miniaturization and parallelization. The challenge that academic laboratories faces to pursue structural genomics programs is to produce, at a higher rate, protein samples. The Structural Biology and Genomics Department (IGBMC - Illkirch - France) is implicated in a structural genomics program of high eukaryotes whose goal is solving crystal structures of proteins and their complexes (including large complexes) related to human health and biotechnology. To achieve such a challenging goal, the Department has established a medium-throughput pipeline for producing protein samples suitable for structural biology studies. Here, we describe the setting up of our initiative from cloning to crystallization and we demonstrate that structural genomics may be manageable by academic laboratories by strategic investments in robotic and by adapting classical bench protocols and new developments, in particular in the field of protein expression, to parallelization.

  2. Radiogenomics Consortium (RGC)

    Cancer.gov

    The Radiogenomics Consortium's hypothesis is that a cancer patient's likelihood of developing toxicity to radiation therapy is influenced by common genetic variations, such as single nucleotide polymorphisms (SNPs).

  3. Terragenome: International Soil Metagenome Sequencing Consortium (GSC8 Meeting)

    ScienceCinema

    Jansson, Janet [LBNL

    2016-07-12

    The Genomic Standards Consortium was formed in September 2005. It is an international, open-membership working body which promotes standardization in the description of genomes and the exchange and integration of genomic data. The 2009 meeting was an activity of a five-year funding "Research Coordination Network" from the National Science Foundation and was organized held at the DOE Joint Genome Institute with organizational support provided by the JGI and by the University of California - San Diego. Janet Jansson of the Lawrence Berkeley National Laboratory discusses the Terragenome Initiative at the Genomic Standards Consortium's 8th meeting at the DOE JGI in Walnut Creek, Calif. on Sept. 9, 2009

  4. Structural and Operational Complexity of the Geobacter Sulfurreducens Genome

    SciTech Connect

    Qiu, Yu; Cho, Byung-Kwan; Park, Young S.; Lovley, Derek R.; Palsson, Bernhard O.; Zengler, Karsten

    2010-06-30

    Prokaryotic genomes can be annotated based on their structural, operational, and functional properties. These annotations provide the pivotal scaffold for understanding cellular functions on a genome-scale, such as metabolism and transcriptional regulation. Here, we describe a systems approach to simultaneously determine the structural and operational annotation of the Geobacter sulfurreducens genome. Integration of proteomics, transcriptomics, RNA polymerase, and sigma factor-binding information with deep-sequencing-based analysis of primary 59-end transcripts allowed for a most precise annotation. The structural annotation is comprised of numerous previously undetected genes, noncoding RNAs, prevalent leaderless mRNA transcripts, and antisense transcripts. When compared with other prokaryotes, we found that the number of antisense transcripts reversely correlated with genome size. The operational annotation consists of 1453 operons, 22% of which have multiple transcription start sites that use different RNA polymerase holoenzymes. Several operons with multiple transcription start sites encoded genes with essential functions, giving insight into the regulatory complexity of the genome. The experimentally determined structural and operational annotations can be combined with functional annotation, yielding a new three-level annotation that greatly expands our understanding of prokaryotic genomes.

  5. Structural and operational complexity of the Geobacter sulfurreducens genome

    PubMed Central

    Qiu, Yu; Cho, Byung-Kwan; Park, Young Seoub; Lovley, Derek; Palsson, Bernhard Ø.; Zengler, Karsten

    2010-01-01

    Prokaryotic genomes can be annotated based on their structural, operational, and functional properties. These annotations provide the pivotal scaffold for understanding cellular functions on a genome-scale, such as metabolism and transcriptional regulation. Here, we describe a systems approach to simultaneously determine the structural and operational annotation of the Geobacter sulfurreducens genome. Integration of proteomics, transcriptomics, RNA polymerase, and sigma factor-binding information with deep-sequencing-based analysis of primary 5′-end transcripts allowed for a most precise annotation. The structural annotation is comprised of numerous previously undetected genes, noncoding RNAs, prevalent leaderless mRNA transcripts, and antisense transcripts. When compared with other prokaryotes, we found that the number of antisense transcripts reversely correlated with genome size. The operational annotation consists of 1453 operons, 22% of which have multiple transcription start sites that use different RNA polymerase holoenzymes. Several operons with multiple transcription start sites encoded genes with essential functions, giving insight into the regulatory complexity of the genome. The experimentally determined structural and operational annotations can be combined with functional annotation, yielding a new three-level annotation that greatly expands our understanding of prokaryotic genomes. PMID:20592237

  6. Breast and Prostate Cancer Cohort Consortium (BPC3)

    Cancer.gov

    Breast and Prostate Cancer Cohort Consortium collaborates with three genomic facilities, epidemiologists, population geneticists, and biostatisticians from multiple institutions to study hormone-related gene variants and environmental factors in breast and prostate cancers.

  7. GAS STORAGE TECHNOLOGY CONSORTIUM

    SciTech Connect

    Robert W. Watson

    2004-10-18

    Gas storage is a critical element in the natural gas industry. Producers, transmission and distribution companies, marketers, and end users all benefit directly from the load balancing function of storage. The unbundling process has fundamentally changed the way storage is used and valued. As an unbundled service, the value of storage is being recovered at rates that reflect its value. Moreover, the marketplace has differentiated between various types of storage services, and has increasingly rewarded flexibility, safety, and reliability. The size of the natural gas market has increased and is projected to continue to increase towards 30 trillion cubic feet (TCF) over the next 10 to 15 years. Much of this increase is projected to come from electric generation, particularly peaking units. Gas storage, particularly the flexible services that are most suited to electric loads, is critical in meeting the needs of these new markets. In order to address the gas storage needs of the natural gas industry, an industry-driven consortium was created--the Gas Storage Technology Consortium (GSTC). The objective of the GSTC is to provide a means to accomplish industry-driven research and development designed to enhance operational flexibility and deliverability of the Nation's gas storage system, and provide a cost effective, safe, and reliable supply of natural gas to meet domestic demand. To accomplish this objective, the project is divided into three phases that are managed and directed by the GSTC Coordinator. The first phase, Phase 1A, was initiated on September 30, 2003, and was completed on March 31, 2004. Phase 1A of the project included the creation of the GSTC structure, development and refinement of a technical approach (work plan) for deliverability enhancement and reservoir management. This report deals with Phase 1B and encompasses the period July 1, 2004, through September 30, 2004. During this time period there were three main activities. First was the ongoing

  8. GAS STORAGE TECHNOLOGY CONSORTIUM

    SciTech Connect

    Robert W. Watson

    2004-07-15

    Gas storage is a critical element in the natural gas industry. Producers, transmission and distribution companies, marketers, and end users all benefit directly from the load balancing function of storage. The unbundling process has fundamentally changed the way storage is used and valued. As an unbundled service, the value of storage is being recovered at rates that reflect its value. Moreover, the marketplace has differentiated between various types of storage services, and has increasingly rewarded flexibility, safety, and reliability. The size of the natural gas market has increased and is projected to continue to increase towards 30 trillion cubic feet (TCF) over the next 10 to 15 years. Much of this increase is projected to come from electric generation, particularly peaking units. Gas storage, particularly the flexible services that are most suited to electric loads, is critical in meeting the needs of these new markets. In order to address the gas storage needs of the natural gas industry, an industry-driven consortium was created--the Gas Storage Technology Consortium (GSTC). The objective of the GSTC is to provide a means to accomplish industry-driven research and development designed to enhance operational flexibility and deliverability of the Nation's gas storage system, and provide a cost effective, safe, and reliable supply of natural gas to meet domestic demand. To accomplish this objective, the project is divided into three phases that are managed and directed by the GSTC Coordinator. Base funding for the consortium is provided by the U.S. Department of Energy (DOE). In addition, funding is anticipated from the Gas Technology Institute (GTI). The first phase, Phase 1A, was initiated on September 30, 2003, and was completed on March 31, 2004. Phase 1A of the project included the creation of the GSTC structure, development and refinement of a technical approach (work plan) for deliverability enhancement and reservoir management. This report deals with

  9. Characterization of the Poplar Pan-Genome by Genome-Wide Identification of Structural Variation.

    PubMed

    Pinosio, Sara; Giacomello, Stefania; Faivre-Rampant, Patricia; Taylor, Gail; Jorge, Veronique; Le Paslier, Marie Christine; Zaina, Giusi; Bastien, Catherine; Cattonaro, Federica; Marroni, Fabio; Morgante, Michele

    2016-10-01

    Many recent studies have emphasized the important role of structural variation (SV) in determining human genetic and phenotypic variation. In plants, studies aimed at elucidating the extent of SV are still in their infancy. Evidence has indicated a high presence and an active role of SV in driving plant genome evolution in different plant species.With the aim of characterizing the size and the composition of the poplar pan-genome, we performed a genome-wide analysis of structural variation in three intercrossable poplar species: Populus nigra, Populus deltoides, and Populus trichocarpa We detected a total of 7,889 deletions and 10,586 insertions relative to the P. trichocarpa reference genome, covering respectively 33.2 Mb and 62.9 Mb of genomic sequence, and 3,230 genes affected by copy number variation (CNV). The majority of the detected variants are inter-specific in agreement with a recent origin following separation of species.Insertions and deletions (INDELs) were preferentially located in low-gene density regions of the poplar genome and were, for the majority, associated with the activity of transposable elements. Genes affected by SV showed lower-than-average expression levels and higher levels of dN/dS, suggesting that they are subject to relaxed selective pressure or correspond to pseudogenes.Functional annotation of genes affected by INDELs showed over-representation of categories associated with transposable elements activity, while genes affected by genic CNVs showed enrichment in categories related to resistance to stress and pathogens. This study provides a genome-wide catalogue of SV and the first insight on functional and structural properties of the poplar pan-genome. PMID:27499133

  10. Characterization of the Poplar Pan-Genome by Genome-Wide Identification of Structural Variation

    PubMed Central

    Pinosio, Sara; Giacomello, Stefania; Faivre-Rampant, Patricia; Taylor, Gail; Jorge, Veronique; Le Paslier, Marie Christine; Zaina, Giusi; Bastien, Catherine; Cattonaro, Federica; Marroni, Fabio; Morgante, Michele

    2016-01-01

    Many recent studies have emphasized the important role of structural variation (SV) in determining human genetic and phenotypic variation. In plants, studies aimed at elucidating the extent of SV are still in their infancy. Evidence has indicated a high presence and an active role of SV in driving plant genome evolution in different plant species. With the aim of characterizing the size and the composition of the poplar pan-genome, we performed a genome-wide analysis of structural variation in three intercrossable poplar species: Populus nigra, Populus deltoides, and Populus trichocarpa. We detected a total of 7,889 deletions and 10,586 insertions relative to the P. trichocarpa reference genome, covering respectively 33.2 Mb and 62.9 Mb of genomic sequence, and 3,230 genes affected by copy number variation (CNV). The majority of the detected variants are inter-specific in agreement with a recent origin following separation of species. Insertions and deletions (INDELs) were preferentially located in low-gene density regions of the poplar genome and were, for the majority, associated with the activity of transposable elements. Genes affected by SV showed lower-than-average expression levels and higher levels of dN/dS, suggesting that they are subject to relaxed selective pressure or correspond to pseudogenes. Functional annotation of genes affected by INDELs showed over-representation of categories associated with transposable elements activity, while genes affected by genic CNVs showed enrichment in categories related to resistance to stress and pathogens. This study provides a genome-wide catalogue of SV and the first insight on functional and structural properties of the poplar pan-genome. PMID:27499133

  11. The effect of the introduction of exogenous strain Acidithiobacillus thiooxidans A01 on functional gene expression, structure and function of indigenous consortium during pyrite bioleaching.

    PubMed

    Liu, Yi; Yin, Huaqun; Zeng, Weimin; Liang, Yili; Liu, Yao; Baba, Ngom; Qiu, Guanzhou; Shen, Li; Fu, Xian; Liu, Xueduan

    2011-09-01

    Acidithiobacillus thiooxidans A01 was added to a consortium of bioleaching bacteria including Acidithiobacilluscaldus, Leptospirillumferriphilum, Acidithiobacillus ferrooxidans, Sulfobacillus thermosulfidooxidans, Acidiphilium spp., and Ferroplasma thermophilum cultured in modified 9 K medium containing 0.5% (w/v) pyrite, and 10.7% increase of bioleaching rate was observed. Changes in community structure and gene expression were monitored with real-time PCR and functional gene arrays (FGAs). Real-time PCR showed that addition of At. thiooxidans caused increased numbers of all consortium members except At. caldus, and At. caldus, L. ferriphilum, and F. thermophilum remained dominant in this community. FGAs results showed that after addition of At. thiooxidans, most genes involved in iron, sulfur, carbon, and nitrogen metabolisms, metal resistance, electron transport, and extracellular polymeric substances of L. ferriphilum, F. thermophilum, and Acidiphilium spp., were up-regulated while most of these genes were down-regulated at 70-78 h in At. caldus and up-regulated in At. ferrooxidans, then down-regulated at 82-86 h.

  12. Spectral entropy criteria for structural segmentation in genomic DNA sequences

    NASA Astrophysics Data System (ADS)

    Chechetkin, V. R.; Lobzin, V. V.

    2004-07-01

    The spectral entropy is calculated with Fourier structure factors and characterizes the level of structural ordering in a sequence of symbols. It may efficiently be applied to the assessment and reconstruction of the modular structure in genomic DNA sequences. We present the relevant spectral entropy criteria for the local and non-local structural segmentation in DNA sequences. The results are illustrated with the model examples and analysis of intervening exon-intron segments in the protein-coding regions.

  13. Consortium Proves Adage.

    ERIC Educational Resources Information Center

    Seidel, Kim

    1997-01-01

    Describes the Minnesota Preparatory Schools, a secondary-level consortium formed by Cotter High School, Saint Mary's University, the Minnesota Academy of Mathematics and Science, De La Salle Language Institute, and the Minnesota Conservatory for the Arts. Indicates that the consortium provides students with flexible schedules geared toward their…

  14. Minnesota Educational Computing Consortium.

    ERIC Educational Resources Information Center

    Haugo, John E.

    The state of Minnesota has established the Minnesota Educational Computing Consortium (MECC) to coordinate the state's educational computing activities. The Consortium is governed by a board of directors representing the State Department of Education, the State Junior Colleges, the State Colleges, the State University and the public and is…

  15. Population-based 3D genome structure analysis reveals driving forces in spatial genome organization

    PubMed Central

    Li, Wenyuan; Kalhor, Reza; Dai, Chao; Hao, Shengli; Gong, Ke; Zhou, Yonggang; Li, Haochen; Zhou, Xianghong Jasmine; Le Gros, Mark A.; Larabell, Carolyn A.; Chen, Lin; Alber, Frank

    2016-01-01

    Conformation capture technologies (e.g., Hi-C) chart physical interactions between chromatin regions on a genome-wide scale. However, the structural variability of the genome between cells poses a great challenge to interpreting ensemble-averaged Hi-C data, particularly for long-range and interchromosomal interactions. Here, we present a probabilistic approach for deconvoluting Hi-C data into a model population of distinct diploid 3D genome structures, which facilitates the detection of chromatin interactions likely to co-occur in individual cells. Our approach incorporates the stochastic nature of chromosome conformations and allows a detailed analysis of alternative chromatin structure states. For example, we predict and experimentally confirm the presence of large centromere clusters with distinct chromosome compositions varying between individual cells. The stability of these clusters varies greatly with their chromosome identities. We show that these chromosome-specific clusters can play a key role in the overall chromosome positioning in the nucleus and stabilizing specific chromatin interactions. By explicitly considering genome structural variability, our population-based method provides an important tool for revealing novel insights into the key factors shaping the spatial genome organization. PMID:26951677

  16. Data structures and compression algorithms for genomic sequence data

    PubMed Central

    Brandon, Marty C.; Wallace, Douglas C.; Baldi, Pierre

    2009-01-01

    Motivation: The continuing exponential accumulation of full genome data, including full diploid human genomes, creates new challenges not only for understanding genomic structure, function and evolution, but also for the storage, navigation and privacy of genomic data. Here, we develop data structures and algorithms for the efficient storage of genomic and other sequence data that may also facilitate querying and protecting the data. Results: The general idea is to encode only the differences between a genome sequence and a reference sequence, using absolute or relative coordinates for the location of the differences. These locations and the corresponding differential variants can be encoded into binary strings using various entropy coding methods, from fixed codes such as Golomb and Elias codes, to variables codes, such as Huffman codes. We demonstrate the approach and various tradeoffs using highly variables human mitochondrial genome sequences as a testbed. With only a partial level of optimization, 3615 genome sequences occupying 56 MB in GenBank are compressed down to only 167 KB, achieving a 345-fold compression rate, using the revised Cambridge Reference Sequence as the reference sequence. Using the consensus sequence as the reference sequence, the data can be stored using only 133 KB, corresponding to a 433-fold level of compression, roughly a 23% improvement. Extensions to nuclear genomes and high-throughput sequencing data are discussed. Availability: Data are publicly available from GenBank, the HapMap web site, and the MITOMAP database. Supplementary materials with additional results, statistics, and software implementations are available from http://mammag.web.uci.edu/bin/view/Mitowiki/ProjectDNACompression. Contact: pfbaldi@ics.uci.edu PMID:19447783

  17. Structural Genomics and Drug Discovery for Infectious Diseases

    SciTech Connect

    Anderson, W.F.

    2010-09-03

    The application of structural genomics methods and approaches to proteins from organisms causing infectious diseases is making available the three dimensional structures of many proteins that are potential drug targets and laying the groundwork for structure aided drug discovery efforts. There are a number of structural genomics projects with a focus on pathogens that have been initiated worldwide. The Center for Structural Genomics of Infectious Diseases (CSGID) was recently established to apply state-of-the-art high throughput structural biology technologies to the characterization of proteins from the National Institute for Allergy and Infectious Diseases (NIAID) category A-C pathogens and organisms causing emerging, or re-emerging infectious diseases. The target selection process emphasizes potential biomedical benefits. Selected proteins include known drug targets and their homologs, essential enzymes, virulence factors and vaccine candidates. The Center also provides a structure determination service for the infectious disease scientific community. The ultimate goal is to generate a library of structures that are available to the scientific community and can serve as a starting point for further research and structure aided drug discovery for infectious diseases. To achieve this goal, the CSGID will determine protein crystal structures of 400 proteins and protein-ligand complexes using proven, rapid, highly integrated, and cost-effective methods for such determination, primarily by X-ray crystallography. High throughput crystallographic structure determination is greatly aided by frequent, convenient access to high-performance beamlines at third-generation synchrotron X-ray sources.

  18. Genome Pool Strategy for Structural Coverage of Protein Families

    SciTech Connect

    Jaroszewski, L.; Slabinski, L.; Wooley, J.; Deacon, A.M.; Lesley, S.A.; Wilson, I.A.; Godzik, A.

    2009-05-18

    Even closely homologous proteins often have different crystallization properties and propensities. This observation can be used to introduce an additional dimension into crystallization trials by simultaneous targeting multiple homologs in what we call a 'genome pool' strategy. We show that this strategy works because protein physicochemical properties correlated with crystallization success have a surprisingly broad distribution within most protein families. There are also easy and difficult families where this distribution is tilted in one direction. This leads to uneven structural coverage of protein families, with more easy ones solved. Increasing the size of the genome pool can improve chances of solving the difficult ones. In contrast, our analysis does not indicate that any specific genomes are easy or difficult. Finally, we show that the group of proteins with known 3D structures is systematically different from the general pool of known proteins and we assess the structural consequences of these differences.

  19. The Impact of Structural Genomics: Expectations and Outcomes

    SciTech Connect

    Chandonia, John-Marc; Brenner, Steven E.

    2005-12-21

    Structural Genomics (SG) projects aim to expand our structural knowledge of biological macromolecules, while lowering the average costs of structure determination. We quantitatively analyzed the novelty, cost, and impact of structures solved by SG centers, and contrast these results with traditional structural biology. The first structure from a protein family is particularly important to reveal the fold and ancient relationships to other proteins. In the last year, approximately half of such structures were solved at a SG center rather than in a traditional laboratory. Furthermore, the cost of solving a structure at the most efficient U.S. center has now dropped to one-quarter the estimated cost of solving a structure by traditional methods. However, top structural biology laboratories are much more efficient than the average, and comparable to SG centers despite working on very challenging structures. Moreover, traditional structural biology papers are cited significantly more often, suggesting greater current impact.

  20. Benefits of Structural Genomics for Drug Discovery Research

    SciTech Connect

    Grabowski, M.; Chruszcz, M; Zimmerman, M; Kirillova, O; Minor, W

    2009-01-01

    While three dimensional structures have long been used to search for new drug targets, only a fraction of new drugs coming to the market has been developed with the use of a structure-based drug discovery approach. However, the recent years have brought not only an avalanche of new macromolecular structures, but also significant advances in the protein structure determination methodology only now making their way into structure-based drug discovery. In this paper, we review recent developments resulting from the Structural Genomics (SG) programs, focusing on the methods and results most likely to improve our understanding of the molecular foundation of human diseases. SG programs have been around for almost a decade, and in that time, have contributed a significant part of the structural coverage of both the genomes of pathogens causing infectious diseases and structurally uncharacterized biological processes in general. Perhaps most importantly, SG programs have developed new methodology at all steps of the structure determination process, not only to determine new structures highly efficiently, but also to screen protein/ligand interactions. We describe the methodologies, experience and technologies developed by SG, which range from improvements to cloning protocols to improved procedures for crystallographic structure solution that may be applied in 'traditional' structural biology laboratories particularly those performing drug discovery. We also discuss the conditions that must be met to convert the present high-throughput structure determination pipeline into a high-output structure-based drug discovery system.

  1. CYP2D6: novel genomic structures and alleles

    PubMed Central

    Kramer, Whitney E.; Walker, Denise L.; O’Kane, Dennis J.; Mrazek, David A.; Fisher, Pamela K.; Dukek, Brian A.; Bruflat, Jamie K.; Black, John L.

    2010-01-01

    Objective CYP2D6 is a polymorphic gene. It has been observed to be deleted, to be duplicated and to undergo recombination events involving the CYP2D7 pseudogene and surrounding sequences. The objective of this study was to discover the genomic structure of CYP2D6 recombinants that interfere with clinical genotyping platforms that are available today. Methods Clinical samples containing rare homozygous CYP2D6 alleles, ambiguous readouts, and those with duplication signals and two different alleles were analyzed by long-range PCR amplification of individual genes, PCR fragment analysis, allele-specific primer extension assay, and DNA sequencing to characterize alleles and genomic structure. Results Novel alleles, genomic structures, and the DNA sequence of these structures are described. Interestingly, in 49 of 50 DNA samples that had CYP2D6 gene duplications or multiplications where two alleles were detected, the chromosome containing the duplication or multiplication had identical tandem alleles. Conclusion Several new CYP2D6 alleles and genomic structures are described which will be useful for CYP2D6 genotyping. The findings suggest that the recombination events responsible for CYP2D6 duplications and multiplications are because of mechanisms other than interchromosomal crossover during meiosis. PMID:19741566

  2. Coevolution of the Organization and Structure of Prokaryotic Genomes.

    PubMed

    Touchon, Marie; Rocha, Eduardo P C

    2016-01-04

    The cytoplasm of prokaryotes contains many molecular machines interacting directly with the chromosome. These vital interactions depend on the chromosome structure, as a molecule, and on the genome organization, as a unit of genetic information. Strong selection for the organization of the genetic elements implicated in these interactions drives replicon ploidy, gene distribution, operon conservation, and the formation of replication-associated traits. The genomes of prokaryotes are also very plastic with high rates of horizontal gene transfer and gene loss. The evolutionary conflicts between plasticity and organization lead to the formation of regions with high genetic diversity whose impact on chromosome structure is poorly understood. Prokaryotic genomes are remarkable documents of natural history because they carry the imprint of all of these selective and mutational forces. Their study allows a better understanding of molecular mechanisms, their impact on microbial evolution, and how they can be tinkered in synthetic biology.

  3. NCI Cohort Consortium Membership

    Cancer.gov

    The NCI Cohort Consortium membership is international and includes investigators responsible for more than 40 high-quality cohorts who are studying large and diverse populations in more than 15 different countries.

  4. Symbolic extensions applied to multiscale structure of genomes.

    PubMed

    Downarowicz, Tomasz; Travisany, Dante; Montecino, Martin; Maass, Alejandro

    2014-06-01

    A genome of a living organism consists of a long string of symbols over a finite alphabet carrying critical information for the organism. This includes its ability to control post natal growth, homeostasis, adaptation to changes in the surrounding environment, or to biochemically respond at the cellular level to various specific regulatory signals. In this sense, a genome represents a symbolic encoding of a highly organized system of information whose functioning may be revealed as a natural multilayer structure in terms of complexity and prominence. In this paper we use the mathematical theory of symbolic extensions as a framework to shed light onto how this multilayer organization is reflected in the symbolic coding of the genome. The distribution of data in an element of a standard symbolic extension of a dynamical system has a specific form: the symbolic sequence is divided into several subsequences (which we call layers) encoding the dynamics on various "scales". We propose that a similar structure resides within the genomes, building our analogy on some of the most recent findings in the field of regulation of genomic DNA functioning. PMID:24728912

  5. Structural classification of proteins and structural genomics: new insights into protein folding and evolution

    PubMed Central

    Andreeva, Antonina; Murzin, Alexey G.

    2010-01-01

    During the past decade, the Protein Structure Initiative (PSI) centres have become major contributors of new families, superfamilies and folds to the Structural Classification of Proteins (SCOP) database. The PSI results have increased the diversity of protein structural space and accelerated our understanding of it. This review article surveys a selection of protein structures determined by the Joint Center for Structural Genomics (JCSG). It presents previously undescribed β-sheet architectures such as the double barrel and spiral β-roll and discusses new examples of unusual topologies and peculiar structural features observed in proteins characterized by the JCSG and other Structural Genomics centres. PMID:20944210

  6. Mitochondrial Disease Sequence Data Resource (MSeqDR): a global grass-roots consortium to facilitate deposition, curation, annotation, and integrated analysis of genomic data for the mitochondrial disease clinical and research communities.

    PubMed

    Falk, Marni J; Shen, Lishuang; Gonzalez, Michael; Leipzig, Jeremy; Lott, Marie T; Stassen, Alphons P M; Diroma, Maria Angela; Navarro-Gomez, Daniel; Yeske, Philip; Bai, Renkui; Boles, Richard G; Brilhante, Virginia; Ralph, David; DaRe, Jeana T; Shelton, Robert; Terry, Sharon F; Zhang, Zhe; Copeland, William C; van Oven, Mannis; Prokisch, Holger; Wallace, Douglas C; Attimonelli, Marcella; Krotoski, Danuta; Zuchner, Stephan; Gai, Xiaowu

    2015-03-01

    Success rates for genomic analyses of highly heterogeneous disorders can be greatly improved if a large cohort of patient data is assembled to enhance collective capabilities for accurate sequence variant annotation, analysis, and interpretation. Indeed, molecular diagnostics requires the establishment of robust data resources to enable data sharing that informs accurate understanding of genes, variants, and phenotypes. The "Mitochondrial Disease Sequence Data Resource (MSeqDR) Consortium" is a grass-roots effort facilitated by the United Mitochondrial Disease Foundation to identify and prioritize specific genomic data analysis needs of the global mitochondrial disease clinical and research community. A central Web portal (https://mseqdr.org) facilitates the coherent compilation, organization, annotation, and analysis of sequence data from both nuclear and mitochondrial genomes of individuals and families with suspected mitochondrial disease. This Web portal provides users with a flexible and expandable suite of resources to enable variant-, gene-, and exome-level sequence analysis in a secure, Web-based, and user-friendly fashion. Users can also elect to share data with other MSeqDR Consortium members, or even the general public, either by custom annotation tracks or through the use of a convenient distributed annotation system (DAS) mechanism. A range of data visualization and analysis tools are provided to facilitate user interrogation and understanding of genomic, and ultimately phenotypic, data of relevance to mitochondrial biology and disease. Currently available tools for nuclear and mitochondrial gene analyses include an MSeqDR GBrowse instance that hosts optimized mitochondrial disease and mitochondrial DNA (mtDNA) specific annotation tracks, as well as an MSeqDR locus-specific database (LSDB) that curates variant data on more than 1300 genes that have been implicated in mitochondrial disease and/or encode mitochondria-localized proteins. MSeqDR is

  7. Genomic structure and evolution of multigene families: "flowers" on the human genome.

    PubMed

    Kim, Hie Lim; Iwase, Mineyo; Igawa, Takeshi; Nishioka, Tasuku; Kaneko, Satoko; Katsura, Yukako; Takahata, Naoyuki; Satta, Yoko

    2012-01-01

    We report the results of an extensive investigation of genomic structures in the human genome, with a particular focus on relatively large repeats (>50 kb) in adjacent chromosomal regions. We named such structures "Flowers" because the pattern observed on dot plots resembles a flower. We detected a total of 291 Flowers in the human genome. They were predominantly located in euchromatic regions. Flowers are gene-rich compared to the average gene density of the genome. Genes involved in systems receiving environmental information, such as immunity and detoxification, were overrepresented in Flowers. Within a Flower, the mean number of duplication units was approximately four. The maximum and minimum identities between homologs in a Flower showed different distributions; the maximum identity was often concentrated to 100% identity, while the minimum identity was evenly distributed in the range of 78% to 100%. Using a gene conversion detection test, we found frequent and/or recent gene conversion events within the tested Flowers. Interestingly, many of those converted regions contained protein-coding genes. Computer simulation studies suggest that one role of such frequent gene conversions is the elongation of the life span of gene families in a Flower by the resurrection of pseudogenes. PMID:22779033

  8. Genome structure of bdelloid rotifers: shaped by asexuality or desiccation?

    PubMed

    Gladyshev, Eugene A; Arkhipova, Irina R

    2010-01-01

    Bdelloid rotifers are microscopic invertebrate animals best known for their ancient asexuality and the ability to survive desiccation at any life stage. Both factors are expected to have a profound influence on their genome structure. Recent molecular studies demonstrated that, although the gene-rich regions of bdelloid genomes are organized as colinear pairs of closely related sequences and depleted in repetitive DNA, subtelomeric regions harbor diverse transposable elements and horizontally acquired genes of foreign origin. Although asexuality is expected to result in depletion of deleterious transposons, only desiccation appears to have the power to produce all the uncovered genomic peculiarities. Repair of desiccation-induced DNA damage would require the presence of a homologous template, maintaining colinear pairs in gene-rich regions and selecting against insertion of repetitive DNA that might cause chromosomal rearrangements. Desiccation may also induce a transient state of competence in recovering animals, allowing them to acquire environmental DNA. Even if bdelloids engage in rare or obscure forms of sexual reproduction, all these features could still be present. The relative contribution of asexuality and desiccation to genome organization may be clarified by analyzing whole-genome sequences and comparing foreign gene and transposon content in species which lost the ability to survive desiccation.

  9. Genome structure and gene content in protist mitochondrial DNAs.

    PubMed

    Gray, M W; Lang, B F; Cedergren, R; Golding, G B; Lemieux, C; Sankoff, D; Turmel, M; Brossard, N; Delage, E; Littlejohn, T G; Plante, I; Rioux, P; Saint-Louis, D; Zhu, Y; Burger, G

    1998-02-15

    Although the collection of completely sequenced mitochondrial genomes is expanding rapidly, only recently has a phylogenetically broad representation of mtDNA sequences from protists (mostly unicellular eukaryotes) become available. This review surveys the 23 complete protist mtDNA sequences that have been determined to date, commenting on such aspects as mitochondrial genome structure, gene content, ribosomal RNA, introns, transfer RNAs and the genetic code and phylogenetic implications. We also illustrate the utility of a comparative genomics approach to gene identification by providing evidence that orfB in plant and protist mtDNAs is the homolog of atp8 , the gene in animal and fungal mtDNA that encodes subunit 8 of the F0portion of mitochondrial ATP synthase. Although several protist mtDNAs, like those of animals and most fungi, are seen to be highly derived, others appear to be have retained a number of features of the ancestral, proto-mitochondrial genome. Some of these ancestral features are also shared with plant mtDNA, although the latter have evidently expanded considerably in size, if not in gene content, in the course of evolution. Comparative analysis of protist mtDNAs is providing a new perspective on mtDNA evolution: how the original mitochondrial genome was organized, what genes it contained, and in what ways it must have changed in different eukaryotic phyla.

  10. Structural analysis of hepatitis C RNA genome using DNA microarrays

    PubMed Central

    Martell, María; Briones, Carlos; de Vicente, Aránzazu; Piron, María; Esteban, Juan I.; Esteban, Rafael; Guardia, Jaime; Gómez, Jordi

    2004-01-01

    Many studies have tried to identify specific nucleotide sequences in the quasispecies of hepatitis C virus (HCV) that determine resistance or sensitivity to interferon (IFN) therapy, unfortunately without conclusive results. Although viral proteins represent the most evident phenotype of the virus, genomic RNA sequences determine secondary and tertiary structures which are also part of the viral phenotype and can be involved in important biological roles. In this work, a method of RNA structure analysis has been developed based on the hybridization of labelled HCV transcripts to microarrays of complementary DNA oligonucleotides. Hybridizations were carried out at non-denaturing conditions, using appropriate temperature and buffer composition to allow binding to the immobilized probes of the RNA transcript without disturbing its secondary/tertiary structural motifs. Oligonucleotides printed onto the microarray covered the entire 5′ non-coding region (5′NCR), the first three-quarters of the core region, the E2–NS2 junction and the first 400 nt of the NS3 region. We document the use of this methodology to analyse the structural degree of a large region of HCV genomic RNA in two genotypes associated with different responses to IFN treatment. The results reported here show different structural degree along the genome regions analysed, and differential hybridization patterns for distinct genotypes in NS2 and NS3 HCV regions. PMID:15247323

  11. Genetic contributions to variation in general cognitive function: a meta-analysis of genome-wide association studies in the CHARGE consortium (N=53 949)

    PubMed Central

    Davies, G; Armstrong, N; Bis, J C; Bressler, J; Chouraki, V; Giddaluru, S; Hofer, E; Ibrahim-Verbaas, C A; Kirin, M; Lahti, J; van der Lee, S J; Le Hellard, S; Liu, T; Marioni, R E; Oldmeadow, C; Postmus, I; Smith, A V; Smith, J A; Thalamuthu, A; Thomson, R; Vitart, V; Wang, J; Yu, L; Zgaga, L; Zhao, W; Boxall, R; Harris, S E; Hill, W D; Liewald, D C; Luciano, M; Adams, H; Ames, D; Amin, N; Amouyel, P; Assareh, A A; Au, R; Becker, J T; Beiser, A; Berr, C; Bertram, L; Boerwinkle, E; Buckley, B M; Campbell, H; Corley, J; De Jager, P L; Dufouil, C; Eriksson, J G; Espeseth, T; Faul, J D; Ford, I; Scotland, Generation; Gottesman, R F; Griswold, M E; Gudnason, V; Harris, T B; Heiss, G; Hofman, A; Holliday, E G; Huffman, J; Kardia, S L R; Kochan, N; Knopman, D S; Kwok, J B; Lambert, J-C; Lee, T; Li, G; Li, S-C; Loitfelder, M; Lopez, O L; Lundervold, A J; Lundqvist, A; Mather, K A; Mirza, S S; Nyberg, L; Oostra, B A; Palotie, A; Papenberg, G; Pattie, A; Petrovic, K; Polasek, O; Psaty, B M; Redmond, P; Reppermund, S; Rotter, J I; Schmidt, H; Schuur, M; Schofield, P W; Scott, R J; Steen, V M; Stott, D J; van Swieten, J C; Taylor, K D; Trollor, J; Trompet, S; Uitterlinden, A G; Weinstein, G; Widen, E; Windham, B G; Jukema, J W; Wright, A F; Wright, M J; Yang, Q; Amieva, H; Attia, J R; Bennett, D A; Brodaty, H; de Craen, A J M; Hayward, C; Ikram, M A; Lindenberger, U; Nilsson, L-G; Porteous, D J; Räikkönen, K; Reinvang, I; Rudan, I; Sachdev, P S; Schmidt, R; Schofield, P R; Srikanth, V; Starr, J M; Turner, S T; Weir, D R; Wilson, J F; van Duijn, C; Launer, L; Fitzpatrick, A L; Seshadri, S; Mosley, T H; Deary, I J

    2015-01-01

    General cognitive function is substantially heritable across the human life course from adolescence to old age. We investigated the genetic contribution to variation in this important, health- and well-being-related trait in middle-aged and older adults. We conducted a meta-analysis of genome-wide association studies of 31 cohorts (N=53 949) in which the participants had undertaken multiple, diverse cognitive tests. A general cognitive function phenotype was tested for, and created in each cohort by principal component analysis. We report 13 genome-wide significant single-nucleotide polymorphism (SNP) associations in three genomic regions, 6q16.1, 14q12 and 19q13.32 (best SNP and closest gene, respectively: rs10457441, P=3.93 × 10−9, MIR2113; rs17522122, P=2.55 × 10−8, AKAP6; rs10119, P=5.67 × 10−9, APOE/TOMM40). We report one gene-based significant association with the HMGN1 gene located on chromosome 21 (P=1 × 10−6). These genes have previously been associated with neuropsychiatric phenotypes. Meta-analysis results are consistent with a polygenic model of inheritance. To estimate SNP-based heritability, the genome-wide complex trait analysis procedure was applied to two large cohorts, the Atherosclerosis Risk in Communities Study (N=6617) and the Health and Retirement Study (N=5976). The proportion of phenotypic variation accounted for by all genotyped common SNPs was 29% (s.e.=5%) and 28% (s.e.=7%), respectively. Using polygenic prediction analysis, ~1.2% of the variance in general cognitive function was predicted in the Generation Scotland cohort (N=5487; P=1.5 × 10−17). In hypothesis-driven tests, there was significant association between general cognitive function and four genes previously associated with Alzheimer's disease: TOMM40, APOE, ABCG1 and MEF2C. PMID:25644384

  12. Complete nucleotide sequence of the Cryptomeria japonica D. Don. chloroplast genome and comparative chloroplast genomics: diversified genomic structure of coniferous species

    PubMed Central

    Hirao, Tomonori; Watanabe, Atsushi; Kurita, Manabu; Kondo, Teiji; Takata, Katsuhiko

    2008-01-01

    Background The recent determination of complete chloroplast (cp) genomic sequences of various plant species has enabled numerous comparative analyses as well as advances in plant and genome evolutionary studies. In angiosperms, the complete cp genome sequences of about 70 species have been determined, whereas those of only three gymnosperm species, Cycas taitungensis, Pinus thunbergii, and Pinus koraiensis have been established. The lack of information regarding the gene content and genomic structure of gymnosperm cp genomes may severely hamper further progress of plant and cp genome evolutionary studies. To address this need, we report here the complete nucleotide sequence of the cp genome of Cryptomeria japonica, the first in the Cupressaceae sensu lato of gymnosperms, and provide a comparative analysis of their gene content and genomic structure that illustrates the unique genomic features of gymnosperms. Results The C. japonica cp genome is 131,810 bp in length, with 112 single copy genes and two duplicated (trnI-CAU, trnQ-UUG) genes that give a total of 116 genes. Compared to other land plant cp genomes, the C. japonica cp has lost one of the relevant large inverted repeats (IRs) found in angiosperms, fern, liverwort, and gymnosperms, such as Cycas and Gingko, and additionally has completely lost its trnR-CCG, partially lost its trnT-GGU, and shows diversification of accD. The genomic structure of the C. japonica cp genome also differs significantly from those of other plant species. For example, we estimate that a minimum of 15 inversions would be required to transform the gene organization of the Pinus thunbergii cp genome into that of C. japonica. In the C. japonica cp genome, direct repeat and inverted repeat sequences are observed at the inversion and translocation endpoints, and these sequences may be associated with the genomic rearrangements. Conclusion The observed differences in genomic structure between C. japonica and other land plants, including

  13. Genome-Wide Approaches for RNA Structure Probing.

    PubMed

    Silverman, Ian M; Berkowitz, Nathan D; Gosai, Sager J; Gregory, Brian D

    2016-01-01

    RNA molecules of all types fold into complex secondary and tertiary structures that are important for their function and regulation. Structural and catalytic RNAs such as ribosomal RNA (rRNA) and transfer RNA (tRNA) are central players in protein synthesis, and only function through their proper folding into intricate three-dimensional structures. Studies of messenger RNA (mRNA) regulation have also revealed that structural elements embedded within these RNA species are important for the proper regulation of their total level in the transcriptome. More recently, the discovery of microRNAs (miRNAs) and long non-coding RNAs (lncRNAs) has shed light on the importance of RNA structure to genome, transcriptome, and proteome regulation. Due to the relatively small number, high conservation, and importance of structural and catalytic RNAs to all life, much early work in RNA structure analysis mapped out a detailed view of these molecules. Computational and physical methods were used in concert with enzymatic and chemical structure probing to create high-resolution models of these fundamental biological molecules. However, the recent expansion in our knowledge of the importance of RNA structure to coding and regulatory RNAs has left the field in need of faster and scalable methods for high-throughput structural analysis. To address this, nuclease and chemical RNA structure probing methodologies have been adapted for genome-wide analysis. These methods have been deployed to globally characterize thousands of RNA structures in a single experiment. Here, we review these experimental methodologies for high-throughput RNA structure determination and discuss the insights gained from each approach. PMID:27256381

  14. Genome databases

    SciTech Connect

    Courteau, J.

    1991-10-11

    Since the Genome Project began several years ago, a plethora of databases have been developed or are in the works. They range from the massive Genome Data Base at Johns Hopkins University, the central repository of all gene mapping information, to small databases focusing on single chromosomes or organisms. Some are publicly available, others are essentially private electronic lab notebooks. Still others limit access to a consortium of researchers working on, say, a single human chromosome. An increasing number incorporate sophisticated search and analytical software, while others operate as little more than data lists. In consultation with numerous experts in the field, a list has been compiled of some key genome-related databases. The list was not limited to map and sequence databases but also included the tools investigators use to interpret and elucidate genetic data, such as protein sequence and protein structure databases. Because a major goal of the Genome Project is to map and sequence the genomes of several experimental animals, including E. coli, yeast, fruit fly, nematode, and mouse, the available databases for those organisms are listed as well. The author also includes several databases that are still under development - including some ambitious efforts that go beyond data compilation to create what are being called electronic research communities, enabling many users, rather than just one or a few curators, to add or edit the data and tag it as raw or confirmed.

  15. Gene3D: Structural Assignment for Whole Genes and Genomes Using the CATH Domain Structure Database

    PubMed Central

    Buchan, Daniel W.A.; Shepherd, Adrian J.; Lee, David; Pearl, Frances M.G.; Rison, Stuart C.G.; Thornton, Janet M.; Orengo, Christine A.

    2002-01-01

    We present a novel web-based resource, Gene3D, of precalculated structural assignments to gene sequences and whole genomes. This resource assigns structural domains from the CATH database to whole genes and links these to their curated functional and structural annotations within the CATH domain structure database, the functional Dictionary of Homologous Superfamilies (DHS) and PDBsum. Currently Gene3D provides annotation for 36 complete genomes (two eukaryotes, six archaea, and 28 bacteria). On average, between 30% and 40% of the genes of a given genome can be structurally annotated. Matches to structural domains are found using the profile-based method (PSI-BLAST). and a novel protocol, DRange, is used to resolve conflicts in matches involving different homologous superfamilies. PMID:11875040

  16. The impact of extremophiles on structural genomics (and vice versa).

    PubMed

    Jenney, Francis E; Adams, Michael W W

    2008-01-01

    The advent of the complete genome sequences of various organisms in the mid-1990s raised the issue of how one could determine the function of hypothetical proteins. While insight might be obtained from a 3D structure, the chances of being able to predict such a structure is limited for the deduced amino acid sequence of any uncharacterized gene. A template for modeling is required, but there was only a low probability of finding a protein closely-related in sequence with an available structure. Thus, in the late 1990s, an international effort known as structural genomics (SG) was initiated, its primary goal to "fill sequence-structure space" by determining the 3D structures of representatives of all known protein families. This was to be achieved mainly by X-ray crystallography and it was estimated that at least 5,000 new structures would be required. While the proteins (genes) for SG have subsequently been derived from hundreds of different organisms, extremophiles and particularly thermophiles have been specifically targeted due to the increased stability and ease of handling of their proteins, relative to those from mesophiles. This review summarizes the significant impact that extremophiles and proteins derived from them have had on SG projects worldwide. To what extent SG has influenced the field of extremophile research is also discussed.

  17. Sequence, structure, function, immunity: structural genomics of costimulation

    PubMed Central

    Chattopadhyay, Kausik; Lazar-Molnar, Eszter; Yan, Qingrong; Rubinstein, Rotem; Zhan, Chenyang; Vigdorovich, Vladimir; Ramagopal, Udupi A.; Bonanno, Jeffrey; Nathenson, Stanley G.; Almo, Steven C.

    2010-01-01

    Summary Costimulatory receptors and ligands trigger the signaling pathways that are responsible for modulating the strength, course and duration of an immune response. High-resolution structures have provided invaluable mechanistic insights by defining the chemical and physical features underlying costimulatory receptor/ligand specificity, affinity, oligomeric state, and valency. Furthermore, these structures revealed general architectural features that are important for the integration of these interactions and their associated signaling pathways into overall cellular physiology. Recent technological advances in structural biology promise unprecedented opportunities for furthering our understanding of the structural features and mechanisms that govern costimulation. In this review we highlight unique insights that have been revealed by structures of costimulatory molecules from the immunoglobulin and tumor necrosis factor superfamilies, and describe a vision for future structural and mechanistic analysis of costimulation. This vision includes simple strategies for the selection of candidate molecules for structure determination and highlights the critical role of structure in the design of mutant costimulatory molecules for the generation of in vivo structure-function correlations in a mammalian model system. This integrated ‘atoms-to-animals’ paradigm provides a comprehensive approach for defining atomic and molecular mechanisms. PMID:19426233

  18. The PlaNet Consortium: A Network of European Plant Databases Connecting Plant Genome Data in an Integrated Biological Knowledge Resource

    PubMed Central

    Ernst, R.; Mayer, K. F. X.

    2004-01-01

    The completion of the Arabidopsis genome and the large collections of other plant sequences generated in recent years have sparked extensive functional genomics efforts. However, the utilization of this data is inefficient, as data sources are distributed and heterogeneous and efforts at data integration are lagging behind. PlaNet aims to overcome the limitations of individual efforts as well as the limitations of heterogeneous, independent data collections. PlaNet is a distributed effort among European bioinformatics groups and plant molecular biologists to establish a comprehensive integrated database in a collaborative network. Objectives are the implementation of infrastructure and data sources to capture plant genomic information into a comprehensive, integrated platform. This will facilitate the systematic exploration of Arabidopsis and other plants. New methods for data exchange, database integration and access are being developed to create a highly integrated, federated data resource for research. The connection between the individual resources is realized with BioMOBY. BioMOBY provides an architecture for the discovery and distribution of biological data through web services. While knowledge is centralized, data is maintained at its primary source without a need for warehousing. To standardize nomenclature and data representation, ontologies and generic data models are defined in interaction with the relevant communities.Minimal data models should make it simple to allow broad integration, while inheritance allows detail and depth to be added to more complex data objects without losing integration. To allow expert annotation and keep databases curated, local and remote annotation interfaces are provided. Easy and direct access to all data is key to the project. PMID:18629059

  19. Genome-wide association study for refractive astigmatism reveals genetic co-determination with spherical equivalent refractive error: the CREAM consortium.

    PubMed

    Li, Qing; Wojciechowski, Robert; Simpson, Claire L; Hysi, Pirro G; Verhoeven, Virginie J M; Ikram, Mohammad Kamran; Höhn, René; Vitart, Veronique; Hewitt, Alex W; Oexle, Konrad; Mäkelä, Kari-Matti; MacGregor, Stuart; Pirastu, Mario; Fan, Qiao; Cheng, Ching-Yu; St Pourcain, Beaté; McMahon, George; Kemp, John P; Northstone, Kate; Rahi, Jugnoo S; Cumberland, Phillippa M; Martin, Nicholas G; Sanfilippo, Paul G; Lu, Yi; Wang, Ya Xing; Hayward, Caroline; Polašek, Ozren; Campbell, Harry; Bencic, Goran; Wright, Alan F; Wedenoja, Juho; Zeller, Tanja; Schillert, Arne; Mirshahi, Alireza; Lackner, Karl; Yip, Shea Ping; Yap, Maurice K H; Ried, Janina S; Gieger, Christian; Murgia, Federico; Wilson, James F; Fleck, Brian; Yazar, Seyhan; Vingerling, Johannes R; Hofman, Albert; Uitterlinden, André; Rivadeneira, Fernando; Amin, Najaf; Karssen, Lennart; Oostra, Ben A; Zhou, Xin; Teo, Yik-Ying; Tai, E Shyong; Vithana, Eranga; Barathi, Veluchamy; Zheng, Yingfeng; Siantar, Rosalynn Grace; Neelam, Kumari; Shin, Youchan; Lam, Janice; Yonova-Doing, Ekaterina; Venturini, Cristina; Hosseini, S Mohsen; Wong, Hoi-Suen; Lehtimäki, Terho; Kähönen, Mika; Raitakari, Olli; Timpson, Nicholas J; Evans, David M; Khor, Chiea-Chuen; Aung, Tin; Young, Terri L; Mitchell, Paul; Klein, Barbara; van Duijn, Cornelia M; Meitinger, Thomas; Jonas, Jost B; Baird, Paul N; Mackey, David A; Wong, Tien Yin; Saw, Seang-Mei; Pärssinen, Olavi; Stambolian, Dwight; Hammond, Christopher J; Klaver, Caroline C W; Williams, Cathy; Paterson, Andrew D; Bailey-Wilson, Joan E; Guggenheim, Jeremy A

    2015-02-01

    To identify genetic variants associated with refractive astigmatism in the general population, meta-analyses of genome-wide association studies were performed for: White Europeans aged at least 25 years (20 cohorts, N = 31,968); Asian subjects aged at least 25 years (7 cohorts, N = 9,295); White Europeans aged <25 years (4 cohorts, N = 5,640); and all independent individuals from the above three samples combined with a sample of Chinese subjects aged <25 years (N = 45,931). Participants were classified as cases with refractive astigmatism if the average cylinder power in their two eyes was at least 1.00 diopter and as controls otherwise. Genome-wide association analysis was carried out for each cohort separately using logistic regression. Meta-analysis was conducted using a fixed effects model. In the older European group the most strongly associated marker was downstream of the neurexin-1 (NRXN1) gene (rs1401327, P = 3.92E-8). No other region reached genome-wide significance, and association signals were lower for the younger European group and Asian group. In the meta-analysis of all cohorts, no marker reached genome-wide significance: The most strongly associated regions were, NRXN1 (rs1401327, P = 2.93E-07), TOX (rs7823467, P = 3.47E-07) and LINC00340 (rs12212674, P = 1.49E-06). For 34 markers identified in prior GWAS for spherical equivalent refractive error, the beta coefficients for genotype versus spherical equivalent, and genotype versus refractive astigmatism, were highly correlated (r = -0.59, P = 2.10E-04). This work revealed no consistent or strong genetic signals for refractive astigmatism; however, the TOX gene region previously identified in GWAS for spherical equivalent refractive error was the second most strongly associated region. Analysis of additional markers provided evidence supporting widespread genetic co-susceptibility for spherical and astigmatic refractive errors.

  20. Elucidation of operon structures across closely related bacterial genomes.

    PubMed

    Zhou, Chuan; Ma, Qin; Li, Guojun

    2014-01-01

    About half of the protein-coding genes in prokaryotic genomes are organized into operons to facilitate co-regulation during transcription. With the evolution of genomes, operon structures are undergoing changes which could coordinate diverse gene expression patterns in response to various stimuli during the life cycle of a bacterial cell. Here we developed a graph-based model to elucidate the diversity of operon structures across a set of closely related bacterial genomes. In the constructed graph, each node represents one orthologous gene group (OGG) and a pair of nodes will be connected if any two genes, from the corresponding two OGGs respectively, are located in the same operon as immediate neighbors in any of the considered genomes. Through identifying the connected components in the above graph, we found that genes in a connected component are likely to be functionally related and these identified components tend to form treelike topology, such as paths and stars, corresponding to different biological mechanisms in transcriptional regulation as follows. Specifically, (i) a path-structure component integrates genes encoding a protein complex, such as ribosome; and (ii) a star-structure component not only groups related genes together, but also reflects the key functional roles of the central node of this component, such as the ABC transporter with a transporter permease and substrate-binding proteins surrounding it. Most interestingly, the genes from organisms with highly diverse living environments, i.e., biomass degraders and animal pathogens of clostridia in our study, can be clearly classified into different topological groups on some connected components.

  1. Elucidation of operon structures across closely related bacterial genomes.

    PubMed

    Zhou, Chuan; Ma, Qin; Li, Guojun

    2014-01-01

    About half of the protein-coding genes in prokaryotic genomes are organized into operons to facilitate co-regulation during transcription. With the evolution of genomes, operon structures are undergoing changes which could coordinate diverse gene expression patterns in response to various stimuli during the life cycle of a bacterial cell. Here we developed a graph-based model to elucidate the diversity of operon structures across a set of closely related bacterial genomes. In the constructed graph, each node represents one orthologous gene group (OGG) and a pair of nodes will be connected if any two genes, from the corresponding two OGGs respectively, are located in the same operon as immediate neighbors in any of the considered genomes. Through identifying the connected components in the above graph, we found that genes in a connected component are likely to be functionally related and these identified components tend to form treelike topology, such as paths and stars, corresponding to different biological mechanisms in transcriptional regulation as follows. Specifically, (i) a path-structure component integrates genes encoding a protein complex, such as ribosome; and (ii) a star-structure component not only groups related genes together, but also reflects the key functional roles of the central node of this component, such as the ABC transporter with a transporter permease and substrate-binding proteins surrounding it. Most interestingly, the genes from organisms with highly diverse living environments, i.e., biomass degraders and animal pathogens of clostridia in our study, can be clearly classified into different topological groups on some connected components. PMID:24959722

  2. Implications of structural genomics target selection strategies: Pfam5000, whole genome, and random approaches

    SciTech Connect

    Chandonia, John-Marc; Brenner, Steven E.

    2004-07-14

    The structural genomics project is an international effort to determine the three-dimensional shapes of all important biological macromolecules, with a primary focus on proteins. Target proteins should be selected according to a strategy which is medically and biologically relevant, of good value, and tractable. As an option to consider, we present the Pfam5000 strategy, which involves selecting the 5000 most important families from the Pfam database as sources for targets. We compare the Pfam5000 strategy to several other proposed strategies that would require similar numbers of targets. These include including complete solution of several small to moderately sized bacterial proteomes, partial coverage of the human proteome, and random selection of approximately 5000 targets from sequenced genomes. We measure the impact that successful implementation of these strategies would have upon structural interpretation of the proteins in Swiss-Prot, TrEMBL, and 131 complete proteomes (including 10 of eukaryotes) from the Proteome Analysis database at EBI. Solving the structures of proteins from the 5000 largest Pfam families would allow accurate fold assignment for approximately 68 percent of all prokaryotic proteins (covering 59 percent of residues) and 61 percent of eukaryotic proteins (40 percent of residues). More fine-grained coverage which would allow accurate modeling of these proteins would require an order of magnitude more targets. The Pfam5000 strategy may be modified in several ways, for example to focus on larger families, bacterial sequences, or eukaryotic sequences; as long as secondary consideration is given to large families within Pfam, coverage results vary only slightly. In contrast, focusing structural genomics on a single tractable genome would have only a limited impact in structural knowledge of other proteomes: a significant fraction (about 30-40 percent of the proteins, and 40-60 percent of the residues) of each proteome is classified in small

  3. The genome sequence of the plant pathogen Xylella fastidiosa. The Xylella fastidiosa Consortium of the Organization for Nucleotide Sequencing and Analysis.

    PubMed

    Simpson, A J; Reinach, F C; Arruda, P; Abreu, F A; Acencio, M; Alvarenga, R; Alves, L M; Araya, J E; Baia, G S; Baptista, C S; Barros, M H; Bonaccorsi, E D; Bordin, S; Bové, J M; Briones, M R; Bueno, M R; Camargo, A A; Camargo, L E; Carraro, D M; Carrer, H; Colauto, N B; Colombo, C; Costa, F F; Costa, M C; Costa-Neto, C M; Coutinho, L L; Cristofani, M; Dias-Neto, E; Docena, C; El-Dorry, H; Facincani, A P; Ferreira, A J; Ferreira, V C; Ferro, J A; Fraga, J S; França, S C; Franco, M C; Frohme, M; Furlan, L R; Garnier, M; Goldman, G H; Goldman, M H; Gomes, S L; Gruber, A; Ho, P L; Hoheisel, J D; Junqueira, M L; Kemper, E L; Kitajima, J P; Krieger, J E; Kuramae, E E; Laigret, F; Lambais, M R; Leite, L C; Lemos, E G; Lemos, M V; Lopes, S A; Lopes, C R; Machado, J A; Machado, M A; Madeira, A M; Madeira, H M; Marino, C L; Marques, M V; Martins, E A; Martins, E M; Matsukuma, A Y; Menck, C F; Miracca, E C; Miyaki, C Y; Monteriro-Vitorello, C B; Moon, D H; Nagai, M A; Nascimento, A L; Netto, L E; Nhani, A; Nobrega, F G; Nunes, L R; Oliveira, M A; de Oliveira, M C; de Oliveira, R C; Palmieri, D A; Paris, A; Peixoto, B R; Pereira, G A; Pereira, H A; Pesquero, J B; Quaggio, R B; Roberto, P G; Rodrigues, V; de M Rosa, A J; de Rosa, V E; de Sá, R G; Santelli, R V; Sawasaki, H E; da Silva, A C; da Silva, A M; da Silva, F R; da Silva, W A; da Silveira, J F; Silvestri, M L; Siqueira, W J; de Souza, A A; de Souza, A P; Terenzi, M F; Truffi, D; Tsai, S M; Tsuhako, M H; Vallada, H; Van Sluys, M A; Verjovski-Almeida, S; Vettore, A L; Zago, M A; Zatz, M; Meidanis, J; Setubal, J C

    2000-07-13

    Xylella fastidiosa is a fastidious, xylem-limited bacterium that causes a range of economically important plant diseases. Here we report the complete genome sequence of X. fastidiosa clone 9a5c, which causes citrus variegated chlorosis--a serious disease of orange trees. The genome comprises a 52.7% GC-rich 2,679,305-base-pair (bp) circular chromosome and two plasmids of 51,158 bp and 1,285 bp. We can assign putative functions to 47% of the 2,904 predicted coding regions. Efficient metabolic functions are predicted, with sugars as the principal energy and carbon source, supporting existence in the nutrient-poor xylem sap. The mechanisms associated with pathogenicity and virulence involve toxins, antibiotics and ion sequestration systems, as well as bacterium-bacterium and bacterium-host interactions mediated by a range of proteins. Orthologues of some of these proteins have only been identified in animal and human pathogens; their presence in X. fastidiosa indicates that the molecular basis for bacterial pathogenicity is both conserved and independent of host. At least 83 genes are bacteriophage-derived and include virulence-associated genes from other bacteria, providing direct evidence of phage-mediated horizontal gene transfer. PMID:10910347

  4. Patent protection for structural genomics-related inventions.

    PubMed

    Vinarov, Sara D

    2003-01-01

    Recently there have been some important developments with respect to the patentability of inventions in the field of structural genomics. The leaders of the European Patent Office (EPO), Japan Patent Office (JPO) and the United States Patent Office (USPTO) came together for a trilateral meeting to conduct a comparative study on protein 3-dimensional (3-D) structure related claims in an effort to come to a mutual understanding about the examination of such inventions. The three patent offices were presented with eight different cases: 1) 3-D structural data of a protein per se; 2) computer-readable storage medium encoded with structural data of a protein; 3) protein defined by its tertiary structure; 4) crystals of known proteins; 5) binding pockets and protein domains; 6) and 7) are both directed to in silico screening methods directed to a specific protein; and 8) pharmacophores. The preliminary conclusions reached at the trilateral meeting provide clarity regarding the types of inventions that may be patentable given a specific set of scientific facts in a patent application. Therefore, the guidance provided by this study will help inventors, attorneys and other patent practitioners who file for patent protection on structural genomics-based inventions both here and abroad comply with the patentability requirements of each office.

  5. California Space Grant Consortium

    NASA Technical Reports Server (NTRS)

    Kosmatka, John; Berger, Wolfgang; Wiskerchen, Michael J.

    2005-01-01

    The organizational and administrative structure of the CaSGC has the Consortium Headquarters Office (Principal Investigator - Dr. John Kosmatka, California Statewide Director - Dr. Michael Wiskerchen) at UC San Diego. Each affiliate member institution has a campus director and an scholarship/fellowship selection committee. Each affiliate campus director also serves on the CaSGC Advisory Council and coordinates CMIS data collection and submission. The CaSGC strives to maintain a balance between expanded affiliate membership and continued high quality in targeted program areas of aerospace research, education, workforce development, and public outreach. Associate members are encouraged to participate on a project-by-project basis that meets the needs of California and the goals and objectives of the CaSGC. Associate members have responsibilities relating only to the CaSGC projects they are directly engaged in. Each year, as part of the CaSGC Improvement Plan, the CaSGC Advisory Council evaluates the performance of the affiliate and associate membership in terms of contributions to the CaSGC Strategic Plan, These CaSGC membership evaluations provide a constructive means for elevating productive members and removing non-performing members. This Program Improvement and Results (PIR) report will document CaSGC program improvement results and impacts that directly respond to the specific needs of California in the area of aerospace-related education and human capital development and the Congressional mandate to "increase the understanding, assessment, development and utilization of space resources by promoting a strong education base, responsive research and training activities, and broad and prompt dissemination of knowledge and technology".

  6. The Mitochondrial Genome of Soybean Reveals Complex Genome Structures and Gene Evolution at Intercellular and Phylogenetic Levels

    PubMed Central

    Chang, Shengxin; Wang, Yankun; Lu, Jiangjie; Gai, Junyi; Li, Jijie; Chu, Pu; Guan, Rongzhan; Zhao, Tuanjie

    2013-01-01

    Determining mitochondrial genomes is important for elucidating vital activities of seed plants. Mitochondrial genomes are specific to each plant species because of their variable size, complex structures and patterns of gene losses and gains during evolution. This complexity has made research on the soybean mitochondrial genome difficult compared with its nuclear and chloroplast genomes. The present study helps to solve a 30-year mystery regarding the most complex mitochondrial genome structure, showing that pairwise rearrangements among the many large repeats may produce an enriched molecular pool of 760 circles in seed plants. The soybean mitochondrial genome harbors 58 genes of known function in addition to 52 predicted open reading frames of unknown function. The genome contains sequences of multiple identifiable origins, including 6.8 kb and 7.1 kb DNA fragments that have been transferred from the nuclear and chloroplast genomes, respectively, and some horizontal DNA transfers. The soybean mitochondrial genome has lost 16 genes, including nine protein-coding genes and seven tRNA genes; however, it has acquired five chloroplast-derived genes during evolution. Four tRNA genes, common among the three genomes, are derived from the chloroplast. Sizeable DNA transfers to the nucleus, with pericentromeric regions as hotspots, are observed, including DNA transfers of 125.0 kb and 151.6 kb identified unambiguously from the soybean mitochondrial and chloroplast genomes, respectively. The soybean nuclear genome has acquired five genes from its mitochondrial genome. These results provide biological insights into the mitochondrial genome of seed plants, and are especially helpful for deciphering vital activities in soybean. PMID:23431381

  7. Gene3D: comprehensive structural and functional annotation of genomes.

    PubMed

    Yeats, Corin; Lees, Jonathan; Reid, Adam; Kellam, Paul; Martin, Nigel; Liu, Xinhui; Orengo, Christine

    2008-01-01

    Gene3D provides comprehensive structural and functional annotation of most available protein sequences, including the UniProt, RefSeq and Integr8 resources. The main structural annotation is generated through scanning these sequences against the CATH structural domain database profile-HMM library. CATH is a database of manually derived PDB-based structural domains, placed within a hierarchy reflecting topology, homology and conservation and is able to infer more ancient and divergent homology relationships than sequence-based approaches. This data is supplemented with Pfam-A, other non-domain structural predictions (i.e. coiled coils) and experimental data from UniProt. In order to enhance the investigations possible with this data, we have also incorporated a variety of protein annotation resources, including protein-protein interaction data, GO functional assignments, KEGG pathways, FUNCAT functional descriptions and links to microarray expression data. All of this data can be accessed through a newly re-designed website that has a focus on flexibility and clarity, with searches that can be restricted to a single genome or across the entire sequence database. Currently Gene3D contains over 3.5 million domain assignments for nearly 5 million proteins including 527 completed genomes. This is available at: http://gene3d.biochem.ucl.ac.uk/ PMID:18032434

  8. Glycan array data management at Consortium for Functional Glycomics.

    PubMed

    Venkataraman, Maha; Sasisekharan, Ram; Raman, Rahul

    2015-01-01

    Glycomics or the study of structure-function relationships of complex glycans has reshaped post-genomics biology. Glycans mediate fundamental biological functions via their specific interactions with a variety of proteins. Recognizing the importance of glycomics, large-scale research initiatives such as the Consortium for Functional Glycomics (CFG) were established to address these challenges. Over the past decade, the Consortium for Functional Glycomics (CFG) has generated novel reagents and technologies for glycomics analyses, which in turn have led to generation of diverse datasets. These datasets have contributed to understanding glycan diversity and structure-function relationships at molecular (glycan-protein interactions), cellular (gene expression and glycan analysis), and whole organism (mouse phenotyping) levels. Among these analyses and datasets, screening of glycan-protein interactions on glycan array platforms has gained much prominence and has contributed to cross-disciplinary realization of the importance of glycomics in areas such as immunology, infectious diseases, cancer biomarkers, etc. This manuscript outlines methodologies for capturing data from glycan array experiments and online tools to access and visualize glycan array data implemented at the CFG.

  9. Meet me halfway: when genomics meets structural bioinformatics.

    PubMed

    Gong, Sungsam; Worth, Catherine L; Cheng, Tammy M K; Blundell, Tom L

    2011-06-01

    The DNA sequencing technology developed by Frederick Sanger in the 1970s established genomics as the basis of comparative genetics. The recent invention of next-generation sequencing (NGS) platform has added a new dimension to genome research by generating ultra-fast and high-throughput sequencing data in an unprecedented manner. The advent of NGS technology also provides the opportunity to study genetic diseases where sequence variants or mutations are sought to establish a causal relationship with disease phenotypes. However, it is not a trivial task to seek genetic variants responsible for genetic diseases and even harder for complex diseases such as diabetes and cancers. In such polygenic diseases, multiple genes and alleles, which can exist in healthy individuals, come together to contribute to common disease phenotypes in a complex manner. Hence, it is desirable to have an approach that integrates omics data with both knowledge of protein structure and function and an understanding of networks/pathways, i.e. functional genomics and systems biology; in this way, genotype-phenotype relationships can be better understood. In this review, we bring this 'bottom-up' approach alongside the current NGS-driven genetic study of genetic variations and disease aetiology. We describe experimental and computational techniques for assessing genetic variants and their deleterious effects on protein structure and function. PMID:21350909

  10. Meet me halfway: when genomics meets structural bioinformatics.

    PubMed

    Gong, Sungsam; Worth, Catherine L; Cheng, Tammy M K; Blundell, Tom L

    2011-06-01

    The DNA sequencing technology developed by Frederick Sanger in the 1970s established genomics as the basis of comparative genetics. The recent invention of next-generation sequencing (NGS) platform has added a new dimension to genome research by generating ultra-fast and high-throughput sequencing data in an unprecedented manner. The advent of NGS technology also provides the opportunity to study genetic diseases where sequence variants or mutations are sought to establish a causal relationship with disease phenotypes. However, it is not a trivial task to seek genetic variants responsible for genetic diseases and even harder for complex diseases such as diabetes and cancers. In such polygenic diseases, multiple genes and alleles, which can exist in healthy individuals, come together to contribute to common disease phenotypes in a complex manner. Hence, it is desirable to have an approach that integrates omics data with both knowledge of protein structure and function and an understanding of networks/pathways, i.e. functional genomics and systems biology; in this way, genotype-phenotype relationships can be better understood. In this review, we bring this 'bottom-up' approach alongside the current NGS-driven genetic study of genetic variations and disease aetiology. We describe experimental and computational techniques for assessing genetic variants and their deleterious effects on protein structure and function.

  11. Primate genome architecture influences structural variation mechanisms and functional consequences.

    PubMed

    Gokcumen, Omer; Tischler, Verena; Tica, Jelena; Zhu, Qihui; Iskow, Rebecca C; Lee, Eunjung; Fritz, Markus Hsi-Yang; Langdon, Amy; Stütz, Adrian M; Pavlidis, Pavlos; Benes, Vladimir; Mills, Ryan E; Park, Peter J; Lee, Charles; Korbel, Jan O

    2013-09-24

    Although nucleotide resolution maps of genomic structural variants (SVs) have provided insights into the origin and impact of phenotypic diversity in humans, comparable maps in nonhuman primates have thus far been lacking. Using massively parallel DNA sequencing, we constructed fine-resolution genomic structural variation maps in five chimpanzees, five orang-utans, and five rhesus macaques. The SV maps, which are comprised of thousands of deletions, duplications, and mobile element insertions, revealed a high activity of retrotransposition in macaques compared with great apes. By comparison, nonallelic homologous recombination is specifically active in the great apes, which is correlated with architectural differences between the genomes of great apes and macaque. Transcriptome analyses across nonhuman primates and humans revealed effects of species-specific whole-gene duplication on gene expression. We identified 13 gene duplications coinciding with the species-specific gain of tissue-specific gene expression in keeping with a role of gene duplication in the promotion of diversification and the acquisition of unique functions. Differences in the present day activity of SV formation mechanisms that our study revealed may contribute to ongoing diversification and adaptation of great ape and Old World monkey lineages.

  12. Primate genome architecture influences structural variation mechanisms and functional consequences

    PubMed Central

    Gokcumen, Omer; Tischler, Verena; Tica, Jelena; Zhu, Qihui; Iskow, Rebecca C.; Lee, Eunjung; Fritz, Markus Hsi-Yang; Langdon, Amy; Stütz, Adrian M.; Pavlidis, Pavlos; Benes, Vladimir; Mills, Ryan E.; Park, Peter J.; Lee, Charles; Korbel, Jan O.

    2013-01-01

    Although nucleotide resolution maps of genomic structural variants (SVs) have provided insights into the origin and impact of phenotypic diversity in humans, comparable maps in nonhuman primates have thus far been lacking. Using massively parallel DNA sequencing, we constructed fine-resolution genomic structural variation maps in five chimpanzees, five orang-utans, and five rhesus macaques. The SV maps, which are comprised of thousands of deletions, duplications, and mobile element insertions, revealed a high activity of retrotransposition in macaques compared with great apes. By comparison, nonallelic homologous recombination is specifically active in the great apes, which is correlated with architectural differences between the genomes of great apes and macaque. Transcriptome analyses across nonhuman primates and humans revealed effects of species-specific whole-gene duplication on gene expression. We identified 13 gene duplications coinciding with the species-specific gain of tissue-specific gene expression in keeping with a role of gene duplication in the promotion of diversification and the acquisition of unique functions. Differences in the present day activity of SV formation mechanisms that our study revealed may contribute to ongoing diversification and adaptation of great ape and Old World monkey lineages. PMID:24014587

  13. NCI Cohort Consortium

    Cancer.gov

    The NCI Cohort Consortium is an extramural-intramural partnership formed by the National Cancer Institute to address the need for large-scale collaborations to pool the large quantity of data and biospecimens necessary to conduct a wide range of cancer studies.

  14. The Idaho Consortium.

    ERIC Educational Resources Information Center

    Beaird, James H.

    The Idaho Consortium was established by the state board of education to remedy perceived needs involving insufficient certificated teachers, excessive teacher mobility, shortage of teacher candidates, inadequate inservice training, a low level of administrative leadership, and a lack of programs in special education, early childhood education,…

  15. Genome structure and transcriptional regulation of human coronavirus NL63

    PubMed Central

    Pyrc, Krzysztof; Jebbink, Maarten F; Berkhout, Ben; van der Hoek, Lia

    2004-01-01

    Background Two human coronaviruses are known since the 1960s: HCoV-229E and HCoV-OC43. SARS-CoV was discovered in the early spring of 2003, followed by the identification of HCoV-NL63, the fourth member of the coronaviridae family that infects humans. In this study, we describe the genome structure and the transcription strategy of HCoV-NL63 by experimental analysis of the viral subgenomic mRNAs. Results The genome of HCoV-NL63 has the following gene order: 1a-1b-S-ORF3-E-M-N. The GC content of the HCoV-NL63 genome is extremely low (34%) compared to other coronaviruses, and we therefore performed additional analysis of the nucleotide composition. Overall, the RNA genome is very low in C and high in U, and this is also reflected in the codon usage. Inspection of the nucleotide composition along the genome indicates that the C-count increases significantly in the last one-third of the genome at the expense of U and G. We document the production of subgenomic (sg) mRNAs coding for the S, ORF3, E, M and N proteins. We did not detect any additional sg mRNA. Furthermore, we sequenced the 5' end of all sg mRNAs, confirming the presence of an identical leader sequence in each sg mRNA. Northern blot analysis indicated that the expression level among the sg mRNAs differs significantly, with the sg mRNA encoding nucleocapsid (N) being the most abundant. Conclusions The presented data give insight into the viral evolution and mutational patterns in coronaviral genome. Furthermore our data show that HCoV-NL63 employs the discontinuous replication strategy with generation of subgenomic mRNAs during the (-) strand synthesis. Because HCoV-NL63 has a low pathogenicity and is able to grow easily in cell culture, this virus can be a powerful tool to study SARS coronavirus pathogenesis. PMID:15548333

  16. Genetic determinants of heel bone properties: genome-wide association meta-analysis and replication in the GEFOS/GENOMOS consortium.

    PubMed

    Moayyeri, Alireza; Hsu, Yi-Hsiang; Karasik, David; Estrada, Karol; Xiao, Su-Mei; Nielson, Carrie; Srikanth, Priya; Giroux, Sylvie; Wilson, Scott G; Zheng, Hou-Feng; Smith, Albert V; Pye, Stephen R; Leo, Paul J; Teumer, Alexander; Hwang, Joo-Yeon; Ohlsson, Claes; McGuigan, Fiona; Minster, Ryan L; Hayward, Caroline; Olmos, José M; Lyytikäinen, Leo-Pekka; Lewis, Joshua R; Swart, Karin M A; Masi, Laura; Oldmeadow, Chris; Holliday, Elizabeth G; Cheng, Sulin; van Schoor, Natasja M; Harvey, Nicholas C; Kruk, Marcin; del Greco M, Fabiola; Igl, Wilmar; Trummer, Olivia; Grigoriou, Efi; Luben, Robert; Liu, Ching-Ti; Zhou, Yanhua; Oei, Ling; Medina-Gomez, Carolina; Zmuda, Joseph; Tranah, Greg; Brown, Suzanne J; Williams, Frances M; Soranzo, Nicole; Jakobsdottir, Johanna; Siggeirsdottir, Kristin; Holliday, Kate L; Hannemann, Anke; Go, Min Jin; Garcia, Melissa; Polasek, Ozren; Laaksonen, Marika; Zhu, Kun; Enneman, Anke W; McEvoy, Mark; Peel, Roseanne; Sham, Pak Chung; Jaworski, Maciej; Johansson, Åsa; Hicks, Andrew A; Pludowski, Pawel; Scott, Rodney; Dhonukshe-Rutten, Rosalie A M; van der Velde, Nathalie; Kähönen, Mika; Viikari, Jorma S; Sievänen, Harri; Raitakari, Olli T; González-Macías, Jesús; Hernández, Jose L; Mellström, Dan; Ljunggren, Osten; Cho, Yoon Shin; Völker, Uwe; Nauck, Matthias; Homuth, Georg; Völzke, Henry; Haring, Robin; Brown, Matthew A; McCloskey, Eugene; Nicholson, Geoffrey C; Eastell, Richard; Eisman, John A; Jones, Graeme; Reid, Ian R; Dennison, Elaine M; Wark, John; Boonen, Steven; Vanderschueren, Dirk; Wu, Frederick C W; Aspelund, Thor; Richards, J Brent; Bauer, Doug; Hofman, Albert; Khaw, Kay-Tee; Dedoussis, George; Obermayer-Pietsch, Barbara; Gyllensten, Ulf; Pramstaller, Peter P; Lorenc, Roman S; Cooper, Cyrus; Kung, Annie Wai Chee; Lips, Paul; Alen, Markku; Attia, John; Brandi, Maria Luisa; de Groot, Lisette C P G M; Lehtimäki, Terho; Riancho, José A; Campbell, Harry; Liu, Yongmei; Harris, Tamara B; Akesson, Kristina; Karlsson, Magnus; Lee, Jong-Young; Wallaschofski, Henri; Duncan, Emma L; O'Neill, Terence W; Gudnason, Vilmundur; Spector, Timothy D; Rousseau, François; Orwoll, Eric; Cummings, Steven R; Wareham, Nick J; Rivadeneira, Fernando; Uitterlinden, Andre G; Prince, Richard L; Kiel, Douglas P; Reeve, Jonathan; Kaptoge, Stephen K

    2014-06-01

    Quantitative ultrasound of the heel captures heel bone properties that independently predict fracture risk and, with bone mineral density (BMD) assessed by X-ray (DXA), may be convenient alternatives for evaluating osteoporosis and fracture risk. We performed a meta-analysis of genome-wide association (GWA) studies to assess the genetic determinants of heel broadband ultrasound attenuation (BUA; n = 14 260), velocity of sound (VOS; n = 15 514) and BMD (n = 4566) in 13 discovery cohorts. Independent replication involved seven cohorts with GWA data (in silico n = 11 452) and new genotyping in 15 cohorts (de novo n = 24 902). In combined random effects, meta-analysis of the discovery and replication cohorts, nine single nucleotide polymorphisms (SNPs) had genome-wide significant (P < 5 × 10(-8)) associations with heel bone properties. Alongside SNPs within or near previously identified osteoporosis susceptibility genes including ESR1 (6q25.1: rs4869739, rs3020331, rs2982552), SPTBN1 (2p16.2: rs11898505), RSPO3 (6q22.33: rs7741021), WNT16 (7q31.31: rs2908007), DKK1 (10q21.1: rs7902708) and GPATCH1 (19q13.11: rs10416265), we identified a new locus on chromosome 11q14.2 (rs597319 close to TMEM135, a gene recently linked to osteoblastogenesis and longevity) significantly associated with both BUA and VOS (P < 8.23 × 10(-14)). In meta-analyses involving 25 cohorts with up to 14 985 fracture cases, six of 10 SNPs associated with heel bone properties at P < 5 × 10(-6) also had the expected direction of association with any fracture (P < 0.05), including three SNPs with P < 0.005: 6q22.33 (rs7741021), 7q31.31 (rs2908007) and 10q21.1 (rs7902708). In conclusion, this GWA study reveals the effect of several genes common to central DXA-derived BMD and heel ultrasound/DXA measures and points to a new genetic locus with potential implications for better understanding of osteoporosis pathophysiology.

  17. Structural Genomics: From Genes to Structures With Valuable Materials And Many Questions in Between

    SciTech Connect

    Fox, B.G.; Goulding, C.; Malkowski, M.G.; Stewart, L.; Deacon, A.; /SLAC, SSRL

    2009-04-30

    The Protein Structure Initiative (PSI), funded by the US National Institutes of Health (NIH), provides a framework for the development and systematic evaluation of methods to solve protein structures. Although the PSI and other structural genomics efforts around the world have led to the solution of many new protein structures as well as the development of new methods, methodological bottlenecks still exist and are being addressed in this 'production phase' of PSI.

  18. X-ray scattering data and structural genomics

    NASA Astrophysics Data System (ADS)

    Doniach, Sebastian

    2003-03-01

    High throughput structural genomics has the ambitious goal of determining the structure of all, or a very large number of protein folds using the high-resolution techniques of protein crystallography and NMR. However, the program is facing significant bottlenecks in reaching this goal, which include problems of protein expression and crystallization. In this talk, some preliminary results on how the low-resolution technique of small-angle X-ray solution scattering (SAXS) can help ameliorate some of these bottlenecks will be presented. One of the most significant bottlenecks arises from the difficulty of crystallizing integral membrane proteins, where only a handful of structures are available compared to thousands of structures for soluble proteins. By 3-dimensional reconstruction from SAXS data, the size and shape of detergent-solubilized integral membrane proteins can be characterized. This information can then be used to classify membrane proteins which constitute some 25% of all genomes. SAXS may also be used to study the dependence of interparticle interference scattering on solvent conditions so that regions of the protein solution phase diagram which favor crystallization can be elucidated. As a further application, SAXS may be used to provide physical constraints on computational methods for protein structure prediction based on primary sequence information. This in turn can help in identifying structural homologs of a given protein, which can then give clues to its function. D. Walther, F. Cohen and S. Doniach. "Reconstruction of low resolution three-dimensional density maps from one-dimensional small angle x-ray scattering data for biomolecules." J. Appl. Cryst. 33(2):350-363 (2000). Protein structure prediction constrained by solution X-ray scattering data and structural homology identification Zheng WJ, Doniach S JOURNAL OF MOLECULAR BIOLOGY , v. 316(#1) pp. 173-187 FEB 8, 2002

  19. Mitochondrial Disease Sequence Data Resource (MSeqDR): A global grass-roots consortium to facilitate deposition, curation, annotation, and integrated analysis of genomic data for the mitochondrial disease clinical and research communities

    PubMed Central

    Falk, Marni J.; Shen, Lishuang; Gonzalez, Michael; Leipzig, Jeremy; Lott, Marie T.; Stassen, Alphons P.M.; Diroma, Maria Angela; Navarro-Gomez, Daniel; Yeske, Philip; Bai, Renkui; Boles, Richard G.; Brilhante, Virginia; Ralph, David; DaRe, Jeana T.; Shelton, Robert; Terry, Sharon; Zhang, Zhe; Copeland, William C.; van Oven, Mannis; Prokisch, Holger; Wallace, Douglas C.; Attimonelli, Marcella; Krotoski, Danuta; Zuchner, Stephan; Gai, Xiaowu

    2014-01-01

    Success rates for genomic analyses of highly heterogeneous disorders can be greatly improved if a large cohort of patient data is assembled to enhance collective capabilities for accurate sequence variant annotation, analysis, and interpretation. Indeed, molecular diagnostics requires the establishment of robust data resources to enable data sharing that informs accurate understanding of genes, variants, and phenotypes. The “Mitochondrial Disease Sequence Data Resource (MSeqDR) Consortium” is a grass-roots effort facilitated by the United Mitochondrial Disease Foundation to identify and prioritize specific genomic data analysis needs of the global mitochondrial disease clinical and research community. A central Web portal (https://mseqdr.org) facilitates the coherent compilation, organization, annotation, and analysis of sequence data from both nuclear and mitochondrial genomes of individuals and families with suspected mitochondrial disease. This Web portal provides users with a flexible and expandable suite of resources to enable variant-, gene-, and exome-level sequence analysis in a secure, Web-based, and user-friendly fashion. Users can also elect to share data with other MSeqDR Consortium members, or even the general public, either by custom annotation tracks or through use of a convenient distributed annotation system (DAS) mechanism. A range of data visualization and analysis tools are provided to facilitate user interrogation and understanding of genomic, and ultimately phenotypic, data of relevance to mitochondrial biology and disease. Currently available tools for nuclear and mitochondrial gene analyses include an MSeqDR GBrowse instance that hosts optimized mitochondrial disease and mitochondrial DNA (mtDNA) specific annotation tracks, as well as an MSeqDR locus-specific database (LSDB) that curates variant data on more than 1,300 genes that have been implicated in mitochondrial disease and/or encode mitochondria-localized proteins. MSeqDR is

  20. The evolution of chloroplast genome structure in ferns.

    PubMed

    Wolf, Paul G; Roper, Jessie M; Duffy, Aaron M

    2010-09-01

    The plastid genome (plastome) is a rich source of phylogenetic and other comparative data in plants. Most land plants possess a plastome of similar structure. However, in a major group of plants, the ferns, a unique plastome structure has evolved. The gene order in ferns has been explained by a series of genomic inversions relative to the plastome organization of seed plants. Here, we examine for the first time the structure of the plastome across fern phylogeny. We used a PCR-based strategy to map and partially sequence plastomes. We found that a pair of partially overlapping inversions in the region of the inverted repeat occurred in the common ancestor of most ferns. However, the ancestral (seed plant) structure is still found in early diverging branches leading to the osmundoid and filmy fern lineages. We found that a second pair of overlapping inversions occurred on a branch leading to the core leptosporangiates. We also found that the unique placement of the gene matK in ferns (lacking a flanking intron) is not a result of a large-scale inversion, as previously thought. This is because the intron loss maps to an earlier point on the phylogeny than the nearby inversion. We speculate on why inversions may occur in pairs and what this may mean for the dynamics of plastome evolution.

  1. The impact of structural genomics: the first quindecennial.

    PubMed

    Grabowski, Marek; Niedzialkowska, Ewa; Zimmerman, Matthew D; Minor, Wladek

    2016-03-01

    The period 2000-2015 brought the advent of high-throughput approaches to protein structure determination. With the overall funding on the order of $2 billion (in 2010 dollars), the structural genomics (SG) consortia established worldwide have developed pipelines for target selection, protein production, sample preparation, crystallization, and structure determination by X-ray crystallography and NMR. These efforts resulted in the determination of over 13,500 protein structures, mostly from unique protein families, and increased the structural coverage of the expanding protein universe. SG programs contributed over 4400 publications to the scientific literature. The NIH-funded Protein Structure Initiatives alone have produced over 2000 scientific publications, which to date have attracted more than 93,000 citations. Software and database developments that were necessary to handle high-throughput structure determination workflows have led to structures of better quality and improved integrity of the associated data. Organized and accessible data have a positive impact on the reproducibility of scientific experiments. Most of the experimental data generated by the SG centers are freely available to the community and has been utilized by scientists in various fields of research. SG projects have created, improved, streamlined, and validated many protocols for protein production and crystallization, data collection, and functional analysis, significantly benefiting biological and biomedical research. PMID:26935210

  2. Refining the Structure and Content of Clinical Genomic Reports

    PubMed Central

    DORSCHNER, MICHAEL O.; AMENDOLA, LAURA M.; SHIRTS, BRIAN H.; KIEDROWSKI, LESLI; SALAMA, JOSEPH; GORDON, ADAM S.; FULLERTON, STEPHANIE M.; TARCZY-HORNOCH, PETER; BYERS, PETER H.; JARVIK, GAIL P.

    2014-01-01

    To effectively articulate the results of exome and genome sequencing we refined the structure and content of molecular test reports. To communicate results of a randomized control trial aimed at the evaluation of exome sequencing for clinical medicine, we developed a structured narrative report. With feedback from genetics and non-genetics professionals, we developed separate indication-specific and incidental findings reports. Standard test report elements were supplemented with research study-specific language, which highlighted the limitations of exome sequencing and provided detailed, structured results, and interpretations. The report format we developed to communicate research results can easily be transformed for clinical use by removal of research-specific statements and disclaimers. The development of clinical reports for exome sequencing has shown that accurate and open communication between the clinician and laboratory is ideally an ongoing process to address the increasing complexity of molecular genetic testing. PMID:24616401

  3. Population structure and minimum core genome typing of Legionella pneumophila

    PubMed Central

    Qin, Tian; Zhang, Wen; Liu, Wenbin; Zhou, Haijian; Ren, Hongyu; Shao, Zhujun; Lan, Ruiting; Xu, Jianguo

    2016-01-01

    Legionella pneumophila is an important human pathogen causing Legionnaires’ disease. In this study, whole genome sequencing (WGS) was used to study the characteristics and population structure of L. pneumophila strains. We sequenced and compared 53 isolates of L. pneumophila covering different serogroups and sequence-based typing (SBT) types (STs). We found that 1,896 single-copy orthologous genes were shared by all isolates and were defined as the minimum core genome (MCG) of L. pneumophila. A total of 323,224 single-nucleotide polymorphisms (SNPs) were identified among the 53 strains. After excluding 314,059 SNPs which were likely to be results of recombination, the remaining 9,165 SNPs were referred to as MCG SNPs. Population Structure analysis based on MCG divided the 53 L. pneumophila into nine MCG groups. The within-group distances were much smaller than the between-group distances, indicating considerable divergence between MCG groups. MCG groups were also supplied by phylogenetic analysis and may be considered as robust taxonomic units within L. pneumophila. Among the nine MCG groups, eight showed high intracellular growth ability while one showed low intracellular growth ability. Furthermore, MCG typing also showed high resolution in subtyping ST1 strains. The results obtained in this study provided significant insights into the evolution, population structure and pathogenicity of L. pneumophila. PMID:26888563

  4. Kansas Wind Energy Consortium

    SciTech Connect

    Gruenbacher, Don

    2015-12-31

    This project addresses both fundamental and applied research problems that will help with problems defined by the DOE “20% Wind by 2030 Report”. In particular, this work focuses on increasing the capacity of small or community wind generation capabilities that would be operated in a distributed generation approach. A consortium (KWEC – Kansas Wind Energy Consortium) of researchers from Kansas State University and Wichita State University aims to dramatically increase the penetration of wind energy via distributed wind power generation. We believe distributed generation through wind power will play a critical role in the ability to reach and extend the renewable energy production targets set by the Department of Energy. KWEC aims to find technical and economic solutions to enable widespread implementation of distributed renewable energy resources that would apply to wind.

  5. Discovery and genotyping of genome structural polymorphism by sequencing on a population scale

    PubMed Central

    Handsaker, Robert E.; Korn, Joshua M.; Nemesh, James; McCarroll, Steven A.

    2016-01-01

    Accurate and complete analysis of genome variation in large populations will be required to understand the role of genome variation in complex disease. We present an analytical framework for characterizing genome deletion polymorphism in populations, using sequence data that are distributed across hundreds or thousands of genomes. Our approach uses population-level relationships to re-interpret the technical features of sequence data that often reflect structural variation. In the 1000 Genomes Project pilot, this approach identified deletion polymorphism across 168 genomes (sequenced at 4x average coverage) with sensitivity and specificity unmatched by other algorithms. We also describe a way to determine the allelic state or genotype of each deletion polymorphism in each genome; the 1000 Genomes Project used this approach to type 13,826 deletion polymorphisms (48 bp – 960 kbp) at high accuracy in populations. These methods offer a way to relate genome structural polymorphism to complex disease in populations. PMID:21317889

  6. The Complete Chloroplast Genome Sequence of Podocarpus lambertii: Genome Structure, Evolutionary Aspects, Gene Content and SSR Detection

    PubMed Central

    Vieira, Leila do Nascimento; Faoro, Helisson; Rogalski, Marcelo; Fraga, Hugo Pacheco de Freitas; Cardoso, Rodrigo Luis Alves; de Souza, Emanuel Maltempi; de Oliveira Pedrosa, Fábio; Nodari, Rubens Onofre; Guerra, Miguel Pedro

    2014-01-01

    Background Podocarpus lambertii (Podocarpaceae) is a native conifer from the Brazilian Atlantic Forest Biome, which is considered one of the 25 biodiversity hotspots in the world. The advancement of next-generation sequencing technologies has enabled the rapid acquisition of whole chloroplast (cp) genome sequences at low cost. Several studies have proven the potential of cp genomes as tools to understand enigmatic and basal phylogenetic relationships at different taxonomic levels, as well as further probe the structural and functional evolution of plants. In this work, we present the complete cp genome sequence of P. lambertii. Methodology/Principal Findings The P. lambertii cp genome is 133,734 bp in length, and similar to other sequenced cupressophytes, it lacks one of the large inverted repeat regions (IR). It contains 118 unique genes and one duplicated tRNA (trnN-GUU), which occurs as an inverted repeat sequence. The rps16 gene was not found, which was previously reported for the plastid genome of another Podocarpaceae (Nageia nagi) and Araucariaceae (Agathis dammara). Structurally, P. lambertii shows 4 inversions of a large DNA fragment ∼20,000 bp compared to the Podocarpus totara cp genome. These unexpected characteristics may be attributed to geographical distance and different adaptive needs. The P. lambertii cp genome presents a total of 28 tandem repeats and 156 SSRs, with homo- and dipolymers being the most common and tri-, tetra-, penta-, and hexapolymers occurring with less frequency. Conclusion The complete cp genome sequence of P. lambertii revealed significant structural changes, even in species from the same genus. These results reinforce the apparently loss of rps16 gene in Podocarpaceae cp genome. In addition, several SSRs in the P. lambertii cp genome are likely intraspecific polymorphism sites, which may allow highly sensitive phylogeographic and population structure studies, as well as phylogenetic studies of species of this genus. PMID

  7. The effect of heavy metals on microbial community structure of a sulfidogenic consortium in anaerobic semi-continuous stirred tank reactors.

    PubMed

    Kieu, Hoa T Q; Horn, Harald; Müller, Elisabeth

    2014-03-01

    The effect of heavy metals on community structure of a heavy metal tolerant sulfidogenic consortium was evaluated by using a combination of denaturing gradient gel electrophoresis (DGGE) of 16S rRNA gene and dissimilatory sulfite reductase (dsrB) gene fragments, 16S rRNA gene cloning analysis and fluorescence in situ hybridization (FISH). For this purpose, four anaerobic semi-continuous stirred tank reactors (referred as R1-R4) were run in parallel for 12 weeks at heavy metal loading rates of 1.5, 3, 4.5 and 7.5 mg l(-1) d(-1) each of Cu(2+), Ni(2+), Zn(2+), and Cr(6+), respectively. The abundance ratio of Desulfovibrio vulgaris detected by FISH to total cell counts was consistent with the obtained results of cloning and DGGE. This indicated that D. vulgaris was dominant in all analyzed samples and played a key role in heavy metal removal in R1, R2, and R3. In contrast, after 4 weeks of operation of R4, a distinct biomass loss was observed and no positive hybridized cells were detected by specific probes for the domain Bacteria, sulfate-reducing bacteria and D. vulgris. High removal efficiencies of heavy metals were achieved in R1, R2 and R3 after 12 weeks, whereas the precipitation of heavy metals in R4 was significantly decreased after 4 weeks and almost not observed after 6 weeks of operation. In addition, the anaerobic bacteria, such as Pertrimonas sulfuriphila, Clostridium sp., Citrobacter amalonaticus, and Klebsiella sp., identified from DGGE bands and clone library were hypothesized as heavy metal resistant bacteria at a loading rate of 1.5 mg l(-1) d(-1) of Cu(2+), Ni(2+), Zn(2+), and Cr(6+.)

  8. The complete mitochondrial genome structure of snow leopard Panthera uncia.

    PubMed

    Wei, Lei; Wu, Xiaobing; Jiang, Zhigang

    2009-05-01

    The complete mitochondrial genome (mtDNA) of snow leopard Panthera uncia was obtained by using the polymerase chain reaction (PCR) technique based on the PCR fragments of 30 primers we designed. The entire mtDNA sequence was 16 773 base pairs (bp) in length, and the base composition was: A-5,357 bp (31.9%); C-4,444 bp (26.5%); G-2,428 bp (14.5%); T-4,544 bp (27.1%). The structural characteristics [0] of the P. uncia mitochondrial genome were highly similar to these of Felis catus, Acinonyx jubatus, Neofelis nebulosa and other mammals. However, we found several distinctive features of the mitochondrial genome of Panthera unica. First, the termination codon of COIII was TAA, which differed from those of F. catus, A. jubatus and N. nebulosa. Second, tRNA(Ser) ((AGY)), which lacked the ''DHU'' arm, could not be folded into the typical cloverleaf-shaped structure. Third, in the control region, a long repetitive sequence in RS-2 (32 bp) region was found with 2 repeats while one short repetitive segment (9 bp) was found with 15 repeats in the RS-3 region. We performed phylogenetic analysis based on a 3 816 bp concatenated sequence of 12S rRNA, 16S rRNA, ND2, ND4, ND5, Cyt b and ATP8 for P. uncia and other related species, the result indicated that P. uncia and P. leo were the sister species, which was different from the previous findings.

  9. The complete mitochondrial genome structure of snow leopard Panthera uncia.

    PubMed

    Wei, Lei; Wu, Xiaobing; Jiang, Zhigang

    2009-05-01

    The complete mitochondrial genome (mtDNA) of snow leopard Panthera uncia was obtained by using the polymerase chain reaction (PCR) technique based on the PCR fragments of 30 primers we designed. The entire mtDNA sequence was 16 773 base pairs (bp) in length, and the base composition was: A-5,357 bp (31.9%); C-4,444 bp (26.5%); G-2,428 bp (14.5%); T-4,544 bp (27.1%). The structural characteristics [0] of the P. uncia mitochondrial genome were highly similar to these of Felis catus, Acinonyx jubatus, Neofelis nebulosa and other mammals. However, we found several distinctive features of the mitochondrial genome of Panthera unica. First, the termination codon of COIII was TAA, which differed from those of F. catus, A. jubatus and N. nebulosa. Second, tRNA(Ser) ((AGY)), which lacked the ''DHU'' arm, could not be folded into the typical cloverleaf-shaped structure. Third, in the control region, a long repetitive sequence in RS-2 (32 bp) region was found with 2 repeats while one short repetitive segment (9 bp) was found with 15 repeats in the RS-3 region. We performed phylogenetic analysis based on a 3 816 bp concatenated sequence of 12S rRNA, 16S rRNA, ND2, ND4, ND5, Cyt b and ATP8 for P. uncia and other related species, the result indicated that P. uncia and P. leo were the sister species, which was different from the previous findings. PMID:18431688

  10. Recognizing genes and other components of genomic structure

    SciTech Connect

    Burks, C. ); Myers, E. . Dept. of Computer Science); Stormo, G.D. . Dept. of Molecular, Cellular and Developmental Biology)

    1991-01-01

    The Aspen Center for Physics (ACP) sponsored a three-week workshop, with 26 scientists participating, from 28 May to 15 June, 1990. The workshop, entitled Recognizing Genes and Other Components of Genomic Structure, focussed on discussion of current needs and future strategies for developing the ability to identify and predict the presence of complex functional units on sequenced, but otherwise uncharacterized, genomic DNA. We addressed the need for computationally-based, automatic tools for synthesizing available data about individual consensus sequences and local compositional patterns into the composite objects (e.g., genes) that are -- as composite entities -- the true object of interest when scanning DNA sequences. The workshop was structured to promote sustained informal contact and exchange of expertise between molecular biologists, computer scientists, and mathematicians. No participant stayed for less than one week, and most attended for two or three weeks. Computers, software, and databases were available for use as electronic blackboards'' and as the basis for collaborative exploration of ideas being discussed and developed at the workshop. 23 refs., 2 tabs.

  11. Genomic structure of the human prion protein gene.

    PubMed Central

    Puckett, C; Concannon, P; Casey, C; Hood, L

    1991-01-01

    Creutzfeld-Jacob disease and Gerstmann-Sträussler syndrome are rare degenerative disorders of the nervous system which have been genetically linked to the prion protein (PrP) gene. The PrP gene encodes a host glycoprotein of unknown function and is located on the short arm of chromosome 20, a region with few known genes or anonymous markers. The complete structure of the PrP gene in man has not been determined despite considerable interest in its relationship to these unusual disorders. We have determined that the human PrP gene has the same simple genomic structure seen in the hamster gene and consists of two exons and a single intron. In contrast to the hamster PrP gene the human gene appears to have a single major transcriptional start site. The region immediately 5' of the transcriptional start site of the human PrP gene demonstrates the GC-rich features commonly seen in housekeeping genes. Curiously, the genomic clone we have isolated contains a 24-bp deletion that removes one of five octameric peptide repeats predicted to form a B-pleated sheet in this region of the PrP. We have also identified 5' of the PrP gene an RFLP which has a high degree of heterozygosity and which should serve as a useful marker for the pter-12 region of human chromosome 20. Images Figure 3 Figure 5 PMID:1678248

  12. Knowledge Mobilization across Boundaries with the Use of Novel Organizational Structures, Conferencing Strategies, and Technological Tools: The Ontario Consortium of Undergraduate Biology Educators (oCUBE) Model

    ERIC Educational Resources Information Center

    Kajiura, Lovaye; Smit, Julie; Montpetit, Colin; Kelly, Tamara; Waugh, Jennifer; Rawle, Fiona; Clark, Julie; Neumann, Melody; French, Michelle

    2014-01-01

    The Ontario Consortium of Undergraduate Biology Educators (oCUBE) brings together over 50 biology educators from 18 Ontario universities with the common goal to improve the biology undergraduate experience for both students and educators. This goal is achieved through an innovative mix of highly interactive face-to-face meetings, online…

  13. Studies on cattle genomic structural variation provide insights into ruminant speciation and adaptation

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genomic structural variations, including segmental duplications (SD) and copy number variations (CNV), contribute significantly to individual health and disease in primates and rodents. As a part of the bovine genome annotation effort, we performed the first genome-wide analysis of SD in cattle usin...

  14. Evidence of structural genomic region recombination in Hepatitis C virus

    PubMed Central

    Cristina, Juan; Colina, Rodney

    2006-01-01

    Background/Aim Hepatitis C virus (HCV) has been the subject of intense research and clinical investigation as its major role in human disease has emerged. Although homologous recombination has been demonstrated in many members of the family Flaviviridae, to which HCV belongs, there have been few studies reporting recombination on natural populations of HCV. Recombination break-points have been identified in non structural proteins of the HCV genome. Given the implications that recombination has for RNA virus evolution, it is clearly important to determine the extent to which recombination plays a role in HCV evolution. In order to gain insight into these matters, we have performed a phylogenetic analysis of 89 full-length HCV strains from all types and sub-types, isolated all over the world, in order to detect possible recombination events. Method Putative recombinant sequences were identified with the use of SimPlot program. Recombination events were confirmed by bootscaning, using putative recombinant sequence as a query. Results Two crossing over events were identified in the E1/E2 structural region of an intra-typic (1a/1c) recombinant strain. Conclusion Only one of 89 full-length strains studied resulted to be a recombinant HCV strain, revealing that homologous recombination does not play an extensive roll in HCV evolution. Nevertheless, this mechanism can not be denied as a source for generating genetic diversity in natural populations of HCV, since a new intra-typic recombinant strain was found. Moreover, the recombination break-points were found in the structural region of the HCV genome. PMID:16813646

  15. A physical map for the Amborella trichopoda genome sheds light on the evolution of angiosperm genome structure

    PubMed Central

    2011-01-01

    Background Recent phylogenetic analyses have identified Amborella trichopoda, an understory tree species endemic to the forests of New Caledonia, as sister to a clade including all other known flowering plant species. The Amborella genome is a unique reference for understanding the evolution of angiosperm genomes because it can serve as an outgroup to root comparative analyses. A physical map, BAC end sequences and sample shotgun sequences provide a first view of the 870 Mbp Amborella genome. Results Analysis of Amborella BAC ends sequenced from each contig suggests that the density of long terminal repeat retrotransposons is negatively correlated with that of protein coding genes. Syntenic, presumably ancestral, gene blocks were identified in comparisons of the Amborella BAC contigs and the sequenced Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera and Oryza sativa genomes. Parsimony mapping of the loss of synteny corroborates previous analyses suggesting that the rate of structural change has been more rapid on lineages leading to Arabidopsis and Oryza compared with lineages leading to Populus and Vitis. The gamma paleohexiploidy event identified in the Arabidopsis, Populus and Vitis genomes is shown to have occurred after the divergence of all other known angiosperms from the lineage leading to Amborella. Conclusions When placed in the context of a physical map, BAC end sequences representing just 5.4% of the Amborella genome have facilitated reconstruction of gene blocks that existed in the last common ancestor of all flowering plants. The Amborella genome is an invaluable reference for inferences concerning the ancestral angiosperm and subsequent genome evolution. PMID:21619600

  16. Combustion Byproducts Recycling Consortium

    SciTech Connect

    Paul Ziemkiewicz; Tamara Vandivort; Debra Pflughoeft-Hassett; Y. Paul Chugh; James Hower

    2008-08-31

    The Combustion Byproducts Recycling Consortium (CBRC) program was developed as a focused program to remove and/or minimize the barriers for effective management of over 123 million tons of coal combustion byproducts (CCBs) annually generated in the USA. At the time of launching the CBRC in 1998, about 25% of CCBs were beneficially utilized while the remaining was disposed in on-site or off-site landfills. During the ten (10) year tenure of CBRC (1998-2008), after a critical review, 52 projects were funded nationwide. By region, the East, Midwest, and West had 21, 18, and 13 projects funded, respectively. Almost all projects were cooperative projects involving industry, government, and academia. The CBRC projects, to a large extent, successfully addressed the problems of large-scale utilization of CCBs. A few projects, such as the two Eastern Region projects that addressed the use of fly ash in foundry applications, might be thought of as a somewhat smaller application in comparison to construction and agricultural uses, but as a novel niche use, they set the stage to draw interest that fly ash substitution for Portland cement might not attract. With consideration of the large increase in flue gas desulfurization (FGD) gypsum in response to EPA regulations, agricultural uses of FGD gypsum hold promise for large-scale uses of a product currently directed to the (currently stagnant) home construction market. Outstanding achievements of the program are: (1) The CBRC successfully enhanced professional expertise in the area of CCBs throughout the nation. The enhanced capacity continues to provide technology and information transfer expertise to industry and regulatory agencies. (2) Several technologies were developed that can be used immediately. These include: (a) Use of CCBs for road base and sub-base applications; (b) full-depth, in situ stabilization of gravel roads or highway/pavement construction recycled materials; and (c) fired bricks containing up to 30%-40% F

  17. Comparative genetics and genomics of nematodes: genome structure, development, and lifestyle.

    PubMed

    Sommer, Ralf J; Streit, Adrian

    2011-01-01

    Nematodes are found in virtually all habitats on earth. Many of them are parasites of plants and animals, including humans. The free-living nematode, Caenorhabditis elegans, is one of the genetically best-studied model organisms and was the first metazoan whose genome was fully sequenced. In recent years, the draft genome sequences of another six nematodes representing four of the five major clades of nematodes were published. Compared to mammalian genomes, all these genomes are very small. Nevertheless, they contain almost the same number of genes as the human genome. Nematodes are therefore a very attractive system for comparative genetic and genomic studies, with C. elegans as an excellent baseline. Here, we review the efforts that were made to extend genetic analysis to nematodes other than C. elegans, and we compare the seven available nematode genomes. One of the most striking findings is the unexpectedly high incidence of gene acquisition through horizontal gene transfer (HGT). PMID:21721943

  18. 2003 NIH protein structure intiative workshop in protein production and crystallization for structural and functional genomics.

    SciTech Connect

    Adams, M.; Joachimiak, A.; Kim, R.; Montelione, G. T.; Norvell, J.; Biosciences Division; University of Georgia; LBNL; Rutgers Univ.; Robert Wood Johnson Medical School

    2004-03-01

    The United States National Institutes of Health (NIH) Protein Structure Initiative (PSI) is a joint government, university, and industry effort, organized and supported by the National Institute of General Medical Sciences (NIGMS), and aimed at reducing the costs in increasing the speed of protein structure determination. Its long-range goal is to make the three-dimensional atomic-level structures of most proteins in nature easily obtainable from knowledge of their corresponding DNA sequences (http://www.nigms.gov/psi). It is the primary U.S. component of a broad international effort in structural genomics, involving at least 20 projects throughout the world. The PSI is now in its fourth year. Nine PSI pilot research centers have been funded to explore the feasibility and impact of genomic scale protein structure analysis. To date, over 500 3D protein structures, providing the first structural representatives for hundreds of protein domain families, have been completed and deposited by the NIH centers into the public Protein Data Bank. In addition, new technologies for protein sample production, data organization, and structure analysis by X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy have been developed. These technologies increase the efficiency of protein structure determination both for structural genomics and for the broader structural biology community. Although progress has been substantial, PSI pilot research centers have identified a number of important bottlenecks that need to be solved to meet the goals of the program. For example, it is now accepted that a major challenge to high-throughput protein structure determination is the fact that for some 70% of targeted proteins, it is difficult to produce protein samples and crystals suitable for structural analysis. In an effort to facilitate an effective exchange of developments and advancements between pilot centers, the NIGMS organized a workshop on gene cloning, protein

  19. Development of the international psyllid genome consortium

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Two of the most important emerging agricultural diseases in the USA are transmitted by two different insect species of psyllids from the Family Psyllidae. The Asian Citrus Psyllid (Diaphorina citri) is the principal vector of the intercellular, plant-pathogenic bacterium Liberibacter which cause Hua...

  20. Portrait of a Consortium: ANKOS (Anatolian University Libraries Consortium)

    ERIC Educational Resources Information Center

    Erdogan, Phyllis; Karasozen, Bulent

    2009-01-01

    The Anatolian University Libraries Consortium (ANKOS) was created in 2001 with only a few members subscribed to nine e-journal collections and bibliographic databases. This Turkish library consortium had developed from one state and three private universities joining together for the purchase of two databases in 1999. Over time, the numbers of…

  1. Integrated consensus map of cultivated peanut and wild relatives reveals structures of the A and B genomes of Arachis and divergence of the legume genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complex, tetraploid genome structure of peanut (Arachis hypogaea) has obstructed advances in genetics and genomics in the species. The aim of this study is to understand the genome structure of Arachis by developing a high-density integrated consensus map. Three recombinant inbred line populatio...

  2. Effects of genome structure variation, homeologous genes and repetitive DNA on polyploid crop research in the age of genomics.

    PubMed

    Fu, Donghui; Mason, Annaliese S; Xiao, Meili; Yan, Hui

    2016-01-01

    Compared to diploid species, allopolyploid crop species possess more complex genomes, higher productivity, and greater adaptability to changing environments. Next generation sequencing techniques have produced high-density genetic maps, whole genome sequences, transcriptomes and epigenomes for important polyploid crops. However, several problems interfere with the full application of next generation sequencing techniques to these crops. Firstly, different types of genomic variation affect sequence assembly and QTL mapping. Secondly, duplicated or homoeologous genes can diverge in function and then lead to emergence of many minor QTL, which increases difficulties in fine mapping, cloning and marker assisted selection. Thirdly, repetitive DNA sequences arising in polyploid crop genomes also impact sequence assembly, and are increasingly being shown to produce small RNAs to regulate gene expression and hence phenotypic traits. We propose that these three key features should be considered together when analyzing polyploid crop genomes. It is apparent that dissection of genomic structural variation, elucidation of the function and mechanism of interaction of homoeologous genes, and investigation of the de novo roles of repeat sequences in agronomic traits are necessary for genomics-based crop breeding in polyploids.

  3. Identification of novel RNA secondary structures within the hepatitis C virus genome reveals a cooperative involvement in genome packaging

    PubMed Central

    Stewart, H.; Bingham, R.J.; White, S. J.; Dykeman, E. C.; Zothner, C.; Tuplin, A. K.; Stockley, P. G.; Twarock, R.; Harris, M.

    2016-01-01

    The specific packaging of the hepatitis C virus (HCV) genome is hypothesised to be driven by Core-RNA interactions. To identify the regions of the viral genome involved in this process, we used SELEX (systematic evolution of ligands by exponential enrichment) to identify RNA aptamers which bind specifically to Core in vitro. Comparison of these aptamers to multiple HCV genomes revealed the presence of a conserved terminal loop motif within short RNA stem-loop structures. We postulated that interactions of these motifs, as well as sub-motifs which were present in HCV genomes at statistically significant levels, with the Core protein may drive virion assembly. We mutated 8 of these predicted motifs within the HCV infectious molecular clone JFH-1, thereby producing a range of mutant viruses predicted to possess altered RNA secondary structures. RNA replication and viral titre were unaltered in viruses possessing only one mutated structure. However, infectivity titres were decreased in viruses possessing a higher number of mutated regions. This work thus identified multiple novel RNA motifs which appear to contribute to genome packaging. We suggest that these structures act as cooperative packaging signals to drive specific RNA encapsidation during HCV assembly. PMID:26972799

  4. Structural variation of the human genome: mechanisms, assays, and role in male infertility

    PubMed Central

    Carvalho, Claudia M.B.; Zhang, Feng; Lupski, James R.

    2011-01-01

    Genomic disorders are defined as diseases caused by rearrangements of the genome incited by a genomic architecture that conveys instability. Y-chromosome related dysfunctions such as male infertility are frequently associated with gross DNA rearrangements resulting from its peculiar genomic architecture. The Y-chromosome has evolved into a highly specialized chromosome to perform male functions, mainly spermatogenesis. Direct and inverted repeats, some of them palindromes with highly identical nucleotide sequences that can form DNA cruciform structures, characterize the genomic structure of the Y-chromosome long arm. Some particular Y chromosome genomic deletions can cause spermatogenic failure likely because of removal of one or more transcriptional units with a potential role in spermatogenesis. We describe mechanisms underlying the formation of human genomic rearrangements on autosomes and review Y-chromosome deletions associated with male infertility. PMID:21210740

  5. Structural Variation Mutagenesis of the Human Genome: Impact on Disease and Evolution

    PubMed Central

    Lupski, James R.

    2015-01-01

    Watson-Crick base-pair changes, or single-nucleotide variants (SNV), have long been known as a source of mutations. However, the extent to which DNA structural variation, including duplication and deletion copy number variants (CNV) and copy number neutral inversions and translocations, contribute to human genome variation and disease has been appreciated only recently. Moreover, the potential complexity of structural variants (SV) was not envisioned; thus, the frequency of complex genomic rearrangements (CGR) and how such events form remained a mystery. The concept of genomic disorders, diseases due to genomic rearrangements and not sequence-based changes for which genomic architecture incite genomic instability, delineated a new category of conditions distinct from chromosomal syndromes and single-gene Mendelian diseases. Nevertheless, it is the mechanistic understanding of CNV/SV formation that has promoted further understanding of human biology and disease and provided insights into human genome and gene evolution. PMID:25892534

  6. Hawaii Space Grant Consortium

    NASA Technical Reports Server (NTRS)

    Flynn, Luke P.

    2005-01-01

    The Hawai'i Space Grant Consortium is composed of ten institutions of higher learning including the University of Hawai'i at Manoa, the University of Hawai'i at Hilo, the University of Guam, and seven Community Colleges spread over the 4 main Hawaiian islands. Geographic separation is not the only obstacle that we face as a Consortium. Hawai'i has been mired in an economic downturn due to a lack of tourism for almost all of the period (2001 - 2004) covered by this report, although hotel occupancy rates and real estate sales have sky-rocketed in the last year. Our challenges have been many including providing quality educational opportunities in the face of shrinking State and Federal budgets, encouraging science and technology course instruction at the K-12 level in a public school system that is becoming less focused on high technology and more focused on developing basic reading and math skills, and assembling community college programs with instructors who are expected to teach more classes for the same salary. Motivated people can overcome these problems. Fortunately, the Hawai'i Space Grant Consortium (HSGC) consists of a group of highly motivated and talented individuals who have not only overcome these obstacles, but have excelled with the Program. We fill a critical need within the State of Hawai'i to provide our children with opportunities to pursue their dreams of becoming the next generation of NASA astronauts, engineers, and explorers. Our strength lies not only in our diligent and creative HSGC advisory board, but also with Hawai'i's teachers, students, parents, and industry executives who are willing to invest their time, effort, and resources into Hawai'i's future. Our operational philosophy is to FACE the Future, meaning that we will facilitate, administer, catalyze, and educate in order to achieve our objective of creating a highly technically capable workforce both here in Hawai'i and for NASA. In addition to administering to programs and

  7. NECOR: New research consortium

    NASA Astrophysics Data System (ADS)

    Richman, Barbara T.

    Three major marine research institutes in the northeastern United States have entered into a formal agreement to coordinate the operation and scheduling of five seagoing oceanographic vessels. NECOR (Northeast Consortium Research Fleet) consists of the Woods Hole Oceanographic Institution, the University of Rhode Island, and Columbia University's Lamont-Doherty Geological Observatory.NECOR was established, in part, to save money during a time of drastic funding reductions for ship support, explained Jules Hirshman, marine science coordinator at Lamont. Budget axings for 1982 chipped off 10-12% (constant dollars) from 1981 s ship funding, estimates Robert Dinsmore, chairman of facilities and marine operations at Woods Hole and chairman of NECOR's executive committee. Steadily rising fuel costs (Eos, June 23, 1981, p. 549) aggravate the funding problem.

  8. SPring-8 Structural Biology Beamlines / Automatic Beamline Operation at RIKEN Structural Genomics Beamlines

    SciTech Connect

    Ueno, Go; Hasegawa, Kazuya; Okazaki, Nobuo; Sakai, Hisanobu; Kumasaka, Takashi; Yamamoto, Masaki

    2007-01-19

    RIKEN Structural Genomics Beamlines (BL26B1 and BL26B2) at SPring-8 have been constructed for high throughput protein crystallography. The beamline operation is automated cooperating with the sample changer robot. The operation software provides a centralized control utilizing the client and server architecture. The sample management system with the networked database has been implemented to accept dry-shipped crystals from distant users.

  9. The Effect of Stress on Genome Regulation and Structure

    PubMed Central

    MADLUNG, ANDREAS; COMAI, LUCA

    2004-01-01

    • Background Stresses exert evolutionary pressures on all organisms, which have developed sophisticated responses to cope and survive. These responses involve cellular physiology, gene regulation and genome remodelling. • Scope In this review, the effects of stress on genomes and the connected responses are considered. Recent developments in our understanding of epigenetic genome regulation, including the role of RNA interference (RNAi), suggest a function for this in stress initiation and response. We review our knowledge of how different stresses, tissue culture, pathogen attack, abiotic stress, and hybridization, affect genomes. Using allopolyploid hybridization as an example, we examine mechanisms that may mediate genomic responses, focusing on RNAi-mediated perturbations. • Conclusions A common response to stresses may be the relaxation of epigenetic regulation, leading to activation of suppressed sequences and secondary effects as regulatory systems attempt to re-establish genomic order. PMID:15319229

  10. Gas Storage Technology Consortium

    SciTech Connect

    Joel L. Morrison; Sharon L. Elder

    2007-03-31

    Gas storage is a critical element in the natural gas industry. Producers, transmission and distribution companies, marketers, and end users all benefit directly from the load balancing function of storage. The unbundling process has fundamentally changed the way storage is used and valued. As an unbundled service, the value of storage is being recovered at rates that reflect its value. Moreover, the marketplace has differentiated between various types of storage services and has increasingly rewarded flexibility, safety, and reliability. The size of the natural gas market has increased and is projected to continue to increase towards 30 trillion cubic feet (TCF) over the next 10 to 15 years. Much of this increase is projected to come from electric generation, particularly peaking units. Gas storage, particularly the flexible services that are most suited to electric loads, is crucial in meeting the needs of these new markets. To address the gas storage needs of the natural gas industry, an industry-driven consortium was created - the Gas Storage Technology Consortium (GSTC). The objective of the GSTC is to provide a means to accomplish industry-driven research and development designed to enhance the operational flexibility and deliverability of the nation's gas storage system, and provide a cost-effective, safe, and reliable supply of natural gas to meet domestic demand. This report addresses the activities for the quarterly period of January1, 2007 through March 31, 2007. Key activities during this time period included: {lg_bullet} Drafting and distributing the 2007 RFP; {lg_bullet} Identifying and securing a meeting site for the GSTC 2007 Spring Proposal Meeting; {lg_bullet} Scheduling and participating in two (2) project mentoring conference calls; {lg_bullet} Conducting elections for four Executive Council seats; {lg_bullet} Collecting and compiling the 2005 GSTC Final Project Reports; and {lg_bullet} Outreach and communications.

  11. Gas Storage Technology Consortium

    SciTech Connect

    Joel Morrison

    2005-09-14

    Gas storage is a critical element in the natural gas industry. Producers, transmission and distribution companies, marketers, and end users all benefit directly from the load balancing function of storage. The unbundling process has fundamentally changed the way storage is used and valued. As an unbundled service, the value of storage is being recovered at rates that reflect its value. Moreover, the marketplace has differentiated between various types of storage services, and has increasingly rewarded flexibility, safety, and reliability. The size of the natural gas market has increased and is projected to continue to increase towards 30 trillion cubic feet (TCF) over the next 10 to 15 years. Much of this increase is projected to come from electric generation, particularly peaking units. Gas storage, particularly the flexible services that are most suited to electric loads, is critical in meeting the needs of these new markets. In order to address the gas storage needs of the natural gas industry, an industry driven consortium was created--the Gas Storage Technology Consortium (GSTC). The objective of the GSTC is to provide a means to accomplish industry-driven research and development designed to enhance operational flexibility and deliverability of the Nation's gas storage system, and provide a cost effective, safe, and reliable supply of natural gas to meet domestic demand. This report addresses the activities for the quarterly period of April 1, 2005 through June 30, 2005. During this time period efforts were directed toward (1) GSTC administration changes, (2) participating in the American Gas Association Operations Conference and Biennial Exhibition, (3) issuing a Request for Proposals (RFP) for proposal solicitation for funding, and (4) organizing the proposal selection meeting.

  12. Gas Storage Technology Consortium

    SciTech Connect

    Joel L. Morrison; Sharon L. Elder

    2006-07-06

    Gas storage is a critical element in the natural gas industry. Producers, transmission & distribution companies, marketers, and end users all benefit directly from the load balancing function of storage. The unbundling process has fundamentally changed the way storage is used and valued. As an unbundled service, the value of storage is being recovered at rates that reflect its value. Moreover, the marketplace has differentiated between various types of storage services, and has increasingly rewarded flexibility, safety, and reliability. The size of the natural gas market has increased and is projected to continue to increase towards 30 trillion cubic feet (TCF) over the next 10 to 15 years. Much of this increase is projected to come from electric generation, particularly peaking units. Gas storage, particularly the flexible services that are most suited to electric loads, is critical in meeting the needs of these new markets. In order to address the gas storage needs of the natural gas industry, an industry-driven consortium was created--the Gas Storage Technology Consortium (GSTC). The objective of the GSTC is to provide a means to accomplish industry-driven research and development designed to enhance operational flexibility and deliverability of the Nation's gas storage system, and provide a cost effective, safe, and reliable supply of natural gas to meet domestic demand. This report addresses the activities for the quarterly period of April 1 to June 30, 2006. Key activities during this time period include: (1) Develop and process subcontract agreements for the eight projects selected for cofunding at the February 2006 GSTC Meeting; (2) Compiling and distributing the three 2004 project final reports to the GSTC Full members; (3) Develop template, compile listserv, and draft first GSTC Insider online newsletter; (4) Continue membership recruitment; (5) Identify projects and finalize agenda for the fall GSTC/AGA Underground Storage Committee Technology Transfer

  13. Gas Storage Technology Consortium

    SciTech Connect

    Joel L. Morrison; Sharon L. Elder

    2006-05-10

    Gas storage is a critical element in the natural gas industry. Producers, transmission and distribution companies, marketers, and end users all benefit directly from the load balancing function of storage. The unbundling process has fundamentally changed the way storage is used and valued. As an unbundled service, the value of storage is being recovered at rates that reflect its value. Moreover, the marketplace has differentiated between various types of storage services, and has increasingly rewarded flexibility, safety, and reliability. The size of the natural gas market has increased and is projected to continue to increase towards 30 trillion cubic feet (TCF) over the next 10 to 15 years. Much of this increase is projected to come from electric generation, particularly peaking units. Gas storage, particularly the flexible services that are most suited to electric loads, is critical in meeting the needs of these new markets. In order to address the gas storage needs of the natural gas industry, an industry-driven consortium was created--the Gas Storage Technology Consortium (GSTC). The objective of the GSTC is to provide a means to accomplish industry-driven research and development designed to enhance operational flexibility and deliverability of the Nation's gas storage system, and provide a cost effective, safe, and reliable supply of natural gas to meet domestic demand. This report addresses the activities for the quarterly period of January 1, 2006 through March 31, 2006. Activities during this time period were: (1) Organize and host the 2006 Spring Meeting in San Diego, CA on February 21-22, 2006; (2) Award 8 projects for co-funding by GSTC for 2006; (3) New members recruitment; and (4) Improving communications.

  14. Nuclear Fabrication Consortium

    SciTech Connect

    Levesque, Stephen

    2013-04-05

    This report summarizes the activities undertaken by EWI while under contract from the Department of Energy (DOE) Office of Nuclear Energy (NE) for the management and operation of the Nuclear Fabrication Consortium (NFC). The NFC was established by EWI to independently develop, evaluate, and deploy fabrication approaches and data that support the re-establishment of the U.S. nuclear industry: ensuring that the supply chain will be competitive on a global stage, enabling more cost-effective and reliable nuclear power in a carbon constrained environment. The NFC provided a forum for member original equipment manufactures (OEM), fabricators, manufacturers, and materials suppliers to effectively engage with each other and rebuild the capacity of this supply chain by : Identifying and removing impediments to the implementation of new construction and fabrication techniques and approaches for nuclear equipment, including system components and nuclear plants. Providing and facilitating detailed scientific-based studies on new approaches and technologies that will have positive impacts on the cost of building of nuclear plants. Analyzing and disseminating information about future nuclear fabrication technologies and how they could impact the North American and the International Nuclear Marketplace. Facilitating dialog and initiate alignment among fabricators, owners, trade associations, and government agencies. Supporting industry in helping to create a larger qualified nuclear supplier network. Acting as an unbiased technology resource to evaluate, develop, and demonstrate new manufacturing technologies. Creating welder and inspector training programs to help enable the necessary workforce for the upcoming construction work. Serving as a focal point for technology, policy, and politically interested parties to share ideas and concepts associated with fabrication across the nuclear industry. The report the objectives and summaries of the Nuclear Fabrication Consortium

  15. DNA-guided genome editing using structure-guided endonucleases.

    PubMed

    Varshney, Gaurav K; Burgess, Shawn M

    2016-01-01

    The search for novel ways to target and alter the genomes of living organisms accelerated rapidly this decade with the discovery of CRISPR/Cas9. Since the initial discovery, efforts to find alternative methods for altering the genome have expanded. A new study presenting an alternative approach has been demonstrated that utilizes flap endonuclease 1 (FEN-1) fused to the Fok1 endonuclease, which shows potential for DNA-guided genome targeting in vivo. PMID:27640875

  16. Structure and content of the Entamoeba histolytica genome.

    PubMed

    Clark, C G; Alsmark, U C M; Tazreiter, M; Saito-Nakano, Y; Ali, V; Marion, S; Weber, C; Mukherjee, C; Bruchhaus, I; Tannich, E; Leippe, M; Sicheritz-Ponten, T; Foster, P G; Samuelson, J; Noël, C J; Hirt, R P; Embley, T M; Gilchrist, C A; Mann, B J; Singh, U; Ackers, J P; Bhattacharya, S; Bhattacharya, A; Lohia, A; Guillén, N; Duchêne, M; Nozaki, T; Hall, N

    2007-01-01

    The intestinal parasite Entamoeba histolytica is one of the first protists for which a draft genome sequence has been published. Although the genome is still incomplete, it is unlikely that many genes are missing from the list of those already identified. In this chapter we summarise the features of the genome as they are currently understood and provide previously unpublished analyses of many of the genes.

  17. Integrated consensus map of cultivated peanut and wild relatives reveals structures of the A and B genomes of Arachis and divergence of the legume genomes.

    PubMed

    Shirasawa, Kenta; Bertioli, David J; Varshney, Rajeev K; Moretzsohn, Marcio C; Leal-Bertioli, Soraya C M; Thudi, Mahendar; Pandey, Manish K; Rami, Jean-Francois; Foncéka, Daniel; Gowda, Makanahally V C; Qin, Hongde; Guo, Baozhu; Hong, Yanbin; Liang, Xuanqiang; Hirakawa, Hideki; Tabata, Satoshi; Isobe, Sachiko

    2013-04-01

    The complex, tetraploid genome structure of peanut (Arachis hypogaea) has obstructed advances in genetics and genomics in the species. The aim of this study is to understand the genome structure of Arachis by developing a high-density integrated consensus map. Three recombinant inbred line populations derived from crosses between the A genome diploid species, Arachis duranensis and Arachis stenosperma; the B genome diploid species, Arachis ipaënsis and Arachis magna; and between the AB genome tetraploids, A. hypogaea and an artificial amphidiploid (A. ipaënsis × A. duranensis)(4×), were used to construct genetic linkage maps: 10 linkage groups (LGs) of 544 cM with 597 loci for the A genome; 10 LGs of 461 cM with 798 loci for the B genome; and 20 LGs of 1442 cM with 1469 loci for the AB genome. The resultant maps plus 13 published maps were integrated into a consensus map covering 2651 cM with 3693 marker loci which was anchored to 20 consensus LGs corresponding to the A and B genomes. The comparative genomics with genome sequences of Cajanus cajan, Glycine max, Lotus japonicus, and Medicago truncatula revealed that the Arachis genome has segmented synteny relationship to the other legumes. The comparative maps in legumes, integrated tetraploid consensus maps, and genome-specific diploid maps will increase the genetic and genomic understanding of Arachis and should facilitate molecular breeding. PMID:23315685

  18. Structure-based inference of molecular functions of proteins of unknown function from Berkeley Structural Genomics Center

    SciTech Connect

    Kim, Sung-Hou; Shin, Dong Hae; Hou, Jingtong; Chandonia, John-Marc; Das, Debanu; Choi, In-Geol; Kim, Rosalind; Kim, Sung-Hou

    2007-09-02

    Advances in sequence genomics have resulted in an accumulation of a huge number of protein sequences derived from genome sequences. However, the functions of a large portion of them cannot be inferred based on the current methods of sequence homology detection to proteins of known functions. Three-dimensional structure can have an important impact in providing inference of molecular function (physical and chemical function) of a protein of unknown function. Structural genomics centers worldwide have been determining many 3-D structures of the proteins of unknown functions, and possible molecular functions of them have been inferred based on their structures. Combined with bioinformatics and enzymatic assay tools, the successful acceleration of the process of protein structure determination through high throughput pipelines enables the rapid functional annotation of a large fraction of hypothetical proteins. We present a brief summary of the process we used at the Berkeley Structural Genomics Center to infer molecular functions of proteins of unknown function.

  19. Reuse at the Software Productivity Consortium

    NASA Technical Reports Server (NTRS)

    Weiss, David M.

    1989-01-01

    The Software Productivity Consortium is sponsored by 14 aerospace companies as a developer of software engineering methods and tools. Software reuse and prototyping are currently the major emphasis areas. The Methodology and Measurement Project in the Software Technology Exploration Division has developed some concepts for reuse which they intend to develop into a synthesis process. They have identified two approaches to software reuse: opportunistic and systematic. The assumptions underlying the systematic approach, phrased as hypotheses, are the following: the redevelopment hypothesis, i.e., software developers solve the same problems repeatedly; the oracle hypothesis, i.e., developers are able to predict variations from one redevelopment to others; and the organizational hypothesis, i.e., software must be organized according to behavior and structure to take advantage of the predictions that the developers make. The conceptual basis for reuse includes: program families, information hiding, abstract interfaces, uses and information hiding hierarchies, and process structure. The primary reusable software characteristics are black-box descriptions, structural descriptions, and composition and decomposition based on program families. Automated support can be provided for systematic reuse, and the Consortium is developing a prototype reuse library and guidebook. The software synthesis process that the Consortium is aiming toward includes modeling, refinement, prototyping, reuse, assessment, and new construction.

  20. Sequence, genomic structure, and chromosomal assignment of human DOC-2

    SciTech Connect

    Albertsen, H.M.; Williams, B.; Smith, S.A.

    1996-04-15

    DOC-2 is a human gene originally identified as a 767-bp cDNA fragment isolated from normal ovarian epithelial cells by differential display against ovarian carcinoma cells. We have now determined the complete cDNA sequence of the 3.2-kb DOC-2 transcript and localized the gene to chromosome 5. A 12.5-kb genomic fragment at the 5{prime}-end of DOC-2 has also been sequenced, revealing the intron-exon structure of the first eight exons (788 bases) of the DOC-2 gene. Translation of the DOC-2 cDNA predicts a hydrophobic protein of 770 amino acid residues with a molecular weight of 82.5 kDa. Comparison of the DNA and amino acid sequences of DOC-2 to publicly accessible sequence data-bases revealed 83% identity to p96, a murine-responsive phosphoprotein. In addition, about 45% identity was observed between the first 140 N-terminal residues of DOC-2 and the Caenorhabditas elegans M110.5 and Drosophila melanoaster Dab genes. 14 refs., 3 figs.

  1. Integrated database of information from structural genomics experiments.

    PubMed

    Asada, Yukuhiko; Sugahara, Michihiro; Mizutani, Hisashi; Naitow, Hisashi; Tanaka, Tomoyuki; Matsuura, Yoshinori; Agari, Yoshihiro; Ebihara, Akio; Shinkai, Akeo; Kuramitsu, Seiki; Yokoyama, Shigeyuki; Kaminuma, Eri; Kobayashi, Norio; Nishikata, Koro; Shimoyama, Sayoko; Toyoda, Tetsuro; Ishikawa, Tetsuya; Kunishima, Naoki

    2013-05-01

    Information from structural genomics experiments at the RIKEN SPring-8 Center, Japan has been compiled and published as an integrated database. The contents of the database are (i) experimental data from nine species of bacteria that cover a large variety of protein molecules in terms of both evolution and properties (http://database.riken.jp/db/bacpedia), (ii) experimental data from mutant proteins that were designed systematically to study the influence of mutations on the diffraction quality of protein crystals (http://database.riken.jp/db/bacpedia) and (iii) experimental data from heavy-atom-labelled proteins from the heavy-atom database HATODAS (http://database.riken.jp/db/hatodas). The database integration adopts the semantic web, which is suitable for data reuse and automatic processing, thereby allowing batch downloads of full data and data reconstruction to produce new databases. In addition, to enhance the use of data (i) and (ii) by general researchers in biosciences, a comprehensible user interface, Bacpedia (http://bacpedia.harima.riken.jp), has been developed.

  2. Structure, Function, and Evolution of the Thiomonas spp. Genome

    PubMed Central

    Arsène-Ploetze, Florence; Koechler, Sandrine; Marchal, Marie; Coppée, Jean-Yves; Chandler, Michael; Bonnefoy, Violaine; Brochier-Armanet, Céline; Barakat, Mohamed; Barbe, Valérie; Battaglia-Brunet, Fabienne; Bruneel, Odile; Bryan, Christopher G.; Cleiss-Arnold, Jessica; Cruveiller, Stéphane; Erhardt, Mathieu; Heinrich-Salmeron, Audrey; Hommais, Florence; Joulian, Catherine; Krin, Evelyne; Lieutaud, Aurélie; Lièvremont, Didier; Michel, Caroline; Muller, Daniel; Ortet, Philippe; Proux, Caroline; Siguier, Patricia; Roche, David; Rouy, Zoé; Salvignol, Grégory; Slyemi, Djamila; Talla, Emmanuel; Weiss, Stéphanie; Weissenbach, Jean; Médigue, Claudine; Bertin, Philippe N.

    2010-01-01

    Bacteria of the Thiomonas genus are ubiquitous in extreme environments, such as arsenic-rich acid mine drainage (AMD). The genome of one of these strains, Thiomonas sp. 3As, was sequenced, annotated, and examined, revealing specific adaptations allowing this bacterium to survive and grow in its highly toxic environment. In order to explore genomic diversity as well as genetic evolution in Thiomonas spp., a comparative genomic hybridization (CGH) approach was used on eight different strains of the Thiomonas genus, including five strains of the same species. Our results suggest that the Thiomonas genome has evolved through the gain or loss of genomic islands and that this evolution is influenced by the specific environmental conditions in which the strains live. PMID:20195515

  3. Progress in understanding and sequencing the genome of Brassica rapa.

    PubMed

    Hong, Chang Pyo; Kwon, Soo-Jin; Kim, Jung Sun; Yang, Tae-Jin; Park, Beom-Seok; Lim, Yong Pyo

    2008-01-01

    Brassica rapa, which is closely related to Arabidopsis thaliana, is an important crop and a model plant for studying genome evolution via polyploidization. We report the current understanding of the genome structure of B. rapa and efforts for the whole-genome sequencing of the species. The tribe Brassicaceae, which comprises ca. 240 species, descended from a common hexaploid ancestor with a basic genome similar to that of Arabidopsis. Chromosome rearrangements, including fusions and/or fissions, resulted in the present-day "diploid" Brassica species with variation in chromosome number and phenotype. Triplicated genomic segments of B. rapa are collinear to those of A. thaliana with InDels. The genome triplication has led to an approximately 1.7-fold increase in the B. rapa gene number compared to that of A. thaliana. Repetitive DNA of B. rapa has also been extensively amplified and has diverged from that of A. thaliana. For its whole-genome sequencing, the Brassica rapa Genome Sequencing Project (BrGSP) consortium has developed suitable genomic resources and constructed genetic and physical maps. Ten chromosomes of B. rapa are being allocated to BrGSP consortium participants, and each chromosome will be sequenced by a BAC-by-BAC approach. Genome sequencing of B. rapa will offer a new perspective for plant biology and evolution in the context of polyploidization.

  4. Gas Storage Technology Consortium

    SciTech Connect

    Joel L. Morrison; Sharon L. Elder

    2006-09-30

    Gas storage is a critical element in the natural gas industry. Producers, transmission and distribution companies, marketers, and end users all benefit directly from the load balancing function of storage. The unbundling process has fundamentally changed the way storage is used and valued. As an unbundled service, the value of storage is being recovered at rates that reflect its value. Moreover, the marketplace has differentiated between various types of storage services, and has increasingly rewarded flexibility, safety, and reliability. The size of the natural gas market has increased and is projected to continue to increase towards 30 trillion cubic feet (TCF) over the next 10 to 15 years. Much of this increase is projected to come from electric generation, particularly peaking units. Gas storage, particularly the flexible services that are most suited to electric loads, is critical in meeting the needs of these new markets. In order to address the gas storage needs of the natural gas industry, an industry-driven consortium was created-the Gas Storage Technology Consortium (GSTC). The objective of the GSTC is to provide a means to accomplish industry-driven research and development designed to enhance operational flexibility and deliverability of the Nation's gas storage system, and provide a cost effective, safe, and reliable supply of natural gas to meet domestic demand. This report addresses the activities for the quarterly period of July 1, 2006 to September 30, 2006. Key activities during this time period include: {lg_bullet} Subaward contracts for all 2006 GSTC projects completed; {lg_bullet} Implement a formal project mentoring process by a mentor team; {lg_bullet} Upcoming Technology Transfer meetings: {sm_bullet} Finalize agenda for the American Gas Association Fall Underground Storage Committee/GSTC Technology Transfer Meeting in San Francisco, CA. on October 4, 2006; {sm_bullet} Identify projects and finalize agenda for the Fall GSTC Technology

  5. Gas Storage Technology Consortium

    SciTech Connect

    Joel Morrison; Elizabeth Wood; Barbara Robuck

    2010-09-30

    The EMS Energy Institute at The Pennsylvania State University (Penn State) has managed the Gas Storage Technology Consortium (GSTC) since its inception in 2003. The GSTC infrastructure provided a means to accomplish industry-driven research and development designed to enhance the operational flexibility and deliverability of the nation's gas storage system, and provide a cost-effective, safe, and reliable supply of natural gas to meet domestic demand. The GSTC received base funding from the U.S. Department of Energy's (DOE) National Energy Technology Laboratory (NETL) Oil & Natural Gas Supply Program. The GSTC base funds were highly leveraged with industry funding for individual projects. Since its inception, the GSTC has engaged 67 members. The GSTC membership base was diverse, coming from 19 states, the District of Columbia, and Canada. The membership was comprised of natural gas storage field operators, service companies, industry consultants, industry trade organizations, and academia. The GSTC organized and hosted a total of 18 meetings since 2003. Of these, 8 meetings were held to review, discuss, and select proposals submitted for funding consideration. The GSTC reviewed a total of 75 proposals and committed co-funding to support 31 industry-driven projects. The GSTC committed co-funding to 41.3% of the proposals that it received and reviewed. The 31 projects had a total project value of $6,203,071 of which the GSTC committed $3,205,978 in co-funding. The committed GSTC project funding represented an average program cost share of 51.7%. Project applicants provided an average program cost share of 48.3%. In addition to the GSTC co-funding, the consortium provided the domestic natural gas storage industry with a technology transfer and outreach infrastructure. The technology transfer and outreach were conducted by having project mentoring teams and a GSTC website, and by working closely with the Pipeline Research Council International (PRCI) to jointly host

  6. Evolution of genomic structural variation and genomic architecture in the adaptive radiations of African cichlid fishes.

    PubMed

    Fan, Shaohua; Meyer, Axel

    2014-01-01

    African cichlid fishes are an ideal system for studying explosive rates of speciation and the origin of diversity in adaptive radiation. Within the last few million years, more than 2000 species have evolved in the Great Lakes of East Africa, the largest adaptive radiation in vertebrates. These young species show spectacular diversity in their coloration, morphology and behavior. However, little is known about the genomic basis of this astonishing diversity. Recently, five African cichlid genomes were sequenced, including that of the Nile Tilapia (Oreochromis niloticus), a basal and only relatively moderately diversified lineage, and the genomes of four representative endemic species of the adaptive radiations, Neolamprologus brichardi, Astatotilapia burtoni, Metriaclima zebra, and Pundamila nyererei. Using the Tilapia genome as a reference genome, we generated a high-resolution genomic variation map, consisting of single nucleotide polymorphisms (SNPs), short insertions and deletions (indels), inversions and deletions. In total, around 18.8, 17.7, 17.0, and 17.0 million SNPs, 2.3, 2.2, 1.4, and 1.9 million indels, 262, 306, 162, and 154 inversions, and 3509, 2705, 2710, and 2634 deletions were inferred to have evolved in N. brichardi, A. burtoni, P. nyererei, and M. zebra, respectively. Many of these variations affected the annotated gene regions in the genome. Different patterns of genetic variation were detected during the adaptive radiation of African cichlid fishes. For SNPs, the highest rate of evolution was detected in the common ancestor of N. brichardi, A. burtoni, P. nyererei, and M. zebra. However, for the evolution of inversions and deletions, we found that the rates at the terminal taxa are substantially higher than the rates at the ancestral lineages. The high-resolution map provides an ideal opportunity to understand the genomic bases of the adaptive radiation of African cichlid fishes.

  7. GenomeTools: a comprehensive software library for efficient processing of structured genome annotations.

    PubMed

    Gremme, Gordon; Steinbiss, Sascha; Kurtz, Stefan

    2013-01-01

    Genome annotations are often published as plain text files describing genomic features and their subcomponents by an implicit annotation graph. In this paper, we present the GenomeTools, a convenient and efficient software library and associated software tools for developing bioinformatics software intended to create, process or convert annotation graphs. The GenomeTools strictly follow the annotation graph approach, offering a unified graph-based representation. This gives the developer intuitive and immediate access to genomic features and tools for their manipulation. To process large annotation sets with low memory overhead, we have designed and implemented an efficient pull-based approach for sequential processing of annotations. This allows to handle even the largest annotation sets, such as a complete catalogue of human variations. Our object-oriented C-based software library enables a developer to conveniently implement their own functionality on annotation graphs and to integrate it into larger workflows, simultaneously accessing compressed sequence data if required. The careful C implementation of the GenomeTools does not only ensure a light-weight memory footprint while allowing full sequential as well as random access to the annotation graph, but also facilitates the creation of bindings to a variety of script programming languages (like Python and Ruby) sharing the same interface. PMID:24091398

  8. Optoelectronic technology consortium

    NASA Astrophysics Data System (ADS)

    Hibbs-Brenner, Mary

    1992-12-01

    The Optoelectronics Technology Consortium has been established to position U.S. industry as the world leader in optical interconnect technology by developing, fabricating, intergrating and demonstrating the producibility of optoelectronic components for high-density/high-data-rate processors and accelerating the insertion of this technology into military and commercial applications. This objective will be accomplished by a program focused in three areas. (1) Demonstrated performance: OETC will demonstrate an aggregate data transfer rate of 16 Gbit/s between single transmitter and receiver packages, as well as the expandability of this technology by combing four links in parallel to achieve a 64 Gbit/s link. (2) Accelerated development: By collaborating during precompetitive technology development stage, OTEC will advance the development of optical components and produce links for a multiboard processor testbed demonstration; and (3) Producibility: OETC's technology will achieve this performance by using components that are affordable, and reliable, with a line BER less than 10(exp -15) and MTTF greater than 10(exp 6) hours.

  9. The impact of genome-wide supported schizophrenia risk variants in the neurogranin gene on brain structure and function.

    PubMed

    Walton, Esther; Geisler, Daniel; Hass, Johanna; Liu, Jingyu; Turner, Jessica; Yendiki, Anastasia; Smolka, Michael N; Ho, Beng-Choon; Manoach, Dara S; Gollub, Randy L; Roessner, Veit; Calhoun, Vince D; Ehrlich, Stefan

    2013-01-01

    The neural mechanisms underlying genetic risk for schizophrenia, a highly heritable psychiatric condition, are still under investigation. New schizophrenia risk genes discovered through genome-wide association studies (GWAS), such as neurogranin (NRGN), can be used to identify these mechanisms. In this study we examined the association of two common NRGN risk single nucleotide polymorphisms (SNPs) with functional and structural brain-based intermediate phenotypes for schizophrenia. We obtained structural, functional MRI and genotype data of 92 schizophrenia patients and 114 healthy volunteers from the multisite Mind Clinical Imaging Consortium study. Two schizophrenia-associated NRGN SNPs (rs12807809 and rs12541) were tested for association with working memory-elicited dorsolateral prefrontal cortex (DLPFC) activity and surface-wide cortical thickness. NRGN rs12541 risk allele homozygotes (TT) displayed increased working memory-related activity in several brain regions, including the left DLPFC, left insula, left somatosensory cortex and the cingulate cortex, when compared to non-risk allele carriers. NRGN rs12807809 non-risk allele (C) carriers showed reduced cortical gray matter thickness compared to risk allele homozygotes (TT) in an area comprising the right pericalcarine gyrus, the right cuneus, and the right lingual gyrus. Our study highlights the effects of schizophrenia risk variants in the NRGN gene on functional and structural brain-based intermediate phenotypes for schizophrenia. These results support recent GWAS findings and further implicate NRGN in the pathophysiology of schizophrenia by suggesting that genetic NRGN risk variants contribute to subtle changes in neural functioning and anatomy that can be quantified with neuroimaging methods. PMID:24098564

  10. COnsortium of METabolomics Studies (COMETS)

    Cancer.gov

    The COnsortium of METabolomics Studies (COMETS) is an extramural-intramural partnership that promotes collaboration among prospective cohort studies that follow participants for a range of outcomes and perform metabolomic profiling of individuals.

  11. Draft Genome of the Wheat Rust Pathogen (Puccinia triticina) Unravels Genome-Wide Structural Variations during Evolution.

    PubMed

    Kiran, Kanti; Rawal, Hukam C; Dubey, Himanshu; Jaswal, Rajdeep; Devanna, B N; Gupta, Deepak Kumar; Bhardwaj, Subhash C; Prasad, P; Pal, Dharam; Chhuneja, Parveen; Balasubramanian, P; Kumar, J; Swami, M; Solanke, Amolkumar U; Gaikwad, Kishor; Singh, Nagendra K; Sharma, Tilak Raj

    2016-01-01

    Leaf rust is one of the most important diseases of wheat and is caused by Puccinia triticina, a highly variable rust pathogen prevalent worldwide. Decoding the genome of this pathogen will help in unraveling the molecular basis of its evolution and in the identification of genes responsible for its various biological functions. We generated high quality draft genome sequences (approximately 100- 106 Mb) of two races of P. triticina; the variable and virulent Race77 and the old, avirulent Race106. The genomes of races 77 and 106 had 33X and 27X coverage, respectively. We predicted 27678 and 26384 genes, with average lengths of 1,129 and 1,086 bases in races 77 and 106, respectively and found that the genomes consisted of 37.49% and 39.99% repetitive sequences. Genome wide comparative analysis revealed that Race77 differs substantially from Race106 with regard to segmental duplication (SD), repeat element, and SNP/InDel characteristics. Comparative analyses showed that Race 77 is a recent, highly variable and adapted Race compared with Race106. Further sequence analyses of 13 additional pathotypes of Race77 clearly differentiated the recent, active and virulent, from the older pathotypes. Average densities of 2.4 SNPs and 0.32 InDels per kb were obtained for all P. triticina pathotypes. Secretome analysis demonstrated that Race77 has more virulence factors than Race 106, which may be responsible for the greater degree of adaptation of this pathogen. We also found that genes under greater selection pressure were conserved in the genomes of both races, and may affect functions crucial for the higher levels of virulence factors in Race77. This study provides insights into the genome structure, genome organization, molecular basis of variation, and pathogenicity of P. triticina The genome sequence data generated in this study have been submitted to public domain databases and will be an important resource for comparative genomics studies of the more than 4000 existing

  12. Hickory Consortium 2001 Final Report

    SciTech Connect

    Not Available

    2003-02-01

    As with all Building America Program consortia, systems thinking is the key to understanding the processes that Hickory Consortium hopes to improve. The Hickory Consortium applies this thinking to more than the whole-building concept. Their systems thinking embraces the meta process of how housing construction takes place in America. By understanding the larger picture, they are able to identify areas where improvements can be made and how to implement them.

  13. Micro and nanofluidic structures for cell sorting and genomic analysis

    NASA Astrophysics Data System (ADS)

    Morton, Keith J.

    Microfluidic systems promise rapid analysis of small samples in a compact and inexpensive format. But direct scaling of lab bench protocols on-chip is challenging because laminar flows in typical microfluidic devices are characterized by non-mixing streamlines. Common microfluidic mixers and sorters work by diffusion, limiting application to objects that diffuse slowly such as cells and DNA. Recently Huang et.al. developed a passive microfluidic element to continuously separate bio-particles deterministically. In Deterministic Lateral Displacement (DLD), objects are sorted by size as they transit an asymmetric array of microfabricated posts. This thesis further develops DLD arrays with applications in three broad new areas. First the arrays are used, not simply to sort particles, but to move streams of cells through functional flows for chemical treatment---such as on-chip immunofluorescent labeling of blood cells with washing, and on-chip E.coli cell lysis with simultaneous chromosome extraction. Secondly, modular tiling of the basic DLD element is used to construct complex particle handling modes that include beam steering for jets of cells and beads. Thirdly, nanostructured DLD arrays are built using Nanoimprint Lithography (NIL) and continuous-flow separation of 100 nm and 200 nm size particles is demonstrated. Finally a number of ancillary nanofabrication techniques were developed in support of these overall goals, including methods to interface nanofluidic structures with standard microfluidic components such as inlet channels and reservoirs, precision etching of ultra-high aspect ratio (>50:1) silicon nanostructures, and fabrication of narrow (˜ 35 nm) channels used to stretch genomic length DNA.

  14. Full-length RNA structure prediction of the HIV-1 genome reveals a conserved core domain.

    PubMed

    Sükösd, Zsuzsanna; Andersen, Ebbe S; Seemann, Stefan E; Jensen, Mads Krogh; Hansen, Mathias; Gorodkin, Jan; Kjems, Jørgen

    2015-12-01

    A distance constrained secondary structural model of the ≈10 kb RNA genome of the HIV-1 has been predicted but higher-order structures, involving long distance interactions, are currently unknown. We present the first global RNA secondary structure model for the HIV-1 genome, which integrates both comparative structure analysis and information from experimental data in a full-length prediction without distance constraints. Besides recovering known structural elements, we predict several novel structural elements that are conserved in HIV-1 evolution. Our results also indicate that the structure of the HIV-1 genome is highly variable in most regions, with a limited number of stable and conserved RNA secondary structures. Most interesting, a set of long distance interactions form a core organizing structure (COS) that organize the genome into three major structural domains. Despite overlapping protein-coding regions the COS is supported by a particular high frequency of compensatory base changes, suggesting functional importance for this element. This new structural element potentially organizes the whole genome into three major domains protruding from a conserved core structure with potential roles in replication and evolution for the virus. PMID:26476446

  15. The First Complete Chloroplast Genome Sequences in Actinidiaceae: Genome Structure and Comparative Analysis

    PubMed Central

    Yao, Xiaohong; Tang, Ping; Li, Zuozhou; Li, Dawei; Liu, Yifei; Huang, Hongwen

    2015-01-01

    Actinidia chinensis is an important economic plant belonging to the basal lineage of the asterids. Availability of a complete Actinidia chloroplast genome sequence is crucial to understanding phylogenetic relationships among major lineages of angiosperms and facilitates kiwifruit genetic improvement. We report here the complete nucleotide sequences of the chloroplast genomes for Actinidia chinensis and A. chinensis var deliciosa obtained through de novo assembly of Illumina paired-end reads produced by total DNA sequencing. The total genome size ranges from 155,446 to 157,557 bp, with an inverted repeat (IR) of 24,013 to 24,391 bp, a large single copy region (LSC) of 87,984 to 88,337 bp and a small single copy region (SSC) of 20,332 to 20,336 bp. The genome encodes 113 different genes, including 79 unique protein-coding genes, 30 tRNA genes and 4 ribosomal RNA genes, with 16 duplicated in the inverted repeats, and a tRNA gene (trnfM-CAU) duplicated once in the LSC region. Comparisons of IR boundaries among four asterid species showed that IR/LSC borders were extended into the 5’ portion of the psbA gene and IR contraction occurred in Actinidia. The clap gene has been lost from the chloroplast genome in Actinidia, and may have been transferred to the nucleus during chloroplast evolution. Twenty-seven polymorphic simple sequence repeat (SSR) loci were identified in the Actinidia chloroplast genome. Maximum parsimony analyses of a 72-gene, 16 taxa angiosperm dataset strongly support the placement of Actinidiaceae in Ericales within the basal asterids. PMID:26046631

  16. A sequence-based survey of the complex structural organization of tumor genomes

    SciTech Connect

    Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav; Yu, Peng; Wu, Chunxiao; Huang, Guiqing; Linardopoulou, Elena V.; Trask, Barbara J.; Waldman, Frederic; Costello, Joseph; Pienta, Kenneth J.; Mills, Gordon B.; Bajsarowicz, Krystyna; Kobayashi, Yasuko; Sridharan, Shivaranjani; Paris, Pamela; Tao, Quanzhou; Aerni, Sarah J.; Brown, Raymond P.; Bashir, Ali; Gray, Joe W.; Cheng, Jan-Fang; de Jong, Pieter; Nefedov, Mikhail; Ried, Thomas; Padilla-Nash, Hesed M.; Collins, Colin C.

    2008-04-03

    The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison of the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.

  17. Combustion Byproducts Recycling Consortium

    SciTech Connect

    Paul Ziemkiewicz; Tamara Vandivort; Debra Pflughoeft-Hassett; Y. Paul Chugh; James Hower

    2008-08-31

    Each year, over 100 million tons of solid byproducts are produced by coal-burning electric utilities in the United States. Annual production of flue gas desulfurization (FGD) byproducts continues to increase as the result of more stringent sulfur emission restrictions. In addition, stricter limits on NOx emissions mandated by the 1990 Clean Air Act have resulted in utility burner/boiler modifications that frequently yield higher carbon concentrations in fly ash, which restricts the use of the ash as a cement replacement. Controlling ammonia in ash is also of concern. If newer, 'clean coal' combustion and gasification technologies are adopted, their byproducts may also present a management challenge. The objective of the Combustion Byproducts Recycling Consortium (CBRC) is to develop and demonstrate technologies to address issues related to the recycling of byproducts associated with coal combustion processes. A goal of CBRC is that these technologies, by the year 2010, will lead to an overall ash utilization rate from the current 34% to 50% by such measures as increasing the current rate of FGD byproduct use and increasing in the number of uses considered 'allowable' under state regulations. Another issue of interest to the CBRC would be to examine the environmental impact of both byproduct utilization and disposal. No byproduct utilization technology is likely to be adopted by industry unless it is more cost-effective than landfilling. Therefore, it is extremely important that the utility industry provide guidance to the R&D program. Government agencies and private-sector organizations that may be able to utilize these materials in the conduct of their missions should also provide input. The CBRC will serve as an effective vehicle for acquiring and maintaining guidance from these diverse organizations so that the proper balance in the R&D program is achieved.

  18. Combustion Byproducts Recycling Consortium

    SciTech Connect

    Ziemkiewicz, Paul; Vandivort, Tamara; Pflughoeft-Hassett, Debra; Chugh, Y Paul; Hower, James

    2008-08-31

    Each year, over 100 million tons of solid byproducts are produced by coal-burning electric utilities in the United States. Annual production of flue gas desulfurization (FGD) byproducts continues to increase as the result of more stringent sulfur emission restrictions. In addition, stricter limits on NOx emissions mandated by the 1990 Clean Air Act have resulted in utility burner/boiler modifications that frequently yield higher carbon concentrations in fly ash, which restricts the use of the ash as a cement replacement. Controlling ammonia in ash is also of concern. If newer, “clean coal” combustion and gasification technologies are adopted, their byproducts may also present a management challenge. The objective of the Combustion Byproducts Recycling Consortium (CBRC) is to develop and demonstrate technologies to address issues related to the recycling of byproducts associated with coal combustion processes. A goal of CBRC is that these technologies, by the year 2010, will lead to an overall ash utilization rate from the current 34% to 50% by such measures as increasing the current rate of FGD byproduct use and increasing in the number of uses considered “allowable” under state regulations. Another issue of interest to the CBRC would be to examine the environmental impact of both byproduct utilization and disposal. No byproduct utilization technology is likely to be adopted by industry unless it is more cost-effective than landfilling. Therefore, it is extremely important that the utility industry provide guidance to the R&D program. Government agencies and privatesector organizations that may be able to utilize these materials in the conduct of their missions should also provide input. The CBRC will serve as an effective vehicle for acquiring and maintaining guidance from these diverse organizations so that the proper balance in the R&D program is achieved.

  19. Defining Genome Project Standards in a New Era of Sequencing

    SciTech Connect

    Chain, Patrick

    2009-05-27

    Patrick Chain of the DOE Joint Genome Institute gives a talk on behalf of the International Genome Sequencing Standards Consortium on the need for intermediate genome classifications between "draft" and "finished"

  20. Local chromatin structure of heterochromatin regulates repeated DNA stability, nucleolus structure, and genome integrity

    SciTech Connect

    Peng, Jamy C.

    2007-01-01

    Heterochromatin constitutes a significant portion of the genome in higher eukaryotes; approximately 30% in Drosophila and human. Heterochromatin contains a high repeat DNA content and a low density of protein-encoding genes. In contrast, euchromatin is composed mostly of unique sequences and contains the majority of single-copy genes. Genetic and cytological studies demonstrated that heterochromatin exhibits regulatory roles in chromosome organization, centromere function and telomere protection. As an epigenetically regulated structure, heterochromatin formation is not defined by any DNA sequence consensus. Heterochromatin is characterized by its association with nucleosomes containing methylated-lysine 9 of histone H3 (H3K9me), heterochromatin protein 1 (HP1) that binds H3K9me, and Su(var)3-9, which methylates H3K9 and binds HP1. Heterochromatin formation and functions are influenced by HP1, Su(var)3-9, and the RNA interference (RNAi) pathway. My thesis project investigates how heterochromatin formation and function impact nuclear architecture, repeated DNA organization, and genome stability in Drosophila melanogaster. H3K9me-based chromatin reduces extrachromosomal DNA formation; most likely by restricting the access of repair machineries to repeated DNAs. Reducing extrachromosomal ribosomal DNA stabilizes rDNA repeats and the nucleolus structure. H3K9me-based chromatin also inhibits DNA damage in heterochromatin. Cells with compromised heterochromatin structure, due to Su(var)3-9 or dcr-2 (a component of the RNAi pathway) mutations, display severe DNA damage in heterochromatin compared to wild type. In these mutant cells, accumulated DNA damage leads to chromosomal defects such as translocations, defective DNA repair response, and activation of the G2-M DNA repair and mitotic checkpoints that ensure cellular and animal viability. My thesis research suggests that DNA replication, repair, and recombination mechanisms in heterochromatin differ from those in

  1. Computational structural variation discovery in genomes: state of the art and challenges

    NASA Astrophysics Data System (ADS)

    Osipowski, Paweł; Pawełkowicz, Magdalena; Przybecki, Zbigniew

    2014-11-01

    Identifying structural variations is crucial to obtain comprehensive knowledge on genomic differentiation. Massive data generated by present technologies determines researchers to make use of computational methods for variation discovery in genomes. Focusing on results and trying to specify challenges remained and possible solutions for the future, here we give a review of state-of-the-art methods and software utilized for structural variation discovery.

  2. Duplex stem-loop-containing quadruplex motifs in the human genome: a combined genomic and structural study.

    PubMed

    Lim, Kah Wai; Jenjaroenpun, Piroon; Low, Zhen Jie; Khong, Zi Jian; Ng, Yi Siang; Kuznetsov, Vladimir Andreevich; Phan, Anh Tuân

    2015-06-23

    Duplex stem-loops and four-stranded G-quadruplexes have been implicated in (patho)biological processes. Overlap of stem-loop- and quadruplex-forming sequences could give rise to quadruplex-duplex hybrids (QDH), which combine features of both structural forms and could exhibit unique properties. Here, we present a combined genomic and structural study of stem-loop-containing quadruplex sequences (SLQS) in the human genome. Based on a maximum loop length of 20 nt, our survey identified 80 307 SLQS, embedded within 60 172 unique clusters. Our analysis suggested that these should cover close to half of total SLQS in the entire genome. Among these, 48 508 SLQS were strand-specifically located in genic/promoter regions, with the majority of genes displaying a low number of SLQS. Notably, genes containing abundant SLQS clusters were strongly associated with brain tissues. Enrichment analysis of SLQS-positive genes and mapping of SLQS onto transcriptional/mutagenesis hotspots and cancer-associated genes, provided a statistical framework supporting the biological involvements of SLQS. In vitro formation of diverse QDH by selective SLQS hits were successfully verified by nuclear magnetic resonance spectroscopy. Folding topologies of two SLQS were elucidated in detail. We also demonstrated that sequence changes at mutation/single-nucleotide polymorphism loci could affect the structural conformations adopted by SLQS. Thus, our predicted SLQS offer novel insights into the potential involvement of QDH in diverse (patho)biological processes and could represent novel regulatory signals.

  3. Duplex stem-loop-containing quadruplex motifs in the human genome: a combined genomic and structural study

    PubMed Central

    Lim, Kah Wai; Jenjaroenpun, Piroon; Low, Zhen Jie; Khong, Zi Jian; Ng, Yi Siang; Kuznetsov, Vladimir Andreevich; Phan, Anh Tuân

    2015-01-01

    Duplex stem-loops and four-stranded G-quadruplexes have been implicated in (patho)biological processes. Overlap of stem-loop- and quadruplex-forming sequences could give rise to quadruplex–duplex hybrids (QDH), which combine features of both structural forms and could exhibit unique properties. Here, we present a combined genomic and structural study of stem-loop-containing quadruplex sequences (SLQS) in the human genome. Based on a maximum loop length of 20 nt, our survey identified 80 307 SLQS, embedded within 60 172 unique clusters. Our analysis suggested that these should cover close to half of total SLQS in the entire genome. Among these, 48 508 SLQS were strand-specifically located in genic/promoter regions, with the majority of genes displaying a low number of SLQS. Notably, genes containing abundant SLQS clusters were strongly associated with brain tissues. Enrichment analysis of SLQS-positive genes and mapping of SLQS onto transcriptional/mutagenesis hotspots and cancer-associated genes, provided a statistical framework supporting the biological involvements of SLQS. In vitro formation of diverse QDH by selective SLQS hits were successfully verified by nuclear magnetic resonance spectroscopy. Folding topologies of two SLQS were elucidated in detail. We also demonstrated that sequence changes at mutation/single-nucleotide polymorphism loci could affect the structural conformations adopted by SLQS. Thus, our predicted SLQS offer novel insights into the potential involvement of QDH in diverse (patho)biological processes and could represent novel regulatory signals. PMID:25958397

  4. Training set optimization under population structure in genomic selection

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The optimization of the training set (TRS) in genomic selection (GS) has received much interest in both animal and plant breeding, because it is critical to the accuracy of the prediction models. In this study, five different TRS sampling algorithms, stratified sampling, mean of the Coefficient of D...

  5. Complete Chloroplast Genome of the Wollemi Pine (Wollemia nobilis): Structure and Evolution

    PubMed Central

    Yap, Jia-Yee S.; Rohner, Thore; Greenfield, Abigail; Van Der Merwe, Marlien; McPherson, Hannah; Glenn, Wendy; Kornfeld, Geoff; Marendy, Elessa; Pan, Annie Y. H.; Wilkins, Marc R.; Rossetto, Maurizio; Delaney, Sven K.

    2015-01-01

    The Wollemi pine (Wollemia nobilis) is a rare Southern conifer with striking morphological similarity to fossil pines. A small population of W. nobilis was discovered in 1994 in a remote canyon system in the Wollemi National Park (near Sydney, Australia). This population contains fewer than 100 individuals and is critically endangered. Previous genetic studies of the Wollemi pine have investigated its evolutionary relationship with other pines in the family Araucariaceae, and have suggested that the Wollemi pine genome contains little or no variation. However, these studies were performed prior to the widespread use of genome sequencing, and their conclusions were based on a limited fraction of the Wollemi pine genome. In this study, we address this problem by determining the entire sequence of the W. nobilis chloroplast genome. A detailed analysis of the structure of the genome is presented, and the evolution of the genome is inferred by comparison with the chloroplast sequences of other members of the Araucariaceae and the related family Podocarpaceae. Pairwise alignments of whole genome sequences, and the presence of unique pseudogenes, gene duplications and insertions in W. nobilis and Araucariaceae, indicate that the W. nobilis chloroplast genome is most similar to that of its sister taxon Agathis. However, the W. nobilis genome contains an unusually high number of repetitive sequences, and these could be used in future studies to investigate and conserve any remnant genetic diversity in the Wollemi pine. PMID:26061691

  6. RNA structural constraints in the evolution of the influenza A virus genome NP segment

    PubMed Central

    Gultyaev, Alexander P; Tsyganov-Bodounov, Anton; Spronken, Monique IJ; van der Kooij, Sander; Fouchier, Ron AM; Olsthoorn, René CL

    2014-01-01

    Conserved RNA secondary structures were predicted in the nucleoprotein (NP) segment of the influenza A virus genome using comparative sequence and structure analysis. A number of structural elements exhibiting nucleotide covariations were identified over the whole segment length, including protein-coding regions. Calculations of mutual information values at the paired nucleotide positions demonstrate that these structures impose considerable constraints on the virus genome evolution. Functional importance of a pseudoknot structure, predicted in the NP packaging signal region, was confirmed by plaque assays of the mutant viruses with disrupted structure and those with restored folding using compensatory substitutions. Possible functions of the conserved RNA folding patterns in the influenza A virus genome are discussed. PMID:25180940

  7. Comprehensive characterization of complex structural variations in cancer by directly comparing genome sequence reads.

    PubMed

    Moncunill, Valentí; Gonzalez, Santi; Beà, Sílvia; Andrieux, Lise O; Salaverria, Itziar; Royo, Cristina; Martinez, Laura; Puiggròs, Montserrat; Segura-Wang, Maia; Stütz, Adrian M; Navarro, Alba; Royo, Romina; Gelpí, Josep L; Gut, Ivo G; López-Otín, Carlos; Orozco, Modesto; Korbel, Jan O; Campo, Elias; Puente, Xose S; Torrents, David

    2014-11-01

    The development of high-throughput sequencing technologies has advanced our understanding of cancer. However, characterizing somatic structural variants in tumor genomes is still challenging because current strategies depend on the initial alignment of reads to a reference genome. Here, we describe SMUFIN (somatic mutation finder), a single program that directly compares sequence reads from normal and tumor genomes to accurately identify and characterize a range of somatic sequence variation, from single-nucleotide variants (SNV) to large structural variants at base pair resolution. Performance tests on modeled tumor genomes showed average sensitivity of 92% and 74% for SNVs and structural variants, with specificities of 95% and 91%, respectively. Analyses of aggressive forms of solid and hematological tumors revealed that SMUFIN identifies breakpoints associated with chromothripsis and chromoplexy with high specificity. SMUFIN provides an integrated solution for the accurate, fast and comprehensive characterization of somatic sequence variation in cancer. PMID:25344728

  8. Deeper insight into the structure of the anaerobic digestion microbial community; the biogas microbiome database is expanded with 157 new genomes.

    PubMed

    Treu, Laura; Kougias, Panagiotis G; Campanaro, Stefano; Bassani, Ilaria; Angelidaki, Irini

    2016-09-01

    This research aimed to better characterize the biogas microbiome by means of high throughput metagenomic sequencing and to elucidate the core microbial consortium existing in biogas reactors independently from the operational conditions. Assembly of shotgun reads followed by an established binning strategy resulted in the highest, up to now, extraction of microbial genomes involved in biogas producing systems. From the 236 extracted genome bins, it was remarkably found that the vast majority of them could only be characterized at high taxonomic levels. This result confirms that the biogas microbiome is comprised by a consortium of unknown species. A comparative analysis between the genome bins of the current study and those extracted from a previous metagenomic assembly demonstrated a similar phylogenetic distribution of the main taxa. Finally, this analysis led to the identification of a subset of common microbes that could be considered as the core essential group in biogas production. PMID:27243603

  9. Mining 3D genome structure populations identifies major factors governing the stability of regulatory communities

    PubMed Central

    Dai, Chao; Li, Wenyuan; Tjong, Harianto; Hao, Shengli; Zhou, Yonggang; Li, Qingjiao; Chen, Lin; Zhu, Bing; Alber, Frank; Jasmine Zhou, Xianghong

    2016-01-01

    Three-dimensional (3D) genome structures vary from cell to cell even in an isogenic sample. Unlike protein structures, genome structures are highly plastic, posing a significant challenge for structure-function mapping. Here we report an approach to comprehensively identify 3D chromatin clusters that each occurs frequently across a population of genome structures, either deconvoluted from ensemble-averaged Hi-C data or from a collection of single-cell Hi-C data. Applying our method to a population of genome structures (at the macrodomain resolution) of lymphoblastoid cells, we identify an atlas of stable inter-chromosomal chromatin clusters. A large number of these clusters are enriched in binding of specific regulatory factors and are therefore defined as ‘Regulatory Communities.' We reveal two major factors, centromere clustering and transcription factor binding, which significantly stabilize such communities. Finally, we show that the regulatory communities differ substantially from cell to cell, indicating that expression variability could be impacted by genome structures. PMID:27240697

  10. Genome sequence, comparative analysis and haplotype structure of the domestic dog.

    PubMed

    Lindblad-Toh, Kerstin; Wade, Claire M; Mikkelsen, Tarjei S; Karlsson, Elinor K; Jaffe, David B; Kamal, Michael; Clamp, Michele; Chang, Jean L; Kulbokas, Edward J; Zody, Michael C; Mauceli, Evan; Xie, Xiaohui; Breen, Matthew; Wayne, Robert K; Ostrander, Elaine A; Ponting, Chris P; Galibert, Francis; Smith, Douglas R; DeJong, Pieter J; Kirkness, Ewen; Alvarez, Pablo; Biagi, Tara; Brockman, William; Butler, Jonathan; Chin, Chee-Wye; Cook, April; Cuff, James; Daly, Mark J; DeCaprio, David; Gnerre, Sante; Grabherr, Manfred; Kellis, Manolis; Kleber, Michael; Bardeleben, Carolyne; Goodstadt, Leo; Heger, Andreas; Hitte, Christophe; Kim, Lisa; Koepfli, Klaus-Peter; Parker, Heidi G; Pollinger, John P; Searle, Stephen M J; Sutter, Nathan B; Thomas, Rachael; Webber, Caleb; Baldwin, Jennifer; Abebe, Adal; Abouelleil, Amr; Aftuck, Lynne; Ait-Zahra, Mostafa; Aldredge, Tyler; Allen, Nicole; An, Peter; Anderson, Scott; Antoine, Claudel; Arachchi, Harindra; Aslam, Ali; Ayotte, Laura; Bachantsang, Pasang; Barry, Andrew; Bayul, Tashi; Benamara, Mostafa; Berlin, Aaron; Bessette, Daniel; Blitshteyn, Berta; Bloom, Toby; Blye, Jason; Boguslavskiy, Leonid; Bonnet, Claude; Boukhgalter, Boris; Brown, Adam; Cahill, Patrick; Calixte, Nadia; Camarata, Jody; Cheshatsang, Yama; Chu, Jeffrey; Citroen, Mieke; Collymore, Alville; Cooke, Patrick; Dawoe, Tenzin; Daza, Riza; Decktor, Karin; DeGray, Stuart; Dhargay, Norbu; Dooley, Kimberly; Dooley, Kathleen; Dorje, Passang; Dorjee, Kunsang; Dorris, Lester; Duffey, Noah; Dupes, Alan; Egbiremolen, Osebhajajeme; Elong, Richard; Falk, Jill; Farina, Abderrahim; Faro, Susan; Ferguson, Diallo; Ferreira, Patricia; Fisher, Sheila; FitzGerald, Mike; Foley, Karen; Foley, Chelsea; Franke, Alicia; Friedrich, Dennis; Gage, Diane; Garber, Manuel; Gearin, Gary; Giannoukos, Georgia; Goode, Tina; Goyette, Audra; Graham, Joseph; Grandbois, Edward; Gyaltsen, Kunsang; Hafez, Nabil; Hagopian, Daniel; Hagos, Birhane; Hall, Jennifer; Healy, Claire; Hegarty, Ryan; Honan, Tracey; Horn, Andrea; Houde, Nathan; Hughes, Leanne; Hunnicutt, Leigh; Husby, M; Jester, Benjamin; Jones, Charlien; Kamat, Asha; Kanga, Ben; Kells, Cristyn; Khazanovich, Dmitry; Kieu, Alix Chinh; Kisner, Peter; Kumar, Mayank; Lance, Krista; Landers, Thomas; Lara, Marcia; Lee, William; Leger, Jean-Pierre; Lennon, Niall; Leuper, Lisa; LeVine, Sarah; Liu, Jinlei; Liu, Xiaohong; Lokyitsang, Yeshi; Lokyitsang, Tashi; Lui, Annie; Macdonald, Jan; Major, John; Marabella, Richard; Maru, Kebede; Matthews, Charles; McDonough, Susan; Mehta, Teena; Meldrim, James; Melnikov, Alexandre; Meneus, Louis; Mihalev, Atanas; Mihova, Tanya; Miller, Karen; Mittelman, Rachel; Mlenga, Valentine; Mulrain, Leonidas; Munson, Glen; Navidi, Adam; Naylor, Jerome; Nguyen, Tuyen; Nguyen, Nga; Nguyen, Cindy; Nguyen, Thu; Nicol, Robert; Norbu, Nyima; Norbu, Choe; Novod, Nathaniel; Nyima, Tenchoe; Olandt, Peter; O'Neill, Barry; O'Neill, Keith; Osman, Sahal; Oyono, Lucien; Patti, Christopher; Perrin, Danielle; Phunkhang, Pema; Pierre, Fritz; Priest, Margaret; Rachupka, Anthony; Raghuraman, Sujaa; Rameau, Rayale; Ray, Verneda; Raymond, Christina; Rege, Filip; Rise, Cecil; Rogers, Julie; Rogov, Peter; Sahalie, Julie; Settipalli, Sampath; Sharpe, Theodore; Shea, Terrance; Sheehan, Mechele; Sherpa, Ngawang; Shi, Jianying; Shih, Diana; Sloan, Jessie; Smith, Cherylyn; Sparrow, Todd; Stalker, John; Stange-Thomann, Nicole; Stavropoulos, Sharon; Stone, Catherine; Stone, Sabrina; Sykes, Sean; Tchuinga, Pierre; Tenzing, Pema; Tesfaye, Senait; Thoulutsang, Dawa; Thoulutsang, Yama; Topham, Kerri; Topping, Ira; Tsamla, Tsamla; Vassiliev, Helen; Venkataraman, Vijay; Vo, Andy; Wangchuk, Tsering; Wangdi, Tsering; Weiand, Michael; Wilkinson, Jane; Wilson, Adam; Yadav, Shailendra; Yang, Shuli; Yang, Xiaoping; Young, Geneva; Yu, Qing; Zainoun, Joanne; Zembek, Lisa; Zimmer, Andrew; Lander, Eric S

    2005-12-01

    Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health.

  11. Asymmetric cryo-EM reconstruction of phage MS2 reveals genome structure in situ.

    PubMed

    Koning, Roman I; Gomez-Blanco, Josue; Akopjana, Inara; Vargas, Javier; Kazaks, Andris; Tars, Kaspars; Carazo, José María; Koster, Abraham J

    2016-01-01

    In single-stranded ribonucleic acid (RNA) viruses, virus capsid assembly and genome packaging are intertwined processes. Using cryo-electron microscopy and single particle analysis we determined the asymmetric virion structure of bacteriophage MS2, which includes 178 copies of the coat protein, a single copy of the A-protein and the RNA genome. This reveals that in situ, the viral RNA genome can adopt a defined conformation. The RNA forms a branched network of stem-loops that almost all allocate near the capsid inner surface, while predominantly binding to coat protein dimers that are located in one-half of the capsid. This suggests that genomic RNA is highly involved in genome packaging and virion assembly. PMID:27561669

  12. Asymmetric cryo-EM reconstruction of phage MS2 reveals genome structure in situ

    PubMed Central

    Koning, Roman I; Gomez-Blanco, Josue; Akopjana, Inara; Vargas, Javier; Kazaks, Andris; Tars, Kaspars; Carazo, José María; Koster, Abraham J.

    2016-01-01

    In single-stranded ribonucleic acid (RNA) viruses, virus capsid assembly and genome packaging are intertwined processes. Using cryo-electron microscopy and single particle analysis we determined the asymmetric virion structure of bacteriophage MS2, which includes 178 copies of the coat protein, a single copy of the A-protein and the RNA genome. This reveals that in situ, the viral RNA genome can adopt a defined conformation. The RNA forms a branched network of stem-loops that almost all allocate near the capsid inner surface, while predominantly binding to coat protein dimers that are located in one-half of the capsid. This suggests that genomic RNA is highly involved in genome packaging and virion assembly. PMID:27561669

  13. Cell-of-Origin-Specific 3D Genome Structure Acquired during Somatic Cell Reprogramming

    PubMed Central

    Krijger, Peter Hugo Lodewijk; Di Stefano, Bruno; de Wit, Elzo; Limone, Francesco; van Oevelen, Chris; de Laat, Wouter; Graf, Thomas

    2016-01-01

    Summary Forced expression of reprogramming factors can convert somatic cells into induced pluripotent stem cells (iPSCs). Here we studied genome topology dynamics during reprogramming of different somatic cell types with highly distinct genome conformations. We find large-scale topologically associated domain (TAD) repositioning and alterations of tissue-restricted genomic neighborhoods and chromatin loops, effectively erasing the somatic-cell-specific genome structures while establishing an embryonic stem-cell-like 3D genome. Yet, early passage iPSCs carry topological hallmarks that enable recognition of their cell of origin. These hallmarks are not remnants of somatic chromosome topologies. Instead, the distinguishing topological features are acquired during reprogramming, as we also find for cell-of-origin-dependent gene expression patterns. PMID:26971819

  14. Cell-of-Origin-Specific 3D Genome Structure Acquired during Somatic Cell Reprogramming.

    PubMed

    Krijger, Peter Hugo Lodewijk; Di Stefano, Bruno; de Wit, Elzo; Limone, Francesco; van Oevelen, Chris; de Laat, Wouter; Graf, Thomas

    2016-05-01

    Forced expression of reprogramming factors can convert somatic cells into induced pluripotent stem cells (iPSCs). Here we studied genome topology dynamics during reprogramming of different somatic cell types with highly distinct genome conformations. We find large-scale topologically associated domain (TAD) repositioning and alterations of tissue-restricted genomic neighborhoods and chromatin loops, effectively erasing the somatic-cell-specific genome structures while establishing an embryonic stem-cell-like 3D genome. Yet, early passage iPSCs carry topological hallmarks that enable recognition of their cell of origin. These hallmarks are not remnants of somatic chromosome topologies. Instead, the distinguishing topological features are acquired during reprogramming, as we also find for cell-of-origin-dependent gene expression patterns.

  15. The ocean sampling day consortium.

    PubMed

    Kopf, Anna; Bicak, Mesude; Kottmann, Renzo; Schnetzer, Julia; Kostadinov, Ivaylo; Lehmann, Katja; Fernandez-Guerra, Antonio; Jeanthon, Christian; Rahav, Eyal; Ullrich, Matthias; Wichels, Antje; Gerdts, Gunnar; Polymenakou, Paraskevi; Kotoulas, Giorgos; Siam, Rania; Abdallah, Rehab Z; Sonnenschein, Eva C; Cariou, Thierry; O'Gara, Fergal; Jackson, Stephen; Orlic, Sandi; Steinke, Michael; Busch, Julia; Duarte, Bernardo; Caçador, Isabel; Canning-Clode, João; Bobrova, Oleksandra; Marteinsson, Viggo; Reynisson, Eyjolfur; Loureiro, Clara Magalhães; Luna, Gian Marco; Quero, Grazia Marina; Löscher, Carolin R; Kremp, Anke; DeLorenzo, Marie E; Øvreås, Lise; Tolman, Jennifer; LaRoche, Julie; Penna, Antonella; Frischer, Marc; Davis, Timothy; Katherine, Barker; Meyer, Christopher P; Ramos, Sandra; Magalhães, Catarina; Jude-Lemeilleur, Florence; Aguirre-Macedo, Ma Leopoldina; Wang, Shiao; Poulton, Nicole; Jones, Scott; Collin, Rachel; Fuhrman, Jed A; Conan, Pascal; Alonso, Cecilia; Stambler, Noga; Goodwin, Kelly; Yakimov, Michael M; Baltar, Federico; Bodrossy, Levente; Van De Kamp, Jodie; Frampton, Dion Mf; Ostrowski, Martin; Van Ruth, Paul; Malthouse, Paul; Claus, Simon; Deneudt, Klaas; Mortelmans, Jonas; Pitois, Sophie; Wallom, David; Salter, Ian; Costa, Rodrigo; Schroeder, Declan C; Kandil, Mahrous M; Amaral, Valentina; Biancalana, Florencia; Santana, Rafael; Pedrotti, Maria Luiza; Yoshida, Takashi; Ogata, Hiroyuki; Ingleton, Tim; Munnik, Kate; Rodriguez-Ezpeleta, Naiara; Berteaux-Lecellier, Veronique; Wecker, Patricia; Cancio, Ibon; Vaulot, Daniel; Bienhold, Christina; Ghazal, Hassan; Chaouni, Bouchra; Essayeh, Soumya; Ettamimi, Sara; Zaid, El Houcine; Boukhatem, Noureddine; Bouali, Abderrahim; Chahboune, Rajaa; Barrijal, Said; Timinouni, Mohammed; El Otmani, Fatima; Bennani, Mohamed; Mea, Marianna; Todorova, Nadezhda; Karamfilov, Ventzislav; Ten Hoopen, Petra; Cochrane, Guy; L'Haridon, Stephane; Bizsel, Kemal Can; Vezzi, Alessandro; Lauro, Federico M; Martin, Patrick; Jensen, Rachelle M; Hinks, Jamie; Gebbels, Susan; Rosselli, Riccardo; De Pascale, Fabio; Schiavon, Riccardo; Dos Santos, Antonina; Villar, Emilie; Pesant, Stéphane; Cataletto, Bruno; Malfatti, Francesca; Edirisinghe, Ranjith; Silveira, Jorge A Herrera; Barbier, Michele; Turk, Valentina; Tinta, Tinkara; Fuller, Wayne J; Salihoglu, Ilkay; Serakinci, Nedime; Ergoren, Mahmut Cerkez; Bresnan, Eileen; Iriberri, Juan; Nyhus, Paul Anders Fronth; Bente, Edvardsen; Karlsen, Hans Erik; Golyshin, Peter N; Gasol, Josep M; Moncheva, Snejana; Dzhembekova, Nina; Johnson, Zackary; Sinigalliano, Christopher David; Gidley, Maribeth Louise; Zingone, Adriana; Danovaro, Roberto; Tsiamis, George; Clark, Melody S; Costa, Ana Cristina; El Bour, Monia; Martins, Ana M; Collins, R Eric; Ducluzeau, Anne-Lise; Martinez, Jonathan; Costello, Mark J; Amaral-Zettler, Linda A; Gilbert, Jack A; Davies, Neil; Field, Dawn; Glöckner, Frank Oliver

    2015-01-01

    Ocean Sampling Day was initiated by the EU-funded Micro B3 (Marine Microbial Biodiversity, Bioinformatics, Biotechnology) project to obtain a snapshot of the marine microbial biodiversity and function of the world's oceans. It is a simultaneous global mega-sequencing campaign aiming to generate the largest standardized microbial data set in a single day. This will be achievable only through the coordinated efforts of an Ocean Sampling Day Consortium, supportive partnerships and networks between sites. This commentary outlines the establishment, function and aims of the Consortium and describes our vision for a sustainable study of marine microbial communities and their embedded functional traits. PMID:26097697

  16. The Ocean Sampling Day Consortium

    SciTech Connect

    Kopf, Anna; Bicak, Mesude; Kottmann, Renzo; Schnetzer, Julia; Kostadinov, Ivaylo; Lehmann, Katja; Fernandez-Guerra, Antonio; Jeanthon, Christian; Rahav, Eyal; Ullrich, Matthias; Wichels, Antje; Gerdts, Gunnar; Polymenakou, Paraskevi; Kotoulas, Giorgos; Siam, Rania; Abdallah, Rehab Z.; Sonnenschein, Eva C.; Cariou, Thierry; O’Gara, Fergal; Jackson, Stephen; Orlic, Sandi; Steinke, Michael; Busch, Julia; Duarte, Bernardo; Caçador, Isabel; Canning-Clode, João; Bobrova, Oleksandra; Marteinsson, Viggo; Reynisson, Eyjolfur; Loureiro, Clara Magalhães; Luna, Gian Marco; Quero, Grazia Marina; Löscher, Carolin R.; Kremp, Anke; DeLorenzo, Marie E.; Øvreås, Lise; Tolman, Jennifer; LaRoche, Julie; Penna, Antonella; Frischer, Marc; Davis, Timothy; Katherine, Barker; Meyer, Christopher P.; Ramos, Sandra; Magalhães, Catarina; Jude-Lemeilleur, Florence; Aguirre-Macedo, Ma Leopoldina; Wang, Shiao; Poulton, Nicole; Jones, Scott; Collin, Rachel; Fuhrman, Jed A.; Conan, Pascal; Alonso, Cecilia; Stambler, Noga; Goodwin, Kelly; Yakimov, Michael M.; Baltar, Federico; Bodrossy, Levente; Van De Kamp, Jodie; Frampton, Dion M. F.; Ostrowski, Martin; Van Ruth, Paul; Malthouse, Paul; Claus, Simon; Deneudt, Klaas; Mortelmans, Jonas; Pitois, Sophie; Wallom, David; Salter, Ian; Costa, Rodrigo; Schroeder, Declan C.; Kandil, Mahrous M.; Amaral, Valentina; Biancalana, Florencia; Santana, Rafael; Pedrotti, Maria Luiza; Yoshida, Takashi; Ogata, Hiroyuki; Ingleton, Tim; Munnik, Kate; Rodriguez-Ezpeleta, Naiara; Berteaux-Lecellier, Veronique; Wecker, Patricia; Cancio, Ibon; Vaulot, Daniel; Bienhold, Christina; Ghazal, Hassan; Chaouni, Bouchra; Essayeh, Soumya; Ettamimi, Sara; Zaid, El Houcine; Boukhatem, Noureddine; Bouali, Abderrahim; Chahboune, Rajaa; Barrijal, Said; Timinouni, Mohammed; El Otmani, Fatima; Bennani, Mohamed; Mea, Marianna; Todorova, Nadezhda; Karamfilov, Ventzislav; ten Hoopen, Petra; Cochrane, Guy; L’Haridon, Stephane; Bizsel, Kemal Can; Vezzi, Alessandro; Lauro, Federico M.; Martin, Patrick; Jensen, Rachelle M.; Hinks, Jamie; Gebbels, Susan; Rosselli, Riccardo; De Pascale, Fabio; Schiavon, Riccardo; dos Santos, Antonina; Villar, Emilie; Pesant, Stéphane; Cataletto, Bruno; Malfatti, Francesca; Edirisinghe, Ranjith; Silveira, Jorge A. Herrera; Barbier, Michele; Turk, Valentina; Tinta, Tinkara; Fuller, Wayne J.; Salihoglu, Ilkay; Serakinci, Nedime; Ergoren, Mahmut Cerkez; Bresnan, Eileen; Iriberri, Juan; Nyhus, Paul Anders Fronth; Bente, Edvardsen; Karlsen, Hans Erik; Golyshin, Peter N.; Gasol, Josep M.; Moncheva, Snejana; Dzhembekova, Nina; Johnson, Zackary; Sinigalliano, Christopher David; Gidley, Maribeth Louise; Zingone, Adriana; Danovaro, Roberto; Tsiamis, George; Clark, Melody S.; Costa, Ana Cristina; El Bour, Monia; Martins, Ana M.; Collins, R. Eric; Ducluzeau, Anne-Lise; Martinez, Jonathan; Costello, Mark J.; Amaral-Zettler, Linda A.; Gilbert, Jack A.; Davies, Neil; Field, Dawn; Glöckner, Frank Oliver

    2015-06-19

    In this study, Ocean Sampling Day was initiated by the EU-funded Micro B3 (Marine Microbial Biodiversity, Bioinformatics, Biotechnology) project to obtain a snapshot of the marine microbial biodiversity and function of the world’s oceans. It is a simultaneous global mega-sequencing campaign aiming to generate the largest standardized microbial data set in a single day. This will be achievable only through the coordinated efforts of an Ocean Sampling Day Consortium, supportive partnerships and networks between sites. This commentary outlines the establishment, function and aims of the Consortium and describes our vision for a sustainable study of marine microbial communities and their embedded functional traits.

  17. The ocean sampling day consortium.

    PubMed

    Kopf, Anna; Bicak, Mesude; Kottmann, Renzo; Schnetzer, Julia; Kostadinov, Ivaylo; Lehmann, Katja; Fernandez-Guerra, Antonio; Jeanthon, Christian; Rahav, Eyal; Ullrich, Matthias; Wichels, Antje; Gerdts, Gunnar; Polymenakou, Paraskevi; Kotoulas, Giorgos; Siam, Rania; Abdallah, Rehab Z; Sonnenschein, Eva C; Cariou, Thierry; O'Gara, Fergal; Jackson, Stephen; Orlic, Sandi; Steinke, Michael; Busch, Julia; Duarte, Bernardo; Caçador, Isabel; Canning-Clode, João; Bobrova, Oleksandra; Marteinsson, Viggo; Reynisson, Eyjolfur; Loureiro, Clara Magalhães; Luna, Gian Marco; Quero, Grazia Marina; Löscher, Carolin R; Kremp, Anke; DeLorenzo, Marie E; Øvreås, Lise; Tolman, Jennifer; LaRoche, Julie; Penna, Antonella; Frischer, Marc; Davis, Timothy; Katherine, Barker; Meyer, Christopher P; Ramos, Sandra; Magalhães, Catarina; Jude-Lemeilleur, Florence; Aguirre-Macedo, Ma Leopoldina; Wang, Shiao; Poulton, Nicole; Jones, Scott; Collin, Rachel; Fuhrman, Jed A; Conan, Pascal; Alonso, Cecilia; Stambler, Noga; Goodwin, Kelly; Yakimov, Michael M; Baltar, Federico; Bodrossy, Levente; Van De Kamp, Jodie; Frampton, Dion Mf; Ostrowski, Martin; Van Ruth, Paul; Malthouse, Paul; Claus, Simon; Deneudt, Klaas; Mortelmans, Jonas; Pitois, Sophie; Wallom, David; Salter, Ian; Costa, Rodrigo; Schroeder, Declan C; Kandil, Mahrous M; Amaral, Valentina; Biancalana, Florencia; Santana, Rafael; Pedrotti, Maria Luiza; Yoshida, Takashi; Ogata, Hiroyuki; Ingleton, Tim; Munnik, Kate; Rodriguez-Ezpeleta, Naiara; Berteaux-Lecellier, Veronique; Wecker, Patricia; Cancio, Ibon; Vaulot, Daniel; Bienhold, Christina; Ghazal, Hassan; Chaouni, Bouchra; Essayeh, Soumya; Ettamimi, Sara; Zaid, El Houcine; Boukhatem, Noureddine; Bouali, Abderrahim; Chahboune, Rajaa; Barrijal, Said; Timinouni, Mohammed; El Otmani, Fatima; Bennani, Mohamed; Mea, Marianna; Todorova, Nadezhda; Karamfilov, Ventzislav; Ten Hoopen, Petra; Cochrane, Guy; L'Haridon, Stephane; Bizsel, Kemal Can; Vezzi, Alessandro; Lauro, Federico M; Martin, Patrick; Jensen, Rachelle M; Hinks, Jamie; Gebbels, Susan; Rosselli, Riccardo; De Pascale, Fabio; Schiavon, Riccardo; Dos Santos, Antonina; Villar, Emilie; Pesant, Stéphane; Cataletto, Bruno; Malfatti, Francesca; Edirisinghe, Ranjith; Silveira, Jorge A Herrera; Barbier, Michele; Turk, Valentina; Tinta, Tinkara; Fuller, Wayne J; Salihoglu, Ilkay; Serakinci, Nedime; Ergoren, Mahmut Cerkez; Bresnan, Eileen; Iriberri, Juan; Nyhus, Paul Anders Fronth; Bente, Edvardsen; Karlsen, Hans Erik; Golyshin, Peter N; Gasol, Josep M; Moncheva, Snejana; Dzhembekova, Nina; Johnson, Zackary; Sinigalliano, Christopher David; Gidley, Maribeth Louise; Zingone, Adriana; Danovaro, Roberto; Tsiamis, George; Clark, Melody S; Costa, Ana Cristina; El Bour, Monia; Martins, Ana M; Collins, R Eric; Ducluzeau, Anne-Lise; Martinez, Jonathan; Costello, Mark J; Amaral-Zettler, Linda A; Gilbert, Jack A; Davies, Neil; Field, Dawn; Glöckner, Frank Oliver

    2015-01-01

    Ocean Sampling Day was initiated by the EU-funded Micro B3 (Marine Microbial Biodiversity, Bioinformatics, Biotechnology) project to obtain a snapshot of the marine microbial biodiversity and function of the world's oceans. It is a simultaneous global mega-sequencing campaign aiming to generate the largest standardized microbial data set in a single day. This will be achievable only through the coordinated efforts of an Ocean Sampling Day Consortium, supportive partnerships and networks between sites. This commentary outlines the establishment, function and aims of the Consortium and describes our vision for a sustainable study of marine microbial communities and their embedded functional traits.

  18. Hemipteran Mitochondrial Genomes: Features, Structures and Implications for Phylogeny

    PubMed Central

    Wang, Yuan; Chen, Jing; Jiang, Li-Yun; Qiao, Ge-Xia

    2015-01-01

    The study of Hemipteran mitochondrial genomes (mitogenomes) began with the Chagas disease vector, Triatoma dimidiata, in 2001. At present, 90 complete Hemipteran mitogenomes have been sequenced and annotated. This review examines the history of Hemipteran mitogenomes research and summarizes the main features of them including genome organization, nucleotide composition, protein-coding genes, tRNAs and rRNAs, and non-coding regions. Special attention is given to the comparative analysis of repeat regions. Gene rearrangements are an additional data type for a few families, and most mitogenomes are arranged in the same order to the proposed ancestral insect. We also discuss and provide insights on the phylogenetic analyses of a variety of taxonomic levels. This review is expected to further expand our understanding of research in this field and serve as a valuable reference resource. PMID:26039239

  19. The structure of the protein universe and genome evolution.

    PubMed

    Koonin, Eugene V; Wolf, Yuri I; Karev, Georgy P

    2002-11-14

    Despite the practically unlimited number of possible protein sequences, the number of basic shapes in which proteins fold seems not only to be finite, but also to be relatively small, with probably no more than 10,000 folds in existence. Moreover, the distribution of proteins among these folds is highly non-homogeneous -- some folds and superfamilies are extremely abundant, but most are rare. Protein folds and families encoded in diverse genomes show similar size distributions with notable mathematical properties, which also extend to the number of connections between domains in multidomain proteins. All these distributions follow asymptotic power laws, such as have been identified in a wide variety of biological and physical systems, and which are typically associated with scale-free networks. These findings suggest that genome evolution is driven by extremely general mechanisms based on the preferential attachment principle.

  20. Primary structure of a genomic zein sequence of maize.

    PubMed Central

    Hu, N T; Peifer, M A; Heidecker, G; Messing, J; Rubenstein, I

    1982-01-01

    The nucleotide sequence of a genomic clone (termed Z4 ) of the zein multigene family was compared to the nucleotide sequence of related cDNA clones of zein mRNAs. A tandem duplication of a 96-bp sequence is found in the genomic clone that is not present in the related cDNA clones. When the duplication is disregarded, the nucleotide sequence homology between Z4 and its related cDNAs was approximately 97%. The nucleotide sequence is also compared to other isolated cDNAs. No introns in the coding region of the zein gene are detected. The first nucleotide of a putative TATA box, TATAAATA , was located 88 nucleotides upstream of the first nucleotide of the first ATG codon which initiated the open reading frame. The first nucleotide of a putative CCAAT box, CAAAAT , appeared 45 nucleotides upstream of the first nucleotide of the zein cDNA clones in the 3' non-coding region also appeared in the genomic sequence at the same locations. The amino acid composition of the polypeptide specified by the Z4 nucleotide sequence is similar to the known composition of zein proteins. PMID:6233138

  1. Associations between inverted repeats and the structural evolution of bacterial genomes.

    PubMed Central

    Achaz, Guillaume; Coissac, Eric; Netter, Pierre; Rocha, Eduardo P C

    2003-01-01

    The stability of the structure of bacterial genomes is challenged by recombination events. Since major rearrangements (i.e., inversions) are thought to frequently operate by homologous recombination between inverted repeats, we analyzed the presence and distribution of such repeats in bacterial genomes and their relation to the conservation of chromosomal structure. First, we show that there is a strong under-representation of inverted repeats, relative to direct repeats, in most chromosomes, especially among the ones regarded as most stable. Second, we show that the avoidance of repeats is frequently associated with the stability of the genomes. Closely related genomes reported to differ in terms of stability are also found to differ in the number of inverted repeats. Third, when using replication strand bias as a proxy for genome stability, we find a significant negative correlation between this strand bias and the abundance of inverted repeats. Fourth, when measuring the recombining potential of inverted repeats and their eventual impact on different features of the chromosomal structure, we observe a tendency of repeats to be located in the chromosome in such a way that rearrangements produce a smaller strand switch and smaller asymmetries than expected by chance. Finally, we discuss the limitations of our analysis and the influence of factors such as the nature of repeats, e.g., transposases, or the differences in the recombination machinery among bacteria. These results shed light on the challenges imposed on the genome structure by the presence of inverted repeats. PMID:12930739

  2. Midwest Superconductivity Consortium: 1994 Progress report

    SciTech Connect

    Not Available

    1995-01-01

    The mission of the Midwest Superconductivity Consortium, MISCON, is to advance the science and understanding of high {Tc} superconductivity. During the past year, 27 projects produced over 123 talks and 139 publications. Group activities and interactions involved 2 MISCON group meetings (held in August and January); with the second MISCON Workshop held in August; 13 external speakers; 79 collaborations (with universities, industry, Federal laboratories, and foreign research centers); and 48 exchanges of samples and/or measurements. Research achievements this past year focused on understanding the effects of processing phenomena on structure-property interrelationships and the fundamental nature of transport properties in high-temperature superconductors.

  3. Consortium for materials development in space

    NASA Technical Reports Server (NTRS)

    1993-01-01

    During fiscal 1993, the Consortium for Materials Development in Space (CMDS) maintained the organizational structure and project orientation established in prior years. The commercial objectives are improved materials, biomedical applications, and infrastructure and support hardware. Projects include nonlinear optical materials; space materials (specifically polymer foam/films, atomic oxygen and high temperature superconductors); alloyed and blended materials: sintered and alloyed materials; polymer and carbonate blends; electrodeposition; organic separation; materials dispersion and biodynamics; space carriers: Consort, COMET support, Spacehab utilization; and flight services: accelerometers, CMIX, USEC, ORSEP, and Space Experiment Facility (SEF).

  4. The genome and structural proteome of an ocean siphovirus: a new window into the cyanobacterial 'mobilome'.

    PubMed

    Sullivan, Matthew B; Krastins, Bryan; Hughes, Jennifer L; Kelly, Libusha; Chase, Michael; Sarracino, David; Chisholm, Sallie W

    2009-11-01

    Prochlorococcus, an abundant phototroph in the oceans, are infected by members of three families of viruses: myo-, podo- and siphoviruses. Genomes of myo- and podoviruses isolated on Prochlorococcus contain DNA replication machinery and virion structural genes homologous to those from coliphages T4 and T7 respectively. They also contain a suite of genes of cyanobacterial origin, most notably photosynthesis genes, which are expressed during infection and appear integral to the evolutionary trajectory of both host and phage. Here we present the first genome of a cyanobacterial siphovirus, P-SS2, which was isolated from Atlantic slope waters using a Prochlorococcus host (MIT9313). The P-SS2 genome is larger than, and considerably divergent from, previously sequenced siphoviruses. It appears most closely related to lambdoid siphoviruses, with which it shares 13 functional homologues. The approximately 108 kb P-SS2 genome encodes 131 predicted proteins and notably lacks photosynthesis genes which have consistently been found in other marine cyanophage, but does contain 14 other cyanobacterial homologues. While only six structural proteins were identified from the genome sequence, 35 proteins were detected experimentally; these mapped onto capsid and tail structural modules in the genome. P-SS2 is potentially capable of integration into its host as inferred from bioinformatically identified genetic machinery int, bet, exo and a 53 bp attachment site. The host attachment site appears to be a genomic island that is tied to insertion sequence (IS) activity that could facilitate mobility of a gene involved in the nitrogen-stress response. The homologous region and a secondary IS-element hot-spot in Synechococcus RS9917 are further evidence of IS-mediated genome evolution coincident with a probable relic prophage integration event. This siphovirus genome provides a glimpse into the biology of a deep-photic zone phage as well as the ocean cyanobacterial prophage and IS element

  5. Physical and Genetic Structure of the Maize Genome Reflects Its Complex Evolutionary History

    PubMed Central

    Wei, Fusheng; Coe, Ed; Nelson, William; Bharti, Arvind K; Engler, Fred; Butler, Ed; Kim, HyeRan; Goicoechea, Jose Luis; Chen, Mingsheng; Lee, Seunghee; Fuks, Galina; Sanchez-Villeda, Hector; Schroeder, Steven; Fang, Zhiwei; McMullen, Michael; Davis, Georgia; Bowers, John E; Paterson, Andrew H; Schaeffer, Mary; Gardiner, Jack; Cone, Karen; Messing, Joachim; Soderlund, Carol; Wing, Rod A

    2007-01-01

    Maize (Zea mays L.) is one of the most important cereal crops and a model for the study of genetics, evolution, and domestication. To better understand maize genome organization and to build a framework for genome sequencing, we constructed a sequence-ready fingerprinted contig-based physical map that covers 93.5% of the genome, of which 86.1% is aligned to the genetic map. The fingerprinted contig map contains 25,908 genic markers that enabled us to align nearly 73% of the anchored maize genome to the rice genome. The distribution pattern of expressed sequence tags correlates to that of recombination. In collinear regions, 1 kb in rice corresponds to an average of 3.2 kb in maize, yet maize has a 6-fold genome size expansion. This can be explained by the fact that most rice regions correspond to two regions in maize as a result of its recent polyploid origin. Inversions account for the majority of chromosome structural variations during subsequent maize diploidization. We also find clear evidence of ancient genome duplication predating the divergence of the progenitors of maize and rice. Reconstructing the paleoethnobotany of the maize genome indicates that the progenitors of modern maize contained ten chromosomes. PMID:17658954

  6. Three-dimensional Structure of a Viral Genome-delivery Portal Vertex

    SciTech Connect

    A Olia; P Prevelige Jr.; J Johnson; G Cingolani

    2011-12-31

    DNA viruses such as bacteriophages and herpesviruses deliver their genome into and out of the capsid through large proteinaceous assemblies, known as portal proteins. Here, we report two snapshots of the dodecameric portal protein of bacteriophage P22. The 3.25-{angstrom}-resolution structure of the portal-protein core bound to 12 copies of gene product 4 (gp4) reveals a {approx}1.1-MDa assembly formed by 24 proteins. Unexpectedly, a lower-resolution structure of the full-length portal protein unveils the unique topology of the C-terminal domain, which forms a {approx}200-{angstrom}-long {alpha}-helical barrel. This domain inserts deeply into the virion and is highly conserved in the Podoviridae family. We propose that the barrel domain facilitates genome spooling onto the interior surface of the capsid during genome packaging and, in analogy to a rifle barrel, increases the accuracy of genome ejection into the host cell.

  7. CFD parametric study of consortium impeller

    NASA Astrophysics Data System (ADS)

    Cheng, Gary C.; Chen, Y. S.; Garcia, Roberto; Williams, Robert W.

    1993-07-01

    . Due to the complexity of blade geometries, the TANDEM blade configurations were analyzed with the multi-zone grid structure. Both the 7.5 deg- and the 22.5 deg-clocking TANDEM blade cases utilized a 80K mesh system. The numerical result of two TANDEM blade modifications indicates the efficiency and the head are worse than those of the baseline case due to larger flow distortion. The gap between the TANDEM blade and the full blade allows the flow passes through and heavily loads the pressure side of the partial blade such that flow reversal occurs near the suction side of the splitter. The flow split at the exit of impeller blades is very non-uniform for TANDEM blade cases, and this will greatly induce the side load on the diffuser. consortium impeller.

  8. Visualizing the global secondary structure of a viral RNA genome with cryo-electron microscopy.

    PubMed

    Garmann, Rees F; Gopal, Ajaykumar; Athavale, Shreyas S; Knobler, Charles M; Gelbart, William M; Harvey, Stephen C

    2015-05-01

    The lifecycle, and therefore the virulence, of single-stranded (ss)-RNA viruses is regulated not only by their particular protein gene products, but also by the secondary and tertiary structure of their genomes. The secondary structure of the entire genomic RNA of satellite tobacco mosaic virus (STMV) was recently determined by selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE). The SHAPE analysis suggested a single highly extended secondary structure with much less branching than occurs in the ensemble of structures predicted by purely thermodynamic algorithms. Here we examine the solution-equilibrated STMV genome by direct visualization with cryo-electron microscopy (cryo-EM), using an RNA of similar length transcribed from the yeast genome as a control. The cryo-EM data reveal an ensemble of branching patterns that are collectively consistent with the SHAPE-derived secondary structure model. Thus, our results both elucidate the statistical nature of the secondary structure of large ss-RNAs and give visual support for modern RNA structure determination methods. Additionally, this work introduces cryo-EM as a means to distinguish between competing secondary structure models if the models differ significantly in terms of the number and/or length of branches. Furthermore, with the latest advances in cryo-EM technology, we suggest the possibility of developing methods that incorporate restraints from cryo-EM into the next generation of algorithms for the determination of RNA secondary and tertiary structures.

  9. Brain Tumor Epidemiology Consortium (BTEC)

    Cancer.gov

    The Brain Tumor Epidemiology Consortium is an open scientific forum organized to foster the development of multi-center, international and inter-disciplinary collaborations that will lead to a better understanding of the etiology, outcomes, and prevention of brain tumors.

  10. The Virginia Home Visiting Consortium

    ERIC Educational Resources Information Center

    Bodkin, Catherine

    2010-01-01

    The Virginia Home Visiting Consortium (HVC) is a collaboration of public and private organizations which work to improve the effectiveness and efficiency of home visiting services throughout the state. The HVC identified service needs and gaps and has focused on increasing the interagency state and local partnerships so that resources are…

  11. Identification of repeat structure in large genomes using repeat probability clouds.

    PubMed

    Gu, Wanjun; Castoe, Todd A; Hedges, Dale J; Batzer, Mark A; Pollock, David D

    2008-09-01

    The identification of repeat structure in eukaryotic genomes can be time-consuming and difficult because of the large amount of information ( approximately 3 x 10(9) bp) that needs to be processed and compared. We introduce a new approach based on exact word counts to evaluate, de novo, the repeat structure present within large eukaryotic genomes. This approach avoids sequence alignment and similarity search, two of the most time-consuming components of traditional methods for repeat identification. Algorithms were implemented to efficiently calculate exact counts for any length oligonucleotide in large genomes. Based on these oligonucleotide counts, oligonucleotide excess probability clouds, or "P-clouds," were constructed. P-clouds are composed of clusters of related oligonucleotides that occur, as a group, more often than expected by chance. After construction, P-clouds were mapped back onto the genome, and regions of high P-cloud density were identified as repetitive regions based on a sliding window approach. This efficient method is capable of analyzing the repeat content of the entire human genome on a single desktop computer in less than half a day, at least 10-fold faster than current approaches. The predicted repetitive regions strongly overlap with known repeat elements as well as other repetitive regions such as gene families, pseudogenes, and segmental duplicons. This method should be extremely useful as a tool for use in de novo identification of repeat structure in large newly sequenced genomes.

  12. Integrating sequencing technologies in personal genomics: optimal low cost reconstruction of structural variants.

    PubMed

    Du, Jiang; Bjornson, Robert D; Zhang, Zhengdong D; Kong, Yong; Snyder, Michael; Gerstein, Mark B

    2009-07-01

    The goal of human genome re-sequencing is obtaining an accurate assembly of an individual's genome. Recently, there has been great excitement in the development of many technologies for this (e.g. medium and short read sequencing from companies such as 454 and SOLiD, and high-density oligo-arrays from Affymetrix and NimbelGen), with even more expected to appear. The costs and sensitivities of these technologies differ considerably from each other. As an important goal of personal genomics is to reduce the cost of re-sequencing to an affordable point, it is worthwhile to consider optimally integrating technologies. Here, we build a simulation toolbox that will help us optimally combine different technologies for genome re-sequencing, especially in reconstructing large structural variants (SVs). SV reconstruction is considered the most challenging step in human genome re-sequencing. (It is sometimes even harder than de novo assembly of small genomes because of the duplications and repetitive sequences in the human genome.) To this end, we formulate canonical problems that are representative of issues in reconstruction and are of small enough scale to be computationally tractable and simulatable. Using semi-realistic simulations, we show how we can combine different technologies to optimally solve the assembly at low cost. With mapability maps, our simulations efficiently handle the inhomogeneous repeat-containing structure of the human genome and the computational complexity of practical assembly algorithms. They quantitatively show how combining different read lengths is more cost-effective than using one length, how an optimal mixed sequencing strategy for reconstructing large novel SVs usually also gives accurate detection of SNPs/indels, how paired-end reads can improve reconstruction efficiency, and how adding in arrays is more efficient than just sequencing for disentangling some complex SVs. Our strategy should facilitate the sequencing of human genomes at

  13. Evolution of recombination and genome structure in eusocial insects.

    PubMed

    Kent, Clement F; Zayed, Amro

    2013-03-01

    Eusocial Hymenoptera, such as the European honey bee, Apis mellifera, have the highest recombination rates of multicellular animals.(1) Recently, we showed(2) that a side-effect of recombination in the honey bee, GC biased gene conversion (bGC), helps maintain the unusual bimodal GC-content distribution of the bee genome by increasing GC-content in high recombination areas while low recombination areas are losing GC-content because of biased AT mutations and low rates of bGC. Although the very high recombination rate of A. mellifera makes GC-content evolution easier to study, the pattern is consistent with results found in many other species including mammals and yeast.(3) Also consistent across phyla is the association of higher genetic diversity and divergence with high GC and high recombination areas.(4) (,) (5) Finally, we showed that genes overexpressed in the brains of workers cluster in GC-rich genomic areas with the highest rates of recombination and molecular evolution.(2) In this Addendum we present a conceptual model of how eusociality and high recombination rates may co-evolve.

  14. Structural analysis of a carcinogen-induced genomic rearrangement event

    SciTech Connect

    Barr, F.G.; Davis, R.J.; Eichenfield, L.; Emanuel, B.S. Univ. of Pennsylvania, Philadelphia )

    1992-02-01

    The authors have explored the mechanism of genomic rearrangement in a hamster fibroblast cell culture system in which rearrangements are induced 5{prime} to the endogenous thymidine kinase gene by chemical carcinogen treatment. The wild-type region around one rearrangement breakpoint was cloned and sequenced. With this sequence information, the carcinogen-induced rearrangement was cloned from the corresponding rearranged cell line by the inverse polymerase chain reaction. After the breakpoint fragment was sequenced, the wild-type rearrangement partner (RP15) was isolated by a second inverse polymerase chain reaction of unrearranged DNA. Comparison of the sequence of the rearrangement breakpoint with the wild-type RP15 and 5{prime} thymidine kinase gene regions revealed short repeats directly at the breakpoint, as well as nearby A+T-rich regions in rearrangement partner. Therefore, these studies reveal interesting sequence and chromatin features near the rearrangement breakpoints and suggest a role for nuclear organization in the mechanism of carcinogen-induced genomic rearrangement.

  15. An integrated map of structural variation in 2,504 human genomes.

    PubMed

    Sudmant, Peter H; Rausch, Tobias; Gardner, Eugene J; Handsaker, Robert E; Abyzov, Alexej; Huddleston, John; Zhang, Yan; Ye, Kai; Jun, Goo; Hsi-Yang Fritz, Markus; Konkel, Miriam K; Malhotra, Ankit; Stütz, Adrian M; Shi, Xinghua; Paolo Casale, Francesco; Chen, Jieming; Hormozdiari, Fereydoun; Dayama, Gargi; Chen, Ken; Malig, Maika; Chaisson, Mark J P; Walter, Klaudia; Meiers, Sascha; Kashin, Seva; Garrison, Erik; Auton, Adam; Lam, Hugo Y K; Jasmine Mu, Xinmeng; Alkan, Can; Antaki, Danny; Bae, Taejeong; Cerveira, Eliza; Chines, Peter; Chong, Zechen; Clarke, Laura; Dal, Elif; Ding, Li; Emery, Sarah; Fan, Xian; Gujral, Madhusudan; Kahveci, Fatma; Kidd, Jeffrey M; Kong, Yu; Lameijer, Eric-Wubbo; McCarthy, Shane; Flicek, Paul; Gibbs, Richard A; Marth, Gabor; Mason, Christopher E; Menelaou, Androniki; Muzny, Donna M; Nelson, Bradley J; Noor, Amina; Parrish, Nicholas F; Pendleton, Matthew; Quitadamo, Andrew; Raeder, Benjamin; Schadt, Eric E; Romanovitch, Mallory; Schlattl, Andreas; Sebra, Robert; Shabalin, Andrey A; Untergasser, Andreas; Walker, Jerilyn A; Wang, Min; Yu, Fuli; Zhang, Chengsheng; Zhang, Jing; Zheng-Bradley, Xiangqun; Zhou, Wanding; Zichner, Thomas; Sebat, Jonathan; Batzer, Mark A; McCarroll, Steven A; Mills, Ryan E; Gerstein, Mark B; Bashir, Ali; Stegle, Oliver; Devine, Scott E; Lee, Charles; Eichler, Evan E; Korbel, Jan O

    2015-10-01

    Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association. PMID:26432246

  16. An integrated map of structural variation in 2,504 human genomes.

    PubMed

    Sudmant, Peter H; Rausch, Tobias; Gardner, Eugene J; Handsaker, Robert E; Abyzov, Alexej; Huddleston, John; Zhang, Yan; Ye, Kai; Jun, Goo; Hsi-Yang Fritz, Markus; Konkel, Miriam K; Malhotra, Ankit; Stütz, Adrian M; Shi, Xinghua; Paolo Casale, Francesco; Chen, Jieming; Hormozdiari, Fereydoun; Dayama, Gargi; Chen, Ken; Malig, Maika; Chaisson, Mark J P; Walter, Klaudia; Meiers, Sascha; Kashin, Seva; Garrison, Erik; Auton, Adam; Lam, Hugo Y K; Jasmine Mu, Xinmeng; Alkan, Can; Antaki, Danny; Bae, Taejeong; Cerveira, Eliza; Chines, Peter; Chong, Zechen; Clarke, Laura; Dal, Elif; Ding, Li; Emery, Sarah; Fan, Xian; Gujral, Madhusudan; Kahveci, Fatma; Kidd, Jeffrey M; Kong, Yu; Lameijer, Eric-Wubbo; McCarthy, Shane; Flicek, Paul; Gibbs, Richard A; Marth, Gabor; Mason, Christopher E; Menelaou, Androniki; Muzny, Donna M; Nelson, Bradley J; Noor, Amina; Parrish, Nicholas F; Pendleton, Matthew; Quitadamo, Andrew; Raeder, Benjamin; Schadt, Eric E; Romanovitch, Mallory; Schlattl, Andreas; Sebra, Robert; Shabalin, Andrey A; Untergasser, Andreas; Walker, Jerilyn A; Wang, Min; Yu, Fuli; Zhang, Chengsheng; Zhang, Jing; Zheng-Bradley, Xiangqun; Zhou, Wanding; Zichner, Thomas; Sebat, Jonathan; Batzer, Mark A; McCarroll, Steven A; Mills, Ryan E; Gerstein, Mark B; Bashir, Ali; Stegle, Oliver; Devine, Scott E; Lee, Charles; Eichler, Evan E; Korbel, Jan O

    2015-10-01

    Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.

  17. RNA structure: merging chemistry and genomics for a holistic perspective.

    PubMed

    Kubota, Miles; Chan, Dalen; Spitale, Robert C

    2015-10-01

    The advent of deep sequencing technology has unexpectedly advanced our structural understanding of molecules composed of nucleic acids. A significant amount of progress has been made recently extrapolating the chemical methods to probe RNA structure into sequencing methods. Herein we review some of the canonical methods to analyze RNA structure, and then we outline how these have been used to probe the structure of many RNAs in parallel. The key is the transformation of structural biology problems into sequencing problems, whereby sequencing power can be interpreted to understand nucleic acid proximity, nucleic acid conformation, or nucleic acid-protein interactions. Utilizing such technologies in this way has the promise to provide novel structural insights into the mechanisms that control normal cellular physiology and provide insight into how structure could be perturbed in disease.

  18. 77 FR 38770 - Notice of Consortium on “nSoft Consortium”

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-06-29

    ... thereafter. Non-profit organizations in lieu of membership fees will contribute personal expertise and... meeting, revisions have ] been made to the membership fee structure and the initial period of time for the consortium. Also, the consortium is open to a limited number of for-profit and not-for-profit...

  19. Genomic structural variation contributes to phenotypic change of industrial bioethanol yeast Saccharomyces cerevisiae.

    PubMed

    Zhang, Ke; Zhang, Li-Jie; Fang, Ya-Hong; Jin, Xin-Na; Qi, Lei; Wu, Xue-Chang; Zheng, Dao-Qiong

    2016-03-01

    Genomic structural variation (GSV) is a ubiquitous phenomenon observed in the genomes of Saccharomyces cerevisiae strains with different genetic backgrounds; however, the physiological and phenotypic effects of GSV are not well understood. Here, we first revealed the genetic characteristics of a widely used industrial S. cerevisiae strain, ZTW1, by whole genome sequencing. ZTW1 was identified as an aneuploidy strain and a large-scale GSV was observed in the ZTW1 genome compared with the genome of a diploid strain YJS329. These GSV events led to copy number variations (CNVs) in many chromosomal segments as well as one whole chromosome in the ZTW1 genome. Changes in the DNA dosage of certain functional genes directly affected their expression levels and the resultant ZTW1 phenotypes. Moreover, CNVs of large chromosomal regions triggered an aneuploidy stress in ZTW1. This stress decreased the proliferation ability and tolerance of ZTW1 to various stresses, while aneuploidy response stress may also provide some benefits to the fermentation performance of the yeast, including increased fermentation rates and decreased byproduct generation. This work reveals genomic characters of the bioethanol S. cerevisiae strain ZTW1 and suggests that GSV is an important kind of mutation that changes the traits of industrial S. cerevisiae strains.

  20. Comparative Genome Structure, Secondary Metabolite, and Effector Coding Capacity across Cochliobolus Pathogens

    SciTech Connect

    Condon, Bradford J.; Leng, Yueqiang; Wu, Dongliang; Bushley, Kathryn E.; Ohm, Robin A.; Otillar, Robert; Martin, Joel; Schackwitz, Wendy; Grimwood, Jane; MohdZainudin, NurAinlzzati; Xue, Chunsheng; Wang, Rui; Manning, Viola A.; Dhillon, Braham; Tu, Zheng Jin; Steffenson, Brian J.; Salamov, Asaf; Sun, Hui; Lowry, Steve; LaButti, Kurt; Han, James; Copeland, Alex; Lindquist, Erika; Barry, Kerrie; Schmutz, Jeremy; Baker, Scott E.; Ciuffetti, Lynda M.; Grigoriev, Igor V.; Zhong, Shaobin; Turgeon, B. Gillian

    2013-01-24

    The genomes of five Cochliobolus heterostrophus strains, two Cochliobolus sativus strains, three additional Cochliobolus species (Cochliobolus victoriae, Cochliobolus carbonum, Cochliobolus miyabeanus), and closely related Setosphaeria turcica were sequenced at the Joint Genome Institute (JGI). The datasets were used to identify SNPs between strains and species, unique genomic regions, core secondary metabolism genes, and small secreted protein (SSP) candidate effector encoding genes with a view towards pinpointing structural elements and gene content associated with specificity of these closely related fungi to different cereal hosts. Whole-genome alignment shows that three to five of each genome differs between strains of the same species, while a quarter of each genome differs between species. On average, SNP counts among field isolates of the same C. heterostrophus species are more than 25 higher than those between inbred lines and 50 lower than SNPs between Cochliobolus species. The suites of nonribosomal peptide synthetase (NRPS), polyketide synthase (PKS), and SSP encoding genes are astoundingly diverse among species but remarkably conserved among isolates of the same species, whether inbred or field strains, except for defining examples that map to unique genomic regions. Functional analysis of several strain-unique PKSs and NRPSs reveal a strong correlation with a role in virulence.

  1. Comparative Genome Structure, Secondary Metabolite, and Effector Coding Capacity across Cochliobolus Pathogens

    PubMed Central

    Bushley, Kathryn E.; Ohm, Robin A.; Otillar, Robert; Martin, Joel; Schackwitz, Wendy; Grimwood, Jane; MohdZainudin, NurAinIzzati; Xue, Chunsheng; Wang, Rui; Manning, Viola A.; Dhillon, Braham; Tu, Zheng Jin; Steffenson, Brian J.; Salamov, Asaf; Sun, Hui; Lowry, Steve; LaButti, Kurt; Han, James; Copeland, Alex; Lindquist, Erika; Barry, Kerrie; Schmutz, Jeremy; Baker, Scott E.; Ciuffetti, Lynda M.; Grigoriev, Igor V.; Zhong, Shaobin; Turgeon, B. Gillian

    2013-01-01

    The genomes of five Cochliobolus heterostrophus strains, two Cochliobolus sativus strains, three additional Cochliobolus species (Cochliobolus victoriae, Cochliobolus carbonum, Cochliobolus miyabeanus), and closely related Setosphaeria turcica were sequenced at the Joint Genome Institute (JGI). The datasets were used to identify SNPs between strains and species, unique genomic regions, core secondary metabolism genes, and small secreted protein (SSP) candidate effector encoding genes with a view towards pinpointing structural elements and gene content associated with specificity of these closely related fungi to different cereal hosts. Whole-genome alignment shows that three to five percent of each genome differs between strains of the same species, while a quarter of each genome differs between species. On average, SNP counts among field isolates of the same C. heterostrophus species are more than 25× higher than those between inbred lines and 50× lower than SNPs between Cochliobolus species. The suites of nonribosomal peptide synthetase (NRPS), polyketide synthase (PKS), and SSP–encoding genes are astoundingly diverse among species but remarkably conserved among isolates of the same species, whether inbred or field strains, except for defining examples that map to unique genomic regions. Functional analysis of several strain-unique PKSs and NRPSs reveal a strong correlation with a role in virulence. PMID:23357949

  2. Genome structure of Abelson murine leukemia virus variants: proviruses in fibroblasts and lymphoid cells.

    PubMed Central

    Goff, S P; Witte, O N; Gilboa, E; Rosenberg, N; Baltimore, D

    1981-01-01

    We have prepared full-length DNA clones of the Abelson murine leukemia virus (A-MuLV) genome. A specific probe homologous to the central portion of the A-MuLV genome was prepared by nick translation of a subcloned restriction fraction from the cloned DNA. The probe was used to examine the genome structure of several A-MuLV variants. The conclusions are: (i) three viruses coding for Abelson-specific proteins of molecular weight 120,000, 100,000, and 90,000 had genomes indistinguishable in size, suggesting that the shorter proteins are the result of early translational termination; (ii) compared with the genome encoding the 120,000-dalton (120K) protein, a genome coding for a 160K protein was 0.8 kilobase larger in the A-MuLV-specific region; and (iii) a genome coding for a 92K protein had a 700-base pair deletion internal to the coding region. This mutant was transformation defective: its 92K protein lacked the protein kinase activity normally associated with the A-MuLV protein, and cells containing the virus were not morphologically transformed. In addition, we determined the number of A-MuLV proviruses in each of several transformed fibroblast and lymphoid cells prepared by infection in vitro. These experiments show that a single copy of the A-MuLV provirus is sufficient to transform both types of cells and that nonproducer cells generally have only one integrated provirus. Images PMID:6264122

  3. Genomic and structural organization of Drosophila melanogaster G elements.

    PubMed Central

    Di Nocera, P P; Graziani, F; Lavorgna, G

    1986-01-01

    The properties and the genomic organization of G elements, a moderately repeated DNA family of D. melanogaster, are reported. G elements lack terminal repeats, generate target site duplications at the point of insertion and exhibit at one end a stretch of A residues of variable length. In a large number of recombinant clones analyzed G elements occur in tandem arrays, interspersed with specific ribosomal DNA (rDNA) segments. This arrangement results from the insertion of members of the G family within the nontranscribed spacer (NTS) of rDNA units. Similarity of the site of integration of G elements to that of ribosomal DNA insertions suggests that distinct DNA sequences might have been inserted into rDNA through a partly common pathway. Images PMID:3003691

  4. Toward a standard in structural genome annotation for prokaryotes

    DOE PAGESBeta

    Tripp, H. James; Sutton, Granger; White, Owen; Wortman, Jennifer; Pati, Amrita; Mikhailova, Natalia; Ovchinnikova, Galina; Payne, Samuel H.; Kyrpides, Nikos C.; Ivanova, Natalia

    2015-07-25

    In an effort to identify the best practice for finding genes in prokaryotic genomes and propose it as a standard for automated annotation pipelines, we collected 1,004,576 peptides from various publicly available resources, and these were used as a basis to evaluate various gene-calling methods. The peptides came from 45 bacterial replicons with an average GC content from 31 % to 74 %, biased toward higher GC content genomes. Automated, manual, and semi-manual methods were used to tally errors in three widely used gene calling methods, as evidenced by peptides mapped outside the boundaries of called genes. We found thatmore » the consensus set of identical genes predicted by the three methods constitutes only about 70 % of the genes predicted by each individual method (with start and stop required to coincide). Peptide data was useful for evaluating some of the differences between gene callers, but not reliable enough to make the results conclusive, due to limitations inherent in any proteogenomic study. A single, unambiguous, unanimous best practice did not emerge from this analysis, since the available proteomics data were not adequate to provide an objective measurement of differences in the accuracy between these methods. However, as a result of this study, software, reference data, and procedures have been better matched among participants, representing a step toward a much-needed standard. In the absence of sufficient amount of experimental data to achieve a universal standard, our recommendation is that any of these methods can be used by the community, as long as a single method is employed across all datasets to be compared.« less

  5. Toward a standard in structural genome annotation for prokaryotes

    SciTech Connect

    Tripp, H. James; Sutton, Granger; White, Owen; Wortman, Jennifer; Pati, Amrita; Mikhailova, Natalia; Ovchinnikova, Galina; Payne, Samuel H.; Kyrpides, Nikos C.; Ivanova, Natalia

    2015-07-25

    In an effort to identify the best practice for finding genes in prokaryotic genomes and propose it as a standard for automated annotation pipelines, we collected 1,004,576 peptides from various publicly available resources, and these were used as a basis to evaluate various gene-calling methods. The peptides came from 45 bacterial replicons with an average GC content from 31 % to 74 %, biased toward higher GC content genomes. Automated, manual, and semi-manual methods were used to tally errors in three widely used gene calling methods, as evidenced by peptides mapped outside the boundaries of called genes. We found that the consensus set of identical genes predicted by the three methods constitutes only about 70 % of the genes predicted by each individual method (with start and stop required to coincide). Peptide data was useful for evaluating some of the differences between gene callers, but not reliable enough to make the results conclusive, due to limitations inherent in any proteogenomic study. A single, unambiguous, unanimous best practice did not emerge from this analysis, since the available proteomics data were not adequate to provide an objective measurement of differences in the accuracy between these methods. However, as a result of this study, software, reference data, and procedures have been better matched among participants, representing a step toward a much-needed standard. In the absence of sufficient amount of experimental data to achieve a universal standard, our recommendation is that any of these methods can be used by the community, as long as a single method is employed across all datasets to be compared.

  6. Structural and functional genome analysis using extended chromatin

    SciTech Connect

    Heaf, T.; Ward, D.C.

    1994-09-01

    Highly extended linear chromatin fibers (ECFs) produced by detergent and high-salt lysis and stretching of nuclear chromatin across the surface of a glass slide can by hybridized over physical distances of at least several Mb. This allows long-range FISH analysis of the human genome with excellent DNA resolution (<10 kb/{mu}m). The insertion of Alu elements which are more than 50-fold underrepresented in centromeres can be seen within and near long tandem arrays of alpha-satellite DNA. Long tracts of trinucleotide repeats, i.e. (CCA){sub n}, can be localized within larger genomic regions. The combined application of BrdU incorporation and ECFs allows one to study the spatio-temporal distribution of DNA replication sites in finer detail. DNA synthesis occurs at multiple discrete sites within Mb arrays of alpha-satellite. Replicating DNA is tightly associated with the nuclear matrix and highly resistant to stretching out, while ECFs containing newly replicated DNA are easily released. Asynchrony in replication timing is accompanied by differences in condensation of homologous DNA segments. Extended chromatin reveals differential packaging of active and inactive DNA. Upon transcriptional inactivation by AMD, the normally compact rRNA genes become much more susceptible to decondensation procedures. By extending the chromatin from pachytene spermatocytes, meiotic pairing and genetic exchange between homologs can be visualized directly. Histone depletion by high salt and detergent produces loop chromatin surrounding the nuclear matrix in a halo-like fashion. DNA halos can be used to map nuclear matrix attachment sites in somatic cells and in mature sperm. Alpha-satellite containing DNA loops appear to be attached to the sperm-cell matrix by CENP-B boxes, short 17 bp sequences found in a subset of alpha satellite monomers. Sperm telomeres almost always appear as hybridization doublets, suggesting the presence of already replicated chromosome ends.

  7. Structure and Genome Release Mechanism of the Human Cardiovirus Saffold Virus 3

    PubMed Central

    Mullapudi, Edukondalu; Nováček, Jiří; Pálková, Lenka; Kulich, Pavel; Lindberg, A. Michael; van Kuppeveld, Frank J. M.

    2016-01-01

    ABSTRACT In order to initiate an infection, viruses need to deliver their genomes into cells. This involves uncoating the genome and transporting it to the cytoplasm. The process of genome delivery is not well understood for nonenveloped viruses. We address this gap in our current knowledge by studying the uncoating of the nonenveloped human cardiovirus Saffold virus 3 (SAFV-3) of the family Picornaviridae. SAFVs cause diseases ranging from gastrointestinal disorders to meningitis. We present a structure of a native SAFV-3 virion determined to 2.5 Å by X-ray crystallography and an 11-Å-resolution cryo-electron microscopy reconstruction of an “altered” particle that is primed for genome release. The altered particles are expanded relative to the native virus and contain pores in the capsid that might serve as channels for the release of VP4 subunits, N termini of VP1, and the RNA genome. Unlike in the related enteroviruses, pores in SAFV-3 are located roughly between the icosahedral 3- and 5-fold axes at an interface formed by two VP1 and one VP3 subunit. Furthermore, in native conditions many cardioviruses contain a disulfide bond formed by cysteines that are separated by just one residue. The disulfide bond is located in a surface loop of VP3. We determined the structure of the SAFV-3 virion in which the disulfide bonds are reduced. Disruption of the bond had minimal effect on the structure of the loop, but it increased the stability and decreased the infectivity of the virus. Therefore, compounds specifically disrupting or binding to the disulfide bond might limit SAFV infection. IMPORTANCE A capsid assembled from viral proteins protects the virus genome during transmission from one cell to another. However, when a virus enters a cell the virus genome has to be released from the capsid in order to initiate infection. This process is not well understood for nonenveloped viruses. We address this gap in our current knowledge by studying the genome release of

  8. Assessment of phylogenetic structure in genome size--gene content correlations.

    PubMed

    Prasad, Vibhu Ranjan; Isler, Karin

    2012-05-01

    Gene content and gene-coding percentage can be predicted from genome size in newly sequenced organisms. Here, we investigate whether these predictions are influenced by phylogenetic relationships between the involved species. Combining a highly resolved phylogenetic tree with a large compilation of gene content data, our results reveal the presence of significant phylogenetic structure in the correlations between genome size and gene content in both bacteria and eukaryotes. The variation in log(gene content) explained by log(genome size) in combination with phylogeny was found to be 97% in bacteria and 55% in eukaryotes. Further, in bacteria, gene-coding percentages are only significantly correlated to genome size if phylogenetic information is taken into account in the analyses. These findings support the usage of phylogenetic correlation models for gene content predictions.

  9. Development of Structural Neurobiology and Genomics Programs in the Neurogenetic Institute

    SciTech Connect

    Henderson, Brian E., M.D.

    2006-11-10

    The purpose of the DOE equipment-only grant was to purchase instrumentation in support of structural biology and genomics core facilities in the Zilkha Neurogenetic Institute (ZNI). The ZNI, a new laboratory facility (125,000 GSF) and a center of excellence at the Keck School of Medicine of USC, was opened in 2003. The goal of the ZNI is to recruit upwards of 30 new faculty investigators engaged in interdisciplinary research programs that will add breadth and depth to existing school strengths in neuroscience, epidemiology and genetics. Many of these faculty, and other faculty researchers at the Keck School will access structural biology and genomics facilities developed in the ZNI.

  10. Structural dynamics of cereal mitochondrial genomes as revealed by complete nucleotide sequencing of the wheat mitochondrial genome.

    PubMed

    Ogihara, Yasunari; Yamazaki, Yukiko; Murai, Koji; Kanno, Akira; Terachi, Toru; Shiina, Takashi; Miyashita, Naohiko; Nasuda, Shuhei; Nakamura, Chiharu; Mori, Naoki; Takumi, Shigeo; Murata, Minoru; Futo, Satoshi; Tsunewaki, Koichiro

    2005-01-01

    The application of a new gene-based strategy for sequencing the wheat mitochondrial genome shows its structure to be a 452 528 bp circular molecule, and provides nucleotide-level evidence of intra-molecular recombination. Single, reciprocal and double recombinant products, and the nucleotide sequences of the repeats that mediate their formation have been identified. The genome has 55 genes with exons, including 35 protein-coding, 3 rRNA and 17 tRNA genes. Nucleotide sequences of seven wheat genes have been determined here for the first time. Nine genes have an exon-intron structure. Gene amplification responsible for the production of multicopy mitochondrial genes, in general, is species-specific, suggesting the recent origin of these genes. About 16, 17, 15, 3.0 and 0.2% of wheat mitochondrial DNA (mtDNA) may be of genic (including introns), open reading frame, repetitive sequence, chloroplast and retro-element origin, respectively. The gene order of the wheat mitochondrial gene map shows little synteny to the rice and maize maps, indicative that thorough gene shuffling occurred during speciation. Almost all unique mtDNA sequences of wheat, as compared with rice and maize mtDNAs, are redundant DNA. Features of the gene-based strategy are discussed, and a mechanistic model of mitochondrial gene amplification is proposed. PMID:16260473

  11. Structural dynamics of cereal mitochondrial genomes as revealed by complete nucleotide sequencing of the wheat mitochondrial genome

    PubMed Central

    Ogihara, Yasunari; Yamazaki, Yukiko; Murai, Koji; Kanno, Akira; Terachi, Toru; Shiina, Takashi; Miyashita, Naohiko; Nasuda, Shuhei; Nakamura, Chiharu; Mori, Naoki; Takumi, Shigeo; Murata, Minoru; Futo, Satoshi; Tsunewaki, Koichiro

    2005-01-01

    The application of a new gene-based strategy for sequencing the wheat mitochondrial genome shows its structure to be a 452 528 bp circular molecule, and provides nucleotide-level evidence of intra-molecular recombination. Single, reciprocal and double recombinant products, and the nucleotide sequences of the repeats that mediate their formation have been identified. The genome has 55 genes with exons, including 35 protein-coding, 3 rRNA and 17 tRNA genes. Nucleotide sequences of seven wheat genes have been determined here for the first time. Nine genes have an exon–intron structure. Gene amplification responsible for the production of multicopy mitochondrial genes, in general, is species-specific, suggesting the recent origin of these genes. About 16, 17, 15, 3.0 and 0.2% of wheat mitochondrial DNA (mtDNA) may be of genic (including introns), open reading frame, repetitive sequence, chloroplast and retro-element origin, respectively. The gene order of the wheat mitochondrial gene map shows little synteny to the rice and maize maps, indicative that thorough gene shuffling occurred during speciation. Almost all unique mtDNA sequences of wheat, as compared with rice and maize mtDNAs, are redundant DNA. Features of the gene-based strategy are discussed, and a mechanistic model of mitochondrial gene amplification is proposed. PMID:16260473

  12. The ISPRS Student Consortium: From launch to tenth anniversary

    NASA Astrophysics Data System (ADS)

    Kanjir, U.; Detchev, I.; Reyes, S. R.; Akkartal Aktas, A.; Lo, C. Y.; Miyazaki, H.

    2014-04-01

    The ISPRS Student Consortium is an international organization for students and young professionals in the fields of photogrammetry, remote sensing, and the geospatial information sciences. Since its start ten years ago, the number of members of the Student Consortium has been steadily growing, now reaching close to 1000. Its increased popularity, especially in recent years, is mainly due to the organization's worldwide involvement in student matters. The Student Consortium has helped organize numerous summer schools, youth forums, and student technical sessions at ISPRS sponsored conferences. In addition, the organization publishes a newsletter, and hosts several social media outlets in order to keep its global membership up-to-date on a regular basis. This paper will describe the structure of the organization, and it will give some example of its past student related activities.

  13. Meeting report of the RNA Ontology Consortium January 8-9, 2011

    PubMed Central

    Clemente, Jose C.; Desai, Narayan; Gilbert, Jack; Gonzalez, Antonio; Kyrpides, Nikos; Meyer, Folker; Nawrocki, Eric; Sterk, Peter; Stombaugh, Jesse; Weinberg, Zasha; Wendel, Doug; Leontis, Neocles B.; Zirbel, Craig; Knight, Rob; Laederach, Alain

    2011-01-01

    This report summarizes the proceedings of the structure mapping working group meeting of the RNA Ontology Consortium (ROC), held in Kona, Hawaii on January 8-9, 2011. The ROC hosted this workshop to facilitate collaborations among those researchers formalizing concepts in RNA, those developing RNA-related software, and those performing genome annotation and standardization. The workshop included three software presentations, extended round-table discussions, and the constitution of two new working groups, the first to address the need for better software integration and the second to discuss standardization and benchmarking of existing RNA annotation pipelines. These working groups have subsequently pursued concrete implementation of actions suggested during the discussion. Further information about the ROC and its activities can be found at http://roc.bgsu.edu/. PMID:21677862

  14. The Ocean Sampling Day Consortium

    DOE PAGESBeta

    Kopf, Anna; Bicak, Mesude; Kottmann, Renzo; Schnetzer, Julia; Kostadinov, Ivaylo; Lehmann, Katja; Fernandez-Guerra, Antonio; Jeanthon, Christian; Rahav, Eyal; Ullrich, Matthias; et al

    2015-06-19

    In this study, Ocean Sampling Day was initiated by the EU-funded Micro B3 (Marine Microbial Biodiversity, Bioinformatics, Biotechnology) project to obtain a snapshot of the marine microbial biodiversity and function of the world’s oceans. It is a simultaneous global mega-sequencing campaign aiming to generate the largest standardized microbial data set in a single day. This will be achievable only through the coordinated efforts of an Ocean Sampling Day Consortium, supportive partnerships and networks between sites. This commentary outlines the establishment, function and aims of the Consortium and describes our vision for a sustainable study of marine microbial communities and theirmore » embedded functional traits.« less

  15. Mitochondrial genome structure and evolution in the living fossil vampire squid, Vampyroteuthis infernalis, and extant cephalopods.

    PubMed

    Yokobori, Shin-ichi; Lindsay, Dhugal J; Yoshida, Mari; Tsuchiya, Kotaro; Yamagishi, Akihiko; Maruyama, Tadashi; Oshima, Tairo

    2007-08-01

    Complete nucleotide sequences of mitochondrial (mt) genomes of the "living fossil" cephalopod Vampyroteuthis infernalis (Vampyromorpha) and the cuttlefish Sepia esculenta (Sepiida) were determined. The V. infernalis mt genome structure is identical to the incirrate octopod Octopus vulgaris mt genome structure, and is therefore more similar to that of the polyplacophoran Katharina tunicata, than to that of the other "living fossil" cephalopod Nautilus macromphalus. The mt genome structure of S. esculenta is identical to that of Sepia officinalis. Molecular phylogenetic analyses based on the mt protein genes from the completely sequenced cephalopod mt genomes suggested the monophyletic relationship of two myopsid squids Loligo bleekeri and Sepiotheuthis lessoniana, and the monophyletic relationship of two oegopsid squids Watasenia scintillans, and Todarodes pacificus. Sepiida appeared as the sister group of Teuthida (Myopsida + Oegopsida). The phylogenetic position of Vampyromorpha appeared as the sister group of Octopoda, although the monophyly of Vampyromorpha and Decapodiformes cannot be rejected outright by our phylogenetic analyses. The hypothesis that Vampyromorpha is basal among the coleoid cephalopods can be rejected because of low statistical support. Therefore, it is reasonable to recognize three major groups in Coleoidea--Vampyromorpha, Octopoda, and Decapodiformes.

  16. Detecting Heterogeneity in Population Structure Across the Genome in Admixed Populations.

    PubMed

    McHugh, Caitlin; Brown, Lisa; Thornton, Timothy A

    2016-09-01

    The genetic structure of human populations is often characterized by aggregating measures of ancestry across the autosomal chromosomes. While it may be reasonable to assume that population structure patterns are similar genome-wide in relatively homogeneous populations, this assumption may not be appropriate for admixed populations, such as Hispanics and African-Americans, with recent ancestry from two or more continents. Recent studies have suggested that systematic ancestry differences can arise at genomic locations in admixed populations as a result of selection and nonrandom mating. Here, we propose a method, which we refer to as the chromosomal ancestry differences (CAnD) test, for detecting heterogeneity in population structure across the genome. CAnD can incorporate either local or chromosome-wide ancestry inferred from SNP genotype data to identify chromosomes harboring genomic regions with ancestry contributions that are significantly different than expected. In simulation studies with real genotype data from phase III of the HapMap Project, we demonstrate the validity and power of CAnD. We apply CAnD to the HapMap Mexican-American (MXL) and African-American (ASW) population samples; in this analysis the software RFMix is used to infer local ancestry at genomic regions, assuming admixing from Europeans, West Africans, and Native Americans. The CAnD test provides strong evidence of heterogeneity in population structure across the genome in the MXL sample ([Formula: see text]), which is largely driven by elevated Native American ancestry and deficit of European ancestry on the X chromosomes. Among the ASW, all chromosomes are largely African derived and no heterogeneity in population structure is detected in this sample. PMID:27440868

  17. The AGTSR consortium: An update

    SciTech Connect

    Fant, D.B.; Golan, L.P.

    1995-10-01

    The Advanced Gas Turbine Systems Research (AGTSR) program is a collaborative University-Industry R&D Consortium that is managed and administered by the South Carolina Energy R&D Center. AGTSR is a nationwide consortium dedicated to advancing land-based gas turbine systems for improving future power generation capability. It directly supports the technology-research arm of the ATS program and targets industry-defined research needs in the areas of combustion, heat transfer, materials, aerodynamics, controls, alternative fuels, and advanced cycles. The consortium is organized to enhance U.S. competitiveness through close collaboration with universities, government, and industry at the R&D level. AGTSR is just finishing its third year of operation and is sponsored by the U.S. DOE - Morgantown Energy Technology Center. The program is scheduled to continue past the year 2000. At present, there are 78 performing member universities representing 36 states, and six cost-sharing U.S. gas turbine corporations. Three RFP`s have been announced and the fourth RFP is expected to be released in December, 1995. There are 31 research subcontracts underway at performing member universities. AGTSR has also organized three workshops, two in combustion and one in heat transfer. A materials workshop is in planning and is scheduled for February, 1996. An industrial internship program was initiated this past summer, with one intern positioned at each of the sponsoring companies. The AGTSR consortium nurtures close industry-university-government collaboration to enhance synergism and the transition of research results, accelerate and promote evolutionary-revolutionary R&D, and strives to keep a prominent U.S. industry strong and on top well into the 21st century. This paper will present the objectives and benefits of the AGTSR program, progress achieved to date, and future planned activity in fiscal year 1996.

  18. John Glenn Biomedical Engineering Consortium

    NASA Technical Reports Server (NTRS)

    Nall, Marsha

    2004-01-01

    The John Glenn Biomedical Engineering Consortium is an inter-institutional research and technology development, beginning with ten projects in FY02 that are aimed at applying GRC expertise in fluid physics and sensor development with local biomedical expertise to mitigate the risks of space flight on the health, safety, and performance of astronauts. It is anticipated that several new technologies will be developed that are applicable to both medical needs in space and on earth.

  19. Genome structure and primitive sex chromosome revealed in Populus

    SciTech Connect

    Tuskan, Gerald A; Yin, Tongming; Gunter, Lee E; Blaudez, D

    2008-01-01

    We constructed a comprehensive genetic map for Populus and ordered 332 Mb of sequence scaffolds along the 19 haploid chromosomes in order to compare chromosomal regions among diverse members of the genus. These efforts lead us to conclude that chromosome XIX in Populus is evolving into a sex chromosome. Consistent segregation distortion in favor of the sub-genera Tacamahaca alleles provided evidence of divergent selection among species, particularly at the proximal end of chromosome XIX. A large microsatellite marker (SSR) cluster was detected in the distorted region even though the genome-wide distribute SSR sites was uniform across the physical map. The differences between the genetic map and physical sequence data suggested recombination suppression was occurring in the distorted region. A gender-determination locus and an overabundance of NBS-LRR genes were also co-located to the distorted region and were put forth as the cause for divergent selection and recombination suppression. This hypothesis was verified by using fine-scale mapping of an integrated scaffold in the vicinity of the gender-determination locus. As such it appears that chromosome XIX in Populus is in the process of evolving from an autosome into a sex chromosome and that NBS-LRR genes may play important role in the chromosomal diversification process in Populus.

  20. The genomic structure of the human UFO receptor.

    PubMed

    Schulz, A S; Schleithoff, L; Faust, M; Bartram, C R; Janssen, J W

    1993-02-01

    Using a DNA transfection-tumorigenicity assay we have recently identified the UFO oncogene. It encodes a tyrosine kinase receptor characterized by the juxtaposition of two immunoglobulin-like and two fibronectin type III repeats in its extracellular domain. Here we describe the genomic organization of the human UFO locus. The UFO receptor is encoded by 20 exons that are distributed over a region of 44 kb. Different isoforms of UFO mRNA are generated by alternative splicing of exon 10 and differential usage of two imperfect polyadenylation sites resulting in the presence or absence of 1.5-kb 3' untranslated sequences. Primer extension and S1 nuclease analyses revealed multiple transcriptional initiation sites including a major site 169 bp upstream of the translation start site. The promoter region is GC rich, lacks TATA and CAAT boxes, but contains potential recognition sites for a variety of trans-acting factors, including Sp1, AP-2 and the cyclic AMP response element-binding protein. Proto-UFO and its oncogenic counterpart exhibit identical cDNA and promoter regions sequences. Possible modes of UFO activation are discussed.

  1. 3D-GNOME: an integrated web service for structural modeling of the 3D genome.

    PubMed

    Szalaj, Przemyslaw; Michalski, Paul J; Wróblewski, Przemysław; Tang, Zhonghui; Kadlof, Michal; Mazzocco, Giovanni; Ruan, Yijun; Plewczynski, Dariusz

    2016-07-01

    Recent advances in high-throughput chromosome conformation capture (3C) technology, such as Hi-C and ChIA-PET, have demonstrated the importance of 3D genome organization in development, cell differentiation and transcriptional regulation. There is now a widespread need for computational tools to generate and analyze 3D structural models from 3C data. Here we introduce our 3D GeNOme Modeling Engine (3D-GNOME), a web service which generates 3D structures from 3C data and provides tools to visually inspect and annotate the resulting structures, in addition to a variety of statistical plots and heatmaps which characterize the selected genomic region. Users submit a bedpe (paired-end BED format) file containing the locations and strengths of long range contact points, and 3D-GNOME simulates the structure and provides a convenient user interface for further analysis. Alternatively, a user may generate structures using published ChIA-PET data for the GM12878 cell line by simply specifying a genomic region of interest. 3D-GNOME is freely available at http://3dgnome.cent.uw.edu.pl/.

  2. 3D-GNOME: an integrated web service for structural modeling of the 3D genome

    PubMed Central

    Szalaj, Przemyslaw; Michalski, Paul J.; Wróblewski, Przemysław; Tang, Zhonghui; Kadlof, Michal; Mazzocco, Giovanni; Ruan, Yijun; Plewczynski, Dariusz

    2016-01-01

    Recent advances in high-throughput chromosome conformation capture (3C) technology, such as Hi-C and ChIA-PET, have demonstrated the importance of 3D genome organization in development, cell differentiation and transcriptional regulation. There is now a widespread need for computational tools to generate and analyze 3D structural models from 3C data. Here we introduce our 3D GeNOme Modeling Engine (3D-GNOME), a web service which generates 3D structures from 3C data and provides tools to visually inspect and annotate the resulting structures, in addition to a variety of statistical plots and heatmaps which characterize the selected genomic region. Users submit a bedpe (paired-end BED format) file containing the locations and strengths of long range contact points, and 3D-GNOME simulates the structure and provides a convenient user interface for further analysis. Alternatively, a user may generate structures using published ChIA-PET data for the GM12878 cell line by simply specifying a genomic region of interest. 3D-GNOME is freely available at http://3dgnome.cent.uw.edu.pl/. PMID:27185892

  3. Detection and interpretation of genomic structural variation in health and disease.

    PubMed

    Vandeweyer, Geert; Kooy, R Frank

    2013-01-01

    Recent technological advances in the detection of genomic structural variation have revolutionized the field of medical genetics. Genome-wide screening for copy-number variants in routine molecular diagnostics unveiled the presence of an unforeseen amount of structural variation in the genome. Owing to the massive amount of patients analyzed, the analysis of the resulting data became exponentially more complex. Simultaneously, novel insights in the impact of structural variation on the phenotype forced the re-evaluation of the pathogenicity of copy-number variations in more complex inheritance models. As a consequence, the challenge of today's genetics shifted from the mere detection of structural variation to the correct annotation and interpretation of the data. Various databases and data mining tools are available to help in the interpretation of the data, but making decisions on the pathogeniticy of the variation is still challenging. This review provides an overview of current laboratory techniques to detect structural variation, options to analyze and annotate data from genome-wide methods and caveats to take into account in interpretation of results.

  4. Appalachian clean coal technology consortium

    SciTech Connect

    Kutz, K.; Yoon, Roe-Hoan

    1995-11-01

    The Appalachian Clean Coal Technology Consortium (ACCTC) has been established to help U.S. coal producers, particularly those in the Appalachian region, increase the production of lower-sulfur coal. The cooperative research conducted as part of the consortium activities will help utilities meet the emissions standards established by the 1990 Clean Air Act Amendments, enhance the competitiveness of U.S. coals in the world market, create jobs in economically-depressed coal producing regions, and reduce U.S. dependence on foreign energy supplies. The research activities will be conducted in cooperation with coal companies, equipment manufacturers, and A&E firms working in the Appalachian coal fields. This approach is consistent with President Clinton`s initiative in establishing Regional Technology Alliances to meet regional needs through technology development in cooperation with industry. The consortium activities are complementary to the High-Efficiency Preparation program of the Pittsburgh Energy Technology Center, but are broader in scope as they are inclusive of technology developments for both near-term and long-term applications, technology transfer, and training a highly-skilled work force.

  5. Structure of Ljungan virus provides insight into genome packaging of this picornavirus

    PubMed Central

    Zhu, Ling; Wang, Xiangxi; Ren, Jingshan; Porta, Claudine; Wenham, Hannah; Ekström, Jens-Ola; Panjwani, Anusha; Knowles, Nick J.; Kotecha, Abhay; Siebert, C. Alistair; Lindberg, A. Michael; Fry, Elizabeth E.; Rao, Zihe; Tuthill, Tobias J.; Stuart, David I.

    2015-01-01

    Picornaviruses are responsible for a range of human and animal diseases, but how their RNA genome is packaged remains poorly understood. A particularly poorly studied group within this family are those that lack the internal coat protein, VP4. Here we report the atomic structure of one such virus, Ljungan virus, the type member of the genus Parechovirus B, which has been linked to diabetes and myocarditis in humans. The 3.78-Å resolution cryo-electron microscopy structure shows remarkable features, including an extended VP1 C terminus, forming a major protuberance on the outer surface of the virus, and a basic motif at the N terminus of VP3, binding to which orders some 12% of the viral genome. This apparently charge-driven RNA attachment suggests that this branch of the picornaviruses uses a different mechanism of genome encapsidation, perhaps explored early in the evolution of picornaviruses. PMID:26446437

  6. Structure of Ljungan virus provides insight into genome packaging of this picornavirus

    NASA Astrophysics Data System (ADS)

    Zhu, Ling; Wang, Xiangxi; Ren, Jingshan; Porta, Claudine; Wenham, Hannah; Ekström, Jens-Ola; Panjwani, Anusha; Knowles, Nick J.; Kotecha, Abhay; Siebert, C. Alistair; Lindberg, A. Michael; Fry, Elizabeth E.; Rao, Zihe; Tuthill, Tobias J.; Stuart, David I.

    2015-10-01

    Picornaviruses are responsible for a range of human and animal diseases, but how their RNA genome is packaged remains poorly understood. A particularly poorly studied group within this family are those that lack the internal coat protein, VP4. Here we report the atomic structure of one such virus, Ljungan virus, the type member of the genus Parechovirus B, which has been linked to diabetes and myocarditis in humans. The 3.78-Å resolution cryo-electron microscopy structure shows remarkable features, including an extended VP1 C terminus, forming a major protuberance on the outer surface of the virus, and a basic motif at the N terminus of VP3, binding to which orders some 12% of the viral genome. This apparently charge-driven RNA attachment suggests that this branch of the picornaviruses uses a different mechanism of genome encapsidation, perhaps explored early in the evolution of picornaviruses.

  7. Structural genomics for drug design against the pathogen Coxiella burnetii.

    PubMed

    Franklin, Matthew C; Cheung, Jonah; Rudolph, Michael J; Burshteyn, Fiana; Cassidy, Michael; Gary, Ebony; Hillerich, Brandan; Yao, Zhong-Ke; Carlier, Paul R; Totrov, Maxim; Love, James D

    2015-12-01

    Coxiella burnetii is a highly infectious bacterium and potential agent of bioterrorism. However, it has not been studied as extensively as other biological agents, and very few of its proteins have been structurally characterized. To address this situation, we undertook a study of critical metabolic enzymes in C. burnetii that have great potential as drug targets. We used high-throughput techniques to produce novel crystal structures of 48 of these proteins. We selected one protein, C. burnetii dihydrofolate reductase (CbDHFR), for additional work to demonstrate the value of these structures for structure-based drug design. This enzyme's structure reveals a feature in the substrate binding groove that is different between CbDHFR and human dihydrofolate reductase (hDHFR). We then identified a compound by in silico screening that exploits this binding groove difference, and demonstrated that this compound inhibits CbDHFR with at least 25-fold greater potency than hDHFR. Since this binding groove feature is shared by many other prokaryotes, the compound identified could form the basis of a novel antibacterial agent effective against a broad spectrum of pathogenic bacteria.

  8. Efficient de novo assembly of large genomes using compressed data structures

    PubMed Central

    Simpson, Jared T.; Durbin, Richard

    2012-01-01

    De novo genome sequence assembly is important both to generate new sequence assemblies for previously uncharacterized genomes and to identify the genome sequence of individuals in a reference-unbiased way. We present memory efficient data structures and algorithms for assembly using the FM-index derived from the compressed Burrows-Wheeler transform, and a new assembler based on these called SGA (String Graph Assembler). We describe algorithms to error-correct, assemble, and scaffold large sets of sequence data. SGA uses the overlap-based string graph model of assembly, unlike most de novo assemblers that rely on de Bruijn graphs, and is simply parallelizable. We demonstrate the error correction and assembly performance of SGA on 1.2 billion sequence reads from a human genome, which we are able to assemble using 54 GB of memory. The resulting contigs are highly accurate and contiguous, while covering 95% of the reference genome (excluding contigs <200 bp in length). Because of the low memory requirements and parallelization without requiring inter-process communication, SGA provides the first practical assembler to our knowledge for a mammalian-sized genome on a low-end computing cluster. PMID:22156294

  9. Coding exon-structure aware realigner (CESAR) utilizes genome alignments for accurate comparative gene annotation.

    PubMed

    Sharma, Virag; Elghafari, Anas; Hiller, Michael

    2016-06-20

    Identifying coding genes is an essential step in genome annotation. Here, we utilize existing whole genome alignments to detect conserved coding exons and then map gene annotations from one genome to many aligned genomes. We show that genome alignments contain thousands of spurious frameshifts and splice site mutations in exons that are truly conserved. To overcome these limitations, we have developed CESAR (Coding Exon-Structure Aware Realigner) that realigns coding exons, while considering reading frame and splice sites of each exon. CESAR effectively avoids spurious frameshifts in conserved genes and detects 91% of shifted splice sites. This results in the identification of thousands of additional conserved exons and 99% of the exons that lack inactivating mutations match real exons. Finally, to demonstrate the potential of using CESAR for comparative gene annotation, we applied it to 188 788 exons of 19 865 human genes to annotate human genes in 99 other vertebrates. These comparative gene annotations are available as a resource (http://bds.mpi-cbg.de/hillerlab/CESAR/). CESAR (https://github.com/hillerlab/CESAR/) can readily be applied to other alignments to accurately annotate coding genes in many other vertebrate and invertebrate genomes. PMID:27016733

  10. Heterogeneous genome divergence, differential introgression, and the origin and structure of hybrid zones.

    PubMed

    Harrison, Richard G; Larson, Erica L

    2016-06-01

    Hybrid zones have been promoted as windows on the evolutionary process and as laboratories for studying divergence and speciation. Patterns of divergence between hybridizing species can now be characterized on a genomewide scale, and recent genome scans have focused on the presence of 'islands' of divergence. Patterns of heterogeneous genomic divergence may reflect differential introgression following secondary contact and provide insights into which genome regions contribute to local adaptation, hybrid unfitness and positive assortative mating. However, heterogeneous genome divergence can also arise in the absence of any gene flow, as a result of variation in selection and recombination across the genome. We suggest that to understand hybrid zone origins and dynamics, it is essential to distinguish between genome regions that are divergent between pure parental populations and regions that show restricted introgression where these populations interact in hybrid zones. The latter, more so than the former, reveal the likely genetic architecture of reproductive isolation. Mosaic hybrid zones, because of their complex structure and multiple contacts, are particularly good subjects for distinguishing primary intergradation from secondary contact. Comparisons among independent hybrid zones or transects that involve the 'same' species pair can also help to distinguish between divergence with gene flow and secondary contact. However, data from replicate hybrid zones or replicate transects do not reveal consistent patterns; in a few cases, patterns of introgression are similar across independent transects, but for many taxa, there is distinct lack of concordance, presumably due to variation in environmental context and/or variation in the genetics of the interacting populations.

  11. Evolution of the Exon-Intron Structure in Ciliate Genomes.

    PubMed

    Bondarenko, Vladyslav S; Gelfand, Mikhail S

    2016-01-01

    A typical eukaryotic gene is comprised of alternating stretches of regions, exons and introns, retained in and spliced out a mature mRNA, respectively. Although the length of introns may vary substantially among organisms, a large fraction of genes contains short introns in many species. Notably, some Ciliates (Paramecium and Nyctotherus) possess only ultra-short introns, around 25 bp long. In Paramecium, ultra-short introns with length divisible by three (3n) are under strong evolutionary pressure and have a high frequency of in-frame stop codons, which, in the case of intron retention, cause premature termination of mRNA translation and consequent degradation of the mis-spliced mRNA by the nonsense-mediated decay mechanism. Here, we analyzed introns in five genera of Ciliates, Paramecium, Tetrahymena, Ichthyophthirius, Oxytricha, and Stylonychia. Introns can be classified into two length classes in Tetrahymena and Ichthyophthirius (with means 48 bp, 69 bp, and 55 bp, 64 bp, respectively), but, surprisingly, comprise three distinct length classes in Oxytricha and Stylonychia (with means 33-35 bp, 47-51 bp, and 78-80 bp). In most ranges of the intron lengths, 3n introns are underrepresented and have a high frequency of in-frame stop codons in all studied species. Introns of Paramecium, Tetrahymena, and Ichthyophthirius are preferentially located at the 5' and 3' ends of genes, whereas introns of Oxytricha and Stylonychia are strongly skewed towards the 5' end. Analysis of evolutionary conservation shows that, in each studied genome, a significant fraction of intron positions is conserved between the orthologs, but intron lengths are not correlated between the species. In summary, our study provides a detailed characterization of introns in several genera of Ciliates and highlights some of their distinctive properties, which, together, indicate that splicing spellchecking is a universal and evolutionarily conserved process in the biogenesis of short introns in

  12. Evolution of the Exon-Intron Structure in Ciliate Genomes

    PubMed Central

    Gelfand, Mikhail S.

    2016-01-01

    A typical eukaryotic gene is comprised of alternating stretches of regions, exons and introns, retained in and spliced out a mature mRNA, respectively. Although the length of introns may vary substantially among organisms, a large fraction of genes contains short introns in many species. Notably, some Ciliates (Paramecium and Nyctotherus) possess only ultra-short introns, around 25 bp long. In Paramecium, ultra-short introns with length divisible by three (3n) are under strong evolutionary pressure and have a high frequency of in-frame stop codons, which, in the case of intron retention, cause premature termination of mRNA translation and consequent degradation of the mis-spliced mRNA by the nonsense-mediated decay mechanism. Here, we analyzed introns in five genera of Ciliates, Paramecium, Tetrahymena, Ichthyophthirius, Oxytricha, and Stylonychia. Introns can be classified into two length classes in Tetrahymena and Ichthyophthirius (with means 48 bp, 69 bp, and 55 bp, 64 bp, respectively), but, surprisingly, comprise three distinct length classes in Oxytricha and Stylonychia (with means 33–35 bp, 47–51 bp, and 78–80 bp). In most ranges of the intron lengths, 3n introns are underrepresented and have a high frequency of in-frame stop codons in all studied species. Introns of Paramecium, Tetrahymena, and Ichthyophthirius are preferentially located at the 5' and 3' ends of genes, whereas introns of Oxytricha and Stylonychia are strongly skewed towards the 5' end. Analysis of evolutionary conservation shows that, in each studied genome, a significant fraction of intron positions is conserved between the orthologs, but intron lengths are not correlated between the species. In summary, our study provides a detailed characterization of introns in several genera of Ciliates and highlights some of their distinctive properties, which, together, indicate that splicing spellchecking is a universal and evolutionarily conserved process in the biogenesis of short introns in

  13. Structured States of Disordered Proteins from Genomic Sequences.

    PubMed

    Toth-Petroczy, Agnes; Palmedo, Perry; Ingraham, John; Hopf, Thomas A; Berger, Bonnie; Sander, Chris; Marks, Debora S

    2016-09-22

    Protein flexibility ranges from simple hinge movements to functional disorder. Around half of all human proteins contain apparently disordered regions with little 3D or functional information, and many of these proteins are associated with disease. Building on the evolutionary couplings approach previously successful in predicting 3D states of ordered proteins and RNA, we developed a method to predict the potential for ordered states for all apparently disordered proteins with sufficiently rich evolutionary information. The approach is highly accurate (79%) for residue interactions as tested in more than 60 known disordered regions captured in a bound or specific condition. Assessing the potential for structure of more than 1,000 apparently disordered regions of human proteins reveals a continuum of structural order with at least 50% with clear propensity for three- or two-dimensional states. Co-evolutionary constraints reveal hitherto unseen structures of functional importance in apparently disordered proteins. PMID:27662088

  14. Structured States of Disordered Proteins from Genomic Sequences.

    PubMed

    Toth-Petroczy, Agnes; Palmedo, Perry; Ingraham, John; Hopf, Thomas A; Berger, Bonnie; Sander, Chris; Marks, Debora S

    2016-09-22

    Protein flexibility ranges from simple hinge movements to functional disorder. Around half of all human proteins contain apparently disordered regions with little 3D or functional information, and many of these proteins are associated with disease. Building on the evolutionary couplings approach previously successful in predicting 3D states of ordered proteins and RNA, we developed a method to predict the potential for ordered states for all apparently disordered proteins with sufficiently rich evolutionary information. The approach is highly accurate (79%) for residue interactions as tested in more than 60 known disordered regions captured in a bound or specific condition. Assessing the potential for structure of more than 1,000 apparently disordered regions of human proteins reveals a continuum of structural order with at least 50% with clear propensity for three- or two-dimensional states. Co-evolutionary constraints reveal hitherto unseen structures of functional importance in apparently disordered proteins.

  15. Protein Production for Structural Genomics Using E. coli Expression

    PubMed Central

    Makowska-Grzyska, Magdalena; Kim, Youngchang; Maltseva, Natalia; Li, Hui; Zhou, Min; Joachimiak, Grazyna; Babnigg, Gyorgy; Joachimiak, Andrzej

    2014-01-01

    The goal of structural biology is to reveal details of the molecular structure of proteins in order to understand their function and mechanism. X-ray crystallography and NMR are the two best methods for atomic level structure determination. However, these methods require milligram quantities of proteins. In this chapter a reproducible methodology for large-scale protein production applicable to a diverse set of proteins is described. The approach is based on protein expression in E. coli as a fusion with a cleavable affinity tag that was tested on over 20,000 proteins. Specifically, a protocol for fermentation of large quantities of native proteins in disposable culture vessels is presented. A modified protocol that allows for the production of selenium-labeled proteins in defined media is also offered. Finally, a method for the purification of His6-tagged proteins on immobilized metal affinity chromatography columns that generates high-purity material is described in detail. PMID:24590711

  16. Structural and functional comparative mapping between the Brassica A genomes in allotetraploid Brassica napus and diploid Brassica rapa.

    PubMed

    Jiang, Congcong; Ramchiary, Nirala; Ma, Yongbiao; Jin, Mina; Feng, Ji; Li, Ruiyuan; Wang, Hao; Long, Yan; Choi, Su Ryun; Zhang, Chunyu; Cowling, Wallace A; Park, Beom Seok; Lim, Yong Pyo; Meng, Jinling

    2011-10-01

    Brassica napus (AACC genome) is an important oilseed crop that was formed by the fusion of the diploids B. rapa (AA) and B. oleracea (CC). The complete genomic sequence of the Brassica A genome will be available soon from the B. rapa genome sequencing project, but it is not clear how informative the A genome sequence in B. rapa (A(r)) will be for predicting the structure and function of the A subgenome in the allotetraploid Brassica species B. napus (A(n)). In this paper, we report the results of structural and functional comparative mapping between the A subgenomes of B. napus and B. rapa based on genetic maps that were anchored with bacterial artificial chromosomes (BACs)-sequence of B. rapa. We identified segmental conservation that represented by syntenic blocks in over one third of the A genome; meanwhile, comparative mapping of quantitative trait loci for seed quality traits identified a dozen homologous regions with conserved function in the A genome of the two species. However, several genomic rearrangement events, such as inversions, intra- and inter-chromosomal translocations, were also observed, covering totally at least 5% of the A genome, between allotetraploid B. napus and diploid B. rapa. Based on these results, the A genomes of B. rapa and B. napus are mostly functionally conserved, but caution will be necessary in applying the full sequence data from B. rapa to the B. napus as a result of genomic rearrangements in the A genome between the two species.

  17. A roadmap for functional structural variants in the soybean genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Gene structural variation (SV) has recently emerged as a key genetic mechanism underlying several important phenotypic traits in crop species. We screened a panel of 41 soybean accessions serving as parents in a soybean nested association mapping population for deletions and duplications in over 53...

  18. Comparative 3D genome structure analysis of the fission and the budding yeast.

    PubMed

    Gong, Ke; Tjong, Harianto; Zhou, Xianghong Jasmine; Alber, Frank

    2015-01-01

    We studied the 3D structural organization of the fission yeast genome, which emerges from the tethering of heterochromatic regions in otherwise randomly configured chromosomes represented as flexible polymer chains in an nuclear environment. This model is sufficient to explain in a statistical manner many experimentally determined distinctive features of the fission yeast genome, including chromatin interaction patterns from Hi-C experiments and the co-locations of functionally related and co-expressed genes, such as genes expressed by Pol-III. Our findings demonstrate that some previously described structure-function correlations can be explained as a consequence of random chromatin collisions driven by a few geometric constraints (mainly due to centromere-SPB and telomere-NE tethering) combined with the specific gene locations in the chromosome sequence. We also performed a comparative analysis between the fission and budding yeast genome structures, for which we previously detected a similar organizing principle. However, due to the different chromosome sizes and numbers, substantial differences are observed in the 3D structural genome organization between the two species, most notably in the nuclear locations of orthologous genes, and the extent of nuclear territories for genes and chromosomes. However, despite those differences, remarkably, functional similarities are maintained, which is evident when comparing spatial clustering of functionally related genes in both yeasts. Functionally related genes show a similar spatial clustering behavior in both yeasts, even though their nuclear locations are largely different between the yeast species.

  19. Comprehensive analysis of glycosyltransferases in eukaryotic genomes for structural and functional characterization of glycans.

    PubMed

    Hashimoto, Kosuke; Tokimatsu, Toshiaki; Kawano, Shin; Yoshizawa, Akiyasu C; Okuda, Shujiro; Goto, Susumu; Kanehisa, Minoru

    2009-05-12

    Glycosyltransferases comprise highly divergent groups of enzymes, which play a central role in the synthesis of complex glycans. Because the repertoire of glycosyltransferases in the genome determines the range of synthesizable glycans, and because the increasing amount of genome sequence data is now available, it is essential to examine these enzymes across organisms to explore possible structures and functions of the glycoconjugates. In this study, we systematically investigated 36 eukaryotic genomes and obtained 3426 glycosyltransferase homologs for biosynthesis of major glycans, classified into 53 families based on sequence similarity. The families were further grouped into six functional categories based on the biosynthetic pathways, which revealed characteristic patterns among organism groups in the degree of conservation and in the number of paralogs. The results also revealed a strong correlation between the number of glycosyltransferases and the number of coding genes in each genome. We then predicted the ability to synthesize major glycan structures including N-glycan precursors and GPI-anchors in each organism from the combination of the glycosyltransferase families. This indicates that not only parasitic protists but also some algae are likely to synthesize smaller structures than the structures known to be conserved among a wide range of eukaryotes. Finally we discuss the functions of two large families, sialyltransferases and beta 4-glycosyltransferases, by performing finer classifications into subfamilies. Our findings suggest that universality and diversity of glycans originate from two types of evolution of glycosyltransferase families, namely conserved families with few paralogs and diverged families with many paralogs.

  20. Characterization and Correction of Error in Genome-Wide IBD Estimation for Samples with Population Structure

    PubMed Central

    Morrison, Jean

    2014-01-01

    The proportion of the genome that is shared identical by descent (IBD) between pairs of individuals is often estimated in studies involving genome-wide SNP data. These estimates can be used to check pedigrees, estimate heritability, and adjust association analyses. We focus on the method of moments technique as implemented in PLINK [Purcell et al., 2007] and other software that estimates the proportions of the genome at which two individuals share 0, 1, or 2 alleles IBD. This technique is based on the assumption that the study sample is drawn from a single, homogeneous, randomly mating population. This assumption is violated if pedigree founders are drawn from multiple populations or include admixed individuals. In the presence of population structure, the method of moments estimator has an inflated variance and can be biased because it relies on sample-based allele frequency estimates. In the case of the PLINK estimator, which truncates genome-wide sharing estimates at zero and one to generate biologically interpretable results, the bias is most often towards over-estimation of relatedness between ancestrally similar individuals. Using simulated pedigrees, we are able to demonstrate and quantify the behavior of the PLINK method of moments estimator under different population structure conditions. We also propose a simple method based on SNP pruning for improving genome-wide IBD estimates when the assumption of a single, homogeneous population is violated. PMID:23740691

  1. Femtomole SHAPE reveals regulatory structures in the authentic XMRV RNA genome

    PubMed Central

    Grohman, Jacob K.; Kottegoda, Sumith; Gorelick, Robert J.; Allbritton, Nancy L.; Weeks, Kevin M.

    2011-01-01

    Higher-order structure influences critical functions in nearly all non-coding and coding RNAs. Most single-nucleotide resolution RNA structure determination technologies cannot be used to analyze RNA from scarce biological samples, like viral genomes. To make quantitative RNA structure analysis applicable to a much wider array of RNA structure-function problems, we developed and applied high-sensitivity selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) to structural analysis of authentic genomic RNA of the xenotropic murine leukemia virus-related virus (XMRV). For analysis of fluorescently labeled cDNAs generated in high-sensitivity SHAPE experiments, we developed a two-color capillary electrophoresis approach with zeptomole molecular detection limits and sub-femtomole sensitivity for complete SHAPE experiments involving hundreds of individual RNA structure measurements. High-sensitivity SHAPE data correlated closely (R = 0.89) with data obtained by conventional capillary electrophoresis. Using high-sensitivity SHAPE, we determined the dimeric structure of the XMRV packaging domain, examined dynamic interactions between a packaging domain RNA and viral nucleocapsid protein inside virion particles, and identified the packaging signal for this virus. Despite extensive sequence differences between XMRV and the intensively studied Moloney murine leukemia virus, architectures of the regulatory domains are similar and reveal common principles of gammaretrovirus RNA genome packaging. PMID:22126209

  2. High Density LD-Based Structural Variations Analysis in Cattle Genome

    PubMed Central

    Salomon-Torres, Ricardo; Matukumalli, Lakshmi K.; Van Tassell, Curtis P.; Villa-Angulo, Carlos; Gonzalez-Vizcarra, Víctor M.; Villa-Angulo, Rafael

    2014-01-01

    Genomic structural variations represent an important source of genetic variation in mammal genomes, thus, they are commonly related to phenotypic expressions. In this work, ∼770,000 single nucleotide polymorphism genotypes from 506 animals from 19 cattle breeds were analyzed. A simple LD-based structural variation was defined, and a genome-wide analysis was performed. After applying some quality control filters, for each breed and each chromosome we calculated the linkage disequilibrium (r2) of short range (≤100 Kb). We sorted SNP pairs by distance and obtained a set of LD means (called the expected means) using bins of 5 Kb. We identified 15,246 segments of at least 1 Kb, among the 19 breeds, consisting of sets of at least 3 adjacent SNPs so that, for each SNP, r2 within its neighbors in a 100 Kb range, to the right side of that SNP, were all bigger than, or all smaller than, the corresponding expected mean, and their P-value were significant after a Benjamini-Hochberg multiple testing correction. In addition, to account just for homogeneously distributed regions we considered only SNPs having at least 15 SNP neighbors within 100 Kb. We defined such segments as structural variations. By grouping all variations across all animals in the sample we defined 9,146 regions, involving a total of 53,137 SNPs; representing the 6.40% (160.98 Mb) from the bovine genome. The identified structural variations covered 3,109 genes. Clustering analysis showed the relatedness of breeds given the geographic region in which they are evolving. In summary, we present an analysis of structural variations based on the deviation of the expected short range LD between SNPs in the bovine genome. With an intuitive and simple definition based only on SNPs data it was possible to discern closeness of breeds due to grouping by geographic region in which they are evolving. PMID:25050984

  3. Mapping the structure and dynamics of genomics-related MeSH terms complex networks.

    PubMed

    Siqueiros-García, Jesús M; Hernández-Lemus, Enrique; García-Herrera, Rodrigo; Robina-Galatas, Andrea

    2014-01-01

    It has been proposed that the history and evolution of scientific ideas may reflect certain aspects of the underlying socio-cognitive frameworks in which science itself is developing. Systematic analyses of the development of scientific knowledge may help us to construct models of the collective dynamics of science. Aiming at scientific rigor, these models should be built upon solid empirical evidence, analyzed with formal tools leading to ever-improving results that support the related conclusions. Along these lines we studied the dynamics and structure of the development of research in genomics as represented by the entire collection of genomics-related scientific papers contained in the PubMed database. The analyzed corpus consisted in more than 49,000 articles published in the years 1987 (first appearance of the term Genomics) to 2011, categorized by means of the Medical Subheadings (MeSH) content-descriptors. Complex networks were built where two MeSH terms were connected if they are descriptors of the same article(s). The analysis of such networks revealed a complex structure and dynamics that to certain extent resembled small-world networks. The evolution of such networks in time reflected interesting phenomena in the historical development of genomic research, including what seems to be a phase-transition in a period marked by the completion of the first draft of the Human Genome Project. We also found that different disciplinary areas have different dynamic evolution patterns in their MeSH connectivity networks. In the case of areas related to science, changes in topology were somewhat fast while retaining a certain core-structure, whereas in the humanities, the evolution was pretty slow and the structure resulted highly redundant and in the case of technology related issues, the evolution was very fast and the structure remained tree-like with almost no overlapping terms. PMID:24699262

  4. Large-insert genome analysis technology detects structural variation in Pseudomonas aeruginosa clinical strains from cystic fibrosis patients.

    PubMed

    Hayden, Hillary S; Gillett, Will; Saenphimmachak, Channakhone; Lim, Regina; Zhou, Yang; Jacobs, Michael A; Chang, Jean; Rohmer, Laurence; D'Argenio, David A; Palmieri, Anthony; Levy, Ruth; Haugen, Eric; Wong, Gane K S; Brittnacher, Mitch J; Burns, Jane L; Miller, Samuel I; Olson, Maynard V; Kaul, Rajinder

    2008-06-01

    Large-insert genome analysis (LIGAN) is a broadly applicable, high-throughput technology designed to characterize genome-scale structural variation. Fosmid paired-end sequences and DNA fingerprints from a query genome are compared to a reference sequence using the Genomic Variation Analysis (GenVal) suite of software tools to pinpoint locations of insertions, deletions, and rearrangements. Fosmids spanning regions that contain new structural variants can then be sequenced. Clonal pairs of Pseudomonas aeruginosa isolates from four cystic fibrosis patients were used to validate the LIGAN technology. Approximately 1.5 Mb of inserted sequences were identified, including 743 kb containing 615 ORFs that are absent from published P. aeruginosa genomes. Six rearrangement breakpoints and 220 kb of deleted sequences were also identified. Our study expands the "genome universe" of P. aeruginosa and validates a technology that complements emerging, short-read sequencing methods that are better suited to characterizing single-nucleotide polymorphisms than structural variation.

  5. Large-insert genome analysis technology detects structural variation in Pseudomonas aeruginosa clinical strains from cystic fibrosis patients.

    PubMed

    Hayden, Hillary S; Gillett, Will; Saenphimmachak, Channakhone; Lim, Regina; Zhou, Yang; Jacobs, Michael A; Chang, Jean; Rohmer, Laurence; D'Argenio, David A; Palmieri, Anthony; Levy, Ruth; Haugen, Eric; Wong, Gane K S; Brittnacher, Mitch J; Burns, Jane L; Miller, Samuel I; Olson, Maynard V; Kaul, Rajinder

    2008-06-01

    Large-insert genome analysis (LIGAN) is a broadly applicable, high-throughput technology designed to characterize genome-scale structural variation. Fosmid paired-end sequences and DNA fingerprints from a query genome are compared to a reference sequence using the Genomic Variation Analysis (GenVal) suite of software tools to pinpoint locations of insertions, deletions, and rearrangements. Fosmids spanning regions that contain new structural variants can then be sequenced. Clonal pairs of Pseudomonas aeruginosa isolates from four cystic fibrosis patients were used to validate the LIGAN technology. Approximately 1.5 Mb of inserted sequences were identified, including 743 kb containing 615 ORFs that are absent from published P. aeruginosa genomes. Six rearrangement breakpoints and 220 kb of deleted sequences were also identified. Our study expands the "genome universe" of P. aeruginosa and validates a technology that complements emerging, short-read sequencing methods that are better suited to characterizing single-nucleotide polymorphisms than structural variation. PMID:18445516

  6. Comparative genomics reveals 104 candidate structured RNAs from bacteria, archaea, and their metagenomes

    PubMed Central

    2010-01-01

    Background Structured noncoding RNAs perform many functions that are essential for protein synthesis, RNA processing, and gene regulation. Structured RNAs can be detected by comparative genomics, in which homologous sequences are identified and inspected for mutations that conserve RNA secondary structure. Results By applying a comparative genomics-based approach to genome and metagenome sequences from bacteria and archaea, we identified 104 candidate structured RNAs and inferred putative functions for many of these. Twelve candidate metabolite-binding RNAs were identified, three of which were validated, including one reported herein that binds the coenzyme S-adenosylmethionine. Newly identified cis-regulatory RNAs are implicated in photosynthesis or nitrogen regulation in cyanobacteria, purine and one-carbon metabolism, stomach infection by Helicobacter, and many other physiological processes. A candidate riboswitch termed crcB is represented in both bacteria and archaea. Another RNA motif may control gene expression from 3'-untranslated regions of mRNAs, which is unusual for bacteria. Many noncoding RNAs that likely act in trans are also revealed, and several of the noncoding RNA candidates are found mostly or exclusively in metagenome DNA sequences. Conclusions This work greatly expands the variety of highly structured noncoding RNAs known to exist in bacteria and archaea and provides a starting point for biochemical and genetic studies needed to validate their biologic functions. Given the sustained rate of RNA discovery over several similar projects, we expect that far more structured RNAs remain to be discovered from bacterial and archaeal organisms. PMID:20230605

  7. Insights into RNA structure and function from genome-wide studies.

    PubMed

    Mortimer, Stefanie A; Kidwell, Mary Anne; Doudna, Jennifer A

    2014-07-01

    A comprehensive understanding of RNA structure will provide fundamental insights into the cellular function of both coding and non-coding RNAs. Although many RNA structures have been analysed by traditional biophysical and biochemical methods, the low-throughput nature of these approaches has prevented investigation of the vast majority of cellular transcripts. Triggered by advances in sequencing technology, genome-wide approaches for probing the transcriptome are beginning to reveal how RNA structure affects each step of protein expression and RNA stability. In this Review, we discuss the emerging relationships between RNA structure and the regulation of gene expression. PMID:24821474

  8. Unique genomic structure and distinct mitotic behavior of ring chromosome 21 in two unrelated cases.

    PubMed

    Zhang, H Z; Xu, F; Seashore, M; Li, P

    2012-01-01

    A ring chromosome replacing a normal chromosome could involve variable structural rearrangements and mitotic instability. However, most previously reported cases lacked further genomic characterization. High-resolution oligonucleotide array comparative genomic hybridization with single-nucleotide polymorphism typing (aCGH+SNP) was used to study 2 unrelated cases with a ring chromosome 21. Case 1 had severe myopia, hypotonia, joint hypermobility, speech delay, and dysmorphic features. aCGH detected a 1.275-Mb duplication of 21q22.12-q22.13 and a 6.731-Mb distal deletion at 21q22.2. Case 2 showed severe growth and developmental retardations, intractable seizures, and dysmorphic features. aCGH revealed a contiguous pattern of a 3.612- Mb deletion of 21q22.12-q22.2, a 4.568-Mb duplication of 21q22.2-q22.3, and a 2.243-Mb distal deletion at 21q22.3. Mitotic instability was noted in 13, 30, and 76% of in vitro cultured metaphase cells, interphase cells, and leukocyte DNA, respectively. The different phenotypes of these 2 cases are likely associated with the unique genomic structure and distinct mitotic behavior of their ring chromosome 21. These 2 cases represent a subtype of ring chromosome 21 probably involving somatic dicentric ring breakage and reunion. A cytogenomic approach is proposed for characterizing the genomic structure and mitotic instability of ring chromosome abnormalities.

  9. 3D structures of membrane proteins from genomic sequencing

    PubMed Central

    Hopf, Thomas A.; Colwell, Lucy J.; Sheridan, Robert; Rost, Burkhard; Sander, Chris; Marks, Debora S.

    2012-01-01

    Summary We show that amino acid co-variation in proteins, extracted from the evolutionary sequence record, can be used to fold transmembrane proteins. We use this technique to predict previously unknown, 3D structures for 11 transmembrane proteins (with up to 14 helices) from their sequences alone. The prediction method (EVfold_membrane), applies a maximum entropy approach to infer evolutionary co-variation in pairs of sequence positions within a protein family and then generates all-atom models with the derived pairwise distance constraints. We benchmark the approach with blinded, de novo computation of known transmembrane protein structures from 23 families, demonstrating unprecedented accuracy of the method for large transmembrane proteins. We show how the method can predict oligomerization, functional sites, and conformational changes in transmembrane proteins. With the rapid rise in large-scale sequencing, more accurate and more comprehensive information on evolutionary constraints can be decoded from genetic variation, greatly expanding the repertoire of transmembrane proteins amenable to modelling by this method. PMID:22579045

  10. The Drosophila Helicase MLE Targets Hairpin Structures in Genomic Transcripts

    PubMed Central

    Cugusi, Simona; Li, Yujing; Jin, Peng; Lucchesi, John C.

    2016-01-01

    RNA hairpins are a common type of secondary structures that play a role in every aspect of RNA biochemistry including RNA editing, mRNA stability, localization and translation of transcripts, and in the activation of the RNA interference (RNAi) and microRNA (miRNA) pathways. Participation in these functions often requires restructuring the RNA molecules by the association of single-strand (ss) RNA-binding proteins or by the action of helicases. The Drosophila MLE helicase has long been identified as a member of the MSL complex responsible for dosage compensation. The complex includes one of two long non-coding RNAs and MLE was shown to remodel the roX RNA hairpin structures in order to initiate assembly of the complex. Here we report that this function of MLE may apply to the hairpins present in the primary RNA transcripts that generate the small molecules responsible for RNA interference. Using stocks from the Transgenic RNAi Project and the Vienna Drosophila Research Center, we show that MLE specifically targets hairpin RNAs at their site of transcription. The association of MLE at these sites is independent of sequence and chromosome location. We use two functional assays to test the biological relevance of this association and determine that MLE participates in the RNAi pathway. PMID:26752049

  11. Draft Genome Sequence of Ruminoclostridium sp. Ne3, Clostridia from an Enrichment Culture Obtained from Australian Subterranean Termite, Nasutitermes exitiosus

    PubMed Central

    Lin, Hai; Tran-Dinh, Nai; Li, Dongmei; Greenfield, Paul; Midgley, David J.

    2015-01-01

    The draft genome sequence of Ruminoclostridium sp. Ne3 was reconstructed from the metagenome of a hydrogenogenic microbial consortium growing on xylan. The organism is likely the primary hemicellulose degrader within the consortium. PMID:25908130

  12. VAMDC Consortium: A Service to Astrophysics

    NASA Astrophysics Data System (ADS)

    L Dubernet, M.; Moreau, N.; Zwoelf, C. M.; Ba, Y. A.

    2015-12-01

    The VAMDC Consortium is a worldwide consortium which federates Atomic and Molecular databases through an e-science infrastructure and a political organisation. About 90% of the inter-connected databases handle data that are used for the interpretation of spectra and for the modelisation of media of many fields of astrophysics. This paper presents how the VAMDC Consortium is organised in order to provide a ``service'' to the astrophysics community.

  13. A high-quality human reference panel reveals the complexity and distribution of genomic structural variants

    PubMed Central

    Hehir-Kwa, Jayne Y.; Marschall, Tobias; Kloosterman, Wigard P.; Francioli, Laurent C.; Baaijens, Jasmijn A.; Dijkstra, Louis J.; Abdellaoui, Abdel; Koval, Vyacheslav; Thung, Djie Tjwan; Wardenaar, René; Renkens, Ivo; Coe, Bradley P.; Deelen, Patrick; de Ligt, Joep; Lameijer, Eric-Wubbo; van Dijk, Freerk; Hormozdiari, Fereydoun; Bovenberg, Jasper A.; de Craen, Anton J. M.; Beekman, Marian; Hofman, Albert; Willemsen, Gonneke; Wolffenbuttel, Bruce; Platteel, Mathieu; Du, Yuanping; Chen, Ruoyan; Cao, Hongzhi; Cao, Rui; Sun, Yushen; Cao, Jeremy Sujie; Neerincx, Pieter B. T.; Dijkstra, Martijn; Byelas, George; Kanterakis, Alexandros; Bot, Jan; Vermaat, Martijn; Laros, Jeroen F. J.; den Dunnen, Johan T.; de Knijff, Peter; Karssen, Lennart C.; van Leeuwen, Elisa M.; Amin, Najaf; Rivadeneira, Fernando; Estrada, Karol; Hottenga, Jouke-Jan; Kattenberg, V. Mathijs; van Enckevort, David; Mei, Hailiang; Santcroos, Mark; van Schaik, Barbera D. C.; Handsaker, Robert E.; McCarroll, Steven A.; Ko, Arthur; Sudmant, Peter; Nijman, Isaac J.; Uitterlinden, André G.; van Duijn, Cornelia M.; Eichler, Evan E.; de Bakker, Paul I. W.; Swertz, Morris A.; Wijmenga, Cisca; van Ommen, Gert-Jan B.; Slagboom, P. Eline; Boomsma, Dorret I.; Schönhuth, Alexander; Ye, Kai; Guryev, Victor

    2016-01-01

    Structural variation (SV) represents a major source of differences between individual human genomes and has been linked to disease phenotypes. However, the majority of studies provide neither a global view of the full spectrum of these variants nor integrate them into reference panels of genetic variation. Here, we analyse whole genome sequencing data of 769 individuals from 250 Dutch families, and provide a haplotype-resolved map of 1.9 million genome variants across 9 different variant classes, including novel forms of complex indels, and retrotransposition-mediated insertions of mobile elements and processed RNAs. A large proportion are previously under reported variants sized between 21 and 100 bp. We detect 4 megabases of novel sequence, encoding 11 new transcripts. Finally, we show 191 known, trait-associated SNPs to be in strong linkage disequilibrium with SVs and demonstrate that our panel facilitates accurate imputation of SVs in unrelated individuals. PMID:27708267

  14. Genome resequencing reveals multiscale geographic structure and extensive linkage disequilibrium in the forest tree Populus trichocarpa

    SciTech Connect

    Slavov, Gancho; DiFazio, Stephen P; Martin, Joel R; Schackwitz, Wendy; Muchero, Wellington; Rodgers-Melnick, Eli; Lipphardt, Mindie; Pennacchio, Christa; Hellsten, Uffe; Pennacchio, Len; Gunter, Lee; Ranjan, Priya; Strauss, Steven; Rokhsar, Daniel; Tuskan, Gerald A

    2012-01-01

    Population genomics of forest trees provides crucial information for breeding, conservation, and bioenergy feedstock development. As part of a large-scale association study, we resequenced 16 genomes of the model tree Populus trichocarpa to an average depth of 39 . Analyses of the resulting data revealed surprisingly extensive population genetic structure and decay of linkage disequilibrium over much larger physical distances than the expected based on previous, smaller-scale studies. Rates of recombination varied widely across the genome but were largely predictable based on DNA sequence and methylation patterns. Our results suggest that genomewide association studies and accurate prediction of phenotypes from DNA data are more feasible in Populus than previously assumed, thereby laying the foundation for a step change in our understanding of tree biology.

  15. Heterogeneous genome divergence, differential introgression, and the origin and structure of hybrid zones

    PubMed Central

    Harrison, Richard G; Larson, Erica L

    2016-01-01

    Hybrid zones have been promoted as windows on the evolutionary process and as laboratories for studying divergence and speciation. Patterns of divergence between hybridizing species can now be characterized on a genome-wide scale, and recent genome scans have focused on the presence of “islands” of divergence. Patterns of heterogeneous genomic divergence may reflect differential introgression following secondary contact and provide insights into which genome regions contribute to local adaptation, hybrid unfitness, and positive assortative mating. However, heterogeneous genome divergence can also arise in the absence of any gene flow, as a result of variation in selection and recombination across the genome. We suggest that to understand hybrid zone origins and dynamics, it is essential to distinguish between genome regions that are divergent between pure parental populations and regions that show restricted introgression where these populations interact in hybrid zones. The latter, more so than the former, reveal the likely genetic architecture of reproductive isolation. Mosaic hybrid zones, because of their complex structure and multiple contacts, are particularly good subjects for distinguishing primary intergradation from secondary contact. Comparisons among independent hybrid zones or transects that involve the “same” species pair can also help to distinguish between divergence with gene flow and secondary contact. However, data from replicate hybrid zones or replicate transects do not reveal consistent patterns; in a few cases, patterns of introgression are similar across independent transects, but for many taxa, there is distinct lack of concordance, presumably due to variation in environmental context and/or variation in the genetics of the interacting populations. PMID:26857437

  16. PanScan, the Pancreatic Cancer Cohort Consortium, and the Pancreatic Cancer Case-Control Consortium

    Cancer.gov

    The Pancreatic Cancer Cohort Consortium consists of more than a dozen prospective epidemiologic cohort studies within the NCI Cohort Consortium, whose leaders work together to investigate the etiology and natural history of pancreatic cancer.

  17. The complete mitochondrial genome structure of the jaguar (Panthera onca).

    PubMed

    Caragiulo, Anthony; Dougherty, Eric; Soto, Sofia; Rabinowitz, Salisa; Amato, George

    2016-01-01

    The jaguar (Panthera onca) is the largest felid in the Western hemisphere, and the only member of the Panthera genus in the New World. The jaguar inhabits most countries within Central and South America, and is considered near threatened by the International Union for the Conservation of Nature. This study represents the first sequence of the entire jaguar mitogenome, which was the only Panthera mitogenome that had not been sequenced. The jaguar mitogenome is 17,049 bases and possesses the same molecular structure as other felid mitogenomes. Bayesian inference (BI) and maximum likelihood (ML) were used to determine the phylogenetic placement of the jaguar within the Panthera genus. Both BI and ML analyses revealed the jaguar to be sister to the tiger/leopard/snow leopard clade.

  18. The complete mitochondrial genome structure of the jaguar (Panthera onca).

    PubMed

    Caragiulo, Anthony; Dougherty, Eric; Soto, Sofia; Rabinowitz, Salisa; Amato, George

    2016-01-01

    The jaguar (Panthera onca) is the largest felid in the Western hemisphere, and the only member of the Panthera genus in the New World. The jaguar inhabits most countries within Central and South America, and is considered near threatened by the International Union for the Conservation of Nature. This study represents the first sequence of the entire jaguar mitogenome, which was the only Panthera mitogenome that had not been sequenced. The jaguar mitogenome is 17,049 bases and possesses the same molecular structure as other felid mitogenomes. Bayesian inference (BI) and maximum likelihood (ML) were used to determine the phylogenetic placement of the jaguar within the Panthera genus. Both BI and ML analyses revealed the jaguar to be sister to the tiger/leopard/snow leopard clade. PMID:25010076

  19. Structural features of conopeptide genes inferred from partial sequences of the Conus tribblei genome.

    PubMed

    Barghi, Neda; Concepcion, Gisela P; Olivera, Baldomero M; Lluisma, Arturo O

    2016-02-01

    The evolvability of venom components (in particular, the gene-encoded peptide toxins) in venomous species serves as an adaptive strategy allowing them to target new prey types or respond to changes in the prey field. The structure, organization, and expression of the venom peptide genes may provide insights into the molecular mechanisms that drive the evolution of such genes. Conus is a particularly interesting group given the high chemical diversity of their venom peptides, and the rapid evolution of the conopeptide-encoding genes. Conus genomes, however, are large and characterized by a high proportion of repetitive sequences. As a result, the structure and organization of conopeptide genes have remained poorly known. In this study, a survey of the genome of Conus tribblei was undertaken to address this gap. A partial assembly of C. tribblei genome was generated; the assembly, though consisting of a large number of fragments, accounted for 2160.5 Mb of sequence. A large number of repetitive genomic elements consisting of 642.6 Mb of retrotransposable elements, simple repeats, and novel interspersed repeats were observed. We characterized the structural organization and distribution of conotoxin genes in the genome. A significant number of conopeptide genes (estimated to be between 148 and 193) belonging to different superfamilies with complete or nearly complete exon regions were observed, ~60 % of which were expressed. The unexpressed conopeptide genes represent hidden but significant conotoxin diversity. The conotoxin genes also differed in the frequency and length of the introns. The interruption of exons by long introns in the conopeptide genes and the presence of repeats in the introns may indicate the importance of introns in facilitating recombination, evolution and diversification of conotoxins. These findings advance our understanding of the structural framework that promotes the gene-level molecular evolution of venom peptides. PMID:26423067

  20. Structural features of conopeptide genes inferred from partial sequences of the Conus tribblei genome.

    PubMed

    Barghi, Neda; Concepcion, Gisela P; Olivera, Baldomero M; Lluisma, Arturo O

    2016-02-01

    The evolvability of venom components (in particular, the gene-encoded peptide toxins) in venomous species serves as an adaptive strategy allowing them to target new prey types or respond to changes in the prey field. The structure, organization, and expression of the venom peptide genes may provide insights into the molecular mechanisms that drive the evolution of such genes. Conus is a particularly interesting group given the high chemical diversity of their venom peptides, and the rapid evolution of the conopeptide-encoding genes. Conus genomes, however, are large and characterized by a high proportion of repetitive sequences. As a result, the structure and organization of conopeptide genes have remained poorly known. In this study, a survey of the genome of Conus tribblei was undertaken to address this gap. A partial assembly of C. tribblei genome was generated; the assembly, though consisting of a large number of fragments, accounted for 2160.5 Mb of sequence. A large number of repetitive genomic elements consisting of 642.6 Mb of retrotransposable elements, simple repeats, and novel interspersed repeats were observed. We characterized the structural organization and distribution of conotoxin genes in the genome. A significant number of conopeptide genes (estimated to be between 148 and 193) belonging to different superfamilies with complete or nearly complete exon regions were observed, ~60 % of which were expressed. The unexpressed conopeptide genes represent hidden but significant conotoxin diversity. The conotoxin genes also differed in the frequency and length of the introns. The interruption of exons by long introns in the conopeptide genes and the presence of repeats in the introns may indicate the importance of introns in facilitating recombination, evolution and diversification of conotoxins. These findings advance our understanding of the structural framework that promotes the gene-level molecular evolution of venom peptides.

  1. Mapping the Structure and Dynamics of Genomics-Related MeSH Terms Complex Networks

    PubMed Central

    Siqueiros-García, Jesús M.; Hernández-Lemus, Enrique; García-Herrera, Rodrigo; Robina-Galatas, Andrea

    2014-01-01

    It has been proposed that the history and evolution of scientific ideas may reflect certain aspects of the underlying socio-cognitive frameworks in which science itself is developing. Systematic analyses of the development of scientific knowledge may help us to construct models of the collective dynamics of science. Aiming at scientific rigor, these models should be built upon solid empirical evidence, analyzed with formal tools leading to ever-improving results that support the related conclusions. Along these lines we studied the dynamics and structure of the development of research in genomics as represented by the entire collection of genomics-related scientific papers contained in the PubMed database. The analyzed corpus consisted in more than 49,000 articles published in the years 1987 (first appeareance of the term Genomics) to 2011, categorized by means of the Medical Subheadings (MeSH) content-descriptors. Complex networks were built where two MeSH terms were connected if they are descriptors of the same article(s). The analysis of such networks revealed a complex structure and dynamics that to certain extent resembled small-world networks. The evolution of such networks in time reflected interesting phenomena in the historical development of genomic research, including what seems to be a phase-transition in a period marked by the completion of the first draft of the Human Genome Project. We also found that different disciplinary areas have different dynamic evolution patterns in their MeSH connectivity networks. In the case of areas related to science, changes in topology were somewhat fast while retaining a certain core-stucture, whereas in the humanities, the evolution was pretty slow and the structure resulted highly redundant and in the case of technology related issues, the evolution was very fast and the structure remained tree-like with almost no overlapping terms. PMID:24699262

  2. Structure of the Acidianus Filamentous Virus 3 and Comparative Genomics of Related Archaeal Lipothrixviruses▿

    PubMed Central

    Vestergaard, Gisle; Aramayo, Ricardo; Basta, Tamara; Häring, Monika; Peng, Xu; Brügger, Kim; Chen, Lanming; Rachel, Reinhard; Boisset, Nicolas; Garrett, Roger A.; Prangishvili, David

    2008-01-01

    Four novel filamentous viruses with double-stranded DNA genomes, namely, Acidianus filamentous virus 3 (AFV3), AFV6, AFV7, and AFV8, have been characterized from the hyperthermophilic archaeal genus Acidianus, and they are assigned to the Betalipothrixvirus genus of the family Lipothrixviridae. The structures of the approximately 2-μm-long virions are similar, and one of them, AFV3, was studied in detail. It consists of a cylindrical envelope containing globular subunits arranged in a helical formation that is unique for any known double-stranded DNA virus. The envelope is 3.1 nm thick and encases an inner core with two parallel rows of protein subunits arranged like a zipper. Each end of the virion is tapered and carries three short filaments. Two major structural proteins were identified as being common to all betalipothrixviruses. The viral genomes were sequenced and analyzed, and they reveal a high level of conservation in both gene content and gene order over large regions, with this similarity extending partly to the earlier described betalipothrixvirus Sulfolobus islandicus filamentous virus. A few predicted gene products of each virus, in addition to the structural proteins, could be assigned specific functions, including a putative helicase involved in Holliday junction branch migration, a nuclease, a protein phosphatase, transcriptional regulators, and glycosyltransferases. The AFV7 genome appears to have undergone intergenomic recombination with a large section of an AFV2-like viral genome, apparently resulting in phenotypic changes, as revealed by the presence of AFV2-like termini in the AFV7 virions. Shared features of the genomes include (i) large inverted terminal repeats exhibiting conserved, regularly spaced direct repeats; (ii) a highly conserved operon encoding the two major structural proteins; (iii) multiple overlapping open reading frames, which may be indicative of gene recoding; (iv) putative 12-bp genetic elements; and (v) partial gene

  3. Overview of the Type I Diabetes Genetics Consortium.

    PubMed

    Rich, S S; Akolkar, B; Concannon, P; Erlich, H; Hilner, J E; Julier, C; Morahan, G; Nerup, J; Nierras, C; Pociot, F; Todd, J A

    2009-12-01

    The Type I Diabetes Genetics Consortium (T1DGC) is an international, multicenter research program with two primary goals. The first goal is to identify genomic regions and candidate genes whose variants modify an individual's risk of type I diabetes (T1D) and help explain the clustering of the disease in families. The second goal is to make research data available to the research community and to establish resources that can be used by, and that are fully accessible to, the research community. To facilitate the access to these resources, the T1DGC has developed a Consortium Agreement (http://www.t1dgc.org) that specifies the rights and responsibilities of investigators who participate in Consortium activities. The T1DGC has assembled a resource of affected sib-pair families, parent-child trios, and case-control collections with banks of DNA, serum, plasma, and EBV-transformed cell lines. In addition, both candidate gene and genome-wide (linkage and association) studies have been performed and displayed in T1DBase (http://www.t1dbase.org) for all researchers to use in their own investigations. In this supplement, a subset of the T1DGC collection has been used to investigate earlier published candidate genes for T1D, to confirm the results from a genome-wide association scan for T1D, and to determine associations with candidate genes for other autoimmune diseases or with type II diabetes that may be involved with beta-cell function.

  4. Midwest Superconductivity Consortium. Progress report, 1992

    SciTech Connect

    Bement, A.L. Jr.

    1993-01-01

    Mission of the Midwest Superconductivity Consortium, MISCON, is to advance the science and understanding of high Tc superconductivity. Programmatic research focuses upon key materials-related problems; principally, synthesis and processing and properties limiting transport phenomena. During the past year, 26 projects produced over 133 talks and 113 publications. publications. Two Master`s Degrees and one Ph.D. were granted to students working on MISCON projects. Group activities and interactions involved two MISCON group meetings (held in July and January), twenty external speakers, 36 collaborations, 10 exchanges of samples and/or measurements, and one (1) gift of equipment from industry. Research achievements this past year expanded our understanding of processing phenomena on structure property interrelationships and the fundamental nature of transport properties in high-temperature superconductors.

  5. Midwest Superconductivity Consortium: 1995 Progress report

    SciTech Connect

    1996-01-01

    The mission of the Midwest Superconductivity Consortium, MISCON, is to advance the science and understanding of high Tc superconductivity. During the past year, 26 projects produced over 133 talks and 127 publications. Three Master`s Degrees and 9 Doctor`s of Philosophy Degrees were granted to students working on MISCON projects. Group activities and interactions involved 2 MISCON group meetings (held in January and July); the third MISCON Summer School held in July; 12 external speakers; 81 collaborations (with universities, industry, Federal laboratories, and foreign research centers); and 54 exchanges of samples and/or measurements. Research achievements this past year focused on understanding the effects of processing phenomena on structure-property interrelationships and the fundamental nature of transport properties in high-temp superconductors.

  6. Characterization of the Genome, Proteome, and Structure of Yersiniophage ϕR1-37

    PubMed Central

    Hyytiäinen, Heidi J.; Happonen, Lotta J.; Kiljunen, Saija; Datta, Neeta; Mattinen, Laura; Williamson, Kirsty; Kristo, Paula; Szeliga, Magdalena; Kalin-Mänttäri, Laura; Ahola-Iivarinen, Elina; Kalkkinen, Nisse; Butcher, Sarah J.

    2012-01-01

    The bacteriophage vB_YecM-ϕR1-37 (ϕR1-37) is a lytic yersiniophage that can propagate naturally in different Yersinia species carrying the correct lipopolysaccharide receptor. This large-tailed phage has deoxyuridine (dU) instead of thymidine in its DNA. In this study, we determined the genomic sequence of phage ϕR1-37, mapped parts of the phage transcriptome, characterized the phage particle proteome, and characterized the virion structure by cryo-electron microscopy and image reconstruction. The 262,391-bp genome of ϕR1-37 is one of the largest sequenced phage genomes, and it contains 367 putative open reading frames (ORFs) and 5 tRNA genes. Mass-spectrometric analysis identified 69 phage particle structural proteins with the genes scattered throughout the genome. A total of 269 of the ORFs (73%) lack homologues in sequence databases. Based on terminator and promoter sequences identified from the intergenic regions, the phage genome was predicted to consist of 40 to 60 transcriptional units. Image reconstruction revealed that the ϕR1-37 capsid consists of hexameric capsomers arranged on a T=27 lattice similar to the bacteriophage ϕKZ. The tail of ϕR1-37 has a contractile sheath. We conclude that phage ϕR1-37 is a representative of a novel phage type that carries the dU-containing genome in a ϕKZ-like head. PMID:22973030

  7. An Isochore-Like Structure in the Genome of the Flatworm Schistosoma mansoni

    PubMed Central

    Lamolle, Guillermo; Protasio, Anna V.; Iriarte, Andrés; Jara, Eugenio; Simón, Diego; Musto, Héctor

    2016-01-01

    Eukaryotic genomes are compositionally heterogeneous, that is, composed by regions that differ in guanine–cytosine (GC) content (isochores). The most well documented case is that of vertebrates (mainly mammals) although it has been also noted among unicellular eukaryotes and invertebrates. In the human genome, regarded as a typical mammal, this heterogeneity is associated with several features. Specifically, genes located in GC-richest regions are the GC3-richest, display CpG islands and have shorter introns. Furthermore, these genes are more heavily expressed and tend to be located at the extremes of the chromosomes. Although the compositional heterogeneity seems to be widespread among eukaryotes, the associated properties noted in the human genome and other mammals have not been investigated in depth in other taxa. Here we provide evidence that the genome of the parasitic flatworm Schistosoma mansoni is compositionally heterogeneous and exhibits an isochore-like structure, displaying some features associated, until now, only with the human and other vertebrate genomes, with the exception of gene concentration. PMID:27435793

  8. An Isochore-Like Structure in the Genome of the Flatworm Schistosoma mansoni.

    PubMed

    Lamolle, Guillermo; Protasio, Anna V; Iriarte, Andrés; Jara, Eugenio; Simón, Diego; Musto, Héctor

    2016-01-01

    Eukaryotic genomes are compositionally heterogeneous, that is, composed by regions that differ in guanine-cytosine (GC) content (isochores). The most well documented case is that of vertebrates (mainly mammals) although it has been also noted among unicellular eukaryotes and invertebrates. In the human genome, regarded as a typical mammal, this heterogeneity is associated with several features. Specifically, genes located in GC-richest regions are the GC3-richest, display CpG islands and have shorter introns. Furthermore, these genes are more heavily expressed and tend to be located at the extremes of the chromosomes. Although the compositional heterogeneity seems to be widespread among eukaryotes, the associated properties noted in the human genome and other mammals have not been investigated in depth in other taxa Here we provide evidence that the genome of the parasitic flatworm Schistosoma mansoni is compositionally heterogeneous and exhibits an isochore-like structure, displaying some features associated, until now, only with the human and other vertebrate genomes, with the exception of gene concentration.

  9. Identification and classification of conserved RNA secondary structures in the human genome.

    PubMed

    Pedersen, Jakob Skou; Bejerano, Gill; Siepel, Adam; Rosenbloom, Kate; Lindblad-Toh, Kerstin; Lander, Eric S; Kent, Jim; Miller, Webb; Haussler, David

    2006-04-01

    The discoveries of microRNAs and riboswitches, among others, have shown functional RNAs to be biologically more important and genomically more prevalent than previously anticipated. We have developed a general comparative genomics method based on phylogenetic stochastic context-free grammars for identifying functional RNAs encoded in the human genome and used it to survey an eight-way genome-wide alignment of the human, chimpanzee, mouse, rat, dog, chicken, zebra-fish, and puffer-fish genomes for deeply conserved functional RNAs. At a loose threshold for acceptance, this search resulted in a set of 48,479 candidate RNA structures. This screen finds a large number of known functional RNAs, including 195 miRNAs, 62 histone 3'UTR stem loops, and various types of known genetic recoding elements. Among the highest-scoring new predictions are 169 new miRNA candidates, as well as new candidate selenocysteine insertion sites, RNA editing hairpins, RNAs involved in transcript auto regulation, and many folds that form singletons or small functional RNA families of completely unknown function. While the rate of false positives in the overall set is difficult to estimate and is likely to be substantial, the results nevertheless provide evidence for many new human functional RNAs and present specific predictions to facilitate their further characterization. PMID:16628248

  10. An Isochore-Like Structure in the Genome of the Flatworm Schistosoma mansoni.

    PubMed

    Lamolle, Guillermo; Protasio, Anna V; Iriarte, Andrés; Jara, Eugenio; Simón, Diego; Musto, Héctor

    2016-01-01

    Eukaryotic genomes are compositionally heterogeneous, that is, composed by regions that differ in guanine-cytosine (GC) content (isochores). The most well documented case is that of vertebrates (mainly mammals) although it has been also noted among unicellular eukaryotes and invertebrates. In the human genome, regarded as a typical mammal, this heterogeneity is associated with several features. Specifically, genes located in GC-richest regions are the GC3-richest, display CpG islands and have shorter introns. Furthermore, these genes are more heavily expressed and tend to be located at the extremes of the chromosomes. Although the compositional heterogeneity seems to be widespread among eukaryotes, the associated properties noted in the human genome and other mammals have not been investigated in depth in other taxa Here we provide evidence that the genome of the parasitic flatworm Schistosoma mansoni is compositionally heterogeneous and exhibits an isochore-like structure, displaying some features associated, until now, only with the human and other vertebrate genomes, with the exception of gene concentration. PMID:27435793

  11. Genomic structure of the human PAX2 gene

    SciTech Connect

    Sanyanusin, P.; Norrish, J.H.; Ward, T.A.

    1996-07-01

    Recent evidence indicates that Fgf8 is expressed during vertebrate development in multiple locations involved in the patterning and outgrowth of important embryo structures. Cloning and analysis of the murine gene revealed at least eight potential protein isoforms that share a common carboxyl region, encoded by exons 2 and 3, but possess different amino termini, generated by alternative splicing of RNA encoded by multiple 5{prime} exons (exons 1A, 1B, 1C, the human FGF8 gene). Human FGF-8 isoforms are identical to their murine counterparts in the common carboxyl region. Four of the human isoforms are identical to, or very similar to, the murine isoforms in the amino termini. However, four of the potential murine isoforms do not have corresponding human isoforms due to marked sequence divergence, leading to a blocked reading frame in exon 1B of FGF8. The lack of the four murine isoforms in humans raises the question of their function in murine development. 18 refs., 2 figs.

  12. Molecular characterization of a toluene-degrading methanogenic consortium

    SciTech Connect

    Ficker, M.; Krastel, K.; Orlicky, S.; Edwards, E.

    1999-12-01

    A toluene-degrading methanogenic consortium enriched from creosote-contaminated aquifer material was maintained on toluene as the sole carbon and energy source for 10 years. The species in the consortium were characterized by using a molecular approach. Total genomic DNA was isolated, and 16S rRBA genes were amplified by using PCR performed with kingdom-specific primers that were specific for 16S rRBA genes from either members of the kingdom Bacteria or members of the kingdom Archaea. A total of 90 eubacterial clones and 75 archaeal clones were grouped by performing a restriction fragment length polymorphism (RFLP) analysis. Six eubacterial sequences and two archaeal sequences were found in the greatest abundance (in six or more clones) based on the RFLP analysis. The relative abundance of each putative species was estimated by using fluorescent in situ hybridization (FISH), and the presence of putative species was determined qualitatively by performing slot blot hybridization with consortium DNA. Both archael species and two of the six eubacterial species were detected in the DNA and FISH hybridization experiments. A phylogenetic analysis of these four dominant organisms suggested that the two archaeal species are related to the genera methanosaeta and Methanospirillum. One of the eubacterial species is related to the genus Desulfotomaculum, which the others is not related to any previously described genus. By elimination, the authors propose that the last organism probably initiates the attack on toluene.

  13. Genome structures and transcriptomes signify niche adaptation for the multiple-ion-tolerant extremophyte Schrenkiella parvula.

    PubMed

    Oh, Dong-Ha; Hong, Hyewon; Lee, Sang Yeol; Yun, Dae-Jin; Bohnert, Hans J; Dassanayake, Maheshi

    2014-04-01

    Schrenkiella parvula (formerly Thellungiella parvula), a close relative of Arabidopsis (Arabidopsis thaliana) and Brassica crop species, thrives on the shores of Lake Tuz, Turkey, where soils accumulate high concentrations of multiple-ion salts. Despite the stark differences in adaptations to extreme salt stresses, the genomes of S. parvula and Arabidopsis show extensive synteny. S. parvula completes its life cycle in the presence of Na⁺, K⁺, Mg²⁺, Li⁺, and borate at soil concentrations lethal to Arabidopsis. Genome structural variations, including tandem duplications and translocations of genes, interrupt the colinearity observed throughout the S. parvula and Arabidopsis genomes. Structural variations distinguish homologous gene pairs characterized by divergent promoter sequences and basal-level expression strengths. Comparative RNA sequencing reveals the enrichment of ion-transport functions among genes with higher expression in S. parvula, while pathogen defense-related genes show higher expression in Arabidopsis. Key stress-related ion transporter genes in S. parvula showed increased copy number, higher transcript dosage, and evidence for subfunctionalization. This extremophyte offers a framework to identify the requisite adjustments of genomic architecture and expression control for a set of genes found in most plants in a way to support distinct niche adaptation and lifestyles. PMID:24563282

  14. Nonclinical and clinical Enterococcus faecium strains, but not Enterococcus faecalis strains, have distinct structural and functional genomic features.

    PubMed

    Kim, Eun Bae; Marco, Maria L

    2014-01-01

    Certain strains of Enterococcus faecium and Enterococcus faecalis contribute beneficially to animal health and food production, while others are associated with nosocomial infections. To determine whether there are structural and functional genomic features that are distinct between nonclinical (NC) and clinical (CL) strains of those species, we analyzed the genomes of 31 E. faecium and 38 E. faecalis strains. Hierarchical clustering of 7,017 orthologs found in the E. faecium pangenome revealed that NC strains clustered into two clades and are distinct from CL strains. NC E. faecium genomes are significantly smaller than CL genomes, and this difference was partly explained by significantly fewer mobile genetic elements (ME), virulence factors (VF), and antibiotic resistance (AR) genes. E. faecium ortholog comparisons identified 68 and 153 genes that are enriched for NC and CL strains, respectively. Proximity analysis showed that CL-enriched loci, and not NC-enriched loci, are more frequently colocalized on the genome with ME. In CL genomes, AR genes are also colocalized with ME, and VF are more frequently associated with CL-enriched loci. Genes in 23 functional groups are also differentially enriched between NC and CL E. faecium genomes. In contrast, differences were not observed between NC and CL E. faecalis genomes despite their having larger genomes than E. faecium. Our findings show that unlike E. faecalis, NC and CL E. faecium strains are equipped with distinct structural and functional genomic features indicative of adaptation to different environments.

  15. The Structure of Human Parechovirus 1 Reveals an Association of the RNA Genome with the Capsid

    PubMed Central

    Kalynych, Sergei; Pálková, Lenka

    2015-01-01

    ABSTRACT Parechoviruses are human pathogens that cause diseases ranging from gastrointestinal disorders to encephalitis. Unlike those of most picornaviruses, parechovirus capsids are composed of only three subunits: VP0, VP1, and VP3. Here, we present the structure of a human parechovirus 1 (HPeV-1) virion determined to a resolution of 3.1 Å. We found that interactions among pentamers in the HPeV-1 capsid are mediated by the N termini of VP0s, which correspond to the capsid protein VP4 and the N-terminal part of the capsid protein VP2 of other picornaviruses. In order to facilitate delivery of the virus genome into the cytoplasm, the N termini of VP0s have to be released from contacts between pentamers and exposed at the particle surface, resulting in capsid disruption. A hydrophobic pocket, which can be targeted by capsid-binding antiviral compounds in many other picornaviruses, is not present in HPeV-1. However, we found that interactions between the HPeV-1 single-stranded RNA genome and subunits VP1 and VP3 in the virion impose a partial icosahedral ordering on the genome. The residues involved in RNA binding are conserved among all parechoviruses, suggesting a putative role of the genome in virion stability or assembly. Therefore, putative small molecules that could disrupt HPeV RNA-capsid protein interactions could be developed into antiviral inhibitors. IMPORTANCE Human parechoviruses (HPeVs) are pathogens that cause diseases ranging from respiratory and gastrointestinal disorders to encephalitis. Recently, there have been outbreaks of HPeV infections in Western Europe and North America. We present the first atomic structure of parechovirus HPeV-1 determined by X-ray crystallography. The structure explains why HPeVs cannot be targeted by antiviral compounds that are effective against other picornaviruses. Furthermore, we found that the interactions of the HPeV-1 genome with the capsid resulted in a partial icosahedral ordering of the genome. The residues

  16. Structure and genome release of Twort-like Myoviridae phage with a double-layered baseplate.

    PubMed

    Nováček, Jiří; Šiborová, Marta; Benešík, Martin; Pantůček, Roman; Doškař, Jiří; Plevka, Pavel

    2016-08-16

    Bacteriophages from the family Myoviridae use double-layered contractile tails to infect bacteria. Contraction of the tail sheath enables the tail tube to penetrate through the bacterial cell wall and serve as a channel for the transport of the phage genome into the cytoplasm. However, the mechanisms controlling the tail contraction and genome release of phages with "double-layered" baseplates were unknown. We used cryo-electron microscopy to show that the binding of the Twort-like phage phi812 to the Staphylococcus aureus cell wall requires a 210° rotation of the heterohexameric receptor-binding and tripod protein complexes within its baseplate about an axis perpendicular to the sixfold axis of the tail. This rotation reorients the receptor-binding proteins to point away from the phage head, and also results in disruption of the interaction of the tripod proteins with the tail sheath, hence triggering its contraction. However, the tail sheath contraction of Myoviridae phages is not sufficient to induce genome ejection. We show that the end of the phi812 double-stranded DNA genome is bound to one protein subunit from a connector complex that also forms an interface between the phage head and tail. The tail sheath contraction induces conformational changes of the neck and connector that result in disruption of the DNA binding. The genome penetrates into the neck, but is stopped at a bottleneck before the tail tube. A subsequent structural change of the tail tube induced by its interaction with the S. aureus cell is required for the genome's release. PMID:27469164

  17. Universal Internucleotide Statistics in Full Genomes: A Footprint of the DNA Structure and Packaging?

    PubMed Central

    Bogachev, Mikhail I.; Kayumov, Airat R.; Bunde, Armin

    2014-01-01

    Uncovering the fundamental laws that govern the complex DNA structural organization remains challenging and is largely based upon reconstructions from the primary nucleotide sequences. Here we investigate the distributions of the internucleotide intervals and their persistence properties in complete genomes of various organisms from Archaea and Bacteria to H. Sapiens aiming to reveal the manifestation of the universal DNA architecture. We find that in all considered organisms the internucleotide interval distributions exhibit the same -exponential form. While in prokaryotes a single -exponential function makes the best fit, in eukaryotes the PDF contains additionally a second -exponential, which in the human genome makes a perfect approximation over nearly 10 decades. We suggest that this functional form is a footprint of the heterogeneous DNA structure, where the first -exponential reflects the universal helical pitch that appears both in pro- and eukaryotic DNA, while the second -exponential is a specific marker of the large-scale eukaryotic DNA organization. PMID:25438044

  18. Long-Range Correlations in Genomic DNA: A Signature of the Nucleosomal Structure

    NASA Astrophysics Data System (ADS)

    Audit, B.; Thermes, C.; Vaillant, C.; D'Aubenton-Carafa, Y.; Muzy, J. F.; Arneodo, A.

    2001-03-01

    We use the ``wavelet transform microscope'' to carry out a comparative statistical analysis of DNA bending profiles and of the corresponding DNA texts. In the three kingdoms, one reveals on both signals a characteristic scale of 100-200 bp that separates two different regimes of power-law correlations (PLC). In the small-scale regime, PLC are observed in eukaryotic, in double-strand DNA viral, and in archaeal genomes, which contrasts with their total absence in the genomes of eubacteria and their viruses. This strongly suggests that small-scale PLC are related to the mechanisms underlying the wrapping of DNA in the nucleosomal structure. We further speculate that the large scale PLC are the signature of the higher-order structure and dynamics of chromatin.

  19. Protein structure similarity clustering (PSSC) and natural product structure as inspiration sources for drug development and chemical genomics.

    PubMed

    Dekker, Frank J; Koch, Marcus A; Waldmann, Herbert

    2005-06-01

    Finding small molecules that modulate protein function is of primary importance in drug development and in the emerging field of chemical genomics. To facilitate the identification of such molecules, we developed a novel strategy making use of structural conservatism found in protein domain architecture and natural product inspired compound library design. Domains and proteins identified as being structurally similar in their ligand-sensing cores are grouped in a protein structure similarity cluster (PSSC). Natural products can be considered as evolutionary pre-validated ligands for multiple proteins and therefore natural products that are known to interact with one of the PSSC member proteins are selected as guiding structures for compound library synthesis. Application of this novel strategy for compound library design provided enhanced hit rates in small compound libraries for structurally similar proteins.

  20. Increasing Sales by Developing Production Consortiums.

    ERIC Educational Resources Information Center

    Smith, Christopher A.; Russo, Robert

    Intended to help rehabilitation facility administrators increase organizational income from manufacturing and/or contracted service sources, this document provides a decision-making model for the development of a production consortium. The document consists of five chapters and two appendices. Chapter 1 defines the consortium concept, explains…

  1. Tri-District Arts Consortium Summer Program.

    ERIC Educational Resources Information Center

    Kirby, Charlotte O.

    1990-01-01

    The Tri-District Arts Consortium in South Carolina was formed to serve artistically gifted students in grades six-nine. The consortium developed a summer program offering music, dance, theatre, and visual arts instruction through a curriculum of intense training, performing, and hands-on experiences with faculty members and guest artists. (JDD)

  2. The tomato genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The tomato genome sequence was undertaken at a time when state-of-the-art sequencing methodologies were undergoing a transition to co-called next generation methodologies. The result was an international consortium undertaking a strategy merging both old and new approaches. Because biologists were...

  3. Whole genome comparison between table and wine grapes reveals a comprehensive catalog of structural variants

    PubMed Central

    2014-01-01

    Background Grapevine (Vitis vinifera L.) is the most important Mediterranean fruit crop, used to produce both wine and spirits as well as table grape and raisins. Wine and table grape cultivars represent two divergent germplasm pools with different origins and domestication history, as well as differential characteristics for berry size, cluster architecture and berry chemical profile, among others. ‘Sultanina’ plays a pivotal role in modern table grape breeding providing the main source of seedlessness. This cultivar is also one of the most planted for fresh consumption and raisins production. Given its importance, we sequenced it and implemented a novel strategy for the de novo assembly of its highly heterozygous genome. Results Our approach produced a draft genome of 466 Mb, recovering 82% of the genes present in the grapevine reference genome; in addition, we identified 240 novel genes. A large number of structural variants and SNPs were identified. Among them, 45 (21 SNPs and 24 INDELs) were experimentally confirmed in ‘Sultanina’ and six SNPs in other 23 table grape varieties. Transposable elements corresponded to ca. 80% of the repetitive sequences involved in structural variants and more than 2,000 genes were affected in their structure by these variants. Some of these genes are likely involved in embryo development, suggesting that they may contribute to seedlessness, a key trait for table grapes. Conclusions This work produced the first structural variants and SNPs catalog for grapevine, constituting a novel and very powerful tool for genomic studies in this key fruit crop, particularly useful to support marker assisted breeding in table grapes. PMID:24397443

  4. Toward genomic identification of β-barrel membrane proteins: Composition and architecture of known structures

    PubMed Central

    Wimley, William C.

    2002-01-01

    The amino acid composition and architecture of all β-barrel membrane proteins of known three-dimensional structure have been examined to generate information that will be useful in identifying β-barrels in genome databases. The database consists of 15 nonredundant structures, including several novel, recent structures. Known structures include monomeric, dimeric, and trimeric β-barrels with between 8 and 22 membrane-spanning β-strands each. For this analysis the membrane-interacting surfaces of the β-barrels were identified with an experimentally derived, whole-residue hydrophobicity scale, and then the barrels were aligned normal to the bilayer and the position of the bilayer midplane was determined for each protein from the hydrophobicity profile. The abundance of each amino acid, relative to the genomic abundance, was calculated for the barrel exterior and interior. The architecture and diversity of known β-barrels was also examined. For example, the distribution of rise-per-residue values perpendicular to the bilayer plane was found to be 2.7 ± 0.25 Å per residue, or about 10 ± 1 residues across the membrane. Also, as noted by other authors, nearly every known membrane-spanning β-barrel strand was found to have a short loop of seven residues or less connecting it to at least one adjacent strand. Using this information we have begun to generate rapid screening algorithms for the identification of β-barrel membrane proteins in genomic databases. Application of one algorithm to the genomes of Escherichia coli and Pseudomonas aeruginosa confirms its ability to identify β-barrels, and reveals dozens of unidentified open reading frames that potentially code for β-barrel outer membrane proteins. PMID:11790840

  5. Finding genome-transcriptome-phenome association with structured association mapping and visualization in GenAMap.

    PubMed

    Curtis, Ross E; Yin, Junming; Kinnaird, Peter; Xing, Eric P

    2012-01-01

    Despite the success of genome-wide association studies in detecting novel disease variants, we are still far from a complete understanding of the mechanisms through which variants cause disease. Most of previous studies have considered only genome-phenome associations. However, the integration of transcriptome data may help further elucidate the mechanisms through which genetic mutations lead to disease and uncover potential pathways to target for treatment. We present a novel structured association mapping strategy for finding genome-transcriptome-phenome associations when SNP, gene-expression, and phenotype data are available for the same cohort. We do so via a two-step procedure where genome-transcriptome associations are identified by GFlasso, a sparse regression technique presented previously. Transcriptome-phenome associations are then found by a novel proposed method called gGFlasso, which leverages structure inherent in the genes and phenotypic traits. Due to the complex nature of three-way association results, visualization tools can aid in the discovery of causal SNPs and regulatory mechanisms affecting diseases. Using wellgrounded visualization techniques, we have designed new visualizations that filter through large three-way association results to detect interesting SNPs and associated genes and traits. The two-step GFlasso-gGFlasso algorithmic approach and new visualizations are integrated into GenAMap, a visual analytics system for structured association mapping. Results on simulated datasets show that our approach has the potential to increase the sensitivity and specificity of association studies, compared to existing procedures that do not exploit the full structural information of the data. We report results from an analysis on a publically available mouse dataset, showing that identified SNP-gene-trait associations are compatible with known biology.

  6. The structure of the Morganella morganii lipopolysaccharide core region and identification of its genomic loci.

    PubMed

    Vinogradov, Evgeny; Nash, John H E; Foote, Simon; Young, N Martin

    2015-01-30

    The core region of the lipopolysaccharide of Morganella morganii serotype O:1ab was obtained by hydrolysis of the LPS and studied by 2D NMR, ESI MS, and chemical methods. Its structure was highly homologous to those from the two major members of the same Proteeae tribe, Proteus mirabilis and Providencia alcalifaciens, and analysis of the M. morganii genome disclosed that the loci for its outer core, lipid A and Ara4N moieties are similarly conserved.

  7. Genome structure of a Saccharomyces cerevisiae strain widely used in bioethanol production

    PubMed Central

    Argueso, Juan Lucas; Carazzolle, Marcelo F.; Mieczkowski, Piotr A.; Duarte, Fabiana M.; Netto, Osmar V.C.; Missawa, Silvia K.; Galzerani, Felipe; Costa, Gustavo G.L.; Vidal, Ramon O.; Noronha, Melline F.; Dominska, Margaret; Andrietta, Maria G.S.; Andrietta, Sílvio R.; Cunha, Anderson F.; Gomes, Luiz H.; Tavares, Flavio C.A.; Alcarde, André R.; Dietrich, Fred S.; McCusker, John H.; Petes, Thomas D.; Pereira, Gonçalo A.G.

    2009-01-01

    Bioethanol is a biofuel produced mainly from the fermentation of carbohydrates derived from agricultural feedstocks by the yeast Saccharomyces cerevisiae. One of the most widely adopted strains is PE-2, a heterothallic diploid naturally adapted to the sugar cane fermentation process used in Brazil. Here we report the molecular genetic analysis of a PE-2 derived diploid (JAY270), and the complete genome sequence of a haploid derivative (JAY291). The JAY270 genome is highly heterozygous (∼2 SNPs/kb) and has several structural polymorphisms between homologous chromosomes. These chromosomal rearrangements are confined to the peripheral regions of the chromosomes, with breakpoints within repetitive DNA sequences. Despite its complex karyotype, this diploid, when sporulated, had a high frequency of viable spores. Hybrid diploids formed by outcrossing with the laboratory strain S288c also displayed good spore viability. Thus, the rearrangements that exist near the ends of chromosomes do not impair meiosis, as they do not span regions that contain essential genes. This observation is consistent with a model in which the peripheral regions of chromosomes represent plastic domains of the genome that are free to recombine ectopically and experiment with alternative structures. We also explored features of the JAY270 and JAY291 genomes that help explain their high adaptation to industrial environments, exhibiting desirable phenotypes such as high ethanol and cell mass production and high temperature and oxidative stress tolerance. The genomic manipulation of such strains could enable the creation of a new generation of industrial organisms, ideally suited for use as delivery vehicles for future bioenergy technologies. PMID:19812109

  8. Genome structure of a Saccharomyces cerevisiae strain widely used in bioethanol production.

    PubMed

    Argueso, Juan Lucas; Carazzolle, Marcelo F; Mieczkowski, Piotr A; Duarte, Fabiana M; Netto, Osmar V C; Missawa, Silvia K; Galzerani, Felipe; Costa, Gustavo G L; Vidal, Ramon O; Noronha, Melline F; Dominska, Margaret; Andrietta, Maria G S; Andrietta, Sílvio R; Cunha, Anderson F; Gomes, Luiz H; Tavares, Flavio C A; Alcarde, André R; Dietrich, Fred S; McCusker, John H; Petes, Thomas D; Pereira, Gonçalo A G

    2009-12-01

    Bioethanol is a biofuel produced mainly from the fermentation of carbohydrates derived from agricultural feedstocks by the yeast Saccharomyces cerevisiae. One of the most widely adopted strains is PE-2, a heterothallic diploid naturally adapted to the sugar cane fermentation process used in Brazil. Here we report the molecular genetic analysis of a PE-2 derived diploid (JAY270), and the complete genome sequence of a haploid derivative (JAY291). The JAY270 genome is highly heterozygous (approximately 2 SNPs/kb) and has several structural polymorphisms between homologous chromosomes. These chromosomal rearrangements are confined to the peripheral regions of the chromosomes, with breakpoints within repetitive DNA sequences. Despite its complex karyotype, this diploid, when sporulated, had a high frequency of viable spores. Hybrid diploids formed by outcrossing with the laboratory strain S288c also displayed good spore viability. Thus, the rearrangements that exist near the ends of chromosomes do not impair meiosis, as they do not span regions that contain essential genes. This observation is consistent with a model in which the peripheral regions of chromosomes represent plastic domains of the genome that are free to recombine ectopically and experiment with alternative structures. We also explored features of the JAY270 and JAY291 genomes that help explain their high adaptation to industrial environments, exhibiting desirable phenotypes such as high ethanol and cell mass production and high temperature and oxidative stress tolerance. The genomic manipulation of such strains could enable the creation of a new generation of industrial organisms, ideally suited for use as delivery vehicles for future bioenergy technologies.

  9. Global MLST of Salmonella Typhi Revisited in Post-genomic Era: Genetic Conservation, Population Structure, and Comparative Genomics of Rare Sequence Types.

    PubMed

    Yap, Kien-Pong; Ho, Wing S; Gan, Han M; Chai, Lay C; Thong, Kwai L

    2016-01-01

    Typhoid fever, caused by Salmonella enterica serovar Typhi, remains an important public health burden in Southeast Asia and other endemic countries. Various genotyping methods have been applied to study the genetic variations of this human-restricted pathogen. Multilocus sequence typing (MLST) is one of the widely accepted methods, and recently, there is a growing interest in the re-application of MLST in the post-genomic era. In this study, we provide the global MLST distribution of S. Typhi utilizing both publicly available 1,826 S. Typhi genome sequences in addition to performing conventional MLST on S. Typhi strains isolated from various endemic regions spanning over a century. Our global MLST analysis confirms the predominance of two sequence types (ST1 and ST2) co-existing in the endemic regions. Interestingly, S. Typhi strains with ST8 are currently confined within the African continent. Comparative genomic analyses of ST8 and other rare STs with genomes of ST1/ST2 revealed unique mutations in important virulence genes such as flhB, sipC, and tviD that may explain the variations that differentiate between seemingly successful (widespread) and unsuccessful (poor dissemination) S. Typhi populations. Large scale whole-genome phylogeny demonstrated evidence of phylogeographical structuring and showed that ST8 may have diverged from the earlier ancestral population of ST1 and ST2, which later lost some of its fitness advantages, leading to poor worldwide dissemination. In response to the unprecedented increase in genomic data, this study demonstrates and highlights the utility of large-scale genome-based MLST as a quick and effective approach to narrow the scope of in-depth comparative genomic analysis and consequently provide new insights into the fine scale of pathogen evolution and population structure. PMID:26973639

  10. A Structural Model of the Genome Packaging Process in a Membrane-Containing Double Stranded DNA Virus

    PubMed Central

    Hong, Chuan; Oksanen, Hanna M.; Liu, Xiangan; Jakana, Joanita; Bamford, Dennis H.; Chiu, Wah

    2014-01-01

    Two crucial steps in the virus life cycle are genome encapsidation to form an infective virion and genome exit to infect the next host cell. In most icosahedral double-stranded (ds) DNA viruses, the viral genome enters and exits the capsid through a unique vertex. Internal membrane-containing viruses possess additional complexity as the genome must be translocated through the viral membrane bilayer. Here, we report the structure of the genome packaging complex with a membrane conduit essential for viral genome encapsidation in the tailless icosahedral membrane-containing bacteriophage PRD1. We utilize single particle electron cryo-microscopy (cryo-EM) and symmetry-free image reconstruction to determine structures of PRD1 virion, procapsid, and packaging deficient mutant particles. At the unique vertex of PRD1, the packaging complex replaces the regular 5-fold structure and crosses the lipid bilayer. These structures reveal that the packaging ATPase P9 and the packaging efficiency factor P6 form a dodecameric portal complex external to the membrane moiety, surrounded by ten major capsid protein P3 trimers. The viral transmembrane density at the special vertex is assigned to be a hexamer of heterodimer of proteins P20 and P22. The hexamer functions as a membrane conduit for the DNA and as a nucleating site for the unique vertex assembly. Our structures show a conformational alteration in the lipid membrane after the P9 and P6 are recruited to the virion. The P8-genome complex is then packaged into the procapsid through the unique vertex while the genome terminal protein P8 functions as a valve that closes the channel once the genome is inside. Comparing mature virion, procapsid, and mutant particle structures led us to propose an assembly pathway for the genome packaging apparatus in the PRD1 virion. PMID:25514469

  11. The AGTSR consortium: An update

    SciTech Connect

    Fant, D.B.; Golan, L.P.

    1995-12-31

    The Advanced Gas Turbine Systems Research program is a nationwide consortium dedicated to advancing land-based gas turbine systems for improving future power generation capability. It directly supports the technology-research arm of the ATS program and targets industry- defined research needs in the areas of combustion, heat transfer, materials, aerodynamics, controls, alternative fuels, and advanced cycles. It is organized to enhance U.S. competitiveness through close collaboration with universities, government, and industry at the R&D level. AGTSR is just finishing its third year of operation; it is scheduled to continue past the year 2000. This update reviews the AGTSR triad, which consists of university/industry R&D activities, technology transfer programs, and trial student programs.

  12. Transposon Insertions, Structural Variations, and SNPs Contribute to the Evolution of the Melon Genome.

    PubMed

    Sanseverino, Walter; Hénaff, Elizabeth; Vives, Cristina; Pinosio, Sara; Burgos-Paz, William; Morgante, Michele; Ramos-Onsins, Sebastián E; Garcia-Mas, Jordi; Casacuberta, Josep Maria

    2015-10-01

    The availability of extensive databases of crop genome sequences should allow analysis of crop variability at an unprecedented scale, which should have an important impact in plant breeding. However, up to now the analysis of genetic variability at the whole-genome scale has been mainly restricted to single nucleotide polymorphisms (SNPs). This is a strong limitation as structural variation (SV) and transposon insertion polymorphisms are frequent in plant species and have had an important mutational role in crop domestication and breeding. Here, we present the first comprehensive analysis of melon genetic diversity, which includes a detailed analysis of SNPs, SV, and transposon insertion polymorphisms. The variability found among seven melon varieties representing the species diversity and including wild accessions and highly breed lines, is relatively high due in part to the marked divergence of some lineages. The diversity is distributed nonuniformly across the genome, being lower at the extremes of the chromosomes and higher in the pericentromeric regions, which is compatible with the effect of purifying selection and recombination forces over functional regions. Additionally, this variability is greatly reduced among elite varieties, probably due to selection during breeding. We have found some chromosomal regions showing a high differentiation of the elite varieties versus the rest, which could be considered as strongly selected candidate regions. Our data also suggest that transposons and SV may be at the origin of an important fraction of the variability in melon, which highlights the importance of analyzing all types of genetic variability to understand crop genome evolution.

  13. Genomic Structure of an Economically Important Cyanobacterium, Arthrospira (Spirulina) platensis NIES-39

    PubMed Central

    Fujisawa, Takatomo; Narikawa, Rei; Okamoto, Shinobu; Ehira, Shigeki; Yoshimura, Hidehisa; Suzuki, Iwane; Masuda, Tatsuru; Mochimaru, Mari; Takaichi, Shinichi; Awai, Koichiro; Sekine, Mitsuo; Horikawa, Hiroshi; Yashiro, Isao; Omata, Seiha; Takarada, Hiromi; Katano, Yoko; Kosugi, Hiroki; Tanikawa, Satoshi; Ohmori, Kazuko; Sato, Naoki; Ikeuchi, Masahiko; Fujita, Nobuyuki; Ohmori, Masayuki

    2010-01-01

    A filamentous non-N2-fixing cyanobacterium, Arthrospira (Spirulina) platensis, is an important organism for industrial applications and as a food supply. Almost the complete genome of A. platensis NIES-39 was determined in this study. The genome structure of A. platensis is estimated to be a single, circular chromosome of 6.8 Mb, based on optical mapping. Annotation of this 6.7 Mb sequence yielded 6630 protein-coding genes as well as two sets of rRNA genes and 40 tRNA genes. Of the protein-coding genes, 78% are similar to those of other organisms; the remaining 22% are currently unknown. A total 612 kb of the genome comprise group II introns, insertion sequences and some repetitive elements. Group I introns are located in a protein-coding region. Abundant restriction-modification systems were determined. Unique features in the gene composition were noted, particularly in a large number of genes for adenylate cyclase and haemolysin-like Ca2+-binding proteins and in chemotaxis proteins. Filament-specific genes were highlighted by comparative genomic analysis. PMID:20203057

  14. Infer Metagenomic Abundance and Reveal Homologous Genomes Based on the Structure of Taxonomy Tree.

    PubMed

    Qiu, Yu-Qing; Tian, Xue; Zhang, Shihua

    2015-01-01

    Metagenomic research uses sequencing technologies to investigate the genetic biodiversity of microbiomes presented in various ecosystems or animal tissues. The composition of a microbial community is highly associated with the environment in which the organisms exist. As large amount of sequencing short reads of microorganism genomes obtained, accurately estimating the abundance of microorganisms within a metagenomic sample is becoming an increasing challenge in bioinformatics. In this paper, we describe a hierarchical taxonomy tree-based mixture model (HTTMM) for estimating the abundance of taxon within a microbial community by incorporating the structure of the taxonomy tree. In this model, genome-specific short reads and homologous short reads among genomes can be distinguished and represented by leaf and intermediate nodes in the taxonomy tree, respectively. We adopt an expectation-maximization algorithm to solve this model. Using simulated and real-world data, we demonstrate that the proposed method is superior to both flat mixture model and lowest common ancestry-based methods. Moreover, this model can reveal previously unaddressed homologous genomes.

  15. Transposon Insertions, Structural Variations, and SNPs Contribute to the Evolution of the Melon Genome.

    PubMed

    Sanseverino, Walter; Hénaff, Elizabeth; Vives, Cristina; Pinosio, Sara; Burgos-Paz, William; Morgante, Michele; Ramos-Onsins, Sebastián E; Garcia-Mas, Jordi; Casacuberta, Josep Maria

    2015-10-01

    The availability of extensive databases of crop genome sequences should allow analysis of crop variability at an unprecedented scale, which should have an important impact in plant breeding. However, up to now the analysis of genetic variability at the whole-genome scale has been mainly restricted to single nucleotide polymorphisms (SNPs). This is a strong limitation as structural variation (SV) and transposon insertion polymorphisms are frequent in plant species and have had an important mutational role in crop domestication and breeding. Here, we present the first comprehensive analysis of melon genetic diversity, which includes a detailed analysis of SNPs, SV, and transposon insertion polymorphisms. The variability found among seven melon varieties representing the species diversity and including wild accessions and highly breed lines, is relatively high due in part to the marked divergence of some lineages. The diversity is distributed nonuniformly across the genome, being lower at the extremes of the chromosomes and higher in the pericentromeric regions, which is compatible with the effect of purifying selection and recombination forces over functional regions. Additionally, this variability is greatly reduced among elite varieties, probably due to selection during breeding. We have found some chromosomal regions showing a high differentiation of the elite varieties versus the rest, which could be considered as strongly selected candidate regions. Our data also suggest that transposons and SV may be at the origin of an important fraction of the variability in melon, which highlights the importance of analyzing all types of genetic variability to understand crop genome evolution. PMID:26174143

  16. Genome-based discovery, structure prediction and functional analysis of cyclic lipopeptide antibiotics in Pseudomonas species.

    PubMed

    de Bruijn, Irene; de Kock, Maarten J D; Yang, Meng; de Waard, Pieter; van Beek, Teris A; Raaijmakers, Jos M

    2007-01-01

    Analysis of microbial genome sequences have revealed numerous genes involved in antibiotic biosynthesis. In Pseudomonads, several gene clusters encoding non-ribosomal peptide synthetases (NRPSs) were predicted to be involved in the synthesis of cyclic lipopeptide (CLP) antibiotics. Most of these predictions, however, are untested and the association between genome sequence and biological function of the predicted metabolite is lacking. Here we report the genome-based identification of previously unknown CLP gene clusters in plant pathogenic Pseudomonas syringae strains B728a and DC3000 and in plant beneficial Pseudomonas fluorescens Pf0-1 and SBW25. For P. fluorescens SBW25, a model strain in studying bacterial evolution and adaptation, the structure of the CLP with a predicted 9-amino acid peptide moiety was confirmed by chemical analyses. Mutagenesis confirmed that the three identified NRPS genes are essential for CLP synthesis in strain SBW25. CLP production was shown to play a key role in motility, biofilm formation and in activity of SBW25 against zoospores of Phytophthora infestans. This is the first time that an antimicrobial metabolite is identified from strain SBW25. The results indicate that genome mining may enable the discovery of unknown gene clusters and traits that are highly relevant in the lifestyle of plant beneficial and plant pathogenic bacteria.

  17. Joint modeling of RNase footprint sequencing profiles for genome-wide inference of RNA structure

    PubMed Central

    Zou, Chenchen; Ouyang, Zhengqing

    2015-01-01

    Recent studies have revealed significant roles of RNA structure in almost every step of RNA processing, including transcription, splicing, transport and translation. RNase footprint sequencing (RNase-seq) has emerged to dissect RNA structures at the genome scale. However, it remains challenging to analyze RNase-seq data because of the issues of signal sparsity, variability and correlations among various RNases. We present a probabilistic framework, joint Poisson-gamma mixture (JPGM), for integrative modeling of multiple RNase-seq profiles. Combining JPGM with hidden Markov model allows genome-wide inference of RNA structures. We apply the joint modeling approach for inferring base pairing states on simulated data sets and RNase-seq profiles of the double-strand specific RNase V1 and single-strand specific RNase S1 in yeast. We demonstrate that joint analysis of V1 and S1 profiles outputs interpretable RNA structure states, while approaches that analyze each profile separately do not. The joint modeling approach predicts the structure states of all nucleotides in 3196 transcripts of yeast without compromising accuracy, while the simple thresholding approach misses 43% of the nucleotides. Furthermore, the posterior probabilities outputted by our model are able to resolve the structural ambiguity of ≈300 000 nucleotides with overlapping V1 and S1 cleavage sites. Our model also generates RNA accessibilities, which are associated with three-dimensional conformations. PMID:26400167

  18. Structure-infectivity analysis of the human rhinovirus genomic RNA 3' non-coding region.

    PubMed Central

    Todd, S; Semler, B L

    1996-01-01

    The specific recognition of genomic positive strand RNAS as templates for the synthesis of intermediate negative strands by the picornavirus replication machinery is presumably mediated by cis-acting sequences within the genomic RNA 3' non-coding region (NCR). A structure-infectivity analysis was conducted on the 44 nt human rhinovirus 14 (HRV14) 3' NCR to identify the primary sequence and/or secondary structure determinants required for viral replication. Using biochemical RNA secondary structure probing techniques, we have demonstrated the existence of a single stem-loop structure contained entirely within the 3' NCR, which appears to be phylogenetically conserved within the rhinovirus genus. We also report the in vivo analysis of a number of 3' NCR deletion mutations engineered into infectious cDNA clones which were designed to disrupt the stem-loop secondary structure to varying degrees. Large deletions (up to 37 nt) resulted in defective growth phenotypes, although they were not lethal. We propose that the absolute requirements for initiation of negative strand synthesis are less stringent than previously postulated, even though defined RNA secondary structure determinants may have evolved to facilitate and/or regulate the process of viral RNA replication. PMID:8668546

  19. In silico prediction and screening of modular crystal structures via a high-throughput genomic approach

    PubMed Central

    Li, Yi; Li, Xu; Liu, Jiancong; Duan, Fangzheng; Yu, Jihong

    2015-01-01

    High-throughput computational methods capable of predicting, evaluating and identifying promising synthetic candidates with desired properties are highly appealing to today's scientists. Despite some successes, in silico design of crystalline materials with complex three-dimensionally extended structures remains challenging. Here we demonstrate the application of a new genomic approach to ABC-6 zeolites, a family of industrially important catalysts whose structures are built from the stacking of modular six-ring layers. The sequences of layer stacking, which we deem the genes of this family, determine the structures and the properties of ABC-6 zeolites. By enumerating these gene-like stacking sequences, we have identified 1,127 most realizable new ABC-6 structures out of 78 groups of 84,292 theoretical ones, and experimentally realized 2 of them. Our genomic approach can extract crucial structural information directly from these gene-like stacking sequences, enabling high-throughput identification of synthetic targets with desired properties among a large number of candidate structures. PMID:26395233

  20. In silico prediction and screening of modular crystal structures via a high-throughput genomic approach

    NASA Astrophysics Data System (ADS)

    Li, Yi; Li, Xu; Liu, Jiancong; Duan, Fangzheng; Yu, Jihong

    2015-09-01

    High-throughput computational methods capable of predicting, evaluating and identifying promising synthetic candidates with desired properties are highly appealing to today's scientists. Despite some successes, in silico design of crystalline materials with complex three-dimensionally extended structures remains challenging. Here we demonstrate the application of a new genomic approach to ABC-6 zeolites, a family of industrially important catalysts whose structures are built from the stacking of modular six-ring layers. The sequences of layer stacking, which we deem the genes of this family, determine the structures and the properties of ABC-6 zeolites. By enumerating these gene-like stacking sequences, we have identified 1,127 most realizable new ABC-6 structures out of 78 groups of 84,292 theoretical ones, and experimentally realized 2 of them. Our genomic approach can extract crucial structural information directly from these gene-like stacking sequences, enabling high-throughput identification of synthetic targets with desired properties among a large number of candidate structures.

  1. Genomic organization of the crested ibis MHC provides new insight into ancestral avian MHC structure

    PubMed Central

    Chen, Li-Cheng; Lan, Hong; Sun, Li; Deng, Yan-Li; Tang, Ke-Yi; Wan, Qiu-Hong

    2015-01-01

    The major histocompatibility complex (MHC) plays an important role in immune response. Avian MHCs are not well characterized, only reporting highly compact Galliformes MHCs and extensively fragmented zebra finch MHC. We report the first genomic structure of an endangered Pelecaniformes (crested ibis) MHC containing 54 genes in three regions spanning ~500 kb. In contrast to the loose BG (26 loci within 265 kb) and Class I (11 within 150) genomic structures, the Core Region is condensed (17 within 85). Furthermore, this Region exhibits a COL11A2 gene, followed by four tandem MHC class II αβ dyads retaining two suites of anciently duplicated “αβ” lineages. Thus, the crested ibis MHC structure is entirely different from the known avian MHC architectures but similar to that of mammalian MHCs, suggesting that the fundamental structure of ancestral avian class II MHCs should be “COL11A2-IIαβ1-IIαβ2.” The gene structures, residue characteristics, and expression levels of the five class I genes reveal inter-locus functional divergence. However, phylogenetic analysis indicates that these five genes generate a well-supported intra-species clade, showing evidence for recent duplications. Our analyses suggest dramatic structural variation among avian MHC lineages, help elucidate avian MHC evolution, and provide a foundation for future conservation studies. PMID:25608659

  2. The complete mitochondrial genome sequence of the tubeworm Lamellibrachia satsuma and structural conservation in the mitochondrial genome control regions of Order Sabellida.

    PubMed

    Patra, Ajit Kumar; Kwon, Yong Min; Kang, Sung Gyun; Fujiwara, Yoshihiro; Kim, Sang-Jin

    2016-04-01

    The control region of the mitochondrial genomes shows high variation in conserved sequence organizations, which follow distinct evolutionary patterns in different species or taxa. In this study, we sequenced the complete mitochondrial genome of Lamellibrachia satsuma from the cold-seep region of Kagoshima Bay, as a part of whole genome study and extensively studied the structural features and patterns of the control region sequences. We obtained 15,037 bp of mitochondrial genome using Illumina sequencing and identified the non-coding AT-rich region or control region (354 bp, AT=83.9%) located between trnH and trnR. We found 7 conserved sequence blocks (CSB), scattered throughout the control region of L. satsuma and other taxa of Annelida. The poly-TA stretches, which commonly form the stem of multiple stem-loop structures, are most conserved in the CSB-I and CSB-II regions. The mitochondrial genome of L. satsuma encodes a unique repetitive sequence in the control region, which forms a unique secondary structure in comparison to Lamellibrachia luymesi. Phylogenetic analyses of all protein-coding genes indicate that L. satsuma forms a monophyletic clade with L. luymesi along with other tubeworms found in cold-seep regions (genera: Lamellibrachia, Escarpia, and Seepiophila). In general, the control region sequences of Annelida could be aligned with certainty within each genus, and to some extent within the family, but with a higher rate of variation in conserved regions. PMID:26776396

  3. Protein production from the structural genomics perspective: achievements and future needs

    PubMed Central

    Almo, Steven C; Garforth, Scott J; Hillerich, Brandan S; Love, James D; Seidel, Ronald D; Burley, Stephen K

    2014-01-01

    Despite a multitude of recent technical breakthroughs speeding high-resolution structural analysis of biological macromolecules, production of sufficient quantities of well-behaved, active protein continues to represent the rate-limiting step in many structure determination efforts. These challenges are only amplified when considered in the context of ongoing structural genomics efforts, which are now contending with multi-domain eukaryotic proteins, secreted proteins, and ever-larger macromolecular assemblies. Exciting new developments in eukaryotic expression platforms, including insect and mammalian-based systems, promise enhanced opportunities for structural approaches to some of the most important biological problems. Development and implementation of automated eukaryotic expression techniques promises to significantly improve production of materials for structural, functional, and biomedical research applications. PMID:23642905

  4. Genome-Wide Probing of RNA Structures In Vitro Using Nucleases and Deep Sequencing.

    PubMed

    Wan, Yue; Qu, Kun; Ouyang, Zhengqing; Chang, Howard Y

    2016-01-01

    RNA structure probing is an important technique that studies the secondary and tertiary conformations of an RNA. While it was traditionally performed on one RNA at a time, recent advances in deep sequencing has enabled the secondary structure mapping of thousands of RNAs simultaneously. Here, we describe the method Parallel Analysis for RNA Structures (PARS), which couples double and single strand specific nuclease probing to high throughput sequencing. Upon cloning of the cleavage sites into a cDNA library, deep sequencing and mapping of reads to the transcriptome, the position of paired and unpaired bases along cellular RNAs can be identified. PARS can be performed under diverse solution conditions and on different organismal RNAs to provide genome-wide RNA structural information. This information can also be further used to constrain computational predictions to provide better RNA structure models under different conditions. PMID:26483021

  5. Structural genomics: keeping up with expanding knowledge of the protein universe.

    PubMed

    Grabowski, Marek; Joachimiak, Andrzej; Otwinowski, Zbyszek; Minor, Wladek

    2007-06-01

    Structural characterization of the protein universe is the main mission of Structural Genomics (SG) programs. However, progress in gene sequencing technology, set in motion in the 1990s, has resulted in rapid expansion of protein sequence space--a twelvefold increase in the past seven years. For the SG field, this creates new challenges and necessitates a re-assessment of its strategies. Nevertheless, despite the growth of sequence space, at present nearly half of the content of the Swiss-Prot database and over 40% of Pfam protein families can be structurally modeled based on structures determined so far, with SG projects making an increasingly significant contribution. The SG contribution of new Pfam structures nearly doubled from 27.2% in 2003 to 51.6% in 2006.

  6. Population genomic structure and linkage disequilibrium analysis of South African goat breeds using genome-wide SNP data.

    PubMed

    Mdladla, K; Dzomba, E F; Huson, H J; Muchadeyi, F C

    2016-08-01

    The sustainability of goat farming in marginal areas of southern Africa depends on local breeds that are adapted to specific agro-ecological conditions. Unimproved non-descript goats are the main genetic resources used for the development of commercial meat-type breeds of South Africa. Little is known about genetic diversity and the genetics of adaptation of these indigenous goat populations. This study investigated the genetic diversity, population structure and breed relations, linkage disequilibrium, effective population size and persistence of gametic phase in goat populations of South Africa. Three locally developed meat-type breeds of the Boer (n = 33), Savanna (n = 31), Kalahari Red (n = 40), a feral breed of Tankwa (n = 25) and unimproved non-descript village ecotypes (n = 110) from four goat-producing provinces of the Eastern Cape, KwaZulu-Natal, Limpopo and North West were assessed using the Illumina Goat 50K SNP Bead Chip assay. The proportion of SNPs with minor allele frequencies >0.05 ranged from 84.22% in the Tankwa to 97.58% in the Xhosa ecotype, with a mean of 0.32 ± 0.13 across populations. Principal components analysis, admixture and pairwise FST identified Tankwa as a genetically distinct population and supported clustering of the populations according to their historical origins. Genome-wide FST identified 101 markers potentially under positive selection in the Tankwa. Average linkage disequilibrium was highest in the Tankwa (r(2)  = 0.25 ± 0.26) and lowest in the village ecotypes (r(2) range = 0.09 ± 0.12 to 0.11 ± 0.14). We observed an effective population size of <150 for all populations 13 generations ago. The estimated correlations for all breed pairs were lower than 0.80 at marker distances >100 kb with the exception of those in Savanna and Tswana populations. This study highlights the high level of genetic diversity in South African indigenous goats as well as the utility of the genome-wide SNP marker panels in

  7. Population genomic structure and linkage disequilibrium analysis of South African goat breeds using genome-wide SNP data.

    PubMed

    Mdladla, K; Dzomba, E F; Huson, H J; Muchadeyi, F C

    2016-08-01

    The sustainability of goat farming in marginal areas of southern Africa depends on local breeds that are adapted to specific agro-ecological conditions. Unimproved non-descript goats are the main genetic resources used for the development of commercial meat-type breeds of South Africa. Little is known about genetic diversity and the genetics of adaptation of these indigenous goat populations. This study investigated the genetic diversity, population structure and breed relations, linkage disequilibrium, effective population size and persistence of gametic phase in goat populations of South Africa. Three locally developed meat-type breeds of the Boer (n = 33), Savanna (n = 31), Kalahari Red (n = 40), a feral breed of Tankwa (n = 25) and unimproved non-descript village ecotypes (n = 110) from four goat-producing provinces of the Eastern Cape, KwaZulu-Natal, Limpopo and North West were assessed using the Illumina Goat 50K SNP Bead Chip assay. The proportion of SNPs with minor allele frequencies >0.05 ranged from 84.22% in the Tankwa to 97.58% in the Xhosa ecotype, with a mean of 0.32 ± 0.13 across populations. Principal components analysis, admixture and pairwise FST identified Tankwa as a genetically distinct population and supported clustering of the populations according to their historical origins. Genome-wide FST identified 101 markers potentially under positive selection in the Tankwa. Average linkage disequilibrium was highest in the Tankwa (r(2)  = 0.25 ± 0.26) and lowest in the village ecotypes (r(2) range = 0.09 ± 0.12 to 0.11 ± 0.14). We observed an effective population size of <150 for all populations 13 generations ago. The estimated correlations for all breed pairs were lower than 0.80 at marker distances >100 kb with the exception of those in Savanna and Tswana populations. This study highlights the high level of genetic diversity in South African indigenous goats as well as the utility of the genome-wide SNP marker panels in

  8. Genomics and the Human Genome Project: implications for psychiatry.

    PubMed

    Kelsoe, John R

    2004-11-01

    In the past decade the Human Genome Project has made extraordinary strides in understanding of fundamental human genetics. The complete human genetic sequence has been determined, and the chromosomal location of almost all human genes identified. Presently, a large international consortium, the HapMap Project, is working to identify a large portion of genetic variation in different human populations and the structure and relationship of these variants to each other. The Human Genome Project has approached human genetics on a scale not previously seen in biology. This has been made possible by dramatic advances in high throughput technology and bio-informatics. Tools such as gene chips and micro-arrays have spawned an entirely new strategy to examine the function and expression of genes in a massively parallel fashion. Together these tools have dramatically advanced our knowledge about the human genome. They promise powerful new approaches to complex genetic traits such as psychiatric illness. The goals and progress of the Human Genome Project and the technology involved are reviewed. The implications of this science for psychiatric genetics are discussed.

  9. SUNrises on the International Plant Nucleus Consortium

    PubMed Central

    Graumann, Katja; Bass, Hank W.; Parry, Geraint

    2013-01-01

    The nuclear periphery is a dynamic, structured environment, whose precise functions are essential for global processes—from nuclear, to cellular, to organismal. Its main components—the nuclear envelope (NE) with inner and outer nuclear membranes (INM and ONM), nuclear pore complexes (NPC), associated cytoskeletal and nucleoskeletal components as well as chromatin are conserved across eukaryotes (Fig. 1). In metazoans in particular, the structure and functions of nuclear periphery components are intensely researched partly because of their involvement in various human diseases. While far less is known about these in plants, the last few years have seen a significant increase in research activity in this area. Plant biologists are not only catching up with the animal field, but recent findings are pushing our advances in this field globally. In recognition of this developing field, the Annual Society of Experimental Biology Meeting in Salzburg kindly hosted a session co-organized by Katja Graumann and David E. Evans (Oxford Brookes University) highlighting new insights into plant nuclear envelope proteins and their interactions. This session brought together leading researchers with expertise in topics such as epigenetics, meiosis, nuclear pore structure and functions, nucleoskeleton and nuclear envelope composition. An open and friendly exchange of ideas was fundamental to the success of the meeting, which resulted in founding the International Plant Nucleus Consortium. This review highlights new developments in plant nuclear envelope research presented at the conference and their importance for the wider understanding of metazoan, yeast and plant nuclear envelope functions and properties. PMID:23324458

  10. A 4-gigabase physical map unlocks the structure and evolution of the complex genome of Aegilops tauschii, the wheat D-genome progenitor

    PubMed Central

    Luo, Ming-Cheng; Gu, Yong Q.; You, Frank M.; Deal, Karin R.; Ma, Yaqin; Hu, Yuqin; Huo, Naxin; Wang, Yi; Wang, Jirui; Chen, Shiyong; Jorgensen, Chad M.; Zhang, Yong; McGuire, Patrick E.; Pasternak, Shiran; Stein, Joshua C.; Ware, Doreen; Kramer, Melissa; McCombie, W. Richard; Kianian, Shahryar F.; Martis, Mihaela M.; Mayer, Klaus F. X.; Sehgal, Sunish K.; Li, Wanlong; Gill, Bikram S.; Bevan, Michael W.; Šimková, Hana; Doležel, Jaroslav; Weining, Song; Lazo, Gerard R.; Anderson, Olin D.; Dvorak, Jan

    2013-01-01

    The current limitations in genome sequencing technology require the construction of physical maps for high-quality draft sequences of large plant genomes, such as that of Aegilops tauschii, the wheat D-genome progenitor. To construct a physical map of the Ae. tauschii genome, we fingerprinted 461,706 bacterial artificial chromosome clones, assembled contigs, designed a 10K Ae. tauschii Infinium SNP array, constructed a 7,185-marker genetic map, and anchored on the map contigs totaling 4.03 Gb. Using whole genome shotgun reads, we extended the SNP marker sequences and found 17,093 genes and gene fragments. We showed that collinearity of the Ae. tauschii genes with Brachypodium distachyon, rice, and sorghum decreased with phylogenetic distance and that structural genome evolution rates have been high across all investigated lineages in subfamily Pooideae, including that of Brachypodieae. We obtained additional information about the evolution of the seven Triticeae chromosomes from 12 ancestral chromosomes and uncovered a pattern of centromere inactivation accompanying nested chromosome insertions in grasses. We showed that the density of noncollinear genes along the Ae. tauschii chromosomes positively correlates with recombination rates, suggested a cause, and showed that new genes, exemplified by disease resistance genes, are preferentially located in high-recombination chromosome regions. PMID:23610408

  11. Novel proteases from the genome of the carnivorous plant Drosera capensis: Structural prediction and comparative analysis.

    PubMed

    Butts, Carter T; Bierma, Jan C; Martin, Rachel W

    2016-10-01

    In his 1875 monograph on insectivorous plants, Darwin described the feeding reactions of Drosera flypaper traps and predicted that their secretions contained a "ferment" similar to mammalian pepsin, an aspartic protease. Here we report a high-quality draft genome sequence for the cape sundew, Drosera capensis, the first genome of a carnivorous plant from order Caryophyllales, which also includes the Venus flytrap (Dionaea) and the tropical pitcher plants (Nepenthes). This species was selected in part for its hardiness and ease of cultivation, making it an excellent model organism for further investigations of plant carnivory. Analysis of predicted protein sequences yields genes encoding proteases homologous to those found in other plants, some of which display sequence and structural features that suggest novel functionalities. Because the sequence similarity to proteins of known structure is in most cases too low for traditional homology modeling, 3D structures of representative proteases are predicted using comparative modeling with all-atom refinement. Although the overall folds and active residues for these proteins are conserved, we find structural and sequence differences consistent with a diversity of substrate recognition patterns. Finally, we predict differences in substrate specificities using in silico experiments, providing targets for structure/function studies of novel enzymes with biological and technological significance. Proteins 2016; 84:1517-1533. © 2016 Wiley Periodicals, Inc.

  12. Novel proteases from the genome of the carnivorous plant Drosera capensis: Structural prediction and comparative analysis.

    PubMed

    Butts, Carter T; Bierma, Jan C; Martin, Rachel W

    2016-10-01

    In his 1875 monograph on insectivorous plants, Darwin described the feeding reactions of Drosera flypaper traps and predicted that their secretions contained a "ferment" similar to mammalian pepsin, an aspartic protease. Here we report a high-quality draft genome sequence for the cape sundew, Drosera capensis, the first genome of a carnivorous plant from order Caryophyllales, which also includes the Venus flytrap (Dionaea) and the tropical pitcher plants (Nepenthes). This species was selected in part for its hardiness and ease of cultivation, making it an excellent model organism for further investigations of plant carnivory. Analysis of predicted protein sequences yields genes encoding proteases homologous to those found in other plants, some of which display sequence and structural features that suggest novel functionalities. Because the sequence similarity to proteins of known structure is in most cases too low for traditional homology modeling, 3D structures of representative proteases are predicted using comparative modeling with all-atom refinement. Although the overall folds and active residues for these proteins are conserved, we find structural and sequence differences consistent with a diversity of substrate recognition patterns. Finally, we predict differences in substrate specificities using in silico experiments, providing targets for structure/function studies of novel enzymes with biological and technological significance. Proteins 2016; 84:1517-1533. © 2016 Wiley Periodicals, Inc. PMID:27353064

  13. Whole-Genome Sequencing Reveals Diverse Models of Structural Variations in Esophageal Squamous Cell Carcinoma.

    PubMed

    Cheng, Caixia; Zhou, Yong; Li, Hongyi; Xiong, Teng; Li, Shuaicheng; Bi, Yanghui; Kong, Pengzhou; Wang, Fang; Cui, Heyang; Li, Yaoping; Fang, Xiaodong; Yan, Ting; Li, Yike; Wang, Juan; Yang, Bin; Zhang, Ling; Jia, Zhiwu; Song, Bin; Hu, Xiaoling; Yang, Jie; Qiu, Haile; Zhang, Gehong; Liu, Jing; Xu, Enwei; Shi, Ruyi; Zhang, Yanyan; Liu, Haiyan; He, Chanting; Zhao, Zhenxiang; Qian, Yu; Rong, Ruizhou; Han, Zhiwei; Zhang, Yanlin; Luo, Wen; Wang, Jiaqian; Peng, Shaoliang; Yang, Xukui; Li, Xiangchun; Li, Lin; Fang, Hu; Liu, Xingmin; Ma, Li; Chen, Yunqing; Guo, Shiping; Chen, Xing; Xi, Yanfeng; Li, Guodong; Liang, Jianfang; Yang, Xiaofeng; Guo, Jiansheng; Jia, JunMei; Li, Qingshan; Cheng, Xiaolong; Zhan, Qimin; Cui, Yongping

    2016-02-01

    Comprehensive identification of somatic structural variations (SVs) and understanding their mutational mechanisms in cancer might contribute to understanding biological differences and help to identify new therapeutic targets. Unfortunately, characterization of complex SVs across the whole genome and the mutational mechanisms underlying esophageal squamous cell carcinoma (ESCC) is largely unclear. To define a comprehensive catalog of somatic SVs, affected target genes, and their underlying mechanisms in ESCC, we re-analyzed whole-genome sequencing (WGS) data from 31 ESCCs using Meerkat algorithm to predict somatic SVs and Patchwork to determine copy-number changes. We found deletions and translocations with NHEJ and alt-EJ signature as the dominant SV types, and 16% of deletions were complex deletions. SVs frequently led to disruption of cancer-associated genes (e.g., CDKN2A and NOTCH1) with different mutational mechanisms. Moreover, chromothripsis, kataegis, and breakage-fusion-bridge (BFB) were identified as contributing to locally mis-arranged chromosomes that occurred in 55% of ESCCs. These genomic catastrophes led to amplification of oncogene through chromothripsis-derived double-minute chromosome formation (e.g., FGFR1 and LETM2) or BFB-affected chromosomes (e.g., CCND1, EGFR, ERBB2, MMPs, and MYC), with approximately 30% of ESCCs harboring BFB-derived CCND1 amplification. Furthermore, analyses of copy-number alterations reveal high frequency of whole-genome duplication (WGD) and recurrent focal amplification of CDCA7 that might act as a potential oncogene in ESCC. Our findings reveal molecular defects such as chromothripsis and BFB in malignant transformation of ESCCs and demonstrate diverse models of SVs-derived target genes in ESCCs. These genome-wide SV profiles and their underlying mechanisms provide preventive, diagnostic, and therapeutic implications for ESCCs. PMID:26833333

  14. Whole-Genome Sequencing Reveals Diverse Models of Structural Variations in Esophageal Squamous Cell Carcinoma

    PubMed Central

    Cheng, Caixia; Zhou, Yong; Li, Hongyi; Xiong, Teng; Li, Shuaicheng; Bi, Yanghui; Kong, Pengzhou; Wang, Fang; Cui, Heyang; Li, Yaoping; Fang, Xiaodong; Yan, Ting; Li, Yike; Wang, Juan; Yang, Bin; Zhang, Ling; Jia, Zhiwu; Song, Bin; Hu, Xiaoling; Yang, Jie; Qiu, Haile; Zhang, Gehong; Liu, Jing; Xu, Enwei; Shi, Ruyi; Zhang, Yanyan; Liu, Haiyan; He, Chanting; Zhao, Zhenxiang; Qian, Yu; Rong, Ruizhou; Han, Zhiwei; Zhang, Yanlin; Luo, Wen; Wang, Jiaqian; Peng, Shaoliang; Yang, Xukui; Li, Xiangchun; Li, Lin; Fang, Hu; Liu, Xingmin; Ma, Li; Chen, Yunqing; Guo, Shiping; Chen, Xing; Xi, Yanfeng; Li, Guodong; Liang, Jianfang; Yang, Xiaofeng; Guo, Jiansheng; Jia, JunMei; Li, Qingshan; Cheng, Xiaolong; Zhan, Qimin; Cui, Yongping

    2016-01-01

    Comprehensive identification of somatic structural variations (SVs) and understanding their mutational mechanisms in cancer might contribute to understanding biological differences and help to identify new therapeutic targets. Unfortunately, characterization of complex SVs across the whole genome and the mutational mechanisms underlying esophageal squamous cell carcinoma (ESCC) is largely unclear. To define a comprehensive catalog of somatic SVs, affected target genes, and their underlying mechanisms in ESCC, we re-analyzed whole-genome sequencing (WGS) data from 31 ESCCs using Meerkat algorithm to predict somatic SVs and Patchwork to determine copy-number changes. We found deletions and translocations with NHEJ and alt-EJ signature as the dominant SV types, and 16% of deletions were complex deletions. SVs frequently led to disruption of cancer-associated genes (e.g., CDKN2A and NOTCH1) with different mutational mechanisms. Moreover, chromothripsis, kataegis, and breakage-fusion-bridge (BFB) were identified as contributing to locally mis-arranged chromosomes that occurred in 55% of ESCCs. These genomic catastrophes led to amplification of oncogene through chromothripsis-derived double-minute chromosome formation (e.g., FGFR1 and LETM2) or BFB-affected chromosomes (e.g., CCND1, EGFR, ERBB2, MMPs, and MYC), with approximately 30% of ESCCs harboring BFB-derived CCND1 amplification. Furthermore, analyses of copy-number alterations reveal high frequency of whole-genome duplication (WGD) and recurrent focal amplification of CDCA7 that might act as a potential oncogene in ESCC. Our findings reveal molecular defects such as chromothripsis and BFB in malignant transformation of ESCCs and demonstrate diverse models of SVs-derived target genes in ESCCs. These genome-wide SV profiles and their underlying mechanisms provide preventive, diagnostic, and therapeutic implications for ESCCs. PMID:26833333

  15. Structure and genome release of Twort-like Myoviridae phage with a double-layered baseplate

    PubMed Central

    Nováček, Jiří; Šiborová, Marta; Benešík, Martin; Pantůček, Roman; Doškař, Jiří; Plevka, Pavel

    2016-01-01

    Bacteriophages from the family Myoviridae use double-layered contractile tails to infect bacteria. Contraction of the tail sheath enables the tail tube to penetrate through the bacterial cell wall and serve as a channel for the transport of the phage genome into the cytoplasm. However, the mechanisms controlling the tail contraction and genome release of phages with “double-layered” baseplates were unknown. We used cryo-electron microscopy to show that the binding of the Twort-like phage phi812 to the Staphylococcus aureus cell wall requires a 210° rotation of the heterohexameric receptor-binding and tripod protein complexes within its baseplate about an axis perpendicular to the sixfold axis of the tail. This rotation reorients the receptor-binding proteins to point away from the phage head, and also results in disruption of the interaction of the tripod proteins with the tail sheath, hence triggering its contraction. However, the tail sheath contraction of Myoviridae phages is not sufficient to induce genome ejection. We show that the end of the phi812 double-stranded DNA genome is bound to one protein subunit from a connector complex that also forms an interface between the phage head and tail. The tail sheath contraction induces conformational changes of the neck and connector that result in disruption of the DNA binding. The genome penetrates into the neck, but is stopped at a bottleneck before the tail tube. A subsequent structural change of the tail tube induced by its interaction with the S. aureus cell is required for the genome’s release. PMID:27469164

  16. Dinoflagellate Gene Structure and Intron Splice Sites in a Genomic Tandem Array.

    PubMed

    Mendez, Gregory S; Delwiche, Charles F; Apt, Kirk E; Lippmeier, J Casey

    2015-01-01

    Dinoflagellates are one of the last major lineages of eukaryotes for which little is known about genome structure and organization. We report here the sequence and gene structure of a clone isolated from a cosmid library which, to our knowledge, represents the largest contiguously sequenced, dinoflagellate genomic, tandem gene array. These data, combined with information from a large transcriptomic library, allowed a high level of confidence of every base pair call. This degree of confidence is not possible with PCR-based contigs. The sequence contains an intron-rich set of five highly expressed gene repeats arranged in tandem. One of the tandem repeat gene members contains an intron 26,372 bp long. This study characterizes a splice site consensus sequence for dinoflagellate introns. Two to nine base pairs around the 3' splice site are repeated by an identical two to nine base pairs around the 5' splice site. The 5' and 3' splice sites are in the same locations within each repeat so that the repeat is found only once in the mature mRNA. This identically repeated intron boundary sequence might be useful in gene modeling and annotation of genomes.

  17. Genomic and supragenomic structure of the nucleotide-like G-protein-coupled receptor GPR34.

    PubMed

    Engemaier, Eva; Römpler, Holger; Schöneberg, Torsten; Schulz, Angela

    2006-02-01

    Directed cloning approaches and large-scale sequencing of several vertebrate genomes unveiled many new members of the G-protein-coupled receptor (GPCR) superfamily, among them GPR34. Initial studies showed that GPR34 is an evolutionarily old GPCR structurally related to a group of ADP-like receptors. To gain insight into the genomic organization, regulation of expression, and supragenomic diversification of GPR34 several vertebrate species were analyzed. In contrast to the obviously intronless coding region GPR34 displays an evolutionary preserved 5' noncoding intron-exon structure. Further, an alternatively used cryptic intron was identified within the coding region, which shortens the N terminus by 47 amino acids. Ubiquitous expression of GPR34 is driven by genomic sequences upstream of at least two transcriptional start regions in mouse and rat but only one region in human. In rodents, both promoters are active in all tissues investigated, but the level of activity is tissue-specific. At the translational level, several conserved in-frame AUGs within the first 150 bp of the coding region may serve as start points for translation in human and other mammals. Combinatory mutagenesis and expression of reporter constructs confirmed these multiple translational start points and revealed a preference for the second in-frame AUG in human GPR34. Our data show that multiple translation initiation starts and alternative splicing contribute to the supragenomic diversification of GPR34. PMID:16338117

  18. Anti-infectious drug repurposing using an integrated chemical genomics and structural systems biology approach.

    PubMed

    Ng, Clara; Hauptman, Ruth; Zhang, Yinliang; Bourne, Philip E; Xie, Lei

    2014-01-01

    The emergence of multi-drug and extensive drug resistance of microbes to antibiotics poses a great threat to human health. Although drug repurposing is a promising solution for accelerating the drug development process, its application to anti-infectious drug discovery is limited by the scope of existing phenotype-, ligand-, or target-based methods. In this paper we introduce a new computational strategy to determine the genome-wide molecular targets of bioactive compounds in both human and bacterial genomes. Our method is based on the use of a novel algorithm, ligand Enrichment of Network Topological Similarity (ligENTS), to map the chemical universe to its global pharmacological space. ligENTS outperforms the state-of-the-art algorithms in identifying novel drug-target relationships. Furthermore, we integrate ligENTS with our structural systems biology platform to identify drug repurposing opportunities via target similarity profiling. Using this integrated strategy, we have identified novel P. falciparum targets of drug-like active compounds from the Malaria Box, and suggest that a number of approved drugs may be active against malaria. This study demonstrates the potential of an integrative chemical genomics and structural systems biology approach to drug repurposing.

  19. Matrix attachment regions and structural colinearity in the genomes of two grass species.

    PubMed Central

    Avramova, Z; Tikhonov, A; Chen, M; Bennetzen, J L

    1998-01-01

    In order to gain insights into the relationship between spatial organization of the genome and genome function we have initiated studies of the co-linear Sh2/A1- homologous regions of rice (30 kb) and sorghum (50 kb). We have identified the locations of matrix attachment regions (MARs) in these homologous chromosome segments, which could serve as anchors for individual structural units or loops. Despite the fact that the nucleotide sequences serving as MARs were not detectably conserved, the general organizational patterns of MARs relative to the neighboring genes were preserved. All identified genes were placed in individual loops that were of comparable size for homologous genes. Hence, gene composition, gene orientation, gene order and the placement of genes into structural units has been evolutionarily conserved in this region. Our analysis demonstrated that the occurrence of various 'MAR motifs' is not indicative of MAR location. However, most of the MARs discovered in the two genomic regions were found to co-localize with miniature inverted repeat transposable elements (MITEs), suggesting that MITEs preferentially insert near MARs and/or that they can serve as MARs. PMID:9443968

  20. Genomic instability: Crossing pathways at the origin of structural and numerical chromosome changes.

    PubMed

    Russo, Antonella; Pacchierotti, Francesca; Cimini, Daniela; Ganem, Neil J; Genescà, Anna; Natarajan, Adayapalam T; Pavanello, Sofia; Valle, Giorgio; Degrassi, Francesca

    2015-08-01

    Genomic instability leads to a wide spectrum of genetic changes, including single nucleotide mutations, structural chromosome alterations, and numerical chromosome changes. The accepted view on how these events are generated predicts that separate cellular mechanisms and genetic events explain the occurrence of these types of genetic variation. Recently, new findings have shed light on the complexity of the mechanisms leading to structural and numerical chromosome aberrations, their intertwining pathways, and their dynamic evolution, in somatic as well as in germ cells. In this review, we present a critical analysis of these recent discoveries in this area, with the aim to contribute to a deeper knowledge of the molecular networks leading to adverse outcomes in humans following exposure to environmental factors. The review illustrates how several technological advances, including DNA sequencing methods, bioinformatics, and live-cell imaging approaches, have contributed to produce a renewed concept of the mechanisms causing genomic instability. Special attention is also given to the specific pathways causing genomic instability in mammalian germ cells. Remarkably, the same scenario emerged from some pioneering studies published in the 1980s to 1990s, when the evolution of polyploidy, the chromosomal effects of spindle poisons, the fate of micronuclei, were intuitively proposed to share mechanisms and pathways. Thus, an old working hypothesis has eventually found proper validation.

  1. Physical mapping and genomic structure of the human TNFR2 gene

    SciTech Connect

    Beltinger, C.P.; White, P.S.; Maris, J.M.

    1996-07-01

    The tumor necrosis factor receptor 2 (TNFR2) gene localizes to 1p36.2, a genomic region characteristically deleted in neuroblastomas and other malignancies. In addition, TNFR2 is the principal mediator of the effects of TNF on cellular immunity, and it may cooperate with TNFR1 in the killing of nonlymphoid cells. Therefore, we undertook an analysis of the genomic structure and precise physical mapping of this gene. The TNFR2 gene is contained on 10 exons that span 26 kb. Most of the functional domains of TNFR2 are encoded by separate exons, and each of the repeats of the extracellular cysteine-rich domain is interrupted by an intron. The genomic structure reveals a close relationship to TNFR1, another member of the TNFR superfamily. Based on electrophoretic analysis of yeast artificial chromosomes, TNFR2 maps within 400 kb of the genetic marker D1S434. In addition, we have identified a new polymorphic dinucleotide repeat within intron 4 of TNFR2. The genetic sequence information and exon-intron boundaries we have determined will facilitate mutational analysis of this gene to determine its potential role in neuroblastoma, as well as in other cancers with characteristic deletions or rearrangements of 1p36. 52 refs., 3 figs., 1 tab.

  2. Overview of PSB track on gene structure identification in large-scale genomic sequence

    SciTech Connect

    Uberbacher, E.C.; Xu, Y.

    1998-12-31

    The recent funding of more than a dozen major genome centers to begin community-wide high-throughput sequencing of the human genome has created a significant new challenge for the computational analysis of DNA sequence and the prediction of gene structure and function. It has been estimated that on average from 1996 to 2003, approximately 2 million bases of newly finished DNA sequence will be produced every day and be made available on the Internet and in central databases. The finished (fully assembled) sequence generated each day will represent approximately 75 new genes (and their respective proteins), and many times this number will be represented in partially completed sequences. The information contained in these is of immeasurable value to medical research, biotechnology, the pharmaceutical industry and researchers in a host of fields ranging from microorganism metabolism, to structural biology, to bioremediation. Sequencing of microorganisms and other model organisms is also ramping up at a very rapid rate. The genomes for yeast and several microorganisms such as H. influenza have recently been fully sequenced, although the significance of many genes remains to be determined.

  3. Whole-genome mutational landscape and characterization of noncoding and structural mutations in liver cancer.

    PubMed

    Fujimoto, Akihiro; Furuta, Mayuko; Totoki, Yasushi; Tsunoda, Tatsuhiko; Kato, Mamoru; Shiraishi, Yuichi; Tanaka, Hiroko; Taniguchi, Hiroaki; Kawakami, Yoshiiku; Ueno, Masaki; Gotoh, Kunihito; Ariizumi, Shun-Ichi; Wardell, Christopher P; Hayami, Shinya; Nakamura, Toru; Aikata, Hiroshi; Arihiro, Koji; Boroevich, Keith A; Abe, Tetsuo; Nakano, Kaoru; Maejima, Kazuhiro; Sasaki-Oku, Aya; Ohsawa, Ayako; Shibuya, Tetsuo; Nakamura, Hiromi; Hama, Natsuko; Hosoda, Fumie; Arai, Yasuhito; Ohashi, Shoko; Urushidate, Tomoko; Nagae, Genta; Yamamoto, Shogo; Ueda, Hiroki; Tatsuno, Kenji; Ojima, Hidenori; Hiraoka, Nobuyoshi; Okusaka, Takuji; Kubo, Michiaki; Marubashi, Shigeru; Yamada, Terumasa; Hirano, Satoshi; Yamamoto, Masakazu; Ohdan, Hideki; Shimada, Kazuaki; Ishikawa, Osamu; Yamaue, Hiroki; Chayama, Kazuki; Miyano, Satoru; Aburatani, Hiroyuki; Shibata, Tatsuhiro; Nakagawa, Hidewaki

    2016-05-01

    Liver cancer, which is most often associated with virus infection, is prevalent worldwide, and its underlying etiology and genomic structure are heterogeneous. Here we provide a whole-genome landscape of somatic alterations in 300 liver cancers from Japanese individuals. Our comprehensive analysis identified point mutations, structural variations (STVs), and virus integrations, in noncoding and coding regions. We discovered mutational signatures related to liver carcinogenesis and recurrently mutated coding and noncoding regions, such as long intergenic noncoding RNA genes (NEAT1 and MALAT1), promoters, CTCF-binding sites, and regulatory regions. STV analysis found a significant association with replication timing and identified known (CDKN2A, CCND1, APC, and TERT) and new (ASH1L, NCOR1, and MACROD2) cancer-related genes that were recurrently affected by STVs, leading to altered expression. These results emphasize the value of whole-genome sequencing analysis in discovering cancer driver mutations and understanding comprehensive molecular profiles of liver cancer, especially with regard to STVs and noncoding mutations. PMID:27064257

  4. BSSV: Bayesian based somatic structural variation identification with whole genome DNA-seq data.

    PubMed

    Chen, Xi; Shi, Xu; Shajahan, Ayesha N; Hilakivi-Clarke, Leena; Clarke, Robert; Xuan, Jianhua

    2014-01-01

    High coverage whole genome DNA-sequencing enables identification of somatic structural variation (SSV) more evident in paired tumor and normal samples. Recent studies show that simultaneous analysis of paired samples provides a better resolution of SSV detection than subtracting shared SVs. However, available tools can neither identify all types of SSVs nor provide any rank information regarding their somatic features. In this paper, we have developed a Bayesian framework, by integrating read alignment information from both tumor and normal samples, called BSSV, to calculate the significance of each SSV. Tested by simulated data, the precision of BSSV is comparable to that of available tools and the false negative rate is significantly lowered. We have also applied this approach to The Cancer Genome Atlas breast cancer data for SSV detection. Many known breast cancer specific mutated genes like RAD51, BRIP1, ER, PGR and PTPRD have been successfully identified.

  5. An integrated map of structural variation in 2,504 human genomes

    PubMed Central

    Jun, Goo; Fritz, Markus Hsi-Yang; Konkel, Miriam K.; Malhotra, Ankit; Stütz, Adrian M.; Shi, Xinghua; Casale, Francesco Paolo; Chen, Jieming; Hormozdiari, Fereydoun; Dayama, Gargi; Chen, Ken; Malig, Maika; Chaisson, Mark J.P.; Walter, Klaudia; Meiers, Sascha; Kashin, Seva; Garrison, Erik; Auton, Adam; Lam, Hugo Y. K.; Mu, Xinmeng Jasmine; Alkan, Can; Antaki, Danny; Bae, Taejeong; Cerveira, Eliza; Chines, Peter; Chong, Zechen; Clarke, Laura; Dal, Elif; Ding, Li; Emery, Sarah; Fan, Xian; Gujral, Madhusudan; Kahveci, Fatma; Kidd, Jeffrey M.; Kong, Yu; Lameijer, Eric-Wubbo; McCarthy, Shane; Flicek, Paul; Gibbs, Richard A.; Marth, Gabor; Mason, Christopher E.; Menelaou, Androniki; Muzny, Donna M.; Nelson, Bradley J.; Noor, Amina; Parrish, Nicholas F.; Pendleton, Matthew; Quitadamo, Andrew; Raeder, Benjamin; Schadt, Eric E.; Romanovitch, Mallory; Schlattl, Andreas; Sebra, Robert; Shabalin, Andrey A.; Untergasser, Andreas; Walker, Jerilyn A.; Wang, Min; Yu, Fuli; Zhang, Chengsheng; Zhang, Jing; Zheng-Bradley, Xiangqun; Zhou, Wanding; Zichner, Thomas; Sebat, Jonathan; Batzer, Mark A.; McCarroll, Steven A.; Mills, Ryan E.; Gerstein, Mark B.; Bashir, Ali; Stegle, Oliver; Devine, Scott E.; Lee, Charles; Eichler, Evan E.; Korbel, Jan O.

    2015-01-01

    Summary Structural variants (SVs) are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight SV classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype-blocks in 26 human populations. Analyzing this set, we identify numerous gene-intersecting SVs exhibiting population stratification and describe naturally occurring homozygous gene knockouts suggesting the dispensability of a variety of human genes. We demonstrate that SVs are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of SV complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex SVs with multiple breakpoints likely formed through individual mutational events. Our catalog will enhance future studies into SV demography, functional impact and disease association. PMID:26432246

  6. Structure and Genome Organization of AFV2, a Novel Archaeal Lipothrixvirus with Unusual Terminal and Core Structures†

    PubMed Central

    Häring, Monika; Vestergaard, Gisle; Brügger, Kim; Rachel, Reinhard; Garrett, Roger A.; Prangishvili, David

    2005-01-01

    A novel filamentous virus, AFV2, from the hyperthermophilic archaeal genus Acidianus shows structural similarity to lipothrixviruses but differs from them in its unusual terminal and core structures. The double-stranded DNA genome contains 31,787 bp and carries eight open reading frames homologous to those of other lipothrixviruses, a single tRNALys gene containing a 12-bp archaeal intron, and a 1,008-bp repeat-rich region near the center of the genome. PMID:15901711

  7. Structure and mechanism of the ATPase that powers viral genome packaging.

    PubMed

    Hilbert, Brendan J; Hayes, Janelle A; Stone, Nicholas P; Duffy, Caroline M; Sankaran, Banumathi; Kelch, Brian A

    2015-07-21

    Many viruses package their genomes into procapsids using an ATPase machine that is among the most powerful known biological motors. However, how this motor couples ATP hydrolysis to DNA translocation is still unknown. Here, we introduce a model system with unique properties for studying motor structure and mechanism. We describe crystal structures of the packaging motor ATPase domain that exhibit nucleotide-dependent conformational changes involving a large rotation of an entire subdomain. We also identify the arginine finger residue that catalyzes ATP hydrolysis in a neighboring motor subunit, illustrating that previous models for motor structure need revision. Our findings allow us to derive a structural model for the motor ring, which we validate using small-angle X-ray scattering and comparisons with previously published data. We illustrate the model's predictive power by identifying the motor's DNA-binding and assembly motifs. Finally, we integrate our results to propose a mechanistic model for DNA translocation by this molecular machine. PMID:26150523

  8. The genome-wide structure of two economically important indigenous Sicilian cattle breeds.

    PubMed

    Mastrangelo, S; Saura, M; Tolone, M; Salces-Ortiz, J; Di Gerlando, R; Bertolini, F; Fontanesi, L; Sardina, M T; Serrano, M; Portolano, B

    2014-11-01

    Genomic technologies, such as high-throughput genotyping based on SNP arrays, provided background information concerning genome structure in domestic animals. The aim of this work was to investigate the genetic structure, the genome-wide estimates of inbreeding, coancestry, effective population size (Ne), and the patterns of linkage disequilibrium (LD) in 2 economically important Sicilian local cattle breeds, Cinisara (CIN) and Modicana (MOD), using the Illumina Bovine SNP50K v2 BeadChip. To understand the genetic relationship and to place both Sicilian breeds in a global context, genotypes from 134 other domesticated bovid breeds were used. Principal component analysis showed that the Sicilian cattle breeds were closer to individuals of Bos taurus taurus from Eurasia and formed nonoverlapping clusters with other breeds. Between the Sicilian cattle breeds, MOD was the most differentiated, whereas the animals belonging to the CIN breed showed a lower value of assignment, the presence of substructure, and genetic links with the MOD breed. The average molecular inbreeding and coancestry coefficients were moderately high, and the current estimates of Ne were low in both breeds. These values indicated a low genetic variability. Considering levels of LD between adjacent markers, the average r(2) in the MOD breed was comparable to those reported for others cattle breeds, whereas CIN showed a lower value. Therefore, these results support the need of more dense SNP arrays for a high-power association mapping and genomic selection efficiency, particularly for the CIN cattle breed. Controlling molecular inbreeding and coancestry would restrict inbreeding depression, the probability of losing beneficial rare alleles, and therefore the risk of extinction. The results generated from this study have important implications for the development of conservation and/or selection breeding programs in these 2 local cattle breeds.

  9. Genomic-scale comparison of sequence- and structure-based methods of function prediction: Does structure provide additional insight?

    PubMed Central

    Fetrow, Jacquelyn S.; Siew, Naomi; Di Gennaro, Jeannine A.; Martinez-Yamout, Maria; Dyson, H. Jane; Skolnick, Jeffrey

    2001-01-01

    A function annotation method using the sequence-to-structure-to-function paradigm is applied to the identification of all disulfide oxidoreductases in the Saccharomyces cerevisiae genome. The method identifies 27 sequences as potential disulfide oxidoreductases. All previously known thioredoxins, glutaredoxins, and disulfide isomerases are correctly identified. Three of the 27 predictions are probable false-positives. Three novel predictions, which subsequently have been experimentally validated, are presented. Two additional novel predictions suggest a disulfide oxidoreductase regulatory mechanism for two subunits (OST3 and OST6) of the yeast oligosaccharyltransferase complex. Based on homology, this prediction can be extended to a potential tumor suppressor gene, N33, in humans, whose biochemical function was not previously known. Attempts to obtain a folded, active N33 construct to test the prediction were unsuccessful. The results show that structure prediction coupled with biochemically relevant structural motifs is a powerful method for the function annotation of genome sequences and can provide more detailed, robust predictions than function prediction methods that rely on sequence comparison alone. PMID:11316881

  10. Structural heterogeneity and functional diversity of topologically associating domains in mammalian genomes

    PubMed Central

    Wang, Xiao-Tao; Dong, Peng-Fei; Zhang, Hong-Yu; Peng, Cheng

    2015-01-01

    Recent chromosome conformation capture (3C) derived techniques have revealed that topologically associating domain (TAD) is a pervasive element in chromatin three-dimensional (3D) organization. However, there is currently no parameter to quantitatively measure the structural characteristics of TADs, thus obscuring our understanding on the structural and functional differences among TADs. Based on our finding that there exist intrinsic chromatin interaction patterns in TADs, we define a theoretical parameter, called aggregation preference (AP), to characterize TAD structures by capturing the interaction aggregation degree. Applying this defined parameter to 11 Hi-C data sets generated by both traditional and in situ Hi-C experimental pipelines, our analyses reveal that heterogeneous structures exist among TADs, and this structural heterogeneity is significantly correlated to DNA sequences, epigenomic signals and gene expressions. Although TADs can be stable in genomic positions across cell lines, structural comparisons show that a considerable number of stable TADs undergo significantly structural rearrangements during cell changes. Moreover, the structural change of TAD is tightly associated with its transcription remodeling. Altogether, the theoretical parameter defined in this work provides a quantitative method to link structural characteristics and biological functions of TADs, and this linkage implies that chromatin interaction pattern has the potential to mark transcription activity in TADs. PMID:26150425

  11. Structural heterogeneity and functional diversity of topologically associating domains in mammalian genomes.

    PubMed

    Wang, Xiao-Tao; Dong, Peng-Fei; Zhang, Hong-Yu; Peng, Cheng

    2015-09-01

    Recent chromosome conformation capture (3C) derived techniques have revealed that topologically associating domain (TAD) is a pervasive element in chromatin three-dimensional (3D) organization. However, there is currently no parameter to quantitatively measure the structural characteristics of TADs, thus obscuring our understanding on the structural and functional differences among TADs. Based on our finding that there exist intrinsic chromatin interaction patterns in TADs, we define a theoretical parameter, called aggregation preference (AP), to characterize TAD structures by capturing the interaction aggregation degree. Applying this defined parameter to 11 Hi-C data sets generated by both traditional and in situ Hi-C experimental pipelines, our analyses reveal that heterogeneous structures exist among TADs, and this structural heterogeneity is significantly correlated to DNA sequences, epigenomic signals and gene expressions. Although TADs can be stable in genomic positions across cell lines, structural comparisons show that a considerable number of stable TADs undergo significantly structural rearrangements during cell changes. Moreover, the structural change of TAD is tightly associated with its transcription remodeling. Altogether, the theoretical parameter defined in this work provides a quantitative method to link structural characteristics and biological functions of TADs, and this linkage implies that chromatin interaction pattern has the potential to mark transcription activity in TADs. PMID:26150425

  12. Genome-wide functional annotation and structural verification of metabolic ORFeome of Chlamydomonas reinhardtii

    PubMed Central

    2011-01-01

    Background Recent advances in the field of metabolic engineering have been expedited by the availability of genome sequences and metabolic modelling approaches. The complete sequencing of the C. reinhardtii genome has made this unicellular alga a good candidate for metabolic engineering studies; however, the annotation of the relevant genes has not been validated and the much-needed metabolic ORFeome is currently unavailable. We describe our efforts on the functional annotation of the ORF models released by the Joint Genome Institute (JGI), prediction of their subcellular localizations, and experimental verification of their structural annotation at the genome scale. Results We assigned enzymatic functions to the translated JGI ORF models of C. reinhardtii by reciprocal BLAST searches of the putative proteome against the UniProt and AraCyc enzyme databases. The best match for each translated ORF was identified and the EC numbers were transferred onto the ORF models. Enzymatic functional assignment was extended to the paralogs of the ORFs by clustering ORFs using BLASTCLUST. In total, we assigned 911 enzymatic functions, including 886 EC numbers, to 1,427 transcripts. We further annotated the enzymatic ORFs by prediction of their subcellular localization. The majority of the ORFs are predicted to be compartmentalized in the cytosol and chloroplast. We verified the structure of the metabolism-related ORF models by reverse transcription-PCR of the functionally annotated ORFs. Following amplification and cloning, we carried out 454FLX and Sanger sequencing of the ORFs. Based on alignment of the 454FLX reads to the ORF predicted sequences, we obtained more than 90% coverage for more than 80% of the ORFs. In total, 1,087 ORF models were verified by 454 and Sanger sequencing methods. We obtained expression evidence for 98% of the metabolic ORFs in the algal cells grown under constant light in the presence of acetate. Conclusions We functionally annotated approximately 1

  13. Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine

    PubMed Central

    Elsik, Christine G.; Tayal, Aditi; Diesh, Colin M.; Unni, Deepak R.; Emery, Marianne L.; Nguyen, Hung N.; Hagen, Darren E.

    2016-01-01

    We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. PMID:26578564

  14. Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine.

    PubMed

    Elsik, Christine G; Tayal, Aditi; Diesh, Colin M; Unni, Deepak R; Emery, Marianne L; Nguyen, Hung N; Hagen, Darren E

    2016-01-01

    We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. PMID:26578564

  15. Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine.

    PubMed

    Elsik, Christine G; Tayal, Aditi; Diesh, Colin M; Unni, Deepak R; Emery, Marianne L; Nguyen, Hung N; Hagen, Darren E

    2016-01-01

    We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search.

  16. The contribution of co-transcriptional RNA:DNA hybrid structures to DNA damage and genome instability

    PubMed Central

    Hamperl, Stephan; Cimprich, Karlene A.

    2014-01-01

    Accurate DNA replication and DNA repair are crucial for the maintenance of genome stability, and it is generally accepted that failure of these processes is a major source of DNA damage in cells. Intriguingly, recent evidence suggests that DNA damage is more likely to occur at genomic loci with high transcriptional activity. Furthermore, loss of certain RNA processing factors in eukaryotic cells is associated with increased formation of co-transcriptional RNA:DNA hybrid structures known as R-loops, resulting in double-strand breaks (DSBs) and DNA damage. However, the molecular mechanisms by which R-loop structures ultimately lead to DNA breaks and genome instability is not well understood. In this review, we summarize the current knowledge about the formation, recognition and processing of RNA:DNA hybrids, and discuss possible mechanisms by which these structures contribute to DNA damage and genome instability in the cell. PMID:24746923

  17. Gene Ontology Consortium: going forward

    PubMed Central

    2015-01-01

    The Gene Ontology (GO; http://www.geneontology.org) is a community-based bioinformatics resource that supplies information about gene product function using ontologies to represent biological knowledge. Here we describe improvements and expansions to several branches of the ontology, as well as updates that have allowed us to more efficiently disseminate the GO and capture feedback from the research community. The Gene Ontology Consortium (GOC) has expanded areas of the ontology such as cilia-related terms, cell-cycle terms and multicellular organism processes. We have also implemented new tools for generating ontology terms based on a set of logical rules making use of templates, and we have made efforts to increase our use of logical definitions. The GOC has a new and improved web site summarizing new developments and documentation, serving as a portal to GO data. Users can perform GO enrichment analysis, and search the GO for terms, annotations to gene products, and associated metadata across multiple species using the all-new AmiGO 2 browser. We encourage and welcome the input of the research community in all biological areas in our continued effort to improve the Gene Ontology. PMID:25428369

  18. Establishing an International Soil Modelling Consortium

    NASA Astrophysics Data System (ADS)

    Vereecken, Harry; Schnepf, Andrea; Vanderborght, Jan

    2015-04-01

    Soil is one of the most critical life-supporting compartments of the Biosphere. Soil provides numerous ecosystem services such as a habitat for biodiversity, water and nutrients, as well as producing food, feed, fiber and energy. To feed the rapidly growing world population in 2050, agricultural food production must be doubled using the same land resources footprint. At the same time, soil resources are threatened due to improper management and climate change. Soil is not only essential for establishing a sustainable bio-economy, but also plays a key role also in a broad range of societal challenges including 1) climate change mitigation and adaptation, 2) land use change 3) water resource protection, 4) biotechnology for human health, 5) biodiversity and ecological sustainability, and 6) combating desertification. Soils regulate and support water, mass and energy fluxes between the land surface, the vegetation, the atmosphere and the deep subsurface and control storage and release of organic matter affecting climate regulation and biogeochemical cycles. Despite the many important functions of soil, many fundamental knowledge gaps remain, regarding the role of soil biota and biodiversity on ecosystem services, the structure and dynamics of soil communities, the interplay between hydrologic and biotic processes, the quantification of soil biogeochemical processes and soil structural processes, the resilience and recovery of soils from stress, as well as the prediction of soil development and the evolution of soils in the landscape, to name a few. Soil models have long played an important role in quantifying and predicting soil processes and related ecosystem services. However, a new generation of soil models based on a whole systems approach comprising all physical, mechanical, chemical and biological processes is now required to address these critical knowledge gaps and thus contribute to the preservation of ecosystem services, improve our understanding of climate

  19. The LBNL/JSU/AGMUS Science Consortium

    SciTech Connect

    1996-04-01

    This report discusses the 11 year of accomplishments of the science consortium of minority graduates from Jackson State University and Ana G. Mendez University at the Lawrence Berkeley National Laboratory.

  20. NASA Space Radiation Transport Code Development Consortium.

    PubMed

    Townsend, Lawrence W

    2005-01-01

    Recently, NASA established a consortium involving the University of Tennessee (lead institution), the University of Houston, Roanoke College and various government and national laboratories, to accelerate the development of a standard set of radiation transport computer codes for NASA human exploration applications. This effort involves further improvements of the Monte Carlo codes HETC and FLUKA and the deterministic code HZETRN, including developing nuclear reaction databases necessary to extend the Monte Carlo codes to carry out heavy ion transport, and extending HZETRN to three dimensions. The improved codes will be validated by comparing predictions with measured laboratory transport data, provided by an experimental measurements consortium, and measurements in the upper atmosphere on the balloon-borne Deep Space Test Bed (DSTB). In this paper, we present an overview of the consortium members and the current status and future plans of consortium efforts to meet the research goals and objectives of this extensive undertaking.

  1. International Lymphoma Epidemiology Consortium (InterLymph)

    Cancer.gov

    A consortium designed to enhance collaboration among epidemiologists studying lymphoma, to provide a forum for the exchange of research ideas, and to create a framework for collaborating on analyses that pool data from multiple studies

  2. International Mouse Phenotyping Consortium (IMPC) —

    Cancer.gov

    The International Mouse Phenotyping Consortium (IMPC) comprises a group of major mouse genetics research institutions along with national funding organisations formed to address the challenge of developing an encyclopedia of mammalian gene function.

  3. 32 CFR 37.1255 - Consortium.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... TECHNOLOGY INVESTMENT AGREEMENTS Definitions of Terms Used in This Part § 37.1255 Consortium. A group of... carry out a research project (see definition of “articles of collaboration,” in § 37.1225)....

  4. 32 CFR 37.1255 - Consortium.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... TECHNOLOGY INVESTMENT AGREEMENTS Definitions of Terms Used in This Part § 37.1255 Consortium. A group of... carry out a research project (see definition of “articles of collaboration,” in § 37.1225)....

  5. CORAL DISEASE & HEALTH CONSORTIUM: FINDING SOLUTIONS

    EPA Science Inventory

    The National Oceanic Atmospheric Administration (NOAA), the Environmental Protection Agency (EPA), and the Department of Interior (DOI) developed the framework for a Coral Disease and Health Consortium (CDHC) for the United States Coral Reef Task Force (USCRTF) through an interag...

  6. Paleogenomics. Genomic structure in Europeans dating back at least 36,200 years.

    PubMed

    Seguin-Orlando, Andaine; Korneliussen, Thorfinn S; Sikora, Martin; Malaspinas, Anna-Sapfo; Manica, Andrea; Moltke, Ida; Albrechtsen, Anders; Ko, Amy; Margaryan, Ashot; Moiseyev, Vyacheslav; Goebel, Ted; Westaway, Michael; Lambert, David; Khartanovich, Valeri; Wall, Jeffrey D; Nigst, Philip R; Foley, Robert A; Lahr, Marta Mirazon; Nielsen, Rasmus; Orlando, Ludovic; Willerslev, Eske

    2014-11-28

    The origin of contemporary Europeans remains contentious. We obtained a genome sequence from Kostenki 14 in European Russia dating from 38,700 to 36,200 years ago, one of the oldest fossils of anatomically modern humans from Europe. We find that Kostenki 14 shares a close ancestry with the 24,000-year-old Mal'ta boy from central Siberia, European Mesolithic hunter-gatherers, some contemporary western Siberians, and many Europeans, but not eastern Asians. Additionally, the Kostenki 14 genome shows evidence of shared ancestry with a population basal to all Eurasians that also relates to later European Neolithic farmers. We find that Kostenki 14 contains more Neandertal DNA that is contained in longer tracts than present Europeans. Our findings reveal the timing of divergence of western Eurasians and East Asians to be more than 36,200 years ago and that European genomic structure today dates back to the Upper Paleolithic and derives from a metapopulation that at times stretched from Europe to central Asia.

  7. Genomic structure and chromosomal mapping of the murine CD40 gene

    SciTech Connect

    Grimaldi, J.C.; Chang, R.; Howard, M.; Cockayne, D.A. ); Torres, R.; Clark, E.A. ); Kozak, C.A. )

    1992-12-15

    The B cell-associated surface molecule, CD40, is likely to play a central role in the expansion of Ag-stimulated B cells, and their interaction with activated Th cells. In this study the authors have isolated genomic clones of murine CD40 from a mouse liver genomic DNA library. Comparison with the murine CD40 cDNA sequence revealed the presence of nine exons that together contain the entire murine CD40 coding region, and span approximately 16.3 kb of genomic DNA. The intron/exon structure of the CD40 gene resembles that of the low affinity nerve growth factor receptor gene, a close homolog of both human and murine CD40. In both cases the functional domains of the receptor molecules are separated onto different exons throughout the genes. Southern blot analysis demonstrated that murine CD40 is a single copy gene that maps in the distal region of mouse chromosome 2. 58 refs., 4 figs., 1 tab.

  8. Viral genome structures, charge, and sequences are optimal for capsid assembly

    NASA Astrophysics Data System (ADS)

    Hagan, Michael

    2014-03-01

    For many viruses, the spontaneous assembly of a capsid shell around the nu-cleic acid (NA) genome is an essential step in the viral life cycle. Capsid formation is a multicomponent, out-of-equilibrium assembly process for which kinetic effects and thermodynamic constraints compete to determine the outcome. Understand-ing how viral components drive highly efficient assembly under these constraints could promote biomedical efforts to block viral propagation, and would elucidate the factors controlling assembly in a wide range of systems containing proteins and polyelectrolytes. This talk will describe coarse-grained models of capsid proteins and NAs with which we investigate the dynamics and thermodynamics of virus assembly. In con-trast to recent theoretical models, we find that capsids spontaneously `overcharge' that is, the NA length which is kinetically and thermodynamically optimal possess-es a negative charge greater than the positive charge of the capsid. When applied to specific virus capsids, the calculated optimal NA lengths closely correspond to the natural viral genome lengths. These results suggest that the features included in this model (i.e. electrostatics, excluded volume, and NA tertiary structure) play key roles in determining assembly thermodynamics and consequently exert selec-tive pressure on viral evolution. I will then discuss mechanisms by which se-quence-specific interactions between NAs and capsid proteins promote selective encapsidation of the viral genome. This work was supported by NIH R01GM108021 and the Brandeis MRSEC NSF-MRSEC-0820492.

  9. Assessing Diversity of DNA Structure-Related Sequence Features in Prokaryotic Genomes

    PubMed Central

    Huang, Yongjie; Mrázek, Jan

    2014-01-01

    Prokaryotic genomes are diverse in terms of their nucleotide and oligonucleotide composition as well as presence of various sequence features that can affect physical properties of the DNA molecule. We present a survey of local sequence patterns which have a potential to promote non-canonical DNA conformations (i.e. different from standard B-DNA double helix) and interpret the results in terms of relationships with organisms' habitats, phylogenetic classifications, and other characteristics. Our present work differs from earlier similar surveys not only by investigating a wider range of sequence patterns in a large number of genomes but also by using a more realistic null model to assess significant deviations. Our results show that simple sequence repeats and Z-DNA-promoting patterns are generally suppressed in prokaryotic genomes, whereas palindromes and inverted repeats are over-represented. Representation of patterns that promote Z-DNA and intrinsic DNA curvature increases with increasing optimal growth temperature (OGT), and decreases with increasing oxygen requirement. Additionally, representations of close direct repeats, palindromes and inverted repeats exhibit clear negative trends with increasing OGT. The observed relationships with environmental characteristics, particularly OGT, suggest possible evolutionary scenarios of structural adaptation of DNA to particular environmental niches. PMID:24408877

  10. Primary structure of the human follistatin precursor and its genomic organization

    SciTech Connect

    Shimasaki, Shunichi; Koga, Makoto; Esch, F.; Cooksey, K.; Mercado, M.; Koba, A.; Ueno, Naoto; Ying, Shaoyao; Ling, N.; Guillemin, R. )

    1988-06-01

    Follistatin is a single-chain gonadal protein that specifically inhibits follicle-stimulating hormone release. By use of the recently characterized porcine follistatin cDNA as a probe to screen a human testis cDNA library and a genomic library, the structure of the complete human follistatin precursor as well as its genomic organization have been determined. Three of eight cDNA clones that were sequenced predicted a precursor with 344 amino acids, whereas the remaining five cDNA clones encoded a 317 amino acid precursor, resulting from alternative splicing of the precursor mRNA. Mature follistatins contain four contiguous domains that are encoded by precisely separated exons; three of the domains are highly similar to each other, as well as to human epidermal growth factor and human pancreatic secretory trypsin inhibitor. The genomic organization of the human follistatin is similar to that of the human epidermal growth factor gene and thus supports the notion of exon shuffling during evolution.

  11. Genomic diversity, population structure, and migration following rapid range expansion in the Balsam poplar, Populus balsamifera.

    PubMed

    Keller, Stephen R; Olson, Matthew S; Silim, Salim; Schroeder, William; Tiffin, Peter

    2010-03-01

    Rapid range expansions can cause pervasive changes in the genetic diversity and structure of populations. The postglacial history of the Balsam Poplar, Populus balsamifera, involved the colonization of most of northern North America, an area largely covered by continental ice sheets during the last glacial maximum. To characterize how this expansion shaped genomic diversity within and among populations, we developed 412 SNP markers that we assayed for a range-wide sample of 474 individuals sampled from 34 populations. We complemented the SNP data set with DNA sequence data from 11 nuclear loci from 94 individuals, and used coalescent analyses to estimate historical population size, demographic growth, and patterns of migration. Bayesian clustering identified three geographically separated demes found in the Northern, Central, and Eastern portions of the species' range. These demes varied significantly in nucleotide diversity, the abundance of private polymorphisms, and population substructure. Most measures supported the Central deme as descended from the primary refuge of diversity. Both SNPs and sequence data suggested recent population growth, and coalescent analyses of historical migration suggested a massive expansion from the Centre to the North and East. Collectively, these data demonstrate the strong influence that range expansions exert on genomic diversity, both within local populations and across the range. Our results suggest that an in-depth knowledge of nucleotide diversity following expansion requires sampling within multiple populations, and highlight the utility of combining insights from different data types in population genomic studies.

  12. Evolutionary genomics reveals conserved structural determinants of signaling and adaptation in microbial chemoreceptors

    SciTech Connect

    Alexander, Roger P; Jouline, Igor B

    2007-01-01

    As an important model for transmembrane signaling, methyl-accepting chemotaxis proteins (MCPs) have been extensively studied by using genetic, biochemical, and structural techniques. However, details of the molecular mechanism of signaling are still not well understood. The availability of genomic information for hundreds of species enables the identification of features in protein sequences that are conserved over long evolutionary distances and thus are critically important for function. We carried out a large-scale comparative genomic analysis of the MCP signaling and adaptation domain family and identified features that appear to be critical for receptor structure and function. Based on domain length and sequence conservation, we identified seven major MCP classes and three distinct structural regions within the cytoplasmic domain: signaling, methylation, and flexible bundle subdomains. The flexible bundle subdomain, not previously recognized in MCPs, is a conserved element that appears to be important for signal transduction. Remarkably, the N- and C-terminal helical arms of the cytoplasmic domain maintain symmetry in length and register despite dramatic variation, from 24 to 64 7-aa heptads in overall domain length. Loss of symmetry is observed in some MCPs, where it is concomitant with specific changes in the sensory module. Each major MCP class has a distinct pattern of predicted methylation sites that is well supported by experimental data. Our findings indicate that signaling and adaptation functions within the MCP cytoplasmic domain are tightly coupled, and that their coevolution has contributed to the significant diversity in chemotaxis mechanisms among different organisms.

  13. Update on the Pfam5000 Strategy for Selection of StructuralGenomics Targets

    SciTech Connect

    Chandonia, John-Marc; Brenner, Steven E.

    2005-06-27

    Structural Genomics is an international effort to determine the three-dimensional shapes of all important biological macromolecules, with a primary focus on proteins. Target proteins should be selected according to a strategy that is medically and biologically relevant, of good financial value, and tractable. In 2003, we presented the ''Pfam5000'' strategy, which involves selecting the 5,000 most important families from the Pfam database as sources for targets. In this update, we show that although both the Pfam database and the number of sequenced genomes have increased in size, the expected benefits of the Pfam5000 strategy have not changed substantially. Solving the structures of proteins from the 5,000 largest Pfam families would allow accurate fold assignment for approximately 65 percent of all prokaryotic proteins (covering 54 percent of residues) and 63 percent of eukaryotic proteins (42 percent of residues). Fewer than 2,300 of the largest families on this list remain to be solved, making the project feasible in the next five years given the expected throughput to be achieved in the production phase of the Protein Structure Initiative.

  14. Class-level relationships in the phylum Cnidaria: evidence from mitochondrial genome structure.

    PubMed Central

    Bridge, D; Cunningham, C W; Schierwater, B; DeSalle, R; Buss, L W

    1992-01-01

    The phylogenetic relationships of the Recent cnidarian classes remain one of the classic problems in invertebrate zoology. We survey the structure of the mitochondrial genome in representatives of the four extant cnidarian classes and in the phylum Ctenophora. We find that all anthozoan species tested possess mtDNA in the form of circular molecules, whereas all scyphozoan, cubozoan, and hydrozoan species tested display mtDNA in the form of linear molecules. Because ctenophore and all other known metazoan mtDNA is circular, the shared occurrence of linear mtDNA in three of the four cnidarian classes suggests a basal position for the Anthozoa within the phylum. Images PMID:1356268

  15. Comparison of SIV and HIV-1 genomic RNA structures reveals impact of sequence evolution on conserved and non-conserved structural motifs.

    PubMed

    Pollom, Elizabeth; Dang, Kristen K; Potter, E Lake; Gorelick, Robert J; Burch, Christina L; Weeks, Kevin M; Swanstrom, Ronald

    2013-01-01

    RNA secondary structure plays a central role in the replication and metabolism of all RNA viruses, including retroviruses like HIV-1. However, structures with known function represent only a fraction of the secondary structure reported for HIV-1(NL4-3). One tool to assess the importance of RNA structures is to examine their conservation over evolutionary time. To this end, we used SHAPE to model the secondary structure of a second primate lentiviral genome, SIVmac239, which shares only 50% sequence identity at the nucleotide level with HIV-1NL4-3. Only about half of the paired nucleotides are paired in both genomic RNAs and, across the genome, just 71 base pairs form with the same pairing partner in both genomes. On average the RNA secondary structure is thus evolving at a much faster rate than the sequence. Structure at the Gag-Pro-Pol frameshift site is maintained but in a significantly altered form, while the impact of selection for maintaining a protein binding interaction can be seen in the conservation of pairing partners in the small RRE stems where Rev binds. Structures that are conserved between SIVmac239 and HIV-1(NL4-3) also occur at the 5' polyadenylation sequence, in the plus strand primer sites, PPT and cPPT, and in the stem-loop structure that includes the first splice acceptor site. The two genomes are adenosine-rich and cytidine-poor. The structured regions are enriched in guanosines, while unpaired regions are enriched in adenosines, and functionaly important structures have stronger base pairing than nonconserved structures. We conclude that much of the secondary structure is the result of fortuitous pairing in a metastable state that reforms during sequence evolution. However, secondary structure elements with important function are stabilized by higher guanosine content that allows regions of structure to persist as sequence evolution proceeds, and, within the confines of selective pressure, allows structures to evolve. PMID:23593004

  16. ArchDB: automated protein loop classification as a tool for structural genomics.

    PubMed

    Espadaler, Jordi; Fernandez-Fuentes, Narcis; Hermoso, Antonio; Querol, Enrique; Aviles, Francesc X; Sternberg, Michael J E; Oliva, Baldomero

    2004-01-01

    The annotation of protein function has become a crucial problem with the advent of sequence and structural genomics initiatives. A large body of evidence suggests that protein structural information is frequently encoded in local sequences, and that folds are mainly made up of a number of simple local units of super-secondary structural motifs, consisting of a few secondary structures and their connecting loops. Moreover, protein loops play an important role in protein function. Here we present ArchDB, a classification database of structural motifs, consisting of one loop plus its bracing secondary structures. ArchDB currently contains 12,665 super-secondary elements classified into 1496 motif subclasses. The database provides an easy way to retrieve functional information from protein structures sharing a common motif, to search motifs found in a given SCOP family, superfamily or fold, or to search by keywords on proteins with classified loops. The ArchDB database of loops is located at http://sbi.imim.es/archdb. PMID:14681390

  17. Genome-wide analysis of enzyme structure-function combination across three domains of life.

    PubMed

    Zhang, Ziding; Tang, Yu-Rong

    2007-01-01

    To investigate diverse enzyme structure-function combination (SFC) types in different species, 34 different genome sequences were annotated using the protein catalytic domain database SCOPEC (http://www.enzome.com/enzome/), in which both the structure and function for each entry are known. Annotated enzymes with catalytic domains from the same SCOP superfamily are considered to have an identical structure. Annotated enzymes sharing the identical three-digit EC number are considered to have the same enzymatic function. Results reveal that the different SFC types for enzymes identified in archaea, bacteria and eukaryota are 137, 300 and 313, respectively. About 80% of the SFCs identified in archaea can be consistently found in bacteria and eukaryota species, whereas 28% and 35% combination types in bacteria and eukaryota respectively are unique to their corresponding groups. The number of functions per structure and the number of structures per function for the annotated sequences were measured in different species. Furthermore, a new concept was proposed to represent enzymatic structures as a functional similarity network. Thus, the current study will be helpful to enhance the global view on the evolution of enzymatic structure and function.

  18. Target Selection and Deselection at the Berkeley StructuralGenomics Center

    SciTech Connect

    Chandonia, John-Marc; Kim, Sung-Hou; Brenner, Steven E.

    2005-03-22

    At the Berkeley Structural Genomics Center (BSGC), our goalis to obtain a near-complete structural complement of proteins in theminimal organisms Mycoplasma genitalium and M. pneumoniae, two closelyrelated pathogens. Current targets for structure determination have beenselected in six major stages, starting with those predicted to be mosttractable to high throughput study and likely to yield new structuralinformation. We report on the process used to select these proteins, aswell as our target deselection procedure. Target deselection reducesexperimental effort by eliminating targets similar to those recentlysolved by the structural biology community or other centers. We measurethe impact of the 69 structures solved at the BSGC as of July 2004 onstructure prediction coverage of the M. pneumoniae and M. genitaliumproteomes. The number of Mycoplasma proteins for which thefold couldfirst be reliably assigned based on structures solved at the BSGC (24 M.pneumoniae and 21 M. genitalium) is approximately 25 percent of the totalresulting from work at all structural genomics centers and the worldwidestructural biology community (94 M. pneumoniae and 86M. genitalium)during the same period. As the number of structures contributed by theBSGC during that period is less than 1 percent of the total worldwideoutput, the benefits of a focused target selection strategy are apparent.If the structures of all current targets were solved, the percentage ofM. pneumoniae proteins for which folds could be reliably assigned wouldincrease from approximately 57 percent (391 of 687) at present to around80 percent (550 of 687), and the percentage of the proteome that could beaccurately modeled would increase from around 37 percent (254 of 687) toabout 64 percent (438 of 687). In M. genitalium, the percentage of theproteome that could be structurally annotated based on structures of ourremaining targets would rise from 72 percent (348 of 486) to around 76percent (371 of 486), with the

  19. Consortium for Materials Development in Space

    NASA Technical Reports Server (NTRS)

    1999-01-01

    During FY99 the Consortium for Materials Development in Space (CMDS) was reorganized around the following guidelines: industry driven, product focus, an industry led advisory council, focus on University of Alabama in Huntsville (UAH) core competencies, linkage to regional investment firms to assist commercialization and to take advantage of space flights. The organizational structure of the CMDS changed considerably during the year. The decision was made to reduce the organization to a Director and an Administrative Assistant. The various research projects, including the employees, were transferred to the appropriate UAH research center or college. In addition, an advisory council was established to provide direction and guidance to the CMDS to ensure a strong commercial focus. The council will (i) review CMDS commercial development plans and provide feedback, (ii) perform an annual evaluation of the Center's progress and present the results of this review to the UAH Vice President for Research, (iii) serve as an avenue of communication between the CMDS and its commercial partners, and (iv) serve as an ambassador and advocate for the CMDS.

  20. LDRD Report FY 03: Structure and Function of Regulatory DNA: A Next Major Challenge in Genomics

    SciTech Connect

    Stubbs, L

    2003-02-18

    With the human genome sequence now available and high quality draft sequences of mouse, rat and many other creatures recently or soon to be released, the field of Genomics has entered an especially exciting phase. The raw materials for locating the {approx}30-40,000 human genes and understanding their basic structure are now online; next, the research community must begin to unravel the mechanisms through which those genes create the complexity of life. Laboratories around the world are already beginning to focus on cataloguing the times, sites and conditions under which each gene is active; others are racing to predict, and then experimentally analyze, the structures of proteins that human genes encode. These activities are extremely important, but they will not reveal the mechanisms through which the correct proteins are activated precisely in the specific cells and at the particular time that is required for normal developmental, health, and in response to the environment. Although we understand well the three-letter code through which genes dictate the production of proteins, the codes through which genes are turned on and off in precise, cell-specific patterns remain a mystery. Unraveling these codes are essential to understanding the functions of genes and the role of human genetic diversity in disease and environmental susceptibility. This problem also represents one of the most exciting challenges in modern biology, drawing in scientists from every discipline to develop the needed biological datasets, measurement technologies and algorithms. The LDRD effort that is the subject of this report was focused on establishing the basic technical and scientific foundations of a well-rounded program in gene regulatory biology at LLNL. The motivation for building these foundations was based on several drivers. First, with the sea-change in genomics, we sought to develop a new, exciting and foreward-thinking research focus for the LLNL genomics team, which could

  1. Discovery of new enzymes and metabolic pathways by using structure and genome context.

    PubMed

    Zhao, Suwen; Kumar, Ritesh; Sakai, Ayano; Vetting, Matthew W; Wood, B McKay; Brown, Shoshana; Bonanno, Jeffery B; Hillerich, Brandan S; Seidel, Ronald D; Babbitt, Patricia C; Almo, Steven C; Sweedler, Jonathan V; Gerlt, John A; Cronan, John E; Jacobson, Matthew P

    2013-10-31

    Assigning valid functions to proteins identified in genome projects is challenging: overprediction and database annotation errors are the principal concerns. We and others are developing computation-guided strategies for functional discovery with 'metabolite docking' to experimentally derived or homology-based three-dimensional structures. Bacterial metabolic pathways often are encoded by 'genome neighbourhoods' (gene clusters and/or operons), which can provide important clues for functional assignment. We recently demonstrated the synergy of docking and pathway context by 'predicting' the intermediates in the glycolytic pathway in Escherichia coli. Metabolite docking to multiple binding proteins and enzymes in the same pathway increases the reliability of in silico predictions of substrate specificities because the pathway intermediates are structurally similar. Here we report that structure-guided approaches for predicting the substrate specificities of several enzymes encoded by a bacterial gene cluster allowed the correct prediction of the in vitro activity of a structurally characterized enzyme of unknown function (PDB 2PMQ), 2-epimerization of trans-4-hydroxy-L-proline betaine (tHyp-B) and cis-4-hydroxy-D-proline betaine (cHyp-B), and also the correct identification of the catabolic pathway in which Hyp-B 2-epimerase participates. The substrate-liganded pose predicted by virtual library screening (docking) was confirmed experimentally. The enzymatic activities in the predicted pathway were confirmed by in vitro assays and genetic analyses; the intermediates were identified by metabolomics; and repression of the genes encoding the pathway by high salt concentrations was established by transcriptomics, confirming the osmolyte role of tHyp-B. This study establishes the utility of structure-guided functional predictions to enable the discovery of new metabolic pathways.

  2. Assessment of Genetic Heterogeneity in Structured Plant Populations Using Multivariate Whole-Genome Regression Models

    PubMed Central

    Lehermeier, Christina; Schön, Chris-Carolin; de los Campos, Gustavo

    2015-01-01

    Plant breeding populations exhibit varying levels of structure and admixture; these features are likely to induce heterogeneity of marker effects across subpopulations. Traditionally, structure has been dealt with as a potential confounder, and various methods exist to “correct” for population stratification. However, these methods induce a mean correction that does not account for heterogeneity of marker effects. The animal breeding literature offers a few recent studies that consider modeling genetic heterogeneity in multibreed data, using multivariate models. However, these methods have received little attention in plant breeding where population structure can have different forms. In this article we address the problem of analyzing data from heterogeneous plant breeding populations, using three approaches: (a) a model that ignores population structure [A-genome-based best linear unbiased prediction (A-GBLUP)], (b) a stratified (i.e., within-group) analysis (W-GBLUP), and (c) a multivariate approach that uses multigroup data and accounts for heterogeneity (MG-GBLUP). The performance of the three models was assessed on three different data sets: a diversity panel of rice (Oryza sativa), a maize (Zea mays L.) half-sib panel, and a wheat (Triticum aestivum L.) data set that originated from plant breeding programs. The estimated genomic correlations between subpopulations varied from null to moderate, depending on the genetic distance between subpopulations and traits. Our assessment of prediction accuracy features cases where ignoring population structure leads to a parsimonious more powerful model as well as others where the multivariate and stratified approaches have higher predictive power. In general, the multivariate approach appeared slightly more robust than either the A- or the W-GBLUP. PMID:26122758

  3. Structure and evolution of the atypical mitochondrial genome of Armadillidium vulgare (Isopoda, Crustacea).

    PubMed

    Marcadé, Isabelle; Cordaux, Richard; Doublet, Vincent; Debenest, Catherine; Bouchon, Didier; Raimond, Roland

    2007-12-01

    The crustacean isopod Armadillidium vulgare is characterized by an unusual approximately 42-kb-long mitochondrial genome consisting of two molecules co-occurring in mitochondria: a circular approximately 28-kb dimer formed by two approximately 14-kb monomers fused in opposite polarities and a linear approximately 14-kb monomer. Here we determined the nucleotide sequence of the fundamental monomeric unit of A. vulgare mitochondrial genome, to gain new insight into its structure and evolution. Our results suggest that the junction zone between monomers of the dimer structure is located in or near the control region. Direct sequencing indicated that the nucleotide sequences of the different monomer units are virtually identical. This suggests that gene conversion and/or replication processes play an important role in shaping nucleotide sequence variation in this mitochondrial genome. The only heteroplasmic site we identified predicts an alloacceptor tRNA change from tRNA(Ala) to tRNA(Val). Therefore, in A. vulgare, tRNA(Ala) and tRNA(Val) are found at the same locus in different monomers, ensuring that both tRNAs are present in mitochondria. The presence of this heteroplasmic site in all sequenced individuals suggests that the polymorphism is selectively maintained, probably because of the necessity of both tRNAs for maintaining proper mitochondrial functions. Thus, our results provide empirical evidence for the tRNA gene recruitment model of tRNA evolution. Moreover, interspecific comparisons showed that the A. vulgare mitochondrial gene order is highly derived compared to the putative ancestral arthropod type. By contrast, an overall high conservation of mitochondrial gene order is observed within crustacean isopods. PMID:17906827

  4. Whole genome comparison of a large collection of mycobacteriophages reveals a continuum of phage genetic diversity.

    PubMed

    Pope, Welkin H; Bowman, Charles A; Russell, Daniel A; Jacobs-Sera, Deborah; Asai, David J; Cresawn, Steven G; Jacobs, William R; Hendrix, Roger W; Lawrence, Jeffrey G; Hatfull, Graham F

    2015-01-01

    The bacteriophage population is large, dynamic, ancient, and genetically diverse. Limited genomic information shows that phage genomes are mosaic, and the genetic architecture of phage populations remains ill-defined. To understand the population structure of phages infecting a single host strain, we isolated, sequenced, and compared 627 phages of Mycobacterium smegmatis. Their genetic diversity is considerable, and there are 28 distinct genomic types (clusters) with related nucleotide sequences. However, amino acid sequence comparisons show pervasive genomic mosaicism, and quantification of inter-cluster and intra-cluster relatedness reveals a continuum of genetic diversity, albeit with uneven representation of different phages. Furthermore, rarefaction analysis shows that the mycobacteriophage population is not closed, and there is a constant influx of genes from other sources. Phage isolation and analysis was performed by a large consortium of academic institutions, illustrating the substantial benefits of a disseminated, structured program involving large numbers of freshman undergraduates in scientific discovery. PMID:25919952

  5. Whole genome comparison of a large collection of mycobacteriophages reveals a continuum of phage genetic diversity.

    PubMed

    Pope, Welkin H; Bowman, Charles A; Russell, Daniel A; Jacobs-Sera, Deborah; Asai, David J; Cresawn, Steven G; Jacobs, William R; Hendrix, Roger W; Lawrence, Jeffrey G; Hatfull, Graham F

    2015-04-28

    The bacteriophage population is large, dynamic, ancient, and genetically diverse. Limited genomic information shows that phage genomes are mosaic, and the genetic architecture of phage populations remains ill-defined. To understand the population structure of phages infecting a single host strain, we isolated, sequenced, and compared 627 phages of Mycobacterium smegmatis. Their genetic diversity is considerable, and there are 28 distinct genomic types (clusters) with related nucleotide sequences. However, amino acid sequence comparisons show pervasive genomic mosaicism, and quantification of inter-cluster and intra-cluster relatedness reveals a continuum of genetic diversity, albeit with uneven representation of different phages. Furthermore, rarefaction analysis shows that the mycobacteriophage population is not closed, and there is a constant influx of genes from other sources. Phage isolation and analysis was performed by a large consortium of academic institutions, illustrating the substantial benefits of a disseminated, structured program involving large numbers of freshman undergraduates in scientific discovery.

  6. Protein structure similarity clustering and natural product structure as guiding principles for chemical genomics.

    PubMed

    Koch, M A; Waldmann, H

    2006-01-01

    The majority of all proteins are modularly built from a limited set of approximately 1,000 structural domains. The knowledge of a common protein fold topology in the ligand-sensing cores of protein domains can be exploited for the design of small-molecule libraries in the development of inhibitors and ligands. Thus, a novel strategy of clustering protein domain cores based exclusively on structure similarity considerations (protein structure similarity clustering, PSSC) has been successfully applied to the development of small-molecule inhibitors of acetylcholinesterase and the 11beta-hydroxysteroid dehydrogenases based on the structure of a naturally occurring Cdc25 inhibitor. The efficiency of making use of the scaffolds of natural products as biologically prevalidated starting points for the design of compound libraries is further highlighted by the development of benzopyran-based FXR ligands.

  7. The Impact of Spatial Structure on Viral Genomic Diversity Generated during Adaptation to Thermal Stress

    PubMed Central

    Ally, Dilara; Wiss, Valorie R.; Deckert, Gail E.; Green, Danielle; Roychoudhury, Pavitra; Wichman, Holly A.; Brown, Celeste J.; Krone, Stephen M.

    2014-01-01

    Background Most clinical and natural microbial communities live and evolve in spatially structured environments. When changes in environmental conditions trigger evolutionary responses, spatial structure can impact the types of adaptive response and the extent to which they spread. In particular, localized competition in a spatial landscape can lead to the emergence of a larger number of different adaptive trajectories than would be found in well-mixed populations. Our goal was to determine how two levels of spatial structure affect genomic diversity in a population and how this diversity is manifested spatially. Methodology/Principal Findings We serially transferred bacteriophage populations growing at high temperatures (40°C) on agar plates for 550 generations at two levels of spatial structure. The level of spatial structure was determined by whether the physical locations of the phage subsamples were preserved or disrupted at each passage to fresh bacterial host populations. When spatial structure of the phage populations was preserved, there was significantly greater diversity on a global scale with restricted and patchy distribution. When spatial structure was disrupted with passaging to fresh hosts, beneficial mutants were spread across the entire plate. This resulted in reduced diversity, possibly due to clonal interference as the most fit mutants entered into competition on a global scale. Almost all substitutions present at the end of the adaptation in the populations with disrupted spatial structure were also present in the populations with structure preserved. Conclusions/Significance Our results are consistent with the patchy nature of the spread of adaptive mutants in a spatial landscape. Spatial structure enhances diversity and slows fixation of beneficial mutants. This added diversity could be beneficial in fluctuating environments. We also connect observed substitutions and their effects on fitness to aspects of phage biology, and we provide

  8. From Genome to Structure and Back Again: A Family Portrait of the Transcarbamylases

    PubMed Central

    Shi, Dashuang; Allewell, Norma M.; Tuchman, Mendel

    2015-01-01

    Enzymes in the transcarbamylase family catalyze the transfer of a carbamyl group from carbamyl phosphate (CP) to an amino group of a second substrate. The two best-characterized members, aspartate transcarbamylase (ATCase) and ornithine transcarbamylase (OTCase), are present in most organisms from bacteria to humans. Recently, structures of four new transcarbamylase members, N-acetyl-l-ornithine transcarbamylase (AOTCase), N-succinyl-l-ornithine transcarbamylase (SOTCase), ygeW encoded transcarbamylase (YTCase) and putrescine transcarbamylase (PTCase) have also been determined. Crystal structures of these enzymes have shown that they have a common overall fold with a trimer as their basic biological unit. The monomer structures share a common CP binding site in their N-terminal domain, but have different second substrate binding sites in their C-terminal domain. The discovery of three new transcarbamylases, l-2,3-diaminopropionate transcarbamylase (DPTCase), l-2,4-diaminobutyrate transcarbamylase (DBTCase) and ureidoglycine transcarbamylase (UGTCase), demonstrates that our knowledge and understanding of the spectrum of the transcarbamylase family is still incomplete. In this review, we summarize studies on the structures and function of transcarbamylases demonstrating how structural information helps to define biological function and how small structural differences govern enzyme specificity. Such information is important for correctly annotating transcarbamylase sequences in the genome databases and for identifying new members of the transcarbamylase family. PMID:26274952

  9. RNATOPS-W: a web server for RNA structure searches of genomes

    PubMed Central

    Wang, Yingfeng; Huang, Zhibin; Wu, Yong; Malmberg, Russell L.; Cai, Liming

    2009-01-01

    Summary: RNATOPS-W is a web server to search sequences for RNA secondary structures including pseudoknots. The server accepts an annotated RNA multiple structural alignment as a structural profile and genomic or other sequences to search. It is built upon RNATOPS, a command line C++software package for the same purpose, in which filters to speed up search are manually selected. RNATOPS-W improves upon RNATOPS by adding the function of automatic selection of a hidden Markov model (HMM) filter and also a friendly user interface for selection of a substructure filter by the user. In addition, RNATOPS-W complements existing RNA secondary structure search web servers that either use built-in structure profiles or are not able to detect pseudoknots. RNATOPS-W inherits the efficiency of RNATOPS in detecting large, complex RNA structures. Availability: The web server RNATOPS-W is available at the web site www.uga.edu/RNA-Informatics/?f=software&p=RNATOPS-w. The underlying search program RNATOPS can be downloaded at www.uga.edu/RNA-Informatics/?f=software&p=RNATOPS. Contact: cai@cs.uga.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:19269988

  10. ViVar: A Comprehensive Platform for the Analysis and Visualization of Structural Genomic Variation

    PubMed Central

    Sante, Tom; Vergult, Sarah; Volders, Pieter-Jan; Kloosterman, Wigard P.; Trooskens, Geert; De Preter, Katleen; Dheedene, Annelies; Speleman, Frank; De Meyer, Tim; Menten, Björn

    2014-01-01

    Structural genomic variations play an important role in human disease and phenotypic diversity. With the rise of high-throughput sequencing tools, mate-pair/paired-end/single-read sequencing has become an important technique for the detection and exploration of structural variation. Several analysis tools exist to handle different parts and aspects of such sequencing based structural variation analyses pipelines. A comprehensive analysis platform to handle all steps, from processing the sequencing data, to the discovery and visualization of structural variants, is missing. The ViVar platform is built to handle the discovery of structural variants, from Depth Of Coverage analysis, aberrant read pair clustering to split read analysis. ViVar provides you with powerful visualization options, enables easy reporting of results and better usability and data management. The platform facilitates the processing, analysis and visualization, of structural variation based on massive parallel sequencing data, enabling the rapid identification of disease loci or genes. ViVar allows you to scale your analysis with your work load over multiple (cloud) servers, has user access control to keep your data safe and is easy expandable as analysis techniques advance. URL: https://www.cmgg.be/vivar/ PMID:25503062

  11. Core genome conservation of Staphylococcus haemolyticus limits sequence based population structure analysis.

    PubMed

    Cavanagh, Jorunn Pauline; Klingenberg, Claus; Hanssen, Anne-Merethe; Fredheim, Elizabeth Aarag; Francois, Patrice; Schrenzel, Jacques; Flægstad, Trond; Sollid, Johanna Ericson

    2012-06-01

    The notoriously multi-resistant Staphylococcus haemolyticus is an emerging pathogen causing serious infections in immunocompromised patients. Defining the population structure is important to detect outbreaks and spread of antimicrobial resistant clones. Currently, the standard typing technique is pulsed-field gel electrophoresis (PFGE). In this study we describe novel molecular typing schemes for S. haemolyticus using multi locus sequence typing (MLST) and multi locus variable number of tandem repeats (VNTR) analysis. Seven housekeeping genes (MLST) and five VNTR loci (MLVF) were selected for the novel typing schemes. A panel of 45 human and veterinary S. haemolyticus isolates was investigated. The collection had diverse PFGE patterns (38 PFGE types) and was sampled over a 20 year-period from eight countries. MLST resolved 17 sequence types (Simpsons index of diversity [SID]=0.877) and MLVF resolved 14 repeat types (SID=0.831). We found a low sequence diversity. Phylogenetic analysis clustered the isolates in three (MLST) and one (MLVF) clonal complexes, respectively. Taken together, neither the MLST nor the MLVF scheme was suitable to resolve the population structure of this S. haemolyticus collection. Future MLVF and MLST schemes will benefit from addition of more variable core genome sequences identified by comparing different fully sequenced S. haemolyticus genomes. PMID:22484086

  12. Deciphering the fine-structure of tribal admixture in the Bedouin population using genomic data.

    PubMed

    Markus, B; Alshafee, I; Birk, O S

    2014-02-01

    The Bedouin Israeli population is highly inbred and structured with a very high prevalence of recessive diseases. Many studies in the past two decades focused on linkage analysis in large, multiple consanguineous pedigrees of this population. The advent of high-throughput technologies motivated researchers to search for rare variants shared between smaller pedigrees, integrating data from clinically similar yet seemingly non-related sporadic cases. However, such analyses are challenging because, without pedigree data, there is no prior knowledge regarding possible relatedness between the sporadic cases. Here, we describe models and techniques for the study of relationships between pedigrees and use them for the inference of tribal co-ancestry, delineating the complex social interactions between different tribes in the Negev Bedouins of southern Israel. Through our analysis, we differentiate between tribes that share many yet small genomic segments because of co-ancestry versus tribes that share larger segments because of recent admixture. The emergent pattern is well correlated with the prevalence of rare mutations in the different tribes. Tribes that do not intermarry, mostly because of social restrictions, hold private mutations, whereas tribes that do intermarry demonstrate a genetic flow of mutations between them. Thus, social structure within an inbred community can be delineated through genomic data, with implications to genetic counseling and genetic mapping.

  13. Deciphering the fine-structure of tribal admixture in the Bedouin population using genomic data

    PubMed Central

    Markus, B; Alshafee, I; Birk, O S

    2014-01-01

    The Bedouin Israeli population is highly inbred and structured with a very high prevalence of recessive diseases. Many studies in the past two decades focused on linkage analysis in large, multiple consanguineous pedigrees of this population. The advent of high-throughput technologies motivated researchers to search for rare variants shared between smaller pedigrees, integrating data from clinically similar yet seemingly non-related sporadic cases. However, such analyses are challenging because, without pedigree data, there is no prior knowledge regarding possible relatedness between the sporadic cases. Here, we describe models and techniques for the study of relationships between pedigrees and use them for the inference of tribal co-ancestry, delineating the complex social interactions between different tribes in the Negev Bedouins of southern Israel. Through our analysis, we differentiate between tribes that share many yet small genomic segments because of co-ancestry versus tribes that share larger segments because of recent admixture. The emergent pattern is well correlated with the prevalence of rare mutations in the different tribes. Tribes that do not intermarry, mostly because of social restrictions, hold private mutations, whereas tribes that do intermarry demonstrate a genetic flow of mutations between them. Thus, social structure within an inbred community can be delineated through genomic data, with implications to genetic counseling and genetic mapping. PMID:24084643

  14. Genome Scan for Selection in Structured Layer Chicken Populations Exploiting Linkage Disequilibrium Information.

    PubMed

    Gholami, Mahmood; Reimer, Christian; Erbe, Malena; Preisinger, Rudolf; Weigend, Annett; Weigend, Steffen; Servin, Bertrand; Simianer, Henner

    2015-01-01

    An increasing interest is being placed in the detection of genes, or genomic regions, that have been targeted by selection because identifying signatures of selection can lead to a better understanding of genotype-phenotype relationships. A common strategy for the detection of selection signatures is to compare samples from distinct populations and to search for genomic regions with outstanding genetic differentiation. The aim of this study was to detect selective signatures in layer chicken populations using a recently proposed approach, hapFLK, which exploits linkage disequilibrium information while accounting appropriately for the hierarchical structure of populations. We performed the analysis on 70 individuals from three commercial layer breeds (White Leghorn, White Rock and Rhode Island Red), genotyped for approximately 1 million SNPs. We found a total of 41 and 107 regions with outstanding differentiation or similarity using hapFLK and its single SNP counterpart FLK respectively. Annotation of selection signature regions revealed various genes and QTL corresponding to productions traits, for which layer breeds were selected. A number of the detected genes were associated with growth and carcass traits, including IGF-1R, AGRP and STAT5B. We also annotated an interesting gene associated with the dark brown feather color mutational phenotype in chickens (SOX10). We compared FST, FLK and hapFLK and demonstrated that exploiting linkage disequilibrium information and accounting for hierarchical population structure decreased the false detection rate.

  15. Structural and Functional Characterization of an Influenza Virus RNA Polymerase-Genomic RNA Complex ▿

    PubMed Central

    Resa-Infante, Patricia; Recuero-Checa, María Ángeles; Zamarreño, Noelia; Llorca, Óscar; Ortín, Juan

    2010-01-01

    The replication and transcription of influenza A virus are carried out by ribonucleoproteins (RNPs) containing each genomic RNA segment associated with nucleoprotein monomers and the heterotrimeric polymerase complex. These RNPs are responsible for virus transcription and replication in the infected cell nucleus. Here we have expressed, purified, and analyzed, structurally and functionally, for the first time, polymerase-RNA template complexes obtained after replication in vivo. These complexes were generated by the cotransfection of plasmids expressing the polymerase subunits and a genomic plasmid expressing a minimal template of positive or negative polarity. Their generation in vivo was strictly dependent on the polymerase activity; they contained mainly negative-polarity viral RNA (vRNA) and could transcribe and replicate in vitro. The three-dimensional structure of the monomeric polymerase-vRNA complexes was similar to that of the RNP-associated polymerase and distinct from that of the polymerase devoid of template. These results suggest that the interaction with the template is sufficient to induce a significant conformation switch in the polymerase complex. PMID:20702645

  16. Complete mitogenome of the edible sea urchin Loxechinus albus: genetic structure and comparative genomics within Echinozoa.

    PubMed

    Cea, Graciela; Gaitán-Espitia, Juan Diego; Cárdenas, Leyla

    2015-06-01

    The edible Chilean red sea urchin, Loxechinus albus, is the only species of its genus and endemic to the Southeastern Pacific. In this study, we reconstructed the mitochondrial genome of L. albus by combining Sanger and pyrosequencing technologies. The mtDNA genome had a length of 15,737 bp and encoded the same 13 protein-coding genes, 22 transfer RNA genes, and two ribosomal RNA genes as other animal mtDNAs. The size of this mitogenome was similar to those of other Echinodermata species. Structural comparisons showed a highly conserved structure, composition, and gene order within Echinoidea and Holothuroidea, and nearly identical gene organization to that found in Asteroidea and Crinoidea, with the majority of differences explained by the inversions of some tRNA genes. Phylogenetic reconstruction supported the monophyly of Echinozoa and recovered the monophyletic relationship of Holothuroidea and Echinoidea. Within Holothuroidea, Bayesian and maximum likelihood analyses recovered a sister-group relationship between Dendrochirotacea and Aspidochirotida. Similarly within Echinoidea, these analyses revealed that L. albus was closely related to Paracentrotus lividus, both being part of a sister group to Strongylocentrotidae and Echinometridae. In addition, two major clades were found within Strongylocentrotidae. One of these clades comprised all of the representative species Strongylocentrotus and Hemicentrotus, whereas the other included species of Mesocentrotus and Pseudocentrotus.

  17. Mutational and structural analysis of diffuse large B-cell lymphoma using whole genome sequencing | Office of Cancer Genomics

    Cancer.gov

    Abstract: Diffuse large B-cell lymphoma (DLBCL) is a genetically heterogeneous cancer comprising at least two molecular subtypes that differ in gene expression and distribution of mutations. Recently, application of genome/exome sequencing and RNA-seq to DLBCL has revealed numerous genes that are recurrent targets of somatic point mutation in this disease.

  18. The complete plastid genome sequence of the parasitic green alga Helicosporidium sp. is highly reduced and structured

    PubMed Central

    de Koning, Audrey P; Keeling, Patrick J

    2006-01-01

    Background Loss of photosynthesis has occurred independently in several plant and algal lineages, and represents a major metabolic shift with potential consequences for the content and structure of plastid genomes. To investigate such changes, we sequenced the complete plastid genome of the parasitic, non-photosynthetic green alga, Helicosporidium. Results The Helicosporidium plastid genome is among the smallest known (37.5 kb), and like other plastids from non-photosynthetic organisms it lacks all genes for proteins that function in photosynthesis. Its reduced size results from more than just loss of genes, however; it has little non-coding DNA, with only one intron and tiny intergenic spaces, and no inverted repeat (no duplicated genes at all). It encodes precisely the minimal complement of tRNAs needed to translate the universal genetic code, and has eliminated all redundant isoacceptors. The Helicosporidium plastid genome is also highly structured, with each half of the circular genome containing nearly all genes on one strand. Helicosporidium is known to be related to trebouxiophyte green algae, but the genome is structured and compacted in a manner more reminiscent of the non-photosynthetic plastids of apicomplexan parasites. Conclusion Helicosporidium contributes significantly to our understanding of the evolution of plastid DNA because it illustrates the highly ordered reduction that occurred following the loss of a major metabolic function. The convergence of plastid genome structure in Helicosporidium and the Apicomplexa raises the interesting possibility that there are common forces that shape plastid genomes, subsequent to the loss of photosynthesis in an organism. PMID:16630350

  19. HSA: integrating multi-track Hi-C data for genome-scale reconstruction of 3D chromatin structure.

    PubMed

    Zou, Chenchen; Zhang, Yuping; Ouyang, Zhengqing

    2016-03-02

    Genome-wide 3C technologies (Hi-C) are being increasingly employed to study three-dimensional (3D) genome conformations. Existing computational approaches are unable to integrate accumulating data to facilitate studying 3D chromatin structure and function. We present HSA ( http://ouyanglab.jax.org/hsa/ ), a flexible tool that jointly analyzes multiple contact maps to infer 3D chromatin structure at the genome scale. HSA globally searches the latent structure underlying different cleavage footprints. Its robustness and accuracy outperform or rival existing tools on extensive simulations and orthogonal experiment validations. Applying HSA to recent in situ Hi-C data, we found the 3D chromatin structures are highly conserved across various human cell types.

  20. JCGGDB: Japan Consortium for Glycobiology and Glycotechnology Database.

    PubMed

    Maeda, Masako; Fujita, Noriaki; Suzuki, Yoshinori; Sawaki, Hiromichi; Shikanai, Toshihide; Narimatsu, Hisashi

    2015-01-01

    The biological significance of glycans has been widely studied and reported in the past. However, most achievements of our predecessors are not readily available in existing databases. JCGGDB is a meta-database involving 15 original databases in AIST and 5 cooperative databases in alliance with JCGG: Japan Consortium for Glycobiology and Glycotechnology. It centers on a glycan structure database and accumulates information such as glycan preferences of lectins, glycosylation sites in proteins, and genes related to glycan syntheses from glycoscience and related fields. This chapter illustrates how to use three major search interfaces (Keyword Search, Structure Search, and GlycoChem Explorer) available in JCGGDB to search across multiple databases.

  1. Genomic organization, structure, regulation and pathogenic role of pilus constituents in major pathogenic Streptococci and Enterococci.

    PubMed

    Kreikemeyer, Bernd; Gámez, Gustavo; Margarit, Immaculada; Giard, Jean-Christophe; Hammerschmidt, Sven; Hartke, Axel; Podbielski, Andreas

    2011-03-01

    Oligocomponent pilus structures, recently discovered in many important Gram-positive pathogens, represent a new class of virulence factors with adhesive and matrix protein-binding activity. Some of these proteins have emerged as very promising lead components of protein-based vaccines against Streptococci. These extended surface structures play key roles in host cell and tissue adherence, paracellular translocation, and biofilm formation of major Gram-positive pathogens such as Streptococcus pyogenes, S. agalactiae, S. pneumoniae as well as in opportunistic and nosocomial pathogens like Enterococci. Here, we discuss the similarities and differences of: (1) the genomic organization of the various regions encoding pilus proteins, (2) the number, type, and assembly of the proteins constituting the pili, (3) their expression and regulation mechanisms, (4) their role in bacterial virulence, and (5) their potential as vaccine candidate antigens.

  2. GO-FAANG meeting: A gathering on functional annotation of animal genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The FAANG (Functional Annotation of Animal Genomes) Consortium recently held a Gathering On FAANG (GO-FAANG) Workshop in Washington, DC on October 7-8, 2015. This consortium is a grass-roots organization formed to advance the annotation of newly assembled genomes of non-model organisms (www.faang.or...

  3. SCHEMA computational design of virus capsid chimeras: calibrating how genome packaging, protection, and transduction correlate with calculated structural disruption.

    PubMed

    Ho, Michelle L; Adler, Benjamin A; Torre, Michael L; Silberg, Jonathan J; Suh, Junghae

    2013-12-20

    Adeno-associated virus (AAV) recombination can result in chimeric capsid protein subunits whose ability to assemble into an oligomeric capsid, package a genome, and transduce cells depends on the inheritance of sequence from different AAV parents. To develop quantitative design principles for guiding site-directed recombination of AAV capsids, we have examined how capsid structural perturbations predicted by the SCHEMA algorithm correlate with experimental measurements of disruption in seventeen chimeric capsid proteins. In our small chimera population, created by recombining AAV serotypes 2 and 4, we found that protection of viral genomes and cellular transduction were inversely related to calculated disruption of the capsid structure. Interestingly, however, we did not observe a correlation between genome packaging and calculated structural disruption; a majority of the chimeric capsid proteins formed at least partially assembled capsids and more than half packaged genomes, including those with the highest SCHEMA disruption. These results suggest that the sequence space accessed by recombination of divergent AAV serotypes is rich in capsid chimeras that assemble into 60-mer capsids and package viral genomes. Overall, the SCHEMA algorithm may be useful for delineating quantitative design principles to guide the creation of libraries enriched in genome-protecting virus nanoparticles that can effecti