Science.gov

Sample records for functional genomic resources

  1. Genetic resources offer efficient tools for rice functional genomics research.

    PubMed

    Lo, Shuen-Fang; Fan, Ming-Jen; Hsing, Yue-Ie; Chen, Liang-Jwu; Chen, Shu; Wen, Ien-Chie; Liu, Yi-Lun; Chen, Ku-Ting; Jiang, Mirng-Jier; Lin, Ming-Kuang; Rao, Meng-Yen; Yu, Lin-Chih; Ho, Tuan-Hua David; Yu, Su-May

    2016-05-01

    Rice is an important crop and major model plant for monocot functional genomics studies. With the establishment of various genetic resources for rice genomics, the next challenge is to systematically assign functions to predicted genes in the rice genome. Compared with the robustness of genome sequencing and bioinformatics techniques, progress in understanding the function of rice genes has lagged, hampering the utilization of rice genes for cereal crop improvement. The use of transfer DNA (T-DNA) insertional mutagenesis offers the advantage of uniform distribution throughout the rice genome, but preferentially in gene-rich regions, resulting in direct gene knockout or activation of genes within 20-30 kb up- and downstream of the T-DNA insertion site and high gene tagging efficiency. Here, we summarize the recent progress in functional genomics using the T-DNA-tagged rice mutant population. We also discuss important features of T-DNA activation- and knockout-tagging and promoter-trapping of the rice genome in relation to mutant and candidate gene characterizations and how to more efficiently utilize rice mutant populations and datasets for high-throughput functional genomics and phenomics studies by forward and reverse genetics approaches. These studies may facilitate the translation of rice functional genomics research to improvements of rice and other cereal crops. © 2015 John Wiley & Sons Ltd.

  2. Tetrahymena functional genomics database (TetraFGD): an integrated resource for Tetrahymena functional genomics.

    PubMed

    Xiong, Jie; Lu, Yuming; Feng, Jinmei; Yuan, Dongxia; Tian, Miao; Chang, Yue; Fu, Chengjie; Wang, Guangying; Zeng, Honghui; Miao, Wei

    2013-01-01

    The ciliated protozoan Tetrahymena thermophila is a useful unicellular model organism for studies of eukaryotic cellular and molecular biology. Researches on T. thermophila have contributed to a series of remarkable basic biological principles. After the macronuclear genome was sequenced, substantial progress has been made in functional genomics research on T. thermophila, including genome-wide microarray analysis of the T. thermophila life cycle, a T. thermophila gene network analysis based on the microarray data and transcriptome analysis by deep RNA sequencing. To meet the growing demands for the Tetrahymena research community, we integrated these data to provide a public access database: Tetrahymena functional genomics database (TetraFGD). TetraFGD contains three major resources, including the RNA-Seq transcriptome, microarray and gene networks. The RNA-Seq data define gene structures and transcriptome, with special emphasis on exon-intron boundaries; the microarray data describe gene expression of 20 time points during three major stages of the T. thermophila life cycle; the gene network data identify potential gene-gene interactions of 15 049 genes. The TetraFGD provides user-friendly search functions that assist researchers in accessing gene models, transcripts, gene expression data and gene-gene relationships. In conclusion, the TetraFGD is an important functional genomic resource for researchers who focus on the Tetrahymena or other ciliates. Database URL: http://tfgd.ihb.ac.cn/

  3. Resources for Functional Genomics Studies in Drosophila melanogaster

    PubMed Central

    Mohr, Stephanie E.; Hu, Yanhui; Kim, Kevin; Housden, Benjamin E.; Perrimon, Norbert

    2014-01-01

    Drosophila melanogaster has become a system of choice for functional genomic studies. Many resources, including online databases and software tools, are now available to support design or identification of relevant fly stocks and reagents or analysis and mining of existing functional genomic, transcriptomic, proteomic, etc. datasets. These include large community collections of fly stocks and plasmid clones, “meta” information sites like FlyBase and FlyMine, and an increasing number of more specialized reagents, databases, and online tools. Here, we introduce key resources useful to plan large-scale functional genomics studies in Drosophila and to analyze, integrate, and mine the results of those studies in ways that facilitate identification of highest-confidence results and generation of new hypotheses. We also discuss ways in which existing resources can be used and might be improved and suggest a few areas of future development that would further support large- and small-scale studies in Drosophila and facilitate use of Drosophila information by the research community more generally. PMID:24653003

  4. proGenomes: a resource for consistent functional and taxonomic annotations of prokaryotic genomes

    PubMed Central

    Mende, Daniel R.; Letunic, Ivica; Huerta-Cepas, Jaime; Li, Simone S.; Forslund, Kristoffer; Sunagawa, Shinichi; Bork, Peer

    2017-01-01

    The availability of microbial genomes has opened many new avenues of research within microbiology. This has been driven primarily by comparative genomics approaches, which rely on accurate and consistent characterization of genomic sequences. It is nevertheless difficult to obtain consistent taxonomic and integrated functional annotations for defined prokaryotic clades. Thus, we developed proGenomes, a resource that provides user-friendly access to currently 25 038 high-quality genomes whose sequences and consistent annotations can be retrieved individually or by taxonomic clade. These genomes are assigned to 5306 consistent and accurate taxonomic species clusters based on previously established methodology. proGenomes also contains functional information for almost 80 million protein-coding genes, including a comprehensive set of general annotations and more focused annotations for carbohydrate-active enzymes and antibiotic resistance genes. Additionally, broad habitat information is provided for many genomes. All genomes and associated information can be downloaded by user-selected clade or multiple habitat-specific sets of representative genomes. We expect that the availability of high-quality genomes with comprehensive functional annotations will promote advances in clinical microbial genomics, functional evolution and other subfields of microbiology. proGenomes is available at http://progenomes.embl.de. PMID:28053165

  5. proGenomes: a resource for consistent functional and taxonomic annotations of prokaryotic genomes.

    PubMed

    Mende, Daniel R; Letunic, Ivica; Huerta-Cepas, Jaime; Li, Simone S; Forslund, Kristoffer; Sunagawa, Shinichi; Bork, Peer

    2017-01-04

    The availability of microbial genomes has opened many new avenues of research within microbiology. This has been driven primarily by comparative genomics approaches, which rely on accurate and consistent characterization of genomic sequences. It is nevertheless difficult to obtain consistent taxonomic and integrated functional annotations for defined prokaryotic clades. Thus, we developed proGenomes, a resource that provides user-friendly access to currently 25 038 high-quality genomes whose sequences and consistent annotations can be retrieved individually or by taxonomic clade. These genomes are assigned to 5306 consistent and accurate taxonomic species clusters based on previously established methodology. proGenomes also contains functional information for almost 80 million protein-coding genes, including a comprehensive set of general annotations and more focused annotations for carbohydrate-active enzymes and antibiotic resistance genes. Additionally, broad habitat information is provided for many genomes. All genomes and associated information can be downloaded by user-selected clade or multiple habitat-specific sets of representative genomes. We expect that the availability of high-quality genomes with comprehensive functional annotations will promote advances in clinical microbial genomics, functional evolution and other subfields of microbiology. proGenomes is available at http://progenomes.embl.de.

  6. Wheat EST resources for functional genomics of abiotic stress

    PubMed Central

    Houde, Mario; Belcaid, Mahdi; Ouellet, François; Danyluk, Jean; Monroy, Antonio F; Dryanova, Ani; Gulick, Patrick; Bergeron, Anne; Laroche, André; Links, Matthew G; MacCarthy, Luke; Crosby, William L; Sarhan, Fathey

    2006-01-01

    Background Wheat is an excellent species to study freezing tolerance and other abiotic stresses. However, the sequence of the wheat genome has not been completely characterized due to its complexity and large size. To circumvent this obstacle and identify genes involved in cold acclimation and associated stresses, a large scale EST sequencing approach was undertaken by the Functional Genomics of Abiotic Stress (FGAS) project. Results We generated 73,521 quality-filtered ESTs from eleven cDNA libraries constructed from wheat plants exposed to various abiotic stresses and at different developmental stages. In addition, 196,041 ESTs for which tracefiles were available from the National Science Foundation wheat EST sequencing program and DuPont were also quality-filtered and used in the analysis. Clustering of the combined ESTs with d2_cluster and TGICL yielded a few large clusters containing several thousand ESTs that were refractory to routine clustering techniques. To resolve this problem, the sequence proximity and "bridges" were identified by an e-value distance graph to manually break clusters into smaller groups. Assembly of the resolved ESTs generated a 75,488 unique sequence set (31,580 contigs and 43,908 singletons/singlets). Digital expression analyses indicated that the FGAS dataset is enriched in stress-regulated genes compared to the other public datasets. Over 43% of the unique sequence set was annotated and classified into functional categories according to Gene Ontology. Conclusion We have annotated 29,556 different sequences, an almost 5-fold increase in annotated sequences compared to the available wheat public databases. Digital expression analysis combined with gene annotation helped in the identification of several pathways associated with abiotic stress. The genomic resources and knowledge developed by this project will contribute to a better understanding of the different mechanisms that govern stress tolerance in wheat and other cereals. PMID

  7. Wheat EST resources for functional genomics of abiotic stress.

    PubMed

    Houde, Mario; Belcaid, Mahdi; Ouellet, François; Danyluk, Jean; Monroy, Antonio F; Dryanova, Ani; Gulick, Patrick; Bergeron, Anne; Laroche, André; Links, Matthew G; MacCarthy, Luke; Crosby, William L; Sarhan, Fathey

    2006-06-13

    Wheat is an excellent species to study freezing tolerance and other abiotic stresses. However, the sequence of the wheat genome has not been completely characterized due to its complexity and large size. To circumvent this obstacle and identify genes involved in cold acclimation and associated stresses, a large scale EST sequencing approach was undertaken by the Functional Genomics of Abiotic Stress (FGAS) project. We generated 73,521 quality-filtered ESTs from eleven cDNA libraries constructed from wheat plants exposed to various abiotic stresses and at different developmental stages. In addition, 196,041 ESTs for which tracefiles were available from the National Science Foundation wheat EST sequencing program and DuPont were also quality-filtered and used in the analysis. Clustering of the combined ESTs with d2_cluster and TGICL yielded a few large clusters containing several thousand ESTs that were refractory to routine clustering techniques. To resolve this problem, the sequence proximity and "bridges" were identified by an e-value distance graph to manually break clusters into smaller groups. Assembly of the resolved ESTs generated a 75,488 unique sequence set (31,580 contigs and 43,908 singletons/singlets). Digital expression analyses indicated that the FGAS dataset is enriched in stress-regulated genes compared to the other public datasets. Over 43% of the unique sequence set was annotated and classified into functional categories according to Gene Ontology. We have annotated 29,556 different sequences, an almost 5-fold increase in annotated sequences compared to the available wheat public databases. Digital expression analysis combined with gene annotation helped in the identification of several pathways associated with abiotic stress. The genomic resources and knowledge developed by this project will contribute to a better understanding of the different mechanisms that govern stress tolerance in wheat and other cereals.

  8. An Italian functional genomic resource for Medicago truncatula

    PubMed Central

    Porceddu, Andrea; Panara, Francesco; Calderini, Ornella; Molinari, Lorna; Taviani, Paola; Lanfaloni, Luisa; Scotti, Carla; Carelli, Maria; Scaramelli, Laura; Bruschi, Gianluca; Cosson, Viviane; Ratet, Pascal; de Larembergue, Henri; Duc, Gerard; Piano, Efisio; Arcioni, Sergio

    2008-01-01

    Background Medicago truncatula is a model species for legumes. Its functional genomics have been considerably boosted in recent years due to initiatives based both in Europe and US. Collections of mutants are becoming increasingly available and this will help unravel the genetic control of important traits for many species of legumes. Findings Our report is on the production of three complementary mutant collections of the model species Medicago truncatula produced in Italy in the frame of a national genomic initiative. Well established strategies were used: Tnt1 mutagenesis, TILLING and activation tagging. Both forward and reverse genetics screenings proved the efficiency of the mutagenesis approaches adopted, enabling the isolation of interesting mutants which are in course of characterization. We anticipate that the reported collections will be complementary to the recently established functional genomics tools developed for Medicago truncatula both in Europe and in the United States. PMID:19077311

  9. A Populus EST resource for plant functional genomics

    PubMed Central

    Sterky, Fredrik; Bhalerao, Rupali R.; Unneberg, Per; Segerman, Bo; Nilsson, Peter; Brunner, Amy M.; Charbonnel-Campaa, Laurence; Lindvall, Jenny Jonsson; Tandre, Karolina; Strauss, Steven H.; Sundberg, Björn; Gustafsson, Petter; Uhlén, Mathias; Bhalerao, Rishikesh P.; Nilsson, Ove; Sandberg, Göran; Karlsson, Jan; Lundeberg, Joakim; Jansson, Stefan

    2004-01-01

    Trees present a life form of paramount importance for terrestrial ecosystems and human societies because of their ecological structure and physiological function and provision of energy and industrial materials. The genus Populus is the internationally accepted model for molecular tree biology. We have analyzed 102,019 Populus ESTs that clustered into 11,885 clusters and 12,759 singletons. We also provide >4,000 assembled full clone sequences to serve as a basis for the upcoming annotation of the Populus genome sequence. A public web-based EST database (populusdb) provides digital expression profiles for 18 tissues that comprise the majority of differentiated organs. The coding content of Populus and Arabidopsis genomes shows very high similarity, indicating that differences between these annual and perennial angiosperm life forms result primarily from differences in gene regulation. The high similarity between Populus and Arabidopsis will allow studies of Populus to directly benefit from the detailed functional genomic information generated for Arabidopsis, enabling detailed insights into tree development and adaptation. These data will also valuable for functional genomic efforts in Arabidopsis. PMID:15353603

  10. Comprehensive Resources for Tomato Functional Genomics Based on the Miniature Model Tomato Micro-Tom

    PubMed Central

    Matsukura, C; Aoki, K; Fukuda, N; Mizoguchi, T; Asamizu, E; Saito, T; Shibata, D; Ezura, H

    2008-01-01

    Tomato (Solanum lycopersicum L., Solanaceae) is an excellent model plant for genomic research of solanaceous plants, as well as for studying the development, ripening, and metabolism of fruit. In 2003, the International Solanaceae Project (SOL, www.sgn.cornell.edu ) was initiated by members from more than 30 countries, and the tomato genome-sequencing project is currently underway. Genome sequence of tomato obtained by this project will provide a firm foundation for forthcoming genomic studies such as the comparative analysis of genes conserved among the Solanaceae species and the elucidation of the functions of unknown tomato genes. To exploit the wealth of the genome sequence information, there is an urgent need for novel resources and analytical tools for tomato functional genomics. Here, we present an overview of the development of genetic and genomic resources of tomato in the last decade, with a special focus on the activities of Japan SOL and the National Bio-Resource Project in the development of functional genomic resources of a model cultivar, Micro-Tom. PMID:19506732

  11. A diploid wheat TILLING resource for wheat functional genomics.

    PubMed

    Rawat, Nidhi; Sehgal, Sunish K; Joshi, Anupama; Rothe, Nolan; Wilson, Duane L; McGraw, Nathan; Vadlani, Praveen V; Li, Wanlong; Gill, Bikram S

    2012-11-07

    Triticum monococcum L., an A genome diploid einkorn wheat, was the first domesticated crop. As a diploid, it is attractive genetic model for the study of gene structure and function of wheat-specific traits. Diploid wheat is currently not amenable to reverse genetics approaches such as insertion mutagenesis and post-transcriptional gene silencing strategies. However, TILLING offers a powerful functional genetics approach for wheat gene analysis. We developed a TILLING population of 1,532 M2 families using EMS as a mutagen. A total of 67 mutants were obtained for the four genes studied. Waxy gene mutation frequencies are known to be 1/17.6 - 34.4 kb DNA in polyploid wheat TILLING populations. The T. monococcum diploid wheat TILLING population had a mutation frequency of 1/90 kb for the same gene. Lignin biosynthesis pathway genes- COMT1, HCT2, and 4CL1 had mutation frequencies of 1/86 kb, 1/92 kb and 1/100 kb, respectively. The overall mutation frequency of the diploid wheat TILLING population was 1/92 kb. The mutation frequency of a diploid wheat TILLING population was found to be higher than that reported for other diploid grasses. The rate, however, is lower than tetraploid and hexaploid wheat TILLING populations because of the higher tolerance of polyploids to mutations. Unlike polyploid wheat, most mutants in diploid wheat have a phenotype amenable to forward and reverse genetic analysis and establish diploid wheat as an attractive model to study gene function in wheat. We estimate that a TILLING population of 5, 520 will be needed to get a non-sense mutation for every wheat gene of interest with 95% probability.

  12. A diploid wheat TILLING resource for wheat functional genomics

    PubMed Central

    2012-01-01

    Background Triticum monococcum L., an A genome diploid einkorn wheat, was the first domesticated crop. As a diploid, it is attractive genetic model for the study of gene structure and function of wheat-specific traits. Diploid wheat is currently not amenable to reverse genetics approaches such as insertion mutagenesis and post-transcriptional gene silencing strategies. However, TILLING offers a powerful functional genetics approach for wheat gene analysis. Results We developed a TILLING population of 1,532 M2 families using EMS as a mutagen. A total of 67 mutants were obtained for the four genes studied. Waxy gene mutation frequencies are known to be 1/17.6 - 34.4 kb DNA in polyploid wheat TILLING populations. The T. monococcum diploid wheat TILLING population had a mutation frequency of 1/90 kb for the same gene. Lignin biosynthesis pathway genes- COMT1, HCT2, and 4CL1 had mutation frequencies of 1/86 kb, 1/92 kb and 1/100 kb, respectively. The overall mutation frequency of the diploid wheat TILLING population was 1/92 kb. Conclusion The mutation frequency of a diploid wheat TILLING population was found to be higher than that reported for other diploid grasses. The rate, however, is lower than tetraploid and hexaploid wheat TILLING populations because of the higher tolerance of polyploids to mutations. Unlike polyploid wheat, most mutants in diploid wheat have a phenotype amenable to forward and reverse genetic analysis and establish diploid wheat as an attractive model to study gene function in wheat. We estimate that a TILLING population of 5, 520 will be needed to get a non-sense mutation for every wheat gene of interest with 95% probability. PMID:23134614

  13. MIPS plant genome information resources.

    PubMed

    Spannagl, Manuel; Haberer, Georg; Ernst, Rebecca; Schoof, Heiko; Mayer, Klaus F X

    2007-01-01

    The Munich Institute for Protein Sequences (MIPS) has been involved in maintaining plant genome databases since the Arabidopsis thaliana genome project. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable data sets for model plant genomes as a backbone against which experimental data, for example from high-throughput functional genomics, can be organized and evaluated. In addition, model genomes also form a scaffold for comparative genomics, and much can be learned from genome-wide evolutionary studies.

  14. Genomic Resources for Gene Discovery, Functional Genome Annotation, and Evolutionary Studies of Maize and Its Close Relatives

    PubMed Central

    Wang, Chao; Shi, Xue; Liu, Lin; Li, Haiyan; Ammiraju, Jetty S.S.; Kudrna, David A.; Xiong, Wentao; Wang, Hao; Dai, Zhaozhao; Zheng, Yonglian; Lai, Jinsheng; Jin, Weiwei; Messing, Joachim; Bennetzen, Jeffrey L; Wing, Rod A.; Luo, Meizhong

    2013-01-01

    Maize is one of the most important food crops and a key model for genetics and developmental biology. A genetically anchored and high-quality draft genome sequence of maize inbred B73 has been obtained to serve as a reference sequence. To facilitate evolutionary studies in maize and its close relatives, much like the Oryza Map Alignment Project (OMAP) (www.OMAP.org) bacterial artificial chromosome (BAC) resource did for the rice community, we constructed BAC libraries for maize inbred lines Zheng58, Chang7-2, and Mo17 and maize wild relatives Zea mays ssp. parviglumis and Tripsacum dactyloides. Furthermore, to extend functional genomic studies to maize and sorghum, we also constructed binary BAC (BIBAC) libraries for the maize inbred B73 and the sorghum landrace Nengsi-1. The BAC/BIBAC vectors facilitate transfer of large intact DNA inserts from BAC clones to the BIBAC vector and functional complementation of large DNA fragments. These seven Zea Map Alignment Project (ZMAP) BAC/BIBAC libraries have average insert sizes ranging from 92 to 148 kb, organellar DNA from 0.17 to 2.3%, empty vector rates between 0.35 and 5.56%, and genome equivalents of 4.7- to 8.4-fold. The usefulness of the Parviglumis and Tripsacum BAC libraries was demonstrated by mapping clones to the reference genome. Novel genes and alleles present in these ZMAP libraries can now be used for functional complementation studies and positional or homology-based cloning of genes for translational genomics. PMID:24037269

  15. Genomic resources for gene discovery, functional genome annotation, and evolutionary studies of maize and its close relatives.

    PubMed

    Wang, Chao; Shi, Xue; Liu, Lin; Li, Haiyan; Ammiraju, Jetty S S; Kudrna, David A; Xiong, Wentao; Wang, Hao; Dai, Zhaozhao; Zheng, Yonglian; Lai, Jinsheng; Jin, Weiwei; Messing, Joachim; Bennetzen, Jeffrey L; Wing, Rod A; Luo, Meizhong

    2013-11-01

    Maize is one of the most important food crops and a key model for genetics and developmental biology. A genetically anchored and high-quality draft genome sequence of maize inbred B73 has been obtained to serve as a reference sequence. To facilitate evolutionary studies in maize and its close relatives, much like the Oryza Map Alignment Project (OMAP) (www.OMAP.org) bacterial artificial chromosome (BAC) resource did for the rice community, we constructed BAC libraries for maize inbred lines Zheng58, Chang7-2, and Mo17 and maize wild relatives Zea mays ssp. parviglumis and Tripsacum dactyloides. Furthermore, to extend functional genomic studies to maize and sorghum, we also constructed binary BAC (BIBAC) libraries for the maize inbred B73 and the sorghum landrace Nengsi-1. The BAC/BIBAC vectors facilitate transfer of large intact DNA inserts from BAC clones to the BIBAC vector and functional complementation of large DNA fragments. These seven Zea Map Alignment Project (ZMAP) BAC/BIBAC libraries have average insert sizes ranging from 92 to 148 kb, organellar DNA from 0.17 to 2.3%, empty vector rates between 0.35 and 5.56%, and genome equivalents of 4.7- to 8.4-fold. The usefulness of the Parviglumis and Tripsacum BAC libraries was demonstrated by mapping clones to the reference genome. Novel genes and alleles present in these ZMAP libraries can now be used for functional complementation studies and positional or homology-based cloning of genes for translational genomics.

  16. A reversible haploid mouse embryonic stem cell biobank resource for functional genomics.

    PubMed

    Elling, Ulrich; Wimmer, Reiner A; Leibbrandt, Andreas; Burkard, Thomas; Michlits, Georg; Leopoldi, Alexandra; Micheler, Thomas; Abdeen, Dana; Zhuk, Sergei; Aspalter, Irene M; Handl, Cornelia; Liebergesell, Julia; Hubmann, Maria; Husa, Anna-Maria; Kinzer, Manuela; Schuller, Nicole; Wetzel, Ellen; van de Loo, Nina; Martinez, Jorge Arturo Zepeda; Estoppey, David; Riedl, Ralph; Yang, Fengtang; Fu, Beiyuan; Dechat, Thomas; Ivics, Zoltán; Agu, Chukwuma A; Bell, Oliver; Blaas, Dieter; Gerhardt, Holger; Hoepfner, Dominic; Stark, Alexander; Penninger, Josef M

    2017-10-05

    The ability to directly uncover the contributions of genes to a given phenotype is fundamental for biology research. However, ostensibly homogeneous cell populations exhibit large clonal variance that can confound analyses and undermine reproducibility. Here we used genome-saturated mutagenesis to create a biobank of over 100,000 individual haploid mouse embryonic stem (mES) cell lines targeting 16,970 genes with genetically barcoded, conditional and reversible mutations. This Haplobank is, to our knowledge, the largest resource of hemi/homozygous mutant mES cells to date and is available to all researchers. Reversible mutagenesis overcomes clonal variance by permitting functional annotation of the genome directly in sister cells. We use the Haplobank in reverse genetic screens to investigate the temporal resolution of essential genes in mES cells, and to identify novel genes that control sprouting angiogenesis and lineage specification of blood vessels. Furthermore, a genome-wide forward screen with Haplobank identified PLA2G16 as a host factor that is required for cytotoxicity by rhinoviruses, which cause the common cold. Therefore, clones from the Haplobank combined with the use of reversible technologies enable high-throughput, reproducible, functional annotation of the genome.

  17. TriTrypDB: a functional genomic resource for the Trypanosomatidae

    PubMed Central

    Aslett, Martin; Aurrecoechea, Cristina; Berriman, Matthew; Brestelli, John; Brunk, Brian P.; Carrington, Mark; Depledge, Daniel P.; Fischer, Steve; Gajria, Bindu; Gao, Xin; Gardner, Malcolm J.; Gingle, Alan; Grant, Greg; Harb, Omar S.; Heiges, Mark; Hertz-Fowler, Christiane; Houston, Robin; Innamorato, Frank; Iodice, John; Kissinger, Jessica C.; Kraemer, Eileen; Li, Wei; Logan, Flora J.; Miller, John A.; Mitra, Siddhartha; Myler, Peter J.; Nayak, Vishal; Pennington, Cary; Phan, Isabelle; Pinney, Deborah F.; Ramasamy, Gowthaman; Rogers, Matthew B.; Roos, David S.; Ross, Chris; Sivam, Dhileep; Smith, Deborah F.; Srinivasamoorthy, Ganesh; Stoeckert, Christian J.; Subramanian, Sandhya; Thibodeau, Ryan; Tivey, Adrian; Treatman, Charles; Velarde, Giles; Wang, Haiming

    2010-01-01

    TriTrypDB (http://tritrypdb.org) is an integrated database providing access to genome-scale datasets for kinetoplastid parasites, and supporting a variety of complex queries driven by research and development needs. TriTrypDB is a collaborative project, utilizing the GUS/WDK computational infrastructure developed by the Eukaryotic Pathogen Bioinformatics Resource Center (EuPathDB.org) to integrate genome annotation and analyses from GeneDB and elsewhere with a wide variety of functional genomics datasets made available by members of the global research community, often pre-publication. Currently, TriTrypDB integrates datasets from Leishmania braziliensis, L. infantum, L. major, L. tarentolae, Trypanosoma brucei and T. cruzi. Users may examine individual genes or chromosomal spans in their genomic context, including syntenic alignments with other kinetoplastid organisms. Data within TriTrypDB can be interrogated utilizing a sophisticated search strategy system that enables a user to construct complex queries combining multiple data types. All search strategies are stored, allowing future access and integrated searches. ‘User Comments’ may be added to any gene page, enhancing available annotation; such comments become immediately searchable via the text search, and are forwarded to curators for incorporation into the reference annotation when appropriate. PMID:19843604

  18. NCBI viral genomes resource.

    PubMed

    Brister, J Rodney; Ako-Adjei, Danso; Bao, Yiming; Blinkova, Olga

    2015-01-01

    Recent technological innovations have ignited an explosion in virus genome sequencing that promises to fundamentally alter our understanding of viral biology and profoundly impact public health policy. Yet, any potential benefits from the billowing cloud of next generation sequence data hinge upon well implemented reference resources that facilitate the identification of sequences, aid in the assembly of sequence reads and provide reference annotation sources. The NCBI Viral Genomes Resource is a reference resource designed to bring order to this sequence shockwave and improve usability of viral sequence data. The resource can be accessed at http://www.ncbi.nlm.nih.gov/genome/viruses/ and catalogs all publicly available virus genome sequences and curates reference genome sequences. As the number of genome sequences has grown, so too have the difficulties in annotating and maintaining reference sequences. The rapid expansion of the viral sequence universe has forced a recalibration of the data model to better provide extant sequence representation and enhanced reference sequence products to serve the needs of the various viral communities. This, in turn, has placed increased emphasis on leveraging the knowledge of individual scientific communities to identify important viral sequences and develop well annotated reference virus genome sets. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by US Government employees and is in the public domain in the US.

  19. NCBI Viral Genomes Resource

    PubMed Central

    Brister, J. Rodney; Ako-adjei, Danso; Bao, Yiming; Blinkova, Olga

    2015-01-01

    Recent technological innovations have ignited an explosion in virus genome sequencing that promises to fundamentally alter our understanding of viral biology and profoundly impact public health policy. Yet, any potential benefits from the billowing cloud of next generation sequence data hinge upon well implemented reference resources that facilitate the identification of sequences, aid in the assembly of sequence reads and provide reference annotation sources. The NCBI Viral Genomes Resource is a reference resource designed to bring order to this sequence shockwave and improve usability of viral sequence data. The resource can be accessed at http://www.ncbi.nlm.nih.gov/genome/viruses/ and catalogs all publicly available virus genome sequences and curates reference genome sequences. As the number of genome sequences has grown, so too have the difficulties in annotating and maintaining reference sequences. The rapid expansion of the viral sequence universe has forced a recalibration of the data model to better provide extant sequence representation and enhanced reference sequence products to serve the needs of the various viral communities. This, in turn, has placed increased emphasis on leveraging the knowledge of individual scientific communities to identify important viral sequences and develop well annotated reference virus genome sets. PMID:25428358

  20. Ensembl comparative genomics resources

    PubMed Central

    Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J.; Searle, Stephen M. J.; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

    2016-01-01

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. PMID:26896847

  1. Ensembl comparative genomics resources.

    PubMed

    Herrero, Javier; Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J; Searle, Stephen M J; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

    2016-01-01

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. © The Author(s) 2016. Published by Oxford University Press.

  2. Function-driven discovery of disease genes in zebrafish using an integrated genomics big data resource.

    PubMed

    Shim, Hongseok; Kim, Ji Hyun; Kim, Chan Yeong; Hwang, Sohyun; Kim, Hyojin; Yang, Sunmo; Lee, Ji Eun; Lee, Insuk

    2016-11-16

    Whole exome sequencing (WES) accelerates disease gene discovery using rare genetic variants, but further statistical and functional evidence is required to avoid false-discovery. To complement variant-driven disease gene discovery, here we present function-driven disease gene discovery in zebrafish (Danio rerio), a promising human disease model owing to its high anatomical and genomic similarity to humans. To facilitate zebrafish-based function-driven disease gene discovery, we developed a genome-scale co-functional network of zebrafish genes, DanioNet (www.inetbio.org/danionet), which was constructed by Bayesian integration of genomics big data. Rigorous statistical assessment confirmed the high prediction capacity of DanioNet for a wide variety of human diseases. To demonstrate the feasibility of the function-driven disease gene discovery using DanioNet, we predicted genes for ciliopathies and performed experimental validation for eight candidate genes. We also validated the existence of heterozygous rare variants in the candidate genes of individuals with ciliopathies yet not in controls derived from the UK10K consortium, suggesting that these variants are potentially involved in enhancing the risk of ciliopathies. These results showed that an integrated genomics big data for a model animal of diseases can expand our opportunity for harnessing WES data in disease gene discovery. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  3. Function-driven discovery of disease genes in zebrafish using an integrated genomics big data resource

    PubMed Central

    Shim, Hongseok; Kim, Ji Hyun; Kim, Chan Yeong; Hwang, Sohyun; Kim, Hyojin; Yang, Sunmo; Lee, Ji Eun; Lee, Insuk

    2016-01-01

    Whole exome sequencing (WES) accelerates disease gene discovery using rare genetic variants, but further statistical and functional evidence is required to avoid false-discovery. To complement variant-driven disease gene discovery, here we present function-driven disease gene discovery in zebrafish (Danio rerio), a promising human disease model owing to its high anatomical and genomic similarity to humans. To facilitate zebrafish-based function-driven disease gene discovery, we developed a genome-scale co-functional network of zebrafish genes, DanioNet (www.inetbio.org/danionet), which was constructed by Bayesian integration of genomics big data. Rigorous statistical assessment confirmed the high prediction capacity of DanioNet for a wide variety of human diseases. To demonstrate the feasibility of the function-driven disease gene discovery using DanioNet, we predicted genes for ciliopathies and performed experimental validation for eight candidate genes. We also validated the existence of heterozygous rare variants in the candidate genes of individuals with ciliopathies yet not in controls derived from the UK10K consortium, suggesting that these variants are potentially involved in enhancing the risk of ciliopathies. These results showed that an integrated genomics big data for a model animal of diseases can expand our opportunity for harnessing WES data in disease gene discovery. PMID:27903883

  4. Generation, functional annotation and comparative analysis of black spruce (Picea mariana) ESTs: an important conifer genomic resource

    PubMed Central

    2013-01-01

    Background EST (expressed sequence tag) sequences and their annotation provide a highly valuable resource for gene discovery, genome sequence annotation, and other genomics studies that can be applied in genetics, breeding and conservation programs for non-model organisms. Conifers are long-lived plants that are ecologically and economically important globally, and have a large genome size. Black spruce (Picea mariana), is a transcontinental species of the North American boreal and temperate forests. However, there are limited transcriptomic and genomic resources for this species. The primary objective of our study was to develop a black spruce transcriptomic resource to facilitate on-going functional genomics projects related to growth and adaptation to climate change. Results We conducted bidirectional sequencing of cDNA clones from a standard cDNA library constructed from black spruce needle tissues. We obtained 4,594 high quality (2,455 5' end and 2,139 3' end) sequence reads, with an average read-length of 532 bp. Clustering and assembly of ESTs resulted in 2,731 unique sequences, consisting of 2,234 singletons and 497 contigs. Approximately two-thirds (63%) of unique sequences were functionally annotated. Genes involved in 36 molecular functions and 90 biological processes were discovered, including 24 putative transcription factors and 232 genes involved in photosynthesis. Most abundantly expressed transcripts were associated with photosynthesis, growth factors, stress and disease response, and transcription factors. A total of 216 full-length genes were identified. About 18% (493) of the transcripts were novel, representing an important addition to the Genbank EST database (dbEST). Fifty-seven di-, tri-, tetra- and penta-nucleotide simple sequence repeats were identified. Conclusions We have developed the first high quality EST resource for black spruce and identified 493 novel transcripts, which may be species-specific related to life history and

  5. Generation, functional annotation and comparative analysis of black spruce (Picea mariana) ESTs: an important conifer genomic resource.

    PubMed

    Mann, Ishminder K; Wegrzyn, Jill L; Rajora, Om P

    2013-10-11

    EST (expressed sequence tag) sequences and their annotation provide a highly valuable resource for gene discovery, genome sequence annotation, and other genomics studies that can be applied in genetics, breeding and conservation programs for non-model organisms. Conifers are long-lived plants that are ecologically and economically important globally, and have a large genome size. Black spruce (Picea mariana), is a transcontinental species of the North American boreal and temperate forests. However, there are limited transcriptomic and genomic resources for this species. The primary objective of our study was to develop a black spruce transcriptomic resource to facilitate on-going functional genomics projects related to growth and adaptation to climate change. We conducted bidirectional sequencing of cDNA clones from a standard cDNA library constructed from black spruce needle tissues. We obtained 4,594 high quality (2,455 5' end and 2,139 3' end) sequence reads, with an average read-length of 532 bp. Clustering and assembly of ESTs resulted in 2,731 unique sequences, consisting of 2,234 singletons and 497 contigs. Approximately two-thirds (63%) of unique sequences were functionally annotated. Genes involved in 36 molecular functions and 90 biological processes were discovered, including 24 putative transcription factors and 232 genes involved in photosynthesis. Most abundantly expressed transcripts were associated with photosynthesis, growth factors, stress and disease response, and transcription factors. A total of 216 full-length genes were identified. About 18% (493) of the transcripts were novel, representing an important addition to the Genbank EST database (dbEST). Fifty-seven di-, tri-, tetra- and penta-nucleotide simple sequence repeats were identified. We have developed the first high quality EST resource for black spruce and identified 493 novel transcripts, which may be species-specific related to life history and ecological traits. We have also

  6. A large-scale zebrafish gene knockout resource for the genome-wide study of gene function.

    PubMed

    Varshney, Gaurav K; Lu, Jing; Gildea, Derek E; Huang, Haigen; Pei, Wuhong; Yang, Zhongan; Huang, Sunny C; Schoenfeld, David; Pho, Nam H; Casero, David; Hirase, Takashi; Mosbrook-Davis, Deborah; Zhang, Suiyuan; Jao, Li-En; Zhang, Bo; Woods, Ian G; Zimmerman, Steven; Schier, Alexander F; Wolfsberg, Tyra G; Pellegrini, Matteo; Burgess, Shawn M; Lin, Shuo

    2013-04-01

    With the completion of the zebrafish genome sequencing project, it becomes possible to analyze the function of zebrafish genes in a systematic way. The first step in such an analysis is to inactivate each protein-coding gene by targeted or random mutation. Here we describe a streamlined pipeline using proviral insertions coupled with high-throughput sequencing and mapping technologies to widely mutagenize genes in the zebrafish genome. We also report the first 6144 mutagenized and archived F1's predicted to carry up to 3776 mutations in annotated genes. Using in vitro fertilization, we have rescued and characterized ~0.5% of the predicted mutations, showing mutation efficacy and a variety of phenotypes relevant to both developmental processes and human genetic diseases. Mutagenized fish lines are being made freely available to the public through the Zebrafish International Resource Center. These fish lines establish an important milestone for zebrafish genetics research and should greatly facilitate systematic functional studies of the vertebrate genome.

  7. HelmCoP: an online resource for helminth functional genomics and drug and vaccine targets prioritization.

    PubMed

    Abubucker, Sahar; Martin, John; Taylor, Christina M; Mitreva, Makedonka

    2011-01-01

    A vast majority of the burden from neglected tropical diseases result from helminth infections (nematodes and platyhelminthes). Parasitic helminthes infect over 2 billion, exerting a high collective burden that rivals high-mortality conditions such as AIDS or malaria, and cause devastation to crops and livestock. The challenges to improve control of parasitic helminth infections are multi-fold and no single category of approaches will meet them all. New information such as helminth genomics, functional genomics and proteomics coupled with innovative bioinformatic approaches provide fundamental molecular information about these parasites, accelerating both basic research as well as development of effective diagnostics, vaccines and new drugs. To facilitate such studies we have developed an online resource, HelmCoP (Helminth Control and Prevention), built by integrating functional, structural and comparative genomic data from plant, animal and human helminthes, to enable researchers to develop strategies for drug, vaccine and pesticide prioritization, while also providing a useful comparative genomics platform. HelmCoP encompasses genomic data from several hosts, including model organisms, along with a comprehensive suite of structural and functional annotations, to assist in comparative analyses and to study host-parasite interactions. The HelmCoP interface, with a sophisticated query engine as a backbone, allows users to search for multi-factorial combinations of properties and serves readily accessible information that will assist in the identification of various genes of interest. HelmCoP is publicly available at: http://www.nematode.net/helmcop.html.

  8. WheatGenome.info: A Resource for Wheat Genomics Resource.

    PubMed

    Lai, Kaitao

    2016-01-01

    An integrated database with a variety of Web-based systems named WheatGenome.info hosting wheat genome and genomic data has been developed to support wheat research and crop improvement. The resource includes multiple Web-based applications, which are implemented as a variety of Web-based systems. These include a GBrowse2-based wheat genome viewer with BLAST search portal, TAGdb for searching wheat second generation genome sequence data, wheat autoSNPdb, links to wheat genetic maps using CMap and CMap3D, and a wheat genome Wiki to allow interaction between diverse wheat genome sequencing activities. This portal provides links to a variety of wheat genome resources hosted at other research organizations. This integrated database aims to accelerate wheat genome research and is freely accessible via the web interface at http://www.wheatgenome.info/ .

  9. The Homeodomain Resource: a comprehensive collection of sequence, structure, interaction, genomic and functional information on the homeodomain protein family

    PubMed Central

    Moreland, R. Travis; Ryan, Joseph F.; Pan, Christopher; Baxevanis, Andreas D.

    2009-01-01

    The Homeodomain Resource is a curated collection of sequence, structure, interaction, genomic and functional information on the homeodomain family. The current version builds upon previous versions by the addition of new, complete sets of homeodomain sequences from fully sequenced genomes, the expansion of existing curated homeodomain information and the improvement of data accessibility through better search tools and more complete data integration. This release contains 1534 full-length homeodomain-containing sequences, 93 experimentally derived homeodomain structures, 101 homeodomain protein–protein interactions, 107 homeodomain DNA-binding sites and 206 homeodomain proteins implicated in human genetic disorders. Database URL: The Homeodomain Resource is freely available and can be accessed at http://research.nhgri.nih.gov/homeodomain/ PMID:20157477

  10. The Biofuel Feedstock Genomics Resource: a web-based portal and database to enable functional genomics of plant biofuel feedstock species.

    PubMed

    Childs, Kevin L; Konganti, Kranti; Buell, C Robin

    2012-01-01

    Major feedstock sources for future biofuel production are likely to be high biomass producing plant species such as poplar, pine, switchgrass, sorghum and maize. One active area of research in these species is genome-enabled improvement of lignocellulosic biofuel feedstock quality and yield. To facilitate genomic-based investigations in these species, we developed the Biofuel Feedstock Genomic Resource (BFGR), a database and web-portal that provides high-quality, uniform and integrated functional annotation of gene and transcript assembly sequences from species of interest to lignocellulosic biofuel feedstock researchers. The BFGR includes sequence data from 54 species and permits researchers to view, analyze and obtain annotation at the gene, transcript, protein and genome level. Annotation of biochemical pathways permits the identification of key genes and transcripts central to the improvement of lignocellulosic properties in these species. The integrated nature of the BFGR in terms of annotation methods, orthologous/paralogous relationships and linkage to seven species with complete genome sequences allows comparative analyses for biofuel feedstock species with limited sequence resources. Database URL: http://bfgr.plantbiology.msu.edu.

  11. Gene Overexpression Resources in Cereals for Functional Genomics and Discovery of Useful Genes

    PubMed Central

    Abe, Kiyomi; Ichikawa, Hiroaki

    2016-01-01

    Identification and elucidation of functions of plant genes is valuable for both basic and applied research. In addition to natural variation in model plants, numerous loss-of-function resources have been produced by mutagenesis with chemicals, irradiation, or insertions of transposable elements or T-DNA. However, we may be unable to observe loss-of-function phenotypes for genes with functionally redundant homologs and for those essential for growth and development. To offset such disadvantages, gain-of-function transgenic resources have been exploited. Activation-tagged lines have been generated using obligatory overexpression of endogenous genes by random insertion of an enhancer. Recent progress in DNA sequencing technology and bioinformatics has enabled the preparation of genomewide collections of full-length cDNAs (fl-cDNAs) in some model species. Using the fl-cDNA clones, a novel gain-of-function strategy, Fl-cDNA OvereXpressor gene (FOX)-hunting system, has been developed. A mutant phenotype in a FOX line can be directly attributed to the overexpressed fl-cDNA. Investigating a large population of FOX lines could reveal important genes conferring favorable phenotypes for crop breeding. Alternatively, a unique loss-of-function approach Chimeric REpressor gene Silencing Technology (CRES-T) has been developed. In CRES-T, overexpression of a chimeric repressor, composed of the coding sequence of a transcription factor (TF) and short peptide designated as the repression domain, could interfere with the action of endogenous TF in plants. Although plant TFs usually consist of gene families, CRES-T is effective, in principle, even for the TFs with functional redundancy. In this review, we focus on the current status of the gene-overexpression strategies and resources for identifying and elucidating novel functions of cereal genes. We discuss the potential of these research tools for identifying useful genes and phenotypes for application in crop breeding. PMID

  12. Gramene 2013: Comparative plant genomics resources

    USDA-ARS?s Scientific Manuscript database

    Gramene (http://www.gramene.org) is a curated online resource for comparative functional genomics in crops and model plant species, currently hosting 27 fully and 10 partially sequenced reference genomes in its build number 38. Its strength derives from the application of a phylogenetic framework fo...

  13. The DNA Data Bank of Japan launches a new resource, the DDBJ Omics Archive of functional genomics experiments.

    PubMed

    Kodama, Yuichi; Mashima, Jun; Kaminuma, Eli; Gojobori, Takashi; Ogasawara, Osamu; Takagi, Toshihisa; Okubo, Kousaku; Nakamura, Yasukazu

    2012-01-01

    The DNA Data Bank of Japan (DDBJ; http://www.ddbj.nig.ac.jp) maintains and provides archival, retrieval and analytical resources for biological information. The central DDBJ resource consists of public, open-access nucleotide sequence databases including raw sequence reads, assembly information and functional annotation. Database content is exchanged with EBI and NCBI within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). In 2011, DDBJ launched two new resources: the 'DDBJ Omics Archive' (DOR; http://trace.ddbj.nig.ac.jp/dor) and BioProject (http://trace.ddbj.nig.ac.jp/bioproject). DOR is an archival database of functional genomics data generated by microarray and highly parallel new generation sequencers. Data are exchanged between the ArrayExpress at EBI and DOR in the common MAGE-TAB format. BioProject provides an organizational framework to access metadata about research projects and the data from the projects that are deposited into different databases. In this article, we describe major changes and improvements introduced to the DDBJ services, and the launch of two new resources: DOR and BioProject.

  14. A large-scale zebrafish gene knockout resource for the genome-wide study of gene function

    PubMed Central

    Varshney, Gaurav K.; Lu, Jing; Gildea, Derek E.; Huang, Haigen; Pei, Wuhong; Yang, Zhongan; Huang, Sunny C.; Schoenfeld, David; Pho, Nam H.; Casero, David; Hirase, Takashi; Mosbrook-Davis, Deborah; Zhang, Suiyuan; Jao, Li-En; Zhang, Bo; Woods, Ian G.; Zimmerman, Steven; Schier, Alexander F.; Wolfsberg, Tyra G.; Pellegrini, Matteo; Burgess, Shawn M.; Lin, Shuo

    2013-01-01

    With the completion of the zebrafish genome sequencing project, it becomes possible to analyze the function of zebrafish genes in a systematic way. The first step in such an analysis is to inactivate each protein-coding gene by targeted or random mutation. Here we describe a streamlined pipeline using proviral insertions coupled with high-throughput sequencing and mapping technologies to widely mutagenize genes in the zebrafish genome. We also report the first 6144 mutagenized and archived F1's predicted to carry up to 3776 mutations in annotated genes. Using in vitro fertilization, we have rescued and characterized ∼0.5% of the predicted mutations, showing mutation efficacy and a variety of phenotypes relevant to both developmental processes and human genetic diseases. Mutagenized fish lines are being made freely available to the public through the Zebrafish International Resource Center. These fish lines establish an important milestone for zebrafish genetics research and should greatly facilitate systematic functional studies of the vertebrate genome. PMID:23382537

  15. Genomic Resources for Cancer Epidemiology

    Cancer.gov

    This page provides links to research resources, complied by the Epidemiology and Genomics Research Program, that may be of interest to genetic epidemiologists conducting cancer research, but is not exhaustive.

  16. Rat Genome and Model Resources.

    PubMed

    Shimoyama, Mary; Smith, Jennifer R; Bryda, Elizabeth; Kuramoto, Takashi; Saba, Laura; Dwinell, Melinda

    2017-07-01

    Rats remain a major model for studying disease mechanisms and discovery, validation, and testing of new compounds to improve human health. The rat's value continues to grow as indicated by the more than 1.4 million publications (second to human) at PubMed documenting important discoveries using this model. Advanced sequencing technologies, genome modification techniques, and the development of embryonic stem cell protocols ensure the rat remains an important mammalian model for disease studies. The 2004 release of the reference genome has been followed by the production of complete genomes for more than two dozen individual strains utilizing NextGen sequencing technologies; their analyses have identified over 80 million variants. This explosion in genomic data has been accompanied by the ability to selectively edit the rat genome, leading to hundreds of new strains through multiple technologies. A number of resources have been developed to provide investigators with access to precision rat models, comprehensive datasets, and sophisticated software tools necessary for their research. Those profiled here include the Rat Genome Database, PhenoGen, Gene Editing Rat Resource Center, Rat Resource and Research Center, and the National BioResource Project for the Rat in Japan. © The Author 2017. Published by Oxford University Press.

  17. Fungal genome resources at NCBI

    PubMed Central

    Robbertse, B.; Tatusova, T.

    2011-01-01

    The National Center for Biotechnology Information (NCBI) is well known for the nucleotide sequence archive, GenBank and sequence analysis tool BLAST. However, NCBI integrates many types of biomolecular data from variety of sources and makes it available to the scientific community as interactive web resources as well as organized releases of bulk data. These tools are available to explore and compare fungal genomes. Searching all databases with Fungi [organism] at http://www.ncbi.nlm.nih.gov/ is the quickest way to find resources of interest with fungal entries. Some tools though are resources specific and can be indirectly accessed from a particular database in the Entrez system. These include graphical viewers and comparative analysis tools such as TaxPlot, TaxMap and UniGene DDD (found via UniGene Homepage). Gene and BioProject pages also serve as portals to external data such as community annotation websites, BioGrid and UniProt. There are many different ways of accessing genomic data at NCBI. Depending on the focus and goal of research projects or the level of interest, a user would select a particular route for accessing genomic databases and resources. This review article describes methods of accessing fungal genome data and provides examples that illustrate the use of analysis tools. PMID:22737589

  18. Fungal genome resources at NCBI.

    PubMed

    Robbertse, B; Tatusova, T

    2011-09-01

    The National Center for Biotechnology Information (NCBI) is well known for the nucleotide sequence archive, GenBank and sequence analysis tool BLAST. However, NCBI integrates many types of biomolecular data from variety of sources and makes it available to the scientific community as interactive web resources as well as organized releases of bulk data. These tools are available to explore and compare fungal genomes. Searching all databases with Fungi [organism] at http://www.ncbi.nlm.nih.gov/ is the quickest way to find resources of interest with fungal entries. Some tools though are resources specific and can be indirectly accessed from a particular database in the Entrez system. These include graphical viewers and comparative analysis tools such as TaxPlot, TaxMap and UniGene DDD (found via UniGene Homepage). Gene and BioProject pages also serve as portals to external data such as community annotation websites, BioGrid and UniProt. There are many different ways of accessing genomic data at NCBI. Depending on the focus and goal of research projects or the level of interest, a user would select a particular route for accessing genomic databases and resources. This review article describes methods of accessing fungal genome data and provides examples that illustrate the use of analysis tools.

  19. A panel of induced pluripotent stem cells from chimpanzees: a resource for comparative functional genomics

    PubMed Central

    Gallego Romero, Irene; Pavlovic, Bryan J; Hernando-Herraez, Irene; Zhou, Xiang; Ward, Michelle C; Banovich, Nicholas E; Kagan, Courtney L; Burnett, Jonathan E; Huang, Constance H; Mitrano, Amy; Chavarria, Claudia I; Friedrich Ben-Nun, Inbar; Li, Yingchun; Sabatini, Karen; Leonardo, Trevor R; Parast, Mana; Marques-Bonet, Tomas; Laurent, Louise C; Loring, Jeanne F; Gilad, Yoav

    2015-01-01

    Comparative genomics studies in primates are restricted due to our limited access to samples. In order to gain better insight into the genetic processes that underlie variation in complex phenotypes in primates, we must have access to faithful model systems for a wide range of cell types. To facilitate this, we generated a panel of 7 fully characterized chimpanzee induced pluripotent stem cell (iPSC) lines derived from healthy donors. To demonstrate the utility of comparative iPSC panels, we collected RNA-sequencing and DNA methylation data from the chimpanzee iPSCs and the corresponding fibroblast lines, as well as from 7 human iPSCs and their source lines, which encompass multiple populations and cell types. We observe much less within-species variation in iPSCs than in somatic cells, indicating the reprogramming process erases many inter-individual differences. The low within-species regulatory variation in iPSCs allowed us to identify many novel inter-species regulatory differences of small magnitude. DOI: http://dx.doi.org/10.7554/eLife.07103.001 PMID:26102527

  20. A panel of induced pluripotent stem cells from chimpanzees: a resource for comparative functional genomics.

    PubMed

    Gallego Romero, Irene; Pavlovic, Bryan J; Hernando-Herraez, Irene; Zhou, Xiang; Ward, Michelle C; Banovich, Nicholas E; Kagan, Courtney L; Burnett, Jonathan E; Huang, Constance H; Mitrano, Amy; Chavarria, Claudia I; Friedrich Ben-Nun, Inbar; Li, Yingchun; Sabatini, Karen; Leonardo, Trevor R; Parast, Mana; Marques-Bonet, Tomas; Laurent, Louise C; Loring, Jeanne F; Gilad, Yoav

    2015-06-23

    Comparative genomics studies in primates are restricted due to our limited access to samples. In order to gain better insight into the genetic processes that underlie variation in complex phenotypes in primates, we must have access to faithful model systems for a wide range of cell types. To facilitate this, we generated a panel of 7 fully characterized chimpanzee induced pluripotent stem cell (iPSC) lines derived from healthy donors. To demonstrate the utility of comparative iPSC panels, we collected RNA-sequencing and DNA methylation data from the chimpanzee iPSCs and the corresponding fibroblast lines, as well as from 7 human iPSCs and their source lines, which encompass multiple populations and cell types. We observe much less within-species variation in iPSCs than in somatic cells, indicating the reprogramming process erases many inter-individual differences. The low within-species regulatory variation in iPSCs allowed us to identify many novel inter-species regulatory differences of small magnitude.

  1. Genomics and Bioinformatics Resources for Crop Improvement

    PubMed Central

    Mochida, Keiichi; Shinozaki, Kazuo

    2010-01-01

    Recent remarkable innovations in platforms for omics-based research and application development provide crucial resources to promote research in model and applied plant species. A combinatorial approach using multiple omics platforms and integration of their outcomes is now an effective strategy for clarifying molecular systems integral to improving plant productivity. Furthermore, promotion of comparative genomics among model and applied plants allows us to grasp the biological properties of each species and to accelerate gene discovery and functional analyses of genes. Bioinformatics platforms and their associated databases are also essential for the effective design of approaches making the best use of genomic resources, including resource integration. We review recent advances in research platforms and resources in plant omics together with related databases and advances in technology. PMID:20208064

  2. Reannotation and extended community resources for the genome of the non-seed plant Physcomitrella patens provide insights into the evolution of plant gene structures and functions

    PubMed Central

    2013-01-01

    Background The moss Physcomitrella patens as a model species provides an important reference for early-diverging lineages of plants and the release of the genome in 2008 opened the doors to genome-wide studies. The usability of a reference genome greatly depends on the quality of the annotation and the availability of centralized community resources. Therefore, in the light of accumulating evidence for missing genes, fragmentary gene structures, false annotations and a low rate of functional annotations on the original release, we decided to improve the moss genome annotation. Results Here, we report the complete moss genome re-annotation (designated V1.6) incorporating the increased transcript availability from a multitude of developmental stages and tissue types. We demonstrate the utility of the improved P. patens genome annotation for comparative genomics and new extensions to the cosmoss.org resource as a central repository for this plant “flagship” genome. The structural annotation of 32,275 protein-coding genes results in 8387 additional loci including 1456 loci with known protein domains or homologs in Plantae. This is the first release to include information on transcript isoforms, suggesting alternative splicing events for at least 10.8% of the loci. Furthermore, this release now also provides information on non-protein-coding loci. Functional annotations were improved regarding quality and coverage, resulting in 58% annotated loci (previously: 41%) that comprise also 7200 additional loci with GO annotations. Access and manual curation of the functional and structural genome annotation is provided via the http://www.cosmoss.org model organism database. Conclusions Comparative analysis of gene structure evolution along the green plant lineage provides novel insights, such as a comparatively high number of loci with 5’-UTR introns in the moss. Comparative analysis of functional annotations reveals expansions of moss house-keeping and metabolic genes

  3. Functional genomics resources for the North Atlantic copepod, Calanus finmarchicus: EST database and physiological microarray

    PubMed Central

    Lenz, Petra H.; Unal, Ebru; Hassett, R. Patrick; Smith, Christine M.; Bucklin, Ann; Christie, Andrew E.; Towle, David W.

    2012-01-01

    The copepod, Calanus finmarchicus is a keystone species for the North Atlantic. Because of recent changes in the geographic distribution of this species, there are questions as to how this organism responds physiologically to environmental cues. Molecular techniques allow for examination and new understanding of these physiological changes. Here, we describe the development of a microarray for high-throughput studies of the physiological ecology of C. finmarchicus. An EST database was generated for this species using a normalized cDNA library derived from adult and sub-adult individuals. Sequence data were clustered into contigs and annotated using Blastx. Target transcripts were selected, and unique, 50 base-pair, oligomer probes were generated for 995 genes. Blast2GO processing provided detailed information on gene function. The selected targets included broad representation of biological processes, cellular components, and molecular functions. The microarray was tested in two sets of comparisons: adult females maintained at different food concentrations and field-caught sub-adults showing differences in lipid storage. Up-regulated and down-regulated transcripts were identified for both comparisons. Only a small subset of the genes up-regulated in low food individuals were also up-regulated in lipid-poor animals; no overlap was seen between the genes down-regulated in the two comparisons. PMID:22277925

  4. High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource

    PubMed Central

    Seaver, Samuel M. D.; Gerdes, Svetlana; Frelin, Océane; Lerma-Ortiz, Claudia; Bradbury, Louis M. T.; Zallot, Rémi; Hasnain, Ghulam; Niehaus, Thomas D.; El Yacoubi, Basma; Pasternak, Shiran; Olson, Robert; Pusch, Gordon; Overbeek, Ross; Stevens, Rick; de Crécy-Lagard, Valérie; Ware, Doreen; Hanson, Andrew D.; Henry, Christopher S.

    2014-01-01

    The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today’s annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic models. To overcome these problems, we have developed the PlantSEED, an integrated, metabolism-centric database to support subsystems-based annotation and metabolic model reconstruction for plant genomes. PlantSEED combines SEED subsystems technology, first developed for microbial genomes, with refined protein families and biochemical data to assign fully consistent functional annotations to orthologous genes, particularly those encoding primary metabolic pathways. Seamless integration with its parent, the prokaryotic SEED database, makes PlantSEED a unique environment for cross-kingdom comparative analysis of plant and bacterial genomes. The consistent annotations imposed by PlantSEED permit rapid reconstruction and modeling of primary metabolism for all plant genomes in the database. This feature opens the unique possibility of model-based assessment of the completeness and accuracy of gene annotation and thus allows computational identification of genes and pathways that are restricted to certain genomes or need better curation. We demonstrate the PlantSEED system by producing consistent annotations for 10 reference genomes. We also produce a functioning metabolic model for each genome, gapfilling to identify missing annotations and proposing gene candidates for missing annotations. Models are built around an extended biomass composition representing the most comprehensive published to date. To our knowledge, our models are the first to be published for seven of the genomes analyzed. PMID:24927599

  5. High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource.

    PubMed

    Seaver, Samuel M D; Gerdes, Svetlana; Frelin, Océane; Lerma-Ortiz, Claudia; Bradbury, Louis M T; Zallot, Rémi; Hasnain, Ghulam; Niehaus, Thomas D; El Yacoubi, Basma; Pasternak, Shiran; Olson, Robert; Pusch, Gordon; Overbeek, Ross; Stevens, Rick; de Crécy-Lagard, Valérie; Ware, Doreen; Hanson, Andrew D; Henry, Christopher S

    2014-07-01

    The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today's annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic models. To overcome these problems, we have developed the PlantSEED, an integrated, metabolism-centric database to support subsystems-based annotation and metabolic model reconstruction for plant genomes. PlantSEED combines SEED subsystems technology, first developed for microbial genomes, with refined protein families and biochemical data to assign fully consistent functional annotations to orthologous genes, particularly those encoding primary metabolic pathways. Seamless integration with its parent, the prokaryotic SEED database, makes PlantSEED a unique environment for cross-kingdom comparative analysis of plant and bacterial genomes. The consistent annotations imposed by PlantSEED permit rapid reconstruction and modeling of primary metabolism for all plant genomes in the database. This feature opens the unique possibility of model-based assessment of the completeness and accuracy of gene annotation and thus allows computational identification of genes and pathways that are restricted to certain genomes or need better curation. We demonstrate the PlantSEED system by producing consistent annotations for 10 reference genomes. We also produce a functioning metabolic model for each genome, gapfilling to identify missing annotations and proposing gene candidates for missing annotations. Models are built around an extended biomass composition representing the most comprehensive published to date. To our knowledge, our models are the first to be published for seven of the genomes analyzed.

  6. Gramene 2013: comparative plant genomics resources

    PubMed Central

    Monaco, Marcela K.; Stein, Joshua; Naithani, Sushma; Wei, Sharon; Dharmawardhana, Palitha; Kumari, Sunita; Amarasinghe, Vindhya; Youens-Clark, Ken; Thomason, James; Preece, Justin; Pasternak, Shiran; Olson, Andrew; Jiao, Yinping; Lu, Zhenyuan; Bolser, Dan; Kerhornou, Arnaud; Staines, Dan; Walts, Brandon; Wu, Guanming; D’Eustachio, Peter; Haw, Robin; Croft, David; Kersey, Paul J.; Stein, Lincoln; Jaiswal, Pankaj; Ware, Doreen

    2014-01-01

    Gramene (http://www.gramene.org) is a curated online resource for comparative functional genomics in crops and model plant species, currently hosting 27 fully and 10 partially sequenced reference genomes in its build number 38. Its strength derives from the application of a phylogenetic framework for genome comparison and the use of ontologies to integrate structural and functional annotation data. Whole-genome alignments complemented by phylogenetic gene family trees help infer syntenic and orthologous relationships. Genetic variation data, sequences and genome mappings available for 10 species, including Arabidopsis, rice and maize, help infer putative variant effects on genes and transcripts. The pathways section also hosts 10 species-specific metabolic pathways databases developed in-house or by our collaborators using Pathway Tools software, which facilitates searches for pathway, reaction and metabolite annotations, and allows analyses of user-defined expression datasets. Recently, we released a Plant Reactome portal featuring 133 curated rice pathways. This portal will be expanded for Arabidopsis, maize and other plant species. We continue to provide genetic and QTL maps and marker datasets developed by crop researchers. The project provides a unique community platform to support scientific research in plant genomics including studies in evolution, genetics, plant breeding, molecular biology, biochemistry and systems biology. PMID:24217918

  7. Saccharomyces Genome Database: the genomics resource of budding yeast

    PubMed Central

    Cherry, J. Michael; Hong, Eurie L.; Amundsen, Craig; Balakrishnan, Rama; Binkley, Gail; Chan, Esther T.; Christie, Karen R.; Costanzo, Maria C.; Dwight, Selina S.; Engel, Stacia R.; Fisk, Dianna G.; Hirschman, Jodi E.; Hitz, Benjamin C.; Karra, Kalpana; Krieger, Cynthia J.; Miyasato, Stuart R.; Nash, Rob S.; Park, Julie; Skrzypek, Marek S.; Simison, Matt; Weng, Shuai; Wong, Edith D.

    2012-01-01

    The Saccharomyces Genome Database (SGD, http://www.yeastgenome.org) is the community resource for the budding yeast Saccharomyces cerevisiae. The SGD project provides the highest-quality manually curated information from peer-reviewed literature. The experimental results reported in the literature are extracted and integrated within a well-developed database. These data are combined with quality high-throughput results and provided through Locus Summary pages, a powerful query engine and rich genome browser. The acquisition, integration and retrieval of these data allow SGD to facilitate experimental design and analysis by providing an encyclopedia of the yeast genome, its chromosomal features, their functions and interactions. Public access to these data is provided to researchers and educators via web pages designed for optimal ease of use. PMID:22110037

  8. Gramene database: navigating plant comparative genomics resources

    USDA-ARS?s Scientific Manuscript database

    Gramene (http://www.gramene.org) is an online, open source, curated resource for plant comparative genomics and pathway analysis designed to support researchers working in plant genomics, breeding, evolutionary biology, system biology, and metabolic engineering. It exploits phylogenetic relationship...

  9. Plant functional genomics

    NASA Astrophysics Data System (ADS)

    Holtorf, Hauke; Guitton, Marie-Christine; Reski, Ralf

    2002-04-01

    Functional genome analysis of plants has entered the high-throughput stage. The complete genome information from key species such as Arabidopsis thaliana and rice is now available and will further boost the application of a range of new technologies to functional plant gene analysis. To broadly assign functions to unknown genes, different fast and multiparallel approaches are currently used and developed. These new technologies are based on known methods but are adapted and improved to accommodate for comprehensive, large-scale gene analysis, i.e. such techniques are novel in the sense that their design allows researchers to analyse many genes at the same time and at an unprecedented pace. Such methods allow analysis of the different constituents of the cell that help to deduce gene function, namely the transcripts, proteins and metabolites. Similarly the phenotypic variations of entire mutant collections can now be analysed in a much faster and more efficient way than before. The different methodologies have developed to form their own fields within the functional genomics technological platform and are termed transcriptomics, proteomics, metabolomics and phenomics. Gene function, however, cannot solely be inferred by using only one such approach. Rather, it is only by bringing together all the information collected by different functional genomic tools that one will be able to unequivocally assign functions to unknown plant genes. This review focuses on current technical developments and their impact on the field of plant functional genomics. The lower plant Physcomitrella is introduced as a new model system for gene function analysis, owing to its high rate of homologous recombination.

  10. Drosophila and genome-wide association studies: a review and resource for the functional dissection of human complex traits

    PubMed Central

    Wangler, Michael F.; Hu, Yanhui

    2017-01-01

    ABSTRACT Human genome-wide association studies (GWAS) have successfully identified thousands of susceptibility loci for common diseases with complex genetic etiologies. Although the susceptibility variants identified by GWAS usually have only modest effects on individual disease risk, they contribute to a substantial burden of trait variation in the overall population. GWAS also offer valuable clues to disease mechanisms that have long proven to be elusive. These insights could lead the way to breakthrough treatments; however, several challenges hinder progress, making innovative approaches to accelerate the follow-up of results from GWAS an urgent priority. Here, we discuss the largely untapped potential of the fruit fly, Drosophila melanogaster, for functional investigation of findings from human GWAS. We highlight selected examples where strong genomic conservation with humans along with the rapid and powerful genetic tools available for flies have already facilitated fine mapping of association signals, elucidated gene mechanisms, and revealed novel disease-relevant biology. We emphasize current research opportunities in this rapidly advancing field, and present bioinformatic analyses that systematically explore the applicability of Drosophila for interrogation of susceptibility signals implicated in more than 1000 human traits, based on all GWAS completed to date. Thus, our discussion is targeted at both human geneticists seeking innovative strategies for experimental validation of findings from GWAS, as well as the Drosophila research community, by whom ongoing investigations of the implicated genes will powerfully inform our understanding of human disease. PMID:28151408

  11. Gramene 2016: comparative plant genomics and pathway resources

    USDA-ARS?s Scientific Manuscript database

    Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the data...

  12. Genomic education resources for nursing faculty.

    PubMed

    Tonkin, Emma; Calzone, Kathleen; Jenkins, Jean; Lea, Dale; Prows, Cynthia

    2011-12-01

    The increasing recognition regarding the relevance of genomics across the scope of nursing healthcare practice has resulted in the drive to integrate appropriate genomic knowledge and skills into nurse education and training. In this final article of the series Genetics-Genomics and Nursing Education, we will look at genetic and genomic education resources and the factors that influence both their creation and use. In considering nurse education from faculty and student perspectives, four identified areas of need have been used as the organizing constructs: guidance (what should be taught and at what level of complexity); support and training; access to genetics professionals and service users; and quality resources. This paper sets out to address the following points: (a) why there is a need for quality genomics education resources to support nurse education; (b) what is required from a resource to make it "useful" for the user; and (c) how the quality and impact of a resource can be measured. While not exhaustive, information is provided to a number of globally accessible resources, along with detailed descriptions of selected teaching or learning tools. Strategies for evaluating the suitability of a resource and suggestions on how genomic resources can be used within nurse education are provided. The use of clinically relevant resources that link theory to professional practice and which meet predefined learning outcomes and practice indicators for nurse education and training will facilitate the integration of genomics into curricula by nurse faculty. Providing clinically meaningful education and training in genomics is central to enabling every nurse to develop the appropriate knowledge and skills in genomics in order to provide optimum care to individuals and families now, and to facilitate the integration of new information and technology as it becomes available across mainstream healthcare services. No claim to original US government works.

  13. Genome-Wide Association Study of Cardiac Structure and Systolic Function in African Americans: The Candidate Gene Association Resource (CARe) Study

    PubMed Central

    Fox, Ervin R.; Musani, Solomon K.; Barbalic, Maja; Lin, Honghuang; Yu, Bing; Ogunyankin, Kofo O.; Smith, Nicholas L.; Kutlar, Abdullah; Glazer, Nicole L.; Post, Wendy S.; Paltoo, Dina N.; Dries, Daniel L.; Farlow, Deborah N.; Duarte, Christine W.; Kardia, Sharon L.; Meyers, Kristin J.; Sun, Yan V.; Arnett, Donna K.; Patki, Amit A.; Sha, Jin; Cui, Xiangqui; Samdarshi, Tandaw E.; Penman, Alan D.; Bibbins-Domingo, Kirsten; Bůžková, Petra; Benjamin, Emelia J.; Bluemke, David A.; Morrison, Alanna C.; Heiss, Gerardo; Carr, J. Jeffrey; Tracy, Russell P.; Mosley, Thomas H.; Taylor, Herman A.; Psaty, Bruce M.; Heckbert, Susan R.; Cappola, Thomas P.; Vasan, Ramachandran S.

    2013-01-01

    Background Using data from four community-based cohorts of African Americans (AA), we tested the association between genome-wide markers (SNPs) and cardiac phenotypes in the Candidate-gene Association REsource (CARe) study. Methods and Results Among 6,765 AA, we related age, sex, height and weight-adjusted residuals for nine cardiac phenotypes (assessed by echocardiogram or MRI) to 2.5 million SNPs genotyped using Genome-Wide Affymetrix Human SNP Array 6.0 (Affy6.0) and the remainder imputed. Within cohort genome-wide association analysis was conducted followed by meta-analysis across cohorts using inverse variance weights (genome-wide significance threshold=4.0 ×10−07). Supplementary pathway analysis was performed. We attempted replication in 3 smaller cohorts of African ancestry and tested look-ups in one consortium of European ancestry (EchoGEN). Across the 9 phenotypes, variants in 4 genetic loci reached genome-wide significance: rs4552931 in UBE2V2 (p=1.43 × 10−07) for left ventricular mass (LVM); rs7213314 in WIPI1 (p=1.68 × 10−07) for LV internal diastolic diameter (LVIDD); rs1571099 in PPAPDC1A (p= 2.57 × 10−08) for interventricular septal wall thickness (IVST); and rs9530176 in KLF5 (p=4.02 × 10−07) for ejection fraction (EF). Associated variants were enriched in three signaling pathways involved in cardiac remodeling. None of the 4 loci replicated in cohorts of African ancestry were confirmed in look-ups in EchoGEN. Conclusions In the largest GWAS of cardiac structure and function to date in AA, we identified 4 genetic loci related to LVM, IVST, LVIDD and EF that reached genome-wide significance. Replication results suggest that these loci may represent unique to individuals of African ancestry. Additional large-scale studies are warranted for these complex phenotypes. PMID:23275298

  14. Enabling functional genomics with genome engineering.

    PubMed

    Hilton, Isaac B; Gersbach, Charles A

    2015-10-01

    Advances in genome engineering technologies have made the precise control over genome sequence and regulation possible across a variety of disciplines. These tools can expand our understanding of fundamental biological processes and create new opportunities for therapeutic designs. The rapid evolution of these methods has also catalyzed a new era of genomics that includes multiple approaches to functionally characterize and manipulate the regulation of genomic information. Here, we review the recent advances of the most widely adopted genome engineering platforms and their application to functional genomics. This includes engineered zinc finger proteins, TALEs/TALENs, and the CRISPR/Cas9 system as nucleases for genome editing, transcription factors for epigenome editing, and other emerging applications. We also present current and potential future applications of these tools, as well as their current limitations and areas for future advances. © 2015 Hilton and Gersbach; Published by Cold Spring Harbor Laboratory Press.

  15. Enabling functional genomics with genome engineering

    PubMed Central

    Hilton, Isaac B.; Gersbach, Charles A.

    2015-01-01

    Advances in genome engineering technologies have made the precise control over genome sequence and regulation possible across a variety of disciplines. These tools can expand our understanding of fundamental biological processes and create new opportunities for therapeutic designs. The rapid evolution of these methods has also catalyzed a new era of genomics that includes multiple approaches to functionally characterize and manipulate the regulation of genomic information. Here, we review the recent advances of the most widely adopted genome engineering platforms and their application to functional genomics. This includes engineered zinc finger proteins, TALEs/TALENs, and the CRISPR/Cas9 system as nucleases for genome editing, transcription factors for epigenome editing, and other emerging applications. We also present current and potential future applications of these tools, as well as their current limitations and areas for future advances. PMID:26430154

  16. Plant database resources at The Institute for Genomic Research.

    PubMed

    Chan, Agnes P; Rabinowicz, Pablo D; Quackenbush, John; Buell, C Robin; Town, Chris D

    2007-01-01

    With the completion of the genome sequences of the model plants Arabidopsis and rice, and the continuing sequencing efforts of other economically important crop plants, an unprecedented amount of genome sequence data is now available for large-scale genomics studies and analyses, such as the identification and discovery of novel genes, comparative genomics, and functional genomics. Efficient utilization of these large data sets is critically dependent on the ease of access and organization of the data. The plant databases at The Institute for Genomic Research (TIGR) have been set up to maintain various data types including genomic sequence, annotation and analyses, expressed transcript assemblies and analyses, and gene expression profiles from microarray studies. We present here an overview of the TIGR database resources for plant genomics and describe methods to access the data.

  17. Resources | Office of Cancer Genomics

    Cancer.gov

    OCG provides a variety of scientific and educational resources for both cancer researchers and members of the general public. These resources are divided into the following types: OCG-Supported Resources: Tools, databases, and reagents generated by initiated and completed OCG programs for researchers, educators, and students. (Note: Databases for current OCG programs are available through program-specific data matrices)

  18. Avianbase: a community resource for bird genomics.

    PubMed

    Eöry, Lél; Gilbert, M Thomas P; Li, Cai; Li, Bo; Archibald, Alan; Aken, Bronwen L; Zhang, Guojie; Jarvis, Erich; Flicek, Paul; Burt, David W

    2015-01-29

    Giving access to sequence and annotation data for genome assemblies is important because, while facilitating research, it places both assembly and annotation quality under scrutiny, resulting in improvements to both. Therefore we announce Avianbase, a resource for bird genomics, which provides access to data released by the Avian Phylogenomics Consortium.

  19. Gramene: a growing plant comparative genomics resource

    USDA-ARS?s Scientific Manuscript database

    Gramene (www.gramene.org) is a curated genetic, genomic and comparative genome analysis resource for the major crop species, such as rice, maize, wheat and many other plant (mainly grass) species. Gramene is an open-source project, with all data and software freely downloadable through the ftp site ...

  20. High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource

    USDA-ARS?s Scientific Manuscript database

    The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today's annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic mode...

  1. Navigating yeast genome maintenance with functional genomics.

    PubMed

    Measday, Vivien; Stirling, Peter C

    2016-03-01

    Maintenance of genome integrity is a fundamental requirement of all organisms. To address this, organisms have evolved extremely faithful modes of replication, DNA repair and chromosome segregation to combat the deleterious effects of an unstable genome. Nonetheless, a small amount of genome instability is the driver of evolutionary change and adaptation, and thus a low level of instability is permitted in populations. While defects in genome maintenance almost invariably reduce fitness in the short term, they can create an environment where beneficial mutations are more likely to occur. The importance of this fact is clearest in the development of human cancer, where genome instability is a well-established enabling characteristic of carcinogenesis. This raises the crucial question: what are the cellular pathways that promote genome maintenance and what are their mechanisms? Work in model organisms, in particular the yeast Saccharomyces cerevisiae, has provided the global foundations of genome maintenance mechanisms in eukaryotes. The development of pioneering genomic tools inS. cerevisiae, such as the systematic creation of mutants in all nonessential and essential genes, has enabled whole-genome approaches to identifying genes with roles in genome maintenance. Here, we review the extensive whole-genome approaches taken in yeast, with an emphasis on functional genomic screens, to understand the genetic basis of genome instability, highlighting a range of genetic and cytological screening modalities. By revealing the biological pathways and processes regulating genome integrity, these analyses contribute to the systems-level map of the yeast cell and inform studies of human disease, especially cancer. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  2. Coral life history and symbiosis: Functional genomic resources for two reef building Caribbean corals, Acropora palmata and Montastraea faveolata

    PubMed Central

    Schwarz, Jodi A; Brokstein, Peter B; Voolstra, Christian; Terry, Astrid Y; Miller, David J; Szmant, Alina M; Coffroth, Mary Alice; Medina, Mónica

    2008-01-01

    Background Scleractinian corals are the foundation of reef ecosystems in tropical marine environments. Their great success is due to interactions with endosymbiotic dinoflagellates (Symbiodinium spp.), with which they are obligately symbiotic. To develop a foundation for studying coral biology and coral symbiosis, we have constructed a set of cDNA libraries and generated and annotated ESTs from two species of corals, Acropora palmata and Montastraea faveolata. Results We generated 14,588 (Ap) and 3,854 (Mf) high quality ESTs from five life history/symbiosis stages (spawned eggs, early-stage planula larvae, late-stage planula larvae either infected with symbionts or uninfected, and adult coral). The ESTs assembled into a set of primarily stage-specific clusters, producing 4,980 (Ap), and 1,732 (Mf) unigenes. The egg stage library, relative to the other developmental stages, was enriched in genes functioning in cell division and proliferation, transcription, signal transduction, and regulation of protein function. Fifteen unigenes were identified as candidate symbiosis-related genes as they were expressed in all libraries constructed from the symbiotic stages and were absent from all of the non symbiotic stages. These include several DNA interacting proteins, and one highly expressed unigene (containing 17 cDNAs) with no significant protein-coding region. A significant number of unigenes (25) encode potential pattern recognition receptors (lectins, scavenger receptors, and others), as well as genes that may function in signaling pathways involved in innate immune responses (toll-like signaling, NFkB p105, and MAP kinases). Comparison between the A. palmata and an A. millepora EST dataset identified ferritin as a highly expressed gene in both datasets that appears to be undergoing adaptive evolution. Five unigenes appear to be restricted to the Scleractinia, as they had no homology to any sequences in the nr databases nor to the non-scleractinian cnidarians

  3. Tomato functional genomics database (TFGD): a comprehensive collection and analysis package for tomato functional genomics

    USDA-ARS?s Scientific Manuscript database

    Tomato Functional Genomics Database (TFGD; http://ted.bti.cornell.edu) provides a comprehensive systems biology resource to store, mine, analyze, visualize and integrate large-scale tomato functional genomics datasets. The database is expanded from the previously described Tomato Expression Database...

  4. MARRVEL: Integration of Human and Model Organism Genetic Resources to Facilitate Functional Annotation of the Human Genome.

    PubMed

    Wang, Julia; Al-Ouran, Rami; Hu, Yanhui; Kim, Seon-Young; Wan, Ying-Wooi; Wangler, Michael F; Yamamoto, Shinya; Chao, Hsiao-Tuan; Comjean, Aram; Mohr, Stephanie E; Perrimon, Norbert; Liu, Zhandong; Bellen, Hugo J

    2017-06-01

    One major challenge encountered with interpreting human genetic variants is the limited understanding of the functional impact of genetic alterations on biological processes. Furthermore, there remains an unmet demand for an efficient survey of the wealth of information on human homologs in model organisms across numerous databases. To efficiently assess the large volume of publically available information, it is important to provide a concise summary of the most relevant information in a rapid user-friendly format. To this end, we created MARRVEL (model organism aggregated resources for rare variant exploration). MARRVEL is a publicly available website that integrates information from six human genetic databases and seven model organism databases. For any given variant or gene, MARRVEL displays information from OMIM, ExAC, ClinVar, Geno2MP, DGV, and DECIPHER. Importantly, it curates model organism-specific databases to concurrently display a concise summary regarding the human gene homologs in budding and fission yeast, worm, fly, fish, mouse, and rat on a single webpage. Experiment-based information on tissue expression, protein subcellular localization, biological process, and molecular function for the human gene and homologs in the seven model organisms are arranged into a concise output. Hence, rather than visiting multiple separate databases for variant and gene analysis, users can obtain important information by searching once through MARRVEL. Altogether, MARRVEL dramatically improves efficiency and accessibility to data collection and facilitates analysis of human genes and variants by cross-disciplinary integration of 18 million records available in public databases to facilitate clinical diagnosis and basic research. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  5. Nongenetic functions of the genome.

    PubMed

    Bustin, Michael; Misteli, Tom

    2016-05-06

    The primary function of the genome is to store, propagate, and express the genetic information that gives rise to a cell's architectural and functional machinery. However, the genome is also a major structural component of the cell. Besides its genetic roles, the genome affects cellular functions by nongenetic means through its physical and structural properties, particularly by exerting mechanical forces and by serving as a scaffold for binding of cellular components. Major cellular processes affected by nongenetic functions of the genome include establishment of nuclear structure, signal transduction, mechanoresponses, cell migration, and vision in nocturnal animals. We discuss the concept, mechanisms, and implications of nongenetic functions of the genome.

  6. The coffee genome hub: a resource for coffee genomes

    PubMed Central

    Dereeper, Alexis; Bocs, Stéphanie; Rouard, Mathieu; Guignon, Valentin; Ravel, Sébastien; Tranchant-Dubreuil, Christine; Poncet, Valérie; Garsmeur, Olivier; Lashermes, Philippe; Droc, Gaëtan

    2015-01-01

    The whole genome sequence of Coffea canephora, the perennial diploid species known as Robusta, has been recently released. In the context of the C. canephora genome sequencing project and to support post-genomics efforts, we developed the Coffee Genome Hub (http://coffee-genome.org/), an integrative genome information system that allows centralized access to genomics and genetics data and analysis tools to facilitate translational and applied research in coffee. We provide the complete genome sequence of C. canephora along with gene structure, gene product information, metabolism, gene families, transcriptomics, syntenic blocks, genetic markers and genetic maps. The hub relies on generic software (e.g. GMOD tools) for easy querying, visualizing and downloading research data. It includes a Genome Browser enhanced by a Community Annotation System, enabling the improvement of automatic gene annotation through an annotation editor. In addition, the hub aims at developing interoperability among other existing South Green tools managing coffee data (phylogenomics resources, SNPs) and/or supporting data analyses with the Galaxy workflow manager. PMID:25392413

  7. The coffee genome hub: a resource for coffee genomes.

    PubMed

    Dereeper, Alexis; Bocs, Stéphanie; Rouard, Mathieu; Guignon, Valentin; Ravel, Sébastien; Tranchant-Dubreuil, Christine; Poncet, Valérie; Garsmeur, Olivier; Lashermes, Philippe; Droc, Gaëtan

    2015-01-01

    The whole genome sequence of Coffea canephora, the perennial diploid species known as Robusta, has been recently released. In the context of the C. canephora genome sequencing project and to support post-genomics efforts, we developed the Coffee Genome Hub (http://coffee-genome.org/), an integrative genome information system that allows centralized access to genomics and genetics data and analysis tools to facilitate translational and applied research in coffee. We provide the complete genome sequence of C. canephora along with gene structure, gene product information, metabolism, gene families, transcriptomics, syntenic blocks, genetic markers and genetic maps. The hub relies on generic software (e.g. GMOD tools) for easy querying, visualizing and downloading research data. It includes a Genome Browser enhanced by a Community Annotation System, enabling the improvement of automatic gene annotation through an annotation editor. In addition, the hub aims at developing interoperability among other existing South Green tools managing coffee data (phylogenomics resources, SNPs) and/or supporting data analyses with the Galaxy workflow manager. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  8. PGSB/MIPS Plant Genome Information Resources and Concepts for the Analysis of Complex Grass Genomes.

    PubMed

    Spannagl, Manuel; Bader, Kai; Pfeifer, Matthias; Nussbaumer, Thomas; Mayer, Klaus F X

    2016-01-01

    PGSB (Plant Genome and Systems Biology; formerly MIPS-Munich Institute for Protein Sequences) has been involved in developing, implementing and maintaining plant genome databases for more than a decade. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable datasets for model plant genomes as a backbone against which experimental data, e.g., from high-throughput functional genomics, can be organized and analyzed. In addition, genomes from both model and crop plants form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny) between related species on macro- and micro-levels.The genomes of many economically important Triticeae plants such as wheat, barley, and rye present a great challenge for sequence assembly and bioinformatic analysis due to their enormous complexity and large genome size. Novel concepts and strategies have been developed to deal with these difficulties and have been applied to the genomes of wheat, barley, rye, and other cereals. This includes the GenomeZipper concept, reference-guided exome assembly, and "chromosome genomics" based on flow cytometry sorted chromosomes.

  9. Phenotypic and genomic analysis of a fast neutron mutant population resource in soybean

    USDA-ARS?s Scientific Manuscript database

    Mutagenized populations have become indispensable resources for introducing variation and studying gene function in plant genomics research. We utilized fast neutron radiation to induce deletion mutations in the soybean genome and phenotypically screened the resulting population. We exposed approxim...

  10. Developing genomic resources for the apiaceae

    USDA-ARS?s Scientific Manuscript database

    The Apiaceae family includes carrot, celery, cilantro, dill, fennel and numerous other spice and medicinal crops. Carrot is the most economically important member of the Apiaceae with an annual value of $600 M in the United States alone. There are few genomic resources for carrot or other Apiaceae, ...

  11. Genome-scale resources for Thermoanaerobacterium saccharolyticum

    DOE PAGES

    Currie, Devin H.; Raman, Babu; Gowen, Christopher M.; ...

    2015-06-26

    Thermoanaerobacterium saccharolyticum is a hemicellulose-degrading thermophilic anaerobe that was previously engineered to produce ethanol at high yield. For this research, a major project was undertaken to develop this organism into an industrial biocatalyst, but the lack of genome information and resources were recognized early on as a key limitation.

  12. Integrating sequence, evolution and functional genomics in regulatory genomics

    PubMed Central

    Vingron, Martin; Brazma, Alvis; Coulson, Richard; van Helden, Jacques; Manke, Thomas; Palin, Kimmo; Sand, Olivier; Ukkonen, Esko

    2009-01-01

    With genome analysis expanding from the study of genes to the study of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome. PMID:19226437

  13. Genomic resources for the study of neuropsychiatric disorders.

    PubMed

    Senthil, G; Dutka, T; Bingaman, L; Lehner, T

    2017-03-21

    The National Institute of Mental Health (NIMH) has made sustained investments in the development of genomic resources over the last two decades. These investments have led to the development of the largest biorepository for psychiatric genetics as a centralized national resource. In the realm of genomic resources, NIMH has been supporting large team science (TS) consortia focused on gene discovery, fine mapping of loci, and functional genomics using state-of-the-art technologies. The scientific output from these efforts has not only begun to transform our understanding of the genetic architecture of neuropsychiatric disorders, but it has also led to a broader cultural change among the investigator community towards deeper collaborations and broad pre-publication sharing of data and resources. The NIMH supported efforts have led to a vast increase in the amount of genetic and genomic resources available to the mental health research community. Here we provide an account of the existing resources and estimates of the scale and scope of what will be available in the near future. All biosamples and data described are intended for broad sharing with researchers worldwide, as allowed by the subject consent and applicable laws.Molecular Psychiatry advance online publication, 21 March 2017; doi:10.1038/mp.2017.29.

  14. Gramene 2016: comparative plant genomics and pathway resources.

    PubMed

    Tello-Ruiz, Marcela K; Stein, Joshua; Wei, Sharon; Preece, Justin; Olson, Andrew; Naithani, Sushma; Amarasinghe, Vindhya; Dharmawardhana, Palitha; Jiao, Yinping; Mulvaney, Joseph; Kumari, Sunita; Chougule, Kapeel; Elser, Justin; Wang, Bo; Thomason, James; Bolser, Daniel M; Kerhornou, Arnaud; Walts, Brandon; Fonseca, Nuno A; Huerta, Laura; Keays, Maria; Tang, Y Amy; Parkinson, Helen; Fabregat, Antonio; McKay, Sheldon; Weiser, Joel; D'Eustachio, Peter; Stein, Lincoln; Petryszak, Robert; Kersey, Paul J; Jaiswal, Pankaj; Ware, Doreen

    2016-01-04

    Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to ∼ 200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBI's Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramene's archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials. Published by Oxford University Press on behalf of Nucleic Acids Research 2015. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  15. Gramene 2016: comparative plant genomics and pathway resources

    PubMed Central

    Tello-Ruiz, Marcela K.; Stein, Joshua; Wei, Sharon; Preece, Justin; Olson, Andrew; Naithani, Sushma; Amarasinghe, Vindhya; Dharmawardhana, Palitha; Jiao, Yinping; Mulvaney, Joseph; Kumari, Sunita; Chougule, Kapeel; Elser, Justin; Wang, Bo; Thomason, James; Bolser, Daniel M.; Kerhornou, Arnaud; Walts, Brandon; Fonseca, Nuno A.; Huerta, Laura; Keays, Maria; Tang, Y. Amy; Parkinson, Helen; Fabregat, Antonio; McKay, Sheldon; Weiser, Joel; D'Eustachio, Peter; Stein, Lincoln; Petryszak, Robert; Kersey, Paul J.; Jaiswal, Pankaj; Ware, Doreen

    2016-01-01

    Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to ∼200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBI's Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramene's archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials. PMID:26553803

  16. Floral gene resources from basal angiosperms for comparative genomics research.

    PubMed

    Albert, Victor A; Soltis, Douglas E; Carlson, John E; Farmerie, William G; Wall, P Kerr; Ilut, Daniel C; Solow, Teri M; Mueller, Lukas A; Landherr, Lena L; Hu, Yi; Buzgo, Matyas; Kim, Sangtae; Yoo, Mi-Jeong; Frohlich, Michael W; Perl-Treves, Rafael; Schlarbaum, Scott E; Bliss, Barbara J; Zhang, Xiaohong; Tanksley, Steven D; Oppenheimer, David G; Soltis, Pamela S; Ma, Hong; DePamphilis, Claude W; Leebens-Mack, James H

    2005-03-30

    The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST) sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants. Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04) generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i) proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii) many known floral gene homologues have been captured, and (iii) phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms. Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage-specific gene duplication and functional divergence, and

  17. Reconciling resource utilization and resource selection functions

    USGS Publications Warehouse

    Hooten, Mevin B.; Hanks, Ephraim M.; Johnson, Devin S.; Alldredge, Mat W.

    2013-01-01

    Summary: 1. Analyses based on utilization distributions (UDs) have been ubiquitous in animal space use studies, largely because they are computationally straightforward and relatively easy to employ. Conventional applications of resource utilization functions (RUFs) suggest that estimates of UDs can be used as response variables in a regression involving spatial covariates of interest. 2. It has been claimed that contemporary implementations of RUFs can yield inference about resource selection, although to our knowledge, an explicit connection has not been described. 3. We explore the relationships between RUFs and resource selection functions from a hueristic and simulation perspective. We investigate several sources of potential bias in the estimation of resource selection coefficients using RUFs (e.g. the spatial covariance modelling that is often used in RUF analyses). 4. Our findings illustrate that RUFs can, in fact, serve as approximations to RSFs and are capable of providing inference about resource selection, but only with some modification and under specific circumstances. 5. Using real telemetry data as an example, we provide guidance on which methods for estimating resource selection may be more appropriate and in which situations. In general, if telemetry data are assumed to arise as a point process, then RSF methods may be preferable to RUFs; however, modified RUFs may provide less biased parameter estimates when the data are subject to location error.

  18. Functional genomics of intracellular bacteria.

    PubMed

    de Barsy, Marie; Greub, Gilbert

    2013-07-01

    During the genomic era, a large amount of whole-genome sequences accumulated, which identified many hypothetical proteins of unknown function. Rapidly, functional genomics, which is the research domain that assign a function to a given gene product, has thus been developed. Functional genomics of intracellular pathogenic bacteria exhibit specific peculiarities due to the fastidious growth of most of these intracellular micro-organisms, due to the close interaction with the host cell, due to the risk of contamination of experiments with host cell proteins and, for some strict intracellular bacteria such as Chlamydia, due to the absence of simple genetic system to manipulate the bacterial genome. To identify virulence factors of intracellular pathogenic bacteria, functional genomics often rely on bioinformatic analyses compared with model organisms such as Escherichia coli and Bacillus subtilis. The use of heterologous expression is another common approach. Given the intracellular lifestyle and the many effectors that are used by the intracellular bacteria to corrupt host cell functions, functional genomics is also often targeting the identification of new effectors such as those of the T4SS of Brucella and Legionella.

  19. The perennial ryegrass GenomeZipper: targeted use of genome resources for comparative grass genomics.

    PubMed

    Pfeifer, Matthias; Martis, Mihaela; Asp, Torben; Mayer, Klaus F X; Lübberstedt, Thomas; Byrne, Stephen; Frei, Ursula; Studer, Bruno

    2013-02-01

    Whole-genome sequences established for model and major crop species constitute a key resource for advanced genomic research. For outbreeding forage and turf grass species like ryegrasses (Lolium spp.), such resources have yet to be developed. Here, we present a model of the perennial ryegrass (Lolium perenne) genome on the basis of conserved synteny to barley (Hordeum vulgare) and the model grass genome Brachypodium (Brachypodium distachyon) as well as rice (Oryza sativa) and sorghum (Sorghum bicolor). A transcriptome-based genetic linkage map of perennial ryegrass served as a scaffold to establish the chromosomal arrangement of syntenic genes from model grass species. This scaffold revealed a high degree of synteny and macrocollinearity and was then utilized to anchor a collection of perennial ryegrass genes in silico to their predicted genome positions. This resulted in the unambiguous assignment of 3,315 out of 8,876 previously unmapped genes to the respective chromosomes. In total, the GenomeZipper incorporates 4,035 conserved grass gene loci, which were used for the first genome-wide sequence divergence analysis between perennial ryegrass, barley, Brachypodium, rice, and sorghum. The perennial ryegrass GenomeZipper is an ordered, information-rich genome scaffold, facilitating map-based cloning and genome assembly in perennial ryegrass and closely related Poaceae species. It also represents a milestone in describing synteny between perennial ryegrass and fully sequenced model grass genomes, thereby increasing our understanding of genome organization and evolution in the most important temperate forage and turf grass species.

  20. Functional genomics in reproductive medicine.

    PubMed

    Barratt, Christopher L R; Hughes, David C; Afnan, Masoud; Brewis, Ian A

    2002-02-01

    The British Fertility Society organised a workshop on Functional Genomics in Reproductive Medicine at the University of Birmingham on 13-14 September 2001. The primary aim was to inform delegates about the power of the technology that has been made available after completion of the sequencing of the human genome, and to stimulate debate about using functional genomics to address both clinical and scientific questions in reproductive medicine. Three specific areas were addressed: proteomics, gene expression and bioinformatics. Although the sophistication and plethora of techniques available were obvious, major limitations in the technology were also discussed. The future promises to be very challenging indeed.

  1. Functional Profiling of Human Fungal Pathogen Genomes

    PubMed Central

    Goranov, Alexi I.; Madhani, Hiten D.

    2015-01-01

    Fungal infections are challenging to diagnose and often difficult to treat, with only a handful of drug classes existing. Understanding the molecular mechanisms by which pathogenic fungi cause human disease is imperative. Here, we discuss how the development and use of genome-scale genetic resources, such as whole-genome knockout collections, can address this unmet need. Using work in Saccharomcyes cerevisiae as a guide, studies of Cryptococcus neoformans and Candida albicans have shown how the challenges of large-scale gene deletion can be overcome, and how such collections can be effectively used to obtain insights into mechanisms of pathogenesis. We conclude that, with concerted efforts, full genome-wide functional analysis of human fungal pathogen genomes is within reach. PMID:25377143

  2. MycoCosm, an Integrated Fungal Genomics Resource

    SciTech Connect

    Shabalov, Igor; Grigoriev, Igor

    2012-03-16

    MycoCosm is a web-based interactive fungal genomics resource, which was first released in March 2010, in response to an urgent call from the fungal community for integration of all fungal genomes and analytical tools in one place (Pan-fungal data resources meeting, Feb 21-22, 2010, Alexandria, VA). MycoCosm integrates genomics data and analysis tools to navigate through over 100 fungal genomes sequenced at JGI and elsewhere. This resource allows users to explore fungal genomes in the context of both genome-centric analysis and comparative genomics, and promotes user community participation in data submission, annotation and analysis. MycoCosm has over 4500 unique visitors/month or 35000+ visitors/year as well as hundreds of registered users contributing their data and expertise to this resource. Its scalable architecture allows significant expansion of the data expected from JGI Fungal Genomics Program, its users, and integration with external resources used by fungal community.

  3. The international Genome sample resource (IGSR): A worldwide collection of genome variation incorporating the 1000 Genomes Project data

    PubMed Central

    Clarke, Laura; Fairley, Susan; Zheng-Bradley, Xiangqun; Streeter, Ian; Perry, Emily; Lowy, Ernesto; Tassé, Anne-Marie; Flicek, Paul

    2017-01-01

    The International Genome Sample Resource (IGSR; http://www.internationalgenome.org) expands in data type and population diversity the resources from the 1000 Genomes Project. IGSR represents the largest open collection of human variation data and provides easy access to these resources. IGSR was established in 2015 to maintain and extend the 1000 Genomes Project data, which has been widely used as a reference set of human variation and by researchers developing analysis methods. IGSR has mapped all of the 1000 Genomes sequence to the newest human reference (GRCh38), and will release updated variant calls to ensure maximal usefulness of the existing data. IGSR is collecting new structural variation data on the 1000 Genomes samples from long read sequencing and other technologies, and will collect relevant functional data into a single comprehensive resource. IGSR is extending coverage with new populations sequenced by collaborating groups. Here, we present the new data and analysis that IGSR has made available. We have also introduced a new data portal that increases discoverability of our data—previously only browseable through our FTP site—by focusing on particular samples, populations or data sets of interest. PMID:27638885

  4. The international Genome sample resource (IGSR): A worldwide collection of genome variation incorporating the 1000 Genomes Project data.

    PubMed

    Clarke, Laura; Fairley, Susan; Zheng-Bradley, Xiangqun; Streeter, Ian; Perry, Emily; Lowy, Ernesto; Tassé, Anne-Marie; Flicek, Paul

    2017-01-04

    The International Genome Sample Resource (IGSR; http://www.internationalgenome.org) expands in data type and population diversity the resources from the 1000 Genomes Project. IGSR represents the largest open collection of human variation data and provides easy access to these resources. IGSR was established in 2015 to maintain and extend the 1000 Genomes Project data, which has been widely used as a reference set of human variation and by researchers developing analysis methods. IGSR has mapped all of the 1000 Genomes sequence to the newest human reference (GRCh38), and will release updated variant calls to ensure maximal usefulness of the existing data. IGSR is collecting new structural variation data on the 1000 Genomes samples from long read sequencing and other technologies, and will collect relevant functional data into a single comprehensive resource. IGSR is extending coverage with new populations sequenced by collaborating groups. Here, we present the new data and analysis that IGSR has made available. We have also introduced a new data portal that increases discoverability of our data-previously only browseable through our FTP site-by focusing on particular samples, populations or data sets of interest. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. Genomic resources for flatfish research and their applications.

    PubMed

    Cerdà, J; Douglas, S; Reith, M

    2010-10-01

    Flatfishes are a group of teleosts of high commercial and environmental interest, whose biology is still poorly understood. The recent rapid development of different 'omic' technologies is, however, enhancing the knowledge of the complex genetic control underlying different physiological processes of flatfishes. This review describes the different functional genomic approaches and resources currently available for flatfish research and summarizes different areas where microarray-based gene expression analysis has been applied. The increase in genome sequencing data has also allowed the construction of genetic linkage maps in different flatfish species; these maps are invaluable for investigating genome organization and identifying genetic traits of commercial interest. Despite the significant progress in this field, the genomic resources currently available for flatfish are still scarce. Further intensive research should be carried out to develop larger genomic sequence databases, high-density microarrays and, more detailed, complete linkage maps, using second-generation sequencing platforms. These tools will be crucial for further expanding the knowledge of flatfish physiology, and it is predicted that they will have important implications for wild fish population management, improved fish welfare and increased productivity in aquaculture.

  6. Kiwifruit Information Resource (KIR): a comparative platform for kiwifruit genomics

    PubMed Central

    Yue, Junyang; Liu, Jian; Ban, Rongjun; Tang, Wei; Deng, Lin; Fei, Zhangjun; Liu, Yongsheng

    2015-01-01

    The Kiwifruit Information Resource (KIR) is dedicated to maintain and integrate comprehensive datasets on genomics, functional genomics and transcriptomics of kiwifruit (Actinidiaceae). KIR serves as a central access point for existing/new genomic and genetic data. KIR also provides researchers with a variety of visualization and analysis tools. Current developments include the updated genome structure of Actinidia chinensis cv. Hongyang and its newest genome annotation, putative transcripts, gene expression, physical markers of genetic traits as well as relevant publications based on the latest genome assembly. Nine thousand five hundred and forty-seven new transcripts are detected and 21 132 old transcripts are changed. At the present release, the next-generation transcriptome sequencing data has been incorporated into gene models and splice variants. Protein–protein interactions are also identified based on experimentally determined orthologous interactions. Furthermore, the experimental results reported in peer-reviewed literature are manually extracted and integrated within a well-developed query page. In total, 122 identifications are currently associated, including commonly used gene names and symbols. All KIR datasets are helpful to facilitate a broad range of kiwifruit research topics and freely available to the research community. Database URL: http://bdg.hfut.edu.cn/kir/index.html. PMID:26656885

  7. Kiwifruit Information Resource (KIR): a comparative platform for kiwifruit genomics.

    PubMed

    Yue, Junyang; Liu, Jian; Ban, Rongjun; Tang, Wei; Deng, Lin; Fei, Zhangjun; Liu, Yongsheng

    2015-01-01

    The Kiwifruit Information Resource (KIR) is dedicated to maintain and integrate comprehensive datasets on genomics, functional genomics and transcriptomics of kiwifruit (Actinidiaceae). KIR serves as a central access point for existing/new genomic and genetic data. KIR also provides researchers with a variety of visualization and analysis tools. Current developments include the updated genome structure of Actinidia chinensis cv. Hongyang and its newest genome annotation, putative transcripts, gene expression, physical markers of genetic traits as well as relevant publications based on the latest genome assembly. Nine thousand five hundred and forty-seven new transcripts are detected and 21 132 old transcripts are changed. At the present release, the next-generation transcriptome sequencing data has been incorporated into gene models and splice variants. Protein-protein interactions are also identified based on experimentally determined orthologous interactions. Furthermore, the experimental results reported in peer-reviewed literature are manually extracted and integrated within a well-developed query page. In total, 122 identifications are currently associated, including commonly used gene names and symbols. All KIR datasets are helpful to facilitate a broad range of kiwifruit research topics and freely available to the research community. Database URL: http://bdg.hfut.edu.cn/kir/index.html.

  8. Floral gene resources from basal angiosperms for comparative genomics research

    PubMed Central

    Albert, Victor A; Soltis, Douglas E; Carlson, John E; Farmerie, William G; Wall, P Kerr; Ilut, Daniel C; Solow, Teri M; Mueller, Lukas A; Landherr, Lena L; Hu, Yi; Buzgo, Matyas; Kim, Sangtae; Yoo, Mi-Jeong; Frohlich, Michael W; Perl-Treves, Rafael; Schlarbaum, Scott E; Bliss, Barbara J; Zhang, Xiaohong; Tanksley, Steven D; Oppenheimer, David G; Soltis, Pamela S; Ma, Hong; dePamphilis, Claude W; Leebens-Mack, James H

    2005-01-01

    Background The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST) sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants. Results Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04) generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i) proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii) many known floral gene homologues have been captured, and (iii) phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms. Conclusion Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage-specific gene duplication and

  9. Success stories in genomic medicine from resource-limited countries.

    PubMed

    Mitropoulos, Konstantinos; Al Jaibeji, Hayat; Forero, Diego A; Laissue, Paul; Wonkam, Ambroise; Lopez-Correa, Catalina; Mohamed, Zahurin; Chantratita, Wasun; Lee, Ming Ta Michael; Llerena, Adrian; Brand, Angela; Ali, Bassam R; Patrinos, George P

    2015-06-18

    In recent years, the translation of genomic discoveries into mainstream medical practice and public health has gained momentum, facilitated by the advent of new technologies. However, there are often major discrepancies in the pace of implementation of genomic medicine between developed and developing/resource-limited countries. The main reason does not only lie in the limitation of resources but also in the slow pace of adoption of the new findings and the poor understanding of the potential that this new discipline offers to rationalize medical diagnosis and treatment. Here, we present and critically discuss examples from the successful implementation of genomic medicine in resource-limited countries, focusing on pharmacogenomics, genome informatics, and public health genomics, emphasizing in the latter case genomic education, stakeholder analysis, and economics in pharmacogenomics. These examples can be considered as model cases and be readily replicated for the wide implementation of pharmacogenomics and genomic medicine in other resource-limited environments.

  10. MorusDB: a resource for mulberry genomics and genome biology

    PubMed Central

    Li, Tian; Qi, Xiwu; Zeng, Qiwei; Xiang, Zhonghuai; He, Ningjia

    2014-01-01

    Mulberry is an important cultivated plant that has received the attention of biologists interested in sericulture and plant–insect interaction. Morus notabilis, a wild mulberry species with a minimal chromosome number is an ideal material for whole-genome sequencing and assembly. The genome and transcriptome of M. notabilis were sequenced and analyzed. In this article, a web-based and open-access database, the Morus Genome Database (MorusDB), was developed to enable easy-to-access and data mining. The MorusDB provides an integrated data source and an easy accession of mulberry large-scale genomic sequencing and assembly, predicted genes and functional annotations, expressed sequence tags (ESTs), transposable elements (TEs), Gene Ontology (GO) terms, horizontal gene transfers between mulberry and silkworm and ortholog and paralog groups. Transcriptome sequencing data for M. notabilis root, leaf, bark, winter bud and male flower can also be searched and downloaded. Furthermore, MorusDB provides an analytical workbench with some built-in tools and pipelines, such as BLAST, Search GO, Mulberry GO and Mulberry GBrowse, to facilitate genomic studies and comparative genomics. The MorusDB provides important genomic resources for scientists working with mulberry and other Moraceae species, which include many important fruit crops. Designed as a basic platform and accompanied by the SilkDB, MorusDB strives to be a comprehensive platform for the silkworm–mulberry interaction studies. Database URL: http://morus.swu.edu.cn/morusdb. PMID:24923822

  11. Application of resequencing to rice genomics, functional genomics and evolutionary analysis

    PubMed Central

    2014-01-01

    Rice is a model system used for crop genomics studies. The completion of the rice genome draft sequences in 2002 not only accelerated functional genome studies, but also initiated a new era of resequencing rice genomes. Based on the reference genome in rice, next-generation sequencing (NGS) using the high-throughput sequencing system can efficiently accomplish whole genome resequencing of various genetic populations and diverse germplasm resources. Resequencing technology has been effectively utilized in evolutionary analysis, rice genomics and functional genomics studies. This technique is beneficial for both bridging the knowledge gap between genotype and phenotype and facilitating molecular breeding via gene design in rice. Here, we also discuss the limitation, application and future prospects of rice resequencing. PMID:25006357

  12. Genomic resources in mungbean for future breeding programs

    PubMed Central

    Kim, Sue K.; Nair, Ramakrishnan M.; Lee, Jayern; Lee, Suk-Ha

    2015-01-01

    Among the legume family, mungbean (Vigna radiata) has become one of the important crops in Asia, showing a steady increase in global production. It provides a good source of protein and contains most notably folate and iron. Beyond the nutritional value of mungbean, certain features make it a well-suited model organism among legume plants because of its small genome size, short life-cycle, self-pollinating, and close genetic relationship to other legumes. In the past, there have been several efforts to develop molecular markers and linkage maps associated with agronomic traits for the genetic improvement of mungbean and, ultimately, breeding for cultivar development to increase the average yields of mungbean. The recent release of a reference genome of the cultivated mungbean (V. radiata var. radiata VC1973A) and an additional de novo sequencing of a wild relative mungbean (V. radiata var. sublobata) has provided a framework for mungbean genetic and genome research, that can further be used for genome-wide association and functional studies to identify genes related to specific agronomic traits. Moreover, the diverse gene pool of wild mungbean comprises valuable genetic resources of beneficial genes that may be helpful in widening the genetic diversity of cultivated mungbean. This review paper covers the research progress on molecular and genomics approaches and the current status of breeding programs that have developed to move toward the ultimate goal of mungbean improvement. PMID:26322067

  13. Functional genomics of pathogenic bacteria.

    PubMed Central

    Moxon, E R; Hood, D W; Saunders, N J; Schweda, E K H; Richards, J C

    2002-01-01

    Microbial diseases remain the commonest cause of global mortality and morbidity. Automated-DNA sequencing has revolutionized the investigation of pathogenic microbes by making the immense fund of information contained in their genomes available at reasonable cost. The challenge is how this information can be used to increase current understanding of the biology of commensal and virulence behaviour of pathogens with particular emphasis on in vivo function and novel approaches to prevention. One example of the application of whole-genome-sequence information is afforded by investigations of the pathogenic role of Haemophilus influenzae lipopolysaccharide and its candidacy as a vaccine. PMID:11839188

  14. Plant Comparative and Functional Genomics

    DOE PAGES

    Yang, Xiaohan; Leebens-Mack, Jim; Chen, Feng; ...

    2015-01-01

    Plants form the foundation for our global ecosystem and are essential for environmental and human health. An increasing number of available plant genomes and tractable experimental systems, comparative and functional plant genomics research is greatly expanding our knowledge of the molecular basis of economically and nutritionally important traits in crop plants. Inferences drawn from comparative genomics are motivating experimental investigations of gene function and gene interactions. In this special issue aims to highlight recent advances made in comparative and functional genomics research in plants. Nine original research articles in this special issue cover five important topics: (1) transcription factor genemore » families relevant to abiotic stress tolerance; (2) plant secondary metabolism; (3) transcriptomebased markers for quantitative trait locus; (4) epigenetic modifications in plant-microbe interactions; and (5) computational prediction of protein-protein interactions. Finally, we studied the plant species in these articles which include model species as well as nonmodel plant species of economic importance (e.g., food crops and medicinal plants).« less

  15. Functional genomics approaches in parasitic helminths.

    PubMed

    Hagen, J; Lee, E F; Fairlie, W D; Kalinna, B H

    2012-01-01

    As research on parasitic helminths is moving into the post-genomic era, an enormous effort is directed towards deciphering gene function and to achieve gene annotation. The sequences that are available in public databases undoubtedly hold information that can be utilized for new interventions and control but the exploitation of these resources has until recently remained difficult. Only now, with the emergence of methods to genetically manipulate and transform parasitic worms will it be possible to gain a comprehensive understanding of the molecular mechanisms involved in nutrition, metabolism, developmental switches/maturation and interaction with the host immune system. This review focuses on functional genomics approaches in parasitic helminths that are currently used, to highlight potential applications of these technologies in the areas of cell biology, systems biology and immunobiology of parasitic helminths. © 2011 Blackwell Publishing Ltd.

  16. Functional Annotation Analytics of Rhodopseudomonas palustris Genomes.

    PubMed

    Simmons, Shaneka S; Isokpehi, Raphael D; Brown, Shyretha D; McAllister, Donee L; Hall, Charnia C; McDuffy, Wanaki M; Medley, Tamara L; Udensi, Udensi K; Rajnarayanan, Rajendram V; Ayensu, Wellington K; Cohly, Hari H P

    2011-01-01

    Rhodopseudomonas palustris, a nonsulphur purple photosynthetic bacteria, has been extensively investigated for its metabolic versatility including ability to produce hydrogen gas from sunlight and biomass. The availability of the finished genome sequences of six R. palustris strains (BisA53, BisB18, BisB5, CGA009, HaA2 and TIE-1) combined with online bioinformatics software for integrated analysis presents new opportunities to determine the genomic basis of metabolic versatility and ecological lifestyles of the bacteria species. The purpose of this investigation was to compare the functional annotations available for multiple R. palustris genomes to identify annotations that can be further investigated for strain-specific or uniquely shared phenotypic characteristics. A total of 2,355 protein family Pfam domain annotations were clustered based on presence or absence in the six genomes. The clustering process identified groups of functional annotations including those that could be verified as strain-specific or uniquely shared phenotypes. For example, genes encoding water/glycerol transport were present in the genome sequences of strains CGA009 and BisB5, but absent in strains BisA53, BisB18, HaA2 and TIE-1. Protein structural homology modeling predicted that the two orthologous 240 aa R. palustris aquaporins have water-specific transport function. Based on observations in other microbes, the presence of aquaporin in R. palustris strains may improve freeze tolerance in natural conditions of rapid freezing such as nitrogen fixation at low temperatures where access to liquid water is a limiting factor for nitrogenase activation. In the case of adaptive loss of aquaporin genes, strains may be better adapted to survive in conditions of high-sugar content such as fermentation of biomass for biohydrogen production. Finally, web-based resources were developed to allow for interactive, user-defined selection of the relationship between protein family annotations and the R

  17. Functional Annotation Analytics of Rhodopseudomonas palustris Genomes

    PubMed Central

    Simmons, Shaneka S.; Isokpehi, Raphael D.; Brown, Shyretha D.; McAllister, Donee L.; Hall, Charnia C.; McDuffy, Wanaki M.; Medley, Tamara L.; Udensi, Udensi K.; Rajnarayanan, Rajendram V.; Ayensu, Wellington K.; Cohly, Hari H.P.

    2011-01-01

    Rhodopseudomonas palustris, a nonsulphur purple photosynthetic bacteria, has been extensively investigated for its metabolic versatility including ability to produce hydrogen gas from sunlight and biomass. The availability of the finished genome sequences of six R. palustris strains (BisA53, BisB18, BisB5, CGA009, HaA2 and TIE-1) combined with online bioinformatics software for integrated analysis presents new opportunities to determine the genomic basis of metabolic versatility and ecological lifestyles of the bacteria species. The purpose of this investigation was to compare the functional annotations available for multiple R. palustris genomes to identify annotations that can be further investigated for strain-specific or uniquely shared phenotypic characteristics. A total of 2,355 protein family Pfam domain annotations were clustered based on presence or absence in the six genomes. The clustering process identified groups of functional annotations including those that could be verified as strain-specific or uniquely shared phenotypes. For example, genes encoding water/glycerol transport were present in the genome sequences of strains CGA009 and BisB5, but absent in strains BisA53, BisB18, HaA2 and TIE-1. Protein structural homology modeling predicted that the two orthologous 240 aa R. palustris aquaporins have water-specific transport function. Based on observations in other microbes, the presence of aquaporin in R. palustris strains may improve freeze tolerance in natural conditions of rapid freezing such as nitrogen fixation at low temperatures where access to liquid water is a limiting factor for nitrogenase activation. In the case of adaptive loss of aquaporin genes, strains may be better adapted to survive in conditions of high-sugar content such as fermentation of biomass for biohydrogen production. Finally, web-based resources were developed to allow for interactive, user-defined selection of the relationship between protein family annotations and the R

  18. 2004 Structural, Function and Evolutionary Genomics

    SciTech Connect

    Douglas L. Brutlag Nancy Ryan Gray

    2005-03-23

    This Gordon conference will cover the areas of structural, functional and evolutionary genomics. It will take a systematic approach to genomics, examining the evolution of proteins, protein functional sites, protein-protein interactions, regulatory networks, and metabolic networks. Emphasis will be placed on what we can learn from comparative genomics and entire genomes and proteomes.

  19. GEneSTATION 1.0: a synthetic resource of diverse evolutionary and functional genomic data for studying the evolution of pregnancy-associated tissues and phenotypes.

    PubMed

    Kim, Mara; Cooper, Brian A; Venkat, Rohit; Phillips, Julie B; Eidem, Haley R; Hirbo, Jibril; Nutakki, Sashank; Williams, Scott M; Muglia, Louis J; Capra, J Anthony; Petren, Kenneth; Abbot, Patrick; Rokas, Antonis; McGary, Kriston L

    2016-01-04

    Mammalian gestation and pregnancy are fast evolving processes that involve the interaction of the fetal, maternal and paternal genomes. Version 1.0 of the GEneSTATION database (http://genestation.org) integrates diverse types of omics data across mammals to advance understanding of the genetic basis of gestation and pregnancy-associated phenotypes and to accelerate the translation of discoveries from model organisms to humans. GEneSTATION is built using tools from the Generic Model Organism Database project, including the biology-aware database CHADO, new tools for rapid data integration, and algorithms that streamline synthesis and user access. GEneSTATION contains curated life history information on pregnancy and reproduction from 23 high-quality mammalian genomes. For every human gene, GEneSTATION contains diverse evolutionary (e.g. gene age, population genetic and molecular evolutionary statistics), organismal (e.g. tissue-specific gene and protein expression, differential gene expression, disease phenotype), and molecular data types (e.g. Gene Ontology Annotation, protein interactions), as well as links to many general (e.g. Entrez, PubMed) and pregnancy disease-specific (e.g. PTBgene, dbPTB) databases. By facilitating the synthesis of diverse functional and evolutionary data in pregnancy-associated tissues and phenotypes and enabling their quick, intuitive, accurate and customized meta-analysis, GEneSTATION provides a novel platform for comprehensive investigation of the function and evolution of mammalian pregnancy.

  20. GEneSTATION 1.0: a synthetic resource of diverse evolutionary and functional genomic data for studying the evolution of pregnancy-associated tissues and phenotypes

    PubMed Central

    Kim, Mara; Cooper, Brian A.; Venkat, Rohit; Phillips, Julie B.; Eidem, Haley R.; Hirbo, Jibril; Nutakki, Sashank; Williams, Scott M.; Muglia, Louis J.; Capra, J. Anthony; Petren, Kenneth; Abbot, Patrick; Rokas, Antonis; McGary, Kriston L.

    2016-01-01

    Mammalian gestation and pregnancy are fast evolving processes that involve the interaction of the fetal, maternal and paternal genomes. Version 1.0 of the GEneSTATION database (http://genestation.org) integrates diverse types of omics data across mammals to advance understanding of the genetic basis of gestation and pregnancy-associated phenotypes and to accelerate the translation of discoveries from model organisms to humans. GEneSTATION is built using tools from the Generic Model Organism Database project, including the biology-aware database CHADO, new tools for rapid data integration, and algorithms that streamline synthesis and user access. GEneSTATION contains curated life history information on pregnancy and reproduction from 23 high-quality mammalian genomes. For every human gene, GEneSTATION contains diverse evolutionary (e.g. gene age, population genetic and molecular evolutionary statistics), organismal (e.g. tissue-specific gene and protein expression, differential gene expression, disease phenotype), and molecular data types (e.g. Gene Ontology Annotation, protein interactions), as well as links to many general (e.g. Entrez, PubMed) and pregnancy disease-specific (e.g. PTBgene, dbPTB) databases. By facilitating the synthesis of diverse functional and evolutionary data in pregnancy-associated tissues and phenotypes and enabling their quick, intuitive, accurate and customized meta-analysis, GEneSTATION provides a novel platform for comprehensive investigation of the function and evolution of mammalian pregnancy. PMID:26567549

  1. Update on RefSeq microbial genomes resources

    PubMed Central

    Tatusova, Tatiana; Ciufo, Stacy; Federhen, Scott; Fedorov, Boris; McVeigh, Richard; O'Neill, Kathleen; Tolstoy, Igor; Zaslavsky, Leonid

    2015-01-01

    NCBI RefSeq genome collection http://www.ncbi.nlm.nih.gov/genome represents all three major domains of life: Eukarya, Bacteria and Archaea as well as Viruses. Prokaryotic genome sequences are the most rapidly growing part of the collection. During the year of 2014 more than 10 000 microbial genome assemblies have been publicly released bringing the total number of prokaryotic genomes close to 30 000. We continue to improve the quality and usability of the microbial genome resources by providing easy access to the data and the results of the pre-computed analysis, and improving analysis and visualization tools. A number of improvements have been incorporated into the Prokaryotic Genome Annotation Pipeline. Several new features have been added to RefSeq prokaryotic genomes data processing pipeline including the calculation of genome groups (clades) and the optimization of protein clusters generation using pan-genome approach. PMID:25510495

  2. The Brachypodium genome sequence: a resource for oat genomics research

    USDA-ARS?s Scientific Manuscript database

    Oat (Avena sativa) is an important cereal crop used as both an animal feed and for human consumption. Genetic and genomic research on oat is hindered because it is hexaploid and possesses a large (13 Gb) genome. Diploid Avena relatives have been employed for genetic and genomic studies, but only mod...

  3. Sources for Comparative Studies of Placentation. II. Genomic Resources

    PubMed Central

    Wildman, Derek E.

    2008-01-01

    The genomes of dozens of placental mammal species are now publicly available. These genome sequences have the potential to provide insight into the development and evolution of the placenta. In particular, the variable anatomy of the placenta has likely been affected by natural selection on the genomes of living and extinct mammals. In this note the current availability of mammal genome sequences is reviewed, and strengths and limitations of these data are discussed. Additionally, museums, zoos, and commercial entities are available to provide genomic resources to the placental research community. Recommendations for tissue storage conditions of placentas in genomic research are given. PMID:18155141

  4. EuPathDB: the eukaryotic pathogen genomics database resource

    PubMed Central

    Aurrecoechea, Cristina; Barreto, Ana; Basenko, Evelina Y.; Brestelli, John; Brunk, Brian P.; Cade, Shon; Crouch, Kathryn; Doherty, Ryan; Falke, Dave; Fischer, Steve; Gajria, Bindu; Harb, Omar S.; Heiges, Mark; Hertz-Fowler, Christiane; Hu, Sufen; Iodice, John; Kissinger, Jessica C.; Lawrence, Cris; Li, Wei; Pinney, Deborah F.; Pulman, Jane A.; Roos, David S.; Shanmugasundram, Achchuthan; Silva-Franco, Fatima; Steinbiss, Sascha; Stoeckert, Christian J.; Spruill, Drew; Wang, Haiming; Warrenfeltz, Susanne; Zheng, Jie

    2017-01-01

    The Eukaryotic Pathogen Genomics Database Resource (EuPathDB, http://eupathdb.org) is a collection of databases covering 170+ eukaryotic pathogens (protists & fungi), along with relevant free-living and non-pathogenic species, and select pathogen hosts. To facilitate the discovery of meaningful biological relationships, the databases couple preconfigured searches with visualization and analysis tools for comprehensive data mining via intuitive graphical interfaces and APIs. All data are analyzed with the same workflows, including creation of gene orthology profiles, so data are easily compared across data sets, data types and organisms. EuPathDB is updated with numerous new analysis tools, features, data sets and data types. New tools include GO, metabolic pathway and word enrichment analyses plus an online workspace for analysis of personal, non-public, large-scale data. Expanded data content is mostly genomic and functional genomic data while new data types include protein microarray, metabolic pathways, compounds, quantitative proteomics, copy number variation, and polysomal transcriptomics. New features include consistent categorization of searches, data sets and genome browser tracks; redesigned gene pages; effective integration of alternative transcripts; and a EuPathDB Galaxy instance for private analyses of a user's data. Forthcoming upgrades include user workspaces for private integration of data with existing EuPathDB data and improved integration and presentation of host–pathogen interactions. PMID:27903906

  5. EuPathDB: the eukaryotic pathogen genomics database resource.

    PubMed

    Aurrecoechea, Cristina; Barreto, Ana; Basenko, Evelina Y; Brestelli, John; Brunk, Brian P; Cade, Shon; Crouch, Kathryn; Doherty, Ryan; Falke, Dave; Fischer, Steve; Gajria, Bindu; Harb, Omar S; Heiges, Mark; Hertz-Fowler, Christiane; Hu, Sufen; Iodice, John; Kissinger, Jessica C; Lawrence, Cris; Li, Wei; Pinney, Deborah F; Pulman, Jane A; Roos, David S; Shanmugasundram, Achchuthan; Silva-Franco, Fatima; Steinbiss, Sascha; Stoeckert, Christian J; Spruill, Drew; Wang, Haiming; Warrenfeltz, Susanne; Zheng, Jie

    2017-01-04

    The Eukaryotic Pathogen Genomics Database Resource (EuPathDB, http://eupathdb.org) is a collection of databases covering 170+ eukaryotic pathogens (protists & fungi), along with relevant free-living and non-pathogenic species, and select pathogen hosts. To facilitate the discovery of meaningful biological relationships, the databases couple preconfigured searches with visualization and analysis tools for comprehensive data mining via intuitive graphical interfaces and APIs. All data are analyzed with the same workflows, including creation of gene orthology profiles, so data are easily compared across data sets, data types and organisms. EuPathDB is updated with numerous new analysis tools, features, data sets and data types. New tools include GO, metabolic pathway and word enrichment analyses plus an online workspace for analysis of personal, non-public, large-scale data. Expanded data content is mostly genomic and functional genomic data while new data types include protein microarray, metabolic pathways, compounds, quantitative proteomics, copy number variation, and polysomal transcriptomics. New features include consistent categorization of searches, data sets and genome browser tracks; redesigned gene pages; effective integration of alternative transcripts; and a EuPathDB Galaxy instance for private analyses of a user's data. Forthcoming upgrades include user workspaces for private integration of data with existing EuPathDB data and improved integration and presentation of host-pathogen interactions.

  6. Protein function annotation using protein domain family resources.

    PubMed

    Das, Sayoni; Orengo, Christine A

    2016-01-15

    As a result of the genome sequencing and structural genomics initiatives, we have a wealth of protein sequence and structural data. However, only about 1% of these proteins have experimental functional annotations. As a result, computational approaches that can predict protein functions are essential in bridging this widening annotation gap. This article reviews the current approaches of protein function prediction using structure and sequence based classification of protein domain family resources with a special focus on functional families in the CATH-Gene3D resource.

  7. Genomic MRI - a Public Resource for Studying Sequence Patterns within Genomic DNA

    PubMed Central

    Prakash, Ashwin; Bechtel, Jason; Fedorov, Alexei

    2011-01-01

    Non-coding genomic regions in complex eukaryotes, including intergenic areas, introns, and untranslated segments of exons, are profoundly non-random in their nucleotide composition and consist of a complex mosaic of sequence patterns. These patterns include so-called Mid-Range Inhomogeneity (MRI) regions -- sequences 30-10000 nucleotides in length that are enriched by a particular base or combination of bases (e.g. (G+T)-rich, purine-rich, etc.). MRI regions are associated with unusual (non-B-form) DNA structures that are often involved in regulation of gene expression, recombination, and other genetic processes (Fedorova & Fedorov 2010). The existence of a strong fixation bias within MRI regions against mutations that tend to reduce their sequence inhomogeneity additionally supports the functionality and importance of these genomic sequences (Prakash et al. 2009). Here we demonstrate a freely available Internet resource -- the Genomic MRI program package -- designed for computational analysis of genomic sequences in order to find and characterize various MRI patterns within them (Bechtel et al. 2008). This package also allows generation of randomized sequences with various properties and level of correspondence to the natural input DNA sequences. The main goal of this resource is to facilitate examination of vast regions of non-coding DNA that are still scarcely investigated and await thorough exploration and recognition. PMID:21610667

  8. The Mouse Genome Database (MGD): comprehensive resource for genetics and genomics of the laboratory mouse

    PubMed Central

    Eppig, Janan T.; Blake, Judith A.; Bult, Carol J.; Kadin, James A.; Richardson, Joel E.

    2012-01-01

    The Mouse Genome Database (MGD, http://www.informatics.jax.org) is the international community resource for integrated genetic, genomic and biological data about the laboratory mouse. Data in MGD are obtained through loads from major data providers and experimental consortia, electronic submissions from laboratories and from the biomedical literature. MGD maintains a comprehensive, unified, non-redundant catalog of mouse genome features generated by distilling gene predictions from NCBI, Ensembl and VEGA. MGD serves as the authoritative source for the nomenclature of mouse genes, mutations, alleles and strains. MGD is the primary source for evidence-supported functional annotations for mouse genes and gene products using the Gene Ontology (GO). MGD provides full annotation of phenotypes and human disease associations for mouse models (genotypes) using terms from the Mammalian Phenotype Ontology and disease names from the Online Mendelian Inheritance in Man (OMIM) resource. MGD is freely accessible online through our website, where users can browse and search interactively, access data in bulk using Batch Query or BioMart, download data files or use our web services Application Programming Interface (API). Improvements to MGD include expanded genome feature classifications, inclusion of new mutant allele sets and phenotype associations and extensions of GO to include new relationships and a new stream of annotations via phylogenetic-based approaches. PMID:22075990

  9. The Mouse Genome Database (MGD): comprehensive resource for genetics and genomics of the laboratory mouse.

    PubMed

    Eppig, Janan T; Blake, Judith A; Bult, Carol J; Kadin, James A; Richardson, Joel E

    2012-01-01

    The Mouse Genome Database (MGD, http://www.informatics.jax.org) is the international community resource for integrated genetic, genomic and biological data about the laboratory mouse. Data in MGD are obtained through loads from major data providers and experimental consortia, electronic submissions from laboratories and from the biomedical literature. MGD maintains a comprehensive, unified, non-redundant catalog of mouse genome features generated by distilling gene predictions from NCBI, Ensembl and VEGA. MGD serves as the authoritative source for the nomenclature of mouse genes, mutations, alleles and strains. MGD is the primary source for evidence-supported functional annotations for mouse genes and gene products using the Gene Ontology (GO). MGD provides full annotation of phenotypes and human disease associations for mouse models (genotypes) using terms from the Mammalian Phenotype Ontology and disease names from the Online Mendelian Inheritance in Man (OMIM) resource. MGD is freely accessible online through our website, where users can browse and search interactively, access data in bulk using Batch Query or BioMart, download data files or use our web services Application Programming Interface (API). Improvements to MGD include expanded genome feature classifications, inclusion of new mutant allele sets and phenotype associations and extensions of GO to include new relationships and a new stream of annotations via phylogenetic-based approaches.

  10. Phenotypic and genomic analyses of a fast neutron mutant population resource in soybean

    USDA-ARS?s Scientific Manuscript database

    Mutagenized populations have become indispensable resources for introducing variation and studying gene function in plant genomics research. In this study, fast neutron (FN) radiation was used to induce deletion mutations in the soybean (Glycine max (L.) Merrill) genome. Approximately 120,000 soybea...

  11. From data to function: functional modeling of poultry genomics data.

    PubMed

    McCarthy, F M; Lyons, E

    2013-09-01

    One of the challenges of functional genomics is to create a better understanding of the biological system being studied so that the data produced are leveraged to provide gains for agriculture, human health, and the environment. Functional modeling enables researchers to make sense of these data as it reframes a long list of genes or gene products (mRNA, ncRNA, and proteins) by grouping based upon function, be it individual molecular functions or interactions between these molecules or broader biological processes, including metabolic and signaling pathways. However, poultry researchers have been hampered by a lack of functional annotation data, tools, and training to use these data and tools. Moreover, this lack is becoming more critical as new sequencing technologies enable us to generate data not only for an increasingly diverse range of species but also individual genomes and populations of individuals. We discuss the impact of these new sequencing technologies on poultry research, with a specific focus on what functional modeling resources are available for poultry researchers. We also describe key strategies for researchers who wish to functionally model their own data, providing background information about functional modeling approaches, the data and tools to support these approaches, and the strengths and limitations of each. Specifically, we describe methods for functional analysis using Gene Ontology (GO) functional summaries, functional enrichment analysis, and pathways and network modeling. As annotation efforts begin to provide the fundamental data that underpin poultry functional modeling (such as improved gene identification, standardized gene nomenclature, temporal and spatial expression data and gene product function), tool developers are incorporating these data into new and existing tools that are used for functional modeling, and cyberinfrastructure is being developed to provide the necessary extendibility and scalability for storing and

  12. IGD: a resource for intronless genes in the human genome.

    PubMed

    Louhichi, Amel; Fourati, Ahmed; Rebaï, Ahmed

    2011-11-15

    Intronless genes (IGs) fraction varies between 2.7 and 97.7% in eukaryotic genomes. Although many databases on exons and introns exist, there was no curated database for such genes which allowed their study in a concerted manner. Such a database would be useful to identify the functional features and the distribution of these genes across the genome. Here, a new database of IGs in eukaryotes based on GenBank data was described. This database, called IGD (Intronless Gene Database), is a collection of gene sequences that were annotated and curated. The current version of IGD contains 687 human intronless genes with their protein and CDS sequences. Some features of the entries are given in this paper. Data was extracted from GenBank release 183 using a Perl script. Data extraction was followed by a manual curation step. Intronless genes were then analyzed based on their RefSeq annotation and Gene Ontology functional class. IGD represents a useful resource for retrieval and in silico study of intronless genes. IGD is available at http://www.bioinfo-cbs.org/igd with comprehensive help and FAQ pages that illustrate the main uses of this resource.

  13. BrucellaBase: Genome information resource.

    PubMed

    Sankarasubramanian, Jagadesan; Vishnu, Udayakumar S; Khader, L K M Abdul; Sridhar, Jayavel; Gunasekaran, Paramasamy; Rajendhran, Jeyaprakash

    2016-09-01

    Brucella sp. causes a major zoonotic disease, brucellosis. Brucella belongs to the family Brucellaceae under the order Rhizobiales of Alphaproteobacteria. We present BrucellaBase, a web-based platform, providing features of a genome database together with unique analysis tools. We have developed a web version of the multilocus sequence typing (MLST) (Whatmore et al., 2007) and phylogenetic analysis of Brucella spp. BrucellaBase currently contains genome data of 510 Brucella strains along with the user interfaces for BLAST, VFDB, CARD, pairwise genome alignment and MLST typing. Availability of these tools will enable the researchers interested in Brucella to get meaningful information from Brucella genome sequences. BrucellaBase will regularly be updated with new genome sequences, new features along with improvements in genome annotations. BrucellaBase is available online at http://www.dbtbrucellosis.in/brucellabase.html or http://59.99.226.203/brucellabase/homepage.html. Copyright © 2016 Elsevier B.V. All rights reserved.

  14. Genomic resources and their influence on the detection of the signal of positive selection in genome scans.

    PubMed

    Manel, S; Perrier, C; Pratlong, M; Abi-Rached, L; Paganini, J; Pontarotti, P; Aurelle, D

    2016-01-01

    Genome scans represent powerful approaches to investigate the action of natural selection on the genetic variation of natural populations and to better understand local adaptation. This is very useful, for example, in the field of conservation biology and evolutionary biology. Thanks to Next Generation Sequencing, genomic resources are growing exponentially, improving genome scan analyses in non-model species. Thousands of SNPs called using Reduced Representation Sequencing are increasingly used in genome scans. Besides, genome sequences are also becoming increasingly available, allowing better processing of short-read data, offering physical localization of variants, and improving haplotype reconstruction and data imputation. Ultimately, genome sequences are also becoming the raw material for selection inferences. Here, we discuss how the increasing availability of such genomic resources, notably genome sequences, influences the detection of signals of selection. Mainly, increasing data density and having the information of physical linkage data expand genome scans by (i) improving the overall quality of the data, (ii) helping the reconstruction of demographic history for the population studied to decrease false-positive rates and (iii) improving the statistical power of methods to detect the signal of selection. Of particular importance, the availability of a high-quality reference genome can improve the detection of the signal of selection by (i) allowing matching the potential candidate loci to linked coding regions under selection, (ii) rapidly moving the investigation to the gene and function and (iii) ensuring that the highly variable regions of the genomes that include functional genes are also investigated. For all those reasons, using reference genomes in genome scan analyses is highly recommended. © 2015 John Wiley & Sons Ltd.

  15. HopBase: a unified resource for Humulus genomics.

    PubMed

    Hill, Steven T; Sudarsanam, Ramcharan; Henning, John; Hendrix, David

    2017-01-01

    Hop (Humulus lupulus L. var lupulus) is a dioecious plant of worldwide significance, used primarily for bittering and flavoring in brewing beer. Studies on the medicinal properties of several unique compounds produced by hop have led to additional interest from pharmacy and healthcare industries as well as livestock production as a natural antibiotic. Genomic research in hop has resulted a published draft genome and transcriptome assemblies. As research into the genomics of hop has gained interest, there is a critical need for centralized online genomic resources. To support the growing research community, we report the development of an online resource "HopBase.org." In addition to providing a gene annotation to the existing Shinsuwase draft genome, HopBase makes available genome assemblies and annotations for both the cultivar "Teamaker" and male hop accession number USDA 21422M. These genome assemblies, gene annotations, along with other common data, coupled with a genome browser and BLAST database enable the hop community to enter the genomic age. The HopBase genomic resource is accessible at http://hopbase.org and http://hopbase.cgrb.oregonstate.edu. © The Author 2017. Published by Oxford University Press.

  16. RNA-Seq of Guar (Cyamopsis tetragonoloba, L. Taub.) Leaves: De novo Transcriptome Assembly, Functional Annotation and Development of Genomic Resources

    PubMed Central

    Tanwar, Umesh K.; Pruthi, Vikas; Randhawa, Gursharn S.

    2017-01-01

    Genetic improvement in industrially important guar (Cyamopsis tetragonoloba, L. Taub.) crop has been hindered due to the lack of sufficient genomic or transcriptomic resources. In this study, RNA-Seq technology was employed to characterize the transcriptome of leaf tissues from two guar varieties, namely, M-83 and RGC-1066. Approximately 30 million high-quality pair-end reads of each variety generated by Illumina HiSeq platform were used for de novo assembly by Trinity program. A total of 62,146 non-redundant unigenes with an average length of 679 bp were obtained. The quality assessment of assembled unigenes revealed 87.50% of complete and 97.18% partial core eukaryotic genes (CEGs). Sequence similarity analyses and annotation of the unigenes against non-redundant protein (Nr) and Gene Ontology (GO) databases identified 175,882 GO annotations. A total of 11,308 guar unigenes were annotated with various enzyme codes (EC) and categorized in six categories with 55 subclasses. The annotation of biochemical pathways resulted in a total of 11,971 unigenes assigned with 145 KEGG maps and 1759 enzyme codes. The species distribution analysis of the unigenes showed highest similarity with Glycine max genes. A total of 5773 potential simple sequence repeats (SSRs) and 3594 high-quality single nucleotide polymorphisms (SNPs) were identified. Out of 20 randomly selected SSRs for wet laboratory validation, 13 showed consistent PCR amplification in both guar varieties. In silico studies identified 145 polymorphic SSR markers in two varieties. To the best of our knowledge, this is the first report on transcriptome analysis and SNPs identification in guar till date. PMID:28210265

  17. MorusDB: a resource for mulberry genomics and genome biology.

    PubMed

    Li, Tian; Qi, Xiwu; Zeng, Qiwei; Xiang, Zhonghuai; He, Ningjia

    2014-01-01

    Mulberry is an important cultivated plant that has received the attention of biologists interested in sericulture and plant-insect interaction. Morus notabilis, a wild mulberry species with a minimal chromosome number is an ideal material for whole-genome sequencing and assembly. The genome and transcriptome of M. notabilis were sequenced and analyzed. In this article, a web-based and open-access database, the Morus Genome Database (MorusDB), was developed to enable easy-to-access and data mining. The MorusDB provides an integrated data source and an easy accession of mulberry large-scale genomic sequencing and assembly, predicted genes and functional annotations, expressed sequence tags (ESTs), transposable elements (TEs), Gene Ontology (GO) terms, horizontal gene transfers between mulberry and silkworm and ortholog and paralog groups. Transcriptome sequencing data for M. notabilis root, leaf, bark, winter bud and male flower can also be searched and downloaded. Furthermore, MorusDB provides an analytical workbench with some built-in tools and pipelines, such as BLAST, Search GO, Mulberry GO and Mulberry GBrowse, to facilitate genomic studies and comparative genomics. The MorusDB provides important genomic resources for scientists working with mulberry and other Moraceae species, which include many important fruit crops. Designed as a basic platform and accompanied by the SilkDB, MorusDB strives to be a comprehensive platform for the silkworm-mulberry interaction studies. Database URL: http://morus.swu.edu.cn/morusdb. © The Author(s) 2014. Published by Oxford University Press.

  18. Gene Chips and Functional Genomics

    NASA Astrophysics Data System (ADS)

    Hamadeh, Hisham; Afshari, Cynthia

    2000-11-01

    These past few years of scientific discovery will undoubtedly be remembered as the "genomics era," the period in which biologists succeeded in enumerating the sequence of nucleotides making up all, or at least most, of human DNA. And while this achievement has been heralded as a technological feat equal to the moon landing, it is only the first of many advances in DNA technology. Scientists are now faced with the task of understanding the meaning of the DNA sequence. Specifically, they want to learn how the DNA code relates to protein function. An important tool in the study of "functional genomics," is the cDNA microarray—also known as the gene chip. Inspired by computer microchips, gene chips allow scientists to monitor the expression of hundreds, even thousands, of genes in a fraction of the time it used to take to monitor the expression of a single one. By altering the conditions under which a particular tissue expresses genes—say, by exposing it to toxins or growth factors—scientists can determine the suite of genes expressed in different situations and hence start to get a handle on the function of these genes. The authors discuss this important new technology and some of its practical applications.

  19. Application of Functional Genomics for Bovine Respiratory Disease Diagnostics.

    PubMed

    Rai, Aswathy N; Epperson, William B; Nanduri, Bindu

    2015-01-01

    Bovine respiratory disease (BRD) is the most common economically important disease affecting cattle. For developing accurate diagnostics that can predict disease susceptibility/resistance and stratification, it is necessary to identify the molecular mechanisms that underlie BRD. To study the complex interactions among the bovine host and the multitude of viral and bacterial pathogens, as well as the environmental factors associated with BRD etiology, genome-scale high-throughput functional genomics methods such as microarrays, RNA-seq, and proteomics are helpful. In this review, we summarize the progress made in our understanding of BRD using functional genomics approaches. We also discuss some of the available bioinformatics resources for analyzing high-throughput data, in the context of biological pathways and molecular interactions. Although resources for studying host response to infection are avail-able, the corresponding information is lacking for majority of BRD pathogens, impeding progress in identifying diagnostic signatures for BRD using functional genomics approaches.

  20. Application of Functional Genomics for Bovine Respiratory Disease Diagnostics

    PubMed Central

    Rai, Aswathy N.; Epperson, William B.; Nanduri, Bindu

    2015-01-01

    Bovine respiratory disease (BRD) is the most common economically important disease affecting cattle. For developing accurate diagnostics that can predict disease susceptibility/resistance and stratification, it is necessary to identify the molecular mechanisms that underlie BRD. To study the complex interactions among the bovine host and the multitude of viral and bacterial pathogens, as well as the environmental factors associated with BRD etiology, genome-scale high-throughput functional genomics methods such as microarrays, RNA-seq, and proteomics are helpful. In this review, we summarize the progress made in our understanding of BRD using functional genomics approaches. We also discuss some of the available bioinformatics resources for analyzing high-throughput data, in the context of biological pathways and molecular interactions. Although resources for studying host response to infection are avail-able, the corresponding information is lacking for majority of BRD pathogens, impeding progress in identifying diagnostic signatures for BRD using functional genomics approaches. PMID:26526746

  1. CottonDB: A resource for cotton genome research

    USDA-ARS?s Scientific Manuscript database

    CottonDB (http://cottondb.org/) is a database and web resource for cotton genomic and genetic research. Created in 1995, CottonDB was among the first plant genome databases established by the USDA-ARS. Accessed through a website interface, the database aims to be a convenient, inclusive medium of ...

  2. Evolution, language and analogy in functional genomics

    NASA Technical Reports Server (NTRS)

    Benner, S. A.; Gaucher, E. A.

    2001-01-01

    Almost a century ago, Wittgenstein pointed out that theory in science is intricately connected to language. This connection is not a frequent topic in the genomics literature. But a case can be made that functional genomics is today hindered by the paradoxes that Wittgenstein identified. If this is true, until these paradoxes are recognized and addressed, functional genomics will continue to be limited in its ability to extrapolate information from genomic sequences.

  3. Evolution, language and analogy in functional genomics

    NASA Technical Reports Server (NTRS)

    Benner, S. A.; Gaucher, E. A.

    2001-01-01

    Almost a century ago, Wittgenstein pointed out that theory in science is intricately connected to language. This connection is not a frequent topic in the genomics literature. But a case can be made that functional genomics is today hindered by the paradoxes that Wittgenstein identified. If this is true, until these paradoxes are recognized and addressed, functional genomics will continue to be limited in its ability to extrapolate information from genomic sequences.

  4. Evolution, language and analogy in functional genomics.

    PubMed

    Benner, S A; Gaucher, E A

    2001-07-01

    Almost a century ago, Wittgenstein pointed out that theory in science is intricately connected to language. This connection is not a frequent topic in the genomics literature. But a case can be made that functional genomics is today hindered by the paradoxes that Wittgenstein identified. If this is true, until these paradoxes are recognized and addressed, functional genomics will continue to be limited in its ability to extrapolate information from genomic sequences.

  5. The KEGG resource for deciphering the genome.

    PubMed

    Kanehisa, Minoru; Goto, Susumu; Kawashima, Shuichi; Okuno, Yasushi; Hattori, Masahiro

    2004-01-01

    A grand challenge in the post-genomic era is a complete computer representation of the cell and the organism, which will enable computational prediction of higher-level complexity of cellular processes and organism behavior from genomic information. Toward this end we have been developing a knowledge-based approach for network prediction, which is to predict, given a complete set of genes in the genome, the protein interaction networks that are responsible for various cellular processes. KEGG at http://www.genome.ad.jp/kegg/ is the reference knowledge base that integrates current knowledge on molecular interaction networks such as pathways and complexes (PATHWAY database), information about genes and proteins generated by genome projects (GENES/SSDB/KO databases) and information about biochemical compounds and reactions (COMPOUND/GLYCAN/REACTION databases). These three types of database actually represent three graph objects, called the protein network, the gene universe and the chemical universe. New efforts are being made to abstract knowledge, both computationally and manually, about ortholog clusters in the KO (KEGG Orthology) database, and to collect and analyze carbohydrate structures in the GLYCAN database.

  6. The KEGG resource for deciphering the genome

    PubMed Central

    Kanehisa, Minoru; Goto, Susumu; Kawashima, Shuichi; Okuno, Yasushi; Hattori, Masahiro

    2004-01-01

    A grand challenge in the post-genomic era is a complete computer representation of the cell and the organism, which will enable computational prediction of higher-level complexity of cellular processes and organism behavior from genomic information. Toward this end we have been developing a knowledge-based approach for network prediction, which is to predict, given a complete set of genes in the genome, the protein interaction networks that are responsible for various cellular processes. KEGG at http://www.genome.ad.jp/kegg/ is the reference knowledge base that integrates current knowledge on molecular interaction networks such as pathways and complexes (PATHWAY database), information about genes and proteins generated by genome projects (GENES/SSDB/KO databases) and information about biochemical compounds and reactions (COMPOUND/GLYCAN/REACTION databases). These three types of database actually represent three graph objects, called the protein network, the gene universe and the chemical universe. New efforts are being made to abstract knowledge, both computationally and manually, about ortholog clusters in the KO (KEGG Orthology) database, and to collect and analyze carbohydrate structures in the GLYCAN database. PMID:14681412

  7. Brassica database (BRAD) version 2.0: integrating and mining Brassicaceae species genomic resources.

    PubMed

    Wang, Xiaobo; Wu, Jian; Liang, Jianli; Cheng, Feng; Wang, Xiaowu

    2015-01-01

    The Brassica database (BRAD) was built initially to assist users apply Brassica rapa and Arabidopsis thaliana genomic data efficiently to their research. However, many Brassicaceae genomes have been sequenced and released after its construction. These genomes are rich resources for comparative genomics, gene annotation and functional evolutionary studies of Brassica crops. Therefore, we have updated BRAD to version 2.0 (V2.0). In BRAD V2.0, 11 more Brassicaceae genomes have been integrated into the database, namely those of Arabidopsis lyrata, Aethionema arabicum, Brassica oleracea, Brassica napus, Camelina sativa, Capsella rubella, Leavenworthia alabamica, Sisymbrium irio and three extremophiles Schrenkiella parvula, Thellungiella halophila and Thellungiella salsuginea. BRAD V2.0 provides plots of syntenic genomic fragments between pairs of Brassicaceae species, from the level of chromosomes to genomic blocks. The Generic Synteny Browser (GBrowse_syn), a module of the Genome Browser (GBrowse), is used to show syntenic relationships between multiple genomes. Search functions for retrieving syntenic and non-syntenic orthologs, as well as their annotation and sequences are also provided. Furthermore, genome and annotation information have been imported into GBrowse so that all functional elements can be visualized in one frame. We plan to continually update BRAD by integrating more Brassicaceae genomes into the database. Database URL: http://brassicadb.org/brad/. © The Author(s) 2015. Published by Oxford University Press.

  8. Brassica database (BRAD) version 2.0: integrating and mining Brassicaceae species genomic resources

    PubMed Central

    Wang, Xiaobo; Wu, Jian; Liang, Jianli; Cheng, Feng; Wang, Xiaowu

    2015-01-01

    The Brassica database (BRAD) was built initially to assist users apply Brassica rapa and Arabidopsis thaliana genomic data efficiently to their research. However, many Brassicaceae genomes have been sequenced and released after its construction. These genomes are rich resources for comparative genomics, gene annotation and functional evolutionary studies of Brassica crops. Therefore, we have updated BRAD to version 2.0 (V2.0). In BRAD V2.0, 11 more Brassicaceae genomes have been integrated into the database, namely those of Arabidopsis lyrata, Aethionema arabicum, Brassica oleracea, Brassica napus, Camelina sativa, Capsella rubella, Leavenworthia alabamica, Sisymbrium irio and three extremophiles Schrenkiella parvula, Thellungiella halophila and Thellungiella salsuginea. BRAD V2.0 provides plots of syntenic genomic fragments between pairs of Brassicaceae species, from the level of chromosomes to genomic blocks. The Generic Synteny Browser (GBrowse_syn), a module of the Genome Browser (GBrowse), is used to show syntenic relationships between multiple genomes. Search functions for retrieving syntenic and non-syntenic orthologs, as well as their annotation and sequences are also provided. Furthermore, genome and annotation information have been imported into GBrowse so that all functional elements can be visualized in one frame. We plan to continually update BRAD by integrating more Brassicaceae genomes into the database. Database URL: http://brassicadb.org/brad/ PMID:26589635

  9. The Functional Genomics Initiative at Oak Ridge National Laboratory

    SciTech Connect

    Johnson, Dabney; Justice, Monica; Beattle, Ken; Buchanan, Michelle; Ramsey, Michael; Ramsey, Rose; Paulus, Michael; Ericson, Nance; Allison, David; Kress, Reid; Mural, Richard; Uberbacher, Ed; Mann, Reinhold

    1997-12-31

    The Functional Genomics Initiative at the Oak Ridge National Laboratory integrates outstanding capabilities in mouse genetics, bioinformatics, and instrumentation. The 50 year investment by the DOE in mouse genetics/mutagenesis has created a one-of-a-kind resource for generating mutations and understanding their biological consequences. It is generally accepted that, through the mouse as a surrogate for human biology, we will come to understand the function of human genes. In addition to this world class program in mammalian genetics, ORNL has also been a world leader in developing bioinformatics tools for the analysis, management and visualization of genomic data. Combining this expertise with new instrumentation technologies will provide a unique capability to understand the consequences of mutations in the mouse at both the organism and molecular levels. The goal of the Functional Genomics Initiative is to develop the technology and methodology necessary to understand gene function on a genomic scale and apply these technologies to megabase regions of the human genome. The effort is scoped so as to create an effective and powerful resource for functional genomics. ORNL is partnering with the Joint Genome Institute and other large scale sequencing centers to sequence several multimegabase regions of both human and mouse genomic DNA, to identify all the genes in these regions, and to conduct fundamental surveys to examine gene function at the molecular and organism level. The Initiative is designed to be a pilot for larger scale deployment in the post-genome era. Technologies will be applied to the examination of gene expression and regulation, metabolism, gene networks, physiology and development.

  10. The Comprehensive Phytopathogen Genomics Resource: a web-based resource for data-mining plant pathogen genomes.

    PubMed

    Hamilton, John P; Neeno-Eckwall, Eric C; Adhikari, Bishwo N; Perna, Nicole T; Tisserat, Ned; Leach, Jan E; Lévesque, C André; Buell, C Robin

    2011-01-01

    The Comprehensive Phytopathogen Genomics Resource (CPGR) provides a web-based portal for plant pathologists and diagnosticians to view the genome and trancriptome sequence status of 806 bacterial, fungal, oomycete, nematode, viral and viroid plant pathogens. Tools are available to search and analyze annotated genome sequences of 74 bacterial, fungal and oomycete pathogens. Oomycete and fungal genomes are obtained directly from GenBank, whereas bacterial genome sequences are downloaded from the A Systematic Annotation Package (ASAP) database that provides curation of genomes using comparative approaches. Curated lists of bacterial genes relevant to pathogenicity and avirulence are also provided. The Plant Pathogen Transcript Assemblies Database provides annotated assemblies of the transcribed regions of 82 eukaryotic genomes from publicly available single pass Expressed Sequence Tags. Data-mining tools are provided along with tools to create candidate diagnostic markers, an emerging use for genomic sequence data in plant pathology. The Plant Pathogen Ribosomal DNA (rDNA) database is a resource for pathogens that lack genome or transcriptome data sets and contains 131 755 rDNA sequences from GenBank for 17 613 species identified as plant pathogens and related genera. Database URL: http://cpgr.plantbiology.msu.edu.

  11. Update on RefSeq microbial genomes resources.

    PubMed

    Tatusova, Tatiana; Ciufo, Stacy; Federhen, Scott; Fedorov, Boris; McVeigh, Richard; O'Neill, Kathleen; Tolstoy, Igor; Zaslavsky, Leonid

    2015-01-01

    NCBI RefSeq genome collection http://www.ncbi.nlm.nih.gov/genome represents all three major domains of life: Eukarya, Bacteria and Archaea as well as Viruses. Prokaryotic genome sequences are the most rapidly growing part of the collection. During the year of 2014 more than 10,000 microbial genome assemblies have been publicly released bringing the total number of prokaryotic genomes close to 30,000. We continue to improve the quality and usability of the microbial genome resources by providing easy access to the data and the results of the pre-computed analysis, and improving analysis and visualization tools. A number of improvements have been incorporated into the Prokaryotic Genome Annotation Pipeline. Several new features have been added to RefSeq prokaryotic genomes data processing pipeline including the calculation of genome groups (clades) and the optimization of protein clusters generation using pan-genome approach. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by US Government employees and is in the public domain in the US.

  12. Fungal genetic resource centres and the genomic challenge.

    PubMed

    Ryan, Matthew J; Smith, David

    2004-12-01

    Fungal research and education has for many years been supported by public service genetic resource centres, whose roles have been to maintain, preserve and supply living cultures to the research community. In the genomic era, genetic resource centres are perhaps more important than ever before. The cultures held, many of which are described and validated by expert biosystematists, are valuable resources for the future. There is a need to supply genomic and proteomic research programmes with fully characterised organisms, as usage of organisms from unreliable sources can prove disastrous, not least in economical terms. However, mycologists often require more than just the organisms, for example, their associated information is vital for bioinformatic applications and some researchers may only require genomic DNA from the organism rather than the organism per se. Genetic resource centres are continually adapting to meet the needs of their users and the wider mycological research community, this associated with OECD international initiatives should ensure they exist to support research for many years to come. This review considers the impact of such initiatives, the current roles of fungal genetic resource centres, the mechanisms used to preserve organisms in a stable manner and the range of resources that are offered for genomic research.

  13. The Perennial Ryegrass GenomeZipper: Targeted Use of Genome Resources for Comparative Grass Genomics1[C][W

    PubMed Central

    Pfeifer, Matthias; Martis, Mihaela; Asp, Torben; Mayer, Klaus F.X.; Lübberstedt, Thomas; Byrne, Stephen; Frei, Ursula; Studer, Bruno

    2013-01-01

    Whole-genome sequences established for model and major crop species constitute a key resource for advanced genomic research. For outbreeding forage and turf grass species like ryegrasses (Lolium spp.), such resources have yet to be developed. Here, we present a model of the perennial ryegrass (Lolium perenne) genome on the basis of conserved synteny to barley (Hordeum vulgare) and the model grass genome Brachypodium (Brachypodium distachyon) as well as rice (Oryza sativa) and sorghum (Sorghum bicolor). A transcriptome-based genetic linkage map of perennial ryegrass served as a scaffold to establish the chromosomal arrangement of syntenic genes from model grass species. This scaffold revealed a high degree of synteny and macrocollinearity and was then utilized to anchor a collection of perennial ryegrass genes in silico to their predicted genome positions. This resulted in the unambiguous assignment of 3,315 out of 8,876 previously unmapped genes to the respective chromosomes. In total, the GenomeZipper incorporates 4,035 conserved grass gene loci, which were used for the first genome-wide sequence divergence analysis between perennial ryegrass, barley, Brachypodium, rice, and sorghum. The perennial ryegrass GenomeZipper is an ordered, information-rich genome scaffold, facilitating map-based cloning and genome assembly in perennial ryegrass and closely related Poaceae species. It also represents a milestone in describing synteny between perennial ryegrass and fully sequenced model grass genomes, thereby increasing our understanding of genome organization and evolution in the most important temperate forage and turf grass species. PMID:23184232

  14. The 19 Genomes of Drosophila: A BAC Library Resource for Genus-Wide and Genome-Scale Comparative Evolutionary Research

    PubMed Central

    Song, Xiang; Goicoechea, Jose Luis; Ammiraju, Jetty S. S.; Luo, Meizhong; He, Ruifeng; Lin, Jinke; Lee, So-Jeong; Sisneros, Nicholas; Watts, Tom; Kudrna, David A.; Golser, Wolfgang; Ashley, Elizabeth; Collura, Kristi; Braidotti, Michele; Yu, Yeisoo; Matzkin, Luciano M.; McAllister, Bryant F.; Markow, Therese Ann; Wing, Rod A.

    2011-01-01

    The genus Drosophila has been the subject of intense comparative phylogenomics characterization to provide insights into genome evolution under diverse biological and ecological contexts and to functionally annotate the Drosophila melanogaster genome, a model system for animal and insect genetics. Recent sequencing of 11 additional Drosophila species from various divergence points of the genus is a first step in this direction. However, to fully reap the benefits of this resource, the Drosophila community is faced with two critical needs: i.e., the expansion of genomic resources from a much broader range of phylogenetic diversity and the development of additional resources to aid in finishing the existing draft genomes. To address these needs, we report the first synthesis of a comprehensive set of bacterial artificial chromosome (BAC) resources for 19 Drosophila species from all three subgenera. Ten libraries were derived from the exact source used to generate 10 of the 12 draft genomes, while the rest were generated from a strategically selected set of species on the basis of salient ecological and life history features and their phylogenetic positions. The majority of the new species have at least one sequenced reference genome for immediate comparative benefit. This 19-BAC library set was rigorously characterized and shown to have large insert sizes (125–168 kb), low nonrecombinant clone content (0.3–5.3%), and deep coverage (9.1–42.9×). Further, we demonstrated the utility of this BAC resource for generating physical maps of targeted loci, refining draft sequence assemblies and identifying potential genomic rearrangements across the phylogeny. PMID:21321134

  15. The 19 genomes of Drosophila: a BAC library resource for genus-wide and genome-scale comparative evolutionary research.

    PubMed

    Song, Xiang; Goicoechea, Jose Luis; Ammiraju, Jetty S S; Luo, Meizhong; He, Ruifeng; Lin, Jinke; Lee, So-Jeong; Sisneros, Nicholas; Watts, Tom; Kudrna, David A; Golser, Wolfgang; Ashley, Elizabeth; Collura, Kristi; Braidotti, Michele; Yu, Yeisoo; Matzkin, Luciano M; McAllister, Bryant F; Markow, Therese Ann; Wing, Rod A

    2011-04-01

    The genus Drosophila has been the subject of intense comparative phylogenomics characterization to provide insights into genome evolution under diverse biological and ecological contexts and to functionally annotate the Drosophila melanogaster genome, a model system for animal and insect genetics. Recent sequencing of 11 additional Drosophila species from various divergence points of the genus is a first step in this direction. However, to fully reap the benefits of this resource, the Drosophila community is faced with two critical needs: i.e., the expansion of genomic resources from a much broader range of phylogenetic diversity and the development of additional resources to aid in finishing the existing draft genomes. To address these needs, we report the first synthesis of a comprehensive set of bacterial artificial chromosome (BAC) resources for 19 Drosophila species from all three subgenera. Ten libraries were derived from the exact source used to generate 10 of the 12 draft genomes, while the rest were generated from a strategically selected set of species on the basis of salient ecological and life history features and their phylogenetic positions. The majority of the new species have at least one sequenced reference genome for immediate comparative benefit. This 19-BAC library set was rigorously characterized and shown to have large insert sizes (125-168 kb), low nonrecombinant clone content (0.3-5.3%), and deep coverage (9.1-42.9×). Further, we demonstrated the utility of this BAC resource for generating physical maps of targeted loci, refining draft sequence assemblies and identifying potential genomic rearrangements across the phylogeny.

  16. Resources for Biological Annotation of the Drosophila Genome

    SciTech Connect

    Gerald M. Rubin

    2005-08-08

    This project supported seed money for the development of cDNA and genetic resources to support studies of the Drosophila melanogaster genome. Key publications supported by this work that provide additional detail: (1) ''The Drosophila gene collection: identification of putative full-length cDNAs for 70% of D. melanogaster genes''; and (2) ''The Berkeley Drosophila Genome Project gene disruption project: Single P-element insertions mutating 25% of vital Drosophila genes''.

  17. MBGD update 2010: toward a comprehensive resource for exploring microbial genome diversity.

    PubMed

    Uchiyama, Ikuo; Higuchi, Toshio; Kawai, Mikihiko

    2010-01-01

    The microbial genome database (MBGD) for comparative analysis is a platform for microbial comparative genomics based on automated ortholog group identification. A prominent feature of MBGD is that it allows users to create ortholog groups using a specified subgroup of organisms. The database is constantly updated and now contains almost 1000 genomes. To utilize the MBGD database as a comprehensive resource for investigating microbial genome diversity, we have developed the following advanced functionalities: (i) enhanced assignment of functional annotation, including external database links to each orthologous group, (ii) interface for choosing a set of genomes to compare based on phenotypic properties, (iii) the addition of more eukaryotic microbial genomes (fungi and protists) and some higher eukaryotes as references and (iv) enhancement of the MyMBGD mode, which allows users to add their own genomes to MBGD and now accepts raw genomic sequences without any annotation (in such a case, it runs a gene-finding procedure before identifying the orthologs). Some analysis functions, such as the function to find orthologs with similar phylogenetic patterns, have also been improved. MBGD is accessible at http://mbgd.genome.ad.jp/.

  18. Open chromatin reveals the functional maize genome

    PubMed Central

    Rodgers-Melnick, Eli; Vera, Daniel L.; Bass, Hank W.

    2016-01-01

    Cellular processes mediated through nuclear DNA must contend with chromatin. Chromatin structural assays can efficiently integrate information across diverse regulatory elements, revealing the functional noncoding genome. In this study, we use a differential nuclease sensitivity assay based on micrococcal nuclease (MNase) digestion to discover open chromatin regions in the maize genome. We find that maize MNase-hypersensitive (MNase HS) regions localize around active genes and within recombination hotspots, focusing biased gene conversion at their flanks. Although MNase HS regions map to less than 1% of the genome, they consistently explain a remarkably large amount (∼40%) of heritable phenotypic variance in diverse complex traits. MNase HS regions are therefore on par with coding sequences as annotations that demarcate the functional parts of the maize genome. These results imply that less than 3% of the maize genome (coding and MNase HS regions) may give rise to the overwhelming majority of phenotypic variation, greatly narrowing the scope of the functional genome. PMID:27185945

  19. Enhancing faba bean (Vicia faba L.) genome resources

    PubMed Central

    Cooper, James W.; Wilson, Michael H.; Derks, Martijn F. L.; Smit, Sandra; Kunert, Karl J.; Cullis, Christopher

    2017-01-01

    Abstract Grain legume improvement is currently impeded by a lack of genomic resources. The paucity of genome information for faba bean can be attributed to the intrinsic difficulties of assembling/annotating its giant (~13 Gb) genome. In order to address this challenge, RNA-sequencing analysis was performed on faba bean (cv. Wizard) leaves. Read alignment to the faba bean reference transcriptome identified 16 300 high quality unigenes. In addition, Illumina paired-end sequencing was used to establish a baseline for genomic information assembly. Genomic reads were assembled de novo into contigs with a size range of 50–5000 bp. Over 85% of sequences did not align to known genes, of which ~10% could be aligned to known repetitive genetic elements. Over 26 000 of the reference transcriptome unigenes could be aligned to DNA-sequencing (DNA-seq) reads with high confidence. Moreover, this comparison identified 56 668 potential splice points in all identified unigenes. Sequence length data were extended at 461 putative loci through alignment of DNA-seq contigs to full-length, publicly available linkage marker sequences. Reads also yielded coverages of 3466× and 650× for the chloroplast and mitochondrial genomes, respectively. Inter- and intraspecies organelle genome comparisons established core legume organelle gene sets, and revealed polymorphic regions of faba bean organelle genomes. PMID:28419381

  20. e-Fungi: a data resource for comparative analysis of fungal genomes

    PubMed Central

    Hedeler, Cornelia; Wong, Han Min; Cornell, Michael J; Alam, Intikhab; Soanes, Darren M; Rattray, Magnus; Hubbard, Simon J; Talbot, Nicholas J; Oliver, Stephen G; Paton, Norman W

    2007-01-01

    Background The number of sequenced fungal genomes is ever increasing, with about 200 genomes already fully sequenced or in progress. Only a small percentage of those genomes have been comprehensively studied, for example using techniques from functional genomics. Comparative analysis has proven to be a useful strategy for enhancing our understanding of evolutionary biology and of the less well understood genomes. However, the data required for these analyses tends to be distributed in various heterogeneous data sources, making systematic comparative studies a cumbersome task. Furthermore, comparative analyses benefit from close integration of derived data sets that cluster genes or organisms in a way that eases the expression of requests that clarify points of similarity or difference between species. Description To support systematic comparative analyses of fungal genomes we have developed the e-Fungi database, which integrates a variety of data for more than 30 fungal genomes. Publicly available genome data, functional annotations, and pathway information has been integrated into a single data repository and complemented with results of comparative analyses, such as MCL and OrthoMCL cluster analysis, and predictions of signaling proteins and the sub-cellular localisation of proteins. To access the data, a library of analysis tasks is available through a web interface. The analysis tasks are motivated by recent comparative genomics studies, and aim to support the study of evolutionary biology as well as community efforts for improving the annotation of genomes. Web services for each query are also available, enabling the tasks to be incorporated into workflows. Conclusion The e-Fungi database provides fungal biologists with a resource for comparative studies of a large range of fungal genomes. Its analysis library supports the comparative study of genome data, functional annotation, and results of large scale analyses over all the genomes stored in the database

  1. PhytoPath: an integrative resource for plant pathogen genomics

    PubMed Central

    Pedro, Helder; Maheswari, Uma; Urban, Martin; Irvine, Alistair George; Cuzick, Alayne; McDowall, Mark D.; Staines, Daniel M.; Kulesha, Eugene; Hammond-Kosack, Kim Elizabeth; Kersey, Paul Julian

    2016-01-01

    PhytoPath (www.phytopathdb.org) is a resource for genomic and phenotypic data from plant pathogen species, that integrates phenotypic data for genes from PHI-base, an expertly curated catalog of genes with experimentally verified pathogenicity, with the Ensembl tools for data visualization and analysis. The resource is focused on fungi, protists (oomycetes) and bacterial plant pathogens that have genomes that have been sequenced and annotated. Genes with associated PHI-base data can be easily identified across all plant pathogen species using a BioMart-based query tool and visualized in their genomic context on the Ensembl genome browser. The PhytoPath resource contains data for 135 genomic sequences from 87 plant pathogen species, and 1364 genes curated for their role in pathogenicity and as targets for chemical intervention. Support for community annotation of gene models is provided using the WebApollo online gene editor, and we are working with interested communities to improve reference annotation for selected species. PMID:26476449

  2. Assembly: a resource for assembled genomes at NCBI

    PubMed Central

    Kitts, Paul A.; Church, Deanna M.; Thibaud-Nissen, Françoise; Choi, Jinna; Hem, Vichet; Sapojnikov, Victor; Smith, Robert G.; Tatusova, Tatiana; Xiang, Charlie; Zherikov, Andrey; DiCuccio, Michael; Murphy, Terence D.; Pruitt, Kim D.; Kimchi, Avi

    2016-01-01

    The NCBI Assembly database (www.ncbi.nlm.nih.gov/assembly/) provides stable accessioning and data tracking for genome assembly data. The model underlying the database can accommodate a range of assembly structures, including sets of unordered contig or scaffold sequences, bacterial genomes consisting of a single complete chromosome, or complex structures such as a human genome with modeled allelic variation. The database provides an assembly accession and version to unambiguously identify the set of sequences that make up a particular version of an assembly, and tracks changes to updated genome assemblies. The Assembly database reports metadata such as assembly names, simple statistical reports of the assembly (number of contigs and scaffolds, contiguity metrics such as contig N50, total sequence length and total gap length) as well as the assembly update history. The Assembly database also tracks the relationship between an assembly submitted to the International Nucleotide Sequence Database Consortium (INSDC) and the assembly represented in the NCBI RefSeq project. Users can find assemblies of interest by querying the Assembly Resource directly or by browsing available assemblies for a particular organism. Links in the Assembly Resource allow users to easily download sequence and annotations for current versions of genome assemblies from the NCBI genomes FTP site. PMID:26578580

  3. Assembly: a resource for assembled genomes at NCBI.

    PubMed

    Kitts, Paul A; Church, Deanna M; Thibaud-Nissen, Françoise; Choi, Jinna; Hem, Vichet; Sapojnikov, Victor; Smith, Robert G; Tatusova, Tatiana; Xiang, Charlie; Zherikov, Andrey; DiCuccio, Michael; Murphy, Terence D; Pruitt, Kim D; Kimchi, Avi

    2016-01-04

    The NCBI Assembly database (www.ncbi.nlm.nih.gov/assembly/) provides stable accessioning and data tracking for genome assembly data. The model underlying the database can accommodate a range of assembly structures, including sets of unordered contig or scaffold sequences, bacterial genomes consisting of a single complete chromosome, or complex structures such as a human genome with modeled allelic variation. The database provides an assembly accession and version to unambiguously identify the set of sequences that make up a particular version of an assembly, and tracks changes to updated genome assemblies. The Assembly database reports metadata such as assembly names, simple statistical reports of the assembly (number of contigs and scaffolds, contiguity metrics such as contig N50, total sequence length and total gap length) as well as the assembly update history. The Assembly database also tracks the relationship between an assembly submitted to the International Nucleotide Sequence Database Consortium (INSDC) and the assembly represented in the NCBI RefSeq project. Users can find assemblies of interest by querying the Assembly Resource directly or by browsing available assemblies for a particular organism. Links in the Assembly Resource allow users to easily download sequence and annotations for current versions of genome assemblies from the NCBI genomes FTP site. Published by Oxford University Press on behalf of Nucleic Acids Research 2015. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  4. Genome resource for the Indonesian coelacanth, Latimeria menadoensis.

    PubMed

    Danke, Joshua; Miyake, Tsutomu; Powers, Thomas; Schein, Jacqueline; Shin, Heesun; Bosdet, Ian; Erdmann, Mark; Caldwell, Roy; Amemiya, Chris T

    2004-03-01

    We have generated a BAC library from the Indonesian coelacanth, Latimeria menadoensis. This library was generated using genomic DNA of nuclei isolated from heart tissue, and has an average insert size of 171 kb. There are a total of 288 384-well microtiter dishes in the library (110,592 clones) and its genomic representation is estimated to encompass > or = 7X coverage based on the amount of DNA presumably cloned in the library as well as via hybridization with probes to a small set of single copy genes. This genomic resource has been made available to the public and should prove useful to the scientific community for many applications, including comparative genomics, molecular evolution and conservation genetics.

  5. Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing

    PubMed Central

    2011-01-01

    Background Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. Results A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed. Conclusions The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives

  6. Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing.

    PubMed

    Straub, Shannon C K; Fishbein, Mark; Livshultz, Tatyana; Foster, Zachary; Parks, Matthew; Weitemier, Kevin; Cronn, Richard C; Liston, Aaron

    2011-05-04

    Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed. The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives. This study represents a first

  7. Plant Genome Resources at the National Center for Biotechnology Information

    PubMed Central

    Wheeler, David L.; Smith-White, Brian; Chetvernin, Vyacheslav; Resenchuk, Sergei; Dombrowski, Susan M.; Pechous, Steven W.; Tatusova, Tatiana; Ostell, James

    2005-01-01

    The National Center for Biotechnology Information (NCBI) integrates data from more than 20 biological databases through a flexible search and retrieval system called Entrez. A core Entrez database, Entrez Nucleotide, includes GenBank and is tightly linked to the NCBI Taxonomy database, the Entrez Protein database, and the scientific literature in PubMed. A suite of more specialized databases for genomes, genes, gene families, gene expression, gene variation, and protein domains dovetails with the core databases to make Entrez a powerful system for genomic research. Linked to the full range of Entrez databases is the NCBI Map Viewer, which displays aligned genetic, physical, and sequence maps for eukaryotic genomes including those of many plants. A specialized plant query page allow maps from all plant genomes covered by the Map Viewer to be searched in tandem to produce a display of aligned maps from several species. PlantBLAST searches against the sequences shown in the Map Viewer allow BLAST alignments to be viewed within a genomic context. In addition, precomputed sequence similarities, such as those for proteins offered by BLAST Link, enable fluid navigation from unannotated to annotated sequences, quickening the pace of discovery. NCBI Web pages for plants, such as Plant Genome Central, complete the system by providing centralized access to NCBI's genomic resources as well as links to organism-specific Web pages beyond NCBI. PMID:16010002

  8. The National Microbial Pathogen Database Resource (NMPDR): a genomics platform based on subsystem annotation

    PubMed Central

    McNeil, Leslie Klis; Reich, Claudia; Aziz, Ramy K.; Bartels, Daniela; Cohoon, Matthew; Disz, Terry; Edwards, Robert A.; Gerdes, Svetlana; Hwang, Kaitlyn; Kubal, Michael; Margaryan, Gohar Rem; Meyer, Folker; Mihalo, William; Olsen, Gary J.; Olson, Robert; Osterman, Andrei; Paarmann, Daniel; Paczian, Tobias; Parrello, Bruce; Pusch, Gordon D.; Rodionov, Dmitry A.; Shi, Xinghua; Vassieva, Olga; Vonstein, Veronika; Zagnitko, Olga; Xia, Fangfang; Zinner, Jenifer; Overbeek, Ross; Stevens, Rick

    2007-01-01

    The National Microbial Pathogen Data Resource (NMPDR) () is a National Institute of Allergy and Infections Disease (NIAID)-funded Bioinformatics Resource Center that supports research in selected Category B pathogens. NMPDR contains the complete genomes of ∼50 strains of pathogenic bacteria that are the focus of our curators, as well as >400 other genomes that provide a broad context for comparative analysis across the three phylogenetic Domains. NMPDR integrates complete, public genomes with expertly curated biological subsystems to provide the most consistent genome annotations. Subsystems are sets of functional roles related by a biologically meaningful organizing principle, which are built over large collections of genomes; they provide researchers with consistent functional assignments in a biologically structured context. Investigators can browse subsystems and reactions to develop accurate reconstructions of the metabolic networks of any sequenced organism. NMPDR provides a comprehensive bioinformatics platform, with tools and viewers for genome analysis. Results of precomputed gene clustering analyses can be retrieved in tabular or graphic format with one-click tools. NMPDR tools include Signature Genes, which finds the set of genes in common or that differentiates two groups of organisms. Essentiality data collated from genome-wide studies have been curated. Drug target identification and high-throughput, in silico, compound screening are in development. PMID:17145713

  9. Functional profiling of the Saccharomyces cerevisiae genome.

    PubMed

    Giaever, Guri; Chu, Angela M; Ni, Li; Connelly, Carla; Riles, Linda; Véronneau, Steeve; Dow, Sally; Lucau-Danila, Ankuta; Anderson, Keith; André, Bruno; Arkin, Adam P; Astromoff, Anna; El-Bakkoury, Mohamed; Bangham, Rhonda; Benito, Rocio; Brachat, Sophie; Campanaro, Stefano; Curtiss, Matt; Davis, Karen; Deutschbauer, Adam; Entian, Karl-Dieter; Flaherty, Patrick; Foury, Francoise; Garfinkel, David J; Gerstein, Mark; Gotte, Deanna; Güldener, Ulrich; Hegemann, Johannes H; Hempel, Svenja; Herman, Zelek; Jaramillo, Daniel F; Kelly, Diane E; Kelly, Steven L; Kötter, Peter; LaBonte, Darlene; Lamb, David C; Lan, Ning; Liang, Hong; Liao, Hong; Liu, Lucy; Luo, Chuanyun; Lussier, Marc; Mao, Rong; Menard, Patrice; Ooi, Siew Loon; Revuelta, Jose L; Roberts, Christopher J; Rose, Matthias; Ross-Macdonald, Petra; Scherens, Bart; Schimmack, Greg; Shafer, Brenda; Shoemaker, Daniel D; Sookhai-Mahadeo, Sharon; Storms, Reginald K; Strathern, Jeffrey N; Valle, Giorgio; Voet, Marleen; Volckaert, Guido; Wang, Ching-yun; Ward, Teresa R; Wilhelmy, Julie; Winzeler, Elizabeth A; Yang, Yonghong; Yen, Grace; Youngman, Elaine; Yu, Kexin; Bussey, Howard; Boeke, Jef D; Snyder, Michael; Philippsen, Peter; Davis, Ronald W; Johnston, Mark

    2002-07-25

    Determining the effect of gene deletion is a fundamental approach to understanding gene function. Conventional genetic screens exhibit biases, and genes contributing to a phenotype are often missed. We systematically constructed a nearly complete collection of gene-deletion mutants (96% of annotated open reading frames, or ORFs) of the yeast Saccharomyces cerevisiae. DNA sequences dubbed 'molecular bar codes' uniquely identify each strain, enabling their growth to be analysed in parallel and the fitness contribution of each gene to be quantitatively assessed by hybridization to high-density oligonucleotide arrays. We show that previously known and new genes are necessary for optimal growth under six well-studied conditions: high salt, sorbitol, galactose, pH 8, minimal medium and nystatin treatment. Less than 7% of genes that exhibit a significant increase in messenger RNA expression are also required for optimal growth in four of the tested conditions. Our results validate the yeast gene-deletion collection as a valuable resource for functional genomics.

  10. The Sleipnir library for computational functional genomics.

    PubMed

    Huttenhower, Curtis; Schroeder, Mark; Chikina, Maria D; Troyanskaya, Olga G

    2008-07-01

    Biological data generation has accelerated to the point where hundreds or thousands of whole-genome datasets of various types are available for many model organisms. This wealth of data can lead to valuable biological insights when analyzed in an integrated manner, but the computational challenge of managing such large data collections is substantial. In order to mine these data efficiently, it is necessary to develop methods that use storage, memory and processing resources carefully. The Sleipnir C++ library implements a variety of machine learning and data manipulation algorithms with a focus on heterogeneous data integration and efficiency for very large biological data collections. Sleipnir allows microarray processing, functional ontology mining, clustering, Bayesian learning and inference and support vector machine tasks to be performed for heterogeneous data on scales not previously practical. In addition to the library, which can easily be integrated into new computational systems, prebuilt tools are provided to perform a variety of common tasks. Many tools are multithreaded for parallelization in desktop or high-throughput computing environments, and most tasks can be performed in minutes for hundreds of datasets using a standard personal computer. Source code (C++) and documentation are available at http://function.princeton.edu/sleipnir and compiled binaries are available from the authors on request.

  11. Integrating genomic resources of flatfish (Pleuronectiformes) to boost aquaculture production.

    PubMed

    Robledo, Diego; Hermida, Miguel; Rubiolo, Juan A; Fernández, Carlos; Blanco, Andrés; Bouza, Carmen; Martínez, Paulino

    2017-03-01

    Flatfish have a high market acceptance thus representing a profitable aquaculture production. The main farmed species is the turbot (Scophthalmus maximus) followed by Japanese flounder (Paralichthys olivaceous) and tongue sole (Cynoglossus semilaevis), but other species like Atlantic halibut (Hippoglossus hippoglossus), Senegalese sole (Solea senegalensis) and common sole (Solea solea) also register an important production and are very promising for farming. Important genomic resources are available for most of these species including whole genome sequencing projects, genetic maps and transcriptomes. In this work, we integrate all available genomic information of these species within a common framework, taking as reference the whole assembled genomes of turbot and tongue sole (>210× coverage). New insights related to the genetic basis of productive traits and new data useful to understand the evolutionary origin and diversification of this group were obtained. Despite a general 1:1 chromosome syntenic relationship between species, the comparison of turbot and tongue sole genomes showed huge intrachromosomic reorganizations. The integration of available mapping information supported specific chromosome fusions along flatfish evolution and facilitated the comparison between species of previously reported genetic associations for productive traits. When comparing transcriptomic resources of the six species, a common set of ~2500 othologues and ~150 common miRNAs were identified, and specific sets of putative missing genes were detected in flatfish transcriptomes, likely reflecting their evolutionary diversification. Copyright © 2016 Elsevier Inc. All rights reserved.

  12. A public resource facilitating clinical use of genomes.

    PubMed

    Ball, Madeleine P; Thakuria, Joseph V; Zaranek, Alexander Wait; Clegg, Tom; Rosenbaum, Abraham M; Wu, Xiaodi; Angrist, Misha; Bhak, Jong; Bobe, Jason; Callow, Matthew J; Cano, Carlos; Chou, Michael F; Chung, Wendy K; Douglas, Shawn M; Estep, Preston W; Gore, Athurva; Hulick, Peter; Labarga, Alberto; Lee, Je-Hyuk; Lunshof, Jeantine E; Kim, Byung Chul; Kim, Jong-Il; Li, Zhe; Murray, Michael F; Nilsen, Geoffrey B; Peters, Brock A; Raman, Anugraha M; Rienhoff, Hugh Y; Robasky, Kimberly; Wheeler, Matthew T; Vandewege, Ward; Vorhaus, Daniel B; Yang, Joyce L; Yang, Luhan; Aach, John; Ashley, Euan A; Drmanac, Radoje; Kim, Seong-Jin; Li, Jin Billy; Peshkin, Leonid; Seidman, Christine E; Seo, Jeong-Sun; Zhang, Kun; Rehm, Heidi L; Church, George M

    2012-07-24

    Rapid advances in DNA sequencing promise to enable new diagnostics and individualized therapies. Achieving personalized medicine, however, will require extensive research on highly reidentifiable, integrated datasets of genomic and health information. To assist with this, participants in the Personal Genome Project choose to forgo privacy via our institutional review board- approved "open consent" process. The contribution of public data and samples facilitates both scientific discovery and standardization of methods. We present our findings after enrollment of more than 1,800 participants, including whole-genome sequencing of 10 pilot participant genomes (the PGP-10). We introduce the Genome-Environment-Trait Evidence (GET-Evidence) system. This tool automatically processes genomes and prioritizes both published and novel variants for interpretation. In the process of reviewing the presumed healthy PGP-10 genomes, we find numerous literature references implying serious disease. Although it is sometimes impossible to rule out a late-onset effect, stringent evidence requirements can address the high rate of incidental findings. To that end we develop a peer production system for recording and organizing variant evaluations according to standard evidence guidelines, creating a public forum for reaching consensus on interpretation of clinically relevant variants. Genome analysis becomes a two-step process: using a prioritized list to record variant evaluations, then automatically sorting reviewed variants using these annotations. Genome data, health and trait information, participant samples, and variant interpretations are all shared in the public domain-we invite others to review our results using our participant samples and contribute to our interpretations. We offer our public resource and methods to further personalized medical research.

  13. A public resource facilitating clinical use of genomes

    PubMed Central

    Ball, Madeleine P.; Thakuria, Joseph V.; Zaranek, Alexander Wait; Clegg, Tom; Rosenbaum, Abraham M.; Wu, Xiaodi; Angrist, Misha; Bhak, Jong; Bobe, Jason; Callow, Matthew J.; Cano, Carlos; Chou, Michael F.; Chung, Wendy K.; Douglas, Shawn M.; Estep, Preston W.; Gore, Athurva; Hulick, Peter; Labarga, Alberto; Lee, Je-Hyuk; Lunshof, Jeantine E.; Kim, Byung Chul; Kim, Jong-Il; Li, Zhe; Murray, Michael F.; Nilsen, Geoffrey B.; Peters, Brock A.; Raman, Anugraha M.; Rienhoff, Hugh Y.; Robasky, Kimberly; Wheeler, Matthew T.; Vandewege, Ward; Vorhaus, Daniel B.; Yang, Joyce L.; Yang, Luhan; Aach, John; Ashley, Euan A.; Drmanac, Radoje; Kim, Seong-Jin; Li, Jin Billy; Peshkin, Leonid; Seidman, Christine E.; Seo, Jeong-Sun; Zhang, Kun; Rehm, Heidi L.; Church, George M.

    2012-01-01

    Rapid advances in DNA sequencing promise to enable new diagnostics and individualized therapies. Achieving personalized medicine, however, will require extensive research on highly reidentifiable, integrated datasets of genomic and health information. To assist with this, participants in the Personal Genome Project choose to forgo privacy via our institutional review board- approved “open consent” process. The contribution of public data and samples facilitates both scientific discovery and standardization of methods. We present our findings after enrollment of more than 1,800 participants, including whole-genome sequencing of 10 pilot participant genomes (the PGP-10). We introduce the Genome-Environment-Trait Evidence (GET-Evidence) system. This tool automatically processes genomes and prioritizes both published and novel variants for interpretation. In the process of reviewing the presumed healthy PGP-10 genomes, we find numerous literature references implying serious disease. Although it is sometimes impossible to rule out a late-onset effect, stringent evidence requirements can address the high rate of incidental findings. To that end we develop a peer production system for recording and organizing variant evaluations according to standard evidence guidelines, creating a public forum for reaching consensus on interpretation of clinically relevant variants. Genome analysis becomes a two-step process: using a prioritized list to record variant evaluations, then automatically sorting reviewed variants using these annotations. Genome data, health and trait information, participant samples, and variant interpretations are all shared in the public domain—we invite others to review our results using our participant samples and contribute to our interpretations. We offer our public resource and methods to further personalized medical research. PMID:22797899

  14. GDR (Genome Database for Rosaceae): integrated web resources for Rosaceae genomics and genetics research

    PubMed Central

    Jung, Sook; Jesudurai, Christopher; Staton, Margaret; Du, Zhidian; Ficklin, Stephen; Cho, Ilhyung; Abbott, Albert; Tomkins, Jeffrey; Main, Dorrie

    2004-01-01

    Background Peach is being developed as a model organism for Rosaceae, an economically important family that includes fruits and ornamental plants such as apple, pear, strawberry, cherry, almond and rose. The genomics and genetics data of peach can play a significant role in the gene discovery and the genetic understanding of related species. The effective utilization of these peach resources, however, requires the development of an integrated and centralized database with associated analysis tools. Description The Genome Database for Rosaceae (GDR) is a curated and integrated web-based relational database. GDR contains comprehensive data of the genetically anchored peach physical map, an annotated peach EST database, Rosaceae maps and markers and all publicly available Rosaceae sequences. Annotations of ESTs include contig assembly, putative function, simple sequence repeats, and anchored position to the peach physical map where applicable. Our integrated map viewer provides graphical interface to the genetic, transcriptome and physical mapping information. ESTs, BACs and markers can be queried by various categories and the search result sites are linked to the integrated map viewer or to the WebFPC physical map sites. In addition to browsing and querying the database, users can compare their sequences with the annotated GDR sequences via a dedicated sequence similarity server running either the BLAST or FASTA algorithm. To demonstrate the utility of the integrated and fully annotated database and analysis tools, we describe a case study where we anchored Rosaceae sequences to the peach physical and genetic map by sequence similarity. Conclusions The GDR has been initiated to meet the major deficiency in Rosaceae genomics and genetics research, namely a centralized web database and bioinformatics tools for data storage, analysis and exchange. GDR can be accessed at . PMID:15357877

  15. Ensembl core software resources: storage and programmatic access for DNA sequence and genome annotation

    PubMed Central

    Kähäri, Andreas; Komorowska, Monika; Keenan, Stephen; Laird, Matthew; Longden, Ian; Proctor, Glenn; Searle, Steve; Staines, Daniel; Taylor, Kieron; Vullo, Alessandro; Yates, Andrew; Zerbino, Daniel

    2017-01-01

    Abstract The Ensembl software resources are a stable infrastructure to store, access and manipulate genome assemblies and their functional annotations. The Ensembl ‘Core’ database and Application Programming Interface (API) was our first major piece of software infrastructure and remains at the centre of all of our genome resources. Since its initial design more than fifteen years ago, the number of publicly available genomic, transcriptomic and proteomic datasets has grown enormously, accelerated by continuous advances in DNA-sequencing technology. Initially intended to provide annotation for the reference human genome, we have extended our framework to support the genomes of all species as well as richer assembly models. Cross-referenced links to other informatics resources facilitate searching our database with a variety of popular identifiers such as UniProt and RefSeq. Our comprehensive and robust framework storing a large diversity of genome annotations in one location serves as a platform for other groups to generate and maintain their own tailored annotation. We welcome reuse and contributions: our databases and APIs are publicly available, all of our source code is released with a permissive Apache v2.0 licence at http://github.com/Ensembl and we have an active developer mailing list (http://www.ensembl.org/info/about/contact/index.html). Database URL: http://www.ensembl.org PMID:28365736

  16. CoryneBase: Corynebacterium Genomic Resources and Analysis Tools at Your Fingertips

    PubMed Central

    Tan, Mui Fern; Jakubovics, Nick S.; Wee, Wei Yee; Mutha, Naresh V. R.; Wong, Guat Jah; Ang, Mia Yang; Yazdi, Amir Hessam; Choo, Siew Woh

    2014-01-01

    Corynebacteria are used for a wide variety of industrial purposes but some species are associated with human diseases. With increasing number of corynebacterial genomes having been sequenced, comparative analysis of these strains may provide better understanding of their biology, phylogeny, virulence and taxonomy that may lead to the discoveries of beneficial industrial strains or contribute to better management of diseases. To facilitate the ongoing research of corynebacteria, a specialized central repository and analysis platform for the corynebacterial research community is needed to host the fast-growing amount of genomic data and facilitate the analysis of these data. Here we present CoryneBase, a genomic database for Corynebacterium with diverse functionality for the analysis of genomes aimed to provide: (1) annotated genome sequences of Corynebacterium where 165,918 coding sequences and 4,180 RNAs can be found in 27 species; (2) access to comprehensive Corynebacterium data through the use of advanced web technologies for interactive web interfaces; and (3) advanced bioinformatic analysis tools consisting of standard BLAST for homology search, VFDB BLAST for sequence homology search against the Virulence Factor Database (VFDB), Pairwise Genome Comparison (PGC) tool for comparative genomic analysis, and a newly designed Pathogenomics Profiling Tool (PathoProT) for comparative pathogenomic analysis. CoryneBase offers the access of a range of Corynebacterium genomic resources as well as analysis tools for comparative genomics and pathogenomics. It is publicly available at http://corynebacterium.um.edu.my/. PMID:24466021

  17. Munich Information Center for Protein Sequences Plant Genome Resources. A Framework for Integrative and Comparative Analyses1[w

    PubMed Central

    Schoof, Heiko; Spannagl, Manuel; Yang, Li; Ernst, Rebecca; Gundlach, Heidrun; Haase, Dirk; Haberer, Georg; Mayer, Klaus F.X.

    2005-01-01

    With several plant genomes sequenced, the power of comparative genome analysis can now be applied. However, genome-scale cross-species analyses are limited by the effort for data integration. To develop an integrated cross-species plant genome resource, we maintain comprehensive databases for model plant genomes, including Arabidopsis (Arabidopsis thaliana), maize (Zea mays), Medicago truncatula, and rice (Oryza sativa). Integration of data and resources is emphasized, both in house as well as with external partners and databases. Manual curation and state-of-the-art bioinformatic analysis are combined to achieve quality data. Easy access to the data is provided through Web interfaces and visualization tools, bulk downloads, and Web services for application-level access. This allows a consistent view of the model plant genomes for comparative and evolutionary studies, the transfer of knowledge between species, and the integration with functional genomics data. PMID:16010004

  18. Draft Genome Sequence of Mentha longifolia and Development of Resources for Mint Cultivar Improvement.

    PubMed

    Vining, Kelly J; Johnson, Sean R; Ahkami, Amirhossein; Lange, Iris; Parrish, Amber N; Trapp, Susan C; Croteau, Rodney B; Straub, Shannon C K; Pandelova, Iovanna; Lange, B Markus

    2017-02-13

    The genus Mentha encompasses mint species cultivated for their essential oils, which are formulated into a vast array of consumer products. Desirable oil characteristics and resistance to the fungal disease Verticillium wilt are top priorities for the mint industry. However, cultivated mints have complex polyploid genomes and are sterile. Breeding efforts, therefore, require the development of genomic resources for fertile mint species. Here, we present draft de novo genome and plastome assemblies for a wilt-resistant South African accession of Mentha longifolia (L.) Huds., a diploid species ancestral to cultivated peppermint and spearmint. The 353 Mb genome contains 35 597 predicted protein-coding genes, including 292 disease resistance gene homologs, and nine genes determining essential oil characteristics. A genetic linkage map ordered 1397 genome scaffolds on 12 pseudochromosomes. More than two million simple sequence repeats were identified, which will facilitate molecular marker development. The M. longifolia genome is a valuable resource for both metabolic engineering and molecular breeding. This is exemplified by employing the genome sequence to clone and functionally characterize the promoters in a peppermint cultivar, and demonstrating the utility of a glandular trichome-specific promoter to increase expression of a biosynthetic gene, thereby modulating essential oil composition. Copyright © 2017 The Author. Published by Elsevier Inc. All rights reserved.

  19. A factor analysis model for functional genomics

    PubMed Central

    Kustra, Rafal; Shioda, Romy; Zhu, Mu

    2006-01-01

    Background Expression array data are used to predict biological functions of uncharacterized genes by comparing their expression profiles to those of characterized genes. While biologically plausible, this is both statistically and computationally challenging. Typical approaches are computationally expensive and ignore correlations among expression profiles and functional categories. Results We propose a factor analysis model (FAM) for functional genomics and give a two-step algorithm, using genome-wide expression data for yeast and a subset of Gene-Ontology Biological Process functional annotations. We show that the predictive performance of our method is comparable to the current best approach while our total computation time was faster by a factor of 4000. We discuss the unique challenges in performance evaluation of algorithms used for genome-wide functions genomics. Finally, we discuss extensions to our method that can incorporate the inherent correlation structure of the functional categories to further improve predictive performance. Conclusion Our factor analysis model is a computationally efficient technique for functional genomics and provides a clear and unified statistical framework with potential for incorporating important gene ontology information to improve predictions. PMID:16630343

  20. Knocking down the obstacles to functional genomics data sharing.

    PubMed

    Simpson, Kaylene J; Smith, Jennifer A

    2017-03-01

    This week, Scientific Data published a collection of eight papers that describe datasets from high-throughput functional genomics screens, primarily utilizing RNA interference (RNAi). The publications explore host-pathogen dependencies, innate immune response, disease pathways, and cell morphology and motility at the genome-level. All data, including raw images from the high content screens, are publically available in PubChem BioAssay, figshare, Harvard Dataverse or the Image Data Resource (IDR). Detailed data descriptors enable use of these data for analysis algorithm design, machine learning, data comparisons, as well as generating new scientific hypotheses.

  1. Knocking down the obstacles to functional genomics data sharing

    PubMed Central

    Simpson, Kaylene J.; Smith, Jennifer A.

    2017-01-01

    This week, Scientific Data published a collection of eight papers that describe datasets from high-throughput functional genomics screens, primarily utilizing RNA interference (RNAi). The publications explore host-pathogen dependencies, innate immune response, disease pathways, and cell morphology and motility at the genome-level. All data, including raw images from the high content screens, are publically available in PubChem BioAssay, figshare, Harvard Dataverse or the Image Data Resource (IDR). Detailed data descriptors enable use of these data for analysis algorithm design, machine learning, data comparisons, as well as generating new scientific hypotheses. PMID:28248922

  2. Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing

    Treesearch

    Shannon C.K. Straub; Mark Fishbein; Tatyana Livshult; Zachary Foster; Matthew Parks; Kevin Weitemier; Richard C. Cronn; Aaron. Liston

    2011-01-01

    Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in...

  3. Searching the Mouse Genome Informatics (MGI) Resources for Information on Mouse Biology from Genotype to Phenotype.

    PubMed

    Shaw, David R

    2016-12-08

    The Mouse Genome Informatics (MGI) resource provides the research community with access to information on the genetics, genomics, and biology of the laboratory mouse. Core data in MGI include gene characterization and function, phenotype and disease model descriptions, DNA and protein sequence data, gene expression data, vertebrate homologies, SNPs, mapping data, and links to other bioinformatics databases. Semantic integration is supported through the use of standardized nomenclature, and through the use of controlled vocabularies such as the mouse Anatomical Dictionary, the Mammalian Phenotype Ontology, and the Gene Ontologies. MGI extracts and organizes data from primary literature. MGI data are shared with and widely displayed from other bioinformatics resources. The database is updated weekly with curated annotations, and regularly adds new datasets and features. This unit provides a guide to using the MGI bioinformatics resource. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.

  4. A Genomic Resource for the Development, Improvement, and Exploitation of Sorghum for Bioenergy

    PubMed Central

    Brenton, Zachary W.; Cooper, Elizabeth A.; Myers, Mathew T.; Boyles, Richard E.; Shakoor, Nadia; Zielinski, Kelsey J.; Rauh, Bradley L.; Bridges, William C.; Morris, Geoffrey P.; Kresovich, Stephen

    2016-01-01

    With high productivity and stress tolerance, numerous grass genera of the Andropogoneae have emerged as candidates for bioenergy production. To optimize these candidates, research examining the genetic architecture of yield, carbon partitioning, and composition is required to advance breeding objectives. Significant progress has been made developing genetic and genomic resources for Andropogoneae, and advances in comparative and computational genomics have enabled research examining the genetic basis of photosynthesis, carbon partitioning, composition, and sink strength. To provide a pivotal resource aimed at developing a comparative understanding of key bioenergy traits in the Andropogoneae, we have established and characterized an association panel of 390 racially, geographically, and phenotypically diverse Sorghum bicolor accessions with 232,303 genetic markers. Sorghum bicolor was selected because of its genomic simplicity, phenotypic diversity, significant genomic tools, and its agricultural productivity and resilience. We have demonstrated the value of sorghum as a functional model for candidate gene discovery for bioenergy Andropogoneae by performing genome-wide association analysis for two contrasting phenotypes representing key components of structural and non-structural carbohydrates. We identified potential genes, including a cellulase enzyme and a vacuolar transporter, associated with increased non-structural carbohydrates that could lead to bioenergy sorghum improvement. Although our analysis identified genes with potentially clear functions, other candidates did not have assigned functions, suggesting novel molecular mechanisms for carbon partitioning traits. These results, combined with our characterization of phenotypic and genetic diversity and the public accessibility of each accession and genomic data, demonstrate the value of this resource and provide a foundation for future improvement of sorghum and related grasses for bioenergy production

  5. A Genomic Resource for the Development, Improvement, and Exploitation of Sorghum for Bioenergy.

    PubMed

    Brenton, Zachary W; Cooper, Elizabeth A; Myers, Mathew T; Boyles, Richard E; Shakoor, Nadia; Zielinski, Kelsey J; Rauh, Bradley L; Bridges, William C; Morris, Geoffrey P; Kresovich, Stephen

    2016-09-01

    With high productivity and stress tolerance, numerous grass genera of the Andropogoneae have emerged as candidates for bioenergy production. To optimize these candidates, research examining the genetic architecture of yield, carbon partitioning, and composition is required to advance breeding objectives. Significant progress has been made developing genetic and genomic resources for Andropogoneae, and advances in comparative and computational genomics have enabled research examining the genetic basis of photosynthesis, carbon partitioning, composition, and sink strength. To provide a pivotal resource aimed at developing a comparative understanding of key bioenergy traits in the Andropogoneae, we have established and characterized an association panel of 390 racially, geographically, and phenotypically diverse Sorghum bicolor accessions with 232,303 genetic markers. Sorghum bicolor was selected because of its genomic simplicity, phenotypic diversity, significant genomic tools, and its agricultural productivity and resilience. We have demonstrated the value of sorghum as a functional model for candidate gene discovery for bioenergy Andropogoneae by performing genome-wide association analysis for two contrasting phenotypes representing key components of structural and non-structural carbohydrates. We identified potential genes, including a cellulase enzyme and a vacuolar transporter, associated with increased non-structural carbohydrates that could lead to bioenergy sorghum improvement. Although our analysis identified genes with potentially clear functions, other candidates did not have assigned functions, suggesting novel molecular mechanisms for carbon partitioning traits. These results, combined with our characterization of phenotypic and genetic diversity and the public accessibility of each accession and genomic data, demonstrate the value of this resource and provide a foundation for future improvement of sorghum and related grasses for bioenergy production.

  6. Genetic Transformation and Genomic Resources for Next-Generation Precise Genome Engineering in Vegetable Crops

    PubMed Central

    Cardi, Teodoro; D’Agostino, Nunzio; Tripodi, Pasquale

    2017-01-01

    In the frame of modern agriculture facing the predicted increase of population and general environmental changes, the securement of high quality food remains a major challenge to deal with. Vegetable crops include a large number of species, characterized by multiple geographical origins, large genetic variability and diverse reproductive features. Due to their nutritional value, they have an important place in human diet. In recent years, many crop genomes have been sequenced permitting the identification of genes and superior alleles associated with desirable traits. Furthermore, innovative biotechnological approaches allow to take a step forward towards the development of new improved cultivars harboring precise genome modifications. Sequence-based knowledge coupled with advanced biotechnologies is supporting the widespread application of new plant breeding techniques to enhance the success in modification and transfer of useful alleles into target varieties. Clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein 9 system, zinc-finger nucleases, and transcription activator-like effector nucleases represent the main methods available for plant genome engineering through targeted modifications. Such technologies, however, require efficient transformation protocols as well as extensive genomic resources and accurate knowledge before they can be efficiently exploited in practical breeding programs. In this review, we revise the state of the art in relation to availability of such scientific and technological resources in various groups of vegetables, describe genome editing results obtained so far and discuss the implications for future applications. PMID:28275380

  7. Genetic Transformation and Genomic Resources for Next-Generation Precise Genome Engineering in Vegetable Crops.

    PubMed

    Cardi, Teodoro; D'Agostino, Nunzio; Tripodi, Pasquale

    2017-01-01

    In the frame of modern agriculture facing the predicted increase of population and general environmental changes, the securement of high quality food remains a major challenge to deal with. Vegetable crops include a large number of species, characterized by multiple geographical origins, large genetic variability and diverse reproductive features. Due to their nutritional value, they have an important place in human diet. In recent years, many crop genomes have been sequenced permitting the identification of genes and superior alleles associated with desirable traits. Furthermore, innovative biotechnological approaches allow to take a step forward towards the development of new improved cultivars harboring precise genome modifications. Sequence-based knowledge coupled with advanced biotechnologies is supporting the widespread application of new plant breeding techniques to enhance the success in modification and transfer of useful alleles into target varieties. Clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein 9 system, zinc-finger nucleases, and transcription activator-like effector nucleases represent the main methods available for plant genome engineering through targeted modifications. Such technologies, however, require efficient transformation protocols as well as extensive genomic resources and accurate knowledge before they can be efficiently exploited in practical breeding programs. In this review, we revise the state of the art in relation to availability of such scientific and technological resources in various groups of vegetables, describe genome editing results obtained so far and discuss the implications for future applications.

  8. Functional innovations of three chronological mesohexaploid Brassica rapa genomes.

    PubMed

    Kim, Jungeun; Lee, Jeongyeo; Choi, Jae-Pil; Park, Inkyu; Yang, Kyungbong; Kim, Min Keun; Lee, Young Han; Nou, Ill-Sup; Kim, Dae-Soo; Min, Sung Ran; Park, Sang Un; Kim, HyeRan

    2014-07-18

    other possible events including nonfunctionalization and neofunctionalization. We refined Brassicaceae GB information using the latest genomic resources, and distinguished three chronologically ordered B. rapa genomes. B. rapa genes were categorized into fast evolving, single- and multi-retention genes, and long retention genes by their substitution rates and retention patterns. Representative functions of the categorized genes were elucidated, providing better understanding of B. rapa evolution and the Brassica genus.

  9. An evolutionary classification of genomic function.

    PubMed

    Graur, Dan; Zheng, Yichen; Azevedo, Ricardo B R

    2015-01-28

    The pronouncements of the ENCODE Project Consortium regarding "junk DNA" exposed the need for an evolutionary classification of genomic elements according to their selected-effect function. In the classification scheme presented here, we divide the genome into "functional DNA," that is, DNA sequences that have a selected-effect function, and "rubbish DNA," that is, sequences that do not. Functional DNA is further subdivided into "literal DNA" and "indifferent DNA." In literal DNA, the order of nucleotides is under selection; in indifferent DNA, only the presence or absence of the sequence is under selection. Rubbish DNA is further subdivided into "junk DNA" and "garbage DNA." Junk DNA neither contributes to nor detracts from the fitness of the organism and, hence, evolves under selective neutrality. Garbage DNA, on the other hand, decreases the fitness of its carriers. Garbage DNA exists in the genome only because natural selection is neither omnipotent nor instantaneous. Each of these four functional categories can be 1) transcribed and translated, 2) transcribed but not translated, or 3) not transcribed. The affiliation of a DNA segment to a particular functional category may change during evolution: Functional DNA may become junk DNA, junk DNA may become garbage DNA, rubbish DNA may become functional DNA, and so on; however, determining the functionality or nonfunctionality of a genomic sequence must be based on its present status rather than on its potential to change (or not to change) in the future. Changes in functional affiliation are divided into pseudogenes, Lazarus DNA, zombie DNA, and Jekyll-to-Hyde DNA.

  10. An Evolutionary Classification of Genomic Function

    PubMed Central

    Graur, Dan; Zheng, Yichen; Azevedo, Ricardo B.R.

    2015-01-01

    The pronouncements of the ENCODE Project Consortium regarding “junk DNA” exposed the need for an evolutionary classification of genomic elements according to their selected-effect function. In the classification scheme presented here, we divide the genome into “functional DNA,” that is, DNA sequences that have a selected-effect function, and “rubbish DNA,” that is, sequences that do not. Functional DNA is further subdivided into “literal DNA” and “indifferent DNA.” In literal DNA, the order of nucleotides is under selection; in indifferent DNA, only the presence or absence of the sequence is under selection. Rubbish DNA is further subdivided into “junk DNA” and “garbage DNA.” Junk DNA neither contributes to nor detracts from the fitness of the organism and, hence, evolves under selective neutrality. Garbage DNA, on the other hand, decreases the fitness of its carriers. Garbage DNA exists in the genome only because natural selection is neither omnipotent nor instantaneous. Each of these four functional categories can be 1) transcribed and translated, 2) transcribed but not translated, or 3) not transcribed. The affiliation of a DNA segment to a particular functional category may change during evolution: Functional DNA may become junk DNA, junk DNA may become garbage DNA, rubbish DNA may become functional DNA, and so on; however, determining the functionality or nonfunctionality of a genomic sequence must be based on its present status rather than on its potential to change (or not to change) in the future. Changes in functional affiliation are divided into pseudogenes, Lazarus DNA, zombie DNA, and Jekyll-to-Hyde DNA. PMID:25635041

  11. InsectBase: a resource for insect genomes and transcriptomes.

    PubMed

    Yin, Chuanlin; Shen, Gengyu; Guo, Dianhao; Wang, Shuping; Ma, Xingzhou; Xiao, Huamei; Liu, Jinding; Zhang, Zan; Liu, Ying; Zhang, Yiqun; Yu, Kaixiang; Huang, Shuiqing; Li, Fei

    2016-01-04

    The genomes and transcriptomes of hundreds of insects have been sequenced. However, insect community lacks an integrated, up-to-date collection of insect gene data. Here, we introduce the first release of InsectBase, available online at http://www.insect-genome.com. The database encompasses 138 insect genomes, 116 insect transcriptomes, 61 insect gene sets, 36 gene families of 60 insects, 7544 miRNAs of 69 insects, 96,925 piRNAs of Drosophila melanogaster and Chilo suppressalis, 2439 lncRNA of Nilaparvata lugens, 22,536 pathways of 78 insects, 678,881 untranslated regions (UTR) of 84 insects and 160,905 coding sequences (CDS) of 70 insects. This release contains over 12 million sequences and provides search functionality, a BLAST server, GBrowse, insect pathway construction, a Facebook-like network for the insect community (iFacebook), and phylogenetic analysis of selected genes.

  12. InsectBase: a resource for insect genomes and transcriptomes

    PubMed Central

    Yin, Chuanlin; Shen, Gengyu; Guo, Dianhao; Wang, Shuping; Ma, Xingzhou; Xiao, Huamei; Liu, Jinding; Zhang, Zan; Liu, Ying; Zhang, Yiqun; Yu, Kaixiang; Huang, Shuiqing; Li, Fei

    2016-01-01

    The genomes and transcriptomes of hundreds of insects have been sequenced. However, insect community lacks an integrated, up-to-date collection of insect gene data. Here, we introduce the first release of InsectBase, available online at http://www.insect-genome.com. The database encompasses 138 insect genomes, 116 insect transcriptomes, 61 insect gene sets, 36 gene families of 60 insects, 7544 miRNAs of 69 insects, 96 925 piRNAs of Drosophila melanogaster and Chilo suppressalis, 2439 lncRNA of Nilaparvata lugens, 22 536 pathways of 78 insects, 678 881 untranslated regions (UTR) of 84 insects and 160 905 coding sequences (CDS) of 70 insects. This release contains over 12 million sequences and provides search functionality, a BLAST server, GBrowse, insect pathway construction, a Facebook-like network for the insect community (iFacebook), and phylogenetic analysis of selected genes. PMID:26578584

  13. CyanoClust: comparative genome resources of cyanobacteria and plastids.

    PubMed

    Sasaki, Naobumi V; Sato, Naoki

    2010-01-01

    Cyanobacteria, which perform oxygen-evolving photosynthesis as do chloroplasts of plants and algae, are one of the best-studied prokaryotic phyla and one from which many representative genomes have been sequenced. Lack of a suitable comparative genomic database has been a problem in cyanobacterial genomics because many proteins involved in physiological functions such as photosynthesis and nitrogen fixation are not catalogued in commonly used databases, such as Clusters of Orthologous Proteins (COG). CyanoClust is a database of homolog groups in cyanobacteria and plastids that are produced by the program Gclust. We have developed a web-server system for the protein homology database featuring cyanobacteria and plastids. Database URL: http://cyanoclust.c.u-tokyo.ac.jp/.

  14. Whole Genome Sequencing of Elite Rice Cultivars as a Comprehensive Information Resource for Marker Assisted Selection

    PubMed Central

    Duitama, Jorge; Silva, Alexander; Sanabria, Yamid; Cruz, Daniel Felipe; Quintero, Constanza; Ballen, Carolina; Lorieux, Mathias; Scheffler, Brian; Farmer, Andrew; Torres, Edgar; Oard, James; Tohme, Joe

    2015-01-01

    Current advances in sequencing technologies and bioinformatics revealed the genomic background of rice, a staple food for the poor people, and provided the basis to develop large genomic variation databases for thousands of cultivars. Proper analysis of this massive resource is expected to give novel insights into the structure, function, and evolution of the rice genome, and to aid the development of rice varieties through marker assisted selection or genomic selection. In this work we present sequencing and bioinformatics analyses of 104 rice varieties belonging to the major subspecies of Oryza sativa. We identified repetitive elements and recurrent copy number variation covering about 200 Mbp of the rice genome. Genotyping of over 18 million polymorphic locations within O. sativa allowed us to reconstruct the individual haplotype patterns shaping the genomic background of elite varieties used by farmers throughout the Americas. Based on a reconstruction of the alleles for the gene GBSSI, we could identify novel genetic markers for selection of varieties with high amylose content. We expect that both the analysis methods and the genomic information described here would be of great use for the rice research community and for other groups carrying on similar sequencing efforts in other crops. PMID:25923345

  15. Development of genomic resources for Nothofagus species using next-generation sequencing data.

    PubMed

    El Mujtar, V A; Gallo, L A; Lang, T; Garnier-Géré, P

    2014-11-01

    Using next-generation sequencing, we developed the first whole-genome resources for two hybridizing Nothofagus species of the Patagonian forests that crucially lack genomic data, despite their ecological and industrial value. A de novo assembly strategy combining base quality control and optimization of the putative chloroplast gene map yielded ~32,000 contigs from 43% of the reads produced. With 12.5% of assembled reads, we covered ~96% of the chloroplast genome and ~70% of the mitochondrial gene content, providing functional and structural annotations for 112 and 52 genes, respectively. Functional annotation was possible on 15% of the contigs, with ~1750 potentially novel nuclear genes identified for Nothofagus species. We estimated that the new resources (13.41 Mb in total) included ~4000 gene regions representing ~6.5% of the expected genic partition of the genome, the remaining contigs potentially being nongenic DNA. A high-quality single nucleotide polymorphisms resource was developed by comparing various filtering methods, and preliminary results indicate a strong conservation of cpDNA genomes in contrast to numerous exclusive nuclear polymorphisms in both species. Finally, we characterized 2274 potential simple sequence repeat (SSR) loci, designed primers for 769 of them and validated nine of 29 loci in 42 individuals per species. Nothofagus obliqua had more alleles (4.89) on average than N. nervosa (2.89), 8 SSRs were efficient to discriminate species, and three were successfully transferred in three other Nothofagus species. These resources will greatly help for future inferences of demographic, adaptive and hybridizing events in Nothofagus species, and for conserving and managing natural populations. © 2014 John Wiley & Sons Ltd.

  16. Fluorescence-based resource for semi-automated genomic analyses

    SciTech Connect

    Kiser, M.B.; Dragwa, C.; Jedlicka, A.E.

    1994-09-01

    To facilitate the practical application of highly efficient semi-automated methods for general application in genomic analyses, we have developed a fluorescence-based marker resource. Ninety highly polymorphic simple tandem repeat markers were combined to provide a rapid, accurate, and highly efficient initial genome-wide screening system. These markers are spaced on average every 33 recombination units, with a mean heterozygosity of 81% (range 65-94%), covering 22 autosomes and the X and Y chromosomes. Less than 3% of the genome lies beyond 30 cM of the nearest marker. Markers were placed in a vertical ladder that we have termed a SET according to the size of the PCR fragments they produce during electrophoresis. Each SET was designed to avoid overlap between loci during gel separations to assure accuracy when scoring genotypes. We have constructed 15 SETS of markers. Three SETS, each labelled with one of three fluors, were combined into what we have termed a GROUP, which is co-electrophoresed with internal size standards that are labelled with a fourth flour. Five GROUPS of markers were assembled that contain a total of 15 SETS of markers. Each GROUP cover 18 regions of the genome that can be detected simultaneously, since this genomic analysis system is fully compatible with automated fragment analyzers using simultaneous four-color fluorescence-based detection systems. This allows for multiplex detection and a throughput of 1,944 genotypes daily per instrument. This system will be highly beneficial in a number of clinical and research applications including: linkage, cancer genetics, forensics, and cytogenetics.

  17. Open chromatin reveals the functional maize genome

    USDA-ARS?s Scientific Manuscript database

    Every cellular process mediated through nuclear DNA must contend with chromatin. As results from ENCODE show, open chromatin assays can efficiently integrate across diverse regulatory elements, revealing functional non-coding genome. In this study, we use a MNase hypersensitivity assay to discover o...

  18. Learning about the Human Genome. Part 2: Resources for Science Educators. ERIC Digest.

    ERIC Educational Resources Information Center

    Haury, David L.

    This ERIC Digest identifies how the human genome project fits into the "National Science Education Standards" and lists Human Genome Project Web sites found on the World Wide Web. It is a resource companion to "Learning about the Human Genome. Part 1: Challenge to Science Educators" (Haury 2001). The Web resources and…

  19. Functional Genomics of Drought Tolerance in Bioenergy Crops

    SciTech Connect

    Yin, Hengfu; Chen, Rick; Yang, Jun; Weston, David; Chen, Jay; Muchero, Wellington; Ye, Ning; Tschaplinski, Timothy J; Wullschleger, Stan D; Cheng, Zong-Ming; Tuskan, Gerald A; Yang, Xiaohan

    2014-01-01

    With the predicted trends in climate change, drought will increasingly impose a grand challenge to biomass production. Most of the bioenergy crops have some degree of drought susceptibility with low water-use efficiency (WUE). It is imperative to improve drought tolerance and WUE in bioenergy crops for sustainable biomass production in arid and semi-arid regions with minimal water input. Genetics and functional genomics can play a critical role in generating knowledge to inform and aid genetic improvement of drought tolerance in bioenergy crops. The molecular aspect of drought response has been extensively investigated in model plants like Arabidopsis, yet our understanding of the molecular mechanisms underlying drought tolerance in bioenergy crops are limited. Crops exhibit various responses to drought stress depending on species and genotype. A rational strategy for studying drought tolerance in bioenergy crops is to translate the knowledge from model plants and pinpoint the unique features associated with individual species and genotypes. In this review, we summarize the general knowledge about drought responsive pathways in plants, with a focus on the identification of commonality and specialty in drought responsive mechanisms among different species and/or genotypes. We describe the genomic resources developed for bioenergy crops and discuss genetic and epigenetic regulation of drought responses. We also examine comparative and evolutionary genomics to leverage the ever-increasing genomics resources and provide new insights beyond what has been known from studies on individual species. Finally, we outline future exploration of drought tolerance using the emerging new technologies.

  20. RNAi for functional genomics in plants.

    PubMed

    McGinnis, Karen M

    2010-03-01

    RNAi refers to several different types of gene silencing mediated by small, dsRNA molecules. Over the course of 20 years, the scientific understanding of RNAi has developed from the initial observation of unexpected expression patterns to a sophisticated understanding of a multi-faceted, evolutionarily conserved network of mechanisms that regulate gene expression in many organisms. It has also been developed as a genetic tool that can be exploited in a wide range of species. Because transgene-induced RNAi has been effective at silencing one or more genes in a wide range of plants, this technology also bears potential as a powerful functional genomics tool across the plant kingdom. Transgene-induced RNAi has indeed been shown to be an effective mechanism for silencing many genes in many organisms, but the results from multiple projects which attempted to exploit RNAi on a genome-wide scale suggest that there is a great deal of variation in the silencing efficacy between transgenic events, silencing targets and silencing-induced phenotype. The results from these projects indicate several important variables that should be considered in experimental design prior to the initiation of functional genomics efforts based on RNAi silencing. In recent years, alternative strategies have been developed for targeted gene silencing, and a combination of approaches may also enhance the use of targeted gene silencing for functional genomics.

  1. A genome resource to address mechanisms of developmental programming: determination of the fetal sheep heart transcriptome.

    PubMed

    Cox, Laura A; Glenn, Jeremy P; Spradling, Kimberly D; Nijland, Mark J; Garcia, Roy; Nathanielsz, Peter W; Ford, Stephen P

    2012-06-15

    The pregnant sheep has provided seminal insights into reproduction related to animal and human development (ovarian function, fertility, implantation, fetal growth, parturition and lactation). Fetal sheep physiology has been extensively studied since 1950, contributing significantly to the basis for our understanding of many aspects of fetal development and behaviour that remain in use in clinical practice today. Understanding mechanisms requires the combination of systems approaches uniquely available in fetal sheep with the power of genomic studies. Absence of the full range of sheep genomic resources has limited the full realization of the power of this model, impeding progress in emerging areas of pregnancy biology such as developmental programming. We have examined the expressed fetal sheep heart transcriptome using high-throughput sequencing technologies. In so doing we identified 36,737 novel transcripts and describe genes, gene variants and pathways relevant to fundamental developmental mechanisms. Genes with the highest expression levels and with novel exons in the fetal heart transcriptome are known to play central roles in muscle development. We show that high-throughput sequencing methods can generate extensive transcriptome information in the absence of an assembled and annotated genome for that species. The gene sequence data obtained provide a unique genomic resource for sheep specific genetic technology development and, combined with the polymorphism data, augment annotation and assembly of the sheep genome. In addition, identification and pathway analysis of novel fetal sheep heart transcriptome splice variants is a first step towards revealing mechanisms of genetic variation and gene environment interactions during fetal heart development.

  2. Development of peanut expessed sequence tag-based genomic resources and tools

    USDA-ARS?s Scientific Manuscript database

    U.S. Peanut Genome Initiative (PGI) has widely recognized the need for peanut genome tools and resources development for mitigating peanut allergens and food safety. Genomics such as Expressed Sequence Tag (EST), microarray technologies, and whole genome sequencing provides robotic tools for profili...

  3. Ascribing Functions to Genes: Journey Towards Genetic Improvement of Rice Via Functional Genomics

    PubMed Central

    Mustafiz, Ananda; Kumari, Sumita; Karan, Ratna

    2016-01-01

    Rice, one of the most important cereal crops for mankind, feeds more than half the world population. Rice has been heralded as a model cereal owing to its small genome size, amenability to easy transformation, high synteny to other cereal crops and availability of complete genome sequence. Moreover, sequence wealth in rice is getting more refined and precise due to resequencing efforts. This humungous resource of sequence data has confronted research fraternity with a herculean challenge as well as an excellent opportunity to functionally validate expressed as well as regulatory portions of the genome. This will not only help us in understanding the genetic basis of plant architecture and physiology but would also steer us towards developing improved cultivars. No single technique can achieve such a mammoth task. Functional genomics through its diverse tools viz. loss and gain of function mutants, multifarious omics strategies like transcriptomics, proteomics, metabolomics and phenomics provide us with the necessary handle. A paradigm shift in technological advances in functional genomics strategies has been instrumental in generating considerable amount of information w.r.t functionality of rice genome. We now have several databases and online resources for functionally validated genes but despite that we are far from reaching the desired milestone of functionally characterizing each and every rice gene. There is an urgent need for a common platform, for information already available in rice, and collaborative efforts between researchers in a concerted manner as well as healthy public-private partnership, for genetic improvement of rice crop better able to handle the pressures of climate change and exponentially increasing population. PMID:27252584

  4. Ascribing Functions to Genes: Journey Towards Genetic Improvement of Rice Via Functional Genomics.

    PubMed

    Mustafiz, Ananda; Kumari, Sumita; Karan, Ratna

    2016-06-01

    Rice, one of the most important cereal crops for mankind, feeds more than half the world population. Rice has been heralded as a model cereal owing to its small genome size, amenability to easy transformation, high synteny to other cereal crops and availability of complete genome sequence. Moreover, sequence wealth in rice is getting more refined and precise due to resequencing efforts. This humungous resource of sequence data has confronted research fraternity with a herculean challenge as well as an excellent opportunity to functionally validate expressed as well as regulatory portions of the genome. This will not only help us in understanding the genetic basis of plant architecture and physiology but would also steer us towards developing improved cultivars. No single technique can achieve such a mammoth task. Functional genomics through its diverse tools viz. loss and gain of function mutants, multifarious omics strategies like transcriptomics, proteomics, metabolomics and phenomics provide us with the necessary handle. A paradigm shift in technological advances in functional genomics strategies has been instrumental in generating considerable amount of information w.r.t functionality of rice genome. We now have several databases and online resources for functionally validated genes but despite that we are far from reaching the desired milestone of functionally characterizing each and every rice gene. There is an urgent need for a common platform, for information already available in rice, and collaborative efforts between researchers in a concerted manner as well as healthy public-private partnership, for genetic improvement of rice crop better able to handle the pressures of climate change and exponentially increasing population.

  5. Genomic resources for sea lice: analysis of ESTs and mitochondrial genomes.

    PubMed

    Yasuike, Motoshige; Leong, Jong; Jantzen, Stuart G; von Schalburg, Kristian R; Nilsen, Frank; Jones, Simon R M; Koop, Ben F

    2012-04-01

    Sea lice are common parasites of both farmed and wild salmon. Salmon farming constitutes an important economic market in North America, South America, and Northern Europe. Infections with sea lice can result in significant production losses. A compilation of genomic information on different genera of sea lice is an important resource for understanding their biology as well as for the study of population genetics and control strategies. We report on over 150,000 expressed sequence tags (ESTs) from five different species (Pacific Lepeophtheirus salmonis (49,672 new ESTs in addition to 14,994 previously reported ESTs), Atlantic L. salmonis (57,349 ESTs), Caligus clemensi (14,821 ESTs), Caligus rogercresseyi (32,135 ESTs), and Lernaeocera branchialis (16,441 ESTs)). For each species, ESTs were assembled into complete or partial genes and annotated by comparisons to known proteins in public databases. In addition, whole mitochondrial (mt) genome sequences of C. clemensi (13,440 bp) and C. rogercresseyi (13,468 bp) were determined and compared to L. salmonis. Both nuclear and mtDNA genes show very high levels of sequence divergence between these ectoparastic copepods suggesting that the different species of sea lice have been in existence for 37-113 million years and that parasitic association with salmonids is also quite ancient. Our ESTs and mtDNA data provide a novel resource for the study of sea louse biology, population genetics, and control strategies. This genomic information provides the material basis for the development of a 38K sea louse microarray that can be used in conjunction with our existing 44K salmon microarray to study host-parasite interactions at the molecular level. This report represents the largest genomic resource for any copepod species to date.

  6. Mouse Genome Database (MGD)-2017: community knowledge resource for the laboratory mouse

    PubMed Central

    Blake, Judith A.; Eppig, Janan T.; Kadin, James A.; Richardson, Joel E.; Smith, Cynthia L.; Bult, Carol J.

    2017-01-01

    The Mouse Genome Database (MGD: http://www.informatics.jax.org) is the primary community data resource for the laboratory mouse. It provides a highly integrated and highly curated system offering a comprehensive view of current knowledge about mouse genes, genetic markers and genomic features as well as the associations of those features with sequence, phenotypes, functional and comparative information, and their relationships to human diseases. MGD continues to enhance access to these data, to extend the scope of data content and visualizations, and to provide infrastructure and user support that ensures effective and efficient use of MGD in the advancement of scientific knowledge. Here, we report on recent enhancements made to the resource and new features. PMID:27899570

  7. Utilizing Genomics to Study Entomopathogenicity in the Fungal Phylum Entomophthoromycota: A Review of Current Genetic Resources.

    PubMed

    De Fine Licht, H H; Hajek, A E; Eilenberg, J; Jensen, A B

    2016-01-01

    The order Entomophthorales, which formerly contained c.280 species, has recently been recognized as a separate phylum, Entomophthoromycota, consisting of three recognized classes and six families. Many genera in this group contain obligate insect-pathogenic species with narrow host ranges, capable of producing epizootics in natural insect populations. Available sequence information from the phylum Entomophthoromycota can be classified into three main categories: first, partial gene regions (exons+introns) used for phylogenetic inference; second, protein coding gene regions obtained using degenerate primers, expressed sequence tag methodology or de novo transcriptome sequencing with molecular function inferred by homology analysis; and third, primarily forthcoming whole-genome sequencing data sets. Here we summarize the current genetic resources for Entomophthoromycota and identify research areas that are likely to be significantly advanced from the availability of new whole-genome resources.

  8. A New System for Comparative Functional Genomics of Saccharomyces Yeasts

    PubMed Central

    Caudy, Amy A.; Guan, Yuanfang; Jia, Yue; Hansen, Christina; DeSevo, Chris; Hayes, Alicia P.; Agee, Joy; Alvarez-Dominguez, Juan R.; Arellano, Hugo; Barrett, Daniel; Bauerle, Cynthia; Bisaria, Namita; Bradley, Patrick H.; Breunig, J. Scott; Bush, Erin; Cappel, David; Capra, Emily; Chen, Walter; Clore, John; Combs, Peter A.; Doucette, Christopher; Demuren, Olukunle; Fellowes, Peter; Freeman, Sam; Frenkel, Evgeni; Gadala-Maria, Daniel; Gawande, Richa; Glass, David; Grossberg, Samuel; Gupta, Anita; Hammonds-Odie, Latanya; Hoisos, Aaron; Hsi, Jenny; Hsu, Yu-Han Huang; Inukai, Sachi; Karczewski, Konrad J.; Ke, Xiaobo; Kojima, Mina; Leachman, Samuel; Lieber, Danny; Liebowitz, Anna; Liu, Julia; Liu, Yufei; Martin, Trevor; Mena, Jose; Mendoza, Rosa; Myhrvold, Cameron; Millian, Christian; Pfau, Sarah; Raj, Sandeep; Rich, Matt; Rokicki, Joe; Rounds, William; Salazar, Michael; Salesi, Matthew; Sharma, Rajani; Silverman, Sanford; Singer, Cara; Sinha, Sandhya; Staller, Max; Stern, Philip; Tang, Hanlin; Weeks, Sharon; Weidmann, Maxwell; Wolf, Ashley; Young, Carmen; Yuan, Jie; Crutchfield, Christopher; McClean, Megan; Murphy, Coleen T.; Llinás, Manuel; Botstein, David; Troyanskaya, Olga G.; Dunham, Maitreya J.

    2013-01-01

    Whole-genome sequencing, particularly in fungi, has progressed at a tremendous rate. More difficult, however, is experimental testing of the inferences about gene function that can be drawn from comparative sequence analysis alone. We present a genome-wide functional characterization of a sequenced but experimentally understudied budding yeast, Saccharomyces bayanus var. uvarum (henceforth referred to as S. bayanus), allowing us to map changes over the 20 million years that separate this organism from S. cerevisiae. We first created a suite of genetic tools to facilitate work in S. bayanus. Next, we measured the gene-expression response of S. bayanus to a diverse set of perturbations optimized using a computational approach to cover a diverse array of functionally relevant biological responses. The resulting data set reveals that gene-expression patterns are largely conserved, but significant changes may exist in regulatory networks such as carbohydrate utilization and meiosis. In addition to regulatory changes, our approach identified gene functions that have diverged. The functions of genes in core pathways are highly conserved, but we observed many changes in which genes are involved in osmotic stress, peroxisome biogenesis, and autophagy. A surprising number of genes specific to S. bayanus respond to oxidative stress, suggesting the organism may have evolved under different selection pressures than S. cerevisiae. This work expands the scope of genome-scale evolutionary studies from sequence-based analysis to rapid experimental characterization and could be adopted for functional mapping in any lineage of interest. Furthermore, our detailed characterization of S. bayanus provides a valuable resource for comparative functional genomics studies in yeast. PMID:23852385

  9. A new system for comparative functional genomics of Saccharomyces yeasts.

    PubMed

    Caudy, Amy A; Guan, Yuanfang; Jia, Yue; Hansen, Christina; DeSevo, Chris; Hayes, Alicia P; Agee, Joy; Alvarez-Dominguez, Juan R; Arellano, Hugo; Barrett, Daniel; Bauerle, Cynthia; Bisaria, Namita; Bradley, Patrick H; Breunig, J Scott; Bush, Erin; Cappel, David; Capra, Emily; Chen, Walter; Clore, John; Combs, Peter A; Doucette, Christopher; Demuren, Olukunle; Fellowes, Peter; Freeman, Sam; Frenkel, Evgeni; Gadala-Maria, Daniel; Gawande, Richa; Glass, David; Grossberg, Samuel; Gupta, Anita; Hammonds-Odie, Latanya; Hoisos, Aaron; Hsi, Jenny; Hsu, Yu-Han Huang; Inukai, Sachi; Karczewski, Konrad J; Ke, Xiaobo; Kojima, Mina; Leachman, Samuel; Lieber, Danny; Liebowitz, Anna; Liu, Julia; Liu, Yufei; Martin, Trevor; Mena, Jose; Mendoza, Rosa; Myhrvold, Cameron; Millian, Christian; Pfau, Sarah; Raj, Sandeep; Rich, Matt; Rokicki, Joe; Rounds, William; Salazar, Michael; Salesi, Matthew; Sharma, Rajani; Silverman, Sanford; Singer, Cara; Sinha, Sandhya; Staller, Max; Stern, Philip; Tang, Hanlin; Weeks, Sharon; Weidmann, Maxwell; Wolf, Ashley; Young, Carmen; Yuan, Jie; Crutchfield, Christopher; McClean, Megan; Murphy, Coleen T; Llinás, Manuel; Botstein, David; Troyanskaya, Olga G; Dunham, Maitreya J

    2013-09-01

    Whole-genome sequencing, particularly in fungi, has progressed at a tremendous rate. More difficult, however, is experimental testing of the inferences about gene function that can be drawn from comparative sequence analysis alone. We present a genome-wide functional characterization of a sequenced but experimentally understudied budding yeast, Saccharomyces bayanus var. uvarum (henceforth referred to as S. bayanus), allowing us to map changes over the 20 million years that separate this organism from S. cerevisiae. We first created a suite of genetic tools to facilitate work in S. bayanus. Next, we measured the gene-expression response of S. bayanus to a diverse set of perturbations optimized using a computational approach to cover a diverse array of functionally relevant biological responses. The resulting data set reveals that gene-expression patterns are largely conserved, but significant changes may exist in regulatory networks such as carbohydrate utilization and meiosis. In addition to regulatory changes, our approach identified gene functions that have diverged. The functions of genes in core pathways are highly conserved, but we observed many changes in which genes are involved in osmotic stress, peroxisome biogenesis, and autophagy. A surprising number of genes specific to S. bayanus respond to oxidative stress, suggesting the organism may have evolved under different selection pressures than S. cerevisiae. This work expands the scope of genome-scale evolutionary studies from sequence-based analysis to rapid experimental characterization and could be adopted for functional mapping in any lineage of interest. Furthermore, our detailed characterization of S. bayanus provides a valuable resource for comparative functional genomics studies in yeast.

  10. Applying functional genomics research to the study of pig reproduction.

    PubMed

    Pomp, D; Caetano, A R; Bertani, G R; Gladney, C D; Johnson, R K

    2001-01-01

    Functional genomics is an experimental approach that incorporates genome-wide or system-wide experimentation, expanding the scope of biological investigation from studying single genes to studying potentially all genes at once in a systematic manner. This technology is highly appealing because of its high throughput and relatively low cost. Furthermore, analysis of gene expression using microarrays is likely to be more biologically relevant than the conventional paradigm of reductionism, because it has the potential to uncover new biological connections between genes and biochemical pathways. However, functional genomics is still in its infancy, especially with regard to the study of pig reproduction. Currently, efforts are centred on developing the necessary resources to enable high throughput evaluation and comparison of gene expression. However, it is clear that in the near future functional genomics will be applied on a large scale to study the biology and physiology of reproduction in pigs, and to understand better the complex nature of genetic control over polygenic characteristics, such as ovulation rate and litter size. We can look forward to generating a significant amount of new data on differences in gene expression between genotypes, treatments, or at various temporal and spatial coordinates within a variety of reproductively relevant systems. Along with this capability will be the challenge of collating, analysing and interpreting datasets that are orders of magnitude more extensive and complex than those currently used. Furthermore, integration of functional genomics with traditional genetic approaches and with detailed analysis of the proteome and relevant whole animal phenotypes will be required to make full use of this powerful new experimental paradigm as a beneficial research tool.

  11. openSNP–A Crowdsourced Web Resource for Personal Genomics

    PubMed Central

    Greshake, Bastian; Bayer, Philipp E.; Rausch, Helge; Reda, Julia

    2014-01-01

    Genome-Wide Association Studies are widely used to correlate phenotypic traits with genetic variants. These studies usually compare the genetic variation between two groups to single out certain Single Nucleotide Polymorphisms (SNPs) that are linked to a phenotypic variation in one of the groups. However, it is necessary to have a large enough sample size to find statistically significant correlations. Direct-To-Consumer (DTC) genetic testing can supply additional data: DTC-companies offer the analysis of a large amount of SNPs for an individual at low cost without the need to consult a physician or geneticist. Over 100,000 people have already been genotyped through Direct-To-Consumer genetic testing companies. However, this data is not public for a variety of reasons and thus cannot be used in research. It seems reasonable to create a central open data repository for such data. Here we present the web platform openSNP, an open database which allows participants of Direct-To-Consumer genetic testing to publish their genetic data at no cost along with phenotypic information. Through this crowdsourced effort of collecting genetic and phenotypic information, openSNP has become a resource for a wide area of studies, including Genome-Wide Association Studies. openSNP is hosted at http://www.opensnp.org, and the code is released under MIT-license at http://github.com/gedankenstuecke/snpr. PMID:24647222

  12. Orchidstra: an integrated orchid functional genomics database.

    PubMed

    Su, Chun-lin; Chao, Ya-Ting; Yen, Shao-Hua; Chen, Chun-Yi; Chen, Wan-Chieh; Chang, Yao-Chien Alex; Shih, Ming-Che

    2013-02-01

    A specialized orchid database, named Orchidstra (URL: http://orchidstra.abrc.sinica.edu.tw), has been constructed to collect, annotate and share genomic information for orchid functional genomics studies. The Orchidaceae is a large family of Angiosperms that exhibits extraordinary biodiversity in terms of both the number of species and their distribution worldwide. Orchids exhibit many unique biological features; however, investigation of these traits is currently constrained due to the limited availability of genomic information. Transcriptome information for five orchid species and one commercial hybrid has been included in the Orchidstra database. Altogether, these comprise >380,000 non-redundant orchid transcript sequences, of which >110,000 are protein-coding genes. Sequences from the transcriptome shotgun assembly (TSA) were obtained either from output reads from next-generation sequencing technologies assembled into contigs, or from conventional cDNA library approaches. An annotation pipeline using Gene Ontology, KEGG and Pfam was built to assign gene descriptions and functional annotation to protein-coding genes. Deep sequencing of small RNA was also performed for Phalaenopsis aphrodite to search for microRNAs (miRNAs), extending the information archived for this species to miRNA annotation, precursors and putative target genes. The P. aphrodite transcriptome information was further used to design probes for an oligonucleotide microarray, and expression profiling analysis was carried out. The intensities of hybridized probes derived from microarray assays of various tissues were incorporated into the database as part of the functional evidence. In the future, the content of the Orchidstra database will be expanded with transcriptome data and genomic information from more orchid species.

  13. Valuation of ecological resources and functions

    SciTech Connect

    Scott, M.J.; Bilyard, G.R.; Link, S.O.

    1998-01-01

    Ecological resources are natural resources that provide certain necessary but overlooked system maintenance functions within ecosystems. Environmental economics is in search of an appropriate analysis framework to determine economic values of such resources, This paper presents a framework that estimates and compiles the components of value for a natural ecosystem. The framework begins with the ecological processes involved, which provide functions within the ecosystem and services valued by humans. We discuss the additive or competive nature of these values, and estimate these values through conventional and unconventional techniques. We apply the framework to ecological resources in a shrub-steppe dryland habitat being displaced by development. We first determine which functions and services are mutually exclusive (e.g., farming vs soil stabilization) and which are complementary or products of joint production (e.g., soil stabilization and maintenance of species). We then apply benefit transfer principles with contingent valuation methodology (CVM), travel cost methodology (TCM), and hedonic damage pricing (HDP). Finally, we derive upper-limit values for more difficult-to-value functions through the use of human analogs, which we argue are the most appropriate method of valuation under some circumstances. The highest values of natural shrub-steppe habitat appear to be derived from soil stabilization. 59 refs., 6 figs., 5 tabs.

  14. RSIADB, a collective resource for genome and transcriptome analyses in Rhizoctonia solani AG1 IA.

    PubMed

    Chen, Lei; Ai, Peng; Zhang, Jinfeng; Deng, Qiming; Wang, Shiquan; Li, Shuangcheng; Zhu, Jun; Li, Ping; Zheng, Aiping

    2016-01-01

    Rice [Oryza sativa (L.)] feeds more than half of the world's population. Rhizoctonia solaniis a major fungal pathogen of rice causing extreme crop losses in all rice-growing regions of the world. R. solani AG1 IA is a major cause of sheath blight in rice. In this study, we constructed a comprehensive and user-friendly web-based database, RSIADB, to analyse its draft genome and transcriptome. The database was built using the genome sequence (10,489 genes) and annotation information for R. solani AG1 IA. A total of six RNAseq samples of R. solani AG1 IA were also analysed, corresponding to 10, 18, 24, 32, 48 and 72 h after infection of rice leaves. The RSIADB database enables users to search, browse, and download gene sequences for R. solani AG1 IA, and mine the data using BLAST, Sequence Extractor, Browse and Construction Diagram tools that were integrated into the database. RSIADB is an important genomic resource for scientists working with R. solani AG1 IA and will assist researchers in analysing the annotated genome and transcriptome of this pathogen. This resource will facilitate studies on gene function, pathogenesis factors and secreted proteins, as well as provide an avenue for comparative analyses of genes expressed during different stages of infection. Database URL:http://genedenovoweb.ticp.net:81/rsia/index.php.

  15. Functional genomic screening to enhance oncolytic virotherapy.

    PubMed

    Mahoney, D J; Stojdl, D F

    2013-02-05

    Functional genomic screening has emerged as a powerful approach for understanding complex biological phenomena. Of the available tools, genome-wide RNA interference (RNAi) technology is unquestionably the most incisive, as it directly probes gene function. Recent applications of RNAi screening have been impressive. Notable amongst these are its use in elucidated mechanism(s) for signal transduction, various aspects of cell biology, tumourigenesis and metastasis, resistance to cancer therapeutics, and the host's response to a pathogen. Herein we discuss how recent RNAi screening efforts have helped turn our attention to the targetability of non-oncogene support pathways for cancer treatment, with a particular focus on a recent study that identified a non-oncogene addiction to the ER stress response as a synergist target for oncolytic virus therapy (OVT). Moreover, we give our thoughts on the future of RNAi screening as a tool to enhance OVT and describe recent technical improvements that are poised to make genome-scale RNAi experiments more sensitive, less noisy, more applicable in vivo, and more easily validated in clinically relevant animal models.

  16. MicroScope in 2017: an expanding and evolving integrated resource for community expertise of microbial genomes

    PubMed Central

    Vallenet, David; Calteau, Alexandra; Cruveiller, Stéphane; Gachet, Mathieu; Lajus, Aurélie; Josso, Adrien; Mercier, Jonathan; Renaux, Alexandre; Rollin, Johan; Rouy, Zoe; Roche, David; Scarpelli, Claude; Médigue, Claudine

    2017-01-01

    The annotation of genomes from NGS platforms needs to be automated and fully integrated. However, maintaining consistency and accuracy in genome annotation is a challenging problem because millions of protein database entries are not assigned reliable functions. This shortcoming limits the knowledge that can be extracted from genomes and metabolic models. Launched in 2005, the MicroScope platform (http://www.genoscope.cns.fr/agc/microscope) is an integrative resource that supports systematic and efficient revision of microbial genome annotation, data management and comparative analysis. Effective comparative analysis requires a consistent and complete view of biological data, and therefore, support for reviewing the quality of functional annotation is critical. MicroScope allows users to analyze microbial (meta)genomes together with post-genomic experiment results if any (i.e. transcriptomics, re-sequencing of evolved strains, mutant collections, phenotype data). It combines tools and graphical interfaces to analyze genomes and to perform the expert curation of gene functions in a comparative context. Starting with a short overview of the MicroScope system, this paper focuses on some major improvements of the Web interface, mainly for the submission of genomic data and on original tools and pipelines that have been developed and integrated in the platform: computation of pan-genomes and prediction of biosynthetic gene clusters. Today the resource contains data for more than 6000 microbial genomes, and among the 2700 personal accounts (65% of which are now from foreign countries), 14% of the users are performing expert annotations, on at least a weekly basis, contributing to improve the quality of microbial genome annotations. PMID:27899624

  17. MicroScope in 2017: an expanding and evolving integrated resource for community expertise of microbial genomes.

    PubMed

    Vallenet, David; Calteau, Alexandra; Cruveiller, Stéphane; Gachet, Mathieu; Lajus, Aurélie; Josso, Adrien; Mercier, Jonathan; Renaux, Alexandre; Rollin, Johan; Rouy, Zoe; Roche, David; Scarpelli, Claude; Médigue, Claudine

    2017-01-04

    The annotation of genomes from NGS platforms needs to be automated and fully integrated. However, maintaining consistency and accuracy in genome annotation is a challenging problem because millions of protein database entries are not assigned reliable functions. This shortcoming limits the knowledge that can be extracted from genomes and metabolic models. Launched in 2005, the MicroScope platform (http://www.genoscope.cns.fr/agc/microscope) is an integrative resource that supports systematic and efficient revision of microbial genome annotation, data management and comparative analysis. Effective comparative analysis requires a consistent and complete view of biological data, and therefore, support for reviewing the quality of functional annotation is critical. MicroScope allows users to analyze microbial (meta)genomes together with post-genomic experiment results if any (i.e. transcriptomics, re-sequencing of evolved strains, mutant collections, phenotype data). It combines tools and graphical interfaces to analyze genomes and to perform the expert curation of gene functions in a comparative context. Starting with a short overview of the MicroScope system, this paper focuses on some major improvements of the Web interface, mainly for the submission of genomic data and on original tools and pipelines that have been developed and integrated in the platform: computation of pan-genomes and prediction of biosynthetic gene clusters. Today the resource contains data for more than 6000 microbial genomes, and among the 2700 personal accounts (65% of which are now from foreign countries), 14% of the users are performing expert annotations, on at least a weekly basis, contributing to improve the quality of microbial genome annotations.

  18. The chickpea genomic web resource: visualization and analysis of the desi-type Cicer arietinum nuclear genome for comparative exploration of legumes.

    PubMed

    Misra, Gopal; Priya, Piyush; Bandhiwal, Nitesh; Bareja, Neha; Jain, Mukesh; Bhatia, Sabhyata; Chattopadhyay, Debasis; Tyagi, Akhilesh K; Yadav, Gitanjali

    2014-12-18

    Availability of the draft nuclear genome sequences of small-seeded desi-type legume crop Cicer arietinum has provided an opportunity for investigating unique chickpea genomic features and evaluation of their biological significance. The increasing number of legume genome sequences also presents a challenge for developing reliable and information-driven bioinformatics applications suitable for comparative exploration of this important class of crop plants. The Chickpea Genomic Web Resource (CGWR) is an implementation of a suite of web-based applications dedicated to chickpea genome visualization and comparative analysis, based on next generation sequencing and assembly of Cicer arietinum desi-type genotype ICC4958. CGWR has been designed and configured for mapping, scanning and browsing the significant chickpea genomic features in view of the important existing and potential roles played by the various legume genome projects in mutant mapping and cloning. It also enables comparative informatics of ICC4958 DNA sequence analysis with other wild and cultivated genotypes of chickpea, various other leguminous species as well as several non-leguminous model plants, to enable investigations into evolutionary processes that shape legume genomes. CGWR is an online database offering a comprehensive visual and functional genomic analysis of the chickpea genome, along with customized maps and gene-clustering options. It is also the only plant based web resource supporting display and analysis of nucleosome positioning patterns in the genome. The usefulness of CGWR has been demonstrated with discoveries of biological significance made using this server. The CGWR is compatible with all available operating systems and browsers, and is available freely under the open source license at http://www.nipgr.res.in/CGWR/home.php.

  19. WormBase: a multi-species resource for nematode biology and genomics

    PubMed Central

    Harris, Todd W.; Chen, Nansheng; Cunningham, Fiona; Tello-Ruiz, Marcela; Antoshechkin, Igor; Bastiani, Carol; Bieri, Tamberlyn; Blasiar, Darin; Bradnam, Keith; Chan, Juancarlos; Chen, Chao-Kung; Chen, Wen J.; Davis, Paul; Kenny, Eimear; Kishore, Ranjana; Lawson, Daniel; Lee, Raymond; Muller, Hans-Michael; Nakamura, Cecilia; Ozersky, Philip; Petcherski, Andrei; Rogers, Anthony; Sabo, Aniko; Schwarz, Erich M.; Van Auken, Kimberly; Wang, Qinghua; Durbin, Richard; Spieth, John; Sternberg, Paul W.; Stein, Lincoln D.

    2004-01-01

    WormBase (http://www.wormbase.org/) is the central data repository for information about Caenorhabditis elegans and related nematodes. As a model organism database, WormBase extends beyond the genomic sequence, integrating experimental results with extensively annotated views of the genome. The WormBase Consortium continues to expand the biological scope and utility of WormBase with the inclusion of large-scale genomic analyses, through active data and literature curation, through new analysis and visualization tools, and through refinement of the user interface. Over the past year, the nearly complete genomic sequence and comparative analyses of the closely related species Caenorhabditis briggsae have been integrated into WormBase, including gene predictions, ortholog assignments and a new synteny viewer to display the relationships between the two species. Extensive site-wide refinement of the user interface now provides quick access to the most frequently accessed resources and a consistent browsing experience across the site. Unified single-page views now provide complete summaries of commonly accessed entries like genes. These advances continue to increase the utility of WormBase for C.elegans researchers, as well as for those researchers exploring problems in functional and comparative genomics in the context of a powerful genetic system. PMID:14681445

  20. Soybean Knowledge Base (SoyKB): a Web Resource for Soybean Translational Genomics

    SciTech Connect

    Joshi, Trupti; Patil, Kapil; Fitzpatrick, Michael R.; Franklin, Levi D.; Yao, Qiuming; Cook, Jeffrey R.; Wang, Zhem; Libault, Marc; Brechenmacher, Laurent; Valliyodan, Babu; Wu, Xiaolei; Cheng, Jianlin; Stacey, Gary; Nguyen, Henry T.; Xu, Dong

    2012-01-17

    Background: Soybean Knowledge Base (SoyKB) is a comprehensive all-inclusive web resource for soybean translational genomics. SoyKB is designed to handle the management and integration of soybean genomics, transcriptomics, proteomics and metabolomics data along with annotation of gene function and biological pathway. It contains information on four entities, namely genes, microRNAs, metabolites and single nucleotide polymorphisms (SNPs). Methods: SoyKB has many useful tools such as Affymetrix probe ID search, gene family search, multiple gene/ metabolite search supporting co-expression analysis, and protein 3D structure viewer as well as download and upload capacity for experimental data and annotations. It has four tiers of registration, which control different levels of access to public and private data. It allows users of certain levels to share their expertise by adding comments to the data. It has a user-friendly web interface together with genome browser and pathway viewer, which display data in an intuitive manner to the soybean researchers, producers and consumers. Conclusions: SoyKB addresses the increasing need of the soybean research community to have a one-stop-shop functional and translational omics web resource for information retrieval and analysis in a user-friendly way. SoyKB can be publicly accessed at http://soykb.org/.

  1. Human-mouse comparative genomics: successes and failures to reveal functional regions of the human genome

    SciTech Connect

    Pennacchio, Len A.; Baroukh, Nadine; Rubin, Edward M.

    2003-05-15

    Deciphering the genetic code embedded within the human genome remains a significant challenge despite the human genome consortium's recent success at defining its linear sequence (Lander et al. 2001; Venter et al. 2001). While useful strategies exist to identify a large percentage of protein encoding regions, efforts to accurately define functional sequences in the remaining {approx}97 percent of the genome lag. Our primary interest has been to utilize the evolutionary relationship and the universal nature of genomic sequence information in vertebrates to reveal functional elements in the human genome. This has been achieved through the combined use of vertebrate comparative genomics to pinpoint highly conserved sequences as candidates for biological activity and transgenic mouse studies to address the functionality of defined human DNA fragments. Accordingly, we describe strategies and insights into functional sequences in the human genome through the use of comparative genomics coupled wit h functional studies in the mouse.

  2. Microbial Genome Analysis and Comparisons: Web-based Protocols and Resources

    USDA-ARS?s Scientific Manuscript database

    Fully annotated genome sequences of many microorganisms are publicly available as a resource. However, in-depth analysis of these genomes using specialized tools is required to derive meaningful information. We describe here the utility of three powerful publicly available genome databases and ana...

  3. Exploring the human genome with functional maps

    PubMed Central

    Huttenhower, Curtis; Haley, Erin M.; Hibbs, Matthew A.; Dumeaux, Vanessa; Barrett, Daniel R.; Coller, Hilary A.; Troyanskaya, Olga G.

    2009-01-01

    Human genomic data of many types are readily available, but the complexity and scale of human molecular biology make it difficult to integrate this body of data, understand it from a systems level, and apply it to the study of specific pathways or genetic disorders. An investigator could best explore a particular protein, pathway, or disease if given a functional map summarizing the data and interactions most relevant to his or her area of interest. Using a regularized Bayesian integration system, we provide maps of functional activity and interaction networks in over 200 areas of human cellular biology, each including information from ∼30,000 genome-scale experiments pertaining to ∼25,000 human genes. Key to these analyses is the ability to efficiently summarize this large data collection from a variety of biologically informative perspectives: prediction of protein function and functional modules, cross-talk among biological processes, and association of novel genes and pathways with known genetic disorders. In addition to providing maps of each of these areas, we also identify biological processes active in each data set. Experimental investigation of five specific genes, AP3B1, ATP6AP1, BLOC1S1, LAMP2, and RAB11A, has confirmed novel roles for these proteins in the proper initiation of macroautophagy in amino acid-starved human fibroblasts. Our functional maps can be explored using HEFalMp (Human Experimental/Functional Mapper), a web interface allowing interactive visualization and investigation of this large body of information. PMID:19246570

  4. Current and future resources for functional metagenomics.

    PubMed

    Lam, Kathy N; Cheng, Jiujun; Engel, Katja; Neufeld, Josh D; Charles, Trevor C

    2015-01-01

    Functional metagenomics is a powerful experimental approach for studying gene function, starting from the extracted DNA of mixed microbial populations. A functional approach relies on the construction and screening of metagenomic libraries-physical libraries that contain DNA cloned from environmental metagenomes. The information obtained from functional metagenomics can help in future annotations of gene function and serve as a complement to sequence-based metagenomics. In this Perspective, we begin by summarizing the technical challenges of constructing metagenomic libraries and emphasize their value as resources. We then discuss libraries constructed using the popular cloning vector, pCC1FOS, and highlight the strengths and shortcomings of this system, alongside possible strategies to maximize existing pCC1FOS-based libraries by screening in diverse hosts. Finally, we discuss the known bias of libraries constructed from human gut and marine water samples, present results that suggest bias may also occur for soil libraries, and consider factors that bias metagenomic libraries in general. We anticipate that discussion of current resources and limitations will advance tools and technologies for functional metagenomics research.

  5. flyDIVaS: A Comparative Genomics Resource for Drosophila Divergence and Selection

    PubMed Central

    Stanley, Craig E.; Kulathinal, Rob J.

    2016-01-01

    With arguably the best finished and expertly annotated genome assembly, Drosophila melanogaster is a formidable genetics model to study all aspects of biology. Nearly a decade ago, the 12 Drosophila genomes project expanded D. melanogaster’s breadth as a comparative model through the community-development of an unprecedented genus- and genome-wide comparative resource. However, since its inception, these datasets for evolutionary inference and biological discovery have become increasingly outdated, outmoded, and inaccessible. Here, we provide an updated and upgradable comparative genomics resource of Drosophila divergence and selection, flyDIVaS, based on the latest genomic assemblies, curated FlyBase annotations, and recent OrthoDB orthology calls. flyDIVaS is an online database containing D. melanogaster-centric orthologous gene sets, CDS and protein alignments, divergence statistics (% gaps, dN, dS, dN/dS), and codon-based tests of positive Darwinian selection. Out of 13,920 protein-coding D. melanogaster genes, ∼80% have one aligned ortholog in the closely related species, D. simulans, and ∼50% have 1–1 12-way alignments in the original 12 sequenced species that span over 80 million yr of divergence. Genes and their orthologs can be chosen from four different taxonomic datasets differing in phylogenetic depth and coverage density, and visualized via interactive alignments and phylogenetic trees. Users can also batch download entire comparative datasets. A functional survey finds conserved mitotic and neural genes, highly diverged immune and reproduction-related genes, more conspicuous signals of divergence across tissue-specific genes, and an enrichment of positive selection among highly diverged genes. flyDIVaS will be regularly updated and can be freely accessed at www.flydivas.info. We encourage researchers to regularly use this resource as a tool for biological inference and discovery, and in their classrooms to help train the next generation of

  6. Mouse Genome Informatics (MGI): Resources for Mining Mouse Genetic, Genomic, and Biological Data in Support of Primary and Translational Research.

    PubMed

    Eppig, Janan T; Smith, Cynthia L; Blake, Judith A; Ringwald, Martin; Kadin, James A; Richardson, Joel E; Bult, Carol J

    2017-01-01

    The Mouse Genome Informatics (MGI), resource ( www.informatics.jax.org ) has existed for over 25 years, and over this time its data content, informatics infrastructure, and user interfaces and tools have undergone dramatic changes (Eppig et al., Mamm Genome 26:272-284, 2015). Change has been driven by scientific methodological advances, rapid improvements in computational software, growth in computer hardware capacity, and the ongoing collaborative nature of the mouse genomics community in building resources and sharing data. Here we present an overview of the current data content of MGI, describe its general organization, and provide examples using simple and complex searches, and tools for mining and retrieving sets of data.

  7. A cricket Gene Index: a genomic resource for studying neurobiology, speciation, and molecular evolution

    PubMed Central

    Danley, Patrick D; Mullen, Sean P; Liu, Fenglong; Nene, Vishvanath; Quackenbush, John; Shaw, Kerry L

    2007-01-01

    Background As the developmental costs of genomic tools decline, genomic approaches to non-model systems are becoming more feasible. Many of these systems may lack advanced genetic tools but are extremely valuable models in other biological fields. Here we report the development of expressed sequence tags (EST's) in an orthopteroid insect, a model for the study of neurobiology, speciation, and evolution. Results We report the sequencing of 14,502 EST's from clones derived from a nerve cord cDNA library, and the subsequent construction of a Gene Index from these sequences, from the Hawaiian trigonidiine cricket Laupala kohalensis. The Gene Index contains 8607 unique sequences comprised of 2575 tentative consensus (TC) sequences and 6032 singletons. For each of the unique sequences, an attempt was made to assign a provisional annotation and to categorize its function using a Gene Ontology-based classification through a sequence-based comparison to known proteins. In addition, a set of unique 70 base pair oligomers that can be used for DNA microarrays was developed. All Gene Index information is posted at the DFCI Gene Indices web page Conclusion Orthopterans are models used to understand the neurophysiological basis of complex motor patterns such as flight and stridulation. The sequences presented in the cricket Gene Index will provide neurophysiologists with many genetic tools that have been largely absent in this field. The cricket Gene Index is one of only two gene indices to be developed in an evolutionary model system. Species within the genus Laupala have speciated recently, rapidly, and extensively. Therefore, the genes identified in the cricket Gene Index can be used to study the genomics of speciation. Furthermore, this gene index represents a significant EST resources for basal insects. As such, this resource is a valuable comparative tool for the understanding of invertebrate molecular evolution. The sequences presented here will provide much needed genomic

  8. The Culture Collection of Algae and Protozoa (CCAP): a biological resource for protistan genomics.

    PubMed

    Gachon, Claire M M; Day, John G; Campbell, Christine N; Pröschold, Thomas; Saxon, Rachel J; Küpper, Frithjof C

    2007-12-30

    CCAP, the largest European protistan culture collection, is based at the Scottish Association for Marine Science near Oban, Scotland (http://www.ccap.ac.uk). The Collection comprises more than 2700 strains in the public domain, of which 1050 are marine algae, 1300 freshwater algae, and 350 protozoa. The primary mission of CCAP is to maintain and distribute defined cultures and their associated information to its customers. It also has a support and advisory function on all aspects of protistan science. In addition, it is involved in the training of students and researchers in algal identification and culture techniques. In light of the increasing number of fully sequenced protists, the CCAP is striving to provide targeted services and support to workers involved in all aspects of genomic research. At present, the Collection holds several hundred strains of genomic model taxa including: Acanthamoeba, Cafeteria, Cercomonas, Chlamydomonas, Chlorella, Cyanophora, Dictyostelium, Dunaliella, Ectocarpus, Emiliania, Euglena, Micromonas, Naegleria, Nephroselmis, Paramecium, Pavlova, Phaeodactylum, Porphyra, Pseudendoclonium, Pylaiella, Rhodomonas, Scenedesmus, Staurastrum, Tetrahymena, Thalassiosira, Volvox and Zygnema. These strains provide a defined representation of natural variation within model organisms, an increasingly useful resource for post-genomics approaches. Our aim over the next 2-5 years is to add value to the Collection by increasing the number of genome model species, and by offering an integrated, up-to-date, easy-to-use resource that would provide curated information on our strain holdings. In collaboration with other major Biological Resource Centres worldwide, we intend to build a hub providing access to both protistan cultures and their associated bioinformatics data.

  9. A genome resource to address mechanisms of developmental programming: determination of the fetal sheep heart transcriptome

    PubMed Central

    Cox, Laura A; Glenn, Jeremy P; Spradling, Kimberly D; Nijland, Mark J; Garcia, Roy; Nathanielsz, Peter W; Ford, Stephen P

    2012-01-01

    The pregnant sheep has provided seminal insights into reproduction related to animal and human development (ovarian function, fertility, implantation, fetal growth, parturition and lactation). Fetal sheep physiology has been extensively studied since 1950, contributing significantly to the basis for our understanding of many aspects of fetal development and behaviour that remain in use in clinical practice today. Understanding mechanisms requires the combination of systems approaches uniquely available in fetal sheep with the power of genomic studies. Absence of the full range of sheep genomic resources has limited the full realization of the power of this model, impeding progress in emerging areas of pregnancy biology such as developmental programming. We have examined the expressed fetal sheep heart transcriptome using high-throughput sequencing technologies. In so doing we identified 36,737 novel transcripts and describe genes, gene variants and pathways relevant to fundamental developmental mechanisms. Genes with the highest expression levels and with novel exons in the fetal heart transcriptome are known to play central roles in muscle development. We show that high-throughput sequencing methods can generate extensive transcriptome information in the absence of an assembled and annotated genome for that species. The gene sequence data obtained provide a unique genomic resource for sheep specific genetic technology development and, combined with the polymorphism data, augment annotation and assembly of the sheep genome. In addition, identification and pathway analysis of novel fetal sheep heart transcriptome splice variants is a first step towards revealing mechanisms of genetic variation and gene environment interactions during fetal heart development. PMID:22508961

  10. The Aspergillus Genome Database (AspGD): recent developments in comprehensive multispecies curation, comparative genomics and community resources.

    PubMed

    Arnaud, Martha B; Cerqueira, Gustavo C; Inglis, Diane O; Skrzypek, Marek S; Binkley, Jonathan; Chibucos, Marcus C; Crabtree, Jonathan; Howarth, Clinton; Orvis, Joshua; Shah, Prachi; Wymore, Farrell; Binkley, Gail; Miyasato, Stuart R; Simison, Matt; Sherlock, Gavin; Wortman, Jennifer R

    2012-01-01

    The Aspergillus Genome Database (AspGD; http://www.aspgd.org) is a freely available, web-based resource for researchers studying fungi of the genus Aspergillus, which includes organisms of clinical, agricultural and industrial importance. AspGD curators have now completed comprehensive review of the entire published literature about Aspergillus nidulans and Aspergillus fumigatus, and this annotation is provided with streamlined, ortholog-based navigation of the multispecies information. AspGD facilitates comparative genomics by providing a full-featured genomics viewer, as well as matched and standardized sets of genomic information for the sequenced aspergilli. AspGD also provides resources to foster interaction and dissemination of community information and resources. We welcome and encourage feedback at aspergillus-curator@lists.stanford.edu.

  11. From bacterial genome to functionality; case bifidobacteria.

    PubMed

    Ventura, Marco; O'Connell-Motherway, Mary; Leahy, Sinead; Moreno-Munoz, Jose Antonio; Fitzgerald, Gerald F; van Sinderen, Douwe

    2007-11-30

    The availability of complete bacterial genome sequences has significantly furthered our understanding of the genetics, physiology and biochemistry of the microorganisms in question, particularly those that have commercially important applications. Bifidobacteria are among such microorganisms, as they constitute mammalian commensals of biotechnological significance due to their perceived role in maintaining a balanced gastrointestinal (GIT) microflora. Bifidobacteria are therefore frequently used as health-promoting or probiotic components in functional food products. A fundamental understanding of the metabolic activities employed by these commensal bacteria, in particular their capability to utilize a wide range of complex oligosaccharides, can reveal ways to provide in vivo growth advantages relative to other competing gut bacteria or pathogens. Furthermore, an in depth analysis of adaptive responses to nutritional or environmental stresses may provide methodologies to retain viability and improve functionality during commercial preparation, storage and delivery of the probiotic organism.

  12. The Aspergillus Genome Database, a curated comparative genomics resource for gene, protein and sequence information for the Aspergillus research community.

    PubMed

    Arnaud, Martha B; Chibucos, Marcus C; Costanzo, Maria C; Crabtree, Jonathan; Inglis, Diane O; Lotia, Adil; Orvis, Joshua; Shah, Prachi; Skrzypek, Marek S; Binkley, Gail; Miyasato, Stuart R; Wortman, Jennifer R; Sherlock, Gavin

    2010-01-01

    The Aspergillus Genome Database (AspGD) is an online genomics resource for researchers studying the genetics and molecular biology of the Aspergilli. AspGD combines high-quality manual curation of the experimental scientific literature examining the genetics and molecular biology of Aspergilli, cutting-edge comparative genomics approaches to iteratively refine and improve structural gene annotations across multiple Aspergillus species, and web-based research tools for accessing and exploring the data. All of these data are freely available at http://www.aspgd.org. We welcome feedback from users and the research community at aspergillus-curator@genome.stanford.edu.

  13. A genome-wide resource for the analysis of protein localisation in Drosophila.

    PubMed

    Sarov, Mihail; Barz, Christiane; Jambor, Helena; Hein, Marco Y; Schmied, Christopher; Suchold, Dana; Stender, Bettina; Janosch, Stephan; K J, Vinay Vikas; Krishnan, R T; Krishnamoorthy, Aishwarya; Ferreira, Irene R S; Ejsmont, Radoslaw K; Finkl, Katja; Hasse, Susanne; Kämpfer, Philipp; Plewka, Nicole; Vinis, Elisabeth; Schloissnig, Siegfried; Knust, Elisabeth; Hartenstein, Volker; Mann, Matthias; Ramaswami, Mani; VijayRaghavan, K; Tomancak, Pavel; Schnorrer, Frank

    2016-02-20

    The Drosophila genome contains >13000 protein-coding genes, the majority of which remain poorly investigated. Important reasons include the lack of antibodies or reporter constructs to visualise these proteins. Here, we present a genome-wide fosmid library of 10000 GFP-tagged clones, comprising tagged genes and most of their regulatory information. For 880 tagged proteins, we created transgenic lines, and for a total of 207 lines, we assessed protein expression and localisation in ovaries, embryos, pupae or adults by stainings and live imaging approaches. Importantly, we visualised many proteins at endogenous expression levels and found a large fraction of them localising to subcellular compartments. By applying genetic complementation tests, we estimate that about two-thirds of the tagged proteins are functional. Moreover, these tagged proteins enable interaction proteomics from developing pupae and adult flies. Taken together, this resource will boost systematic analysis of protein expression and localisation in various cellular and developmental contexts.

  14. Selfish drive can trump function when animal mitochondrial genomes compete

    PubMed Central

    Ma, Hansong; O’Farrell, Patrick H.

    2016-01-01

    Mitochondrial genomes compete for transmission from mother to progeny. We explored this competition by introducing a second genome into Drosophila melanogaster to follow transmission. Competitions between closely related genomes favored those functional in electron transport, resulting in a host-beneficial purifying selection1. Contrastingly, matchups between distant genomes often favored those with negligible, negative or lethal consequences, indicating selfish selection. Exhibiting powerful selfish selection, a genome carrying a detrimental mutation displaced a complementing genome leading to population death after several generations. In a different pairing, opposing selfish and purifying selection counterbalanced to give stable transmission of two genomes. Sequencing of recombinant mitochondrial genomes revealed that the non-coding region, containing origins of replication, governs selfish transmission. Uniparental inheritance prevents encounters between distantly related genomes. Nonetheless, within each maternal lineage, constant competition among sibling genomes selects for super-replicators. We suggest that this relentless competition drives positive selection promoting change in the sequences influencing transmission. PMID:27270106

  15. Selfish drive can trump function when animal mitochondrial genomes compete.

    PubMed

    Ma, Hansong; O'Farrell, Patrick H

    2016-07-01

    Mitochondrial genomes compete for transmission from mother to progeny. We explored this competition by introducing a second genome into Drosophila melanogaster to follow transmission. Competitions between closely related genomes favored those functional in electron transport, resulting in a host-beneficial purifying selection. In contrast, matchups between distantly related genomes often favored those with negligible, negative or lethal consequences, indicating selfish selection. Exhibiting powerful selfish selection, a genome carrying a detrimental mutation displaced a complementing genome, leading to population death after several generations. In a different pairing, opposing selfish and purifying selection counterbalanced to give stable transmission of two genomes. Sequencing of recombinant mitochondrial genomes showed that the noncoding region, containing origins of replication, governs selfish transmission. Uniparental inheritance prevents encounters between distantly related genomes. Nonetheless, in each maternal lineage, constant competition among sibling genomes selects for super-replicators. We suggest that this relentless competition drives positive selection, promoting change in the sequences influencing transmission.

  16. Longitudinal Relationships Between Resources, Motivation, and Functioning

    PubMed Central

    Emery, Lisa; Neupert, Shevaun D.

    2012-01-01

    Objectives. We investigated how fluctuations and linear changes in health and cognitive resources influence the motivation to engage in complex cognitive activity and the extent to which motivation mediated the relationship between changing resources and cognitively demanding activities. Method. Longitudinal data from 332 adults aged 20–85 years were examined. Motivation was assessed using a composite of Need for Cognition and Personal Need for Structure and additional measures of health, sensory functioning, cognitive ability, and self-reported activity engagement. Results. Multilevel modeling revealed that age-typical changes in health, sensory functions, and ability were associated with changes in motivation, with the impact of declining health on motivation being particularly strong in older adulthood. Changes in motivation, in turn, predicted involvement in cognitive and social activities as well as changes in cognitive ability. Finally, motivation was observed to partially mediate the relationship between changes in resources and cognitively demanding activities. Discussion. Our results suggest that motivation may play an important role in determining the course of cognitive change and involvement in cognitively demanding everyday activities in adulthood. PMID:21926400

  17. CucCAP - Developing genomic resources for the cucurbit community

    USDA-ARS?s Scientific Manuscript database

    The U.S. cucurbit community has initiated a USDA-SCRI funded cucurbit genomics project, CucCAP: Leveraging applied genomics to increase disease resistance in cucurbit crops. Our primary objectives are: develop genomic and bioinformatic breeding tool kits for accelerated crop improvement across the...

  18. The Human OligoGenome Resource: a database of oligonucleotide capture probes for resequencing target regions across the human genome.

    PubMed

    Newburger, Daniel E; Natsoulis, Georges; Grimes, Sue; Bell, John M; Davis, Ronald W; Batzoglou, Serafim; Ji, Hanlee P

    2012-01-01

    Recent exponential growth in the throughput of next-generation DNA sequencing platforms has dramatically spurred the use of accessible and scalable targeted resequencing approaches. This includes candidate region diagnostic resequencing and novel variant validation from whole genome or exome sequencing analysis. We have previously demonstrated that selective genomic circularization is a robust in-solution approach for capturing and resequencing thousands of target human genome loci such as exons and regulatory sequences. To facilitate the design and production of customized capture assays for any given region in the human genome, we developed the Human OligoGenome Resource (http://oligogenome.stanford.edu/). This online database contains over 21 million capture oligonucleotide sequences. It enables one to create customized and highly multiplexed resequencing assays of target regions across the human genome and is not restricted to coding regions. In total, this resource provides 92.1% in silico coverage of the human genome. The online server allows researchers to download a complete repository of oligonucleotide probes and design customized capture assays to target multiple regions throughout the human genome. The website has query tools for selecting and evaluating capture oligonucleotides from specified genomic regions.

  19. Gramene: A Resource for Comparative Analysis of Plants Genomes and Pathways.

    PubMed

    Tello-Ruiz, Marcela Karey; Stein, Joshua; Wei, Sharon; Youens-Clark, Ken; Jaiswal, Pankaj; Ware, Doreen

    2016-01-01

    Gramene is an integrated informatics resource for accessing, visualizing, and comparing plant genomes and biological pathways. Originally targeting grasses, Gramene has grown to host annotations for economically important and research model crops, including wheat, potato, tomato, banana, grape, poplar, and Chlamydomonas. Its strength derives from the application of a phylogenetic framework for genome comparison and the use of ontologies to integrate structural and functional annotation data. This chapter outlines system requirements for end users and database hosting, data types and basic navigation within Gramene, and provides examples of how to (1) view a phylogenetic tree for a family of transcription factors, (2) explore genetic variation in the orthologues of a gene with a known trait association, and (3) upload, visualize, and privately share end user data into a new genome browser track.Moreover, this is the first publication describing Gramene's new web interface-intended to provide a simplified portal to the most complete and up-to-date set of plant genome and pathway annotations.

  20. MIPS Arabidopsis thaliana Database (MAtDB): an integrated biological knowledge resource for plant genomics.

    PubMed

    Schoof, Heiko; Ernst, Rebecca; Nazarov, Vladimir; Pfeifer, Lukas; Mewes, Hans-Werner; Mayer, Klaus F X

    2004-01-01

    Arabidopsis thaliana is the most widely studied model plant. Functional genomics is intensively underway in many laboratories worldwide. Beyond the basic annotation of the primary sequence data, the annotated genetic elements of Arabidopsis must be linked to diverse biological data and higher order information such as metabolic or regulatory pathways. The MIPS Arabidopsis thaliana database MAtDB aims to provide a comprehensive resource for Arabidopsis as a genome model that serves as a primary reference for research in plants and is suitable for transfer of knowledge to other plants, especially crops. The genome sequence as a common backbone serves as a scaffold for the integration of data, while, in a complementary effort, these data are enhanced through the application of state-of-the-art bioinformatics tools. This information is visualized on a genome-wide and a gene-by-gene basis with access both for web users and applications. This report updates the information given in a previous report and provides an outlook on further developments. The MAtDB web interface can be accessed at http://mips.gsf.de/proj/thal/db.

  1. Prokaryotic Virus Orthologous Groups (pVOGs): a resource for comparative genomics and protein family annotation

    PubMed Central

    Grazziotin, Ana Laura; Koonin, Eugene V.; Kristensen, David M.

    2017-01-01

    Viruses are the most abundant and diverse biological entities on earth, and while most of this diversity remains completely unexplored, advances in genome sequencing have provided unprecedented glimpses into the virosphere. The Prokaryotic Virus Orthologous Groups (pVOGs, formerly called Phage Orthologous Groups, POGs) resource has aided in this task over the past decade by using automated methods to keep pace with the rapid increase in genomic data. The uses of pVOGs include functional annotation of viral proteins, identification of genes and viruses in uncharacterized DNA samples, phylogenetic analysis, large-scale comparative genomics projects, and more. The pVOGs database represents a comprehensive set of orthologous gene families shared across multiple complete genomes of viruses that infect bacterial or archaeal hosts (viruses of eukaryotes will be added at a future date). The pVOGs are constructed within the Clusters of Orthologous Groups (COGs) framework that is widely used for orthology identification in prokaryotes. Since the previous release of the POGs, the size has tripled to nearly 3000 genomes and 300 000 proteins, and the number of conserved orthologous groups doubled to 9518. User-friendly webpages are available, including multiple sequence alignments and HMM profiles for each VOG. These changes provide major improvements to the pVOGs database, at a time of rapid advances in virus genomics. The pVOGs database is hosted jointly at the University of Iowa at http://dmk-brain.ecn.uiowa.edu/pVOGs and the NCBI at ftp://ftp.ncbi.nlm.nih.gov/pub/kristensen/pVOGs/home.html. PMID:27789703

  2. Prokaryotic Virus Orthologous Groups (pVOGs): a resource for comparative genomics and protein family annotation.

    PubMed

    Grazziotin, Ana Laura; Koonin, Eugene V; Kristensen, David M

    2017-01-04

    Viruses are the most abundant and diverse biological entities on earth, and while most of this diversity remains completely unexplored, advances in genome sequencing have provided unprecedented glimpses into the virosphere. The Prokaryotic Virus Orthologous Groups (pVOGs, formerly called Phage Orthologous Groups, POGs) resource has aided in this task over the past decade by using automated methods to keep pace with the rapid increase in genomic data. The uses of pVOGs include functional annotation of viral proteins, identification of genes and viruses in uncharacterized DNA samples, phylogenetic analysis, large-scale comparative genomics projects, and more. The pVOGs database represents a comprehensive set of orthologous gene families shared across multiple complete genomes of viruses that infect bacterial or archaeal hosts (viruses of eukaryotes will be added at a future date). The pVOGs are constructed within the Clusters of Orthologous Groups (COGs) framework that is widely used for orthology identification in prokaryotes. Since the previous release of the POGs, the size has tripled to nearly 3000 genomes and 300 000 proteins, and the number of conserved orthologous groups doubled to 9518. User-friendly webpages are available, including multiple sequence alignments and HMM profiles for each VOG. These changes provide major improvements to the pVOGs database, at a time of rapid advances in virus genomics. The pVOGs database is hosted jointly at the University of Iowa at http://dmk-brain.ecn.uiowa.edu/pVOGs and the NCBI at ftp://ftp.ncbi.nlm.nih.gov/pub/kristensen/pVOGs/home.html. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  3. The function of genomes in bioenergetic organelles.

    PubMed Central

    Allen, John F

    2003-01-01

    Mitochondria and chloroplasts are energy-transducing organelles of the cytoplasm of eukaryotic cells. They originated as bacterial symbionts whose host cells acquired respiration from the precursor of the mitochondrion, and oxygenic photosynthesis from the precursor of the chloroplast. The host cells also acquired genetic information from their symbionts, eventually incorporating much of it into their own genomes. Genes of the eukaryotic cell nucleus now encode most mitochondrial and chloroplast proteins. Genes are copied and moved between cellular compartments with relative ease, and there is no obvious obstacle to successful import of any protein precursor from the cytosol. So why are any genes at all retained in cytoplasmic organelles? One proposal is that these small but functional genomes provide a location for genes that is close to, and in the same compartment as, their gene products. This co-location facilitates rapid and direct regulatory coupling. Redox control of synthesis de novo is put forward as the common property of those proteins that must be encoded and synthesized within mitochondria and chloroplasts. This testable hypothesis is termed CORR, for co-location for redox regulation. Principles, predictions and consequences of CORR are examined in the context of competing hypotheses and current evidence. PMID:12594916

  4. A Functional Simulator of Spacecraft Resources

    NASA Technical Reports Server (NTRS)

    Liceaga, Carlos A.; Troutman, Patrick A.

    1997-01-01

    The SPAcecraft SIMulator (SPASIM) simulates the functions and resources of a spacecraft to quickly perform Phase A trade-off analyses and uncover any operational bottlenecks during any part of the mission. Failure modes and operational contingencies can be evaluated allowing optimization for a range of mission scenarios. The payloads and subsystems are simulated, using a hierarchy of graphical models, in terms of how their functions affect resources such as propellant, power, and data. Any of the inputs and outputs of the payloads and subsystems can be plotted during the simulation. Most trade-off analyses, including those that compare current versus advanced technology, can be performed by changing values in the parameter menus. However, when a component is replaced by one with a different functional architecture, its graphical model can also be modified or replaced by drawing from a component library. SPASIM has been validated using several spacecraft designs which were at least at the Critical Design Review level. The user and programmer guide, including figures, is available on line as a hyper text document. This is an easy-to-use and expand tool which is based on MATLAB and SIMULINK. It runs on SGI workstations and PCs under Windows 95 or NT.

  5. MicroScope--an integrated microbial resource for the curation and comparative analysis of genomic and metabolic data.

    PubMed

    Vallenet, David; Belda, Eugeni; Calteau, Alexandra; Cruveiller, Stéphane; Engelen, Stefan; Lajus, Aurélie; Le Fèvre, François; Longin, Cyrille; Mornico, Damien; Roche, David; Rouy, Zoé; Salvignol, Gregory; Scarpelli, Claude; Thil Smith, Adam Alexander; Weiman, Marion; Médigue, Claudine

    2013-01-01

    MicroScope is an integrated platform dedicated to both the methodical updating of microbial genome annotation and to comparative analysis. The resource provides data from completed and ongoing genome projects (automatic and expert annotations), together with data sources from post-genomic experiments (i.e. transcriptomics, mutant collections) allowing users to perfect and improve the understanding of gene functions. MicroScope (http://www.genoscope.cns.fr/agc/microscope) combines tools and graphical interfaces to analyse genomes and to perform the manual curation of gene annotations in a comparative context. Since its first publication in January 2006, the system (previously named MaGe for Magnifying Genomes) has been continuously extended both in terms of data content and analysis tools. The last update of MicroScope was published in 2009 in the Database journal. Today, the resource contains data for >1600 microbial genomes, of which ∼300 are manually curated and maintained by biologists (1200 personal accounts today). Expert annotations are continuously gathered in the MicroScope database (∼50 000 a year), contributing to the improvement of the quality of microbial genomes annotations. Improved data browsing and searching tools have been added, original tools useful in the context of expert annotation have been developed and integrated and the website has been significantly redesigned to be more user-friendly. Furthermore, in the context of the European project Microme (Framework Program 7 Collaborative Project), MicroScope is becoming a resource providing for the curation and analysis of both genomic and metabolic data. An increasing number of projects are related to the study of environmental bacterial (meta)genomes that are able to metabolize a large variety of chemical compounds that may be of high industrial interest.

  6. MicroScope—an integrated microbial resource for the curation and comparative analysis of genomic and metabolic data

    PubMed Central

    Vallenet, David; Belda, Eugeni; Calteau, Alexandra; Cruveiller, Stéphane; Engelen, Stefan; Lajus, Aurélie; Le Fèvre, François; Longin, Cyrille; Mornico, Damien; Roche, David; Rouy, Zoé; Salvignol, Gregory; Scarpelli, Claude; Thil Smith, Adam Alexander; Weiman, Marion; Médigue, Claudine

    2013-01-01

    MicroScope is an integrated platform dedicated to both the methodical updating of microbial genome annotation and to comparative analysis. The resource provides data from completed and ongoing genome projects (automatic and expert annotations), together with data sources from post-genomic experiments (i.e. transcriptomics, mutant collections) allowing users to perfect and improve the understanding of gene functions. MicroScope (http://www.genoscope.cns.fr/agc/microscope) combines tools and graphical interfaces to analyse genomes and to perform the manual curation of gene annotations in a comparative context. Since its first publication in January 2006, the system (previously named MaGe for Magnifying Genomes) has been continuously extended both in terms of data content and analysis tools. The last update of MicroScope was published in 2009 in the Database journal. Today, the resource contains data for >1600 microbial genomes, of which ∼300 are manually curated and maintained by biologists (1200 personal accounts today). Expert annotations are continuously gathered in the MicroScope database (∼50 000 a year), contributing to the improvement of the quality of microbial genomes annotations. Improved data browsing and searching tools have been added, original tools useful in the context of expert annotation have been developed and integrated and the website has been significantly redesigned to be more user-friendly. Furthermore, in the context of the European project Microme (Framework Program 7 Collaborative Project), MicroScope is becoming a resource providing for the curation and analysis of both genomic and metabolic data. An increasing number of projects are related to the study of environmental bacterial (meta)genomes that are able to metabolize a large variety of chemical compounds that may be of high industrial interest. PMID:23193269

  7. Population perspectives on functional genomic variation in yeast.

    PubMed

    Skelly, Daniel A; Magwene, Paul M

    2016-03-01

    Advances in high-throughput sequencing have facilitated large-scale surveys of genomic variation in the budding yeast,Saccharomyces cerevisiae These surveys have revealed extensive sequence variation between yeast strains. However, much less is known about how such variation influences the amount and nature of variation for functional genomic traits within and between yeast lineages. We review population-level studies of functional genomic variation, with a particular focus on how population functional genomic approaches can provide insights into both genome function and the evolutionary process. Although variation in functional genomics phenotypes is pervasive, our understanding of the consequences of this variation, either in physiological or evolutionary terms, is still rudimentary and thus motivates increased attention to appropriate null models. To date, much of the focus of population functional genomic studies has been on gene expression variation, but other functional genomic data types are just as likely to reveal important insights at the population level, suggesting a pressing need for more studies that go beyond transcription. Finally, we discuss how a population functional genomic perspective can be a powerful approach for developing a mechanistic understanding of the processes that link genomic variation to organismal phenotypes through gene networks.

  8. Large-scale reduction of the Bacillus subtilis genome: consequences for the transcriptional network, resource allocation, and metabolism.

    PubMed

    Reuß, Daniel R; Altenbuchner, Josef; Mäder, Ulrike; Rath, Hermann; Ischebeck, Till; Sappa, Praveen Kumar; Thürmer, Andrea; Guérin, Cyprien; Nicolas, Pierre; Steil, Leif; Zhu, Bingyao; Feussner, Ivo; Klumpp, Stefan; Daniel, Rolf; Commichau, Fabian M; Völker, Uwe; Stülke, Jörg

    2017-02-01

    Understanding cellular life requires a comprehensive knowledge of the essential cellular functions, the components involved, and their interactions. Minimized genomes are an important tool to gain this knowledge. We have constructed strains of the model bacterium, Bacillus subtilis, whose genomes have been reduced by ∼36%. These strains are fully viable, and their growth rates in complex medium are comparable to those of wild type strains. An in-depth multi-omics analysis of the genome reduced strains revealed how the deletions affect the transcription regulatory network of the cell, translation resource allocation, and metabolism. A comparison of gene counts and resource allocation demonstrates drastic differences in the two parameters, with 50% of the genes using as little as 10% of translation capacity, whereas the 6% essential genes require 57% of the translation resources. Taken together, the results are a valuable resource on gene dispensability in B. subtilis, and they suggest the roads to further genome reduction to approach the final aim of a minimal cell in which all functions are understood.

  9. The FunGenES database: a genomics resource for mouse embryonic stem cell differentiation.

    PubMed

    Schulz, Herbert; Kolde, Raivo; Adler, Priit; Aksoy, Irène; Anastassiadis, Konstantinos; Bader, Michael; Billon, Nathalie; Boeuf, Hélène; Bourillot, Pierre-Yves; Buchholz, Frank; Dani, Christian; Doss, Michael Xavier; Forrester, Lesley; Gitton, Murielle; Henrique, Domingos; Hescheler, Jürgen; Himmelbauer, Heinz; Hübner, Norbert; Karantzali, Efthimia; Kretsovali, Androniki; Lubitz, Sandra; Pradier, Laurent; Rai, Meena; Reimand, Jüri; Rolletschek, Alexandra; Sachinidis, Agapios; Savatier, Pierre; Stewart, Francis; Storm, Mike P; Trouillas, Marina; Vilo, Jaak; Welham, Melanie J; Winkler, Johannes; Wobus, Anna M; Hatzopoulos, Antonis K

    2009-09-03

    Embryonic stem (ES) cells have high self-renewal capacity and the potential to differentiate into a large variety of cell types. To investigate gene networks operating in pluripotent ES cells and their derivatives, the "Functional Genomics in Embryonic Stem Cells" consortium (FunGenES) has analyzed the transcriptome of mouse ES cells in eleven diverse settings representing sixty-seven experimental conditions. To better illustrate gene expression profiles in mouse ES cells, we have organized the results in an interactive database with a number of features and tools. Specifically, we have generated clusters of transcripts that behave the same way under the entire spectrum of the sixty-seven experimental conditions; we have assembled genes in groups according to their time of expression during successive days of ES cell differentiation; we have included expression profiles of specific gene classes such as transcription regulatory factors and Expressed Sequence Tags; transcripts have been arranged in "Expression Waves" and juxtaposed to genes with opposite or complementary expression patterns; we have designed search engines to display the expression profile of any transcript during ES cell differentiation; gene expression data have been organized in animated graphs of KEGG signaling and metabolic pathways; and finally, we have incorporated advanced functional annotations for individual genes or gene clusters of interest and links to microarray and genomic resources. The FunGenES database provides a comprehensive resource for studies into the biology of ES cells.

  10. The FunGenES Database: A Genomics Resource for Mouse Embryonic Stem Cell Differentiation

    PubMed Central

    Adler, Priit; Aksoy, Irène; Anastassiadis, Konstantinos; Bader, Michael; Billon, Nathalie; Boeuf, Hélène; Bourillot, Pierre-Yves; Buchholz, Frank; Dani, Christian; Doss, Michael Xavier; Forrester, Lesley; Gitton, Murielle; Henrique, Domingos; Hescheler, Jürgen; Himmelbauer, Heinz; Hübner, Norbert; Karantzali, Efthimia; Kretsovali, Androniki; Lubitz, Sandra; Pradier, Laurent; Rai, Meena; Reimand, Jüri; Rolletschek, Alexandra; Sachinidis, Agapios; Savatier, Pierre; Stewart, Francis; Storm, Mike P.; Trouillas, Marina; Vilo, Jaak; Welham, Melanie J.; Winkler, Johannes; Wobus, Anna M.; Hatzopoulos, Antonis K.

    2009-01-01

    Embryonic stem (ES) cells have high self-renewal capacity and the potential to differentiate into a large variety of cell types. To investigate gene networks operating in pluripotent ES cells and their derivatives, the “Functional Genomics in Embryonic Stem Cells” consortium (FunGenES) has analyzed the transcriptome of mouse ES cells in eleven diverse settings representing sixty-seven experimental conditions. To better illustrate gene expression profiles in mouse ES cells, we have organized the results in an interactive database with a number of features and tools. Specifically, we have generated clusters of transcripts that behave the same way under the entire spectrum of the sixty-seven experimental conditions; we have assembled genes in groups according to their time of expression during successive days of ES cell differentiation; we have included expression profiles of specific gene classes such as transcription regulatory factors and Expressed Sequence Tags; transcripts have been arranged in “Expression Waves” and juxtaposed to genes with opposite or complementary expression patterns; we have designed search engines to display the expression profile of any transcript during ES cell differentiation; gene expression data have been organized in animated graphs of KEGG signaling and metabolic pathways; and finally, we have incorporated advanced functional annotations for individual genes or gene clusters of interest and links to microarray and genomic resources. The FunGenES database provides a comprehensive resource for studies into the biology of ES cells. PMID:19727443

  11. Physical mapping resources for large plant genomes: radiation hybrids for wheat D-genome progenitor Aegilops tauschii

    PubMed Central

    2012-01-01

    lines with an average deletion frequency of ~10% were identified for developing high density marker scaffolds of the D-genome. Conclusions The RH panel reported here is the first developed for any wild ancestor of a major cultivated plant species. The results provided insight into various aspects of RH mapping in plants, including the genetically effective cell number for wheat (for the first time) and the potential implementation of this technique in other plant species. This RH panel will be an invaluable resource for mapping gene based markers, developing a complete marker scaffold for the whole genome sequence assembly, fine mapping of markers and functional characterization of genes and gene networks present on the D-genome. PMID:23127207

  12. Physical mapping resources for large plant genomes: radiation hybrids for wheat D-genome progenitor Aegilops tauschii.

    PubMed

    Kumar, Ajay; Simons, Kristin; Iqbal, Muhammad J; de Jiménez, Monika Michalak; Bassi, Filippo M; Ghavami, Farhad; Al-Azzam, Omar; Drader, Thomas; Wang, Yi; Luo, Ming-Cheng; Gu, Yong Q; Denton, Anne; Lazo, Gerard R; Xu, Steven S; Dvorak, Jan; Kianian, Penny M A; Kianian, Shahryar F

    2012-11-05

    deletion frequency of ~10% were identified for developing high density marker scaffolds of the D-genome. The RH panel reported here is the first developed for any wild ancestor of a major cultivated plant species. The results provided insight into various aspects of RH mapping in plants, including the genetically effective cell number for wheat (for the first time) and the potential implementation of this technique in other plant species. This RH panel will be an invaluable resource for mapping gene based markers, developing a complete marker scaffold for the whole genome sequence assembly, fine mapping of markers and functional characterization of genes and gene networks present on the D-genome.

  13. Genomic resources for songbird research and their use in characterizing gene expression during brain development

    PubMed Central

    Li, XiaoChing; Wang, Xiu-Jie; Tannenhauser, Jonathan; Podell, Sheila; Mukherjee, Piali; Hertel, Moritz; Biane, Jeremy; Masuda, Shoko; Nottebohm, Fernando; Gaasterland, Terry

    2007-01-01

    Vocal learning and neuronal replacement have been studied extensively in songbirds, but until recently, few molecular and genomic tools for songbird research existed. Here we describe new molecular/genomic resources developed in our laboratory. We made cDNA libraries from zebra finch (Taeniopygia guttata) brains at different developmental stages. A total of 11,000 cDNA clones from these libraries, representing 5,866 unique gene transcripts, were randomly picked and sequenced from the 3′ ends. A web-based database was established for clone tracking, sequence analysis, and functional annotations. Our cDNA libraries were not normalized. Sequencing ESTs without normalization produced many developmental stage-specific sequences, yielding insights into patterns of gene expression at different stages of brain development. In particular, the cDNA library made from brains at posthatching day 30–50, corresponding to the period of rapid song system development and song learning, has the most diverse and richest set of genes expressed. We also identified five microRNAs whose sequences are highly conserved between zebra finch and other species. We printed cDNA microarrays and profiled gene expression in the high vocal center of both adult male zebra finches and canaries (Serinus canaria). Genes differentially expressed in the high vocal center were identified from the microarray hybridization results. Selected genes were validated by in situ hybridization. Networks among the regulated genes were also identified. These resources provide songbird biologists with tools for genome annotation, comparative genomics, and microarray gene expression analysis. PMID:17426146

  14. Fishing for function: zebrafish BAC transgenics for functional genomics.

    PubMed

    Chatterjee, Sumantra; Lufkin, Thomas

    2011-08-01

    Transgenics using bacterial artificial chromosomes (BACs) offers a great opportunity to look at gene regulation in a developing embryo. The modified BAC containing a reporter inserted just before the translational start site of the gene of interest allows for the visualization of spatio-temporal gene expression. Though this method has been used in the mouse model extensively, its utility in zebrafish studies is relatively new. This review aims to look at the utility of making BAC transgenics in zebrafish and its applications in functional genomics. We look at the various methods to modify the BAC, some limitations and what the future holds.

  15. Fishing for function: zebrafish BAC transgenics for functional genomics

    PubMed Central

    Chatterjee, Sumantra; Lufkin, Thomas

    2012-01-01

    Transgenics using bacterial artificial chromosomes (BACs) offers a great opportunity to look at gene regulation in a developing embryo. The modified BAC containing a reporter inserted just before the translational start site of the gene of interest allows for the visualization of spatio-temporal gene expression. Though this method has been used in the mouse model extensively, its utility in zebrafish studies is relatively new. This review aims to look at the utility of making BAC transgenics in zebrafish and its applications in functional genomics. We look at the various methods to modify the BAC, some limitations and what the future holds. PMID:21647532

  16. Functional genomics down under: RNAi screening in the Victorian Centre for Functional Genomics.

    PubMed

    Thomas, Daniel W; Gould, Cathryn M; Handoko, Yanny; Simpson, Kaylene J

    2014-05-01

    The Victorian Centre for Functional Genomics (VCFG) is an RNAi screening facility housed at the Peter MacCallum Cancer Centre in Melbourne, Australia. The Peter Mac is Australia's largest dedicated Cancer Research Institute, home to a team of over 520 scientists that focus on understanding the genetic risk of cancer, the molecular events regulating cancer growth and dissemination and improving detection through new diagnostic tools (www.petermac.org). Peter Mac is a well recognised technology leader and established the VCFG with a view to enabling researchers Australia and New Zealand-wide access to cutting edge functional genomics technology, infrastructure and expertise. This review documents the technology platforms operated within the VCFG and provides insight into the workflows and analysis pipelines currently in operation.

  17. A collection of plant-specific genomic data and resources at NCBI.

    PubMed

    Tatusova, Tatiana; Smith-White, Brian; Ostell, James

    2007-01-01

    The National Center for Biotechnology Information (NCBI) provides a data-rich environment in support of genomic research by collecting the biological data for genomes, genes, gene expressions, gene variation, gene families, proteins, and protein domains and integrating the data with analytical, search, and retrieval resources through the NCBI Web site. Entrez, an integrated search and retrieval system, enables text searches across various diverse biological databases maintained at NCBI. Map Viewer, the genome browser developed at NCBI, displays aligned genetic, physical, and sequence maps for eukaryotic genomes including those of many plants. A specialized plant query page allows maps from all plant genomes available in the Map Viewer to be searched to produce a display of aligned maps from several species. Customized Plant Basic Local Alignment Search Tool (PlantBLAST) allows the user to perform sequence similarity searches in a special collection of mapped plant sequence data and to view the resulting alignments within a genomic context using Map Viewer. In addition, pre-computed sequence similarities, such as those for proteins offered by BLAST Link (BLink), enable fluid navigation from un-annotated to annotated sequences, quickening the pace of discovery. Plant Genome Central (PGC) is a Web portal that provides centralized access to all NCBI plant genome resources. Also, there are links to plant-specific Web resources external to NCBI such as organism-specific databases, genome-sequencing project Web pages, and homepages of genomic bioinformatics organizations.

  18. Community standards for genomic resources, genetic conservation, and data integration

    Treesearch

    Jill Wegrzyn; Meg Staton; Emily Grau; Richard Cronn; C. Dana. Nelson

    2017-01-01

    Genetics and genomics are increasingly important in forestry management and conservation. Next generation sequencing can increase analytical power, but still relies on building on the structure of previously acquired data. Data standards and data sharing allow the community to maximize the analytical power of high throughput genomics data. The landscape of incomplete...

  19. Exploration of Genetic and Genomic Resources for Abiotic and Biotic Stress Tolerance in Pearl Millet

    PubMed Central

    Shivhare, Radha; Lata, Charu

    2017-01-01

    Pearl millet is one of the most important small-grained C4 Panicoid crops with a large genome size (∼2352 Mb), short life cycle and outbreeding nature. It is highly resilient to areas with scanty rain and high temperature. Pearl millet is a nutritionally superior staple crop for people inhabiting hot, drought-prone arid and semi-arid regions of South Asia and Africa where it is widely grown and used for food, hay, silage, bird feed, building material, and fuel. Having excellent nutrient composition and exceptional buffering capacity against variable climatic conditions and pathogen attack makes pearl millet a wonderful model crop for stress tolerance studies. Pearl millet germplasm show a large range of genotypic and phenotypic variations including tolerance to abiotic and biotic stresses. Conventional breeding for enhancing abiotic and biotic stress resistance in pearl millet have met with considerable success, however, in last few years various novel approaches including functional genomics and molecular breeding have been attempted in this crop for augmenting yield under adverse environmental conditions, and there is still a lot of scope for further improvement using genomic tools. Discovery and use of various DNA-based markers such as EST-SSRs, DArT, CISP, and SSCP-SNP in pearl millet not only help in determining population structure and genetic diversity but also prove to be important for developing strategies for crop improvement at a faster rate and greater precision. Molecular marker-based genetic linkage maps and identification of genomic regions determining yield under abiotic stresses particularly terminal drought have paved way for marker-assisted selection and breeding of pearl millet cultivars. Reference collections and marker-assisted backcrossing have also been used to improve biotic stress resistance in pearl millet specifically to downy mildew. Whole genome sequencing of pearl millet genome will give new insights for processing of functional

  20. Exploration of Genetic and Genomic Resources for Abiotic and Biotic Stress Tolerance in Pearl Millet.

    PubMed

    Shivhare, Radha; Lata, Charu

    2016-01-01

    Pearl millet is one of the most important small-grained C4 Panicoid crops with a large genome size (∼2352 Mb), short life cycle and outbreeding nature. It is highly resilient to areas with scanty rain and high temperature. Pearl millet is a nutritionally superior staple crop for people inhabiting hot, drought-prone arid and semi-arid regions of South Asia and Africa where it is widely grown and used for food, hay, silage, bird feed, building material, and fuel. Having excellent nutrient composition and exceptional buffering capacity against variable climatic conditions and pathogen attack makes pearl millet a wonderful model crop for stress tolerance studies. Pearl millet germplasm show a large range of genotypic and phenotypic variations including tolerance to abiotic and biotic stresses. Conventional breeding for enhancing abiotic and biotic stress resistance in pearl millet have met with considerable success, however, in last few years various novel approaches including functional genomics and molecular breeding have been attempted in this crop for augmenting yield under adverse environmental conditions, and there is still a lot of scope for further improvement using genomic tools. Discovery and use of various DNA-based markers such as EST-SSRs, DArT, CISP, and SSCP-SNP in pearl millet not only help in determining population structure and genetic diversity but also prove to be important for developing strategies for crop improvement at a faster rate and greater precision. Molecular marker-based genetic linkage maps and identification of genomic regions determining yield under abiotic stresses particularly terminal drought have paved way for marker-assisted selection and breeding of pearl millet cultivars. Reference collections and marker-assisted backcrossing have also been used to improve biotic stress resistance in pearl millet specifically to downy mildew. Whole genome sequencing of pearl millet genome will give new insights for processing of functional

  1. Whole genome sequences in pulse crops: a global community resource to expedite translational genomics and knowledge-based crop improvement.

    PubMed

    Bohra, Abhishek; Singh, Narendra P

    2015-08-01

    Unprecedented developments in legume genomics over the last decade have resulted in the acquisition of a wide range of modern genomic resources to underpin genetic improvement of grain legumes. The genome enabled insights direct investigators in various ways that primarily include unearthing novel structural variations, retrieving the lost genetic diversity, introducing novel/exotic alleles from wider gene pools, finely resolving the complex quantitative traits and so forth. To this end, ready availability of cost-efficient and high-density genotyping assays allows genome wide prediction to be increasingly recognized as the key selection criterion in crop breeding. Further, the high-dimensional measurements of agronomically significant phenotypes obtained by using new-generation screening techniques will empower reference based resequencing as well as allele mining and trait mapping methods to comprehensively associate genome diversity with the phenome scale variation. Besides stimulating the forward genetic systems, accessibility to precisely delineated genomic segments reveals novel candidates for reverse genetic techniques like targeted genome editing. The shifting paradigm in plant genomics in turn necessitates optimization of crop breeding strategies to enable the most efficient integration of advanced omics knowledge and tools. We anticipate that the crop improvement schemes will be bolstered remarkably with rational deployment of these genome-guided approaches, ultimately resulting in expanded plant breeding capacities and improved crop performance.

  2. Integrating Genomic Resources with Electronic Health Records using the HL7 Infobutton Standard

    PubMed Central

    Overby, Casey Lynnette; Del Fiol, Guilherme; Rubinstein, Wendy S.; Maglott, Donna R.; Nelson, Tristan H.; Milosavljevic, Aleksandar; Martin, Christa L.; Goehringer, Scott R.; Freimuth, Robert R.; Williams, Marc S.

    2016-01-01

    Summary Background The Clinical Genome Resource (ClinGen) Electronic Health Record (EHR) Workgroup aims to integrate ClinGen resources with EHRs. A promising option to enable this integration is through the Health Level Seven (HL7) Infobutton Standard. EHR systems that are certified according to the US Meaningful Use program provide HL7-compliant infobutton capabilities, which can be leveraged to support clinical decision-making in genomics. Objectives To integrate genomic knowledge resources using the HL7 infobutton standard. Two tactics to achieve this objective were: (1) creating an HL7-compliant search interface for ClinGen, and (2) proposing guidance for genomic resources on achieving HL7 Infobutton standard accessibility and compliance. Methods We built a search interface utilizing OpenInfobutton, an open source reference implementation of the HL7 Infobutton standard. ClinGen resources were assessed for readiness towards HL7 compliance. Finally, based upon our experiences we provide recommendations for publishers seeking to achieve HL7 compliance. Results Eight genomic resources and two sub-resources were integrated with the ClinGen search engine via OpenInfobutton and the HL7 infobutton standard. Resources we assessed have varying levels of readiness towards HL7-compliance. Furthermore, we found that adoption of standard terminologies used by EHR systems is the main gap to achieve compliance. Conclusion Genomic resources can be integrated with EHR systems via the HL7 Infobutton standard using OpenInfobutton. Full compliance of genomic resources with the Infobutton standard would further enhance interoperability with EHR systems. PMID:27579472

  3. INFRAFRONTIER: a European resource for studying the functional basis of human disease.

    PubMed

    Raess, Michael; de Castro, Ana Ambrosio; Gailus-Durner, Valérie; Fessele, Sabine; Hrabě de Angelis, Martin

    2016-08-01

    Ageing research and more generally the study of the functional basis of human diseases profit enormously from the large-scale approaches and resources in mouse functional genomics: systematic targeted mutation of the mouse genome, systemic phenotyping in mouse clinics, and the archiving and distribution of the mouse resources in public repositories. INFRAFRONTIER, the European research infrastructure for the development, systemic phenotyping, archiving and distribution of mammalian models, offers access to sustainable mouse resources for biomedical research. INFRAFRONTIER promotes the global sharing of high-quality resources and data and thus contributes to data reproducibility and animal welfare. INFRAFRONTIER puts great effort into international standardisation and quality control and into technology development to improve and expand experimental protocols, reduce the use of animals in research and increase the reproducibility of results. In concert with the research community and the International Mouse Phenotyping Consortium (IMPC), INFRAFRONTIER is currently developing new pilot platforms and services for the research on ageing and age-related diseases.

  4. NCBI GEO: archive for functional genomics data sets—update

    PubMed Central

    Barrett, Tanya; Wilhite, Stephen E.; Ledoux, Pierre; Evangelista, Carlos; Kim, Irene F.; Tomashevsky, Maxim; Marshall, Kimberly A.; Phillippy, Katherine H.; Sherman, Patti M.; Holko, Michelle; Yefanov, Andrey; Lee, Hyeseung; Zhang, Naigong; Robertson, Cynthia L.; Serova, Nadezhda; Davis, Sean; Soboleva, Alexandra

    2013-01-01

    The Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) is an international public repository for high-throughput microarray and next-generation sequence functional genomic data sets submitted by the research community. The resource supports archiving of raw data, processed data and metadata which are indexed, cross-linked and searchable. All data are freely available for download in a variety of formats. GEO also provides several web-based tools and strategies to assist users to query, analyse and visualize data. This article reports current status and recent database developments, including the release of GEO2R, an R-based web application that helps users analyse GEO data. PMID:23193258

  5. NCBI GEO: archive for functional genomics data sets--update.

    PubMed

    Barrett, Tanya; Wilhite, Stephen E; Ledoux, Pierre; Evangelista, Carlos; Kim, Irene F; Tomashevsky, Maxim; Marshall, Kimberly A; Phillippy, Katherine H; Sherman, Patti M; Holko, Michelle; Yefanov, Andrey; Lee, Hyeseung; Zhang, Naigong; Robertson, Cynthia L; Serova, Nadezhda; Davis, Sean; Soboleva, Alexandra

    2013-01-01

    The Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) is an international public repository for high-throughput microarray and next-generation sequence functional genomic data sets submitted by the research community. The resource supports archiving of raw data, processed data and metadata which are indexed, cross-linked and searchable. All data are freely available for download in a variety of formats. GEO also provides several web-based tools and strategies to assist users to query, analyse and visualize data. This article reports current status and recent database developments, including the release of GEO2R, an R-based web application that helps users analyse GEO data.

  6. Genomic resources for valuable woody ornamental landscape plants such as hydrangea macrophylla

    USDA-ARS?s Scientific Manuscript database

    New and improved ornamental landscape plants are typically produced by conventional breeding. Unfortunately, long generation times for woody plants can significantly slow progress. Incorporating genomic and biotechnology resources provides more information when designing breeding strategies and ac...

  7. The Drosophila genome nexus: a population genomic resource of 623 Drosophila melanogaster genomes, including 197 from a single ancestral range population.

    PubMed

    Lack, Justin B; Cardeno, Charis M; Crepeau, Marc W; Taylor, William; Corbett-Detig, Russell B; Stevens, Kristian A; Langley, Charles H; Pool, John E

    2015-04-01

    Hundreds of wild-derived Drosophila melanogaster genomes have been published, but rigorous comparisons across data sets are precluded by differences in alignment methodology. The most common approach to reference-based genome assembly is a single round of alignment followed by quality filtering and variant detection. We evaluated variations and extensions of this approach and settled on an assembly strategy that utilizes two alignment programs and incorporates both substitutions and short indels to construct an updated reference for a second round of mapping prior to final variant detection. Utilizing this approach, we reassembled published D. melanogaster population genomic data sets and added unpublished genomes from several sub-Saharan populations. Most notably, we present aligned data from phase 3 of the Drosophila Population Genomics Project (DPGP3), which provides 197 genomes from a single ancestral range population of D. melanogaster (from Zambia). The large sample size, high genetic diversity, and potentially simpler demographic history of the DPGP3 sample will make this a highly valuable resource for fundamental population genetic research. The complete set of assemblies described here, termed the Drosophila Genome Nexus, presently comprises 623 consistently aligned genomes and is publicly available in multiple formats with supporting documentation and bioinformatic tools. This resource will greatly facilitate population genomic analysis in this model species by reducing the methodological differences between data sets. Copyright © 2015 by the Genetics Society of America.

  8. Development of Genomic Resources for Pacific Herring through Targeted Transcriptome Pyrosequencing

    PubMed Central

    Roberts, Steven B.; Hauser, Lorenz; Seeb, Lisa W.; Seeb, James E.

    2012-01-01

    Pacific herring (Clupea pallasii) support commercially and culturally important fisheries but have experienced significant additional pressure from a variety of anthropogenic and environmental sources. In order to provide genomic resources to facilitate organismal and population level research, high-throughput pyrosequencing (Roche 454) was carried out on transcriptome libraries from liver and testes samples taken in Prince William Sound, the Bering Sea, and the Gulf of Alaska. Over 40,000 contigs were identified with an average length of 728 bp. We describe an annotated transcriptome as well as a workflow for single nucleotide polymorphism (SNP) discovery and validation. A subset of 96 candidate SNPs chosen from 10,933 potential SNPs, were tested using a combination of Sanger sequencing and high-resolution melt-curve analysis. Five SNPs supported between-ocean-basin differentiation, while one SNP associated with immune function provided high differentiation between Prince William Sound and Kodiak Island within the Gulf of Alaska. These genomic resources provide a basis for environmental physiology studies and opportunities for marker development and subsequent population structure analysis. PMID:22383979

  9. [The place of functional genomics in oncological research].

    PubMed

    Bálint, Bálint L; Nagy, László

    2013-03-01

    The 1000 genomes project changed the way how we see the human genome. The rapid development of the deep sequencing technologies is raising several practical questions, and the way how we answer these questions will affect deeply the future of the oncological reseach in Hungary. In our manuscript we give a short overview of the results of the 1000 genomes project and we present the place of the functional genomic investigations between other genomic tools. Based on the recent development in the field we summarize the challenges that have to be addressed in the next couple of years.

  10. Resources to increase genetics and genomics capacity of oncology nurses.

    PubMed

    Aiello, Lisa B

    2015-03-01

    Since the completion of the Human Genome Project (HGP) in 2003, the understanding of genetics and its influence on disease, particularly cancer, has increased dramatically. The initial focus after the completion of HGP was on identifying single-gene disorders, such as many hereditary cancer syndromes (e.g., BRCA1, BRCA2, HNPCC). As research continues, the major impact that genetics and genomics have across the healthcare continuum is only beginning to become clear.

  11. Megx.net: integrated database resource for marine ecological genomics

    PubMed Central

    Kottmann, Renzo; Kostadinov, Ivalyo; Duhaime, Melissa Beth; Buttigieg, Pier Luigi; Yilmaz, Pelin; Hankeln, Wolfgang; Waldmann, Jost; Glöckner, Frank Oliver

    2010-01-01

    Megx.net is a database and portal that provides integrated access to georeferenced marker genes, environment data and marine genome and metagenome projects for microbial ecological genomics. All data are stored in the Microbial Ecological Genomics DataBase (MegDB), which is subdivided to hold both sequence and habitat data and global environmental data layers. The extended system provides access to several hundreds of genomes and metagenomes from prokaryotes and phages, as well as over a million small and large subunit ribosomal RNA sequences. With the refined Genes Mapserver, all data can be interactively visualized on a world map and statistics describing environmental parameters can be calculated. Sequence entries have been curated to comply with the proposed minimal standards for genomes and metagenomes (MIGS/MIMS) of the Genomic Standards Consortium. Access to data is facilitated by Web Services. The updated megx.net portal offers microbial ecologists greatly enhanced database content, and new features and tools for data analysis, all of which are freely accessible from our webpage http://www.megx.net. PMID:19858098

  12. The role of chromosome domains in shaping the functional genome.

    PubMed

    Sexton, Tom; Cavalli, Giacomo

    2015-03-12

    The genome must be highly compacted to fit within eukaryotic nuclei but must be accessible to the transcriptional machinery to allow appropriate expression of genes in different cell types and throughout developmental pathways. A growing body of work has shown that the genome, analogously to proteins, forms an ordered, hierarchical structure that closely correlates and may even be causally linked with regulation of functions such as transcription. This review describes our current understanding of how these functional genomic "secondary and tertiary structures" form a blueprint for global nuclear architecture and the potential they hold for understanding and manipulating genomic regulation.

  13. RegPrecise 3.0--a resource for genome-scale exploration of transcriptional regulation in bacteria.

    PubMed

    Novichkov, Pavel S; Kazakov, Alexey E; Ravcheev, Dmitry A; Leyn, Semen A; Kovaleva, Galina Y; Sutormin, Roman A; Kazanov, Marat D; Riehl, William; Arkin, Adam P; Dubchak, Inna; Rodionov, Dmitry A

    2013-11-01

    Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in prokaryotes is one of the critical tasks of modern genomics. Bacteria from different taxonomic groups, whose lifestyles and natural environments are substantially different, possess highly diverged transcriptional regulatory networks. The comparative genomics approaches are useful for in silico reconstruction of bacterial regulons and networks operated by both transcription factors (TFs) and RNA regulatory elements (riboswitches). RegPrecise (http://regprecise.lbl.gov) is a web resource for collection, visualization and analysis of transcriptional regulons reconstructed by comparative genomics. We significantly expanded a reference collection of manually curated regulons we introduced earlier. RegPrecise 3.0 provides access to inferred regulatory interactions organized by phylogenetic, structural and functional properties. Taxonomy-specific collections include 781 TF regulogs inferred in more than 160 genomes representing 14 taxonomic groups of Bacteria. TF-specific collections include regulogs for a selected subset of 40 TFs reconstructed across more than 30 taxonomic lineages. Novel collections of regulons operated by RNA regulatory elements (riboswitches) include near 400 regulogs inferred in 24 bacterial lineages. RegPrecise 3.0 provides four classifications of the reference regulons implemented as controlled vocabularies: 55 TF protein families; 43 RNA motif families; ~150 biological processes or metabolic pathways; and ~200 effectors or environmental signals. Genome-wide visualization of regulatory networks and metabolic pathways covered by the reference regulons are available for all studied genomes. A separate section of RegPrecise 3.0 contains draft regulatory networks in 640 genomes obtained by an conservative propagation of the reference regulons to closely related genomes. RegPrecise 3.0 gives access to the transcriptional regulons reconstructed in

  14. Exploiting Genomics Resources to Identify Candidate Genes Underlying Antioxidants Content in Tomato Fruit

    PubMed Central

    Calafiore, Roberta; Ruggieri, Valentino; Raiola, Assunta; Rigano, Maria M.; Sacco, Adriana; Hassan, Mohamed I.; Frusciante, Luigi; Barone, Amalia

    2016-01-01

    The tomato is a model species for fleshy fruit development and ripening, as well as for genomics studies of others Solanaceae. Many genetic and genomics resources, including databases for sequencing, transcriptomics and metabolomics data, have been developed and are today available. The purpose of the present work was to uncover new genes and/or alleles that determine ascorbic acid and carotenoids accumulation, by exploiting one Solanum pennellii introgression lines (IL7-3) harboring quantitative trait loci (QTL) that increase the content of these metabolites in the fruit. The higher ascorbic acid and carotenoids content in IL7-3 was confirmed at three fruit developmental stages. The tomato genome reference sequence and the recently released S. pennellii genome sequence were investigated to identify candidate genes (CGs) that might control ascorbic acid and carotenoids accumulation. First of all, a refinement of the wild region borders in the IL7-3 was achieved by analyzing CAPS markers designed in our laboratory. Afterward, six CGs associated to ascorbic acid and one with carotenoids metabolism were identified exploring the annotation and the Gene Ontology terms of genes included in the region. Variants between the sequence of the wild and the cultivated alleles of these genes were investigated for their functional relevance and their potential effects on the protein sequences were predicted. Transcriptional levels of CGs in the introgression region were extracted from RNA-Seq data available for the entire S. pennellii introgression lines collection and verified by Real-Time qPCR. Finally, seven IL7-3 sub-lines were genotyped using 28 species-specific markers and then were evaluated for metabolites content. These analyses evidenced a significant decrease in transcript abundance for one 9-cis-epoxycarotenoid dioxygenase and one L-ascorbate oxidase homolog, whose role in the accumulation of carotenoids and ascorbic acid is discussed. Comprehensively, the reported

  15. Cancer genomics object model: an object model for multiple functional genomics data for cancer research.

    PubMed

    Park, Yu Rang; Lee, Hye Won; Cho, Sung Bum; Kim, Ju Han

    2007-01-01

    The development of functional genomics including transcriptomics, proteomics and metabolomics allow us to monitor a large number of key cellular pathways simultaneously. Several technology-specific data models have been introduced for the representation of functional genomics experimental data, including the MicroArray Gene Expression-Object Model (MAGE-OM), the Proteomics Experiment Data Repository (PEDRo), and the Tissue MicroArray-Object Model (TMA-OM). Despite the increasing number of cancer studies using multiple functional genomics technologies, there is still no integrated data model for multiple functional genomics experimental and clinical data. We propose an object-oriented data model for cancer genomics research, Cancer Genomics Object Model (CaGe-OM). We reference four data models: Functional Genomic-Object Model, MAGE-OM, TMAOM and PEDRo. The clinical and histopathological information models are created by analyzing cancer management workflow and referencing the College of American Pathology Cancer Protocols and National Cancer Institute Common Data Elements. The CaGe-OM provides a comprehensive data model for integrated storage and analysis of clinical and multiple functional genomics data.

  16. Using an online genome resource to identify myostatin variation in U.S. sheep

    USDA-ARS?s Scientific Manuscript database

    We created a public, searchable DNA sequence resource for sheep that contained approximately 14x whole genome sequence of 96 rams. The animals represent 10 popular U.S. breeds and share minimal pedigree relationships, making the resource suitable for viewing gene variants in the user-friendly Integ...

  17. Functional diversity in resource use by fungi.

    PubMed

    McGuire, Krista L; Bent, Elizabeth; Borneman, James; Majumder, Arundhati; Allison, Steven D; Tresederi, Kathleen K

    2010-08-01

    Fungi influence nutrient cycling in terrestrial ecosystems, as they are major regulators of decomposition and soil respiration. However, little is known about the substrate preferences of individual fungal species outside of laboratory culture studies. If active fungi differ in their substrate preferences in situ, then changes in fungal diversity due to global change may dramatically influence nutrient cycling in ecosystems. To test the responses of individual fungal taxa to specific substrates, we used a nucleotide-analogue procedure in the boreal forest of Alaska (USA). Specifically, we added four organic N compounds commonly found in plant litter (arginine, glutamate, lignocellulose, and tannin-protein) to litterbags filled with decomposed leaf litter (black spruce and aspen) and assessed the responses of active fungal species using qPCR (quantitative polymerase chain reaction), oligonucleotide fingerprinting of rRNA genes, and sequencing. We also compared the sequences from our experiment with a concurrent warming experiment to see if active fungi that targeted more recalcitrant compounds would respond more positively to soil warming. We found that individual fungal taxa responded differently to substrate additions and that active fungal communities were different across litter types (spruce vs. aspen). Active fungi that targeted lignocellulose also responded positively to experimental warming. Additionally, resource-use patterns in different fungal taxa were genetically correlated, suggesting that it may be possible to predict the ecological function of active fungal communities based on genetic information. Together, these results imply that fungi are functionally diverse and that reductions in fungal diversity may have consequences for ecosystem functioning.

  18. AtPID: a genome-scale resource for genotype–phenotype associations in Arabidopsis

    PubMed Central

    Lv, Qi; Lan, Yiheng; Shi, Yan; Wang, Huan; Pan, Xia; Li, Peng; Shi, Tieliu

    2017-01-01

    AtPID (Arabidopsis thaliana Protein Interactome Database, available at http://www.megabionet.org/atpid) is an integrated database resource for protein interaction network and functional annotation. In the past few years, we collected 5564 mutants with significant morphological alterations and manually curated them to 167 plant ontology (PO) morphology categories. These single/multiple-gene mutants were indexed and linked to 3919 genes. After integrated these genotype–phenotype associations with the comprehensive protein interaction network in AtPID, we developed a Naïve Bayes method and predicted 4457 novel high confidence gene-PO pairs with 1369 genes as the complement. Along with the accumulated novel data for protein interaction and functional annotation, and the updated visualization toolkits, we present a genome-scale resource for genotype–phenotype associations for Arabidopsis in AtPID 5.0. In our updated website, all the new genotype–phenotype associations from mutants, protein network, and the protein annotation information can be vividly displayed in a comprehensive network view, which will greatly enhance plant protein function and genotype–phenotype association studies in a systematical way. PMID:27899679

  19. Assigning biological functions to rice genes by genome annotation, expression analysis and mutagenesis.

    PubMed

    Jiang, Shu-Ye; Ramachandran, Srinivasan

    2010-12-01

    Rice is the first cereal genome to be completely sequenced. Since the completion of its genome sequencing, considerable progress has been made in multiple areas including the whole genome annotation, gene expression profiling, mutant collection, etc. Here, we summarize the current status of rice genome annotation and review the methodology of assigning biological functions to hundreds of thousands of rice genes as well as discuss the major limitations and the future perspective in rice functional genomics. Available data analysis shows that the rice genome encodes around 32,000 protein-coding genes. Expression analysis revealed at least 31,000 genes with expression evidence from full-length cDNA/EST collection or other transcript profiling. In addition, we have summarized various strategies to generate mutant population including natural, physical, chemical, T-DNA, transposon/retrotransposon or gene silencing based mutagenesis. Currently, more than 1 million of mutants have been generated and 27,551 of them have their flanking sequence tags. To assign biological functions to hundreds of thousands of rice genes, global co-operations are required, various genetic resources should be more easily accessible and diverse data from transcriptomics, proteomics, epigenetics, comparative genomics and bioinformatics should be integrated to better understand the functions of these genes and their regulatory mechanisms.

  20. Reptilian Transcriptomes v2.0: An Extensive Resource for Sauropsida Genomics and Transcriptomics

    PubMed Central

    Tzika, Athanasia C.; Ullate-Agote, Asier; Grbic, Djordje; Milinkovitch, Michel C.

    2015-01-01

    Despite the availability of deep-sequencing techniques, genomic and transcriptomic data remain unevenly distributed across phylogenetic groups. For example, reptiles are poorly represented in sequence databases, hindering functional evolutionary and developmental studies in these lineages substantially more diverse than mammals. In addition, different studies use different assembly and annotation protocols, inhibiting meaningful comparisons. Here, we present the “Reptilian Transcriptomes Database 2.0,” which provides extensive annotation of transcriptomes and genomes from species covering the major reptilian lineages. To this end, we sequenced normalized complementary DNA libraries of multiple adult tissues and various embryonic stages of the leopard gecko and the corn snake and gathered published reptilian sequence data sets from representatives of the four extant orders of reptiles: Squamata (snakes and lizards), the tuatara, crocodiles, and turtles. The LANE runner 2.0 software was implemented to annotate all assemblies within a single integrated pipeline. We show that this approach increases the annotation completeness of the assembled transcriptomes/genomes. We then built large concatenated protein alignments of single-copy genes and inferred phylogenetic trees that support the positions of turtles and the tuatara as sister groups of Archosauria and Squamata, respectively. The Reptilian Transcriptomes Database 2.0 resource will be updated to include selected new data sets as they become available, thus making it a reference for differential expression studies, comparative genomics and transcriptomics, linkage mapping, molecular ecology, and phylogenomic analyses involving reptiles. The database is available at www.reptilian-transcriptomes.org and can be enquired using a wwwblast server installed at the University of Geneva. PMID:26133641

  1. Reptilian Transcriptomes v2.0: An Extensive Resource for Sauropsida Genomics and Transcriptomics.

    PubMed

    Tzika, Athanasia C; Ullate-Agote, Asier; Grbic, Djordje; Milinkovitch, Michel C

    2015-07-01

    Despite the availability of deep-sequencing techniques, genomic and transcriptomic data remain unevenly distributed across phylogenetic groups. For example, reptiles are poorly represented in sequence databases, hindering functional evolutionary and developmental studies in these lineages substantially more diverse than mammals. In addition, different studies use different assembly and annotation protocols, inhibiting meaningful comparisons. Here, we present the "Reptilian Transcriptomes Database 2.0," which provides extensive annotation of transcriptomes and genomes from species covering the major reptilian lineages. To this end, we sequenced normalized complementary DNA libraries of multiple adult tissues and various embryonic stages of the leopard gecko and the corn snake and gathered published reptilian sequence data sets from representatives of the four extant orders of reptiles: Squamata (snakes and lizards), the tuatara, crocodiles, and turtles. The LANE runner 2.0 software was implemented to annotate all assemblies within a single integrated pipeline. We show that this approach increases the annotation completeness of the assembled transcriptomes/genomes. We then built large concatenated protein alignments of single-copy genes and inferred phylogenetic trees that support the positions of turtles and the tuatara as sister groups of Archosauria and Squamata, respectively. The Reptilian Transcriptomes Database 2.0 resource will be updated to include selected new data sets as they become available, thus making it a reference for differential expression studies, comparative genomics and transcriptomics, linkage mapping, molecular ecology, and phylogenomic analyses involving reptiles. The database is available at www.reptilian-transcriptomes.org and can be enquired using a wwwblast server installed at the University of Geneva. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  2. REVIEW: Zebrafish: A Renewed Model System For Functional Genomics

    NASA Astrophysics Data System (ADS)

    Wen, Xiao-Yan

    2008-01-01

    In the post genome era, a major goal in molecular biology is to determine the function of the many thousands of genes present in the vertebrate genome. The zebrafish (Danio rerio) provides an almost ideal genetic model to identify the biological roles of these novel genes, in part because their embryos are transparent and develop rapidly. The zebrafish has many advantages over mouse for genome-wide mutagenesis studies, allowing for easier, cheaper and faster functional characterization of novel genes in the vertebrate genome. Many molecular research tools such as chemical mutagenesis, transgenesis, gene trapping, gene knockdown, TILLING, gene targeting, RNAi and chemical genetic screen are now available in zebrafish. Combining all the forward, reverse, and chemical genetic tools, it is expected that zebrafish will make invaluable contribution to vertebrate functional genomics in functional annotation of the genes, modeling human diseases and drug discoveries.

  3. New resources inform study of genome size, content, and organization in nonavian reptiles

    PubMed Central

    Janes, Daniel E.; Organ, Christopher; Valenzuela, Nicole

    2008-01-01

    Genomic resources for studies of nonavian reptiles have recently improved and will reach a new level of access once the genomes of the painted turtle (Chrysemys picta) and the green anole (Anolis carolinensis) have been published. Eleven speakers gathered for a symposium on reptilian genomics and evolutionary genetics at the 2008 meeting of the Society for Integrative and Comparative Biology in San Antonio, Texas. Presentations described results of reptilian genetic studies concerning molecular evolution, chromosomal evolution, genomic architecture, population dynamics, endocrinology and endocrine disruption, and the evolution of developmental mechanisms. The presented studies took advantage of the recent generation of genetic and genomic tools and resources. Novel findings demonstrated the positive impact made by the improved availability of resources like genome annotations and bacterial artificial chromosomes (BACs). The symposium was timely and important because it provided a vehicle for the dissemination of novel findings that advance the field. Moreover, this meeting fostered the synergistic interaction of the participants as a group, which is anticipated to encourage the funding and creation of further resources such as additional BAC libraries and genomic projects. Novel data have already been collected and studies like those presented in this symposium promise to shape and improve our understanding of overall amniote evolution. Additional reptilian taxa such as the American alligator (Alligator mississippiensis), tuatara (Sphenodon punctatus), and garter snake (Thamnophis sirtalis) should be the foci of future genomic projects. We hope that the following articles in this volume will help promote these efforts by describing the conclusions and the potential that the improvement of genomic resources for nonavian reptiles can continue having in this important area of integrative and comparative biology. PMID:21669805

  4. Population genomics of resource exploitation: insights from gene expression profiles of two Daphnia ecotypes fed alternate resources.

    PubMed

    Dudycha, Jeffry L; Brandon, Christopher S; Deitz, Kevin C

    2012-02-01

    Consumer-resource interactions are a central issue in evolutionary and community ecology because they play important roles in selection and population regulation. Most consumers encounter resource variation at multiple scales, and respond through phenotypic plasticity in the short term or evolutionary divergence in the long term. The key traits for these responses may influence resource acquisition, assimilation, and/or allocation. To identify relevant candidate genes, we experimentally assayed genome-wide gene expression in pond and lake Daphnia ecotypes exposed to alternate resource environments. One was a simple, high-quality laboratory diet, Ankistrodesmus falcatus. The other was the complex natural seston from a large lake. In temporary ponds, Daphnia generally experience high-quality, abundant resources, whereas lakes provide low-quality, seasonally shifting resources that are chronically limiting. For both ecotypes, we used replicate clones drawn from a number of separate populations. Fourteen genes were differentially regulated with respect to resources, including genes involved in gut processes, resource allocation, and activities with no obvious connection to resource exploitation. Three genes were differentially regulated in both ecotypes; the others may play a role in ecological divergence. Genes clearly linked to gut processes include two peritrophic matrix proteins, a Niemann-Pick type C2 gene, and a chymotrypsin. A pancreatic lipase, an epoxide hydrolase, a neuroparsin, and an UDP-dependent glucuronyltransferase are potentially involved in resource allocation through effects on energy processing and storage or hormone pathways. We performed quantitative rt-PCR for eight genes in independent samples of three clones of each of the two ecotypes. Though these largely confirmed observed differential regulation, some genes' expression was highly variable among clones. Our results demonstrate the value of matching the level of biological replication in

  5. Population genomics of resource exploitation: insights from gene expression profiles of two Daphnia ecotypes fed alternate resources

    PubMed Central

    Dudycha, Jeffry L; Brandon, Christopher S; Deitz, Kevin C

    2012-01-01

    Consumer–resource interactions are a central issue in evolutionary and community ecology because they play important roles in selection and population regulation. Most consumers encounter resource variation at multiple scales, and respond through phenotypic plasticity in the short term or evolutionary divergence in the long term. The key traits for these responses may influence resource acquisition, assimilation, and/or allocation. To identify relevant candidate genes, we experimentally assayed genome-wide gene expression in pond and lake Daphnia ecotypes exposed to alternate resource environments. One was a simple, high-quality laboratory diet, Ankistrodesmus falcatus. The other was the complex natural seston from a large lake. In temporary ponds, Daphnia generally experience high-quality, abundant resources, whereas lakes provide low-quality, seasonally shifting resources that are chronically limiting. For both ecotypes, we used replicate clones drawn from a number of separate populations. Fourteen genes were differentially regulated with respect to resources, including genes involved in gut processes, resource allocation, and activities with no obvious connection to resource exploitation. Three genes were differentially regulated in both ecotypes; the others may play a role in ecological divergence. Genes clearly linked to gut processes include two peritrophic matrix proteins, a Niemann–Pick type C2 gene, and a chymotrypsin. A pancreatic lipase, an epoxide hydrolase, a neuroparsin, and an UDP-dependent glucuronyltransferase are potentially involved in resource allocation through effects on energy processing and storage or hormone pathways. We performed quantitative rt-PCR for eight genes in independent samples of three clones of each of the two ecotypes. Though these largely confirmed observed differential regulation, some genes’ expression was highly variable among clones. Our results demonstrate the value of matching the level of biological replication

  6. An initial strategy for the systematic identification of functional elements in the human genome by low-redundancy comparative sequencing

    PubMed Central

    Margulies, Elliott H.; Vinson, Jade P.; Miller, Webb; Jaffe, David B.; Lindblad-Toh, Kerstin; Chang, Jean L.; Green, Eric D.; Lander, Eric S.; Mullikin, James C.; Clamp, Michele

    2005-01-01

    With the recent completion of a high-quality sequence of the human genome, the challenge is now to understand the functional elements that it encodes. Comparative genomic analysis offers a powerful approach for finding such elements by identifying sequences that have been highly conserved during evolution. Here, we propose an initial strategy for detecting such regions by generating low-redundancy sequence from a collection of 16 eutherian mammals, beyond the 7 for which genome sequence data are already available. We show that such sequence can be accurately aligned to the human genome and used to identify most of the highly conserved regions. Although not a long-term substitute for generating high-quality genomic sequences from many mammalian species, this strategy represents a practical initial approach for rapidly annotating the most evolutionarily conserved sequences in the human genome, providing a key resource for the systematic study of human genome function. PMID:15778292

  7. The Protein Information Resource: an integrated public resource of functional annotation of proteins

    PubMed Central

    Wu, Cathy H.; Huang, Hongzhan; Arminski, Leslie; Castro-Alvear, Jorge; Chen, Yongxing; Hu, Zhang-Zhi; Ledley, Robert S.; Lewis, Kali C.; Mewes, Hans-Werner; Orcutt, Bruce C.; Suzek, Baris E.; Tsugita, Akira; Vinayaka, C. R.; Yeh, Lai-Su L.; Zhang, Jian; Barker, Winona C.

    2002-01-01

    The Protein Information Resource (PIR) serves as an integrated public resource of functional annotation of protein data to support genomic/proteomic research and scientific discovery. The PIR, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the PIR-International Protein Sequence Database (PSD), the major annotated protein sequence database in the public domain, containing about 250 000 proteins. To improve protein annotation and the coverage of experimentally validated data, a bibliography submission system is developed for scientists to submit, categorize and retrieve literature information. Comprehensive protein information is available from iProClass, which includes family classification at the superfamily, domain and motif levels, structural and functional features of proteins, as well as cross-references to over 40 biological databases. To provide timely and comprehensive protein data with source attribution, we have introduced a non-redundant reference protein database, PIR-NREF. The database consists of about 800 000 proteins collected from PIR-PSD, SWISS-PROT, TrEMBL, GenPept, RefSeq and PDB, with composite protein names and literature data. To promote database interoperability, we provide XML data distribution and open database schema, and adopt common ontologies. The PIR web site (http://pir.georgetown.edu/) features data mining and sequence analysis tools for information retrieval and functional identification of proteins based on both sequence and annotation information. The PIR databases and other files are also available by FTP (ftp://nbrfa.georgetown.edu/pir_databases). PMID:11752247

  8. Characterizing genomic alterations in cancer by complementary functional associations

    PubMed Central

    Kim, J. W.; Botvinnik, O. B.; Abudayyeh, O.; Birger, C.; Rosenbluh, J.; Shrestha, Y.; Abazeed, M. E.; Hammerman, P. S.; DiCara, D.; Konieczkowski, D. J.; Johannessen, C. M.; Liberzon, A.; Alizad-Rahvar, A. R.; Alexe, G.; Aguirre, A.; Ghandi, M.; Greulich, H.; Vazquez, F.; Weir, B. A.; Van Allen, E. M.; Tsherniak, A.; Shao, D. D.; Zack, T. I.; Noble, M.; Getz, G.; Beroukhim, R.; Garraway, L. A.; Ardakani, M.; Romualdi, C.; Sales, G.; Barbie, D. A.; Boehm, J. S.; Hahn, W. C.; Mesirov, J. P.; Tamayo, P.

    2016-01-01

    Systematic efforts to sequence the cancer genome have identified large numbers of relevant mutations and copy number alterations in human cancers; however, elucidating their functional consequences, and their interactions to drive or maintain oncogenic states, is still a significant challenge. Here we introduce REVEALER, a computational method that identifies combinations of mutually exclusive genomic alterations correlated with functional phenotypes, such as the activation or gene-dependency of oncogenic pathways or the sensitivity to a drug treatment. We use REVEALER to uncover complementary genomic alterations associated with the transcriptional activation of β-catenin and NRF2, MEK-inhibitor sensitivity, and KRAS dependency. REVEALER successfully identified both known and new associations demonstrating the power of combining functional profiles with extensive characterization of genomic alterations in cancer genomes. PMID:27088724

  9. The Integrated Microbial Genomes (IMG) System: An Expanding Comparative Analysis Resource

    SciTech Connect

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Grechkin, Yuri; Ratner, Anna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2009-09-13

    The integrated microbial genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG contains both draft and complete microbial genomes integrated with other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through regular releases. Several companion IMG systems have been set up in order to serve domain specific needs, such as expert review of genome annotations. IMG is available at .

  10. Neuroscience in the era of functional genomics and systems biology.

    PubMed

    Geschwind, Daniel H; Konopka, Genevieve

    2009-10-15

    Advances in genetics and genomics have fuelled a revolution in discovery-based, or hypothesis-generating, research that provides a powerful complement to the more directly hypothesis-driven molecular, cellular and systems neuroscience. Genetic and functional genomic studies have already yielded important insights into neuronal diversity and function, as well as disease. One of the most exciting and challenging frontiers in neuroscience involves harnessing the power of large-scale genetic, genomic and phenotypic data sets, and the development of tools for data integration and mining. Methods for network analysis and systems biology offer the promise of integrating these multiple levels of data, connecting molecular pathways to nervous system function.

  11. Partnering for functional genomics research conference: Abstracts of poster presentations

    SciTech Connect

    1998-06-01

    This reports contains abstracts of poster presentations presented at the Functional Genomics Research Conference held April 16--17, 1998 in Oak Ridge, Tennessee. Attention is focused on the following areas: mouse mutagenesis and genomics; phenotype screening; gene expression analysis; DNA analysis technology development; bioinformatics; comparative analyses of mouse, human, and yeast sequences; and pilot projects to evaluate methodologies.

  12. Decoding the ecological function of accessory genome

    USDA-ARS?s Scientific Manuscript database

    Shiga toxin-producing Escherichia coli O157:H7 primarily resides in cattle asymptomatically, and can be transmitted to humans through food. A study by Lupolova et al applied a machine-learning approach to complex pan-genome information and predicted that only a small subset of bovine isolates have t...

  13. From Loci to Biology: Functional Genomics of Genome-Wide Association for Coronary Disease

    PubMed Central

    Nurnberg, Sylvia T; Zhang, Hanrui; Hand, Nicholas J; Bauer, Robert C; Saleheen, Danish; Reilly, Muredach P; Rader, Daniel J

    2016-01-01

    Genome-wide association studies (GWAS) have provided a rich collection of ~58 CAD loci that suggest the existence of previously unsuspected new biology relevant to atherosclerosis. However, these studies only identify genomic loci associated with CAD and many questions remain even after a genomic locus is definitively implicated, including the nature of the causal variant(s) and the causal gene(s), as well as the directionality of effect. There are a number of tools that can be employed for investigation of the functional genomics of these loci, and progress has been made on a limited number of novel CAD loci. New biology regarding atherosclerosis and CAD will be learned through the functional genomics of these loci and the hope is that at least some of these new pathways relevant to CAD pathogenesis will yield new therapeutic targets for the prevention and treatment of CAD. PMID:26892960

  14. A genome-wide resource for the analysis of protein localisation in Drosophila

    PubMed Central

    Sarov, Mihail; Barz, Christiane; Jambor, Helena; Hein, Marco Y; Schmied, Christopher; Suchold, Dana; Stender, Bettina; Janosch, Stephan; KJ, Vinay Vikas; Krishnan, RT; Krishnamoorthy, Aishwarya; Ferreira, Irene RS; Ejsmont, Radoslaw K; Finkl, Katja; Hasse, Susanne; Kämpfer, Philipp; Plewka, Nicole; Vinis, Elisabeth; Schloissnig, Siegfried; Knust, Elisabeth; Hartenstein, Volker; Mann, Matthias; Ramaswami, Mani; VijayRaghavan, K; Tomancak, Pavel; Schnorrer, Frank

    2016-01-01

    The Drosophila genome contains >13000 protein-coding genes, the majority of which remain poorly investigated. Important reasons include the lack of antibodies or reporter constructs to visualise these proteins. Here, we present a genome-wide fosmid library of 10000 GFP-tagged clones, comprising tagged genes and most of their regulatory information. For 880 tagged proteins, we created transgenic lines, and for a total of 207 lines, we assessed protein expression and localisation in ovaries, embryos, pupae or adults by stainings and live imaging approaches. Importantly, we visualised many proteins at endogenous expression levels and found a large fraction of them localising to subcellular compartments. By applying genetic complementation tests, we estimate that about two-thirds of the tagged proteins are functional. Moreover, these tagged proteins enable interaction proteomics from developing pupae and adult flies. Taken together, this resource will boost systematic analysis of protein expression and localisation in various cellular and developmental contexts. DOI: http://dx.doi.org/10.7554/eLife.12068.001 PMID:26896675

  15. Chemical genomics for studying parasite gene function and interaction

    PubMed Central

    Li, Jian; Yuan, Jing; Chen, Chin-chien; Inglese, James; Su, Xin-zhuan

    2013-01-01

    With the development of new technologies in genome sequencing, gene expression profiling, genotyping, and high-throughput screening of chemical compound libraries, small molecules are playing increasingly important roles in studying gene expression regulation, gene-gene interaction, and gene function. Here we briefly review and discuss some recent advancements in drug target identification and phenotype characterization using combinations of high-throughput screening of small-molecule libraries and various genome-wide methods such as whole genome sequencing, genome-wide association studies, and genome-wide expressional analysis. These approaches can be used to search for new drugs against parasitic infections, to identify drug targets or drug-resistance genes, and to infer gene function. PMID:24215777

  16. Genetic Resources, Genome Mapping and Evolutionary Genomics of the Pig (Sus scrofa)

    PubMed Central

    Chen, Kefei; Baxter, Tara; Muir, William M.; Groenen, Martien A.; Schook, Lawrence B.

    2007-01-01

    The pig, a representative of the artiodactyla clade, is one of the first animals domesticated, and has become an important agriculture animal as one of the major human nutritional sources of animal based protein. The pig is also a valuable biomedical model organism for human health. The pig's importance to human health and nutrition is reflected in the decision to sequence its genome (3X). As an animal species with its wild ancestors present in the world, the pig provides a unique opportunity for tracing mammalian evolutionary history and defining signatures of selection resulting from both domestication and natural selection. Completion of the pig genome sequencing project will have significant impacts on both agriculture and human health. Following the pig whole genome sequence drafts, along with large-scale polymorphism data, it will be possible to conduct genome sweeps using association mapping, and identify signatures of selection. Here, we provide a description of the pig genome sequencing project and perspectives on utilizing genomic technologies to exploit pig genome evolution and the molecular basis for phenotypic traits for improving pig production and health. PMID:17384734

  17. Characterizing genomic alterations in cancer by complementary functional associations | Office of Cancer Genomics

    Cancer.gov

    Systematic efforts to sequence the cancer genome have identified large numbers of mutations and copy number alterations in human cancers. However, elucidating the functional consequences of these variants, and their interactions to drive or maintain oncogenic states, remains a challenge in cancer research. We developed REVEALER, a computational method that identifies combinations of mutually exclusive genomic alterations correlated with functional phenotypes, such as the activation or gene dependency of oncogenic pathways or sensitivity to a drug treatment.

  18. Construction of genome-wide physical BAC contigs using mapped cDNA as probes: Toward an integrated BAC library resource for genome sequencing and analysis. Annual report, July 1995--January 1997

    SciTech Connect

    Mitchell, S.C.; Bocskai, D.; Cao, Y.

    1997-12-31

    The goal of human genome project is to characterize and sequence entire genomes of human and several model organisms, thus providing complete sets of information on the entire structure of transcribed, regulatory and other functional regions for these organisms. In the past years, a number of useful genetic and physical markers on human and mouse genomes have been made available along with the advent of BAC library resources for these organisms. The advances in technology and resource development made it feasible to efficiently construct genome-wide physical BAC contigs for human and other genomes. Currently, over 30,000 mapped STSs and 27,000 mapped Unigenes are available for human genome mapping. ESTs and cDNAs are excellent resources for building contig maps for two reasons. Firstly, they exist in two alternative forms--as both sequence information for PCR primer pairs, and cDoreen genomic libraries efficiently for large number of DNA probes by combining over 100 cDNA probes in each hybridization. Second, the linkage and order of genes are rather conserved among human, mouse and other model organisms. Therefore, gene markers have advantages over random anonymous STSs in building maps for comparative genomic studies.

  19. Phenotype-Genotype Integrator (PheGenI): synthesizing genome-wide association study (GWAS) data with existing genomic resources.

    PubMed

    Ramos, Erin M; Hoffman, Douglas; Junkins, Heather A; Maglott, Donna; Phan, Lon; Sherry, Stephen T; Feolo, Mike; Hindorff, Lucia A

    2014-01-01

    Rapidly accumulating data from genome-wide association studies (GWASs) and other large-scale studies are most useful when synthesized with existing databases. To address this opportunity, we developed the Phenotype-Genotype Integrator (PheGenI), a user-friendly web interface that integrates various National Center for Biotechnology Information (NCBI) genomic databases with association data from the National Human Genome Research Institute GWAS Catalog and supports downloads of search results. Here, we describe the rationale for and development of this resource. Integrating over 66,000 association records with extensive single nucleotide polymorphism (SNP), gene, and expression quantitative trait loci data already available from the NCBI, PheGenI enables deeper investigation and interrogation of SNPs associated with a wide range of traits, facilitating the examination of the relationships between genetic variation and human diseases.

  20. Integrative Transcriptome, Genome and Quantitative Trait Loci Resources Identify Single Nucleotide Polymorphisms in Candidate Genes for Growth Traits in Turbot.

    PubMed

    Robledo, Diego; Fernández, Carlos; Hermida, Miguel; Sciara, Andrés; Álvarez-Dios, José Antonio; Cabaleiro, Santiago; Caamaño, Rubén; Martínez, Paulino; Bouza, Carmen

    2016-02-17

    Growth traits represent a main goal in aquaculture breeding programs and may be related to adaptive variation in wild fisheries. Integrating quantitative trait loci (QTL) mapping and next generation sequencing can greatly help to identify variation in candidate genes, which can result in marker-assisted selection and better genetic structure information. Turbot is a commercially important flatfish in Europe and China, with available genomic information on QTLs and genome mapping. Muscle and liver RNA-seq from 18 individuals was carried out to obtain gene sequences and markers functionally related to growth, resulting in a total of 20,447 genes and 85,344 single nucleotide polymorphisms (SNPs). Many growth-related genes and SNPs were identified and placed in the turbot genome and genetic map to explore their co-localization with growth-QTL markers. Forty-five SNPs on growth-related genes were selected based on QTL co-localization and relevant function for growth traits. Forty-three SNPs were technically feasible and validated in a wild Atlantic population, where 91% were polymorphic. The integration of functional and structural genomic resources in turbot provides a practical approach for QTL mining in this species. Validated SNPs represent a useful set of growth-related gene markers for future association, functional and population studies in this flatfish species.

  1. Integrative Transcriptome, Genome and Quantitative Trait Loci Resources Identify Single Nucleotide Polymorphisms in Candidate Genes for Growth Traits in Turbot

    PubMed Central

    Robledo, Diego; Fernández, Carlos; Hermida, Miguel; Sciara, Andrés; Álvarez-Dios, José Antonio; Cabaleiro, Santiago; Caamaño, Rubén; Martínez, Paulino; Bouza, Carmen

    2016-01-01

    Growth traits represent a main goal in aquaculture breeding programs and may be related to adaptive variation in wild fisheries. Integrating quantitative trait loci (QTL) mapping and next generation sequencing can greatly help to identify variation in candidate genes, which can result in marker-assisted selection and better genetic structure information. Turbot is a commercially important flatfish in Europe and China, with available genomic information on QTLs and genome mapping. Muscle and liver RNA-seq from 18 individuals was carried out to obtain gene sequences and markers functionally related to growth, resulting in a total of 20,447 genes and 85,344 single nucleotide polymorphisms (SNPs). Many growth-related genes and SNPs were identified and placed in the turbot genome and genetic map to explore their co-localization with growth-QTL markers. Forty-five SNPs on growth-related genes were selected based on QTL co-localization and relevant function for growth traits. Forty-three SNPs were technically feasible and validated in a wild Atlantic population, where 91% were polymorphic. The integration of functional and structural genomic resources in turbot provides a practical approach for QTL mining in this species. Validated SNPs represent a useful set of growth-related gene markers for future association, functional and population studies in this flatfish species. PMID:26901189

  2. Genomic resources for Myzus persicae: EST sequencing, SNP identification, and microarray design

    PubMed Central

    Ramsey, John S; Wilson, Alex CC; de Vos, Martin; Sun, Qi; Tamborindeguy, Cecilia; Winfield, Agnese; Malloch, Gaynor; Smith, Dawn M; Fenton, Brian; Gray, Stewart M; Jander, Georg

    2007-01-01

    Background The green peach aphid, Myzus persicae (Sulzer), is a world-wide insect pest capable of infesting more than 40 plant families, including many crop species. However, despite the significant damage inflicted by M. persicae in agricultural systems through direct feeding damage and by its ability to transmit plant viruses, limited genomic information is available for this species. Results Sequencing of 16 M. persicae cDNA libraries generated 26,669 expressed sequence tags (ESTs). Aphids for library construction were raised on Arabidopsis thaliana, Nicotiana benthamiana, Brassica oleracea, B. napus, and Physalis floridana (with and without Potato leafroll virus infection). The M. persicae cDNA libraries include ones made from sexual and asexual whole aphids, guts, heads, and salivary glands. In silico comparison of cDNA libraries identified aphid genes with tissue-specific expression patterns, and gene expression that is induced by feeding on Nicotiana benthamiana. Furthermore, 2423 genes that are novel to science and potentially aphid-specific were identified. Comparison of cDNA data from three aphid lineages identified single nucleotide polymorphisms that can be used as genetic markers and, in some cases, may represent functional differences in the protein products. In particular, non-conservative amino acid substitutions in a highly expressed gut protease may be of adaptive significance for M. persicae feeding on different host plants. The Agilent eArray platform was used to design an M. persicae oligonucleotide microarray representing over 10,000 unique genes. Conclusion New genomic resources have been developed for M. persicae, an agriculturally important insect pest. These include previously unknown sequence data, a collection of expressed genes, molecular markers, and a DNA microarray that can be used to study aphid gene expression. These resources will help elucidate the adaptations that allow M. persicae to develop compatible interactions with its

  3. Defining functional DNA elements in the human genome.

    PubMed

    Kellis, Manolis; Wold, Barbara; Snyder, Michael P; Bernstein, Bradley E; Kundaje, Anshul; Marinov, Georgi K; Ward, Lucas D; Birney, Ewan; Crawford, Gregory E; Dekker, Job; Dunham, Ian; Elnitski, Laura L; Farnham, Peggy J; Feingold, Elise A; Gerstein, Mark; Giddings, Morgan C; Gilbert, David M; Gingeras, Thomas R; Green, Eric D; Guigo, Roderic; Hubbard, Tim; Kent, Jim; Lieb, Jason D; Myers, Richard M; Pazin, Michael J; Ren, Bing; Stamatoyannopoulos, John A; Weng, Zhiping; White, Kevin P; Hardison, Ross C

    2014-04-29

    With the completion of the human genome sequence, attention turned to identifying and annotating its functional DNA elements. As a complement to genetic and comparative genomics approaches, the Encyclopedia of DNA Elements Project was launched to contribute maps of RNA transcripts, transcriptional regulator binding sites, and chromatin states in many cell types. The resulting genome-wide data reveal sites of biochemical activity with high positional resolution and cell type specificity that facilitate studies of gene regulation and interpretation of noncoding variants associated with human disease. However, the biochemically active regions cover a much larger fraction of the genome than do evolutionarily conserved regions, raising the question of whether nonconserved but biochemically active regions are truly functional. Here, we review the strengths and limitations of biochemical, evolutionary, and genetic approaches for defining functional DNA segments, potential sources for the observed differences in estimated genomic coverage, and the biological implications of these discrepancies. We also analyze the relationship between signal intensity, genomic coverage, and evolutionary conservation. Our results reinforce the principle that each approach provides complementary information and that we need to use combinations of all three to elucidate genome function in human biology and disease.

  4. Defining functional DNA elements in the human genome

    PubMed Central

    Kellis, Manolis; Wold, Barbara; Snyder, Michael P.; Bernstein, Bradley E.; Kundaje, Anshul; Marinov, Georgi K.; Ward, Lucas D.; Birney, Ewan; Crawford, Gregory E.; Dekker, Job; Dunham, Ian; Elnitski, Laura L.; Farnham, Peggy J.; Feingold, Elise A.; Gerstein, Mark; Giddings, Morgan C.; Gilbert, David M.; Gingeras, Thomas R.; Green, Eric D.; Guigo, Roderic; Hubbard, Tim; Kent, Jim; Lieb, Jason D.; Myers, Richard M.; Pazin, Michael J.; Ren, Bing; Stamatoyannopoulos, John A.; Weng, Zhiping; White, Kevin P.; Hardison, Ross C.

    2014-01-01

    With the completion of the human genome sequence, attention turned to identifying and annotating its functional DNA elements. As a complement to genetic and comparative genomics approaches, the Encyclopedia of DNA Elements Project was launched to contribute maps of RNA transcripts, transcriptional regulator binding sites, and chromatin states in many cell types. The resulting genome-wide data reveal sites of biochemical activity with high positional resolution and cell type specificity that facilitate studies of gene regulation and interpretation of noncoding variants associated with human disease. However, the biochemically active regions cover a much larger fraction of the genome than do evolutionarily conserved regions, raising the question of whether nonconserved but biochemically active regions are truly functional. Here, we review the strengths and limitations of biochemical, evolutionary, and genetic approaches for defining functional DNA segments, potential sources for the observed differences in estimated genomic coverage, and the biological implications of these discrepancies. We also analyze the relationship between signal intensity, genomic coverage, and evolutionary conservation. Our results reinforce the principle that each approach provides complementary information and that we need to use combinations of all three to elucidate genome function in human biology and disease. PMID:24753594

  5. Genomic Enzymology: Web Tools for Leveraging Protein Family Sequence-Function Space and Genome Context to Discover Novel Functions.

    PubMed

    Gerlt, John A

    2017-08-22

    The exponentially increasing number of protein and nucleic acid sequences provides opportunities to discover novel enzymes, metabolic pathways, and metabolites/natural products, thereby adding to our knowledge of biochemistry and biology. The challenge has evolved from generating sequence information to mining the databases to integrating and leveraging the available information, i.e., the availability of "genomic enzymology" web tools. Web tools that allow identification of biosynthetic gene clusters are widely used by the natural products/synthetic biology community, thereby facilitating the discovery of novel natural products and the enzymes responsible for their biosynthesis. However, many novel enzymes with interesting mechanisms participate in uncharacterized small-molecule metabolic pathways; their discovery and functional characterization also can be accomplished by leveraging information in protein and nucleic acid databases. This Perspective focuses on two genomic enzymology web tools that assist the discovery novel metabolic pathways: (1) Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST) for generating sequence similarity networks to visualize and analyze sequence-function space in protein families and (2) Enzyme Function Initiative-Genome Neighborhood Tool (EFI-GNT) for generating genome neighborhood networks to visualize and analyze the genome context in microbial and fungal genomes. Both tools have been adapted to other applications to facilitate target selection for enzyme discovery and functional characterization. As the natural products community has demonstrated, the enzymology community needs to embrace the essential role of web tools that allow the protein and genome sequence databases to be leveraged for novel insights into enzymological problems.

  6. The Global Invertebrate Genomics Alliance (GIGA): Developing Community Resources to Study Diverse Invertebrate Genomes

    PubMed Central

    2014-01-01

    Over 95% of all metazoan (animal) species comprise the “invertebrates,” but very few genomes from these organisms have been sequenced. We have, therefore, formed a “Global Invertebrate Genomics Alliance” (GIGA). Our intent is to build a collaborative network of diverse scientists to tackle major challenges (e.g., species selection, sample collection and storage, sequence assembly, annotation, analytical tools) associated with genome/transcriptome sequencing across a large taxonomic spectrum. We aim to promote standards that will facilitate comparative approaches to invertebrate genomics and collaborations across the international scientific community. Candidate study taxa include species from Porifera, Ctenophora, Cnidaria, Placozoa, Mollusca, Arthropoda, Echinodermata, Annelida, Bryozoa, and Platyhelminthes, among others. GIGA will target 7000 noninsect/nonnematode species, with an emphasis on marine taxa because of the unrivaled phyletic diversity in the oceans. Priorities for selecting invertebrates for sequencing will include, but are not restricted to, their phylogenetic placement; relevance to organismal, ecological, and conservation research; and their importance to fisheries and human health. We highlight benefits of sequencing both whole genomes (DNA) and transcriptomes and also suggest policies for genomic-level data access and sharing based on transparency and inclusiveness. The GIGA Web site (http://giga.nova.edu) has been launched to facilitate this collaborative venture. PMID:24336862

  7. The Global Invertebrate Genomics Alliance (GIGA): developing community resources to study diverse invertebrate genomes.

    PubMed

    Bracken-Grissom, Heather; Collins, Allen G; Collins, Timothy; Crandall, Keith; Distel, Daniel; Dunn, Casey; Giribet, Gonzalo; Haddock, Steven; Knowlton, Nancy; Martindale, Mark; Medina, Mónica; Messing, Charles; O'Brien, Stephen J; Paulay, Gustav; Putnam, Nicolas; Ravasi, Timothy; Rouse, Greg W; Ryan, Joseph F; Schulze, Anja; Wörheide, Gert; Adamska, Maja; Bailly, Xavier; Breinholt, Jesse; Browne, William E; Diaz, M Christina; Evans, Nathaniel; Flot, Jean-François; Fogarty, Nicole; Johnston, Matthew; Kamel, Bishoy; Kawahara, Akito Y; Laberge, Tammy; Lavrov, Dennis; Michonneau, François; Moroz, Leonid L; Oakley, Todd; Osborne, Karen; Pomponi, Shirley A; Rhodes, Adelaide; Santos, Scott R; Satoh, Nori; Thacker, Robert W; Van de Peer, Yves; Voolstra, Christian R; Welch, David Mark; Winston, Judith; Zhou, Xin

    2014-01-01

    Over 95% of all metazoan (animal) species comprise the "invertebrates," but very few genomes from these organisms have been sequenced. We have, therefore, formed a "Global Invertebrate Genomics Alliance" (GIGA). Our intent is to build a collaborative network of diverse scientists to tackle major challenges (e.g., species selection, sample collection and storage, sequence assembly, annotation, analytical tools) associated with genome/transcriptome sequencing across a large taxonomic spectrum. We aim to promote standards that will facilitate comparative approaches to invertebrate genomics and collaborations across the international scientific community. Candidate study taxa include species from Porifera, Ctenophora, Cnidaria, Placozoa, Mollusca, Arthropoda, Echinodermata, Annelida, Bryozoa, and Platyhelminthes, among others. GIGA will target 7000 noninsect/nonnematode species, with an emphasis on marine taxa because of the unrivaled phyletic diversity in the oceans. Priorities for selecting invertebrates for sequencing will include, but are not restricted to, their phylogenetic placement; relevance to organismal, ecological, and conservation research; and their importance to fisheries and human health. We highlight benefits of sequencing both whole genomes (DNA) and transcriptomes and also suggest policies for genomic-level data access and sharing based on transparency and inclusiveness. The GIGA Web site (http://giga.nova.edu) has been launched to facilitate this collaborative venture.

  8. The draft genome of the pest tephritid fruit fly Bactrocera tryoni: resources for the genomic analysis of hybridising species.

    PubMed

    Gilchrist, Anthony Stuart; Shearman, Deborah C A; Frommer, Marianne; Raphael, Kathryn A; Deshpande, Nandan P; Wilkins, Marc R; Sherwin, William B; Sved, John A

    2014-12-20

    The tephritid fruit flies include a number of economically important pests of horticulture, with a large accumulated body of research on their biology and control. Amongst the Tephritidae, the genus Bactrocera, containing over 400 species, presents various species groups of potential utility for genetic studies of speciation, behaviour or pest control. In Australia, there exists a triad of closely-related, sympatric Bactrocera species which do not mate in the wild but which, despite distinct morphologies and behaviours, can be force-mated in the laboratory to produce fertile hybrid offspring. To exploit the opportunities offered by genomics, such as the efficient identification of genetic loci central to pest behaviour and to the earliest stages of speciation, investigators require genomic resources for future investigations. We produced a draft de novo genome assembly of Australia's major tephritid pest species, Bactrocera tryoni. The male genome (650-700 Mbp) includes approximately 150 Mb of interspersed repetitive DNA sequences and 60 Mb of satellite DNA. Assessment using conserved core eukaryotic sequences indicated 98% completeness. Over 16,000 MAKER-derived gene models showed a large degree of overlap with other Dipteran reference genomes. The sequence of the ribosomal RNA transcribed unit was also determined. Unscaffolded assemblies of B. neohumeralis and B. jarvisi were then produced; comparison with B. tryoni showed that the species are more closely related than any Drosophila species pair. The similarity of the genomes was exploited to identify 4924 potentially diagnostic indels between the species, all of which occur in non-coding regions. This first draft B. tryoni genome resembles other dipteran genomes in terms of size and putative coding sequences. For all three species included in this study, we have identified a comprehensive set of non-redundant repetitive sequences, including the ribosomal RNA unit, and have quantified the major satellite DNA

  9. Common-pool resources and population genomics in Iceland, Estonia, and Tonga.

    PubMed

    Barker, Jeffrey H

    2003-01-01

    This paper addresses the application of the ethical concept of trust and the legal and political concept of public trust to population genomics projects in Iceland, Estonia, and Tonga. Focusing on trust and public trust, the paper explores analogies between the genomics projects and the treatment of other common-pool resources, making use of the notion of trust as an ethical demand, derived from the works of Emmanuel Levinas and Knud Eljer Løgstrup. The paper discusses the degree to which the ethical demands for trust and public trust have been established and maintained in the three national population genomics projects.

  10. Coordinated international action to accelerate genome-to-phenome with FAANG, The Functional Annotation of Animal Genomes project

    USDA-ARS?s Scientific Manuscript database

    We describe the organization of a nascent international effort - the "Functional Annotation of ANimal Genomes" project - whose aim is to produce comprehensive maps of functional elements in the genomes of domesticated animal species....

  11. Functional coverage of the human genome by existing structures, structural genomics targets, and homology models.

    PubMed

    Xie, Lei; Bourne, Philip E

    2005-08-01

    The bias in protein structure and function space resulting from experimental limitations and targeting of particular functional classes of proteins by structural biologists has long been recognized, but never continuously quantified. Using the Enzyme Commission and the Gene Ontology classifications as a reference frame, and integrating structure data from the Protein Data Bank (PDB), target sequences from the structural genomics projects, structure homology derived from the SUPERFAMILY database, and genome annotations from Ensembl and NCBI, we provide a quantified view, both at the domain and whole-protein levels, of the current and projected coverage of protein structure and function space relative to the human genome. Protein structures currently provide at least one domain that covers 37% of the functional classes identified in the genome; whole structure coverage exists for 25% of the genome. If all the structural genomics targets were solved (twice the current number of structures in the PDB), it is estimated that structures of one domain would cover 69% of the functional classes identified and complete structure coverage would be 44%. Homology models from existing experimental structures extend the 37% coverage to 56% of the genome as single domains and 25% to 31% for complete structures. Coverage from homology models is not evenly distributed by protein family, reflecting differing degrees of sequence and structure divergence within families. While these data provide coverage, conversely, they also systematically highlight functional classes of proteins for which structures should be determined. Current key functional families without structure representation are highlighted here; updated information on the "most wanted list" that should be solved is available on a weekly basis from http://function.rcsb.org:8080/pdb/function_distribution/index.html.

  12. Determining protein function and interaction from genome analysis

    DOEpatents

    Eisenberg, David; Marcotte, Edward M.; Thompson, Michael J.; Pellegrini, Matteo; Yeates, Todd O.

    2004-08-03

    A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

  13. Assigning protein functions by comparative genome analysis protein phylogenetic profiles

    DOEpatents

    Pellegrini, Matteo; Marcotte, Edward M.; Thompson, Michael J.; Eisenberg, David; Grothe, Robert; Yeates, Todd O.

    2003-05-13

    A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

  14. Transcriptome and genome sequencing uncovers functional variation in humans.

    PubMed

    Lappalainen, Tuuli; Sammeth, Michael; Friedländer, Marc R; 't Hoen, Peter A C; Monlong, Jean; Rivas, Manuel A; Gonzàlez-Porta, Mar; Kurbatova, Natalja; Griebel, Thasso; Ferreira, Pedro G; Barann, Matthias; Wieland, Thomas; Greger, Liliana; van Iterson, Maarten; Almlöf, Jonas; Ribeca, Paolo; Pulyakhina, Irina; Esser, Daniela; Giger, Thomas; Tikhonov, Andrew; Sultan, Marc; Bertier, Gabrielle; MacArthur, Daniel G; Lek, Monkol; Lizano, Esther; Buermans, Henk P J; Padioleau, Ismael; Schwarzmayr, Thomas; Karlberg, Olof; Ongen, Halit; Kilpinen, Helena; Beltran, Sergi; Gut, Marta; Kahlem, Katja; Amstislavskiy, Vyacheslav; Stegle, Oliver; Pirinen, Matti; Montgomery, Stephen B; Donnelly, Peter; McCarthy, Mark I; Flicek, Paul; Strom, Tim M; Lehrach, Hans; Schreiber, Stefan; Sudbrak, Ralf; Carracedo, Angel; Antonarakis, Stylianos E; Häsler, Robert; Syvänen, Ann-Christine; van Ommen, Gert-Jan; Brazma, Alvis; Meitinger, Thomas; Rosenstiel, Philip; Guigó, Roderic; Gut, Ivo G; Estivill, Xavier; Dermitzakis, Emmanouil T

    2013-09-26

    Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of messenger RNA and microRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project--the first uniformly processed high-throughput RNA-sequencing data from multiple human populations with high-quality genome sequences. We discover extremely widespread genetic variation affecting the regulation of most genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on the cellular mechanisms of regulatory and loss-of-function variation, and allows us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome.

  15. A Protocol for Generating and Exchanging (Genome-Scale) Metabolic Resource Allocation Models.

    PubMed

    Reimers, Alexandra-M; Lindhorst, Henning; Waldherr, Steffen

    2017-09-06

    In this article, we present a protocol for generating a complete (genome-scale) metabolic resource allocation model, as well as a proposal for how to represent such models in the systems biology markup language (SBML). Such models are used to investigate enzyme levels and achievable growth rates in large-scale metabolic networks. Although the idea of metabolic resource allocation studies has been present in the field of systems biology for some years, no guidelines for generating such a model have been published up to now. This paper presents step-by-step instructions for building a (dynamic) resource allocation model, starting with prerequisites such as a genome-scale metabolic reconstruction, through building protein and noncatalytic biomass synthesis reactions and assigning turnover rates for each reaction. In addition, we explain how one can use SBML level 3 in combination with the flux balance constraints and our resource allocation modeling annotation to represent such models.

  16. BambooGDB: a bamboo genome database with functional annotation and an analysis platform

    PubMed Central

    Zhao, Hansheng; Peng, Zhenhua; Fei, Benhua; Li, Lubin; Hu, Tao; Gao, Zhimin; Jiang, Zehui

    2014-01-01

    Bamboo, as one of the most important non-timber forest products and fastest-growing plants in the world, represents the only major lineage of grasses that is native to forests. Recent success on the first high-quality draft genome sequence of moso bamboo (Phyllostachys edulis) provides new insights on bamboo genetics and evolution. To further extend our understanding on bamboo genome and facilitate future studies on the basis of previous achievements, here we have developed BambooGDB, a bamboo genome database with functional annotation and analysis platform. The de novo sequencing data, together with the full-length complementary DNA and RNA-seq data of moso bamboo composed the main contents of this database. Based on these sequence data, a comprehensively functional annotation for bamboo genome was made. Besides, an analytical platform composed of comparative genomic analysis, protein–protein interactions network, pathway analysis and visualization of genomic data was also constructed. As discovery tools to understand and identify biological mechanisms of bamboo, the platform can be used as a systematic framework for helping and designing experiments for further validation. Moreover, diverse and powerful search tools and a convenient browser were incorporated to facilitate the navigation of these data. As far as we know, this is the first genome database for bamboo. Through integrating high-throughput sequencing data, a full functional annotation and several analysis modules, BambooGDB aims to provide worldwide researchers with a central genomic resource and an extensible analysis platform for bamboo genome. BambooGDB is freely available at http://www.bamboogdb.org/. Database URL: http://www.bamboogdb.org PMID:24602877

  17. BambooGDB: a bamboo genome database with functional annotation and an analysis platform.

    PubMed

    Zhao, Hansheng; Peng, Zhenhua; Fei, Benhua; Li, Lubin; Hu, Tao; Gao, Zhimin; Jiang, Zehui

    2014-01-01

    Bamboo, as one of the most important non-timber forest products and fastest-growing plants in the world, represents the only major lineage of grasses that is native to forests. Recent success on the first high-quality draft genome sequence of moso bamboo (Phyllostachys edulis) provides new insights on bamboo genetics and evolution. To further extend our understanding on bamboo genome and facilitate future studies on the basis of previous achievements, here we have developed BambooGDB, a bamboo genome database with functional annotation and analysis platform. The de novo sequencing data, together with the full-length complementary DNA and RNA-seq data of moso bamboo composed the main contents of this database. Based on these sequence data, a comprehensively functional annotation for bamboo genome was made. Besides, an analytical platform composed of comparative genomic analysis, protein-protein interactions network, pathway analysis and visualization of genomic data was also constructed. As discovery tools to understand and identify biological mechanisms of bamboo, the platform can be used as a systematic framework for helping and designing experiments for further validation. Moreover, diverse and powerful search tools and a convenient browser were incorporated to facilitate the navigation of these data. As far as we know, this is the first genome database for bamboo. Through integrating high-throughput sequencing data, a full functional annotation and several analysis modules, BambooGDB aims to provide worldwide researchers with a central genomic resource and an extensible analysis platform for bamboo genome. BambooGDB is freely available at http://www.bamboogdb.org/. Database URL: http://www.bamboogdb.org.

  18. What's that gene (or protein)? Online resources for exploring functions of genes, transcripts, and proteins

    PubMed Central

    Hutchins, James R. A.

    2014-01-01

    The genomic era has enabled research projects that use approaches including genome-scale screens, microarray analysis, next-generation sequencing, and mass spectrometry–based proteomics to discover genes and proteins involved in biological processes. Such methods generate data sets of gene, transcript, or protein hits that researchers wish to explore to understand their properties and functions and thus their possible roles in biological systems of interest. Recent years have seen a profusion of Internet-based resources to aid this process. This review takes the viewpoint of the curious biologist wishing to explore the properties of protein-coding genes and their products, identified using genome-based technologies. Ten key questions are asked about each hit, addressing functions, phenotypes, expression, evolutionary conservation, disease association, protein structure, interactors, posttranslational modifications, and inhibitors. Answers are provided by presenting the latest publicly available resources, together with methods for hit-specific and data set–wide information retrieval, suited to any genome-based analytical technique and experimental species. The utility of these resources is demonstrated for 20 factors regulating cell proliferation. Results obtained using some of these are discussed in more depth using the p53 tumor suppressor as an example. This flexible and universally applicable approach for characterizing experimental hits helps researchers to maximize the potential of their projects for biological discovery. PMID:24723265

  19. Apollo: a community resource for genome annotation editing

    PubMed Central

    Ed, Lee; Nomi, Harris; Mark, Gibson; Raymond, Chetty; Suzanna, Lewis

    2009-01-01

    Summary: Apollo is a genome annotation-editing tool with an easy to use graphical interface. It is a component of the GMOD project, with ongoing development driven by the community. Recent additions to the software include support for the generic feature format version 3 (GFF3), continuous transcriptome data, a full Chado database interface, integration with remote services for on-the-fly BLAST and Primer BLAST analyses, graphical interfaces for configuring user preferences and full undo of all edit operations. Apollo's user community continues to grow, including its use as an educational tool for college and high-school students. Availability: Apollo is a Java application distributed under a free and open source license. Installers for Windows, Linux, Unix, Solaris and Mac OS X are available at http://apollo.berkeleybop.org, and the source code is available from the SourceForge CVS repository at http://gmod.cvs.sourceforge.net/gmod/apollo. Contact: elee@berkeleybop.org PMID:19439563

  20. A High-Definition View of Functional Genetic Variation from Natural Yeast Genomes

    PubMed Central

    Bergström, Anders; Simpson, Jared T.; Salinas, Francisco; Barré, Benjamin; Parts, Leopold; Zia, Amin; Nguyen Ba, Alex N.; Moses, Alan M.; Louis, Edward J.; Mustonen, Ville; Warringer, Jonas; Durbin, Richard; Liti, Gianni

    2014-01-01

    The question of how genetic variation in a population influences phenotypic variation and evolution is of major importance in modern biology. Yet much is still unknown about the relative functional importance of different forms of genome variation and how they are shaped by evolutionary processes. Here we address these questions by population level sequencing of 42 strains from the budding yeast Saccharomyces cerevisiae and its closest relative S. paradoxus. We find that genome content variation, in the form of presence or absence as well as copy number of genetic material, is higher within S. cerevisiae than within S. paradoxus, despite genetic distances as measured in single-nucleotide polymorphisms being vastly smaller within the former species. This genome content variation, as well as loss-of-function variation in the form of premature stop codons and frameshifting indels, is heavily enriched in the subtelomeres, strongly reinforcing the relevance of these regions to functional evolution. Genes affected by these likely functional forms of variation are enriched for functions mediating interaction with the external environment (sugar transport and metabolism, flocculation, metal transport, and metabolism). Our results and analyses provide a comprehensive view of genomic diversity in budding yeast and expose surprising and pronounced differences between the variation within S. cerevisiae and that within S. paradoxus. We also believe that the sequence data and de novo assemblies will constitute a useful resource for further evolutionary and population genomics studies. PMID:24425782

  1. A high-definition view of functional genetic variation from natural yeast genomes.

    PubMed

    Bergström, Anders; Simpson, Jared T; Salinas, Francisco; Barré, Benjamin; Parts, Leopold; Zia, Amin; Nguyen Ba, Alex N; Moses, Alan M; Louis, Edward J; Mustonen, Ville; Warringer, Jonas; Durbin, Richard; Liti, Gianni

    2014-04-01

    The question of how genetic variation in a population influences phenotypic variation and evolution is of major importance in modern biology. Yet much is still unknown about the relative functional importance of different forms of genome variation and how they are shaped by evolutionary processes. Here we address these questions by population level sequencing of 42 strains from the budding yeast Saccharomyces cerevisiae and its closest relative S. paradoxus. We find that genome content variation, in the form of presence or absence as well as copy number of genetic material, is higher within S. cerevisiae than within S. paradoxus, despite genetic distances as measured in single-nucleotide polymorphisms being vastly smaller within the former species. This genome content variation, as well as loss-of-function variation in the form of premature stop codons and frameshifting indels, is heavily enriched in the subtelomeres, strongly reinforcing the relevance of these regions to functional evolution. Genes affected by these likely functional forms of variation are enriched for functions mediating interaction with the external environment (sugar transport and metabolism, flocculation, metal transport, and metabolism). Our results and analyses provide a comprehensive view of genomic diversity in budding yeast and expose surprising and pronounced differences between the variation within S. cerevisiae and that within S. paradoxus. We also believe that the sequence data and de novo assemblies will constitute a useful resource for further evolutionary and population genomics studies.

  2. Prospects and challenges for the conservation of farm animal genomic resources, 2015-2025.

    PubMed

    Bruford, Michael W; Ginja, Catarina; Hoffmann, Irene; Joost, Stéphane; Orozco-terWengel, Pablo; Alberto, Florian J; Amaral, Andreia J; Barbato, Mario; Biscarini, Filippo; Colli, Licia; Costa, Mafalda; Curik, Ino; Duruz, Solange; Ferenčaković, Maja; Fischer, Daniel; Fitak, Robert; Groeneveld, Linn F; Hall, Stephen J G; Hanotte, Olivier; Hassan, Faiz-Ul; Helsen, Philippe; Iacolina, Laura; Kantanen, Juha; Leempoel, Kevin; Lenstra, Johannes A; Ajmone-Marsan, Paolo; Masembe, Charles; Megens, Hendrik-Jan; Miele, Mara; Neuditschko, Markus; Nicolazzi, Ezequiel L; Pompanon, François; Roosen, Jutta; Sevane, Natalia; Smetko, Anamarija; Štambuk, Anamaria; Streeter, Ian; Stucki, Sylvie; Supakorn, China; Telo Da Gama, Luis; Tixier-Boichard, Michèle; Wegmann, Daniel; Zhan, Xiangjiang

    2015-01-01

    Livestock conservation practice is changing rapidly in light of policy developments, climate change and diversifying market demands. The last decade has seen a step change in technology and analytical approaches available to define, manage and conserve Farm Animal Genomic Resources (FAnGR). However, these rapid changes pose challenges for FAnGR conservation in terms of technological continuity, analytical capacity and integrative methodologies needed to fully exploit new, multidimensional data. The final conference of the ESF Genomic Resources program aimed to address these interdisciplinary problems in an attempt to contribute to the agenda for research and policy development directions during the coming decade. By 2020, according to the Convention on Biodiversity's Aichi Target 13, signatories should ensure that "…the genetic diversity of …farmed and domesticated animals and of wild relatives …is maintained, and strategies have been developed and implemented for minimizing genetic erosion and safeguarding their genetic diversity." However, the real extent of genetic erosion is very difficult to measure using current data. Therefore, this challenging target demands better coverage, understanding and utilization of genomic and environmental data, the development of optimized ways to integrate these data with social and other sciences and policy analysis to enable more flexible, evidence-based models to underpin FAnGR conservation. At the conference, we attempted to identify the most important problems for effective livestock genomic resource conservation during the next decade. Twenty priority questions were identified that could be broadly categorized into challenges related to methodology, analytical approaches, data management and conservation. It should be acknowledged here that while the focus of our meeting was predominantly around genetics, genomics and animal science, many of the practical challenges facing conservation of genomic resources are

  3. Prospects and challenges for the conservation of farm animal genomic resources, 2015-2025

    PubMed Central

    Bruford, Michael W.; Ginja, Catarina; Hoffmann, Irene; Joost, Stéphane; Orozco-terWengel, Pablo; Alberto, Florian J.; Amaral, Andreia J.; Barbato, Mario; Biscarini, Filippo; Colli, Licia; Costa, Mafalda; Curik, Ino; Duruz, Solange; Ferenčaković, Maja; Fischer, Daniel; Fitak, Robert; Groeneveld, Linn F.; Hall, Stephen J. G.; Hanotte, Olivier; Hassan, Faiz-ul; Helsen, Philippe; Iacolina, Laura; Kantanen, Juha; Leempoel, Kevin; Lenstra, Johannes A.; Ajmone-Marsan, Paolo; Masembe, Charles; Megens, Hendrik-Jan; Miele, Mara; Neuditschko, Markus; Nicolazzi, Ezequiel L.; Pompanon, François; Roosen, Jutta; Sevane, Natalia; Smetko, Anamarija; Štambuk, Anamaria; Streeter, Ian; Stucki, Sylvie; Supakorn, China; Telo Da Gama, Luis; Tixier-Boichard, Michèle; Wegmann, Daniel; Zhan, Xiangjiang

    2015-01-01

    Livestock conservation practice is changing rapidly in light of policy developments, climate change and diversifying market demands. The last decade has seen a step change in technology and analytical approaches available to define, manage and conserve Farm Animal Genomic Resources (FAnGR). However, these rapid changes pose challenges for FAnGR conservation in terms of technological continuity, analytical capacity and integrative methodologies needed to fully exploit new, multidimensional data. The final conference of the ESF Genomic Resources program aimed to address these interdisciplinary problems in an attempt to contribute to the agenda for research and policy development directions during the coming decade. By 2020, according to the Convention on Biodiversity's Aichi Target 13, signatories should ensure that “…the genetic diversity of …farmed and domesticated animals and of wild relatives …is maintained, and strategies have been developed and implemented for minimizing genetic erosion and safeguarding their genetic diversity.” However, the real extent of genetic erosion is very difficult to measure using current data. Therefore, this challenging target demands better coverage, understanding and utilization of genomic and environmental data, the development of optimized ways to integrate these data with social and other sciences and policy analysis to enable more flexible, evidence-based models to underpin FAnGR conservation. At the conference, we attempted to identify the most important problems for effective livestock genomic resource conservation during the next decade. Twenty priority questions were identified that could be broadly categorized into challenges related to methodology, analytical approaches, data management and conservation. It should be acknowledged here that while the focus of our meeting was predominantly around genetics, genomics and animal science, many of the practical challenges facing conservation of genomic resources are

  4. A whole-genome, radiation hybrid mapping resource of hexaploid wheat.

    PubMed

    Tiwari, Vijay K; Heesacker, Adam; Riera-Lizarazu, Oscar; Gunn, Hilary; Wang, Shichen; Wang, Yi; Gu, Young Q; Paux, Etienne; Koo, Dal-Hoe; Kumar, Ajay; Luo, Ming-Cheng; Lazo, Gerard; Zemetra, Robert; Akhunov, Eduard; Friebe, Bernd; Poland, Jesse; Gill, Bikram S; Kianian, Shahryar; Leonard, Jeffrey M

    2016-04-01

    Generating a contiguous, ordered reference sequence of a complex genome such as hexaploid wheat (2n = 6x = 42; approximately 17 GB) is a challenging task due to its large, highly repetitive, and allopolyploid genome. In wheat, ordering of whole-genome or hierarchical shotgun sequencing contigs is primarily based on recombination and comparative genomics-based approaches. However, comparative genomics approaches are limited to syntenic inference and recombination is suppressed within the pericentromeric regions of wheat chromosomes, thus, precise ordering of physical maps and sequenced contigs across the whole-genome using these approaches is nearly impossible. We developed a whole-genome radiation hybrid (WGRH) resource and tested it by genotyping a set of 115 randomly selected lines on a high-density single nucleotide polymorphism (SNP) array. At the whole-genome level, 26 299 SNP markers were mapped on the RH panel and provided an average mapping resolution of approximately 248 Kb/cR1500 with a total map length of 6866 cR1500 . The 7296 unique mapping bins provided a five- to eight-fold higher resolution than genetic maps used in similar studies. Most strikingly, the RH map had uniform bin resolution across the entire chromosome(s), including pericentromeric regions. Our research provides a valuable and low-cost resource for anchoring and ordering sequenced BAC and next generation sequencing (NGS) contigs. The WGRH developed for reference wheat line Chinese Spring (CS-WGRH), will be useful for anchoring and ordering sequenced BAC and NGS based contigs for assembling a high-quality, reference sequence of hexaploid wheat. Additionally, this study provides an excellent model for developing similar resources for other polyploid species. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.

  5. The kangaroo genome

    PubMed Central

    Wakefield, Matthew J.; Graves, Jennifer A. Marshall

    2003-01-01

    The kangaroo genome is a rich and unique resource for comparative genomics. Marsupial genetics and cytology have made significant contributions to the understanding of gene function and evolution, and increasing the availability of kangaroo DNA sequence information would provide these benefits on a genomic scale. Here we summarize the contributions from cytogenetic and genetic studies of marsupials, describe the genomic resources currently available and those being developed, and explore the benefits of a kangaroo genome project. PMID:12612602

  6. Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome

    PubMed Central

    Bergman, Casey M; Pfeiffer, Barret D; Rincón-Limas, Diego E; Hoskins, Roger A; Gnirke, Andreas; Mungall, Chris J; Wang, Adrienne M; Kronmiller, Brent; Pacleb, Joanne; Park, Soo; Stapleton, Mark; Wan, Kenneth; George, Reed A; de Jong, Pieter J; Botas, Juan; Rubin, Gerald M; Celniker, Susan E

    2002-01-01

    Background It is widely accepted that comparative sequence data can aid the functional annotation of genome sequences; however, the most informative species and features of genome evolution for comparison remain to be determined. Results We analyzed conservation in eight genomic regions (apterous, even-skipped, fushi tarazu, twist, and Rhodopsins 1, 2, 3 and 4) from four Drosophila species (D. erecta, D. pseudoobscura, D. willistoni, and D. littoralis) covering more than 500 kb of the D. melanogaster genome. All D. melanogaster genes (and 78-82% of coding exons) identified in divergent species such as D. pseudoobscura show evidence of functional constraint. Addition of a third species can reveal functional constraint in otherwise non-significant pairwise exon comparisons. Microsynteny is largely conserved, with rearrangement breakpoints, novel transposable element insertions, and gene transpositions occurring in similar numbers. Rates of amino-acid substitution are higher in uncharacterized genes relative to genes that have previously been studied. Conserved non-coding sequences (CNCSs) tend to be spatially clustered with conserved spacing between CNCSs, and clusters of CNCSs can be used to predict enhancer sequences. Conclusions Our results provide the basis for choosing species whose genome sequences would be most useful in aiding the functional annotation of coding and cis-regulatory sequences in Drosophila. Furthermore, this work shows how decoding the spatial organization of conserved sequences, such as the clustering of CNCSs, can complement efforts to annotate eukaryotic genomes on the basis of sequence conservation alone. PMID:12537575

  7. In silico Comparison of 19 Porphyromonas gingivalis Strains in Genomics, Phylogenetics, Phylogenomics and Functional Genomics

    PubMed Central

    Chen, Tsute; Siddiqui, Huma; Olsen, Ingar

    2017-01-01

    Currently, genome sequences of a total of 19 Porphyromonas gingivalis strains are available, including eight completed genomes (strains W83, ATCC 33277, TDC60, HG66, A7436, AJW4, 381, and A7A1-28) and 11 high-coverage draft sequences (JCVI SC001, F0185, F0566, F0568, F0569, F0570, SJD2, W4087, W50, Ando, and MP4-504) that are assembled into fewer than 300 contigs. The objective was to compare these genomes at both nucleotide and protein sequence levels in order to understand their phylogenetic and functional relatedness. Four copies of 16S rRNA gene sequences were identified in each of the eight complete genomes and one in the other 11 unfinished genomes. These 43 16S rRNA sequences represent only 24 unique sequences and the derived phylogenetic tree suggests a possible evolutionary history for these strains. Phylogenomic comparison based on shared proteins and whole genome nucleotide sequences consistently showed two groups with closely related members: one consisted of ATCC 33277, 381, and HG66, another of W83, W50, and A7436. At least 1,037 core/shared proteins were identified in the 19 P. gingivalis genomes based on the most stringent detecting parameters. Comparative functional genomics based on genome-wide comparisons between NCBI and RAST annotations, as well as additional approaches, revealed functions that are unique or missing in individual P. gingivalis strains, or species-specific in all P. gingivalis strains, when compared to a neighboring species P. asaccharolytica. All the comparative results of this study are available online for download at ftp://www.homd.org/publication_data/20160425/. PMID:28261563

  8. In silico Comparison of 19 Porphyromonas gingivalis Strains in Genomics, Phylogenetics, Phylogenomics and Functional Genomics.

    PubMed

    Chen, Tsute; Siddiqui, Huma; Olsen, Ingar

    2017-01-01

    Currently, genome sequences of a total of 19 Porphyromonas gingivalis strains are available, including eight completed genomes (strains W83, ATCC 33277, TDC60, HG66, A7436, AJW4, 381, and A7A1-28) and 11 high-coverage draft sequences (JCVI SC001, F0185, F0566, F0568, F0569, F0570, SJD2, W4087, W50, Ando, and MP4-504) that are assembled into fewer than 300 contigs. The objective was to compare these genomes at both nucleotide and protein sequence levels in order to understand their phylogenetic and functional relatedness. Four copies of 16S rRNA gene sequences were identified in each of the eight complete genomes and one in the other 11 unfinished genomes. These 43 16S rRNA sequences represent only 24 unique sequences and the derived phylogenetic tree suggests a possible evolutionary history for these strains. Phylogenomic comparison based on shared proteins and whole genome nucleotide sequences consistently showed two groups with closely related members: one consisted of ATCC 33277, 381, and HG66, another of W83, W50, and A7436. At least 1,037 core/shared proteins were identified in the 19 P. gingivalis genomes based on the most stringent detecting parameters. Comparative functional genomics based on genome-wide comparisons between NCBI and RAST annotations, as well as additional approaches, revealed functions that are unique or missing in individual P. gingivalis strains, or species-specific in all P. gingivalis strains, when compared to a neighboring species P. asaccharolytica. All the comparative results of this study are available online for download at ftp://www.homd.org/publication_data/20160425/.

  9. Development of a D genome specific marker resource for diploid and hexaploid wheat.

    PubMed

    Wang, Yi; Drader, Thomas; Tiwari, Vijay K; Dong, Lingli; Kumar, Ajay; Huo, Naxin; Ghavami, Farhad; Iqbal, M Javed; Lazo, Gerard R; Leonard, Jeff; Gill, Bikram S; Kianian, Shahryar F; Luo, Ming-Cheng; Gu, Yong Q

    2015-08-28

    Mapping and map-based cloning of genes that control agriculturally and economically important traits remain great challenges for plants with complex highly repetitive genomes such as those within the grass tribe, Triticeae. Mapping limitations in the Triticeae are primarily due to low frequencies of polymorphic gene markers and poor genetic recombination in certain genetic regions. Although the abundance of repetitive sequence may pose common problems in genome analysis and sequence assembly of large and complex genomes, they provide repeat junction markers with random and unbiased distribution throughout chromosomes. Hence, development of a high-throughput mapping technology that combine both gene-based and repeat junction-based markers is needed to generate maps that have better coverage of the entire genome. In this study, the available genomics resource of the diploid Aegilop tauschii, the D genome donor of bread wheat, were used to develop genome specific markers that can be applied for mapping in modern hexaploid wheat. A NimbleGen array containing both gene-based and repeat junction probe sequences derived from Ae. tauschii was developed and used to map the Chinese Spring nullisomic-tetrasomic lines and deletion bin lines of the D genome chromosomes. Based on these mapping data, we have now anchored 5,171 repeat junction probes and 10,892 gene probes, corresponding to 5,070 gene markers, to the delineated deletion bins of the D genome. The order of the gene-based markers within the deletion bins of the Chinese Spring can be inferred based on their positions on the Ae. tauschii genetic map. Analysis of the probe sequences against the Chinese Spring chromosome sequence assembly database facilitated mapping of the NimbleGen probes to the sequence contigs and allowed assignment or ordering of these sequence contigs within the deletion bins. The accumulated length of anchored sequence contigs is about 155 Mb, representing ~ 3.2 % of the D genome. A specific

  10. PSiFR: an integrated resource for prediction of protein structure and function.

    PubMed

    Pandit, Shashi B; Brylinski, Michal; Zhou, Hongyi; Gao, Mu; Arakaki, Adrian K; Skolnick, Jeffrey

    2010-03-01

    In the post-genomic era, the annotation of protein function facilitates the understanding of various biological processes. To extend the range of function annotation methods to the twilight zone of sequence identity, we have developed approaches that exploit both protein tertiary structure and/or protein sequence evolutionary relationships. To serve the scientific community, we have integrated the structure prediction tools, TASSER, TASSER-Lite and METATASSER, and the functional inference tools, FINDSITE, a structure-based algorithm for binding site prediction, Gene Ontology molecular function inference and ligand screening, EFICAz(2), a sequence-based approach to enzyme function inference and DBD-hunter, an algorithm for predicting DNA-binding proteins and associated DNA-binding residues, into a unified web resource, Protein Structure and Function prediction Resource (PSiFR). PSiFR is freely available for use on the web at http://psifr.cssb.biology.gatech.edu/

  11. Exploring the post-genomic world: differing explanatory and manipulatory functions of post-genomic sciences

    PubMed Central

    Holmes, Christina; Carlson, Siobhan M.; McDonald, Fiona; Jones, Mavis; Graham, Janice

    2016-01-01

    Richard Lewontin proposed that the ability of a scientific field to create a narrative for public understanding garners it social relevance. This article applies Lewontin's conceptual framework of the functions of science (manipulatory and explanatory) to compare and explain the current differences in perceived societal relevance of genetics/genomics and proteomics. We provide three examples to illustrate the social relevance and strong cultural narrative of genetics/genomics for which no counterpart exists for proteomics. We argue that the major difference between genetics/genomics and proteomics is that genomics has a strong explanatory function, due to the strong cultural narrative of heredity. Based on qualitative interviews and observations of proteomics conferences, we suggest that the nature of proteins, lack of public understanding, and theoretical complexity exacerbates this difference for proteomics. Lewontin's framework suggests that social scientists may find that omics sciences affect social relations in different ways than past analyses of genetics. PMID:27134568

  12. Exploring the post-genomic world: differing explanatory and manipulatory functions of post-genomic sciences.

    PubMed

    Holmes, Christina; Carlson, Siobhan M; McDonald, Fiona; Jones, Mavis; Graham, Janice

    2016-01-02

    Richard Lewontin proposed that the ability of a scientific field to create a narrative for public understanding garners it social relevance. This article applies Lewontin's conceptual framework of the functions of science (manipulatory and explanatory) to compare and explain the current differences in perceived societal relevance of genetics/genomics and proteomics. We provide three examples to illustrate the social relevance and strong cultural narrative of genetics/genomics for which no counterpart exists for proteomics. We argue that the major difference between genetics/genomics and proteomics is that genomics has a strong explanatory function, due to the strong cultural narrative of heredity. Based on qualitative interviews and observations of proteomics conferences, we suggest that the nature of proteins, lack of public understanding, and theoretical complexity exacerbates this difference for proteomics. Lewontin's framework suggests that social scientists may find that omics sciences affect social relations in different ways than past analyses of genetics.

  13. Functional genomics of lactic acid bacteria: from food to health

    PubMed Central

    2014-01-01

    Genome analysis using next generation sequencing technologies has revolutionized the characterization of lactic acid bacteria and complete genomes of all major groups are now available. Comparative genomics has provided new insights into the natural and laboratory evolution of lactic acid bacteria and their environmental interactions. Moreover, functional genomics approaches have been used to understand the response of lactic acid bacteria to their environment. The results have been instrumental in understanding the adaptation of lactic acid bacteria in artisanal and industrial food fermentations as well as their interactions with the human host. Collectively, this has led to a detailed analysis of genes involved in colonization, persistence, interaction and signaling towards to the human host and its health. Finally, massive parallel genome re-sequencing has provided new opportunities in applied genomics, specifically in the characterization of novel non-GMO strains that have potential to be used in the food industry. Here, we provide an overview of the state of the art of these functional genomics approaches and their impact in understanding, applying and designing lactic acid bacteria for food and health. PMID:25186768

  14. Functional genomics of lactic acid bacteria: from food to health.

    PubMed

    Douillard, François P; de Vos, Willem M

    2014-08-29

    Genome analysis using next generation sequencing technologies has revolutionized the characterization of lactic acid bacteria and complete genomes of all major groups are now available. Comparative genomics has provided new insights into the natural and laboratory evolution of lactic acid bacteria and their environmental interactions. Moreover, functional genomics approaches have been used to understand the response of lactic acid bacteria to their environment. The results have been instrumental in understanding the adaptation of lactic acid bacteria in artisanal and industrial food fermentations as well as their interactions with the human host. Collectively, this has led to a detailed analysis of genes involved in colonization, persistence, interaction and signaling towards to the human host and its health. Finally, massive parallel genome re-sequencing has provided new opportunities in applied genomics, specifically in the characterization of novel non-GMO strains that have potential to be used in the food industry. Here, we provide an overview of the state of the art of these functional genomics approaches and their impact in understanding, applying and designing lactic acid bacteria for food and health.

  15. The Naked Mole Rat Genome Resource: facilitating analyses of cancer and longevity-related adaptations

    PubMed Central

    Keane, Michael; Craig, Thomas; Alföldi, Jessica; Berlin, Aaron M.; Johnson, Jeremy; Seluanov, Andrei; Gorbunova, Vera; Di Palma, Federica; Lindblad-Toh, Kerstin; Church, George M.; de Magalhães, João Pedro

    2014-01-01

    Motivation: The naked mole rat (Heterocephalus glaber) is an exceptionally long-lived and cancer-resistant rodent native to East Africa. Although its genome was previously sequenced, here we report a new assembly sequenced by us with substantially higher N50 values for scaffolds and contigs. Results: We analyzed the annotation of this new improved assembly and identified candidate genomic adaptations which may have contributed to the evolution of the naked mole rat’s extraordinary traits, including in regions of p53, and the hyaluronan receptors CD44 and HMMR (RHAMM). Furthermore, we developed a freely available web portal, the Naked Mole Rat Genome Resource (http://www.naked-mole-rat.org), featuring the data and results of our analysis, to assist researchers interested in the genome and genes of the naked mole rat, and also to facilitate further studies on this fascinating species. Availability and implementation: The Naked Mole Rat Genome Resource is freely available online at http://www.naked-mole-rat.org. This resource is open source and the source code is available at https://github.com/maglab/naked-mole-rat-portal. Contact: jp@senescence.info PMID:25172923

  16. AphanoDB: a genomic resource for Aphanomyces pathogens

    PubMed Central

    Madoui, Mohammed-Amine; Gaulin, Elodie; Mathé, Catherine; San Clemente, Hélène; Couloux, Arnaud; Wincker, Patrick; Dumas, Bernard

    2007-01-01

    Background The Oomycete genus Aphanomyces comprises devastating plant and animal pathogens. However, little is known about the molecular mechanisms underlying pathogenicity of Aphanomyces species. In this study, we report on the development of a public database called AphanoDB which is dedicated to Aphanomyces genomic data. As a first step, a large collection of Expressed Sequence Tags was obtained from the legume pathogen A. euteiches, which was then processed and collected into AphanoDB. Description Two cDNA libraries of A. euteiches were created: one from mycelium growing on synthetic medium and one from mycelium grown in contact to root tissues of the model legume Medicago truncatula. From these libraries, 18,684 expressed sequence tags were obtained and assembled into 7,977 unigenes which were compared to public databases for annotation. Queries on AphanoDB allow the users to retrieve information for each unigene including similarity to known protein sequences, protein domains and Gene Ontology classification. Statistical analysis of EST frequency from the two different growth conditions was also added to the database. Conclusion AphanoDB is a public database with a user-friendly web interface. The sequence report pages are the main web interface which provides all annotation details for each unigene. These interactive sequence report pages are easily available through text, BLAST, Gene Ontology and expression profile search utilities. AphanoDB is available from URL: . PMID:18096036

  17. Functional Analysis of the Human Genome:. Study of Genetic Disease

    NASA Astrophysics Data System (ADS)

    Tsui, Lap-Chee

    2003-04-01

    I will divide my remarks into 3 parts. First, I will give a brief summary of the Human Genome Project. Second, I will describe our work on human chromosome 7 to illustrate how we could contribute to the Project and disease research. Third, I would like to bring across the argument that study of genetic disease is an integral component of the Human Genome Project. In particular, I will use cystic fibrosis as an example to elaborate why I consider disease study is a part of functional genomics.

  18. PolyCat: A Resource for Genome Categorization of Sequencing Reads From Allopolyploid Organisms

    PubMed Central

    Page, Justin T.; Gingle, Alan R.; Udall, Joshua A.

    2013-01-01

    Read mapping is a fundamental part of next-generation genomic research but is complicated by genome duplication in many plants. Categorizing DNA sequence reads into their respective genomes enables current methods to analyze polyploid genomes as if they were diploid. We present PolyCat—a pipeline for mapping and categorizing all types of next-generation sequence data produced from allopolyploid organisms. PolyCat uses GSNAP’s single-nucleotide polymorphism (SNP)-tolerant mapping to minimize the mapping efficiency bias caused by SNPs between genomes. PolyCat then uses SNPs between genomes to categorize reads according to their respective genomes. Bisulfite-treated reads have a significant reduction in nucleotide complexity because nucleotide conversion events are confounded with transition substitutions. PolyCat includes special provisions to properly handle bisulfite-treated data. We demonstrate the functionality of PolyCat on allotetraploid cotton, Gossypium hirsutum, and create a functional SNP index for efficiently mapping sequence reads to the D-genome sequence of G. raimondii. PolyCat is appropriate for all allopolyploids and all types of next-generation genome analysis, including differential expression (RNA sequencing), differential methylation (bisulfite sequencing), differential DNA-protein binding (chromatin immunoprecipitation sequencing), and population diversity. PMID:23450226

  19. Genome resources for climate-resilient cowpea, an essential crop for food security

    USDA-ARS?s Scientific Manuscript database

    Cowpea is a legume crop that is resilient in hot and drought-prone climates, and a primary source of protein in sub-Saharan Africa and other parts of the developing world. However, genome resources for cowpea have lagged behind those of most other major crop plants. Here we describe foundational g...

  20. Genome-scale resources for Thermoanaerobacterium saccharolyticum

    SciTech Connect

    Currie, Devin H.; Raman, Babu; Gowen, Christopher M.; Tschaplinski, Timothy J.; Land, Miriam L.; Brown, Steven D.; Covalla, Sean; Klingeman, Dawn Marie; Yang, Zamin Koo; Engle, Nancy L.; Johnson, Courtney M.; Rodriguez, Miguel A.; Shaw, A. Joe; Kenealy, William R.; Lynd, Lee R.; Fong, Stephen S.; Mielenz, Jonathan R.; Davison, Brian H.; Hogsett, David A.; Herring, Christopher D.

    2015-06-26

    Thermoanaerobacterium saccharolyticum is a hemicellulose-degrading thermophilic anaerobe that was previously engineered to produce ethanol at high yield. For this research, a major project was undertaken to develop this organism into an industrial biocatalyst, but the lack of genome information and resources were recognized early on as a key limitation.

  1. VectorBase: improvements to a bioinformatics resource for invertebrate vector genomics.

    PubMed

    Megy, Karine; Emrich, Scott J; Lawson, Daniel; Campbell, David; Dialynas, Emmanuel; Hughes, Daniel S T; Koscielny, Gautier; Louis, Christos; Maccallum, Robert M; Redmond, Seth N; Sheehan, Andrew; Topalis, Pantelis; Wilson, Derek

    2012-01-01

    VectorBase (http://www.vectorbase.org) is a NIAID-supported bioinformatics resource for invertebrate vectors of human pathogens. It hosts data for nine genomes: mosquitoes (three Anopheles gambiae genomes, Aedes aegypti and Culex quinquefasciatus), tick (Ixodes scapularis), body louse (Pediculus humanus), kissing bug (Rhodnius prolixus) and tsetse fly (Glossina morsitans). Hosted data range from genomic features and expression data to population genetics and ontologies. We describe improvements and integration of new data that expand our taxonomic coverage. Releases are bi-monthly and include the delivery of preliminary data for emerging genomes. Frequent updates of the genome browser provide VectorBase users with increasing options for visualizing their own high-throughput data. One major development is a new population biology resource for storing genomic variations, insecticide resistance data and their associated metadata. It takes advantage of improved ontologies and controlled vocabularies. Combined, these new features ensure timely release of multiple types of data in the public domain while helping overcome the bottlenecks of bioinformatics and annotation by engaging with our user community.

  2. The Agassiz's desert tortoise genome provides a resource for the conservation of a threatened species.

    PubMed

    Tollis, Marc; DeNardo, Dale F; Cornelius, John A; Dolby, Greer A; Edwards, Taylor; Henen, Brian T; Karl, Alice E; Murphy, Robert W; Kusumi, Kenro

    2017-01-01

    Agassiz's desert tortoise (Gopherus agassizii) is a long-lived species native to the Mojave Desert and is listed as threatened under the US Endangered Species Act. To aid conservation efforts for preserving the genetic diversity of this species, we generated a whole genome reference sequence with an annotation based on deep transcriptome sequences of adult skeletal muscle, lung, brain, and blood. The draft genome assembly for G. agassizii has a scaffold N50 length of 252 kbp and a total length of 2.4 Gbp. Genome annotation reveals 20,172 protein-coding genes in the G. agassizii assembly, and that gene structure is more similar to chicken than other turtles. We provide a series of comparative analyses demonstrating (1) that turtles are among the slowest-evolving genome-enabled reptiles, (2) amino acid changes in genes controlling desert tortoise traits such as shell development, longevity and osmoregulation, and (3) fixed variants across the Gopherus species complex in genes related to desert adaptations, including circadian rhythm and innate immune response. This G. agassizii genome reference and annotation is the first such resource for any tortoise, and will serve as a foundation for future analysis of the genetic basis of adaptations to the desert environment, allow for investigation into genomic factors affecting tortoise health, disease and longevity, and serve as a valuable resource for additional studies in this species complex.

  3. Final progress report, Construction of a genome-wide highly characterized clone resource for genome sequencing

    SciTech Connect

    Nierman, William C.

    2000-02-14

    At TIGR, the human Bacterial Artificial Chromosome (BAC) end sequencing and trimming were with an overall sequencing success rate of 65%. CalTech human BAC libraries A, B, C and D as well as Roswell Park Cancer Institute's library RPCI-11 were used. To date, we have generated >300,000 end sequences from >186,000 human BAC clones with an average read length {approx}460 bp for a total of 141 Mb covering {approx}4.7% of the genome. Over sixty percent of the clones have BAC end sequences (BESs) from both ends representing over five-fold coverage of the genome by the paired-end clones. The average phred Q20 length is {approx}400 bp. This high accuracy makes our BESs match the human finished sequences with an average identity of 99% and a match length of 450 bp, and a frequency of one match per 12.8 kb contig sequence. Our sample tracking has ensured a clone tracking accuracy of >90%, which gives researchers a high confidence in (1) retrieving the right clone from the BA C libraries based on the sequence matches; and (2) building a minimum tiling path of sequence-ready clones across the genome and genome assembly scaffolds.

  4. Accurate evaluation and analysis of functional genomics data and methods

    PubMed Central

    Greene, Casey S.; Troyanskaya, Olga G.

    2016-01-01

    The development of technology capable of inexpensively performing large-scale measurements of biological systems has generated a wealth of data. Integrative analysis of these data holds the promise of uncovering gene function, regulation, and, in the longer run, understanding complex disease. However, their analysis has proved very challenging, as it is difficult to quickly and effectively assess the relevance and accuracy of these data for individual biological questions. Here, we identify biases that present challenges for the assessment of functional genomics data and methods. We then discuss evaluation methods that, taken together, begin to address these issues. We also argue that the funding of systematic data-driven experiments and of high-quality curation efforts will further improve evaluation metrics so that they more-accurately assess functional genomics data and methods. Such metrics will allow researchers in the field of functional genomics to continue to answer important biological questions in a data-driven manner. PMID:22268703

  5. A segmental genomic duplication generates a functional intron

    PubMed Central

    Hellsten, Uffe; Aspden, Julie L.; Rio, Donald C.; Rokhsar, Daniel S.

    2011-01-01

    An intron is an extended genomic feature whose function requires multiple constrained positions—donor and acceptor splice sites, a branch point, a polypyrimidine tract and suitable splicing enhancers—that may be distributed over hundreds or thousands of nucleotides. New introns are therefore unlikely to emerge by incremental accumulation of functional sub-elements. Here we demonstrate that a functional intron can be created de novo in a single step by a segmental genomic duplication. This experiment recapitulates in vivo the birth of an intron that arose in the ancestral jawed vertebrate lineage nearly half-a-billion years ago. PMID:21878908

  6. High-throughput TILLING for functional genomics.

    PubMed

    Till, Bradley J; Colbert, Trenton; Tompa, Rachel; Enns, Linda C; Codomo, Christine A; Johnson, Jessica E; Reynolds, Steven H; Henikoff, Jorja G; Greene, Elizabeth A; Steine, Michael N; Comai, Luca; Henikoff, Steven

    2003-01-01

    Targeting-induced local lesions in genomes (TILLING) is a general strategy for identifying induced point mutations that can be applied to almost any organism. Here, we describe the basic methodology for high-throughput TILLING. Gene segments are amplified using fluorescently tagged primers, and products are denatured and reannealed to form heteroduplexes between the mutated sequence and its wild-type counterpart. These heteroduplexes are substrates for cleavage by the endonuclease CEL I. Following cleavage, products are analyzed on denaturing polyacrylamide gels using the LI-COR DNA analyzer system. High-throughput TILLING has been adopted by the Arabidopsis TILLING Project (ATP) to provide allelic series of point mutations for the general Arabidopsis community.

  7. Deductive genomics: a functional approach to identify innovative drug targets in the post-genome era.

    PubMed

    Stumm, Gabriele; Russ, Andreas; Nehls, Michael

    2002-01-01

    The sequencing of the human genome has generated a drug discovery process that is based on sequence analysis and hypothesis-driven (inductive) prediction of gene function. This approach, which we term inductive genomics, is currently dominating the efforts of the pharmaceutical industry to identify new drug targets. According to recent studies, this sequence-driven discovery process is paradoxically increasing the average cost of drug development, thus falling short of the promise of the Human Genome Project to simplify the creation of much needed novel therapeutics. In the early stages of discovery, the flurry of new gene sequences makes it difficult to pick and prioritize the most promising product candidates for product development, as with existing technologies important decisions have to be based on circumstantial evidence that does not strongly predict therapeutic potential. This is because the physiological function of a potential target cannot be predicted by gene sequence analysis and in vitro technologies alone. In contrast, deductive genomics, or large-scale forward genetics, bridges the gap between sequence and function by providing a function-driven in vivo screen of a highly orthologous mammalian model genome for medically relevant physiological functions and drug targets. This approach allows drug discovery to move beyond the focus on sequence-driven identification of new members of classical drug-able protein families towards the biology-driven identification of innovative targets and biological pathways.

  8. Bluejay 1.0: genome browsing and comparison with rich customization provision and dynamic resource linking

    PubMed Central

    Soh, Jung; Gordon, Paul MK; Taschuk, Morgan L; Dong, Anguo; Ah-Seng, Andrew C; Turinsky, Andrei L; Sensen, Christoph W

    2008-01-01

    Background The Bluejay genome browser has been developed over several years to address the challenges posed by the ever increasing number of data types as well as the increasing volume of data in genome research. Beginning with a browser capable of rendering views of XML-based genomic information and providing scalable vector graphics output, we have now completed version 1.0 of the system with many additional features. Our development efforts were guided by our observation that biologists who use both gene expression profiling and comparative genomics gain functional insights above and beyond those provided by traditional per-gene analyses. Results Bluejay 1.0 is a genome viewer integrating genome annotation with: (i) gene expression information; and (ii) comparative analysis with an unlimited number of other genomes in the same view. This allows the biologist to see a gene not just in the context of its genome, but also its regulation and its evolution. Bluejay now has rich provision for personalization by users: (i) numerous display customization features; (ii) the availability of waypoints for marking multiple points of interest on a genome and subsequently utilizing them; and (iii) the ability to take user relevance feedback of annotated genes or textual items to offer personalized recommendations. Bluejay 1.0 also embeds the Seahawk browser for the Moby protocol, enabling users to seamlessly invoke hundreds of Web Services on genomic data of interest without any hard-coding. Conclusion Bluejay offers a unique set of customizable genome-browsing features, with the goal of allowing biologists to quickly focus on, analyze, compare, and retrieve related information on the parts of the genomic data they are most interested in. We expect these capabilities of Bluejay to benefit the many biologists who want to answer complex questions using the information available from completely sequenced genomes. PMID:18940007

  9. Natural selection on functional modules, a genome-wide analysis.

    PubMed

    Serra, François; Arbiza, Leonardo; Dopazo, Joaquín; Dopazo, Hernán

    2011-03-01

    Classically, the functional consequences of natural selection over genomes have been analyzed as the compound effects of individual genes. The current paradigm for large-scale analysis of adaptation is based on the observed significant deviations of rates of individual genes from neutral evolutionary expectation. This approach, which assumed independence among genes, has not been able to identify biological functions significantly enriched in positively selected genes in individual species. Alternatively, pooling related species has enhanced the search for signatures of selection. However, grouping signatures does not allow testing for adaptive differences between species. Here we introduce the Gene-Set Selection Analysis (GSSA), a new genome-wide approach to test for evidences of natural selection on functional modules. GSSA is able to detect lineage specific evolutionary rate changes in a notable number of functional modules. For example, in nine mammal and Drosophilae genomes GSSA identifies hundreds of functional modules with significant associations to high and low rates of evolution. Many of the detected functional modules with high evolutionary rates have been previously identified as biological functions under positive selection. Notably, GSSA identifies conserved functional modules with many positively selected genes, which questions whether they are exclusively selected for fitting genomes to environmental changes. Our results agree with previous studies suggesting that adaptation requires positive selection, but not every mutation under positive selection contributes to the adaptive dynamical process of the evolution of species.

  10. Leveraging Comparative Genomics to Identify and Functionally Characterize Genes Associated with Sperm Phenotypes in Python bivittatus (Burmese Python)

    PubMed Central

    Rutllant, Josep

    2016-01-01

    Comparative genomics approaches provide a means of leveraging functional genomics information from a highly annotated model organism's genome (such as the mouse genome) in order to make physiological inferences about the role of genes and proteins in a less characterized organism's genome (such as the Burmese python). We employed a comparative genomics approach to produce the functional annotation of Python bivittatus genes encoding proteins associated with sperm phenotypes. We identify 129 gene-phenotype relationships in the python which are implicated in 10 specific sperm phenotypes. Results obtained through our systematic analysis identified subsets of python genes exhibiting associations with gene ontology annotation terms. Functional annotation data was represented in a semantic scatter plot. Together, these newly annotated Python bivittatus genome resources provide a high resolution framework from which the biology relating to reptile spermatogenesis, fertility, and reproduction can be further investigated. Applications of our research include (1) production of genetic diagnostics for assessing fertility in domestic and wild reptiles; (2) enhanced assisted reproduction technology for endangered and captive reptiles; and (3) novel molecular targets for biotechnology-based approaches aimed at reducing fertility and reproduction of invasive reptiles. Additional enhancements to reptile genomic resources will further enhance their value. PMID:27200191

  11. Leveraging Comparative Genomics to Identify and Functionally Characterize Genes Associated with Sperm Phenotypes in Python bivittatus (Burmese Python).

    PubMed

    Irizarry, Kristopher J L; Rutllant, Josep

    2016-01-01

    Comparative genomics approaches provide a means of leveraging functional genomics information from a highly annotated model organism's genome (such as the mouse genome) in order to make physiological inferences about the role of genes and proteins in a less characterized organism's genome (such as the Burmese python). We employed a comparative genomics approach to produce the functional annotation of Python bivittatus genes encoding proteins associated with sperm phenotypes. We identify 129 gene-phenotype relationships in the python which are implicated in 10 specific sperm phenotypes. Results obtained through our systematic analysis identified subsets of python genes exhibiting associations with gene ontology annotation terms. Functional annotation data was represented in a semantic scatter plot. Together, these newly annotated Python bivittatus genome resources provide a high resolution framework from which the biology relating to reptile spermatogenesis, fertility, and reproduction can be further investigated. Applications of our research include (1) production of genetic diagnostics for assessing fertility in domestic and wild reptiles; (2) enhanced assisted reproduction technology for endangered and captive reptiles; and (3) novel molecular targets for biotechnology-based approaches aimed at reducing fertility and reproduction of invasive reptiles. Additional enhancements to reptile genomic resources will further enhance their value.

  12. Bases and spaces: resources on the web for accessing the draft human genome - II - After publication of the draft

    PubMed Central

    Semple, Colin AM

    2001-01-01

    The volume of human genome sequence and the variety of web-based tools to access it continue to grow at an impressive rate, but a working knowledge of certain key resources can be sufficient to get the most from your genome. This article provides an update to Genome Biology 2000, 1(4):reviews2001.1-2001.5. PMID:11423014

  13. Phytozome Comparative Plant Genomics Portal

    SciTech Connect

    Goodstein, David; Batra, Sajeev; Carlson, Joseph; Hayes, Richard; Phillips, Jeremy; Shu, Shengqiang; Schmutz, Jeremy; Rokhsar, Daniel

    2014-09-09

    The Dept. of Energy Joint Genome Institute is a genomics user facility supporting DOE mission science in the areas of Bioenergy, Carbon Cycling, and Biogeochemistry. The Plant Program at the JGI applies genomic, analytical, computational and informatics platforms and methods to: 1. Understand and accelerate the improvement (domestication) of bioenergy crops 2. Characterize and moderate plant response to climate change 3. Use comparative genomics to identify constrained elements and infer gene function 4. Build high quality genomic resource platforms of JGI Plant Flagship genomes for functional and experimental work 5. Expand functional genomic resources for Plant Flagship genomes

  14. Characterization of whole genome radiation hybrid mapping resources for non-mammalian vertebrates.

    PubMed Central

    Kwok, C; Korn, R M; Davis, M E; Burt, D W; Critcher, R; McCarthy, L; Paw, B H; Zon, L I; Goodfellow, P N; Schmitt, K

    1998-01-01

    Radiation hybrid panels are already available for genome mapping in human and mouse. In this study we have used two model organisms (chicken and zebrafish) to show that hybrid panels that contain a full complement of the donor genome can be generated by fusion to hamster cells. The quality of the resulting hybrids has been assessed using PCR and FISH. We confirmed the utility of our panels by establishing the percentage of donor DNA present in the hybrids. Our hybrid resources will allow inexpensive gene mapping and we expect that this technology can be transferred to many other species. Such successes are providing the basis for a new era of mapping tools, in the form of whole genome radiation hybrid panels, and are opening new possibilities for systematic genome analysis in the animal genetics community. PMID:9671819

  15. AphidBase: A centralized bioinformatic resource for annotation of the pea aphid genome

    PubMed Central

    Legeai, Fabrice; Shigenobu, Shuji; Gauthier, Jean-Pierre; Colbourne, John; Rispe, Claude; Collin, Olivier; Richards, Stephen; Wilson, Alex C. C.; Tagu, Denis

    2015-01-01

    AphidBase is a centralized bioinformatic resource that was developed to facilitate community annotation of the pea aphid genome by the International Aphid Genomics Consortium (IAGC). The AphidBase Information System designed to organize and distribute genomic data and annotations for a large international community was constructed using open source software tools from the Generic Model Organism Database (GMOD). The system includes Apollo and GBrowse utilities as well as a wiki, blast search capabilities and a full text search engine. AphidBase strongly supported community cooperation and coordination in the curation of gene models during community annotation of the pea aphid genome. AphidBase can be accessed at http://www.aphidbase.com. PMID:20482635

  16. Conifer genomics and adaptation: at the crossroads of genetic diversity and genome function.

    PubMed

    Prunier, Julien; Verta, Jukka-Pekka; MacKay, John J

    2016-01-01

    Conifers have been understudied at the genomic level despite their worldwide ecological and economic importance but the situation is rapidly changing with the development of next generation sequencing (NGS) technologies. With NGS, genomics research has simultaneously gained in speed, magnitude and scope. In just a few years, genomes of 20-24 gigabases have been sequenced for several conifers, with several others expected in the near future. Biological insights have resulted from recent sequencing initiatives as well as genetic mapping, gene expression profiling and gene discovery research over nearly two decades. We review the knowledge arising from conifer genomics research emphasizing genome evolution and the genomic basis of adaptation, and outline emerging questions and knowledge gaps. We discuss future directions in three areas with potential inputs from NGS technologies: the evolutionary impacts of adaptation in conifers based on the adaptation-by-speciation model; the contributions of genetic variability of gene expression in adaptation; and the development of a broader understanding of genetic diversity and its impacts on genome function. These research directions promise to sustain research aimed at addressing the emerging challenges of adaptation that face conifer trees. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  17. Advances in ecological genomics in forest trees and applications to genetic resources conservation and breeding.

    PubMed

    Holliday, Jason A; Aitken, Sally N; Cooke, Janice E K; Fady, Bruno; González-Martínez, Santiago C; Heuertz, Myriam; Jaramillo-Correa, Juan-Pablo; Lexer, Christian; Staton, Margaret; Whetten, Ross W; Plomion, Christophe

    2017-02-01

    Forest trees are an unparalleled group of organisms in their combined ecological, economic and societal importance. With widespread distributions, predominantly random mating systems and large population sizes, most tree species harbour extensive genetic variation both within and among populations. At the same time, demographic processes associated with Pleistocene climate oscillations and land-use change have affected contemporary range-wide diversity and may impinge on the potential for future adaptation. Understanding how these adaptive and neutral processes have shaped the genomes of trees species is therefore central to their management and conservation. As for many other taxa, the advent of high-throughput sequencing methods is expected to yield an understanding of the interplay between the genome and environment at a level of detail and depth not possible only a few years ago. An international conference entitled 'Genomics and Forest Tree Genetics' was held in May 2016, in Arcachon (France), and brought together forest geneticists with a wide range of research interests to disseminate recent efforts that leverage contemporary genomic tools to probe the population, quantitative and evolutionary genomics of trees. An important goal of the conference was to discuss how such data can be applied to both genome-enabled breeding and the conservation of forest genetic resources under land use and climate change. Here, we report discoveries presented at the meeting and discuss how the ecological genomic toolkit can be used to address both basic and applied questions in tree biology. © 2016 John Wiley & Sons Ltd.

  18. CTDB: An Integrated Chickpea Transcriptome Database for Functional and Applied Genomics

    PubMed Central

    Patel, Ravi K.; Garg, Rohini; Jain, Mukesh

    2015-01-01

    Chickpea is an important grain legume used as a rich source of protein in human diet. The narrow genetic diversity and limited availability of genomic resources are the major constraints in implementing breeding strategies and biotechnological interventions for genetic enhancement of chickpea. We developed an integrated Chickpea Transcriptome Database (CTDB), which provides the comprehensive web interface for visualization and easy retrieval of transcriptome data in chickpea. The database features many tools for similarity search, functional annotation (putative function, PFAM domain and gene ontology) search and comparative gene expression analysis. The current release of CTDB (v2.0) hosts transcriptome datasets with high quality functional annotation from cultivated (desi and kabuli types) and wild chickpea. A catalog of transcription factor families and their expression profiles in chickpea are available in the database. The gene expression data have been integrated to study the expression profiles of chickpea transcripts in major tissues/organs and various stages of flower development. The utilities, such as similarity search, ortholog identification and comparative gene expression have also been implemented in the database to facilitate comparative genomic studies among different legumes and Arabidopsis. Furthermore, the CTDB represents a resource for the discovery of functional molecular markers (microsatellites and single nucleotide polymorphisms) between different chickpea types. We anticipate that integrated information content of this database will accelerate the functional and applied genomic research for improvement of chickpea. The CTDB web service is freely available at http://nipgr.res.in/ctdb.html. PMID:26322998

  19. Functional Genomic Analysis of C. elegans Molting

    PubMed Central

    Frand, Alison R; Russel, Sascha

    2005-01-01

    Although the molting cycle is a hallmark of insects and nematodes, neither the endocrine control of molting via size, stage, and nutritional inputs nor the enzymatic mechanism for synthesis and release of the exoskeleton is well understood. Here, we identify endocrine and enzymatic regulators of molting in C. elegans through a genome-wide RNA-interference screen. Products of the 159 genes discovered include annotated transcription factors, secreted peptides, transmembrane proteins, and extracellular matrix enzymes essential for molting. Fusions between several genes and green fluorescent protein show a pulse of expression before each molt in epithelial cells that synthesize the exoskeleton, indicating that the corresponding proteins are made in the correct time and place to regulate molting. We show further that inactivation of particular genes abrogates expression of the green fluorescent protein reporter genes, revealing regulatory networks that might couple the expression of genes essential for molting to endocrine cues. Many molting genes are conserved in parasitic nematodes responsible for human disease, and thus represent attractive targets for pesticide and pharmaceutical development. PMID:16122351

  20. Budding off: bringing functional genomics to Candida albicans

    PubMed Central

    Anderson, Matthew Z.

    2016-01-01

    Candida species are the most prevalent human fungal pathogens, with Candida albicans being the most clinically relevant species. Candida albicans resides as a commensal of the human gastrointestinal tract but is a frequent cause of opportunistic mucosal and systemic infections. Investigation of C. albicans virulence has traditionally relied on candidate gene approaches, but recent advances in functional genomics have now facilitated global, unbiased studies of gene function. Such studies include comparative genomics (both between and within Candida species), analysis of total RNA expression, and regulation and delineation of protein–DNA interactions. Additionally, large collections of mutant strains have begun to aid systematic screening of clinically relevant phenotypes. Here, we will highlight the development of functional genomics in C. albicans and discuss the use of these approaches to addressing both commensalism and pathogenesis in this species. PMID:26424829

  1. FuGE: Functional Genomics Experiment Object Model.

    PubMed

    Jones, Andrew R; Pizarro, Angel; Spellman, Paul; Miller, Michael

    2006-01-01

    This is an interim report on the Functional Genomics Experiment (FuGE) Object Model. FuGE is a framework for creating data standards for high-throughput biological experiments, developed by a consortium of researchers from academia and industry. FuGE supports rich annotation of samples, protocols, instruments, and software, as well as providing extension points for technology specific details. It has been adopted by microarray and proteomics standards bodies as a basis for forthcoming standards. It is hoped that standards developers for other omics techniques will join this collaborative effort; widespread adoption will allow uniform annotation of common parts of functional genomics workflows, reduce standard development and learning times through the sharing of consistent practice, and ease the construction of software for accessing and integrating functional genomics data.

  2. RNAi and functional genomics in plant parasitic nematodes.

    PubMed

    Rosso, M N; Jones, J T; Abad, P

    2009-01-01

    Plant nematology is currently undergoing a revolution with the availability of the first genome sequences as well as comprehensive expressed sequence tag (EST) libraries from a range of nematode species. Several strategies are being used to exploit this wealth of information. Comparative genomics is being used to explore the acquisition of novel genes associated with parasitic lifestyles. Functional analyses of nematode genes are moving toward larger scale studies including global transcriptome profiling. RNA interference (RNAi) has been shown to reduce expression of a range of plant parasitic nematode genes and is a powerful tool for functional analysis of nematode genes. RNAi-mediated suppression of genes essential for nematode development, survival, or parasitism is revealing new targets for nematode control. Plant nematology in the genomics era is now facing the challenge to develop RNAi screens adequate for high-throughput functional analyses.

  3. Budding off: bringing functional genomics to Candida albicans.

    PubMed

    Anderson, Matthew Z; Bennett, Richard J

    2016-03-01

    Candida species are the most prevalent human fungal pathogens, with Candida albicans being the most clinically relevant species. Candida albicans resides as a commensal of the human gastrointestinal tract but is a frequent cause of opportunistic mucosal and systemic infections. Investigation of C. albicans virulence has traditionally relied on candidate gene approaches, but recent advances in functional genomics have now facilitated global, unbiased studies of gene function. Such studies include comparative genomics (both between and within Candida species), analysis of total RNA expression, and regulation and delineation of protein-DNA interactions. Additionally, large collections of mutant strains have begun to aid systematic screening of clinically relevant phenotypes. Here, we will highlight the development of functional genomics in C. albicans and discuss the use of these approaches to addressing both commensalism and pathogenesis in this species. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  4. Solutions for data integration in functional genomics: a critical assessment and case study.

    PubMed

    Smedley, Damian; Swertz, Morris A; Wolstencroft, Katy; Proctor, Glenn; Zouberakis, Michael; Bard, Jonathan; Hancock, John M; Schofield, Paul

    2008-11-01

    The torrent of data emerging from the application of new technologies to functional genomics and systems biology can no longer be contained within the traditional modes of data sharing and publication with the consequence that data is being deposited in, distributed across and disseminated through an increasing number of databases. The resulting fragmentation poses serious problems for the model organism community which increasingly rely on data mining and computational approaches that require gathering of data from a range of sources. In the light of these problems, the European Commission has funded a coordination action, CASIMIR (coordination and sustainability of international mouse informatics resources), with a remit to assess the technical and social aspects of database interoperability that currently prevent the full realization of the potential of data integration in mouse functional genomics. In this article, we assess the current problems with interoperability, with particular reference to mouse functional genomics, and critically review the technologies that can be deployed to overcome them. We describe a typical use-case where an investigator wishes to gather data on variation, genomic context and metabolic pathway involvement for genes discovered in a genome-wide screen. We go on to develop an automated approach involving an in silico experimental workflow tool, Taverna, using web services, BioMart and MOLGENIS technologies for data retrieval. Finally, we focus on the current impediments to adopting such an approach in a wider context, and strategies to overcome them.

  5. Recombination between defective tombusvirus RNAs generates functional hybrid genomes

    SciTech Connect

    White, K.A.; Morris, T.J.

    1994-04-26

    The tombusviruses represent a group of small icosahedral plant viruses that contain monopartite positive-sense RNA genomes. Tombusviruses are able to generate small replicating deletion mutants of their genomes (i.e., defective interfering RNAs) during infections via RNA recombination and/or rearrangement. To further study the process of RNA recombination and to determine whether tombusviruses were capable of trans-recombination, protoplasts were coinoculated with in vitro-generated transcripts of a nonreplicating 3{prime}-truncated genomic RNA of cucumber necrosis tombusvirus and either replicative or replication-defective DI RNAs of tomato bushy stunt tombusvirus. After 48-hr incubation, two dominant replicative chimeric recombinant viral RNA populations were detected that contained various large contiguous 5{prime} segments of the cucumber necrosis tombusvirus genomic RNA fused to 3{prime}-terminal regions of the tomato bushy stunt tombusvirus defective interfering RNA. Some of the larger chimeric recombinants formed in protoplasts were able to systemically infect plants and induce wild-type symptoms. In addition, a functional chimeric genome was generated in planta after direct coinoculation of whole plants with the defective RNA components. These results indicate that (i) RNA recombination can occur relatively efficiently in single-cell infections, (ii) trans-recombination can occur with nonreplicating viral RNA components, and (iii) functional chimeric genomes can be generated via recombination. Possible mechanisms for the formation of the recombinants are proposed, and evolutionary implications are discussed.

  6. WWOX: its genomics, partners, and functions.

    PubMed

    Del Mare, Sara; Salah, Zaidoun; Aqeilan, Rami I

    2009-11-01

    The WW domain-containing oxidoreductase (WWOX) spans one of the most active common fragile sites (CFSs) involved in cancer, FRA16D. WWOX encodes a 46-kDa protein that contains two N-terminal WW domains and a central short-chain dehydrogenase/reductase (SDR) domain. Through its WW domain, Wwox interacts with its partners and modulates their functions. Our data indicate that Wwox suppresses the transactivation function of several transcription factors implied in neoplasia by sequestering them in the cytoplasm. Work from our laboratory and other research groups have demonstrated that Wwox participates in a number of cellular processes including growth, differentiation, apoptosis, and tumor suppression. Targeted deletion of the Wwox gene in mice causes increased spontaneous and chemically induced tumor incidence supporting bona fide tumor suppressor function of WWOX. Moreover, generation of the Wwox-deficient mice uncovers, at least in part, some of the physiological in vivo functions of the WWOX gene. This review focuses on recent progress that elucidates Wwox functions in biology and pathology.

  7. Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells.

    PubMed

    Yang, Wanjuan; Soares, Jorge; Greninger, Patricia; Edelman, Elena J; Lightfoot, Howard; Forbes, Simon; Bindal, Nidhi; Beare, Dave; Smith, James A; Thompson, I Richard; Ramaswamy, Sridhar; Futreal, P Andrew; Haber, Daniel A; Stratton, Michael R; Benes, Cyril; McDermott, Ultan; Garnett, Mathew J

    2013-01-01

    Alterations in cancer genomes strongly influence clinical responses to treatment and in many instances are potent biomarkers for response to drugs. The Genomics of Drug Sensitivity in Cancer (GDSC) database (www.cancerRxgene.org) is the largest public resource for information on drug sensitivity in cancer cells and molecular markers of drug response. Data are freely available without restriction. GDSC currently contains drug sensitivity data for almost 75 000 experiments, describing response to 138 anticancer drugs across almost 700 cancer cell lines. To identify molecular markers of drug response, cell line drug sensitivity data are integrated with large genomic datasets obtained from the Catalogue of Somatic Mutations in Cancer database, including information on somatic mutations in cancer genes, gene amplification and deletion, tissue type and transcriptional data. Analysis of GDSC data is through a web portal focused on identifying molecular biomarkers of drug sensitivity based on queries of specific anticancer drugs or cancer genes. Graphical representations of the data are used throughout with links to related resources and all datasets are fully downloadable. GDSC provides a unique resource incorporating large drug sensitivity and genomic datasets to facilitate the discovery of new therapeutic biomarkers for cancer therapies.

  8. A new cancer genome anatomy project web resource for the community.

    PubMed

    Schaefer, C; Grouse, L; Buetow, K; Strausberg, R L

    2001-01-01

    The National Cancer Institute's Cancer Genome Anatomy Project (CGAP) is developing publicly accessible information, technology, and material resources that provide a platform for the interface of cancer research and genomics. CGAP's efforts have focused toward (1) building and annotating catalogues of genes expressed during cancer development, (2) identifying polymorphisms in those genes, and (3) developing resources for the molecular characterization of cancer-related chromosomal aberrations. To date, CGAP has produced more than 1,000,000 expressed sequence tags, approximately 3,300,000 serial analysis of gene expression tags, and identified more than 10,000 human gene-based single-nucleotide polymorphisms. To enhance access to these datasets by the research community, a new Cancer Genome Project web site (http://cgap.nci.nih.gov/) is being introduced. The web site includes genomic data for humans and mice, including transcript sequence, gene expression patterns, single-nucleotide polymorphisms, clone resources, and cytogenetic information. Descriptions of the methods and reagents used in deriving the CGAP datasets are also provided. An extensive suite of informatics tools facilitates queries and analysis of the CGAP data by the community. One of the newest features of the CGAP web site is an electronic version of the Mitelman Database of Chromosome Aberrations in Cancer.

  9. Genomic Enzymology: Web Tools for Leveraging Protein Family Sequence–Function Space and Genome Context to Discover Novel Functions

    PubMed Central

    2017-01-01

    The exponentially increasing number of protein and nucleic acid sequences provides opportunities to discover novel enzymes, metabolic pathways, and metabolites/natural products, thereby adding to our knowledge of biochemistry and biology. The challenge has evolved from generating sequence information to mining the databases to integrating and leveraging the available information, i.e., the availability of “genomic enzymology” web tools. Web tools that allow identification of biosynthetic gene clusters are widely used by the natural products/synthetic biology community, thereby facilitating the discovery of novel natural products and the enzymes responsible for their biosynthesis. However, many novel enzymes with interesting mechanisms participate in uncharacterized small-molecule metabolic pathways; their discovery and functional characterization also can be accomplished by leveraging information in protein and nucleic acid databases. This Perspective focuses on two genomic enzymology web tools that assist the discovery novel metabolic pathways: (1) Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST) for generating sequence similarity networks to visualize and analyze sequence–function space in protein families and (2) Enzyme Function Initiative-Genome Neighborhood Tool (EFI-GNT) for generating genome neighborhood networks to visualize and analyze the genome context in microbial and fungal genomes. Both tools have been adapted to other applications to facilitate target selection for enzyme discovery and functional characterization. As the natural products community has demonstrated, the enzymology community needs to embrace the essential role of web tools that allow the protein and genome sequence databases to be leveraged for novel insights into enzymological problems. PMID:28826221

  10. Structure and Functional Studies on Dengue-2 Virus Genome

    DTIC Science & Technology

    1986-03-01

    AD STRUCTURE AND FUNCTIONAL STUDIES ON DENGUE -2 VIRUS GENOME FINAL Report Lfl C’) Radha Krishnan Padmanabhan, Ph.D. 0) March 1, 1986 Supported by U.S...and Functional Studies on Dengue -2 Virus Genome 12. PERSONAL AUTHOR(S) Radha Krishnan Padmanabhan 13a. TYPE OF REPORT 13b. TIME COVERED 14. DATE OF...3’-end of Dengue RNA in order to facilitate cDNA synthesis by oligo d(T) priming as proposed in the original research project. 2. We also showed that

  11. From functional genomics to systems biology: concepts and practices.

    PubMed

    Auffray, Charles; Imbeaud, Sandrine; Roux-Rouquié, Magali; Hood, Leroy

    2003-01-01

    Systems biology is the iterative and integrative study of biological systems as systems in response to perturbations. It is founded on hypotheses formalized in models built from the results of global functional genomics analyses of the complexity of the genome, transcriptome, proteome, metabolome, etc. Its implementation by cross-disciplinary teams in a standardized mode under quality assurance should allow accessing the small variations of the large number of elements determining functioning of biological systems. Galactose utilization in yeast, and sea urchin development are two examples of emerging systems biology.

  12. MicrobesOnline: an integrated portal for comparative and functional genomics

    SciTech Connect

    Dehal, Paramvir; Joachimiak, Marcin; Price, Morgan; Bates, John; Baumohl, Jason; Chivian, Dylan; Friedland, Greg; Huang, Kathleen; Keller, Keith; Novichkov, Pavel; Dubchak, Inna; Alm, Eric; Arkin, Adam

    2011-07-14

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html.

  13. MicrobesOnline: an integrated portal for comparative and functional genomics

    SciTech Connect

    Dehal, Paramvir S.; Joachimiak, Marcin P.; Price, Morgan N.; Bates, John T.; Baumohl, Jason K.; Chivian, Dylan; Friedland, Greg D.; Huang, Katherine H.; Keller, Keith; Novichkov, Pavel S.; Dubchak, Inna L.; Alm, Eric J.; Arkin, Adam P.

    2009-09-17

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html.

  14. DFAST and DAGA: web-based integrated genome annotation tools and resources

    PubMed Central

    TANIZAWA, Yasuhiro; FUJISAWA, Takatomo; KAMINUMA, Eli; NAKAMURA, Yasukazu; ARITA, Masanori

    2016-01-01

    Quality assurance and correct taxonomic affiliation of data submitted to public sequence databases have been an everlasting problem. The DDBJ Fast Annotation and Submission Tool (DFAST) is a newly developed genome annotation pipeline with quality and taxonomy assessment tools. To enable annotation of ready-to-submit quality, we also constructed curated reference protein databases tailored for lactic acid bacteria. DFAST was developed so that all the procedures required for DDBJ submission could be done seamlessly online. The online workspace would be especially useful for users not familiar with bioinformatics skills. In addition, we have developed a genome repository, DFAST Archive of Genome Annotation (DAGA), which currently includes 1,421 genomes covering 179 species and 18 subspecies of two genera, Lactobacillus and Pediococcus, obtained from both DDBJ/ENA/GenBank and Sequence Read Archive (SRA). All the genomes deposited in DAGA were annotated consistently and assessed using DFAST. To assess the taxonomic position based on genomic sequence information, we used the average nucleotide identity (ANI), which showed high discriminative power to determine whether two given genomes belong to the same species. We corrected mislabeled or misidentified genomes in the public database and deposited the curated information in DAGA. The repository will improve the accessibility and reusability of genome resources for lactic acid bacteria. By exploiting the data deposited in DAGA, we found intraspecific subgroups in Lactobacillus gasseri and Lactobacillus jensenii, whose variation between subgroups is larger than the well-accepted ANI threshold of 95% to differentiate species. DFAST and DAGA are freely accessible at https://dfast.nig.ac.jp. PMID:27867804

  15. DFAST and DAGA: web-based integrated genome annotation tools and resources.

    PubMed

    Tanizawa, Yasuhiro; Fujisawa, Takatomo; Kaminuma, Eli; Nakamura, Yasukazu; Arita, Masanori

    2016-01-01

    Quality assurance and correct taxonomic affiliation of data submitted to public sequence databases have been an everlasting problem. The DDBJ Fast Annotation and Submission Tool (DFAST) is a newly developed genome annotation pipeline with quality and taxonomy assessment tools. To enable annotation of ready-to-submit quality, we also constructed curated reference protein databases tailored for lactic acid bacteria. DFAST was developed so that all the procedures required for DDBJ submission could be done seamlessly online. The online workspace would be especially useful for users not familiar with bioinformatics skills. In addition, we have developed a genome repository, DFAST Archive of Genome Annotation (DAGA), which currently includes 1,421 genomes covering 179 species and 18 subspecies of two genera, Lactobacillus and Pediococcus, obtained from both DDBJ/ENA/GenBank and Sequence Read Archive (SRA). All the genomes deposited in DAGA were annotated consistently and assessed using DFAST. To assess the taxonomic position based on genomic sequence information, we used the average nucleotide identity (ANI), which showed high discriminative power to determine whether two given genomes belong to the same species. We corrected mislabeled or misidentified genomes in the public database and deposited the curated information in DAGA. The repository will improve the accessibility and reusability of genome resources for lactic acid bacteria. By exploiting the data deposited in DAGA, we found intraspecific subgroups in Lactobacillus gasseri and Lactobacillus jensenii, whose variation between subgroups is larger than the well-accepted ANI threshold of 95% to differentiate species. DFAST and DAGA are freely accessible at https://dfast.nig.ac.jp.

  16. Integrative genomics to dissect retinoid functions.

    PubMed

    Mendoza-Parra, Marco-Antonio; Gronemeyer, Hinrich

    2014-01-01

    Retinoids and rexinoids, as all other ligands of the nuclear receptor (NR) family, act as ligand-regulated trans-acting transcription factors that bind to cis-acting DNA regulatory elements in the promoter regions of target genes (for reviews see [12, 22, 23, 26, 36]). Ligand binding modulates the communication functions of the receptor with the intracellular environment, which essentially entails receptor-protein and receptor-DNA or receptor-chromatin interactions. In this communication network, the receptor simultaneously serves as both intracellular sensor and regulator of cell/organ functions. Receptors are "intelligent" mediators of the information encoded in the chemical structure of a nuclear receptor ligand, as they interpret this information in the context of cellular identity and cell-physiological status and convert it into a dynamic chain of receptor-protein and receptor-DNA interactions. To process input and output information, they are composed of a modular structure with several domains that have evolved to exert particular molecular recognition functions. As detailed in other chapters in this volume, the main functional domains are the DNA-binding (DBD) and ligand-binding (LBD) [5-7, 38, 56, 71]. The LBD serves as a dual input-output information processor. Inputs, such as ligand binding or receptor phosphorylations, induce allosteric changes in receptor surfaces that serve as docking sites for outputs, such as subunits of transcription and epigenetic machineries or enzyme complexes. The complexity of input and output signals and their interdependencies is far from being understood.

  17. Using the ENCODE Resource for Functional Annotation of Genetic Variants.

    PubMed

    Pazin, Michael J

    2015-03-11

    This article illustrates the use of the Encyclopedia of DNA Elements (ENCODE) resource to generate or refine hypotheses from genomic data on disease and other phenotypic traits. First, the goals and history of ENCODE and related epigenomics projects are reviewed. Second, the rationale for ENCODE and the major data types used by ENCODE are briefly described, as are some standard heuristics for their interpretation. Third, the use of the ENCODE resource is examined. Standard use cases for ENCODE, accessing the ENCODE resource, and accessing data from related projects are discussed. Although the focus of this article is the use of ENCODE data, some of the same approaches can be used with data from other projects.

  18. Targeted Genome-Wide Enrichment of Functional Regions

    PubMed Central

    Senapathy, Periannan; Bhasi, Ashwini; Mattox, Jeffrey; Dhandapany, Perundurai S.; Sadayappan, Sakthivel

    2010-01-01

    Only a small fraction of large genomes such as that of the human contains the functional regions such as the exons, promoters, and polyA sites. A platform technique for selective enrichment of functional genomic regions will enable several next-generation sequencing applications that include the discovery of causal mutations for disease and drug response. Here, we describe a powerful platform technique, termed “functional genomic fingerprinting” (FGF), for the multiplexed genomewide isolation and analysis of targeted regions such as the exome, promoterome, or exon splice enhancers. The technique employs a fixed part of a uniquely designed Fixed-Randomized primer, while the randomized part contains all the possible sequence permutations. The Fixed-Randomized primers bind with full sequence complementarity at multiple sites where the fixed sequence (such as the splice signals) occurs within the genome, and multiplex amplify many regions bounded by the fixed sequences (e.g., exons). Notably, validation of this technique using cardiac myosin binding protein-C (MYBPC3) gene as an example strongly supports the application and efficacy of this method. Further, assisted by genomewide computational analyses of such sequences, the FGF technique may provide a unique platform for high-throughput sample production and analysis of targeted genomic regions by the next-generation sequencing techniques, with powerful applications in discovering disease and drug response genes. PMID:20585402

  19. Resources for Genetic and Genomic Analysis of Emerging Pathogen Acinetobacter baumannii

    PubMed Central

    Ramage, Elizabeth; Weiss, Eli J.; Radey, Matthew; Hayden, Hillary S.; Held, Kiara G.; Huse, Holly K.; Zurawski, Daniel V.; Brittnacher, Mitchell J.; Manoil, Colin

    2015-01-01

    ABSTRACT Acinetobacter baumannii is a Gram-negative bacterial pathogen notorious for causing serious nosocomial infections that resist antibiotic therapy. Research to identify factors responsible for the pathogen's success has been limited by the resources available for genome-scale experimental studies. This report describes the development of several such resources for A. baumannii strain AB5075, a recently characterized wound isolate that is multidrug resistant and displays robust virulence in animal models. We report the completion and annotation of the genome sequence, the construction of a comprehensive ordered transposon mutant library, the extension of high-coverage transposon mutant pool sequencing (Tn-seq) to the strain, and the identification of the genes essential for growth on nutrient-rich agar. These resources should facilitate large-scale genetic analysis of virulence, resistance, and other clinically relevant traits that make A. baumannii a formidable public health threat. IMPORTANCE Acinetobacter baumannii is one of six bacterial pathogens primarily responsible for antibiotic-resistant infections that have become the scourge of health care facilities worldwide. Eliminating such infections requires a deeper understanding of the factors that enable the pathogen to persist in hospital environments, establish infections, and resist antibiotics. We present a set of resources that should accelerate genome-scale genetic characterization of these traits for a reference isolate of A. baumannii that is highly virulent and representative of current outbreak strains. PMID:25845845

  20. Annotating the Function of the Human Genome with Gene Ontology and Disease Ontology

    PubMed Central

    Hu, Yang; Zhou, Wenyang; Ren, Jun; Dong, Lixiang

    2016-01-01

    Increasing evidences indicated that function annotation of human genome in molecular level and phenotype level is very important for systematic analysis of genes. In this study, we presented a framework named Gene2Function to annotate Gene Reference into Functions (GeneRIFs), in which each functional description of GeneRIFs could be annotated by a text mining tool Open Biomedical Annotator (OBA), and each Entrez gene could be mapped to Human Genome Organisation Gene Nomenclature Committee (HGNC) gene symbol. After annotating all the records about human genes of GeneRIFs, 288,869 associations between 13,148 mRNAs and 7,182 terms, 9,496 associations between 948 microRNAs and 533 terms, and 901 associations between 139 long noncoding RNAs (lncRNAs) and 297 terms were obtained as a comprehensive annotation resource of human genome. High consistency of term frequency of individual gene (Pearson correlation = 0.6401, p = 2.2e − 16) and gene frequency of individual term (Pearson correlation = 0.1298, p = 3.686e − 14) in GeneRIFs and GOA shows our annotation resource is very reliable. PMID:27635398

  1. Annotating the Function of the Human Genome with Gene Ontology and Disease Ontology.

    PubMed

    Hu, Yang; Zhou, Wenyang; Ren, Jun; Dong, Lixiang; Wang, Yadong; Jin, Shuilin; Cheng, Liang

    2016-01-01

    Increasing evidences indicated that function annotation of human genome in molecular level and phenotype level is very important for systematic analysis of genes. In this study, we presented a framework named Gene2Function to annotate Gene Reference into Functions (GeneRIFs), in which each functional description of GeneRIFs could be annotated by a text mining tool Open Biomedical Annotator (OBA), and each Entrez gene could be mapped to Human Genome Organisation Gene Nomenclature Committee (HGNC) gene symbol. After annotating all the records about human genes of GeneRIFs, 288,869 associations between 13,148 mRNAs and 7,182 terms, 9,496 associations between 948 microRNAs and 533 terms, and 901 associations between 139 long noncoding RNAs (lncRNAs) and 297 terms were obtained as a comprehensive annotation resource of human genome. High consistency of term frequency of individual gene (Pearson correlation = 0.6401, p = 2.2e - 16) and gene frequency of individual term (Pearson correlation = 0.1298, p = 3.686e - 14) in GeneRIFs and GOA shows our annotation resource is very reliable.

  2. Resource base influences genome-wide DNA methylation levels in wild baboons (Papio cynocephalus).

    PubMed

    Lea, Amanda J; Altmann, Jeanne; Alberts, Susan C; Tung, Jenny

    2016-04-01

    Variation in resource availability commonly exerts strong effects on fitness-related traits in wild animals. However, we know little about the molecular mechanisms that mediate these effects, or about their persistence over time. To address these questions, we profiled genome-wide whole-blood DNA methylation levels in two sets of wild baboons: (i) 'wild-feeding' baboons that foraged naturally in a savanna environment and (ii) 'Lodge' baboons that had ready access to spatially concentrated human food scraps, resulting in high feeding efficiency and low daily travel distances. We identified 1014 sites (0.20% of sites tested) that were differentially methylated between wild-feeding and Lodge baboons, providing the first evidence that resource availability shapes the epigenome in a wild mammal. Differentially methylated sites tended to occur in contiguous stretches (i.e., in differentially methylated regions or DMRs), in promoters and enhancers, and near metabolism-related genes, supporting their functional importance in gene regulation. In agreement, reporter assay experiments confirmed that methylation at the largest identified DMR, located in the promoter of a key glycolysis-related gene, was sufficient to causally drive changes in gene expression. Intriguingly, all dispersing males carried a consistent epigenetic signature of their membership in a wild-feeding group, regardless of whether males dispersed into or out of this group as adults. Together, our findings support a role for DNA methylation in mediating ecological effects on phenotypic traits in the wild and emphasize the dynamic environmental sensitivity of DNA methylation levels across the life course.

  3. Genome resources for climate-resilient cowpea, an essential crop for food security.

    PubMed

    Muñoz-Amatriaín, María; Mirebrahim, Hamid; Xu, Pei; Wanamaker, Steve I; Luo, MingCheng; Alhakami, Hind; Alpert, Matthew; Atokple, Ibrahim; Batieno, Benoit J; Boukar, Ousmane; Bozdag, Serdar; Cisse, Ndiaga; Drabo, Issa; Ehlers, Jeffrey D; Farmer, Andrew; Fatokun, Christian; Gu, Yong Q; Guo, Yi-Ning; Huynh, Bao-Lam; Jackson, Scott A; Kusi, Francis; Lawley, Cynthia T; Lucas, Mitchell R; Ma, Yaqin; Timko, Michael P; Wu, Jiajie; You, Frank; Barkley, Noelle A; Roberts, Philip A; Lonardi, Stefano; Close, Timothy J

    2017-03-01

    Cowpea (Vigna unguiculata L. Walp.) is a legume crop that is resilient to hot and drought-prone climates, and a primary source of protein in sub-Saharan Africa and other parts of the developing world. However, genome resources for cowpea have lagged behind most other major crops. Here we describe foundational genome resources and their application to the analysis of germplasm currently in use in West African breeding programs. Resources developed from the African cultivar IT97K-499-35 include a whole-genome shotgun (WGS) assembly, a bacterial artificial chromosome (BAC) physical map, and assembled sequences from 4355 BACs. These resources and WGS sequences of an additional 36 diverse cowpea accessions supported the development of a genotyping assay for 51 128 SNPs, which was then applied to five bi-parental RIL populations to produce a consensus genetic map containing 37 372 SNPs. This genetic map enabled the anchoring of 100 Mb of WGS and 420 Mb of BAC sequences, an exploration of genetic diversity along each linkage group, and clarification of macrosynteny between cowpea and common bean. The SNP assay enabled a diversity analysis of materials from West African breeding programs. Two major subpopulations exist within those materials, one of which has significant parentage from South and East Africa and more diversity. There are genomic regions of high differentiation between subpopulations, one of which coincides with a cluster of nodulin genes. The new resources and knowledge help to define goals and accelerate the breeding of improved varieties to address food security issues related to limited-input small-holder farming and climate stress. © 2016 The Authors. The Plant Journal published by John Wiley & Sons Ltd and Society for Experimental Biology.

  4. Genomic Functionalization: The Next Revolution In Biology

    SciTech Connect

    Anderson, Peter; Schoeniger, Joseph S.; Imbro, Paula M.

    2014-07-01

    We have implemented a ligand-alignment algorithm into our developed computational pipeline for identifying specificity-determining features (SDFs) in protein-ligand complexes. Given a set of protein-ligand complex structures, the algorithm aligns the complexes by ligand rather than by the C -RMSD or standard approach, providing a single reference frame for extracting SDFs. We anticipate that this ligand-alignment capability will be highly useful for protein function prediction. We already have a database containing > 20 K ligand-protein complex crystal structures taken from the Protein Data Bank. By aligning these proteins to single reference frames using ligand alignment, we can submit the complexes to our pipeline for SDF extraction. The SDFs derived from this training procedure can be used as thumbprints that are hallmarks of individual enzyme classes. These SDF thumbprints may then serve as guides to the prediction of function of new unknown proteins.

  5. Exploring Protein Function Using the Saccharomyces Genome Database.

    PubMed

    Wong, Edith D

    2017-01-01

    Elucidating the function of individual proteins will help to create a comprehensive picture of cell biology, as well as shed light on human disease mechanisms, possible treatments, and cures. Due to its compact genome, and extensive history of experimentation and annotation, the budding yeast Saccharomyces cerevisiae is an ideal model organism in which to determine protein function. This information can then be leveraged to infer functions of human homologs. Despite the large amount of research and biological data about S. cerevisiae, many proteins' functions remain unknown. Here, we explore ways to use the Saccharomyces Genome Database (SGD; http://www.yeastgenome.org ) to predict the function of proteins and gain insight into their roles in various cellular processes.

  6. Preterm Birth Genome Project (PGP) -- validation of resources for preterm birth genome-wide studies.

    PubMed

    Pennell, Craig E; Vadillo-Ortega, Felipe; Olson, David M; Ha, Eun-Hee; Williams, Scott; Frayling, Tim M; Dolan, Siobhan; Katz, Michael; Merialdi, Mario; Menon, Ramkumar

    2013-01-01

    We determined a series of quality control (QC) analyses to assess the usability of DNA collected and processed from different countries utilizing different DNA extraction techniques prior to genome-wide association studies (GWAS). The quality of DNA collected utilizing four different DNA extraction techniques and the impact of shipping DNA at different temperatures on array performance were evaluated. Fifteen maternal-fetal pairs were used from four countries. DNA was extracted using four approaches: whole blood, blood spots with whole genome amplification (WGA), saliva and buccal swab. Samples were sent to a genotyping facility, either on dry ice or at room temperature and genotyped using Affymetrix SNP array 6.0. QC measured included extraction techniques, effect of shipping temperatures, accuracy and Mendelian concordance. Significantly fewer (50 % ) single nucleotide polymorphisms (SNPs) passed QC metrics for buccal swab DNA (P < 0.0001) due to missing genotype data (P < 0.0001). Whole blood or saliva DNA had the highest call rates (99.2 0.4 % and 99.3 0.2 % , respectively) and Mendelian concordance. Shipment temperature had no effect. DNA from blood or saliva had the highest call rate accuracy, and buccal swabs had the lowest. DNA extracted from blood, saliva and blood spots were found suitable for GWAS in our study.

  7. Comparative Analysis of Chloroplast Genomes: Functional Annotation, Genome-Based Phylogeny, and Deduced Evolutionary Patterns

    PubMed Central

    Rivas, Javier De Las; Lozano, Juan Jose; Ortiz, Angel R.

    2002-01-01

    All protein sequences from 19 complete chloroplast genomes (cpDNA) have been studied using a new computational method able to analyze functional correlations among series of protein sequences contained in complete proteomes. First, all open reading frames (ORFs) from the cpDNAs, comprising a total of 2266 protein sequences, were compared against the 3168 proteins from Synechocystis PCC6803 complete genome to find functionally related orthologous proteins. Additionally, all cpDNA genomes were pairwise compared to find orthologous groups not present in cyanobacteria. Annotations in the cluster of othologous proteins database and CyanoBase were used as reference for the functional assignments. Following this protocol, new functional assignments were made for ORFs of unknown function and for ycfs (hypothetical chloroplast frames), which still lack a functional assignment. Using this information, a matrix of functional relationships was derived from profiles of the presence and/or absence of orthologous proteins; the matrix included 1837 proteins in 277 orthologous clusters. A factor analysis study of this matrix, followed by cluster analysis, allowed us to obtain accurate phylogenetic reconstructions and the detection of genes probably involved in speciation as phylogenetic correlates. Finally, by grouping common evolutionary patterns, we show that it is possible to determine functionally linked protein networks. This has allowed us to suggest putative associations for some unknown ORFs. PMID:11932241

  8. Genetic and genomic approaches to understanding macrophage identity and function.

    PubMed

    Glass, Christopher K

    2015-04-01

    A major goal of our laboratory is to understand the molecular mechanisms that underlie the development and functions of diverse macrophage phenotypes in health and disease. Recent studies using genetic and genomic approaches suggest a relatively simple model of collaborative and hierarchical interactions between lineage-determining and signal-dependent transcription factors that enable selection and activation of transcriptional enhancers that specify macrophage identity and function. In addition, we have found that it is possible to use natural genetic variation as a powerful tool for advancing our understanding of how the macrophage deciphers the information encoded by the genome to attain specific phenotypes in a context-dependent manner. Here, I will describe our recent efforts to extend genetic and genomic approaches to investigate the roles of distinct tissue environments in determining the phenotypes of different resident populations of macrophages.

  9. MELOGEN: an EST database for melon functional genomics

    PubMed Central

    Gonzalez-Ibeas, Daniel; Blanca, José; Roig, Cristina; González-To, Mireia; Picó, Belén; Truniger, Verónica; Gómez, Pedro; Deleu, Wim; Caño-Delgado, Ana; Arús, Pere; Nuez, Fernando; Garcia-Mas, Jordi; Puigdomènech, Pere; Aranda, Miguel A

    2007-01-01

    Background Melon (Cucumis melo L.) is one of the most important fleshy fruits for fresh consumption. Despite this, few genomic resources exist for this species. To facilitate the discovery of genes involved in essential traits, such as fruit development, fruit maturation and disease resistance, and to speed up the process of breeding new and better adapted melon varieties, we have produced a large collection of expressed sequence tags (ESTs) from eight normalized cDNA libraries from different tissues in different physiological conditions. Results We determined over 30,000 ESTs that were clustered into 16,637 non-redundant sequences or unigenes, comprising 6,023 tentative consensus sequences (contigs) and 10,614 unclustered sequences (singletons). Many potential molecular markers were identified in the melon dataset: 1,052 potential simple sequence repeats (SSRs) and 356 single nucleotide polymorphisms (SNPs) were found. Sixty-nine percent of the melon unigenes showed a significant similarity with proteins in databases. Functional classification of the unigenes was carried out following the Gene Ontology scheme. In total, 9,402 unigenes were mapped to one or more ontology. Remarkably, the distributions of melon and Arabidopsis unigenes followed similar tendencies, suggesting that the melon dataset is representative of the whole melon transcriptome. Bioinformatic analyses primarily focused on potential precursors of melon micro RNAs (miRNAs) in the melon dataset, but many other genes potentially controlling disease resistance and fruit quality traits were also identified. Patterns of transcript accumulation were characterised by Real-Time-qPCR for 20 of these genes. Conclusion The collection of ESTs characterised here represents a substantial increase on the genetic information available for melon. A database (MELOGEN) which contains all EST sequences, contig images and several tools for analysis and data mining has been created. This set of sequences constitutes

  10. Genomics Encyclopedia of Bacteria and Archaea-Root Nodule Bacteria (GEBA-RNB): a resource for microsymbiont genomes (2013 DOE JGI Genomics of Energy and Environment 8th Annual User Meeting)

    SciTech Connect

    Reeve, Wayne

    2013-03-01

    Wayne Reeve of Murdoch University on "Genomics Encyclopedia of Bacteria and Archaea-Root Nodule Bacteria (GEBA-RNB): a resource for microsymbiont genomes" at the 8th Annual Genomics of Energy & Environment Meeting on March 27, 2013 in Walnut Creek, Calif.

  11. YersiniaBase: a genomic resource and analysis platform for comparative analysis of Yersinia.

    PubMed

    Tan, Shi Yang; Dutta, Avirup; Jakubovics, Nicholas S; Ang, Mia Yang; Siow, Cheuk Chuen; Mutha, Naresh Vr; Heydari, Hamed; Wee, Wei Yee; Wong, Guat Jah; Choo, Siew Woh

    2015-01-16

    Yersinia is a Gram-negative bacteria that includes serious pathogens such as the Yersinia pestis, which causes plague, Yersinia pseudotuberculosis, Yersinia enterocolitica. The remaining species are generally considered non-pathogenic to humans, although there is evidence that at least some of these species can cause occasional infections using distinct mechanisms from the more pathogenic species. With the advances in sequencing technologies, many genomes of Yersinia have been sequenced. However, there is currently no specialized platform to hold the rapidly-growing Yersinia genomic data and to provide analysis tools particularly for comparative analyses, which are required to provide improved insights into their biology, evolution and pathogenicity. To facilitate the ongoing and future research of Yersinia, especially those generally considered non-pathogenic species, a well-defined repository and analysis platform is needed to hold the Yersinia genomic data and analysis tools for the Yersinia research community. Hence, we have developed the YersiniaBase, a robust and user-friendly Yersinia resource and analysis platform for the analysis of Yersinia genomic data. YersiniaBase has a total of twelve species and 232 genome sequences, of which the majority are Yersinia pestis. In order to smooth the process of searching genomic data in a large database, we implemented an Asynchronous JavaScript and XML (AJAX)-based real-time searching system in YersiniaBase. Besides incorporating existing tools, which include JavaScript-based genome browser (JBrowse) and Basic Local Alignment Search Tool (BLAST), YersiniaBase also has in-house developed tools: (1) Pairwise Genome Comparison tool (PGC) for comparing two user-selected genomes; (2) Pathogenomics Profiling Tool (PathoProT) for comparative pathogenomics analysis of Yersinia genomes; (3) YersiniaTree for constructing phylogenetic tree of Yersinia. We ran analyses based on the tools and genomic data in YersiniaBase and the

  12. Development of Highly Informative Genome-Wide Single Sequence Repeat Markers for Breeding Applications in Sesame and Construction of a Web Resource: SisatBase.

    PubMed

    Dossa, Komivi; Yu, Jingyin; Liao, Boshou; Cisse, Ndiaga; Zhang, Xiurong

    2017-01-01

    The sequencing of the full nuclear genome of sesame (Sesamum indicum L.) provides the platform for functional analyses of genome components and their application in breeding programs. Although the importance of microsatellites markers or simple sequence repeats (SSR) in crop genotyping, genetics, and breeding applications is well established, only a little information exist concerning SSRs at the whole genome level in sesame. In addition, SSRs represent a suitable marker type for sesame molecular breeding in developing countries where it is mainly grown. In this study, we identified 138,194 genome-wide SSRs of which 76.5% were physically mapped onto the 13 pseudo-chromosomes. Among these SSRs, up to three primers pairs were supplied for 101,930 SSRs and used to in silico amplify the reference genome together with two newly sequenced sesame accessions. A total of 79,957 SSRs (78%) were polymorphic between the three genomes thereby suggesting their promising use in different genomics-assisted breeding applications. From these polymorphic SSRs, 23 were selected and validated to have high polymorphic potential in 48 sesame accessions from different growing areas of Africa. Furthermore, we have developed an online user-friendly database, SisatBase (http://www.sesame-bioinfo.org/SisatBase/), which provides free access to SSRs data as well as an integrated platform for functional analyses. Altogether, the reference SSR and SisatBase would serve as useful resources for genetic assessment, genomic studies, and breeding advancement in sesame, especially in developing countries.

  13. Development of Highly Informative Genome-Wide Single Sequence Repeat Markers for Breeding Applications in Sesame and Construction of a Web Resource: SisatBase

    PubMed Central

    Dossa, Komivi; Yu, Jingyin; Liao, Boshou; Cisse, Ndiaga; Zhang, Xiurong

    2017-01-01

    The sequencing of the full nuclear genome of sesame (Sesamum indicum L.) provides the platform for functional analyses of genome components and their application in breeding programs. Although the importance of microsatellites markers or simple sequence repeats (SSR) in crop genotyping, genetics, and breeding applications is well established, only a little information exist concerning SSRs at the whole genome level in sesame. In addition, SSRs represent a suitable marker type for sesame molecular breeding in developing countries where it is mainly grown. In this study, we identified 138,194 genome-wide SSRs of which 76.5% were physically mapped onto the 13 pseudo-chromosomes. Among these SSRs, up to three primers pairs were supplied for 101,930 SSRs and used to in silico amplify the reference genome together with two newly sequenced sesame accessions. A total of 79,957 SSRs (78%) were polymorphic between the three genomes thereby suggesting their promising use in different genomics-assisted breeding applications. From these polymorphic SSRs, 23 were selected and validated to have high polymorphic potential in 48 sesame accessions from different growing areas of Africa. Furthermore, we have developed an online user-friendly database, SisatBase (http://www.sesame-bioinfo.org/SisatBase/), which provides free access to SSRs data as well as an integrated platform for functional analyses. Altogether, the reference SSR and SisatBase would serve as useful resources for genetic assessment, genomic studies, and breeding advancement in sesame, especially in developing countries. PMID:28878802

  14. Mutant power: using mutant allele collections for yeast functional genomics

    PubMed Central

    Norman, Kaitlyn L.

    2016-01-01

    The budding yeast has long served as a model eukaryote for the functional genomic analysis of highly conserved signaling pathways, cellular processes and mechanisms underlying human disease. The collection of reagents available for genomics in yeast is extensive, encompassing a growing diversity of mutant collections beyond gene deletion sets in the standard wild-type S288C genetic background. We review here three main types of mutant allele collections: transposon mutagen collections, essential gene collections and overexpression libraries. Each collection provides unique and identifiable alleles that can be utilized in genome-wide, high-throughput studies. These genomic reagents are particularly informative in identifying synthetic phenotypes and functions associated with essential genes, including those modeled most effectively in complex genetic backgrounds. Several examples of genomic studies in filamentous/pseudohyphal backgrounds are provided here to illustrate this point. Additionally, the limitations of each approach are examined. Collectively, these mutant allele collections in Saccharomyces cerevisiae and the related pathogenic yeast Candida albicans promise insights toward an advanced understanding of eukaryotic molecular and cellular biology. PMID:26453908

  15. Mutant power: using mutant allele collections for yeast functional genomics.

    PubMed

    Norman, Kaitlyn L; Kumar, Anuj

    2016-03-01

    The budding yeast has long served as a model eukaryote for the functional genomic analysis of highly conserved signaling pathways, cellular processes and mechanisms underlying human disease. The collection of reagents available for genomics in yeast is extensive, encompassing a growing diversity of mutant collections beyond gene deletion sets in the standard wild-type S288C genetic background. We review here three main types of mutant allele collections: transposon mutagen collections, essential gene collections and overexpression libraries. Each collection provides unique and identifiable alleles that can be utilized in genome-wide, high-throughput studies. These genomic reagents are particularly informative in identifying synthetic phenotypes and functions associated with essential genes, including those modeled most effectively in complex genetic backgrounds. Several examples of genomic studies in filamentous/pseudohyphal backgrounds are provided here to illustrate this point. Additionally, the limitations of each approach are examined. Collectively, these mutant allele collections in Saccharomyces cerevisiae and the related pathogenic yeast Candida albicans promise insights toward an advanced understanding of eukaryotic molecular and cellular biology. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  16. Novel approaches in function-driven single-cell genomics

    DOE PAGES

    Doud, Devin F. R.; Woyke, Tanja

    2017-06-07

    Deeper sequencing and improved bioinformatics in conjunction with single-cell and metagenomic approaches continue to illuminate undercharacterized environmental microbial communities. This has propelled the 'who is there, and what might they be doing' paradigm to the uncultivated and has already radically changed the topology of the tree of life and provided key insights into the microbial contribution to biogeochemistry. While characterization of 'who' based on marker genes can describe a large fraction of the community, answering 'what are they doing' remains the elusive pinnacle for microbiology. Function-driven single-cell genomics provides a solution by using a function-based screen to subsample complex microbialmore » communities in a targeted manner for the isolation and genome sequencing of single cells. This enables single-cell sequencing to be focused on cells with specific phenotypic or metabolic characteristics of interest. Recovered genomes are conclusively implicated for both encoding and exhibiting the feature of interest, improving downstream annotation and revealing activity levels within that environment. This emerging approach has already improved our understanding of microbial community functioning and facilitated the experimental analysis of uncharacterized gene product space. Here we provide a comprehensive review of strategies that have been applied for function-driven single-cell genomics and the future directions we envision.« less

  17. The Human Functional Genomics Project: Understanding Generation of Diversity.

    PubMed

    Pappalardo, Jenna L; Hafler, David A

    2016-11-03

    Generation of biologic diversity is a cornerstone of immunity, yet the tools to investigate the causal influence of genetic and environmental factors have been greatly limited. Studies from the Human Functional Genomics Project, presented in Cell and other Cell Press journals, integrate environmental and genetic factors with the direction and magnitude of immune responses to decipher inflammatory disease pathogenesis.

  18. Resurrection of DNA function in vivo from an extinct genome.

    PubMed

    Pask, Andrew J; Behringer, Richard R; Renfree, Marilyn B

    2008-05-21

    There is a burgeoning repository of information available from ancient DNA that can be used to understand how genomes have evolved and to determine the genetic features that defined a particular species. To assess the functional consequences of changes to a genome, a variety of methods are needed to examine extinct DNA function. We isolated a transcriptional enhancer element from the genome of an extinct marsupial, the Tasmanian tiger (Thylacinus cynocephalus or thylacine), obtained from 100 year-old ethanol-fixed tissues from museum collections. We then examined the function of the enhancer in vivo. Using a transgenic approach, it was possible to resurrect DNA function in transgenic mice. The results demonstrate that the thylacine Col2A1 enhancer directed chondrocyte-specific expression in this extinct mammalian species in the same way as its orthologue does in mice. While other studies have examined extinct coding DNA function in vitro, this is the first example of the restoration of extinct non-coding DNA and examination of its function in vivo. Our method using transgenesis can be used to explore the function of regulatory and protein-coding sequences obtained from any extinct species in an in vivo model system, providing important insights into gene evolution and diversity.

  19. Resurrection of DNA Function In Vivo from an Extinct Genome

    PubMed Central

    Pask, Andrew J.; Behringer, Richard R.; Renfree, Marilyn B.

    2008-01-01

    There is a burgeoning repository of information available from ancient DNA that can be used to understand how genomes have evolved and to determine the genetic features that defined a particular species. To assess the functional consequences of changes to a genome, a variety of methods are needed to examine extinct DNA function. We isolated a transcriptional enhancer element from the genome of an extinct marsupial, the Tasmanian tiger (Thylacinus cynocephalus or thylacine), obtained from 100 year-old ethanol-fixed tissues from museum collections. We then examined the function of the enhancer in vivo. Using a transgenic approach, it was possible to resurrect DNA function in transgenic mice. The results demonstrate that the thylacine Col2A1 enhancer directed chondrocyte-specific expression in this extinct mammalian species in the same way as its orthologue does in mice. While other studies have examined extinct coding DNA function in vitro, this is the first example of the restoration of extinct non-coding DNA and examination of its function in vivo. Our method using transgenesis can be used to explore the function of regulatory and protein-coding sequences obtained from any extinct species in an in vivo model system, providing important insights into gene evolution and diversity. PMID:18493600

  20. Functional annotation from the genome sequence of the giant panda.

    PubMed

    Huo, Tong; Zhang, Yinjie; Lin, Jianping

    2012-08-01

    The giant panda is one of the most critically endangered species due to the fragmentation and loss of its habitat. Studying the functions of proteins in this animal, especially specific trait-related proteins, is therefore necessary to protect the species. In this work, the functions of these proteins were investigated using the genome sequence of the giant panda. Data on 21,001 proteins and their functions were stored in the Giant Panda Protein Database, in which the proteins were divided into two groups: 20,179 proteins whose functions can be predicted by GeneScan formed the known-function group, whereas 822 proteins whose functions cannot be predicted by GeneScan comprised the unknown-function group. For the known-function group, we further classified the proteins by molecular function, biological process, cellular component, and tissue specificity. For the unknown-function group, we developed a strategy in which the proteins were filtered by cross-Blast to identify panda-specific proteins under the assumption that proteins related to the panda-specific traits in the unknown-function group exist. After this filtering procedure, we identified 32 proteins (2 of which are membrane proteins) specific to the giant panda genome as compared against the dog and horse genomes. Based on their amino acid sequences, these 32 proteins were further analyzed by functional classification using SVM-Prot, motif prediction using MyHits, and interacting protein prediction using the Database of Interacting Proteins. Nineteen proteins were predicted to be zinc-binding proteins, thus affecting the activities of nucleic acids. The 32 panda-specific proteins will be further investigated by structural and functional analysis.

  1. Performance Ratio Based Resource Allocation Decision-Making in Genomic Medicine.

    PubMed

    Fragoulakis, Vasilios; Mitropoulou, Christina; Katelidou, Daphne; van Schaik, Ron H; Maniadakis, Nikolaos; Patrinos, George P

    2017-02-01

    In modern healthcare systems, the available resources may influence the morbidity, mortality, and-consequently-the level of healthcare provided in every country. This is of particular interest in developing countries where the resources are limited and must be spent wisely to address social justice and the right for equal access in healthcare services by all the citizens in economically viable terms. In this light, the current allocation is, in practice, inefficient and rests mostly on each country's individual political and historical context and, thus, does not always incorporate decision-making enabled by economic models. In this study, we present a new economic model, specifically for resource allocation for genomic medicine, based on performance ratio, with potential applications in diverse healthcare sectors, which are particularly appealing for developing countries and low-resource environments. The model proposes a new method for resource allocation taking into account (1) the size of innovation of a new technology, (2) the relative effectiveness in comparison with social preferences, and (3) the cost of the technology, which permits the measurement of effectiveness to be determined differently in the context of a specific disease and then to be expressed in a relative form using a common performance ratio. The present work expands on previous work for innovation in economic models pertaining to genomic medicine and supports translational science.

  2. SkateBase, an elasmobranch genome project and collection of molecular resources for chondrichthyan fishes

    PubMed Central

    Wyffels, Jennifer; L. King, Benjamin; Vincent, James; Chen, Chuming; Wu, Cathy H.; Polson, Shawn W.

    2014-01-01

    Chondrichthyan fishes are a diverse class of gnathostomes that provide a valuable perspective on fundamental characteristics shared by all jawed and limbed vertebrates. Studies of phylogeny, species diversity, population structure, conservation, and physiology are accelerated by genomic, transcriptomic and protein sequence data. These data are widely available for many sarcopterygii (coelacanth, lungfish and tetrapods) and actinoptergii (ray-finned fish including teleosts) taxa, but limited for chondrichthyan fishes.  In this study, we summarize available data for chondrichthyes and describe resources for one of the largest projects to characterize one of these fish, Leucoraja erinacea, the little skate.  SkateBase ( http://skatebase.org) serves as the skate genome project portal linking data, research tools, and teaching resources. PMID:25309735

  3. Genomic resources for a model in adaptation and speciation research: characterization of the Poecilia mexicana transcriptome

    PubMed Central

    2012-01-01

    Background Elucidating the genomic basis of adaptation and speciation is a major challenge in natural systems with large quantities of environmental and phenotypic data, mostly because of the scarcity of genomic resources for non-model organisms. The Atlantic molly (Poecilia mexicana, Poeciliidae) is a small livebearing fish that has been extensively studied for evolutionary ecology research, particularly because this species has repeatedly colonized extreme environments in the form of caves and toxic hydrogen sulfide containing springs. In such extreme environments, populations show strong patterns of adaptive trait divergence and the emergence of reproductive isolation. Here, we used RNA-sequencing to assemble and annotate the first transcriptome of P. mexicana to facilitate ecological genomics studies in the future and aid the identification of genes underlying adaptation and speciation in the system. Description We provide the first annotated reference transcriptome of P. mexicana. Our transcriptome shows high congruence with other published fish transcriptomes, including that of the guppy, medaka, zebrafish, and stickleback. Transcriptome annotation uncovered the presence of candidate genes relevant in the study of adaptation to extreme environments. We describe general and oxidative stress response genes as well as genes involved in pathways induced by hypoxia or involved in sulfide metabolism. To facilitate future comparative analyses, we also conducted quantitative comparisons between P. mexicana from different river drainages. 106,524 single nucleotide polymorphisms were detected in our dataset, including potential markers that are putatively fixed across drainages. Furthermore, specimens from different drainages exhibited some consistent differences in gene regulation. Conclusions Our study provides a valuable genomic resource to study the molecular underpinnings of adaptation to extreme environments in replicated sulfide spring and cave environments. In

  4. Optimizing de novo transcriptome assembly and extending genomic resources for striped catfish (Pangasianodon hypophthalmus).

    PubMed

    Thanh, Nguyen Minh; Jung, Hyungtaek; Lyons, Russell E; Njaci, Isaac; Yoon, Byoung-Ha; Chand, Vincent; Tuan, Nguyen Viet; Thu, Vo Thi Minh; Mather, Peter

    2015-10-01

    Striped catfish (Pangasianodon hypophthalmus) is a commercially important freshwater fish used in inland aquaculture in the Mekong Delta, Vietnam. The culture industry is facing a significant challenge however from saltwater intrusion into many low topographical coastal provinces across the Mekong Delta as a result of predicted climate change impacts. Developing genomic resources for this species can facilitate the production of improved culture lines that can withstand raised salinity conditions, and so we have applied high-throughput Ion Torrent sequencing of transcriptome libraries from six target osmoregulatory organs from striped catfish as a genomic resource for use in future selection strategies. We obtained 12,177,770 reads after trimming and processing with an average length of 97bp. De novo assemblies were generated using CLC Genomic Workbench, Trinity and Velvet/Oases with the best overall contig performance resulting from the CLC assembly. De novo assembly using CLC yielded 66,451 contigs with an average length of 478bp and N50 length of 506bp. A total of 37,969 contigs (57%) possessed significant similarity with proteins in the non-redundant database. Comparative analyses revealed that a significant number of contigs matched sequences reported in other teleost fishes, ranging in similarity from 45.2% with Atlantic cod to 52% with zebrafish. In addition, 28,879 simple sequence repeats (SSRs) and 55,721 single nucleotide polymorphisms (SNPs) were detected in the striped catfish transcriptome. The sequence collection generated in the current study represents the most comprehensive genomic resource for P. hypophthalmus available to date. Our results illustrate the utility of next-generation sequencing as an efficient tool for constructing a large genomic database for marker development in non-model species. Copyright © 2015 Elsevier B.V. All rights reserved.

  5. A resource for functional profiling of noncoding RNA in the yeast Saccharomyces cerevisiae.

    PubMed

    Parker, Steven; Fraczek, Marcin G; Wu, Jian; Shamsah, Sara; Manousaki, Alkisti; Dungrattanalert, Kobchai; de Almeida, Rogerio Alves; Estrada-Rivadeneyra, Diego; Omara, Walid; Delneri, Daniela; O'Keefe, Raymond T

    2017-08-01

    Eukaryotic genomes are extensively transcribed, generating many different RNAs with no known function. We have constructed 1502 molecular barcoded ncRNA gene deletion strains encompassing 443 ncRNAs in the yeast Saccharomyces cerevisiae as tools for ncRNA functional analysis. This resource includes deletions of small nuclear RNAs (snRNAs), transfer RNAs (tRNAs), small nucleolar RNAs (snoRNAs), and other annotated ncRNAs as well as the more recently identified stable unannotated transcripts (SUTs) and cryptic unstable transcripts (CUTs) whose functions are largely unknown. Specifically, deletions have been constructed for ncRNAs found in the intergenic regions, not overlapping genes or their promoters (i.e., at least 200 bp minimum distance from the closest gene start codon). The deletion strains carry molecular barcodes designed to be complementary with the protein gene deletion collection enabling parallel analysis experiments. These strains will be useful for the numerous genomic and molecular techniques that utilize deletion strains, including genome-wide phenotypic screens under different growth conditions, pooled chemogenomic screens with drugs or chemicals, synthetic genetic array analysis to uncover novel genetic interactions, and synthetic dosage lethality screens to analyze gene dosage. Overall, we created a valuable resource for the RNA community and for future ncRNA research. © 2017 Parker et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  6. StreptoBase: An Oral Streptococcus mitis Group Genomic Resource and Analysis Platform.

    PubMed

    Zheng, Wenning; Tan, Tze King; Paterson, Ian C; Mutha, Naresh V R; Siow, Cheuk Chuen; Tan, Shi Yang; Old, Lesley A; Jakubovics, Nicholas S; Choo, Siew Woh

    2016-01-01

    The oral streptococci are spherical Gram-positive bacteria categorized under the phylum Firmicutes which are among the most common causative agents of bacterial infective endocarditis (IE) and are also important agents in septicaemia in neutropenic patients. The Streptococcus mitis group is comprised of 13 species including some of the most common human oral colonizers such as S. mitis, S. oralis, S. sanguinis and S. gordonii as well as species such as S. tigurinus, S. oligofermentans and S. australis that have only recently been classified and are poorly understood at present. We present StreptoBase, which provides a specialized free resource focusing on the genomic analyses of oral species from the mitis group. It currently hosts 104 S. mitis group genomes including 27 novel mitis group strains that we sequenced using the high throughput Illumina HiSeq technology platform, and provides a comprehensive set of genome sequences for analyses, particularly comparative analyses and visualization of both cross-species and cross-strain characteristics of S. mitis group bacteria. StreptoBase incorporates sophisticated in-house designed bioinformatics web tools such as Pairwise Genome Comparison (PGC) tool and Pathogenomic Profiling Tool (PathoProT), which facilitate comparative pathogenomics analysis of Streptococcus strains. Examples are provided to demonstrate how StreptoBase can be employed to compare genome structure of different S. mitis group bacteria and putative virulence genes profile across multiple streptococcal strains. In conclusion, StreptoBase offers access to a range of streptococci genomic resources as well as analysis tools and will be an invaluable platform to accelerate research in streptococci. Database URL: http://streptococcus.um.edu.my.

  7. StreptoBase: An Oral Streptococcus mitis Group Genomic Resource and Analysis Platform

    PubMed Central

    Zheng, Wenning; Paterson, Ian C.; Mutha, Naresh V. R.; Siow, Cheuk Chuen; Tan, Shi Yang; Old, Lesley A.; Jakubovics, Nicholas S.; Choo, Siew Woh

    2016-01-01

    The oral streptococci are spherical Gram-positive bacteria categorized under the phylum Firmicutes which are among the most common causative agents of bacterial infective endocarditis (IE) and are also important agents in septicaemia in neutropenic patients. The Streptococcus mitis group is comprised of 13 species including some of the most common human oral colonizers such as S. mitis, S. oralis, S. sanguinis and S. gordonii as well as species such as S. tigurinus, S. oligofermentans and S. australis that have only recently been classified and are poorly understood at present. We present StreptoBase, which provides a specialized free resource focusing on the genomic analyses of oral species from the mitis group. It currently hosts 104 S. mitis group genomes including 27 novel mitis group strains that we sequenced using the high throughput Illumina HiSeq technology platform, and provides a comprehensive set of genome sequences for analyses, particularly comparative analyses and visualization of both cross-species and cross-strain characteristics of S. mitis group bacteria. StreptoBase incorporates sophisticated in-house designed bioinformatics web tools such as Pairwise Genome Comparison (PGC) tool and Pathogenomic Profiling Tool (PathoProT), which facilitate comparative pathogenomics analysis of Streptococcus strains. Examples are provided to demonstrate how StreptoBase can be employed to compare genome structure of different S. mitis group bacteria and putative virulence genes profile across multiple streptococcal strains. In conclusion, StreptoBase offers access to a range of streptococci genomic resources as well as analysis tools and will be an invaluable platform to accelerate research in streptococci. Database URL: http://streptococcus.um.edu.my. PMID:27138013

  8. Structure and Functional Studies on Dengue-2 Virus Genome

    DTIC Science & Technology

    1986-03-01

    AD_ _ _ Lfl oSTRUCTURE AND FUNCTIONAL STUDIES ON DENGUE -2 VIRUS GENOME 0Annual Report Radha Krishnan Padmanabhan, Ph.D. March 1, 1986 Supported by...Studies on Dengue -2 Virus Genome 12 PERSONAL AUTHOR(S) Radha Krishnan Padmanabhan 13a TYPE OF REPORT 1 3b TIME COVERED 14 DATE OF REPORT (Year, Month, Day...analysis of these clones totalling 06 01 14,586 nucleotides: Deduced amino acid sequences of dengue virI 19 ABSTRACT (Continue on reverse of

  9. dbWGFP: a database and web server of human whole-genome single nucleotide variants and their functional predictions.

    PubMed

    Wu, Jiaxin; Wu, Mengmeng; Li, Lianshuo; Liu, Zhuo; Zeng, Wanwen; Jiang, Rui

    2016-01-01

    The recent advancement of the next generation sequencing technology has enabled the fast and low-cost detection of all genetic variants spreading across the entire human genome, making the application of whole-genome sequencing a tendency in the study of disease-causing genetic variants. Nevertheless, there still lacks a repository that collects predictions of functionally damaging effects of human genetic variants, though it has been well recognized that such predictions play a central role in the analysis of whole-genome sequencing data. To fill this gap, we developed a database named dbWGFP (a database and web server of human whole-genome single nucleotide variants and their functional predictions) that contains functional predictions and annotations of nearly 8.58 billion possible human whole-genome single nucleotide variants. Specifically, this database integrates 48 functional predictions calculated by 17 popular computational methods and 44 valuable annotations obtained from various data sources. Standalone software, user-friendly query services and free downloads of this database are available at http://bioinfo.au.tsinghua.edu.cn/dbwgfp. dbWGFP provides a valuable resource for the analysis of whole-genome sequencing, exome sequencing and SNP array data, thereby complementing existing data sources and computational resources in deciphering genetic bases of human inherited diseases.

  10. Resources

    MedlinePlus

    ... Gastrointestinal disorders - resources Hearing impairment - resources Hearing or speech impairment - resources Heart disease - resources Hemophilia - resources Herpes - resources Incest - resources Incontinence - ...

  11. Towards a TILLING platform for functional genomics in Piel de Sapo melons.

    PubMed

    González, Mireia; Xu, Meihong; Esteras, Cristina; Roig, Cristina; Monforte, Antonio J; Troadec, Christelle; Pujol, Marta; Nuez, Fernando; Bendahmane, Abdelhafid; Garcia-Mas, Jordi; Picó, Belén

    2011-08-11

    The availability of genetic and genomic resources for melon has increased significantly, but functional genomics resources are still limited for this crop. TILLING is a powerful reverse genetics approach that can be utilized to generate novel mutations in candidate genes. A TILLING resource is available for cantalupensis melons, but not for inodorus melons, the other main commercial group. A new ethyl methanesulfonate-mutagenized (EMS) melon population was generated for the first time in an andromonoecious non-climacteric inodorus Piel de Sapo genetic background. Diverse mutant phenotypes in seedlings, vines and fruits were observed, some of which were of possible commercial interest. The population was first screened for mutations in three target genes involved in disease resistance and fruit quality (Cm-PDS, Cm-eIF4E and Cm-eIFI(iso)4E). The same genes were also tilled in the available monoecious and climacteric cantalupensis EMS melon population. The overall mutation density in this first Piel de Sapo TILLING platform was estimated to be 1 mutation/1.5 Mb by screening four additional genes (Cm-ACO1, Cm-NOR, Cm-DET1 and Cm-DHS). Thirty-three point mutations were found for the seven gene targets, six of which were predicted to have an impact on the function of the protein. The genotype/phenotype correlation was demonstrated for a loss-of-function mutation in the Phytoene desaturase gene, which is involved in carotenoid biosynthesis. The TILLING approach was successful at providing new mutations in the genetic background of Piel de Sapo in most of the analyzed genes, even in genes for which natural variation is extremely low. This new resource will facilitate reverse genetics studies in non-climacteric melons, contributing materially to future genomic and breeding studies.

  12. Towards a TILLING platform for functional genomics in Piel de Sapo melons

    PubMed Central

    2011-01-01

    Background The availability of genetic and genomic resources for melon has increased significantly, but functional genomics resources are still limited for this crop. TILLING is a powerful reverse genetics approach that can be utilized to generate novel mutations in candidate genes. A TILLING resource is available for cantalupensis melons, but not for inodorus melons, the other main commercial group. Results A new ethyl methanesulfonate-mutagenized (EMS) melon population was generated for the first time in an andromonoecious non-climacteric inodorus Piel de Sapo genetic background. Diverse mutant phenotypes in seedlings, vines and fruits were observed, some of which were of possible commercial interest. The population was first screened for mutations in three target genes involved in disease resistance and fruit quality (Cm-PDS, Cm-eIF4E and Cm-eIFI(iso)4E). The same genes were also tilled in the available monoecious and climacteric cantalupensis EMS melon population. The overall mutation density in this first Piel de Sapo TILLING platform was estimated to be 1 mutation/1.5 Mb by screening four additional genes (Cm-ACO1, Cm-NOR, Cm-DET1 and Cm-DHS). Thirty-three point mutations were found for the seven gene targets, six of which were predicted to have an impact on the function of the protein. The genotype/phenotype correlation was demonstrated for a loss-of-function mutation in the Phytoene desaturase gene, which is involved in carotenoid biosynthesis. Conclusions The TILLING approach was successful at providing new mutations in the genetic background of Piel de Sapo in most of the analyzed genes, even in genes for which natural variation is extremely low. This new resource will facilitate reverse genetics studies in non-climacteric melons, contributing materially to future genomic and breeding studies. PMID:21834982

  13. Beyond Drosophila: RNAi in vivo and functional genomics in insects.

    PubMed

    Bellés, Xavier

    2010-01-01

    The increasing availability of insect genomes has revealed a large number of genes with unknown functions and the resulting problem of how to discover these functions. The RNA interference (RNAi) technique, which generates loss-of-function phenotypes by depletion of a chosen transcript, can help to overcome this challenge. RNAi can unveil the functions of new genes, lead to the discovery of new functions for old genes, and find the genes for old functions. Moreover, the possibility of studying the functions of homologous genes in different species can allow comparisons of the genetic networks regulating a given function in different insect groups, thereby facilitating an evolutionary insight into developmental processes. RNAi also has drawbacks and obscure points, however, such as those related to differences in species sensitivity. Disentangling these differences is one of the main challenges in the RNAi field.

  14. Genomic islands predict functional adaptation in marine actinobacteria

    SciTech Connect

    Penn, Kevin; Jenkins, Caroline; Nett, Markus; Udwary, Daniel; Gontang, Erin; McGlinchey, Ryan; Foster, Brian; Lapidus, Alla; Podell, Sheila; Allen, Eric; Moore, Bradley; Jensen, Paul

    2009-04-01

    Linking functional traits to bacterial phylogeny remains a fundamental but elusive goal of microbial ecology 1. Without this information, it becomes impossible to resolve meaningful units of diversity and the mechanisms by which bacteria interact with each other and adapt to environmental change. Ecological adaptations among bacterial populations have been linked to genomic islands, strain-specific regions of DNA that house functionally adaptive traits 2. In the case of environmental bacteria, these traits are largely inferred from bioinformatic or gene expression analyses 2, thus leaving few examples in which the functions of island genes have been experimentally characterized. Here we report the complete genome sequences of Salinispora tropica and S. arenicola, the first cultured, obligate marine Actinobacteria 3. These two species inhabit benthic marine environments and dedicate 8-10percent of their genomes to the biosynthesis of secondary metabolites. Despite a close phylogenetic relationship, 25 of 37 secondary metabolic pathways are species-specific and located within 21 genomic islands, thus providing new evidence linking secondary metabolism to ecological adaptation. Species-specific differences are also observed in CRISPR sequences, suggesting that variations in phage immunity provide fitness advantages that contribute to the cosmopolitan distribution of S. arenicola 4. The two Salinispora genomes have evolved by complex processes that include the duplication and acquisition of secondary metabolite genes, the products of which provide immediate opportunities for molecular diversification and ecological adaptation. Evidence that secondary metabolic pathways are exchanged by Horizontal Gene Transfer (HGT) yet are fixed among globally distributed populations 5 supports a functional role for their products and suggests that pathway acquisition represents a previously unrecognized force driving bacterial diversification

  15. Generation of an integrated Hieracium genomic and transcriptomic resource enables exploration of small RNA pathways during apomixis initiation.

    PubMed

    Rabiger, David S; Taylor, Jennifer M; Spriggs, Andrew; Hand, Melanie L; Henderson, Steven T; Johnson, Susan D; Oelkers, Karsten; Hrmova, Maria; Saito, Keisuke; Suzuki, Go; Mukai, Yasuhiko; Carroll, Bernard J; Koltunow, Anna M G

    2016-10-06

    Application of apomixis, or asexual seed formation, in crop breeding would allow rapid fixation of complex traits, economizing improved crop delivery. Identification of apomixis genes is confounded by the polyploid nature, high genome complexity and lack of genomic sequence integration with reproductive tissue transcriptomes in most apomicts. A genomic and transcriptomic resource was developed for Hieracium subgenus Pilosella (Asteraceae) which incorporates characterized sexual, apomictic and mutant apomict plants exhibiting reversion to sexual reproduction. Apomicts develop additional female gametogenic cells that suppress the sexual pathway in ovules. Disrupting small RNA pathways in sexual Arabidopsis also induces extra female gametogenic cells; therefore, the resource was used to examine if changes in small RNA pathways correlate with apomixis initiation. An initial characterization of small RNA pathway genes within Hieracium was undertaken, and ovary-expressed ARGONAUTE genes were identified and cloned. Comparisons of whole ovary transcriptomes from mutant apomicts, relative to the parental apomict, revealed that differentially expressed genes were enriched for processes involved in small RNA biogenesis and chromatin silencing. Small RNA profiles within mutant ovaries did not reveal large-scale alterations in composition or length distributions; however, a small number of differentially expressed, putative small RNA targets were identified. The established Hieracium resource represents a substantial contribution towards the investigation of early sexual and apomictic female gamete development, and the generation of new candidate genes and markers. Observed changes in small RNA targets and biogenesis pathways within sexual and apomictic ovaries will underlie future functional research into apomixis initiation in Hieracium.

  16. WormBase ParaSite - a comprehensive resource for helminth genomics.

    PubMed

    Howe, Kevin L; Bolt, Bruce J; Shafie, Myriam; Kersey, Paul; Berriman, Matthew

    2016-11-27

    The number of publicly available parasitic worm genome sequences has increased dramatically in the past three years, and research interest in helminth functional genomics is now quickly gathering pace in response to the foundation that has been laid by these collective efforts. A systematic approach to the organisation, curation, analysis and presentation of these data is clearly vital for maximising the utility of these data to researchers. We have developed a portal called WormBase ParaSite (http://parasite.wormbase.org) for interrogating helminth genomes on a large scale. Data from over 100 nematode and platyhelminth species are integrated, adding value by way of systematic and consistent functional annotation (e.g. protein domains and Gene Ontology terms), gene expression analysis (e.g. alignment of life-stage specific transcriptome data sets), and comparative analysis (e.g. orthologues and paralogues). We provide several ways of exploring the data, including genome browsers, genome and gene summary pages, text search, sequence search, a query wizard, bulk downloads, and programmatic interfaces. In this review, we provide an overview of the back-end infrastructure and analysis behind WormBase ParaSite, and the displays and tools available to users for interrogating helminth genomic data.

  17. Functional Requirements for Information Resource Provenance on the Web

    SciTech Connect

    McCusker, James P.; Lebo, Timothy; Graves, Alvaro; Difranzo, Dominic; Pinheiro da Silva, Paulo; McGuinness, Deborah L.

    2012-06-19

    We provide a means to formally explain the relationship between HTTP URLs and the representations returned when they are requested. According to existing World Wide Web architecture, the URL serves as an identier for a semiotic referent while the document returned via HTTP serves as a representation of the same referent. This begins with two sides of a semiotic triangle; the third side is the relationship between the URL and the representation received. We complete this description by extending the library science resource model Functional Requirements for Bibliographic Resources (FRBR) with cryptographic message and content digests to create a Functional Requirements for Information Resources (FRIR). We show how applying the FRIR model to HTTP GET and POST transactions disambiguates the many relationships between a given URL and all representations received from its request, provides fine-grained explanations that are complementary to existing explanations of web resources, and integrates easily into the emerging W3C provenance standard.

  18. Computational methods and resources for the interpretation of genomic variants in cancer

    PubMed Central

    2015-01-01

    The recent improvement of the high-throughput sequencing technologies is having a strong impact on the detection of genetic variations associated with cancer. Several institutions worldwide have been sequencing the whole exomes and or genomes of cancer patients in the thousands, thereby providing an invaluable collection of new somatic mutations in different cancer types. These initiatives promoted the development of methods and tools for the analysis of cancer genomes that are aimed at studying the relationship between genotype and phenotype in cancer. In this article we review the online resources and computational tools for the analysis of cancer genome. First, we describe the available repositories of cancer genome data. Next, we provide an overview of the methods for the detection of genetic variation and computational tools for the prioritization of cancer related genes and causative somatic variations. Finally, we discuss the future perspectives in cancer genomics focusing on the impact of computational methods and quantitative approaches for defining personalized strategies to improve the diagnosis and treatment of cancer. PMID:26111056

  19. Computational methods and resources for the interpretation of genomic variants in cancer.

    PubMed

    Tian, Rui; Basu, Malay K; Capriotti, Emidio

    2015-01-01

    The recent improvement of the high-throughput sequencing technologies is having a strong impact on the detection of genetic variations associated with cancer. Several institutions worldwide have been sequencing the whole exomes and or genomes of cancer patients in the thousands, thereby providing an invaluable collection of new somatic mutations in different cancer types. These initiatives promoted the development of methods and tools for the analysis of cancer genomes that are aimed at studying the relationship between genotype and phenotype in cancer. In this article we review the online resources and computational tools for the analysis of cancer genome. First, we describe the available repositories of cancer genome data. Next, we provide an overview of the methods for the detection of genetic variation and computational tools for the prioritization of cancer related genes and causative somatic variations. Finally, we discuss the future perspectives in cancer genomics focusing on the impact of computational methods and quantitative approaches for defining personalized strategies to improve the diagnosis and treatment of cancer.

  20. Neocaridina denticulata: A Decapod Crustacean Model for Functional Genomics.

    PubMed

    Mykles, Donald L; Hui, Jerome H L

    2015-11-01

    A decapod crustacean model is needed for understanding the molecular mechanisms underlying physiological processes, such as reproduction, sex determination, molting and growth, immunity, regeneration, and response to stress. Criteria for selection are: life-history traits, adult size, availability and ease of culture, and genomics and genetic manipulation. Three freshwater species are considered: cherry shrimp, Neocaridina denticulata; red swamp crayfish, Procambarus clarkii; and redclaw crayfish, Cherax quadricarinatus. All three are readily available, reproduce year round, and grow rapidly. The crayfish species require more space for culture than does N. denticulata. The transparent cuticle of cherry shrimp provides for direct assessment of reproductive status, stage of molt, and tissue-specific expression of reporter genes, and facilitates screening of mutations affecting phenotype. Moreover, a preliminary genome of N. denticulata is available and efforts toward complete genome sequencing and transcriptome sequencing have been initiated. Neocaridina denticulata possesses the best combination of traits that make it most suitable as a model for functional genomics. The next step is to obtain the complete genome sequence and to develop molecular technologies for the screening of mutants and for manipulating tissue-specific gene expression.

  1. Functional noncoding sequences derived from SINEs in the mammalian genome

    PubMed Central

    Nishihara, Hidenori; Smit, Arian F.A.; Okada, Norihiro

    2006-01-01

    Recent comparative analyses of mammalian sequences have revealed that a large number of nonprotein-coding genomic regions are under strong selective constraint. Here, we report that some of these loci have been derived from a newly defined family of ancient SINEs (short interspersed repetitive elements). This is a surprising result, as SINEs and other transposable elements are commonly thought to be genomic parasites. We named the ancient SINE family AmnSINE1, for Amniota SINE1, because we found it to be present in mammals as well as in birds, and some copies predate the mammalian-bird split 310 million years ago (Mya). AmnSINE1 has a chimeric structure of a 5S rRNA and a tRNA-derived SINE, and is related to five tRNA-derived SINE families that we characterized here in the coelacanth, dogfish shark, hagfish, and amphioxus genomes. All of the newly described SINE families have a common central domain that is also shared by zebrafish SINE3, and we collectively name them the DeuSINE (Deuterostomia SINE) superfamily. Notably, of the ∼1000 still identifiable copies of AmnSINE1 in the human genome, 105 correspond to loci phylogenetically highly conserved among mammalian orthologs. The conservation is strongest over the central domain. Thus, AmnSINE1 appears to be the best example of a transposable element of which a significant fraction of the copies have acquired genomic functionality. PMID:16717141

  2. Functional noncoding sequences derived from SINEs in the mammalian genome.

    PubMed

    Nishihara, Hidenori; Smit, Arian F A; Okada, Norihiro

    2006-07-01

    Recent comparative analyses of mammalian sequences have revealed that a large number of nonprotein-coding genomic regions are under strong selective constraint. Here, we report that some of these loci have been derived from a newly defined family of ancient SINEs (short interspersed repetitive elements). This is a surprising result, as SINEs and other transposable elements are commonly thought to be genomic parasites. We named the ancient SINE family AmnSINE1, for Amniota SINE1, because we found it to be present in mammals as well as in birds, and some copies predate the mammalian-bird split 310 million years ago (Mya). AmnSINE1 has a chimeric structure of a 5S rRNA and a tRNA-derived SINE, and is related to five tRNA-derived SINE families that we characterized here in the coelacanth, dogfish shark, hagfish, and amphioxus genomes. All of the newly described SINE families have a common central domain that is also shared by zebrafish SINE3, and we collectively name them the DeuSINE (Deuterostomia SINE) superfamily. Notably, of the approximately 1000 still identifiable copies of AmnSINE1 in the human genome, 105 correspond to loci phylogenetically highly conserved among mammalian orthologs. The conservation is strongest over the central domain. Thus, AmnSINE1 appears to be the best example of a transposable element of which a significant fraction of the copies have acquired genomic functionality.

  3. Recent advances in tomato functional genomics: utilization of VIGS.

    PubMed

    Sahu, Pranav Pankaj; Puranik, Swati; Khan, Moinuddin; Prasad, Manoj

    2012-10-01

    Tomato unquestionably occupies a significant position in world vegetable production owing to its world-wide consumption. The tomato genome sequencing efforts being recently concluded, it becomes more imperative to recognize important functional genes from this treasure of generated information for improving tomato yield. While much progress has been made in conventional tomato breeding, post-transcriptional gene silencing (PTGS) offers an alternative approach for advancement of tomato functional genomics. In particular, virus-induced gene silencing (VIGS) is increasingly being used as rapid, reliable, and lucrative screening strategy to elucidate gene function. In this review, we focus on the recent advancement made through exploiting the potential of this technique for manipulating different agronomically important traits in tomato by discussing several case studies.

  4. Functional Genomics Using the Saccharomyces cerevisiae Yeast Deletion Collections.

    PubMed

    Nislow, Corey; Wong, Lai Hong; Lee, Amy Huei-Yi; Giaever, Guri

    2016-09-01

    Constructed by a consortium of 16 laboratories, the Saccharomyces genome-wide deletion collections have, for the past decade, provided a powerful, rapid, and inexpensive approach for functional profiling of the yeast genome. Loss-of-function deletion mutants were systematically created using a polymerase chain reaction (PCR)-based gene deletion strategy to generate a start-to-stop codon replacement of each open reading frame by homologous recombination. Each strain carries two molecular barcodes that serve as unique strain identifiers, enabling their growth to be analyzed in parallel and the fitness contribution of each gene to be quantitatively assessed by hybridization to high-density oligonucleotide arrays or through the use of next-generation sequencing technologies. Functional profiling of the deletion collections, using either strain-by-strain or parallel assays, provides an unbiased approach to systematically survey the yeast genome. The Saccharomyces yeast deletion collections have proved immensely powerful in contributing to the understanding of gene function, including functional relationships between genes and genetic pathways in response to diverse genetic and environmental perturbations. © 2016 Cold Spring Harbor Laboratory Press.

  5. A physical, genetic and functional sequence assembly of the barley genome.

    PubMed

    Mayer, Klaus F X; Waugh, Robbie; Brown, John W S; Schulman, Alan; Langridge, Peter; Platzer, Matthias; Fincher, Geoffrey B; Muehlbauer, Gary J; Sato, Kazuhiro; Close, Timothy J; Wise, Roger P; Stein, Nils

    2012-11-29

    Barley (Hordeum vulgare L.) is among the world's earliest domesticated and most important crop plants. It is diploid with a large haploid genome of 5.1 gigabases (Gb). Here we present an integrated and ordered physical, genetic and functional sequence resource that describes the barley gene-space in a structured whole-genome context. We developed a physical map of 4.98 Gb, with more than 3.90 Gb anchored to a high-resolution genetic map. Projecting a deep whole-genome shotgun assembly, complementary DNA and deep RNA sequence data onto this framework supports 79,379 transcript clusters, including 26,159 'high-confidence' genes with homology support from other plant genomes. Abundant alternative splicing, premature termination codons and novel transcriptionally active regions suggest that post-transcriptional processing forms an important regulatory layer. Survey sequences from diverse accessions reveal a landscape of extensive single-nucleotide variation. Our data provide a platform for both genome-assisted research and enabling contemporary crop improvement.

  6. NeisseriaBase: a specialised Neisseria genomic resource and analysis platform.

    PubMed

    Zheng, Wenning; Mutha, Naresh V R; Heydari, Hamed; Dutta, Avirup; Siow, Cheuk Chuen; Jakubovics, Nicholas S; Wee, Wei Yee; Tan, Shi Yang; Ang, Mia Yang; Wong, Guat Jah; Choo, Siew Woh

    2016-01-01

    Database (VFDB) specific homology searches, the VFDB BLAST is also incorporated into the database. In addition, NeisseriaBase is equipped with in-house designed tools such as the Pairwise Genome Comparison tool (PGC) for comparative genomic analysis and the Pathogenomics Profiling Tool (PathoProT) for the comparative pathogenomics analysis of Neisseria strains. Discussion. This user-friendly database not only provides access to a host of genomic resources on Neisseria but also enables high-quality comparative genome analysis, which is crucial for the expanding scientific community interested in Neisseria research. This database is freely available at http://neisseria.um.edu.my.

  7. NeisseriaBase: a specialised Neisseria genomic resource and analysis platform

    PubMed Central

    Zheng, Wenning; Mutha, Naresh V.R.; Heydari, Hamed; Dutta, Avirup; Siow, Cheuk Chuen; Jakubovics, Nicholas S.; Wee, Wei Yee; Tan, Shi Yang; Ang, Mia Yang; Wong, Guat Jah

    2016-01-01

    Database (VFDB) specific homology searches, the VFDB BLAST is also incorporated into the database. In addition, NeisseriaBase is equipped with in-house designed tools such as the Pairwise Genome Comparison tool (PGC) for comparative genomic analysis and the Pathogenomics Profiling Tool (PathoProT) for the comparative pathogenomics analysis of Neisseria strains. Discussion. This user-friendly database not only provides access to a host of genomic resources on Neisseria but also enables high-quality comparative genome analysis, which is crucial for the expanding scientific community interested in Neisseria research. This database is freely available at http://neisseria.um.edu.my. PMID:27017950

  8. Functional genomics of Lactobacillus casei establishment in the gut

    PubMed Central

    Licandro-Seraut, Hélène; Scornec, Hélène; Pédron, Thierry; Cavin, Jean-François; Sansonetti, Philippe J.

    2014-01-01

    Although the composition of the gut microbiota and its symbiotic contribution to key host physiological functions are well established, little is known as yet about the bacterial factors that account for this symbiosis. We selected Lactobacillus casei as a model microorganism to proceed to genomewide identification of the functions required for a symbiont to establish colonization in the gut. As a result of our recent development of a transposon-mutagenesis tool that overcomes the barrier that had prevented L. casei random mutagenesis, we developed a signature-tagged mutagenesis approach combining whole-genome reverse genetics using a set of tagged transposons and in vivo screening using the rabbit ligated ileal loop model. After sequencing transposon insertion sites in 9,250 random mutants, we assembled a library of 1,110 independent mutants, all disrupted in a different gene, that provides a representative view of the L. casei genome. By determining the relative quantity of each of the 1,110 mutants before and after the in vivo challenge, we identified a core of 47 L. casei genes necessary for its establishment in the gut. They are involved in housekeeping functions, metabolism (sugar, amino acids), cell wall biogenesis, and adaptation to environment. Hence we provide what is, to our knowledge, the first global functional genomics analysis of L. casei symbiosis. PMID:25024222

  9. Introducing National Center for Genome Resources (NCGR) Informatics (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    SciTech Connect

    Crow, John

    2012-06-01

    John Crow from the National Center for Genome Resources discusses his organization's informatics at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  10. Introducing National Center for Genome Resources (NCGR) Informatics (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    ScienceCinema

    Crow, John [National Center for Genome Resources

    2016-07-12

    John Crow from the National Center for Genome Resources discusses his organization's informatics at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  11. Putting the Pieces Together: Clinically Relevant Genetic and Genomic Resources for Hospitalists and Neonatologists.

    PubMed

    Miller, Rebecca; Khromykh, Alina; Babcock, Holly; Jenevein, Callie; Solomon, Benjamin D

    2017-02-01

    Genetic conditions are individually rare but are common in aggregate, and they often present in the neonatal and early pediatric periods. These conditions are often severe, can be difficult to diagnose and manage, and may heavily affect patients, families, health care systems, and society. Because of recent technological advances, the availability and uptake of genetic and genomic testing are increasing rapidly. However, there is a dearth of trained geneticists and genetic counselors to help guide and explain these conditions and relevant tests. To help hospitalists, neonatologists, and related practitioners navigate this complex and evolving field, we have compiled a list of free (mostly Web-based) resources relevant to the diagnosis and management of genetic conditions and related disorders. These resources, which we describe individually, can be useful for nongeneticist clinicians, and some also include material that can be used to explain concepts and conditions to patients or families. The resources presented are divided into the following categories (which overlap): general information, databases of genetic conditions, resources that can help generate differential diagnoses, databases of genetic testing laboratories (to help with logistics of ordering tests), information on newborn screening, and other resources. We also include a separate list of helpful textbooks and manuals. We conclude with 2 examples describing how some of these resources would be used by a pediatric hospitalist or neonatologist during the inpatient management of a child with a suspected genetic condition. Copyright © 2017 by the American Academy of Pediatrics.

  12. Rice TOGO Browser: A platform to retrieve integrated information on rice functional and applied genomics.

    PubMed

    Nagamura, Yoshiaki; Antonio, Baltazar A; Sato, Yutaka; Miyao, Akio; Namiki, Nobukazu; Yonemaru, Jun-ichi; Minami, Hiroshi; Kamatsuki, Kaori; Shimura, Kan; Shimizu, Yuji; Hirochika, Hirohiko

    2011-02-01

    The Rice TOGO Browser is an online public resource designed to facilitate integration and visualization of mapping data of bacterial artificial chromosome (BAC)/P1-derived artificial chromosome (PAC) clones, genes, restriction fragment length polymorphism (RFLP)/simple sequence repeat (SSR) markers and phenotype data represented as quantitative trait loci (QTLs) onto the genome sequence, and to provide a platform for more efficient utilization of genome information from the point of view of applied genomics as well as functional genomics. Three search options, namely keyword search, region search and trait search, generate various types of data in a user-friendly interface with three distinct viewers, a chromosome viewer, an integrated map viewer and a sequence viewer, thereby providing the opportunity to view the position of genes and/or QTLs at the chromosomal level and to retrieve any sequence information in a user-defined genome region. Furthermore, the gene list, marker list and genome sequence in a specified region delineated by RFLP/SSR markers and any sequences designed as primers can be viewed and downloaded to support forward genetics approaches. An additional feature of this database is the graphical viewer for BLAST search to reveal information not only for regions with significant sequence similarity but also for regions adjacent to those with similarity but with no hits between sequences. An easy to use and intuitive user interface can help a wide range of users in retrieving integrated mapping information including agronomically important traits on the rice genome sequence. The database can be accessed at http://agri-trait.dna.affrc.go.jp/.

  13. The HuRef Browser: a web resource for individual human genomics

    PubMed Central

    Axelrod, Nelson; Lin, Yuan; Ng, Pauline C.; Stockwell, Timothy B.; Crabtree, Jonathan; Huang, Jiaqi; Kirkness, Ewen; Strausberg, Robert L.; Frazier, Marvin E.; Venter, J. Craig; Kravitz, Saul; Levy, Samuel

    2009-01-01

    The HuRef Genome Browser is a web application for the navigation and analysis of the previously published genome of a human individual, termed HuRef. The browser provides a comparative view between the NCBI human reference sequence and the HuRef assembly, and it enables the navigation of the HuRef genome in the context of HuRef, NCBI and Ensembl annotations. Single nucleotide polymorphisms, indels, inversions, structural and copy-number variations are shown in the context of existing functional annotations on either genome in the comparative view. Demonstrated here are some potential uses of the browser to enable a better understanding of individual human genetic variation. The browser provides full access to the underlying reads with sequence and quality information, the genome assembly and the evidence supporting the identification of DNA polymorphisms. The HuRef Browser is a unique and versatile tool for browsing genome assemblies and studying individual human sequence variation in a diploid context. The browser is available online at http://huref.jcvi.org. PMID:19036787

  14. The HuRef Browser: a web resource for individual human genomics.

    PubMed

    Axelrod, Nelson; Lin, Yuan; Ng, Pauline C; Stockwell, Timothy B; Crabtree, Jonathan; Huang, Jiaqi; Kirkness, Ewen; Strausberg, Robert L; Frazier, Marvin E; Venter, J Craig; Kravitz, Saul; Levy, Samuel

    2009-01-01

    The HuRef Genome Browser is a web application for the navigation and analysis of the previously published genome of a human individual, termed HuRef. The browser provides a comparative view between the NCBI human reference sequence and the HuRef assembly, and it enables the navigation of the HuRef genome in the context of HuRef, NCBI and Ensembl annotations. Single nucleotide polymorphisms, indels, inversions, structural and copy-number variations are shown in the context of existing functional annotations on either genome in the comparative view. Demonstrated here are some potential uses of the browser to enable a better understanding of individual human genetic variation. The browser provides full access to the underlying reads with sequence and quality information, the genome assembly and the evidence supporting the identification of DNA polymorphisms. The HuRef Browser is a unique and versatile tool for browsing genome assemblies and studying individual human sequence variation in a diploid context. The browser is available online at http://huref.jcvi.org.

  15. Genome-wide functional analysis in Candida albicans.

    PubMed

    Motaung, Thabiso E; Ells, Ruan; Pohl, Carolina H; Albertyn, Jacobus; Tsilo, Toi J

    2017-02-08

    Candida albicans is an important etiological agent of superficial and life-threatening infections in individuals with compromised immune systems. To date, we know of several overlapping genetic networks that govern virulence attributes in this fungal pathogen. Classical use of deletion mutants has led to the discovery of numerous virulence factors over the years, and genome-wide functional analysis has propelled gene discovery at an even faster pace. Indeed, a number of recent studies using large-scale genetic screens followed by genome-wide functional analysis has allowed for the unbiased discovery of many new genes involved in C. albicans biology. Here we share our perspectives on the role of these studies in analyzing fundamental aspects of C. albicans virulence properties.

  16. Functional genomic analysis of the Drosophila immune response.

    PubMed

    Valanne, Susanna

    2014-01-01

    Drosophila melanogaster has been widely used as a model organism for over a century now, and also as an immunological research model for over 20 years. With the emergence of RNA interference (RNAi) in Drosophila as a robust tool to silence genes of interest, large-scale or genome-wide functional analysis has become a popular way of studying the Drosophila immune response in cell culture. Drosophila immunity is composed of cellular and humoral immunity mechanisms, and especially the systemic, humoral response pathways have been extensively dissected using the functional genomic approach. Although most components of the main immune pathways had already been found using traditional genetic screening techniques, important findings including pathway components, positive and negative regulators and modifiers have been made with RNAi screening. Additionally, RNAi screening has produced new information on host-pathogen interactions related to the pathogenesis of many microbial species. Copyright © 2013 Elsevier Ltd. All rights reserved.

  17. The ethics of functional genomics: same, same, but different?

    PubMed

    Hoeyer, Klaus; Koch, Lene

    2006-09-01

    Respect for human life--a notion of worth uniting all members of the human race--constitutes a sense of anthropocentrism that has long been the justification for the enrollment of animals in experimentation executed to develop therapies to alleviate human suffering. Currently, however, advances in functional genomics are causing a qualitative transformation of the rationale for medical research performed on animals. The notion of human distinctness is being fundamentally challenged when gene sequences similar to those found in humans are identified in different species. In this Opinion article, we would like to highlight an inherent tension brought about by the current developments in functional genomics: a tension between the scientific and the ethical status of gene sequences. Is it reasonable to argue that they are the same for all practical purposes but different in ethical status?

  18. Relaxation of yeast mitochondrial functions after whole-genome duplication

    PubMed Central

    Jiang, Huifeng; Guan, Wenjun; Pinney, David; Wang, Wen; Gu, Zhenglong

    2008-01-01

    Mitochondria are essential for cellular energy production in most eukaryotic organisms. However, when glucose is abundant, yeast species that underwent whole-genome duplication (WGD) mostly conduct fermentation even under aerobic conditions, and most can survive without a functional mitochondrial genome. In this study, we show that the rate of evolution for the nuclear-encoded mitochondrial genes was greater in post-WGD species than pre-WGD species. Furthermore, codon usage bias was relaxed for these genes in post-WGD yeast species. The codon usage pattern and the distribution of a particular transcription regulatory element suggest that the change to an efficient aerobic fermentation lifestyle in this lineage might have emerged after WGD between the divergence of Kluyveromyces polysporus and Saccharomyces castellii from their common ancestor. This new energy production strategy could have led to the relaxation of mitochondrial function in the relevant yeast species. PMID:18669479

  19. Whole-Genome Sequence of Mesorhizobium hungaricum sp. nov. Strain UASWS1009, a Potential Resource for Agricultural and Environmental Uses

    PubMed Central

    Crovadore, Julien; Cochard, Bastien; Calmin, Gautier; Chablais, Romain; Schulz, Torsten

    2016-01-01

    We report here the whole-genome shotgun sequences of the strain UASWS1009 of the species Mesorhizobium hungaricum sp. nov., which are different from any other known Mesorhizobium species. This is the first genome registered for this new species, which could be considered as a potential resource for agriculture and environmental uses. PMID:27738050

  20. Development of a high-throughput SNP resource to advance genomic, genetic and breeding research in carrot (Daucus carota L.)

    USDA-ARS?s Scientific Manuscript database

    The rapid advancement in high-throughput SNP genotyping technologies along with next generation sequencing (NGS) platforms has decreased the cost, improved the quality of large-scale genome surveys, and allowed specialty crops with limited genomic resources such as carrot (Daucus carota) to access t...

  1. Vitigene: A database for grape genomics and genetic resources delivery that benefits grape growers and scientific communities

    USDA-ARS?s Scientific Manuscript database

    A new grape genomic database was established for ‘Native American Grape Species’ as a genomic resource (http://vitigene.famu.edu:9082/eclient/ IDMLogon2.jsp). The new database hosts genetic information collected from disease tolerant/resistant grapevine endemic to North America, and is a valuable re...

  2. Automated protein function prediction--the genomic challenge.

    PubMed

    Friedberg, Iddo

    2006-09-01

    Overwhelmed with genomic data, biologists are facing the first big post-genomic question--what do all genes do? First, not only is the volume of pure sequence and structure data growing, but its diversity is growing as well, leading to a disproportionate growth in the number of uncharacterized gene products. Consequently, established methods of gene and protein annotation, such as homology-based transfer, are annotating less data and in many cases are amplifying existing erroneous annotation. Second, there is a need for a functional annotation which is standardized and machine readable so that function prediction programs could be incorporated into larger workflows. This is problematic due to the subjective and contextual definition of protein function. Third, there is a need to assess the quality of function predictors. Again, the subjectivity of the term 'function' and the various aspects of biological function make this a challenging effort. This article briefly outlines the history of automated protein function prediction and surveys the latest innovations in all three topics.

  3. Everyday functioning and successful aging: the impact of resources.

    PubMed

    Baltes, M M; Lang, F R

    1997-09-01

    The goal of this article is to examine differential aging in everyday functioning between resource-rich and resource-poor older adults. Four groups of older adults were identified on the basis of 2 distinct resource factors: a Sensorimotor-Cognitive factor and a Social-Personality factor. The resource-richest group consisted of those participants who were above the median in both factors; those falling below the median in both factors comprised the resource-poorest group; and 2 additional groups consisted of older adults who were above the median in either 1 of the 2 factors. At the level of mean differences, the 4 groups differed in the length of the waking day, the variability in activities, the frequency of intellectual-cultural and social-relational activities, and resting times. Considering age differences there are more and larger negative age effects in the resource-poorest group than in the resource-richest one. The metamodel of selective optimization with compensation is used to interpret the findings.

  4. A Functional Genomic Analysis of NF1-Associated Learning Disabilities

    DTIC Science & Technology

    2008-02-01

    family. For example, Rab1 and Rab2 are downregulated in the NF1 hippocampus, while Rab3A is upregulated; Synaptotagmin 1 is downregulated, while...syntaxin binding protein 1 DOWN 1421990_at Syt1 synaptotagmin I DOWN 1422589_at Rab3a RAB3A, member RAS oncogene family UP Vesicle recycling 1422809_at...AD_________________ Award Number: W81XWH-04- 1 -0261 TITLE: A Functional Genomic Analysis of NF1-Associated Learning Disabilities

  5. A Functional Genomic Analysis of NF1-Associated Learning Disabilities

    DTIC Science & Technology

    2007-02-01

    For example, Rab1 and Rab2 are downregulated in the NF1 hippocampus, while Rab3A is upregulated; Synaptotagmin 1 is downregulated, while...syntaxin binding protein 1 DOWN 1421990_at Syt1 synaptotagmin I DOWN 1422589_at Rab3a RAB3A, member RAS oncogene family UP Vesicle recycling 1422809_at...AD_________________ Award Number: W81XWH-04- 1 -0261 TITLE: A Functional Genomic Analysis of NF1

  6. Unlocking Holocentric Chromosomes: New Perspectives from Comparative and Functional Genomics?

    PubMed Central

    Mandrioli, Mauro; Manicardi, Gian Carlo

    2012-01-01

    The presence of chromosomes with diffuse centromeres (holocentric chromosomes) has been reported in several taxa since more than fifty years, but a full understanding of their origin is still lacking. Comparative and functional genomics are nowadays furnishing new data to better understand holocentric chromosome evolution thus opening new perspectives to analyse karyotype rearrangements in species with holocentric chromosomes in particular evidencing unusual common features, such as the uniform GC content and gene distribution along chromosomes. PMID:23372420

  7. Functional genomics of bio-energy plants and related patent activities.

    PubMed

    Jiang, Shu-Ye; Ramachandran, Srinivasan

    2013-04-01

    With dwindling fossil oil resources and increased economic growth of many developing countries due to globalization, energy driven from an alternative source such as bio-energy in a sustainable fashion is the need of the hour. However, production of energy from biological source is relatively expensive due to low starch and sugar contents of bioenergy plants leading to lower oil yield and reduced quality along with lower conversion efficiency of feedstock. In this context genetic improvement of bio-energy plants offers a viable solution. In this manuscript, we reviewed the current status of functional genomics studies and related patent activities in bio-energy plants. Currently, genomes of considerable bio-energy plants have been sequenced or are in progress and also large amount of expression sequence tags (EST) or cDNA sequences are available from them. These studies provide fundamental data for more reliable genome annotation and as a result, several genomes have been annotated in a genome-wide level. In addition to this effort, various mutagenesis tools have also been employed to develop mutant populations for characterization of genes that are involved in bioenergy quantitative traits. With the progress made on functional genomics of important bio-energy plants, more patents were filed with a significant number of them focusing on genes and DNA sequences which may involve in improvement of bio-energy traits including higher yield and quality of starch, sugar and oil. We also believe that these studies will lead to the generation of genetically altered plants with improved tolerance to various abiotic and biotic stresses.

  8. A collection of enhancer trap insertional mutants for functional genomics in tomato.

    PubMed

    Pérez-Martín, Fernando; Yuste-Lisbona, Fernando J; Pineda, Benito; Angarita-Díaz, María Pilar; García-Sogo, Begoña; Antón, Teresa; Sánchez, Sibilla; Giménez, Estela; Atarés, Alejandro; Fernández-Lozano, Antonia; Ortíz-Atienza, Ana; García-Alcázar, Manuel; Castañeda, Laura; Fonseca, Rocío; Capel, Carmen; Goergen, Geraldine; Sánchez, Jorge; Quispe, Jorge L; Capel, Juan; Angosto, Trinidad; Moreno, Vicente; Lozano, Rafael

    2017-11-01

    With the completion of genome sequencing projects, the next challenge is to close the gap between gene annotation and gene functional assignment. Genomic tools to identify gene functions are based on the analysis of phenotypic variations between a wild type and its mutant; hence, mutant collections are a valuable resource. In this sense, T-DNA collections allow for an easy and straightforward identification of the tagged gene, serving as the basis of both forward and reverse genetic strategies. This study reports on the phenotypic and molecular characterization of an enhancer trap T-DNA collection in tomato (Solanum lycopersicum L.), which has been produced by Agrobacterium-mediated transformation using a binary vector bearing a minimal promoter fused to the uidA reporter gene. Two genes have been isolated from different T-DNA mutants, one of these genes codes for a UTP-glucose-1-phosphate uridylyltransferase involved in programmed cell death and leaf development, which means a novel gene function reported in tomato. Together, our results support that enhancer trapping is a powerful tool to identify novel genes and regulatory elements in tomato and that this T-DNA mutant collection represents a highly valuable resource for functional analyses in this fleshy-fruited model species. © 2017 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  9. Nucleotide diversity analysis highlights functionally important genomic regions

    PubMed Central

    Tatarinova, Tatiana V.; Chekalin, Evgeny; Nikolsky, Yuri; Bruskin, Sergey; Chebotarov, Dmitry; McNally, Kenneth L.; Alexandrov, Nickolai

    2016-01-01

    We analyzed functionality and relative distribution of genetic variants across the complete Oryza sativa genome, using the 40 million single nucleotide polymorphisms (SNPs) dataset from the 3,000 Rice Genomes Project (http://snp-seek.irri.org), the largest and highest density SNP collection for any higher plant. We have shown that the DNA-binding transcription factors (TFs) are the most conserved group of genes, whereas kinases and membrane-localized transporters are the most variable ones. TFs may be conserved because they belong to some of the most connected regulatory hubs that modulate transcription of vast downstream gene networks, whereas signaling kinases and transporters need to adapt rapidly to changing environmental conditions. In general, the observed profound patterns of nucleotide variability reveal functionally important genomic regions. As expected, nucleotide diversity is much higher in intergenic regions than within gene bodies (regions spanning gene models), and protein-coding sequences are more conserved than untranslated gene regions. We have observed a sharp decline in nucleotide diversity that begins at about 250 nucleotides upstream of the transcription start and reaches minimal diversity exactly at the transcription start. We found the transcription termination sites to have remarkably symmetrical patterns of SNP density, implying presence of functional sites near transcription termination. Also, nucleotide diversity was significantly lower near 3′ UTRs, the area rich with regulatory regions. PMID:27774999

  10. Functional Genomics in the Study of Mind-Body Therapies

    PubMed Central

    Niles, Halsey; Mehta, Darshan H.; Corrigan, Alexandra A.; Bhasin, Manoj K.; Denninger, John W.

    2014-01-01

    Background Mind-body therapies (MBTs) are used throughout the world in treatment, disease prevention, and health promotion. However, the mechanisms by which MBTs exert their positive effects are not well understood. Investigations into MBTs using functional genomics have revolutionized the understanding of MBT mechanisms and their effects on human physiology. Methods We searched the literature for the effects of MBTs on functional genomics determinants using MEDLINE, supplemented by a manual search of additional journals and a reference list review. Results We reviewed 15 trials that measured global or targeted transcriptomic, epigenomic, or proteomic changes in peripheral blood. Sample sizes ranged from small pilot studies (n=2) to large trials (n=500). While the reliability of individual genes from trial to trial was often inconsistent, genes related to inflammatory response, particularly those involved in the nuclear factor-kappa B (NF-κB) pathway, were consistently downregulated across most studies. Conclusion In general, existing trials focusing on gene expression changes brought about by MBTs have revealed intriguing connections to the immune system through the NF-κB cascade, to telomere maintenance, and to apoptotic regulation. However, these findings are limited to a small number of trials and relatively small sample sizes. More rigorous randomized controlled trials of healthy subjects and specific disease states are warranted. Future research should investigate functional genomics areas both upstream and downstream of MBT-related gene expression changes—from epigenomics to proteomics and metabolomics. PMID:25598735

  11. Nucleotide diversity analysis highlights functionally important genomic regions.

    PubMed

    Tatarinova, Tatiana V; Chekalin, Evgeny; Nikolsky, Yuri; Bruskin, Sergey; Chebotarov, Dmitry; McNally, Kenneth L; Alexandrov, Nickolai

    2016-10-24

    We analyzed functionality and relative distribution of genetic variants across the complete Oryza sativa genome, using the 40 million single nucleotide polymorphisms (SNPs) dataset from the 3,000 Rice Genomes Project (http://snp-seek.irri.org), the largest and highest density SNP collection for any higher plant. We have shown that the DNA-binding transcription factors (TFs) are the most conserved group of genes, whereas kinases and membrane-localized transporters are the most variable ones. TFs may be conserved because they belong to some of the most connected regulatory hubs that modulate transcription of vast downstream gene networks, whereas signaling kinases and transporters need to adapt rapidly to changing environmental conditions. In general, the observed profound patterns of nucleotide variability reveal functionally important genomic regions. As expected, nucleotide diversity is much higher in intergenic regions than within gene bodies (regions spanning gene models), and protein-coding sequences are more conserved than untranslated gene regions. We have observed a sharp decline in nucleotide diversity that begins at about 250 nucleotides upstream of the transcription start and reaches minimal diversity exactly at the transcription start. We found the transcription termination sites to have remarkably symmetrical patterns of SNP density, implying presence of functional sites near transcription termination. Also, nucleotide diversity was significantly lower near 3' UTRs, the area rich with regulatory regions.

  12. Functional Profiling Using the Saccharomyces Genome Deletion Project Collections.

    PubMed

    Nislow, Corey; Wong, Lai Hong; Lee, Amy Huei-Yi; Giaever, Guri

    2016-09-01

    The ability to measure and quantify the fitness of an entire organism requires considerably more complex approaches than simply using traditional "omic" methods that examine, for example, the abundance of RNA transcripts, proteins, or metabolites. The yeast deletion collections represent the only systematic, comprehensive set of null alleles for any organism in which such fitness measurements can be assayed. Generated by the Saccharomyces Genome Deletion Project, these collections allow the systematic and parallel analysis of gene functions using any measurable phenotype. The unique 20-bp molecular barcodes engineered into the genome of each deletion strain facilitate the massively parallel analysis of individual fitness. Here, we present functional genomic protocols for use with the yeast deletion collections. We describe how to maintain, propagate, and store the deletion collections and how to perform growth fitness assays on single and parallel screening platforms. Phenotypic fitness analyses of the yeast mutants, described in brief here, provide important insights into biological functions, mechanisms of drug action, and response to environmental stresses. It is important to bear in mind that the specific assays described in this protocol represent some of the many ways in which these collections can be assayed, and in this description particular attention is paid to maximizing throughput using growth as the phenotypic measure.

  13. The SOL Genomics Network. A Comparative Resource for Solanaceae Biology and Beyond1

    PubMed Central

    Mueller, Lukas A.; Solow, Teri H.; Taylor, Nicolas; Skwarecki, Beth; Buels, Robert; Binns, John; Lin, Chenwei; Wright, Mark H.; Ahrens, Robert; Wang, Ying; Herbst, Evan V.; Keyder, Emil R.; Menda, Naama; Zamir, Dani; Tanksley, Steven D.

    2005-01-01

    The SOL Genomics Network (SGN; http://sgn.cornell.edu) is a rapidly evolving comparative resource for the plants of the Solanaceae family, which includes important crop and model plants such as potato (Solanum tuberosum), eggplant (Solanum melongena), pepper (Capsicum annuum), and tomato (Solanum lycopersicum). The aim of SGN is to relate these species to one another using a comparative genomics approach and to tie them to the other dicots through the fully sequenced genome of Arabidopsis (Arabidopsis thaliana). SGN currently houses map and marker data for Solanaceae species, a large expressed sequence tag collection with computationally derived unigene sets, an extensive database of phenotypic information for a mutagenized tomato population, and associated tools such as real-time quantitative trait loci. Recently, the International Solanaceae Project (SOL) was formed as an umbrella organization for Solanaceae research in over 30 countries to address important questions in plant biology. The first cornerstone of the SOL project is the sequencing of the entire euchromatic portion of the tomato genome. SGN is collaborating with other bioinformatics centers in building the bioinformatics infrastructure for the tomato sequencing project and implementing the bioinformatics strategy of the larger SOL project. The overarching goal of SGN is to make information available in an intuitive comparative format, thereby facilitating a systems approach to investigations into the basis of adaptation and phenotypic diversity in the Solanaceae family, other species in the Asterid clade such as coffee (Coffea arabica), Rubiaciae, and beyond. PMID:16010005

  14. HmtDB, a genomic resource for mitochondrion-based human variability studies.

    PubMed

    Rubino, Francesco; Piredda, Roberta; Calabrese, Francesco Maria; Simone, Domenico; Lang, Martin; Calabrese, Claudia; Petruzzella, Vittoria; Tommaseo-Ponzetta, Mila; Gasparre, Giuseppe; Attimonelli, Marcella

    2012-01-01

    HmtDB (http://www.hmtdb.uniba.it:8080/hmdb) is a open resource created to support population genetics and mitochondrial disease studies. The database hosts human mitochondrial genome sequences annotated with population and variability data, the latter being estimated through the application of the SiteVar software based on site-specific nucleotide and amino acid variability calculations. The annotations are manually curated thus adding value to the quality of the information provided to the end-user. Classifier tools implemented in HmtDB allow the prediction of the haplogroup for any human mitochondrial genome currently stored in HmtDB or externally submitted de novo by an end-user. Haplogroup definition is based on the Phylotree system. End-users accessing HmtDB are hence allowed to (i) browse the database through the use of a multi-criterion 'query' system; (ii) analyze their own human mitochondrial sequences via the 'classify' tool (for complete genomes) or by downloading the 'fragment-classifier' tool (for partial sequences); (iii) download multi-alignments with reference genomes as well as variability data.

  15. HmtDB, a genomic resource for mitochondrion-based human variability studies

    PubMed Central

    Rubino, Francesco; Piredda, Roberta; Calabrese, Francesco Maria; Simone, Domenico; Lang, Martin; Calabrese, Claudia; Petruzzella, Vittoria; Tommaseo-Ponzetta, Mila; Gasparre, Giuseppe; Attimonelli, Marcella

    2012-01-01

    HmtDB (http://www.hmtdb.uniba.it:8080/hmdb) is a open resource created to support population genetics and mitochondrial disease studies. The database hosts human mitochondrial genome sequences annotated with population and variability data, the latter being estimated through the application of the SiteVar software based on site-specific nucleotide and amino acid variability calculations. The annotations are manually curated thus adding value to the quality of the information provided to the end-user. Classifier tools implemented in HmtDB allow the prediction of the haplogroup for any human mitochondrial genome currently stored in HmtDB or externally submitted de novo by an end-user. Haplogroup definition is based on the Phylotree system. End-users accessing HmtDB are hence allowed to (i) browse the database through the use of a multi-criterion ‘query’ system; (ii) analyze their own human mitochondrial sequences via the ‘classify’ tool (for complete genomes) or by downloading the ‘fragment-classifier’ tool (for partial sequences); (iii) download multi-alignments with reference genomes as well as variability data. PMID:22139932

  16. Fluorescence-based resource for semiautomated genomic analyses using microsatellite markers

    SciTech Connect

    Levitt, R.C.; Kiser, M.B.; Dragwa, C.

    1994-11-15

    To facilitate the practical application of highly-efficient semiautomated methods for general application in genomic analyses, the authors have developed a fluorescence-based microsatellite marker resource. Ninety highly polymorphic microsatellite markers were combined to provide a rapid, accurate, and highly efficient initial genome-wide screening system. These markers are spaced on average every 33 cM, with a mean heterozygosity of 81% (range 65-94%), covering 22 autosomes and the X and Y chromosomes. Less than 10% of the genome lies beyond 20 cM of the nearest marker. Since this genomic analysis system is fully compatible with automated fragment analyzers using simultaneous four-color fluorescence-based detection systems, the 5 groups of 18 markers can be detected concurrently. This multiplex detection provides a throughput of 1944 genotypes daily per instrument. This system will be highly beneficial in a number of clinical and research applications including linkage, cancer genetics, forensics, and cytogenetics. 16 refs., 1 fig., 2 tabs.

  17. Beyond the genome: functional studies of phototrophic sulfur oxidation.

    PubMed

    Hanson, Thomas E; Morgan-Kiss, Rachael M; Chan, Leong-Keat; Hiras, Jennifer

    2010-01-01

    The increasing availability of complete genomic sequences for cultured phototrophic bacteria and assembled metagenomes from environments dominated by phototrophs has reinforced the need for a "post-genomic" analytical effort to test models of cellular structure and function proposed from genomic data. Comparative genomics has produced a testable model for pathways of sulfur compound oxidation in the phototrophic bacteria. In the case of sulfide, two enzymes are predicted to oxidize sulfide: sulfide:quinone oxidoreductase and flavocytochrome c sulfide dehydrogenase. However, these models do not predict which enzyme is important under what conditions. In Chlorobaculum tepidum, a model green sulfur bacterium, a combination of genetics and physiological analysis of mutant strains has led to the realization that this organism contains at least two active sulfide:quinone oxidoreductases and that there is significant interaction between sulfide oxidation and light harvesting. In the case of elemental sulfur, an organothiol intermediate of unknown structure has been proposed to activate elemental sulfur for transport into the cytoplasm where it can be oxidized or assimilated, and recent approaches using classical metabolite analysis have begun to shed light on this issue both in C. tepidum and the purple sulfur bacterium Allochromatium vinosum.

  18. Human Ageing Genomic Resources: Integrated databases and tools for the biology and genetics of ageing

    PubMed Central

    Tacutu, Robi; Craig, Thomas; Budovsky, Arie; Wuttke, Daniel; Lehmann, Gilad; Taranukha, Dmitri; Costa, Joana; Fraifeld, Vadim E.; de Magalhães, João Pedro

    2013-01-01

    The Human Ageing Genomic Resources (HAGR, http://genomics.senescence.info) is a freely available online collection of research databases and tools for the biology and genetics of ageing. HAGR features now several databases with high-quality manually curated data: (i) GenAge, a database of genes associated with ageing in humans and model organisms; (ii) AnAge, an extensive collection of longevity records and complementary traits for >4000 vertebrate species; and (iii) GenDR, a newly incorporated database, containing both gene mutations that interfere with dietary restriction-mediated lifespan extension and consistent gene expression changes induced by dietary restriction. Since its creation about 10 years ago, major efforts have been undertaken to maintain the quality of data in HAGR, while further continuing to develop, improve and extend it. This article briefly describes the content of HAGR and details the major updates since its previous publications, in terms of both structure and content. The completely redesigned interface, more intuitive and more integrative of HAGR resources, is also presented. Altogether, we hope that through its improvements, the current version of HAGR will continue to provide users with the most comprehensive and accessible resources available today in the field of biogerontology. PMID:23193293

  19. Human Ageing Genomic Resources: integrated databases and tools for the biology and genetics of ageing.

    PubMed

    Tacutu, Robi; Craig, Thomas; Budovsky, Arie; Wuttke, Daniel; Lehmann, Gilad; Taranukha, Dmitri; Costa, Joana; Fraifeld, Vadim E; de Magalhães, João Pedro

    2013-01-01

    The Human Ageing Genomic Resources (HAGR, http://genomics.senescence.info) is a freely available online collection of research databases and tools for the biology and genetics of ageing. HAGR features now several databases with high-quality manually curated data: (i) GenAge, a database of genes associated with ageing in humans and model organisms; (ii) AnAge, an extensive collection of longevity records and complementary traits for >4000 vertebrate species; and (iii) GenDR, a newly incorporated database, containing both gene mutations that interfere with dietary restriction-mediated lifespan extension and consistent gene expression changes induced by dietary restriction. Since its creation about 10 years ago, major efforts have been undertaken to maintain the quality of data in HAGR, while further continuing to develop, improve and extend it. This article briefly describes the content of HAGR and details the major updates since its previous publications, in terms of both structure and content. The completely redesigned interface, more intuitive and more integrative of HAGR resources, is also presented. Altogether, we hope that through its improvements, the current version of HAGR will continue to provide users with the most comprehensive and accessible resources available today in the field of biogerontology.

  20. Pearl millet genome sequence provides a resource to improve agronomic traits in arid environments.

    PubMed

    Varshney, Rajeev K; Shi, Chengcheng; Thudi, Mahendar; Mariac, Cedric; Wallace, Jason; Qi, Peng; Zhang, He; Zhao, Yusheng; Wang, Xiyin; Rathore, Abhishek; Srivastava, Rakesh K; Chitikineni, Annapurna; Fan, Guangyi; Bajaj, Prasad; Punnuri, Somashekhar; Gupta, S K; Wang, Hao; Jiang, Yong; Couderc, Marie; Katta, Mohan A V S K; Paudel, Dev R; Mungra, K D; Chen, Wenbin; Harris-Shultz, Karen R; Garg, Vanika; Desai, Neetin; Doddamani, Dadakhalandar; Kane, Ndjido Ardo; Conner, Joann A; Ghatak, Arindam; Chaturvedi, Palak; Subramaniam, Sabarinath; Yadav, Om Parkash; Berthouly-Salazar, Cécile; Hamidou, Falalou; Wang, Jianping; Liang, Xinming; Clotault, Jérémy; Upadhyaya, Hari D; Cubry, Philippe; Rhoné, Bénédicte; Gueye, Mame Codou; Sunkar, Ramanjulu; Dupuy, Christian; Sparvoli, Francesca; Cheng, Shifeng; Mahala, R S; Singh, Bharat; Yadav, Rattan S; Lyons, Eric; Datta, Swapan K; Hash, C Tom; Devos, Katrien M; Buckler, Edward; Bennetzen, Jeffrey L; Paterson, Andrew H; Ozias-Akins, Peggy; Grando, Stefania; Wang, Jun; Mohapatra, Trilochan; Weckwerth, Wolfram; Reif, Jochen C; Liu, Xin; Vigouroux, Yves; Xu, Xun

    2017-09-18

    Pearl millet [Cenchrus americanus (L.) Morrone] is a staple food for more than 90 million farmers in arid and semi-arid regions of sub-Saharan Africa, India and South Asia. We report the ∼1.79 Gb draft whole genome sequence of reference genotype Tift 23D2B1-P1-P5, which contains an estimated 38,579 genes. We highlight the substantial enrichment for wax biosynthesis genes, which may contribute to heat and drought tolerance in this crop. We resequenced and analyzed 994 pearl millet lines, enabling insights into population structure, genetic diversity and domestication. We use these resequencing data to establish marker trait associations for genomic selection, to define heterotic pools, and to predict hybrid performance. We believe that these resources should empower researchers and breeders to improve this important staple crop.

  1. Development of Genomic Resources for a thraustochytrid Pathogen and Investigation of Temperature Influences on Gene Expression

    PubMed Central

    Garcia-Vedrenne, Ana Elisa; Groner, Maya; Page-Karjian, Annie; Siegmund, Gregor-Fausto; Singhal, Sonia; Sziklay, Jamie; Roberts, Steven

    2013-01-01

    Understanding how environmental changes influence the pathogenicity and virulence of infectious agents is critical for predicting epidemiological patterns of disease. Thraustochytrids, part of the larger taxonomic class Labyrinthulomycetes, contain several highly pathogenic species, including the hard clam pathogen quahog parasite unknown (QPX). QPX has been associated with large-scale mortality events along the northeastern coast of North America. Growth and physiology of QPX is temperature-dependent, and changes in local temperature profiles influence pathogenicity. In this study we characterize the partial genome of QPX and examine the influence of temperature on gene expression. Genes involved in several biological processes are differentially expressed upon temperature change, including those associated with altered growth and metabolism and virulence. The genomic and transcriptomic resources developed in this study provide a foundation for better understanding virulence, pathogenicity and life history of thraustochytrid pathogens. PMID:24069279

  2. Draft genome sequence of chickpea (Cicer arietinum) provides a resource for trait improvement.

    PubMed

    Varshney, Rajeev K; Song, Chi; Saxena, Rachit K; Azam, Sarwar; Yu, Sheng; Sharpe, Andrew G; Cannon, Steven; Baek, Jongmin; Rosen, Benjamin D; Tar'an, Bunyamin; Millan, Teresa; Zhang, Xudong; Ramsay, Larissa D; Iwata, Aiko; Wang, Ying; Nelson, William; Farmer, Andrew D; Gaur, Pooran M; Soderlund, Carol; Penmetsa, R Varma; Xu, Chunyan; Bharti, Arvind K; He, Weiming; Winter, Peter; Zhao, Shancen; Hane, James K; Carrasquilla-Garcia, Noelia; Condie, Janet A; Upadhyaya, Hari D; Luo, Ming-Cheng; Thudi, Mahendar; Gowda, C L L; Singh, Narendra P; Lichtenzveig, Judith; Gali, Krishna K; Rubio, Josefa; Nadarajan, N; Dolezel, Jaroslav; Bansal, Kailash C; Xu, Xun; Edwards, David; Zhang, Gengyun; Kahl, Guenter; Gil, Juan; Singh, Karam B; Datta, Swapan K; Jackson, Scott A; Wang, Jun; Cook, Douglas R

    2013-03-01

    Chickpea (Cicer arietinum) is the second most widely grown legume crop after soybean, accounting for a substantial proportion of human dietary nitrogen intake and playing a crucial role in food security in developing countries. We report the ∼738-Mb draft whole genome shotgun sequence of CDC Frontier, a kabuli chickpea variety, which contains an estimated 28,269 genes. Resequencing and analysis of 90 cultivated and wild genotypes from ten countries identifies targets of both breeding-associated genetic sweeps and breeding-associated balancing selection. Candidate genes for disease resistance and agronomic traits are highlighted, including traits that distinguish the two main market classes of cultivated chickpea--desi and kabuli. These data comprise a resource for chickpea improvement through molecular breeding and provide insights into both genome diversity and domestication.

  3. Resourcing the Training and Development Function. IES Report.

    ERIC Educational Resources Information Center

    Carter, A.; Hirsh, W.; Aston, J.

    A study explored current practice in organizing and resourcing training and development (T&D) using survey responses from over 100 major private and public sector employers and case studies of T&D functions in 6 organizations. Business drivers for T&D were senior management as customers; diagnosis of training as "the…

  4. Resourcing the Training and Development Function. IES Report.

    ERIC Educational Resources Information Center

    Carter, A.; Hirsh, W.; Aston, J.

    A study explored current practice in organizing and resourcing training and development (T&D) using survey responses from over 100 major private and public sector employers and case studies of T&D functions in 6 organizations. Business drivers for T&D were senior management as customers; diagnosis of training as "the…

  5. Evidence-based gene models for structural and functional annotations of the oil palm genome.

    PubMed

    Chan, Kuang-Lim; Tatarinova, Tatiana V; Rosli, Rozana; Amiruddin, Nadzirah; Azizi, Norazah; Halim, Mohd Amin Ab; Sanusi, Nik Shazana Nik Mohd; Jayanthi, Nagappan; Ponomarenko, Petr; Triska, Martin; Solovyev, Victor; Firdaus-Raih, Mohd; Sambanthamurthi, Ravigadevi; Murphy, Denis; Low, Eng-Ti Leslie

    2017-09-08

    Oil palm is an important source of edible oil. The importance of the crop, as well as its long breeding cycle (10-12 years) has led to the sequencing of its genome in 2013 to pave the way for genomics-guided breeding. Nevertheless, the first set of gene predictions, although useful, had many fragmented genes. Classification and characterization of genes associated with traits of interest, such as those for fatty acid biosynthesis and disease resistance, were also limited. Lipid-, especially fatty acid (FA)-related genes are of particular interest for the oil palm as they specify oil yields and quality. This paper presents the characterization of the oil palm genome using different gene prediction methods and comparative genomics analysis, identification of FA biosynthesis and disease resistance genes, and the development of an annotation database and bioinformatics tools. Using two independent gene-prediction pipelines, Fgenesh++ and Seqping, 26,059 oil palm genes with transcriptome and RefSeq support were identified from the oil palm genome. These coding regions of the genome have a characteristic broad distribution of GC3 (fraction of cytosine and guanine in the third position of a codon) with over half the GC3-rich genes (GC3 ≥ 0.75286) being intronless. In comparison, only one-seventh of the oil palm genes identified are intronless. Using comparative genomics analysis, characterization of conserved domains and active sites, and expression analysis, 42 key genes involved in FA biosynthesis in oil palm were identified. For three of them, namely EgFABF, EgFABH and EgFAD3, segmental duplication events were detected. Our analysis also identified 210 candidate resistance genes in six classes, grouped by their protein domain structures. We present an accurate and comprehensive annotation of the oil palm genome, focusing on analysis of important categories of genes (GC3-rich and intronless), as well as those associated with important functions, such as FA

  6. Functional and evolutionary insights from the genomes of three parasitoid Nasonia species.

    PubMed

    Werren, John H; Richards, Stephen; Desjardins, Christopher A; Niehuis, Oliver; Gadau, Jürgen; Colbourne, John K; Werren, John H; Richards, Stephen; Desjardins, Christopher A; Niehuis, Oliver; Gadau, Jürgen; Colbourne, John K; Beukeboom, Leo W; Desplan, Claude; Elsik, Christine G; Grimmelikhuijzen, Cornelis J P; Kitts, Paul; Lynch, Jeremy A; Murphy, Terence; Oliveira, Deodoro C S G; Smith, Christopher D; van de Zande, Louis; Worley, Kim C; Zdobnov, Evgeny M; Aerts, Maarten; Albert, Stefan; Anaya, Victor H; Anzola, Juan M; Barchuk, Angel R; Behura, Susanta K; Bera, Agata N; Berenbaum, May R; Bertossa, Rinaldo C; Bitondi, Márcia M G; Bordenstein, Seth R; Bork, Peer; Bornberg-Bauer, Erich; Brunain, Marleen; Cazzamali, Giuseppe; Chaboub, Lesley; Chacko, Joseph; Chavez, Dean; Childers, Christopher P; Choi, Jeong-Hyeon; Clark, Michael E; Claudianos, Charles; Clinton, Rochelle A; Cree, Andrew G; Cristino, Alexandre S; Dang, Phat M; Darby, Alistair C; de Graaf, Dirk C; Devreese, Bart; Dinh, Huyen H; Edwards, Rachel; Elango, Navin; Elhaik, Eran; Ermolaeva, Olga; Evans, Jay D; Foret, Sylvain; Fowler, Gerald R; Gerlach, Daniel; Gibson, Joshua D; Gilbert, Donald G; Graur, Dan; Gründer, Stefan; Hagen, Darren E; Han, Yi; Hauser, Frank; Hultmark, Da; Hunter, Henry C; Hurst, Gregory D D; Jhangian, Shalini N; Jiang, Huaiyang; Johnson, Reed M; Jones, Andrew K; Junier, Thomas; Kadowaki, Tatsuhiko; Kamping, Albert; Kapustin, Yuri; Kechavarzi, Bobak; Kim, Jaebum; Kim, Jay; Kiryutin, Boris; Koevoets, Tosca; Kovar, Christie L; Kriventseva, Evgenia V; Kucharski, Robert; Lee, Heewook; Lee, Sandra L; Lees, Kristin; Lewis, Lora R; Loehlin, David W; Logsdon, John M; Lopez, Jacqueline A; Lozado, Ryan J; Maglott, Donna; Maleszka, Ryszard; Mayampurath, Anoop; Mazur, Danielle J; McClure, Marcella A; Moore, Andrew D; Morgan, Margaret B; Muller, Jean; Munoz-Torres, Monica C; Muzny, Donna M; Nazareth, Lynne V; Neupert, Susanne; Nguyen, Ngoc B; Nunes, Francis M F; Oakeshott, John G; Okwuonu, Geoffrey O; Pannebakker, Bart A; Pejaver, Vikas R; Peng, Zuogang; Pratt, Stephen C; Predel, Reinhard; Pu, Ling-Ling; Ranson, Hilary; Raychoudhury, Rhitoban; Rechtsteiner, Andreas; Reese, Justin T; Reid, Jeffrey G; Riddle, Megan; Robertson, Hugh M; Romero-Severson, Jeanne; Rosenberg, Miriam; Sackton, Timothy B; Sattelle, David B; Schlüns, Helge; Schmitt, Thomas; Schneider, Martina; Schüler, Andreas; Schurko, Andrew M; Shuker, David M; Simões, Zilá L P; Sinha, Saurabh; Smith, Zachary; Solovyev, Victor; Souvorov, Alexandre; Springauf, Andreas; Stafflinger, Elisabeth; Stage, Deborah E; Stanke, Mario; Tanaka, Yoshiaki; Telschow, Arndt; Trent, Carol; Vattathil, Selina; Verhulst, Eveline C; Viljakainen, Lumi; Wanner, Kevin W; Waterhouse, Robert M; Whitfield, James B; Wilkes, Timothy E; Williamson, Michael; Willis, Judith H; Wolschin, Florian; Wyder, Stefan; Yamada, Takuji; Yi, Soojin V; Zecher, Courtney N; Zhang, Lan; Gibbs, Richard A

    2010-01-15

    We report here genome sequences and comparative analyses of three closely related parasitoid wasps: Nasonia vitripennis, N. giraulti, and N. longicornis. Parasitoids are important regulators of arthropod populations, including major agricultural pests and disease vectors, and Nasonia is an emerging genetic model, particularly for evolutionary and developmental genetics. Key findings include the identification