Science.gov

Sample records for functional genomic resources

  1. Tetrahymena functional genomics database (TetraFGD): an integrated resource for Tetrahymena functional genomics.

    PubMed

    Xiong, Jie; Lu, Yuming; Feng, Jinmei; Yuan, Dongxia; Tian, Miao; Chang, Yue; Fu, Chengjie; Wang, Guangying; Zeng, Honghui; Miao, Wei

    2013-01-01

    The ciliated protozoan Tetrahymena thermophila is a useful unicellular model organism for studies of eukaryotic cellular and molecular biology. Researches on T. thermophila have contributed to a series of remarkable basic biological principles. After the macronuclear genome was sequenced, substantial progress has been made in functional genomics research on T. thermophila, including genome-wide microarray analysis of the T. thermophila life cycle, a T. thermophila gene network analysis based on the microarray data and transcriptome analysis by deep RNA sequencing. To meet the growing demands for the Tetrahymena research community, we integrated these data to provide a public access database: Tetrahymena functional genomics database (TetraFGD). TetraFGD contains three major resources, including the RNA-Seq transcriptome, microarray and gene networks. The RNA-Seq data define gene structures and transcriptome, with special emphasis on exon-intron boundaries; the microarray data describe gene expression of 20 time points during three major stages of the T. thermophila life cycle; the gene network data identify potential gene-gene interactions of 15 049 genes. The TetraFGD provides user-friendly search functions that assist researchers in accessing gene models, transcripts, gene expression data and gene-gene relationships. In conclusion, the TetraFGD is an important functional genomic resource for researchers who focus on the Tetrahymena or other ciliates. Database URL: http://tfgd.ihb.ac.cn/

  2. Resources for Functional Genomics Studies in Drosophila melanogaster

    PubMed Central

    Mohr, Stephanie E.; Hu, Yanhui; Kim, Kevin; Housden, Benjamin E.; Perrimon, Norbert

    2014-01-01

    Drosophila melanogaster has become a system of choice for functional genomic studies. Many resources, including online databases and software tools, are now available to support design or identification of relevant fly stocks and reagents or analysis and mining of existing functional genomic, transcriptomic, proteomic, etc. datasets. These include large community collections of fly stocks and plasmid clones, “meta” information sites like FlyBase and FlyMine, and an increasing number of more specialized reagents, databases, and online tools. Here, we introduce key resources useful to plan large-scale functional genomics studies in Drosophila and to analyze, integrate, and mine the results of those studies in ways that facilitate identification of highest-confidence results and generation of new hypotheses. We also discuss ways in which existing resources can be used and might be improved and suggest a few areas of future development that would further support large- and small-scale studies in Drosophila and facilitate use of Drosophila information by the research community more generally. PMID:24653003

  3. The Xenopus ORFeome: A resource that enables functional genomics

    PubMed Central

    Grant, Ian M.; Balcha, Dawit; Hao, Tong; Shen, Yun; Trivedi, Prasad; Patrushev, Ilya; Fortriede, Joshua D.; Karpinka, John B.; Liu, Limin; Zorn, Aaron M.; Stukenberg, P. Todd; Hill, David E.; Gilchrist, Michael J.

    2015-01-01

    Functional characterisation of proteins and large-scale, systems-level studies are enabled by extensive sets of cloned open reading frames (ORFs) in an easily-accessible format that enables many different applications. Here we report the release of the first stage of the Xenopus ORFeome, which contains 8673 ORFs from the Xenopus Gene Collection (XGC) for Xenopus laevis, cloned into a Gateway® donor vector enabling rapid in-frame transfer of the ORFs to expression vectors. This resource represents an estimated 7871 unique genes, approximately 40% of the non-redundant X. laevis gene complement, and includes 2724 genes where the human ortholog has an association with disease. Transfer into the Gateway system was validated by 5′ and 3′ end sequencing of the entire collection and protein expression of a set of test clones. In a parallel process, the underlying ORF predictions from the original XGC collection were re-analysed to verify quality and full-length status, identifying those proteins likely to exhibit truncations when translated. These data are integrated into Xenbase, the Xenopus community database, which associates genomic, expression, function and human disease model metadata to each ORF, enabling end-users to search for ORFeome clones with links to commercial distributors of the collection. When coupled with the experimental advantages of Xenopus eggs and embryos, the ORFeome collection represents a valuable resource for functional genomics and disease modelling. PMID:26391338

  4. Wheat EST resources for functional genomics of abiotic stress

    PubMed Central

    Houde, Mario; Belcaid, Mahdi; Ouellet, François; Danyluk, Jean; Monroy, Antonio F; Dryanova, Ani; Gulick, Patrick; Bergeron, Anne; Laroche, André; Links, Matthew G; MacCarthy, Luke; Crosby, William L; Sarhan, Fathey

    2006-01-01

    Background Wheat is an excellent species to study freezing tolerance and other abiotic stresses. However, the sequence of the wheat genome has not been completely characterized due to its complexity and large size. To circumvent this obstacle and identify genes involved in cold acclimation and associated stresses, a large scale EST sequencing approach was undertaken by the Functional Genomics of Abiotic Stress (FGAS) project. Results We generated 73,521 quality-filtered ESTs from eleven cDNA libraries constructed from wheat plants exposed to various abiotic stresses and at different developmental stages. In addition, 196,041 ESTs for which tracefiles were available from the National Science Foundation wheat EST sequencing program and DuPont were also quality-filtered and used in the analysis. Clustering of the combined ESTs with d2_cluster and TGICL yielded a few large clusters containing several thousand ESTs that were refractory to routine clustering techniques. To resolve this problem, the sequence proximity and "bridges" were identified by an e-value distance graph to manually break clusters into smaller groups. Assembly of the resolved ESTs generated a 75,488 unique sequence set (31,580 contigs and 43,908 singletons/singlets). Digital expression analyses indicated that the FGAS dataset is enriched in stress-regulated genes compared to the other public datasets. Over 43% of the unique sequence set was annotated and classified into functional categories according to Gene Ontology. Conclusion We have annotated 29,556 different sequences, an almost 5-fold increase in annotated sequences compared to the available wheat public databases. Digital expression analysis combined with gene annotation helped in the identification of several pathways associated with abiotic stress. The genomic resources and knowledge developed by this project will contribute to a better understanding of the different mechanisms that govern stress tolerance in wheat and other cereals. PMID

  5. An Italian functional genomic resource for Medicago truncatula

    PubMed Central

    Porceddu, Andrea; Panara, Francesco; Calderini, Ornella; Molinari, Lorna; Taviani, Paola; Lanfaloni, Luisa; Scotti, Carla; Carelli, Maria; Scaramelli, Laura; Bruschi, Gianluca; Cosson, Viviane; Ratet, Pascal; de Larembergue, Henri; Duc, Gerard; Piano, Efisio; Arcioni, Sergio

    2008-01-01

    Background Medicago truncatula is a model species for legumes. Its functional genomics have been considerably boosted in recent years due to initiatives based both in Europe and US. Collections of mutants are becoming increasingly available and this will help unravel the genetic control of important traits for many species of legumes. Findings Our report is on the production of three complementary mutant collections of the model species Medicago truncatula produced in Italy in the frame of a national genomic initiative. Well established strategies were used: Tnt1 mutagenesis, TILLING and activation tagging. Both forward and reverse genetics screenings proved the efficiency of the mutagenesis approaches adopted, enabling the isolation of interesting mutants which are in course of characterization. We anticipate that the reported collections will be complementary to the recently established functional genomics tools developed for Medicago truncatula both in Europe and in the United States. PMID:19077311

  6. A Populus EST resource for plant functional genomics.

    PubMed

    Sterky, Fredrik; Bhalerao, Rupali R; Unneberg, Per; Segerman, Bo; Nilsson, Peter; Brunner, Amy M; Charbonnel-Campaa, Laurence; Lindvall, Jenny Jonsson; Tandre, Karolina; Strauss, Steven H; Sundberg, Björn; Gustafsson, Petter; Uhlén, Mathias; Bhalerao, Rishikesh P; Nilsson, Ove; Sandberg, Göran; Karlsson, Jan; Lundeberg, Joakim; Jansson, Stefan

    2004-09-21

    Trees present a life form of paramount importance for terrestrial ecosystems and human societies because of their ecological structure and physiological function and provision of energy and industrial materials. The genus Populus is the internationally accepted model for molecular tree biology. We have analyzed 102,019 Populus ESTs that clustered into 11,885 clusters and 12,759 singletons. We also provide >4,000 assembled full clone sequences to serve as a basis for the upcoming annotation of the Populus genome sequence. A public web-based EST database (POPULUSDB) provides digital expression profiles for 18 tissues that comprise the majority of differentiated organs. The coding content of Populus and Arabidopsis genomes shows very high similarity, indicating that differences between these annual and perennial angiosperm life forms result primarily from differences in gene regulation. The high similarity between Populus and Arabidopsis will allow studies of Populus to directly benefit from the detailed functional genomic information generated for Arabidopsis, enabling detailed insights into tree development and adaptation. These data will also valuable for functional genomic efforts in Arabidopsis.

  7. Comprehensive Resources for Tomato Functional Genomics Based on the Miniature Model Tomato Micro-Tom

    PubMed Central

    Matsukura, C; Aoki, K; Fukuda, N; Mizoguchi, T; Asamizu, E; Saito, T; Shibata, D; Ezura, H

    2008-01-01

    Tomato (Solanum lycopersicum L., Solanaceae) is an excellent model plant for genomic research of solanaceous plants, as well as for studying the development, ripening, and metabolism of fruit. In 2003, the International Solanaceae Project (SOL, www.sgn.cornell.edu ) was initiated by members from more than 30 countries, and the tomato genome-sequencing project is currently underway. Genome sequence of tomato obtained by this project will provide a firm foundation for forthcoming genomic studies such as the comparative analysis of genes conserved among the Solanaceae species and the elucidation of the functions of unknown tomato genes. To exploit the wealth of the genome sequence information, there is an urgent need for novel resources and analytical tools for tomato functional genomics. Here, we present an overview of the development of genetic and genomic resources of tomato in the last decade, with a special focus on the activities of Japan SOL and the National Bio-Resource Project in the development of functional genomic resources of a model cultivar, Micro-Tom. PMID:19506732

  8. Natural and artificial mutants as valuable resources for functional genomics and molecular breeding.

    PubMed

    Jiang, Shu-Ye; Ramachandran, Srinivasan

    2010-01-01

    With the completion of rice genome sequencing, large collection of expression data and the great efforts in annotating rice genomes, the next challenge is to systematically assign functions to all predicted genes in the genome. The generations and collections of mutants at the genome-wide level form technological platform of functional genomics. In this study, we have reviewed currently employed tools to generate such mutant populations. These tools include natural, physical, chemical, tissue culture, T-DNA, transposon or gene silencing based mutagenesis. We also reviewed how these tools were used to generate a large collection of mutants and how these mutants can be screened and detected for functional analysis of a gene. The data suggested that the current population of mutants might be large enough to tag all predicted genes. However, the collection of flanking sequencing tags (FSTs) is limited due to the relatively higher cost. Thus, we have proposed a new strategy to generate gene-silencing mutants at the genome-wide level. Due to the large collection of insertion mutants, the next step to rice functional genomics should be focusing on functional characterization of tagged genes by detailed survey of corresponding mutants. Additionally, we also evaluated the utilization of these mutants as valuable resources for molecular breeding.

  9. The non-human primate reference transcriptome resource (NHPRTR) for comparative functional genomics

    PubMed Central

    Pipes, Lenore; Li, Sheng; Bozinoski, Marjan; Palermo, Robert; Peng, Xinxia; Blood, Phillip; Kelly, Sara; Weiss, Jeffrey M.; Thierry-Mieg, Jean; Thierry-Mieg, Danielle; Zumbo, Paul; Chen, Ronghua; Schroth, Gary P.; Mason, Christopher E.; Katze, Michael G.

    2013-01-01

    RNA-based next-generation sequencing (RNA-Seq) provides a tremendous amount of new information regarding gene and transcript structure, expression and regulation. This is particularly true for non-coding RNAs where whole transcriptome analyses have revealed that the much of the genome is transcribed and that many non-coding transcripts have widespread functionality. However, uniform resources for raw, cleaned and processed RNA-Seq data are sparse for most organisms and this is especially true for non-human primates (NHPs). Here, we describe a large-scale RNA-Seq data and analysis infrastructure, the NHP reference transcriptome resource (http://nhprtr.org); it presently hosts data from12 species of primates, to be expanded to 15 species/subspecies spanning great apes, old world monkeys, new world monkeys and prosimians. Data are collected for each species using pools of RNA from comparable tissues. We provide data access in advance of its deposition at NCBI, as well as browsable tracks of alignments against the human genome using the UCSC genome browser. This resource will continue to host additional RNA-Seq data, alignments and assemblies as they are generated over the coming years and provide a key resource for the annotation of NHP genomes as well as informing primate studies on evolution, reproduction, infection, immunity and pharmacology. PMID:23203872

  10. AmoebaDB and MicrosporidiaDB: functional genomic resources for Amoebozoa and Microsporidia species

    PubMed Central

    Aurrecoechea, Cristina; Barreto, Ana; Brestelli, John; Brunk, Brian P.; Caler, Elisabet V.; Fischer, Steve; Gajria, Bindu; Gao, Xin; Gingle, Alan; Grant, Greg; Harb, Omar S.; Heiges, Mark; Iodice, John; Kissinger, Jessica C.; Kraemer, Eileen T.; Li, Wei; Nayak, Vishal; Pennington, Cary; Pinney, Deborah F.; Pitts, Brian; Roos, David S.; Srinivasamoorthy, Ganesh; Stoeckert, Christian J.; Treatman, Charles; Wang, Haiming

    2011-01-01

    AmoebaDB (http://AmoebaDB.org) and MicrosporidiaDB (http://MicrosporidiaDB.org) are new functional genomic databases serving the amoebozoa and microsporidia research communities, respectively. AmoebaDB contains the genomes of three Entamoeba species (E. dispar, E. invadens and E. histolityca) and microarray expression data for E. histolytica. MicrosporidiaDB contains the genomes of Encephalitozoon cuniculi, E. intestinalis and E. bieneusi. The databases belong to the National Institute of Allergy and Infectious Diseases (NIAID) funded EuPathDB (http://EuPathDB.org) Bioinformatics Resource Center family of integrated databases and assume the same architectural and graphical design as other EuPathDB resources such as PlasmoDB and TriTrypDB. Importantly they utilize the graphical strategy builder that affords a database user the ability to ask complex multi-data-type questions with relative ease and versatility. Genomic scale data can be queried based on BLAST searches, annotation keywords and gene ID searches, GO terms, sequence motifs, protein characteristics, phylogenetic relationships and functional data such as transcript (microarray and EST evidence) and protein expression data. Search strategies can be saved within a user’s profile for future retrieval and may also be shared with other researchers using a unique strategy web address. PMID:20974635

  11. A genome-scale resource for the functional characterization of Arabidopsis transcription factors

    PubMed Central

    Pruneda-Paz, Jose L.; Breton, Ghislain; Nagel, Dawn H.; Kang, S. Earl; Bonaldi, Katia; Doherty, Colleen J.; Ravelo, Stephanie; Galli, Mary; Ecker, Joseph R.; Kay, Steve A.

    2014-01-01

    SUMMARY Extensive transcriptional networks play major roles in cellular and organismal functions. Transcript levels are in part determined by the combinatorial and overlapping functions of multiple transcription factors (TFs) bound to gene promoters. TF-promoter interactions thus provide the basic molecular wiring of transcriptional regulatory networks. In plants, discovery of the functional roles of TFs is limited by an increased complexity of network circuitry due to a significant expansion of TF families. Here, we present the construction of a comprehensive clone-collection of Arabidopsis TFs created to provide a versatile resource to uncover TF biological functions. We leveraged this collection by implementing a high-throughput DNA-binding assay and identified direct regulators of a key clock gene (CCA1) that provide molecular links between different signaling modules and the circadian clock. The resources introduced in this work will significantly contribute to a better understanding of the transcriptional regulatory landscape of plant genomes. PMID:25043187

  12. NCBI viral genomes resource.

    PubMed

    Brister, J Rodney; Ako-Adjei, Danso; Bao, Yiming; Blinkova, Olga

    2015-01-01

    Recent technological innovations have ignited an explosion in virus genome sequencing that promises to fundamentally alter our understanding of viral biology and profoundly impact public health policy. Yet, any potential benefits from the billowing cloud of next generation sequence data hinge upon well implemented reference resources that facilitate the identification of sequences, aid in the assembly of sequence reads and provide reference annotation sources. The NCBI Viral Genomes Resource is a reference resource designed to bring order to this sequence shockwave and improve usability of viral sequence data. The resource can be accessed at http://www.ncbi.nlm.nih.gov/genome/viruses/ and catalogs all publicly available virus genome sequences and curates reference genome sequences. As the number of genome sequences has grown, so too have the difficulties in annotating and maintaining reference sequences. The rapid expansion of the viral sequence universe has forced a recalibration of the data model to better provide extant sequence representation and enhanced reference sequence products to serve the needs of the various viral communities. This, in turn, has placed increased emphasis on leveraging the knowledge of individual scientific communities to identify important viral sequences and develop well annotated reference virus genome sets.

  13. Ensembl comparative genomics resources

    PubMed Central

    Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J.; Searle, Stephen M. J.; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

    2016-01-01

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org. PMID:26896847

  14. Ensembl comparative genomics resources.

    PubMed

    Herrero, Javier; Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J; Searle, Stephen M J; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul

    2016-01-01

    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org.

  15. Generation, functional annotation and comparative analysis of black spruce (Picea mariana) ESTs: an important conifer genomic resource

    PubMed Central

    2013-01-01

    Background EST (expressed sequence tag) sequences and their annotation provide a highly valuable resource for gene discovery, genome sequence annotation, and other genomics studies that can be applied in genetics, breeding and conservation programs for non-model organisms. Conifers are long-lived plants that are ecologically and economically important globally, and have a large genome size. Black spruce (Picea mariana), is a transcontinental species of the North American boreal and temperate forests. However, there are limited transcriptomic and genomic resources for this species. The primary objective of our study was to develop a black spruce transcriptomic resource to facilitate on-going functional genomics projects related to growth and adaptation to climate change. Results We conducted bidirectional sequencing of cDNA clones from a standard cDNA library constructed from black spruce needle tissues. We obtained 4,594 high quality (2,455 5' end and 2,139 3' end) sequence reads, with an average read-length of 532 bp. Clustering and assembly of ESTs resulted in 2,731 unique sequences, consisting of 2,234 singletons and 497 contigs. Approximately two-thirds (63%) of unique sequences were functionally annotated. Genes involved in 36 molecular functions and 90 biological processes were discovered, including 24 putative transcription factors and 232 genes involved in photosynthesis. Most abundantly expressed transcripts were associated with photosynthesis, growth factors, stress and disease response, and transcription factors. A total of 216 full-length genes were identified. About 18% (493) of the transcripts were novel, representing an important addition to the Genbank EST database (dbEST). Fifty-seven di-, tri-, tetra- and penta-nucleotide simple sequence repeats were identified. Conclusions We have developed the first high quality EST resource for black spruce and identified 493 novel transcripts, which may be species-specific related to life history and

  16. The Eukaryotic Pathogen Databases: a functional genomic resource integrating data from human and veterinary parasites.

    PubMed

    Harb, Omar S; Roos, David S

    2015-01-01

    Over the past 20 years, advances in high-throughput biological techniques and the availability of computational resources including fast Internet access have resulted in an explosion of large genome-scale data sets "big data." While such data are readily available for download and personal use and analysis from a variety of repositories, often such analysis requires access to seldom-available computational skills. As a result a number of databases have emerged to provide scientists with online tools enabling the interrogation of data without the need for sophisticated computational skills beyond basic knowledge of Internet browser utility. This chapter focuses on the Eukaryotic Pathogen Databases (EuPathDB: http://eupathdb.org) Bioinformatic Resource Center (BRC) and illustrates some of the available tools and methods.

  17. The Eukaryotic Pathogen Databases: a functional genomic resource integrating data from human and veterinary parasites.

    PubMed

    Harb, Omar S; Roos, David S

    2015-01-01

    Over the past 20 years, advances in high-throughput biological techniques and the availability of computational resources including fast Internet access have resulted in an explosion of large genome-scale data sets "big data." While such data are readily available for download and personal use and analysis from a variety of repositories, often such analysis requires access to seldom-available computational skills. As a result a number of databases have emerged to provide scientists with online tools enabling the interrogation of data without the need for sophisticated computational skills beyond basic knowledge of Internet browser utility. This chapter focuses on the Eukaryotic Pathogen Databases (EuPathDB: http://eupathdb.org) Bioinformatic Resource Center (BRC) and illustrates some of the available tools and methods. PMID:25388105

  18. The Biofuel Feedstock Genomics Resource: a web-based portal and database to enable functional genomics of plant biofuel feedstock species.

    PubMed

    Childs, Kevin L; Konganti, Kranti; Buell, C Robin

    2012-01-01

    Major feedstock sources for future biofuel production are likely to be high biomass producing plant species such as poplar, pine, switchgrass, sorghum and maize. One active area of research in these species is genome-enabled improvement of lignocellulosic biofuel feedstock quality and yield. To facilitate genomic-based investigations in these species, we developed the Biofuel Feedstock Genomic Resource (BFGR), a database and web-portal that provides high-quality, uniform and integrated functional annotation of gene and transcript assembly sequences from species of interest to lignocellulosic biofuel feedstock researchers. The BFGR includes sequence data from 54 species and permits researchers to view, analyze and obtain annotation at the gene, transcript, protein and genome level. Annotation of biochemical pathways permits the identification of key genes and transcripts central to the improvement of lignocellulosic properties in these species. The integrated nature of the BFGR in terms of annotation methods, orthologous/paralogous relationships and linkage to seven species with complete genome sequences allows comparative analyses for biofuel feedstock species with limited sequence resources. Database URL: http://bfgr.plantbiology.msu.edu.

  19. Gene Overexpression Resources in Cereals for Functional Genomics and Discovery of Useful Genes

    PubMed Central

    Abe, Kiyomi; Ichikawa, Hiroaki

    2016-01-01

    Identification and elucidation of functions of plant genes is valuable for both basic and applied research. In addition to natural variation in model plants, numerous loss-of-function resources have been produced by mutagenesis with chemicals, irradiation, or insertions of transposable elements or T-DNA. However, we may be unable to observe loss-of-function phenotypes for genes with functionally redundant homologs and for those essential for growth and development. To offset such disadvantages, gain-of-function transgenic resources have been exploited. Activation-tagged lines have been generated using obligatory overexpression of endogenous genes by random insertion of an enhancer. Recent progress in DNA sequencing technology and bioinformatics has enabled the preparation of genomewide collections of full-length cDNAs (fl-cDNAs) in some model species. Using the fl-cDNA clones, a novel gain-of-function strategy, Fl-cDNA OvereXpressor gene (FOX)-hunting system, has been developed. A mutant phenotype in a FOX line can be directly attributed to the overexpressed fl-cDNA. Investigating a large population of FOX lines could reveal important genes conferring favorable phenotypes for crop breeding. Alternatively, a unique loss-of-function approach Chimeric REpressor gene Silencing Technology (CRES-T) has been developed. In CRES-T, overexpression of a chimeric repressor, composed of the coding sequence of a transcription factor (TF) and short peptide designated as the repression domain, could interfere with the action of endogenous TF in plants. Although plant TFs usually consist of gene families, CRES-T is effective, in principle, even for the TFs with functional redundancy. In this review, we focus on the current status of the gene-overexpression strategies and resources for identifying and elucidating novel functions of cereal genes. We discuss the potential of these research tools for identifying useful genes and phenotypes for application in crop breeding. PMID

  20. WheatGenome.info: A Resource for Wheat Genomics Resource.

    PubMed

    Lai, Kaitao

    2016-01-01

    An integrated database with a variety of Web-based systems named WheatGenome.info hosting wheat genome and genomic data has been developed to support wheat research and crop improvement. The resource includes multiple Web-based applications, which are implemented as a variety of Web-based systems. These include a GBrowse2-based wheat genome viewer with BLAST search portal, TAGdb for searching wheat second generation genome sequence data, wheat autoSNPdb, links to wheat genetic maps using CMap and CMap3D, and a wheat genome Wiki to allow interaction between diverse wheat genome sequencing activities. This portal provides links to a variety of wheat genome resources hosted at other research organizations. This integrated database aims to accelerate wheat genome research and is freely accessible via the web interface at http://www.wheatgenome.info/ .

  1. WheatGenome.info: A Resource for Wheat Genomics Resource.

    PubMed

    Lai, Kaitao

    2016-01-01

    An integrated database with a variety of Web-based systems named WheatGenome.info hosting wheat genome and genomic data has been developed to support wheat research and crop improvement. The resource includes multiple Web-based applications, which are implemented as a variety of Web-based systems. These include a GBrowse2-based wheat genome viewer with BLAST search portal, TAGdb for searching wheat second generation genome sequence data, wheat autoSNPdb, links to wheat genetic maps using CMap and CMap3D, and a wheat genome Wiki to allow interaction between diverse wheat genome sequencing activities. This portal provides links to a variety of wheat genome resources hosted at other research organizations. This integrated database aims to accelerate wheat genome research and is freely accessible via the web interface at http://www.wheatgenome.info/ . PMID:26519407

  2. Shanghai RAPESEED Database: a resource for functional genomics studies of seed development and fatty acid metabolism of Brassica.

    PubMed

    Wu, Guo-Zhang; Shi, Qiu-Ming; Niu, Ya; Xing, Mei-Qing; Xue, Hong-Wei

    2008-01-01

    The Shanghai RAPESEED Database (RAPESEED, http://rapeseed.plantsignal.cn/) was created to provide the solid platform for functional genomics studies of oilseed crops with the emphasis on seed development and fatty acid metabolism. The RAPESEED includes the resource of 8462 unique ESTs, of which 3526 clones are with full length cDNA; the expression profiles of 8095 genes and the Serial Analysis of Gene Expression (SAGE, 23,895 unique tags) and tag-to-gene data during seed development. In addition, a total of approximately 14,700 M3 mutant populations were generated by ethylmethanesulfonate (EMS) mutagenesis and related seed quality information was determined using the Foss NIR System. Further, the TILLING (Targeting Induced Local Lesions IN Genomes) platform was established based on the generated EMS mutant population. The relevant information was collected in RAPESEED database, which can be searched through keywords, nucleotide or protein sequences, or seed quality parameters, and downloaded.

  3. Gramene 2013: Comparative plant genomics resources

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Gramene (http://www.gramene.org) is a curated online resource for comparative functional genomics in crops and model plant species, currently hosting 27 fully and 10 partially sequenced reference genomes in its build number 38. Its strength derives from the application of a phylogenetic framework fo...

  4. Genomic Resources for Cancer Epidemiology

    Cancer.gov

    This page provides links to research resources, complied by the Epidemiology and Genomics Research Program, that may be of interest to genetic epidemiologists conducting cancer research, but is not exhaustive.

  5. A panel of induced pluripotent stem cells from chimpanzees: a resource for comparative functional genomics

    PubMed Central

    Gallego Romero, Irene; Pavlovic, Bryan J; Hernando-Herraez, Irene; Zhou, Xiang; Ward, Michelle C; Banovich, Nicholas E; Kagan, Courtney L; Burnett, Jonathan E; Huang, Constance H; Mitrano, Amy; Chavarria, Claudia I; Friedrich Ben-Nun, Inbar; Li, Yingchun; Sabatini, Karen; Leonardo, Trevor R; Parast, Mana; Marques-Bonet, Tomas; Laurent, Louise C; Loring, Jeanne F; Gilad, Yoav

    2015-01-01

    Comparative genomics studies in primates are restricted due to our limited access to samples. In order to gain better insight into the genetic processes that underlie variation in complex phenotypes in primates, we must have access to faithful model systems for a wide range of cell types. To facilitate this, we generated a panel of 7 fully characterized chimpanzee induced pluripotent stem cell (iPSC) lines derived from healthy donors. To demonstrate the utility of comparative iPSC panels, we collected RNA-sequencing and DNA methylation data from the chimpanzee iPSCs and the corresponding fibroblast lines, as well as from 7 human iPSCs and their source lines, which encompass multiple populations and cell types. We observe much less within-species variation in iPSCs than in somatic cells, indicating the reprogramming process erases many inter-individual differences. The low within-species regulatory variation in iPSCs allowed us to identify many novel inter-species regulatory differences of small magnitude. DOI: http://dx.doi.org/10.7554/eLife.07103.001 PMID:26102527

  6. High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource.

    PubMed

    Seaver, Samuel M D; Gerdes, Svetlana; Frelin, Océane; Lerma-Ortiz, Claudia; Bradbury, Louis M T; Zallot, Rémi; Hasnain, Ghulam; Niehaus, Thomas D; El Yacoubi, Basma; Pasternak, Shiran; Olson, Robert; Pusch, Gordon; Overbeek, Ross; Stevens, Rick; de Crécy-Lagard, Valérie; Ware, Doreen; Hanson, Andrew D; Henry, Christopher S

    2014-07-01

    The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today's annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic models. To overcome these problems, we have developed the PlantSEED, an integrated, metabolism-centric database to support subsystems-based annotation and metabolic model reconstruction for plant genomes. PlantSEED combines SEED subsystems technology, first developed for microbial genomes, with refined protein families and biochemical data to assign fully consistent functional annotations to orthologous genes, particularly those encoding primary metabolic pathways. Seamless integration with its parent, the prokaryotic SEED database, makes PlantSEED a unique environment for cross-kingdom comparative analysis of plant and bacterial genomes. The consistent annotations imposed by PlantSEED permit rapid reconstruction and modeling of primary metabolism for all plant genomes in the database. This feature opens the unique possibility of model-based assessment of the completeness and accuracy of gene annotation and thus allows computational identification of genes and pathways that are restricted to certain genomes or need better curation. We demonstrate the PlantSEED system by producing consistent annotations for 10 reference genomes. We also produce a functioning metabolic model for each genome, gapfilling to identify missing annotations and proposing gene candidates for missing annotations. Models are built around an extended biomass composition representing the most comprehensive published to date. To our knowledge, our models are the first to be published for seven of the genomes analyzed. PMID:24927599

  7. Fungal genome resources at NCBI

    PubMed Central

    Robbertse, B.; Tatusova, T.

    2011-01-01

    The National Center for Biotechnology Information (NCBI) is well known for the nucleotide sequence archive, GenBank and sequence analysis tool BLAST. However, NCBI integrates many types of biomolecular data from variety of sources and makes it available to the scientific community as interactive web resources as well as organized releases of bulk data. These tools are available to explore and compare fungal genomes. Searching all databases with Fungi [organism] at http://www.ncbi.nlm.nih.gov/ is the quickest way to find resources of interest with fungal entries. Some tools though are resources specific and can be indirectly accessed from a particular database in the Entrez system. These include graphical viewers and comparative analysis tools such as TaxPlot, TaxMap and UniGene DDD (found via UniGene Homepage). Gene and BioProject pages also serve as portals to external data such as community annotation websites, BioGrid and UniProt. There are many different ways of accessing genomic data at NCBI. Depending on the focus and goal of research projects or the level of interest, a user would select a particular route for accessing genomic databases and resources. This review article describes methods of accessing fungal genome data and provides examples that illustrate the use of analysis tools. PMID:22737589

  8. Genomics and Bioinformatics Resources for Crop Improvement

    PubMed Central

    Mochida, Keiichi; Shinozaki, Kazuo

    2010-01-01

    Recent remarkable innovations in platforms for omics-based research and application development provide crucial resources to promote research in model and applied plant species. A combinatorial approach using multiple omics platforms and integration of their outcomes is now an effective strategy for clarifying molecular systems integral to improving plant productivity. Furthermore, promotion of comparative genomics among model and applied plants allows us to grasp the biological properties of each species and to accelerate gene discovery and functional analyses of genes. Bioinformatics platforms and their associated databases are also essential for the effective design of approaches making the best use of genomic resources, including resource integration. We review recent advances in research platforms and resources in plant omics together with related databases and advances in technology. PMID:20208064

  9. Plant functional genomics.

    PubMed

    Holtorf, Hauke; Guitton, Marie-Christine; Reski, Ralf

    2002-06-01

    Functional genome analysis of plants has entered the high-throughput stage. The complete genome information from key species such as Arabidopsis thaliana and rice is now available and will further boost the application of a range of new technologies to functional plant gene analysis. To broadly assign functions to unknown genes, different fast and multiparallel approaches are currently used and developed. These new technologies are based on known methods but are adapted and improved to accommodate for comprehensive, large-scale gene analysis, i.e. such techniques are novel in the sense that their design allows researchers to analyse many genes at the same time and at an unprecedented pace. Such methods allow analysis of the different constituents of the cell that help to deduce gene function, namely the transcripts, proteins and metabolites. Similarly the phenotypic variations of entire mutant collections can now be analysed in a much faster and more efficient way than before. The different methodologies have developed to form their own fields within the functional genomics technological platform and are termed transcriptomics, proteomics, metabolomics and phenomics. Gene function, however, cannot solely be inferred by using only one such approach. Rather, it is only by bringing together all the information collected by different functional genomic tools that one will be able to unequivocally assign functions to unknown plant genes. This review focuses on current technical developments and their impact on the field of plant functional genomics. The lower plant Physcomitrella is introduced as a new model system for gene function analysis, owing to its high rate of homologous recombination.

  10. Saccharomyces Genome Database: the genomics resource of budding yeast

    PubMed Central

    Cherry, J. Michael; Hong, Eurie L.; Amundsen, Craig; Balakrishnan, Rama; Binkley, Gail; Chan, Esther T.; Christie, Karen R.; Costanzo, Maria C.; Dwight, Selina S.; Engel, Stacia R.; Fisk, Dianna G.; Hirschman, Jodi E.; Hitz, Benjamin C.; Karra, Kalpana; Krieger, Cynthia J.; Miyasato, Stuart R.; Nash, Rob S.; Park, Julie; Skrzypek, Marek S.; Simison, Matt; Weng, Shuai; Wong, Edith D.

    2012-01-01

    The Saccharomyces Genome Database (SGD, http://www.yeastgenome.org) is the community resource for the budding yeast Saccharomyces cerevisiae. The SGD project provides the highest-quality manually curated information from peer-reviewed literature. The experimental results reported in the literature are extracted and integrated within a well-developed database. These data are combined with quality high-throughput results and provided through Locus Summary pages, a powerful query engine and rich genome browser. The acquisition, integration and retrieval of these data allow SGD to facilitate experimental design and analysis by providing an encyclopedia of the yeast genome, its chromosomal features, their functions and interactions. Public access to these data is provided to researchers and educators via web pages designed for optimal ease of use. PMID:22110037

  11. Saccharomyces Genome Database: the genomics resource of budding yeast.

    PubMed

    Cherry, J Michael; Hong, Eurie L; Amundsen, Craig; Balakrishnan, Rama; Binkley, Gail; Chan, Esther T; Christie, Karen R; Costanzo, Maria C; Dwight, Selina S; Engel, Stacia R; Fisk, Dianna G; Hirschman, Jodi E; Hitz, Benjamin C; Karra, Kalpana; Krieger, Cynthia J; Miyasato, Stuart R; Nash, Rob S; Park, Julie; Skrzypek, Marek S; Simison, Matt; Weng, Shuai; Wong, Edith D

    2012-01-01

    The Saccharomyces Genome Database (SGD, http://www.yeastgenome.org) is the community resource for the budding yeast Saccharomyces cerevisiae. The SGD project provides the highest-quality manually curated information from peer-reviewed literature. The experimental results reported in the literature are extracted and integrated within a well-developed database. These data are combined with quality high-throughput results and provided through Locus Summary pages, a powerful query engine and rich genome browser. The acquisition, integration and retrieval of these data allow SGD to facilitate experimental design and analysis by providing an encyclopedia of the yeast genome, its chromosomal features, their functions and interactions. Public access to these data is provided to researchers and educators via web pages designed for optimal ease of use. PMID:22110037

  12. The Anadara trapezia transcriptome: a resource for molluscan physiological genomics.

    PubMed

    Prentis, Peter J; Pavasovic, Ana

    2014-12-01

    In this study we undertook deep sequencing of the blood cockle, Anadara trapezia, transcriptome to generate genomic resources for future functional genomics analyses. Over 27 million high quality paired end reads were assembled into 75024 contigs. Of these contigs, 29013 (38.7%) received significant BLASTx hits and gene ontology (GO) terms were assigned to 13718 of these sequences. This resource will facilitate physiological genomic studies to test the gene expression response of A. trapezia to various environmental stresses. PMID:25151889

  13. Genomic resources in fruit plants: an assessment of current status.

    PubMed

    Rai, Manoj K; Shekhawat, N S

    2015-01-01

    The availability of many genomic resources such as genome sequences, functional genomics resources including microarrays and RNA-seq, sufficient numbers of molecular markers, express sequence tags (ESTs) and high-density genetic maps is causing a rapid acceleration of genetics and genomic research of many fruit plants. This is leading to an increase in our knowledge of the genes that are linked to many horticultural and agronomically important traits. Recently, some progress has also been made on the identification and functional analysis of miRNAs in some fruit plants. This is one of the most active research fields in plant sciences. The last decade has witnessed development of genomic resources in many fruit plants such as apple, banana, citrus, grapes, papaya, pears, strawberry etc.; however, many of them are still not being exploited. Furthermore, owing to lack of resources, infrastructure and research facilities in many lesser-developed countries, development of genomic resources in many underutilized or less-studied fruit crops, which grow in these countries, is limited. Thus, research emphasis should be given to those fruit crops for which genomic resources are relatively scarce. The development of genomic databases of these less-studied fruit crops will enable biotechnologists to identify target genes that underlie key horticultural and agronomical traits. This review presents an overview of the current status of the development of genomic resources in fruit plants with the main emphasis being on genome sequencing, EST resources, functional genomics resources including microarray and RNA-seq, identification of quantitative trait loci and construction of genetic maps as well as efforts made on the identification and functional analysis of miRNAs in fruit plants.

  14. Enabling functional genomics with genome engineering.

    PubMed

    Hilton, Isaac B; Gersbach, Charles A

    2015-10-01

    Advances in genome engineering technologies have made the precise control over genome sequence and regulation possible across a variety of disciplines. These tools can expand our understanding of fundamental biological processes and create new opportunities for therapeutic designs. The rapid evolution of these methods has also catalyzed a new era of genomics that includes multiple approaches to functionally characterize and manipulate the regulation of genomic information. Here, we review the recent advances of the most widely adopted genome engineering platforms and their application to functional genomics. This includes engineered zinc finger proteins, TALEs/TALENs, and the CRISPR/Cas9 system as nucleases for genome editing, transcription factors for epigenome editing, and other emerging applications. We also present current and potential future applications of these tools, as well as their current limitations and areas for future advances.

  15. Enabling functional genomics with genome engineering.

    PubMed

    Hilton, Isaac B; Gersbach, Charles A

    2015-10-01

    Advances in genome engineering technologies have made the precise control over genome sequence and regulation possible across a variety of disciplines. These tools can expand our understanding of fundamental biological processes and create new opportunities for therapeutic designs. The rapid evolution of these methods has also catalyzed a new era of genomics that includes multiple approaches to functionally characterize and manipulate the regulation of genomic information. Here, we review the recent advances of the most widely adopted genome engineering platforms and their application to functional genomics. This includes engineered zinc finger proteins, TALEs/TALENs, and the CRISPR/Cas9 system as nucleases for genome editing, transcription factors for epigenome editing, and other emerging applications. We also present current and potential future applications of these tools, as well as their current limitations and areas for future advances. PMID:26430154

  16. Gene3D: a domain-based resource for comparative genomics, functional annotation and protein network analysis.

    PubMed

    Lees, Jonathan; Yeats, Corin; Perkins, James; Sillitoe, Ian; Rentzsch, Robert; Dessailly, Benoit H; Orengo, Christine

    2012-01-01

    Gene3D http://gene3d.biochem.ucl.ac.uk is a comprehensive database of protein domain assignments for sequences from the major sequence databases. Domains are directly mapped from structures in the CATH database or predicted using a library of representative profile HMMs derived from CATH superfamilies. As previously described, Gene3D integrates many other protein family and function databases. These facilitate complex associations of molecular function, structure and evolution. Gene3D now includes a domain functional family (FunFam) level below the homologous superfamily level assignments. Additions have also been made to the interaction data. More significantly, to help with the visualization and interpretation of multi-genome scale data sets, we have developed a new, revamped website. Searching has been simplified with more sophisticated filtering of results, along with new tools based on Cytoscape Web, for visualizing protein-protein interaction networks, differences in domain composition between genomes and the taxonomic distribution of individual superfamilies.

  17. Navigating yeast genome maintenance with functional genomics.

    PubMed

    Measday, Vivien; Stirling, Peter C

    2016-03-01

    Maintenance of genome integrity is a fundamental requirement of all organisms. To address this, organisms have evolved extremely faithful modes of replication, DNA repair and chromosome segregation to combat the deleterious effects of an unstable genome. Nonetheless, a small amount of genome instability is the driver of evolutionary change and adaptation, and thus a low level of instability is permitted in populations. While defects in genome maintenance almost invariably reduce fitness in the short term, they can create an environment where beneficial mutations are more likely to occur. The importance of this fact is clearest in the development of human cancer, where genome instability is a well-established enabling characteristic of carcinogenesis. This raises the crucial question: what are the cellular pathways that promote genome maintenance and what are their mechanisms? Work in model organisms, in particular the yeast Saccharomyces cerevisiae, has provided the global foundations of genome maintenance mechanisms in eukaryotes. The development of pioneering genomic tools inS. cerevisiae, such as the systematic creation of mutants in all nonessential and essential genes, has enabled whole-genome approaches to identifying genes with roles in genome maintenance. Here, we review the extensive whole-genome approaches taken in yeast, with an emphasis on functional genomic screens, to understand the genetic basis of genome instability, highlighting a range of genetic and cytological screening modalities. By revealing the biological pathways and processes regulating genome integrity, these analyses contribute to the systems-level map of the yeast cell and inform studies of human disease, especially cancer.

  18. Navigating yeast genome maintenance with functional genomics.

    PubMed

    Measday, Vivien; Stirling, Peter C

    2016-03-01

    Maintenance of genome integrity is a fundamental requirement of all organisms. To address this, organisms have evolved extremely faithful modes of replication, DNA repair and chromosome segregation to combat the deleterious effects of an unstable genome. Nonetheless, a small amount of genome instability is the driver of evolutionary change and adaptation, and thus a low level of instability is permitted in populations. While defects in genome maintenance almost invariably reduce fitness in the short term, they can create an environment where beneficial mutations are more likely to occur. The importance of this fact is clearest in the development of human cancer, where genome instability is a well-established enabling characteristic of carcinogenesis. This raises the crucial question: what are the cellular pathways that promote genome maintenance and what are their mechanisms? Work in model organisms, in particular the yeast Saccharomyces cerevisiae, has provided the global foundations of genome maintenance mechanisms in eukaryotes. The development of pioneering genomic tools inS. cerevisiae, such as the systematic creation of mutants in all nonessential and essential genes, has enabled whole-genome approaches to identifying genes with roles in genome maintenance. Here, we review the extensive whole-genome approaches taken in yeast, with an emphasis on functional genomic screens, to understand the genetic basis of genome instability, highlighting a range of genetic and cytological screening modalities. By revealing the biological pathways and processes regulating genome integrity, these analyses contribute to the systems-level map of the yeast cell and inform studies of human disease, especially cancer. PMID:26323482

  19. Tomato functional genomics database (TFGD): a comprehensive collection and analysis package for tomato functional genomics

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Tomato Functional Genomics Database (TFGD; http://ted.bti.cornell.edu) provides a comprehensive systems biology resource to store, mine, analyze, visualize and integrate large-scale tomato functional genomics datasets. The database is expanded from the previously described Tomato Expression Database...

  20. Nongenetic functions of the genome.

    PubMed

    Bustin, Michael; Misteli, Tom

    2016-05-01

    The primary function of the genome is to store, propagate, and express the genetic information that gives rise to a cell's architectural and functional machinery. However, the genome is also a major structural component of the cell. Besides its genetic roles, the genome affects cellular functions by nongenetic means through its physical and structural properties, particularly by exerting mechanical forces and by serving as a scaffold for binding of cellular components. Major cellular processes affected by nongenetic functions of the genome include establishment of nuclear structure, signal transduction, mechanoresponses, cell migration, and vision in nocturnal animals. We discuss the concept, mechanisms, and implications of nongenetic functions of the genome.

  1. Gramene 2016: comparative plant genomics and pathway resources

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the data...

  2. High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today's annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic mode...

  3. Gramene: a growing plant comparative genomics resource

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Gramene (www.gramene.org) is a curated genetic, genomic and comparative genome analysis resource for the major crop species, such as rice, maize, wheat and many other plant (mainly grass) species. Gramene is an open-source project, with all data and software freely downloadable through the ftp site ...

  4. Resources | Office of Cancer Genomics

    Cancer.gov

    OCG provides a variety of scientific and educational resources for both cancer researchers and members of the general public. These resources are divided into the following types: OCG-Supported Resources: Tools, databases, and reagents generated by initiated and completed OCG programs for researchers, educators, and students. (Note: Databases for current OCG programs are available through program-specific data matrices)

  5. PLAZA: A Comparative Genomics Resource to Study Gene and Genome Evolution in Plants[W

    PubMed Central

    Proost, Sebastian; Van Bel, Michiel; Sterck, Lieven; Billiau, Kenny; Van Parys, Thomas; Van de Peer, Yves; Vandepoele, Klaas

    2009-01-01

    The number of sequenced genomes of representatives within the green lineage is rapidly increasing. Consequently, comparative sequence analysis has significantly altered our view on the complexity of genome organization, gene function, and regulatory pathways. To explore all this genome information, a centralized infrastructure is required where all data generated by different sequencing initiatives is integrated and combined with advanced methods for data mining. Here, we describe PLAZA, an online platform for plant comparative genomics (http://bioinformatics.psb.ugent.be/plaza/). This resource integrates structural and functional annotation of published plant genomes together with a large set of interactive tools to study gene function and gene and genome evolution. Precomputed data sets cover homologous gene families, multiple sequence alignments, phylogenetic trees, intraspecies whole-genome dot plots, and genomic colinearity between species. Through the integration of high confidence Gene Ontology annotations and tree-based orthology between related species, thousands of genes lacking any functional description are functionally annotated. Advanced query systems, as well as multiple interactive visualization tools, are available through a user-friendly and intuitive Web interface. In addition, detailed documentation and tutorials introduce the different tools, while the workbench provides an efficient means to analyze user-defined gene sets through PLAZA's interface. In conclusion, PLAZA provides a comprehensible and up-to-date research environment to aid researchers in the exploration of genome information within the green plant lineage. PMID:20040540

  6. PLAZA: a comparative genomics resource to study gene and genome evolution in plants.

    PubMed

    Proost, Sebastian; Van Bel, Michiel; Sterck, Lieven; Billiau, Kenny; Van Parys, Thomas; Van de Peer, Yves; Vandepoele, Klaas

    2009-12-01

    The number of sequenced genomes of representatives within the green lineage is rapidly increasing. Consequently, comparative sequence analysis has significantly altered our view on the complexity of genome organization, gene function, and regulatory pathways. To explore all this genome information, a centralized infrastructure is required where all data generated by different sequencing initiatives is integrated and combined with advanced methods for data mining. Here, we describe PLAZA, an online platform for plant comparative genomics (http://bioinformatics.psb.ugent.be/plaza/). This resource integrates structural and functional annotation of published plant genomes together with a large set of interactive tools to study gene function and gene and genome evolution. Precomputed data sets cover homologous gene families, multiple sequence alignments, phylogenetic trees, intraspecies whole-genome dot plots, and genomic colinearity between species. Through the integration of high confidence Gene Ontology annotations and tree-based orthology between related species, thousands of genes lacking any functional description are functionally annotated. Advanced query systems, as well as multiple interactive visualization tools, are available through a user-friendly and intuitive Web interface. In addition, detailed documentation and tutorials introduce the different tools, while the workbench provides an efficient means to analyze user-defined gene sets through PLAZA's interface. In conclusion, PLAZA provides a comprehensible and up-to-date research environment to aid researchers in the exploration of genome information within the green plant lineage.

  7. Genome-scale resources for Thermoanaerobacterium saccharolyticum

    DOE PAGES

    Currie, Devin H.; Raman, Babu; Gowen, Christopher M.; Tschaplinski, Timothy J.; Land, Miriam L.; Brown, Steven D.; Covalla, Sean; Klingeman, Dawn Marie; Yang, Zamin Koo; Engle, Nancy L.; et al

    2015-06-26

    Thermoanaerobacterium saccharolyticum is a hemicellulose-degrading thermophilic anaerobe that was previously engineered to produce ethanol at high yield. For this research, a major project was undertaken to develop this organism into an industrial biocatalyst, but the lack of genome information and resources were recognized early on as a key limitation.

  8. Developing genomic resources for the apiaceae

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The Apiaceae family includes carrot, celery, cilantro, dill, fennel and numerous other spice and medicinal crops. Carrot is the most economically important member of the Apiaceae with an annual value of $600 M in the United States alone. There are few genomic resources for carrot or other Apiaceae, ...

  9. Sequence resources at the Candida Genome Database.

    PubMed

    Arnaud, Martha B; Costanzo, Maria C; Skrzypek, Marek S; Shah, Prachi; Binkley, Gail; Lane, Christopher; Miyasato, Stuart R; Sherlock, Gavin

    2007-01-01

    The Candida Genome Database (CGD, http://www.candidagenome.org/) contains a curated collection of genomic information and community resources for researchers who are interested in the molecular biology of the opportunistic pathogen Candida albicans. With the recent release of a new assembly of the C.albicans genome, Assembly 20, C.albicans genomics has entered a new era. Although the C.albicans genome assembly continues to undergo refinement, multiple assemblies and gene nomenclatures will remain in widespread use by the research community. CGD has now taken on the responsibility of maintaining the most up-to-date version of the genome sequence by providing the data from this new assembly alongside the data from the previous assemblies, as well as any future corrections and refinements. In this database update, we describe the sequence information available for C.albicans, the sequence information contained in CGD, and the tools for sequence retrieval, analysis and comparison that CGD provides. CGD is freely accessible at http://www.candidagenome.org/ and CGD curators may be contacted by email at candida-curator@genome.stanford.edu.

  10. Gramene 2016: comparative plant genomics and pathway resources.

    PubMed

    Tello-Ruiz, Marcela K; Stein, Joshua; Wei, Sharon; Preece, Justin; Olson, Andrew; Naithani, Sushma; Amarasinghe, Vindhya; Dharmawardhana, Palitha; Jiao, Yinping; Mulvaney, Joseph; Kumari, Sunita; Chougule, Kapeel; Elser, Justin; Wang, Bo; Thomason, James; Bolser, Daniel M; Kerhornou, Arnaud; Walts, Brandon; Fonseca, Nuno A; Huerta, Laura; Keays, Maria; Tang, Y Amy; Parkinson, Helen; Fabregat, Antonio; McKay, Sheldon; Weiser, Joel; D'Eustachio, Peter; Stein, Lincoln; Petryszak, Robert; Kersey, Paul J; Jaiswal, Pankaj; Ware, Doreen

    2016-01-01

    Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to ∼ 200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBI's Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramene's archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials.

  11. Gramene 2016: comparative plant genomics and pathway resources

    PubMed Central

    Tello-Ruiz, Marcela K.; Stein, Joshua; Wei, Sharon; Preece, Justin; Olson, Andrew; Naithani, Sushma; Amarasinghe, Vindhya; Dharmawardhana, Palitha; Jiao, Yinping; Mulvaney, Joseph; Kumari, Sunita; Chougule, Kapeel; Elser, Justin; Wang, Bo; Thomason, James; Bolser, Daniel M.; Kerhornou, Arnaud; Walts, Brandon; Fonseca, Nuno A.; Huerta, Laura; Keays, Maria; Tang, Y. Amy; Parkinson, Helen; Fabregat, Antonio; McKay, Sheldon; Weiser, Joel; D'Eustachio, Peter; Stein, Lincoln; Petryszak, Robert; Kersey, Paul J.; Jaiswal, Pankaj; Ware, Doreen

    2016-01-01

    Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to ∼200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBI's Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramene's archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials. PMID:26553803

  12. Gramene 2016: comparative plant genomics and pathway resources.

    PubMed

    Tello-Ruiz, Marcela K; Stein, Joshua; Wei, Sharon; Preece, Justin; Olson, Andrew; Naithani, Sushma; Amarasinghe, Vindhya; Dharmawardhana, Palitha; Jiao, Yinping; Mulvaney, Joseph; Kumari, Sunita; Chougule, Kapeel; Elser, Justin; Wang, Bo; Thomason, James; Bolser, Daniel M; Kerhornou, Arnaud; Walts, Brandon; Fonseca, Nuno A; Huerta, Laura; Keays, Maria; Tang, Y Amy; Parkinson, Helen; Fabregat, Antonio; McKay, Sheldon; Weiser, Joel; D'Eustachio, Peter; Stein, Lincoln; Petryszak, Robert; Kersey, Paul J; Jaiswal, Pankaj; Ware, Doreen

    2016-01-01

    Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to ∼ 200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBI's Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramene's archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials. PMID:26553803

  13. Genomic resource development for shellfish of conservation concern.

    PubMed

    Timmins-Schiffman, Emma B; Friedman, Carolyn S; Metzger, Dave C; White, Samuel J; Roberts, Steven B

    2013-03-01

    Effective conservation of threatened species depends on the ability to assess organism physiology and population demography. To develop genomic resources to better understand the dynamics of two ecologically vulnerable species in the Pacific Northwest of the United States, larval transcriptomes were sequenced for the pinto abalone, Haliotis kamtschatkana kamtschatkana, and the Olympia oyster, Ostrea lurida. Based on comparative species analysis the Ostrea lurida transcriptome (41 136 contigs) is relatively complete. These transcriptomes represent the first significant contribution to genomic resources for both species. Genes are described based on biological function with particular attention to those associated with temperature change, oxidative stress and immune function. In addition, transcriptome-derived genetic markers are provided. Together, these resources provide valuable tools for future studies aimed at conservation of Haliotis kamtschatkana kamtschatkana, Ostrea lurida and related species.

  14. The perennial ryegrass GenomeZipper: targeted use of genome resources for comparative grass genomics.

    PubMed

    Pfeifer, Matthias; Martis, Mihaela; Asp, Torben; Mayer, Klaus F X; Lübberstedt, Thomas; Byrne, Stephen; Frei, Ursula; Studer, Bruno

    2013-02-01

    Whole-genome sequences established for model and major crop species constitute a key resource for advanced genomic research. For outbreeding forage and turf grass species like ryegrasses (Lolium spp.), such resources have yet to be developed. Here, we present a model of the perennial ryegrass (Lolium perenne) genome on the basis of conserved synteny to barley (Hordeum vulgare) and the model grass genome Brachypodium (Brachypodium distachyon) as well as rice (Oryza sativa) and sorghum (Sorghum bicolor). A transcriptome-based genetic linkage map of perennial ryegrass served as a scaffold to establish the chromosomal arrangement of syntenic genes from model grass species. This scaffold revealed a high degree of synteny and macrocollinearity and was then utilized to anchor a collection of perennial ryegrass genes in silico to their predicted genome positions. This resulted in the unambiguous assignment of 3,315 out of 8,876 previously unmapped genes to the respective chromosomes. In total, the GenomeZipper incorporates 4,035 conserved grass gene loci, which were used for the first genome-wide sequence divergence analysis between perennial ryegrass, barley, Brachypodium, rice, and sorghum. The perennial ryegrass GenomeZipper is an ordered, information-rich genome scaffold, facilitating map-based cloning and genome assembly in perennial ryegrass and closely related Poaceae species. It also represents a milestone in describing synteny between perennial ryegrass and fully sequenced model grass genomes, thereby increasing our understanding of genome organization and evolution in the most important temperate forage and turf grass species.

  15. Functional genomics in the mouse.

    PubMed

    Perkins, Archibald S

    2002-08-01

    The mouse is the premier genetic model organism for the study of human disease and development. With the recent advances in sequencing of the human and mouse genomes, there is strong interest now in large-scale approaches to decipher the function of mouse genes using various mutagenesis technologies. This review discusses what tools are currently available for manipulating and mutagenizing the mouse genome, such as ethylnitrosourea and gene trap mutagenesis, engineered inversions and deletions using the cre-lox system, and proviral insertional mutagenesis in somatic cells, and how these are being used to uncover gene function.

  16. Reconciling resource utilization and resource selection functions

    USGS Publications Warehouse

    Hooten, Mevin B.; Hanks, Ephraim M.; Johnson, Devin S.; Alldredge, Mat W.

    2013-01-01

    Summary: 1. Analyses based on utilization distributions (UDs) have been ubiquitous in animal space use studies, largely because they are computationally straightforward and relatively easy to employ. Conventional applications of resource utilization functions (RUFs) suggest that estimates of UDs can be used as response variables in a regression involving spatial covariates of interest. 2. It has been claimed that contemporary implementations of RUFs can yield inference about resource selection, although to our knowledge, an explicit connection has not been described. 3. We explore the relationships between RUFs and resource selection functions from a hueristic and simulation perspective. We investigate several sources of potential bias in the estimation of resource selection coefficients using RUFs (e.g. the spatial covariance modelling that is often used in RUF analyses). 4. Our findings illustrate that RUFs can, in fact, serve as approximations to RSFs and are capable of providing inference about resource selection, but only with some modification and under specific circumstances. 5. Using real telemetry data as an example, we provide guidance on which methods for estimating resource selection may be more appropriate and in which situations. In general, if telemetry data are assumed to arise as a point process, then RSF methods may be preferable to RUFs; however, modified RUFs may provide less biased parameter estimates when the data are subject to location error.

  17. Reconciling resource utilization and resource selection functions.

    PubMed

    Hooten, Mevin B; Hanks, Ephraim M; Johnson, Devin S; Alldredge, Mat W

    2013-11-01

    1. Analyses based on utilization distributions (UDs) have been ubiquitous in animal space use studies, largely because they are computationally straightforward and relatively easy to employ. Conventional applications of resource utilization functions (RUFs) suggest that estimates of UDs can be used as response variables in a regression involving spatial covariates of interest. 2. It has been claimed that contemporary implementations of RUFs can yield inference about resource selection, although to our knowledge, an explicit connection has not been described. 3. We explore the relationships between RUFs and resource selection functions from a hueristic and simulation perspective. We investigate several sources of potential bias in the estimation of resource selection coefficients using RUFs (e.g. the spatial covariance modelling that is often used in RUF analyses). 4. Our findings illustrate that RUFs can, in fact, serve as approximations to RSFs and are capable of providing inference about resource selection, but only with some modification and under specific circumstances. 5. Using real telemetry data as an example, we provide guidance on which methods for estimating resource selection may be more appropriate and in which situations. In general, if telemetry data are assumed to arise as a point process, then RSF methods may be preferable to RUFs; however, modified RUFs may provide less biased parameter estimates when the data are subject to location error.

  18. Development of chloroplast genomic resources for Cynara.

    PubMed

    Curci, Pasquale L; De Paola, Domenico; Sonnante, Gabriella

    2016-03-01

    In this study, new chloroplast (cp) resources were developed for the genus Cynara, using whole cp genomes from 20 genotypes, by means of high-throughput sequencing technologies. Our target species included seven globe artichokes, two cultivated cardoons, eight wild artichokes, and three other wild Cynara species (C. baetica, C. cornigera and C. syriaca). One complete cp genome was isolated using short reads from a whole-genome sequencing project, while the others were obtained by means of long-range PCR, for which primer pairs are provided here. A de novo assembly strategy combined with a reference-based assembly allowed us to reconstruct each cp genome. Comparative analyses among the newly sequenced genotypes and two additional Cynara cp genomes ('Brindisino' artichoke and C. humilis) retrieved from public databases revealed 126 parsimony informative characters and 258 singletons in Cynara, for a total of 384 variable characters. Thirty-nine SSR loci and 34 other INDEL events were detected. After data analysis, 37 primer pairs for SSR amplification were designed, and these molecular markers were subsequently validated in our Cynara genotypes. Phylogenetic analysis based on all cp variable characters provided the best resolution when compared to what was observed using only parsimony informative characters, or only short 'variable' cp regions. The evaluation of the molecular resources obtained from this study led us to support the 'super-barcode' theory and consider the total cp sequence of Cynara as a reliable and valuable molecular marker for exploring species diversity and examining variation below the species level. PMID:26354522

  19. Functional genomics of pathogenic bacteria.

    PubMed Central

    Moxon, E R; Hood, D W; Saunders, N J; Schweda, E K H; Richards, J C

    2002-01-01

    Microbial diseases remain the commonest cause of global mortality and morbidity. Automated-DNA sequencing has revolutionized the investigation of pathogenic microbes by making the immense fund of information contained in their genomes available at reasonable cost. The challenge is how this information can be used to increase current understanding of the biology of commensal and virulence behaviour of pathogens with particular emphasis on in vivo function and novel approaches to prevention. One example of the application of whole-genome-sequence information is afforded by investigations of the pathogenic role of Haemophilus influenzae lipopolysaccharide and its candidacy as a vaccine. PMID:11839188

  20. 2004 Structural, Function and Evolutionary Genomics

    SciTech Connect

    Douglas L. Brutlag Nancy Ryan Gray

    2005-03-23

    This Gordon conference will cover the areas of structural, functional and evolutionary genomics. It will take a systematic approach to genomics, examining the evolution of proteins, protein functional sites, protein-protein interactions, regulatory networks, and metabolic networks. Emphasis will be placed on what we can learn from comparative genomics and entire genomes and proteomes.

  1. Genomics data resources: frameworks and standards.

    PubMed

    Wilkinson, Mark D

    2012-01-01

    The emergence of genomics tools for the evolutionary and comparative biology community led to a rapid explosion in the number of online resources targeted at this specialized community, including Web-based comparative genomics software, such as the Artemis Comparison Tool (WebACT); databases, such as PaleoDB, Global Biodiversity Information Facility, and TreeBase; and knowledge frameworks, such as the Evolution Ontology. Unfortunately, these providers are largely independent of one another and therefore the individual resources do not share any centralized plan for how the data or tools would or should be provided. As a result, there are a myriad of often incompatible technologies and frameworks being used by this community of providers. In this chapter, we explore approaches to online resource publication, both those already in use by the community, as well as new and emergent frameworks and standards. Exploration of the strengths and weaknesses of each approach, together with a brief exploration of the philosophy or informatics theory behind the varying approaches, will hopefully help readers as they navigate this data space. The discussion is constructed such that it lays the groundwork for exploration of a new global standard for data and knowledge representation--"The Semantic Web"--that holds promise of providing solutions to many of the complexities users face in their attempts to discover and integrate biodiversity data, and examples are provided.

  2. MycoCosm, an Integrated Fungal Genomics Resource

    SciTech Connect

    Shabalov, Igor; Grigoriev, Igor

    2012-03-16

    MycoCosm is a web-based interactive fungal genomics resource, which was first released in March 2010, in response to an urgent call from the fungal community for integration of all fungal genomes and analytical tools in one place (Pan-fungal data resources meeting, Feb 21-22, 2010, Alexandria, VA). MycoCosm integrates genomics data and analysis tools to navigate through over 100 fungal genomes sequenced at JGI and elsewhere. This resource allows users to explore fungal genomes in the context of both genome-centric analysis and comparative genomics, and promotes user community participation in data submission, annotation and analysis. MycoCosm has over 4500 unique visitors/month or 35000+ visitors/year as well as hundreds of registered users contributing their data and expertise to this resource. Its scalable architecture allows significant expansion of the data expected from JGI Fungal Genomics Program, its users, and integration with external resources used by fungal community.

  3. Functional Annotation Analytics of Rhodopseudomonas palustris Genomes

    PubMed Central

    Simmons, Shaneka S.; Isokpehi, Raphael D.; Brown, Shyretha D.; McAllister, Donee L.; Hall, Charnia C.; McDuffy, Wanaki M.; Medley, Tamara L.; Udensi, Udensi K.; Rajnarayanan, Rajendram V.; Ayensu, Wellington K.; Cohly, Hari H.P.

    2011-01-01

    Rhodopseudomonas palustris, a nonsulphur purple photosynthetic bacteria, has been extensively investigated for its metabolic versatility including ability to produce hydrogen gas from sunlight and biomass. The availability of the finished genome sequences of six R. palustris strains (BisA53, BisB18, BisB5, CGA009, HaA2 and TIE-1) combined with online bioinformatics software for integrated analysis presents new opportunities to determine the genomic basis of metabolic versatility and ecological lifestyles of the bacteria species. The purpose of this investigation was to compare the functional annotations available for multiple R. palustris genomes to identify annotations that can be further investigated for strain-specific or uniquely shared phenotypic characteristics. A total of 2,355 protein family Pfam domain annotations were clustered based on presence or absence in the six genomes. The clustering process identified groups of functional annotations including those that could be verified as strain-specific or uniquely shared phenotypes. For example, genes encoding water/glycerol transport were present in the genome sequences of strains CGA009 and BisB5, but absent in strains BisA53, BisB18, HaA2 and TIE-1. Protein structural homology modeling predicted that the two orthologous 240 aa R. palustris aquaporins have water-specific transport function. Based on observations in other microbes, the presence of aquaporin in R. palustris strains may improve freeze tolerance in natural conditions of rapid freezing such as nitrogen fixation at low temperatures where access to liquid water is a limiting factor for nitrogenase activation. In the case of adaptive loss of aquaporin genes, strains may be better adapted to survive in conditions of high-sugar content such as fermentation of biomass for biohydrogen production. Finally, web-based resources were developed to allow for interactive, user-defined selection of the relationship between protein family annotations and the R

  4. Functional genomics approaches in parasitic helminths.

    PubMed

    Hagen, J; Lee, E F; Fairlie, W D; Kalinna, B H

    2012-01-01

    As research on parasitic helminths is moving into the post-genomic era, an enormous effort is directed towards deciphering gene function and to achieve gene annotation. The sequences that are available in public databases undoubtedly hold information that can be utilized for new interventions and control but the exploitation of these resources has until recently remained difficult. Only now, with the emergence of methods to genetically manipulate and transform parasitic worms will it be possible to gain a comprehensive understanding of the molecular mechanisms involved in nutrition, metabolism, developmental switches/maturation and interaction with the host immune system. This review focuses on functional genomics approaches in parasitic helminths that are currently used, to highlight potential applications of these technologies in the areas of cell biology, systems biology and immunobiology of parasitic helminths.

  5. Genomes with distinct function composition.

    PubMed

    Tamames, J; Ouzounis, C; Sander, C; Valencia, A

    1996-06-24

    The functional composition of organisms can be analysed for the first time with the appearance of complete or sizeable parts of various genomes. We have reduced the problem of protein function classification to a simple scheme with three classes of protein function: energy-, information- and communication-associated proteins. Finer classification schemes can be easily mapped to the above three classes. To deal with the vast amount of information, a system for automatic function classification using database annotations has been developed. The system is able to classify correctly about 80% of the query sequences with annotations. Using this system, we can analyse samples from the genomes of the most represented species in sequence databases and compare their genomic composition. The similarities and differences for different taxonomic groups are strikingly intuitive. Viruses have the highest proportion of proteins involved in the control and expression of genetic information. Bacteria have the highest proportion of their genes dedicated to the production of proteins associated with small molecule transformations and transport. Animals have a very large proportion of proteins associated with intra- and intercellular communication and other regulatory processes. In general, the proportion of communication-related proteins increases during evolution, indicating trends that led to the emergence of the eukaryotic cell and later the transition from unicellular to multicellular organisms. PMID:8682215

  6. [The ENCODE project and functional genomics studies].

    PubMed

    Ding, Nan; Qu, Hongzhu; Fang, Xiangdong

    2014-03-01

    Upon the completion of the Human Genome Project, scientists have been trying to interpret the underlying genomic code for human biology. Since 2003, National Human Genome Research Institute (NHGRI) has invested nearly $0.3 billion and gathered over 440 scientists from more than 32 institutions in the United States, China, United Kingdom, Japan, Spain and Singapore to initiate the Encyclopedia of DNA Elements (ENCODE) project, aiming to identify and analyze all regulatory elements in the human genome. Taking advantage of the development of next-generation sequencing technologies and continuous improvement of experimental methods, ENCODE had made remarkable achievements: identified methylation and histone modification of DNA sequences and their regulatory effects on gene expression through altering chromatin structures, categorized binding sites of various transcription factors and constructed their regulatory networks, further revised and updated database for pseudogenes and non-coding RNA, and identified SNPs in regulatory sequences associated with diseases. These findings help to comprehensively understand information embedded in gene and genome sequences, the function of regulatory elements as well as the molecular mechanism underlying the transcriptional regulation by noncoding regions, and provide extensive data resource for life sciences, particularly for translational medicine. We re-viewed the contributions of high-throughput sequencing platform development and bioinformatical technology improve-ment to the ENCODE project, the association between epigenetics studies and the ENCODE project, and the major achievement of the ENCODE project. We also provided our prospective on the role of the ENCODE project in promoting the development of basic and clinical medicine.

  7. Genomic resources for flatfish research and their applications.

    PubMed

    Cerdà, J; Douglas, S; Reith, M

    2010-10-01

    Flatfishes are a group of teleosts of high commercial and environmental interest, whose biology is still poorly understood. The recent rapid development of different 'omic' technologies is, however, enhancing the knowledge of the complex genetic control underlying different physiological processes of flatfishes. This review describes the different functional genomic approaches and resources currently available for flatfish research and summarizes different areas where microarray-based gene expression analysis has been applied. The increase in genome sequencing data has also allowed the construction of genetic linkage maps in different flatfish species; these maps are invaluable for investigating genome organization and identifying genetic traits of commercial interest. Despite the significant progress in this field, the genomic resources currently available for flatfish are still scarce. Further intensive research should be carried out to develop larger genomic sequence databases, high-density microarrays and, more detailed, complete linkage maps, using second-generation sequencing platforms. These tools will be crucial for further expanding the knowledge of flatfish physiology, and it is predicted that they will have important implications for wild fish population management, improved fish welfare and increased productivity in aquaculture.

  8. Kiwifruit Information Resource (KIR): a comparative platform for kiwifruit genomics.

    PubMed

    Yue, Junyang; Liu, Jian; Ban, Rongjun; Tang, Wei; Deng, Lin; Fei, Zhangjun; Liu, Yongsheng

    2015-01-01

    The Kiwifruit Information Resource (KIR) is dedicated to maintain and integrate comprehensive datasets on genomics, functional genomics and transcriptomics of kiwifruit (Actinidiaceae). KIR serves as a central access point for existing/new genomic and genetic data. KIR also provides researchers with a variety of visualization and analysis tools. Current developments include the updated genome structure of Actinidia chinensis cv. Hongyang and its newest genome annotation, putative transcripts, gene expression, physical markers of genetic traits as well as relevant publications based on the latest genome assembly. Nine thousand five hundred and forty-seven new transcripts are detected and 21 132 old transcripts are changed. At the present release, the next-generation transcriptome sequencing data has been incorporated into gene models and splice variants. Protein-protein interactions are also identified based on experimentally determined orthologous interactions. Furthermore, the experimental results reported in peer-reviewed literature are manually extracted and integrated within a well-developed query page. In total, 122 identifications are currently associated, including commonly used gene names and symbols. All KIR datasets are helpful to facilitate a broad range of kiwifruit research topics and freely available to the research community. Database URL: http://bdg.hfut.edu.cn/kir/index.html. PMID:26656885

  9. Kiwifruit Information Resource (KIR): a comparative platform for kiwifruit genomics.

    PubMed

    Yue, Junyang; Liu, Jian; Ban, Rongjun; Tang, Wei; Deng, Lin; Fei, Zhangjun; Liu, Yongsheng

    2015-01-01

    The Kiwifruit Information Resource (KIR) is dedicated to maintain and integrate comprehensive datasets on genomics, functional genomics and transcriptomics of kiwifruit (Actinidiaceae). KIR serves as a central access point for existing/new genomic and genetic data. KIR also provides researchers with a variety of visualization and analysis tools. Current developments include the updated genome structure of Actinidia chinensis cv. Hongyang and its newest genome annotation, putative transcripts, gene expression, physical markers of genetic traits as well as relevant publications based on the latest genome assembly. Nine thousand five hundred and forty-seven new transcripts are detected and 21 132 old transcripts are changed. At the present release, the next-generation transcriptome sequencing data has been incorporated into gene models and splice variants. Protein-protein interactions are also identified based on experimentally determined orthologous interactions. Furthermore, the experimental results reported in peer-reviewed literature are manually extracted and integrated within a well-developed query page. In total, 122 identifications are currently associated, including commonly used gene names and symbols. All KIR datasets are helpful to facilitate a broad range of kiwifruit research topics and freely available to the research community. Database URL: http://bdg.hfut.edu.cn/kir/index.html.

  10. Floral gene resources from basal angiosperms for comparative genomics research

    PubMed Central

    Albert, Victor A; Soltis, Douglas E; Carlson, John E; Farmerie, William G; Wall, P Kerr; Ilut, Daniel C; Solow, Teri M; Mueller, Lukas A; Landherr, Lena L; Hu, Yi; Buzgo, Matyas; Kim, Sangtae; Yoo, Mi-Jeong; Frohlich, Michael W; Perl-Treves, Rafael; Schlarbaum, Scott E; Bliss, Barbara J; Zhang, Xiaohong; Tanksley, Steven D; Oppenheimer, David G; Soltis, Pamela S; Ma, Hong; dePamphilis, Claude W; Leebens-Mack, James H

    2005-01-01

    Background The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST) sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants. Results Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04) generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i) proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii) many known floral gene homologues have been captured, and (iii) phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms. Conclusion Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage-specific gene duplication and

  11. MorusDB: a resource for mulberry genomics and genome biology.

    PubMed

    Li, Tian; Qi, Xiwu; Zeng, Qiwei; Xiang, Zhonghuai; He, Ningjia

    2014-01-01

    Mulberry is an important cultivated plant that has received the attention of biologists interested in sericulture and plant-insect interaction. Morus notabilis, a wild mulberry species with a minimal chromosome number is an ideal material for whole-genome sequencing and assembly. The genome and transcriptome of M. notabilis were sequenced and analyzed. In this article, a web-based and open-access database, the Morus Genome Database (MorusDB), was developed to enable easy-to-access and data mining. The MorusDB provides an integrated data source and an easy accession of mulberry large-scale genomic sequencing and assembly, predicted genes and functional annotations, expressed sequence tags (ESTs), transposable elements (TEs), Gene Ontology (GO) terms, horizontal gene transfers between mulberry and silkworm and ortholog and paralog groups. Transcriptome sequencing data for M. notabilis root, leaf, bark, winter bud and male flower can also be searched and downloaded. Furthermore, MorusDB provides an analytical workbench with some built-in tools and pipelines, such as BLAST, Search GO, Mulberry GO and Mulberry GBrowse, to facilitate genomic studies and comparative genomics. The MorusDB provides important genomic resources for scientists working with mulberry and other Moraceae species, which include many important fruit crops. Designed as a basic platform and accompanied by the SilkDB, MorusDB strives to be a comprehensive platform for the silkworm-mulberry interaction studies. Database URL: http://morus.swu.edu.cn/morusdb.

  12. Functional Insights from Structural Genomics

    SciTech Connect

    Forouhar,F.; Kuzin, A.; Seetharaman, J.; Lee, I.; Zhou, W.; Abashidze, M.; Chen, Y.; Montelione, G.; Tong, L.; et al

    2007-01-01

    Structural genomics efforts have produced structural information, either directly or by modeling, for thousands of proteins over the past few years. While many of these proteins have known functions, a large percentage of them have not been characterized at the functional level. The structural information has provided valuable functional insights on some of these proteins, through careful structural analyses, serendipity, and structure-guided functional screening. Some of the success stories based on structures solved at the Northeast Structural Genomics Consortium (NESG) are reported here. These include a novel methyl salicylate esterase with important role in plant innate immunity, a novel RNA methyltransferase (H. influenzae yggJ (HI0303)), a novel spermidine/spermine N-acetyltransferase (B. subtilis PaiA), a novel methyltransferase or AdoMet binding protein (A. fulgidus AF{_}0241), an ATP:cob(I)alamin adenosyltransferase (B. subtilis YvqK), a novel carboxysome pore (E. coli EutN), a proline racemase homolog with a disrupted active site (B. melitensis BME11586), an FMN-dependent enzyme (S. pneumoniae SP{_}1951), and a 12-stranded {beta}-barrel with a novel fold (V. parahaemolyticus VPA1032).

  13. Gene Chips and Functional Genomics

    NASA Astrophysics Data System (ADS)

    Hamadeh, Hisham; Afshari, Cynthia

    2000-11-01

    These past few years of scientific discovery will undoubtedly be remembered as the "genomics era," the period in which biologists succeeded in enumerating the sequence of nucleotides making up all, or at least most, of human DNA. And while this achievement has been heralded as a technological feat equal to the moon landing, it is only the first of many advances in DNA technology. Scientists are now faced with the task of understanding the meaning of the DNA sequence. Specifically, they want to learn how the DNA code relates to protein function. An important tool in the study of "functional genomics," is the cDNA microarray—also known as the gene chip. Inspired by computer microchips, gene chips allow scientists to monitor the expression of hundreds, even thousands, of genes in a fraction of the time it used to take to monitor the expression of a single one. By altering the conditions under which a particular tissue expresses genes—say, by exposing it to toxins or growth factors—scientists can determine the suite of genes expressed in different situations and hence start to get a handle on the function of these genes. The authors discuss this important new technology and some of its practical applications.

  14. GEneSTATION 1.0: a synthetic resource of diverse evolutionary and functional genomic data for studying the evolution of pregnancy-associated tissues and phenotypes.

    PubMed

    Kim, Mara; Cooper, Brian A; Venkat, Rohit; Phillips, Julie B; Eidem, Haley R; Hirbo, Jibril; Nutakki, Sashank; Williams, Scott M; Muglia, Louis J; Capra, J Anthony; Petren, Kenneth; Abbot, Patrick; Rokas, Antonis; McGary, Kriston L

    2016-01-01

    Mammalian gestation and pregnancy are fast evolving processes that involve the interaction of the fetal, maternal and paternal genomes. Version 1.0 of the GEneSTATION database (http://genestation.org) integrates diverse types of omics data across mammals to advance understanding of the genetic basis of gestation and pregnancy-associated phenotypes and to accelerate the translation of discoveries from model organisms to humans. GEneSTATION is built using tools from the Generic Model Organism Database project, including the biology-aware database CHADO, new tools for rapid data integration, and algorithms that streamline synthesis and user access. GEneSTATION contains curated life history information on pregnancy and reproduction from 23 high-quality mammalian genomes. For every human gene, GEneSTATION contains diverse evolutionary (e.g. gene age, population genetic and molecular evolutionary statistics), organismal (e.g. tissue-specific gene and protein expression, differential gene expression, disease phenotype), and molecular data types (e.g. Gene Ontology Annotation, protein interactions), as well as links to many general (e.g. Entrez, PubMed) and pregnancy disease-specific (e.g. PTBgene, dbPTB) databases. By facilitating the synthesis of diverse functional and evolutionary data in pregnancy-associated tissues and phenotypes and enabling their quick, intuitive, accurate and customized meta-analysis, GEneSTATION provides a novel platform for comprehensive investigation of the function and evolution of mammalian pregnancy.

  15. GEneSTATION 1.0: a synthetic resource of diverse evolutionary and functional genomic data for studying the evolution of pregnancy-associated tissues and phenotypes.

    PubMed

    Kim, Mara; Cooper, Brian A; Venkat, Rohit; Phillips, Julie B; Eidem, Haley R; Hirbo, Jibril; Nutakki, Sashank; Williams, Scott M; Muglia, Louis J; Capra, J Anthony; Petren, Kenneth; Abbot, Patrick; Rokas, Antonis; McGary, Kriston L

    2016-01-01

    Mammalian gestation and pregnancy are fast evolving processes that involve the interaction of the fetal, maternal and paternal genomes. Version 1.0 of the GEneSTATION database (http://genestation.org) integrates diverse types of omics data across mammals to advance understanding of the genetic basis of gestation and pregnancy-associated phenotypes and to accelerate the translation of discoveries from model organisms to humans. GEneSTATION is built using tools from the Generic Model Organism Database project, including the biology-aware database CHADO, new tools for rapid data integration, and algorithms that streamline synthesis and user access. GEneSTATION contains curated life history information on pregnancy and reproduction from 23 high-quality mammalian genomes. For every human gene, GEneSTATION contains diverse evolutionary (e.g. gene age, population genetic and molecular evolutionary statistics), organismal (e.g. tissue-specific gene and protein expression, differential gene expression, disease phenotype), and molecular data types (e.g. Gene Ontology Annotation, protein interactions), as well as links to many general (e.g. Entrez, PubMed) and pregnancy disease-specific (e.g. PTBgene, dbPTB) databases. By facilitating the synthesis of diverse functional and evolutionary data in pregnancy-associated tissues and phenotypes and enabling their quick, intuitive, accurate and customized meta-analysis, GEneSTATION provides a novel platform for comprehensive investigation of the function and evolution of mammalian pregnancy. PMID:26567549

  16. GEneSTATION 1.0: a synthetic resource of diverse evolutionary and functional genomic data for studying the evolution of pregnancy-associated tissues and phenotypes

    PubMed Central

    Kim, Mara; Cooper, Brian A.; Venkat, Rohit; Phillips, Julie B.; Eidem, Haley R.; Hirbo, Jibril; Nutakki, Sashank; Williams, Scott M.; Muglia, Louis J.; Capra, J. Anthony; Petren, Kenneth; Abbot, Patrick; Rokas, Antonis; McGary, Kriston L.

    2016-01-01

    Mammalian gestation and pregnancy are fast evolving processes that involve the interaction of the fetal, maternal and paternal genomes. Version 1.0 of the GEneSTATION database (http://genestation.org) integrates diverse types of omics data across mammals to advance understanding of the genetic basis of gestation and pregnancy-associated phenotypes and to accelerate the translation of discoveries from model organisms to humans. GEneSTATION is built using tools from the Generic Model Organism Database project, including the biology-aware database CHADO, new tools for rapid data integration, and algorithms that streamline synthesis and user access. GEneSTATION contains curated life history information on pregnancy and reproduction from 23 high-quality mammalian genomes. For every human gene, GEneSTATION contains diverse evolutionary (e.g. gene age, population genetic and molecular evolutionary statistics), organismal (e.g. tissue-specific gene and protein expression, differential gene expression, disease phenotype), and molecular data types (e.g. Gene Ontology Annotation, protein interactions), as well as links to many general (e.g. Entrez, PubMed) and pregnancy disease-specific (e.g. PTBgene, dbPTB) databases. By facilitating the synthesis of diverse functional and evolutionary data in pregnancy-associated tissues and phenotypes and enabling their quick, intuitive, accurate and customized meta-analysis, GEneSTATION provides a novel platform for comprehensive investigation of the function and evolution of mammalian pregnancy. PMID:26567549

  17. Genomic resources in mungbean for future breeding programs

    PubMed Central

    Kim, Sue K.; Nair, Ramakrishnan M.; Lee, Jayern; Lee, Suk-Ha

    2015-01-01

    Among the legume family, mungbean (Vigna radiata) has become one of the important crops in Asia, showing a steady increase in global production. It provides a good source of protein and contains most notably folate and iron. Beyond the nutritional value of mungbean, certain features make it a well-suited model organism among legume plants because of its small genome size, short life-cycle, self-pollinating, and close genetic relationship to other legumes. In the past, there have been several efforts to develop molecular markers and linkage maps associated with agronomic traits for the genetic improvement of mungbean and, ultimately, breeding for cultivar development to increase the average yields of mungbean. The recent release of a reference genome of the cultivated mungbean (V. radiata var. radiata VC1973A) and an additional de novo sequencing of a wild relative mungbean (V. radiata var. sublobata) has provided a framework for mungbean genetic and genome research, that can further be used for genome-wide association and functional studies to identify genes related to specific agronomic traits. Moreover, the diverse gene pool of wild mungbean comprises valuable genetic resources of beneficial genes that may be helpful in widening the genetic diversity of cultivated mungbean. This review paper covers the research progress on molecular and genomics approaches and the current status of breeding programs that have developed to move toward the ultimate goal of mungbean improvement. PMID:26322067

  18. Evolution, language and analogy in functional genomics.

    PubMed

    Benner, S A; Gaucher, E A

    2001-07-01

    Almost a century ago, Wittgenstein pointed out that theory in science is intricately connected to language. This connection is not a frequent topic in the genomics literature. But a case can be made that functional genomics is today hindered by the paradoxes that Wittgenstein identified. If this is true, until these paradoxes are recognized and addressed, functional genomics will continue to be limited in its ability to extrapolate information from genomic sequences.

  19. Evolution, language and analogy in functional genomics

    NASA Technical Reports Server (NTRS)

    Benner, S. A.; Gaucher, E. A.

    2001-01-01

    Almost a century ago, Wittgenstein pointed out that theory in science is intricately connected to language. This connection is not a frequent topic in the genomics literature. But a case can be made that functional genomics is today hindered by the paradoxes that Wittgenstein identified. If this is true, until these paradoxes are recognized and addressed, functional genomics will continue to be limited in its ability to extrapolate information from genomic sequences.

  20. The Populus Genome Integrative Explorer (PopGenIE): a new resource for exploring the Populus genome.

    PubMed

    Sjödin, Andreas; Street, Nathaniel Robert; Sandberg, Göran; Gustafsson, Petter; Jansson, Stefan

    2009-06-01

    Populus has become an important model plant system. However, utilization of the increasingly extensive collection of genetics and genomics data created by the community is currently hindered by the lack of a central resource, such as a model organism database (MOD). Such MODs offer a single entry point to the collection of resources available within a model system, typically including tools for exploring and querying those resources. As a starting point to overcoming the lack of such an MOD for Populus, we present the Populus Genome Integrative Explorer (PopGenIE), an integrated set of tools for exploring the Populus genome and transcriptome. The resource includes genome, synteny and quantitative trait locus (QTL) browsers for exploring genetic data. Expression tools include an electronic fluorescent pictograph (eFP) browser, expression profile plots, co-regulation within collated transcriptomics data sets, and identification of over-represented functional categories and genomic hotspot locations. A number of collated transcriptomics data sets are made available in the eFP browser to facilitate functional exploration of gene function. Additional homology and data extraction tools are provided. PopGenIE significantly increases accessibility to Populus genomics resources and allows exploration of transcriptomics data without the need to learn or understand complex statistical analysis methods. PopGenIE is available at www.popgenie.org or via www.populusgenome.info.

  1. Application of Functional Genomics for Bovine Respiratory Disease Diagnostics

    PubMed Central

    Rai, Aswathy N.; Epperson, William B.; Nanduri, Bindu

    2015-01-01

    Bovine respiratory disease (BRD) is the most common economically important disease affecting cattle. For developing accurate diagnostics that can predict disease susceptibility/resistance and stratification, it is necessary to identify the molecular mechanisms that underlie BRD. To study the complex interactions among the bovine host and the multitude of viral and bacterial pathogens, as well as the environmental factors associated with BRD etiology, genome-scale high-throughput functional genomics methods such as microarrays, RNA-seq, and proteomics are helpful. In this review, we summarize the progress made in our understanding of BRD using functional genomics approaches. We also discuss some of the available bioinformatics resources for analyzing high-throughput data, in the context of biological pathways and molecular interactions. Although resources for studying host response to infection are avail-able, the corresponding information is lacking for majority of BRD pathogens, impeding progress in identifying diagnostic signatures for BRD using functional genomics approaches. PMID:26526746

  2. Update on RefSeq microbial genomes resources.

    PubMed

    Tatusova, Tatiana; Ciufo, Stacy; Federhen, Scott; Fedorov, Boris; McVeigh, Richard; O'Neill, Kathleen; Tolstoy, Igor; Zaslavsky, Leonid

    2015-01-01

    NCBI RefSeq genome collection http://www.ncbi.nlm.nih.gov/genome represents all three major domains of life: Eukarya, Bacteria and Archaea as well as Viruses. Prokaryotic genome sequences are the most rapidly growing part of the collection. During the year of 2014 more than 10,000 microbial genome assemblies have been publicly released bringing the total number of prokaryotic genomes close to 30,000. We continue to improve the quality and usability of the microbial genome resources by providing easy access to the data and the results of the pre-computed analysis, and improving analysis and visualization tools. A number of improvements have been incorporated into the Prokaryotic Genome Annotation Pipeline. Several new features have been added to RefSeq prokaryotic genomes data processing pipeline including the calculation of genome groups (clades) and the optimization of protein clusters generation using pan-genome approach.

  3. The Brachypodium genome sequence: a resource for oat genomics research

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Oat (Avena sativa) is an important cereal crop used as both an animal feed and for human consumption. Genetic and genomic research on oat is hindered because it is hexaploid and possesses a large (13 Gb) genome. Diploid Avena relatives have been employed for genetic and genomic studies, but only mod...

  4. Sources for Comparative Studies of Placentation. II. Genomic Resources

    PubMed Central

    Wildman, Derek E.

    2008-01-01

    The genomes of dozens of placental mammal species are now publicly available. These genome sequences have the potential to provide insight into the development and evolution of the placenta. In particular, the variable anatomy of the placenta has likely been affected by natural selection on the genomes of living and extinct mammals. In this note the current availability of mammal genome sequences is reviewed, and strengths and limitations of these data are discussed. Additionally, museums, zoos, and commercial entities are available to provide genomic resources to the placental research community. Recommendations for tissue storage conditions of placentas in genomic research are given. PMID:18155141

  5. Genomic resources and their influence on the detection of the signal of positive selection in genome scans.

    PubMed

    Manel, S; Perrier, C; Pratlong, M; Abi-Rached, L; Paganini, J; Pontarotti, P; Aurelle, D

    2016-01-01

    Genome scans represent powerful approaches to investigate the action of natural selection on the genetic variation of natural populations and to better understand local adaptation. This is very useful, for example, in the field of conservation biology and evolutionary biology. Thanks to Next Generation Sequencing, genomic resources are growing exponentially, improving genome scan analyses in non-model species. Thousands of SNPs called using Reduced Representation Sequencing are increasingly used in genome scans. Besides, genome sequences are also becoming increasingly available, allowing better processing of short-read data, offering physical localization of variants, and improving haplotype reconstruction and data imputation. Ultimately, genome sequences are also becoming the raw material for selection inferences. Here, we discuss how the increasing availability of such genomic resources, notably genome sequences, influences the detection of signals of selection. Mainly, increasing data density and having the information of physical linkage data expand genome scans by (i) improving the overall quality of the data, (ii) helping the reconstruction of demographic history for the population studied to decrease false-positive rates and (iii) improving the statistical power of methods to detect the signal of selection. Of particular importance, the availability of a high-quality reference genome can improve the detection of the signal of selection by (i) allowing matching the potential candidate loci to linked coding regions under selection, (ii) rapidly moving the investigation to the gene and function and (iii) ensuring that the highly variable regions of the genomes that include functional genes are also investigated. For all those reasons, using reference genomes in genome scan analyses is highly recommended.

  6. The Functional Genomics Initiative at Oak Ridge National Laboratory

    SciTech Connect

    Johnson, Dabney; Justice, Monica; Beattle, Ken; Buchanan, Michelle; Ramsey, Michael; Ramsey, Rose; Paulus, Michael; Ericson, Nance; Allison, David; Kress, Reid; Mural, Richard; Uberbacher, Ed; Mann, Reinhold

    1997-12-31

    The Functional Genomics Initiative at the Oak Ridge National Laboratory integrates outstanding capabilities in mouse genetics, bioinformatics, and instrumentation. The 50 year investment by the DOE in mouse genetics/mutagenesis has created a one-of-a-kind resource for generating mutations and understanding their biological consequences. It is generally accepted that, through the mouse as a surrogate for human biology, we will come to understand the function of human genes. In addition to this world class program in mammalian genetics, ORNL has also been a world leader in developing bioinformatics tools for the analysis, management and visualization of genomic data. Combining this expertise with new instrumentation technologies will provide a unique capability to understand the consequences of mutations in the mouse at both the organism and molecular levels. The goal of the Functional Genomics Initiative is to develop the technology and methodology necessary to understand gene function on a genomic scale and apply these technologies to megabase regions of the human genome. The effort is scoped so as to create an effective and powerful resource for functional genomics. ORNL is partnering with the Joint Genome Institute and other large scale sequencing centers to sequence several multimegabase regions of both human and mouse genomic DNA, to identify all the genes in these regions, and to conduct fundamental surveys to examine gene function at the molecular and organism level. The Initiative is designed to be a pilot for larger scale deployment in the post-genome era. Technologies will be applied to the examination of gene expression and regulation, metabolism, gene networks, physiology and development.

  7. Open chromatin reveals the functional maize genome

    PubMed Central

    Rodgers-Melnick, Eli; Vera, Daniel L.; Bass, Hank W.

    2016-01-01

    Cellular processes mediated through nuclear DNA must contend with chromatin. Chromatin structural assays can efficiently integrate information across diverse regulatory elements, revealing the functional noncoding genome. In this study, we use a differential nuclease sensitivity assay based on micrococcal nuclease (MNase) digestion to discover open chromatin regions in the maize genome. We find that maize MNase-hypersensitive (MNase HS) regions localize around active genes and within recombination hotspots, focusing biased gene conversion at their flanks. Although MNase HS regions map to less than 1% of the genome, they consistently explain a remarkably large amount (∼40%) of heritable phenotypic variance in diverse complex traits. MNase HS regions are therefore on par with coding sequences as annotations that demarcate the functional parts of the maize genome. These results imply that less than 3% of the maize genome (coding and MNase HS regions) may give rise to the overwhelming majority of phenotypic variation, greatly narrowing the scope of the functional genome. PMID:27185945

  8. HTS-DB: an online resource to publish and query data from functional genomics high-throughput siRNA screening projects

    PubMed Central

    Saunders, Rebecca E.; Instrell, Rachael; Rispoli, Rossella; Jiang, Ming; Howell, Michael

    2013-01-01

    High-throughput screening (HTS) uses technologies such as RNA interference to generate loss-of-function phenotypes on a genomic scale. As these technologies become more popular, many research institutes have established core facilities of expertise to deal with the challenges of large-scale HTS experiments. As the efforts of core facility screening projects come to fruition, focus has shifted towards managing the results of these experiments and making them available in a useful format that can be further mined for phenotypic discovery. The HTS-DB database provides a public view of data from screening projects undertaken by the HTS core facility at the CRUK London Research Institute. All projects and screens are described with comprehensive assay protocols, and datasets are provided with complete descriptions of analysis techniques. This format allows users to browse and search data from large-scale studies in an informative and intuitive way. It also provides a repository for additional measurements obtained from screens that were not the focus of the project, such as cell viability, and groups these data so that it can provide a gene-centric summary across several different cell lines and conditions. All datasets from our screens that can be made available can be viewed interactively and mined for further hit lists. We believe that in this format, the database provides researchers with rapid access to results of large-scale experiments that might facilitate their understanding of genes/compounds identified in their own research. Database URL: http://hts.cancerresearchuk.org/db/public PMID:24122843

  9. Genome resource banking of biomedically important laboratory animals.

    PubMed

    Agca, Yuksel

    2012-11-01

    Genome resource banking is the systematic collection, storage, and redistribution of biomaterials in an organized, logistical, and secure manner. Genome cryobanks usually contain biomaterials and associated genomic information essential for progression of biomedicine, human health, and research. In that regard, appropriate genome cryobanks could provide essential biomaterials for both current and future research projects in the form of various cell types and tissues, including sperm, oocytes, embryos, embryonic or adult stem cells, induced pluripotent stem cells, and gonadal tissues. In addition to cryobanked germplasm, cryobanking of DNA, serum, blood products, and tissues from scientifically, economically, and ecologically important species has become a common practice. For revitalization of the whole organism, cryopreserved germplasm in conjunction with assisted reproductive technologies, offer a powerful approach for research model management, as well as assisting in animal production for agriculture, conservation, and human reproductive medicine. Recently, many developed and developing countries have allocated substantial resources to establish genome resources banks which are responsible for safeguarding scientifically, economically, and ecologically important wild type, mutant, and transgenic plants, fish, and local livestock breeds, as well as wildlife species. This review is dedicated to the memory of Dr. John K. Critser, who has made profound contributions to the science of cryobiology and establishment of genome research and resources centers for mice, rats, and swine. Emphasis will be given to application of genome resource banks to species with substantial contributions to the advancement of biomedicine and human health. PMID:22981880

  10. Brassica database (BRAD) version 2.0: integrating and mining Brassicaceae species genomic resources.

    PubMed

    Wang, Xiaobo; Wu, Jian; Liang, Jianli; Cheng, Feng; Wang, Xiaowu

    2015-01-01

    The Brassica database (BRAD) was built initially to assist users apply Brassica rapa and Arabidopsis thaliana genomic data efficiently to their research. However, many Brassicaceae genomes have been sequenced and released after its construction. These genomes are rich resources for comparative genomics, gene annotation and functional evolutionary studies of Brassica crops. Therefore, we have updated BRAD to version 2.0 (V2.0). In BRAD V2.0, 11 more Brassicaceae genomes have been integrated into the database, namely those of Arabidopsis lyrata, Aethionema arabicum, Brassica oleracea, Brassica napus, Camelina sativa, Capsella rubella, Leavenworthia alabamica, Sisymbrium irio and three extremophiles Schrenkiella parvula, Thellungiella halophila and Thellungiella salsuginea. BRAD V2.0 provides plots of syntenic genomic fragments between pairs of Brassicaceae species, from the level of chromosomes to genomic blocks. The Generic Synteny Browser (GBrowse_syn), a module of the Genome Browser (GBrowse), is used to show syntenic relationships between multiple genomes. Search functions for retrieving syntenic and non-syntenic orthologs, as well as their annotation and sequences are also provided. Furthermore, genome and annotation information have been imported into GBrowse so that all functional elements can be visualized in one frame. We plan to continually update BRAD by integrating more Brassicaceae genomes into the database. Database URL: http://brassicadb.org/brad/.

  11. Brassica database (BRAD) version 2.0: integrating and mining Brassicaceae species genomic resources

    PubMed Central

    Wang, Xiaobo; Wu, Jian; Liang, Jianli; Cheng, Feng; Wang, Xiaowu

    2015-01-01

    The Brassica database (BRAD) was built initially to assist users apply Brassica rapa and Arabidopsis thaliana genomic data efficiently to their research. However, many Brassicaceae genomes have been sequenced and released after its construction. These genomes are rich resources for comparative genomics, gene annotation and functional evolutionary studies of Brassica crops. Therefore, we have updated BRAD to version 2.0 (V2.0). In BRAD V2.0, 11 more Brassicaceae genomes have been integrated into the database, namely those of Arabidopsis lyrata, Aethionema arabicum, Brassica oleracea, Brassica napus, Camelina sativa, Capsella rubella, Leavenworthia alabamica, Sisymbrium irio and three extremophiles Schrenkiella parvula, Thellungiella halophila and Thellungiella salsuginea. BRAD V2.0 provides plots of syntenic genomic fragments between pairs of Brassicaceae species, from the level of chromosomes to genomic blocks. The Generic Synteny Browser (GBrowse_syn), a module of the Genome Browser (GBrowse), is used to show syntenic relationships between multiple genomes. Search functions for retrieving syntenic and non-syntenic orthologs, as well as their annotation and sequences are also provided. Furthermore, genome and annotation information have been imported into GBrowse so that all functional elements can be visualized in one frame. We plan to continually update BRAD by integrating more Brassicaceae genomes into the database. Database URL: http://brassicadb.org/brad/ PMID:26589635

  12. The KEGG resource for deciphering the genome.

    PubMed

    Kanehisa, Minoru; Goto, Susumu; Kawashima, Shuichi; Okuno, Yasushi; Hattori, Masahiro

    2004-01-01

    A grand challenge in the post-genomic era is a complete computer representation of the cell and the organism, which will enable computational prediction of higher-level complexity of cellular processes and organism behavior from genomic information. Toward this end we have been developing a knowledge-based approach for network prediction, which is to predict, given a complete set of genes in the genome, the protein interaction networks that are responsible for various cellular processes. KEGG at http://www.genome.ad.jp/kegg/ is the reference knowledge base that integrates current knowledge on molecular interaction networks such as pathways and complexes (PATHWAY database), information about genes and proteins generated by genome projects (GENES/SSDB/KO databases) and information about biochemical compounds and reactions (COMPOUND/GLYCAN/REACTION databases). These three types of database actually represent three graph objects, called the protein network, the gene universe and the chemical universe. New efforts are being made to abstract knowledge, both computationally and manually, about ortholog clusters in the KO (KEGG Orthology) database, and to collect and analyze carbohydrate structures in the GLYCAN database.

  13. The KEGG resource for deciphering the genome.

    PubMed

    Kanehisa, Minoru; Goto, Susumu; Kawashima, Shuichi; Okuno, Yasushi; Hattori, Masahiro

    2004-01-01

    A grand challenge in the post-genomic era is a complete computer representation of the cell and the organism, which will enable computational prediction of higher-level complexity of cellular processes and organism behavior from genomic information. Toward this end we have been developing a knowledge-based approach for network prediction, which is to predict, given a complete set of genes in the genome, the protein interaction networks that are responsible for various cellular processes. KEGG at http://www.genome.ad.jp/kegg/ is the reference knowledge base that integrates current knowledge on molecular interaction networks such as pathways and complexes (PATHWAY database), information about genes and proteins generated by genome projects (GENES/SSDB/KO databases) and information about biochemical compounds and reactions (COMPOUND/GLYCAN/REACTION databases). These three types of database actually represent three graph objects, called the protein network, the gene universe and the chemical universe. New efforts are being made to abstract knowledge, both computationally and manually, about ortholog clusters in the KO (KEGG Orthology) database, and to collect and analyze carbohydrate structures in the GLYCAN database. PMID:14681412

  14. Functional profiling of the Saccharomyces cerevisiae genome.

    PubMed

    Giaever, Guri; Chu, Angela M; Ni, Li; Connelly, Carla; Riles, Linda; Véronneau, Steeve; Dow, Sally; Lucau-Danila, Ankuta; Anderson, Keith; André, Bruno; Arkin, Adam P; Astromoff, Anna; El-Bakkoury, Mohamed; Bangham, Rhonda; Benito, Rocio; Brachat, Sophie; Campanaro, Stefano; Curtiss, Matt; Davis, Karen; Deutschbauer, Adam; Entian, Karl-Dieter; Flaherty, Patrick; Foury, Francoise; Garfinkel, David J; Gerstein, Mark; Gotte, Deanna; Güldener, Ulrich; Hegemann, Johannes H; Hempel, Svenja; Herman, Zelek; Jaramillo, Daniel F; Kelly, Diane E; Kelly, Steven L; Kötter, Peter; LaBonte, Darlene; Lamb, David C; Lan, Ning; Liang, Hong; Liao, Hong; Liu, Lucy; Luo, Chuanyun; Lussier, Marc; Mao, Rong; Menard, Patrice; Ooi, Siew Loon; Revuelta, Jose L; Roberts, Christopher J; Rose, Matthias; Ross-Macdonald, Petra; Scherens, Bart; Schimmack, Greg; Shafer, Brenda; Shoemaker, Daniel D; Sookhai-Mahadeo, Sharon; Storms, Reginald K; Strathern, Jeffrey N; Valle, Giorgio; Voet, Marleen; Volckaert, Guido; Wang, Ching-yun; Ward, Teresa R; Wilhelmy, Julie; Winzeler, Elizabeth A; Yang, Yonghong; Yen, Grace; Youngman, Elaine; Yu, Kexin; Bussey, Howard; Boeke, Jef D; Snyder, Michael; Philippsen, Peter; Davis, Ronald W; Johnston, Mark

    2002-07-25

    Determining the effect of gene deletion is a fundamental approach to understanding gene function. Conventional genetic screens exhibit biases, and genes contributing to a phenotype are often missed. We systematically constructed a nearly complete collection of gene-deletion mutants (96% of annotated open reading frames, or ORFs) of the yeast Saccharomyces cerevisiae. DNA sequences dubbed 'molecular bar codes' uniquely identify each strain, enabling their growth to be analysed in parallel and the fitness contribution of each gene to be quantitatively assessed by hybridization to high-density oligonucleotide arrays. We show that previously known and new genes are necessary for optimal growth under six well-studied conditions: high salt, sorbitol, galactose, pH 8, minimal medium and nystatin treatment. Less than 7% of genes that exhibit a significant increase in messenger RNA expression are also required for optimal growth in four of the tested conditions. Our results validate the yeast gene-deletion collection as a valuable resource for functional genomics.

  15. Genome Resource Banking of Biomedically Important Laboratory Animals

    PubMed Central

    Agca, Yuksel

    2014-01-01

    Genome resource banking (GRB) is the systematic collection, storage, and re-distribution of biomaterials in an organized, logistical, and secure manner. Genome cyrobanks usually contain biomaterials and associated genomic information essential for progression of biomedicine, human health, and research. In that regard, appropriate genome cryobanks could provide essential biomaterials for both current and future research projects in the form of various cell types and tissues, including sperm, oocytes, embryos, embryonic or adult stem cells, induced pluripotent stem cells, and gonadal tissues. In addition to cryobanked germplasm, cryobanking of DNA, serum, blood products, and tissues from scientifically, economically and ecologically important species has become a common practice. For revitalization of the whole organism, cryopreserved germplasm in conjunction with assisted reproductive technologies (ART), offer a powerful approach for research model management, as well as assisting in animal production for agriculture, conservation, and human reproductive medicine. Recently, many developed and developing countries have allocated substantial resources to establish genome resources banks which are responsible for safeguarding scientifically, economically and ecologically important wild type, mutant and transgenic plants, fish, and local livestock breeds, as well as wildlife species. This review is dedicated to the memory of Dr. John K. Critser, who had made profound contributions to the science of cryobiology and establishment of genome research and resources centers for mice, rats and swine. Emphasis will be given to application of GRBs to species with substantial contributions to the advancement of biomedicine and human health. PMID:22981880

  16. The Comprehensive Phytopathogen Genomics Resource: a web-based resource for data-mining plant pathogen genomes.

    PubMed

    Hamilton, John P; Neeno-Eckwall, Eric C; Adhikari, Bishwo N; Perna, Nicole T; Tisserat, Ned; Leach, Jan E; Lévesque, C André; Buell, C Robin

    2011-01-01

    The Comprehensive Phytopathogen Genomics Resource (CPGR) provides a web-based portal for plant pathologists and diagnosticians to view the genome and trancriptome sequence status of 806 bacterial, fungal, oomycete, nematode, viral and viroid plant pathogens. Tools are available to search and analyze annotated genome sequences of 74 bacterial, fungal and oomycete pathogens. Oomycete and fungal genomes are obtained directly from GenBank, whereas bacterial genome sequences are downloaded from the A Systematic Annotation Package (ASAP) database that provides curation of genomes using comparative approaches. Curated lists of bacterial genes relevant to pathogenicity and avirulence are also provided. The Plant Pathogen Transcript Assemblies Database provides annotated assemblies of the transcribed regions of 82 eukaryotic genomes from publicly available single pass Expressed Sequence Tags. Data-mining tools are provided along with tools to create candidate diagnostic markers, an emerging use for genomic sequence data in plant pathology. The Plant Pathogen Ribosomal DNA (rDNA) database is a resource for pathogens that lack genome or transcriptome data sets and contains 131 755 rDNA sequences from GenBank for 17 613 species identified as plant pathogens and related genera. Database URL: http://cpgr.plantbiology.msu.edu.

  17. MBGD update 2010: toward a comprehensive resource for exploring microbial genome diversity.

    PubMed

    Uchiyama, Ikuo; Higuchi, Toshio; Kawai, Mikihiko

    2010-01-01

    The microbial genome database (MBGD) for comparative analysis is a platform for microbial comparative genomics based on automated ortholog group identification. A prominent feature of MBGD is that it allows users to create ortholog groups using a specified subgroup of organisms. The database is constantly updated and now contains almost 1000 genomes. To utilize the MBGD database as a comprehensive resource for investigating microbial genome diversity, we have developed the following advanced functionalities: (i) enhanced assignment of functional annotation, including external database links to each orthologous group, (ii) interface for choosing a set of genomes to compare based on phenotypic properties, (iii) the addition of more eukaryotic microbial genomes (fungi and protists) and some higher eukaryotes as references and (iv) enhancement of the MyMBGD mode, which allows users to add their own genomes to MBGD and now accepts raw genomic sequences without any annotation (in such a case, it runs a gene-finding procedure before identifying the orthologs). Some analysis functions, such as the function to find orthologs with similar phylogenetic patterns, have also been improved. MBGD is accessible at http://mbgd.genome.ad.jp/.

  18. Termites as functional gene resources.

    PubMed

    Matsui, Toru; Tokuda, Gaku; Shinzato, Naoya

    2009-01-01

    Termites (Dictyoptera, Isoptera) comprise a complex assemblage of diverse species, roughly divided into so-called lower and higher termites. Lower termites harbor a dense and diverse population of prokaryotes and flagellated protists (single-cell eukaryotes) in their gut. Higher termites comprise only one apical family (Termitidae) but more than three-quarters of all termite species. While they also harbor a dense and diverse array of prokaryotes, higher termites typically lack flagellated protists. Although termites are regarded as harmful because of the ability to decompose cellulosic materials such as houses made of wood. Classical enrichment culture technique and recent metagenomic approach showed that the termites and/or their symbionts are potentially good resource of functional genes for industrial applications. Recent papers and patents showed termites and its symbionts have not only cellulolytic or lignin decomposition activity but also aromatic hydrocarbons degradation. These functions would be useful for biomass utilization, environmental remediation, and fine-chemicals production. In this review, along with the current patents of termite derived biochemical functions, future prospects for practical application based on the recent progress in metagenomic research are discussed.

  19. Resources for Biological Annotation of the Drosophila Genome

    SciTech Connect

    Gerald M. Rubin

    2005-08-08

    This project supported seed money for the development of cDNA and genetic resources to support studies of the Drosophila melanogaster genome. Key publications supported by this work that provide additional detail: (1) ''The Drosophila gene collection: identification of putative full-length cDNAs for 70% of D. melanogaster genes''; and (2) ''The Berkeley Drosophila Genome Project gene disruption project: Single P-element insertions mutating 25% of vital Drosophila genes''.

  20. Genomics and bioinformatics resources for translational science in Rosaceae.

    PubMed

    Jung, Sook; Main, Dorrie

    2014-01-01

    Recent technological advances in biology promise unprecedented opportunities for rapid and sustainable advancement of crop quality. Following this trend, the Rosaceae research community continues to generate large amounts of genomic, genetic and breeding data. These include annotated whole genome sequences, transcriptome and expression data, proteomic and metabolomic data, genotypic and phenotypic data, and genetic and physical maps. Analysis, storage, integration and dissemination of these data using bioinformatics tools and databases are essential to provide utility of the data for basic, translational and applied research. This review discusses the currently available genomics and bioinformatics resources for the Rosaceae family.

  1. e-Fungi: a data resource for comparative analysis of fungal genomes

    PubMed Central

    Hedeler, Cornelia; Wong, Han Min; Cornell, Michael J; Alam, Intikhab; Soanes, Darren M; Rattray, Magnus; Hubbard, Simon J; Talbot, Nicholas J; Oliver, Stephen G; Paton, Norman W

    2007-01-01

    Background The number of sequenced fungal genomes is ever increasing, with about 200 genomes already fully sequenced or in progress. Only a small percentage of those genomes have been comprehensively studied, for example using techniques from functional genomics. Comparative analysis has proven to be a useful strategy for enhancing our understanding of evolutionary biology and of the less well understood genomes. However, the data required for these analyses tends to be distributed in various heterogeneous data sources, making systematic comparative studies a cumbersome task. Furthermore, comparative analyses benefit from close integration of derived data sets that cluster genes or organisms in a way that eases the expression of requests that clarify points of similarity or difference between species. Description To support systematic comparative analyses of fungal genomes we have developed the e-Fungi database, which integrates a variety of data for more than 30 fungal genomes. Publicly available genome data, functional annotations, and pathway information has been integrated into a single data repository and complemented with results of comparative analyses, such as MCL and OrthoMCL cluster analysis, and predictions of signaling proteins and the sub-cellular localisation of proteins. To access the data, a library of analysis tasks is available through a web interface. The analysis tasks are motivated by recent comparative genomics studies, and aim to support the study of evolutionary biology as well as community efforts for improving the annotation of genomes. Web services for each query are also available, enabling the tasks to be incorporated into workflows. Conclusion The e-Fungi database provides fungal biologists with a resource for comparative studies of a large range of fungal genomes. Its analysis library supports the comparative study of genome data, functional annotation, and results of large scale analyses over all the genomes stored in the database

  2. PhytoPath: an integrative resource for plant pathogen genomics

    PubMed Central

    Pedro, Helder; Maheswari, Uma; Urban, Martin; Irvine, Alistair George; Cuzick, Alayne; McDowall, Mark D.; Staines, Daniel M.; Kulesha, Eugene; Hammond-Kosack, Kim Elizabeth; Kersey, Paul Julian

    2016-01-01

    PhytoPath (www.phytopathdb.org) is a resource for genomic and phenotypic data from plant pathogen species, that integrates phenotypic data for genes from PHI-base, an expertly curated catalog of genes with experimentally verified pathogenicity, with the Ensembl tools for data visualization and analysis. The resource is focused on fungi, protists (oomycetes) and bacterial plant pathogens that have genomes that have been sequenced and annotated. Genes with associated PHI-base data can be easily identified across all plant pathogen species using a BioMart-based query tool and visualized in their genomic context on the Ensembl genome browser. The PhytoPath resource contains data for 135 genomic sequences from 87 plant pathogen species, and 1364 genes curated for their role in pathogenicity and as targets for chemical intervention. Support for community annotation of gene models is provided using the WebApollo online gene editor, and we are working with interested communities to improve reference annotation for selected species. PMID:26476449

  3. PhytoPath: an integrative resource for plant pathogen genomics.

    PubMed

    Pedro, Helder; Maheswari, Uma; Urban, Martin; Irvine, Alistair George; Cuzick, Alayne; McDowall, Mark D; Staines, Daniel M; Kulesha, Eugene; Hammond-Kosack, Kim Elizabeth; Kersey, Paul Julian

    2016-01-01

    PhytoPath (www.phytopathdb.org) is a resource for genomic and phenotypic data from plant pathogen species, that integrates phenotypic data for genes from PHI-base, an expertly curated catalog of genes with experimentally verified pathogenicity, with the Ensembl tools for data visualization and analysis. The resource is focused on fungi, protists (oomycetes) and bacterial plant pathogens that have genomes that have been sequenced and annotated. Genes with associated PHI-base data can be easily identified across all plant pathogen species using a BioMart-based query tool and visualized in their genomic context on the Ensembl genome browser. The PhytoPath resource contains data for 135 genomic sequences from 87 plant pathogen species, and 1364 genes curated for their role in pathogenicity and as targets for chemical intervention. Support for community annotation of gene models is provided using the WebApollo online gene editor, and we are working with interested communities to improve reference annotation for selected species.

  4. Assembly: a resource for assembled genomes at NCBI

    PubMed Central

    Kitts, Paul A.; Church, Deanna M.; Thibaud-Nissen, Françoise; Choi, Jinna; Hem, Vichet; Sapojnikov, Victor; Smith, Robert G.; Tatusova, Tatiana; Xiang, Charlie; Zherikov, Andrey; DiCuccio, Michael; Murphy, Terence D.; Pruitt, Kim D.; Kimchi, Avi

    2016-01-01

    The NCBI Assembly database (www.ncbi.nlm.nih.gov/assembly/) provides stable accessioning and data tracking for genome assembly data. The model underlying the database can accommodate a range of assembly structures, including sets of unordered contig or scaffold sequences, bacterial genomes consisting of a single complete chromosome, or complex structures such as a human genome with modeled allelic variation. The database provides an assembly accession and version to unambiguously identify the set of sequences that make up a particular version of an assembly, and tracks changes to updated genome assemblies. The Assembly database reports metadata such as assembly names, simple statistical reports of the assembly (number of contigs and scaffolds, contiguity metrics such as contig N50, total sequence length and total gap length) as well as the assembly update history. The Assembly database also tracks the relationship between an assembly submitted to the International Nucleotide Sequence Database Consortium (INSDC) and the assembly represented in the NCBI RefSeq project. Users can find assemblies of interest by querying the Assembly Resource directly or by browsing available assemblies for a particular organism. Links in the Assembly Resource allow users to easily download sequence and annotations for current versions of genome assemblies from the NCBI genomes FTP site. PMID:26578580

  5. From genome to function: the Arabidopsis aquaporins

    PubMed Central

    Quigley, Francoise; Rosenberg, Joshua M; Shachar-Hill, Yair; Bohnert, Hans J

    2002-01-01

    Background In the post-genomic era newly sequenced genomes can be used to deduce organismal functions from our knowledge of other systems. Here we apply this approach to analyzing the aquaporin gene family in Arabidopsis thaliana. The aquaporins are intrinsic membrane proteins that have been characterized as facilitators of water flux. Originally termed major intrinsic proteins (MIPs), they are now also known as water channels, glycerol facilitators and aqua-glyceroporins, yet recent data suggest that they facilitate the movement of other low-molecular-weight metabolites as well. Results The Arabidopsis genome contains 38 sequences with homology to aquaporin in four subfamilies, termed PIP, TIP, NIP and SIP. We have analyzed aquaporin family structure and expression using the A. thaliana genome sequence, and introduce a new NMR approach for the purpose of analyzing water movement in plant roots in vivo. Conclusions Our preliminary data indicate a strongly transcellular component for the flux of water in roots. PMID:11806824

  6. A genome-scale shRNA resource for transgenic RNAi in Drosophila.

    PubMed

    Ni, Jian-Quan; Zhou, Rui; Czech, Benjamin; Liu, Lu-Ping; Holderbaum, Laura; Yang-Zhou, Donghui; Shim, Hye-Seok; Tao, Rong; Handler, Dominik; Karpowicz, Phillip; Binari, Richard; Booker, Matthew; Brennecke, Julius; Perkins, Lizabeth A; Hannon, Gregory J; Perrimon, Norbert

    2011-05-01

    Existing transgenic RNAi resources in Drosophila melanogaster based on long double-stranded hairpin RNAs are powerful tools for functional studies, but they are ineffective in gene knockdown during oogenesis, an important model system for the study of many biological questions. We show that shRNAs, modeled on an endogenous microRNA, are extremely effective at silencing gene expression during oogenesis. We also describe our progress toward building a genome-wide shRNA resource. PMID:21460824

  7. The National Microbial Pathogen Database Resource (NMPDR): a genomics platform based on subsystem annotation.

    PubMed

    McNeil, Leslie Klis; Reich, Claudia; Aziz, Ramy K; Bartels, Daniela; Cohoon, Matthew; Disz, Terry; Edwards, Robert A; Gerdes, Svetlana; Hwang, Kaitlyn; Kubal, Michael; Margaryan, Gohar Rem; Meyer, Folker; Mihalo, William; Olsen, Gary J; Olson, Robert; Osterman, Andrei; Paarmann, Daniel; Paczian, Tobias; Parrello, Bruce; Pusch, Gordon D; Rodionov, Dmitry A; Shi, Xinghua; Vassieva, Olga; Vonstein, Veronika; Zagnitko, Olga; Xia, Fangfang; Zinner, Jenifer; Overbeek, Ross; Stevens, Rick

    2007-01-01

    The National Microbial Pathogen Data Resource (NMPDR) (http://www.nmpdr.org) is a National Institute of Allergy and Infections Disease (NIAID)-funded Bioinformatics Resource Center that supports research in selected Category B pathogens. NMPDR contains the complete genomes of approximately 50 strains of pathogenic bacteria that are the focus of our curators, as well as >400 other genomes that provide a broad context for comparative analysis across the three phylogenetic Domains. NMPDR integrates complete, public genomes with expertly curated biological subsystems to provide the most consistent genome annotations. Subsystems are sets of functional roles related by a biologically meaningful organizing principle, which are built over large collections of genomes; they provide researchers with consistent functional assignments in a biologically structured context. Investigators can browse subsystems and reactions to develop accurate reconstructions of the metabolic networks of any sequenced organism. NMPDR provides a comprehensive bioinformatics platform, with tools and viewers for genome analysis. Results of precomputed gene clustering analyses can be retrieved in tabular or graphic format with one-click tools. NMPDR tools include Signature Genes, which finds the set of genes in common or that differentiates two groups of organisms. Essentiality data collated from genome-wide studies have been curated. Drug target identification and high-throughput, in silico, compound screening are in development. PMID:17145713

  8. The National Microbial Pathogen Database Resource (NMPDR): a genomics platform based on subsystem annotation

    PubMed Central

    McNeil, Leslie Klis; Reich, Claudia; Aziz, Ramy K.; Bartels, Daniela; Cohoon, Matthew; Disz, Terry; Edwards, Robert A.; Gerdes, Svetlana; Hwang, Kaitlyn; Kubal, Michael; Margaryan, Gohar Rem; Meyer, Folker; Mihalo, William; Olsen, Gary J.; Olson, Robert; Osterman, Andrei; Paarmann, Daniel; Paczian, Tobias; Parrello, Bruce; Pusch, Gordon D.; Rodionov, Dmitry A.; Shi, Xinghua; Vassieva, Olga; Vonstein, Veronika; Zagnitko, Olga; Xia, Fangfang; Zinner, Jenifer; Overbeek, Ross; Stevens, Rick

    2007-01-01

    The National Microbial Pathogen Data Resource (NMPDR) () is a National Institute of Allergy and Infections Disease (NIAID)-funded Bioinformatics Resource Center that supports research in selected Category B pathogens. NMPDR contains the complete genomes of ∼50 strains of pathogenic bacteria that are the focus of our curators, as well as >400 other genomes that provide a broad context for comparative analysis across the three phylogenetic Domains. NMPDR integrates complete, public genomes with expertly curated biological subsystems to provide the most consistent genome annotations. Subsystems are sets of functional roles related by a biologically meaningful organizing principle, which are built over large collections of genomes; they provide researchers with consistent functional assignments in a biologically structured context. Investigators can browse subsystems and reactions to develop accurate reconstructions of the metabolic networks of any sequenced organism. NMPDR provides a comprehensive bioinformatics platform, with tools and viewers for genome analysis. Results of precomputed gene clustering analyses can be retrieved in tabular or graphic format with one-click tools. NMPDR tools include Signature Genes, which finds the set of genes in common or that differentiates two groups of organisms. Essentiality data collated from genome-wide studies have been curated. Drug target identification and high-throughput, in silico, compound screening are in development. PMID:17145713

  9. Functional genomics of tomato: opportunities and challenges in post-genome NGS era.

    PubMed

    Kumar, Rahul; Khurana, Ashima

    2014-12-01

    The Tomato Genome Sequencing Project represented a landmark venture in the history of sequencing projects where both Sanger's and next-generation sequencing (NGS) technologies were employed, and a highly accurate and one of the best assembled plant genomes along with a draft of the wild relative, Solanum pimpinellifolium, were released in 2012. However, the functional potential of the major portion of this newly generated resource is still undefined. The very first challenge before scientists working on tomato functional biology is to exploit this high-quality reference sequence for tapping of the wealth of genetic variants for improving agronomic traits in cultivated tomatoes. The sequence data generated recently by 150 Tomato Genome Consortium would further uncover the natural alleles present in different tomato genotypes. Therefore, we found it relevant to have a fresh outlook on tomato functional genomics in the context of application of NGS technologies in its post-genome sequencing phase. Herein, we provide an overview how NGS technologies vis-a-vis available reference sequence have assisted each other for their mutual improvement and how their combined use could further facilitate the development of other 'omics' tools, required to propel the Solanaceae research. Additionally, we highlight the challenges associated with the application of these cutting-edge technologies.

  10. Functional genomics of tomato: opportunities and challenges in post-genome NGS era.

    PubMed

    Kumar, Rahul; Khurana, Ashima

    2014-12-01

    The Tomato Genome Sequencing Project represented a landmark venture in the history of sequencing projects where both Sanger's and next-generation sequencing (NGS) technologies were employed, and a highly accurate and one of the best assembled plant genomes along with a draft of the wild relative, Solanum pimpinellifolium, were released in 2012. However, the functional potential of the major portion of this newly generated resource is still undefined. The very first challenge before scientists working on tomato functional biology is to exploit this high-quality reference sequence for tapping of the wealth of genetic variants for improving agronomic traits in cultivated tomatoes. The sequence data generated recently by 150 Tomato Genome Consortium would further uncover the natural alleles present in different tomato genotypes. Therefore, we found it relevant to have a fresh outlook on tomato functional genomics in the context of application of NGS technologies in its post-genome sequencing phase. Herein, we provide an overview how NGS technologies vis-a-vis available reference sequence have assisted each other for their mutual improvement and how their combined use could further facilitate the development of other 'omics' tools, required to propel the Solanaceae research. Additionally, we highlight the challenges associated with the application of these cutting-edge technologies. PMID:25431420

  11. Genome resource for the Indonesian coelacanth, Latimeria menadoensis.

    PubMed

    Danke, Joshua; Miyake, Tsutomu; Powers, Thomas; Schein, Jacqueline; Shin, Heesun; Bosdet, Ian; Erdmann, Mark; Caldwell, Roy; Amemiya, Chris T

    2004-03-01

    We have generated a BAC library from the Indonesian coelacanth, Latimeria menadoensis. This library was generated using genomic DNA of nuclei isolated from heart tissue, and has an average insert size of 171 kb. There are a total of 288 384-well microtiter dishes in the library (110,592 clones) and its genomic representation is estimated to encompass > or = 7X coverage based on the amount of DNA presumably cloned in the library as well as via hybridization with probes to a small set of single copy genes. This genomic resource has been made available to the public and should prove useful to the scientific community for many applications, including comparative genomics, molecular evolution and conservation genetics.

  12. StaphyloBase: a specialized genomic resource for the staphylococcal research community.

    PubMed

    Heydari, Hamed; Mutha, Naresh V R; Mahmud, Mahafizul Imran; Siow, Cheuk Chuen; Wee, Wei Yee; Wong, Guat Jah; Yazdi, Amir Hessam; Ang, Mia Yang; Choo, Siew Woh

    2014-01-01

    With the advent of high-throughput sequencing technologies, many staphylococcal genomes have been sequenced. Comparative analysis of these strains will provide better understanding of their biology, phylogeny, virulence and taxonomy, which may contribute to better management of diseases caused by staphylococcal pathogens. We developed StaphyloBase with the goal of having a one-stop genomic resource platform for the scientific community to access, retrieve, download, browse, search, visualize and analyse the staphylococcal genomic data and annotations. We anticipate this resource platform will facilitate the analysis of staphylococcal genomic data, particularly in comparative analyses. StaphyloBase currently has a collection of 754 032 protein-coding sequences (CDSs), 19 258 rRNAs and 15 965 tRNAs from 292 genomes of different staphylococcal species. Information about these features is also included, such as putative functions, subcellular localizations and gene/protein sequences. Our web implementation supports diverse query types and the exploration of CDS- and RNA-type information in detail using an AJAX-based real-time search system. JBrowse has also been incorporated to allow rapid and seamless browsing of staphylococcal genomes. The Pairwise Genome Comparison tool is designed for comparative genomic analysis, for example, to reveal the relationships between two user-defined staphylococcal genomes. A newly designed Pathogenomics Profiling Tool (PathoProT) is also included in this platform to facilitate comparative pathogenomics analysis of staphylococcal strains. In conclusion, StaphyloBase offers access to a range of staphylococcal genomic resources as well as analysis tools for comparative analyses. Database URL: http://staphylococcus.um.edu.my/.

  13. Quantitative prediction of genome-wide resource allocation in bacteria.

    PubMed

    Goelzer, Anne; Muntel, Jan; Chubukov, Victor; Jules, Matthieu; Prestel, Eric; Nölker, Rolf; Mariadassou, Mahendra; Aymerich, Stéphane; Hecker, Michael; Noirot, Philippe; Becher, Dörte; Fromion, Vincent

    2015-11-01

    Predicting resource allocation between cell processes is the primary step towards decoding the evolutionary constraints governing bacterial growth under various conditions. Quantitative prediction at genome-scale remains a computational challenge as current methods are limited by the tractability of the problem or by simplifying hypotheses. Here, we show that the constraint-based modeling method Resource Balance Analysis (RBA), calibrated using genome-wide absolute protein quantification data, accurately predicts resource allocation in the model bacterium Bacillus subtilis for a wide range of growth conditions. The regulation of most cellular processes is consistent with the objective of growth rate maximization except for a few suboptimal processes which likely integrate more complex objectives such as coping with stressful conditions and survival. As a proof of principle by using simulations, we illustrated how calibrated RBA could aid rational design of strains for maximizing protein production, offering new opportunities to investigate design principles in prokaryotes and to exploit them for biotechnological applications.

  14. An evolutionary classification of genomic function.

    PubMed

    Graur, Dan; Zheng, Yichen; Azevedo, Ricardo B R

    2015-03-01

    The pronouncements of the ENCODE Project Consortium regarding "junk DNA" exposed the need for an evolutionary classification of genomic elements according to their selected-effect function. In the classification scheme presented here, we divide the genome into "functional DNA," that is, DNA sequences that have a selected-effect function, and "rubbish DNA," that is, sequences that do not. Functional DNA is further subdivided into "literal DNA" and "indifferent DNA." In literal DNA, the order of nucleotides is under selection; in indifferent DNA, only the presence or absence of the sequence is under selection. Rubbish DNA is further subdivided into "junk DNA" and "garbage DNA." Junk DNA neither contributes to nor detracts from the fitness of the organism and, hence, evolves under selective neutrality. Garbage DNA, on the other hand, decreases the fitness of its carriers. Garbage DNA exists in the genome only because natural selection is neither omnipotent nor instantaneous. Each of these four functional categories can be 1) transcribed and translated, 2) transcribed but not translated, or 3) not transcribed. The affiliation of a DNA segment to a particular functional category may change during evolution: Functional DNA may become junk DNA, junk DNA may become garbage DNA, rubbish DNA may become functional DNA, and so on; however, determining the functionality or nonfunctionality of a genomic sequence must be based on its present status rather than on its potential to change (or not to change) in the future. Changes in functional affiliation are divided into pseudogenes, Lazarus DNA, zombie DNA, and Jekyll-to-Hyde DNA. PMID:25635041

  15. Characterization of the Newly Developed Soybean Cultivar DT2008 in Relation to the Model Variety W82 Reveals a New Genetic Resource for Comparative and Functional Genomics for Improved Drought Tolerance

    PubMed Central

    Ha, Chien Van; Le, Dung Tien; Nishiyama, Rie; Watanabe, Yasuko; Tran, Uyen Thi; Dong, Nguyen Van; Tran, Lam-Son Phan

    2013-01-01

    Soybean (Glycine max) productivity is adversely affected by drought stress worldwide, including Vietnam. In the last few years, we have made a great effort in the development of drought-tolerant soybean cultivars by breeding and/or radiation-induced mutagenesis. One of the newly developed cultivars, the DT2008, showed enhanced drought tolerance and stable yield in the field conditions. The purpose of this study was to compare the drought-tolerant phenotype of DT2008 and Williams 82 (W82) by assessing their water loss and growth rate under dehydration and/or drought stress conditions as a means to provide genetic resources for further comparative and functional genomics. We found that DT2008 had reduced water loss under both dehydration and drought stresses in comparison with W82. The examination of root and shoot growths of DT2008 and W82 under both normal and drought conditions indicated that DT2008 maintains a better shoot and root growth rates than W82 under both two growth conditions. These results together suggest that DT2008 has better drought tolerance degree than W82. Our results open the way for further comparison of DT2008 and W82 at molecular levels by advanced omic approaches to identify mutation(s) involved in the enhancement of drought tolerance of DT2008, contributing to our understanding of drought tolerance mechanisms in soybean. Mutation(s) identified are potential candidates for genetic engineering of elite soybean varieties to improve drought tolerance and biomass. PMID:23509774

  16. Open chromatin reveals the functional maize genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Every cellular process mediated through nuclear DNA must contend with chromatin. As results from ENCODE show, open chromatin assays can efficiently integrate across diverse regulatory elements, revealing functional non-coding genome. In this study, we use a MNase hypersensitivity assay to discover o...

  17. CoryneBase: Corynebacterium genomic resources and analysis tools at your fingertips.

    PubMed

    Heydari, Hamed; Siow, Cheuk Chuen; Tan, Mui Fern; Jakubovics, Nick S; Wee, Wei Yee; Mutha, Naresh V R; Wong, Guat Jah; Ang, Mia Yang; Yazdi, Amir Hessam; Choo, Siew Woh

    2014-01-01

    Corynebacteria are used for a wide variety of industrial purposes but some species are associated with human diseases. With increasing number of corynebacterial genomes having been sequenced, comparative analysis of these strains may provide better understanding of their biology, phylogeny, virulence and taxonomy that may lead to the discoveries of beneficial industrial strains or contribute to better management of diseases. To facilitate the ongoing research of corynebacteria, a specialized central repository and analysis platform for the corynebacterial research community is needed to host the fast-growing amount of genomic data and facilitate the analysis of these data. Here we present CoryneBase, a genomic database for Corynebacterium with diverse functionality for the analysis of genomes aimed to provide: (1) annotated genome sequences of Corynebacterium where 165,918 coding sequences and 4,180 RNAs can be found in 27 species; (2) access to comprehensive Corynebacterium data through the use of advanced web technologies for interactive web interfaces; and (3) advanced bioinformatic analysis tools consisting of standard BLAST for homology search, VFDB BLAST for sequence homology search against the Virulence Factor Database (VFDB), Pairwise Genome Comparison (PGC) tool for comparative genomic analysis, and a newly designed Pathogenomics Profiling Tool (PathoProT) for comparative pathogenomic analysis. CoryneBase offers the access of a range of Corynebacterium genomic resources as well as analysis tools for comparative genomics and pathogenomics. It is publicly available at http://corynebacterium.um.edu.my/.

  18. A public resource facilitating clinical use of genomes

    PubMed Central

    Ball, Madeleine P.; Thakuria, Joseph V.; Zaranek, Alexander Wait; Clegg, Tom; Rosenbaum, Abraham M.; Wu, Xiaodi; Angrist, Misha; Bhak, Jong; Bobe, Jason; Callow, Matthew J.; Cano, Carlos; Chou, Michael F.; Chung, Wendy K.; Douglas, Shawn M.; Estep, Preston W.; Gore, Athurva; Hulick, Peter; Labarga, Alberto; Lee, Je-Hyuk; Lunshof, Jeantine E.; Kim, Byung Chul; Kim, Jong-Il; Li, Zhe; Murray, Michael F.; Nilsen, Geoffrey B.; Peters, Brock A.; Raman, Anugraha M.; Rienhoff, Hugh Y.; Robasky, Kimberly; Wheeler, Matthew T.; Vandewege, Ward; Vorhaus, Daniel B.; Yang, Joyce L.; Yang, Luhan; Aach, John; Ashley, Euan A.; Drmanac, Radoje; Kim, Seong-Jin; Li, Jin Billy; Peshkin, Leonid; Seidman, Christine E.; Seo, Jeong-Sun; Zhang, Kun; Rehm, Heidi L.; Church, George M.

    2012-01-01

    Rapid advances in DNA sequencing promise to enable new diagnostics and individualized therapies. Achieving personalized medicine, however, will require extensive research on highly reidentifiable, integrated datasets of genomic and health information. To assist with this, participants in the Personal Genome Project choose to forgo privacy via our institutional review board- approved “open consent” process. The contribution of public data and samples facilitates both scientific discovery and standardization of methods. We present our findings after enrollment of more than 1,800 participants, including whole-genome sequencing of 10 pilot participant genomes (the PGP-10). We introduce the Genome-Environment-Trait Evidence (GET-Evidence) system. This tool automatically processes genomes and prioritizes both published and novel variants for interpretation. In the process of reviewing the presumed healthy PGP-10 genomes, we find numerous literature references implying serious disease. Although it is sometimes impossible to rule out a late-onset effect, stringent evidence requirements can address the high rate of incidental findings. To that end we develop a peer production system for recording and organizing variant evaluations according to standard evidence guidelines, creating a public forum for reaching consensus on interpretation of clinically relevant variants. Genome analysis becomes a two-step process: using a prioritized list to record variant evaluations, then automatically sorting reviewed variants using these annotations. Genome data, health and trait information, participant samples, and variant interpretations are all shared in the public domain—we invite others to review our results using our participant samples and contribute to our interpretations. We offer our public resource and methods to further personalized medical research. PMID:22797899

  19. PlantGDB: a resource for comparative plant genomics

    PubMed Central

    Duvick, Jon; Fu, Ann; Muppirala, Usha; Sabharwal, Mukul; Wilkerson, Matthew D.; Lawrence, Carolyn J.; Lushbough, Carol; Brendel, Volker

    2008-01-01

    PlantGDB (http://www.plantgdb.org/) is a genomics database encompassing sequence data for green plants (Viridiplantae). PlantGDB provides annotated transcript assemblies for >100 plant species, with transcripts mapped to their cognate genomic context where available, integrated with a variety of sequence analysis tools and web services. For 14 plant species with emerging or complete genome sequence, PlantGDB's genome browsers (xGDB) serve as a graphical interface for viewing, evaluating and annotating transcript and protein alignments to chromosome or bacterial artificial chromosome (BAC)-based genome assemblies. Annotation is facilitated by the integrated yrGATE module for community curation of gene models. Novel web services at PlantGDB include Tracembler, an iterative alignment tool that generates contigs from GenBank trace file data and BioExtract Server, a web-based server for executing custom sequence analysis workflows. PlantGDB also hosts a plant genomics research outreach portal (PGROP) that facilitates access to a large number of resources for research and training. PMID:18063570

  20. Non-coding genome functions in diabetes.

    PubMed

    Cebola, Inês; Pasquali, Lorenzo

    2016-01-01

    Most of the genetic variation associated with diabetes, through genome-wide association studies, does not reside in protein-coding regions, making the identification of functional variants and their eventual translation to the clinic challenging. In recent years, high-throughput sequencing-based methods have enabled genome-scale high-resolution epigenomic profiling in a variety of human tissues, allowing the exploration of the human genome outside of the well-studied coding regions. These experiments unmasked tens of thousands of regulatory elements across several cell types, including diabetes-relevant tissues, providing new insights into their mechanisms of gene regulation. Regulatory landscapes are highly dynamic and cell-type specific and, being sensitive to DNA sequence variation, can vary with individual genomes. The scientific community is now in place to exploit the regulatory maps of tissues central to diabetes etiology, such as pancreatic progenitors and adult islets. This giant leap forward in the understanding of pancreatic gene regulation is revolutionizing our capacity to discriminate between functional and non-functional non-coding variants, opening opportunities to uncover regulatory links between sequence variation and diabetes susceptibility. In this review, we focus on the non-coding regulatory landscape of the pancreatic endocrine cells and provide an overview of the recent developments in this field. PMID:26438568

  1. Functional Genomics of Drought Tolerance in Bioenergy Crops

    SciTech Connect

    Yin, Hengfu; Chen, Rick; Yang, Jun; Weston, David; Chen, Jay; Muchero, Wellington; Ye, Ning; Tschaplinski, Timothy J; Wullschleger, Stan D; Cheng, Zong-Ming; Tuskan, Gerald A; Yang, Xiaohan

    2014-01-01

    With the predicted trends in climate change, drought will increasingly impose a grand challenge to biomass production. Most of the bioenergy crops have some degree of drought susceptibility with low water-use efficiency (WUE). It is imperative to improve drought tolerance and WUE in bioenergy crops for sustainable biomass production in arid and semi-arid regions with minimal water input. Genetics and functional genomics can play a critical role in generating knowledge to inform and aid genetic improvement of drought tolerance in bioenergy crops. The molecular aspect of drought response has been extensively investigated in model plants like Arabidopsis, yet our understanding of the molecular mechanisms underlying drought tolerance in bioenergy crops are limited. Crops exhibit various responses to drought stress depending on species and genotype. A rational strategy for studying drought tolerance in bioenergy crops is to translate the knowledge from model plants and pinpoint the unique features associated with individual species and genotypes. In this review, we summarize the general knowledge about drought responsive pathways in plants, with a focus on the identification of commonality and specialty in drought responsive mechanisms among different species and/or genotypes. We describe the genomic resources developed for bioenergy crops and discuss genetic and epigenetic regulation of drought responses. We also examine comparative and evolutionary genomics to leverage the ever-increasing genomics resources and provide new insights beyond what has been known from studies on individual species. Finally, we outline future exploration of drought tolerance using the emerging new technologies.

  2. Ascribing Functions to Genes: Journey Towards Genetic Improvement of Rice Via Functional Genomics.

    PubMed

    Mustafiz, Ananda; Kumari, Sumita; Karan, Ratna

    2016-06-01

    Rice, one of the most important cereal crops for mankind, feeds more than half the world population. Rice has been heralded as a model cereal owing to its small genome size, amenability to easy transformation, high synteny to other cereal crops and availability of complete genome sequence. Moreover, sequence wealth in rice is getting more refined and precise due to resequencing efforts. This humungous resource of sequence data has confronted research fraternity with a herculean challenge as well as an excellent opportunity to functionally validate expressed as well as regulatory portions of the genome. This will not only help us in understanding the genetic basis of plant architecture and physiology but would also steer us towards developing improved cultivars. No single technique can achieve such a mammoth task. Functional genomics through its diverse tools viz. loss and gain of function mutants, multifarious omics strategies like transcriptomics, proteomics, metabolomics and phenomics provide us with the necessary handle. A paradigm shift in technological advances in functional genomics strategies has been instrumental in generating considerable amount of information w.r.t functionality of rice genome. We now have several databases and online resources for functionally validated genes but despite that we are far from reaching the desired milestone of functionally characterizing each and every rice gene. There is an urgent need for a common platform, for information already available in rice, and collaborative efforts between researchers in a concerted manner as well as healthy public-private partnership, for genetic improvement of rice crop better able to handle the pressures of climate change and exponentially increasing population. PMID:27252584

  3. Ascribing Functions to Genes: Journey Towards Genetic Improvement of Rice Via Functional Genomics.

    PubMed

    Mustafiz, Ananda; Kumari, Sumita; Karan, Ratna

    2016-06-01

    Rice, one of the most important cereal crops for mankind, feeds more than half the world population. Rice has been heralded as a model cereal owing to its small genome size, amenability to easy transformation, high synteny to other cereal crops and availability of complete genome sequence. Moreover, sequence wealth in rice is getting more refined and precise due to resequencing efforts. This humungous resource of sequence data has confronted research fraternity with a herculean challenge as well as an excellent opportunity to functionally validate expressed as well as regulatory portions of the genome. This will not only help us in understanding the genetic basis of plant architecture and physiology but would also steer us towards developing improved cultivars. No single technique can achieve such a mammoth task. Functional genomics through its diverse tools viz. loss and gain of function mutants, multifarious omics strategies like transcriptomics, proteomics, metabolomics and phenomics provide us with the necessary handle. A paradigm shift in technological advances in functional genomics strategies has been instrumental in generating considerable amount of information w.r.t functionality of rice genome. We now have several databases and online resources for functionally validated genes but despite that we are far from reaching the desired milestone of functionally characterizing each and every rice gene. There is an urgent need for a common platform, for information already available in rice, and collaborative efforts between researchers in a concerted manner as well as healthy public-private partnership, for genetic improvement of rice crop better able to handle the pressures of climate change and exponentially increasing population.

  4. Impact of transgenic technologies on functional genomics.

    PubMed

    Shashikant, Cooduvalli S; Ruddle, Frank H

    2003-07-01

    Gene transfer technologies in mammals are the focus of renewed interest owing to the recent emphasis on analyzing gene function in the postgenomic era. Three important developments in this area include transgenics, gene targeting and nuclear transfer or animal cloning. These technological innovations have enhanced our ability to analyze gene function at the level of the whole organism and have provided the means to modify gene expression. This review discusses the origins and current status of transgenic technologies. Various applications and technologies including chromosome engineering, stem cells, gene traps and modification of livestock are presented. The impact of mouse technologies and genomics on functional analyses is also discussed.

  5. Effects of retroviruses on host genome function.

    PubMed

    Jern, Patric; Coffin, John M

    2008-01-01

    For millions of years, retroviral infections have challenged vertebrates, occasionally leading to germline integration and inheritance as ERVs, genetic parasites whose remnants today constitute some 7% to 8% of the human genome. Although they have had significant evolutionary side effects, it is useful to view ERVs as fossil representatives of retroviruses extant at the time of their insertion into the germline and not as direct players in the evolutionary process itself. Expression of particular ERVs is associated with several positive physiological functions as well as certain diseases, although their roles in human disease as etiological agents, possible contributing factors, or disease markers-well demonstrated in animal models-remain to be established. Here we discuss ERV contributions to host genome structure and function, including their ability to mediate recombination, and physiological effects on the host transcriptome resulting from their integration, expression, and other events.

  6. A Genomic Resource for the Development, Improvement, and Exploitation of Sorghum for Bioenergy.

    PubMed

    Brenton, Zachary W; Cooper, Elizabeth A; Myers, Mathew T; Boyles, Richard E; Shakoor, Nadia; Zielinski, Kelsey J; Rauh, Bradley L; Bridges, William C; Morris, Geoffrey P; Kresovich, Stephen

    2016-09-01

    With high productivity and stress tolerance, numerous grass genera of the Andropogoneae have emerged as candidates for bioenergy production. To optimize these candidates, research examining the genetic architecture of yield, carbon partitioning, and composition is required to advance breeding objectives. Significant progress has been made developing genetic and genomic resources for Andropogoneae, and advances in comparative and computational genomics have enabled research examining the genetic basis of photosynthesis, carbon partitioning, composition, and sink strength. To provide a pivotal resource aimed at developing a comparative understanding of key bioenergy traits in the Andropogoneae, we have established and characterized an association panel of 390 racially, geographically, and phenotypically diverse Sorghum bicolor accessions with 232,303 genetic markers. Sorghum bicolor was selected because of its genomic simplicity, phenotypic diversity, significant genomic tools, and its agricultural productivity and resilience. We have demonstrated the value of sorghum as a functional model for candidate gene discovery for bioenergy Andropogoneae by performing genome-wide association analysis for two contrasting phenotypes representing key components of structural and non-structural carbohydrates. We identified potential genes, including a cellulase enzyme and a vacuolar transporter, associated with increased non-structural carbohydrates that could lead to bioenergy sorghum improvement. Although our analysis identified genes with potentially clear functions, other candidates did not have assigned functions, suggesting novel molecular mechanisms for carbon partitioning traits. These results, combined with our characterization of phenotypic and genetic diversity and the public accessibility of each accession and genomic data, demonstrate the value of this resource and provide a foundation for future improvement of sorghum and related grasses for bioenergy production.

  7. A Genomic Resource for the Development, Improvement, and Exploitation of Sorghum for Bioenergy

    PubMed Central

    Brenton, Zachary W.; Cooper, Elizabeth A.; Myers, Mathew T.; Boyles, Richard E.; Shakoor, Nadia; Zielinski, Kelsey J.; Rauh, Bradley L.; Bridges, William C.; Morris, Geoffrey P.; Kresovich, Stephen

    2016-01-01

    With high productivity and stress tolerance, numerous grass genera of the Andropogoneae have emerged as candidates for bioenergy production. To optimize these candidates, research examining the genetic architecture of yield, carbon partitioning, and composition is required to advance breeding objectives. Significant progress has been made developing genetic and genomic resources for Andropogoneae, and advances in comparative and computational genomics have enabled research examining the genetic basis of photosynthesis, carbon partitioning, composition, and sink strength. To provide a pivotal resource aimed at developing a comparative understanding of key bioenergy traits in the Andropogoneae, we have established and characterized an association panel of 390 racially, geographically, and phenotypically diverse Sorghum bicolor accessions with 232,303 genetic markers. Sorghum bicolor was selected because of its genomic simplicity, phenotypic diversity, significant genomic tools, and its agricultural productivity and resilience. We have demonstrated the value of sorghum as a functional model for candidate gene discovery for bioenergy Andropogoneae by performing genome-wide association analysis for two contrasting phenotypes representing key components of structural and non-structural carbohydrates. We identified potential genes, including a cellulase enzyme and a vacuolar transporter, associated with increased non-structural carbohydrates that could lead to bioenergy sorghum improvement. Although our analysis identified genes with potentially clear functions, other candidates did not have assigned functions, suggesting novel molecular mechanisms for carbon partitioning traits. These results, combined with our characterization of phenotypic and genetic diversity and the public accessibility of each accession and genomic data, demonstrate the value of this resource and provide a foundation for future improvement of sorghum and related grasses for bioenergy production

  8. Orchidstra: An Integrated Orchid Functional Genomics Database

    PubMed Central

    Su, Chun-lin; Chao, Ya-Ting; Yen, Shao-Hua; Chen, Chun-Yi; Chen, Wan-Chieh; Chang, Yao-Chien Alex; Shih, Ming-Che

    2013-01-01

    A specialized orchid database, named Orchidstra (URL: http://orchidstra.abrc.sinica.edu.tw), has been constructed to collect, annotate and share genomic information for orchid functional genomics studies. The Orchidaceae is a large family of Angiosperms that exhibits extraordinary biodiversity in terms of both the number of species and their distribution worldwide. Orchids exhibit many unique biological features; however, investigation of these traits is currently constrained due to the limited availability of genomic information. Transcriptome information for five orchid species and one commercial hybrid has been included in the Orchidstra database. Altogether, these comprise >380,000 non-redundant orchid transcript sequences, of which >110,000 are protein-coding genes. Sequences from the transcriptome shotgun assembly (TSA) were obtained either from output reads from next-generation sequencing technologies assembled into contigs, or from conventional cDNA library approaches. An annotation pipeline using Gene Ontology, KEGG and Pfam was built to assign gene descriptions and functional annotation to protein-coding genes. Deep sequencing of small RNA was also performed for Phalaenopsis aphrodite to search for microRNAs (miRNAs), extending the information archived for this species to miRNA annotation, precursors and putative target genes. The P. aphrodite transcriptome information was further used to design probes for an oligonucleotide microarray, and expression profiling analysis was carried out. The intensities of hybridized probes derived from microarray assays of various tissues were incorporated into the database as part of the functional evidence. In the future, the content of the Orchidstra database will be expanded with transcriptome data and genomic information from more orchid species. PMID:23324169

  9. Functional genomics for food fermentation processes.

    PubMed

    Smid, E J; Hugenholtz, J

    2010-01-01

    This review describes recent scientific and technological drivers of food fermentation research. In addition, a number of practical implications of the results of this development will be highlighted. The first part of the manuscript elaborates on the message that genome sequence information gives us an unprecedented view on the biodiversity of microbes in food fermentation. This information can be made applicable for tailoring relevant characteristics of food products through fermentation. The second part deals with the integration of genome sequence data into metabolic models and the use of these models for a number of topics that are relevant for food fermentation processes. The final part will be about metagenomics approaches to reveal the complexity and understand the functionality of undefined complex microbial consortia used in a diverse range of food fermentation processes.

  10. Whole genome sequencing of elite rice cultivars as a comprehensive information resource for marker assisted selection.

    PubMed

    Duitama, Jorge; Silva, Alexander; Sanabria, Yamid; Cruz, Daniel Felipe; Quintero, Constanza; Ballen, Carolina; Lorieux, Mathias; Scheffler, Brian; Farmer, Andrew; Torres, Edgar; Oard, James; Tohme, Joe

    2015-01-01

    Current advances in sequencing technologies and bioinformatics revealed the genomic background of rice, a staple food for the poor people, and provided the basis to develop large genomic variation databases for thousands of cultivars. Proper analysis of this massive resource is expected to give novel insights into the structure, function, and evolution of the rice genome, and to aid the development of rice varieties through marker assisted selection or genomic selection. In this work we present sequencing and bioinformatics analyses of 104 rice varieties belonging to the major subspecies of Oryza sativa. We identified repetitive elements and recurrent copy number variation covering about 200 Mbp of the rice genome. Genotyping of over 18 million polymorphic locations within O. sativa allowed us to reconstruct the individual haplotype patterns shaping the genomic background of elite varieties used by farmers throughout the Americas. Based on a reconstruction of the alleles for the gene GBSSI, we could identify novel genetic markers for selection of varieties with high amylose content. We expect that both the analysis methods and the genomic information described here would be of great use for the rice research community and for other groups carrying on similar sequencing efforts in other crops. PMID:25923345

  11. Whole Genome Sequencing of Elite Rice Cultivars as a Comprehensive Information Resource for Marker Assisted Selection

    PubMed Central

    Duitama, Jorge; Silva, Alexander; Sanabria, Yamid; Cruz, Daniel Felipe; Quintero, Constanza; Ballen, Carolina; Lorieux, Mathias; Scheffler, Brian; Farmer, Andrew; Torres, Edgar; Oard, James; Tohme, Joe

    2015-01-01

    Current advances in sequencing technologies and bioinformatics revealed the genomic background of rice, a staple food for the poor people, and provided the basis to develop large genomic variation databases for thousands of cultivars. Proper analysis of this massive resource is expected to give novel insights into the structure, function, and evolution of the rice genome, and to aid the development of rice varieties through marker assisted selection or genomic selection. In this work we present sequencing and bioinformatics analyses of 104 rice varieties belonging to the major subspecies of Oryza sativa. We identified repetitive elements and recurrent copy number variation covering about 200 Mbp of the rice genome. Genotyping of over 18 million polymorphic locations within O. sativa allowed us to reconstruct the individual haplotype patterns shaping the genomic background of elite varieties used by farmers throughout the Americas. Based on a reconstruction of the alleles for the gene GBSSI, we could identify novel genetic markers for selection of varieties with high amylose content. We expect that both the analysis methods and the genomic information described here would be of great use for the rice research community and for other groups carrying on similar sequencing efforts in other crops. PMID:25923345

  12. Whole genome sequencing of elite rice cultivars as a comprehensive information resource for marker assisted selection.

    PubMed

    Duitama, Jorge; Silva, Alexander; Sanabria, Yamid; Cruz, Daniel Felipe; Quintero, Constanza; Ballen, Carolina; Lorieux, Mathias; Scheffler, Brian; Farmer, Andrew; Torres, Edgar; Oard, James; Tohme, Joe

    2015-01-01

    Current advances in sequencing technologies and bioinformatics revealed the genomic background of rice, a staple food for the poor people, and provided the basis to develop large genomic variation databases for thousands of cultivars. Proper analysis of this massive resource is expected to give novel insights into the structure, function, and evolution of the rice genome, and to aid the development of rice varieties through marker assisted selection or genomic selection. In this work we present sequencing and bioinformatics analyses of 104 rice varieties belonging to the major subspecies of Oryza sativa. We identified repetitive elements and recurrent copy number variation covering about 200 Mbp of the rice genome. Genotyping of over 18 million polymorphic locations within O. sativa allowed us to reconstruct the individual haplotype patterns shaping the genomic background of elite varieties used by farmers throughout the Americas. Based on a reconstruction of the alleles for the gene GBSSI, we could identify novel genetic markers for selection of varieties with high amylose content. We expect that both the analysis methods and the genomic information described here would be of great use for the rice research community and for other groups carrying on similar sequencing efforts in other crops.

  13. InsectBase: a resource for insect genomes and transcriptomes

    PubMed Central

    Yin, Chuanlin; Shen, Gengyu; Guo, Dianhao; Wang, Shuping; Ma, Xingzhou; Xiao, Huamei; Liu, Jinding; Zhang, Zan; Liu, Ying; Zhang, Yiqun; Yu, Kaixiang; Huang, Shuiqing; Li, Fei

    2016-01-01

    The genomes and transcriptomes of hundreds of insects have been sequenced. However, insect community lacks an integrated, up-to-date collection of insect gene data. Here, we introduce the first release of InsectBase, available online at http://www.insect-genome.com. The database encompasses 138 insect genomes, 116 insect transcriptomes, 61 insect gene sets, 36 gene families of 60 insects, 7544 miRNAs of 69 insects, 96 925 piRNAs of Drosophila melanogaster and Chilo suppressalis, 2439 lncRNA of Nilaparvata lugens, 22 536 pathways of 78 insects, 678 881 untranslated regions (UTR) of 84 insects and 160 905 coding sequences (CDS) of 70 insects. This release contains over 12 million sequences and provides search functionality, a BLAST server, GBrowse, insect pathway construction, a Facebook-like network for the insect community (iFacebook), and phylogenetic analysis of selected genes. PMID:26578584

  14. CyanoClust: comparative genome resources of cyanobacteria and plastids.

    PubMed

    Sasaki, Naobumi V; Sato, Naoki

    2010-01-01

    Cyanobacteria, which perform oxygen-evolving photosynthesis as do chloroplasts of plants and algae, are one of the best-studied prokaryotic phyla and one from which many representative genomes have been sequenced. Lack of a suitable comparative genomic database has been a problem in cyanobacterial genomics because many proteins involved in physiological functions such as photosynthesis and nitrogen fixation are not catalogued in commonly used databases, such as Clusters of Orthologous Proteins (COG). CyanoClust is a database of homolog groups in cyanobacteria and plastids that are produced by the program Gclust. We have developed a web-server system for the protein homology database featuring cyanobacteria and plastids. Database URL: http://cyanoclust.c.u-tokyo.ac.jp/.

  15. Gene3D: comprehensive structural and functional annotation of genomes.

    PubMed

    Yeats, Corin; Lees, Jonathan; Reid, Adam; Kellam, Paul; Martin, Nigel; Liu, Xinhui; Orengo, Christine

    2008-01-01

    Gene3D provides comprehensive structural and functional annotation of most available protein sequences, including the UniProt, RefSeq and Integr8 resources. The main structural annotation is generated through scanning these sequences against the CATH structural domain database profile-HMM library. CATH is a database of manually derived PDB-based structural domains, placed within a hierarchy reflecting topology, homology and conservation and is able to infer more ancient and divergent homology relationships than sequence-based approaches. This data is supplemented with Pfam-A, other non-domain structural predictions (i.e. coiled coils) and experimental data from UniProt. In order to enhance the investigations possible with this data, we have also incorporated a variety of protein annotation resources, including protein-protein interaction data, GO functional assignments, KEGG pathways, FUNCAT functional descriptions and links to microarray expression data. All of this data can be accessed through a newly re-designed website that has a focus on flexibility and clarity, with searches that can be restricted to a single genome or across the entire sequence database. Currently Gene3D contains over 3.5 million domain assignments for nearly 5 million proteins including 527 completed genomes. This is available at: http://gene3d.biochem.ucl.ac.uk/ PMID:18032434

  16. SGR: an online genomic resource for the woodland strawberry

    PubMed Central

    2013-01-01

    Background Fragaria vesca, a diploid strawberry species commonly known as the alpine or woodland strawberry, is a versatile experimental plant system and an emerging model for the Rosaceae family. An ancestral F. vesca genome contributed to the genome of the octoploid dessert strawberry (F. ×ananassa), and the extant genome exhibits synteny with other commercially important members of the Rosaceae family such as apple and peach. To provide a molecular description of floral organ and fruit development at the resolution of specific tissues and cell types, RNAs from flowers and early developmental stage fruit tissues of the inbred F. vesca line YW5AF7 were extracted and the resulting cDNA libraries sequenced using an Illumina HiSeq2000. To enable easy access as well as mining of this two-dimensional (stage and tissue) transcriptome dataset, a web-based database, the Strawberry Genomic Resource (SGR), was developed. Description SGR is a web accessible database that contains sample description, sample statistics, gene annotation, and gene expression analysis. This information can be accessed publicly from a web-based interface at http://bioinformatics.towson.edu/strawberry/Default.aspx. The SGR website provides user friendly search and browse capabilities for all the data stored in the database. Users are able to search for genes using a gene ID or description or obtain differentially expressed genes by entering different comparison parameters. Search results can be downloaded in a tabular format compatible with Microsoft excel application. Aligned reads to individual genes and exon/intron structures are displayed using the genome browser, facilitating gene re-annotation by individual users. Conclusions The SGR database was developed to facilitate dissemination and data mining of extensive floral and fruit transcriptome data in the woodland strawberry. It enables users to mine the data in different ways to study different pathways or biological processes during

  17. Genomic Resources Notes accepted 1 February 2015 - 31 March 2015.

    PubMed

    Arthofer, Wolfgang; Bertini, Laura; Caruso, Carla; Cicconardi, Francesco; Delph, Lynda F; Fields, Peter D; Ikeda, Minoru; Minegishi, Yuki; Proietti, Silvia; Ritthammer, Heike; Schlick-Steiner, Birgit C; Steiner, Florian M; Wachter, Gregor A; Wagner, Herbert C; Weingartner, Laura A

    2015-07-01

    This article documents the public availability of (i) raw transcriptome sequence data, assembled contigs and BLAST hits of the Antarctic plant Colobanthus quitensis grown in two different climatic conditions, (ii) the draft genome sequence data (raw reads, assembled contigs and unassembled reads) and RAD-tag read data of the marbled flounder Pseudopleuronectes yokohamae, (iii) transcriptome resources from four white campion (Silene latifolia) individuals from two morphologically divergent populations and (iv) nuclear DNA markers from 454 sequencing of reduced representation libraries (RRL) based on amplified fragment length polymorphism (AFLP) PCR products of four species of ants in the genus Tetramorium. PMID:26095006

  18. Learning about the Human Genome. Part 2: Resources for Science Educators. ERIC Digest.

    ERIC Educational Resources Information Center

    Haury, David L.

    This ERIC Digest identifies how the human genome project fits into the "National Science Education Standards" and lists Human Genome Project Web sites found on the World Wide Web. It is a resource companion to "Learning about the Human Genome. Part 1: Challenge to Science Educators" (Haury 2001). The Web resources and instructional materials can…

  19. Valuation of Ecological Resources and Functions

    PubMed

    Scott; Bilyard; Link; Ulibarri; Westerdahl; Ricci; Seely

    1998-01-01

    / Ecological resources are natural resources that provide certain necessary but overlooked system maintenance functions within ecosystems. Environmental economics is in search of an appropriate analysis framework to determine economic values of such resources. This paper presents a framework that estimates and compiles the components of value for a natural ecosystem. The framework begins with the ecological processes involved, which provide functions within the ecosystem and services valued by humans. We discuss the additive or competive nature of these values, and estimate these values through conventional and unconventional techniques. We apply the framework to ecological resources in a shrub-steppe dryland habitat being displaced by development. We first determine which functions and services are mutually exclusive (e.g., farming vs soil stabilization) and which are complementary or products of joint production (e.g., soil stabilization and maintenance of species). We then apply benefit transfer principles with contingent valuation methodology (CVM), travel cost methodology (TCM), and hedonic damage pricing (HDP). Finally, we derive upper-limit values for more difficult-to-value functions through the use of human analogs, which we argue are the most appropriate method of valuation under some circumstances. The highest values of natural shrub-steppe habitat appear to be derived from soil stabilization.KEY WORDS: Natural resource economics; Ecological economics; Ecological resources; Shrub-steppe; Environmental valuation; Cost; Benefit; Value PMID:9419284

  20. More genomic resources for less-studied crops.

    PubMed

    Varshney, Rajeev K; Glaszmann, Jean-Christophe; Leung, Hei; Ribaut, Jean-Marcel

    2010-09-01

    Many of the crop species considered to be minor on a global scale, yet are important locally for food security in the developing world, have remained less-studied crops. Recent years have witnessed the development of large-scale genomic and genetic resources, including simple sequence repeat, single nucleotide polymorphism and diversity array technology markers, expressed sequence tags or transcript reads, bacterial artificial chromosome libraries, genetic and physical maps, and genetic stocks with rich genetic diversity, such as core reference sets and introgression lines in these crops. These resources have the potential to accelerate gene discovery and initiate molecular breeding in these crops, thereby enhancing crop productivity to ensure food security in developing countries. PMID:20692061

  1. Development of peanut expessed sequence tag-based genomic resources and tools

    Technology Transfer Automated Retrieval System (TEKTRAN)

    U.S. Peanut Genome Initiative (PGI) has widely recognized the need for peanut genome tools and resources development for mitigating peanut allergens and food safety. Genomics such as Expressed Sequence Tag (EST), microarray technologies, and whole genome sequencing provides robotic tools for profili...

  2. Development of peanut EST (expressed sequence tag)-based genomic resources and tools

    Technology Transfer Automated Retrieval System (TEKTRAN)

    U.S. Peanut Genome Initiative (PGI) has widely recognized the need for peanut genome tools and resources development for mitigating peanut allergens and food safety. Genomics such as Expressed Sequence Tag (EST), microarray technologies, and whole genome sequencing provides robotic tools for profili...

  3. Retroelements and their impact on genome evolution and functioning.

    PubMed

    Gogvadze, Elena; Buzdin, Anton

    2009-12-01

    Retroelements comprise a considerable fraction of eukaryotic genomes. Since their initial discovery by Barbara McClintock in maize DNA, retroelements have been found in genomes of almost all organisms. First considered as a "junk DNA" or genomic parasites, they were shown to influence genome functioning and to promote genetic innovations. For this reason, they were suggested as an important creative force in the genome evolution and adaptation of an organism to altered environmental conditions. In this review, we summarize the up-to-date knowledge of different ways of retroelement involvement in structural and functional evolution of genes and genomes, as well as the mechanisms generated by cells to control their retrotransposition.

  4. Selfish drive can trump function when animal mitochondrial genomes compete.

    PubMed

    Ma, Hansong; O'Farrell, Patrick H

    2016-07-01

    Mitochondrial genomes compete for transmission from mother to progeny. We explored this competition by introducing a second genome into Drosophila melanogaster to follow transmission. Competitions between closely related genomes favored those functional in electron transport, resulting in a host-beneficial purifying selection. In contrast, matchups between distantly related genomes often favored those with negligible, negative or lethal consequences, indicating selfish selection. Exhibiting powerful selfish selection, a genome carrying a detrimental mutation displaced a complementing genome, leading to population death after several generations. In a different pairing, opposing selfish and purifying selection counterbalanced to give stable transmission of two genomes. Sequencing of recombinant mitochondrial genomes showed that the noncoding region, containing origins of replication, governs selfish transmission. Uniparental inheritance prevents encounters between distantly related genomes. Nonetheless, in each maternal lineage, constant competition among sibling genomes selects for super-replicators. We suggest that this relentless competition drives positive selection, promoting change in the sequences influencing transmission. PMID:27270106

  5. Selfish drive can trump function when animal mitochondrial genomes compete.

    PubMed

    Ma, Hansong; O'Farrell, Patrick H

    2016-07-01

    Mitochondrial genomes compete for transmission from mother to progeny. We explored this competition by introducing a second genome into Drosophila melanogaster to follow transmission. Competitions between closely related genomes favored those functional in electron transport, resulting in a host-beneficial purifying selection. In contrast, matchups between distantly related genomes often favored those with negligible, negative or lethal consequences, indicating selfish selection. Exhibiting powerful selfish selection, a genome carrying a detrimental mutation displaced a complementing genome, leading to population death after several generations. In a different pairing, opposing selfish and purifying selection counterbalanced to give stable transmission of two genomes. Sequencing of recombinant mitochondrial genomes showed that the noncoding region, containing origins of replication, governs selfish transmission. Uniparental inheritance prevents encounters between distantly related genomes. Nonetheless, in each maternal lineage, constant competition among sibling genomes selects for super-replicators. We suggest that this relentless competition drives positive selection, promoting change in the sequences influencing transmission.

  6. Comparative Plant Genomics Resources at PlantGDB1

    PubMed Central

    Dong, Qunfeng; Lawrence, Carolyn J.; Schlueter, Shannon D.; Wilkerson, Matthew D.; Kurtz, Stefan; Lushbough, Carol; Brendel, Volker

    2005-01-01

    PlantGDB (http://www.plantgdb.org/) is a database of plant molecular sequences. Expressed sequence tag (EST) sequences are assembled into contigs that represent tentative unique genes. EST contigs are functionally annotated with information derived from known protein sequences that are highly similar to the putative translation products. Tentative Gene Ontology terms are assigned to match those of the similar sequences identified. Genome survey sequences are assembled similarly. The resulting genome survey sequence contigs are matched to ESTs and conserved protein homologs to identify putative full-length open reading frame-containing genes, which are subsequently provisionally classified according to established gene family designations. For Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa), the exon-intron boundaries for gene structures are annotated by spliced alignment of ESTs and full-length cDNAs to their respective complete genome sequences. Unique genome browsers have been developed to present all available EST and cDNA evidence for current transcript models (for Arabidopsis, see the AtGDB site at http://www.plantgdb.org/AtGDB/; for rice, see the OsGDB site at http://www.plantgdb.org/OsGDB/). In addition, a number of bioinformatic tools have been integrated at PlantGDB that enable researchers to carry out sequence analyses on-site using both their own data and data residing within the database. PMID:16219921

  7. Utilizing Genomics to Study Entomopathogenicity in the Fungal Phylum Entomophthoromycota: A Review of Current Genetic Resources.

    PubMed

    De Fine Licht, H H; Hajek, A E; Eilenberg, J; Jensen, A B

    2016-01-01

    The order Entomophthorales, which formerly contained c.280 species, has recently been recognized as a separate phylum, Entomophthoromycota, consisting of three recognized classes and six families. Many genera in this group contain obligate insect-pathogenic species with narrow host ranges, capable of producing epizootics in natural insect populations. Available sequence information from the phylum Entomophthoromycota can be classified into three main categories: first, partial gene regions (exons+introns) used for phylogenetic inference; second, protein coding gene regions obtained using degenerate primers, expressed sequence tag methodology or de novo transcriptome sequencing with molecular function inferred by homology analysis; and third, primarily forthcoming whole-genome sequencing data sets. Here we summarize the current genetic resources for Entomophthoromycota and identify research areas that are likely to be significantly advanced from the availability of new whole-genome resources.

  8. The function of genomes in bioenergetic organelles.

    PubMed Central

    Allen, John F

    2003-01-01

    Mitochondria and chloroplasts are energy-transducing organelles of the cytoplasm of eukaryotic cells. They originated as bacterial symbionts whose host cells acquired respiration from the precursor of the mitochondrion, and oxygenic photosynthesis from the precursor of the chloroplast. The host cells also acquired genetic information from their symbionts, eventually incorporating much of it into their own genomes. Genes of the eukaryotic cell nucleus now encode most mitochondrial and chloroplast proteins. Genes are copied and moved between cellular compartments with relative ease, and there is no obvious obstacle to successful import of any protein precursor from the cytosol. So why are any genes at all retained in cytoplasmic organelles? One proposal is that these small but functional genomes provide a location for genes that is close to, and in the same compartment as, their gene products. This co-location facilitates rapid and direct regulatory coupling. Redox control of synthesis de novo is put forward as the common property of those proteins that must be encoded and synthesized within mitochondria and chloroplasts. This testable hypothesis is termed CORR, for co-location for redox regulation. Principles, predictions and consequences of CORR are examined in the context of competing hypotheses and current evidence. PMID:12594916

  9. Valuation of ecological resources and functions

    SciTech Connect

    Scott, M.J.; Bilyard, G.R.; Link, S.O.

    1998-01-01

    Ecological resources are natural resources that provide certain necessary but overlooked system maintenance functions within ecosystems. Environmental economics is in search of an appropriate analysis framework to determine economic values of such resources, This paper presents a framework that estimates and compiles the components of value for a natural ecosystem. The framework begins with the ecological processes involved, which provide functions within the ecosystem and services valued by humans. We discuss the additive or competive nature of these values, and estimate these values through conventional and unconventional techniques. We apply the framework to ecological resources in a shrub-steppe dryland habitat being displaced by development. We first determine which functions and services are mutually exclusive (e.g., farming vs soil stabilization) and which are complementary or products of joint production (e.g., soil stabilization and maintenance of species). We then apply benefit transfer principles with contingent valuation methodology (CVM), travel cost methodology (TCM), and hedonic damage pricing (HDP). Finally, we derive upper-limit values for more difficult-to-value functions through the use of human analogs, which we argue are the most appropriate method of valuation under some circumstances. The highest values of natural shrub-steppe habitat appear to be derived from soil stabilization. 59 refs., 6 figs., 5 tabs.

  10. RSIADB, a collective resource for genome and transcriptome analyses in Rhizoctonia solani AG1 IA.

    PubMed

    Chen, Lei; Ai, Peng; Zhang, Jinfeng; Deng, Qiming; Wang, Shiquan; Li, Shuangcheng; Zhu, Jun; Li, Ping; Zheng, Aiping

    2016-01-01

    Rice [Oryza sativa (L.)] feeds more than half of the world's population. Rhizoctonia solaniis a major fungal pathogen of rice causing extreme crop losses in all rice-growing regions of the world. R. solani AG1 IA is a major cause of sheath blight in rice. In this study, we constructed a comprehensive and user-friendly web-based database, RSIADB, to analyse its draft genome and transcriptome. The database was built using the genome sequence (10,489 genes) and annotation information for R. solani AG1 IA. A total of six RNAseq samples of R. solani AG1 IA were also analysed, corresponding to 10, 18, 24, 32, 48 and 72 h after infection of rice leaves. The RSIADB database enables users to search, browse, and download gene sequences for R. solani AG1 IA, and mine the data using BLAST, Sequence Extractor, Browse and Construction Diagram tools that were integrated into the database. RSIADB is an important genomic resource for scientists working with R. solani AG1 IA and will assist researchers in analysing the annotated genome and transcriptome of this pathogen. This resource will facilitate studies on gene function, pathogenesis factors and secreted proteins, as well as provide an avenue for comparative analyses of genes expressed during different stages of infection. Database URL:http://genedenovoweb.ticp.net:81/rsia/index.php.

  11. RSIADB, a collective resource for genome and transcriptome analyses in Rhizoctonia solani AG1 IA

    PubMed Central

    Ai, Peng; Zhang, Jinfeng; Deng, Qiming; Wang, Shiquan; Li, Shuangcheng; Zhu, Jun; Li, Ping; Zheng, Aiping

    2016-01-01

    Rice [Oryza sativa (L.)] feeds more than half of the world’s population. Rhizoctonia solani is a major fungal pathogen of rice causing extreme crop losses in all rice-growing regions of the world. R. solani AG1 IA is a major cause of sheath blight in rice. In this study, we constructed a comprehensive and user-friendly web-based database, RSIADB, to analyse its draft genome and transcriptome. The database was built using the genome sequence (10 489 genes) and annotation information for R. solani AG1 IA. A total of six RNAseq samples of R. solani AG1 IA were also analysed, corresponding to 10, 18, 24, 32, 48 and 72 h after infection of rice leaves. The RSIADB database enables users to search, browse, and download gene sequences for R. solani AG1 IA, and mine the data using BLAST, Sequence Extractor, Browse and Construction Diagram tools that were integrated into the database. RSIADB is an important genomic resource for scientists working with R. solani AG1 IA and will assist researchers in analysing the annotated genome and transcriptome of this pathogen. This resource will facilitate studies on gene function, pathogenesis factors and secreted proteins, as well as provide an avenue for comparative analyses of genes expressed during different stages of infection. Database URL: http://genedenovoweb.ticp.net:81/rsia/index.php PMID:27022158

  12. RSIADB, a collective resource for genome and transcriptome analyses in Rhizoctonia solani AG1 IA.

    PubMed

    Chen, Lei; Ai, Peng; Zhang, Jinfeng; Deng, Qiming; Wang, Shiquan; Li, Shuangcheng; Zhu, Jun; Li, Ping; Zheng, Aiping

    2016-01-01

    Rice [Oryza sativa (L.)] feeds more than half of the world's population. Rhizoctonia solaniis a major fungal pathogen of rice causing extreme crop losses in all rice-growing regions of the world. R. solani AG1 IA is a major cause of sheath blight in rice. In this study, we constructed a comprehensive and user-friendly web-based database, RSIADB, to analyse its draft genome and transcriptome. The database was built using the genome sequence (10,489 genes) and annotation information for R. solani AG1 IA. A total of six RNAseq samples of R. solani AG1 IA were also analysed, corresponding to 10, 18, 24, 32, 48 and 72 h after infection of rice leaves. The RSIADB database enables users to search, browse, and download gene sequences for R. solani AG1 IA, and mine the data using BLAST, Sequence Extractor, Browse and Construction Diagram tools that were integrated into the database. RSIADB is an important genomic resource for scientists working with R. solani AG1 IA and will assist researchers in analysing the annotated genome and transcriptome of this pathogen. This resource will facilitate studies on gene function, pathogenesis factors and secreted proteins, as well as provide an avenue for comparative analyses of genes expressed during different stages of infection. Database URL:http://genedenovoweb.ticp.net:81/rsia/index.php. PMID:27022158

  13. The personal genome browser: visualizing functions of genetic variants.

    PubMed

    Juan, Liran; Teng, Mingxiang; Zang, Tianyi; Hao, Yafeng; Wang, Zhenxing; Yan, Chengwu; Liu, Yongzhuang; Li, Jie; Zhang, Tianjiao; Wang, Yadong

    2014-07-01

    Advances in high-throughput sequencing technologies have brought us into the individual genome era. Projects such as the 1000 Genomes Project have led the individual genome sequencing to become more and more popular. How to visualize, analyse and annotate individual genomes with knowledge bases to support genome studies and personalized healthcare is still a big challenge. The Personal Genome Browser (PGB) is developed to provide comprehensive functional annotation and visualization for individual genomes based on the genetic-molecular-phenotypic model. Investigators can easily view individual genetic variants, such as single nucleotide variants (SNVs), INDELs and structural variations (SVs), as well as genomic features and phenotypes associated to the individual genetic variants. The PGB especially highlights potential functional variants using the PGB built-in method or SIFT/PolyPhen2 scores. Moreover, the functional risks of genes could be evaluated by scanning individual genetic variants on the whole genome, a chromosome, or a cytoband based on functional implications of the variants. Investigators can then navigate to high risk genes on the scanned individual genome. The PGB accepts Variant Call Format (VCF) and Genetic Variation Format (GVF) files as the input. The functional annotation of input individual genome variants can be visualized in real time by well-defined symbols and shapes. The PGB is available at http://www.pgbrowser.org/. PMID:24799434

  14. openSNP–A Crowdsourced Web Resource for Personal Genomics

    PubMed Central

    Greshake, Bastian; Bayer, Philipp E.; Rausch, Helge; Reda, Julia

    2014-01-01

    Genome-Wide Association Studies are widely used to correlate phenotypic traits with genetic variants. These studies usually compare the genetic variation between two groups to single out certain Single Nucleotide Polymorphisms (SNPs) that are linked to a phenotypic variation in one of the groups. However, it is necessary to have a large enough sample size to find statistically significant correlations. Direct-To-Consumer (DTC) genetic testing can supply additional data: DTC-companies offer the analysis of a large amount of SNPs for an individual at low cost without the need to consult a physician or geneticist. Over 100,000 people have already been genotyped through Direct-To-Consumer genetic testing companies. However, this data is not public for a variety of reasons and thus cannot be used in research. It seems reasonable to create a central open data repository for such data. Here we present the web platform openSNP, an open database which allows participants of Direct-To-Consumer genetic testing to publish their genetic data at no cost along with phenotypic information. Through this crowdsourced effort of collecting genetic and phenotypic information, openSNP has become a resource for a wide area of studies, including Genome-Wide Association Studies. openSNP is hosted at http://www.opensnp.org, and the code is released under MIT-license at http://github.com/gedankenstuecke/snpr. PMID:24647222

  15. Functional annotations in bacterial genomes based on small RNA signatures.

    PubMed

    Sridhar, Jayavel; Rafi, Ziauddin Ahamed

    2008-04-04

    One of the key challenges in computational genomics is annotating coding genes and identification of regulatory RNAs in complete genomes. An attempt is made in this study which uses the regulatory RNA locations and their conserved flanking genes identified within the genomic backbone of template genome to search for similar RNA locations in query genomes. The search is based on recently reported coexistence of small RNAs and their conserved flanking genes in related genomes. Based on our study, 54 additional sRNA locations and functions of 96 uncharacterized genes are predicted in two draft genomes viz., Serratia marcesens Db1 and Yersinia enterocolitica 8081. Although most of the identified additional small RNA regions and their corresponding flanking genes are homologous in nature, the proposed anchoring technique could successfully identify four non-homologous small RNA regions in Y. enterocolitica genome also. The KEGG Orthology (KO) based automated functional predictions confirms the predicted functions of 65 flanking genes having defined KO numbers, out of the total 96 predictions made by this method. This coexistence based method shows more sensitivity than controlled vocabularies in locating orthologous gene pairs even in the absence of defined Orthology numbers. All functional predictions made by this study in Y. enterocolitica 8081 were confirmed by the recently published complete genome sequence and annotations. This study also reports the possible regions of gene rearrangements in these two genomes and further characterization of such RNA regions could shed more light on their possible role in genome evolution.

  16. Functional annotations in bacterial genomes based on small RNA signatures

    PubMed Central

    Sridhar, Jayavel; Rafi, Ziauddin Ahamed

    2008-01-01

    One of the key challenges in computational genomics is annotating coding genes and identification of regulatory RNAs in complete genomes. An attempt is made in this study which uses the regulatory RNA locations and their conserved flanking genes identified within the genomic backbone of template genome to search for similar RNA locations in query genomes. The search is based on recently reported coexistence of small RNAs and their conserved flanking genes in related genomes. Based on our study, 54 additional sRNA locations and functions of 96 uncharacterized genes are predicted in two draft genomes viz., Serratia marcesens Db1 and Yersinia enterocolitica 8081. Although most of the identified additional small RNA regions and their corresponding flanking genes are homologous in nature, the proposed anchoring technique could successfully identify four non-homologous small RNA regions in Y. enterocolitica genome also. The KEGG Orthology (KO) based automated functional predictions confirms the predicted functions of 65 flanking genes having defined KO numbers, out of the total 96 predictions made by this method. This coexistence based method shows more sensitivity than controlled vocabularies in locating orthologous gene pairs even in the absence of defined Orthology numbers. All functional predictions made by this study in Y. enterocolitica 8081 were confirmed by the recently published complete genome sequence and annotations. This study also reports the possible regions of gene rearrangements in these two genomes and further characterization of such RNA regions could shed more light on their possible role in genome evolution. PMID:18478081

  17. Soybean Knowledge Base (SoyKB): a Web Resource for Soybean Translational Genomics

    SciTech Connect

    Joshi, Trupti; Patil, Kapil; Fitzpatrick, Michael R.; Franklin, Levi D.; Yao, Qiuming; Cook, Jeffrey R.; Wang, Zhem; Libault, Marc; Brechenmacher, Laurent; Valliyodan, Babu; Wu, Xiaolei; Cheng, Jianlin; Stacey, Gary; Nguyen, Henry T.; Xu, Dong

    2012-01-17

    Background: Soybean Knowledge Base (SoyKB) is a comprehensive all-inclusive web resource for soybean translational genomics. SoyKB is designed to handle the management and integration of soybean genomics, transcriptomics, proteomics and metabolomics data along with annotation of gene function and biological pathway. It contains information on four entities, namely genes, microRNAs, metabolites and single nucleotide polymorphisms (SNPs). Methods: SoyKB has many useful tools such as Affymetrix probe ID search, gene family search, multiple gene/ metabolite search supporting co-expression analysis, and protein 3D structure viewer as well as download and upload capacity for experimental data and annotations. It has four tiers of registration, which control different levels of access to public and private data. It allows users of certain levels to share their expertise by adding comments to the data. It has a user-friendly web interface together with genome browser and pathway viewer, which display data in an intuitive manner to the soybean researchers, producers and consumers. Conclusions: SoyKB addresses the increasing need of the soybean research community to have a one-stop-shop functional and translational omics web resource for information retrieval and analysis in a user-friendly way. SoyKB can be publicly accessed at http://soykb.org/.

  18. MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease.

    PubMed

    Shen, Lishuang; Diroma, Maria Angela; Gonzalez, Michael; Navarro-Gomez, Daniel; Leipzig, Jeremy; Lott, Marie T; van Oven, Mannis; Wallace, Douglas C; Muraresku, Colleen Clarke; Zolkipli-Cunningham, Zarazuela; Chinnery, Patrick F; Attimonelli, Marcella; Zuchner, Stephan; Falk, Marni J; Gai, Xiaowu

    2016-06-01

    MSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes, genes, and variants. A central Web portal (https://mseqdr.org) integrates community knowledge from expert-curated databases with genomic and phenotype data shared by clinicians and researchers. MSeqDR also functions as a centralized application server for Web-based tools to analyze data across both mitochondrial and nuclear DNA, including investigator-driven whole exome or genome dataset analyses through MSeqDR-Genesis. MSeqDR-GBrowse genome browser supports interactive genomic data exploration and visualization with custom tracks relevant to mtDNA variation and mitochondrial disease. MSeqDR-LSDB is a locus-specific database that currently manages 178 mitochondrial diseases, 1,363 genes associated with mitochondrial biology or disease, and 3,711 pathogenic variants in those genes. MSeqDR Disease Portal allows hierarchical tree-style disease exploration to evaluate their unique descriptions, phenotypes, and causative variants. Automated genomic data submission tools are provided that capture ClinVar compliant variant annotations. PhenoTips will be used for phenotypic data submission on deidentified patients using human phenotype ontology terminology. The development of a dynamic informed patient consent process to guide data access is underway to realize the full potential of these resources.

  19. MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease.

    PubMed

    Shen, Lishuang; Diroma, Maria Angela; Gonzalez, Michael; Navarro-Gomez, Daniel; Leipzig, Jeremy; Lott, Marie T; van Oven, Mannis; Wallace, Douglas C; Muraresku, Colleen Clarke; Zolkipli-Cunningham, Zarazuela; Chinnery, Patrick F; Attimonelli, Marcella; Zuchner, Stephan; Falk, Marni J; Gai, Xiaowu

    2016-06-01

    MSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes, genes, and variants. A central Web portal (https://mseqdr.org) integrates community knowledge from expert-curated databases with genomic and phenotype data shared by clinicians and researchers. MSeqDR also functions as a centralized application server for Web-based tools to analyze data across both mitochondrial and nuclear DNA, including investigator-driven whole exome or genome dataset analyses through MSeqDR-Genesis. MSeqDR-GBrowse genome browser supports interactive genomic data exploration and visualization with custom tracks relevant to mtDNA variation and mitochondrial disease. MSeqDR-LSDB is a locus-specific database that currently manages 178 mitochondrial diseases, 1,363 genes associated with mitochondrial biology or disease, and 3,711 pathogenic variants in those genes. MSeqDR Disease Portal allows hierarchical tree-style disease exploration to evaluate their unique descriptions, phenotypes, and causative variants. Automated genomic data submission tools are provided that capture ClinVar compliant variant annotations. PhenoTips will be used for phenotypic data submission on deidentified patients using human phenotype ontology terminology. The development of a dynamic informed patient consent process to guide data access is underway to realize the full potential of these resources. PMID:26919060

  20. Functional genomics: Probing plant gene function and expression with transposons

    PubMed Central

    Martienssen, Robert A.

    1998-01-01

    Transposable elements provide a convenient and flexible means to disrupt plant genes, so allowing their function to be assessed. By engineering transposons to carry reporter genes and regulatory signals, the expression of target genes can be monitored and to some extent manipulated. Two strategies for using transposons to assess gene function are outlined here: First, the PCR can be used to identify plants that carry insertions into specific genes from among pools of heavily mutagenized individuals (site-selected transposon mutagenesis). This method requires that high copy transposons be used and that a relatively large number of reactions be performed to identify insertions into genes of interest. Second, a large library of plants, each carrying a unique insertion, can be generated. Each insertion site then can be amplified and sequenced systematically. These two methods have been demonstrated in maize, Arabidopsis, and other plant species, and the relative merits of each are discussed in the context of plant genome research. PMID:9482828

  1. Current and future resources for functional metagenomics.

    PubMed

    Lam, Kathy N; Cheng, Jiujun; Engel, Katja; Neufeld, Josh D; Charles, Trevor C

    2015-01-01

    Functional metagenomics is a powerful experimental approach for studying gene function, starting from the extracted DNA of mixed microbial populations. A functional approach relies on the construction and screening of metagenomic libraries-physical libraries that contain DNA cloned from environmental metagenomes. The information obtained from functional metagenomics can help in future annotations of gene function and serve as a complement to sequence-based metagenomics. In this Perspective, we begin by summarizing the technical challenges of constructing metagenomic libraries and emphasize their value as resources. We then discuss libraries constructed using the popular cloning vector, pCC1FOS, and highlight the strengths and shortcomings of this system, alongside possible strategies to maximize existing pCC1FOS-based libraries by screening in diverse hosts. Finally, we discuss the known bias of libraries constructed from human gut and marine water samples, present results that suggest bias may also occur for soil libraries, and consider factors that bias metagenomic libraries in general. We anticipate that discussion of current resources and limitations will advance tools and technologies for functional metagenomics research. PMID:26579102

  2. Current and future resources for functional metagenomics

    PubMed Central

    Lam, Kathy N.; Cheng, Jiujun; Engel, Katja; Neufeld, Josh D.; Charles, Trevor C.

    2015-01-01

    Functional metagenomics is a powerful experimental approach for studying gene function, starting from the extracted DNA of mixed microbial populations. A functional approach relies on the construction and screening of metagenomic libraries—physical libraries that contain DNA cloned from environmental metagenomes. The information obtained from functional metagenomics can help in future annotations of gene function and serve as a complement to sequence-based metagenomics. In this Perspective, we begin by summarizing the technical challenges of constructing metagenomic libraries and emphasize their value as resources. We then discuss libraries constructed using the popular cloning vector, pCC1FOS, and highlight the strengths and shortcomings of this system, alongside possible strategies to maximize existing pCC1FOS-based libraries by screening in diverse hosts. Finally, we discuss the known bias of libraries constructed from human gut and marine water samples, present results that suggest bias may also occur for soil libraries, and consider factors that bias metagenomic libraries in general. We anticipate that discussion of current resources and limitations will advance tools and technologies for functional metagenomics research. PMID:26579102

  3. The Cancer Genome Anatomy Project: new resources for reading the molecular signatures of cancer.

    PubMed

    Strausberg, R L

    2001-09-01

    The Cancer Genome Anatomy Project (CGAP) has built informational, technological, and physical resources to interface genomics with basic and clinical cancer research. The CGAP web site (http://cgap.nci.nih.gov) provides informatics tools for in silico analysis of the CGAP datasets as well as information for accessing each of the CGAP resources. Published in 2001 by John Wiley & Sons, Ltd.

  4. NCBI GEO: archive for functional genomics data sets--update.

    PubMed

    Barrett, Tanya; Wilhite, Stephen E; Ledoux, Pierre; Evangelista, Carlos; Kim, Irene F; Tomashevsky, Maxim; Marshall, Kimberly A; Phillippy, Katherine H; Sherman, Patti M; Holko, Michelle; Yefanov, Andrey; Lee, Hyeseung; Zhang, Naigong; Robertson, Cynthia L; Serova, Nadezhda; Davis, Sean; Soboleva, Alexandra

    2013-01-01

    The Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) is an international public repository for high-throughput microarray and next-generation sequence functional genomic data sets submitted by the research community. The resource supports archiving of raw data, processed data and metadata which are indexed, cross-linked and searchable. All data are freely available for download in a variety of formats. GEO also provides several web-based tools and strategies to assist users to query, analyse and visualize data. This article reports current status and recent database developments, including the release of GEO2R, an R-based web application that helps users analyse GEO data.

  5. flyDIVaS: A Comparative Genomics Resource for Drosophila Divergence and Selection

    PubMed Central

    Stanley, Craig E.; Kulathinal, Rob J.

    2016-01-01

    With arguably the best finished and expertly annotated genome assembly, Drosophila melanogaster is a formidable genetics model to study all aspects of biology. Nearly a decade ago, the 12 Drosophila genomes project expanded D. melanogaster’s breadth as a comparative model through the community-development of an unprecedented genus- and genome-wide comparative resource. However, since its inception, these datasets for evolutionary inference and biological discovery have become increasingly outdated, outmoded, and inaccessible. Here, we provide an updated and upgradable comparative genomics resource of Drosophila divergence and selection, flyDIVaS, based on the latest genomic assemblies, curated FlyBase annotations, and recent OrthoDB orthology calls. flyDIVaS is an online database containing D. melanogaster-centric orthologous gene sets, CDS and protein alignments, divergence statistics (% gaps, dN, dS, dN/dS), and codon-based tests of positive Darwinian selection. Out of 13,920 protein-coding D. melanogaster genes, ∼80% have one aligned ortholog in the closely related species, D. simulans, and ∼50% have 1–1 12-way alignments in the original 12 sequenced species that span over 80 million yr of divergence. Genes and their orthologs can be chosen from four different taxonomic datasets differing in phylogenetic depth and coverage density, and visualized via interactive alignments and phylogenetic trees. Users can also batch download entire comparative datasets. A functional survey finds conserved mitotic and neural genes, highly diverged immune and reproduction-related genes, more conspicuous signals of divergence across tissue-specific genes, and an enrichment of positive selection among highly diverged genes. flyDIVaS will be regularly updated and can be freely accessed at www.flydivas.info. We encourage researchers to regularly use this resource as a tool for biological inference and discovery, and in their classrooms to help train the next generation of

  6. The Aspergillus Genome Database, a curated comparative genomics resource for gene, protein and sequence information for the Aspergillus research community.

    PubMed

    Arnaud, Martha B; Chibucos, Marcus C; Costanzo, Maria C; Crabtree, Jonathan; Inglis, Diane O; Lotia, Adil; Orvis, Joshua; Shah, Prachi; Skrzypek, Marek S; Binkley, Gail; Miyasato, Stuart R; Wortman, Jennifer R; Sherlock, Gavin

    2010-01-01

    The Aspergillus Genome Database (AspGD) is an online genomics resource for researchers studying the genetics and molecular biology of the Aspergilli. AspGD combines high-quality manual curation of the experimental scientific literature examining the genetics and molecular biology of Aspergilli, cutting-edge comparative genomics approaches to iteratively refine and improve structural gene annotations across multiple Aspergillus species, and web-based research tools for accessing and exploring the data. All of these data are freely available at http://www.aspgd.org. We welcome feedback from users and the research community at aspergillus-curator@genome.stanford.edu.

  7. The Culture Collection of Algae and Protozoa (CCAP): a biological resource for protistan genomics.

    PubMed

    Gachon, Claire M M; Day, John G; Campbell, Christine N; Pröschold, Thomas; Saxon, Rachel J; Küpper, Frithjof C

    2007-12-30

    CCAP, the largest European protistan culture collection, is based at the Scottish Association for Marine Science near Oban, Scotland (http://www.ccap.ac.uk). The Collection comprises more than 2700 strains in the public domain, of which 1050 are marine algae, 1300 freshwater algae, and 350 protozoa. The primary mission of CCAP is to maintain and distribute defined cultures and their associated information to its customers. It also has a support and advisory function on all aspects of protistan science. In addition, it is involved in the training of students and researchers in algal identification and culture techniques. In light of the increasing number of fully sequenced protists, the CCAP is striving to provide targeted services and support to workers involved in all aspects of genomic research. At present, the Collection holds several hundred strains of genomic model taxa including: Acanthamoeba, Cafeteria, Cercomonas, Chlamydomonas, Chlorella, Cyanophora, Dictyostelium, Dunaliella, Ectocarpus, Emiliania, Euglena, Micromonas, Naegleria, Nephroselmis, Paramecium, Pavlova, Phaeodactylum, Porphyra, Pseudendoclonium, Pylaiella, Rhodomonas, Scenedesmus, Staurastrum, Tetrahymena, Thalassiosira, Volvox and Zygnema. These strains provide a defined representation of natural variation within model organisms, an increasingly useful resource for post-genomics approaches. Our aim over the next 2-5 years is to add value to the Collection by increasing the number of genome model species, and by offering an integrated, up-to-date, easy-to-use resource that would provide curated information on our strain holdings. In collaboration with other major Biological Resource Centres worldwide, we intend to build a hub providing access to both protistan cultures and their associated bioinformatics data.

  8. The Aspergillus Genome Database (AspGD): recent developments in comprehensive multispecies curation, comparative genomics and community resources.

    PubMed

    Arnaud, Martha B; Cerqueira, Gustavo C; Inglis, Diane O; Skrzypek, Marek S; Binkley, Jonathan; Chibucos, Marcus C; Crabtree, Jonathan; Howarth, Clinton; Orvis, Joshua; Shah, Prachi; Wymore, Farrell; Binkley, Gail; Miyasato, Stuart R; Simison, Matt; Sherlock, Gavin; Wortman, Jennifer R

    2012-01-01

    The Aspergillus Genome Database (AspGD; http://www.aspgd.org) is a freely available, web-based resource for researchers studying fungi of the genus Aspergillus, which includes organisms of clinical, agricultural and industrial importance. AspGD curators have now completed comprehensive review of the entire published literature about Aspergillus nidulans and Aspergillus fumigatus, and this annotation is provided with streamlined, ortholog-based navigation of the multispecies information. AspGD facilitates comparative genomics by providing a full-featured genomics viewer, as well as matched and standardized sets of genomic information for the sequenced aspergilli. AspGD also provides resources to foster interaction and dissemination of community information and resources. We welcome and encourage feedback at aspergillus-curator@lists.stanford.edu.

  9. STINGRAY: system for integrated genomic resources and analysis

    PubMed Central

    2014-01-01

    Background The STINGRAY system has been conceived to ease the tasks of integrating, analyzing, annotating and presenting genomic and expression data from Sanger and Next Generation Sequencing (NGS) platforms. Findings STINGRAY includes: (a) a complete and integrated workflow (more than 20 bioinformatics tools) ranging from functional annotation to phylogeny; (b) a MySQL database schema, suitable for data integration and user access control; and (c) a user-friendly graphical web-based interface that makes the system intuitive, facilitating the tasks of data analysis and annotation. Conclusion STINGRAY showed to be an easy to use and complete system for analyzing sequencing data. While both Sanger and NGS platforms are supported, the system could be faster using Sanger data, since the large NGS datasets could potentially slow down the MySQL database usage. STINGRAY is available at http://stingray.biowebdb.org and the open source code at http://sourceforge.net/projects/stingray-biowebdb/. PMID:24606808

  10. REVIEW: Zebrafish: A Renewed Model System For Functional Genomics

    NASA Astrophysics Data System (ADS)

    Wen, Xiao-Yan

    2008-01-01

    In the post genome era, a major goal in molecular biology is to determine the function of the many thousands of genes present in the vertebrate genome. The zebrafish (Danio rerio) provides an almost ideal genetic model to identify the biological roles of these novel genes, in part because their embryos are transparent and develop rapidly. The zebrafish has many advantages over mouse for genome-wide mutagenesis studies, allowing for easier, cheaper and faster functional characterization of novel genes in the vertebrate genome. Many molecular research tools such as chemical mutagenesis, transgenesis, gene trapping, gene knockdown, TILLING, gene targeting, RNAi and chemical genetic screen are now available in zebrafish. Combining all the forward, reverse, and chemical genetic tools, it is expected that zebrafish will make invaluable contribution to vertebrate functional genomics in functional annotation of the genes, modeling human diseases and drug discoveries.

  11. CucCAP - Developing genomic resources for the cucurbit community

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The U.S. cucurbit community has initiated a USDA-SCRI funded cucurbit genomics project, CucCAP: Leveraging applied genomics to increase disease resistance in cucurbit crops. Our primary objectives are: develop genomic and bioinformatic breeding tool kits for accelerated crop improvement across the...

  12. Retroelements and their impact on genome evolution and functioning.

    PubMed

    Gogvadze, Elena; Buzdin, Anton

    2009-12-01

    Retroelements comprise a considerable fraction of eukaryotic genomes. Since their initial discovery by Barbara McClintock in maize DNA, retroelements have been found in genomes of almost all organisms. First considered as a "junk DNA" or genomic parasites, they were shown to influence genome functioning and to promote genetic innovations. For this reason, they were suggested as an important creative force in the genome evolution and adaptation of an organism to altered environmental conditions. In this review, we summarize the up-to-date knowledge of different ways of retroelement involvement in structural and functional evolution of genes and genomes, as well as the mechanisms generated by cells to control their retrotransposition. PMID:19649766

  13. A Functional Simulator of Spacecraft Resources

    NASA Technical Reports Server (NTRS)

    Liceaga, Carlos A.; Troutman, Patrick A.

    1997-01-01

    The SPAcecraft SIMulator (SPASIM) simulates the functions and resources of a spacecraft to quickly perform Phase A trade-off analyses and uncover any operational bottlenecks during any part of the mission. Failure modes and operational contingencies can be evaluated allowing optimization for a range of mission scenarios. The payloads and subsystems are simulated, using a hierarchy of graphical models, in terms of how their functions affect resources such as propellant, power, and data. Any of the inputs and outputs of the payloads and subsystems can be plotted during the simulation. Most trade-off analyses, including those that compare current versus advanced technology, can be performed by changing values in the parameter menus. However, when a component is replaced by one with a different functional architecture, its graphical model can also be modified or replaced by drawing from a component library. SPASIM has been validated using several spacecraft designs which were at least at the Critical Design Review level. The user and programmer guide, including figures, is available on line as a hyper text document. This is an easy-to-use and expand tool which is based on MATLAB and SIMULINK. It runs on SGI workstations and PCs under Windows 95 or NT.

  14. Characterizing genomic alterations in cancer by complementary functional associations

    PubMed Central

    Kim, J. W.; Botvinnik, O. B.; Abudayyeh, O.; Birger, C.; Rosenbluh, J.; Shrestha, Y.; Abazeed, M. E.; Hammerman, P. S.; DiCara, D.; Konieczkowski, D. J.; Johannessen, C. M.; Liberzon, A.; Alizad-Rahvar, A. R.; Alexe, G.; Aguirre, A.; Ghandi, M.; Greulich, H.; Vazquez, F.; Weir, B. A.; Van Allen, E. M.; Tsherniak, A.; Shao, D. D.; Zack, T. I.; Noble, M.; Getz, G.; Beroukhim, R.; Garraway, L. A.; Ardakani, M.; Romualdi, C.; Sales, G.; Barbie, D. A.; Boehm, J. S.; Hahn, W. C.; Mesirov, J. P.; Tamayo, P.

    2016-01-01

    Systematic efforts to sequence the cancer genome have identified large numbers of relevant mutations and copy number alterations in human cancers; however, elucidating their functional consequences, and their interactions to drive or maintain oncogenic states, is still a significant challenge. Here we introduce REVEALER, a computational method that identifies combinations of mutually exclusive genomic alterations correlated with functional phenotypes, such as the activation or gene-dependency of oncogenic pathways or the sensitivity to a drug treatment. We use REVEALER to uncover complementary genomic alterations associated with the transcriptional activation of β-catenin and NRF2, MEK-inhibitor sensitivity, and KRAS dependency. REVEALER successfully identified both known and new associations demonstrating the power of combining functional profiles with extensive characterization of genomic alterations in cancer genomes. PMID:27088724

  15. Partnering for functional genomics research conference: Abstracts of poster presentations

    SciTech Connect

    1998-06-01

    This reports contains abstracts of poster presentations presented at the Functional Genomics Research Conference held April 16--17, 1998 in Oak Ridge, Tennessee. Attention is focused on the following areas: mouse mutagenesis and genomics; phenotype screening; gene expression analysis; DNA analysis technology development; bioinformatics; comparative analyses of mouse, human, and yeast sequences; and pilot projects to evaluate methodologies.

  16. Whole genome sequences in pulse crops: a global community resource to expedite translational genomics and knowledge-based crop improvement.

    PubMed

    Bohra, Abhishek; Singh, Narendra P

    2015-08-01

    Unprecedented developments in legume genomics over the last decade have resulted in the acquisition of a wide range of modern genomic resources to underpin genetic improvement of grain legumes. The genome enabled insights direct investigators in various ways that primarily include unearthing novel structural variations, retrieving the lost genetic diversity, introducing novel/exotic alleles from wider gene pools, finely resolving the complex quantitative traits and so forth. To this end, ready availability of cost-efficient and high-density genotyping assays allows genome wide prediction to be increasingly recognized as the key selection criterion in crop breeding. Further, the high-dimensional measurements of agronomically significant phenotypes obtained by using new-generation screening techniques will empower reference based resequencing as well as allele mining and trait mapping methods to comprehensively associate genome diversity with the phenome scale variation. Besides stimulating the forward genetic systems, accessibility to precisely delineated genomic segments reveals novel candidates for reverse genetic techniques like targeted genome editing. The shifting paradigm in plant genomics in turn necessitates optimization of crop breeding strategies to enable the most efficient integration of advanced omics knowledge and tools. We anticipate that the crop improvement schemes will be bolstered remarkably with rational deployment of these genome-guided approaches, ultimately resulting in expanded plant breeding capacities and improved crop performance.

  17. Phydbac (phylogenomic display of bacterial genes): An interactive resource for the annotation of bacterial genomes.

    PubMed

    Enault, François; Suhre, Karsten; Poirot, Olivier; Abergel, Chantal; Claverie, Jean-Michel

    2003-07-01

    Phydbac is a web interactive resource based on phylogenomic profiling, designed to help microbiologists to annotate bacterial proteins. Phylogenomic annotation is based on the assumption that functionally linked protein-coding genes must evolve in a coordinated manner. The detection of subsets of co-evolving genes within a given genome involves the computation of protein sequence conservation profiles across a spectrum of microbial species, followed by the identification of significant pairwise correlations between them. Many ongoing studies are devoted to the problem of computing the most biologically significant phylogenomic profiles and how best identifying clusters of 'functionally interacting' genes. Here we introduce a web tool, Phydbac, allowing the dynamic construction of phylogenomic profiles of protein sequences of interest and their interactive display. In addition, Phydbac can identify Escherichia coli proteins exhibiting the evolution pattern most similar to arbitrary query protein sequences, hence providing functional hints for open reading frames (ORFs) of hypothetical or unknown function. The phylogenomic profiles of all E.coli K-12 protein-coding genes are pre-computed, allowing queries about E.coli genes to be answered instantaneously. The profiles and phylogenomic neighborhoods are computed using an original method shown to perform better than previous ones. An extension of Phydbac, including precomputed profiles for all available bacterial genomes (including major pathogens) will soon be available. Phydbac can be accessed at: http://igs-server.cnrs-mrs.fr/phydbac/.

  18. From Loci to Biology: Functional Genomics of Genome-Wide Association for Coronary Disease.

    PubMed

    Nurnberg, Sylvia T; Zhang, Hanrui; Hand, Nicholas J; Bauer, Robert C; Saleheen, Danish; Reilly, Muredach P; Rader, Daniel J

    2016-02-19

    Genome-wide association studies have provided a rich collection of ≈ 58 coronary artery disease (CAD) loci that suggest the existence of previously unsuspected new biology relevant to atherosclerosis. However, these studies only identify genomic loci associated with CAD, and many questions remain even after a genomic locus is definitively implicated, including the nature of the causal variant(s) and the causal gene(s), as well as the directionality of effect. There are several tools that can be used for investigation of the functional genomics of these loci, and progress has been made on a limited number of novel CAD loci. New biology regarding atherosclerosis and CAD will be learned through the functional genomics of these loci, and the hope is that at least some of these new pathways relevant to CAD pathogenesis will yield new therapeutic targets for the prevention and treatment of CAD.

  19. Nuclear pore proteins and the control of genome functions

    PubMed Central

    Ibarra, Arkaitz

    2015-01-01

    Nuclear pore complexes (NPCs) are composed of several copies of ∼30 different proteins called nucleoporins (Nups). NPCs penetrate the nuclear envelope (NE) and regulate the nucleocytoplasmic trafficking of macromolecules. Beyond this vital role, NPC components influence genome functions in a transport-independent manner. Nups play an evolutionarily conserved role in gene expression regulation that, in metazoans, extends into the nuclear interior. Additionally, in proliferative cells, Nups play a crucial role in genome integrity maintenance and mitotic progression. Here we discuss genome-related functions of Nups and their impact on essential DNA metabolism processes such as transcription, chromosome duplication, and segregation. PMID:25691464

  20. Characterizing genomic alterations in cancer by complementary functional associations | Office of Cancer Genomics

    Cancer.gov

    Systematic efforts to sequence the cancer genome have identified large numbers of mutations and copy number alterations in human cancers. However, elucidating the functional consequences of these variants, and their interactions to drive or maintain oncogenic states, remains a challenge in cancer research. We developed REVEALER, a computational method that identifies combinations of mutually exclusive genomic alterations correlated with functional phenotypes, such as the activation or gene dependency of oncogenic pathways or sensitivity to a drug treatment.

  1. Integrating Genomic Resources with Electronic Health Records using the HL7 Infobutton Standard

    PubMed Central

    Overby, Casey Lynnette; Del Fiol, Guilherme; Rubinstein, Wendy S.; Maglott, Donna R.; Nelson, Tristan H.; Milosavljevic, Aleksandar; Martin, Christa L.; Goehringer, Scott R.; Freimuth, Robert R.; Williams, Marc S.

    2016-01-01

    Summary Background The Clinical Genome Resource (ClinGen) Electronic Health Record (EHR) Workgroup aims to integrate ClinGen resources with EHRs. A promising option to enable this integration is through the Health Level Seven (HL7) Infobutton Standard. EHR systems that are certified according to the US Meaningful Use program provide HL7-compliant infobutton capabilities, which can be leveraged to support clinical decision-making in genomics. Objectives To integrate genomic knowledge resources using the HL7 infobutton standard. Two tactics to achieve this objective were: (1) creating an HL7-compliant search interface for ClinGen, and (2) proposing guidance for genomic resources on achieving HL7 Infobutton standard accessibility and compliance. Methods We built a search interface utilizing OpenInfobutton, an open source reference implementation of the HL7 Infobutton standard. ClinGen resources were assessed for readiness towards HL7 compliance. Finally, based upon our experiences we provide recommendations for publishers seeking to achieve HL7 compliance. Results Eight genomic resources and two sub-resources were integrated with the ClinGen search engine via OpenInfobutton and the HL7 infobutton standard. Resources we assessed have varying levels of readiness towards HL7-compliance. Furthermore, we found that adoption of standard terminologies used by EHR systems is the main gap to achieve compliance. Conclusion Genomic resources can be integrated with EHR systems via the HL7 Infobutton standard using OpenInfobutton. Full compliance of genomic resources with the Infobutton standard would further enhance interoperability with EHR systems. PMID:27579472

  2. Bacteriophage functional genomics and its role in bacterial pathogen detection.

    PubMed

    Klumpp, Jochen; Fouts, Derrick E; Sozhamannan, Shanmuga

    2013-07-01

    Emerging and reemerging bacterial infectious diseases are a major public health concern worldwide. The role of bacteriophages in the emergence of novel bacterial pathogens by horizontal gene transfer was highlighted by the May 2011 Escherichia coli O104:H4 outbreaks that originated in Germany and spread to other European countries. This outbreak also highlighted the pivotal role played by recent advances in functional genomics in rapidly deciphering the virulence mechanism elicited by this novel pathogen and developing rapid diagnostics and therapeutics. However, despite a steady increase in the number of phage sequences in the public databases, boosted by the next-generation sequencing technologies, few functional genomics studies of bacteriophages have been conducted. Our definition of 'functional genomics' encompasses a range of aspects: phage genome sequencing, annotation and ascribing functions to phage genes, prophage identification in bacterial sequences, elucidating the events in various stages of phage life cycle using genomic, transcriptomic and proteomic approaches, defining the mechanisms of host takeover including specific bacterial-phage protein interactions and identifying virulence and other adaptive features encoded by phages and finally, using prophage genomic information for bacterial detection/diagnostics. Given the breadth and depth of this definition and the fact that some of these aspects (especially phage-encoded virulence/adaptive features) have been treated extensively in other reviews, we restrict our focus only on certain aspects. These include phage genome sequencing and annotation, identification of prophages in bacterial sequences and genetic characterization of phages, functional genomics of the infection process and finally, bacterial identification using genomic information.

  3. Bacteriophage functional genomics and its role in bacterial pathogen detection.

    PubMed

    Klumpp, Jochen; Fouts, Derrick E; Sozhamannan, Shanmuga

    2013-07-01

    Emerging and reemerging bacterial infectious diseases are a major public health concern worldwide. The role of bacteriophages in the emergence of novel bacterial pathogens by horizontal gene transfer was highlighted by the May 2011 Escherichia coli O104:H4 outbreaks that originated in Germany and spread to other European countries. This outbreak also highlighted the pivotal role played by recent advances in functional genomics in rapidly deciphering the virulence mechanism elicited by this novel pathogen and developing rapid diagnostics and therapeutics. However, despite a steady increase in the number of phage sequences in the public databases, boosted by the next-generation sequencing technologies, few functional genomics studies of bacteriophages have been conducted. Our definition of 'functional genomics' encompasses a range of aspects: phage genome sequencing, annotation and ascribing functions to phage genes, prophage identification in bacterial sequences, elucidating the events in various stages of phage life cycle using genomic, transcriptomic and proteomic approaches, defining the mechanisms of host takeover including specific bacterial-phage protein interactions and identifying virulence and other adaptive features encoded by phages and finally, using prophage genomic information for bacterial detection/diagnostics. Given the breadth and depth of this definition and the fact that some of these aspects (especially phage-encoded virulence/adaptive features) have been treated extensively in other reviews, we restrict our focus only on certain aspects. These include phage genome sequencing and annotation, identification of prophages in bacterial sequences and genetic characterization of phages, functional genomics of the infection process and finally, bacterial identification using genomic information. PMID:23520178

  4. Development of Genomic Resources for Pacific Herring through Targeted Transcriptome Pyrosequencing

    PubMed Central

    Roberts, Steven B.; Hauser, Lorenz; Seeb, Lisa W.; Seeb, James E.

    2012-01-01

    Pacific herring (Clupea pallasii) support commercially and culturally important fisheries but have experienced significant additional pressure from a variety of anthropogenic and environmental sources. In order to provide genomic resources to facilitate organismal and population level research, high-throughput pyrosequencing (Roche 454) was carried out on transcriptome libraries from liver and testes samples taken in Prince William Sound, the Bering Sea, and the Gulf of Alaska. Over 40,000 contigs were identified with an average length of 728 bp. We describe an annotated transcriptome as well as a workflow for single nucleotide polymorphism (SNP) discovery and validation. A subset of 96 candidate SNPs chosen from 10,933 potential SNPs, were tested using a combination of Sanger sequencing and high-resolution melt-curve analysis. Five SNPs supported between-ocean-basin differentiation, while one SNP associated with immune function provided high differentiation between Prince William Sound and Kodiak Island within the Gulf of Alaska. These genomic resources provide a basis for environmental physiology studies and opportunities for marker development and subsequent population structure analysis. PMID:22383979

  5. Genome wide functional genetics in haploid cells.

    PubMed

    Elling, Ulrich; Penninger, Josef M

    2014-08-01

    Some organisms such as yeast or males of social insects are haploid, i.e. they carry a single set of chromosomes, while haploidy in mammals is exclusively restricted to mature germ cells. A single copy of the genome provides the basis for genetic analyses where any recessive mutation of essential genes will show a clear phenotype due to the absence of a second gene copy. Most prominently, haploidy in yeast has been utilized for recessive genetic screens that have markedly contributed to our understanding of development, basic physiology, and disease. Somatic mammalian cells carry two copies of chromosomes (diploidy) that obscure genetic analysis. Near haploid human leukemic cells however have been developed as a high throughput screening tool. Although deemed impossible, we and others have generated mammalian haploid embryonic stem cells from parthenogenetic mouse embryos. Haploid stem cells open the possibility of combining the power of a haploid genome with pluripotency of embryonic stem cells to uncover fundamental biological processes in defined cell types at a genomic scale. Haploid genetics has thus become a powerful alternative to RNAi or CRISPR based screens. PMID:24950427

  6. Determining protein function and interaction from genome analysis

    DOEpatents

    Eisenberg, David; Marcotte, Edward M.; Thompson, Michael J.; Pellegrini, Matteo; Yeates, Todd O.

    2004-08-03

    A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

  7. Assigning protein functions by comparative genome analysis protein phylogenetic profiles

    DOEpatents

    Pellegrini, Matteo; Marcotte, Edward M.; Thompson, Michael J.; Eisenberg, David; Grothe, Robert; Yeates, Todd O.

    2003-05-13

    A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

  8. Coordinated international action to accelerate genome-to-phenome with FAANG, The Functional Annotation of Animal Genomes project

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We describe the organization of a nascent international effort - the "Functional Annotation of ANimal Genomes" project - whose aim is to produce comprehensive maps of functional elements in the genomes of domesticated animal species....

  9. RegPrecise 3.0 – A resource for genome-scale exploration of transcriptional regulation in bacteria

    PubMed Central

    2013-01-01

    Background Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in prokaryotes is one of the critical tasks of modern genomics. Bacteria from different taxonomic groups, whose lifestyles and natural environments are substantially different, possess highly diverged transcriptional regulatory networks. The comparative genomics approaches are useful for in silico reconstruction of bacterial regulons and networks operated by both transcription factors (TFs) and RNA regulatory elements (riboswitches). Description RegPrecise (http://regprecise.lbl.gov) is a web resource for collection, visualization and analysis of transcriptional regulons reconstructed by comparative genomics. We significantly expanded a reference collection of manually curated regulons we introduced earlier. RegPrecise 3.0 provides access to inferred regulatory interactions organized by phylogenetic, structural and functional properties. Taxonomy-specific collections include 781 TF regulogs inferred in more than 160 genomes representing 14 taxonomic groups of Bacteria. TF-specific collections include regulogs for a selected subset of 40 TFs reconstructed across more than 30 taxonomic lineages. Novel collections of regulons operated by RNA regulatory elements (riboswitches) include near 400 regulogs inferred in 24 bacterial lineages. RegPrecise 3.0 provides four classifications of the reference regulons implemented as controlled vocabularies: 55 TF protein families; 43 RNA motif families; ~150 biological processes or metabolic pathways; and ~200 effectors or environmental signals. Genome-wide visualization of regulatory networks and metabolic pathways covered by the reference regulons are available for all studied genomes. A separate section of RegPrecise 3.0 contains draft regulatory networks in 640 genomes obtained by an conservative propagation of the reference regulons to closely related genomes. Conclusions RegPrecise 3.0 gives access to the

  10. SchistoDB: a Schistosoma mansoni genome resource

    PubMed Central

    Zerlotini, Adhemar; Heiges, Mark; Wang, Haiming; Moraes, Romulo L. V.; Dominitini, Anderson J.; Ruiz, Jerônimo C.; Kissinger, Jessica C.; Oliveira, Guilherme

    2009-01-01

    SchistoDB (http://schistoDB.net/) is a genomic database for the parasitic organism Schistosoma mansoni, one of the major causative agents of schistosomiasis worldwide. It currently incorporates sequences and annotation for S. mansoni in a single user-friendly database. Several genomic scale analyses are available as well as ESTs, oligonucleotides, metabolic pathways and drugs. In this article, we describe the data sets and its analyses, how to query the database and tools available in the website. PMID:18842636

  11. BambooGDB: a bamboo genome database with functional annotation and an analysis platform

    PubMed Central

    Zhao, Hansheng; Peng, Zhenhua; Fei, Benhua; Li, Lubin; Hu, Tao; Gao, Zhimin; Jiang, Zehui

    2014-01-01

    Bamboo, as one of the most important non-timber forest products and fastest-growing plants in the world, represents the only major lineage of grasses that is native to forests. Recent success on the first high-quality draft genome sequence of moso bamboo (Phyllostachys edulis) provides new insights on bamboo genetics and evolution. To further extend our understanding on bamboo genome and facilitate future studies on the basis of previous achievements, here we have developed BambooGDB, a bamboo genome database with functional annotation and analysis platform. The de novo sequencing data, together with the full-length complementary DNA and RNA-seq data of moso bamboo composed the main contents of this database. Based on these sequence data, a comprehensively functional annotation for bamboo genome was made. Besides, an analytical platform composed of comparative genomic analysis, protein–protein interactions network, pathway analysis and visualization of genomic data was also constructed. As discovery tools to understand and identify biological mechanisms of bamboo, the platform can be used as a systematic framework for helping and designing experiments for further validation. Moreover, diverse and powerful search tools and a convenient browser were incorporated to facilitate the navigation of these data. As far as we know, this is the first genome database for bamboo. Through integrating high-throughput sequencing data, a full functional annotation and several analysis modules, BambooGDB aims to provide worldwide researchers with a central genomic resource and an extensible analysis platform for bamboo genome. BambooGDB is freely available at http://www.bamboogdb.org/. Database URL: http://www.bamboogdb.org PMID:24602877

  12. Megx.net: integrated database resource for marine ecological genomics.

    PubMed

    Kottmann, Renzo; Kostadinov, Ivalyo; Duhaime, Melissa Beth; Buttigieg, Pier Luigi; Yilmaz, Pelin; Hankeln, Wolfgang; Waldmann, Jost; Glöckner, Frank Oliver

    2010-01-01

    Megx.net is a database and portal that provides integrated access to georeferenced marker genes, environment data and marine genome and metagenome projects for microbial ecological genomics. All data are stored in the Microbial Ecological Genomics DataBase (MegDB), which is subdivided to hold both sequence and habitat data and global environmental data layers. The extended system provides access to several hundreds of genomes and metagenomes from prokaryotes and phages, as well as over a million small and large subunit ribosomal RNA sequences. With the refined Genes Mapserver, all data can be interactively visualized on a world map and statistics describing environmental parameters can be calculated. Sequence entries have been curated to comply with the proposed minimal standards for genomes and metagenomes (MIGS/MIMS) of the Genomic Standards Consortium. Access to data is facilitated by Web Services. The updated megx.net portal offers microbial ecologists greatly enhanced database content, and new features and tools for data analysis, all of which are freely accessible from our webpage http://www.megx.net.

  13. A High-Definition View of Functional Genetic Variation from Natural Yeast Genomes

    PubMed Central

    Bergström, Anders; Simpson, Jared T.; Salinas, Francisco; Barré, Benjamin; Parts, Leopold; Zia, Amin; Nguyen Ba, Alex N.; Moses, Alan M.; Louis, Edward J.; Mustonen, Ville; Warringer, Jonas; Durbin, Richard; Liti, Gianni

    2014-01-01

    The question of how genetic variation in a population influences phenotypic variation and evolution is of major importance in modern biology. Yet much is still unknown about the relative functional importance of different forms of genome variation and how they are shaped by evolutionary processes. Here we address these questions by population level sequencing of 42 strains from the budding yeast Saccharomyces cerevisiae and its closest relative S. paradoxus. We find that genome content variation, in the form of presence or absence as well as copy number of genetic material, is higher within S. cerevisiae than within S. paradoxus, despite genetic distances as measured in single-nucleotide polymorphisms being vastly smaller within the former species. This genome content variation, as well as loss-of-function variation in the form of premature stop codons and frameshifting indels, is heavily enriched in the subtelomeres, strongly reinforcing the relevance of these regions to functional evolution. Genes affected by these likely functional forms of variation are enriched for functions mediating interaction with the external environment (sugar transport and metabolism, flocculation, metal transport, and metabolism). Our results and analyses provide a comprehensive view of genomic diversity in budding yeast and expose surprising and pronounced differences between the variation within S. cerevisiae and that within S. paradoxus. We also believe that the sequence data and de novo assemblies will constitute a useful resource for further evolutionary and population genomics studies. PMID:24425782

  14. Mapping genomic features to functional traits through microbial whole genome sequences.

    PubMed

    Zhang, Wei; Zeng, Erliang; Liu, Dan; Jones, Stuart E; Emrich, Scott

    2014-01-01

    Recently, the utility of trait-based approaches for microbial communities has been identified. Increasing availability of whole genome sequences provide the opportunity to explore the genetic foundations of a variety of functional traits. We proposed a machine learning framework to quantitatively link the genomic features with functional traits. Genes from bacteria genomes belonging to different functional traits were grouped to Cluster of Orthologs (COGs), and were used as features. Then, TF-IDF technique from the text mining domain was applied to transform the data to accommodate the abundance and importance of each COG. After TF-IDF processing, COGs were ranked using feature selection methods to identify their relevance to the functional trait of interest. Extensive experimental results demonstrated that functional trait related genes can be detected using our method. Further, the method has the potential to provide novel biological insights.

  15. Contribution of genomics to the understanding of physiological functions.

    PubMed

    Hocquette, J F; Cassar-Malek, I; Scalbert, A; Guillou, F

    2009-10-01

    Genomics has brought with it a true biological revolution and can be applied to all areas of life sciences. The advent of genomics is thus linked to the development of high-throughput techniques which allows the genome of organisms as a whole to be studied. The first high-throughput techniques to be developed were sequencing methods. These advances will allow new approaches to a variety of problems in biology. For instance, the emerging fields of genomic medicine in humans and genomic selection in livestock are promising. After the sequencing of genomes, genomics has shifted to the study of gene expression and function. This is called the "post-genomic area" by some authors or "functional genomics" by others. The most recent "omics" to be developed are associated with the study of the metabolism (e.g. metabolomics). Integrative "omics" approaches (e.g. nutrigenomics) are based on the association of the omics tools at different levels (DNA, RNA, proteins, metabolites) for a specific objective (here nutrition). In terms of perspectives, it is likely that methods for collecting data will outstrip our capacity to adequately analyse these data. So scientists must develop bioinformatic tools and methods to overcome this difficulty. In addition, high-throughput techniques need to be developed in physiology in order to match the increasing amount of genomic information with true biological data. Finally, there is no doubt that all these new approaches will allow important new genes and novel biological mechanisms to be discovered. Physiological models with invalidated or over-expressed genes will be precious tools to check these new biological discoveries.

  16. Functional diversity in resource use by fungi.

    PubMed

    McGuire, Krista L; Bent, Elizabeth; Borneman, James; Majumder, Arundhati; Allison, Steven D; Tresederi, Kathleen K

    2010-08-01

    Fungi influence nutrient cycling in terrestrial ecosystems, as they are major regulators of decomposition and soil respiration. However, little is known about the substrate preferences of individual fungal species outside of laboratory culture studies. If active fungi differ in their substrate preferences in situ, then changes in fungal diversity due to global change may dramatically influence nutrient cycling in ecosystems. To test the responses of individual fungal taxa to specific substrates, we used a nucleotide-analogue procedure in the boreal forest of Alaska (USA). Specifically, we added four organic N compounds commonly found in plant litter (arginine, glutamate, lignocellulose, and tannin-protein) to litterbags filled with decomposed leaf litter (black spruce and aspen) and assessed the responses of active fungal species using qPCR (quantitative polymerase chain reaction), oligonucleotide fingerprinting of rRNA genes, and sequencing. We also compared the sequences from our experiment with a concurrent warming experiment to see if active fungi that targeted more recalcitrant compounds would respond more positively to soil warming. We found that individual fungal taxa responded differently to substrate additions and that active fungal communities were different across litter types (spruce vs. aspen). Active fungi that targeted lignocellulose also responded positively to experimental warming. Additionally, resource-use patterns in different fungal taxa were genetically correlated, suggesting that it may be possible to predict the ecological function of active fungal communities based on genetic information. Together, these results imply that fungi are functionally diverse and that reductions in fungal diversity may have consequences for ecosystem functioning.

  17. Exploiting Genomics Resources to Identify Candidate Genes Underlying Antioxidants Content in Tomato Fruit

    PubMed Central

    Calafiore, Roberta; Ruggieri, Valentino; Raiola, Assunta; Rigano, Maria M.; Sacco, Adriana; Hassan, Mohamed I.; Frusciante, Luigi; Barone, Amalia

    2016-01-01

    The tomato is a model species for fleshy fruit development and ripening, as well as for genomics studies of others Solanaceae. Many genetic and genomics resources, including databases for sequencing, transcriptomics and metabolomics data, have been developed and are today available. The purpose of the present work was to uncover new genes and/or alleles that determine ascorbic acid and carotenoids accumulation, by exploiting one Solanum pennellii introgression lines (IL7-3) harboring quantitative trait loci (QTL) that increase the content of these metabolites in the fruit. The higher ascorbic acid and carotenoids content in IL7-3 was confirmed at three fruit developmental stages. The tomato genome reference sequence and the recently released S. pennellii genome sequence were investigated to identify candidate genes (CGs) that might control ascorbic acid and carotenoids accumulation. First of all, a refinement of the wild region borders in the IL7-3 was achieved by analyzing CAPS markers designed in our laboratory. Afterward, six CGs associated to ascorbic acid and one with carotenoids metabolism were identified exploring the annotation and the Gene Ontology terms of genes included in the region. Variants between the sequence of the wild and the cultivated alleles of these genes were investigated for their functional relevance and their potential effects on the protein sequences were predicted. Transcriptional levels of CGs in the introgression region were extracted from RNA-Seq data available for the entire S. pennellii introgression lines collection and verified by Real-Time qPCR. Finally, seven IL7-3 sub-lines were genotyped using 28 species-specific markers and then were evaluated for metabolites content. These analyses evidenced a significant decrease in transcript abundance for one 9-cis-epoxycarotenoid dioxygenase and one L-ascorbate oxidase homolog, whose role in the accumulation of carotenoids and ascorbic acid is discussed. Comprehensively, the reported

  18. The Human Resources Function and the Growing Company.

    ERIC Educational Resources Information Center

    Arthur, Diane

    1987-01-01

    Discusses factors in a changing society that will make it necessary for companies to revamp their human resources function. Topics include technology, demographics, emerging career categories, human resources planning, benefits, legal trends, and training and development. Tells how to revamp the human resources function. (CH)

  19. Reptilian Transcriptomes v2.0: An Extensive Resource for Sauropsida Genomics and Transcriptomics.

    PubMed

    Tzika, Athanasia C; Ullate-Agote, Asier; Grbic, Djordje; Milinkovitch, Michel C

    2015-06-01

    Despite the availability of deep-sequencing techniques, genomic and transcriptomic data remain unevenly distributed across phylogenetic groups. For example, reptiles are poorly represented in sequence databases, hindering functional evolutionary and developmental studies in these lineages substantially more diverse than mammals. In addition, different studies use different assembly and annotation protocols, inhibiting meaningful comparisons. Here, we present the "Reptilian Transcriptomes Database 2.0," which provides extensive annotation of transcriptomes and genomes from species covering the major reptilian lineages. To this end, we sequenced normalized complementary DNA libraries of multiple adult tissues and various embryonic stages of the leopard gecko and the corn snake and gathered published reptilian sequence data sets from representatives of the four extant orders of reptiles: Squamata (snakes and lizards), the tuatara, crocodiles, and turtles. The LANE runner 2.0 software was implemented to annotate all assemblies within a single integrated pipeline. We show that this approach increases the annotation completeness of the assembled transcriptomes/genomes. We then built large concatenated protein alignments of single-copy genes and inferred phylogenetic trees that support the positions of turtles and the tuatara as sister groups of Archosauria and Squamata, respectively. The Reptilian Transcriptomes Database 2.0 resource will be updated to include selected new data sets as they become available, thus making it a reference for differential expression studies, comparative genomics and transcriptomics, linkage mapping, molecular ecology, and phylogenomic analyses involving reptiles. The database is available at www.reptilian-transcriptomes.org and can be enquired using a wwwblast server installed at the University of Geneva. PMID:26133641

  20. Reptilian Transcriptomes v2.0: An Extensive Resource for Sauropsida Genomics and Transcriptomics

    PubMed Central

    Tzika, Athanasia C.; Ullate-Agote, Asier; Grbic, Djordje; Milinkovitch, Michel C.

    2015-01-01

    Despite the availability of deep-sequencing techniques, genomic and transcriptomic data remain unevenly distributed across phylogenetic groups. For example, reptiles are poorly represented in sequence databases, hindering functional evolutionary and developmental studies in these lineages substantially more diverse than mammals. In addition, different studies use different assembly and annotation protocols, inhibiting meaningful comparisons. Here, we present the “Reptilian Transcriptomes Database 2.0,” which provides extensive annotation of transcriptomes and genomes from species covering the major reptilian lineages. To this end, we sequenced normalized complementary DNA libraries of multiple adult tissues and various embryonic stages of the leopard gecko and the corn snake and gathered published reptilian sequence data sets from representatives of the four extant orders of reptiles: Squamata (snakes and lizards), the tuatara, crocodiles, and turtles. The LANE runner 2.0 software was implemented to annotate all assemblies within a single integrated pipeline. We show that this approach increases the annotation completeness of the assembled transcriptomes/genomes. We then built large concatenated protein alignments of single-copy genes and inferred phylogenetic trees that support the positions of turtles and the tuatara as sister groups of Archosauria and Squamata, respectively. The Reptilian Transcriptomes Database 2.0 resource will be updated to include selected new data sets as they become available, thus making it a reference for differential expression studies, comparative genomics and transcriptomics, linkage mapping, molecular ecology, and phylogenomic analyses involving reptiles. The database is available at www.reptilian-transcriptomes.org and can be enquired using a wwwblast server installed at the University of Geneva. PMID:26133641

  1. Functional genomics of lactic acid bacteria: from food to health.

    PubMed

    Douillard, François P; de Vos, Willem M

    2014-08-29

    Genome analysis using next generation sequencing technologies has revolutionized the characterization of lactic acid bacteria and complete genomes of all major groups are now available. Comparative genomics has provided new insights into the natural and laboratory evolution of lactic acid bacteria and their environmental interactions. Moreover, functional genomics approaches have been used to understand the response of lactic acid bacteria to their environment. The results have been instrumental in understanding the adaptation of lactic acid bacteria in artisanal and industrial food fermentations as well as their interactions with the human host. Collectively, this has led to a detailed analysis of genes involved in colonization, persistence, interaction and signaling towards to the human host and its health. Finally, massive parallel genome re-sequencing has provided new opportunities in applied genomics, specifically in the characterization of novel non-GMO strains that have potential to be used in the food industry. Here, we provide an overview of the state of the art of these functional genomics approaches and their impact in understanding, applying and designing lactic acid bacteria for food and health.

  2. Functional genomics of lactic acid bacteria: from food to health

    PubMed Central

    2014-01-01

    Genome analysis using next generation sequencing technologies has revolutionized the characterization of lactic acid bacteria and complete genomes of all major groups are now available. Comparative genomics has provided new insights into the natural and laboratory evolution of lactic acid bacteria and their environmental interactions. Moreover, functional genomics approaches have been used to understand the response of lactic acid bacteria to their environment. The results have been instrumental in understanding the adaptation of lactic acid bacteria in artisanal and industrial food fermentations as well as their interactions with the human host. Collectively, this has led to a detailed analysis of genes involved in colonization, persistence, interaction and signaling towards to the human host and its health. Finally, massive parallel genome re-sequencing has provided new opportunities in applied genomics, specifically in the characterization of novel non-GMO strains that have potential to be used in the food industry. Here, we provide an overview of the state of the art of these functional genomics approaches and their impact in understanding, applying and designing lactic acid bacteria for food and health. PMID:25186768

  3. Functional Analysis of the Human Genome:. Study of Genetic Disease

    NASA Astrophysics Data System (ADS)

    Tsui, Lap-Chee

    2003-04-01

    I will divide my remarks into 3 parts. First, I will give a brief summary of the Human Genome Project. Second, I will describe our work on human chromosome 7 to illustrate how we could contribute to the Project and disease research. Third, I would like to bring across the argument that study of genetic disease is an integral component of the Human Genome Project. In particular, I will use cystic fibrosis as an example to elaborate why I consider disease study is a part of functional genomics.

  4. Exploring the post-genomic world: differing explanatory and manipulatory functions of post-genomic sciences

    PubMed Central

    Holmes, Christina; Carlson, Siobhan M.; McDonald, Fiona; Jones, Mavis; Graham, Janice

    2016-01-01

    Richard Lewontin proposed that the ability of a scientific field to create a narrative for public understanding garners it social relevance. This article applies Lewontin's conceptual framework of the functions of science (manipulatory and explanatory) to compare and explain the current differences in perceived societal relevance of genetics/genomics and proteomics. We provide three examples to illustrate the social relevance and strong cultural narrative of genetics/genomics for which no counterpart exists for proteomics. We argue that the major difference between genetics/genomics and proteomics is that genomics has a strong explanatory function, due to the strong cultural narrative of heredity. Based on qualitative interviews and observations of proteomics conferences, we suggest that the nature of proteins, lack of public understanding, and theoretical complexity exacerbates this difference for proteomics. Lewontin's framework suggests that social scientists may find that omics sciences affect social relations in different ways than past analyses of genetics. PMID:27134568

  5. Genome Snapshot: a new resource at the Saccharomyces Genome Database (SGD) presenting an overview of the Saccharomyces cerevisiae genome.

    PubMed

    Hirschman, Jodi E; Balakrishnan, Rama; Christie, Karen R; Costanzo, Maria C; Dwight, Selina S; Engel, Stacia R; Fisk, Dianna G; Hong, Eurie L; Livstone, Michael S; Nash, Robert; Park, Julie; Oughtred, Rose; Skrzypek, Marek; Starr, Barry; Theesfeld, Chandra L; Williams, Jennifer; Andrada, Rey; Binkley, Gail; Dong, Qing; Lane, Christopher; Miyasato, Stuart; Sethuraman, Anand; Schroeder, Mark; Thanawala, Mayank K; Weng, Shuai; Dolinski, Kara; Botstein, David; Cherry, J Michael

    2006-01-01

    Sequencing and annotation of the entire Saccharomyces cerevisiae genome has made it possible to gain a genome-wide perspective on yeast genes and gene products. To make this information available on an ongoing basis, the Saccharomyces Genome Database (SGD) (http://www.yeastgenome.org/) has created the Genome Snapshot (http://db.yeastgenome.org/cgi-bin/genomeSnapShot.pl). The Genome Snapshot summarizes the current state of knowledge about the genes and chromosomal features of S.cerevisiae. The information is organized into two categories: (i) number of each type of chromosomal feature annotated in the genome and (ii) number and distribution of genes annotated to Gene Ontology terms. Detailed lists are accessible through SGD's Advanced Search tool (http://db.yeastgenome.org/cgi-bin/search/featureSearch), and all the data presented on this page are available from the SGD ftp site (ftp://ftp.yeastgenome.org/yeast/).

  6. GiardiaDB and TrichDB: integrated genomic resources for the eukaryotic protist pathogens Giardia lamblia and Trichomonas vaginalis

    PubMed Central

    Aurrecoechea, Cristina; Brestelli, John; Carlton, Jane M.; Dommer, Jennifer; Fischer, Steve; Gajria, Bindu; Gao, Xin; Gingle, Alan; Harb, Omar S.; Heiges, Mark; Innamorato, Frank; Iodice, John; Kissinger, Jessica C.; Kraemer, Eileen; Li, Wei; Miller, John A.; Morrison, Hilary G.; Nayak, Vishal; Pennington, Cary; Pinney, Deborah F.; Roos, David S.; Ross, Chris; Sullivan, Steven; Treatman, Charles; Wang, Haiming

    2009-01-01

    GiardiaDB (http://GiardiaDB.org) and TrichDB (http://TrichDB.org) house the genome databases for Giardia lamblia and Trichomonas vaginalis, respectively, and represent the latest additions to the EuPathDB (http://EuPathDB.org) family of functional genomic databases. GiardiaDB and TrichDB employ the same framework as other EuPathDB sites (CryptoDB, PlasmoDB and ToxoDB), supporting fully integrated and searchable databases. Genomic-scale data available via these resources may be queried based on BLAST searches, annotation keywords and gene ID searches, GO terms, sequence motifs and other protein characteristics. Functional queries may also be formulated, based on transcript and protein expression data from a variety of platforms. Phylogenetic relationships may also be interrogated. The ability to combine the results from independent queries, and to store queries and query results for future use facilitates complex, genome-wide mining of functional genomic data. PMID:18824479

  7. GiardiaDB and TrichDB: integrated genomic resources for the eukaryotic protist pathogens Giardia lamblia and Trichomonas vaginalis.

    PubMed

    Aurrecoechea, Cristina; Brestelli, John; Brunk, Brian P; Carlton, Jane M; Dommer, Jennifer; Fischer, Steve; Gajria, Bindu; Gao, Xin; Gingle, Alan; Grant, Greg; Harb, Omar S; Heiges, Mark; Innamorato, Frank; Iodice, John; Kissinger, Jessica C; Kraemer, Eileen; Li, Wei; Miller, John A; Morrison, Hilary G; Nayak, Vishal; Pennington, Cary; Pinney, Deborah F; Roos, David S; Ross, Chris; Stoeckert, Christian J; Sullivan, Steven; Treatman, Charles; Wang, Haiming

    2009-01-01

    GiardiaDB (http://GiardiaDB.org) and TrichDB (http://TrichDB.org) house the genome databases for Giardia lamblia and Trichomonas vaginalis, respectively, and represent the latest additions to the EuPathDB (http://EuPathDB.org) family of functional genomic databases. GiardiaDB and TrichDB employ the same framework as other EuPathDB sites (CryptoDB, PlasmoDB and ToxoDB), supporting fully integrated and searchable databases. Genomic-scale data available via these resources may be queried based on BLAST searches, annotation keywords and gene ID searches, GO terms, sequence motifs and other protein characteristics. Functional queries may also be formulated, based on transcript and protein expression data from a variety of platforms. Phylogenetic relationships may also be interrogated. The ability to combine the results from independent queries, and to store queries and query results for future use facilitates complex, genome-wide mining of functional genomic data. PMID:18824479

  8. The Integrated Microbial Genomes (IMG) System: An Expanding Comparative Analysis Resource

    SciTech Connect

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Grechkin, Yuri; Ratner, Anna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2009-09-13

    The integrated microbial genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG contains both draft and complete microbial genomes integrated with other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through regular releases. Several companion IMG systems have been set up in order to serve domain specific needs, such as expert review of genome annotations. IMG is available at .

  9. The Protein Information Resource: an integrated public resource of functional annotation of proteins

    PubMed Central

    Wu, Cathy H.; Huang, Hongzhan; Arminski, Leslie; Castro-Alvear, Jorge; Chen, Yongxing; Hu, Zhang-Zhi; Ledley, Robert S.; Lewis, Kali C.; Mewes, Hans-Werner; Orcutt, Bruce C.; Suzek, Baris E.; Tsugita, Akira; Vinayaka, C. R.; Yeh, Lai-Su L.; Zhang, Jian; Barker, Winona C.

    2002-01-01

    The Protein Information Resource (PIR) serves as an integrated public resource of functional annotation of protein data to support genomic/proteomic research and scientific discovery. The PIR, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the PIR-International Protein Sequence Database (PSD), the major annotated protein sequence database in the public domain, containing about 250 000 proteins. To improve protein annotation and the coverage of experimentally validated data, a bibliography submission system is developed for scientists to submit, categorize and retrieve literature information. Comprehensive protein information is available from iProClass, which includes family classification at the superfamily, domain and motif levels, structural and functional features of proteins, as well as cross-references to over 40 biological databases. To provide timely and comprehensive protein data with source attribution, we have introduced a non-redundant reference protein database, PIR-NREF. The database consists of about 800 000 proteins collected from PIR-PSD, SWISS-PROT, TrEMBL, GenPept, RefSeq and PDB, with composite protein names and literature data. To promote database interoperability, we provide XML data distribution and open database schema, and adopt common ontologies. The PIR web site (http://pir.georgetown.edu/) features data mining and sequence analysis tools for information retrieval and functional identification of proteins based on both sequence and annotation information. The PIR databases and other files are also available by FTP (ftp://nbrfa.georgetown.edu/pir_databases). PMID:11752247

  10. Leveraging Comparative Genomics to Identify and Functionally Characterize Genes Associated with Sperm Phenotypes in Python bivittatus (Burmese Python).

    PubMed

    Irizarry, Kristopher J L; Rutllant, Josep

    2016-01-01

    Comparative genomics approaches provide a means of leveraging functional genomics information from a highly annotated model organism's genome (such as the mouse genome) in order to make physiological inferences about the role of genes and proteins in a less characterized organism's genome (such as the Burmese python). We employed a comparative genomics approach to produce the functional annotation of Python bivittatus genes encoding proteins associated with sperm phenotypes. We identify 129 gene-phenotype relationships in the python which are implicated in 10 specific sperm phenotypes. Results obtained through our systematic analysis identified subsets of python genes exhibiting associations with gene ontology annotation terms. Functional annotation data was represented in a semantic scatter plot. Together, these newly annotated Python bivittatus genome resources provide a high resolution framework from which the biology relating to reptile spermatogenesis, fertility, and reproduction can be further investigated. Applications of our research include (1) production of genetic diagnostics for assessing fertility in domestic and wild reptiles; (2) enhanced assisted reproduction technology for endangered and captive reptiles; and (3) novel molecular targets for biotechnology-based approaches aimed at reducing fertility and reproduction of invasive reptiles. Additional enhancements to reptile genomic resources will further enhance their value. PMID:27200191

  11. Leveraging Comparative Genomics to Identify and Functionally Characterize Genes Associated with Sperm Phenotypes in Python bivittatus (Burmese Python).

    PubMed

    Irizarry, Kristopher J L; Rutllant, Josep

    2016-01-01

    Comparative genomics approaches provide a means of leveraging functional genomics information from a highly annotated model organism's genome (such as the mouse genome) in order to make physiological inferences about the role of genes and proteins in a less characterized organism's genome (such as the Burmese python). We employed a comparative genomics approach to produce the functional annotation of Python bivittatus genes encoding proteins associated with sperm phenotypes. We identify 129 gene-phenotype relationships in the python which are implicated in 10 specific sperm phenotypes. Results obtained through our systematic analysis identified subsets of python genes exhibiting associations with gene ontology annotation terms. Functional annotation data was represented in a semantic scatter plot. Together, these newly annotated Python bivittatus genome resources provide a high resolution framework from which the biology relating to reptile spermatogenesis, fertility, and reproduction can be further investigated. Applications of our research include (1) production of genetic diagnostics for assessing fertility in domestic and wild reptiles; (2) enhanced assisted reproduction technology for endangered and captive reptiles; and (3) novel molecular targets for biotechnology-based approaches aimed at reducing fertility and reproduction of invasive reptiles. Additional enhancements to reptile genomic resources will further enhance their value.

  12. Leveraging Comparative Genomics to Identify and Functionally Characterize Genes Associated with Sperm Phenotypes in Python bivittatus (Burmese Python)

    PubMed Central

    Rutllant, Josep

    2016-01-01

    Comparative genomics approaches provide a means of leveraging functional genomics information from a highly annotated model organism's genome (such as the mouse genome) in order to make physiological inferences about the role of genes and proteins in a less characterized organism's genome (such as the Burmese python). We employed a comparative genomics approach to produce the functional annotation of Python bivittatus genes encoding proteins associated with sperm phenotypes. We identify 129 gene-phenotype relationships in the python which are implicated in 10 specific sperm phenotypes. Results obtained through our systematic analysis identified subsets of python genes exhibiting associations with gene ontology annotation terms. Functional annotation data was represented in a semantic scatter plot. Together, these newly annotated Python bivittatus genome resources provide a high resolution framework from which the biology relating to reptile spermatogenesis, fertility, and reproduction can be further investigated. Applications of our research include (1) production of genetic diagnostics for assessing fertility in domestic and wild reptiles; (2) enhanced assisted reproduction technology for endangered and captive reptiles; and (3) novel molecular targets for biotechnology-based approaches aimed at reducing fertility and reproduction of invasive reptiles. Additional enhancements to reptile genomic resources will further enhance their value. PMID:27200191

  13. Foster Family Resources, Psychosocial Functioning, and Retention.

    ERIC Educational Resources Information Center

    Rhodes, Kathryn W.; Orme, John G.; Cox, Mary Ellen; Buehler, Cheryl

    2003-01-01

    This longitudinal study examined the effect of family resources and psychosocial problems on retention for foster families. Almost 50 percent of families who started preservice training did not complete it. Families with more psychosocial problems and fewer resources were more likely to express uncertainly about continuing. These results have…

  14. Conifer genomics and adaptation: at the crossroads of genetic diversity and genome function.

    PubMed

    Prunier, Julien; Verta, Jukka-Pekka; MacKay, John J

    2016-01-01

    Conifers have been understudied at the genomic level despite their worldwide ecological and economic importance but the situation is rapidly changing with the development of next generation sequencing (NGS) technologies. With NGS, genomics research has simultaneously gained in speed, magnitude and scope. In just a few years, genomes of 20-24 gigabases have been sequenced for several conifers, with several others expected in the near future. Biological insights have resulted from recent sequencing initiatives as well as genetic mapping, gene expression profiling and gene discovery research over nearly two decades. We review the knowledge arising from conifer genomics research emphasizing genome evolution and the genomic basis of adaptation, and outline emerging questions and knowledge gaps. We discuss future directions in three areas with potential inputs from NGS technologies: the evolutionary impacts of adaptation in conifers based on the adaptation-by-speciation model; the contributions of genetic variability of gene expression in adaptation; and the development of a broader understanding of genetic diversity and its impacts on genome function. These research directions promise to sustain research aimed at addressing the emerging challenges of adaptation that face conifer trees.

  15. Functional genomics identifies drivers of medulloblastoma dissemination.

    PubMed

    Mumert, Michael; Dubuc, Adrian; Wu, Xiaochong; Northcott, Paul A; Chin, Steven S; Pedone, Carolyn A; Taylor, Michael D; Fults, Daniel W

    2012-10-01

    Medulloblastomas are malignant brain tumors that arise in the cerebellum in children and disseminate via the cerebrospinal fluid to the leptomeningeal spaces of the brain and spinal cord. Challenged by the poor prognosis for patients with metastatic dissemination, pediatric oncologists have developed aggressive treatment protocols, combining surgery, craniospinal radiation, and high-dose chemotherapy, that often cause disabling neurotoxic effects in long-term survivors. Insights into the genetic control of medulloblastoma dissemination have come from transposon insertion mutagenesis studies. Mobilizing the Sleeping Beauty transposon in cerebellar neural progenitor cells caused widespread dissemination of typically nonmetastatic medulloblastomas in Patched(+/-) mice, in which Shh signaling is hyperactive. Candidate metastasis genes were identified by sequencing the insertion sites and then mapping these sequences back to the mouse genome. To determine whether genes located at transposon insertion sites directly caused medulloblastomas to disseminate, we overexpressed candidate genes in Nestin(+) neural progenitors in the cerebella of mice by retroviral transfer in combination with Shh. We show here that ectopic expression of Eras, Lhx1, Ccrk, and Akt shifted the in vivo growth characteristics of Shh-induced medulloblastomas from a localized pattern to a disseminated pattern in which tumor cells seeded the leptomeningeal spaces of the brain and spinal cord. PMID:22875024

  16. Recombination between defective tombusvirus RNAs generates functional hybrid genomes

    SciTech Connect

    White, K.A.; Morris, T.J.

    1994-04-26

    The tombusviruses represent a group of small icosahedral plant viruses that contain monopartite positive-sense RNA genomes. Tombusviruses are able to generate small replicating deletion mutants of their genomes (i.e., defective interfering RNAs) during infections via RNA recombination and/or rearrangement. To further study the process of RNA recombination and to determine whether tombusviruses were capable of trans-recombination, protoplasts were coinoculated with in vitro-generated transcripts of a nonreplicating 3{prime}-truncated genomic RNA of cucumber necrosis tombusvirus and either replicative or replication-defective DI RNAs of tomato bushy stunt tombusvirus. After 48-hr incubation, two dominant replicative chimeric recombinant viral RNA populations were detected that contained various large contiguous 5{prime} segments of the cucumber necrosis tombusvirus genomic RNA fused to 3{prime}-terminal regions of the tomato bushy stunt tombusvirus defective interfering RNA. Some of the larger chimeric recombinants formed in protoplasts were able to systemically infect plants and induce wild-type symptoms. In addition, a functional chimeric genome was generated in planta after direct coinoculation of whole plants with the defective RNA components. These results indicate that (i) RNA recombination can occur relatively efficiently in single-cell infections, (ii) trans-recombination can occur with nonreplicating viral RNA components, and (iii) functional chimeric genomes can be generated via recombination. Possible mechanisms for the formation of the recombinants are proposed, and evolutionary implications are discussed.

  17. CTDB: An Integrated Chickpea Transcriptome Database for Functional and Applied Genomics.

    PubMed

    Verma, Mohit; Kumar, Vinay; Patel, Ravi K; Garg, Rohini; Jain, Mukesh

    2015-01-01

    Chickpea is an important grain legume used as a rich source of protein in human diet. The narrow genetic diversity and limited availability of genomic resources are the major constraints in implementing breeding strategies and biotechnological interventions for genetic enhancement of chickpea. We developed an integrated Chickpea Transcriptome Database (CTDB), which provides the comprehensive web interface for visualization and easy retrieval of transcriptome data in chickpea. The database features many tools for similarity search, functional annotation (putative function, PFAM domain and gene ontology) search and comparative gene expression analysis. The current release of CTDB (v2.0) hosts transcriptome datasets with high quality functional annotation from cultivated (desi and kabuli types) and wild chickpea. A catalog of transcription factor families and their expression profiles in chickpea are available in the database. The gene expression data have been integrated to study the expression profiles of chickpea transcripts in major tissues/organs and various stages of flower development. The utilities, such as similarity search, ortholog identification and comparative gene expression have also been implemented in the database to facilitate comparative genomic studies among different legumes and Arabidopsis. Furthermore, the CTDB represents a resource for the discovery of functional molecular markers (microsatellites and single nucleotide polymorphisms) between different chickpea types. We anticipate that integrated information content of this database will accelerate the functional and applied genomic research for improvement of chickpea. The CTDB web service is freely available at http://nipgr.res.in/ctdb.html. PMID:26322998

  18. CTDB: An Integrated Chickpea Transcriptome Database for Functional and Applied Genomics.

    PubMed

    Verma, Mohit; Kumar, Vinay; Patel, Ravi K; Garg, Rohini; Jain, Mukesh

    2015-01-01

    Chickpea is an important grain legume used as a rich source of protein in human diet. The narrow genetic diversity and limited availability of genomic resources are the major constraints in implementing breeding strategies and biotechnological interventions for genetic enhancement of chickpea. We developed an integrated Chickpea Transcriptome Database (CTDB), which provides the comprehensive web interface for visualization and easy retrieval of transcriptome data in chickpea. The database features many tools for similarity search, functional annotation (putative function, PFAM domain and gene ontology) search and comparative gene expression analysis. The current release of CTDB (v2.0) hosts transcriptome datasets with high quality functional annotation from cultivated (desi and kabuli types) and wild chickpea. A catalog of transcription factor families and their expression profiles in chickpea are available in the database. The gene expression data have been integrated to study the expression profiles of chickpea transcripts in major tissues/organs and various stages of flower development. The utilities, such as similarity search, ortholog identification and comparative gene expression have also been implemented in the database to facilitate comparative genomic studies among different legumes and Arabidopsis. Furthermore, the CTDB represents a resource for the discovery of functional molecular markers (microsatellites and single nucleotide polymorphisms) between different chickpea types. We anticipate that integrated information content of this database will accelerate the functional and applied genomic research for improvement of chickpea. The CTDB web service is freely available at http://nipgr.res.in/ctdb.html.

  19. First TILLING Platform in Cucurbita pepo: A New Mutant Resource for Gene Function and Crop Improvement

    PubMed Central

    Vicente-Dólera, Nelly; Troadec, Christelle; Moya, Manuel; del Río-Celestino, Mercedes; Pomares-Viciana, Teresa; Bendahmane, Abdelhafid; Picó, Belén; Román, Belén; Gómez, Pedro

    2014-01-01

    Although the availability of genetic and genomic resources for Cucurbita pepo has increased significantly, functional genomic resources are still limited for this crop. In this direction, we have developed a high throughput reverse genetic tool: the first TILLING (Targeting Induced Local Lesions IN Genomes) resource for this species. Additionally, we have used this resource to demonstrate that the previous EMS mutant population we developed has the highest mutation density compared with other cucurbits mutant populations. The overall mutation density in this first C. pepo TILLING platform was estimated to be 1/133 Kb by screening five additional genes. In total, 58 mutations confirmed by sequencing were identified in the five targeted genes, thirteen of which were predicted to have an impact on the function of the protein. The genotype/phenotype correlation was studied in a peroxidase gene, revealing that the phenotype of seedling homozygous for one of the isolated mutant alleles was albino. These results indicate that the TILLING approach in this species was successful at providing new mutations and can address the major challenge of linking sequence information to biological function and also the identification of novel variation for crop breeding. PMID:25386735

  20. Genomic Resources Notes Accepted 1 August 2015 - 31 September 2015.

    PubMed

    Kohler, Annegret; Kremer, Antoine; Le Provost, Grégoire; Lesur, Isabelle; Lin, Gonhua; Martin, Francis; Plomion, Christophe; Wu, Aiguo; Zhao, Fang

    2016-01-01

    This article documents the public availability of transcriptomic resources for (i) the Hazelnut tree (Corylus avellana L.) and (ii) the oriental rat flea and primary plague vector, Xenopsylla cheopis. PMID:26768197

  1. Integrative Transcriptome, Genome and Quantitative Trait Loci Resources Identify Single Nucleotide Polymorphisms in Candidate Genes for Growth Traits in Turbot.

    PubMed

    Robledo, Diego; Fernández, Carlos; Hermida, Miguel; Sciara, Andrés; Álvarez-Dios, José Antonio; Cabaleiro, Santiago; Caamaño, Rubén; Martínez, Paulino; Bouza, Carmen

    2016-01-01

    Growth traits represent a main goal in aquaculture breeding programs and may be related to adaptive variation in wild fisheries. Integrating quantitative trait loci (QTL) mapping and next generation sequencing can greatly help to identify variation in candidate genes, which can result in marker-assisted selection and better genetic structure information. Turbot is a commercially important flatfish in Europe and China, with available genomic information on QTLs and genome mapping. Muscle and liver RNA-seq from 18 individuals was carried out to obtain gene sequences and markers functionally related to growth, resulting in a total of 20,447 genes and 85,344 single nucleotide polymorphisms (SNPs). Many growth-related genes and SNPs were identified and placed in the turbot genome and genetic map to explore their co-localization with growth-QTL markers. Forty-five SNPs on growth-related genes were selected based on QTL co-localization and relevant function for growth traits. Forty-three SNPs were technically feasible and validated in a wild Atlantic population, where 91% were polymorphic. The integration of functional and structural genomic resources in turbot provides a practical approach for QTL mining in this species. Validated SNPs represent a useful set of growth-related gene markers for future association, functional and population studies in this flatfish species. PMID:26901189

  2. Integrative Transcriptome, Genome and Quantitative Trait Loci Resources Identify Single Nucleotide Polymorphisms in Candidate Genes for Growth Traits in Turbot

    PubMed Central

    Robledo, Diego; Fernández, Carlos; Hermida, Miguel; Sciara, Andrés; Álvarez-Dios, José Antonio; Cabaleiro, Santiago; Caamaño, Rubén; Martínez, Paulino; Bouza, Carmen

    2016-01-01

    Growth traits represent a main goal in aquaculture breeding programs and may be related to adaptive variation in wild fisheries. Integrating quantitative trait loci (QTL) mapping and next generation sequencing can greatly help to identify variation in candidate genes, which can result in marker-assisted selection and better genetic structure information. Turbot is a commercially important flatfish in Europe and China, with available genomic information on QTLs and genome mapping. Muscle and liver RNA-seq from 18 individuals was carried out to obtain gene sequences and markers functionally related to growth, resulting in a total of 20,447 genes and 85,344 single nucleotide polymorphisms (SNPs). Many growth-related genes and SNPs were identified and placed in the turbot genome and genetic map to explore their co-localization with growth-QTL markers. Forty-five SNPs on growth-related genes were selected based on QTL co-localization and relevant function for growth traits. Forty-three SNPs were technically feasible and validated in a wild Atlantic population, where 91% were polymorphic. The integration of functional and structural genomic resources in turbot provides a practical approach for QTL mining in this species. Validated SNPs represent a useful set of growth-related gene markers for future association, functional and population studies in this flatfish species. PMID:26901189

  3. Genetic Resources, Genome Mapping and Evolutionary Genomics of the Pig (Sus scrofa)

    PubMed Central

    Chen, Kefei; Baxter, Tara; Muir, William M.; Groenen, Martien A.; Schook, Lawrence B.

    2007-01-01

    The pig, a representative of the artiodactyla clade, is one of the first animals domesticated, and has become an important agriculture animal as one of the major human nutritional sources of animal based protein. The pig is also a valuable biomedical model organism for human health. The pig's importance to human health and nutrition is reflected in the decision to sequence its genome (3X). As an animal species with its wild ancestors present in the world, the pig provides a unique opportunity for tracing mammalian evolutionary history and defining signatures of selection resulting from both domestication and natural selection. Completion of the pig genome sequencing project will have significant impacts on both agriculture and human health. Following the pig whole genome sequence drafts, along with large-scale polymorphism data, it will be possible to conduct genome sweeps using association mapping, and identify signatures of selection. Here, we provide a description of the pig genome sequencing project and perspectives on utilizing genomic technologies to exploit pig genome evolution and the molecular basis for phenotypic traits for improving pig production and health. PMID:17384734

  4. Finishing genomes with limited resources: lessons from an ensemble of microbial genomes

    PubMed Central

    2010-01-01

    While new sequencing technologies have ushered in an era where microbial genomes can be easily sequenced, the goal of routinely producing high-quality draft and finished genomes in a cost-effective fashion has still remained elusive. Due to shorter read lengths and limitations in library construction protocols, shotgun sequencing and assembly based on these technologies often results in fragmented assemblies. Correspondingly, while draft assemblies can be obtained in days, finishing can take many months and hence the time and effort can only be justified for high-priority genomes and in large sequencing centers. In this work, we revisit this issue in light of our own experience in producing finished and nearly-finished genomes for a range of microbial species in a small-lab setting. These genomes were finished with surprisingly little investments in terms of time, computational effort and lab work, suggesting that the increased access to sequencing might also eventually lead to a greater proportion of finished genomes from small labs and genomics cores. PMID:20398345

  5. A genome-wide resource for the analysis of protein localisation in Drosophila

    PubMed Central

    Sarov, Mihail; Barz, Christiane; Jambor, Helena; Hein, Marco Y; Schmied, Christopher; Suchold, Dana; Stender, Bettina; Janosch, Stephan; KJ, Vinay Vikas; Krishnan, RT; Krishnamoorthy, Aishwarya; Ferreira, Irene RS; Ejsmont, Radoslaw K; Finkl, Katja; Hasse, Susanne; Kämpfer, Philipp; Plewka, Nicole; Vinis, Elisabeth; Schloissnig, Siegfried; Knust, Elisabeth; Hartenstein, Volker; Mann, Matthias; Ramaswami, Mani; VijayRaghavan, K; Tomancak, Pavel; Schnorrer, Frank

    2016-01-01

    The Drosophila genome contains >13000 protein-coding genes, the majority of which remain poorly investigated. Important reasons include the lack of antibodies or reporter constructs to visualise these proteins. Here, we present a genome-wide fosmid library of 10000 GFP-tagged clones, comprising tagged genes and most of their regulatory information. For 880 tagged proteins, we created transgenic lines, and for a total of 207 lines, we assessed protein expression and localisation in ovaries, embryos, pupae or adults by stainings and live imaging approaches. Importantly, we visualised many proteins at endogenous expression levels and found a large fraction of them localising to subcellular compartments. By applying genetic complementation tests, we estimate that about two-thirds of the tagged proteins are functional. Moreover, these tagged proteins enable interaction proteomics from developing pupae and adult flies. Taken together, this resource will boost systematic analysis of protein expression and localisation in various cellular and developmental contexts. DOI: http://dx.doi.org/10.7554/eLife.12068.001 PMID:26896675

  6. Phytozome Comparative Plant Genomics Portal

    SciTech Connect

    Goodstein, David; Batra, Sajeev; Carlson, Joseph; Hayes, Richard; Phillips, Jeremy; Shu, Shengqiang; Schmutz, Jeremy; Rokhsar, Daniel

    2014-09-09

    The Dept. of Energy Joint Genome Institute is a genomics user facility supporting DOE mission science in the areas of Bioenergy, Carbon Cycling, and Biogeochemistry. The Plant Program at the JGI applies genomic, analytical, computational and informatics platforms and methods to: 1. Understand and accelerate the improvement (domestication) of bioenergy crops 2. Characterize and moderate plant response to climate change 3. Use comparative genomics to identify constrained elements and infer gene function 4. Build high quality genomic resource platforms of JGI Plant Flagship genomes for functional and experimental work 5. Expand functional genomic resources for Plant Flagship genomes

  7. What's that gene (or protein)? Online resources for exploring functions of genes, transcripts, and proteins

    PubMed Central

    Hutchins, James R. A.

    2014-01-01

    The genomic era has enabled research projects that use approaches including genome-scale screens, microarray analysis, next-generation sequencing, and mass spectrometry–based proteomics to discover genes and proteins involved in biological processes. Such methods generate data sets of gene, transcript, or protein hits that researchers wish to explore to understand their properties and functions and thus their possible roles in biological systems of interest. Recent years have seen a profusion of Internet-based resources to aid this process. This review takes the viewpoint of the curious biologist wishing to explore the properties of protein-coding genes and their products, identified using genome-based technologies. Ten key questions are asked about each hit, addressing functions, phenotypes, expression, evolutionary conservation, disease association, protein structure, interactors, posttranslational modifications, and inhibitors. Answers are provided by presenting the latest publicly available resources, together with methods for hit-specific and data set–wide information retrieval, suited to any genome-based analytical technique and experimental species. The utility of these resources is demonstrated for 20 factors regulating cell proliferation. Results obtained using some of these are discussed in more depth using the p53 tumor suppressor as an example. This flexible and universally applicable approach for characterizing experimental hits helps researchers to maximize the potential of their projects for biological discovery. PMID:24723265

  8. Genomic Functionalization: The Next Revolution In Biology

    SciTech Connect

    Anderson, Peter; Schoeniger, Joseph S.; Imbro, Paula M.

    2014-07-01

    We have implemented a ligand-alignment algorithm into our developed computational pipeline for identifying specificity-determining features (SDFs) in protein-ligand complexes. Given a set of protein-ligand complex structures, the algorithm aligns the complexes by ligand rather than by the C -RMSD or standard approach, providing a single reference frame for extracting SDFs. We anticipate that this ligand-alignment capability will be highly useful for protein function prediction. We already have a database containing > 20 K ligand-protein complex crystal structures taken from the Protein Data Bank. By aligning these proteins to single reference frames using ligand alignment, we can submit the complexes to our pipeline for SDF extraction. The SDFs derived from this training procedure can be used as thumbprints that are hallmarks of individual enzyme classes. These SDF thumbprints may then serve as guides to the prediction of function of new unknown proteins.

  9. MicrobesOnline: an integrated portal for comparative and functional genomics

    SciTech Connect

    Dehal, Paramvir; Joachimiak, Marcin; Price, Morgan; Bates, John; Baumohl, Jason; Chivian, Dylan; Friedland, Greg; Huang, Kathleen; Keller, Keith; Novichkov, Pavel; Dubchak, Inna; Alm, Eric; Arkin, Adam

    2011-07-14

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html.

  10. MicrobesOnline: an integrated portal for comparative and functional genomics

    SciTech Connect

    Dehal, Paramvir S.; Joachimiak, Marcin P.; Price, Morgan N.; Bates, John T.; Baumohl, Jason K.; Chivian, Dylan; Friedland, Greg D.; Huang, Katherine H.; Keller, Keith; Novichkov, Pavel S.; Dubchak, Inna L.; Alm, Eric J.; Arkin, Adam P.

    2009-09-17

    Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html.

  11. The Mouse Genome Database (MGD): from genes to mice--a community resource for mouse biology.

    PubMed

    Eppig, Janan T; Bult, Carol J; Kadin, James A; Richardson, Joel E; Blake, Judith A; Anagnostopoulos, A; Baldarelli, R M; Baya, M; Beal, J S; Bello, S M; Boddy, W J; Bradt, D W; Burkart, D L; Butler, N E; Campbell, J; Cassell, M A; Corbani, L E; Cousins, S L; Dahmen, D J; Dene, H; Diehl, A D; Drabkin, H J; Frazer, K S; Frost, P; Glass, L H; Goldsmith, C W; Grant, P L; Lennon-Pierce, M; Lewis, J; Lu, I; Maltais, L J; McAndrews-Hill, M; McClellan, L; Miers, D B; Miller, L A; Ni, L; Ormsby, J E; Qi, D; Reddy, T B K; Reed, D J; Richards-Smith, B; Shaw, D R; Sinclair, R; Smith, C L; Szauter, P; Walker, M B; Walton, D O; Washburn, L L; Witham, I T; Zhu, Y

    2005-01-01

    The Mouse Genome Database (MGD) forms the core of the Mouse Genome Informatics (MGI) system (http://www.informatics.jax.org), a model organism database resource for the laboratory mouse. MGD provides essential integration of experimental knowledge for the mouse system with information annotated from both literature and online sources. MGD curates and presents consensus and experimental data representations of genotype (sequence) through phenotype information, including highly detailed reports about genes and gene products. Primary foci of integration are through representations of relationships among genes, sequences and phenotypes. MGD collaborates with other bioinformatics groups to curate a definitive set of information about the laboratory mouse and to build and implement the data and semantic standards that are essential for comparative genome analysis. Recent improvements in MGD discussed here include the enhancement of phenotype resources, the re-development of the International Mouse Strain Resource, IMSR, the update of mammalian orthology datasets and the electronic publication of classic books in mouse genetics.

  12. Genomic resources for Myzus persicae: EST sequencing, SNP identification, and microarray design

    PubMed Central

    Ramsey, John S; Wilson, Alex CC; de Vos, Martin; Sun, Qi; Tamborindeguy, Cecilia; Winfield, Agnese; Malloch, Gaynor; Smith, Dawn M; Fenton, Brian; Gray, Stewart M; Jander, Georg

    2007-01-01

    Background The green peach aphid, Myzus persicae (Sulzer), is a world-wide insect pest capable of infesting more than 40 plant families, including many crop species. However, despite the significant damage inflicted by M. persicae in agricultural systems through direct feeding damage and by its ability to transmit plant viruses, limited genomic information is available for this species. Results Sequencing of 16 M. persicae cDNA libraries generated 26,669 expressed sequence tags (ESTs). Aphids for library construction were raised on Arabidopsis thaliana, Nicotiana benthamiana, Brassica oleracea, B. napus, and Physalis floridana (with and without Potato leafroll virus infection). The M. persicae cDNA libraries include ones made from sexual and asexual whole aphids, guts, heads, and salivary glands. In silico comparison of cDNA libraries identified aphid genes with tissue-specific expression patterns, and gene expression that is induced by feeding on Nicotiana benthamiana. Furthermore, 2423 genes that are novel to science and potentially aphid-specific were identified. Comparison of cDNA data from three aphid lineages identified single nucleotide polymorphisms that can be used as genetic markers and, in some cases, may represent functional differences in the protein products. In particular, non-conservative amino acid substitutions in a highly expressed gut protease may be of adaptive significance for M. persicae feeding on different host plants. The Agilent eArray platform was used to design an M. persicae oligonucleotide microarray representing over 10,000 unique genes. Conclusion New genomic resources have been developed for M. persicae, an agriculturally important insect pest. These include previously unknown sequence data, a collection of expressed genes, molecular markers, and a DNA microarray that can be used to study aphid gene expression. These resources will help elucidate the adaptations that allow M. persicae to develop compatible interactions with its

  13. Functional genomics of Buchnera and the ecology of aphid hosts.

    PubMed

    Moran, Nancy A; Degnan, Patrick H

    2006-04-01

    In many animal groups, mutualistic bacterial symbionts play a central role in host ecology, by provisioning rare nutrients and thus enabling specialization on restricted diets. Among such symbionts, genomic studies are most advanced for Buchnera, the obligate symbiont of aphids, which feed on phloem sap. The contents of the highly reduced Buchnera genomes have verified its role in aphid nutrition. Comparisons of Buchnera gene sets indicate ongoing, irreversible gene losses that are expected to affect aphid nutritional needs. Furthermore, almost all regulatory genes have been eliminated, raising the question of whether and how gene expression responds to environmental change. Microarray studies on genome-wide expression indicate that Buchnera has evolved some constitutive changes in gene expression: homologues of heat stress genes have elevated transcript levels in Buchnera (relative to other bacteria) even in the absence of stress. Additionally, the microarray results indicate that responses to heat stress and to amino acid availability are both few and modest. Observed responses are consistent with control by the few ancestral regulators retained in the genome. Initial studies on the role of host genes in mediating the symbiosis reveal distinctive expression patterns in host cells harbouring Buchnera. In the near future, a complete genome of pea aphid will accelerate progress in understanding the functional integration of aphid and Buchnera genomes. Although information for other insect symbioses is relatively limited, studies on symbionts of carpenter ants and tsetse flies indicate many similarities to Buchnera. PMID:16626452

  14. The Global Invertebrate Genomics Alliance (GIGA): developing community resources to study diverse invertebrate genomes.

    PubMed

    Bracken-Grissom, Heather; Collins, Allen G; Collins, Timothy; Crandall, Keith; Distel, Daniel; Dunn, Casey; Giribet, Gonzalo; Haddock, Steven; Knowlton, Nancy; Martindale, Mark; Medina, Mónica; Messing, Charles; O'Brien, Stephen J; Paulay, Gustav; Putnam, Nicolas; Ravasi, Timothy; Rouse, Greg W; Ryan, Joseph F; Schulze, Anja; Wörheide, Gert; Adamska, Maja; Bailly, Xavier; Breinholt, Jesse; Browne, William E; Diaz, M Christina; Evans, Nathaniel; Flot, Jean-François; Fogarty, Nicole; Johnston, Matthew; Kamel, Bishoy; Kawahara, Akito Y; Laberge, Tammy; Lavrov, Dennis; Michonneau, François; Moroz, Leonid L; Oakley, Todd; Osborne, Karen; Pomponi, Shirley A; Rhodes, Adelaide; Santos, Scott R; Satoh, Nori; Thacker, Robert W; Van de Peer, Yves; Voolstra, Christian R; Welch, David Mark; Winston, Judith; Zhou, Xin

    2014-01-01

    Over 95% of all metazoan (animal) species comprise the "invertebrates," but very few genomes from these organisms have been sequenced. We have, therefore, formed a "Global Invertebrate Genomics Alliance" (GIGA). Our intent is to build a collaborative network of diverse scientists to tackle major challenges (e.g., species selection, sample collection and storage, sequence assembly, annotation, analytical tools) associated with genome/transcriptome sequencing across a large taxonomic spectrum. We aim to promote standards that will facilitate comparative approaches to invertebrate genomics and collaborations across the international scientific community. Candidate study taxa include species from Porifera, Ctenophora, Cnidaria, Placozoa, Mollusca, Arthropoda, Echinodermata, Annelida, Bryozoa, and Platyhelminthes, among others. GIGA will target 7000 noninsect/nonnematode species, with an emphasis on marine taxa because of the unrivaled phyletic diversity in the oceans. Priorities for selecting invertebrates for sequencing will include, but are not restricted to, their phylogenetic placement; relevance to organismal, ecological, and conservation research; and their importance to fisheries and human health. We highlight benefits of sequencing both whole genomes (DNA) and transcriptomes and also suggest policies for genomic-level data access and sharing based on transparency and inclusiveness. The GIGA Web site (http://giga.nova.edu) has been launched to facilitate this collaborative venture.

  15. The Global Invertebrate Genomics Alliance (GIGA): Developing Community Resources to Study Diverse Invertebrate Genomes

    PubMed Central

    2014-01-01

    Over 95% of all metazoan (animal) species comprise the “invertebrates,” but very few genomes from these organisms have been sequenced. We have, therefore, formed a “Global Invertebrate Genomics Alliance” (GIGA). Our intent is to build a collaborative network of diverse scientists to tackle major challenges (e.g., species selection, sample collection and storage, sequence assembly, annotation, analytical tools) associated with genome/transcriptome sequencing across a large taxonomic spectrum. We aim to promote standards that will facilitate comparative approaches to invertebrate genomics and collaborations across the international scientific community. Candidate study taxa include species from Porifera, Ctenophora, Cnidaria, Placozoa, Mollusca, Arthropoda, Echinodermata, Annelida, Bryozoa, and Platyhelminthes, among others. GIGA will target 7000 noninsect/nonnematode species, with an emphasis on marine taxa because of the unrivaled phyletic diversity in the oceans. Priorities for selecting invertebrates for sequencing will include, but are not restricted to, their phylogenetic placement; relevance to organismal, ecological, and conservation research; and their importance to fisheries and human health. We highlight benefits of sequencing both whole genomes (DNA) and transcriptomes and also suggest policies for genomic-level data access and sharing based on transparency and inclusiveness. The GIGA Web site (http://giga.nova.edu) has been launched to facilitate this collaborative venture. PMID:24336862

  16. The Global Invertebrate Genomics Alliance (GIGA): developing community resources to study diverse invertebrate genomes.

    PubMed

    Bracken-Grissom, Heather; Collins, Allen G; Collins, Timothy; Crandall, Keith; Distel, Daniel; Dunn, Casey; Giribet, Gonzalo; Haddock, Steven; Knowlton, Nancy; Martindale, Mark; Medina, Mónica; Messing, Charles; O'Brien, Stephen J; Paulay, Gustav; Putnam, Nicolas; Ravasi, Timothy; Rouse, Greg W; Ryan, Joseph F; Schulze, Anja; Wörheide, Gert; Adamska, Maja; Bailly, Xavier; Breinholt, Jesse; Browne, William E; Diaz, M Christina; Evans, Nathaniel; Flot, Jean-François; Fogarty, Nicole; Johnston, Matthew; Kamel, Bishoy; Kawahara, Akito Y; Laberge, Tammy; Lavrov, Dennis; Michonneau, François; Moroz, Leonid L; Oakley, Todd; Osborne, Karen; Pomponi, Shirley A; Rhodes, Adelaide; Santos, Scott R; Satoh, Nori; Thacker, Robert W; Van de Peer, Yves; Voolstra, Christian R; Welch, David Mark; Winston, Judith; Zhou, Xin

    2014-01-01

    Over 95% of all metazoan (animal) species comprise the "invertebrates," but very few genomes from these organisms have been sequenced. We have, therefore, formed a "Global Invertebrate Genomics Alliance" (GIGA). Our intent is to build a collaborative network of diverse scientists to tackle major challenges (e.g., species selection, sample collection and storage, sequence assembly, annotation, analytical tools) associated with genome/transcriptome sequencing across a large taxonomic spectrum. We aim to promote standards that will facilitate comparative approaches to invertebrate genomics and collaborations across the international scientific community. Candidate study taxa include species from Porifera, Ctenophora, Cnidaria, Placozoa, Mollusca, Arthropoda, Echinodermata, Annelida, Bryozoa, and Platyhelminthes, among others. GIGA will target 7000 noninsect/nonnematode species, with an emphasis on marine taxa because of the unrivaled phyletic diversity in the oceans. Priorities for selecting invertebrates for sequencing will include, but are not restricted to, their phylogenetic placement; relevance to organismal, ecological, and conservation research; and their importance to fisheries and human health. We highlight benefits of sequencing both whole genomes (DNA) and transcriptomes and also suggest policies for genomic-level data access and sharing based on transparency and inclusiveness. The GIGA Web site (http://giga.nova.edu) has been launched to facilitate this collaborative venture. PMID:24336862

  17. Annotating the Function of the Human Genome with Gene Ontology and Disease Ontology.

    PubMed

    Hu, Yang; Zhou, Wenyang; Ren, Jun; Dong, Lixiang; Wang, Yadong; Jin, Shuilin; Cheng, Liang

    2016-01-01

    Increasing evidences indicated that function annotation of human genome in molecular level and phenotype level is very important for systematic analysis of genes. In this study, we presented a framework named Gene2Function to annotate Gene Reference into Functions (GeneRIFs), in which each functional description of GeneRIFs could be annotated by a text mining tool Open Biomedical Annotator (OBA), and each Entrez gene could be mapped to Human Genome Organisation Gene Nomenclature Committee (HGNC) gene symbol. After annotating all the records about human genes of GeneRIFs, 288,869 associations between 13,148 mRNAs and 7,182 terms, 9,496 associations between 948 microRNAs and 533 terms, and 901 associations between 139 long noncoding RNAs (lncRNAs) and 297 terms were obtained as a comprehensive annotation resource of human genome. High consistency of term frequency of individual gene (Pearson correlation = 0.6401, p = 2.2e - 16) and gene frequency of individual term (Pearson correlation = 0.1298, p = 3.686e - 14) in GeneRIFs and GOA shows our annotation resource is very reliable. PMID:27635398

  18. Annotating the Function of the Human Genome with Gene Ontology and Disease Ontology

    PubMed Central

    Hu, Yang; Zhou, Wenyang; Ren, Jun; Dong, Lixiang

    2016-01-01

    Increasing evidences indicated that function annotation of human genome in molecular level and phenotype level is very important for systematic analysis of genes. In this study, we presented a framework named Gene2Function to annotate Gene Reference into Functions (GeneRIFs), in which each functional description of GeneRIFs could be annotated by a text mining tool Open Biomedical Annotator (OBA), and each Entrez gene could be mapped to Human Genome Organisation Gene Nomenclature Committee (HGNC) gene symbol. After annotating all the records about human genes of GeneRIFs, 288,869 associations between 13,148 mRNAs and 7,182 terms, 9,496 associations between 948 microRNAs and 533 terms, and 901 associations between 139 long noncoding RNAs (lncRNAs) and 297 terms were obtained as a comprehensive annotation resource of human genome. High consistency of term frequency of individual gene (Pearson correlation = 0.6401, p = 2.2e − 16) and gene frequency of individual term (Pearson correlation = 0.1298, p = 3.686e − 14) in GeneRIFs and GOA shows our annotation resource is very reliable. PMID:27635398

  19. Annotating the Function of the Human Genome with Gene Ontology and Disease Ontology

    PubMed Central

    Hu, Yang; Zhou, Wenyang; Ren, Jun; Dong, Lixiang

    2016-01-01

    Increasing evidences indicated that function annotation of human genome in molecular level and phenotype level is very important for systematic analysis of genes. In this study, we presented a framework named Gene2Function to annotate Gene Reference into Functions (GeneRIFs), in which each functional description of GeneRIFs could be annotated by a text mining tool Open Biomedical Annotator (OBA), and each Entrez gene could be mapped to Human Genome Organisation Gene Nomenclature Committee (HGNC) gene symbol. After annotating all the records about human genes of GeneRIFs, 288,869 associations between 13,148 mRNAs and 7,182 terms, 9,496 associations between 948 microRNAs and 533 terms, and 901 associations between 139 long noncoding RNAs (lncRNAs) and 297 terms were obtained as a comprehensive annotation resource of human genome. High consistency of term frequency of individual gene (Pearson correlation = 0.6401, p = 2.2e − 16) and gene frequency of individual term (Pearson correlation = 0.1298, p = 3.686e − 14) in GeneRIFs and GOA shows our annotation resource is very reliable.

  20. Update on Genomic Databases and Resources at the National Center for Biotechnology Information.

    PubMed

    Tatusova, Tatiana

    2016-01-01

    The National Center for Biotechnology Information (NCBI), as a primary public repository of genomic sequence data, collects and maintains enormous amounts of heterogeneous data. Data for genomes, genes, gene expressions, gene variation, gene families, proteins, and protein domains are integrated with the analytical, search, and retrieval resources through the NCBI website, text-based search and retrieval system, provides a fast and easy way to navigate across diverse biological databases.Comparative genome analysis tools lead to further understanding of evolution processes quickening the pace of discovery. Recent technological innovations have ignited an explosion in genome sequencing that has fundamentally changed our understanding of the biology of living organisms. This huge increase in DNA sequence data presents new challenges for the information management system and the visualization tools. New strategies have been designed to bring an order to this genome sequence shockwave and improve the usability of associated data.

  1. Update on Genomic Databases and Resources at the National Center for Biotechnology Information.

    PubMed

    Tatusova, Tatiana

    2016-01-01

    The National Center for Biotechnology Information (NCBI), as a primary public repository of genomic sequence data, collects and maintains enormous amounts of heterogeneous data. Data for genomes, genes, gene expressions, gene variation, gene families, proteins, and protein domains are integrated with the analytical, search, and retrieval resources through the NCBI website, text-based search and retrieval system, provides a fast and easy way to navigate across diverse biological databases.Comparative genome analysis tools lead to further understanding of evolution processes quickening the pace of discovery. Recent technological innovations have ignited an explosion in genome sequencing that has fundamentally changed our understanding of the biology of living organisms. This huge increase in DNA sequence data presents new challenges for the information management system and the visualization tools. New strategies have been designed to bring an order to this genome sequence shockwave and improve the usability of associated data. PMID:27115625

  2. MELOGEN: an EST database for melon functional genomics

    PubMed Central

    Gonzalez-Ibeas, Daniel; Blanca, José; Roig, Cristina; González-To, Mireia; Picó, Belén; Truniger, Verónica; Gómez, Pedro; Deleu, Wim; Caño-Delgado, Ana; Arús, Pere; Nuez, Fernando; Garcia-Mas, Jordi; Puigdomènech, Pere; Aranda, Miguel A

    2007-01-01

    Background Melon (Cucumis melo L.) is one of the most important fleshy fruits for fresh consumption. Despite this, few genomic resources exist for this species. To facilitate the discovery of genes involved in essential traits, such as fruit development, fruit maturation and disease resistance, and to speed up the process of breeding new and better adapted melon varieties, we have produced a large collection of expressed sequence tags (ESTs) from eight normalized cDNA libraries from different tissues in different physiological conditions. Results We determined over 30,000 ESTs that were clustered into 16,637 non-redundant sequences or unigenes, comprising 6,023 tentative consensus sequences (contigs) and 10,614 unclustered sequences (singletons). Many potential molecular markers were identified in the melon dataset: 1,052 potential simple sequence repeats (SSRs) and 356 single nucleotide polymorphisms (SNPs) were found. Sixty-nine percent of the melon unigenes showed a significant similarity with proteins in databases. Functional classification of the unigenes was carried out following the Gene Ontology scheme. In total, 9,402 unigenes were mapped to one or more ontology. Remarkably, the distributions of melon and Arabidopsis unigenes followed similar tendencies, suggesting that the melon dataset is representative of the whole melon transcriptome. Bioinformatic analyses primarily focused on potential precursors of melon micro RNAs (miRNAs) in the melon dataset, but many other genes potentially controlling disease resistance and fruit quality traits were also identified. Patterns of transcript accumulation were characterised by Real-Time-qPCR for 20 of these genes. Conclusion The collection of ESTs characterised here represents a substantial increase on the genetic information available for melon. A database (MELOGEN) which contains all EST sequences, contig images and several tools for analysis and data mining has been created. This set of sequences constitutes

  3. Functional annotation from the genome sequence of the giant panda.

    PubMed

    Huo, Tong; Zhang, Yinjie; Lin, Jianping

    2012-08-01

    The giant panda is one of the most critically endangered species due to the fragmentation and loss of its habitat. Studying the functions of proteins in this animal, especially specific trait-related proteins, is therefore necessary to protect the species. In this work, the functions of these proteins were investigated using the genome sequence of the giant panda. Data on 21,001 proteins and their functions were stored in the Giant Panda Protein Database, in which the proteins were divided into two groups: 20,179 proteins whose functions can be predicted by GeneScan formed the known-function group, whereas 822 proteins whose functions cannot be predicted by GeneScan comprised the unknown-function group. For the known-function group, we further classified the proteins by molecular function, biological process, cellular component, and tissue specificity. For the unknown-function group, we developed a strategy in which the proteins were filtered by cross-Blast to identify panda-specific proteins under the assumption that proteins related to the panda-specific traits in the unknown-function group exist. After this filtering procedure, we identified 32 proteins (2 of which are membrane proteins) specific to the giant panda genome as compared against the dog and horse genomes. Based on their amino acid sequences, these 32 proteins were further analyzed by functional classification using SVM-Prot, motif prediction using MyHits, and interacting protein prediction using the Database of Interacting Proteins. Nineteen proteins were predicted to be zinc-binding proteins, thus affecting the activities of nucleic acids. The 32 panda-specific proteins will be further investigated by structural and functional analysis.

  4. Functional genomics and its implications for molecular medicine.

    PubMed

    Broeckel, Ulrich; Maresso, Karen; Kugathasan, Subra

    2006-10-01

    Technological development in genetics and genomics provides unprecedented possibilities to identify the underlying molecular basic of many common diseases. With the availability of the human genome sequence and growing information on the most frequent DNA variations combined with the molecular analysis on the RNA expression and protein level, diseases might be characterized in the future at the molecular level. Describing gene function anf the specific role of DNA, RNA, and proteins in the disease process provides novel diagnosis tools and treatment. Ultimately how the unique genetic signature of an individual influences the risk and prognosis of disease will be the basis for individualized medicine in the years to come.

  5. Prospects and challenges for the conservation of farm animal genomic resources, 2015-2025.

    PubMed

    Bruford, Michael W; Ginja, Catarina; Hoffmann, Irene; Joost, Stéphane; Orozco-terWengel, Pablo; Alberto, Florian J; Amaral, Andreia J; Barbato, Mario; Biscarini, Filippo; Colli, Licia; Costa, Mafalda; Curik, Ino; Duruz, Solange; Ferenčaković, Maja; Fischer, Daniel; Fitak, Robert; Groeneveld, Linn F; Hall, Stephen J G; Hanotte, Olivier; Hassan, Faiz-Ul; Helsen, Philippe; Iacolina, Laura; Kantanen, Juha; Leempoel, Kevin; Lenstra, Johannes A; Ajmone-Marsan, Paolo; Masembe, Charles; Megens, Hendrik-Jan; Miele, Mara; Neuditschko, Markus; Nicolazzi, Ezequiel L; Pompanon, François; Roosen, Jutta; Sevane, Natalia; Smetko, Anamarija; Štambuk, Anamaria; Streeter, Ian; Stucki, Sylvie; Supakorn, China; Telo Da Gama, Luis; Tixier-Boichard, Michèle; Wegmann, Daniel; Zhan, Xiangjiang

    2015-01-01

    Livestock conservation practice is changing rapidly in light of policy developments, climate change and diversifying market demands. The last decade has seen a step change in technology and analytical approaches available to define, manage and conserve Farm Animal Genomic Resources (FAnGR). However, these rapid changes pose challenges for FAnGR conservation in terms of technological continuity, analytical capacity and integrative methodologies needed to fully exploit new, multidimensional data. The final conference of the ESF Genomic Resources program aimed to address these interdisciplinary problems in an attempt to contribute to the agenda for research and policy development directions during the coming decade. By 2020, according to the Convention on Biodiversity's Aichi Target 13, signatories should ensure that "…the genetic diversity of …farmed and domesticated animals and of wild relatives …is maintained, and strategies have been developed and implemented for minimizing genetic erosion and safeguarding their genetic diversity." However, the real extent of genetic erosion is very difficult to measure using current data. Therefore, this challenging target demands better coverage, understanding and utilization of genomic and environmental data, the development of optimized ways to integrate these data with social and other sciences and policy analysis to enable more flexible, evidence-based models to underpin FAnGR conservation. At the conference, we attempted to identify the most important problems for effective livestock genomic resource conservation during the next decade. Twenty priority questions were identified that could be broadly categorized into challenges related to methodology, analytical approaches, data management and conservation. It should be acknowledged here that while the focus of our meeting was predominantly around genetics, genomics and animal science, many of the practical challenges facing conservation of genomic resources are

  6. Prospects and challenges for the conservation of farm animal genomic resources, 2015-2025.

    PubMed

    Bruford, Michael W; Ginja, Catarina; Hoffmann, Irene; Joost, Stéphane; Orozco-terWengel, Pablo; Alberto, Florian J; Amaral, Andreia J; Barbato, Mario; Biscarini, Filippo; Colli, Licia; Costa, Mafalda; Curik, Ino; Duruz, Solange; Ferenčaković, Maja; Fischer, Daniel; Fitak, Robert; Groeneveld, Linn F; Hall, Stephen J G; Hanotte, Olivier; Hassan, Faiz-Ul; Helsen, Philippe; Iacolina, Laura; Kantanen, Juha; Leempoel, Kevin; Lenstra, Johannes A; Ajmone-Marsan, Paolo; Masembe, Charles; Megens, Hendrik-Jan; Miele, Mara; Neuditschko, Markus; Nicolazzi, Ezequiel L; Pompanon, François; Roosen, Jutta; Sevane, Natalia; Smetko, Anamarija; Štambuk, Anamaria; Streeter, Ian; Stucki, Sylvie; Supakorn, China; Telo Da Gama, Luis; Tixier-Boichard, Michèle; Wegmann, Daniel; Zhan, Xiangjiang

    2015-01-01

    Livestock conservation practice is changing rapidly in light of policy developments, climate change and diversifying market demands. The last decade has seen a step change in technology and analytical approaches available to define, manage and conserve Farm Animal Genomic Resources (FAnGR). However, these rapid changes pose challenges for FAnGR conservation in terms of technological continuity, analytical capacity and integrative methodologies needed to fully exploit new, multidimensional data. The final conference of the ESF Genomic Resources program aimed to address these interdisciplinary problems in an attempt to contribute to the agenda for research and policy development directions during the coming decade. By 2020, according to the Convention on Biodiversity's Aichi Target 13, signatories should ensure that "…the genetic diversity of …farmed and domesticated animals and of wild relatives …is maintained, and strategies have been developed and implemented for minimizing genetic erosion and safeguarding their genetic diversity." However, the real extent of genetic erosion is very difficult to measure using current data. Therefore, this challenging target demands better coverage, understanding and utilization of genomic and environmental data, the development of optimized ways to integrate these data with social and other sciences and policy analysis to enable more flexible, evidence-based models to underpin FAnGR conservation. At the conference, we attempted to identify the most important problems for effective livestock genomic resource conservation during the next decade. Twenty priority questions were identified that could be broadly categorized into challenges related to methodology, analytical approaches, data management and conservation. It should be acknowledged here that while the focus of our meeting was predominantly around genetics, genomics and animal science, many of the practical challenges facing conservation of genomic resources are

  7. Prospects and challenges for the conservation of farm animal genomic resources, 2015-2025

    PubMed Central

    Bruford, Michael W.; Ginja, Catarina; Hoffmann, Irene; Joost, Stéphane; Orozco-terWengel, Pablo; Alberto, Florian J.; Amaral, Andreia J.; Barbato, Mario; Biscarini, Filippo; Colli, Licia; Costa, Mafalda; Curik, Ino; Duruz, Solange; Ferenčaković, Maja; Fischer, Daniel; Fitak, Robert; Groeneveld, Linn F.; Hall, Stephen J. G.; Hanotte, Olivier; Hassan, Faiz-ul; Helsen, Philippe; Iacolina, Laura; Kantanen, Juha; Leempoel, Kevin; Lenstra, Johannes A.; Ajmone-Marsan, Paolo; Masembe, Charles; Megens, Hendrik-Jan; Miele, Mara; Neuditschko, Markus; Nicolazzi, Ezequiel L.; Pompanon, François; Roosen, Jutta; Sevane, Natalia; Smetko, Anamarija; Štambuk, Anamaria; Streeter, Ian; Stucki, Sylvie; Supakorn, China; Telo Da Gama, Luis; Tixier-Boichard, Michèle; Wegmann, Daniel; Zhan, Xiangjiang

    2015-01-01

    Livestock conservation practice is changing rapidly in light of policy developments, climate change and diversifying market demands. The last decade has seen a step change in technology and analytical approaches available to define, manage and conserve Farm Animal Genomic Resources (FAnGR). However, these rapid changes pose challenges for FAnGR conservation in terms of technological continuity, analytical capacity and integrative methodologies needed to fully exploit new, multidimensional data. The final conference of the ESF Genomic Resources program aimed to address these interdisciplinary problems in an attempt to contribute to the agenda for research and policy development directions during the coming decade. By 2020, according to the Convention on Biodiversity's Aichi Target 13, signatories should ensure that “…the genetic diversity of …farmed and domesticated animals and of wild relatives …is maintained, and strategies have been developed and implemented for minimizing genetic erosion and safeguarding their genetic diversity.” However, the real extent of genetic erosion is very difficult to measure using current data. Therefore, this challenging target demands better coverage, understanding and utilization of genomic and environmental data, the development of optimized ways to integrate these data with social and other sciences and policy analysis to enable more flexible, evidence-based models to underpin FAnGR conservation. At the conference, we attempted to identify the most important problems for effective livestock genomic resource conservation during the next decade. Twenty priority questions were identified that could be broadly categorized into challenges related to methodology, analytical approaches, data management and conservation. It should be acknowledged here that while the focus of our meeting was predominantly around genetics, genomics and animal science, many of the practical challenges facing conservation of genomic resources are

  8. The Awesome Power of Yeast Evolutionary Genetics: New Genome Sequences and Strain Resources for the Saccharomyces sensu stricto Genus

    PubMed Central

    Scannell, Devin R.; Zill, Oliver A.; Rokas, Antonis; Payen, Celia; Dunham, Maitreya J.; Eisen, Michael B.; Rine, Jasper; Johnston, Mark; Hittinger, Chris Todd

    2011-01-01

    High-quality, well-annotated genome sequences and standardized laboratory strains fuel experimental and evolutionary research. We present improved genome sequences of three species of Saccharomyces sensu stricto yeasts: S. bayanus var. uvarum (CBS 7001), S. kudriavzevii (IFO 1802T and ZP 591), and S. mikatae (IFO 1815T), and describe their comparison to the genomes of S. cerevisiae and S. paradoxus. The new sequences, derived by assembling millions of short DNA sequence reads together with previously published Sanger shotgun reads, have vastly greater long-range continuity and far fewer gaps than the previously available genome sequences. New gene predictions defined a set of 5261 protein-coding orthologs across the five most commonly studied Saccharomyces yeasts, enabling a re-examination of the tempo and mode of yeast gene evolution and improved inferences of species-specific gains and losses. To facilitate experimental investigations, we generated genetically marked, stable haploid strains for all three of these Saccharomyces species. These nearly complete genome sequences and the collection of genetically marked strains provide a valuable toolset for comparative studies of gene function, metabolism, and evolution, and render Saccharomyces sensu stricto the most experimentally tractable model genus. These resources are freely available and accessible through www.SaccharomycesSensuStricto.org. PMID:22384314

  9. Transcriptome and genome sequencing uncovers functional variation in humans

    PubMed Central

    Lappalainen, Tuuli; Sammeth, Michael; Friedländer, Marc R; ‘t Hoen, Peter AC; Monlong, Jean; Rivas, Manuel A; Gonzàlez-Porta, Mar; Kurbatova, Natalja; Griebel, Thasso; Ferreira, Pedro G; Barann, Matthias; Wieland, Thomas; Greger, Liliana; van Iterson, Maarten; Almlöf, Jonas; Ribeca, Paolo; Pulyakhina, Irina; Esser, Daniela; Giger, Thomas; Tikhonov, Andrew; Sultan, Marc; Bertier, Gabrielle; MacArthur, Daniel G; Lek, Monkol; Lizano, Esther; Buermans, Henk PJ; Padioleau, Ismael; Schwarzmayr, Thomas; Karlberg, Olof; Ongen, Halit; Kilpinen, Helena; Beltran, Sergi; Gut, Marta; Kahlem, Katja; Amstislavskiy, Vyacheslav; Stegle, Oliver; Pirinen, Matti; Montgomery, Stephen B; Donnelly, Peter; McCarthy, Mark I; Flicek, Paul; Strom, Tim M; Lehrach, Hans; Schreiber, Stefan; Sudbrak, Ralf; Carracedo, Ángel; Antonarakis, Stylianos E; Häsler, Robert; Syvänen, Ann-Christine; van Ommen, Gert-Jan; Brazma, Alvis; Meitinger, Thomas; Rosenstiel, Philip; Guigó, Roderic; Gut, Ivo G; Estivill, Xavier; Dermitzakis, Emmanouil T

    2013-01-01

    Summary Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of mRNA and miRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project – the first uniformly processed RNA-seq data from multiple human populations with high-quality genome sequences. We discovered extremely widespread genetic variation affecting regulation of the majority of genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on cellular mechanisms of regulatory and loss-of-function variation, and allowed us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome. PMID:24037378

  10. A whole-genome, radiation hybrid mapping resource of hexaploid wheat.

    PubMed

    Tiwari, Vijay K; Heesacker, Adam; Riera-Lizarazu, Oscar; Gunn, Hilary; Wang, Shichen; Wang, Yi; Gu, Young Q; Paux, Etienne; Koo, Dal-Hoe; Kumar, Ajay; Luo, Ming-Cheng; Lazo, Gerard; Zemetra, Robert; Akhunov, Eduard; Friebe, Bernd; Poland, Jesse; Gill, Bikram S; Kianian, Shahryar; Leonard, Jeffrey M

    2016-04-01

    Generating a contiguous, ordered reference sequence of a complex genome such as hexaploid wheat (2n = 6x = 42; approximately 17 GB) is a challenging task due to its large, highly repetitive, and allopolyploid genome. In wheat, ordering of whole-genome or hierarchical shotgun sequencing contigs is primarily based on recombination and comparative genomics-based approaches. However, comparative genomics approaches are limited to syntenic inference and recombination is suppressed within the pericentromeric regions of wheat chromosomes, thus, precise ordering of physical maps and sequenced contigs across the whole-genome using these approaches is nearly impossible. We developed a whole-genome radiation hybrid (WGRH) resource and tested it by genotyping a set of 115 randomly selected lines on a high-density single nucleotide polymorphism (SNP) array. At the whole-genome level, 26 299 SNP markers were mapped on the RH panel and provided an average mapping resolution of approximately 248 Kb/cR1500 with a total map length of 6866 cR1500 . The 7296 unique mapping bins provided a five- to eight-fold higher resolution than genetic maps used in similar studies. Most strikingly, the RH map had uniform bin resolution across the entire chromosome(s), including pericentromeric regions. Our research provides a valuable and low-cost resource for anchoring and ordering sequenced BAC and next generation sequencing (NGS) contigs. The WGRH developed for reference wheat line Chinese Spring (CS-WGRH), will be useful for anchoring and ordering sequenced BAC and NGS based contigs for assembling a high-quality, reference sequence of hexaploid wheat. Additionally, this study provides an excellent model for developing similar resources for other polyploid species.

  11. Beyond Drosophila: RNAi in vivo and functional genomics in insects.

    PubMed

    Bellés, Xavier

    2010-01-01

    The increasing availability of insect genomes has revealed a large number of genes with unknown functions and the resulting problem of how to discover these functions. The RNA interference (RNAi) technique, which generates loss-of-function phenotypes by depletion of a chosen transcript, can help to overcome this challenge. RNAi can unveil the functions of new genes, lead to the discovery of new functions for old genes, and find the genes for old functions. Moreover, the possibility of studying the functions of homologous genes in different species can allow comparisons of the genetic networks regulating a given function in different insect groups, thereby facilitating an evolutionary insight into developmental processes. RNAi also has drawbacks and obscure points, however, such as those related to differences in species sensitivity. Disentangling these differences is one of the main challenges in the RNAi field.

  12. Genomic islands predict functional adaptation in marine actinobacteria

    SciTech Connect

    Penn, Kevin; Jenkins, Caroline; Nett, Markus; Udwary, Daniel; Gontang, Erin; McGlinchey, Ryan; Foster, Brian; Lapidus, Alla; Podell, Sheila; Allen, Eric; Moore, Bradley; Jensen, Paul

    2009-04-01

    Linking functional traits to bacterial phylogeny remains a fundamental but elusive goal of microbial ecology 1. Without this information, it becomes impossible to resolve meaningful units of diversity and the mechanisms by which bacteria interact with each other and adapt to environmental change. Ecological adaptations among bacterial populations have been linked to genomic islands, strain-specific regions of DNA that house functionally adaptive traits 2. In the case of environmental bacteria, these traits are largely inferred from bioinformatic or gene expression analyses 2, thus leaving few examples in which the functions of island genes have been experimentally characterized. Here we report the complete genome sequences of Salinispora tropica and S. arenicola, the first cultured, obligate marine Actinobacteria 3. These two species inhabit benthic marine environments and dedicate 8-10percent of their genomes to the biosynthesis of secondary metabolites. Despite a close phylogenetic relationship, 25 of 37 secondary metabolic pathways are species-specific and located within 21 genomic islands, thus providing new evidence linking secondary metabolism to ecological adaptation. Species-specific differences are also observed in CRISPR sequences, suggesting that variations in phage immunity provide fitness advantages that contribute to the cosmopolitan distribution of S. arenicola 4. The two Salinispora genomes have evolved by complex processes that include the duplication and acquisition of secondary metabolite genes, the products of which provide immediate opportunities for molecular diversification and ecological adaptation. Evidence that secondary metabolic pathways are exchanged by Horizontal Gene Transfer (HGT) yet are fixed among globally distributed populations 5 supports a functional role for their products and suggests that pathway acquisition represents a previously unrecognized force driving bacterial diversification

  13. dbWGFP: a database and web server of human whole-genome single nucleotide variants and their functional predictions

    PubMed Central

    Wu, Jiaxin; Wu, Mengmeng; Li, Lianshuo; Liu, Zhuo; Zeng, Wanwen; Jiang, Rui

    2016-01-01

    The recent advancement of the next generation sequencing technology has enabled the fast and low-cost detection of all genetic variants spreading across the entire human genome, making the application of whole-genome sequencing a tendency in the study of disease-causing genetic variants. Nevertheless, there still lacks a repository that collects predictions of functionally damaging effects of human genetic variants, though it has been well recognized that such predictions play a central role in the analysis of whole-genome sequencing data. To fill this gap, we developed a database named dbWGFP (a database and web server of human whole-genome single nucleotide variants and their functional predictions) that contains functional predictions and annotations of nearly 8.58 billion possible human whole-genome single nucleotide variants. Specifically, this database integrates 48 functional predictions calculated by 17 popular computational methods and 44 valuable annotations obtained from various data sources. Standalone software, user-friendly query services and free downloads of this database are available at http://bioinfo.au.tsinghua.edu.cn/dbwgfp. dbWGFP provides a valuable resource for the analysis of whole-genome sequencing, exome sequencing and SNP array data, thereby complementing existing data sources and computational resources in deciphering genetic bases of human inherited diseases. PMID:26989155

  14. dbWGFP: a database and web server of human whole-genome single nucleotide variants and their functional predictions.

    PubMed

    Wu, Jiaxin; Wu, Mengmeng; Li, Lianshuo; Liu, Zhuo; Zeng, Wanwen; Jiang, Rui

    2016-01-01

    The recent advancement of the next generation sequencing technology has enabled the fast and low-cost detection of all genetic variants spreading across the entire human genome, making the application of whole-genome sequencing a tendency in the study of disease-causing genetic variants. Nevertheless, there still lacks a repository that collects predictions of functionally damaging effects of human genetic variants, though it has been well recognized that such predictions play a central role in the analysis of whole-genome sequencing data. To fill this gap, we developed a database named dbWGFP (a database and web server of human whole-genome single nucleotide variants and their functional predictions) that contains functional predictions and annotations of nearly 8.58 billion possible human whole-genome single nucleotide variants. Specifically, this database integrates 48 functional predictions calculated by 17 popular computational methods and 44 valuable annotations obtained from various data sources. Standalone software, user-friendly query services and free downloads of this database are available at http://bioinfo.au.tsinghua.edu.cn/dbwgfp. dbWGFP provides a valuable resource for the analysis of whole-genome sequencing, exome sequencing and SNP array data, thereby complementing existing data sources and computational resources in deciphering genetic bases of human inherited diseases. PMID:26989155

  15. Functional Genomics Using the Saccharomyces cerevisiae Yeast Deletion Collections.

    PubMed

    Nislow, Corey; Wong, Lai Hong; Lee, Amy Huei-Yi; Giaever, Guri

    2016-01-01

    Constructed by a consortium of 16 laboratories, the Saccharomyces genome-wide deletion collections have, for the past decade, provided a powerful, rapid, and inexpensive approach for functional profiling of the yeast genome. Loss-of-function deletion mutants were systematically created using a polymerase chain reaction (PCR)-based gene deletion strategy to generate a start-to-stop codon replacement of each open reading frame by homologous recombination. Each strain carries two molecular barcodes that serve as unique strain identifiers, enabling their growth to be analyzed in parallel and the fitness contribution of each gene to be quantitatively assessed by hybridization to high-density oligonucleotide arrays or through the use of next-generation sequencing technologies. Functional profiling of the deletion collections, using either strain-by-strain or parallel assays, provides an unbiased approach to systematically survey the yeast genome. The Saccharomyces yeast deletion collections have proved immensely powerful in contributing to the understanding of gene function, including functional relationships between genes and genetic pathways in response to diverse genetic and environmental perturbations. PMID:27587784

  16. Final progress report, Construction of a genome-wide highly characterized clone resource for genome sequencing

    SciTech Connect

    Nierman, William C.

    2000-02-14

    At TIGR, the human Bacterial Artificial Chromosome (BAC) end sequencing and trimming were with an overall sequencing success rate of 65%. CalTech human BAC libraries A, B, C and D as well as Roswell Park Cancer Institute's library RPCI-11 were used. To date, we have generated >300,000 end sequences from >186,000 human BAC clones with an average read length {approx}460 bp for a total of 141 Mb covering {approx}4.7% of the genome. Over sixty percent of the clones have BAC end sequences (BESs) from both ends representing over five-fold coverage of the genome by the paired-end clones. The average phred Q20 length is {approx}400 bp. This high accuracy makes our BESs match the human finished sequences with an average identity of 99% and a match length of 450 bp, and a frequency of one match per 12.8 kb contig sequence. Our sample tracking has ensured a clone tracking accuracy of >90%, which gives researchers a high confidence in (1) retrieving the right clone from the BA C libraries based on the sequence matches; and (2) building a minimum tiling path of sequence-ready clones across the genome and genome assembly scaffolds.

  17. Using The ENCODE Resource For Functional Annotation Of Genetic Variants

    PubMed Central

    Pazin, Michael J.

    2015-01-01

    Summary This article illustrates the use of the Encyclopedia of DNA Elements (ENCODE) resource to generate or refine hypotheses from genomic data on disease and other phenotypic traits. First, the goals and history of ENCODE and related epigenomics projects are reviewed. Second, the rationale for ENCODE and the major data types used by ENCODE are briefly described, as are some standard heuristics for their interpretation. Third, the use of the ENCODE resource is examined. Standard use cases for ENCODE, accessing the ENCODE resource, and accessing data from related projects are discussed. Finally, access to resources from ENCODE and related epigenomics projects are reviewed. (Although the focus of this article is the use of ENCODE data, some of the same approaches can be used with the data from other projects.) While this article is focused on the case of interpreting genetic variation data, essentially the same approaches can be used with the ENCODE resource, or with data from other projects, to interpret epigenomic and gene regulation data, with appropriate modification (Rakyan et al. 2011; Ng et al. 2012). Such approaches could allow investigators to use genomic methods to study environmental and stochastic processes, in addition to genetic processes. PMID:25762420

  18. Genomic resources for multiple species in the Drosophila ananassae species group.

    PubMed

    Signor, Sarah; Seher, Thaddeus; Kopp, Artyom

    2013-01-01

    The development of genomic resources in non-model taxa is essential for understanding the genetic basis of biological diversity. Although the genomes of many Drosophila species have been sequenced, most of the phenotypic diversity in this genus remains to be explored. To facilitate the genetic analysis of interspecific and intraspecific variation, we have generated new genomic resources for seven species and subspecies in the D. ananassae species subgroup. We have generated large amounts of transcriptome sequence data for D. ercepeae, D. merina, D. bipectinata, D. malerkotliana malerkotliana, D. m. pallens, D. pseudoananassae pseudoananassae, and D. p. nigrens. de novo assembly resulted in contigs covering more than half of the predicted transcriptome and matching an average of 59% of annotated genes in the complete genome of D. ananassae. Most contigs, corresponding to an average of 49% of D. ananassae genes, contain sequence polymorphisms that can be used as genetic markers. Subsets of these markers were validated by genotyping the progeny of inter- and intraspecific crosses. The ananassae subgroup is an excellent model system for examining the molecular basis of speciation and phenotypic evolution. The new genomic resources will facilitate the genetic analysis of inter- and intraspecific differences in this lineage. Transcriptome sequencing provides a simple and cost-effective way to identify molecular markers at nearly single-gene density, and is equally applicable to any non-model taxa. PMID:23639891

  19. Genomic resources for multiple species in the Drosophila ananassae species group

    PubMed Central

    Signor, Sarah; Seher, Thaddeus; Kopp, Artyom

    2013-01-01

    The development of genomic resources in non-model taxa is essential for understanding the genetic basis of biological diversity. Although the genomes of many Drosophila species have been sequenced, most of the phenotypic diversity in this genus remains to be explored. To facilitate the genetic analysis of interspecific and intraspecific variation, we have generated new genomic resources for seven species and subspecies in the D. ananassae species subgroup. We have generated large amounts of transcriptome sequence data for D. ercepeae, D. merina, D. bipectinata, D. malerkotliana malerkotliana, D. m. pallens, D. pseudoananassae pseudoananassae, and D. p. nigrens. de novo assembly resulted in contigs covering more than half of the predicted transcriptome and matching an average of 59% of annotated genes in the complete genome of D. ananassae. Most contigs, corresponding to an average of 49% of D. ananassae genes, contain sequence polymorphisms that can be used as genetic markers. Subsets of these markers were validated by genotyping the progeny of inter- and intraspecific crosses. The ananassae subgroup is an excellent model system for examining the molecular basis of speciation and phenotypic evolution. The new genomic resources will facilitate the genetic analysis of inter- and intraspecific differences in this lineage. Transcriptome sequencing provides a simple and cost-effective way to identify molecular markers at nearly single-gene density, and is equally applicable to any non-model taxa. PMID:23639891

  20. VectorBase: improvements to a bioinformatics resource for invertebrate vector genomics.

    PubMed

    Megy, Karine; Emrich, Scott J; Lawson, Daniel; Campbell, David; Dialynas, Emmanuel; Hughes, Daniel S T; Koscielny, Gautier; Louis, Christos; Maccallum, Robert M; Redmond, Seth N; Sheehan, Andrew; Topalis, Pantelis; Wilson, Derek

    2012-01-01

    VectorBase (http://www.vectorbase.org) is a NIAID-supported bioinformatics resource for invertebrate vectors of human pathogens. It hosts data for nine genomes: mosquitoes (three Anopheles gambiae genomes, Aedes aegypti and Culex quinquefasciatus), tick (Ixodes scapularis), body louse (Pediculus humanus), kissing bug (Rhodnius prolixus) and tsetse fly (Glossina morsitans). Hosted data range from genomic features and expression data to population genetics and ontologies. We describe improvements and integration of new data that expand our taxonomic coverage. Releases are bi-monthly and include the delivery of preliminary data for emerging genomes. Frequent updates of the genome browser provide VectorBase users with increasing options for visualizing their own high-throughput data. One major development is a new population biology resource for storing genomic variations, insecticide resistance data and their associated metadata. It takes advantage of improved ontologies and controlled vocabularies. Combined, these new features ensure timely release of multiple types of data in the public domain while helping overcome the bottlenecks of bioinformatics and annotation by engaging with our user community. PMID:22135296

  1. Towards a TILLING platform for functional genomics in Piel de Sapo melons

    PubMed Central

    2011-01-01

    Background The availability of genetic and genomic resources for melon has increased significantly, but functional genomics resources are still limited for this crop. TILLING is a powerful reverse genetics approach that can be utilized to generate novel mutations in candidate genes. A TILLING resource is available for cantalupensis melons, but not for inodorus melons, the other main commercial group. Results A new ethyl methanesulfonate-mutagenized (EMS) melon population was generated for the first time in an andromonoecious non-climacteric inodorus Piel de Sapo genetic background. Diverse mutant phenotypes in seedlings, vines and fruits were observed, some of which were of possible commercial interest. The population was first screened for mutations in three target genes involved in disease resistance and fruit quality (Cm-PDS, Cm-eIF4E and Cm-eIFI(iso)4E). The same genes were also tilled in the available monoecious and climacteric cantalupensis EMS melon population. The overall mutation density in this first Piel de Sapo TILLING platform was estimated to be 1 mutation/1.5 Mb by screening four additional genes (Cm-ACO1, Cm-NOR, Cm-DET1 and Cm-DHS). Thirty-three point mutations were found for the seven gene targets, six of which were predicted to have an impact on the function of the protein. The genotype/phenotype correlation was demonstrated for a loss-of-function mutation in the Phytoene desaturase gene, which is involved in carotenoid biosynthesis. Conclusions The TILLING approach was successful at providing new mutations in the genetic background of Piel de Sapo in most of the analyzed genes, even in genes for which natural variation is extremely low. This new resource will facilitate reverse genetics studies in non-climacteric melons, contributing materially to future genomic and breeding studies. PMID:21834982

  2. Genome-scale resources for Thermoanaerobacterium saccharolyticum

    SciTech Connect

    Currie, Devin H.; Raman, Babu; Gowen, Christopher M.; Tschaplinski, Timothy J.; Land, Miriam L.; Brown, Steven D.; Covalla, Sean; Klingeman, Dawn Marie; Yang, Zamin Koo; Engle, Nancy L.; Johnson, Courtney M.; Rodriguez, Miguel A.; Shaw, A. Joe; Kenealy, William R.; Lynd, Lee R.; Fong, Stephen S.; Mielenz, Jonathan R.; Davison, Brian H.; Hogsett, David A.; Herring, Christopher D.

    2015-06-26

    Thermoanaerobacterium saccharolyticum is a hemicellulose-degrading thermophilic anaerobe that was previously engineered to produce ethanol at high yield. For this research, a major project was undertaken to develop this organism into an industrial biocatalyst, but the lack of genome information and resources were recognized early on as a key limitation.

  3. Functional genomics of Lactobacillus casei establishment in the gut

    PubMed Central

    Licandro-Seraut, Hélène; Scornec, Hélène; Pédron, Thierry; Cavin, Jean-François; Sansonetti, Philippe J.

    2014-01-01

    Although the composition of the gut microbiota and its symbiotic contribution to key host physiological functions are well established, little is known as yet about the bacterial factors that account for this symbiosis. We selected Lactobacillus casei as a model microorganism to proceed to genomewide identification of the functions required for a symbiont to establish colonization in the gut. As a result of our recent development of a transposon-mutagenesis tool that overcomes the barrier that had prevented L. casei random mutagenesis, we developed a signature-tagged mutagenesis approach combining whole-genome reverse genetics using a set of tagged transposons and in vivo screening using the rabbit ligated ileal loop model. After sequencing transposon insertion sites in 9,250 random mutants, we assembled a library of 1,110 independent mutants, all disrupted in a different gene, that provides a representative view of the L. casei genome. By determining the relative quantity of each of the 1,110 mutants before and after the in vivo challenge, we identified a core of 47 L. casei genes necessary for its establishment in the gut. They are involved in housekeeping functions, metabolism (sugar, amino acids), cell wall biogenesis, and adaptation to environment. Hence we provide what is, to our knowledge, the first global functional genomics analysis of L. casei symbiosis. PMID:25024222

  4. Functional genomic analysis of the Drosophila immune response.

    PubMed

    Valanne, Susanna

    2014-01-01

    Drosophila melanogaster has been widely used as a model organism for over a century now, and also as an immunological research model for over 20 years. With the emergence of RNA interference (RNAi) in Drosophila as a robust tool to silence genes of interest, large-scale or genome-wide functional analysis has become a popular way of studying the Drosophila immune response in cell culture. Drosophila immunity is composed of cellular and humoral immunity mechanisms, and especially the systemic, humoral response pathways have been extensively dissected using the functional genomic approach. Although most components of the main immune pathways had already been found using traditional genetic screening techniques, important findings including pathway components, positive and negative regulators and modifiers have been made with RNAi screening. Additionally, RNAi screening has produced new information on host-pathogen interactions related to the pathogenesis of many microbial species. PMID:23707784

  5. Functional annotation of introns in mitochondrial genome--a brief review.

    PubMed

    Anandakumar, Shanmugam; Ravindran, Suda Parimala; Shanmughavel, Piramanayagam

    2016-01-01

    The present study is to decipher the non-coding regions present in mitochondrial genomes that cause diseases in humans and predict their functional roles through comparative genomics approach followed by functional annotation of these segments.

  6. Nucleotide diversity analysis highlights functionally important genomic regions

    PubMed Central

    Tatarinova, Tatiana V.; Chekalin, Evgeny; Nikolsky, Yuri; Bruskin, Sergey; Chebotarov, Dmitry; McNally, Kenneth L.; Alexandrov, Nickolai

    2016-01-01

    We analyzed functionality and relative distribution of genetic variants across the complete Oryza sativa genome, using the 40 million single nucleotide polymorphisms (SNPs) dataset from the 3,000 Rice Genomes Project (http://snp-seek.irri.org), the largest and highest density SNP collection for any higher plant. We have shown that the DNA-binding transcription factors (TFs) are the most conserved group of genes, whereas kinases and membrane-localized transporters are the most variable ones. TFs may be conserved because they belong to some of the most connected regulatory hubs that modulate transcription of vast downstream gene networks, whereas signaling kinases and transporters need to adapt rapidly to changing environmental conditions. In general, the observed profound patterns of nucleotide variability reveal functionally important genomic regions. As expected, nucleotide diversity is much higher in intergenic regions than within gene bodies (regions spanning gene models), and protein-coding sequences are more conserved than untranslated gene regions. We have observed a sharp decline in nucleotide diversity that begins at about 250 nucleotides upstream of the transcription start and reaches minimal diversity exactly at the transcription start. We found the transcription termination sites to have remarkably symmetrical patterns of SNP density, implying presence of functional sites near transcription termination. Also, nucleotide diversity was significantly lower near 3′ UTRs, the area rich with regulatory regions. PMID:27774999

  7. Functional Profiling Using the Saccharomyces Genome Deletion Project Collections.

    PubMed

    Nislow, Corey; Wong, Lai Hong; Lee, Amy Huei-Yi; Giaever, Guri

    2016-01-01

    The ability to measure and quantify the fitness of an entire organism requires considerably more complex approaches than simply using traditional "omic" methods that examine, for example, the abundance of RNA transcripts, proteins, or metabolites. The yeast deletion collections represent the only systematic, comprehensive set of null alleles for any organism in which such fitness measurements can be assayed. Generated by the Saccharomyces Genome Deletion Project, these collections allow the systematic and parallel analysis of gene functions using any measurable phenotype. The unique 20-bp molecular barcodes engineered into the genome of each deletion strain facilitate the massively parallel analysis of individual fitness. Here, we present functional genomic protocols for use with the yeast deletion collections. We describe how to maintain, propagate, and store the deletion collections and how to perform growth fitness assays on single and parallel screening platforms. Phenotypic fitness analyses of the yeast mutants, described in brief here, provide important insights into biological functions, mechanisms of drug action, and response to environmental stresses. It is important to bear in mind that the specific assays described in this protocol represent some of the many ways in which these collections can be assayed, and in this description particular attention is paid to maximizing throughput using growth as the phenotypic measure. PMID:27587776

  8. Genomic leftovers: identifying novel microsatellites, over-represented motifs and functional elements in the human genome.

    PubMed

    Fonville, Natalie C; Velmurugan, Karthik Raja; Tae, Hongseok; Vaksman, Zalman; McIver, Lauren J; Garner, Harold R

    2016-01-01

    The human genome is 99% complete. This study contributes to filling the 1% gap by enriching previously unknown repeat regions called microsatellites (MST). We devised a Global MST Enrichment (GME) kit to enrich and nextgen sequence 2 colorectal cell lines and 16 normal human samples to illustrate its utility in identifying contigs from reads that do not map to the genome reference. The analysis of these samples yielded 790 novel extra-referential concordant contigs that are observed in more than one sample. We searched for evidence of functional elements in the concordant contigs in two ways: (1) BLAST-ing each contig against normal RNA-Seq samples, (2) Checking for predicted functional elements using GlimmerHMM. Of the 790 concordant contigs, 37 had an exact match to at least one RNA-Seq read; 15 aligned to more than 100 RNA-Seq reads. Of the 249 concordant contigs predicted by GlimmerHMM to have functional elements, 6 had at least one exact RNA-Seq match. BLAST-ing these novel contigs against all publically available sequences confirmed that they were found in human and chimpanzee BAC and FOSMID clones sequenced as part of the original human genome project. These extra-referential contigs predominantly contained pentameric repeats, especially two motifs: AATGG and GTGGA. PMID:27278669

  9. Genomic leftovers: identifying novel microsatellites, over-represented motifs and functional elements in the human genome

    PubMed Central

    Fonville, Natalie C.; Velmurugan, Karthik Raja; Tae, Hongseok; Vaksman, Zalman; McIver, Lauren J.; Garner, Harold R.

    2016-01-01

    The human genome is 99% complete. This study contributes to filling the 1% gap by enriching previously unknown repeat regions called microsatellites (MST). We devised a Global MST Enrichment (GME) kit to enrich and nextgen sequence 2 colorectal cell lines and 16 normal human samples to illustrate its utility in identifying contigs from reads that do not map to the genome reference. The analysis of these samples yielded 790 novel extra-referential concordant contigs that are observed in more than one sample. We searched for evidence of functional elements in the concordant contigs in two ways: (1) BLAST-ing each contig against normal RNA-Seq samples, (2) Checking for predicted functional elements using GlimmerHMM. Of the 790 concordant contigs, 37 had an exact match to at least one RNA-Seq read; 15 aligned to more than 100 RNA-Seq reads. Of the 249 concordant contigs predicted by GlimmerHMM to have functional elements, 6 had at least one exact RNA-Seq match. BLAST-ing these novel contigs against all publically available sequences confirmed that they were found in human and chimpanzee BAC and FOSMID clones sequenced as part of the original human genome project. These extra-referential contigs predominantly contained pentameric repeats, especially two motifs: AATGG and GTGGA. PMID:27278669

  10. Genomic leftovers: identifying novel microsatellites, over-represented motifs and functional elements in the human genome.

    PubMed

    Fonville, Natalie C; Velmurugan, Karthik Raja; Tae, Hongseok; Vaksman, Zalman; McIver, Lauren J; Garner, Harold R

    2016-06-09

    The human genome is 99% complete. This study contributes to filling the 1% gap by enriching previously unknown repeat regions called microsatellites (MST). We devised a Global MST Enrichment (GME) kit to enrich and nextgen sequence 2 colorectal cell lines and 16 normal human samples to illustrate its utility in identifying contigs from reads that do not map to the genome reference. The analysis of these samples yielded 790 novel extra-referential concordant contigs that are observed in more than one sample. We searched for evidence of functional elements in the concordant contigs in two ways: (1) BLAST-ing each contig against normal RNA-Seq samples, (2) Checking for predicted functional elements using GlimmerHMM. Of the 790 concordant contigs, 37 had an exact match to at least one RNA-Seq read; 15 aligned to more than 100 RNA-Seq reads. Of the 249 concordant contigs predicted by GlimmerHMM to have functional elements, 6 had at least one exact RNA-Seq match. BLAST-ing these novel contigs against all publically available sequences confirmed that they were found in human and chimpanzee BAC and FOSMID clones sequenced as part of the original human genome project. These extra-referential contigs predominantly contained pentameric repeats, especially two motifs: AATGG and GTGGA.

  11. Genomics Encyclopedia of Bacteria and Archaea-Root Nodule Bacteria (GEBA-RNB): a resource for microsymbiont genomes (2013 DOE JGI Genomics of Energy and Environment 8th Annual User Meeting)

    SciTech Connect

    Reeve, Wayne

    2013-03-01

    Wayne Reeve of Murdoch University on "Genomics Encyclopedia of Bacteria and Archaea-Root Nodule Bacteria (GEBA-RNB): a resource for microsymbiont genomes" at the 8th Annual Genomics of Energy & Environment Meeting on March 27, 2013 in Walnut Creek, Calif.

  12. Insect genome content phylogeny and functional annotation of core insect genomes.

    PubMed

    Rosenfeld, Jeffrey A; Foox, Jonathan; DeSalle, Rob

    2016-04-01

    Twenty-one fully sequenced and well annotated insect genomes were examined for genome content in a phylogenetic context. Gene presence/absence matrices and phylogenetic trees were constructed using several phylogenetic criteria. The role of e-value on phylogenetic analysis and genome content characterization is examined using scaled e-value cutoffs and a single linkage clustering approach to orthology determination. Previous studies have focused on the role of gene loss in terminals in the insect tree of life. The present study examines several common ancestral nodes in the insect tree. We suggest that the common ancestors of major insect groups like Diptera, Hymenoptera, Hemiptera and Holometabola experience more gene gain than gene loss. This suggests that as major insect groups arose, their genomic repertoire expanded through gene duplication (segmental duplications), followed by contraction by gene loss in specific terminal lineages. In addition, we examine the functional significance of the loss and gain of genes in the divergence of some of the major insect groups. PMID:26549428

  13. Functional genomic studies of aldo-keto reductases.

    PubMed

    Petrash, J M; Murthy, B S; Young, M; Morris, K; Rikimaru, L; Griest, T A; Harter, T

    2001-01-30

    Aldose reductase (AR) is considered a potential mediator of diabetic complications and is a drug target for inhibitors of diabetic retinopathy and neuropathy in clinical trials. However, the physiological role of this enzyme still has not been established. Since effective inhibition of diabetic complications will require early intervention, it is important to delineate whether AR fulfills a physiological role that cannot be compensated by an alternate aldo-keto reductase. Functional genomics provides a variety of powerful new tools to probe the physiological roles of individual genes, especially those comprising gene families. Several eucaryotic genomes have been sequenced and annotated, including yeast, nematode and fly. To probe the function of AR, we have chosen to utilize the budding yeast Saccharomyces cerevisiae as a potential model system. Unlike Caenorhabditis elegans and D. melanogaster, yeast provides a more desirable system for our studies because its genome is manipulated more readily and is able to sustain multiple gene deletions in the presence of either drug or auxotrophic selectable markers. Using BLAST searches against the human AR gene sequence, we identified six genes in the complete S. cerevisiae genome with strong homology to AR. In all cases, amino acids thought to play important catalytic roles in human AR are conserved in the yeast AR-like genes. All six yeast AR-like open reading frames (ORFs) have been cloned into plasmid expression vectors. Substrate and AR inhibitor specificities have been surveyed on four of the enzyme forms to identify, which are the most functionally similar to human AR. Our data reveal that two of the enzymes (YDR368Wp and YHR104Wp) are notable for their similarity to human AR in terms of activity with aldoses and substituted aromatic aldehydes. Ongoing studies are aimed at characterizing the phenotypes of yeast strains containing single and multiple knockouts of the AR-like genes. PMID:11306085

  14. Genome Engineering Using Targeted Oligonucleotide Libraries and Functional Selection

    PubMed Central

    Diner, Elie J.; Garza-Sánchez, Fernando; Hayes, Christopher S.

    2011-01-01

    The λ phage Red proteins greatly enhance homologous recombination in Escherichia coli. Red-mediated recombination or “recombineering” can be used to construct targeted gene deletions as well as to introduce point mutations into the genome. Here, we describe our method for scanning mutagenesis using recombineered oligonucleotide libraries. This approach entails randomization of specific codons within a target gene, followed by functional selection to isolate mutants. Oligonucleotide library mutagenesis has generated hundreds of novel antibiotic resistance mutations in genes encoding ribosomal proteins, and should be applicable to other systems for which functional selections exist. PMID:21815087

  15. Functional annotations for the Saccharomyces cerevisiae genome: the knowns and the known unknowns

    PubMed Central

    Christie, Karen R.; Hong, Eurie L.; Cherry, J. Michael

    2011-01-01

    The quest to characterize each of the genes of the yeast Saccharomyces cerevisiae has propelled the development and application of novel high-throughput (HTP) experimental techniques. To handle the enormous amount of information generated by these techniques, new bioinformatics tools and resources are needed. Gene Ontology (GO) annotations curated by the Saccharomyces Genome Database (SGD) have facilitated the development of algorithms that analyze HTP data and help predict functions for poorly characterized genes in S. cerevisiae and other organisms. Here, we describe how published results are incorporated into GO annotations at SGD and why researchers can benefit from using these resources wisely to analyze their HTP data and predict gene functions. PMID:19577472

  16. Preterm Birth Genome Project (PGP) -- validation of resources for preterm birth genome-wide studies.

    PubMed

    Pennell, Craig E; Vadillo-Ortega, Felipe; Olson, David M; Ha, Eun-Hee; Williams, Scott; Frayling, Tim M; Dolan, Siobhan; Katz, Michael; Merialdi, Mario; Menon, Ramkumar

    2013-01-01

    We determined a series of quality control (QC) analyses to assess the usability of DNA collected and processed from different countries utilizing different DNA extraction techniques prior to genome-wide association studies (GWAS). The quality of DNA collected utilizing four different DNA extraction techniques and the impact of shipping DNA at different temperatures on array performance were evaluated. Fifteen maternal-fetal pairs were used from four countries. DNA was extracted using four approaches: whole blood, blood spots with whole genome amplification (WGA), saliva and buccal swab. Samples were sent to a genotyping facility, either on dry ice or at room temperature and genotyped using Affymetrix SNP array 6.0. QC measured included extraction techniques, effect of shipping temperatures, accuracy and Mendelian concordance. Significantly fewer (50 % ) single nucleotide polymorphisms (SNPs) passed QC metrics for buccal swab DNA (P < 0.0001) due to missing genotype data (P < 0.0001). Whole blood or saliva DNA had the highest call rates (99.2 0.4 % and 99.3 0.2 % , respectively) and Mendelian concordance. Shipment temperature had no effect. DNA from blood or saliva had the highest call rate accuracy, and buccal swabs had the lowest. DNA extracted from blood, saliva and blood spots were found suitable for GWAS in our study.

  17. Resource base influences genome-wide DNA methylation levels in wild baboons (Papio cynocephalus).

    PubMed

    Lea, Amanda J; Altmann, Jeanne; Alberts, Susan C; Tung, Jenny

    2016-04-01

    Variation in resource availability commonly exerts strong effects on fitness-related traits in wild animals. However, we know little about the molecular mechanisms that mediate these effects, or about their persistence over time. To address these questions, we profiled genome-wide whole-blood DNA methylation levels in two sets of wild baboons: (i) 'wild-feeding' baboons that foraged naturally in a savanna environment and (ii) 'Lodge' baboons that had ready access to spatially concentrated human food scraps, resulting in high feeding efficiency and low daily travel distances. We identified 1014 sites (0.20% of sites tested) that were differentially methylated between wild-feeding and Lodge baboons, providing the first evidence that resource availability shapes the epigenome in a wild mammal. Differentially methylated sites tended to occur in contiguous stretches (i.e., in differentially methylated regions or DMRs), in promoters and enhancers, and near metabolism-related genes, supporting their functional importance in gene regulation. In agreement, reporter assay experiments confirmed that methylation at the largest identified DMR, located in the promoter of a key glycolysis-related gene, was sufficient to causally drive changes in gene expression. Intriguingly, all dispersing males carried a consistent epigenetic signature of their membership in a wild-feeding group, regardless of whether males dispersed into or out of this group as adults. Together, our findings support a role for DNA methylation in mediating ecological effects on phenotypic traits in the wild and emphasize the dynamic environmental sensitivity of DNA methylation levels across the life course.

  18. Event-based text mining for biology and functional genomics.

    PubMed

    Ananiadou, Sophia; Thompson, Paul; Nawaz, Raheel; McNaught, John; Kell, Douglas B

    2015-05-01

    The assessment of genome function requires a mapping between genome-derived entities and biochemical reactions, and the biomedical literature represents a rich source of information about reactions between biological components. However, the increasingly rapid growth in the volume of literature provides both a challenge and an opportunity for researchers to isolate information about reactions of interest in a timely and efficient manner. In response, recent text mining research in the biology domain has been largely focused on the identification and extraction of 'events', i.e. categorised, structured representations of relationships between biochemical entities, from the literature. Functional genomics analyses necessarily encompass events as so defined. Automatic event extraction systems facilitate the development of sophisticated semantic search applications, allowing researchers to formulate structured queries over extracted events, so as to specify the exact types of reactions to be retrieved. This article provides an overview of recent research into event extraction. We cover annotated corpora on which systems are trained, systems that achieve state-of-the-art performance and details of the community shared tasks that have been instrumental in increasing the quality, coverage and scalability of recent systems. Finally, several concrete applications of event extraction are covered, together with emerging directions of research.

  19. A functional isopenicillin N synthase in an animal genome.

    PubMed

    Roelofs, Dick; Timmermans, Martijn J T N; Hensbergen, Paul; van Leeuwen, Hans; Koopman, Jessica; Faddeeva, Anna; Suring, Wouter; de Boer, Tjalf E; Mariën, Janine; Boer, Remon; Bovenberg, Roel; van Straalen, Nico M

    2013-03-01

    Horizontal transfer of genes is widespread among prokaryotes, but is less common between microorganisms and animals. Here, we present evidence for the presence of a gene encoding functional isopenicillin N synthase, an enzyme in the β-lactam antibiotics biosynthesis pathway, in the genome of the soil-living collembolan species, Folsomia candida (FcIPNS). At present, this gene is only known from bacteria and fungi, as is the capacity to produce β-lactam antibiotics. The FcIPNS gene was located on two genomic contigs, was physically linked to a predicted insect ATP-binding cassette transporter gene, and contained three introns each flanked by eukaryotic splicing recognition sites (GT/AG). Homology searches revealed no similarity between these introns and the FcIPNS regions of bacteria or fungi. All amino acids conserved across bacteria and fungi were also conserved in F. candida. Recombinant FcIPNS was able to convert its substrate amino δ-(l-α-aminoadipyl)-l-cysteinyl-d-valine into isopenicillin N, providing strong evidence that FcIPNS is functional. Phylogenetic analysis clustered FcIPNS outside the bacterial IPNS clade, and also outside the fungal IPNS clade, suggesting an ancient gene transfer followed by divergence in the F. candida genome. In conclusion, the data suggest that the soil-living collembolan F. candida has assimilated the capacity for antibacterial activity by horizontal gene transfer, which may be an important adaptive trait in the microbe-dominated soil ecosystem. PMID:23204388

  20. Whole-genome sequence-based analysis of thyroid function.

    PubMed

    Taylor, Peter N; Porcu, Eleonora; Chew, Shelby; Campbell, Purdey J; Traglia, Michela; Brown, Suzanne J; Mullin, Benjamin H; Shihab, Hashem A; Min, Josine; Walter, Klaudia; Memari, Yasin; Huang, Jie; Barnes, Michael R; Beilby, John P; Charoen, Pimphen; Danecek, Petr; Dudbridge, Frank; Forgetta, Vincenzo; Greenwood, Celia; Grundberg, Elin; Johnson, Andrew D; Hui, Jennie; Lim, Ee M; McCarthy, Shane; Muddyman, Dawn; Panicker, Vijay; Perry, John R B; Bell, Jordana T; Yuan, Wei; Relton, Caroline; Gaunt, Tom; Schlessinger, David; Abecasis, Goncalo; Cucca, Francesco; Surdulescu, Gabriela L; Woltersdorf, Wolfram; Zeggini, Eleftheria; Zheng, Hou-Feng; Toniolo, Daniela; Dayan, Colin M; Naitza, Silvia; Walsh, John P; Spector, Tim; Davey Smith, George; Durbin, Richard; Richards, J Brent; Sanna, Serena; Soranzo, Nicole; Timpson, Nicholas J; Wilson, Scott G

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N=2,287). Using additional whole-genome sequence and deeply imputed data sets, we report meta-analysis results for common variants (MAF≥1%) associated with TSH and FT4 (N=16,335). For TSH, we identify a novel variant in SYN2 (MAF=23.5%, P=6.15 × 10(-9)) and a new independent variant in PDE8B (MAF=10.4%, P=5.94 × 10(-14)). For FT4, we report a low-frequency variant near B4GALT6/SLC25A52 (MAF=3.2%, P=1.27 × 10(-9)) tagging a rare TTR variant (MAF=0.4%, P=2.14 × 10(-11)). All common variants explain ≥20% of the variance in TSH and FT4. Analysis of rare variants (MAF<1%) using sequence kernel association testing reveals a novel association with FT4 in NRG1. Our results demonstrate that increased coverage in whole-genome sequence association studies identifies novel variants associated with thyroid function. PMID:25743335

  1. Whole-genome sequence-based analysis of thyroid function.

    PubMed

    Taylor, Peter N; Porcu, Eleonora; Chew, Shelby; Campbell, Purdey J; Traglia, Michela; Brown, Suzanne J; Mullin, Benjamin H; Shihab, Hashem A; Min, Josine; Walter, Klaudia; Memari, Yasin; Huang, Jie; Barnes, Michael R; Beilby, John P; Charoen, Pimphen; Danecek, Petr; Dudbridge, Frank; Forgetta, Vincenzo; Greenwood, Celia; Grundberg, Elin; Johnson, Andrew D; Hui, Jennie; Lim, Ee M; McCarthy, Shane; Muddyman, Dawn; Panicker, Vijay; Perry, John R B; Bell, Jordana T; Yuan, Wei; Relton, Caroline; Gaunt, Tom; Schlessinger, David; Abecasis, Goncalo; Cucca, Francesco; Surdulescu, Gabriela L; Woltersdorf, Wolfram; Zeggini, Eleftheria; Zheng, Hou-Feng; Toniolo, Daniela; Dayan, Colin M; Naitza, Silvia; Walsh, John P; Spector, Tim; Davey Smith, George; Durbin, Richard; Richards, J Brent; Sanna, Serena; Soranzo, Nicole; Timpson, Nicholas J; Wilson, Scott G

    2015-03-06

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N=2,287). Using additional whole-genome sequence and deeply imputed data sets, we report meta-analysis results for common variants (MAF≥1%) associated with TSH and FT4 (N=16,335). For TSH, we identify a novel variant in SYN2 (MAF=23.5%, P=6.15 × 10(-9)) and a new independent variant in PDE8B (MAF=10.4%, P=5.94 × 10(-14)). For FT4, we report a low-frequency variant near B4GALT6/SLC25A52 (MAF=3.2%, P=1.27 × 10(-9)) tagging a rare TTR variant (MAF=0.4%, P=2.14 × 10(-11)). All common variants explain ≥20% of the variance in TSH and FT4. Analysis of rare variants (MAF<1%) using sequence kernel association testing reveals a novel association with FT4 in NRG1. Our results demonstrate that increased coverage in whole-genome sequence association studies identifies novel variants associated with thyroid function.

  2. Genomic resources for invertebrate vectors of human pathogens, and the role of VectorBase.

    PubMed

    Megy, K; Hammond, M; Lawson, D; Bruggner, R V; Birney, E; Collins, F H

    2009-05-01

    High-throughput genome sequencing techniques have now reached vector biology with an emphasis on those species that are vectors of human pathogens. The first mosquito to be sequenced was Anopheles gambiae, the vector for Plasmodium parasites that cause malaria. Further mosquitoes have followed: Aedes aegypti (yellow fever and dengue fever vector) and Culex pipiens (lymphatic filariasis and West Nile fever). Species that are currently in sequencing include the body louse Pediculus humanus (Typhus vector), the triatomine Rhodnius prolixus (Chagas disease vector) and the tick Ixodes scapularis (Lyme disease vector). The motivations for sequencing vector genomes are to further understand vector biology, with an eye on developing new control strategies (for example novel chemical attractants or repellents) or understanding the limitations of current strategies (for example the mechanism of insecticide resistance); to analyse the mechanisms driving their evolution; and to perform an exhaustive analysis of the gene repertory. The proliferation of genomic data creates the need for efficient and accessible storage. We present VectorBase, a genomic resource centre that is both involved in the annotation of vector genomes and act as a portal for access to the genomic information (http://www.vectorbase.org).

  3. AgBase: a unified resource for functional analysis in agriculture.

    PubMed

    McCarthy, Fiona M; Bridges, Susan M; Wang, Nan; Magee, G Bryce; Williams, W Paul; Luthe, Dawn S; Burgess, Shane C

    2007-01-01

    Analysis of functional genomics (transcriptomics and proteomics) datasets is hindered in agricultural species because agricultural genome sequences have relatively poor structural and functional annotation. To facilitate systems biology in these species we have established the curated, web-accessible, public resource 'AgBase' (www.agbase.msstate.edu). We have improved the structural annotation of agriculturally important genomes by experimentally confirming the in vivo expression of electronically predicted proteins and by proteogenomic mapping. Proteogenomic data are available from the AgBase proteogenomics link. We contribute Gene Ontology (GO) annotations and we provide a two tier system of GO annotations for users. The 'GO Consortium' gene association file contains the most rigorous GO annotations based solely on experimental data. The 'Community' gene association file contains GO annotations based on expert community knowledge (annotations based directly from author statements and submitted annotations from the community) and annotations for predicted proteins. We have developed two tools for proteomics analysis and these are freely available on request. A suite of tools for analyzing functional genomics datasets using the GO is available online at the AgBase site. We encourage and publicly acknowledge GO annotations from researchers and provide an online mechanism for agricultural researchers to submit requests for GO annotations. PMID:17135208

  4. Bordetella pertussis evolution in the (functional) genomics era.

    PubMed

    Belcher, Thomas; Preston, Andrew

    2015-11-01

    The incidence of whooping cough caused by Bordetella pertussis in many developed countries has risen dramatically in recent years. This has been linked to the use of an acellular pertussis vaccine. In addition, it is thought that B. pertussis is adapting under acellular vaccine mediated immune selection pressure, towards vaccine escape. Genomics-based approaches have revolutionized the ability to resolve the fine structure of the global B. pertussis population and its evolution during the era of vaccination. Here, we discuss the current picture of B. pertussis evolution and diversity in the light of the current resurgence, highlight import questions raised by recent studies in this area and discuss the role that functional genomics can play in addressing current knowledge gaps.

  5. Adapting CRISPR/Cas9 for functional genomics screens.

    PubMed

    Malina, Abba; Katigbak, Alexandra; Cencic, Regina; Maïga, Rayelle Itoua; Robert, Francis; Miura, Hisashi; Pelletier, Jerry

    2014-01-01

    The use of CRISPR/Cas9 (clustered regularly interspaced short palindromic repeats/CRISPR-associated protein) for targeted genome editing has been widely adopted and is considered a "game changing" technology. The ease and rapidity by which this approach can be used to modify endogenous loci in a wide spectrum of cell types and organisms makes it a powerful tool for customizable genetic modifications as well as for large-scale functional genomics. The development of retrovirus-based expression platforms to simultaneously deliver the Cas9 nuclease and single guide (sg) RNAs provides unique opportunities by which to ensure stable and reproducible expression of the editing tools and a broad cell targeting spectrum, while remaining compatible with in vivo genetic screens. Here, we describe methods and highlight considerations for designing and generating sgRNA libraries in all-in-one retroviral vectors for such applications.

  6. Functional and Evolutionary Insights from the Genomes of Three Parasitoid Nasonia Species

    PubMed Central

    2010-01-01

    We report here genome sequences and comparative analyses of three closely related parasitoid wasps: Nasonia vitripennis, N. giraulti, and N. longicornis. Parasitoids are important regulators of arthropod populations, including major agricultural pests and disease vectors, and Nasonia is an emerging genetic model, particularly for evolutionary and developmental genetics. Key findings include the identification of a functional DNA methylation tool kit; hymenopteran-specific genes including diverse venoms; lateral gene transfers among Pox viruses, Wolbachia, and Nasonia; and the rapid evolution of genes involved in nuclear-mitochondrial interactions that are implicated in speciation. Newly developed genome resources advance Nasonia for genetic research, accelerate mapping and cloning of quantitative trait loci, and will ultimately provide tools and knowledge for further increasing the utility of parasitoids as pest insect-control agents. PMID:20075255

  7. Integrative pathway genomics of lung function and airflow obstruction.

    PubMed

    Gharib, Sina A; Loth, Daan W; Soler Artigas, María; Birkland, Timothy P; Wilk, Jemma B; Wain, Louise V; Brody, Jennifer A; Obeidat, Ma'en; Hancock, Dana B; Tang, Wenbo; Rawal, Rajesh; Boezen, H Marike; Imboden, Medea; Huffman, Jennifer E; Lahousse, Lies; Alves, Alexessander C; Manichaikul, Ani; Hui, Jennie; Morrison, Alanna C; Ramasamy, Adaikalavan; Smith, Albert Vernon; Gudnason, Vilmundur; Surakka, Ida; Vitart, Veronique; Evans, David M; Strachan, David P; Deary, Ian J; Hofman, Albert; Gläser, Sven; Wilson, James F; North, Kari E; Zhao, Jing Hua; Heckbert, Susan R; Jarvis, Deborah L; Probst-Hensch, Nicole; Schulz, Holger; Barr, R Graham; Jarvelin, Marjo-Riitta; O'Connor, George T; Kähönen, Mika; Cassano, Patricia A; Hysi, Pirro G; Dupuis, Josée; Hayward, Caroline; Psaty, Bruce M; Hall, Ian P; Parks, William C; Tobin, Martin D; London, Stephanie J

    2015-12-01

    Chronic respiratory disorders are important contributors to the global burden of disease. Genome-wide association studies (GWASs) of lung function measures have identified several trait-associated loci, but explain only a modest portion of the phenotypic variability. We postulated that integrating pathway-based methods with GWASs of pulmonary function and airflow obstruction would identify a broader repertoire of genes and processes influencing these traits. We performed two independent GWASs of lung function and applied gene set enrichment analysis to one of the studies and validated the results using the second GWAS. We identified 131 significantly enriched gene sets associated with lung function and clustered them into larger biological modules involved in diverse processes including development, immunity, cell signaling, proliferation and arachidonic acid. We found that enrichment of gene sets was not driven by GWAS-significant variants or loci, but instead by those with less stringent association P-values. Next, we applied pathway enrichment analysis to a meta-analyzed GWAS of airflow obstruction. We identified several biologic modules that functionally overlapped with those associated with pulmonary function. However, differences were also noted, including enrichment of extracellular matrix (ECM) processes specifically in the airflow obstruction study. Network analysis of the ECM module implicated a candidate gene, matrix metalloproteinase 10 (MMP10), as a putative disease target. We used a knockout mouse model to functionally validate MMP10's role in influencing lung's susceptibility to cigarette smoke-induced emphysema. By integrating pathway analysis with population-based genomics, we unraveled biologic processes underlying pulmonary function traits and identified a candidate gene for obstructive lung disease. PMID:26395457

  8. Integrative pathway genomics of lung function and airflow obstruction.

    PubMed

    Gharib, Sina A; Loth, Daan W; Soler Artigas, María; Birkland, Timothy P; Wilk, Jemma B; Wain, Louise V; Brody, Jennifer A; Obeidat, Ma'en; Hancock, Dana B; Tang, Wenbo; Rawal, Rajesh; Boezen, H Marike; Imboden, Medea; Huffman, Jennifer E; Lahousse, Lies; Alves, Alexessander C; Manichaikul, Ani; Hui, Jennie; Morrison, Alanna C; Ramasamy, Adaikalavan; Smith, Albert Vernon; Gudnason, Vilmundur; Surakka, Ida; Vitart, Veronique; Evans, David M; Strachan, David P; Deary, Ian J; Hofman, Albert; Gläser, Sven; Wilson, James F; North, Kari E; Zhao, Jing Hua; Heckbert, Susan R; Jarvis, Deborah L; Probst-Hensch, Nicole; Schulz, Holger; Barr, R Graham; Jarvelin, Marjo-Riitta; O'Connor, George T; Kähönen, Mika; Cassano, Patricia A; Hysi, Pirro G; Dupuis, Josée; Hayward, Caroline; Psaty, Bruce M; Hall, Ian P; Parks, William C; Tobin, Martin D; London, Stephanie J

    2015-12-01

    Chronic respiratory disorders are important contributors to the global burden of disease. Genome-wide association studies (GWASs) of lung function measures have identified several trait-associated loci, but explain only a modest portion of the phenotypic variability. We postulated that integrating pathway-based methods with GWASs of pulmonary function and airflow obstruction would identify a broader repertoire of genes and processes influencing these traits. We performed two independent GWASs of lung function and applied gene set enrichment analysis to one of the studies and validated the results using the second GWAS. We identified 131 significantly enriched gene sets associated with lung function and clustered them into larger biological modules involved in diverse processes including development, immunity, cell signaling, proliferation and arachidonic acid. We found that enrichment of gene sets was not driven by GWAS-significant variants or loci, but instead by those with less stringent association P-values. Next, we applied pathway enrichment analysis to a meta-analyzed GWAS of airflow obstruction. We identified several biologic modules that functionally overlapped with those associated with pulmonary function. However, differences were also noted, including enrichment of extracellular matrix (ECM) processes specifically in the airflow obstruction study. Network analysis of the ECM module implicated a candidate gene, matrix metalloproteinase 10 (MMP10), as a putative disease target. We used a knockout mouse model to functionally validate MMP10's role in influencing lung's susceptibility to cigarette smoke-induced emphysema. By integrating pathway analysis with population-based genomics, we unraveled biologic processes underlying pulmonary function traits and identified a candidate gene for obstructive lung disease.

  9. Optimizing de novo transcriptome assembly and extending genomic resources for striped catfish (Pangasianodon hypophthalmus).

    PubMed

    Thanh, Nguyen Minh; Jung, Hyungtaek; Lyons, Russell E; Njaci, Isaac; Yoon, Byoung-Ha; Chand, Vincent; Tuan, Nguyen Viet; Thu, Vo Thi Minh; Mather, Peter

    2015-10-01

    Striped catfish (Pangasianodon hypophthalmus) is a commercially important freshwater fish used in inland aquaculture in the Mekong Delta, Vietnam. The culture industry is facing a significant challenge however from saltwater intrusion into many low topographical coastal provinces across the Mekong Delta as a result of predicted climate change impacts. Developing genomic resources for this species can facilitate the production of improved culture lines that can withstand raised salinity conditions, and so we have applied high-throughput Ion Torrent sequencing of transcriptome libraries from six target osmoregulatory organs from striped catfish as a genomic resource for use in future selection strategies. We obtained 12,177,770 reads after trimming and processing with an average length of 97bp. De novo assemblies were generated using CLC Genomic Workbench, Trinity and Velvet/Oases with the best overall contig performance resulting from the CLC assembly. De novo assembly using CLC yielded 66,451 contigs with an average length of 478bp and N50 length of 506bp. A total of 37,969 contigs (57%) possessed significant similarity with proteins in the non-redundant database. Comparative analyses revealed that a significant number of contigs matched sequences reported in other teleost fishes, ranging in similarity from 45.2% with Atlantic cod to 52% with zebrafish. In addition, 28,879 simple sequence repeats (SSRs) and 55,721 single nucleotide polymorphisms (SNPs) were detected in the striped catfish transcriptome. The sequence collection generated in the current study represents the most comprehensive genomic resource for P. hypophthalmus available to date. Our results illustrate the utility of next-generation sequencing as an efficient tool for constructing a large genomic database for marker development in non-model species. PMID:25979246

  10. Genome-wide functional annotation of Phomopsis longicolla isolate MSPL 10-6.

    PubMed

    Darwish, Omar; Li, Shuxian; Matthews, Benjamin; Alkharouf, Nadim

    2016-06-01

    Phomopsis seed decay of soybean is caused primarily by the seed-borne fungal pathogen Phomopsis longicolla (syn. Diaporthe longicolla). This disease severely decreases soybean seed quality, reduces seedling vigor and stand establishment, and suppresses yield. It is one of the most economically important soybean diseases. In this study we annotated the entire genome of P. longicolla isolate MSPL 10-6, which was isolated from field-grown soybean seed in Mississippi, USA. This study represents the first reported genome-wide functional annotation of a seed borne fungal pathogen in the Diaporthe-Phomopsis complex. The P. longicolla genome annotation will enable research into the genetic basis of fungal infection of soybean seed and provide information for the study of soybean-fungal interactions. The genome annotation will also be a valuable resource for the research and agricultural communities. It will aid in the development of new control strategies for this pathogen. The annotations can be found from: http://bioinformatics.towson.edu/phomopsis_longicolla/download.html. NCBI accession number is: AYRD00000000. PMID:27222801

  11. Biodegradation of DDT by Stenotrophomonas sp. DDT-1: Characterization and genome functional analysis

    NASA Astrophysics Data System (ADS)

    Pan, Xiong; Lin, Dunli; Zheng, Yuan; Zhang, Qian; Yin, Yuanming; Cai, Lin; Fang, Hua; Yu, Yunlong

    2016-02-01

    A novel bacterium capable of utilizing 1,1,1-trichloro-2,2-bis(p-chlorophenyl)ethane (DDT) as the sole carbon and energy source was isolated from a contaminated soil which was identified as Stenotrophomonas sp. DDT-1 based on morphological characteristics, BIOLOG GN2 microplate profile, and 16S rDNA phylogeny. Genome sequencing and functional annotation of the isolate DDT-1 showed a 4,514,569 bp genome size, 66.92% GC content, 4,033 protein-coding genes, and 76 RNA genes including 8 rRNA genes. Totally, 2,807 protein-coding genes were assigned to Clusters of Orthologous Groups (COGs), and 1,601 protein-coding genes were mapped to Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway. The degradation half-lives of DDT increased with substrate concentration from 0.1 to 10.0 mg/l, whereas decreased with temperature from 15 °C to 35 °C. Neutral condition was the most favorable for DDT biodegradation. Based on genome annotation of DDT degradation genes and the metabolites detected by GC-MS, a mineralization pathway was proposed for DDT biodegradation in which it was orderly converted into DDE/DDD, DDMU, DDOH, and DDA via dechlorination, hydroxylation, and carboxylation, and ultimately mineralized to carbon dioxide. The results indicate that the isolate DDT-1 is a promising bacterial resource for the removal or detoxification of DDT residues in the environment.

  12. Physiological and genomic basis of mechanical-functional trade-off in plant vasculature

    PubMed Central

    Sengupta, Sonali; Majumder, Arun Lahiri

    2014-01-01

    Some areas in plant abiotic stress research are not frequently addressed by genomic and molecular tools. One such area is the cross reaction of gravitational force with upward capillary pull of water and the mechanical-functional trade-off in plant vasculature. Although frost, drought and flooding stress greatly impact these physiological processes and consequently plant performance, the genomic and molecular basis of such trade-off is only sporadically addressed and so is its adaptive value. Embolism resistance is an important multiple stress- opposition trait and do offer scopes for critical insight to unravel and modify the input of living cells in the process and their biotechnological intervention may be of great importance. Vascular plants employ different physiological strategies to cope with embolism and variation is observed across the kingdom. The genomic resources in this area have started to emerge and open up possibilities of synthesis, validation and utilization of the new knowledge-base. This review article assesses the research till date on this issue and discusses new possibilities for bridging physiology and genomics of a plant, and foresees its implementation in crop science. PMID:24904619

  13. Development and perspectives of scientific services offered by genomic biological resource centres.

    PubMed

    Wagner, Florian; Heidtke, Karsten R; Drescher, Bernd; Radelof, Uwe

    2007-09-01

    A number of fundamental technical developments like the evolvement of oligonucleotide microarrays, new sequencing technologies and gene synthesis have considerably changed the character of genomic biological resource centres in recent years. While genomic biological resource centres traditionally served mainly as providers of sparsely characterized cDNA clones and clone sets, there is nowadays a clear tendency towards well-characterized, high-quality clones. In addition, major new service units like microarray services have developed, which are completely independent of clone collections, reflecting the co-evolution of data generation and technology development. The new technologies require an increasingly higher degree of specialization, data integration and quality standards. Altogether, these developments result in spin-offs of highly specialized biotech companies, some of which will take a prominent position in translational medicine.

  14. SkateBase, an elasmobranch genome project and collection of molecular resources for chondrichthyan fishes

    PubMed Central

    Wyffels, Jennifer; L. King, Benjamin; Vincent, James; Chen, Chuming; Wu, Cathy H.; Polson, Shawn W.

    2014-01-01

    Chondrichthyan fishes are a diverse class of gnathostomes that provide a valuable perspective on fundamental characteristics shared by all jawed and limbed vertebrates. Studies of phylogeny, species diversity, population structure, conservation, and physiology are accelerated by genomic, transcriptomic and protein sequence data. These data are widely available for many sarcopterygii (coelacanth, lungfish and tetrapods) and actinoptergii (ray-finned fish including teleosts) taxa, but limited for chondrichthyan fishes.  In this study, we summarize available data for chondrichthyes and describe resources for one of the largest projects to characterize one of these fish, Leucoraja erinacea, the little skate.  SkateBase ( http://skatebase.org) serves as the skate genome project portal linking data, research tools, and teaching resources. PMID:25309735

  15. Evaluating sustainability of watershed resources management through wetland functional analysis

    SciTech Connect

    Zalidis, G.C.; Gerakis, A. . Lab. of Applied Soil Science)

    1999-08-01

    Unsustainable agricultural policies and water and soil resource schemes have drained two thirds of Mediterranean wetlands since 1920. An outstanding example is Karla in Greece, a former internationally important wetland that was drained in 1962 causing environmental, social, and water and soil problems. The objective of this study was to assess the functions and values of Karla, at three periods of its history, and to relate them to major events in the management of the water and soil resources of its watershed. Information on wetland and watershed features was collected from historical records and field visits. The results showed that the wetland in its pristine state had performed five functions to a high degree, one (groundwater recharge) to a moderate degree, and one (flood storage) to a low degree. Flood-control works, uncontrolled pumping, etc., in 1936--1961 degraded all functions except microclimate modification while, the bird support function was moderately altered. Drainage works in 1962 left a very small artificially flooded wetland with only four functions performed to an insignificant degree. Value degradation followed function degradation. It was concluded that past resource management has been nonintegrated. No consideration was given to the multiple functions and values of Karla. Previous restoration proposals involved the reinstatement of one or two functions only. The appropriate restoration scheme for Karla must be multiobjective and based on the integrated resource management of its own and the neighboring watersheds.

  16. Building a Strategic Human Resource Function. Workforce Brief.

    ERIC Educational Resources Information Center

    Weinberg, Ray; Mitchell, Stephen

    A human resource (HR) function should be strategically integrated with a company's business plan to help the company achieve its business objectives. Four elements ensure that the HR function carries out this role: plan strategically, align activities, value employees, and be proactive. The major components of a strategic HR program--obtaining,…

  17. StreptoBase: An Oral Streptococcus mitis Group Genomic Resource and Analysis Platform.

    PubMed

    Zheng, Wenning; Tan, Tze King; Paterson, Ian C; Mutha, Naresh V R; Siow, Cheuk Chuen; Tan, Shi Yang; Old, Lesley A; Jakubovics, Nicholas S; Choo, Siew Woh

    2016-01-01

    The oral streptococci are spherical Gram-positive bacteria categorized under the phylum Firmicutes which are among the most common causative agents of bacterial infective endocarditis (IE) and are also important agents in septicaemia in neutropenic patients. The Streptococcus mitis group is comprised of 13 species including some of the most common human oral colonizers such as S. mitis, S. oralis, S. sanguinis and S. gordonii as well as species such as S. tigurinus, S. oligofermentans and S. australis that have only recently been classified and are poorly understood at present. We present StreptoBase, which provides a specialized free resource focusing on the genomic analyses of oral species from the mitis group. It currently hosts 104 S. mitis group genomes including 27 novel mitis group strains that we sequenced using the high throughput Illumina HiSeq technology platform, and provides a comprehensive set of genome sequences for analyses, particularly comparative analyses and visualization of both cross-species and cross-strain characteristics of S. mitis group bacteria. StreptoBase incorporates sophisticated in-house designed bioinformatics web tools such as Pairwise Genome Comparison (PGC) tool and Pathogenomic Profiling Tool (PathoProT), which facilitate comparative pathogenomics analysis of Streptococcus strains. Examples are provided to demonstrate how StreptoBase can be employed to compare genome structure of different S. mitis group bacteria and putative virulence genes profile across multiple streptococcal strains. In conclusion, StreptoBase offers access to a range of streptococci genomic resources as well as analysis tools and will be an invaluable platform to accelerate research in streptococci. Database URL: http://streptococcus.um.edu.my.

  18. StreptoBase: An Oral Streptococcus mitis Group Genomic Resource and Analysis Platform.

    PubMed

    Zheng, Wenning; Tan, Tze King; Paterson, Ian C; Mutha, Naresh V R; Siow, Cheuk Chuen; Tan, Shi Yang; Old, Lesley A; Jakubovics, Nicholas S; Choo, Siew Woh

    2016-01-01

    The oral streptococci are spherical Gram-positive bacteria categorized under the phylum Firmicutes which are among the most common causative agents of bacterial infective endocarditis (IE) and are also important agents in septicaemia in neutropenic patients. The Streptococcus mitis group is comprised of 13 species including some of the most common human oral colonizers such as S. mitis, S. oralis, S. sanguinis and S. gordonii as well as species such as S. tigurinus, S. oligofermentans and S. australis that have only recently been classified and are poorly understood at present. We present StreptoBase, which provides a specialized free resource focusing on the genomic analyses of oral species from the mitis group. It currently hosts 104 S. mitis group genomes including 27 novel mitis group strains that we sequenced using the high throughput Illumina HiSeq technology platform, and provides a comprehensive set of genome sequences for analyses, particularly comparative analyses and visualization of both cross-species and cross-strain characteristics of S. mitis group bacteria. StreptoBase incorporates sophisticated in-house designed bioinformatics web tools such as Pairwise Genome Comparison (PGC) tool and Pathogenomic Profiling Tool (PathoProT), which facilitate comparative pathogenomics analysis of Streptococcus strains. Examples are provided to demonstrate how StreptoBase can be employed to compare genome structure of different S. mitis group bacteria and putative virulence genes profile across multiple streptococcal strains. In conclusion, StreptoBase offers access to a range of streptococci genomic resources as well as analysis tools and will be an invaluable platform to accelerate research in streptococci. Database URL: http://streptococcus.um.edu.my. PMID:27138013

  19. StreptoBase: An Oral Streptococcus mitis Group Genomic Resource and Analysis Platform

    PubMed Central

    Zheng, Wenning; Paterson, Ian C.; Mutha, Naresh V. R.; Siow, Cheuk Chuen; Tan, Shi Yang; Old, Lesley A.; Jakubovics, Nicholas S.; Choo, Siew Woh

    2016-01-01

    The oral streptococci are spherical Gram-positive bacteria categorized under the phylum Firmicutes which are among the most common causative agents of bacterial infective endocarditis (IE) and are also important agents in septicaemia in neutropenic patients. The Streptococcus mitis group is comprised of 13 species including some of the most common human oral colonizers such as S. mitis, S. oralis, S. sanguinis and S. gordonii as well as species such as S. tigurinus, S. oligofermentans and S. australis that have only recently been classified and are poorly understood at present. We present StreptoBase, which provides a specialized free resource focusing on the genomic analyses of oral species from the mitis group. It currently hosts 104 S. mitis group genomes including 27 novel mitis group strains that we sequenced using the high throughput Illumina HiSeq technology platform, and provides a comprehensive set of genome sequences for analyses, particularly comparative analyses and visualization of both cross-species and cross-strain characteristics of S. mitis group bacteria. StreptoBase incorporates sophisticated in-house designed bioinformatics web tools such as Pairwise Genome Comparison (PGC) tool and Pathogenomic Profiling Tool (PathoProT), which facilitate comparative pathogenomics analysis of Streptococcus strains. Examples are provided to demonstrate how StreptoBase can be employed to compare genome structure of different S. mitis group bacteria and putative virulence genes profile across multiple streptococcal strains. In conclusion, StreptoBase offers access to a range of streptococci genomic resources as well as analysis tools and will be an invaluable platform to accelerate research in streptococci. Database URL: http://streptococcus.um.edu.my. PMID:27138013

  20. From RNA-seq to large-scale genotyping - genomics resources for rye (Secale cereale L.)

    PubMed Central

    2011-01-01

    Background The improvement of agricultural crops with regard to yield, resistance and environmental adaptation is a perpetual challenge for both breeding and research. Exploration of the genetic potential and implementation of genome-based breeding strategies for efficient rye (Secale cereale L.) cultivar improvement have been hampered by the lack of genome sequence information. To overcome this limitation we sequenced the transcriptomes of five winter rye inbred lines using Roche/454 GS FLX technology. Results More than 2.5 million reads were assembled into 115,400 contigs representing a comprehensive rye expressed sequence tag (EST) resource. From sequence comparisons 5,234 single nucleotide polymorphisms (SNPs) were identified to develop the Rye5K high-throughput SNP genotyping array. Performance of the Rye5K SNP array was investigated by genotyping 59 rye inbred lines including the five lines used for sequencing, and five barley, three wheat, and two triticale accessions. A balanced distribution of allele frequencies ranging from 0.1 to 0.9 was observed. Residual heterozygosity of the rye inbred lines varied from 4.0 to 20.4% with higher average heterozygosity in the pollen compared to the seed parent pool. Conclusions The established sequence and molecular marker resources will improve and promote genetic and genomic research as well as genome-based breeding in rye. PMID:21951788

  1. arrayMap: A Reference Resource for Genomic Copy Number Imbalances in Human Malignancies

    PubMed Central

    Baudis, Michael

    2012-01-01

    Background The delineation of genomic copy number abnormalities (CNAs) from cancer samples has been instrumental for identification of tumor suppressor genes and oncogenes and proven useful for clinical marker detection. An increasing number of projects have mapped CNAs using high-resolution microarray based techniques. So far, no single resource does provide a global collection of readily accessible oncogenomic array data. Methodology/Principal Findings We here present arrayMap, a curated reference database and bioinformatics resource targeting copy number profiling data in human cancer. The arrayMap database provides a platform for meta-analysis and systems level data integration of high-resolution oncogenomic CNA data. To date, the resource incorporates more than 40,000 arrays in 224 cancer types extracted from several resources, including the NCBI’s Gene Expression Omnibus (GEO), EBI’s ArrayExpress (AE), The Cancer Genome Atlas (TCGA), publication supplements and direct submissions. For the majority of the included datasets, probe level and integrated visualization facilitate gene level and genome wide data review. Results from multi-case selections can be connected to downstream data analysis and visualization tools. Conclusions/Significance To our knowledge, currently no data source provides an extensive collection of high resolution oncogenomic CNA data which readily could be used for genomic feature mining, across a representative range of cancer entities. arrayMap represents our effort for providing a long term platform for oncogenomic CNA data independent of specific platform considerations or specific project dependence. The online database can be accessed at http//www.arraymap.org. PMID:22629346

  2. Interactions of photosynthesis with genome size and function

    PubMed Central

    Raven, John A.; Beardall, John; Larkum, Anthony W. D.; Sánchez-Baracaldo, Patricia

    2013-01-01

    Photolithotrophs are divided between those that use water as their electron donor (Cyanobacteria and the photosynthetic eukaryotes) and those that use a different electron donor (the anoxygenic photolithotrophs, all of them Bacteria). Photolithotrophs with the most reduced genomes have more genes than do the corresponding chemoorganotrophs, and the fastest-growing photolithotrophs have significantly lower specific growth rates than the fastest-growing chemoorganotrophs. Slower growth results from diversion of resources into the photosynthetic apparatus, which accounts for about half of the cell protein. There are inherent dangers in (especially oxygenic) photosynthesis, including the formation of reactive oxygen species (ROS) and blue light sensitivity of the water spitting apparatus. The extent to which photolithotrophs incur greater DNA damage and repair, and faster protein turnover with increased rRNA requirement, needs further investigation. A related source of environmental damage is ultraviolet B (UVB) radiation (280–320 nm), whose flux at the Earth's surface decreased as oxygen (and ozone) increased in the atmosphere. This oxygenation led to the requirements of defence against ROS, and decreasing availability to organisms of combined (non-dinitrogen) nitrogen and ferrous iron, and (indirectly) phosphorus, in the oxygenated biosphere. Differential codon usage in the genome and, especially, the proteome can lead to economies in the use of potentially growth-limiting elements PMID:23754816

  3. Interactions of photosynthesis with genome size and function.

    PubMed

    Raven, John A; Beardall, John; Larkum, Anthony W D; Sánchez-Baracaldo, Patricia

    2013-07-19

    Photolithotrophs are divided between those that use water as their electron donor (Cyanobacteria and the photosynthetic eukaryotes) and those that use a different electron donor (the anoxygenic photolithotrophs, all of them Bacteria). Photolithotrophs with the most reduced genomes have more genes than do the corresponding chemoorganotrophs, and the fastest-growing photolithotrophs have significantly lower specific growth rates than the fastest-growing chemoorganotrophs. Slower growth results from diversion of resources into the photosynthetic apparatus, which accounts for about half of the cell protein. There are inherent dangers in (especially oxygenic) photosynthesis, including the formation of reactive oxygen species (ROS) and blue light sensitivity of the water spitting apparatus. The extent to which photolithotrophs incur greater DNA damage and repair, and faster protein turnover with increased rRNA requirement, needs further investigation. A related source of environmental damage is ultraviolet B (UVB) radiation (280-320 nm), whose flux at the Earth's surface decreased as oxygen (and ozone) increased in the atmosphere. This oxygenation led to the requirements of defence against ROS, and decreasing availability to organisms of combined (non-dinitrogen) nitrogen and ferrous iron, and (indirectly) phosphorus, in the oxygenated biosphere. Differential codon usage in the genome and, especially, the proteome can lead to economies in the use of potentially growth-limiting elements.

  4. Introns: The Functional Benefits of Introns in Genomes.

    PubMed

    Jo, Bong-Seok; Choi, Sun Shim

    2015-12-01

    The intron has been a big biological mystery since it was first discovered in several aspects. First, all of the completely sequenced eukaryotes harbor introns in the genomic structure, whereas no prokaryotes identified so far carry introns. Second, the amount of total introns varies in different species. Third, the length and number of introns vary in different genes, even within the same species genome. Fourth, all introns are copied into RNAs by transcription and DNAs by replication processes, but intron sequences do not participate in protein-coding sequences. The existence of introns in the genome should be a burden to some cells, because cells have to consume a great deal of energy to copy and excise them exactly at the correct positions with the help of complicated spliceosomal machineries. The existence throughout the long evolutionary history is explained, only if selective advantages of carrying introns are assumed to be given to cells to overcome the negative effect of introns. In that regard, we summarize previous research about the functional roles or benefits of introns. Additionally, several other studies strongly suggesting that introns should not be junk will be introduced. PMID:26865841

  5. Primate genome architecture influences structural variation mechanisms and functional consequences.

    PubMed

    Gokcumen, Omer; Tischler, Verena; Tica, Jelena; Zhu, Qihui; Iskow, Rebecca C; Lee, Eunjung; Fritz, Markus Hsi-Yang; Langdon, Amy; Stütz, Adrian M; Pavlidis, Pavlos; Benes, Vladimir; Mills, Ryan E; Park, Peter J; Lee, Charles; Korbel, Jan O

    2013-09-24

    Although nucleotide resolution maps of genomic structural variants (SVs) have provided insights into the origin and impact of phenotypic diversity in humans, comparable maps in nonhuman primates have thus far been lacking. Using massively parallel DNA sequencing, we constructed fine-resolution genomic structural variation maps in five chimpanzees, five orang-utans, and five rhesus macaques. The SV maps, which are comprised of thousands of deletions, duplications, and mobile element insertions, revealed a high activity of retrotransposition in macaques compared with great apes. By comparison, nonallelic homologous recombination is specifically active in the great apes, which is correlated with architectural differences between the genomes of great apes and macaque. Transcriptome analyses across nonhuman primates and humans revealed effects of species-specific whole-gene duplication on gene expression. We identified 13 gene duplications coinciding with the species-specific gain of tissue-specific gene expression in keeping with a role of gene duplication in the promotion of diversification and the acquisition of unique functions. Differences in the present day activity of SV formation mechanisms that our study revealed may contribute to ongoing diversification and adaptation of great ape and Old World monkey lineages.

  6. Primate genome architecture influences structural variation mechanisms and functional consequences

    PubMed Central

    Gokcumen, Omer; Tischler, Verena; Tica, Jelena; Zhu, Qihui; Iskow, Rebecca C.; Lee, Eunjung; Fritz, Markus Hsi-Yang; Langdon, Amy; Stütz, Adrian M.; Pavlidis, Pavlos; Benes, Vladimir; Mills, Ryan E.; Park, Peter J.; Lee, Charles; Korbel, Jan O.

    2013-01-01

    Although nucleotide resolution maps of genomic structural variants (SVs) have provided insights into the origin and impact of phenotypic diversity in humans, comparable maps in nonhuman primates have thus far been lacking. Using massively parallel DNA sequencing, we constructed fine-resolution genomic structural variation maps in five chimpanzees, five orang-utans, and five rhesus macaques. The SV maps, which are comprised of thousands of deletions, duplications, and mobile element insertions, revealed a high activity of retrotransposition in macaques compared with great apes. By comparison, nonallelic homologous recombination is specifically active in the great apes, which is correlated with architectural differences between the genomes of great apes and macaque. Transcriptome analyses across nonhuman primates and humans revealed effects of species-specific whole-gene duplication on gene expression. We identified 13 gene duplications coinciding with the species-specific gain of tissue-specific gene expression in keeping with a role of gene duplication in the promotion of diversification and the acquisition of unique functions. Differences in the present day activity of SV formation mechanisms that our study revealed may contribute to ongoing diversification and adaptation of great ape and Old World monkey lineages. PMID:24014587

  7. Multifaceted Genomic Risk for Brain Function in Schizophrenia

    PubMed Central

    Chen, Jiayu; Calhoun, Vince D.; Pearlson, Godfrey D.; Ehrlich, Stefan; Turner, Jessica A.; Ho, Beng-Choon; Wassink, Thomas H.; Michael, Andrew M; Liu, Jingyu

    2012-01-01

    Recently, deriving candidate endophenotypes from brain imaging data has become a valuable approach to study genetic influences on schizophrenia (SZ), whose pathophysiology remains unclear. In this work we utilized a multivariate approach, parallel independent component analysis, to identify genomic risk components associated with brain function abnormalities in SZ. 5157 candidate single nucleotide polymorphisms (SNPs) were derived from genome-wide array based on their possible connections with SZ and further investigated for their associations with brain activations captured with functional magnetic resonance imaging (fMRI) during a sensorimotor task. Using data from 92 SZ patients and 116 healthy controls, we detected a significant correlation (r= 0.29; p= 2.41×10−5) between one fMRI component and one SNP component, both of which significantly differentiated patients from controls. The fMRI component mainly consisted of precentral and postcentral gyri, the major activated regions in the motor task. On average, higher activation in these regions was observed in participants with higher loadings of the linked SNP component, predominantly contributed to by 253 SNPs. 138 identified SNPs were from known coding regions of 100 unique genes. 31 identified SNPs did not differ between groups, but moderately correlated with some other group-discriminating SNPs, indicating interactions among alleles contributing towards elevated SZ susceptibility. The genes associated with the identified SNPs participated in four neurotransmitter pathways: GABA receptor signaling, dopamine receptor signaling, neuregulin signaling and glutamate receptor signaling. In summary, our work provides further evidence for the complexity of genomic risk to the functional brain abnormality in SZ and suggests a pathological role of interactions between SNPs, genes and multiple neurotransmitter pathways. PMID:22440650

  8. High-resolution functional profiling of the norovirus genome.

    PubMed

    Thorne, Lucy; Bailey, Dalan; Goodfellow, Ian

    2012-11-01

    Human noroviruses (HuNoV) are a major cause of nonbacterial gastroenteritis worldwide, yet details of the life cycle and replication of HuNoV are relatively unknown due to the lack of an efficient cell culture system. Studies with murine norovirus (MNV), which can be propagated in permissive cells, have begun to probe different aspects of the norovirus life cycle; however, our understanding of the specific functions of the viral proteins lags far behind that of other RNA viruses. Genome-wide functional profiling by insertional mutagenesis can reveal protein domains essential for replication and can lead to generation of tagged viruses, which has not yet been achieved for noroviruses. Here, transposon-mediated insertional mutagenesis was used to create 5 libraries of mutagenized MNV infectious clones, each containing a 15-nucleotide sequence randomly inserted within a defined region of the genome. Infectious virus was recovered from each library and was subsequently passaged in cell culture to determine the effect of each insertion by insertion-specific fluorescent PCR profiling. Genome-wide profiling of over 2,000 insertions revealed essential protein domains and confirmed known functional motifs. As validation, several insertion sites were introduced into a wild-type clone, successfully allowing the recovery of infectious virus. Screening of a number of reporter proteins and epitope tags led to the generation of the first infectious epitope-tagged noroviruses carrying the FLAG epitope tag in either NS4 or VP2. Subsequent work confirmed that epitope-tagged fully infectious noroviruses may be of use in the dissection of the molecular interactions that occur within the viral replication complex. PMID:22915807

  9. Genome Evolution in the Cold: Antarctic Icefish Muscle Transcriptome Reveals Selective Duplications Increasing Mitochondrial Function

    PubMed Central

    Coppe, Alessandro; Agostini, Cecilia; Marino, Ilaria A.M.; Zane, Lorenzo; Bargelloni, Luca; Bortoluzzi, Stefania; Patarnello, Tomaso

    2013-01-01

    Antarctic notothenioids radiated over millions of years in subzero waters, evolving peculiar features, such as antifreeze glycoproteins and absence of heat shock response. Icefish, family Channichthyidae, also lack oxygen-binding proteins and display extreme modifications, including high mitochondrial densities in aerobic tissues. A genomic expansion accompanying the evolution of these fish was reported, but paucity of genomic information limits the understanding of notothenioid cold adaptation. We reconstructed and annotated the first skeletal muscle transcriptome of the icefish Chionodraco hamatus providing a new resource for icefish genomics (http://compgen.bio.unipd.it/chamatusbase/, last accessed December 12, 2012). We exploited deep sequencing of this energy-dependent tissue to test the hypothesis of selective duplication of genes involved in mitochondrial function. We developed a bioinformatic approach to univocally assign C. hamatus transcripts to orthology groups extracted from phylogenetic trees of five model species. Chionodraco hamatus duplicates were recorded for each orthology group allowing the identification of duplicated genes specific to the icefish lineage. Significantly more duplicates were found in the icefish when transcriptome data were compared with whole-genome data of model species. Indeed, duplicated genes were significantly enriched in proteins with mitochondrial localization, involved in mitochondrial function and biogenesis. In cold conditions and without oxygen-carrying proteins, energy production is challenging. The combination of high mitochondrial densities and the maintenance of duplicated genes involved in mitochondrial biogenesis and aerobic respiration might confer a selective advantage by improving oxygen diffusion and energy supply to aerobic tissues. Our results provide new insights into the genomic basis of icefish cold adaptation. PMID:23196969

  10. Functional Requirements for Information Resource Provenance on the Web

    SciTech Connect

    McCusker, James P.; Lebo, Timothy; Graves, Alvaro; Difranzo, Dominic; Pinheiro da Silva, Paulo; McGuinness, Deborah L.

    2012-06-19

    We provide a means to formally explain the relationship between HTTP URLs and the representations returned when they are requested. According to existing World Wide Web architecture, the URL serves as an identier for a semiotic referent while the document returned via HTTP serves as a representation of the same referent. This begins with two sides of a semiotic triangle; the third side is the relationship between the URL and the representation received. We complete this description by extending the library science resource model Functional Requirements for Bibliographic Resources (FRBR) with cryptographic message and content digests to create a Functional Requirements for Information Resources (FRIR). We show how applying the FRIR model to HTTP GET and POST transactions disambiguates the many relationships between a given URL and all representations received from its request, provides fine-grained explanations that are complementary to existing explanations of web resources, and integrates easily into the emerging W3C provenance standard.

  11. Computational methods and resources for the interpretation of genomic variants in cancer

    PubMed Central

    2015-01-01

    The recent improvement of the high-throughput sequencing technologies is having a strong impact on the detection of genetic variations associated with cancer. Several institutions worldwide have been sequencing the whole exomes and or genomes of cancer patients in the thousands, thereby providing an invaluable collection of new somatic mutations in different cancer types. These initiatives promoted the development of methods and tools for the analysis of cancer genomes that are aimed at studying the relationship between genotype and phenotype in cancer. In this article we review the online resources and computational tools for the analysis of cancer genome. First, we describe the available repositories of cancer genome data. Next, we provide an overview of the methods for the detection of genetic variation and computational tools for the prioritization of cancer related genes and causative somatic variations. Finally, we discuss the future perspectives in cancer genomics focusing on the impact of computational methods and quantitative approaches for defining personalized strategies to improve the diagnosis and treatment of cancer. PMID:26111056

  12. Computational methods and resources for the interpretation of genomic variants in cancer.

    PubMed

    Tian, Rui; Basu, Malay K; Capriotti, Emidio

    2015-01-01

    The recent improvement of the high-throughput sequencing technologies is having a strong impact on the detection of genetic variations associated with cancer. Several institutions worldwide have been sequencing the whole exomes and or genomes of cancer patients in the thousands, thereby providing an invaluable collection of new somatic mutations in different cancer types. These initiatives promoted the development of methods and tools for the analysis of cancer genomes that are aimed at studying the relationship between genotype and phenotype in cancer. In this article we review the online resources and computational tools for the analysis of cancer genome. First, we describe the available repositories of cancer genome data. Next, we provide an overview of the methods for the detection of genetic variation and computational tools for the prioritization of cancer related genes and causative somatic variations. Finally, we discuss the future perspectives in cancer genomics focusing on the impact of computational methods and quantitative approaches for defining personalized strategies to improve the diagnosis and treatment of cancer.

  13. Integration of molecular functions at the ecosystemic level: breakthroughs and future goals of environmental genomics and post-genomics

    PubMed Central

    Vandenkoornhuyse, Philippe; Dufresne, Alexis; Quaiser, Achim; Gouesbet, Gwenola; Binet, Françoise; Francez, André-Jean; Mahé, Stéphane; Bormans, Myriam; Lagadeuc, Yvan; Couée, Ivan

    2010-01-01

    Environmental genomics and genome-wide expression approaches deal with large-scale sequence-based information obtained from environmental samples, at organismal, population or community levels. To date, environmental genomics, transcriptomics and proteomics are arguably the most powerful approaches to discover completely novel ecological functions and to link organismal capabilities, organism–environment interactions, functional diversity, ecosystem processes, evolution and Earth history. Thus, environmental genomics is not merely a toolbox of new technologies but also a source of novel ecological concepts and hypotheses. By removing previous dichotomies between ecophysiology, population ecology, community ecology and ecosystem functioning, environmental genomics enables the integration of sequence-based information into higher ecological and evolutionary levels. However, environmental genomics, along with transcriptomics and proteomics, must involve pluridisciplinary research, such as new developments in bioinformatics, in order to integrate high-throughput molecular biology techniques into ecology. In this review, the validity of environmental genomics and post-genomics for studying ecosystem functioning is discussed in terms of major advances and expectations, as well as in terms of potential hurdles and limitations. Novel avenues for improving the use of these approaches to test theory-driven ecological hypotheses are also explored. PMID:20426792

  14. NeisseriaBase: a specialised Neisseria genomic resource and analysis platform

    PubMed Central

    Zheng, Wenning; Mutha, Naresh V.R.; Heydari, Hamed; Dutta, Avirup; Siow, Cheuk Chuen; Jakubovics, Nicholas S.; Wee, Wei Yee; Tan, Shi Yang; Ang, Mia Yang; Wong, Guat Jah

    2016-01-01

    Database (VFDB) specific homology searches, the VFDB BLAST is also incorporated into the database. In addition, NeisseriaBase is equipped with in-house designed tools such as the Pairwise Genome Comparison tool (PGC) for comparative genomic analysis and the Pathogenomics Profiling Tool (PathoProT) for the comparative pathogenomics analysis of Neisseria strains. Discussion. This user-friendly database not only provides access to a host of genomic resources on Neisseria but also enables high-quality comparative genome analysis, which is crucial for the expanding scientific community interested in Neisseria research. This database is freely available at http://neisseria.um.edu.my. PMID:27017950

  15. NeisseriaBase: a specialised Neisseria genomic resource and analysis platform.

    PubMed

    Zheng, Wenning; Mutha, Naresh V R; Heydari, Hamed; Dutta, Avirup; Siow, Cheuk Chuen; Jakubovics, Nicholas S; Wee, Wei Yee; Tan, Shi Yang; Ang, Mia Yang; Wong, Guat Jah; Choo, Siew Woh

    2016-01-01

    Database (VFDB) specific homology searches, the VFDB BLAST is also incorporated into the database. In addition, NeisseriaBase is equipped with in-house designed tools such as the Pairwise Genome Comparison tool (PGC) for comparative genomic analysis and the Pathogenomics Profiling Tool (PathoProT) for the comparative pathogenomics analysis of Neisseria strains. Discussion. This user-friendly database not only provides access to a host of genomic resources on Neisseria but also enables high-quality comparative genome analysis, which is crucial for the expanding scientific community interested in Neisseria research. This database is freely available at http://neisseria.um.edu.my. PMID:27017950

  16. NeisseriaBase: a specialised Neisseria genomic resource and analysis platform.

    PubMed

    Zheng, Wenning; Mutha, Naresh V R; Heydari, Hamed; Dutta, Avirup; Siow, Cheuk Chuen; Jakubovics, Nicholas S; Wee, Wei Yee; Tan, Shi Yang; Ang, Mia Yang; Wong, Guat Jah; Choo, Siew Woh

    2016-01-01

    Database (VFDB) specific homology searches, the VFDB BLAST is also incorporated into the database. In addition, NeisseriaBase is equipped with in-house designed tools such as the Pairwise Genome Comparison tool (PGC) for comparative genomic analysis and the Pathogenomics Profiling Tool (PathoProT) for the comparative pathogenomics analysis of Neisseria strains. Discussion. This user-friendly database not only provides access to a host of genomic resources on Neisseria but also enables high-quality comparative genome analysis, which is crucial for the expanding scientific community interested in Neisseria research. This database is freely available at http://neisseria.um.edu.my.

  17. Algal Functional Annotation Tool from the DOE-UCLA Institute for Genomics and Proteomics

    DOE Data Explorer

    Lopez, David

    The Algal Functional Annotation Tool is a bioinformatics resource to visualize pathway maps, identify enriched biological terms, or convert gene identifiers to elucidate biological function in silico. These types of analysis have been catered to support lists of gene identifiers, such as those coming from transcriptome gene expression analysis. By analyzing the functional annotation of an interesting set of genes, common biological motifs may be elucidated and a first-pass analysis can point further research in the right direction. Currently, the following databases have been parsed, processed, and added to the tool: 1( Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathways Database, 2) MetaCyc Encyclopedia of Metabolic Pathways, 3) Panther Pathways Database, 4) Reactome Pathways Database, 5) Gene Ontology, 6) MapMan Ontology, 7) KOG (Eukaryotic Clusters of Orthologous Groups), 5)Pfam, 6) InterPro.

  18. Generalizing complexity: a fruitful partnership of functional genomics and systems biology

    PubMed Central

    2012-01-01

    A report on the meeting 'Functional Genomics and Systems Biology 2011', Wellcome Trust Conference Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK, 29 November to 1 December, 2011. PMID:22348308

  19. Blast2GO goes grid: developing a grid-enabled prototype for functional genomics analysis.

    PubMed

    Aparicio, G; Götz, S; Conesa, A; Segrelles, D; Blanquer, I; García, J M; Hernandez, V; Robles, M; Talon, M

    2006-01-01

    The vast amount in complexity of data generated in Genomic Research implies that new dedicated and powerful computational tools need to be developed to meet their analysis requirements. Blast2GO (B2G) is a bioinformatics tool for Gene Ontology-based DNA or protein sequence annotation and function-based data mining. The application has been developed with the aim of affering an easy-to-use tool for functional genomics research. Typical B2G users are middle size genomics labs carrying out sequencing, ETS and microarray projects, handling datasets up to several thousand sequences. In the current version of B2G. The power and analytical potential of both annotation and function data-mining is somehow restricted to the computational power behind each particular installation. In order to be able to offer the possibility of an enhanced computational capacity within this bioinformatics application, a Grid component is being developed. A prototype has been conceived for the particular problem of speeding up the Blast searches to obtain fast results for large datasets. Many efforts have been done in the literature concerning the speeding up of Blast searches, but few of them deal with the use of large heterogeneous production Grid Infrastructures. These are the infrastructures that could reach the largest number of resources and the best load balancing for data access. The Grid Service under development will analyse requests based on the number of sequences, splitting them accordingly to the available resources. Lower-level computation will be performed through MPIBLAST. The software architecture is based on the WSRF standard.

  20. Functional Genome Screening to Elucidate the Colistin Resistance Mechanism

    PubMed Central

    Kumar, Mohit; Gupta, Ashutosh; Sahoo, Rajesh Kumar; Jena, Jayanti; Debata, Nagen Kumar; Subudhi, Enketeswara

    2016-01-01

    Antibiogram profile of 1590 clinical bacterial isolates based on thirteen different antimicrobial compounds showed that 1.6% of the bacterial isolates are multidrug resistant. Distribution pattern based on 16S rRNA sequence analysis showed that Pseudomonas aeruginosa constituted the largest group (83.6%) followed by Burkholderia pseudomallei sp. A191 (5.17%), Staphylococcus sp. A261 (3.45%). Among the various antibiotics used, colistin appeared to be the most effective against the Gram negative bacteria. Burkholderia pseudomallei sp. A191 and Pseudomonas aeruginosa sp. A111 showed resistance to 1500 μg/ml and 750 μg/ml of colistin respectively which constitutes 7.7% of the bacterial population. A functional genomics strategy was employed to discover the molecular support for colistin resistance in Burkholderia pseudomallei sp. A191. A pUC plasmid-based genomic expression library was constructed with an estimated library size of 2.1 × 107bp. Five colistin resistant clones were obtained after functional screening of the library. Analysis of DNA sequence of five colistin resistant clones showed homology to two component regularity systems (TCRS) encoding for a histidine kinase (mrgS) and its regulatory component (mrgR). Cross complementation assay showed that mutations in mrgS were sufficient enough to confer colistin resistant phenotype in a sensitive strain. PMID:26988670

  1. Functional Genome Screening to Elucidate the Colistin Resistance Mechanism.

    PubMed

    Kumar, Mohit; Gupta, Ashutosh; Sahoo, Rajesh Kumar; Jena, Jayanti; Debata, Nagen Kumar; Subudhi, Enketeswara

    2016-01-01

    Antibiogram profile of 1590 clinical bacterial isolates based on thirteen different antimicrobial compounds showed that 1.6% of the bacterial isolates are multidrug resistant. Distribution pattern based on 16S rRNA sequence analysis showed that Pseudomonas aeruginosa constituted the largest group (83.6%) followed by Burkholderia pseudomallei sp. A191 (5.17%), Staphylococcus sp. A261 (3.45%). Among the various antibiotics used, colistin appeared to be the most effective against the Gram negative bacteria. Burkholderia pseudomallei sp. A191 and Pseudomonas aeruginosa sp. A111 showed resistance to 1500 μg/ml and 750 μg/ml of colistin respectively which constitutes 7.7% of the bacterial population. A functional genomics strategy was employed to discover the molecular support for colistin resistance in Burkholderia pseudomallei sp. A191. A pUC plasmid-based genomic expression library was constructed with an estimated library size of 2.1 × 10(7)bp. Five colistin resistant clones were obtained after functional screening of the library. Analysis of DNA sequence of five colistin resistant clones showed homology to two component regularity systems (TCRS) encoding for a histidine kinase (mrgS) and its regulatory component (mrgR). Cross complementation assay showed that mutations in mrgS were sufficient enough to confer colistin resistant phenotype in a sensitive strain. PMID:26988670

  2. Cubozoan genome illuminates functional diversification of opsins and photoreceptor evolution.

    PubMed

    Liegertová, Michaela; Pergner, Jiří; Kozmiková, Iryna; Fabian, Peter; Pombinho, Antonio R; Strnad, Hynek; Pačes, Jan; Vlček, Čestmír; Bartůněk, Petr; Kozmik, Zbyněk

    2015-07-08

    Animals sense light primarily by an opsin-based photopigment present in a photoreceptor cell. Cnidaria are arguably the most basal phylum containing a well-developed visual system. The evolutionary history of opsins in the animal kingdom has not yet been resolved. Here, we study the evolution of animal opsins by genome-wide analysis of the cubozoan jellyfish Tripedalia cystophora, a cnidarian possessing complex lens-containing eyes and minor photoreceptors. A large number of opsin genes with distinct tissue- and stage-specific expression were identified. Our phylogenetic analysis unequivocally classifies cubozoan opsins as a sister group to c-opsins and documents lineage-specific expansion of the opsin gene repertoire in the cubozoan genome. Functional analyses provided evidence for the use of the Gs-cAMP signaling pathway in a small set of cubozoan opsins, indicating the possibility that the majority of other cubozoan opsins signal via distinct pathways. Additionally, these tests uncovered subtle differences among individual opsins, suggesting possible fine-tuning for specific photoreceptor tasks. Based on phylogenetic, expression and biochemical analysis we propose that rapid lineage- and species-specific duplications of the intron-less opsin genes and their subsequent functional diversification promoted evolution of a large repertoire of both visual and extraocular photoreceptors in cubozoans.

  3. [Research progress on banana functional genomics involved in fruit quality].

    PubMed

    Liu, Ju-Hua; Xu, Bi-Yu; Zhang, Jing; Wang, Jia-Shui; Jia, Cai-Hong; Zhang, Jian-Bin; Jin, Zhi-Qiang

    2012-04-01

    Banana is one of the most important tropical fruits and main economical resource for tropical people. Banana quality is always becoming a focus for people to follow with interest. Here, we reviewed recent research progresses on isolation and identification of banana genes involved in fruit quality such as ripening, softening, glycometabolism, and scent, which will help us explore their functions and facilitate banana quality improvement. PMID:22522158

  4. Comparative Genomic and Functional Analysis of 100 Lactobacillus rhamnosus Strains and Their Comparison with Strain GG

    PubMed Central

    Pietilä, Taija E.; Järvinen, Hanna M.; Messing, Marcel; Randazzo, Cinzia L.; Paulin, Lars; Laine, Pia; Ritari, Jarmo; Caggia, Cinzia; Lähteinen, Tanja; Brouns, Stan J. J.; Satokari, Reetta; von Ossowski, Ingemar; Reunanen, Justus; Palva, Airi; de Vos, Willem M.

    2013-01-01

    Lactobacillus rhamnosus is a lactic acid bacterium that is found in a large variety of ecological habitats, including artisanal and industrial dairy products, the oral cavity, intestinal tract or vagina. To gain insights into the genetic complexity and ecological versatility of the species L. rhamnosus, we examined the genomes and phenotypes of 100 L. rhamnosus strains isolated from diverse sources. The genomes of 100 L. rhamnosus strains were mapped onto the L. rhamnosus GG reference genome. These strains were phenotypically characterized for a wide range of metabolic, antagonistic, signalling and functional properties. Phylogenomic analysis showed multiple groupings of the species that could partly be associated with their ecological niches. We identified 17 highly variable regions that encode functions related to lifestyle, i.e. carbohydrate transport and metabolism, production of mucus-binding pili, bile salt resistance, prophages and CRISPR adaptive immunity. Integration of the phenotypic and genomic data revealed that some L. rhamnosus strains possibly resided in multiple niches, illustrating the dynamics of bacterial habitats. The present study showed two distinctive geno-phenotypes in the L. rhamnosus species. The geno-phenotype A suggests an adaptation to stable nutrient-rich niches, i.e. milk-derivative products, reflected by the alteration or loss of biological functions associated with antimicrobial activity spectrum, stress resistance, adaptability and fitness to a distinctive range of habitats. In contrast, the geno-phenotype B displays adequate traits to a variable environment, such as the intestinal tract, in terms of nutrient resources, bacterial population density and host effects. PMID:23966868

  5. Introducing National Center for Genome Resources (NCGR) Informatics (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    ScienceCinema

    Crow, John [National Center for Genome Resources

    2016-07-12

    John Crow from the National Center for Genome Resources discusses his organization's informatics at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  6. Introducing National Center for Genome Resources (NCGR) Informatics (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    SciTech Connect

    Crow, John

    2012-06-01

    John Crow from the National Center for Genome Resources discusses his organization's informatics at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  7. Functional genomics of PCOS: from GWAS to molecular mechanisms.

    PubMed

    McAllister, Jan M; Legro, Richard S; Modi, Bhavi P; Strauss, Jerome F

    2015-03-01

    Polycystic ovary syndrome (PCOS) is a common endocrinopathy characterized by increased ovarian androgen biosynthesis, anovulation, and infertility. PCOS has a strong heritable component based on familial clustering and twin studies. Genome-wide association studies (GWAS) identified several PCOS candidate loci including LHCGR, FSHR, ZNF217, YAP1, INSR, RAB5B, and C9orf3. We review the functional roles of strong PCOS candidate loci focusing on FSHR, LHCGR, INSR, and DENND1A. We propose that these candidates comprise a hierarchical signaling network by which DENND1A, LHCGR, INSR, RAB5B, adapter proteins, and associated downstream signaling cascades converge to regulate theca cell androgen biosynthesis. Future elucidation of the functional gene networks predicted by the PCOS GWAS will result in new diagnostic and therapeutic approaches for women with PCOS.

  8. Whole-Genome Sequence of Mesorhizobium hungaricum sp. nov. Strain UASWS1009, a Potential Resource for Agricultural and Environmental Uses

    PubMed Central

    Crovadore, Julien; Cochard, Bastien; Calmin, Gautier; Chablais, Romain; Schulz, Torsten

    2016-01-01

    We report here the whole-genome shotgun sequences of the strain UASWS1009 of the species Mesorhizobium hungaricum sp. nov., which are different from any other known Mesorhizobium species. This is the first genome registered for this new species, which could be considered as a potential resource for agriculture and environmental uses. PMID:27738050

  9. Development of a high-throughput SNP resource to advance genomic, genetic and breeding research in carrot (Daucus carota L.)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The rapid advancement in high-throughput SNP genotyping technologies along with next generation sequencing (NGS) platforms has decreased the cost, improved the quality of large-scale genome surveys, and allowed specialty crops with limited genomic resources such as carrot (Daucus carota) to access t...

  10. Genome-wide functional annotation and structural verification of metabolic ORFeome of Chlamydomonas reinhardtii

    PubMed Central

    2011-01-01

    Background Recent advances in the field of metabolic engineering have been expedited by the availability of genome sequences and metabolic modelling approaches. The complete sequencing of the C. reinhardtii genome has made this unicellular alga a good candidate for metabolic engineering studies; however, the annotation of the relevant genes has not been validated and the much-needed metabolic ORFeome is currently unavailable. We describe our efforts on the functional annotation of the ORF models released by the Joint Genome Institute (JGI), prediction of their subcellular localizations, and experimental verification of their structural annotation at the genome scale. Results We assigned enzymatic functions to the translated JGI ORF models of C. reinhardtii by reciprocal BLAST searches of the putative proteome against the UniProt and AraCyc enzyme databases. The best match for each translated ORF was identified and the EC numbers were transferred onto the ORF models. Enzymatic functional assignment was extended to the paralogs of the ORFs by clustering ORFs using BLASTCLUST. In total, we assigned 911 enzymatic functions, including 886 EC numbers, to 1,427 transcripts. We further annotated the enzymatic ORFs by prediction of their subcellular localization. The majority of the ORFs are predicted to be compartmentalized in the cytosol and chloroplast. We verified the structure of the metabolism-related ORF models by reverse transcription-PCR of the functionally annotated ORFs. Following amplification and cloning, we carried out 454FLX and Sanger sequencing of the ORFs. Based on alignment of the 454FLX reads to the ORF predicted sequences, we obtained more than 90% coverage for more than 80% of the ORFs. In total, 1,087 ORF models were verified by 454 and Sanger sequencing methods. We obtained expression evidence for 98% of the metabolic ORFs in the algal cells grown under constant light in the presence of acetate. Conclusions We functionally annotated approximately 1

  11. Comparative population genomics: power and principles for the inference of functionality.

    PubMed

    Lawrie, David S; Petrov, Dmitri A

    2014-04-01

    The availability of sequenced genomes from multiple related organisms allows the detection and localization of functional genomic elements based on the idea that such elements evolve more slowly than neutral sequences. Although such comparative genomics methods have proven useful in discovering functional elements and ascertaining levels of functional constraint in the genome as a whole, here we outline limitations intrinsic to this approach that cannot be overcome by sequencing more species. We argue that it is essential to supplement comparative genomics with ultra-deep sampling of populations from closely related species to enable substantially more powerful genomic scans for functional elements. The convergence of sequencing technology and population genetics theory has made such projects feasible and has exciting implications for functional genomics.

  12. Resources

    MedlinePlus

    ... Breastfeeding - resources Bulimia - resources Burns - resources Cancer - resources Cerebral palsy - resources Celiac disease - resources Child abuse - resources Chronic fatigue syndrome - resources Chronic pain - ...

  13. Functional conservation of Rel binding sites in drosophilid genomes.

    PubMed

    Copley, Richard R; Totrov, Maxim; Linnell, Jane; Field, Simon; Ragoussis, Jiannis; Udalova, Irina A

    2007-09-01

    Evolutionary constraints on gene regulatory elements are poorly understood: Little is known about how the strength of transcription factor binding correlates with DNA sequence conservation, and whether transcription factor binding sites can evolve rapidly while retaining their function. Here we use the model of the NFKB/Rel-dependent gene regulation in divergent Drosophila species to examine the hypothesis that the functional properties of authentic transcription factor binding sites are under stronger evolutionary constraints than the genomic background. Using molecular modeling we compare tertiary structures of the Drosophila Rel family proteins Dorsal, Dif, and Relish and demonstrate that their DNA-binding and protein dimerization domains undergo distinct rates of evolution. The accumulated amino acid changes, however, are unlikely to affect DNA sequence recognition and affinity. We employ our recently developed microarray-based experimental platform and principal coordinates statistical analysis to quantitatively and systematically profile DNA binding affinities of three Drosophila Rel proteins to 10,368 variants of the NFKB recognition sequences. We then correlate the evolutionary divergence of gene regulatory regions with differences in DNA binding affinities. Genome-wide analyses reveal a significant increase in the number of conserved Rel binding sites in promoters of developmental and immune genes. Significantly, the affinity of Rel proteins to these sites was higher than to less conserved sites and was maintained by the conservation of the DNA binding site sequence (static conservation) or in some cases despite significantly diverged sequences (dynamic conservation). We discuss how two types of conservation may contribute to the stabilization and optimization of a functional gene regulatory code in evolution.

  14. Functional conservation of Rel binding sites in drosophilid genomes

    PubMed Central

    Copley, Richard R.; Totrov, Maxim; Linnell, Jane; Field, Simon; Ragoussis, Jiannis; Udalova, Irina A.

    2007-01-01

    Evolutionary constraints on gene regulatory elements are poorly understood: Little is known about how the strength of transcription factor binding correlates with DNA sequence conservation, and whether transcription factor binding sites can evolve rapidly while retaining their function. Here we use the model of the NFKB/Rel-dependent gene regulation in divergent Drosophila species to examine the hypothesis that the functional properties of authentic transcription factor binding sites are under stronger evolutionary constraints than the genomic background. Using molecular modeling we compare tertiary structures of the Drosophila Rel family proteins Dorsal, Dif, and Relish and demonstrate that their DNA-binding and protein dimerization domains undergo distinct rates of evolution. The accumulated amino acid changes, however, are unlikely to affect DNA sequence recognition and affinity. We employ our recently developed microarray-based experimental platform and principal coordinates statistical analysis to quantitatively and systematically profile DNA binding affinities of three Drosophila Rel proteins to 10,368 variants of the NFKB recognition sequences. We then correlate the evolutionary divergence of gene regulatory regions with differences in DNA binding affinities. Genome-wide analyses reveal a significant increase in the number of conserved Rel binding sites in promoters of developmental and immune genes. Significantly, the affinity of Rel proteins to these sites was higher than to less conserved sites and was maintained by the conservation of the DNA binding site sequence (static conservation) or in some cases despite significantly diverged sequences (dynamic conservation). We discuss how two types of conservation may contribute to the stabilization and optimization of a functional gene regulatory code in evolution. PMID:17785540

  15. NGSmethDB: an updated genome resource for high quality, single-cytosine resolution methylomes

    PubMed Central

    Geisen, Stefanie; Barturen, Guillermo; Alganza, Ángel M.; Hackenberg, Michael; Oliver, José L.

    2014-01-01

    The updated release of ‘NGSmethDB’ (http://bioinfo2.ugr.es/NGSmethDB) is a repository for single-base whole-genome methylome maps for the best-assembled eukaryotic genomes. Short-read data sets from NGS bisulfite-sequencing projects of cell lines, fresh and pathological tissues are first pre-processed and aligned to the corresponding reference genome, and then the cytosine methylation levels are profiled. One major improvement is the application of a unique bioinformatics protocol to all data sets, thereby assuring the comparability of all values with each other. We implemented stringent quality controls to minimize important error sources, such as sequencing errors, bisulfite failures, clonal reads or single nucleotide variants (SNVs). This leads to reliable and high-quality methylomes, all obtained under uniform settings. Another significant improvement is the detection in parallel of SNVs, which might be crucial for many downstream analyses (e.g. SNVs and differential-methylation relationships). A next-generation methylation browser allows fast and smooth scrolling and zooming, thus speeding data download/upload, at the same time requiring fewer server resources. Several data mining tools allow the comparison/retrieval of methylation levels in different tissues or genome regions. NGSmethDB methylomes are also available as native tracks through a UCSC hub, which allows comparison with a wide range of third-party annotations, in particular phenotype or disease annotations. PMID:24271385

  16. The SOL Genomics Network: a comparative resource for Solanaceae biology and beyond.

    PubMed

    Mueller, Lukas A; Solow, Teri H; Taylor, Nicolas; Skwarecki, Beth; Buels, Robert; Binns, John; Lin, Chenwei; Wright, Mark H; Ahrens, Robert; Wang, Ying; Herbst, Evan V; Keyder, Emil R; Menda, Naama; Zamir, Dani; Tanksley, Steven D

    2005-07-01

    The SOL Genomics Network (SGN; http://sgn.cornell.edu) is a rapidly evolving comparative resource for the plants of the Solanaceae family, which includes important crop and model plants such as potato (Solanum tuberosum), eggplant (Solanum melongena), pepper (Capsicum annuum), and tomato (Solanum lycopersicum). The aim of SGN is to relate these species to one another using a comparative genomics approach and to tie them to the other dicots through the fully sequenced genome of Arabidopsis (Arabidopsis thaliana). SGN currently houses map and marker data for Solanaceae species, a large expressed sequence tag collection with computationally derived unigene sets, an extensive database of phenotypic information for a mutagenized tomato population, and associated tools such as real-time quantitative trait loci. Recently, the International Solanaceae Project (SOL) was formed as an umbrella organization for Solanaceae research in over 30 countries to address important questions in plant biology. The first cornerstone of the SOL project is the sequencing of the entire euchromatic portion of the tomato genome. SGN is collaborating with other bioinformatics centers in building the bioinformatics infrastructure for the tomato sequencing project and implementing the bioinformatics strategy of the larger SOL project. The overarching goal of SGN is to make information available in an intuitive comparative format, thereby facilitating a systems approach to investigations into the basis of adaptation and phenotypic diversity in the Solanaceae family, other species in the Asterid clade such as coffee (Coffea arabica), Rubiaciae, and beyond.

  17. The SOL Genomics Network. A Comparative Resource for Solanaceae Biology and Beyond1

    PubMed Central

    Mueller, Lukas A.; Solow, Teri H.; Taylor, Nicolas; Skwarecki, Beth; Buels, Robert; Binns, John; Lin, Chenwei; Wright, Mark H.; Ahrens, Robert; Wang, Ying; Herbst, Evan V.; Keyder, Emil R.; Menda, Naama; Zamir, Dani; Tanksley, Steven D.

    2005-01-01

    The SOL Genomics Network (SGN; http://sgn.cornell.edu) is a rapidly evolving comparative resource for the plants of the Solanaceae family, which includes important crop and model plants such as potato (Solanum tuberosum), eggplant (Solanum melongena), pepper (Capsicum annuum), and tomato (Solanum lycopersicum). The aim of SGN is to relate these species to one another using a comparative genomics approach and to tie them to the other dicots through the fully sequenced genome of Arabidopsis (Arabidopsis thaliana). SGN currently houses map and marker data for Solanaceae species, a large expressed sequence tag collection with computationally derived unigene sets, an extensive database of phenotypic information for a mutagenized tomato population, and associated tools such as real-time quantitative trait loci. Recently, the International Solanaceae Project (SOL) was formed as an umbrella organization for Solanaceae research in over 30 countries to address important questions in plant biology. The first cornerstone of the SOL project is the sequencing of the entire euchromatic portion of the tomato genome. SGN is collaborating with other bioinformatics centers in building the bioinformatics infrastructure for the tomato sequencing project and implementing the bioinformatics strategy of the larger SOL project. The overarching goal of SGN is to make information available in an intuitive comparative format, thereby facilitating a systems approach to investigations into the basis of adaptation and phenotypic diversity in the Solanaceae family, other species in the Asterid clade such as coffee (Coffea arabica), Rubiaciae, and beyond. PMID:16010005

  18. Resourcing the Training and Development Function. IES Report.

    ERIC Educational Resources Information Center

    Carter, A.; Hirsh, W.; Aston, J.

    A study explored current practice in organizing and resourcing training and development (T&D) using survey responses from over 100 major private and public sector employers and case studies of T&D functions in 6 organizations. Business drivers for T&D were senior management as customers; diagnosis of training as "the solution;" individual…

  19. NCBI GEO: archive for high-throughput functional genomic data.

    PubMed

    Barrett, Tanya; Troup, Dennis B; Wilhite, Stephen E; Ledoux, Pierre; Rudnev, Dmitry; Evangelista, Carlos; Kim, Irene F; Soboleva, Alexandra; Tomashevsky, Maxim; Marshall, Kimberly A; Phillippy, Katherine H; Sherman, Patti M; Muertter, Rolf N; Edgar, Ron

    2009-01-01

    The Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) is the largest public repository for high-throughput gene expression data. Additionally, GEO hosts other categories of high-throughput functional genomic data, including those that examine genome copy number variations, chromatin structure, methylation status and transcription factor binding. These data are generated by the research community using high-throughput technologies like microarrays and, more recently, next-generation sequencing. The database has a flexible infrastructure that can capture fully annotated raw and processed data, enabling compliance with major community-derived scientific reporting standards such as 'Minimum Information About a Microarray Experiment' (MIAME). In addition to serving as a centralized data storage hub, GEO offers many tools and features that allow users to effectively explore, analyze and download expression data from both gene-centric and experiment-centric perspectives. This article summarizes the GEO repository structure, content and operating procedures, as well as recently introduced data mining features. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/.

  20. Sugarcane Functional Genomics: Gene Discovery for Agronomic Trait Development

    PubMed Central

    Menossi, M.; Silva-Filho, M. C.; Vincentz, M.; Van-Sluys, M.-A.; Souza, G. M.

    2008-01-01

    Sugarcane is a highly productive crop used for centuries as the main source of sugar and recently to produce ethanol, a renewable bio-fuel energy source. There is increased interest in this crop due to the impending need to decrease fossil fuel usage. Sugarcane has a highly polyploid genome. Expressed sequence tag (EST) sequencing has significantly contributed to gene discovery and expression studies used to associate function with sugarcane genes. A significant amount of data exists on regulatory events controlling responses to herbivory, drought, and phosphate deficiency, which cause important constraints on yield and on endophytic bacteria, which are highly beneficial. The means to reduce drought, phosphate deficiency, and herbivory by the sugarcane borer have a negative impact on the environment. Improved tolerance for these constraints is being sought. Sugarcane's ability to accumulate sucrose up to 16% of its culm dry weight is a challenge for genetic manipulation. Genome-based technology such as cDNA microarray data indicates genes associated with sugar content that may be used to develop new varieties improved for sucrose content or for traits that restrict the expansion of the cultivated land. The genes can also be used as molecular markers of agronomic traits in traditional breeding programs. PMID:18273390

  1. Genome, Functional Gene Annotation, and Nuclear Transformation of the Heterokont Oleaginous Alga Nannochloropsis oceanica CCMP1779

    PubMed Central

    Tsai, Chia-Hong; Bullard, Blair; Cornish, Adam J.; Harvey, Christopher; Reca, Ida-Barbara; Thornburg, Chelsea; Achawanantakun, Rujira; Buehl, Christopher J.; Campbell, Michael S.; Cavalier, David; Childs, Kevin L.; Clark, Teresa J.; Deshpande, Rahul; Erickson, Erika; Armenia Ferguson, Ann; Handee, Witawas; Kong, Que; Li, Xiaobo; Liu, Bensheng; Lundback, Steven; Peng, Cheng; Roston, Rebecca L.; Sanjaya; Simpson, Jeffrey P.; TerBush, Allan; Warakanont, Jaruswan; Zäuner, Simone; Farre, Eva M.; Hegg, Eric L.; Jiang, Ning; Kuo, Min-Hao; Lu, Yan; Niyogi, Krishna K.; Ohlrogge, John; Osteryoung, Katherine W.; Shachar-Hill, Yair; Sears, Barbara B.; Sun, Yanni; Takahashi, Hideki; Yandell, Mark; Shiu, Shin-Han; Benning, Christoph

    2012-01-01

    Unicellular marine algae have promise for providing sustainable and scalable biofuel feedstocks, although no single species has emerged as a preferred organism. Moreover, adequate molecular and genetic resources prerequisite for the rational engineering of marine algal feedstocks are lacking for most candidate species. Heterokonts of the genus Nannochloropsis naturally have high cellular oil content and are already in use for industrial production of high-value lipid products. First success in applying reverse genetics by targeted gene replacement makes Nannochloropsis oceanica an attractive model to investigate the cell and molecular biology and biochemistry of this fascinating organism group. Here we present the assembly of the 28.7 Mb genome of N. oceanica CCMP1779. RNA sequencing data from nitrogen-replete and nitrogen-depleted growth conditions support a total of 11,973 genes, of which in addition to automatic annotation some were manually inspected to predict the biochemical repertoire for this organism. Among others, more than 100 genes putatively related to lipid metabolism, 114 predicted transcription factors, and 109 transcriptional regulators were annotated. Comparison of the N. oceanica CCMP1779 gene repertoire with the recently published N. gaditana genome identified 2,649 genes likely specific to N. oceanica CCMP1779. Many of these N. oceanica–specific genes have putative orthologs in other species or are supported by transcriptional evidence. However, because similarity-based annotations are limited, functions of most of these species-specific genes remain unknown. Aside from the genome sequence and its analysis, protocols for the transformation of N. oceanica CCMP1779 are provided. The availability of genomic and transcriptomic data for Nannochloropsis oceanica CCMP1779, along with efficient transformation protocols, provides a blueprint for future detailed gene functional analysis and genetic engineering of Nannochloropsis species by a growing

  2. Draft genome sequence of chickpea (Cicer arietinum) provides a resource for trait improvement.

    PubMed

    Varshney, Rajeev K; Song, Chi; Saxena, Rachit K; Azam, Sarwar; Yu, Sheng; Sharpe, Andrew G; Cannon, Steven; Baek, Jongmin; Rosen, Benjamin D; Tar'an, Bunyamin; Millan, Teresa; Zhang, Xudong; Ramsay, Larissa D; Iwata, Aiko; Wang, Ying; Nelson, William; Farmer, Andrew D; Gaur, Pooran M; Soderlund, Carol; Penmetsa, R Varma; Xu, Chunyan; Bharti, Arvind K; He, Weiming; Winter, Peter; Zhao, Shancen; Hane, James K; Carrasquilla-Garcia, Noelia; Condie, Janet A; Upadhyaya, Hari D; Luo, Ming-Cheng; Thudi, Mahendar; Gowda, C L L; Singh, Narendra P; Lichtenzveig, Judith; Gali, Krishna K; Rubio, Josefa; Nadarajan, N; Dolezel, Jaroslav; Bansal, Kailash C; Xu, Xun; Edwards, David; Zhang, Gengyun; Kahl, Guenter; Gil, Juan; Singh, Karam B; Datta, Swapan K; Jackson, Scott A; Wang, Jun; Cook, Douglas R

    2013-03-01

    Chickpea (Cicer arietinum) is the second most widely grown legume crop after soybean, accounting for a substantial proportion of human dietary nitrogen intake and playing a crucial role in food security in developing countries. We report the ∼738-Mb draft whole genome shotgun sequence of CDC Frontier, a kabuli chickpea variety, which contains an estimated 28,269 genes. Resequencing and analysis of 90 cultivated and wild genotypes from ten countries identifies targets of both breeding-associated genetic sweeps and breeding-associated balancing selection. Candidate genes for disease resistance and agronomic traits are highlighted, including traits that distinguish the two main market classes of cultivated chickpea--desi and kabuli. These data comprise a resource for chickpea improvement through molecular breeding and provide insights into both genome diversity and domestication.

  3. Development of Genomic Resources for a thraustochytrid Pathogen and Investigation of Temperature Influences on Gene Expression

    PubMed Central

    Garcia-Vedrenne, Ana Elisa; Groner, Maya; Page-Karjian, Annie; Siegmund, Gregor-Fausto; Singhal, Sonia; Sziklay, Jamie; Roberts, Steven

    2013-01-01

    Understanding how environmental changes influence the pathogenicity and virulence of infectious agents is critical for predicting epidemiological patterns of disease. Thraustochytrids, part of the larger taxonomic class Labyrinthulomycetes, contain several highly pathogenic species, including the hard clam pathogen quahog parasite unknown (QPX). QPX has been associated with large-scale mortality events along the northeastern coast of North America. Growth and physiology of QPX is temperature-dependent, and changes in local temperature profiles influence pathogenicity. In this study we characterize the partial genome of QPX and examine the influence of temperature on gene expression. Genes involved in several biological processes are differentially expressed upon temperature change, including those associated with altered growth and metabolism and virulence. The genomic and transcriptomic resources developed in this study provide a foundation for better understanding virulence, pathogenicity and life history of thraustochytrid pathogens. PMID:24069279

  4. GénoPlante-Info (GPI): a collection of databases and bioinformatics resources for plant genomics

    PubMed Central

    Samson, Delphine; Legeai, Fabrice; Karsenty, Emmanuelle; Reboux, Sébastien; Veyrieras, Jean-Baptiste; Just, Jeremy; Barillot, Emmanuel

    2003-01-01

    Génoplante is a partnership program between public French institutes (INRA, CIRAD, IRD and CNRS) and private companies (Biogemma, Bayer CropScience and Bioplante) that aims at developing genome analysis programs for crop species (corn, wheat, rapeseed, sunflower and pea) and model plants (Arabidopsis and rice). The outputs of these programs form a wealth of information (genomic sequence, transcriptome, proteome, allelic variability, mapping and synteny, and mutation data) and tools (databases, interfaces, analysis software), that are being integrated and made public at the public bioinformatics resource centre of Génoplante: GénoPlante-Info (GPI). This continuous flood of data and tools is regularly updated and will grow continuously during the coming two years. Access to the GPI databases and tools is available at http://genoplante-info.infobiogen.fr/. PMID:12519976

  5. Genome sequence and functional genomic analysis of the oil-degrading bacterium Oleispira antarctica

    PubMed Central

    Kube, Michael; Chernikova, Tatyana N.; Al-Ramahi, Yamal; Beloqui, Ana; Lopez-Cortez, Nieves; Guazzaroni, María-Eugenia; Heipieper, Hermann J.; Klages, Sven; Kotsyurbenko, Oleg R.; Langer, Ines; Nechitaylo, Taras Y.; Lünsdorf, Heinrich; Fernández, Marisol; Juárez, Silvia; Ciordia, Sergio; Singer, Alexander; Kagan, Olga; Egorova, Olga; Alain Petit, Pierre; Stogios, Peter; Kim, Youngchang; Tchigvintsev, Anatoli; Flick, Robert; Denaro, Renata; Genovese, Maria; Albar, Juan P.; Reva, Oleg N.; Martínez-Gomariz, Montserrat; Tran, Hai; Ferrer, Manuel; Savchenko, Alexei; Yakunin, Alexander F.; Yakimov, Michail M.; Golyshina, Olga V.; Reinhardt, Richard; Golyshin, Peter N.

    2013-01-01

    Ubiquitous bacteria from the genus Oleispira drive oil degradation in the largest environment on Earth, the cold and deep sea. Here we report the genome sequence of Oleispira antarctica and show that compared with Alcanivorax borkumensis—the paradigm of mesophilic hydrocarbonoclastic bacteria—O. antarctica has a larger genome that has witnessed massive gene-transfer events. We identify an array of alkane monooxygenases, osmoprotectants, siderophores and micronutrient-scavenging pathways. We also show that at low temperatures, the main protein-folding machine Cpn60 functions as a single heptameric barrel that uses larger proteins as substrates compared with the classical double-barrel structure observed at higher temperatures. With 11 protein crystal structures, we further report the largest set of structures from one psychrotolerant organism. The most common structural feature is an increased content of surface-exposed negatively charged residues compared to their mesophilic counterparts. Our findings are relevant in the context of microbial cold-adaptation mechanisms and the development of strategies for oil-spill mitigation in cold environments. PMID:23877221

  6. Human Ageing Genomic Resources: Integrated databases and tools for the biology and genetics of ageing

    PubMed Central

    Tacutu, Robi; Craig, Thomas; Budovsky, Arie; Wuttke, Daniel; Lehmann, Gilad; Taranukha, Dmitri; Costa, Joana; Fraifeld, Vadim E.; de Magalhães, João Pedro

    2013-01-01

    The Human Ageing Genomic Resources (HAGR, http://genomics.senescence.info) is a freely available online collection of research databases and tools for the biology and genetics of ageing. HAGR features now several databases with high-quality manually curated data: (i) GenAge, a database of genes associated with ageing in humans and model organisms; (ii) AnAge, an extensive collection of longevity records and complementary traits for >4000 vertebrate species; and (iii) GenDR, a newly incorporated database, containing both gene mutations that interfere with dietary restriction-mediated lifespan extension and consistent gene expression changes induced by dietary restriction. Since its creation about 10 years ago, major efforts have been undertaken to maintain the quality of data in HAGR, while further continuing to develop, improve and extend it. This article briefly describes the content of HAGR and details the major updates since its previous publications, in terms of both structure and content. The completely redesigned interface, more intuitive and more integrative of HAGR resources, is also presented. Altogether, we hope that through its improvements, the current version of HAGR will continue to provide users with the most comprehensive and accessible resources available today in the field of biogerontology. PMID:23193293

  7. Human Ageing Genomic Resources: integrated databases and tools for the biology and genetics of ageing.

    PubMed

    Tacutu, Robi; Craig, Thomas; Budovsky, Arie; Wuttke, Daniel; Lehmann, Gilad; Taranukha, Dmitri; Costa, Joana; Fraifeld, Vadim E; de Magalhães, João Pedro

    2013-01-01

    The Human Ageing Genomic Resources (HAGR, http://genomics.senescence.info) is a freely available online collection of research databases and tools for the biology and genetics of ageing. HAGR features now several databases with high-quality manually curated data: (i) GenAge, a database of genes associated with ageing in humans and model organisms; (ii) AnAge, an extensive collection of longevity records and complementary traits for >4000 vertebrate species; and (iii) GenDR, a newly incorporated database, containing both gene mutations that interfere with dietary restriction-mediated lifespan extension and consistent gene expression changes induced by dietary restriction. Since its creation about 10 years ago, major efforts have been undertaken to maintain the quality of data in HAGR, while further continuing to develop, improve and extend it. This article briefly describes the content of HAGR and details the major updates since its previous publications, in terms of both structure and content. The completely redesigned interface, more intuitive and more integrative of HAGR resources, is also presented. Altogether, we hope that through its improvements, the current version of HAGR will continue to provide users with the most comprehensive and accessible resources available today in the field of biogerontology.

  8. Structure-based inference of molecular functions of proteins of unknown function from Berkeley Structural Genomics Center

    SciTech Connect

    Kim, Sung-Hou; Shin, Dong Hae; Hou, Jingtong; Chandonia, John-Marc; Das, Debanu; Choi, In-Geol; Kim, Rosalind; Kim, Sung-Hou

    2007-09-02

    Advances in sequence genomics have resulted in an accumulation of a huge number of protein sequences derived from genome sequences. However, the functions of a large portion of them cannot be inferred based on the current methods of sequence homology detection to proteins of known functions. Three-dimensional structure can have an important impact in providing inference of molecular function (physical and chemical function) of a protein of unknown function. Structural genomics centers worldwide have been determining many 3-D structures of the proteins of unknown functions, and possible molecular functions of them have been inferred based on their structures. Combined with bioinformatics and enzymatic assay tools, the successful acceleration of the process of protein structure determination through high throughput pipelines enables the rapid functional annotation of a large fraction of hypothetical proteins. We present a brief summary of the process we used at the Berkeley Structural Genomics Center to infer molecular functions of proteins of unknown function.

  9. A System for Dosage-Based Functional Genomics in Poplar[OPEN

    PubMed Central

    2015-01-01

    Altering gene dosage through variation in gene copy number is a powerful approach to addressing questions regarding gene regulation, quantitative trait loci, and heterosis, but one that is not easily applied to sexually transmitted species. Elite poplar (Populus spp) varieties are created through interspecific hybridization, followed by clonal propagation. Altered gene dosage relationships are believed to contribute to hybrid performance. Clonal propagation allows for replication and maintenance of meiotically unstable ploidy or structural variants and provides an alternative approach to investigating gene dosage effects not possible in sexually propagated species. Here, we built a genome-wide structural variation system for dosage-based functional genomics and breeding of poplar. We pollinated Populus deltoides with gamma-irradiated Populus nigra pollen to produce >500 F1 seedlings containing dosage lesions in the form of deletions and insertions of chromosomal segments (indel mutations). Using high-precision dosage analysis, we detected indel mutations in ∼55% of the progeny. These indels varied in length, position, and number per individual, cumulatively tiling >99% of the genome, with an average of 10 indels per gene. Combined with future phenotype and transcriptome data, this population will provide an excellent resource for creating and characterizing dosage-based variation in poplar, including the contribution of dosage to quantitative traits and heterosis. PMID:26320226

  10. Identification of novel biomass-degrading enzymes from genomic dark matter: Populating genomic sequence space with functional annotation.

    PubMed

    Piao, Hailan; Froula, Jeff; Du, Changbin; Kim, Tae-Wan; Hawley, Erik R; Bauer, Stefan; Wang, Zhong; Ivanova, Nathalia; Clark, Douglas S; Klenk, Hans-Peter; Hess, Matthias

    2014-08-01

    Although recent nucleotide sequencing technologies have significantly enhanced our understanding of microbial genomes, the function of ∼35% of genes identified in a genome currently remains unknown. To improve the understanding of microbial genomes and consequently of microbial processes it will be crucial to assign a function to this "genomic dark matter." Due to the urgent need for additional carbohydrate-active enzymes for improved production of transportation fuels from lignocellulosic biomass, we screened the genomes of more than 5,500 microorganisms for hypothetical proteins that are located in the proximity of already known cellulases. We identified, synthesized and expressed a total of 17 putative cellulase genes with insufficient sequence similarity to currently known cellulases to be identified as such using traditional sequence annotation techniques that rely on significant sequence similarity. The recombinant proteins of the newly identified putative cellulases were subjected to enzymatic activity assays to verify their hydrolytic activity towards cellulose and lignocellulosic biomass. Eleven (65%) of the tested enzymes had significant activity towards at least one of the substrates. This high success rate highlights that a gene context-based approach can be used to assign function to genes that are otherwise categorized as "genomic dark matter" and to identify biomass-degrading enzymes that have little sequence similarity to already known cellulases. The ability to assign function to genes that have no related sequence representatives with functional annotation will be important to enhance our understanding of microbial processes and to identify microbial proteins for a wide range of applications.

  11. Biodiversity and functional genomics in the human microbiome.

    PubMed

    Morgan, Xochitl C; Segata, Nicola; Huttenhower, Curtis

    2013-01-01

    Over the course of our lives, humans are colonized by a tremendous diversity of commensal microbes, which comprise the human microbiome. The collective genetic potential (metagenome) of the human microbiome is orders of magnitude more than the human genome, and it profoundly affects human health and disease in ways we are only beginning to understand. Advances in computing and high-throughput sequencing have enabled population-level surveys such as MetaHIT and the recently released Human Microbiome Project, detailed investigations of the microbiome in human disease, and mechanistic studies employing gnotobiotic model organisms. The resulting knowledge of human microbiome composition, function, and range of variation across multiple body sites has begun to assemble a rich picture of commensal host-microbe and microbe-microbe interactions as well as their roles in human health and disease and their potential as diagnostic and therapeutic tools.

  12. Comparative genomics of pectinacetylesterases: Insight on function and biology

    PubMed Central

    de Souza, Amancio José; Pauly, Markus

    2015-01-01

    Pectin acetylation influences the gelling ability of this important plant polysaccharide for the food industry. Plant apoplastic pectinacetylesterases (PAEs) play a key role in regulating the degree of pectin acetylation and modifying their expression thus represents one way to engineer plant polysaccharides for food applications. Identifying the major active enzymes within the PAE gene family will aid in our understanding of this biological phenomena as well as provide the tools for direct trait manipulation. Using comparative genomics we propose that there is a minimal set of 4 distinct PAEs in plants. Possible functional diversification of the PAE family in the grasses is also explored with the identification of 3 groups of PAE genes specific to grasses. PMID:26237162

  13. Biodiversity and Functional Genomics in the Human Microbiome

    PubMed Central

    Morgan, Xochitl C.; Segata, Nicola; Huttenhower, Curtis

    2012-01-01

    Over the course of our lives, humans are colonized by a tremendous diversity of commensal microbes, which comprise the human microbiome. The collective genetic potential (metagenome) of the human microbiome is orders of magnitude more than the human genome, and it profoundly affects human health and disease in ways we are only beginning to understand. Advances in computing and high-throughput sequencing have enabled population-level surveys such as MetaHIT and the recently-released Human Microbiome Project, detailed investigations of the microbiome in human disease, and mechanistic studies employing gnotobiotic model organisms. The resulting knowledge of human microbiome composition, function, and range of variation across multiple body sites has begun to assemble a rich picture of commensal host-microbe and microbe- microbe interactions as well as their roles in human health and disease and their potential as diagnostic and therapeutic tools. PMID:23140990

  14. Functional genomics of tomato in a post-genome-sequencing phase

    PubMed Central

    Aoki, Koh; Ogata, Yoshiyuki; Igarashi, Kaori; Yano, Kentaro; Nagasaki, Hideki; Kaminuma, Eli; Toyoda, Atsushi

    2013-01-01

    Completion of tomato genome sequencing project has broad impacts on genetic and genomic studies of tomato and Solanaceae plants. The reference genome sequence derived from Solanum lycopersicum cv ‘Heinz 1706’ serves as the firm basis for sequencing-based approaches to tomato genomics. In this article, we first present a brief summary of the genome sequencing project and a summary of the reference genome sequence. We then focus on recent progress in transcriptome sequencing and small RNA sequencing and show how the reference genome sequence makes these analyses more comprehensive than before. We discuss the potential of in-depth analysis that is based on DNA methylome sequencing and transcription start-site detection. Finally, we describe the current status of efforts to resequence S. lycopersicum cultivars to demonstrate how resequencing can allow the use of intraspecific genomic diversity for detailed phenotyping and breeding. PMID:23641177

  15. Functional genomics of the horn fly, Haematobia irritans (Linnaeus, 1758)

    PubMed Central

    2011-01-01

    Background The horn fly, Haematobia irritans (Linnaeus, 1758) (Diptera: Muscidae) is one of the most important ectoparasites of pastured cattle. Horn flies infestations reduce cattle weight gain and milk production. Additionally, horn flies are mechanical vectors of different pathogens that cause disease in cattle. The aim of this study was to conduct a functional genomics study in female horn flies using Expressed Sequence Tags (EST) analysis and RNA interference (RNAi). Results A cDNA library was made from whole abdominal tissues collected from partially fed adult female horn flies. High quality horn fly ESTs (2,160) were sequenced and assembled into 992 unigenes (178 contigs and 814 singlets) representing molecular functions such as serine proteases, cell metabolism, mitochondrial function, transcription and translation, transport, chromatin structure, vitellogenesis, cytoskeleton, DNA replication, cell response to stress and infection, cell proliferation and cell-cell interactions, intracellular trafficking and secretion, and development. Functional analyses were conducted using RNAi for the first time in horn flies. Gene knockdown by RNAi resulted in higher horn fly mortality (protease inhibitor functional group), reduced oviposition (vitellogenin, ferritin and vATPase groups) or both (immune response and 5'-NUC groups) when compared to controls. Silencing of ubiquitination ESTs did not affect horn fly mortality and ovisposition while gene knockdown in the ferritin and vATPse functional groups reduced mortality when compared to controls. Conclusions These results advanced the molecular characterization of this important ectoparasite and suggested candidate protective antigens for the development of vaccines for the control of horn fly infestations. PMID:21310032

  16. Functional cis-regulatory genomics for systems biology

    PubMed Central

    Nam, Jongmin; Dong, Ping; Tarpine, Ryan; Istrail, Sorin; Davidson, Eric H.

    2010-01-01

    Gene expression is controlled by interactions between trans-regulatory factors and cis-regulatory DNA sequences, and these interactions constitute the essential functional linkages of gene regulatory networks (GRNs). Validation of GRN models requires experimental cis-regulatory tests of predicted linkages to authenticate their identities and proposed functions. However, cis-regulatory analysis is, at present, at a severe bottleneck in genomic system biology because of the demanding experimental methodologies currently in use for discovering cis-regulatory modules (CRMs), in the genome, and for measuring their activities. Here we demonstrate a high-throughput approach to both discovery and quantitative characterization of CRMs. The unique aspect is use of DNA sequence tags to “barcode” CRM expression constructs, which can then be mixed, injected together into sea urchin eggs, and subsequently deconvolved. This method has increased the rate of cis-regulatory analysis by >100-fold compared with conventional one-by-one reporter assays. The utility of the DNA-tag reporters was demonstrated by the rapid discovery of 81 active CRMs from 37 previously unexplored sea urchin genes. We then obtained simultaneous high-resolution temporal characterization of the regulatory activities of more than 80 CRMs. On average 2–3 CRMs were discovered per gene. Comparison of endogenous gene expression profiles with those of the CRMs recovered from each gene showed that, for most cases, at least one CRM is active in each phase of endogenous expression, suggesting that CRM recovery was comprehensive. This approach will qualitatively alter the practice of GRN construction as well as validation, and will impact many additional areas of regulatory system biology. PMID:20142491

  17. Combined use of genetic and genomics resources to understand virus resistance and fruit quality traits in melon.

    PubMed

    Argyris, Jason M; Pujol, Marta; Martín-Hernández, Ana Montserrat; Garcia-Mas, Jordi

    2015-09-01

    The availability of the genome sequence of many crop species during the past few years has opened a new era in plant biology, allowing for the performance of massive genomic studies in plant species other than the classical models Arabidopsis and rice. One of these crop species is melon (Cucumis melo), a cucurbit of high economic value that has become an interesting model for the study of biological processes such as fruit ripening, sex determination and phloem transport. The recent availability of the melon genome sequence, together with a number of genetic and genomic resources, provides powerful tools that can be used to assist in the main melon breeding targets, namely disease resistance and fruit quality. In this review, we will describe recent data obtained combining the use of a melon near isogenic line (NIL) population and genomic resources to gain insight into agronomically important traits as fruit ripening, resistance to Cucumber Mosaic virus (CMV) and the accumulation of sugars in fruits.

  18. Functional Analysis of Shewanella, a cross genome comparison.

    SciTech Connect

    Serres, Margrethe H.

    2009-05-15

    The bacterial genus Shewanella includes a group of highly versatile organisms that have successfully adapted to life in many environments ranging from aquatic (fresh and marine) to sedimentary (lake and marine sediments, subsurface sediments, sea vent). A unique respiratory capability of the Shewanellas, initially observed for Shewanella oneidensis MR-1, is the ability to use metals and metalloids, including radioactive compounds, as electron acceptors. Members of the Shewanella genus have also been shown to degrade environmental pollutants i.e. halogenated compounds, making this group highly applicable for the DOE mission. S. oneidensis MR-1 has in addition been found to utilize a diverse set of nutrients and to have a large set of genes dedicated to regulation and to sensing of the environment. The sequencing of the S. oneidensis MR-1 genome facilitated experimental and bioinformatics analyses by a group of collaborating researchers, the Shewanella Federation. Through the joint effort and with support from Department of Energy S. oneidensis MR-1 has become a model organism of study. Our work has been a functional analysis of S. oneidensis MR-1, both by itself and as part of a comparative study. We have improved the annotation of gene products, assigned metabolic functions, and analyzed protein families present in S. oneidensis MR-1. The data has been applied to analysis of experimental data (i.e. gene expression, proteome) generated for S. oneidensis MR-1. Further, this work has formed the basis for a comparative study of over 20 members of the Shewanella genus. The species and strains selected for genome sequencing represented an evolutionary gradient of DNA relatedness, ranging from close to intermediate, and to distant. The organisms selected have also adapted to a variety of ecological niches. Through our work we have been able to detect and interpret genome similarities and differences between members of the genus. We have in this way contributed to the

  19. Plant Ion Channels: Gene Families, Physiology, and Functional Genomics Analyses

    PubMed Central

    Ward, John M.; Mäser, Pascal; Schroeder, Julian I.

    2016-01-01

    Distinct potassium, anion, and calcium channels in the plasma membrane and vacuolar membrane of plant cells have been identified and characterized by patch clamping. Primarily owing to advances in Arabidopsis genetics and genomics, and yeast functional complementation, many of the corresponding genes have been identified. Recent advances in our understanding of ion channel genes that mediate signal transduction and ion transport are discussed here. Some plant ion channels, for example, ALMT and SLAC anion channel subunits, are unique. The majority of plant ion channel families exhibit homology to animal genes; such families include both hyperpolarization-and depolarization-activated Shaker-type potassium channels, CLC chloride transporters/channels, cyclic nucleotide–gated channels, and ionotropic glutamate receptor homologs. These plant ion channels offer unique opportunities to analyze the structural mechanisms and functions of ion channels. Here we review gene families of selected plant ion channel classes and discuss unique structure-function aspects and their physiological roles in plant cell signaling and transport. PMID:18842100

  20. GIANT API: an application programming interface for functional genomics

    PubMed Central

    Roberts, Andrew M.; Wong, Aaron K.; Fisk, Ian; Troyanskaya, Olga G.

    2016-01-01

    GIANT API provides biomedical researchers programmatic access to tissue-specific and global networks in humans and model organisms, and associated tools, which includes functional re-prioritization of existing genome-wide association study (GWAS) data. Using tissue-specific interaction networks, researchers are able to predict relationships between genes specific to a tissue or cell lineage, identify the changing roles of genes across tissues and uncover disease-gene associations. Additionally, GIANT API enables computational tools like NetWAS, which leverages tissue-specific networks for re-prioritization of GWAS results. The web services covered by the API include 144 tissue-specific functional gene networks in human, global functional networks for human and six common model organisms and the NetWAS method. GIANT API conforms to the REST architecture, which makes it stateless, cacheable and highly scalable. It can be used by a diverse range of clients including web browsers, command terminals, programming languages and standalone apps for data analysis and visualization. The API is freely available for use at http://giant-api.princeton.edu. PMID:27098035

  1. GIANT API: an application programming interface for functional genomics.

    PubMed

    Roberts, Andrew M; Wong, Aaron K; Fisk, Ian; Troyanskaya, Olga G

    2016-07-01

    GIANT API provides biomedical researchers programmatic access to tissue-specific and global networks in humans and model organisms, and associated tools, which includes functional re-prioritization of existing genome-wide association study (GWAS) data. Using tissue-specific interaction networks, researchers are able to predict relationships between genes specific to a tissue or cell lineage, identify the changing roles of genes across tissues and uncover disease-gene associations. Additionally, GIANT API enables computational tools like NetWAS, which leverages tissue-specific networks for re-prioritization of GWAS results. The web services covered by the API include 144 tissue-specific functional gene networks in human, global functional networks for human and six common model organisms and the NetWAS method. GIANT API conforms to the REST architecture, which makes it stateless, cacheable and highly scalable. It can be used by a diverse range of clients including web browsers, command terminals, programming languages and standalone apps for data analysis and visualization. The API is freely available for use at http://giant-api.princeton.edu. PMID:27098035

  2. GIANT API: an application programming interface for functional genomics.

    PubMed

    Roberts, Andrew M; Wong, Aaron K; Fisk, Ian; Troyanskaya, Olga G

    2016-07-01

    GIANT API provides biomedical researchers programmatic access to tissue-specific and global networks in humans and model organisms, and associated tools, which includes functional re-prioritization of existing genome-wide association study (GWAS) data. Using tissue-specific interaction networks, researchers are able to predict relationships between genes specific to a tissue or cell lineage, identify the changing roles of genes across tissues and uncover disease-gene associations. Additionally, GIANT API enables computational tools like NetWAS, which leverages tissue-specific networks for re-prioritization of GWAS results. The web services covered by the API include 144 tissue-specific functional gene networks in human, global functional networks for human and six common model organisms and the NetWAS method. GIANT API conforms to the REST architecture, which makes it stateless, cacheable and highly scalable. It can be used by a diverse range of clients including web browsers, command terminals, programming languages and standalone apps for data analysis and visualization. The API is freely available for use at http://giant-api.princeton.edu.

  3. OrtholugeDB: a bacterial and archaeal orthology resource for improved comparative genomic analysis.

    PubMed

    Whiteside, Matthew D; Winsor, Geoffrey L; Laird, Matthew R; Brinkman, Fiona S L

    2013-01-01

    Prediction of orthologs (homologous genes that diverged because of speciation) is an integral component of many comparative genomics methods. Although orthologs are more likely to have similar function versus paralogs (genes that diverged because of duplication), recent studies have shown that their degree of functional conservation is variable. Also, there are inherent problems with several large-scale ortholog prediction approaches. To address these issues, we previously developed Ortholuge, which uses phylogenetic distance ratios to provide more precise ortholog assessments for a set of predicted orthologs. However, the original version of Ortholuge required manual intervention and was not easily accessible; therefore, we now report the development of OrtholugeDB, available online at http://www.pathogenomics.sfu.ca/ortholugedb. OrtholugeDB provides ortholog predictions for completely sequenced bacterial and archaeal genomes from NCBI based on reciprocal best Basic Local Alignment Search Tool hits, supplemented with further evaluation by the more precise Ortholuge method. The OrtholugeDB web interface facilitates user-friendly and flexible ortholog analysis, from single genes to genomes, plus flexible data download options. We compare Ortholuge with similar methods, showing how it may more consistently identify orthologs with conserved features across a wide range of taxonomic distances. OrtholugeDB facilitates rapid, and more accurate, bacterial and archaeal comparative genomic analysis and large-scale ortholog predictions.

  4. Comparative genomics of Geobacter chemotaxis genes reveals diverse signaling function

    PubMed Central

    Tran, Hoa T; Krushkal, Julia; Antommattei, Frances M; Lovley, Derek R; Weis, Robert M

    2008-01-01

    Background Geobacter species are δ-Proteobacteria and are often the predominant species in a variety of sedimentary environments where Fe(III) reduction is important. Their ability to remediate contaminated environments and produce electricity makes them attractive for further study. Cell motility, biofilm formation, and type IV pili all appear important for the growth of Geobacter in changing environments and for electricity production. Recent studies in other bacteria have demonstrated that signaling pathways homologous to the paradigm established for Escherichia coli chemotaxis can regulate type IV pili-dependent motility, the synthesis of flagella and type IV pili, the production of extracellular matrix material, and biofilm formation. The classification of these pathways by comparative genomics improves the ability to understand how Geobacter thrives in natural environments and better their use in microbial fuel cells. Results The genomes of G. sulfurreducens, G. metallireducens, and G. uraniireducens contain multiple (~70) homologs of chemotaxis genes arranged in several major clusters (six, seven, and seven, respectively). Unlike the single gene cluster of E. coli, the Geobacter clusters are not all located near the flagellar genes. The probable functions of some Geobacter clusters are assignable by homology to known pathways; others appear to be unique to the Geobacter sp. and contain genes of unknown function. We identified large numbers of methyl-accepting chemotaxis protein (MCP) homologs that have diverse sensing domain architectures and generate a potential for sensing a great variety of environmental signals. We discuss mechanisms for class-specific segregation of the MCPs in the cell membrane, which serve to maintain pathway specificity and diminish crosstalk. Finally, the regulation of gene expression in Geobacter differs from E. coli. The sequences of predicted promoter elements suggest that the alternative sigma factors σ28 and σ54 play a role

  5. Genomic Resources for Water Yam (Dioscorea alata L.): Analyses of EST-Sequences, De Novo Sequencing and GBS Libraries

    PubMed Central

    Saski, Christopher A.; Bhattacharjee, Ranjana; Scheffler, Brian E.; Asiedu, Robert

    2015-01-01

    The reducing cost and rapid progress in next-generation sequencing techniques coupled with high performance computational approaches have resulted in large-scale discovery of advanced genomic resources in several model and non-model plant species. Yam (Dioscorea spp.) is a major food and cash crop in many countries but research efforts have been limited to understand the genetics and generate genomic information for the crop. The availability of a large number of genomic resources including genome-wide molecular markers will accelerate the breeding efforts and application of genomic selection in yams. In the present study, several methods including expressed sequence tags (EST)-sequencing, de novo sequencing, and genotyping-by-sequencing (GBS) profiles on two yam (Dioscorea alata L.) genotypes (TDa 95/00328 and TDa 95-310) was performed to generate genomic resources for use in its improvement programs. This includes a comprehensive set of EST-SSRs, genomic SSRs, whole genome SNPs, and reduced representation SNPs. A total of 1,152 EST-SSRs were developed from >40,000 EST-sequences generated from the two genotypes. A set of 388 EST-SSRs were validated as polymorphic showing a polymorphism rate of 34% when tested on two diverse parents targeted for anthracnose disease. In addition, approximately 40X de novo whole genome sequence coverage was generated for each of the two genotypes, and a total of 18,584 and 15,952 genomic SSRs were identified for TDa 95/00328 and TDa 95-310, respectively. A custom made pipeline resulted in the selection of 573 genomic SSRs common across the two genotypes, of which only eight failed, 478 being polymorphic and 62 monomorphic indicating a polymorphic rate of 83.5%. Additionally, 288,505 high quality SNPs were also identified between these two genotypes. Genotyping by sequencing reads on these two genotypes also revealed 36,790 overlapping SNP positions that are distributed throughout the genome. Our efforts in using different approaches

  6. Genomic Resources for Water Yam (Dioscorea alata L.): Analyses of EST-Sequences, De Novo Sequencing and GBS Libraries.

    PubMed

    Saski, Christopher A; Bhattacharjee, Ranjana; Scheffler, Brian E; Asiedu, Robert

    2015-01-01

    The reducing cost and rapid progress in next-generation sequencing techniques coupled with high performance computational approaches have resulted in large-scale discovery of advanced genomic resources in several model and non-model plant species. Yam (Dioscorea spp.) is a major food and cash crop in many countries but research efforts have been limited to understand the genetics and generate genomic information for the crop. The availability of a large number of genomic resources including genome-wide molecular markers will accelerate the breeding efforts and application of genomic selection in yams. In the present study, several methods including expressed sequence tags (EST)-sequencing, de novo sequencing, and genotyping-by-sequencing (GBS) profiles on two yam (Dioscorea alata L.) genotypes (TDa 95/00328 and TDa 95-310) was performed to generate genomic resources for use in its improvement programs. This includes a comprehensive set of EST-SSRs, genomic SSRs, whole genome SNPs, and reduced representation SNPs. A total of 1,152 EST-SSRs were developed from >40,000 EST-sequences generated from the two genotypes. A set of 388 EST-SSRs were validated as polymorphic showing a polymorphism rate of 34% when tested on two diverse parents targeted for anthracnose disease. In addition, approximately 40X de novo whole genome sequence coverage was generated for each of the two genotypes, and a total of 18,584 and 15,952 genomic SSRs were identified for TDa 95/00328 and TDa 95-310, respectively. A custom made pipeline resulted in the selection of 573 genomic SSRs common across the two genotypes, of which only eight failed, 478 being polymorphic and 62 monomorphic indicating a polymorphic rate of 83.5%. Additionally, 288,505 high quality SNPs were also identified between these two genotypes. Genotyping by sequencing reads on these two genotypes also revealed 36,790 overlapping SNP positions that are distributed throughout the genome. Our efforts in using different approaches

  7. Genomic Resources for Water Yam (Dioscorea alata L.): Analyses of EST-Sequences, De Novo Sequencing and GBS Libraries.

    PubMed

    Saski, Christopher A; Bhattacharjee, Ranjana; Scheffler, Brian E; Asiedu, Robert

    2015-01-01

    The reducing cost and rapid progress in next-generation sequencing techniques coupled with high performance computational approaches have resulted in large-scale discovery of advanced genomic resources in several model and non-model plant species. Yam (Dioscorea spp.) is a major food and cash crop in many countries but research efforts have been limited to understand the genetics and generate genomic information for the crop. The availability of a large number of genomic resources including genome-wide molecular markers will accelerate the breeding efforts and application of genomic selection in yams. In the present study, several methods including expressed sequence tags (EST)-sequencing, de novo sequencing, and genotyping-by-sequencing (GBS) profiles on two yam (Dioscorea alata L.) genotypes (TDa 95/00328 and TDa 95-310) was performed to generate genomic resources for use in its improvement programs. This includes a comprehensive set of EST-SSRs, genomic SSRs, whole genome SNPs, and reduced representation SNPs. A total of 1,152 EST-SSRs were developed from >40,000 EST-sequences generated from the two genotypes. A set of 388 EST-SSRs were validated as polymorphic showing a polymorphism rate of 34% when tested on two diverse parents targeted for anthracnose disease. In addition, approximately 40X de novo whole genome sequence coverage was generated for each of the two genotypes, and a total of 18,584 and 15,952 genomic SSRs were identified for TDa 95/00328 and TDa 95-310, respectively. A custom made pipeline resulted in the selection of 573 genomic SSRs common across the two genotypes, of which only eight failed, 478 being polymorphic and 62 monomorphic indicating a polymorphic rate of 83.5%. Additionally, 288,505 high quality SNPs were also identified between these two genotypes. Genotyping by sequencing reads on these two genotypes also revealed 36,790 overlapping SNP positions that are distributed throughout the genome. Our efforts in using different approaches

  8. The SOFG Anatomy Entry List (SAEL): An Annotation Tool for Functional Genomics Data

    PubMed Central

    Parkinson, Helen; Aitken, Stuart; Baldock, Richard A.; Bard, Jonathan B. L.; Burger, Albert; Hayamizu, Terry F.; Rector, Alan; Ringwald, Martin; Rogers, Jeremy; Rosse, Cornelius; Stoeckert, Christian J.

    2004-01-01

    A great deal of data in functional genomics studies needs to be annotated with low-resolution anatomical terms. For example, gene expression assays based on manually dissected samples (microarray, SAGE, etc.) need high-level anatomical terms to describe sample origin. First-pass annotation in high-throughput assays (e.g. large-scale in situ gene expression screens or phenotype screens) and bibliographic applications, such as selection of keywords, would also benefit from a minimum set of standard anatomical terms. Although only simple terms are required, the researcher faces serious practical problems of inconsistency and confusion, given the different aims and the range of complexity of existing anatomy ontologies. A Standards and Ontologies for Functional Genomics (SOFG) group therefore initiated discussions between several of the major anatomical ontologies for higher vertebrates. As we report here, one result of these discussions is a simple, accessible, controlled vocabulary of gross anatomical terms, the SOFG Anatomy Entry List (SAEL). The SAEL is available from http://www.sofg.org and is intended as a resource for biologists, curators, bioinformaticians and developers of software supporting functional genomics. It can be used directly for annotation in the contexts described above. Importantly, each term is linked to the corresponding term in each of the major anatomy ontologies. Where the simple list does not provide enough detail or sophistication, therefore, the researcher can use the SAEL to choose the appropriate ontology and move directly to the relevant term as an entry point. The SAEL links will also be used to support computational access to the respective ontologies. PMID:18629134

  9. Biodegradation of DDT by Stenotrophomonas sp. DDT-1: Characterization and genome functional analysis

    PubMed Central

    Pan, Xiong; Lin, Dunli; Zheng, Yuan; Zhang, Qian; Yin, Yuanming; Cai, Lin; Fang, Hua; Yu, Yunlong

    2016-01-01

    A novel bacterium capable of utilizing 1,1,1-trichloro-2,2-bis(p-chlorophenyl)ethane (DDT) as the sole carbon and energy source was isolated from a contaminated soil which was identified as Stenotrophomonas sp. DDT-1 based on morphological characteristics, BIOLOG GN2 microplate profile, and 16S rDNA phylogeny. Genome sequencing and functional annotation of the isolate DDT-1 showed a 4,514,569 bp genome size, 66.92% GC content, 4,033 protein-coding genes, and 76 RNA genes including 8 rRNA genes. Totally, 2,807 protein-coding genes were assigned to Clusters of Orthologous Groups (COGs), and 1,601 protein-coding genes were mapped to Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway. The degradation half-lives of DDT increased with substrate concentration from 0.1 to 10.0 mg/l, whereas decreased with temperature from 15 °C to 35 °C. Neutral condition was the most favorable for DDT biodegradation. Based on genome annotation of DDT degradation genes and the metabolites detected by GC-MS, a mineralization pathway was proposed for DDT biodegradation in which it was orderly converted into DDE/DDD, DDMU, DDOH, and DDA via dechlorination, hydroxylation, and carboxylation, and ultimately mineralized to carbon dioxide. The results indicate that the isolate DDT-1 is a promising bacterial resource for the removal or detoxification of DDT residues in the environment. PMID:26888254

  10. Limited resources of genome sequencing in developing countries: Challenges and solutions.

    PubMed

    Helmy, Mohamed; Awad, Mohamed; Mosa, Kareem A

    2016-06-01

    The differences between countries in national income, growth, human development and many other factors are used to classify countries into developed and developing countries. There are several classification systems that use different sets of measures and criteria. The most common classifications are the United Nations (UN) and the World Bank (WB) systems. The UN classification system uses the UN Human Development Index (HDI), an indicator that uses statistic of life expectancy, education, and income per capita for countries' classification. While the WB system uses gross national income (GNI) per capita that is calculated using the World Bank Atlas method. According to the UN and WB classification systems, there are 151 and 134 developing countries, respectively, with 89% overlap between the two systems. Developing countries have limited human development, and limited expenditure in education and research, among several other limitations. The biggest challenge facing genomic researchers and clinicians is limited resources. As a result, genomic tools, specifically genome sequencing technologies, which are rapidly becoming indispensable, are not widely available. In this report, we explore the current status of sequencing technologies in developing countries, describe the associated challenges and emphasize potential solutions. PMID:27354935

  11. Limited resources of genome sequencing in developing countries: Challenges and solutions.

    PubMed

    Helmy, Mohamed; Awad, Mohamed; Mosa, Kareem A

    2016-06-01

    The differences between countries in national income, growth, human development and many other factors are used to classify countries into developed and developing countries. There are several classification systems that use different sets of measures and criteria. The most common classifications are the United Nations (UN) and the World Bank (WB) systems. The UN classification system uses the UN Human Development Index (HDI), an indicator that uses statistic of life expectancy, education, and income per capita for countries' classification. While the WB system uses gross national income (GNI) per capita that is calculated using the World Bank Atlas method. According to the UN and WB classification systems, there are 151 and 134 developing countries, respectively, with 89% overlap between the two systems. Developing countries have limited human development, and limited expenditure in education and research, among several other limitations. The biggest challenge facing genomic researchers and clinicians is limited resources. As a result, genomic tools, specifically genome sequencing technologies, which are rapidly becoming indispensable, are not widely available. In this report, we explore the current status of sequencing technologies in developing countries, describe the associated challenges and emphasize potential solutions.

  12. ERP (enterprise resource planning) systems can streamline healthcare business functions.

    PubMed

    Jenkins, E K; Christenson, E

    2001-05-01

    Enterprise resource planning (ERP) software applications are designed to facilitate the systemwide integration of complex processes and functions across a large enterprise consisting of many internal and external constituents. Although most currently available ERP applications generally are tailored to the needs of the manufacturing industry, many large healthcare systems are investigating these applications. Due to the significant differences between manufacturing and patient care, ERP-based systems do not easily translate to the healthcare setting. In particular, the lack of clinical standardization impedes the use of ERP systems for clinical integration. Nonetheless, an ERP-based system can help a healthcare organization integrate many functions, including patient scheduling, human resources management, workload forecasting, and management of workflow, that are not directly dependent on clinical decision making. PMID:11351810

  13. Metabolomic Functional Analysis of Bacterial Genomes: Final Report

    SciTech Connect

    Arp, Daniel J; Sayavedra-Soto, Luis A

    2008-01-01

    The availability of the complete DNA sequence of the bacterial genome of Nitrosomonas europaea offered the opportunity for unprecedented and detailed investigations of function. We studied the function of genes involved in carbohydrate and Fe metabolism. N. europaea has genes for the synthesis and degradation of glycogen and sucrose but cannot grow on substrates other than ammonia and CO2. Granules of glycogen were detected in whole cells by electron microscopy and quantified in cell-free extracts by enzymatic methods. The cellular glycogen and sucrose content varied depending on the composition of the growth medium and cellular growth stage. N. europaea also depends heavily on iron for metabolism of ammonia, is particularly interesting since it lacks genes for siderophore production, and has genes with only low similarity to known iron reductases, yet grows relatively well in medium containing low Fe. By comparing the transcriptomes of cells grown in iron-replete medium versus iron-limited medium, 247 genes were identified as differentially expressed. Mutant strains deficient in genes for sucrose, glycogen and iron metabolism were created and are being used to further our understanding of ammonia oxidizing bacteria.

  14. Bovine Genome Database: new tools for gleaning function from the Bos taurus genome.

    PubMed

    Elsik, Christine G; Unni, Deepak R; Diesh, Colin M; Tayal, Aditi; Emery, Marianne L; Nguyen, Hung N; Hagen, Darren E

    2016-01-01

    We report an update of the Bovine Genome Database (BGD) (http://BovineGenome.org). The goal of BGD is to support bovine genomics research by providing genome annotation and data mining tools. We have developed new genome and annotation browsers using JBrowse and WebApollo for two Bos taurus genome assemblies, the reference genome assembly (UMD3.1.1) and the alternate genome assembly (Btau_4.6.1). Annotation tools have been customized to highlight priority genes for annotation, and to aid annotators in selecting gene evidence tracks from 91 tissue specific RNAseq datasets. We have also developed BovineMine, based on the InterMine data warehousing system, to integrate the bovine genome, annotation, QTL, SNP and expression data with external sources of orthology, gene ontology, gene interaction and pathway information. BovineMine provides powerful query building tools, as well as customized query templates, and allows users to analyze and download genome-wide datasets. With BovineMine, bovine researchers can use orthology to leverage the curated gene pathways of model organisms, such as human, mouse and rat. BovineMine will be especially useful for gene ontology and pathway analyses in conjunction with GWAS and QTL studies.

  15. Beyond the dna: a prototype for functional genomics

    SciTech Connect

    Albala, J

    2000-03-02

    A prototype oligonucleotide ''functional chip'' has been developed to screen novel DNA repair proteins for their ability to bind or alter different forms of DNA. This chip has been developed as a functional genomics screen for analysis of protein-DNA interactions for novel proteins identified from the Human Genome Project The process of novel gene identification that has ensued as a consequence of available sequence information is remarkable. The challenge how lies in determining the function of newly identified gene products in a time-and cost-effective high-throughput manner. The functional chip is generated by the robotic application of DNA spotted in a microarray format onto a glass slide. Individual proteins are then analyzed against the different form of DNA bound to the slide. Several prototype functional chips were designed to contain various DNA fragments tethered to a glass slide for analysis of protein-DNA binding or enzymatic activity of known proteins. The technology has been developed to screen novel, putative DNA repair proteins for their ability to bind various types of DNA alone and in concert with protein partners. An additional scheme has been devised to screen putative repair enzymes for their ability to process different types of DNA molecules. Current methods to analyze gene expression primarily utilize either of two technologies. The oligonucleotide chip, pioneered by Fodor and co-workers and Affymetrix, Inc., consists of greater than 64,000 oligonucleotides attached in situ to a glass support. The oligonucleotide chip has been used primarily to identify specific mutations in a given gene by hybridization against a fluorescently-labeled substrate. The second method is the microarray, whereby DNA targets are systematically arranged on a glass slide and then hybridized with fluorescently-labeled complex targets for gene expression analysis (Jordan, 1998). By this technique, a large amount of information can be obtained examining global

  16. GeNemo: a search engine for web-based functional genomic data.

    PubMed

    Zhang, Yongqing; Cao, Xiaoyi; Zhong, Sheng

    2016-07-01

    A set of new data types emerged from functional genomic assays, including ChIP-seq, DNase-seq, FAIRE-seq and others. The results are typically stored as genome-wide intensities (WIG/bigWig files) or functional genomic regions (peak/BED files). These data types present new challenges to big data science. Here, we present GeNemo, a web-based search engine for functional genomic data. GeNemo searches user-input data against online functional genomic datasets, including the entire collection of ENCODE and mouse ENCODE datasets. Unlike text-based search engines, GeNemo's searches are based on pattern matching of functional genomic regions. This distinguishes GeNemo from text or DNA sequence searches. The user can input any complete or partial functional genomic dataset, for example, a binding intensity file (bigWig) or a peak file. GeNemo reports any genomic regions, ranging from hundred bases to hundred thousand bases, from any of the online ENCODE datasets that share similar functional (binding, modification, accessibility) patterns. This is enabled by a Markov Chain Monte Carlo-based maximization process, executed on up to 24 parallel computing threads. By clicking on a search result, the user can visually compare her/his data with the found datasets and navigate the identified genomic regions. GeNemo is available at www.genemo.org. PMID:27098038

  17. GeNemo: a search engine for web-based functional genomic data.

    PubMed

    Zhang, Yongqing; Cao, Xiaoyi; Zhong, Sheng

    2016-07-01

    A set of new data types emerged from functional genomic assays, including ChIP-seq, DNase-seq, FAIRE-seq and others. The results are typically stored as genome-wide intensities (WIG/bigWig files) or functional genomic regions (peak/BED files). These data types present new challenges to big data science. Here, we present GeNemo, a web-based search engine for functional genomic data. GeNemo searches user-input data against online functional genomic datasets, including the entire collection of ENCODE and mouse ENCODE datasets. Unlike text-based search engines, GeNemo's searches are based on pattern matching of functional genomic regions. This distinguishes GeNemo from text or DNA sequence searches. The user can input any complete or partial functional genomic dataset, for example, a binding intensity file (bigWig) or a peak file. GeNemo reports any genomic regions, ranging from hundred bases to hundred thousand bases, from any of the online ENCODE datasets that share similar functional (binding, modification, accessibility) patterns. This is enabled by a Markov Chain Monte Carlo-based maximization process, executed on up to 24 parallel computing threads. By clicking on a search result, the user can visually compare her/his data with the found datasets and navigate the identified genomic regions. GeNemo is available at www.genemo.org.

  18. GeNemo: a search engine for web-based functional genomic data

    PubMed Central

    Zhang, Yongqing; Cao, Xiaoyi; Zhong, Sheng

    2016-01-01

    A set of new data types emerged from functional genomic assays, including ChIP-seq, DNase-seq, FAIRE-seq and others. The results are typically stored as genome-wide intensities (WIG/bigWig files) or functional genomic regions (peak/BED files). These data types present new challenges to big data science. Here, we present GeNemo, a web-based search engine for functional genomic data. GeNemo searches user-input data against online functional genomic datasets, including the entire collection of ENCODE and mouse ENCODE datasets. Unlike text-based search engines, GeNemo's searches are based on pattern matching of functional genomic regions. This distinguishes GeNemo from text or DNA sequence searches. The user can input any complete or partial functional genomic dataset, for example, a binding intensity file (bigWig) or a peak file. GeNemo reports any genomic regions, ranging from hundred bases to hundred thousand bases, from any of the online ENCODE datasets that share similar functional (binding, modification, accessibility) patterns. This is enabled by a Markov Chain Monte Carlo-based maximization process, executed on up to 24 parallel computing threads. By clicking on a search result, the user can visually compare her/his data with the found datasets and navigate the identified genomic regions. GeNemo is available at www.genemo.org. PMID:27098038

  19. ZikaVR: An Integrated Zika Virus Resource for Genomics, Proteomics, Phylogenetic and Therapeutic Analysis.

    PubMed

    Gupta, Amit Kumar; Kaur, Karambir; Rajput, Akanksha; Dhanda, Sandeep Kumar; Sehgal, Manika; Khan, Md Shoaib; Monga, Isha; Dar, Showkat Ahmad; Singh, Sandeep; Nagpal, Gandharva; Usmani, Salman Sadullah; Thakur, Anamika; Kaur, Gazaldeep; Sharma, Shivangi; Bhardwaj, Aman; Qureshi, Abid; Raghava, Gajendra Pal Singh; Kumar, Manoj

    2016-01-01

    Current Zika virus (ZIKV) outbreaks that spread in several areas of Africa, Southeast Asia, and in pacific islands is declared as a global health emergency by World Health Organization (WHO). It causes Zika fever and illness ranging from severe autoimmune to neurological complications in humans. To facilitate research on this virus, we have developed an integrative multi-omics platform; ZikaVR (http://bioinfo.imtech.res.in/manojk/zikavr/), dedicated to the ZIKV genomic, proteomic and therapeutic knowledge. It comprises of whole genome sequences, their respective functional information regarding proteins, genes, and structural content. Additionally, it also delivers sophisticated analysis such as whole-genome alignments, conservation and variation, CpG islands, codon context, usage bias and phylogenetic inferences at whole genome and proteome level with user-friendly visual environment. Further, glycosylation sites and molecular diagnostic primers were also analyzed. Most importantly, we also proposed potential therapeutically imperative constituents namely vaccine epitopes, siRNAs, miRNAs, sgRNAs and repurposing drug candidates. PMID:27633273

  20. The MiST2 database: a comprehensive genomics resource on microbial signal transduction

    PubMed Central

    Ulrich, Luke E.; Zhulin, Igor B.

    2010-01-01

    The MiST2 database (http://mistdb.com) identifies and catalogs the repertoire of signal transduction proteins in microbial genomes. Signal transduction systems regulate the majority of cellular activities including the metabolism, development, host-recognition, biofilm production, virulence, and antibiotic resistance of human pathogens. Thus, knowledge of the proteins and interactions that comprise these communication networks is an essential component to furthering biomedical discovery. These are identified by searching protein sequences for specific domain profiles that implicate a protein in signal transduction. Compared to the previous version of the database, MiST2 contains a host of new features and improvements including the following: draft genomes; extracytoplasmic function (ECF) sigma factor protein identification; enhanced classification of signaling proteins; novel, high-quality domain models for identifying histidine kinases and response regulators; neighboring two-component genes; gene cart; better search capabilities; enhanced taxonomy browser; advanced genome browser; and a modern, biologist-friendly web interface. MiST2 currently contains 966 complete and 157 draft bacterial and archaeal genomes, which collectively contain more than 245 000 signal transduction proteins. The majority (66%) of these are one-component systems, followed by two-component proteins (26%), chemotaxis (6%), and finally ECF factors (2%). PMID:19900966

  1. ZikaVR: An Integrated Zika Virus Resource for Genomics, Proteomics, Phylogenetic and Therapeutic Analysis

    PubMed Central

    Gupta, Amit Kumar; Kaur, Karambir; Rajput, Akanksha; Dhanda, Sandeep Kumar; Sehgal, Manika; Khan, Md. Shoaib; Monga, Isha; Dar, Showkat Ahmad; Singh, Sandeep; Nagpal, Gandharva; Usmani, Salman Sadullah; Thakur, Anamika; Kaur, Gazaldeep; Sharma, Shivangi; Bhardwaj, Aman; Qureshi, Abid; Raghava, Gajendra Pal Singh; Kumar, Manoj

    2016-01-01

    Current Zika virus (ZIKV) outbreaks that spread in several areas of Africa, Southeast Asia, and in pacific islands is declared as a global health emergency by World Health Organization (WHO). It causes Zika fever and illness ranging from severe autoimmune to neurological complications in humans. To facilitate research on this virus, we have developed an integrative multi-omics platform; ZikaVR (http://bioinfo.imtech.res.in/manojk/zikavr/), dedicated to the ZIKV genomic, proteomic and therapeutic knowledge. It comprises of whole genome sequences, their respective functional information regarding proteins, genes, and structural content. Additionally, it also delivers sophisticated analysis such as whole-genome alignments, conservation and variation, CpG islands, codon context, usage bias and phylogenetic inferences at whole genome and proteome level with user-friendly visual environment. Further, glycosylation sites and molecular diagnostic primers were also analyzed. Most importantly, we also proposed potential therapeutically imperative constituents namely vaccine epitopes, siRNAs, miRNAs, sgRNAs and repurposing drug candidates. PMID:27633273

  2. ZikaVR: An Integrated Zika Virus Resource for Genomics, Proteomics, Phylogenetic and Therapeutic Analysis.

    PubMed

    Gupta, Amit Kumar; Kaur, Karambir; Rajput, Akanksha; Dhanda, Sandeep Kumar; Sehgal, Manika; Khan, Md Shoaib; Monga, Isha; Dar, Showkat Ahmad; Singh, Sandeep; Nagpal, Gandharva; Usmani, Salman Sadullah; Thakur, Anamika; Kaur, Gazaldeep; Sharma, Shivangi; Bhardwaj, Aman; Qureshi, Abid; Raghava, Gajendra Pal Singh; Kumar, Manoj

    2016-01-01

    Current Zika virus (ZIKV) outbreaks that spread in several areas of Africa, Southeast Asia, and in pacific islands is declared as a global health emergency by World Health Organization (WHO). It causes Zika fever and illness ranging from severe autoimmune to neurological complications in humans. To facilitate research on this virus, we have developed an integrative multi-omics platform; ZikaVR (http://bioinfo.imtech.res.in/manojk/zikavr/), dedicated to the ZIKV genomic, proteomic and therapeutic knowledge. It comprises of whole genome sequences, their respective functional information regarding proteins, genes, and structural content. Additionally, it also delivers sophisticated analysis such as whole-genome alignments, conservation and variation, CpG islands, codon context, usage bias and phylogenetic inferences at whole genome and proteome level with user-friendly visual environment. Further, glycosylation sites and molecular diagnostic primers were also analyzed. Most importantly, we also proposed potential therapeutically imperative constituents namely vaccine epitopes, siRNAs, miRNAs, sgRNAs and repurposing drug candidates.

  3. The non-photosynthetic, pathogenic green alga Helicosporidium sp. has retained a modified, functional plastid genome.

    PubMed

    Tartar, Aurélien; Boucias, Drion G

    2004-04-01

    A fragment of the Helicosporidium sp. (Chlorophyta: Trebouxiophyceae) plastid genome has been sequenced. The genome architecture was compared to that of both a non-photosynthetic relative (Prototheca wickerhamii) and a photosynthetic relative (Chlorella vulgaris). Comparative genomic analysis indicated that Helicosporidium and Prototheca are closely related genera. The analyses also revealed that the Helicosporidium sp. plastid genome has been rearranged. In particular, two ribosomal protein-encoding genes (rpl19 and rps23) appeared to have been transposed, or lost from the Helicosporidium sp. plastid genome. RT-PCR reactions demonstrated that the retained plastid genes were transcribed, suggesting that, despite rearrangement(s), the Helicosporidium sp. plastid genome has remained functional. The modified plastid genome architecture is a novel apomorphy that indicates that the Helicosporidia are highly derived green algae, more so than Prototheca spp. As such, they represent a promising model to study organellar genome reorganizations in parasitic protists.

  4. Functional genomics of seed dormancy in wheat: advances and prospects

    PubMed Central

    Gao, Feng; Ayele, Belay T.

    2014-01-01

    Seed dormancy is a mechanism underlying the inability of viable seeds to germinate under optimal environmental conditions. To achieve rapid and uniform germination, wheat and other cereal crops have been selected against dormancy. As a result, most of the modern commercial cultivars have low level of seed dormancy and are susceptible to preharvest sprouting when wet and moist conditions occur prior to harvest. As it causes substantial loss in grain yield and quality, preharvest sprouting is an ever-present major constraint to the production of wheat. The significance of the problem emphasizes the need to incorporate an intermediate level of dormancy into elite wheat cultivars, and this requires detailed dissection of the mechanisms underlying the regulation of seed dormancy and preharvest sprouting. Seed dormancy research in wheat often involves after-ripening, a period of dry storage during which seeds lose dormancy, or comparative analysis of seeds derived from dormant and non-dormant cultivars. The increasing development in wheat genomic resources along with the application of transcriptomics, proteomics, and metabolomics approaches in studying wheat seed dormancy have extended our knowledge of the mechanisms acting at transcriptional and post-transcriptional levels. Recent progresses indicate that some of the molecular mechanisms are associated with hormonal pathways, epigenetic regulations, targeted oxidative modifications of seed mRNAs and proteins, redox regulation of seed protein thiols, and modulation of translational activities. Given that preharvest sprouting is closely associated with seed dormancy, these findings will significantly contribute to the designing of efficient strategies for breeding preharvest sprouting tolerant wheat. PMID:25309557

  5. Genome-wide association study of porcine hematological parameters in a Large White × Minzhu F2 resource population.

    PubMed

    Luo, Weizhen; Chen, Shaokang; Cheng, Duxue; Wang, Ligang; Li, Yong; Ma, Xiaojun; Song, Xin; Liu, Xin; Li, Wen; Liang, Jing; Yan, Hua; Zhao, Kebin; Wang, Chuduan; Wang, Lixian; Zhang, Longchao

    2012-01-01

    Hematological traits, which are important indicators of immune function in animals, have been commonly examined as biomarkers of disease and disease severity in humans and animals. Genome-wide significant quantitative trait loci (QTLs) provide important information for use in breeding programs of animals such as pigs. QTLs for hematological parameters (hematological traits) have been detected in pig chromosomes, although these are often mapped by linkage analysis to large intervals making identification of the underlying mutation problematic. Single nucleotide polymorphisms (SNPs) are the common form of genetic variation among individuals and are thought to account for the majority of inherited traits. In this study, a genome-wide association study (GWAS) was performed to detect regions of association with hematological traits in a three-generation resource population produced by intercrossing Large White boars and Minzhu sows during the period from 2007 to 2011. Illumina PorcineSNP60 BeadChip technology was used to genotype each animal and seven hematological parameters were measured (hematocrit (HCT), hemoglobin (HGB), mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHC), mean corpuscular volume (MCV), red blood cell count (RBC) and red blood cell volume distribution width (RDW)). Data were analyzed in a three step Genome-wide Rapid Association using the Mixed Model and Regression-Genomic Control (GRAMMAR-GC) method. A total of 62 genome-wide significant and three chromosome-wide significant SNPs associated with hematological parameters were detected in this GWAS. Seven and five SNPs were associated with HCT and HGB, respectively. These SNPs were all located within the region of 34.6-36.5 Mb on SSC7. Four SNPs within the region of 43.7-47.0 Mb and fifty-five SNPs within the region of 42.2-73.8 Mb on SSC8 showed significant association with MCH and MCV, respectively. At chromosome-wide significant level, one SNP at 29.2 Mb on SSC1

  6. Mammalian-specific genomic functions: Newly acquired traits generated by genomic imprinting and LTR retrotransposon-derived genes in mammals

    PubMed Central

    KANEKO-ISHINO, Tomoko; ISHINO, Fumitoshi

    2015-01-01

    Mammals, including human beings, have evolved a unique viviparous reproductive system and a highly developed central nervous system. How did these unique characteristics emerge in mammalian evolution, and what kinds of changes did occur in the mammalian genomes as evolution proceeded? A key conceptual term in approaching these issues is “mammalian-specific genomic functions”, a concept covering both mammalian-specific epigenetics and genetics. Genomic imprinting and LTR retrotransposon-derived genes are reviewed as the representative, mammalian-specific genomic functions that are essential not only for the current mammalian developmental system, but also mammalian evolution itself. First, the essential roles of genomic imprinting in mammalian development, especially related to viviparous reproduction via placental function, as well as the emergence of genomic imprinting in mammalian evolution, are discussed. Second, we introduce the novel concept of “mammalian-specific traits generated by mammalian-specific genes from LTR retrotransposons”, based on the finding that LTR retrotransposons served as a critical driving force in the mammalian evolution via generating mammalian-specific genes. PMID:26666304

  7. Epiviz: interactive visual analytics for functional genomics data.

    PubMed

    Chelaru, Florin; Smith, Llewellyn; Goldstein, Naomi; Bravo, Héctor Corrada

    2014-09-01

    Visualization is an integral aspect of genomics data analysis. Algorithmic-statistical analysis and interactive visualization are most effective when used iteratively. Epiviz (http://epiviz.cbcb.umd.edu/), a web-based genome browser, and the Epivizr Bioconductor package allow interactive, extensible and reproducible visualization within a state-of-the-art data-analysis platform.

  8. Generation of RCAS vectors useful for functional genomic analyses.

    PubMed

    Loftus, S K; Larson, D M; Watkins-Chow, D; Church, D M; Pavan, W J

    2001-10-31

    Avian leukosis type A virus-derived retroviral vectors have been used to introduce genes into cells expressing the corresponding avian receptor tv-a. This includes the use of Replication-Competent Avian sarcoma-leukosis virus (ASLV) long terminal repeat (LTR) with Splice acceptor (RCAS) vectors in the analysis of avian development, human and murine cell cultures, murine cell lineage studies and cancer biology. Previously, cloning of genes into this virus was difficult due to the large size of the vector and sparse cloning sites. To overcome some of the disadvantages of traditional cloning using the RCASBP-Y vector, we have modified the RCASBP-Y to incorporate "Gateway" site-specific recombination cloning of genes into the construct, either with or without HA epitope tags. We have found the repetitive "att" sequences, which are the targets for site-specific recombination, do not impair the production of infectious viral particles or the expression of the gene of interest. This is the first instance of site-specific recombination being used to generate retroviral gene constructs. These viral constructs will allow for the efficient transfer and expression of cDNAs needed for functional genomic analyses. PMID:11759842

  9. Bioadhesion in ascidians: a developmental and functional genomics perspective

    PubMed Central

    Pennati, Roberta; Rothbächer, Ute

    2015-01-01

    The development of bioadhesives inspired from marine animals is a promising approach to generate new tissue-compatible medical components. A number of marine species, through their adhesive properties, also represent significant foulers that become increasingly problematic to aquaculture, shipping or local biodiversity. In order to develop more sophisticated man-made glues and/or efficient fouling resistant surfaces, it is important to understand the mechanical, structural and molecular properties of adhesive organs in selected species. Ascidians are marine invertebrates with larvae that opportunistically attach to almost any type of submerged surface to undergo metamorphosis into permanently sessile adults. Not only do they represent a globally important fouling organism, but they are becoming increasingly popular as model organisms for developmental biology. The latter is due to their phylogenetic position as the sister group to the vertebrates and their cellular and molecular accessibility for experimentation. In this paper, we review the mechanisms of larval adhesion in ascidians and draw conclusions from comparative analyses of selected species. We further discuss how knowledge from a developmental and functional genomics point of view can advance our understanding of cellular and molecular signatures and their hierarchical usage in animal adhesive organs. PMID:25657840

  10. Genomic and functional overlap between somatic and germline chromosomal rearrangements.

    PubMed

    van Heesch, Sebastiaan; Simonis, Marieke; van Roosmalen, Markus J; Pillalamarri, Vamsee; Brand, Harrison; Kuijk, Ewart W; de Luca, Kim L; Lansu, Nico; Braat, A Koen; Menelaou, Androniki; Hao, Wensi; Korving, Jeroen; Snijder, Simone; van der Veken, Lars T; Hochstenbach, Ron; Knegt, Alida C; Duran, Karen; Renkens, Ivo; Alekozai, Najla; Jager, Myrthe; Vergult, Sarah; Menten, Björn; de Bruijn, Ewart; Boymans, Sander; Ippel, Elly; van Binsbergen, Ellen; Talkowski, Michael E; Lichtenbelt, Klaske; Cuppen, Edwin; Kloosterman, Wigard P

    2014-12-24

    Genomic rearrangements are a common cause of human congenital abnormalities. However, their origin and consequences are poorly understood. We performed molecular analysis of two patients with congenital disease who carried de novo genomic rearrangements. We found that the rearrangements in both patients hit genes that are recurrently rearranged in cancer (ETV1, FOXP1, and microRNA cluster C19MC) and drive formation of fusion genes similar to those described in cancer. Subsequent analysis of a large set of 552 de novo germline genomic rearrangements underlying congenital disorders revealed enrichment for genes rearranged in cancer and overlap with somatic cancer breakpoints. Breakpoints of common (inherited) germline structural variations also overlap with cancer breakpoints but are depleted for cancer genes. We propose that the same genomic positions are prone to genomic rearrangements in germline and soma but that timing and context of breakage determines whether developmental defects or cancer are promoted. PMID:25497101

  11. Structure, Function, and Evolution of the Thiomonas spp. Genome

    PubMed Central

    Arsène-Ploetze, Florence; Koechler, Sandrine; Marchal, Marie; Coppée, Jean-Yves; Chandler, Michael; Bonnefoy, Violaine; Brochier-Armanet, Céline; Barakat, Mohamed; Barbe, Valérie; Battaglia-Brunet, Fabienne; Bruneel, Odile; Bryan, Christopher G.; Cleiss-Arnold, Jessica; Cruveiller, Stéphane; Erhardt, Mathieu; Heinrich-Salmeron, Audrey; Hommais, Florence; Joulian, Catherine; Krin, Evelyne; Lieutaud, Aurélie; Lièvremont, Didier; Michel, Caroline; Muller, Daniel; Ortet, Philippe; Proux, Caroline; Siguier, Patricia; Roche, David; Rouy, Zoé; Salvignol, Grégory; Slyemi, Djamila; Talla, Emmanuel; Weiss, Stéphanie; Weissenbach, Jean; Médigue, Claudine; Bertin, Philippe N.

    2010-01-01

    Bacteria of the Thiomonas genus are ubiquitous in extreme environments, such as arsenic-rich acid mine drainage (AMD). The genome of one of these strains, Thiomonas sp. 3As, was sequenced, annotated, and examined, revealing specific adaptations allowing this bacterium to survive and grow in its highly toxic environment. In order to explore genomic diversity as well as genetic evolution in Thiomonas spp., a comparative genomic hybridization (CGH) approach was used on eight different strains of the Thiomonas genus, including five strains of the same species. Our results suggest that the Thiomonas genome has evolved through the gain or loss of genomic islands and that this evolution is influenced by the specific environmental conditions in which the strains live. PMID:20195515

  12. The power of EST sequence data: Relation to Acyrthosiphon pisum genome annotation and functional genomics initiatives

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genes important to aphid biology, survival and reproduction were successfully identified by use of a genomics approach. We created and described the Sequencing, compilation, and annotation of the approxiamtely 525Mb nuclear genome of the pea aphid, Acyrthosiphon pisum, which represents an important ...

  13. Measuring error rates in genomic perturbation screens: gold standards for human functional genomics.

    PubMed

    Hart, Traver; Brown, Kevin R; Sircoulomb, Fabrice; Rottapel, Robert; Moffat, Jason

    2014-01-01

    Technological advancement has opened the door to systematic genetics in mammalian cells. Genome-scale loss-of-function screens can assay fitness defects induced by partial gene knockdown, using RNA interference, or complete gene knockout, using new CRISPR techniques. These screens can reveal the basic blueprint required for cellular proliferation. Moreover, comparing healthy to cancerous tissue can uncover genes that are essential only in the tumor; these genes are targets for the development of specific anticancer therapies. Unfortunately, progress in this field has been hampered by off-target effects of perturbation reagents and poorly quantified error rates in large-scale screens. To improve the quality of information derived from these screens, and to provide a framework for understanding the capabilities and limitations of CRISPR technology, we derive gold-standard reference sets of essential and nonessential genes, and provide a Bayesian classifier of gene essentiality that outperforms current methods on both RNAi and CRISPR screens. Our results indicate that CRISPR technology is more sensitive than RNAi and that both techniques have nontrivial false discovery rates that can be mitigated by rigorous analytical methods.

  14. From Functional Genomics to Functional Immunomics: New Challenges, Old Problems, Big Rewards

    PubMed Central

    2006-01-01

    The development of DNA microarray technology a decade ago led to the establishment of functional genomics as one of the most active and successful scientific disciplines today. With the ongoing development of immunomic microarray technology—a spatially addressable, large-scale technology for measurement of specific immunological response—the new challenge of functional immunomics is emerging, which bears similarities to but is also significantly different from functional genomics. Immunonic data has been successfully used to identify biological markers involved in autoimmune diseases, allergies, viral infections such as human immunodeficiency virus (HIV), influenza, diabetes, and responses to cancer vaccines. This review intends to provide a coherent vision of this nascent scientific field, and speculate on future research directions. We discuss at some length issues such as epitope prediction, immunomic microarray technology and its applications, and computation and statistical challenges related to functional immunomics. Based on the recent discovery of regulation mechanisms in T cell responses, we envision the use of immunomic microarrays as a tool for advances in systems biology of cellular immune responses, by means of immunomic regulatory network models. PMID:16863395

  15. Complete genome sequencing and comparative genomic analysis of functionally diverse Lysinibacillus sphaericus III(3)7.

    PubMed

    Rey, Andrés; Silva-Quintero, Laura; Dussán, Jenny

    2016-09-01

    Lysinibacillus sphaericus III(3)7 is a native Colombian strain, the first one isolated from soil samples. This strain has shown high levels of pathogenic activity against Culex quinquefaciatus larvae in laboratory assays compared to other members of the same species. Using Pacific Biosciences sequencing technology we sequenced, annotated (de novo) and described the genome of strain III(3)7, achieving a complete genome sequence status. We then performed a comparative analysis between the newly sequenced genome and the ones previously reported for Colombian isolates L. sphaericus OT4b.31, CBAM5 and OT4b.25, with the inclusion of L. sphaericus C3-41 that has been used as a reference genome for most of previous genome sequencing projects. We concluded that L. sphaericus III(3)7 is highly similar with strain OT4b.25 and shares high levels of synteny with isolates CBAM5 and C3-41. PMID:27419068

  16. Functional Genomics Reveals Linkers Critical for Influenza Virus Polymerase

    PubMed Central

    Wang, Lulan; Wu, Aiping; Wang, Yao E.; Quanquin, Natalie; Li, Chunfeng; Wang, Jingfeng; Chen, Hsiang-Wen; Liu, Suyang; Liu, Ping; Zhang, Hong; Qin, F. Xiao-Feng

    2015-01-01

    ABSTRACT Influenza virus mRNA synthesis by the RNA-dependent RNA polymerase involves binding and cleavage of capped cellular mRNA by the PB2 and PA subunits, respectively, and extension of viral mRNA by PB1. However, the mechanism for such a dynamic process is unclear. Using high-throughput mutagenesis and sequencing analysis, we have not only generated a comprehensive functional map for the microdomains of individual subunits but also have revealed the PA linker to be critical for polymerase activity. This PA linker binds to PB1 and also forms ionic interactions with the PA C-terminal channel. Nearly all mutants with five-amino-acid insertions in the linker were nonviable. Our model further suggests that the PA linker plays an important role in the conformational changes that occur between stages that favor capped mRNA binding and cleavage and those associated with viral mRNA synthesis. IMPORTANCE The RNA-dependent RNA polymerase of influenza virus consists of the PB1, PB2, and PA subunits. By combining genome-wide mutagenesis analysis with the recently discovered crystal structure of the influenza polymerase heterotrimer, we generated a comprehensive functional map of the entire influenza polymerase complex. We identified the microdomains of individual subunits, including the catalytic domains, the interaction interfaces between subunits, and nine linkers interconnecting different domains. Interestingly, we found that mutants with five-amino-acid insertions in individual linkers were nonviable, suggesting the critical roles these linkers play in coordinating spatial relationships between the subunits. We further identified an extended PA linker that binds to PB1 and also forms ionic interactions with the PA C-terminal channel. PMID:26719244

  17. Genomic and functional adaptation in surface ocean planktonic prokaryotes.

    PubMed

    Yooseph, Shibu; Nealson, Kenneth H; Rusch, Douglas B; McCrow, John P; Dupont, Christopher L; Kim, Maria; Johnson, Justin; Montgomery, Robert; Ferriera, Steve; Beeson, Karen; Williamson, Shannon J; Tovchigrechko, Andrey; Allen, Andrew E; Zeigler, Lisa A; Sutton, Granger; Eisenstadt, Eric; Rogers, Yu-Hui; Friedman, Robert; Frazier, Marvin; Venter, J Craig

    2010-11-01

    The understanding of marine microbial ecology and metabolism has been hampered by the paucity of sequenced reference genomes. To this end, we report the sequencing of 137 diverse marine isolates collected from around the world. We analysed these sequences, along with previously published marine prokaryotic genomes, in the context of marine metagenomic data, to gain insights into the ecology of the surface ocean prokaryotic picoplankton (0.1-3.0 μm size range). The results suggest that the sequenced genomes define two microbial groups: one composed of only a few taxa that are nearly always abundant in picoplanktonic communities, and the other consisting of many microbial taxa that are rarely abundant. The genomic content of the second group suggests that these microbes are capable of slow growth and survival in energy-limited environments, and rapid growth in energy-rich environments. By contrast, the abundant and cosmopolitan picoplanktonic prokaryotes for which there is genomic representation have smaller genomes, are probably capable of only slow growth and seem to be relatively unable to sense or rapidly acclimate to energy-rich conditions. Their genomic features also lead us to propose that one method used to avoid predation by viruses and/or bacterivores is by means of slow growth and the maintenance of low biomass.

  18. Evolution and function of genomic imprinting in plants

    PubMed Central

    Rodrigues, Jessica A.; Zilberman, Daniel

    2015-01-01

    Genomic imprinting, an inherently epigenetic phenomenon defined by parent of origin-dependent gene expression, is observed in mammals and flowering plants. Genome-scale surveys of imprinted expression and the underlying differential epigenetic marks have led to the discovery of hundreds of imprinted plant genes and confirmed DNA and histone methylation as key regulators of plant imprinting. However, the biological roles of the vast majority of imprinted plant genes are unknown, and the evolutionary forces shaping plant imprinting remain rather opaque. Here, we review the mechanisms of plant genomic imprinting and discuss theories of imprinting evolution and biological significance in light of recent findings. PMID:26680300

  19. Functional Genomics of Eukaryotic Photosynthesis Using Insertional Mutagenesis of Chlamydomonas reinhardtii1

    PubMed Central

    Dent, Rachel M.; Haglund, Cat M.; Chin, Brian L.; Kobayashi, Marilyn C.; Niyogi, Krishna K.

    2005-01-01

    The unicellular green alga Chlamydomonas reinhardtii is a widely used model organism for studies of oxygenic photosynthesis in eukaryotes. Here we describe the development of a resource for functional genomics of photosynthesis using insertional mutagenesis of the Chlamydomonas nuclear genome. Chlamydomonas cells were transformed with either of two plasmids conferring zeocin resistance, and insertional mutants were selected in the dark on acetate-containing medium to recover light-sensitive and nonphotosynthetic mutants. The population of insertional mutants was subjected to a battery of primary and secondary phenotypic screens to identify photosynthesis-related mutants that were pigment deficient, light sensitive, nonphotosynthetic, or hypersensitive to reactive oxygen species. Approximately 9% of the insertional mutants exhibited 1 or more of these phenotypes. Molecular analysis showed that each mutant line contains an average of 1.4 insertions, and genetic analysis indicated that approximately 50% of the mutations are tagged by the transforming DNA. Flanking DNA was isolated from the mutants, and sequence data for the insertion sites in 50 mutants are presented and discussed. PMID:15653810

  20. [The application of genome editing in identification of plant gene function and crop breeding].

    PubMed

    Xiangchun, Zhou; Yongzhong, Xing

    2016-03-01

    Plant genome can be modified via current biotechnology with high specificity and excellent efficiency. Zinc finger nucleases (ZFN), transcription activator-like effector nucleases (TALEN) and clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated 9 (Cas9) system are the key engineered nucleases used in the genome editing. Genome editing techniques enable gene targeted mutagenesis, gene knock-out, gene insertion or replacement at the target sites during the endogenous DNA repair process, including non-homologous end joining (NHEJ) and homologous recombination (HR), triggered by the induction of DNA double-strand break (DSB). Genome editing has been successfully applied in the genome modification of diverse plant species, such as Arabidopsis thaliana, Oryza sativa, and Nicotiana tabacum. In this review, we summarize the application of genome editing in identification of plant gene function and crop breeding. Moreover, we also discuss the improving points of genome editing in crop precision genetic improvement for further study.

  1. Maps of cis-Regulatory Nodes in Megabase Long Genome Segments are an Inevitable Intermediate Step Toward Whole Genome Functional Mapping

    PubMed Central

    Nikolaev, Lev G; Akopov, Sergey B; Chernov, Igor P; Sverdlov, Eugene D

    2007-01-01

    The availability of complete human and other metazoan genome sequences has greatly facilitated positioning and analysis of various genomic functional elements, with initial emphasis on coding sequences. However, complete functional maps of sequenced eukaryotic genomes should include also positions of all non-coding regulatory elements. Unfortunately, experimental data on genomic positions of a multitude of regulatory sequences, such as enhancers, silencers, insulators, transcription terminators, and replication origins are very limited, especially at the whole genome level. Since most genomic regulatory elements (e.g. enhancers) are generally gene-, tissue-, or cell-specific, the prediction of these elements by computational methods is difficult and often ambiguous. Therefore, the development of high-throughput experimental approaches for identifying and mapping genomic functional elements is highly desirable. At the same time, the creation of whole-genome map of hundreds of thousands of regulatory elements in several hundreds of tissue/cell types is presently far beyond our capabilities. A possible alternative for the whole genome approach is to concentrate efforts on individual genomic segments and then to integrate the data obtained into a whole genome functional map. Moreover, the maps of polygenic fragments with functional cis-regulatory elements would provide valuable data on complex regulatory systems, including their variability and evolution. Here, we reviewed experimental approaches to the realization of these ideas, including our own developments of experimental techniques for selection of cis-acting functionally active DNA fragments from large (megabase-sized) segments of mammalian genomes. PMID:18660850

  2. Maps of cis-Regulatory Nodes in Megabase Long Genome Segments are an Inevitable Intermediate Step Toward Whole Genome Functional Mapping.

    PubMed

    Nikolaev, Lev G; Akopov, Sergey B; Chernov, Igor P; Sverdlov, Eugene D

    2007-04-01

    The availability of complete human and other metazoan genome sequences has greatly facilitated positioning and analysis of various genomic functional elements, with initial emphasis on coding sequences. However, complete functional maps of sequenced eukaryotic genomes should include also positions of all non-coding regulatory elements. Unfortunately, experimental data on genomic positions of a multitude of regulatory sequences, such as enhancers, silencers, insulators, transcription terminators, and replication origins are very limited, especially at the whole genome level. Since most genomic regulatory elements (e.g. enhancers) are generally gene-, tissue-, or cell-specific, the prediction of these elements by computational methods is difficult and often ambiguous. Therefore, the development of high-throughput experimental approaches for identifying and mapping genomic functional elements is highly desirable. At the same time, the creation of whole-genome map of hundreds of thousands of regulatory elements in several hundreds of tissue/cell types is presently far beyond our capabilities. A possible alternative for the whole genome approach is to concentrate efforts on individual genomic segments and then to integrate the data obtained into a whole genome functional map. Moreover, the maps of polygenic fragments with functional cis-regulatory elements would provide valuable data on complex regulatory systems, including their variability and evolution. Here, we reviewed experimental approaches to the realization of these ideas, including our own developments of experimental techniques for selection of cis-acting functionally active DNA fragments from large (megabase-sized) segments of mammalian genomes. PMID:18660850

  3. Open access resources for genome-wide association mapping in rice.

    PubMed

    McCouch, Susan R; Wright, Mark H; Tung, Chih-Wei; Maron, Lyza G; McNally, Kenneth L; Fitzgerald, Melissa; Singh, Namrata; DeClerck, Genevieve; Agosto-Perez, Francisco; Korniliev, Pavel; Greenberg, Anthony J; Naredo, Ma Elizabeth B; Mercado, Sheila Mae Q; Harrington, Sandra E; Shi, Yuxin; Branchini, Darcy A; Kuser-Falcão, Paula R; Leung, Hei; Ebana, Kowaru; Yano, Masahiro; Eizenga, Georgia; McClung, Anna; Mezey, Jason

    2016-01-01

    Increasing food production is essential to meet the demands of a growing human population, with its rising income levels and nutritional expectations. To address the demand, plant breeders seek new sources of genetic variation to enhance the productivity, sustainability and resilience of crop varieties. Here we launch a high-resolution, open-access research platform to facilitate genome-wide association mapping in rice, a staple food crop. The platform provides an immortal collection of diverse germplasm, a high-density single-nucleotide polymorphism data set tailored for gene discovery, well-documented analytical strategies, and a suite of bioinformatics resources to facilitate biological interpretation. Using grain length, we demonstrate the power and resolution of our new high-density rice array, the accompanying genotypic data set, and an expanded diversity panel for detecting major and minor effect QTLs and subpopulation-specific alleles, with immediate implications for rice improvement. PMID:26842267

  4. Open access resources for genome-wide association mapping in rice

    PubMed Central

    McCouch, Susan R.; Wright, Mark H.; Tung, Chih-Wei; Maron, Lyza G.; McNally, Kenneth L.; Fitzgerald, Melissa; Singh, Namrata; DeClerck, Genevieve; Agosto-Perez, Francisco; Korniliev, Pavel; Greenberg, Anthony J.; Naredo, Ma. Elizabeth B.; Mercado, Sheila Mae Q.; Harrington, Sandra E.; Shi, Yuxin; Branchini, Darcy A.; Kuser-Falcão, Paula R.; Leung, Hei; Ebana, Kowaru; Yano, Masahiro; Eizenga, Georgia; McClung, Anna; Mezey, Jason

    2016-01-01

    Increasing food production is essential to meet the demands of a growing human population, with its rising income levels and nutritional expectations. To address the demand, plant breeders seek new sources of genetic variation to enhance the productivity, sustainability and resilience of crop varieties. Here we launch a high-resolution, open-access research platform to facilitate genome-wide association mapping in rice, a staple food crop. The platform provides an immortal collection of diverse germplasm, a high-density single-nucleotide polymorphism data set tailored for gene discovery, well-documented analytical strategies, and a suite of bioinformatics resources to facilitate biological interpretation. Using grain length, we demonstrate the power and resolution of our new high-density rice array, the accompanying genotypic data set, and an expanded diversity panel for detecting major and minor effect QTLs and subpopulation-specific alleles, with immediate implications for rice improvement. PMID:26842267

  5. Open access resources for genome-wide association mapping in rice.

    PubMed

    McCouch, Susan R; Wright, Mark H; Tung, Chih-Wei; Maron, Lyza G; McNally, Kenneth L; Fitzgerald, Melissa; Singh, Namrata; DeClerck, Genevieve; Agosto-Perez, Francisco; Korniliev, Pavel; Greenberg, Anthony J; Naredo, Ma Elizabeth B; Mercado, Sheila Mae Q; Harrington, Sandra E; Shi, Yuxin; Branchini, Darcy A; Kuser-Falcão, Paula R; Leung, Hei; Ebana, Kowaru; Yano, Masahiro; Eizenga, Georgia; McClung, Anna; Mezey, Jason

    2016-01-01

    Increasing food production is essential to meet the demands of a growing human population, with its rising income levels and nutritional expectations. To address the demand, plant breeders seek new sources of genetic variation to enhance the productivity, sustainability and resilience of crop varieties. Here we launch a high-resolution, open-access research platform to facilitate genome-wide association mapping in rice, a staple food crop. The platform provides an immortal collection of diverse germplasm, a high-density single-nucleotide polymorphism data set tailored for gene discovery, well-documented analytical strategies, and a suite of bioinformatics resources to facilitate biological interpretation. Using grain length, we demonstrate the power and resolution of our new high-density rice array, the accompanying genotypic data set, and an expanded diversity panel for detecting major and minor effect QTLs and subpopulation-specific alleles, with immediate implications for rice improvement.

  6. A Functional Genomic Approach to Chlorinated Ethenes Bioremediation

    NASA Astrophysics Data System (ADS)

    Lee, P. K.; Brodie, E. L.; MacBeth, T. W.; Deeb, R. A.; Sorenson, K. S.; Andersen, G. L.; Alvarez-Cohen, L.

    2007-12-01

    With the recent advances in genomic sciences, a knowledge-based approach can now be taken to optimize the bioremediation of trichloroethene (TCE). During the bioremediation of a heterogeneous subsurface, it is vital to identify and quantify the functionally important microorganisms present, characterize the microbial community and measure their physiological activity. In our field experiments, quantitative PCR (qPCR) was coupled with reverse-transcription (RT) to analyze both copy numbers and transcripts expressed by the 16S rRNA gene and three reductive dehalogenase (RDase) genes as biomarkers of Dehalococcoides spp. in the groundwater of a TCE-DNAPL site at Ft. Lewis (WA) that was serially subjected to biostimulation and bioaugmentation. Genes in the Dehalococcoides genus were targeted as they are the only known organisms that can completely dechlorinate TCE to the innocuous product ethene. Biomarker quantification revealed an overall increase of more than three orders of magnitude in the total Dehalococcoides population and quantification of the more liable and stringently regulated mRNAs confirmed that Dehalococcoides spp. were active. Parallel with our field experiments, laboratory studies were conducted to explore the physiology of Dehalococcoides isolates in order to develop relevant biomarkers that are indicative of the metabolic state of cells. Recently, we verified the function of the nitrogenase operon in Dehalococcoides sp. strain 195 and nitrogenase-encoding genes are ideal biomarker targets to assess cellular nitrogen requirement. To characterize the microbial community, we applied a high-density phylogenetic microarray (16S PhyloChip) that simultaneous monitors over 8,700 unique taxa to track the bacterial and archaeal populations through different phases of treatment. As a measure of species richness, 1,300 to 1,520 taxa were detected in groundwater samples extracted during different stages of treatment as well as in the bioaugmentation culture. We

  7. PlasmoView: a web-based resource to visualise global Plasmodium falciparum genomic variation.

    PubMed

    Preston, Mark D; Assefa, Samuel A; Ocholla, Harold; Sutherland, Colin J; Borrmann, Steffen; Nzila, Alexis; Michon, Pascal; Hien, Tran Tinh; Bousema, Teun; Drakeley, Christopher J; Zongo, Issaka; Ouédraogo, Jean-Bosco; Djimde, Abdoulaye A; Doumbo, Ogobara K; Nosten, Francois; Fairhurst, Rick M; Conway, David J; Roper, Cally; Clark, Taane G

    2014-06-01

    Malaria is a global public health challenge, with drug resistance a major barrier to disease control and elimination. To meet the urgent need for better treatments and vaccines, a deeper knowledge of Plasmodium biology and malaria epidemiology is required. An improved understanding of the genomic variation of malaria parasites, especially the most virulent Plasmodium falciparum (Pf) species, has the potential to yield new insights in these areas. High-throughput sequencing and genotyping is generating large amounts of genomic data across multiple parasite populations. The resulting ability to identify informative variants, particularly single-nucleotide polymorphisms (SNPs), will lead to the discovery of intra- and inter-population differences and thus enable the development of genetic barcodes for diagnostic assays and clinical studies. Knowledge of genetic variability underlying drug resistance and other differential phenotypes will also facilitate the identification of novel mutations and contribute to surveillance and stratified medicine applications. The PlasmoView interactive web-browsing tool enables the research community to visualise genomic variation and annotation (eg, biological function) in a geographic setting. The first release contains over 600,000 high-quality SNPs in 631 Pf isolates from laboratory strains and four malaria-endemic regions (West Africa, East Africa, Southeast Asia and Oceania). PMID:24338354

  8. PlasmoView: A Web-based Resource to Visualise Global Plasmodium falciparum Genomic Variation

    PubMed Central

    Preston, Mark D.; Assefa, Samuel A.; Ocholla, Harold; Sutherland, Colin J.; Borrmann, Steffen; Nzila, Alexis; Michon, Pascal; Hien, Tran Tinh; Bousema, Teun; Drakeley, Christopher J.; Zongo, Issaka; Ouédraogo, Jean-Bosco; Djimde, Abdoulaye A.; Doumbo, Ogobara K.; Nosten, Francois; Fairhurst, Rick M.; Conway, David J.; Roper, Cally; Clark, Taane G.

    2014-01-01

    Malaria is a global public health challenge, with drug resistance a major barrier to disease control and elimination. To meet the urgent need for better treatments and vaccines, a deeper knowledge of Plasmodium biology and malaria epidemiology is required. An improved understanding of the genomic variation of malaria parasites, especially the most virulent Plasmodium falciparum (Pf) species, has the potential to yield new insights in these areas. High-throughput sequencing and genotyping is generating large amounts of genomic data across multiple parasite populations. The resulting ability to identify informative variants, particularly single-nucleotide polymorphisms (SNPs), will lead to the discovery of intra- and inter-population differences and thus enable the development of genetic barcodes for diagnostic assays and clinical studies. Knowledge of genetic variability underlying drug resistance and other differential phenotypes will also facilitate the identification of novel mutations and contribute to surveillance and stratified medicine applications. The PlasmoView interactive web-browsing tool enables the research community to visualise genomic variation and annotation (eg, biological function) in a geographic setting. The first release contains over 600 000 high-quality SNPs in 631 Pf isolates from laboratory strains and four malaria-endemic regions (West Africa, East Africa, Southeast Asia and Oceania). PMID:24338354

  9. The yeast deletion collection: a decade of functional genomics.

    PubMed

    Giaever, Guri; Nislow, Corey

    2014-06-01

    The yeast deletion collections comprise >21,000 mutant strains that carry precise start-to-stop deletions of ∼6000 open reading frames. This collection includes heterozygous and homozygous diploids, and haploids of both MAT A: and MATα mating types. The yeast deletion collection, or yeast knockout (YKO) set, represents the first and only complete, systematically constructed deletion collection available for any organism. Conceived during the Saccharomyces cerevisiae sequencing project, work on the project began in 1998 and was completed in 2002. The YKO strains have been used in numerous laboratories in >1000 genome-wide screens. This landmark genome project has inspired development of numerous genome-wide technologies in organisms from yeast to man. Notable spinoff technologies include synthetic genetic array and HIPHOP chemogenomics. In this retrospective, we briefly describe the yeast deletion project and some of its most noteworthy biological contributions and the impact that these collections have had on the yeast research community and on genomics in general.

  10. Functional profiling of cyanobacterial genomes and its role in ecological adaptations.

    PubMed

    Prabha, Ratna; Singh, Dhananjaya P; Somvanshi, Pallavi; Rai, Anil

    2016-09-01

    With the availability of complete genome sequences of many cyanobacterial species, it is becoming feasible to study the broad prospective of the environmental adaptation and the overall changes at transcriptional and translational level in these organisms. In the evolutionary phase, niche-specific competitive forces have resulted in specific features of the cyanobacterial genomes. In this study, functional composition of the 84 different cyanobacterial genomes and their adaptations to different environments was examined by identifying the genomic composition for specific cellular processes, which reflect their genomic functional profile and ecological adaptation. It was identified that among cyanobacterial genomes, metabolic genes have major share over other categories and differentiation of genomic functional profile was observed for the species inhabiting different habitats. The cyanobacteria of freshwater and other habitats accumulate large number of poorly characterized genes. Strain specific functions were also reported in many cyanobacterial members, of which an important feature was the occurrence of phage-related sequences. From this study, it can be speculated that habitat is one of the major factors in giving the shape of functional composition of cyanobacterial genomes towards their ecological adaptations. PMID:27408818

  11. Novel genomic resources for a climate change sensitive mammal: characterization of the American pika transcriptome

    PubMed Central

    2013-01-01

    Background When faced with climate change, species must either shift their home range or adapt in situ in order to maintain optimal physiological balance with their environment. The American pika (Ochotona princeps) is a small alpine mammal with limited dispersal capacity and low tolerance for thermal stress. As a result, pikas have become an important system for examining biotic responses to changing climatic conditions. Previous research using amplified fragment length polymorphisms (AFLPs) has revealed evidence for environmental-mediated selection in O. princeps populations distributed along elevation gradients, yet the anonymity of AFLP loci and lack of available genomic resources precluded the identification of associated gene regions. Here, we harnessed next-generation sequencing technology in order to characterize the American pika transcriptome and identify a large suite of single nucleotide polymorphisms (SNPs), which can be used to elucidate elevation- and site-specific patterns of sequence variation. Results We constructed pooled cDNA libraries of O. princeps from high (1400m) and low (300m) elevation sites along a previously established transect in British Columbia. Transcriptome sequencing using the Roche 454 GS FLX titanium platform generated 780 million base pairs of data, which were assembled into 7,325 high coverage contigs. These contigs were used to identify 24,261 novel SNP loci. Using high resolution melt analysis, we developed 17 of these SNPs into genotyping assays, which were validated with independent DNA samples from British Columbia Canada and Oregon State USA. In addition, we detected haplotypes in the NADH dehydrogenase subunit 5 of the mitochondrial genome that were fixed and different among elevations, suggesting that this may be an informative target gene for studying the role of cellular respiration in local adaptation. We also identified contigs that were unique to each elevation, including a high elevation-specific contig that was

  12. High levels of variation in Salix lignocellulose genes revealed using poplar genomic resources

    PubMed Central

    2013-01-01

    Background Little is known about the levels of variation in lignin or other wood related genes in Salix, a genus that is being increasingly used for biomass and biofuel production. The lignin biosynthesis pathway is well characterized in a number of species, including the model tree Populus. We aimed to transfer the genomic resources already available in Populus to its sister genus Salix to assess levels of variation within genes involved in wood formation. Results Amplification trials for 27 gene regions were undertaken in 40 Salix taxa. Twelve of these regions were sequenced. Alignment searches of the resulting sequences against reference databases, combined with phylogenetic analyses, showed the close similarity of these Salix sequences to Populus, confirming homology of the primer regions and indicating a high level of conservation within the wood formation genes. However, all sequences were found to vary considerably among Salix species, mainly as SNPs with a smaller number of insertions-deletions. Between 25 and 176 SNPs per kbp per gene region (in predicted exons) were discovered within Salix. Conclusions The variation found is sizeable but not unexpected as it is based on interspecific and not intraspecific comparison; it is comparable to interspecific variation in Populus. The characterisation of genetic variation is a key process in pre-breeding and for the conservation and exploitation of genetic resources in Salix. This study characterises the variation in several lignocellulose gene markers for such purposes. PMID:23924375

  13. Genome-Wide Association and Functional Follow-Up Reveals New Loci for Kidney Function

    PubMed Central

    Fuchsberger, Christian; Olden, Matthias; Chen, Ming-Huei; Tin, Adrienne; Taliun, Daniel; Li, Man; Gao, Xiaoyi; Gorski, Mathias; Yang, Qiong; Hundertmark, Claudia; Foster, Meredith C.; O'Seaghdha, Conall M.; Glazer, Nicole; Isaacs, Aaron; Liu, Ching-Ti; Smith, Albert V.; O'Connell, Jeffrey R.; Struchalin, Maksim; Tanaka, Toshiko; Li, Guo; Johnson, Andrew D.; Gierman, Hinco J.; Feitosa, Mary; Hwang, Shih-Jen; Atkinson, Elizabeth J.; Lohman, Kurt; Cornelis, Marilyn C.; Johansson, Åsa; Tönjes, Anke; Dehghan, Abbas; Chouraki, Vincent; Holliday, Elizabeth G.; Sorice, Rossella; Kutalik, Zoltan; Lehtimäki, Terho; Esko, Tõnu; Deshmukh, Harshal; Ulivi, Sheila; Chu, Audrey Y.; Murgia, Federico; Trompet, Stella; Imboden, Medea; Kollerits, Barbara; Pistis, Giorgio; Harris, Tamara B.; Launer, Lenore J.; Aspelund, Thor; Eiriksdottir, Gudny; Mitchell, Braxton D.; Boerwinkle, Eric; Schmidt, Helena; Cavalieri, Margherita; Rao, Madhumathi; Hu, Frank B.; Demirkan, Ayse; Oostra, Ben A.; de Andrade, Mariza; Turner, Stephen T.; Ding, Jingzhong; Andrews, Jeanette S.; Freedman, Barry I.; Koenig, Wolfgang; Illig, Thomas; Döring, Angela; Wichmann, H.-Erich; Kolcic, Ivana; Zemunik, Tatijana; Boban, Mladen; Minelli, Cosetta; Wheeler, Heather E.; Igl, Wilmar; Zaboli, Ghazal; Wild, Sarah H.; Wright, Alan F.; Campbell, Harry; Ellinghaus, David; Nöthlings, Ute; Jacobs, Gunnar; Biffar, Reiner; Endlich, Karlhans; Ernst, Florian; Homuth, Georg; Kroemer, Heyo K.; Nauck, Matthias; Stracke, Sylvia; Völker, Uwe; Völzke, Henry; Kovacs, Peter; Stumvoll, Michael; Mägi, Reedik; Hofman, Albert; Uitterlinden, Andre G.; Rivadeneira, Fernando; Aulchenko, Yurii S.; Polasek, Ozren; Hastie, Nick; Vitart, Veronique; Helmer, Catherine; Wang, Jie Jin; Ruggiero, Daniela; Bergmann, Sven; Kähönen, Mika; Viikari, Jorma; Nikopensius, Tiit; Province, Michael; Ketkar, Shamika; Colhoun, Helen; Doney, Alex; Robino, Antonietta; Giulianini, Franco; Krämer, Bernhard K.; Portas, Laura; Ford, Ian; Buckley, Brendan M.; Adam, Martin; Thun, Gian-Andri; Paulweber, Bernhard; Haun, Margot; Sala, Cinzia; Metzger, Marie; Mitchell, Paul; Ciullo, Marina; Kim, Stuart K.; Vollenweider, Peter; Raitakari, Olli; Metspalu, Andres; Palmer, Colin; Gasparini, Paolo; Pirastu, Mario; Jukema, J. Wouter; Probst-Hensch, Nicole M.; Kronenberg, Florian; Toniolo, Daniela; Gudnason, Vilmundur; Shuldiner, Alan R.; Coresh, Josef; Schmidt, Reinhold; Ferrucci, Luigi; Siscovick, David S.; van Duijn, Cornelia M.; Borecki, Ingrid; Kardia, Sharon L. R.; Liu, Yongmei; Curhan, Gary C.; Rudan, Igor; Gyllensten, Ulf; Wilson, James F.; Franke, Andre; Pramstaller, Peter P.; Rettig, Rainer; Prokopenko, Inga; Witteman, Jacqueline C. M.; Hayward, Caroline; Ridker, Paul; Parsa, Afshin; Bochud, Murielle; Heid, Iris M.; Goessling, Wolfram; Chasman, Daniel I.; Kao, W. H. Linda; Fox, Caroline S.

    2012-01-01

    Chronic kidney disease (CKD) is an important public health problem with a genetic component. We performed genome-wide association studies in up to 130,600 European ancestry participants overall, and stratified for key CKD risk factors. We uncovered 6 new loci in association with estimated glomerular filtration rate (eGFR), the primary clinical measure of CKD, in or near MPPED2, DDX1, SLC47A1, CDK12, CASP9, and INO80. Morpholino knockdown of mpped2 and casp9 in zebrafish embryos revealed podocyte and tubular abnormalities with altered dextran clearance, suggesting a role for these genes in renal function. By providing new insights into genes that regulate renal function, these results could further our understanding of the pathogenesis of CKD. PMID:22479191

  14. Functional Genomics of Physiological Plasticity and Local Adaptation in Killifish

    PubMed Central

    Galvez, Fernando; Zhang, Shujun; Williams, Larissa M.; Oleksiak, Marjorie F.

    2011-01-01

    Evolutionary solutions to the physiological challenges of life in highly variable habitats can span the continuum from evolution of a cosmopolitan plastic phenotype to the evolution of locally adapted phenotypes. Killifish (Fundulus sp.) have evolved both highly plastic and locally adapted phenotypes within different selective contexts, providing a comparative system in which to explore the genomic underpinnings of physiological plasticity and adaptive variation. Importantly, extensive variation exists among populations and species for tolerance to a variety of stressors, and we exploit this variation in comparative studies to yield insights into the genomic basis of evolved phenotypic variation. Notably, species of Fundulus occupy the continuum of osmotic habitats from freshwater to marine and populations within Fundulus heteroclitus span far greater variation in pollution tolerance than across all species of fish. Here, we explore how transcriptome regulation underpins extreme physiological plasticity on osmotic shock and how genomic and transcriptomic variation is associated with locally evolved pollution tolerance. We show that F. heteroclitus quickly acclimate to extreme osmotic shock by mounting a dramatic rapid transcriptomic response including an early crisis control phase followed by a tissue remodeling phase involving many regulatory pathways. We also show that convergent evolution of locally adapted pollution tolerance involves complex patterns of gene expression and genome sequence variation, which is confounded with body-weight dependence for some genes. Similarly, exploiting the natural phenotypic variation associated with other established and emerging model organisms is likely to greatly accelerate the pace of discovery of the genomic basis of phenotypic variation. PMID:20581107

  15. New bioinformatic tool for quick identification of functionally relevant endogenous retroviral inserts in human genome.

    PubMed

    Garazha, Andrew; Ivanova, Alena; Suntsova, Maria; Malakhova, Galina; Roumiantsev, Sergey; Zhavoronkov, Alex; Buzdin, Anton

    2015-01-01

    Endogenous retroviruses (ERVs) and LTR retrotransposons (LRs) occupy ∼8% of human genome. Deep sequencing technologies provide clues to understanding of functional relevance of individual ERVs/LRs by enabling direct identification of transcription factor binding sites (TFBS) and other landmarks of functional genomic elements. Here, we performed the genome-wide identification of human ERVs/LRs containing TFBS according to the ENCODE project. We created the first interactive ERV/LRs database that groups the individual inserts according to their familial nomenclature, number of mapped TFBS and divergence from their consensus sequence. Information on any particular element can be easily extracted by the user. We also created a genome browser tool, which enables quick mapping of any ERV/LR insert according to genomic coordinates, known human genes and TFBS. These tools can be used to easily explore functionally relevant individual ERV/LRs, and for studying their impact on the regulation of human genes. Overall, we identified ∼110,000 ERV/LR genomic elements having TFBS. We propose a hypothesis of "domestication" of ERV/LR TFBS by the genome milieu including subsequent stages of initial epigenetic repression, partial functional release, and further mutation-driven reshaping of TFBS in tight coevolution with the enclosing genomic loci.

  16. New bioinformatic tool for quick identification of functionally relevant endogenous retroviral inserts in human genome

    PubMed Central

    Garazha, Andrew; Ivanova, Alena; Suntsova, Maria; Malakhova, Galina; Roumiantsev, Sergey; Zhavoronkov, Alex; Buzdin, Anton

    2015-01-01

    Abstract Endogenous retroviruses (ERVs) and LTR retrotransposons (LRs) occupy ∼8% of human genome. Deep sequencing technologies provide clues to understanding of functional relevance of individual ERVs/LRs by enabling direct identification of transcription factor binding sites (TFBS) and other landmarks of functional genomic elements. Here, we performed the genome-wide identification of human ERVs/LRs containing TFBS according to the ENCODE project. We created the first interactive ERV/LRs database that groups the individual inserts according to their familial nomenclature, number of mapped TFBS and divergence from their consensus sequence. Information on any particular element can be easily extracted by the user. We also created a genome browser tool, which enables quick mapping of any ERV/LR insert according to genomic coordinates, known human genes and TFBS. These tools can be used to easily explore functionally relevant individual ERV/LRs, and for studying their impact on the regulation of human genes. Overall, we identified ∼110,000 ERV/LR genomic elements having TFBS. We propose a hypothesis of “domestication” of ERV/LR TFBS by the genome milieu including subsequent stages of initial epigenetic repression, partial functional release, and further mutation-driven reshaping of TFBS in tight coevolution with the enclosing genomic loci. PMID:25853282

  17. GO-FAANG meeting: A gathering on functional annotation of animal genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The FAANG (Functional Annotation of Animal Genomes) Consortium recently held a Gathering On FAANG (GO-FAANG) Workshop in Washington, DC on October 7-8, 2015. This consortium is a grass-roots organization formed to advance the annotation of newly assembled genomes of non-model organisms (www.faang.or...

  18. Draft Genome Sequence of Lactobacillus reuteri Strain CRL 1098, an Interesting Candidate for Functional Food Development

    PubMed Central

    Torres, Andrea C.; Suárez, Nadia E.; Font, Graciela; Saavedra, Lucila

    2016-01-01

    We report here the draft genome sequence of Lactobacillus reuteri strain CRL 1098. This strain represents an interesting candidate for functional food development because of its proven probiotic properties. The draft genome sequence is composed of 1,969,471 bp assembled into 45 contigs and an average G+C content of 38.8%. PMID:27563038

  19. Functional Classification of Natural Resources for Valuing Natural Resources in Korea

    NASA Astrophysics Data System (ADS)

    Choi, H.; Lee, W.; Kwak, H.

    2013-12-01

    The ecosystem services concept emphasizes not only regulating services, but also supporting, provisioning, and cultural/social services according to the Millennium Ecosystem Assessment (MA). While the spatial and quantifying of ecosystem services is becoming increasingly recognized for natural resources conservation, however, due to methodological challenges, ecosystem services quantification is rarely considered in Republic of Korea (ROK). This study matches appropriate indicators, data and mapping for describing respective states, quantification and ecosystem valuation. The results were analyzed with statistical and GIS-based techniques. We classified the ecosystem services function based on reference to the literature, interviews and a modified approach compared to the MA, the Economics of Ecosystems and Biodiversity (TEEB). For quantifying values, we subdivided land cover types using ecological features and normalized numerical information of provisioning services, regulating services and cultural services. Resulting hotspots of ecosystem services are related to landscape features and land cover types in ROK. The mapping results show hotspots of ecosystem services where high level of ecosystem services is distributed - around Baekdudaegan protected area (Gangwon, Gyeongbuk Province, Chungbuk, Jeonam Province). n addition, the results of our study show that ecosystem services function - especially, fostering water resources, erosion control, air quality and pollution control in terrestrial ecosystems - can contribute to planning management policy for ecosystem based management at regional scale.

  20. Synergies between assisted reproduction technologies and functional genomics.

    PubMed

    Loi, Pasqualino; Toschi, Paola; Zacchini, Federica; Ptak, Grazyna; Scapolo, Pier A; Capra, Emanuele; Stella, Alessandra; Marsan, Paolo Ajmone; Williams, John L

    2016-01-01

    This review, is a synopsis of advanced reproductive technologies in farm animals, including the discussion of their limiting factors as revealed by the study of offspring derived from embryos produced in vitro and through cloning. These studies show that the problems of epigenetic mis-programming, which were reported in the initial stages of assisted reproduction, still persist. The importance of whole-genome analyses, including the methylome and transcriptome, in improving embryo biotechnologies in farm animals, are discussed. Genome editing approaches for the improvement of economically-relevant traits in farm animals are also described. Efficient farm animal embryo biotechnologies, including cloning and the most recent technologies such as genome editing, will effectively complement the latest strategies to accelerate genetic improvement of farm animals. PMID:27481215

  1. Comparative functional genomics of plant pathogenic Fusarium species

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Fusarium species are among the most economically important group of plant pathogenic fungi. Comparison of the four currently available Fusarium genome sequences allows an unsurpassed and unprecedented ability to predict genes, determine synteny and define regulatory sequences for genes in phytopatho...

  2. The yeast deletion collection: a decade of functional genomics.

    PubMed

    Giaever, Guri; Nislow, Corey

    2014-06-01

    The yeast deletion collections comprise >21,000 mutant strains that carry precise start-to-stop deletions of ∼6000 open reading frames. This collection includes heterozygous and homozygous diploids, and haploids of both MAT A: and MATα mating types. The yeast deletion collection, or yeast knockout (YKO) set, represents the first and only complete, systematically constructed deletion collection available for any organism. Conceived during the Saccharomyces cerevisiae sequencing project, work on the project began in 1998 and was completed in 2002. The YKO strains have been used in numerous laboratories in >1000 genome-wide screens. This landmark genome project has inspired development of numerous genome-wide technologies in organisms from yeast to man. Notable spinoff technologies include synthetic genetic array and HIPHOP chemogenomics. In this retrospective, we briefly describe the yeast deletion project and some of its most noteworthy biological contributions and the impact that these collections have had on the yeast research community and on genomics in general. PMID:24939991

  3. The Yeast Deletion Collection: A Decade of Functional Genomics

    PubMed Central

    Giaever, Guri; Nislow, Corey

    2014-01-01

    The yeast deletion collections comprise >21,000 mutant strains that carry precise start-to-stop deletions of ∼6000 open reading frames. This collection includes heterozygous and homozygous diploids, and haploids of both MATa and MATα mating types. The yeast deletion collection, or yeast knockout (YKO) set, represents the first and only complete, systematically constructed deletion collection available for any organism. Conceived during the Saccharomyces cerevisiae sequencing project, work on the project began in 1998 and was completed in 2002. The YKO strains have been used in numerous laboratories in >1000 genome-wide screens. This landmark genome project has inspired development of numerous genome-wide technologies in organisms from yeast to man. Notable spinoff technologies include synthetic genetic array and HIPHOP chemogenomics. In this retrospective, we briefly describe the yeast deletion project and some of its most noteworthy biological contributions and the impact that these collections have had on the yeast research community and on genomics in general. PMID:24939991

  4. Functional Genomics Tools for Haemonchus contortus and Lessons From Other Helminths.

    PubMed

    Britton, C; Roberts, B; Marks, N D

    2016-01-01

    The availability of genome and transcriptome data for parasitic nematodes, including Haemonchus contortus, has highlighted the need to develop functional genomics tools. Comparative genomic analysis, particularly using data from the free-living nematode Caenorhabditis elegans, can help predict gene function. Reliable approaches to study function directly in parasitic nematodes are currently lacking. However, gene knockdown by RNA interference (RNAi) is being successfully used in schistosome and planarian species to define gene functions. Lessons from these systems may be applied to improve RNAi in H. contortus. Previous studies in H. contortus and related nematodes demonstrated reliable RNAi-mediated silencing of some genes, but not others. Current data suggest that susceptibility to RNAi in these nematodes is limited to genes expressed in sites accessible to the environment, such as the gut, amphids and excretory cell. Therefore, RNAi is functional in H. contortus, but improvements are needed to develop this system as a functional genomics platform. Here, we summarize RNAi studies on H. contortus and discuss the optimization of RNA delivery and improvements to culture methods to enhance larval development, protein turnover and the induction of phenotypic effects in vitro. The transgenic delivery of RNA or dominant-negative gene constructs and the recently developed CRISPR/Cas genome-editing technique are considered as potential alternative approaches for gene knockout. This is a key time to devote greater effort in progressing from genome to function, to improve our understanding of the biology of Haemonchus and identify novel targets for parasite control. PMID:27238014

  5. UFO: a web server for ultra-fast functional profiling of whole genome protein sequences

    PubMed Central

    Meinicke, Peter

    2009-01-01

    Background Functional profiling is a key technique to characterize and compare the functional potential of entire genomes. The estimation of profiles according to an assignment of sequences to functional categories is a computationally expensive task because it requires the comparison of all protein sequences from a genome with a usually large database of annotated sequences or sequence families. Description Based on machine learning techniques for Pfam domain detection, the UFO web server for ultra-fast functional profiling allows researchers to process large protein sequence collections instantaneously. Besides the frequencies of Pfam and GO categories, the user also obtains the sequence specific assignments to Pfam domain families. In addition, a comparison with existing genomes provides dissimilarity scores with respect to 821 reference proteomes. Considering the underlying UFO domain detection, the results on 206 test genomes indicate a high sensitivity of the approach. In comparison with current state-of-the-art HMMs, the runtime measurements show a considerable speed up in the range of four orders of magnitude. For an average size prokaryotic genome, the computation of a functional profile together with its comparison typically requires about 10 seconds of processing time. Conclusion For the first time the UFO web server makes it possible to get a quick overview on the functional inventory of newly sequenced organisms. The genome scale comparison with a large number of precomputed profiles allows a first guess about functionally related organisms. The service is freely available and does not require user registration or specification of a valid email address. PMID:19725959

  6. EXPLOITING GENOME DATA TO UNDERSTAND THE FUNCTION, REGULATION AND EVOLUTIONARY ORIGINS OF TOXICOLOGICALLY RELEVANT GENES

    EPA Science Inventory

    The wealth of new information coming from the many genome sequencing projects is providing unprecedented opportunities for major advances in all areas of biology, including the environmental health sciences. To facilitate this discovery process, experts in the fields of function...

  7. Three-dimensional eukaryotic genomic organization is strongly correlated with codon usage expression and function.

    PubMed

    Diament, Alon; Pinter, Ron Y; Tuller, Tamir

    2014-01-01

    It has been shown that the distribution of genes in eukaryotic genomes is not random; however, formerly reported relations between gene function and genomic organization were relatively weak. Previous studies have demonstrated that codon usage bias is related to all stages of gene expression and to protein function. Here we apply a novel tool for assessing functional relatedness, codon usage frequency similarity (CUFS), which measures similarity between genes in terms of codon and amino acid usage. By analyzing chromosome conformation capture data, describing the three-dimensional (3D) conformation of the DNA, we show that the functional similarity between genes captured by CUFS is directly and very strongly correlated with their 3D distance in Saccharomyces cerevisiae, Schizosaccharomyces pombe, Arabidopsis thaliana, mouse and human. This emphasizes the importance of three-dimensional genomic localization in eukaryotes and indicates that codon usage is tightly linked to genome architecture. PMID:25510862

  8. Functional genomics in chickens: development of integrated-systems microarrays for transcriptional profiling and discovery of regulatory pathways.

    PubMed

    Cogburn, L A; Wang, X; Carre, W; Rejto, L; Aggrey, S E; Duclos, M J; Simon, J; Porter, T E

    2004-01-01

    The genetic networks that govern the differentiation and growth of major tissues of economic importance in the chicken are largely unknown. Under a functional genomics project, our consortium has generated 30 609 expressed sequence tags (ESTs) and developed several chicken DNA microarrays, which represent the Chicken Metabolic/Somatic (10 K) and Neuroendocrine/Reproductive (8 K) Systems (http://udgenome.ags.udel.edu/cogburn/). One of the major challenges facing functional genomics is the development of mathematical models to reconstruct functional gene networks and regulatory pathways from vast volumes of microarray data. In initial studies with liver-specific microarrays (3.1 K), we have examined gene expression profiles in liver during the peri-hatch transition and during a strong metabolic perturbation-fasting and re-feeding-in divergently selected broiler chickens (fast vs. slow-growth lines). The expression of many genes controlling metabolic pathways is dramatically altered by these perturbations. Our analysis has revealed a large number of clusters of functionally related genes (mainly metabolic enzymes and transcription factors) that control major metabolic pathways. Currently, we are conducting transcriptional profiling studies of multiple tissues during development of two sets of divergently selected broiler chickens (fast vs. slow growing and fat vs. lean lines). Transcriptional profiling across multiple tissues should permit construction of a detailed genetic blueprint that illustrates the developmental events and hierarchy of genes that govern growth and development of chickens. This review will briefly describe the recent acquisition of chicken genomic resources (ESTs and microarrays) and our consortium's efforts to help launch the new era of functional genomics in the chicken.

  9. Functional genomics in chickens: development of integrated-systems microarrays for transcriptional profiling and discovery of regulatory pathways.

    PubMed

    Cogburn, L A; Wang, X; Carre, W; Rejto, L; Aggrey, S E; Duclos, M J; Simon, J; Porter, T E

    2004-01-01

    The genetic networks that govern the differentiation and growth of major tissues of economic importance in the chicken are largely unknown. Under a functional genomics project, our consortium has generated 30 609 expressed sequence tags (ESTs) and developed several chicken DNA microarrays, which represent the Chicken Metabolic/Somatic (10 K) and Neuroendocrine/Reproductive (8 K) Systems (http://udgenome.ags.udel.edu/cogburn/). One of the major challenges facing functional genomics is the development of mathematical models to reconstruct functional gene networks and regulatory pathways from vast volumes of microarray data. In initial studies with liver-specific microarrays (3.1 K), we have examined gene expression profiles in liver during the peri-hatch transition and during a strong metabolic perturbation-fasting and re-feeding-in divergently selected broiler chickens (fast vs. slow-growth lines). The expression of many genes controlling metabolic pathways is dramatically altered by these perturbations. Our analysis has revealed a large number of clusters of functionally related genes (mainly metabolic enzymes and transcription factors) that control major metabolic pathways. Currently, we are conducting transcriptional profiling studies of multiple tissues during development of two sets of divergently selected broiler chickens (fast vs. slow growing and fat vs. lean lines). Transcriptional profiling across multiple tissues should permit construction of a detailed genetic blueprint that illustrates the developmental events and hierarchy of genes that govern growth and development of chickens. This review will briefly describe the recent acquisition of chicken genomic resources (ESTs and microarrays) and our consortium's efforts to help launch the new era of functional genomics in the chicken. PMID:18629153

  10. Nuclease-mediated genome editing: At the front-line of functional genomics technology.

    PubMed

    Sakuma, Tetsushi; Woltjen, Knut

    2014-01-01

    Genome editing with engineered endonucleases is rapidly becoming a staple method in developmental biology studies. Engineered nucleases permit random or designed genomic modification at precise loci through the stimulation of endogenous double-strand break repair. Homology-directed repair following targeted DNA damage is mediated by co-introduction of a custom repair template, allowing the derivation of knock-out and knock-in alleles in animal models previously refractory to classic gene targeting procedures. Currently there are three main types of customizable site-specific nucleases delineated by the source mechanism of DNA binding that guides nuclease activity to a genomic target: zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and clustered regularly interspaced short palindromic repeats (CRISPR). Among these genome engineering tools, characteristics such as the ease of design and construction, mechanism of inducing DNA damage, and DNA sequence specificity all differ, making their application complementary. By understanding the advantages and disadvantages of each method, one may make the best choice for their particular purpose.

  11. A conifer genomics resource of 200,000 spruce (Picea spp.) ESTs and 6,464 high-quality, sequence-finished full-length cDNAs for Sitka spruce (Picea sitchensis)

    PubMed Central

    Ralph, Steven G; Chun, Hye Jung E; Kolosova, Natalia; Cooper, Dawn; Oddy, Claire; Ritland, Carol E; Kirkpatrick, Robert; Moore, Richard; Barber, Sarah; Holt, Robert A; Jones, Steven JM; Marra, Marco A; Douglas, Carl J; Ritland, Kermit; Bohlmann, Jörg

    2008-01-01

    Background Members of the pine family (Pinaceae), especially species of spruce (Picea spp.) and pine (Pinus spp.), dominate many of the world's temperate and boreal forests. These conifer forests are of critical importance for global ecosystem stability and biodiversity. They also provide the majority of the world's wood and fiber supply and serve as a renewable resource for other industrial biomaterials. In contrast to angiosperms, functional and comparative genomics research on conifers, or other gymnosperms, is limited by the lack of a relevant reference genome sequence. Sequence-finished full-length (FL)cDNAs and large collections of expressed sequence tags (ESTs) are essential for gene discovery, functional genomics, and for future efforts of conifer genome annotation. Results As part of a conifer genomics program to characterize defense against insects and adaptation to local environments, and to discover genes for the production of biomaterials, we developed 20 standard, normalized or full-length enriched cDNA libraries from Sitka spruce (P. sitchensis), white spruce (P. glauca), and interior spruce (P. glauca-engelmannii complex). We sequenced and analyzed 206,875 3'- or 5'-end ESTs from these libraries, and developed a resource of 6,464 high-quality sequence-finished FLcDNAs from Sitka spruce. Clustering and assembly of 147,146 3'-end ESTs resulted in 19,941 contigs and 26,804 singletons, representing 46,745 putative unique transcripts (PUTs). The 6,464 FLcDNAs were all obtained from a single Sitka spruce genotype and represent 5,718 PUTs. Conclusion This paper provides detailed annotation and quality assessment of a large EST and FLcDNA resource for spruce. The 6,464 Sitka spruce FLcDNAs represent the third largest sequence-verified FLcDNA resource for any plant species, behind only rice (Oryza sativa) and Arabidopsis (Arabidopsis thaliana), and the only substantial FLcDNA resource for a gymnosperm. Our emphasis on capturing FLcDNAs and ESTs from c

  12. New Developments of RNAi in Paracoccidioides brasiliensis: Prospects for High-Throughput, Genome-Wide, Functional Genomics

    PubMed Central

    Goes, Tercio; Bailão, Elisa Flavia L. C.; Correa, Cristiane R.; Bozzi, Adriana; Santos, Luara I.; Gomes, Dawidson A.; Soares, Celia M. A.; Goes, Alfredo M.

    2014-01-01

    Background The Fungal Genome Initiative of the Broad Institute, in partnership with the Paracoccidioides research community, has recently sequenced the genome of representative isolates of this human-pathogen dimorphic fungus: Pb18 (S1), Pb03 (PS2) and Pb01. The accomplishment of future high-throughput, genome-wide, functional genomics will rely upon appropriate molecular tools and straightforward techniques to streamline the generation of stable loss-of-function phenotypes. In the past decades, RNAi has emerged as the most robust genetic technique to modulate or to suppress gene expression in diverse eukaryotes, including fungi. These molecular tools and techniques, adapted for RNAi, were up until now unavailable for P. brasiliensis. Methodology/Principal Findings In this paper, we report Agrobacterium tumefaciens mediated transformation of yeast cells for high-throughput applications with which higher transformation frequencies of 150±24 yeast cell transformants per 1×106 viable yeast cells were obtained. Our approach is based on a bifunctional selective marker fusion protein consisted of the Streptoalloteichus hindustanus bleomycin-resistance gene (Shble) and the intrinsically fluorescent monomeric protein mCherry which was codon-optimized for heterologous expression in P. brasiliensis. We also report successful GP43 gene knock-down through the expression of intron-containing hairpin RNA (ihpRNA) from a Gateway-adapted cassette (cALf) which was purpose-built for gene silencing in a high-throughput manner. Gp43 transcript levels were reduced by 73.1±22.9% with this approach. Conclusions/Significance We have a firm conviction that the genetic transformation technique and the molecular tools herein described will have a relevant contribution in future Paracoccidioides spp. functional genomics research. PMID:25275433

  13. The development of PIPA: an integrated and automated pipeline for genome-wide protein function annotation

    PubMed Central

    Yu, Chenggang; Zavaljevski, Nela; Desai, Valmik; Johnson, Seth; Stevens, Fred J; Reifman, Jaques

    2008-01-01

    low recall (33.0%). Our consensus algorithm for GO annotation is based on the computation and propagation of likelihood scores associated with GO terms. The test results suggest that, for a given recall, the application of the consensus algorithm yields higher precision than when consensus is not used. Conclusion The algorithms implemented in PIPA provide automated genome-wide protein function annotation based on reconciled predictions from multiple resources. PMID:18221520

  14. Functional genomics bridges the gap between quantitative genetics and molecular biology

    PubMed Central

    Lappalainen, Tuuli

    2015-01-01

    Deep characterization of molecular function of genetic variants in the human genome is becoming increasingly important for understanding genetic associations to disease and for learning to read the regulatory code of the genome. In this paper, I discuss how recent advances in both quantitative genetics and molecular biology have contributed to understanding functional effects of genetic variants, lessons learned from eQTL studies, and future challenges in this field. PMID:26430152

  15. Functional genomics bridges the gap between quantitative genetics and molecular biology.

    PubMed

    Lappalainen, Tuuli

    2015-10-01

    Deep characterization of molecular function of genetic variants in the human genome is becoming increasingly important for understanding genetic associations to disease and for learning to read the regulatory code of the genome. In this paper, I discuss how recent advances in both quantitative genetics and molecular biology have contributed to understanding functional effects of genetic variants, lessons learned from eQTL studies, and future challenges in this field.

  16. MitoZoa 2.0: a database resource and search tools for comparative and evolutionary analyses of mitochondrial genomes in Metazoa

    PubMed Central

    D'Onorio de Meo, Paolo; D'Antonio, Mattia; Griggio, Francesca; Lupi, Renato; Borsani, Massimiliano; Pavesi, Giulio; Castrignanò, Tiziana; Pesole, Graziano; Gissi, Carmela

    2012-01-01

    The MITOchondrial genome database of metaZOAns (MitoZoa) is a public resource for comparative analyses of metazoan mitochondrial genomes (mtDNA) at both the sequence and genomic organizational levels. The main characteristics of the MitoZoa database are the careful revision of mtDNA entry annotations and the possibility of retrieving gene order and non-coding region (NCR) data in appropriate formats. The MitoZoa retrieval system enables basic and complex queries at various taxonomic levels using different search menus. MitoZoa 2.0 has been enhanced in several aspects, including: a re-annotation pipeline to check the correctness of protein-coding gene predictions; a standardized annotation of introns and of precursor ORFs whose functionality is post-transcriptionally recovered by RNA editing or programmed translational frameshifting; updates of taxon-related fields and a BLAST sequence similarity search tool. Database novelties and the definition of standard mtDNA annotation rules, together with the user-friendly retrieval system and the BLAST service, make MitoZoa a valuable resource for comparative and evolutionary analyses as well as a reference database to assist in the annotation of novel mtDNA sequences. MitoZoa is freely accessible at http://www.caspur.it/mitozoa. PMID:22123747

  17. MitoZoa 2.0: a database resource and search tools for comparative and evolutionary analyses of mitochondrial genomes in Metazoa.

    PubMed

    D'Onorio de Meo, Paolo; D'Antonio, Mattia; Griggio, Francesca; Lupi, Renato; Borsani, Massimiliano; Pavesi, Giulio; Castrignanò, Tiziana; Pesole, Graziano; Gissi, Carmela

    2012-01-01

    The MITOchondrial genome database of metaZOAns (MitoZoa) is a public resource for comparative analyses of metazoan mitochondrial genomes (mtDNA) at both the sequence and genomic organizational levels. The main characteristics of the MitoZoa database are the careful revision of mtDNA entry annotations and the possibility of retrieving gene order and non-coding region (NCR) data in appropriate formats. The MitoZoa retrieval system enables basic and complex queries at various taxonomic levels using different search menus. MitoZoa 2.0 has been enhanced in several aspects, including: a re-annotation pipeline to check the correctness of protein-coding gene predictions; a standardized annotation of introns and of precursor ORFs whose functionality is post-transcriptionally recovered by RNA editing or programmed translational frameshifting; updates of taxon-related fields and a BLAST sequence similarity search tool. Database novelties and the definition of standard mtDNA annotation rules, together with the user-friendly retrieval system and the BLAST service, make MitoZoa a valuable resource for comparative and evolutionary analyses as well as a reference database to assist in the annotation of novel mtDNA sequences. MitoZoa is freely accessible at http://www.caspur.it/mitozoa.

  18. GenomicusPlants: A Web Resource to Study Genome Evolution in Flowering Plants

    PubMed Central

    Louis, Alexandra; Murat, Florent; Salse, Jérôme; Roest Crollius, Hugues

    2015-01-01

    Comparative genomics combined with phylogenetic reconstructions are powerful approaches to study the evolution of genes and genomes. However, the current rapid expansion of the volume of genomic information makes it increasingly difficult to interrogate, integrate and synthesize comparative genome data while taking into account the maximum breadth of information available. GenomicusPlants (http://www.genomicus.biologie.ens.fr/genomicus-plants) is an extension of the Genomicus webserver that addresses this issue by allowing users to explore flowering plant genomes in an intuitive way, across the broadest evolutionary scales. Extant genomes of 26 flowering plants can be analyzed, as well as 23 ancestral reconstructed genomes. Ancestral gene order provides a long-term chronological view of gene order evolution, greatly facilitating comparative genomics and evolutionary studies. Four main interfaces (‘views’) are available where: (i) PhyloView combines phylogenetic trees with comparisons of genomic loci across any number of genomes; (ii) AlignView projects loci of interest against all other genomes to visualize its topological conservation; (iii) MatrixView compares two genomes in a classical dotplot representation; and (iv) Karyoview visualizes chromosome karyotypes ‘painted’ with colours of another genome of interest. All four views are interconnected and benefit from many customizable features. PMID:25432975

  19. GenomicusPlants: a web resource to study genome evolution in flowering plants.

    PubMed

    Louis, Alexandra; Murat, Florent; Salse, Jérôme; Crollius, Hugues Roest

    2015-01-01

    Comparative genomics combined with phylogenetic reconstructions are powerful approaches to study the evolution of genes and genomes. However, the current rapid expansion of the volume of genomic information makes it increasingly difficult to interrogate, integrate and synthesize comparative genome data while taking into account the maximum breadth of information available. GenomicusPlants (http://www.genomicus.biologie.ens.fr/genomicus-plants) is an extension of the Genomicus webserver that addresses this issue by allowing users to explore flowering plant genomes in an intuitive way, across the broadest evolutionary scales. Extant genomes of 26 flowering plants can be analyzed, as well as 23 ancestral reconstructed genomes. Ancestral gene order provides a long-term chronological view of gene order evolution, greatly facilitating comparative genomics and evolutionary studies. Four main interfaces ('views') are available where: (i) PhyloView combines phylogenetic trees with comparisons of genomic loci across any number of genomes; (ii) AlignView projects loci of interest against all other genomes to visualize its topological conservation; (iii) MatrixView compares two genomes in a classical dotplot representation; and (iv) Karyoview visualizes chromosome karyotypes 'painted' with colours of another genome of interest. All four views are interconnected and benefit from many customizable features.

  20. Functional phylogenomics analysis of bacteria and archaea using consistent genome annotation with UniFam

    DOE PAGES

    Chai, Juanjuan; Kora, Guruprasad; Ahn, Tae-Hyuk; Hyatt, Doug; Pan, Chongle

    2014-10-09

    To supply some background, phylogenetic studies have provided detailed knowledge on the evolutionary mechanisms of genes and species in Bacteria and Archaea. However, the evolution of cellular functions, represented by metabolic pathways and biological processes, has not been systematically characterized. Many clades in the prokaryotic tree of life have now been covered by sequenced genomes in GenBank. This enables a large-scale functional phylogenomics study of many computationally inferred cellular functions across all sequenced prokaryotes. Our results show a total of 14,727 GenBank prokaryotic genomes were re-annotated using a new protein family database, UniFam, to obtain consistent functional annotations for accuratemore » comparison. The functional profile of a genome was represented by the biological process Gene Ontology (GO) terms in its annotation. The GO term enrichment analysis differentiated the functional profiles between selected archaeal taxa. 706 prokaryotic metabolic pathways were inferred from these genomes using Pathway Tools and MetaCyc. The consistency between the distribution of metabolic pathways in the genomes and the phylogenetic tree of the genomes was measured using parsimony scores and retention indices. The ancestral functional profiles at the internal nodes of the phylogenetic tree were reconstructed to track the gains and losses of metabolic pathways in evolutionary history. In conclusion, our functional phylogenomics analysis shows divergent functional profiles of taxa and clades. Such function-phylogeny correlation stems from a set of clade-specific cellular functions with low parsimony scores. On the other hand, many cellular functions are sparsely dispersed across many clades with high parsimony scores. These different types of cellular functions have distinct evolutionary patterns reconstructed from the prokaryotic tree.« less

  1. Functional phylogenomics analysis of bacteria and archaea using consistent genome annotation with UniFam

    SciTech Connect

    Chai, Juanjuan; Kora, Guruprasad; Ahn, Tae-Hyuk; Hyatt, Doug; Pan, Chongle

    2014-10-09

    To supply some background, phylogenetic studies have provided detailed knowledge on the evolutionary mechanisms of genes and species in Bacteria and Archaea. However, the evolution of cellular functions, represented by metabolic pathways and biological processes, has not been systematically characterized. Many clades in the prokaryotic tree of life have now been covered by sequenced genomes in GenBank. This enables a large-scale functional phylogenomics study of many computationally inferred cellular functions across all sequenced prokaryotes. Our results show a total of 14,727 GenBank prokaryotic genomes were re-annotated using a new protein family database, UniFam, to obtain consistent functional annotations for accurate comparison. The functional profile of a genome was represented by the biological process Gene Ontology (GO) terms in its annotation. The GO term enrichment analysis differentiated the functional profiles between selected archaeal taxa. 706 prokaryotic metabolic pathways were inferred from these genomes using Pathway Tools and MetaCyc. The consistency between the distribution of metabolic pathways in the genomes and the phylogenetic tree of the genomes was measured using parsimony scores and retention indices. The ancestral functional profiles at the internal nodes of the phylogenetic tree were reconstructed to track the gains and losses of metabolic pathways in evolutionary history. In conclusion, our functional phylogenomics analysis shows divergent functional profiles of taxa and clades. Such function-phylogeny correlation stems from a set of clade-specific cellular functions with low parsimony scores. On the other hand, many cellular functions are sparsely dispersed across many clades with high parsimony scores. These different types of cellular functions have distinct evolutionary patterns reconstructed from the prokaryotic tree.

  2. RNA Interference for Functional Genomics and Improvement of Cotton (Gossypium species)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    RNA interference (RNAi), is a powerful new technology in the discovery of genetic sequence functions, and has become a valuable tool for functional genomics of cotton (Gossypium ssp.). The rapid adoption of RNAi has replaced previous antisense technology. RNAi has aided in the discovery of function ...

  3. Discovery of biological networks from diverse functional genomic data

    PubMed Central

    Myers, Chad L; Robson, Drew; Wible, Adam; Hibbs, Matthew A; Chiriac, Camelia; Theesfeld, Chandra L; Dolinski, Kara; Troyanskaya, Olga G

    2005-01-01

    We have developed a general probabilistic system for query-based discovery of pathway-specific networks through integration of diverse genome-wide data. This framework was validated by accurately recovering known networks for 31 biological processes in Saccharomyces cerevisiae and experimentally verifying predictions for the process of chromosomal segregation. Our system, bioPIXIE, a public, comprehensive system for integration, analysis, and visualization of biological network predictions for S. cerevisiae, is freely accessible over the worldwide web. PMID:16420673

  4. Genome-wide high-throughput SNP discovery and genotyping for understanding natural (functional) allelic diversity and domestication patterns in wild chickpea

    PubMed Central

    Bajaj, Deepak; Das, Shouvik; Badoni, Saurabh; Kumar, Vinod; Singh, Mohar; Bansal, Kailash C.; Tyagi, Akhilesh K.; Parida, Swarup K.

    2015-01-01

    We identified 82489 high-quality genome-wide SNPs from 93 wild and cultivated Cicer accessions through integrated reference genome- and de novo-based GBS assays. High intra- and inter-specific polymorphic potential (66–85%) and broader natural allelic diversity (6–64%) detected by genome-wide SNPs among accessions signify their efficacy for monitoring introgression and transferring target trait-regulating genomic (gene) regions/allelic variants from wild to cultivated Cicer gene pools for genetic improvement. The population-specific assignment of wild Cicer accessions pertaining to the primary gene pool are more influenced by geographical origin/phenotypic characteristics than species/gene-pools of origination. The functional significance of allelic variants (non-synonymous and regulatory SNPs) scanned from transcription factors and stress-responsive genes in differentiating wild accessions (with potential known sources of yield-contributing and stress tolerance traits) from cultivated desi and kabuli accessions, fine-mapping/map-based cloning of QTLs and determination of LD patterns across wild and cultivated gene-pools are suitably elucidated. The correlation between phenotypic (agromorphological traits) and molecular diversity-based admixed domestication patterns within six structured populations of wild and cultivated accessions via genome-wide SNPs was apparent. This suggests utility of whole genome SNPs as a potential resource for identifying naturally selected trait-regulating genomic targets/functional allelic variants adaptive to diverse agroclimatic regions for genetic enhancement of cultivated gene-pools. PMID:26208313

  5. Genome-Wide Distribution, Organisation and Functional Characterization of Disease Resistance and Defence Response Genes across Rice Species.

    PubMed

    Singh, Sangeeta; Chand, Suresh; Singh, N K; Sharma, Tilak Raj

    2015-01-01

    The resistance (R) genes and defense response (DR) genes have become very important resources for the development of disease resistant cultivars. In the present investigation, genome-wide identification, expression, phylogenetic and synteny analysis was done for R and DR-genes across three species of rice viz: Oryza sativa ssp indica cv 93-11, Oryza sativa ssp japonica and wild rice species, Oryza brachyantha. We used the in silico approach to identify and map 786 R -genes and 167 DR-genes, 672 R-genes and 142 DR-genes, 251 R-genes and 86 DR-genes in the japonica, indica and O. brachyanth a genomes, respectively. Our analysis showed that 60.5% and 55.6% of the R-genes are tandemly repeated within clusters and distributed over all the rice chromosomes in indica and japonica genomes, respectively. The phylogenetic analysis along with motif distribution shows high degree of conservation of R- and DR-genes in clusters. In silico expression analysis of R-genes and DR-genes showed more than 85% were expressed genes showing corresponding EST matches in the databases. This study gave special emphasis on mechanisms of gene evolution and duplication for R and DR genes across species. Analysis of paralogs across rice species indicated 17% and 4.38% R-genes, 29% and 11.63% DR-genes duplication in indica and Oryza brachyantha, as compared to 20% and 26% duplication of R-genes and DR-genes in japonica respectively. We found that during the course of duplication only 9.5% of R- and DR-genes changed their function and rest of the genes have maintained their identity. Syntenic relationship across three genomes inferred that more orthology is shared between indica and japonica genomes as compared to brachyantha genome. Genome wide identification of R-genes and DR-genes in the rice genome will help in allele mining and functional validation of these genes, and to understand molecular mechanism of disease resistance and their evolution in rice and related species.

  6. Comparative mitochondrial genomics of snakes: extraordinary substitution rate dynamics and functionality of the duplicate control region

    PubMed Central

    Jiang, Zhi J; Castoe, Todd A; Austin, Christopher C; Burbrink, Frank T; Herron, Matthew D; McGuire, Jimmy A; Parkinson, Christopher L; Pollock, David D

    2007-01-01

    Background The mitochondrial genomes of snakes are characterized by an overall evolutionary rate that appears to be one of the most accelerated among vertebrates. They also possess other unusual features, including short tRNAs and other genes, and a duplicated control region that has been stably maintained since it originated more than 70 million years ago. Here, we provide a detailed analysis of evolutionary dynamics in snake mitochondrial genomes to better understand the basis of these extreme characteristics, and to explore the relationship between mitochondrial genome molecular evolution, genome architecture, and molecular function. We sequenced complete mitochondrial genomes from Slowinski's corn snake (Pantherophis slowinskii) and two cottonmouths (Agkistrodon piscivorus) to complement previously existing mitochondrial genomes, and to provide an improved comparative view of how genome architecture affects molecular evolution at contrasting levels of divergence. Results We present a Bayesian genetic approach that suggests that the duplicated control region can function as an additional origin of heavy strand replication. The two control regions also appear to have different intra-specific versus inter-specific evolutionary dynamics that may be associated with complex modes of concerted evolution. We find that different genomic regions have experienced substantial accelerated evolution along early branches in snakes, with different genes having experienced dramatic accelerations along specific branches. Some of these accelerations appear to coincide with, or subsequent to, the shortening of various mitochondrial genes and the duplication of the control region and flanking tRNAs. Conclusion Fluctuations in the strength and pattern of selection during snake evolution have had widely varying gene-specific effects on substitution rates, and these rate accelerations may have been functionally related to unusual changes in genomic architecture. The among-lineage and

  7. Function-selective domain architecture plasticity potentials in eukaryotic genome evolution.

    PubMed

    Linkeviciute, Viktorija; Rackham, Owen J L; Gough, Julian; Oates, Matt E; Fang, Hai

    2015-12-01

    To help evaluate how protein function impacts on genome evolution, we introduce a new concept of 'architecture plasticity potential' - the capacity to form distinct domain architectures - both for an individual domain, or more generally for a set of domains grouped by shared function. We devise a scoring metric to measure the plasticity potential for these domain sets, and evaluate how function has changed over time for different species. Applying this metric to a phylogenetic tree of eukaryotic genomes, we find that the involvement of each function is not random but highly selective. For certain lineages there is strong bias for evolution to involve domains related to certain functions. In general eukaryotic genomes, particularly animals, expand complex functional activities such as signalling and regulation, but at the cost of reducing metabolic processes. We also observe differential evolution of transcriptional regulation and a unique evolutionary role of channel regulators; crucially this is only observable in terms of the architecture plasticity potential. Our findings provide a new layer of information to understand the significance of function in eukaryotic genome evolution. A web search tool, available at http://supfam.org/Pevo, offers a wide spectrum of options for exploring functional importance in eukaryotic genome evolution.

  8. Function-selective domain architecture plasticity potentials in eukaryotic genome evolution

    PubMed Central

    Linkeviciute, Viktorija; Rackham, Owen J.L.; Gough, Julian; Oates, Matt E.; Fang, Hai

    2015-01-01

    To help evaluate how protein function impacts on genome evolution, we introduce a new concept of ‘architecture plasticity potential’ – the capacity to form distinct domain architectures – both for an individual domain, or more generally for a set of domains grouped by shared function. We devise a scoring metric to measure the plasticity potential for these domain sets, and evaluate how function has changed over time for different species. Applying this metric to a phylogenetic tree of eukaryotic genomes, we find that the involvement of each function is not random but highly selective. For certain lineages there is strong bias for evolution to involve domains related to certain functions. In general eukaryotic genomes, particularly animals, expand complex functional activities such as signalling and regulation, but at the cost of reducing metabolic processes. We also observe differential evolution of transcriptional regulation and a unique evolutionary role of channel regulators; crucially this is only observable in terms of the architecture plasticity potential. Our findings provide a new layer of information to understand the significance of function in eukaryotic genome evolution. A web search tool, available at http://supfam.org/Pevo, offers a wide spectrum of options for exploring functional importance in eukaryotic genome evolution. PMID:25980317

  9. Expanding the roles of chromatin insulators in nuclear architecture, chromatin organization and genome function.

    PubMed

    Schoborg, Todd; Labrador, Mariano

    2014-11-01

    Of the numerous classes of elements involved in modulating eukaryotic chromosome structure and function, chromatin insulators arguably remain the most poorly understood in their contribution to these processes in vivo. Indeed, our view of chromatin insulators has evolved dramatically since their chromatin boundary and enhancer blocking properties were elucidated roughly a quarter of a century ago as a result of recent genome-wide, high-throughput methods better suited to probing the role of these elements in their native genomic contexts. The overall theme that has emerged from these studies is that chromatin insulators function as general facilitators of higher-order chromatin loop structures that exert both physical and functional constraints on the genome. In this review, we summarize the result of recent work that supports this idea as well as a number of other studies linking these elements to a diverse array of nuclear processes, suggesting that chromatin insulators exert master control over genome organization and behavior.

  10. Rosetta stone method for detecting protein function and protein-protein interactions from genome sequences

    DOEpatents

    Eisenberg, David; Marcotte, Edward M.; Pellegrini, Matteo; Thompson, Michael J.; Yeates, Todd O.

    2002-10-15

    A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

  11. De novo transcriptome sequencing facilitates genomic resource generation in Tinospora cordifolia.

    PubMed

    Singh, Rakesh; Kumar, Rajesh; Mahato, Ajay Kumar; Paliwal, Ritu; Singh, Amit Kumar; Kumar, Sundeep; Marla, Soma S; Kumar, Ashok; Singh, Nagendra K

    2016-09-01

    Tinospora cordifolia is known for its medicinal properties owing to the presence of useful constituents such as terpenes, glycosides, steroids, alkaloids, and flavonoids belonging to secondary metabolism origin. However, there is little information available pertaining to critical genomic elements (ESTs, molecular markers) necessary for judicious exploitation of its germplasm. We employed 454 GS-FLX pyrosequencing of entire transcripts and altogether ∼25 K assembled transcripts or Expressed sequence tags (ESTs) were identified. As the interest in T. cordifolia is primarily due to its secondary metabolite constituents, the ESTs pertaining to terpenoids biosynthetic pathway were identified in the present study. Additionally, several ESTs were assigned to different transcription factor families. To validate our transcripts dataset, the novel EST-SSR markers were generated to assess the genetic diversity among germplasm of T. cordifolia. These EST-SSR markers were found to be polymorphic and the dendrogram based on dice similarity index revealed three distinct clustering of accessions. The present study demonstrates effectiveness in using both NEWBLER and MIRA sequence read assembler software for enriching transcript-dataset and thus enables better exploitation of EST resources for mining candidate genes and designing molecular markers. PMID:27465295

  12. The exomes of the NCI-60 panel: a genomic resource for cancer biology and systems pharmacology.

    PubMed

    Abaan, Ogan D; Polley, Eric C; Davis, Sean R; Zhu, Yuelin J; Bilke, Sven; Walker, Robert L; Pineda, Marbin; Gindin, Yevgeniy; Jiang, Yuan; Reinhold, William C; Holbeck, Susan L; Simon, Richard M; Doroshow, James H; Pommier, Yves; Meltzer, Paul S

    2013-07-15

    The NCI-60 cell lines are the most frequently studied human tumor cell lines in cancer research. This panel has generated the most extensive cancer pharmacology database worldwide. In addition, these cell lines have been intensely investigated, providing a unique platform for hypothesis-driven research focused on enhancing our understanding of tumor biology. Here, we report a comprehensive analysis of coding variants in the NCI-60 panel of cell lines identified by whole exome sequencing, providing a list of possible cancer specific variants for the community. Furthermore, we identify pharmacogenomic correlations between specific variants in genes such as TP53, BRAF, ERBBs, and ATAD5 and anticancer agents such as nutlin, vemurafenib, erlotinib, and bleomycin showing one of many ways the data could be used to validate and generate novel hypotheses for further investigation. As new cancer genes are identified through large-scale sequencing studies, the data presented here for the NCI-60 will be an invaluable resource for identifying cell lines with mutations in such genes for hypothesis-driven research. To enhance the utility of the data for the greater research community, the genomic variants are freely available in different formats and from multiple sources including the CellMiner and Ingenuity websites. PMID:23856246

  13. Databases of genomic variation and phenotypes: existing resources and future needs

    PubMed Central

    Johnston, Jennifer J.; Biesecker, Leslie G.

    2013-01-01

    Massively parallel sequencing (MPS) has become an important tool for identifying medically significant variants in both research and the clinic. Accurate variation and genotype–phenotype databases are critical in our ability to make sense of the vast amount of information that MPS generates. The purpose of this review is to summarize the state of the art of variation and genotype–phenotype databases, how they can be used, and opportunities to improve these resources. Our working assumption is that the objective of the clinical genomicist is to identify highly penetrant variants that could explain existing disease or predict disease risk for individual patients or research participants. We have detailed how current databases contribute to this goal providing frequency data, literature reviews and predictions of causation for individual variants. For variant annotation, databases vary greatly in their ease of use, the use of standard mutation nomenclature, the comprehensiveness of the variant cataloging and the degree of expert opinion. Ultimately, we need a dynamic and comprehensive reference database of medically important variants that is easily cross referenced to exome and genome sequence data and allows for an accumulation of expert opinion. PMID:23962721

  14. The infectious BAC genomic DNA expression library: a high capacity vector system for functional genomics

    PubMed Central

    Lufino, Michele M. P.; Edser, Pauline A. H.; Quail, Michael A.; Rice, Stephen; Adams, David J.; Wade-Martins, Richard

    2016-01-01

    Gene dosage plays a critical role in a range of cellular phenotypes, yet most cellular expression systems use heterologous cDNA-based vectors which express proteins well above physiological levels. In contrast, genomic DNA expression vectors generate physiologically-relevant levels of gene expression by carrying the whole genomic DNA locus of a gene including its regulatory elements. Here we describe the first genomic DNA expression library generated using the high-capacity herpes simplex virus-1 amplicon technology to deliver bacterial artificial chromosomes (BACs) into cells by viral transduction. The infectious BAC (iBAC) library contains 184,320 clones with an average insert size of 134.5 kb. We show in a Chinese hamster ovary (CHO) disease model cell line and mouse embryonic stem (ES) cells that this library can be used for genetic rescue studies in a range of contexts including the physiological restoration of Ldlr deficiency, and viral receptor expression. The iBAC library represents an important new genetic analysis tool openly available to the research community. PMID:27353647

  15. Functional genomic and high-content screening for target discovery and deconvolution

    PubMed Central

    Heynen-Genel, Susanne; Pache, Lars; Chanda, Sumit K

    2014-01-01

    Introduction Functional genomic screens apply knowledge gained from the sequencing of the human genome toward rapid methods of identifying genes involved in cellular function based on a specific phenotype. This approach has been made possible through the use of advances in both molecular biology and automation. The utility of this approach has been further enhanced through the application of image-based high content screening, an automated microscopy and quantitative image analysis platform. These approaches can significantly enhance acquisition of novel targets for drug discovery. Areas covered Both the utility and potential issues associated with functional genomic screening approaches are discussed along with examples that illustrate both. The considerations for high content screening applied to functional genomics are also presented. Expert opinion Functional genomic and high content screening are extremely useful in the identification of new drug targets. However, the technical, experimental, and computational parameters have an enormous influence on the results. Thus, although new targets are identified, caution should be applied toward interpretation of screening data in isolation. Genomic screens should be viewed as an integral component of a target identification campaign that requires both the acquisition of orthogonal data, as well as a rigorous validation strategy. PMID:22860749

  16. ScreenBEAM: a novel meta-analysis algorithm for functional genomics screens via Bayesian hierarchical modeling | Office of Cancer Genomics

    Cancer.gov

    Functional genomics (FG) screens, using RNAi or CRISPR technology, have become a standard tool for systematic, genome-wide loss-of-function studies for therapeutic target discovery. As in many large-scale assays, however, off-target effects, variable reagents' potency and experimental noise must be accounted for appropriately control for false positives.

  17. The Arabidopsis Information Resource: Making and Mining the ‘Gold Standard’ Annotated Reference Plant Genome

    PubMed Central

    Berardini, Tanya Z.; Reiser, Leonore; Li, Donghui; Mezheritsky, Yarik; Muller, Robert; Strait, Emily; Huala, Eva

    2015-01-01

    The Arabidopsis Information Resource (TAIR) is a continuously updated, online database of genetic and molecular biology data for the model plant Arabidopsis thaliana that provides a global research community with centralized access to data for over 30,000 Arabidopsis genes. TAIR’s biocurators systematically extract, organize, and interconnect experimental data from the literature along with computational predictions, community submissions, and high throughput datasets to present a high quality and comprehensive picture of Arabidopsis gene function. TAIR provides tools for data visualization and analysis, and enables ordering of seed and DNA stocks, protein chips and other experimental resources. TAIR actively engages with its users who contribute expertise and data that augments the work of the curatorial staff. TAIR’s focus in an extensive and evolving ecosystem of online resources for plant biology is on the critically important role of extracting experimentally-based research findings from the literature and making that information computationally accessible. In response to the loss of government grant funding, the TAIR team founded a nonprofit entity, Phoenix Bioinformatics, with the aim of developing sustainable funding models for biological databases, using TAIR as a test case. Phoenix has successfully transitioned TAIR to subscription-based funding while still keeping its data relatively open and accessible. PMID:26201819

  18. Data for constructing insect genome content matrices for phylogenetic analysis and functional annotation

    PubMed Central

    Rosenfeld, Jeffrey; Foox, Jonathan; DeSalle, Rob

    2015-01-01

    Twenty one fully sequenced and well annotated insect genomes were used to construct genome content matrices for phylogenetic analysis and functional annotation of insect genomes. To examine the role of e-value cutoff in ortholog determination we used scaled e-value cutoffs and a single linkage clustering approach.. The present communication includes (1) a list of the genomes used to construct the genome content phylogenetic matrices, (2) a nexus file with the data matrices used in phylogenetic analysis, (3) a nexus file with the Newick trees generated by phylogenetic analysis, (4) an excel file listing the Core (CORE) genes and Unique (UNI) genes found in five insect groups, and (5) a figure showing a plot of consistency index (CI) versus percent of unannotated genes that are apomorphies in the data set for gene losses and gains and bar plots of gains and losses for four consistency index (CI) cutoffs. PMID:26862572

  19. Data for constructing insect genome content matrices for phylogenetic analysis and functional annotation.

    PubMed

    Rosenfeld, Jeffrey; Foox, Jonathan; DeSalle, Rob

    2016-03-01

    Twenty one fully sequenced and well annotated insect genomes were used to construct genome content matrices for phylogenetic analysis and functional annotation of insect genomes. To examine the role of e-value cutoff in ortholog determination we used scaled e-value cutoffs and a single linkage clustering approach.. The present communication includes (1) a list of the genomes used to construct the genome content phylogenetic matrices, (2) a nexus file with the data matrices used in phylogenetic analysis, (3) a nexus file with the Newick trees generated by phylogenetic analysis, (4) an excel file listing the Core (CORE) genes and Unique (UNI) genes found in five insect groups, and (5) a figure showing a plot of consistency index (CI) versus percent of unannotated genes that are apomorphies in the data set for gene losses and gains and bar plots of gains and losses for four consistency index (CI) cutoffs. PMID:26862572

  20. Next-Generation High-Throughput Functional Annotation of Microbial Genomes

    PubMed Central

    Baric, Ralph S.; Damania, Blossom; Miller, Samuel I.; Rubin, Eric J.

    2016-01-01

    ABSTRACT Host infection by microbial pathogens cues global changes in microbial and host cell biology that facilitate microbial replication and disease. The complete maps of thousands of bacterial and viral genomes have recently been defined; however, the rate at which physiological or biochemical functions have been assigned to genes has greatly lagged. The National Institute of Allergy and Infectious Diseases (NIAID) addressed this gap by creating functional genomics centers dedicated to developing high-throughput approaches to assign gene function. These centers require broad-based and collaborative research programs to generate and integrate diverse data to achieve a comprehensive understanding of microbial pathogenesis. High-throughput functional genomics can lead to new therapeutics and better understanding of the next generation of emerging pathogens by rapidly defining new general mechanisms by which organisms cause disease and replicate in host tissues and by facilitating the rate at which functional data reach the scientific community. PMID:27703071

  1. Modelling Human Regulatory Variation in Mouse: Finding the Function in Genome-Wide Association Studies and Whole-Genome Sequencing

    PubMed Central

    Schmouth, Jean-François; Bonaguro, Russell J.; Corso-Diaz, Ximena; Simpson, Elizabeth M.

    2012-01-01

    An increasing body of literature from genome-wide association studies and human whole-genome sequencing highlights the identification of large numbers of candidate regulatory variants of potential therapeutic interest in numerous diseases. Our relatively poor understanding of the functions of non-coding genomic sequence, and the slow and laborious process of experimental validation of the functional significance of human regulatory variants, limits our ability to fully benefit from this information in our efforts to comprehend human disease. Humanized mouse models (HuMMs), in which human genes are introduced into the mouse, suggest an approach to this problem. In the past, HuMMs have been used successfully to study human disease variants; e.g., the complex genetic condition arising from Down syndrome, common monogenic disorders such as Huntington disease and β-thalassemia, and cancer susceptibility genes such as BRCA1. In this commentary, we highlight a novel method for high-throughput single-copy site-specific generation of HuMMs entitled High-throughput Human Genes on the X Chromosome (HuGX). This method can be applied to most human genes for which a bacterial artificial chromosome (BAC) construct can be derived and a mouse-null allele exists. This strategy comprises (1) the use of recombineering technology to create a human variant–harbouring BAC, (2) knock-in of this BAC into the mouse genome using Hprt docking technology, and (3) allele comparison by interspecies complementation. We demonstrate the throughput of the HuGX method by generating a series of seven different alleles for the human NR2E1 gene at Hprt. In future challenges, we consider the current limitations of experimental approaches and call for a concerted effort by the genetics community, for both human and mouse, to solve the challenge of the functional analysis of human regulatory variation. PMID:22396661

  2. High Throughput Technologies for Functional Analysis of Archael Genomics

    SciTech Connect

    El-Sayed, Najib M. A.

    1998-09-25

    The specific aims of this project were as follows: (1) to design primers to each predicted open reading frame (ORF) in M. jannaschii and M. thermoautotrophicum to allow the amplification of a unique target sequence that will represent the corresponding coding region on a complete genome chip (2) to amplify each target sequence from M. jannaschii and M. thermoautotrophicum and verify that these PCR products are the expected DNA fragment (3) to establish a relational database that will track the production of target DNAs and the nucleotide sequence used to represent each ORF.

  3. Genomic islands link secondary metabolism to functional adaptation in marine Actinobacteria.

    PubMed

    Penn, Kevin; Jenkins, Caroline; Nett, Markus; Udwary, Daniel W; Gontang, Erin A; McGlinchey, Ryan P; Foster, Brian; Lapidus, Alla; Podell, Sheila; Allen, Eric E; Moore, Bradley S; Jensen, Paul R

    2009-10-01

    Genomic islands have been shown to harbor functional traits that differentiate ecologically distinct populations of environmental bacteria. A comparative analysis of the complete genome sequences of the marine Actinobacteria Salinispora tropica and Salinispora arenicola reveals that 75% of the species-specific genes are located in 21 genomic islands. These islands are enriched in genes associated with secondary metabolite biosynthesis providing evidence that secondary metabolism is linked to functional adaptation. Secondary metabolism accounts for 8.8% and 10.9% of the genes in the S. tropica and S. arenicola genomes, respectively, and represents the major functional category of annotated genes that differentiates the two species. Genomic islands harbor all 25 of the species-specific biosynthetic pathways, the majority of which occur in S. arenicola and may contribute to the cosmopolitan distribution of this species. Genome evolution is dominated by gene duplication and acquisition, which in the case of secondary metabolism provide immediate opportunities for the production of new bioactive products. Evidence that secondary metabolic pathways are exchanged horizontally, coupled with earlier evidence for fixation among globally distributed populations, supports a functional role and suggests that the acquisition of natural product biosynthetic gene clusters represents a previously unrecognized force driving bacterial diversification. Species-specific differences observed in clustered regularly interspaced short palindromic repeat sequences suggest that S. arenicola may possess a higher level of phage immunity, whereas a highly duplicated family of polymorphic membrane proteins provides evidence for a new mechanism of marine adaptation in Gram-positive bacteria.

  4. When parasitic wasps hijacked viruses: genomic and functional evolution of polydnaviruses.

    PubMed

    Herniou, Elisabeth A; Huguet, Elisabeth; Thézé, Julien; Bézier, Annie; Periquet, Georges; Drezen, Jean-Michel

    2013-09-19

    The Polydnaviridae (PDV), including the Bracovirus (BV) and Ichnovirus genera, originated from the integration of unrelated viruses in the genomes of two parasitoid wasp lineages, in a remarkable example of convergent evolution. Functionally active PDVs represent the most compelling evolutionary success among endogenous viral elements (EVEs). BV evolved from the domestication by braconid wasps of a nudivirus 100 Ma. The nudivirus genome has become an EVE involved in BV particle production but is not encapsidated. Instead, BV genomes have co-opted virulence genes, used by the wasps to control the immunity and development of their hosts. Gene transfers and duplications have shaped BV genomes, now encoding hundreds of genes. Phylogenomic studies suggest that BVs contribute largely to wasp diversification and adaptation to their hosts. A genome evolution model explains how multidirectional wasp adaptation to different host species could have fostered PDV genome extension. Integrative studies linking ecological data on the wasp to genomic analyses should provide new insights into the adaptive role of particular BV genes. Forthcoming genomic advances should also indicate if the associations between endoparasitoid wasps and symbiotic viruses evolved because of their particularly intimate interactions with their hosts, or if similar domesticated EVEs could be uncovered in other parasites.

  5. When parasitic wasps hijacked viruses: genomic and functional evolution of polydnaviruses.

    PubMed

    Herniou, Elisabeth A; Huguet, Elisabeth; Thézé, Julien; Bézier, Annie; Periquet, Georges; Drezen, Jean-Michel

    2013-09-19

    The Polydnaviridae (PDV), including the Bracovirus (BV) and Ichnovirus genera, originated from the integration of unrelated viruses in the genomes of two parasitoid wasp lineages, in a remarkable example of convergent evolution. Functionally active PDVs represent the most compelling evolutionary success among endogenous viral elements (EVEs). BV evolved from the domestication by braconid wasps of a nudivirus 100 Ma. The nudivirus genome has become an EVE involved in BV particle production but is not encapsidated. Instead, BV genomes have co-opted virulence genes, used by the wasps to control the immunity and development of their hosts. Gene transfers and duplications have shaped BV genomes, now encoding hundreds of genes. Phylogenomic studies suggest that BVs contribute largely to wasp diversification and adaptation to their hosts. A genome evolution model explains how multidirectional wasp adaptation to different host species could have fostered PDV genome extension. Integrative studies linking ecological data on the wasp to genomic analyses should provide new insights into the adaptive role of particular BV genes. Forthcoming genomic advances should also indicate if the associations between endoparasitoid wasps and symbiotic viruses evolved because of their particularly intimate interactions with their hosts, or if similar domesticated EVEs could be uncovered in other parasites. PMID:23938758

  6. Colibri: a functional data base for the Escherichia coli genome.

    PubMed Central

    Médigue, C; Viari, A; Hénaut, A; Danchin, A

    1993-01-01

    Several data libraries have been created to organize all the data obtained worldwide about the Escherichia coli genome. Because the known data now amount to more than 40% of the whole genome sequence, it has become necessary to organize the data in such a way that appropriate procedures can associate knowledge produced by experiments about each gene to its position on the chromosome and its relation to other relevant genes, for example. In addition, global properties of genes, affected by the introduction of new entries, should be present as appropriate description fields. A data base, implemented on Macintosh by using the data base management system 4th Dimension, is described. It is constructed around a core constituted by known contigs of E. coli sequences and links data collected in general libraries (unmodified) to data associated with evolving knowledge (with modifiable fields). Biologically significant results obtained through the coupling of appropriate procedures (learning or statistical data analysis) are presented. The data base is available through a 4th Dimension runtime and through FTP on Internet. It has been regularly updated and will be systematically linked to other E. coli data bases (M. Kroger, R. Wahl, G. Schachtel, and P. Rice, Nucleic Acids Res. 20(Suppl.):2119-2144, 1992; K. E. Rudd, W. Miller, C. Werner, J. Ostell, C. Tolstoshev, and S. G. Satterfield, Nucleic Acids Res. 19:637-647, 1991) in the near future. Images PMID:8246843

  7. Genetic screens and functional genomics using CRISPR/Cas9 technology.

    PubMed

    Hartenian, Ella; Doench, John G

    2015-04-01

    Functional genomics attempts to understand the genome by perturbing the flow of information from DNA to RNA to protein, in order to learn how gene dysfunction leads to disease. CRISPR/Cas9 technology is the newest tool in the geneticist's toolbox, allowing researchers to edit DNA with unprecedented ease, speed and accuracy, and representing a novel means to perform genome-wide genetic screens to discover gene function. In this review, we first summarize the discovery and characterization of CRISPR/Cas9, and then compare it to other genome engineering technologies. We discuss its initial use in screening applications, with a focus on optimizing on-target activity and minimizing off-target effects. Finally, we comment on future challenges and opportunities afforded by this technology.

  8. Advances in mRNA Silencing and Transgene Expression: a Gateway to Functional Genomics in Schistosomes

    PubMed Central

    Tchoubrieva, Elissaveta B.; Kalinna, Bernd H.

    2013-01-01

    The completion of the WHO Schistosoma Genome Project in 2008, although not fully annotated, provides a golden opportunity to actively pursue fundamental research on the parasites genome. This analysis will aid identification of targets for drugs, vaccines and markers for diagnostic tools as well as for studying the biological basis of drug resistance, infectivity and pathology. For the validation of drug and vaccine targets, the genomic sequence data is only of use if functional analyses can be conducted (in the parasite itself). Until recently, gene manipulation approaches had not been seriously addressed. This situation is now changing and rapid advances have been made in gene silencing and transgenesis of schistosomes. PMID:21415884

  9. Paired-end genomic signature tags: a method for the functional analysis of genomes and epigenomes.

    PubMed

    Dunn, John J; McCorkle, Sean R; Everett, Logan; Anderson, Carl W

    2007-01-01

    Because paired-end genomic signature tags are sequenced-based, they have the potential to become an alternate tool to tiled microarray hybridization as a method for genome-wide localization of transcription factors and other sequence-specific DNA binding proteins. As outlined here the method also can be used for global analysis of DNA methylation. One advantage of this approach is the ability to easily switch between different genome types without having to fabricate a new microarray for each and every DNA type. However, the method does have some disadvantages. Among the most rate-limiting steps of our PE-GST protocol are the need to concatemerize the diTAGs, size fractionate them and then clone them prior to sequencing. This is usually followed by additional steps to amplify and size select for long (> or = 500) concatemer inserts prior to sequencing. These time-consuming steps are important for standard DNA sequencing as they increase efficiency approximately 20-30-fold since each amplified concatemer can now provide information on multiple tags; the limitation on data acqui- sition is read length during sequencing. However, the development of new sequencing methods such as Life Sciences' 454 new nanotechnology-based sequencing instrument (41) could increase tag sequencing efficiency by several orders of magnitude (> or = 100,000 diTAG reads/run), which is sufficient to provide in-depth global analysis of all ChIP PE-GSTs in a single run. This is because the lengths of our paired-end diTAGs (approximately 60 bp) fall well within the region of high accuracy for read lengths on this instrument. In principle, sequence analysis of diTAGs could begin as soon as they are generated, thereby completely bypassing the need for the concatemerization, sizing, downstream cloning steps and sequencing template purification. In addition, our protocol places any one of several unique four-base long nucleotide sequences, such as GATC, between each and every diTAG pair, which could