Science.gov

Sample records for functional genomics tools

  1. Genomic Enzymology: Web Tools for Leveraging Protein Family Sequence-Function Space and Genome Context to Discover Novel Functions.

    PubMed

    Gerlt, John A

    2017-08-22

    The exponentially increasing number of protein and nucleic acid sequences provides opportunities to discover novel enzymes, metabolic pathways, and metabolites/natural products, thereby adding to our knowledge of biochemistry and biology. The challenge has evolved from generating sequence information to mining the databases to integrating and leveraging the available information, i.e., the availability of "genomic enzymology" web tools. Web tools that allow identification of biosynthetic gene clusters are widely used by the natural products/synthetic biology community, thereby facilitating the discovery of novel natural products and the enzymes responsible for their biosynthesis. However, many novel enzymes with interesting mechanisms participate in uncharacterized small-molecule metabolic pathways; their discovery and functional characterization also can be accomplished by leveraging information in protein and nucleic acid databases. This Perspective focuses on two genomic enzymology web tools that assist the discovery novel metabolic pathways: (1) Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST) for generating sequence similarity networks to visualize and analyze sequence-function space in protein families and (2) Enzyme Function Initiative-Genome Neighborhood Tool (EFI-GNT) for generating genome neighborhood networks to visualize and analyze the genome context in microbial and fungal genomes. Both tools have been adapted to other applications to facilitate target selection for enzyme discovery and functional characterization. As the natural products community has demonstrated, the enzymology community needs to embrace the essential role of web tools that allow the protein and genome sequence databases to be leveraged for novel insights into enzymological problems.

  2. Genetic resources offer efficient tools for rice functional genomics research.

    PubMed

    Lo, Shuen-Fang; Fan, Ming-Jen; Hsing, Yue-Ie; Chen, Liang-Jwu; Chen, Shu; Wen, Ien-Chie; Liu, Yi-Lun; Chen, Ku-Ting; Jiang, Mirng-Jier; Lin, Ming-Kuang; Rao, Meng-Yen; Yu, Lin-Chih; Ho, Tuan-Hua David; Yu, Su-May

    2016-05-01

    Rice is an important crop and major model plant for monocot functional genomics studies. With the establishment of various genetic resources for rice genomics, the next challenge is to systematically assign functions to predicted genes in the rice genome. Compared with the robustness of genome sequencing and bioinformatics techniques, progress in understanding the function of rice genes has lagged, hampering the utilization of rice genes for cereal crop improvement. The use of transfer DNA (T-DNA) insertional mutagenesis offers the advantage of uniform distribution throughout the rice genome, but preferentially in gene-rich regions, resulting in direct gene knockout or activation of genes within 20-30 kb up- and downstream of the T-DNA insertion site and high gene tagging efficiency. Here, we summarize the recent progress in functional genomics using the T-DNA-tagged rice mutant population. We also discuss important features of T-DNA activation- and knockout-tagging and promoter-trapping of the rice genome in relation to mutant and candidate gene characterizations and how to more efficiently utilize rice mutant populations and datasets for high-throughput functional genomics and phenomics studies by forward and reverse genetics approaches. These studies may facilitate the translation of rice functional genomics research to improvements of rice and other cereal crops. © 2015 John Wiley & Sons Ltd.

  3. Genomic Enzymology: Web Tools for Leveraging Protein Family Sequence–Function Space and Genome Context to Discover Novel Functions

    PubMed Central

    2017-01-01

    The exponentially increasing number of protein and nucleic acid sequences provides opportunities to discover novel enzymes, metabolic pathways, and metabolites/natural products, thereby adding to our knowledge of biochemistry and biology. The challenge has evolved from generating sequence information to mining the databases to integrating and leveraging the available information, i.e., the availability of “genomic enzymology” web tools. Web tools that allow identification of biosynthetic gene clusters are widely used by the natural products/synthetic biology community, thereby facilitating the discovery of novel natural products and the enzymes responsible for their biosynthesis. However, many novel enzymes with interesting mechanisms participate in uncharacterized small-molecule metabolic pathways; their discovery and functional characterization also can be accomplished by leveraging information in protein and nucleic acid databases. This Perspective focuses on two genomic enzymology web tools that assist the discovery novel metabolic pathways: (1) Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST) for generating sequence similarity networks to visualize and analyze sequence–function space in protein families and (2) Enzyme Function Initiative-Genome Neighborhood Tool (EFI-GNT) for generating genome neighborhood networks to visualize and analyze the genome context in microbial and fungal genomes. Both tools have been adapted to other applications to facilitate target selection for enzyme discovery and functional characterization. As the natural products community has demonstrated, the enzymology community needs to embrace the essential role of web tools that allow the protein and genome sequence databases to be leveraged for novel insights into enzymological problems. PMID:28826221

  4. Bovine Genome Database: new tools for gleaning function from the Bos taurus genome.

    PubMed

    Elsik, Christine G; Unni, Deepak R; Diesh, Colin M; Tayal, Aditi; Emery, Marianne L; Nguyen, Hung N; Hagen, Darren E

    2016-01-04

    We report an update of the Bovine Genome Database (BGD) (http://BovineGenome.org). The goal of BGD is to support bovine genomics research by providing genome annotation and data mining tools. We have developed new genome and annotation browsers using JBrowse and WebApollo for two Bos taurus genome assemblies, the reference genome assembly (UMD3.1.1) and the alternate genome assembly (Btau_4.6.1). Annotation tools have been customized to highlight priority genes for annotation, and to aid annotators in selecting gene evidence tracks from 91 tissue specific RNAseq datasets. We have also developed BovineMine, based on the InterMine data warehousing system, to integrate the bovine genome, annotation, QTL, SNP and expression data with external sources of orthology, gene ontology, gene interaction and pathway information. BovineMine provides powerful query building tools, as well as customized query templates, and allows users to analyze and download genome-wide datasets. With BovineMine, bovine researchers can use orthology to leverage the curated gene pathways of model organisms, such as human, mouse and rat. BovineMine will be especially useful for gene ontology and pathway analyses in conjunction with GWAS and QTL studies. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. Excisable cassettes: new tools for functional analysis of Streptomyces genomes.

    PubMed

    Raynal, Alain; Karray, Fatma; Tuphile, Karine; Darbon-Rongère, Emmanuelle; Pernodet, Jean-Luc

    2006-07-01

    The functional analysis of microbial genomes often requires gene inactivation. We constructed a set of cassettes consisting of single antibiotic resistance genes flanked by the attL and attR sites resulting from site-specific integration of the Streptomyces pSAM2 element. These cassettes can easily be used to inactivate genes by in-frame deletion in Streptomyces by a three-step strategy. In the first step, in Escherichia coli, the cassette is inserted into a cloned copy of the gene to be inactivated. In the second step, the gene is replaced by homologous recombination in Streptomyces, allowing substitution of the wild-type target gene with its inactivated counterpart. In the third step, the cassette can be removed by expression of the pSAM2 genes xis and int. The resulting strains are marker-free and contain an "attB-like" sequence of 33, 34, or 35 bp with no stop codon if the cassette is correctly chosen. Thus, a gene can be disrupted by creating an in-frame deletion, avoiding polar effects if downstream genes are cotranscribed with the target gene. A set of cassettes was constructed to contain a hygromycin or gentamicin resistance gene flanked by the attL and attR sites. The initial constructions carrying convenient cloning sites allow the insertion of any other marker gene. We tested insertion and excision by inserting a cassette into orf3, the third gene of an operon involved in spiramycin biosynthesis. We verified that the cassette exerted a polar effect on the transcription of downstream genes but that, after excision, complementation with orf3 alone restored spiramycin production.

  6. Excisable Cassettes: New Tools for Functional Analysis of Streptomyces Genomes

    PubMed Central

    Raynal, Alain; Karray, Fatma; Tuphile, Karine; Darbon-Rongère, Emmanuelle; Pernodet, Jean-Luc

    2006-01-01

    The functional analysis of microbial genomes often requires gene inactivation. We constructed a set of cassettes consisting of single antibiotic resistance genes flanked by the attL and attR sites resulting from site-specific integration of the Streptomyces pSAM2 element. These cassettes can easily be used to inactivate genes by in-frame deletion in Streptomyces by a three-step strategy. In the first step, in Escherichia coli, the cassette is inserted into a cloned copy of the gene to be inactivated. In the second step, the gene is replaced by homologous recombination in Streptomyces, allowing substitution of the wild-type target gene with its inactivated counterpart. In the third step, the cassette can be removed by expression of the pSAM2 genes xis and int. The resulting strains are marker-free and contain an “attB-like” sequence of 33, 34, or 35 bp with no stop codon if the cassette is correctly chosen. Thus, a gene can be disrupted by creating an in-frame deletion, avoiding polar effects if downstream genes are cotranscribed with the target gene. A set of cassettes was constructed to contain a hygromycin or gentamicin resistance gene flanked by the attL and attR sites. The initial constructions carrying convenient cloning sites allow the insertion of any other marker gene. We tested insertion and excision by inserting a cassette into orf3, the third gene of an operon involved in spiramycin biosynthesis. We verified that the cassette exerted a polar effect on the transcription of downstream genes but that, after excision, complementation with orf3 alone restored spiramycin production. PMID:16820478

  7. New bioinformatic tool for quick identification of functionally relevant endogenous retroviral inserts in human genome.

    PubMed

    Garazha, Andrew; Ivanova, Alena; Suntsova, Maria; Malakhova, Galina; Roumiantsev, Sergey; Zhavoronkov, Alex; Buzdin, Anton

    2015-01-01

    Endogenous retroviruses (ERVs) and LTR retrotransposons (LRs) occupy ∼8% of human genome. Deep sequencing technologies provide clues to understanding of functional relevance of individual ERVs/LRs by enabling direct identification of transcription factor binding sites (TFBS) and other landmarks of functional genomic elements. Here, we performed the genome-wide identification of human ERVs/LRs containing TFBS according to the ENCODE project. We created the first interactive ERV/LRs database that groups the individual inserts according to their familial nomenclature, number of mapped TFBS and divergence from their consensus sequence. Information on any particular element can be easily extracted by the user. We also created a genome browser tool, which enables quick mapping of any ERV/LR insert according to genomic coordinates, known human genes and TFBS. These tools can be used to easily explore functionally relevant individual ERV/LRs, and for studying their impact on the regulation of human genes. Overall, we identified ∼110,000 ERV/LR genomic elements having TFBS. We propose a hypothesis of "domestication" of ERV/LR TFBS by the genome milieu including subsequent stages of initial epigenetic repression, partial functional release, and further mutation-driven reshaping of TFBS in tight coevolution with the enclosing genomic loci.

  8. Technicolour transgenics: imaging tools for functional genomics in the mouse.

    PubMed

    Hadjantonakis, Anna-Katerina; Dickinson, Mary E; Fraser, Scott E; Papaioannou, Virginia E

    2003-08-01

    Over the past decade, a battery of powerful tools that encompass forward and reverse genetic approaches have been developed to dissect the molecular and cellular processes that regulate development and disease. The advent of genetically-encoded fluorescent proteins that are expressed in wild type and mutant mice, together with advances in imaging technology, make it possible to study these biological processes in many dimensions. Importantly, these technologies allow direct visual access to complex events as they happen in their native environment, which provides greater insights into mammalian biology than ever before.

  9. Molecular tools for functional genomics in filamentous fungi: recent advances and new strategies.

    PubMed

    Jiang, Dewei; Zhu, Wei; Wang, Yunchuan; Sun, Chang; Zhang, Ke-Qin; Yang, Jinkui

    2013-12-01

    Advances in genetic transformation techniques have made important contributions to molecular genetics. Various molecular tools and strategies have been developed for functional genomic analysis of filamentous fungi since the first DNA transformation was successfully achieved in Neurospora crassa in 1973. Increasing amounts of genomic data regarding filamentous fungi are continuously reported and large-scale functional studies have become common in a wide range of fungal species. In this review, various molecular tools used in filamentous fungi are compared and discussed, including methods for genetic transformation (e.g., protoplast transformation, electroporation, and microinjection), the construction of random mutant libraries (e.g., restriction enzyme mediated integration, transposon arrayed gene knockout, and Agrobacterium tumefaciens mediated transformation), and the analysis of gene function (e.g., RNA interference and transcription activator-like effector nucleases). We also focused on practical strategies that could enhance the efficiency of genetic manipulation in filamentous fungi, such as choosing a proper screening system and marker genes, assembling target-cassettes or vectors effectively, and transforming into strains that are deficient in the nonhomologous end joining pathway. In summary, we present an up-to-date review on the different molecular tools and latest strategies that have been successfully used in functional genomics in filamentous fungi.

  10. Functional Genomics Tools for Haemonchus contortus and Lessons From Other Helminths.

    PubMed

    Britton, C; Roberts, B; Marks, N D

    2016-01-01

    The availability of genome and transcriptome data for parasitic nematodes, including Haemonchus contortus, has highlighted the need to develop functional genomics tools. Comparative genomic analysis, particularly using data from the free-living nematode Caenorhabditis elegans, can help predict gene function. Reliable approaches to study function directly in parasitic nematodes are currently lacking. However, gene knockdown by RNA interference (RNAi) is being successfully used in schistosome and planarian species to define gene functions. Lessons from these systems may be applied to improve RNAi in H. contortus. Previous studies in H. contortus and related nematodes demonstrated reliable RNAi-mediated silencing of some genes, but not others. Current data suggest that susceptibility to RNAi in these nematodes is limited to genes expressed in sites accessible to the environment, such as the gut, amphids and excretory cell. Therefore, RNAi is functional in H. contortus, but improvements are needed to develop this system as a functional genomics platform. Here, we summarize RNAi studies on H. contortus and discuss the optimization of RNA delivery and improvements to culture methods to enhance larval development, protein turnover and the induction of phenotypic effects in vitro. The transgenic delivery of RNA or dominant-negative gene constructs and the recently developed CRISPR/Cas genome-editing technique are considered as potential alternative approaches for gene knockout. This is a key time to devote greater effort in progressing from genome to function, to improve our understanding of the biology of Haemonchus and identify novel targets for parasite control. Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. Algal Functional Annotation Tool from the DOE-UCLA Institute for Genomics and Proteomics

    DOE Data Explorer

    Lopez, David

    The Algal Functional Annotation Tool is a bioinformatics resource to visualize pathway maps, identify enriched biological terms, or convert gene identifiers to elucidate biological function in silico. These types of analysis have been catered to support lists of gene identifiers, such as those coming from transcriptome gene expression analysis. By analyzing the functional annotation of an interesting set of genes, common biological motifs may be elucidated and a first-pass analysis can point further research in the right direction. Currently, the following databases have been parsed, processed, and added to the tool: 1( Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathways Database, 2) MetaCyc Encyclopedia of Metabolic Pathways, 3) Panther Pathways Database, 4) Reactome Pathways Database, 5) Gene Ontology, 6) MapMan Ontology, 7) KOG (Eukaryotic Clusters of Orthologous Groups), 5)Pfam, 6) InterPro.

  12. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research.

    PubMed

    Conesa, Ana; Götz, Stefan; García-Gómez, Juan Miguel; Terol, Javier; Talón, Manuel; Robles, Montserrat

    2005-09-15

    We present here Blast2GO (B2G), a research tool designed with the main purpose of enabling Gene Ontology (GO) based data mining on sequence data for which no GO annotation is yet available. B2G joints in one application GO annotation based on similarity searches with statistical analysis and highlighted visualization on directed acyclic graphs. This tool offers a suitable platform for functional genomics research in non-model species. B2G is an intuitive and interactive desktop application that allows monitoring and comprehension of the whole annotation and analysis process. Blast2GO is freely available via Java Web Start at http://www.blast2go.de. http://www.blast2go.de -> Evaluation.

  13. Agrobacterium rhizogenes-induced cotton hairy root culture as an alternative tool for cotton functional genomics

    USDA-ARS?s Scientific Manuscript database

    Although well-accepted as the ultimate method for cotton functional genomics, Agrobacterium tumefaciens-mediated cotton transformation is not widely used for functional analyses of cotton genes and their promoters since regeneration of cotton in tissue culture is lengthy and labor intensive. In cer...

  14. Modulated Modularity Clustering as an Exploratory Tool for Functional Genomic Inference

    PubMed Central

    Stone, Eric A.; Ayroles, Julien F.

    2009-01-01

    In recent years, the advent of high-throughput assays, coupled with their diminishing cost, has facilitated a systems approach to biology. As a consequence, massive amounts of data are currently being generated, requiring efficient methodology aimed at the reduction of scale. Whole-genome transcriptional profiling is a standard component of systems-level analyses, and to reduce scale and improve inference clustering genes is common. Since clustering is often the first step toward generating hypotheses, cluster quality is critical. Conversely, because the validation of cluster-driven hypotheses is indirect, it is critical that quality clusters not be obtained by subjective means. In this paper, we present a new objective-based clustering method and demonstrate that it yields high-quality results. Our method, modulated modularity clustering (MMC), seeks community structure in graphical data. MMC modulates the connection strengths of edges in a weighted graph to maximize an objective function (called modularity) that quantifies community structure. The result of this maximization is a clustering through which tightly-connected groups of vertices emerge. Our application is to systems genetics, and we quantitatively compare MMC both to the hierarchical clustering method most commonly employed and to three popular spectral clustering approaches. We further validate MMC through analyses of human and Drosophila melanogaster expression data, demonstrating that the clusters we obtain are biologically meaningful. We show MMC to be effective and suitable to applications of large scale. In light of these features, we advocate MMC as a standard tool for exploration and hypothesis generation. PMID:19424432

  15. An Analysis of Adenovirus Genomes Using Whole Genome Software Tools

    PubMed Central

    Mahadevan, Padmanabhan

    2016-01-01

    The evolution of sequencing technology has lead to an enormous increase in the number of genomes that have been sequenced. This is especially true in the field of virus genomics. In order to extract meaningful biological information from these genomes, whole genome data mining software tools must be utilized. Hundreds of tools have been developed to analyze biological sequence data. However, only some of these tools are user-friendly to biologists. Several of these tools that have been successfully used to analyze adenovirus genomes are described here. These include Artemis, EMBOSS, pDRAW, zPicture, CoreGenes, GeneOrder, and PipMaker. These tools provide functionalities such as visualization, restriction enzyme analysis, alignment, and proteome comparisons that are extremely useful in the bioinformatics analysis of adenovirus genomes. PMID:28293072

  16. Enabling functional genomics with genome engineering.

    PubMed

    Hilton, Isaac B; Gersbach, Charles A

    2015-10-01

    Advances in genome engineering technologies have made the precise control over genome sequence and regulation possible across a variety of disciplines. These tools can expand our understanding of fundamental biological processes and create new opportunities for therapeutic designs. The rapid evolution of these methods has also catalyzed a new era of genomics that includes multiple approaches to functionally characterize and manipulate the regulation of genomic information. Here, we review the recent advances of the most widely adopted genome engineering platforms and their application to functional genomics. This includes engineered zinc finger proteins, TALEs/TALENs, and the CRISPR/Cas9 system as nucleases for genome editing, transcription factors for epigenome editing, and other emerging applications. We also present current and potential future applications of these tools, as well as their current limitations and areas for future advances. © 2015 Hilton and Gersbach; Published by Cold Spring Harbor Laboratory Press.

  17. Enabling functional genomics with genome engineering

    PubMed Central

    Hilton, Isaac B.; Gersbach, Charles A.

    2015-01-01

    Advances in genome engineering technologies have made the precise control over genome sequence and regulation possible across a variety of disciplines. These tools can expand our understanding of fundamental biological processes and create new opportunities for therapeutic designs. The rapid evolution of these methods has also catalyzed a new era of genomics that includes multiple approaches to functionally characterize and manipulate the regulation of genomic information. Here, we review the recent advances of the most widely adopted genome engineering platforms and their application to functional genomics. This includes engineered zinc finger proteins, TALEs/TALENs, and the CRISPR/Cas9 system as nucleases for genome editing, transcription factors for epigenome editing, and other emerging applications. We also present current and potential future applications of these tools, as well as their current limitations and areas for future advances. PMID:26430154

  18. Algal functional annotation tool

    SciTech Connect

    2012-07-12

    Abstract BACKGROUND: Progress in genome sequencing is proceeding at an exponential pace, and several new algal genomes are becoming available every year. One of the challenges facing the community is the association of protein sequences encoded in the genomes with biological function. While most genome assembly projects generate annotations for predicted protein sequences, they are usually limited and integrate functional terms from a limited number of databases. Another challenge is the use of annotations to interpret large lists of 'interesting' genes generated by genome-scale datasets. Previously, these gene lists had to be analyzed across several independent biological databases, often on a gene-by-gene basis. In contrast, several annotation databases, such as DAVID, integrate data from multiple functional databases and reveal underlying biological themes of large gene lists. While several such databases have been constructed for animals, none is currently available for the study of algae. Due to renewed interest in algae as potential sources of biofuels and the emergence of multiple algal genome sequences, a significant need has arisen for such a database to process the growing compendiums of algal genomic data. DESCRIPTION: The Algal Functional Annotation Tool is a web-based comprehensive analysis suite integrating annotation data from several pathway, ontology, and protein family databases. The current version provides annotation for the model alga Chlamydomonas reinhardtii, and in the future will include additional genomes. The site allows users to interpret large gene lists by identifying associated functional terms, and their enrichment. Additionally, expression data for several experimental conditions were compiled and analyzed to provide an expression-based enrichment search. A tool to search for functionally-related genes based on gene expression across these conditions is also provided. Other features include dynamic visualization of genes on KEGG

  19. Geochip: A high throughput genomic tool for linking community structure to functions

    SciTech Connect

    Van Nostrand, Joy D.; Liang, Yuting; He, Zhili; Li, Guanghe; Zhou, Jizhong

    2009-01-30

    GeoChip is a comprehensive functional gene array that targets key functional genes involved in the geochemical cycling of N, C, and P, sulfate reduction, metal resistance and reduction, and contaminant degradation. Studies have shown the GeoChip to be a sensitive, specific, and high-throughput tool for microbial community analysis that has the power to link geochemical processes with microbial community structure. However, several challenges remain regarding the development and applications of microarrays for microbial community analysis.

  20. Mutagenesis as a Tool in Plant Genetics, Functional Genomics, and Breeding

    PubMed Central

    Sikora, Per; Chawade, Aakash; Larsson, Mikael; Olsson, Johanna; Olsson, Olof

    2011-01-01

    Plant mutagenesis is rapidly coming of age in the aftermath of recent developments in high-resolution molecular and biochemical techniques. By combining the high variation of mutagenised populations with novel screening methods, traits that are almost impossible to identify by conventional breeding are now being developed and characterised at the molecular level. This paper provides a comprehensive overview of the various techniques and workflows available to researchers today in the field of molecular breeding, and how these tools complement the ones already used in traditional breeding. Both genetic (Targeting Induced Local Lesions in Genomes; TILLING) and phenotypic screens are evaluated. Finally, different ways of bridging the gap between genotype and phenotype are discussed. PMID:22315587

  1. The SOFG Anatomy Entry List (SAEL): An Annotation Tool for Functional Genomics Data

    PubMed Central

    Parkinson, Helen; Aitken, Stuart; Baldock, Richard A.; Bard, Jonathan B. L.; Burger, Albert; Hayamizu, Terry F.; Rector, Alan; Ringwald, Martin; Rogers, Jeremy; Rosse, Cornelius; Stoeckert, Christian J.

    2004-01-01

    A great deal of data in functional genomics studies needs to be annotated with low-resolution anatomical terms. For example, gene expression assays based on manually dissected samples (microarray, SAGE, etc.) need high-level anatomical terms to describe sample origin. First-pass annotation in high-throughput assays (e.g. large-scale in situ gene expression screens or phenotype screens) and bibliographic applications, such as selection of keywords, would also benefit from a minimum set of standard anatomical terms. Although only simple terms are required, the researcher faces serious practical problems of inconsistency and confusion, given the different aims and the range of complexity of existing anatomy ontologies. A Standards and Ontologies for Functional Genomics (SOFG) group therefore initiated discussions between several of the major anatomical ontologies for higher vertebrates. As we report here, one result of these discussions is a simple, accessible, controlled vocabulary of gross anatomical terms, the SOFG Anatomy Entry List (SAEL). The SAEL is available from http://www.sofg.org and is intended as a resource for biologists, curators, bioinformaticians and developers of software supporting functional genomics. It can be used directly for annotation in the contexts described above. Importantly, each term is linked to the corresponding term in each of the major anatomy ontologies. Where the simple list does not provide enough detail or sophistication, therefore, the researcher can use the SAEL to choose the appropriate ontology and move directly to the relevant term as an entry point. The SAEL links will also be used to support computational access to the respective ontologies. PMID:18629134

  2. Bacterial Artificial Chromosomes: A Functional Genomics Tool for the Study of Positive-strand RNA Viruses.

    PubMed

    Yun, Sang-Im; Song, Byung-Hak; Kim, Jin-Kyoung; Lee, Young-Min

    2015-12-29

    Reverse genetics, an approach to rescue infectious virus entirely from a cloned cDNA, has revolutionized the field of positive-strand RNA viruses, whose genomes have the same polarity as cellular mRNA. The cDNA-based reverse genetics system is a seminal method that enables direct manipulation of the viral genomic RNA, thereby generating recombinant viruses for molecular and genetic studies of both viral RNA elements and gene products in viral replication and pathogenesis. It also provides a valuable platform that allows the development of genetically defined vaccines and viral vectors for the delivery of foreign genes. For many positive-strand RNA viruses such as Japanese encephalitis virus (JEV), however, the cloned cDNAs are unstable, posing a major obstacle to the construction and propagation of the functional cDNA. Here, the present report describes the strategic considerations in creating and amplifying a genetically stable full-length infectious JEV cDNA as a bacterial artificial chromosome (BAC) using the following general experimental procedures: viral RNA isolation, cDNA synthesis, cDNA subcloning and modification, assembly of a full-length cDNA, cDNA linearization, in vitro RNA synthesis, and virus recovery. This protocol provides a general methodology applicable to cloning full-length cDNA for a range of positive-strand RNA viruses, particularly those with a genome of >10 kb in length, into a BAC vector, from which infectious RNAs can be transcribed in vitro with a bacteriophage RNA polymerase.

  3. Bacterial Artificial Chromosomes: A Functional Genomics Tool for the Study of Positive-strand RNA Viruses

    PubMed Central

    Yun, Sang-Im; Song, Byung-Hak; Kim, Jin-Kyoung; Lee, Young-Min

    2015-01-01

    Reverse genetics, an approach to rescue infectious virus entirely from a cloned cDNA, has revolutionized the field of positive-strand RNA viruses, whose genomes have the same polarity as cellular mRNA. The cDNA-based reverse genetics system is a seminal method that enables direct manipulation of the viral genomic RNA, thereby generating recombinant viruses for molecular and genetic studies of both viral RNA elements and gene products in viral replication and pathogenesis. It also provides a valuable platform that allows the development of genetically defined vaccines and viral vectors for the delivery of foreign genes. For many positive-strand RNA viruses such as Japanese encephalitis virus (JEV), however, the cloned cDNAs are unstable, posing a major obstacle to the construction and propagation of the functional cDNA. Here, the present report describes the strategic considerations in creating and amplifying a genetically stable full-length infectious JEV cDNA as a bacterial artificial chromosome (BAC) using the following general experimental procedures: viral RNA isolation, cDNA synthesis, cDNA subcloning and modification, assembly of a full-length cDNA, cDNA linearization, in vitro RNA synthesis, and virus recovery. This protocol provides a general methodology applicable to cloning full-length cDNA for a range of positive-strand RNA viruses, particularly those with a genome of >10 kb in length, into a BAC vector, from which infectious RNAs can be transcribed in vitro with a bacteriophage RNA polymerase. PMID:26780115

  4. The use of retroviruses as pharmaceutical tools for target discovery and validation in the field of functional genomics.

    PubMed

    Lorens, J B; Sousa, C; Bennett, M K; Molineaux, S M; Payan, D G

    2001-12-01

    Retrovirally mediated functional genomics enables identification of physiologically relevant cellular therapeutic targets. Unique properties of retroviruses make them ideal tools for the introduction of large and diverse libraries of potential genetic effectors to a variety of cell types. The identification and recovery of intracellular library elements responsible for altered disease responses establishes a direct basis for pharmaceutical development. Recent innovations in retroviral infection efficiency and expression control have broadened application of the methodology to include libraries of mutagenized cDNAs, peptides and ribozyme genetic effectors.

  5. Bacterial thermostable beta-glucanases as a tool for plant functional genomics.

    PubMed

    Abdeev, Rustam M; Abdeeva, Inna A; Bruskin, Sergey S; Musiychuk, Konstantin A; Goldenkova-Pavlova, Irina V; Piruzian, Eleonora S

    2009-05-01

    A new strategy for creating experimental models for functional genomics has been proposed. It is based on the expression in transgenic plants of genes from thermophilic bacteria encoding functional analogues of plant proteins with high specific activity and thermal stability. We have validated this strategy by comparing physiological, biochemical and molecular properties of control tobacco plants and transgenic plants expressing genes of beta-glucanases with different substrate specificity. We demonstrate that the expression of bacterial beta-1,3-1,4-glucanase gene exerts no significant influence on tobacco plant metabolism, while the expression of bacterial beta-1,3-glucanase affects plant metabolism only at early stages of growth and development. By contrast, the expression of bacterial beta-1,4-glucanase has a significant effect on transgenic tobacco plant metabolism, namely, it affects plant morphology, the thickness of the primary cell wall, phytohormonal status, and the relative sugar content. We propose a hypothesis of beta-glucanase action as an important factor of genetic regulation of metabolic processes in plants.

  6. A generalized transducing phage (phiIF3) for the genomically sequenced Serratia marcescens strain Db11: a tool for functional genomics of an opportunistic human pathogen.

    PubMed

    Petty, Nicola K; Foulds, Ian J; Pradel, Elizabeth; Ewbank, Jonathan J; Salmond, George P C

    2006-06-01

    A bacteriophage (phiIF3) capable of mediating generalized transduction in Serratia marcescens strain Db11 has been isolated and characterized. The genome of this Serratia strain has recently been sequenced and is likely to become the reference strain for S. marcescens researchers. phiIF3 is most likely a virulent phage, which can transduce markers at frequencies of 10(-6) transductants per p.f.u. It has a lipopolysaccharide receptor and was determined to have a latent period of 50 min and a burst size of approximately 100 phages. The phage DNA was resistant to digestion with restriction enzymes. Electron microscopy showed phiIF3 to be a member of the family Myoviridae. This is the first report of a generalized transducing phage able to infect Db11 and this phage will be a valuable tool for functional genomic analysis of the pathogen host.

  7. Genome Exploitation and Bioinformatics Tools

    NASA Astrophysics Data System (ADS)

    de Jong, Anne; van Heel, Auke J.; Kuipers, Oscar P.

    Bioinformatic tools can greatly improve the efficiency of bacteriocin screening efforts by limiting the amount of strains. Different classes of bacteriocins can be detected in genomes by looking at different features. Finding small bacteriocins can be especially challenging due to low homology and because small open reading frames (ORFs) are often omitted from annotations. In this chapter, several bioinformatic tools/strategies to identify bacteriocins in genomes are discussed.

  8. Plant functional genomics

    NASA Astrophysics Data System (ADS)

    Holtorf, Hauke; Guitton, Marie-Christine; Reski, Ralf

    2002-04-01

    Functional genome analysis of plants has entered the high-throughput stage. The complete genome information from key species such as Arabidopsis thaliana and rice is now available and will further boost the application of a range of new technologies to functional plant gene analysis. To broadly assign functions to unknown genes, different fast and multiparallel approaches are currently used and developed. These new technologies are based on known methods but are adapted and improved to accommodate for comprehensive, large-scale gene analysis, i.e. such techniques are novel in the sense that their design allows researchers to analyse many genes at the same time and at an unprecedented pace. Such methods allow analysis of the different constituents of the cell that help to deduce gene function, namely the transcripts, proteins and metabolites. Similarly the phenotypic variations of entire mutant collections can now be analysed in a much faster and more efficient way than before. The different methodologies have developed to form their own fields within the functional genomics technological platform and are termed transcriptomics, proteomics, metabolomics and phenomics. Gene function, however, cannot solely be inferred by using only one such approach. Rather, it is only by bringing together all the information collected by different functional genomic tools that one will be able to unequivocally assign functions to unknown plant genes. This review focuses on current technical developments and their impact on the field of plant functional genomics. The lower plant Physcomitrella is introduced as a new model system for gene function analysis, owing to its high rate of homologous recombination.

  9. Bean pod mottle virus: a new powerful tool for functional genomics studies in Pisum sativum.

    PubMed

    Meziadi, Chouaib; Blanchet, Sophie; Richard, Manon M S; Pilet-Nayel, Marie-Laure; Geffroy, Valérie; Pflieger, Stéphanie

    2016-08-01

    Pea (Pisum sativum L.) is an important legume worldwide. The importance of pea in arable rotations and nutritional value for both human and animal consumption have fostered sustained production and different studies to improve agronomic traits of interest. Moreover, complete sequencing of the pea genome is currently underway and will lead to the identification of a large number of genes potentially associated with important agronomic traits. Because stable genetic transformation is laborious for pea, virus-induced gene silencing (VIGS) appears as a powerful alternative technology for determining the function of unknown genes. In this work, we present a rapid and efficient viral inoculation method using DNA infectious plasmids of Bean pod mottle virus (BPMV)-derived VIGS vector. Six pea genotypes with important genes controlling biotic and/or abiotic stresses were found susceptible to BPMV carrying a GFP reporter gene and showed fluorescence in both shoots and roots. In a second step, we investigated 37 additional pea genotypes and found that 30 were susceptible to BPMV and only 7 were resistant. The capacity of BPMV to induce silencing of endogenes was investigated in the most susceptible genotype using two visual reporter genes: PsPDS and PsKORRIGAN1 (PsKOR1) encoding PHYTOENE DESATURASE and a 1,4-β-D-glucanase, respectively. The features of the 'one-step' BPMV-derived VIGS vector include (i) the ease of rub-inoculation, without any need for biolistic or agro-inoculation procedures, (ii) simple cost-effective procedure and (iii) noninterference of viral symptoms with silencing. These features make BPMV the most adapted VIGS vector in pea to make low- to high-throughput VIGS studies. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  10. Algal functional annotation tool

    SciTech Connect

    Lopez, D.; Casero, D.; Cokus, S. J.; Merchant, S. S.; Pellegrini, M.

    2012-07-01

    The Algal Functional Annotation Tool is a web-based comprehensive analysis suite integrating annotation data from several pathway, ontology, and protein family databases. The current version provides annotation for the model alga Chlamydomonas reinhardtii, and in the future will include additional genomes. The site allows users to interpret large gene lists by identifying associated functional terms, and their enrichment. Additionally, expression data for several experimental conditions were compiled and analyzed to provide an expression-based enrichment search. A tool to search for functionally-related genes based on gene expression across these conditions is also provided. Other features include dynamic visualization of genes on KEGG pathway maps and batch gene identifier conversion.

  11. Navigating yeast genome maintenance with functional genomics.

    PubMed

    Measday, Vivien; Stirling, Peter C

    2016-03-01

    Maintenance of genome integrity is a fundamental requirement of all organisms. To address this, organisms have evolved extremely faithful modes of replication, DNA repair and chromosome segregation to combat the deleterious effects of an unstable genome. Nonetheless, a small amount of genome instability is the driver of evolutionary change and adaptation, and thus a low level of instability is permitted in populations. While defects in genome maintenance almost invariably reduce fitness in the short term, they can create an environment where beneficial mutations are more likely to occur. The importance of this fact is clearest in the development of human cancer, where genome instability is a well-established enabling characteristic of carcinogenesis. This raises the crucial question: what are the cellular pathways that promote genome maintenance and what are their mechanisms? Work in model organisms, in particular the yeast Saccharomyces cerevisiae, has provided the global foundations of genome maintenance mechanisms in eukaryotes. The development of pioneering genomic tools inS. cerevisiae, such as the systematic creation of mutants in all nonessential and essential genes, has enabled whole-genome approaches to identifying genes with roles in genome maintenance. Here, we review the extensive whole-genome approaches taken in yeast, with an emphasis on functional genomic screens, to understand the genetic basis of genome instability, highlighting a range of genetic and cytological screening modalities. By revealing the biological pathways and processes regulating genome integrity, these analyses contribute to the systems-level map of the yeast cell and inform studies of human disease, especially cancer. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  12. RNA interference-based gene silencing as an efficient tool for functional genomics in hexaploid bread wheat.

    PubMed

    Travella, Silvia; Klimm, Theres E; Keller, Beat

    2006-09-01

    Insertional mutagenesis and gene silencing are efficient tools for the determination of gene function. In contrast to gain- or loss-of-function approaches, RNA interference (RNAi)-induced gene silencing can possibly silence multigene families and homoeologous genes in polyploids. This is of great importance for functional studies in hexaploid wheat (Triticum aestivum), where most of the genes are present in at least three homoeologous copies and conventional insertional mutagenesis is not effective. We have introduced into bread wheat double-stranded RNA-expressing constructs containing fragments of genes encoding Phytoene Desaturase (PDS) or the signal transducer of ethylene, Ethylene Insensitive 2 (EIN2). Transformed plants showed phenotypic changes that were stably inherited over at least two generations. These changes were very similar to mutant phenotypes of the two genes in diploid model plants. Quantitative real-time polymerase chain reaction revealed a good correlation between decreasing mRNA levels and increasingly severe phenotypes. RNAi silencing had the same quantitative effect on all three homoeologous genes. The most severe phenotypes were observed in homozygous plants that showed the strongest mRNA reduction and, interestingly, produced around 2-fold the amount of small RNAs compared to heterozygous plants. This suggests that the effect of RNAi in hexaploid wheat is gene-dosage dependent. Wheat seedlings with low mRNA levels for EIN2 were ethylene insensitive. Thus, EIN2 is a positive regulator of the ethylene-signaling pathway in wheat, very similar to its homologs in Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa). Our data show that RNAi results in stably inherited phenotypes and therefore represents an efficient tool for functional genomic studies in polyploid wheat.

  13. Validation of Synthetic CRISPR Reagents as a Tool for Arrayed Functional Genomic Screening

    PubMed Central

    Tan, Jenille; Martin, Scott E.

    2016-01-01

    To date, lentiviral-based CRISPR-Cas9 screens have largely been conducted in pooled format. However, numerous assays are not amenable to pooled approaches, and lentiviral screening in arrayed format presents many challenges. We sought to examine synthetic CRISPR reagents in the context of arrayed screening. Experiments were performed using aberrant DNA replication as an assay. Using synthetic CRISPR RNAs targeting the known control gene GMNN in HCT-116 cells stably expressing Cas9, we observed statistically significant phenotype among the majority of transfected cells within 72 hours. Additional studies revealed near complete loss of GMNN protein and editing of GMNN DNA. We next conducted a screen of synthetic CRISPR RNAs directed against 640 ubiquitin-related genes. Screening identified known and novel DNA replication regulators that were also supported by siRNA gene knockdown. Notably, CRISPR screening identified more statistically significant hits than corresponding siRNA screens run in parallel. These results highlight the possibility of using synthetic CRISPR reagents as an arrayed screening tool. PMID:28030641

  14. Vectors for fluorescent protein tagging in Phytophthora: tools for functional genomics and cell biology.

    PubMed

    Ah-Fong, Audrey M V; Judelson, Howard S

    2011-09-01

    Fluorescent tagging has become the strategy of choice for examining the subcellular localisation of proteins. To develop a versatile community resource for this method in oomycetes, plasmids were constructed that allow the expression of either of four spectrally distinct proteins [cyan fluorescent protein (CFP), green fluorescent protein (GFP), yellow fluorescent protein (YFP), and mCherry], alone or fused at their N- or C-termini, to sequences of interest. Equivalent sets of plasmids were made using neomycin or hygromycin phosphotransferases (nptII, hpt) as selectable markers, to facilitate double-labelling and aid work in diverse species. The fluorescent proteins and drug-resistance markers were fused to transcriptional regulatory sequences from the oomycete Bremia lactucae, which are known to function in diverse oomycetes, although the promoter in the fluorescence cassette (ham34) can be replaced easily by a promoter of interest. The function of each plasmid was confirmed in Phytophthora infestans. Moreover, fusion proteins were generated using targeting sequences for the endoplasmic reticulum, Golgi, mitochondria, nuclei, and peroxisomes. Studies of the distribution of the fusions in mycelia and sporangia provided insight into cellular organisation at different stages of development. This toolbox of vectors should advance studies of gene function and cell biology in Phytophthora and other oomycetes. Copyright © 2011 British Mycological Society. Published by Elsevier Ltd. All rights reserved.

  15. Multilevel functional genomics data integration as a tool for understanding physiology: a network biology perspective.

    PubMed

    Davidsen, Peter K; Turan, Nil; Egginton, Stuart; Falciani, Francesco

    2016-02-01

    The overall aim of physiological research is to understand how living systems function in an integrative manner. Consequently, the discipline of physiology has since its infancy attempted to link multiple levels of biological organization. Increasingly this has involved mathematical and computational approaches, typically to model a small number of components spanning several levels of biological organization. With the advent of "omics" technologies, which can characterize the molecular state of a cell or tissue (intended as the level of expression and/or activity of its molecular components), the number of molecular components we can quantify has increased exponentially. Paradoxically, the unprecedented amount of experimental data has made it more difficult to derive conceptual models underlying essential mechanisms regulating mammalian physiology. We present an overview of state-of-the-art methods currently used to identifying biological networks underlying genomewide responses. These are based on a data-driven approach that relies on advanced computational methods designed to "learn" biology from observational data. In this review, we illustrate an application of these computational methodologies using a case study integrating an in vivo model representing the transcriptional state of hypoxic skeletal muscle with a clinical study representing muscle wasting in chronic obstructive pulmonary disease patients. The broader application of these approaches to modeling multiple levels of biological data in the context of modern physiology is discussed. Copyright © 2016 the American Physiological Society.

  16. The GLOBE 3D Genome Platform - towards a novel system-biological paper tool to integrate the huge complexity of genome organization and function.

    PubMed

    Knoch, Tobias A; Lesnussa, Michael; Kepper, Nick; Eussen, Hubert B; Grosveld, Frank G

    2009-01-01

    Genomes are tremendous co-evolutionary holistic systems for molecular storage, processing and fabrication of information. Their system-biological complexity remains, however, still largely mysterious, despite immense sequencing achievements and huge advances in the understanding of the general sequential, three-dimensional and regulatory organization. Here, we present the GLOBE 3D Genome Platform a completely novel grid based virtual "paper" tool and in fact the first system-biological genome browser integrating the holistic complexity of genomes in a single easy comprehensible platform: Based on a detailed study of biophysical and IT requirements, every architectural level from sequence to morphology of one or several genomes can be approached in a real and in a symbolic representation simultaneously and navigated by continuous scale-free zooming within a unique three-dimensional OpenGL and grid driven environment. In principle an unlimited number of multi-dimensional data sets can be visualized, customized in terms of arrangement, shape, colour, and texture etc. as well as accessed and annotated individually or in groups using internal or external data bases/facilities. Any information can be searched and correlated by importing or calculating simple relations in real-time using grid resources. A general correlation and application platform for more complex correlative analysis and a front-end for system-biological simulations both using again the huge capabilities of grid infrastructures is currently under development. Hence, the GLOBE 3D Genome Platform is an example of a grid based approach towards a virtual desktop for genomic work combining the three fundamental distributed resources: i) visual data representation, ii) data access and management, and iii) data analysis and creation. Thus, the GLOBE 3D Genome Platform is the novel system-biology oriented information system urgently needed to access, present, annotate, and to simulate the holistic genome

  17. VISTA - computational tools for comparative genomics

    SciTech Connect

    Frazer, Kelly A.; Pachter, Lior; Poliakov, Alexander; Rubin,Edward M.; Dubchak, Inna

    2004-01-01

    Comparison of DNA sequences from different species is a fundamental method for identifying functional elements in genomes. Here we describe the VISTA family of tools created to assist biologists in carrying out this task. Our first VISTA server at http://www-gsd.lbl.gov/VISTA/ was launched in the summer of 2000 and was designed to align long genomic sequences and visualize these alignments with associated functional annotations. Currently the VISTA site includes multiple comparative genomics tools and provides users with rich capabilities to browse pre-computed whole-genome alignments of large vertebrate genomes and other groups of organisms with VISTA Browser, submit their own sequences of interest to several VISTA servers for various types of comparative analysis, and obtain detailed comparative analysis results for a set of cardiovascular genes. We illustrate capabilities of the VISTA site by the analysis of a 180 kilobase (kb) interval on human chromosome 5 that encodes for the kinesin family member3A (KIF3A) protein.

  18. Skittle: a 2-dimensional genome visualization tool.

    PubMed

    Seaman, Josiah D; Sanford, John C

    2009-12-30

    It is increasingly evident that there are multiple and overlapping patterns within the genome, and that these patterns contain different types of information--regarding both genome function and genome history. In order to discover additional genomic patterns which may have biological significance, novel strategies are required. To partially address this need, we introduce a new data visualization tool entitled Skittle. This program first creates a 2-dimensional nucleotide display by assigning four colors to the four nucleotides, and then text-wraps to a user adjustable width. This nucleotide display is accompanied by a "repeat map" which comprehensively displays all local repeating units, based upon analysis of all possible local alignments. Skittle includes a smooth-zooming interface which allows the user to analyze genomic patterns at any scale.Skittle is especially useful in identifying and analyzing tandem repeats, including repeats not normally detectable by other methods. However, Skittle is also more generally useful for analysis of any genomic data, allowing users to correlate published annotations and observable visual patterns, and allowing for sequence and construct quality control. Preliminary observations using Skittle reveal intriguing genomic patterns not otherwise obvious, including structured variations inside tandem repeats. The striking visual patterns revealed by Skittle appear to be useful for hypothesis development, and have already led the authors to theorize that imperfect tandem repeats could act as information carriers, and may form tertiary structures within the interphase nucleus.

  19. Strategies and tools for whole genome alignments

    SciTech Connect

    Couronne, Olivier; Poliakov, Alexander; Bray, Nicolas; Ishkhanov,Tigran; Ryaboy, Dmitriy; Rubin, Edward; Pachter, Lior; Dubchak, Inna

    2002-11-25

    The availability of the assembled mouse genome makespossible, for the first time, an alignment and comparison of two largevertebrate genomes. We have investigated different strategies ofalignment for the subsequent analysis of conservation of genomes that areeffective for different quality assemblies. These strategies were appliedto the comparison of the working draft of the human genome with the MouseGenome Sequencing Consortium assembly, as well as other intermediatemouse assemblies. Our methods are fast and the resulting alignmentsexhibit a high degree of sensitivity, covering more than 90 percent ofknown coding exons in the human genome. We have obtained such coveragewhile preserving specificity. With a view towards the end user, we havedeveloped a suite of tools and websites for automatically aligning, andsubsequently browsing and working with whole genome comparisons. Wedescribe the use of these tools to identify conserved non-coding regionsbetween the human and mouse genomes, some of which have not beenidentified by other methods.

  20. Comparative genomics tools applied to bioterrorism defence.

    PubMed

    Slezak, Tom; Kuczmarski, Tom; Ott, Linda; Torres, Clinton; Medeiros, Dan; Smith, Jason; Truitt, Brian; Mulakken, Nisha; Lam, Marisa; Vitalis, Elizabeth; Zemla, Adam; Zhou, Carol Ecale; Gardner, Shea

    2003-06-01

    comparative genomics algorithms may help spur algorithm developers to tackle some of the many remaining problems that need to be addressed. Solutions to these problems will advance a wide range of biological disciplines, only one of which is pathogen detection. For example, exploration in evolution and phylogenetics, annotating gene coding regions, predicting and understanding gene function and regulation, and untangling gene networks all rely on tools for aligning multiple sequences, detecting gene rearrangements and duplications, and visualising genomic data. Two key problems currently needing improved solutions are: (1) aligning incomplete, fragmentary sequence (eg draft genome contigs or arbitrary genome regions) with both complete genomes and other fragmentary sequences; and (2) ordering, aligning and visualising non-colinear gene rearrangements and inversions in addition to the colinear alignments handled by current tools.

  1. Translational and functional oncogenomics. From cancer-oriented genomic screenings to new diagnostic tools and improved cancer treatment.

    PubMed

    Medico, Enzo

    2008-01-01

    We present here an experimental pipeline for the systematic identification and functional characterization of genes with high potential diagnostic and therapeutic value in human cancer. Complementary competences and resources have been brought together in the TRANSFOG Consortium to reach the following integrated research objectives: 1) execution of cancer-oriented genomic screenings on tumor tissues and experimental models and merging of the results to generate a prioritized panel of candidate genes involved in cancer progression and metastasis; 2) setup of systems for high-throughput delivery of full-length cDNAs, for gain-of-function analysis of the prioritized candidate genes; 3) collection of vectors and oligonucleotides for systematic, RNA interference-mediated down-regulation of the candidate genes; 4) adaptation of existing cell-based and model organism assays to a systematic analysis of gain and loss of function of the candidate genes, for identification and preliminary validation of novel potential therapeutic targets; 5) proteomic analysis of signal transduction and protein-protein interaction for better dissection of aberrant cancer signaling pathways; 6) validation of the diagnostic potential of the identified cancer genes towards the clinical use of diagnostic molecular signatures; 7) generation of a shared informatics platform for data handling and gene functional annotation. The results of the first three years of activity of the TRANSFOG Consortium are also briefly presented and discussed.

  2. Cytoscape: the network visualization tool for GenomeSpace workflows

    PubMed Central

    Demchak, Barry; Hull, Tim; Reich, Michael; Liefeld, Ted; Smoot, Michael; Ideker, Trey; Mesirov, Jill P.

    2014-01-01

    Modern genomic analysis often requires workflows incorporating multiple best-of-breed tools. GenomeSpace is a web-based visual workbench that combines a selection of these tools with mechanisms that create data flows between them. One such tool is Cytoscape 3, a popular application that enables analysis and visualization of graph-oriented genomic networks. As Cytoscape runs on the desktop, and not in a web browser, integrating it into GenomeSpace required special care in creating a seamless user experience and enabling appropriate data flows. In this paper, we present the design and operation of the Cytoscape GenomeSpace app, which accomplishes this integration, thereby providing critical analysis and visualization functionality for GenomeSpace users. It has been downloaded over 850 times since the release of its first version in September, 2013. PMID:25165537

  3. DCODE.ORG Anthology of Comparative Genomic Tools

    SciTech Connect

    Loots, G G; Ovcharenko, I

    2005-01-11

    Comparative genomics provides the means to demarcate functional regions in anonymous DNA sequences. The successful application of this method to identifying novel genes is currently shifting to deciphering the noncoding encryption of gene regulation across genomes. To facilitate the use of comparative genomics to practical applications in genetics and genomics we have developed several analytical and visualization tools for the analysis of arbitrary sequences and whole genomes. These tools include two alignment tools: zPicture and Mulan; a phylogenetic shadowing tool: eShadow for identifying lineage- and species-specific functional elements; two evolutionary conserved transcription factor analysis tools: rVista and multiTF; a tool for extracting cis-regulatory modules governing the expression of co-regulated genes, CREME; and a dynamic portal to multiple vertebrate and invertebrate genome alignments, the ECR Browser. Here we briefly describe each one of these tools and provide specific examples on their practical applications. All the tools are publicly available at the http://www.dcode.org/ web site.

  4. TheViral MetaGenome Annotation Pipeline(VMGAP):an automated tool for the functional annotation of viral Metagenomic shotgun sequencing data

    PubMed Central

    Lorenzi, Hernan A.; Hoover, Jeff; Inman, Jason; Safford, Todd; Murphy, Sean; Kagan, Leonid; Williamson, Shannon J.

    2011-01-01

    In the past few years, the field of metagenomics has been growing at an accelerated pace, particularly in response to advancements in new sequencing technologies. The large volume of sequence data from novel organisms generated by metagenomic projects has triggered the development of specialized databases and tools focused on particular groups of organisms or data types. Here we describe a pipeline for the functional annotation of viral metagenomic sequence data. The Viral MetaGenome Annotation Pipeline (VMGAP) pipeline takes advantage of a number of specialized databases, such as collections of mobile genetic elements and environmental metagenomes to improve the classification and functional prediction of viral gene products. The pipeline assigns a functional term to each predicted protein sequence following a suite of comprehensive analyses whose results are ranked according to a priority rules hierarchy. Additional annotation is provided in the form of enzyme commission (EC) numbers, GO/MeGO terms and Hidden Markov Models together with supporting evidence. PMID:21886867

  5. Computational Tools for Genomic Studies in Plants.

    PubMed

    Martinez, Manuel

    2016-12-01

    In recent years, the genomic sequence of numerous plant species including the main crop species has been determined. Computational tools have been developed to deal with the issue of which plant has been sequenced and where is the sequence hosted. In this mini-review, the databases for genome projects, the databases created to host species/clade projects and the databases developed to perform plant comparative genomics are revised. Because of their importance in modern research, an in-depth analysis of the plant comparative genomics databases has been performed. This comparative analysis is focused in the common and specific computational tools developed to achieve the particular objectives of each database. Besides, emerging high-performance bioinformatics tools specific for plant research are commented. What kind of computational approaches should be implemented in next years to efficiently analyze plant genomes is discussed.

  6. Metabolomics as a Hypothesis-Generating Functional Genomics Tool for the Annotation of Arabidopsis thaliana Genes of “Unknown Function”

    PubMed Central

    Quanbeck, Stephanie M.; Brachova, Libuse; Campbell, Alexis A.; Guan, Xin; Perera, Ann; He, Kun; Rhee, Seung Y.; Bais, Preeti; Dickerson, Julie A.; Dixon, Philip; Wohlgemuth, Gert; Fiehn, Oliver; Barkan, Lenore; Lange, Iris; Lange, B. Markus; Lee, Insuk; Cortes, Diego; Salazar, Carolina; Shuman, Joel; Shulaev, Vladimir; Huhman, David V.; Sumner, Lloyd W.; Roth, Mary R.; Welti, Ruth; Ilarslan, Hilal; Wurtele, Eve S.; Nikolau, Basil J.

    2012-01-01

    Metabolomics is the methodology that identifies and measures global pools of small molecules (of less than about 1,000 Da) of a biological sample, which are collectively called the metabolome. Metabolomics can therefore reveal the metabolic outcome of a genetic or environmental perturbation of a metabolic regulatory network, and thus provide insights into the structure and regulation of that network. Because of the chemical complexity of the metabolome and limitations associated with individual analytical platforms for determining the metabolome, it is currently difficult to capture the complete metabolome of an organism or tissue, which is in contrast to genomics and transcriptomics. This paper describes the analysis of Arabidopsis metabolomics data sets acquired by a consortium that includes five analytical laboratories, bioinformaticists, and biostatisticians, which aims to develop and validate metabolomics as a hypothesis-generating functional genomics tool. The consortium is determining the metabolomes of Arabidopsis T-DNA mutant stocks, grown in standardized controlled environment optimized to minimize environmental impacts on the metabolomes. Metabolomics data were generated with seven analytical platforms, and the combined data is being provided to the research community to formulate initial hypotheses about genes of unknown function (GUFs). A public database (www.PlantMetabolomics.org) has been developed to provide the scientific community with access to the data along with tools to allow for its interactive analysis. Exemplary datasets are discussed to validate the approach, which illustrate how initial hypotheses can be generated from the consortium-produced metabolomics data, integrated with prior knowledge to provide a testable hypothesis concerning the functionality of GUFs. PMID:22645570

  7. ParameciumDB in 2011: new tools and new data for functional and comparative genomics of the model ciliate Paramecium tetraurelia.

    PubMed

    Arnaiz, Olivier; Sperling, Linda

    2011-01-01

    ParameciumDB is a community model organism database built with the GMOD toolkit to integrate the genome and biology of the ciliate Paramecium tetraurelia. Over the last four years, post-genomic data from proteome and transcriptome studies has been incorporated along with predicted orthologs in 33 species, annotations from the community and publications from the scientific literature. Available tools include BioMart for complex queries, GBrowse2 for genome browsing, the Apollo genome editor for expert curation of gene models, a Blast server, a motif finder, and a wiki for protocols, nomenclature guidelines and other documentation. In-house tools have been developed for ontology browsing and evaluation of off-target RNAi matches. Now ready for next-generation deep sequencing data and the genomes of other Paramecium species, this open-access resource is available at http://paramecium.cgm.cnrs-gif.fr.

  8. Nongenetic functions of the genome.

    PubMed

    Bustin, Michael; Misteli, Tom

    2016-05-06

    The primary function of the genome is to store, propagate, and express the genetic information that gives rise to a cell's architectural and functional machinery. However, the genome is also a major structural component of the cell. Besides its genetic roles, the genome affects cellular functions by nongenetic means through its physical and structural properties, particularly by exerting mechanical forces and by serving as a scaffold for binding of cellular components. Major cellular processes affected by nongenetic functions of the genome include establishment of nuclear structure, signal transduction, mechanoresponses, cell migration, and vision in nocturnal animals. We discuss the concept, mechanisms, and implications of nongenetic functions of the genome.

  9. Application of Genomic Tools in Plant Breeding

    PubMed Central

    Pérez-de-Castro, A.M.; Vilanova, S.; Cañizares, J.; Pascual, L.; Blanca, J.M.; Díez, M.J.; Prohens, J.; Picó, B.

    2012-01-01

    Plant breeding has been very successful in developing improved varieties using conventional tools and methodologies. Nowadays, the availability of genomic tools and resources is leading to a new revolution of plant breeding, as they facilitate the study of the genotype and its relationship with the phenotype, in particular for complex traits. Next Generation Sequencing (NGS) technologies are allowing the mass sequencing of genomes and transcriptomes, which is producing a vast array of genomic information. The analysis of NGS data by means of bioinformatics developments allows discovering new genes and regulatory sequences and their positions, and makes available large collections of molecular markers. Genome-wide expression studies provide breeders with an understanding of the molecular basis of complex traits. Genomic approaches include TILLING and EcoTILLING, which make possible to screen mutant and germplasm collections for allelic variants in target genes. Re-sequencing of genomes is very useful for the genome-wide discovery of markers amenable for high-throughput genotyping platforms, like SSRs and SNPs, or the construction of high density genetic maps. All these tools and resources facilitate studying the genetic diversity, which is important for germplasm management, enhancement and use. Also, they allow the identification of markers linked to genes and QTLs, using a diversity of techniques like bulked segregant analysis (BSA), fine genetic mapping, or association mapping. These new markers are used for marker assisted selection, including marker assisted backcross selection, ‘breeding by design’, or new strategies, like genomic selection. In conclusion, advances in genomics are providing breeders with new tools and methodologies that allow a great leap forward in plant breeding, including the ‘superdomestication’ of crops and the genetic dissection and breeding for complex traits. PMID:23115520

  10. BEACON: automated tool for Bacterial GEnome Annotation ComparisON.

    PubMed

    Kalkatawi, Manal; Alam, Intikhab; Bajic, Vladimir B

    2015-08-18

    Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON's utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27%, while the number of genes without any function assignment is reduced. We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/ .

  11. Galaxy tools to study genome diversity

    PubMed Central

    2013-01-01

    Background Intra-species genetic variation can be used to investigate population structure, selection, and gene flow in non-model vertebrates; and due to the plummeting costs for genome sequencing, it is now possible for small labs to obtain full-genome variation data from their species of interest. However, those labs may not have easy access to, and familiarity with, computational tools to analyze those data. Results We have created a suite of tools for the Galaxy web server aimed at handling nucleotide and amino-acid polymorphisms discovered by full-genome sequencing of several individuals of the same species, or using a SNP genotyping microarray. In addition to providing user-friendly tools, a main goal is to make published analyses reproducible. While most of the examples discussed in this paper deal with nuclear-genome diversity in non-human vertebrates, we also illustrate the application of the tools to fungal genomes, human biomedical data, and mitochondrial sequences. Conclusions This project illustrates that a small group can design, implement, test, document, and distribute a Galaxy tool collection to meet the needs of a particular community of biologists. PMID:24377391

  12. Tools for sea urchin genomic analysis.

    PubMed

    Cameron, R Andrew

    2014-01-01

    The Sea Urchin Genome Project Web site, SpBase ( http://SpBase.org ), in association with a suite of publicly available sequence comparison tools provides a platform from which to analyze genes and genomic sequences of sea urchin. This information system is specifically designed to support laboratory bench studies in cell and molecular biology. In particular these tools and datasets have supported the description of the gene regulatory networks of the purple sea urchin S. purpuratus. This chapter details methods to undertake in the first steps to find genes and noncoding regulatory sequences for further analysis.

  13. PanGeT: Pan-genomics tool.

    PubMed

    Yuvaraj, Iyyappan; Sridhar, Jayavel; Michael, Daliah; Sekar, Kanagaraj

    2017-02-05

    A decade after the concept of Pan-genome was first introduced; research in this field has spread its tentacles to areas such as pathogenesis of diseases, bacterial evolutionary studies and drug resistance. Gene content-based differentiation of virulent and a virulent strains of bacteria and identification of pathogen specific genes is imperative to understand their physiology and gain insights into the mechanism of genome evolution. Subsequently, this will aid in identifying diagnostic targets and in developing and selecting vaccines. The root of pan-genomic studies, however, is to identify the core genes, dispensable genes and strain specific genes across the genomes belonging to a clade. To this end, we have developed a tool, "PanGeT - Pan-genomics Tool" to compute the 'pan-genome' based on comparisons at the genome as well as the proteome levels. This automated tool is implemented using LaTeX libraries for effective visualization of overall pan-genome through graphical plots. Links to retrieve sequence information and functional annotations have also been provided. PanGeT can be downloaded from http://pranag.physics.iisc.ernet.in/PanGeT/ or https://github.com/PanGeTv1/PanGeT.

  14. RNA Interference-Based Gene Silencing as an Efficient Tool for Functional Genomics in Hexaploid Bread Wheat1[W][OA

    PubMed Central

    Travella, Silvia; Klimm, Theres E.; Keller, Beat

    2006-01-01

    Insertional mutagenesis and gene silencing are efficient tools for the determination of gene function. In contrast to gain- or loss-of-function approaches, RNA interference (RNAi)-induced gene silencing can possibly silence multigene families and homoeologous genes in polyploids. This is of great importance for functional studies in hexaploid wheat (Triticum aestivum), where most of the genes are present in at least three homoeologous copies and conventional insertional mutagenesis is not effective. We have introduced into bread wheat double-stranded RNA-expressing constructs containing fragments of genes encoding Phytoene Desaturase (PDS) or the signal transducer of ethylene, Ethylene Insensitive 2 (EIN2). Transformed plants showed phenotypic changes that were stably inherited over at least two generations. These changes were very similar to mutant phenotypes of the two genes in diploid model plants. Quantitative real-time polymerase chain reaction revealed a good correlation between decreasing mRNA levels and increasingly severe phenotypes. RNAi silencing had the same quantitative effect on all three homoeologous genes. The most severe phenotypes were observed in homozygous plants that showed the strongest mRNA reduction and, interestingly, produced around 2-fold the amount of small RNAs compared to heterozygous plants. This suggests that the effect of RNAi in hexaploid wheat is gene-dosage dependent. Wheat seedlings with low mRNA levels for EIN2 were ethylene insensitive. Thus, EIN2 is a positive regulator of the ethylene-signaling pathway in wheat, very similar to its homologs in Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa). Our data show that RNAi results in stably inherited phenotypes and therefore represents an efficient tool for functional genomic studies in polyploid wheat. PMID:16861570

  15. DHPC: a new tool to express genome structural features.

    PubMed

    Deng, Xuegong; Deng, Xuemei; Rayner, Simon; Liu, Xiangdong; Zhang, Qingling; Yang, Yupu; Li, Ning

    2008-05-01

    The DHPC (DNA Hilbert-Peano curve) is a new tool for visualizing large-scale genome sequences by mapping sequences into a two-dimensional square. It utilizes the space-filling function of Hilbert-Peano mapping. By applying a Gauss smoothing technique and a user-defined color function, a large-scale genome sequence can be mapped into a two-dimensional color image. In the calculated DHPCs, many genome characteristics are revealed. In this article we introduce the method and show how DHPCs may be used to identify regions of different base composition. The power of the method is demonstrated by presenting multiple examples such as repeating sequences, degree of base bias, regions of homogeneity and their boundaries, and mark of annotated segments. We also present several genome curves generated by DHPC to demonstrate how DHPC can be used to find previously unidentified sequence features in these genomes.

  16. Gene Chips and Functional Genomics

    NASA Astrophysics Data System (ADS)

    Hamadeh, Hisham; Afshari, Cynthia

    2000-11-01

    These past few years of scientific discovery will undoubtedly be remembered as the "genomics era," the period in which biologists succeeded in enumerating the sequence of nucleotides making up all, or at least most, of human DNA. And while this achievement has been heralded as a technological feat equal to the moon landing, it is only the first of many advances in DNA technology. Scientists are now faced with the task of understanding the meaning of the DNA sequence. Specifically, they want to learn how the DNA code relates to protein function. An important tool in the study of "functional genomics," is the cDNA microarray—also known as the gene chip. Inspired by computer microchips, gene chips allow scientists to monitor the expression of hundreds, even thousands, of genes in a fraction of the time it used to take to monitor the expression of a single one. By altering the conditions under which a particular tissue expresses genes—say, by exposing it to toxins or growth factors—scientists can determine the suite of genes expressed in different situations and hence start to get a handle on the function of these genes. The authors discuss this important new technology and some of its practical applications.

  17. Functional genomics tools applied to plant metabolism: a survey on plant respiration, its connections and the annotation of complex gene functions

    PubMed Central

    Araújo, Wagner L.; Nunes-Nesi, Adriano; Williams, Thomas C. R.

    2012-01-01

    The application of post-genomic techniques in plant respiration studies has greatly improved our ability to assign functions to gene products. In addition it has also revealed previously unappreciated interactions between distal elements of metabolism. Such results have reinforced the need to consider plant respiratory metabolism as part of a complex network and making sense of such interactions will ultimately require the construction of predictive and mechanistic models. Transcriptomics, proteomics, metabolomics, and the quantification of metabolic flux will be of great value in creating such models both by facilitating the annotation of complex gene function, determining their structure and by furnishing the quantitative data required to test them. In this review, we highlight how these experimental approaches have contributed to our current understanding of plant respiratory metabolism and its interplay with associated process (e.g., photosynthesis, photorespiration, and nitrogen metabolism). We also discuss how data from these techniques may be integrated, with the ultimate aim of identifying mechanisms that control and regulate plant respiration and discovering novel gene functions with potential biotechnological implications. PMID:22973288

  18. GREAT: a web portal for Genome Regulatory Architecture Tools.

    PubMed

    Bouyioukos, Costas; Bucchini, François; Elati, Mohamed; Képès, François

    2016-07-08

    GREAT (Genome REgulatory Architecture Tools) is a novel web portal for tools designed to generate user-friendly and biologically useful analysis of genome architecture and regulation. The online tools of GREAT are freely accessible and compatible with essentially any operating system which runs a modern browser. GREAT is based on the analysis of genome layout -defined as the respective positioning of co-functional genes- and its relation with chromosome architecture and gene expression. GREAT tools allow users to systematically detect regular patterns along co-functional genomic features in an automatic way consisting of three individual steps and respective interactive visualizations. In addition to the complete analysis of regularities, GREAT tools enable the use of periodicity and position information for improving the prediction of transcription factor binding sites using a multi-view machine learning approach. The outcome of this integrative approach features a multivariate analysis of the interplay between the location of a gene and its regulatory sequence. GREAT results are plotted in web interactive graphs and are available for download either as individual plots, self-contained interactive pages or as machine readable tables for downstream analysis. The GREAT portal can be reached at the following URL https://absynth.issb.genopole.fr/GREAT and each individual GREAT tool is available for downloading. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. Integrating sequence, evolution and functional genomics in regulatory genomics

    PubMed Central

    Vingron, Martin; Brazma, Alvis; Coulson, Richard; van Helden, Jacques; Manke, Thomas; Palin, Kimmo; Sand, Olivier; Ukkonen, Esko

    2009-01-01

    With genome analysis expanding from the study of genes to the study of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome. PMID:19226437

  20. PSAT: A web tool to compare genomic neighborhoods of multiple prokaryotic genomes

    PubMed Central

    Fong, Christine; Rohmer, Laurence; Radey, Matthew; Wasnick, Michael; Brittnacher, Mitchell J

    2008-01-01

    Background The conservation of gene order among prokaryotic genomes can provide valuable insight into gene function, protein interactions, or events by which genomes have evolved. Although some tools are available for visualizing and comparing the order of genes between genomes of study, few support an efficient and organized analysis between large numbers of genomes. The Prokaryotic Sequence homology Analysis Tool (PSAT) is a web tool for comparing gene neighborhoods among multiple prokaryotic genomes. Results PSAT utilizes a database that is preloaded with gene annotation, BLAST hit results, and gene-clustering scores designed to help identify regions of conserved gene order. Researchers use the PSAT web interface to find a gene of interest in a reference genome and efficiently retrieve the sequence homologs found in other bacterial genomes. The tool generates a graphic of the genomic neighborhood surrounding the selected gene and the corresponding regions for its homologs in each comparison genome. Homologs in each region are color coded to assist users with analyzing gene order among various genomes. In contrast to common comparative analysis methods that filter sequence homolog data based on alignment score cutoffs, PSAT leverages gene context information for homologs, including those with weak alignment scores, enabling a more sensitive analysis. Features for constraining or ordering results are designed to help researchers browse results from large numbers of comparison genomes in an organized manner. PSAT has been demonstrated to be useful for helping to identify gene orthologs and potential functional gene clusters, and detecting genome modifications that may result in loss of function. Conclusion PSAT allows researchers to investigate the order of genes within local genomic neighborhoods of multiple genomes. A PSAT web server for public use is available for performing analyses on a growing set of reference genomes through any web browser with no client

  1. Frageria vesca, a useful tool for Rosaceae Genomics

    USDA-ARS?s Scientific Manuscript database

    The diploid woodland strawberry, Fragaria vesca subsp. vesca forma semperflorens, has many characteristics that make it an ideal plant for strawberry genomics research. A brief review of some of the tools available for using this plant as a reference for studying gene function in strawberry and oth...

  2. The UCSC genome browser and associated tools.

    PubMed

    Kuhn, Robert M; Haussler, David; Kent, W James

    2013-03-01

    The UCSC Genome Browser (http://genome.ucsc.edu) is a graphical viewer for genomic data now in its 13th year. Since the early days of the Human Genome Project, it has presented an integrated view of genomic data of many kinds. Now home to assemblies for 58 organisms, the Browser presents visualization of annotations mapped to genomic coordinates. The ability to juxtapose annotations of many types facilitates inquiry-driven data mining. Gene predictions, mRNA alignments, epigenomic data from the ENCODE project, conservation scores from vertebrate whole-genome alignments and variation data may be viewed at any scale from a single base to an entire chromosome. The Browser also includes many other widely used tools, including BLAT, which is useful for alignments from high-throughput sequencing experiments. Private data uploaded as Custom Tracks and Data Hubs in many formats may be displayed alongside the rich compendium of precomputed data in the UCSC database. The Table Browser is a full-featured graphical interface, which allows querying, filtering and intersection of data tables. The Saved Session feature allows users to store and share customized views, enhancing the utility of the system for organizing multiple trains of thought. Binary Alignment/Map (BAM), Variant Call Format and the Personal Genome Single Nucleotide Polymorphisms (SNPs) data formats are useful for visualizing a large sequencing experiment (whole-genome or whole-exome), where the differences between the data set and the reference assembly may be displayed graphically. Support for high-throughput sequencing extends to compact, indexed data formats, such as BAM, bigBed and bigWig, allowing rapid visualization of large datasets from RNA-seq and ChIP-seq experiments via local hosting.

  3. Bioinformatics tools for analysing viral genomic data.

    PubMed

    Orton, R J; Gu, Q; Hughes, J; Maabar, M; Modha, S; Vattipally, S B; Wilkie, G S; Davison, A J

    2016-04-01

    The field of viral genomics and bioinformatics is experiencing a strong resurgence due to high-throughput sequencing (HTS) technology, which enables the rapid and cost-effective sequencing and subsequent assembly of large numbers of viral genomes. In addition, the unprecedented power of HTS technologies has enabled the analysis of intra-host viral diversity and quasispecies dynamics in relation to important biological questions on viral transmission, vaccine resistance and host jumping. HTS also enables the rapid identification of both known and potentially new viruses from field and clinical samples, thus adding new tools to the fields of viral discovery and metagenomics. Bioinformatics has been central to the rise of HTS applications because new algorithms and software tools are continually needed to process and analyse the large, complex datasets generated in this rapidly evolving area. In this paper, the authors give a brief overview of the main bioinformatics tools available for viral genomic research, with a particular emphasis on HTS technologies and their main applications. They summarise the major steps in various HTS analyses, starting with quality control of raw reads and encompassing activities ranging from consensus and de novo genome assembly to variant calling and metagenomics, as well as RNA sequencing.

  4. Analysis tools for the interplay between genome layout and regulation.

    PubMed

    Bouyioukos, Costas; Elati, Mohamed; Képès, François

    2016-06-06

    Genome layout and gene regulation appear to be interdependent. Understanding this interdependence is key to exploring the dynamic nature of chromosome conformation and to engineering functional genomes. Evidence for non-random genome layout, defined as the relative positioning of either co-functional or co-regulated genes, stems from two main approaches. Firstly, the analysis of contiguous genome segments across species, has highlighted the conservation of gene arrangement (synteny) along chromosomal regions. Secondly, the study of long-range interactions along a chromosome has emphasised regularities in the positioning of microbial genes that are co-regulated, co-expressed or evolutionarily correlated. While one-dimensional pattern analysis is a mature field, it is often powerless on biological datasets which tend to be incomplete, and partly incorrect. Moreover, there is a lack of comprehensive, user-friendly tools to systematically analyse, visualise, integrate and exploit regularities along genomes. Here we present the Genome REgulatory and Architecture Tools SCAN (GREAT:SCAN) software for the systematic study of the interplay between genome layout and gene expression regulation. SCAN is a collection of related and interconnected applications currently able to perform systematic analyses of genome regularities as well as to improve transcription factor binding sites (TFBS) and gene regulatory network predictions based on gene positional information. We demonstrate the capabilities of these tools by studying on one hand the regular patterns of genome layout in the major regulons of the bacterium Escherichia coli. On the other hand, we demonstrate the capabilities to improve TFBS prediction in microbes. Finally, we highlight, by visualisation of multivariate techniques, the interplay between position and sequence information for effective transcription regulation.

  5. Bovine Genome Database: integrated tools for genome annotation and discovery.

    PubMed

    Childers, Christopher P; Reese, Justin T; Sundaram, Jaideep P; Vile, Donald C; Dickens, C Michael; Childs, Kevin L; Salih, Hanni; Bennett, Anna K; Hagen, Darren E; Adelson, David L; Elsik, Christine G

    2011-01-01

    The Bovine Genome Database (BGD; http://BovineGenome.org) strives to improve annotation of the bovine genome and to integrate the genome sequence with other genomics data. BGD includes GBrowse genome browsers, the Apollo Annotation Editor, a quantitative trait loci (QTL) viewer, BLAST databases and gene pages. Genome browsers, available for both scaffold and chromosome coordinate systems, display the bovine Official Gene Set (OGS), RefSeq and Ensembl gene models, non-coding RNA, repeats, pseudogenes, single-nucleotide polymorphism, markers, QTL and alignments to complementary DNAs, ESTs and protein homologs. The Bovine QTL viewer is connected to the BGD Chromosome GBrowse, allowing for the identification of candidate genes underlying QTL. The Apollo Annotation Editor connects directly to the BGD Chado database to provide researchers with remote access to gene evidence in a graphical interface that allows editing and creating new gene models. Researchers may upload their annotations to the BGD server for review and integration into the subsequent release of the OGS. Gene pages display information for individual OGS gene models, including gene structure, transcript variants, functional descriptions, gene symbols, Gene Ontology terms, annotator comments and links to National Center for Biotechnology Information and Ensembl. Each gene page is linked to a wiki page to allow input from the research community.

  6. Genome Calligrapher: A Web Tool for Refactoring Bacterial Genome Sequences for de Novo DNA Synthesis.

    PubMed

    Christen, Matthias; Deutsch, Samuel; Christen, Beat

    2015-08-21

    Recent advances in synthetic biology have resulted in an increasing demand for the de novo synthesis of large-scale DNA constructs. Any process improvement that enables fast and cost-effective streamlining of digitized genetic information into fabricable DNA sequences holds great promise to study, mine, and engineer genomes. Here, we present Genome Calligrapher, a computer-aided design web tool intended for whole genome refactoring of bacterial chromosomes for de novo DNA synthesis. By applying a neutral recoding algorithm, Genome Calligrapher optimizes GC content and removes obstructive DNA features known to interfere with the synthesis of double-stranded DNA and the higher order assembly into large DNA constructs. Subsequent bioinformatics analysis revealed that synthesis constraints are prevalent among bacterial genomes. However, a low level of codon replacement is sufficient for refactoring bacterial genomes into easy-to-synthesize DNA sequences. To test the algorithm, 168 kb of synthetic DNA comprising approximately 20 percent of the synthetic essential genome of the cell-cycle bacterium Caulobacter crescentus was streamlined and then ordered from a commercial supplier of low-cost de novo DNA synthesis. The successful assembly into eight 20 kb segments indicates that Genome Calligrapher algorithm can be efficiently used to refactor difficult-to-synthesize DNA. Genome Calligrapher is broadly applicable to recode biosynthetic pathways, DNA sequences, and whole bacterial genomes, thus offering new opportunities to use synthetic biology tools to explore the functionality of microbial diversity. The Genome Calligrapher web tool can be accessed at https://christenlab.ethz.ch/GenomeCalligrapher  .

  7. Next generation tools for genomic data generation, distribution, and visualization.

    PubMed

    Nix, David A; Di Sera, Tonya L; Dalley, Brian K; Milash, Brett A; Cundick, Robert M; Quinn, Kevin S; Courdy, Samir J

    2010-09-09

    With the rapidly falling cost and availability of high throughput sequencing and microarray technologies, the bottleneck for effectively using genomic analysis in the laboratory and clinic is shifting to one of effectively managing, analyzing, and sharing genomic data. Here we present three open-source, platform independent, software tools for generating, analyzing, distributing, and visualizing genomic data. These include a next generation sequencing/microarray LIMS and analysis project center (GNomEx); an application for annotating and programmatically distributing genomic data using the community vetted DAS/2 data exchange protocol (GenoPub); and a standalone Java Swing application (GWrap) that makes cutting edge command line analysis tools available to those who prefer graphical user interfaces. Both GNomEx and GenoPub use the rich client Flex/Flash web browser interface to interact with Java classes and a relational database on a remote server. Both employ a public-private user-group security model enabling controlled distribution of patient and unpublished data alongside public resources. As such, they function as genomic data repositories that can be accessed manually or programmatically through DAS/2-enabled client applications such as the Integrated Genome Browser. These tools have gained wide use in our core facilities, research laboratories and clinics and are freely available for non-profit use. See http://sourceforge.net/projects/gnomex/, http://sourceforge.net/projects/genoviz/, and http://sourceforge.net/projects/useq.

  8. Functional genomics of intracellular bacteria.

    PubMed

    de Barsy, Marie; Greub, Gilbert

    2013-07-01

    During the genomic era, a large amount of whole-genome sequences accumulated, which identified many hypothetical proteins of unknown function. Rapidly, functional genomics, which is the research domain that assign a function to a given gene product, has thus been developed. Functional genomics of intracellular pathogenic bacteria exhibit specific peculiarities due to the fastidious growth of most of these intracellular micro-organisms, due to the close interaction with the host cell, due to the risk of contamination of experiments with host cell proteins and, for some strict intracellular bacteria such as Chlamydia, due to the absence of simple genetic system to manipulate the bacterial genome. To identify virulence factors of intracellular pathogenic bacteria, functional genomics often rely on bioinformatic analyses compared with model organisms such as Escherichia coli and Bacillus subtilis. The use of heterologous expression is another common approach. Given the intracellular lifestyle and the many effectors that are used by the intracellular bacteria to corrupt host cell functions, functional genomics is also often targeting the identification of new effectors such as those of the T4SS of Brucella and Legionella.

  9. Comparing thousands of circular genomes using the CGView Comparison Tool

    PubMed Central

    2012-01-01

    compare a reference sequence to thousands of existing genomes or sequence collections (next-generation sequencing reads for example) on a standard desktop computer. It provides analysis and visualization functionality not available in any existing circular genome visualization tool. By visually presenting sequence conservation information along with functional classifications and sequence composition characteristics, CCT can be a useful tool for identifying rapidly evolving or novel sequences, horizontally transferred sequences, or unusual functional properties in newly sequenced genomes. CCT is freely available for download at http://stothard.afns.ualberta.ca/downloads/CCT/. PMID:22621371

  10. From data to function: functional modeling of poultry genomics data.

    PubMed

    McCarthy, F M; Lyons, E

    2013-09-01

    One of the challenges of functional genomics is to create a better understanding of the biological system being studied so that the data produced are leveraged to provide gains for agriculture, human health, and the environment. Functional modeling enables researchers to make sense of these data as it reframes a long list of genes or gene products (mRNA, ncRNA, and proteins) by grouping based upon function, be it individual molecular functions or interactions between these molecules or broader biological processes, including metabolic and signaling pathways. However, poultry researchers have been hampered by a lack of functional annotation data, tools, and training to use these data and tools. Moreover, this lack is becoming more critical as new sequencing technologies enable us to generate data not only for an increasingly diverse range of species but also individual genomes and populations of individuals. We discuss the impact of these new sequencing technologies on poultry research, with a specific focus on what functional modeling resources are available for poultry researchers. We also describe key strategies for researchers who wish to functionally model their own data, providing background information about functional modeling approaches, the data and tools to support these approaches, and the strengths and limitations of each. Specifically, we describe methods for functional analysis using Gene Ontology (GO) functional summaries, functional enrichment analysis, and pathways and network modeling. As annotation efforts begin to provide the fundamental data that underpin poultry functional modeling (such as improved gene identification, standardized gene nomenclature, temporal and spatial expression data and gene product function), tool developers are incorporating these data into new and existing tools that are used for functional modeling, and cyberinfrastructure is being developed to provide the necessary extendibility and scalability for storing and

  11. MaGnET: Malaria Genome Exploration Tool

    PubMed Central

    Sharman, Joanna L.; Gerloff, Dietlind L.

    2013-01-01

    Summary: The Malaria Genome Exploration Tool (MaGnET) is a software tool enabling intuitive ‘exploration-style’ visualization of functional genomics data relating to the malaria parasite, Plasmodium falciparum. MaGnET provides innovative integrated graphic displays for different datasets, including genomic location of genes, mRNA expression data, protein–protein interactions and more. Any selection of genes to explore made by the user is easily carried over between the different viewers for different datasets, and can be changed interactively at any point (without returning to a search). Availability and Implementation: Free online use (Java Web Start) or download (Java application archive and MySQL database; requires local MySQL installation) at http://malariagenomeexplorer.org Contact: joanna.sharman@ed.ac.uk or dgerloff@ffame.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23894142

  12. MaGnET: Malaria Genome Exploration Tool.

    PubMed

    Sharman, Joanna L; Gerloff, Dietlind L

    2013-09-15

    The Malaria Genome Exploration Tool (MaGnET) is a software tool enabling intuitive 'exploration-style' visualization of functional genomics data relating to the malaria parasite, Plasmodium falciparum. MaGnET provides innovative integrated graphic displays for different datasets, including genomic location of genes, mRNA expression data, protein-protein interactions and more. Any selection of genes to explore made by the user is easily carried over between the different viewers for different datasets, and can be changed interactively at any point (without returning to a search). Free online use (Java Web Start) or download (Java application archive and MySQL database; requires local MySQL installation) at http://malariagenomeexplorer.org joanna.sharman@ed.ac.uk or dgerloff@ffame.org Supplementary data are available at Bioinformatics online.

  13. GLANET: genomic loci annotation and enrichment tool.

    PubMed

    Otlu, Burçak; Firtina, Can; Keles, Sündüz; Tastan, Oznur

    2017-09-15

    Genomic studies identify genomic loci representing genetic variations, transcription factor (TF) occupancy, or histone modification through next generation sequencing (NGS) technologies. Interpreting these loci requires evaluating them with known genomic and epigenomic annotations. We present GLANET as a comprehensive annotation and enrichment analysis tool which implements a sampling-based enrichment test that accounts for GC content and/or mappability biases, jointly or separately. GLANET annotates and performs enrichment analysis on these loci with a rich library. We introduce and perform novel data-driven computational experiments for assessing the power and Type-I error of its enrichment procedure which show that GLANET has attained high statistical power and well-controlled Type-I error rate. As a key feature, users can easily extend its library with new gene sets and genomic intervals. Other key features include assessment of impact of single nucleotide variants (SNPs) on TF binding sites and regulation based pathway enrichment analysis. GLANET can be run using its GUI or on command line. GLANET's source code is available at https://github.com/burcakotlu/GLANET . Tutorials are provided at https://glanet.readthedocs.org . burcak@ceng.metu.edu.tr or oznur.tastan@cs.bilkent.edu.tr. Supplementary data are available at Bioinformatics online.

  14. Functional genomics in reproductive medicine.

    PubMed

    Barratt, Christopher L R; Hughes, David C; Afnan, Masoud; Brewis, Ian A

    2002-02-01

    The British Fertility Society organised a workshop on Functional Genomics in Reproductive Medicine at the University of Birmingham on 13-14 September 2001. The primary aim was to inform delegates about the power of the technology that has been made available after completion of the sequencing of the human genome, and to stimulate debate about using functional genomics to address both clinical and scientific questions in reproductive medicine. Three specific areas were addressed: proteomics, gene expression and bioinformatics. Although the sophistication and plethora of techniques available were obvious, major limitations in the technology were also discussed. The future promises to be very challenging indeed.

  15. ECR Browser: A Tool For Visualizing And Accessing Data From Comparisons Of Multiple Vertebrate Genomes

    SciTech Connect

    Loots, G G; Ovcharenko, I; Stubbs, L; Nobrega, M A

    2004-01-06

    The increasing number of vertebrate genomes being sequenced in draft or finished form provide a unique opportunity to study and decode the language of DNA sequence through comparative genome alignments. However, novel tools and strategies are required to accommodate this increasing volume of genomic information and to facilitate experimental annotation of genome function. Here we present the ECR Browser, a tool that provides an easy and dynamic access to whole genome alignments of human, mouse, rat and fish sequences. This web-based tool (http://ecrbrowser.dcode.org) provides the starting point for discovery of novel genes, identification of distant gene regulatory elements and prediction of transcription factor binding sites. The genome alignment portal of the ECR Browser also permits fast and automated alignment of any user-submitted sequence to the genome of choice. The interconnection of the ECR browser with other DNA sequence analysis tools creates a unique portal for studying and exploring vertebrate genomes.

  16. RNAi for functional genomics in plants.

    PubMed

    McGinnis, Karen M

    2010-03-01

    RNAi refers to several different types of gene silencing mediated by small, dsRNA molecules. Over the course of 20 years, the scientific understanding of RNAi has developed from the initial observation of unexpected expression patterns to a sophisticated understanding of a multi-faceted, evolutionarily conserved network of mechanisms that regulate gene expression in many organisms. It has also been developed as a genetic tool that can be exploited in a wide range of species. Because transgene-induced RNAi has been effective at silencing one or more genes in a wide range of plants, this technology also bears potential as a powerful functional genomics tool across the plant kingdom. Transgene-induced RNAi has indeed been shown to be an effective mechanism for silencing many genes in many organisms, but the results from multiple projects which attempted to exploit RNAi on a genome-wide scale suggest that there is a great deal of variation in the silencing efficacy between transgenic events, silencing targets and silencing-induced phenotype. The results from these projects indicate several important variables that should be considered in experimental design prior to the initiation of functional genomics efforts based on RNAi silencing. In recent years, alternative strategies have been developed for targeted gene silencing, and a combination of approaches may also enhance the use of targeted gene silencing for functional genomics.

  17. From plant genomes to protein families: computational tools

    PubMed Central

    Martinez, Manuel

    2013-01-01

    The development of new high-throughput sequencing technologies has increased dramatically the number of successful genomic projects. Thus, draft genomic sequences of more than 60 plant species are currently available. Suitable bioinformatics tools are being developed to assemble, annotate and analyze the enormous number of sequences produced. In this context, specific plant comparative genomic databases are become powerful tools for gene family annotation in plant clades. In this mini-review, the current state-of-art of genomic projects is glossed. Besides, the computational tools developed to compare genomic data are compiled. PMID:24688740

  18. Endonucleases: new tools to edit the mouse genome.

    PubMed

    Wijshake, Tobias; Baker, Darren J; van de Sluis, Bart

    2014-10-01

    Mouse transgenesis has been instrumental in determining the function of genes in the pathophysiology of human diseases and modification of genes by homologous recombination in mouse embryonic stem cells remains a widely used technology. However, this approach harbors a number of disadvantages, as it is time-consuming and quite laborious. Over the last decade a number of new genome editing technologies have been developed, including zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and clustered regularly interspaced short palindromic repeats/CRISPR-associated (CRISPR/Cas). These systems are characterized by a designed DNA binding protein or RNA sequence fused or co-expressed with a non-specific endonuclease, respectively. The engineered DNA binding protein or RNA sequence guides the nuclease to a specific target sequence in the genome to induce a double strand break. The subsequent activation of the DNA repair machinery then enables the introduction of gene modifications at the target site, such as gene disruption, correction or insertion. Nuclease-mediated genome editing has numerous advantages over conventional gene targeting, including increased efficiency in gene editing, reduced generation time of mutant mice, and the ability to mutagenize multiple genes simultaneously. Although nuclease-driven modifications in the genome are a powerful tool to generate mutant mice, there are concerns about off-target cleavage, especially when using the CRISPR/Cas system. Here, we describe the basic principles of these new strategies in mouse genome manipulation, their inherent advantages, and their potential disadvantages compared to current technologies used to study gene function in mouse models. This article is part of a Special Issue entitled: From Genome to Function.

  19. Functional genomics of pathogenic bacteria.

    PubMed Central

    Moxon, E R; Hood, D W; Saunders, N J; Schweda, E K H; Richards, J C

    2002-01-01

    Microbial diseases remain the commonest cause of global mortality and morbidity. Automated-DNA sequencing has revolutionized the investigation of pathogenic microbes by making the immense fund of information contained in their genomes available at reasonable cost. The challenge is how this information can be used to increase current understanding of the biology of commensal and virulence behaviour of pathogens with particular emphasis on in vivo function and novel approaches to prevention. One example of the application of whole-genome-sequence information is afforded by investigations of the pathogenic role of Haemophilus influenzae lipopolysaccharide and its candidacy as a vaccine. PMID:11839188

  20. Plant Comparative and Functional Genomics

    DOE PAGES

    Yang, Xiaohan; Leebens-Mack, Jim; Chen, Feng; ...

    2015-01-01

    Plants form the foundation for our global ecosystem and are essential for environmental and human health. An increasing number of available plant genomes and tractable experimental systems, comparative and functional plant genomics research is greatly expanding our knowledge of the molecular basis of economically and nutritionally important traits in crop plants. Inferences drawn from comparative genomics are motivating experimental investigations of gene function and gene interactions. In this special issue aims to highlight recent advances made in comparative and functional genomics research in plants. Nine original research articles in this special issue cover five important topics: (1) transcription factor genemore » families relevant to abiotic stress tolerance; (2) plant secondary metabolism; (3) transcriptomebased markers for quantitative trait locus; (4) epigenetic modifications in plant-microbe interactions; and (5) computational prediction of protein-protein interactions. Finally, we studied the plant species in these articles which include model species as well as nonmodel plant species of economic importance (e.g., food crops and medicinal plants).« less

  1. The integrated microbial genomes (IMG) system in 2007: datacontent and analysis tool extensions

    SciTech Connect

    Markowitz, Victor M.; Szeto, Ernest; Palaniappan, Krishna; Grechkin, Yuri; Chu, Ken; Chen, I-Min A.; Dubchak, Inna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2007-08-01

    The Integrated Microbial Genomes (IMG) system is a data management, analysis and annotation platform for all publicly available genomes. IMG contains both draft and complete JGI microbial genomes integrated with all other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and annotating genomes, genes and functions, individually or in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through quarterly releases. IMG is provided by the DOE-Joint Genome Institute (JGI) and is available from http://img.jgi.doe.gov.

  2. 2004 Structural, Function and Evolutionary Genomics

    SciTech Connect

    Douglas L. Brutlag Nancy Ryan Gray

    2005-03-23

    This Gordon conference will cover the areas of structural, functional and evolutionary genomics. It will take a systematic approach to genomics, examining the evolution of proteins, protein functional sites, protein-protein interactions, regulatory networks, and metabolic networks. Emphasis will be placed on what we can learn from comparative genomics and entire genomes and proteomes.

  3. REVIEW: Zebrafish: A Renewed Model System For Functional Genomics

    NASA Astrophysics Data System (ADS)

    Wen, Xiao-Yan

    2008-01-01

    In the post genome era, a major goal in molecular biology is to determine the function of the many thousands of genes present in the vertebrate genome. The zebrafish (Danio rerio) provides an almost ideal genetic model to identify the biological roles of these novel genes, in part because their embryos are transparent and develop rapidly. The zebrafish has many advantages over mouse for genome-wide mutagenesis studies, allowing for easier, cheaper and faster functional characterization of novel genes in the vertebrate genome. Many molecular research tools such as chemical mutagenesis, transgenesis, gene trapping, gene knockdown, TILLING, gene targeting, RNAi and chemical genetic screen are now available in zebrafish. Combining all the forward, reverse, and chemical genetic tools, it is expected that zebrafish will make invaluable contribution to vertebrate functional genomics in functional annotation of the genes, modeling human diseases and drug discoveries.

  4. [The place of functional genomics in oncological research].

    PubMed

    Bálint, Bálint L; Nagy, László

    2013-03-01

    The 1000 genomes project changed the way how we see the human genome. The rapid development of the deep sequencing technologies is raising several practical questions, and the way how we answer these questions will affect deeply the future of the oncological reseach in Hungary. In our manuscript we give a short overview of the results of the 1000 genomes project and we present the place of the functional genomic investigations between other genomic tools. Based on the recent development in the field we summarize the challenges that have to be addressed in the next couple of years.

  5. Functional genomic screening to enhance oncolytic virotherapy.

    PubMed

    Mahoney, D J; Stojdl, D F

    2013-02-05

    Functional genomic screening has emerged as a powerful approach for understanding complex biological phenomena. Of the available tools, genome-wide RNA interference (RNAi) technology is unquestionably the most incisive, as it directly probes gene function. Recent applications of RNAi screening have been impressive. Notable amongst these are its use in elucidated mechanism(s) for signal transduction, various aspects of cell biology, tumourigenesis and metastasis, resistance to cancer therapeutics, and the host's response to a pathogen. Herein we discuss how recent RNAi screening efforts have helped turn our attention to the targetability of non-oncogene support pathways for cancer treatment, with a particular focus on a recent study that identified a non-oncogene addiction to the ER stress response as a synergist target for oncolytic virus therapy (OVT). Moreover, we give our thoughts on the future of RNAi screening as a tool to enhance OVT and describe recent technical improvements that are poised to make genome-scale RNAi experiments more sensitive, less noisy, more applicable in vivo, and more easily validated in clinically relevant animal models.

  6. Bioinformatics tools for small genomes, such as hepatitis B virus.

    PubMed

    Bell, Trevor G; Kramvis, Anna

    2015-02-16

    DNA sequence analysis is undertaken in many biological research laboratories. The workflow consists of several steps involving the bioinformatic processing of biological data. We have developed a suite of web-based online bioinformatic tools to assist with processing, analysis and curation of DNA sequence data. Most of these tools are genome-agnostic, with two tools specifically designed for hepatitis B virus sequence data. Tools in the suite are able to process sequence data from Sanger sequencing, ultra-deep amplicon resequencing (pyrosequencing) and chromatograph (trace files), as appropriate. The tools are available online at no cost and are aimed at researchers without specialist technical computer knowledge. The tools can be accessed at http://hvdr.bioinf.wits.ac.za/SmallGenomeTools, and the source code is available online at https://github.com/DrTrevorBell/SmallGenomeTools.

  7. Bioinformatics Tools for Small Genomes, Such as Hepatitis B Virus

    PubMed Central

    Bell, Trevor G.; Kramvis, Anna

    2015-01-01

    DNA sequence analysis is undertaken in many biological research laboratories. The workflow consists of several steps involving the bioinformatic processing of biological data. We have developed a suite of web-based online bioinformatic tools to assist with processing, analysis and curation of DNA sequence data. Most of these tools are genome-agnostic, with two tools specifically designed for hepatitis B virus sequence data. Tools in the suite are able to process sequence data from Sanger sequencing, ultra-deep amplicon resequencing (pyrosequencing) and chromatograph (trace files), as appropriate. The tools are available online at no cost and are aimed at researchers without specialist technical computer knowledge. The tools can be accessed at http://hvdr.bioinf.wits.ac.za/SmallGenomeTools, and the source code is available online at https://github.com/DrTrevorBell/SmallGenomeTools. PMID:25690798

  8. Data management tools for genomic applications: A progress report

    SciTech Connect

    Markowitz, V.M.; Chen, I-Min A.

    1993-09-01

    We report in this paper on the development of data management tools that allow scientist to construct and manipulate genomic data bases in terms of application-specific objects and protocols. We are developing tools for specifying genomic database structures, as well as for entering, changing, maintaining, browsing and querying data in genomic data bases. These tools are based on the Object-protocol Model (OPM) developed by us and target commercial relational database management systems which are widely used in molecular biology laboratories. OPM allows scientists to interact with genomic databases in terms of their own frame or reference, namely genomic objects and protocols. Databases developed using the data management tools are easier to use, manage, and adapt.

  9. Integrative data-mining tools to link gene and function.

    PubMed

    El Yacoubi, Basma; de Crécy-Lagard, Valérie

    2014-01-01

    Information derived from genomic and post-genomic data can be efficiently used to link gene and function. Several web-based platforms have been developed to mine these types of data by integrating different tools. This method paper is designed to allow the user to navigate these platforms in order to make functional predictions. The main focus is on phylogenetic distribution and physical clustering tools, but other tools such as pathway reconstruction, gene fusions, and analysis of high-throughput experimental data are also surveyed.

  10. FDA Bioinformatics Tool for Microbial Genomics Research on Molecular Characterization of Bacterial Foodborne Pathogens Using Microarrays

    USDA-ARS?s Scientific Manuscript database

    Background: Advances in microbial genomics and bioinformatics are offering greater insights into the emergence and spread of foodborne pathogens in outbreak scenarios. The Food and Drug Administration (FDA) has developed the genomics tool ArrayTrackTM, which provides extensive functionalities to man...

  11. Neuroscience in the era of functional genomics and systems biology.

    PubMed

    Geschwind, Daniel H; Konopka, Genevieve

    2009-10-15

    Advances in genetics and genomics have fuelled a revolution in discovery-based, or hypothesis-generating, research that provides a powerful complement to the more directly hypothesis-driven molecular, cellular and systems neuroscience. Genetic and functional genomic studies have already yielded important insights into neuronal diversity and function, as well as disease. One of the most exciting and challenging frontiers in neuroscience involves harnessing the power of large-scale genetic, genomic and phenotypic data sets, and the development of tools for data integration and mining. Methods for network analysis and systems biology offer the promise of integrating these multiple levels of data, connecting molecular pathways to nervous system function.

  12. Technology developments in biological tools for targeted genome surgery.

    PubMed

    Teimourian, Shahram; Abdollahzadeh, Rasoul

    2015-01-01

    Different biological tools for targeted genome engineering have recently appeared and these include tools like meganucleases, zinc-finger nucleases and newer technologies including TALENs and CRISPR/Cas systems. transcription activator-like effector nucleases (TALENs) have greatly improved genome editing efficiency by making site-specific DNA double-strand breaks. Several studies have shown the prominence of TALENs in comparison to the meganucleases and zinc-finger nucleases. The most important feature of TALENs that makes them suitable tools for targeted genome editing is the modularity of central repeat domains, meaning that they can be designed to recognize any desirable DNA sequence. In this review, we present a comprehensive and concise description of TALENs technology developments for targeted genome surgery with to the point description and comparison of other tools.

  13. Evolution, language and analogy in functional genomics

    NASA Technical Reports Server (NTRS)

    Benner, S. A.; Gaucher, E. A.

    2001-01-01

    Almost a century ago, Wittgenstein pointed out that theory in science is intricately connected to language. This connection is not a frequent topic in the genomics literature. But a case can be made that functional genomics is today hindered by the paradoxes that Wittgenstein identified. If this is true, until these paradoxes are recognized and addressed, functional genomics will continue to be limited in its ability to extrapolate information from genomic sequences.

  14. Evolution, language and analogy in functional genomics

    NASA Technical Reports Server (NTRS)

    Benner, S. A.; Gaucher, E. A.

    2001-01-01

    Almost a century ago, Wittgenstein pointed out that theory in science is intricately connected to language. This connection is not a frequent topic in the genomics literature. But a case can be made that functional genomics is today hindered by the paradoxes that Wittgenstein identified. If this is true, until these paradoxes are recognized and addressed, functional genomics will continue to be limited in its ability to extrapolate information from genomic sequences.

  15. Evolution, language and analogy in functional genomics.

    PubMed

    Benner, S A; Gaucher, E A

    2001-07-01

    Almost a century ago, Wittgenstein pointed out that theory in science is intricately connected to language. This connection is not a frequent topic in the genomics literature. But a case can be made that functional genomics is today hindered by the paradoxes that Wittgenstein identified. If this is true, until these paradoxes are recognized and addressed, functional genomics will continue to be limited in its ability to extrapolate information from genomic sequences.

  16. Genomics and public health: development of Web-based training tools for increasing genomic awareness.

    PubMed

    Bodzin, Jennifer; Kardia, Sharon L R; Goldenberg, Aaron; Raup, Sarah F; Bach, Janice V; Citrin, Toby

    2005-04-01

    In 2001, the Centers for Disease Control and Prevention funded three Centers for Genomics and Public Health to develop training tools for increasing genomic awareness. Over the past three years, the centers, working together with the Centers for Disease Control and Prevention's Office of Genomics and Disease Prevention, have developed tools to increase awareness of the impact genomics will have on public health practice, to provide a foundation for understanding basic genomic advances, and to translate the relevance of that information to public health practitioners' own work. These training tools serve to communicate genomic advances and their potential for integration into public heath practice. This paper highlights two of these training tools: 1) Genomics for Public Health Practitioners: The Practical Application of Genomics in Public Health Practice, a Web-based introduction to genomics, and 2) Six Weeks to Genomic Awareness, an in-depth training module on public health genomics. This paper focuses on the processes and collaborative efforts by which these live presentations were developed and delivered as Web-based training sessions.

  17. CoryneBase: Corynebacterium Genomic Resources and Analysis Tools at Your Fingertips

    PubMed Central

    Tan, Mui Fern; Jakubovics, Nick S.; Wee, Wei Yee; Mutha, Naresh V. R.; Wong, Guat Jah; Ang, Mia Yang; Yazdi, Amir Hessam; Choo, Siew Woh

    2014-01-01

    Corynebacteria are used for a wide variety of industrial purposes but some species are associated with human diseases. With increasing number of corynebacterial genomes having been sequenced, comparative analysis of these strains may provide better understanding of their biology, phylogeny, virulence and taxonomy that may lead to the discoveries of beneficial industrial strains or contribute to better management of diseases. To facilitate the ongoing research of corynebacteria, a specialized central repository and analysis platform for the corynebacterial research community is needed to host the fast-growing amount of genomic data and facilitate the analysis of these data. Here we present CoryneBase, a genomic database for Corynebacterium with diverse functionality for the analysis of genomes aimed to provide: (1) annotated genome sequences of Corynebacterium where 165,918 coding sequences and 4,180 RNAs can be found in 27 species; (2) access to comprehensive Corynebacterium data through the use of advanced web technologies for interactive web interfaces; and (3) advanced bioinformatic analysis tools consisting of standard BLAST for homology search, VFDB BLAST for sequence homology search against the Virulence Factor Database (VFDB), Pairwise Genome Comparison (PGC) tool for comparative genomic analysis, and a newly designed Pathogenomics Profiling Tool (PathoProT) for comparative pathogenomic analysis. CoryneBase offers the access of a range of Corynebacterium genomic resources as well as analysis tools for comparative genomics and pathogenomics. It is publicly available at http://corynebacterium.um.edu.my/. PMID:24466021

  18. GenomicTools: a computational platform for developing high-throughput analytics in genomics.

    PubMed

    Tsirigos, Aristotelis; Haiminen, Niina; Bilal, Erhan; Utro, Filippo

    2012-01-15

    Recent advances in sequencing technology have resulted in the dramatic increase of sequencing data, which, in turn, requires efficient management of computational resources, such as computing time, memory requirements as well as prototyping of computational pipelines. We present GenomicTools, a flexible computational platform, comprising both a command-line set of tools and a C++ API, for the analysis and manipulation of high-throughput sequencing data such as DNA-seq, RNA-seq, ChIP-seq and MethylC-seq. GenomicTools implements a variety of mathematical operations between sets of genomic regions thereby enabling the prototyping of computational pipelines that can address a wide spectrum of tasks ranging from pre-processing and quality control to meta-analyses. Additionally, the GenomicTools platform is designed to analyze large datasets of any size by minimizing memory requirements. In practical applications, where comparable, GenomicTools outperforms existing tools in terms of both time and memory usage. The GenomicTools platform (version 2.0.0) was implemented in C++. The source code, documentation, user manual, example datasets and scripts are available online at http://code.google.com/p/ibm-cbc-genomic-tools.

  19. Functional genomics down under: RNAi screening in the Victorian Centre for Functional Genomics.

    PubMed

    Thomas, Daniel W; Gould, Cathryn M; Handoko, Yanny; Simpson, Kaylene J

    2014-05-01

    The Victorian Centre for Functional Genomics (VCFG) is an RNAi screening facility housed at the Peter MacCallum Cancer Centre in Melbourne, Australia. The Peter Mac is Australia's largest dedicated Cancer Research Institute, home to a team of over 520 scientists that focus on understanding the genetic risk of cancer, the molecular events regulating cancer growth and dissemination and improving detection through new diagnostic tools (www.petermac.org). Peter Mac is a well recognised technology leader and established the VCFG with a view to enabling researchers Australia and New Zealand-wide access to cutting edge functional genomics technology, infrastructure and expertise. This review documents the technology platforms operated within the VCFG and provides insight into the workflows and analysis pipelines currently in operation.

  20. Patenting genome research tools and the law.

    PubMed

    Eisenberg, Rebecca

    2003-01-01

    Patenting genes encoding therapeutic proteins was relatively uncontroversial in the early days of biotechnology. Controversy arose in the era of high-throughput DNA sequencing, when gene patents started to look less like patents on drugs and more like patents on scientific information. Evolving scientific and business strategies for exploiting genomic information raised concerns that patents might slow subsequent research. The trend towards stricter enforcement of the utility and disclosure requirements by the patent offices should help clarify the current confusion.

  1. Can orangutans (Pongo abelii) infer tool functionality?

    PubMed

    Mulcahy, Nicholas J; Schubiger, Michèle N

    2014-05-01

    It is debatable whether apes can reason about the unobservable properties of tools. We tested orangutans for this ability with a range of tool tasks that they could solve by using observational cues to infer tool functionality. In experiment 1, subjects successfully chose an unbroken tool over a broken one when each tool's middle section was hidden. This prevented seeing which tool was functional but it could be inferred by noting the tools' visible ends that were either disjointed (broken tool) or aligned (unbroken tool). We investigated whether success in experiment 1 was best explained by inferential reasoning or by having a preference per se for a hidden tool with an aligned configuration. We conducted a similar task to experiment 1 and included a functional bent tool that could be arranged to have the same disjointed configuration as the broken tool. The results suggested that subjects had a preference per se for the aligned tool by choosing it regardless of whether it was paired with the broken tool or the functional bent tool. However, further experiments with the bent tool task suggested this preference was a result of additional demands of having to attend to and remember the properties of the tools from the beginning of the task. In our last experiment, we removed these task demands and found evidence that subjects could infer the functionality of a broken tool and an unbroken tool that both looked identical at the time of choice.

  2. From Loci to Biology: Functional Genomics of Genome-Wide Association for Coronary Disease

    PubMed Central

    Nurnberg, Sylvia T; Zhang, Hanrui; Hand, Nicholas J; Bauer, Robert C; Saleheen, Danish; Reilly, Muredach P; Rader, Daniel J

    2016-01-01

    Genome-wide association studies (GWAS) have provided a rich collection of ~58 CAD loci that suggest the existence of previously unsuspected new biology relevant to atherosclerosis. However, these studies only identify genomic loci associated with CAD and many questions remain even after a genomic locus is definitively implicated, including the nature of the causal variant(s) and the causal gene(s), as well as the directionality of effect. There are a number of tools that can be employed for investigation of the functional genomics of these loci, and progress has been made on a limited number of novel CAD loci. New biology regarding atherosclerosis and CAD will be learned through the functional genomics of these loci and the hope is that at least some of these new pathways relevant to CAD pathogenesis will yield new therapeutic targets for the prevention and treatment of CAD. PMID:26892960

  3. Genomic Tools in Groundnut Breeding Program: Status and Perspectives.

    PubMed

    Janila, P; Variath, Murali T; Pandey, Manish K; Desmae, Haile; Motagi, Babu N; Okori, Patrick; Manohar, Surendra S; Rathnakumar, A L; Radhakrishnan, T; Liao, Boshou; Varshney, Rajeev K

    2016-01-01

    Groundnut, a nutrient-rich food legume, is cultivated world over. It is valued for its good quality cooking oil, energy and protein rich food, and nutrient-rich fodder. Globally, groundnut improvement programs have developed varieties to meet the preferences of farmers, traders, processors, and consumers. Enhanced yield, tolerance to biotic and abiotic stresses and quality parameters have been the target traits. Spurt in genetic information of groundnut was facilitated by development of molecular markers, genetic, and physical maps, generation of expressed sequence tags (EST), discovery of genes, and identification of quantitative trait loci (QTL) for some important biotic and abiotic stresses and quality traits. The first groundnut variety developed using marker assisted breeding (MAB) was registered in 2003. Since then, USA, China, Japan, and India have begun to use genomic tools in routine groundnut improvement programs. Introgression lines that combine foliar fungal disease resistance and early maturity were developed using MAB. Establishment of marker-trait associations (MTA) paved way to integrate genomic tools in groundnut breeding for accelerated genetic gain. Genomic Selection (GS) tools are employed to improve drought tolerance and pod yield, governed by several minor effect QTLs. Draft genome sequence and low cost genotyping tools such as genotyping by sequencing (GBS) are expected to accelerate use of genomic tools to enhance genetic gains for target traits in groundnut.

  4. Genomic Tools in Groundnut Breeding Program: Status and Perspectives

    PubMed Central

    Janila, P.; Variath, Murali T.; Pandey, Manish K.; Desmae, Haile; Motagi, Babu N.; Okori, Patrick; Manohar, Surendra S.; Rathnakumar, A. L.; Radhakrishnan, T.; Liao, Boshou; Varshney, Rajeev K.

    2016-01-01

    Groundnut, a nutrient-rich food legume, is cultivated world over. It is valued for its good quality cooking oil, energy and protein rich food, and nutrient-rich fodder. Globally, groundnut improvement programs have developed varieties to meet the preferences of farmers, traders, processors, and consumers. Enhanced yield, tolerance to biotic and abiotic stresses and quality parameters have been the target traits. Spurt in genetic information of groundnut was facilitated by development of molecular markers, genetic, and physical maps, generation of expressed sequence tags (EST), discovery of genes, and identification of quantitative trait loci (QTL) for some important biotic and abiotic stresses and quality traits. The first groundnut variety developed using marker assisted breeding (MAB) was registered in 2003. Since then, USA, China, Japan, and India have begun to use genomic tools in routine groundnut improvement programs. Introgression lines that combine foliar fungal disease resistance and early maturity were developed using MAB. Establishment of marker-trait associations (MTA) paved way to integrate genomic tools in groundnut breeding for accelerated genetic gain. Genomic Selection (GS) tools are employed to improve drought tolerance and pod yield, governed by several minor effect QTLs. Draft genome sequence and low cost genotyping tools such as genotyping by sequencing (GBS) are expected to accelerate use of genomic tools to enhance genetic gains for target traits in groundnut. PMID:27014312

  5. The capsicum transcriptome DB: a "hot" tool for genomic research.

    PubMed

    Góngora-Castillo, Elsa; Fajardo-Jaime, Rubén; Fernández-Cortes, Araceli; Jofre-Garfias, Alba E; Lozoya-Gloria, Edmundo; Martínez, Octavio; Ochoa-Alejo, Neftalí; Rivera-Bustamante, Rafael

    2012-01-01

    Chili pepper (Capsicum annuum) is an economically important crop with no available public genome sequence. We describe a genomic resource to facilitate Capsicum annuum research. A collection of Expressed Sequence Tags (ESTs) derived from five C. annuum organs (root, stem, leaf, flower and fruit) were sequenced using the Sanger method and multiple leaf transcriptomes were deeply sampled using with GS-pyrosequencing. A hybrid assembly of 1,324,516 raw reads yielded 32,314 high quality contigs as validated by coverage and identity analysis with existing pepper sequences. Overall, 75.5% of the contigs had significant sequence similarity to entries in nucleic acid and protein databases; 23% of the sequences have not been previously reported for C. annuum and expand sequence resources for this species. A MySQL database and a user-friendly Web interface were constructed with search-tools that permit queries of the ESTs including sequence, functional annotation, Gene Ontology classification, metabolic pathways, and assembly information. The Capsicum Transcriptome DB is free available from http://www.bioingenios.ira.cinvestav.mx:81/Joomla/

  6. TALEN and CRISPR/Cas Genome Editing Systems: Tools of Discovery

    PubMed Central

    Nemudryi, A. A.; Valetdinova, K. R.; Medvedev, S. P.; Zakian, S. M.

    2014-01-01

    Precise studies of plant, animal and human genomes enable remarkable opportunities of obtained data application in biotechnology and medicine. However, knowing nucleotide sequences isn’t enough for understanding of particular genomic elements functional relationship and their role in phenotype formation and disease pathogenesis. In post-genomic era methods allowing genomic DNA sequences manipulation, visualization and regulation of gene expression are rapidly evolving. Though, there are few methods, that meet high standards of efficiency, safety and accessibility for a wide range of researchers. In 2011 and 2013 novel methods of genome editing appeared – this are TALEN (Transcription Activator-Like Effector Nucleases) and CRISPR (Clustered Regulatory Interspaced Short Palindromic Repeats)/Cas9 systems. Although TALEN and CRISPR/Cas9 appeared recently, these systems have proved to be effective and reliable tools for genome engineering. Here we generally review application of these systems for genome editing in conventional model objects of current biology, functional genome screening, cell-based human hereditary disease modeling, epigenome studies and visualization of cellular processes. Additionally, we review general strategies for designing TALEN and CRISPR/Cas9 and analyzing their activity. We also discuss some obstacles researcher can face using these genome editing tools. PMID:25349712

  7. Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace

    PubMed Central

    Thorvaldsdottir, Helga; Liefeld, Ted; Ocana, Marco; Borges-Rivera, Diego; Pochet, Nathalie; Robinson, James T.; Demchak, Barry; Hull, Tim; Ben-Artzi, Gil; Blankenberg, Daniel; Barber, Galt P.; Lee, Brian T.; Kuhn, Robert M.; Nekrutenko, Anton; Segal, Eran; Ideker, Trey; Reich, Michael; Regev, Aviv; Chang, Howard Y.; Mesirov, Jill P.

    2015-01-01

    Integrative analysis of multiple data types to address complex biomedical questions requires the use of multiple software tools in concert and remains an enormous challenge for most of the biomedical research community. Here we introduce GenomeSpace (http://www.genomespace.org), a cloud-based, cooperative community resource. Seeded as a collaboration of six of the most popular genomics analysis tools, GenomeSpace now supports the streamlined interaction of 20 bioinformatics tools and data resources. To facilitate the ability of non-programming users’ to leverage GenomeSpace in integrative analysis, it offers a growing set of ‘recipes’, short workflows involving a few tools and steps to guide investigators through high utility analysis tasks. PMID:26780094

  8. Open chromatin reveals the functional maize genome

    PubMed Central

    Rodgers-Melnick, Eli; Vera, Daniel L.; Bass, Hank W.

    2016-01-01

    Cellular processes mediated through nuclear DNA must contend with chromatin. Chromatin structural assays can efficiently integrate information across diverse regulatory elements, revealing the functional noncoding genome. In this study, we use a differential nuclease sensitivity assay based on micrococcal nuclease (MNase) digestion to discover open chromatin regions in the maize genome. We find that maize MNase-hypersensitive (MNase HS) regions localize around active genes and within recombination hotspots, focusing biased gene conversion at their flanks. Although MNase HS regions map to less than 1% of the genome, they consistently explain a remarkably large amount (∼40%) of heritable phenotypic variance in diverse complex traits. MNase HS regions are therefore on par with coding sequences as annotations that demarcate the functional parts of the maize genome. These results imply that less than 3% of the maize genome (coding and MNase HS regions) may give rise to the overwhelming majority of phenotypic variation, greatly narrowing the scope of the functional genome. PMID:27185945

  9. CGUG: in silico proteome and genome parsing tool for the determination of "core" and unique genes in the analysis of genomes up to ca. 1.9 Mb

    PubMed Central

    Mahadevan, Padmanabhan; King, John F; Seto, Donald

    2009-01-01

    Background Viruses and small-genome bacteria (~2 megabases and smaller) comprise a considerable population in the biosphere and are of interest to many researchers. These genomes are now sequenced at an unprecedented rate and require complementary computational tools to analyze. "CoreGenesUniqueGenes" (CGUG) is an in silico genome data mining tool that determines a "core" set of genes from two to five organisms with genomes in this size range. Core and unique genes may reflect similar niches and needs, and may be used in classifying organisms. Findings CGUG is available at as a web-based on-the-fly tool that performs iterative BLASTP analyses using a reference genome and up to four query genomes to provide a table of genes common to these genomes. The result is an in silico display of genomes and their proteomes, allowing for further analysis. CGUG can be used for "genome annotation by homology", as demonstrated with Chlamydophila and Francisella genomes. Conclusion CGUG is used to reanalyze the ICTV-based classifications of bacteriophages, to reconfirm long-standing relationships and to explore new classifications. These genomes have been problematic in the past, due largely to horizontal gene transfers. CGUG is validated as a tool for reannotating small genome bacteria using more up-to-date annotations by similarity or homology. These serve as an entry point for wet-bench experiments to confirm the functions of these "hypothetical" and "unknown" proteins. PMID:19706165

  10. Hawkeye: an interactive visual analytics tool for genome assemblies

    PubMed Central

    2007-01-01

    Genome sequencing remains an inexact science, and genome sequences can contain significant errors if they are not carefully examined. Hawkeye is our new visual analytics tool for genome assemblies, designed to aid in identifying and correcting assembly errors. Users can analyze all levels of an assembly along with summary statistics and assembly metrics, and are guided by a ranking component towards likely mis-assemblies. Hawkeye is freely available and released as part of the open source AMOS project http://amos.sourceforge.net/hawkeye. PMID:17349036

  11. The Sleipnir library for computational functional genomics.

    PubMed

    Huttenhower, Curtis; Schroeder, Mark; Chikina, Maria D; Troyanskaya, Olga G

    2008-07-01

    Biological data generation has accelerated to the point where hundreds or thousands of whole-genome datasets of various types are available for many model organisms. This wealth of data can lead to valuable biological insights when analyzed in an integrated manner, but the computational challenge of managing such large data collections is substantial. In order to mine these data efficiently, it is necessary to develop methods that use storage, memory and processing resources carefully. The Sleipnir C++ library implements a variety of machine learning and data manipulation algorithms with a focus on heterogeneous data integration and efficiency for very large biological data collections. Sleipnir allows microarray processing, functional ontology mining, clustering, Bayesian learning and inference and support vector machine tasks to be performed for heterogeneous data on scales not previously practical. In addition to the library, which can easily be integrated into new computational systems, prebuilt tools are provided to perform a variety of common tasks. Many tools are multithreaded for parallelization in desktop or high-throughput computing environments, and most tasks can be performed in minutes for hundreds of datasets using a standard personal computer. Source code (C++) and documentation are available at http://function.princeton.edu/sleipnir and compiled binaries are available from the authors on request.

  12. The Human Functional Genomics Project: Understanding Generation of Diversity.

    PubMed

    Pappalardo, Jenna L; Hafler, David A

    2016-11-03

    Generation of biologic diversity is a cornerstone of immunity, yet the tools to investigate the causal influence of genetic and environmental factors have been greatly limited. Studies from the Human Functional Genomics Project, presented in Cell and other Cell Press journals, integrate environmental and genetic factors with the direction and magnitude of immune responses to decipher inflammatory disease pathogenesis.

  13. GenomePeek—an online tool for prokaryotic genome and metagenome analysis

    DOE PAGES

    McNair, Katelyn; Edwards, Robert A.

    2015-06-16

    As increases in prokaryotic sequencing take place, a method to quickly and accurately analyze this data is needed. Previous tools are mainly designed for metagenomic analysis and have limitations; such as long runtimes and significant false positive error rates. The online tool GenomePeek (edwards.sdsu.edu/GenomePeek) was developed to analyze both single genome and metagenome sequencing files, quickly and with low error rates. GenomePeek uses a sequence assembly approach where reads to a set of conserved genes are extracted, assembled and then aligned against the highly specific reference database. GenomePeek was found to be faster than traditional approaches while still keeping errormore » rates low, as well as offering unique data visualization options.« less

  14. GenomePeek—an online tool for prokaryotic genome and metagenome analysis

    SciTech Connect

    McNair, Katelyn; Edwards, Robert A.

    2015-06-16

    As increases in prokaryotic sequencing take place, a method to quickly and accurately analyze this data is needed. Previous tools are mainly designed for metagenomic analysis and have limitations; such as long runtimes and significant false positive error rates. The online tool GenomePeek (edwards.sdsu.edu/GenomePeek) was developed to analyze both single genome and metagenome sequencing files, quickly and with low error rates. GenomePeek uses a sequence assembly approach where reads to a set of conserved genes are extracted, assembled and then aligned against the highly specific reference database. GenomePeek was found to be faster than traditional approaches while still keeping error rates low, as well as offering unique data visualization options.

  15. The Functional Genomics Initiative at Oak Ridge National Laboratory

    SciTech Connect

    Johnson, Dabney; Justice, Monica; Beattle, Ken; Buchanan, Michelle; Ramsey, Michael; Ramsey, Rose; Paulus, Michael; Ericson, Nance; Allison, David; Kress, Reid; Mural, Richard; Uberbacher, Ed; Mann, Reinhold

    1997-12-31

    The Functional Genomics Initiative at the Oak Ridge National Laboratory integrates outstanding capabilities in mouse genetics, bioinformatics, and instrumentation. The 50 year investment by the DOE in mouse genetics/mutagenesis has created a one-of-a-kind resource for generating mutations and understanding their biological consequences. It is generally accepted that, through the mouse as a surrogate for human biology, we will come to understand the function of human genes. In addition to this world class program in mammalian genetics, ORNL has also been a world leader in developing bioinformatics tools for the analysis, management and visualization of genomic data. Combining this expertise with new instrumentation technologies will provide a unique capability to understand the consequences of mutations in the mouse at both the organism and molecular levels. The goal of the Functional Genomics Initiative is to develop the technology and methodology necessary to understand gene function on a genomic scale and apply these technologies to megabase regions of the human genome. The effort is scoped so as to create an effective and powerful resource for functional genomics. ORNL is partnering with the Joint Genome Institute and other large scale sequencing centers to sequence several multimegabase regions of both human and mouse genomic DNA, to identify all the genes in these regions, and to conduct fundamental surveys to examine gene function at the molecular and organism level. The Initiative is designed to be a pilot for larger scale deployment in the post-genome era. Technologies will be applied to the examination of gene expression and regulation, metabolism, gene networks, physiology and development.

  16. Applications of genomic tools to colorectal cancer therapeutics.

    PubMed

    Auman, J Todd; McLeod, Howard L

    2008-12-01

    Clinically and histopathologically similar colorectal cancers exhibit considerable variability in their responses to chemotherapeutics. The advent of genomic technologies has enabled the unbiased determination of changes in DNA and RNA, alterations that may be responsible for, or predictive of the variability in response to chemotherapy. This review highlights several advances made in applying genomic tools toward colorectal cancer therapeutics. Progress has been made using gene expression profiling to identify which colorectal cancer patients would benefit most from adjuvant chemotherapy. In addition, advances have been made in colorectal cancer pharmacogenomics by identifying gene expression patterns associated with sensitivity to specific chemotherapeutic agents. Lastly, the use of genome-wide mutation screening of individual tumor samples to identify the profiles of mutated genes is explored. Future research toward integrating genomic information with clinical and histopathological data is expected to lead to improved therapeutic management of colorectal cancer.

  17. Complete Genome Sequence of Flavobacteriumpsychrophilum Strain OSU THCO2-90, Used for Functional Genetic Analysis.

    PubMed

    Rochat, Tatiana; Barbier, Paul; Nicolas, Pierre; Loux, Valentin; Pérez-Pascual, David; Guijarro, José A; Bernardet, Jean-François; Duchaud, Eric

    2017-02-23

    We report here the complete annotated genome sequence of Flavobacterium psychrophilum OSU THCO2-90, isolated from Coho salmon (Oncorhynchus kisutch) in Oregon. The genome consists of a circular chromosome with 2,343 predicted open reading frames. This strain has proved to be a valuable tool for functional genomics.

  18. BEDTools: the Swiss-army tool for genome feature analysis

    PubMed Central

    Quinlan, Aaron R.

    2014-01-01

    Technological advances have enabled the use of DNA sequencing as a flexible tool to characterize genetic variation and to measure the activity of diverse cellular phenomena such as gene isoform expression and transcription factor binding. Extracting biological insight from the experiments enabled by these advances demands the analysis of large, multi-dimensional datasets. This unit describes the use of the BEDTools toolkit for the exploration of high-throughput genomics datasets. I present several protocols for common genomic analyses and demonstrate how simple BEDTools operations may be combined to create bespoke pipelines addressing complex questions. PMID:25199790

  19. Virome genomics: a tool for defining the human virome.

    PubMed

    Wylie, Kristine M; Weinstock, George M; Storch, Gregory A

    2013-08-01

    High throughput, deep sequencing assays are powerful tools for gaining insights into virus-host interactions. Sequencing assays can discover novel viruses and describe the genomes of novel and known viruses. Genomic information can predict viral proteins that can be characterized, describe important genes in the host that control infections, and evaluate gene expression of viruses and hosts during infection. Sequencing can also describe variation and evolution of viruses during replication and transmission. This review recounts some of the major advances in the studies of virus-host interactions from the last two years, and discusses the uses of sequencing technologies relating to these studies.

  20. Databases and Web Tools for Cancer Genomics Study

    PubMed Central

    Yang, Yadong; Dong, Xunong; Xie, Bingbing; Ding, Nan; Chen, Juan; Li, Yongjun; Zhang, Qian; Qu, Hongzhu; Fang, Xiangdong

    2015-01-01

    Publicly-accessible resources have promoted the advance of scientific discovery. The era of genomics and big data has brought the need for collaboration and data sharing in order to make effective use of this new knowledge. Here, we describe the web resources for cancer genomics research and rate them on the basis of the diversity of cancer types, sample size, omics data comprehensiveness, and user experience. The resources reviewed include data repository and analysis tools; and we hope such introduction will promote the awareness and facilitate the usage of these resources in the cancer research community. PMID:25707591

  1. Databases and web tools for cancer genomics study.

    PubMed

    Yang, Yadong; Dong, Xunong; Xie, Bingbing; Ding, Nan; Chen, Juan; Li, Yongjun; Zhang, Qian; Qu, Hongzhu; Fang, Xiangdong

    2015-02-01

    Publicly-accessible resources have promoted the advance of scientific discovery. The era of genomics and big data has brought the need for collaboration and data sharing in order to make effective use of this new knowledge. Here, we describe the web resources for cancer genomics research and rate them on the basis of the diversity of cancer types, sample size, omics data comprehensiveness, and user experience. The resources reviewed include data repository and analysis tools; and we hope such introduction will promote the awareness and facilitate the usage of these resources in the cancer research community. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.

  2. Probing the impact of chromatin conformation on genome editing tools

    PubMed Central

    Chen, Xiaoyu; Rinsma, Marrit; Janssen, Josephine M.; Liu, Jin; Maggio, Ignazio; Gonçalves, Manuel A.F.V.

    2016-01-01

    Transcription activator-like effector nucleases (TALENs) and RNA-guided nucleases derived from clustered, regularly interspaced, short palindromic repeats (CRISPR)-Cas9 systems have become ubiquitous genome editing tools. Despite this, the impact that distinct high-order chromatin conformations have on these sequence-specific designer nucleases is, presently, ill-defined. The same applies to the relative performance of TALENs and CRISPR/Cas9 nucleases at isogenic target sequences subjected to different epigenetic modifications. Here, to address these gaps in our knowledge, we have implemented quantitative cellular systems based on genetic reporters in which the euchromatic and heterochromatic statuses of designer nuclease target sites are stringently controlled by small-molecule drug availability. By using these systems, we demonstrate that TALENs and CRISPR/Cas9 nucleases are both significantly affected by the high-order epigenetic context of their target sequences. In addition, this outcome could also be ascertained for S. pyogenes CRISPR/Cas9 complexes harbouring Cas9 variants whose DNA cleaving specificities are superior to that of the wild-type Cas9 protein. Thus, the herein investigated cellular models will serve as valuable functional readouts for screening and assessing the role of chromatin on designer nucleases based on different platforms or with different architectures or compositions. PMID:27280977

  3. Genome editing strategies: potential tools for eradicating HIV-1/AIDS

    PubMed Central

    Khalili, Kamel; Gordon, Jennifer; Cosentino, Laura; Hu, Wenhui

    2015-01-01

    Current therapy for controlling HIV-1 infection and preventing AIDS progression has profoundly decreased viral replication in cells susceptible to HIV-1 infection, but it does not eliminate the low level of viral replication in latently infected cells which contain integrated copies of HIV-1 proviral DNA. There is an urgent need for the development of HIV-1 genome eradication strategies that will lead to a permanent or “sterile” cure of HIV-1/AIDS. In the past few years, novel nuclease-initiated genome editing tools have been developing rapidly, including ZFNs, TALENs, and the CRISPR/Cas9 system. These surgical knives, which can excise any genome, provide a great opportunity to eradicate the HIV-1 genome by targeting highly conserved regions of the HIV-1 long terminal repeats or essential viral genes. Given the time consuming and costly engineering of target-specific ZFNs and TALENs, the RNA-guided endonuclease Cas9 technology has emerged as a simpler and more versatile technology to allow permanent removal of integrated HIV-1 proviral DNA in eukaryotic cells, and hopefully animal models or human patients. The major unmet challenges of this approach at present include inefficient nuclease gene delivery, potential off-target cleavage, and cell-specific genome targeting. Nanoparticle or lentivirus-mediated delivery of next generation Cas9 technologies including nickase or RNA-guided FokI nuclease (RFN) will further improve the potential for genome editing to become a promising approach for curing HIV-1/AIDS. PMID:25716921

  4. Biological Database of Images and Genomes: tools for community annotations linking image and genomic information

    PubMed Central

    Oberlin, Andrew T; Jurkovic, Dominika A; Balish, Mitchell F; Friedberg, Iddo

    2013-01-01

    Genomic data and biomedical imaging data are undergoing exponential growth. However, our understanding of the phenotype–genotype connection linking the two types of data is lagging behind. While there are many types of software that enable the manipulation and analysis of image data and genomic data as separate entities, there is no framework established for linking the two. We present a generic set of software tools, BioDIG, that allows linking of image data to genomic data. BioDIG tools can be applied to a wide range of research problems that require linking images to genomes. BioDIG features the following: rapid construction of web-based workbenches, community-based annotation, user management and web services. By using BioDIG to create websites, researchers and curators can rapidly annotate a large number of images with genomic information. Here we present the BioDIG software tools that include an image module, a genome module and a user management module. We also introduce a BioDIG-based website, MyDIG, which is being used to annotate images of mycoplasmas. Database URL: BioDIG website: http://biodig.org BioDIG source code repository: http://github.com/FriedbergLab/BioDIG The MyDIG database: http://mydig.biodig.org/ PMID:23550062

  5. RNAi and functional genomics in plant parasitic nematodes.

    PubMed

    Rosso, M N; Jones, J T; Abad, P

    2009-01-01

    Plant nematology is currently undergoing a revolution with the availability of the first genome sequences as well as comprehensive expressed sequence tag (EST) libraries from a range of nematode species. Several strategies are being used to exploit this wealth of information. Comparative genomics is being used to explore the acquisition of novel genes associated with parasitic lifestyles. Functional analyses of nematode genes are moving toward larger scale studies including global transcriptome profiling. RNA interference (RNAi) has been shown to reduce expression of a range of plant parasitic nematode genes and is a powerful tool for functional analysis of nematode genes. RNAi-mediated suppression of genes essential for nematode development, survival, or parasitism is revealing new targets for nematode control. Plant nematology in the genomics era is now facing the challenge to develop RNAi screens adequate for high-throughput functional analyses.

  6. Rolling circle amplification, a powerful tool for genetic and functional studies of complete hepatitis B virus genomes from low-level infections and for directly probing covalently closed circular DNA.

    PubMed

    Margeridon, Séverine; Carrouée-Durantel, Sandra; Chemin, Isabelle; Barraud, Luc; Zoulim, Fabien; Trépo, Christian; Kay, Alan

    2008-09-01

    Complete characterization of the biological properties of hepatitis B virus (HBV) variants requires the generation of full-length genomes. The aim of this study was to develop new tools for the efficient full-length genome amplification of virus from samples with low viral loads. Rolling circle amplification (RCA) was used to amplify full-length HBV genomes from both sera and liver biopsy samples from chronic HBV carriers. Serum-derived relaxed circular HBV DNA could be amplified only after completion and ligation of plus-strand DNA. Covalently closed circular DNA (cccDNA) from liver biopsies could be amplified directly from as few as 13 copies, using RCA, followed by a full-length HBV PCR. Three serial liver biopsy samples were obtained from a lamivudine-resistant patient who cleared detectable serum HBV after adefovir dipivoxil was added to the lamivudine therapy and then seroconverted to anti-HBs. Only the genomes from the last biopsy specimen obtained after the emergence of lamivudine resistance contained the lamivudine resistance-associated mutations rtL180M and rtM204V ("rt" indicates reverse transcriptase domain). Defective genomes were also found in this biopsy sample. Genomes cloned from the liver biopsy specimens were transfected into HuH7 cells to study their replication competence and their susceptibility to lamivudine. RCA is a powerful tool for amplifying full-length HBV genomes and will be especially useful for the study of occult or inactive HBV infections and patients undergoing antiviral treatment. It can also be used to probe HBV cccDNA, the crucial intermediate in viral persistence and the archive of resistance mutations.

  7. A factor analysis model for functional genomics

    PubMed Central

    Kustra, Rafal; Shioda, Romy; Zhu, Mu

    2006-01-01

    Background Expression array data are used to predict biological functions of uncharacterized genes by comparing their expression profiles to those of characterized genes. While biologically plausible, this is both statistically and computationally challenging. Typical approaches are computationally expensive and ignore correlations among expression profiles and functional categories. Results We propose a factor analysis model (FAM) for functional genomics and give a two-step algorithm, using genome-wide expression data for yeast and a subset of Gene-Ontology Biological Process functional annotations. We show that the predictive performance of our method is comparable to the current best approach while our total computation time was faster by a factor of 4000. We discuss the unique challenges in performance evaluation of algorithms used for genome-wide functions genomics. Finally, we discuss extensions to our method that can incorporate the inherent correlation structure of the functional categories to further improve predictive performance. Conclusion Our factor analysis model is a computationally efficient technique for functional genomics and provides a clear and unified statistical framework with potential for incorporating important gene ontology information to improve predictions. PMID:16630343

  8. DoriTool: A Bioinformatics Integrative Tool for Post-Association Functional Annotation.

    PubMed

    Martín-Antoniano, Isabel; Alonso, Lola; Madrid, Miguel; López de Maturana, Evangelina; Malats, Núria

    2017-07-13

    The emergence of high-throughput data in biology has increased the need for functional in silico analysis and prompted the development of integrative bioinformatics tools to facilitate the obtainment of biologically meaningful data. In this paper, we present DoriTool, a comprehensive, easy, and friendly pipeline integrating biological data from different functional tools. The tool was designed with the aim to maximize reproducibility and reduce the working time of the researchers, especially of those with limited bioinformatics skills, and to help them with the interpretation of the results. DoriTool is based upon an integrative strategy implemented following a modular design pattern. Using scripts written in Bash, Perl and R, it performs a functional in silico analysis annotation at mutation/variant level, gene level, pathway level and network level by combining up-to-date functional and genomic data and integrating also third-party bioinformatics tools in a pipeline. DoriTool uses GRCh37 human assembly and online mode. DoriTool provides nice visual reports including variant annotation, linkage disequilibrium proxies, gene annotation, gene ontology analysis, expression quantitative trait loci results from Genotype-Tissue Expression (GTEx) and coloured pathways. Here, we also show DoriTool functionalities by applying it to a dataset of 13 variants associated with prostate cancer. Project development, released code libraries, GitHub repository (https://github.com/doritool) and documentation are hosted at https://doritool.github.io/. DoriTool is, to our knowledge, the most complete bioinformatics tool offering functional in silico annotation of variants previously associated with a trait of interest, shedding light on the underlying biology and helping the researchers in the interpretation and discussion of the results. © 2017 S. Karger AG, Basel.

  9. Tetrahymena as a Unicellular Model Eukaryote: Genetic and Genomic Tools.

    PubMed

    Ruehle, Marisa D; Orias, Eduardo; Pearson, Chad G

    2016-06-01

    Tetrahymena thermophila is a ciliate model organism whose study has led to important discoveries and insights into both conserved and divergent biological processes. In this review, we describe the tools for the use of Tetrahymena as a model eukaryote, including an overview of its life cycle, orientation to its evolutionary roots, and methodological approaches to forward and reverse genetics. Recent genomic tools have expanded Tetrahymena's utility as a genetic model system. With the unique advantages that Tetrahymena provide, we argue that it will continue to be a model organism of choice. Copyright © 2016 by the Genetics Society of America.

  10. Tetrahymena as a Unicellular Model Eukaryote: Genetic and Genomic Tools

    PubMed Central

    Ruehle, Marisa D.; Orias, Eduardo; Pearson, Chad G.

    2016-01-01

    Tetrahymena thermophila is a ciliate model organism whose study has led to important discoveries and insights into both conserved and divergent biological processes. In this review, we describe the tools for the use of Tetrahymena as a model eukaryote, including an overview of its life cycle, orientation to its evolutionary roots, and methodological approaches to forward and reverse genetics. Recent genomic tools have expanded Tetrahymena’s utility as a genetic model system. With the unique advantages that Tetrahymena provide, we argue that it will continue to be a model organism of choice. PMID:27270699

  11. Cascade: an RNA-seq visualization tool for cancer genomics.

    PubMed

    Shifman, Aaron R; Johnson, Radia M; Wilhelm, Brian T

    2016-01-25

    Cancer genomics projects are producing ever-increasing amounts of rich and diverse data from patient samples. The ability to easily visualize this data in an integrated an intuitive way is currently limited by the current software available. As a result, users typically must use several different tools to view the different data types for their cohort, making it difficult to have a simple unified view of their data. Here we present Cascade, a novel web based tool for the intuitive 3D visualization of RNA-seq data from cancer genomics experiments. The Cascade viewer allows multiple data types (e.g. mutation, gene expression, alternative splicing frequency) to be simultaneously displayed, allowing a simplified view of the data in a way that is tuneable based on user specified parameters. The main webpage of Cascade provides a primary view of user data which is overlaid onto known biological pathways that are either predefined or added by users. A space-saving menu for data selection and parameter adjustment allows users to access an underlying MySQL database and customize the features presented in the main view. There is currently a pressing need for new software tools to allow researchers to easily explore large cancer genomics datasets and generate hypotheses. Cascade represents a simple yet intuitive interface for data visualization that is both scalable and customizable.

  12. The CRISPR/Cas Genome-Editing Tool: Application in Improvement of Crops

    PubMed Central

    Khatodia, Surender; Bhatotia, Kirti; Passricha, Nishat; Khurana, S. M. P.; Tuteja, Narendra

    2016-01-01

    The Clustered Regularly Interspaced Short Palindromic Repeats associated Cas9/sgRNA system is a novel targeted genome-editing technique derived from bacterial immune system. It is an inexpensive, easy, most user friendly and rapidly adopted genome editing tool transforming to revolutionary paradigm. This technique enables precise genomic modifications in many different organisms and tissues. Cas9 protein is an RNA guided endonuclease utilized for creating targeted double-stranded breaks with only a short RNA sequence to confer recognition of the target in animals and plants. Development of genetically edited (GE) crops similar to those developed by conventional or mutation breeding using this potential technique makes it a promising and extremely versatile tool for providing sustainable productive agriculture for better feeding of rapidly growing population in a changing climate. The emerging areas of research for the genome editing in plants include interrogating gene function, rewiring the regulatory signaling networks and sgRNA library for high-throughput loss-of-function screening. In this review, we have described the broad applicability of the Cas9 nuclease mediated targeted plant genome editing for development of designer crops. The regulatory uncertainty and social acceptance of plant breeding by Cas9 genome editing have also been described. With this powerful and innovative technique the designer GE non-GM plants could further advance climate resilient and sustainable agriculture in the future and maximizing yield by combating abiotic and biotic stresses. PMID:27148329

  13. Recent advances in developing molecular tools for targeted genome engineering of mammalian cells.

    PubMed

    Lim, Kwang-il

    2015-01-01

    Various biological molecules naturally existing in diversified species including fungi, bacteria, and bacteriophage have functionalities for DNA binding and processing. The biological molecules have been recently actively engineered for use in customized genome editing of mammalian cells as the molecule-encoding DNA sequence information and the underlying mechanisms how the molecules work are unveiled. Excitingly, multiple novel methods based on the newly constructed artificial molecular tools have enabled modifications of specific endogenous genetic elements in the genome context at efficiencies that are much higher than that of the conventional homologous recombination based methods. This minireview introduces the most recently spotlighted molecular genome engineering tools with their key features and ongoing modifications for better performance. Such ongoing efforts have mainly focused on the removal of the inherent DNA sequence recognition rigidity from the original molecular platforms, the addition of newly tailored targeting functions into the engineered molecules, and the enhancement of their targeting specificity. Effective targeted genome engineering of mammalian cells will enable not only sophisticated genetic studies in the context of the genome, but also widely-applicable universal therapeutics based on the pinpointing and correction of the disease-causing genetic elements within the genome in the near future.

  14. Spatial Genome Organization and Its Emerging Role as a Potential Diagnosis Tool

    PubMed Central

    Meaburn, Karen J.

    2016-01-01

    In eukaryotic cells the genome is highly spatially organized. Functional relevance of higher order genome organization is implied by the fact that specific genes, and even whole chromosomes, alter spatial position in concert with functional changes within the nucleus, for example with modifications to chromatin or transcription. The exact molecular pathways that regulate spatial genome organization and the full implication to the cell of such an organization remain to be determined. However, there is a growing realization that the spatial organization of the genome can be used as a marker of disease. While global genome organization patterns remain largely conserved in disease, some genes and chromosomes occupy distinct nuclear positions in diseased cells compared to their normal counterparts, with the patterns of reorganization differing between diseases. Importantly, mapping the spatial positioning patterns of specific genomic loci can distinguish cancerous tissue from benign with high accuracy. Genome positioning is an attractive novel biomarker since additional quantitative biomarkers are urgently required in many cancer types. Current diagnostic techniques are often subjective and generally lack the ability to identify aggressive cancer from indolent, which can lead to over- or under-treatment of patients. Proof-of-principle for the use of genome positioning as a diagnostic tool has been provided based on small scale retrospective studies. Future large-scale studies are required to assess the feasibility of bringing spatial genome organization-based diagnostics to the clinical setting and to determine if the positioning patterns of specific loci can be useful biomarkers for cancer prognosis. Since spatial reorganization of the genome has been identified in multiple human diseases, it is likely that spatial genome positioning patterns as a diagnostic biomarker may be applied to many diseases. PMID:27507988

  15. Development of peanut expessed sequence tag-based genomic resources and tools

    USDA-ARS?s Scientific Manuscript database

    U.S. Peanut Genome Initiative (PGI) has widely recognized the need for peanut genome tools and resources development for mitigating peanut allergens and food safety. Genomics such as Expressed Sequence Tag (EST), microarray technologies, and whole genome sequencing provides robotic tools for profili...

  16. GAPIT Version 2: An Enhanced Integrated Tool for Genomic Association and Prediction.

    PubMed

    Tang, You; Liu, Xiaolei; Wang, Jiabo; Li, Meng; Wang, Qishan; Tian, Feng; Su, Zhongbin; Pan, Yuchun; Liu, Di; Lipka, Alexander E; Buckler, Edward S; Zhang, Zhiwu

    2016-07-01

    Most human diseases and agriculturally important traits are complex. Dissecting their genetic architecture requires continued development of innovative and powerful statistical methods. Corresponding advances in computing tools are critical to efficiently use these statistical innovations and to enhance and accelerate biomedical and agricultural research and applications. The genome association and prediction integrated tool (GAPIT) was first released in 2012 and became widely used for genome-wide association studies (GWAS) and genomic prediction. The GAPIT implemented computationally efficient statistical methods, including the compressed mixed linear model (CMLM) and genomic prediction by using genomic best linear unbiased prediction (gBLUP). New state-of-the-art statistical methods have now been implemented in a new, enhanced version of GAPIT. These methods include factored spectrally transformed linear mixed models (FaST-LMM), enriched CMLM (ECMLM), FaST-LMM-Select, and settlement of mixed linear models under progressively exclusive relationship (SUPER). The genomic prediction methods implemented in this new release of the GAPIT include gBLUP based on CMLM, ECMLM, and SUPER. Additionally, the GAPIT was updated to improve its existing output display features and to add new data display and evaluation functions, including new graphing options and capabilities, phenotype simulation, power analysis, and cross-validation. These enhancements make the GAPIT a valuable resource for determining appropriate experimental designs and performing GWAS and genomic prediction. The enhanced R-based GAPIT software package uses state-of-the-art methods to conduct GWAS and genomic prediction. The GAPIT also provides new functions for developing experimental designs and creating publication-ready tabular summaries and graphs to improve the efficiency and application of genomic research.

  17. Genetic and genomic approaches to understanding macrophage identity and function.

    PubMed

    Glass, Christopher K

    2015-04-01

    A major goal of our laboratory is to understand the molecular mechanisms that underlie the development and functions of diverse macrophage phenotypes in health and disease. Recent studies using genetic and genomic approaches suggest a relatively simple model of collaborative and hierarchical interactions between lineage-determining and signal-dependent transcription factors that enable selection and activation of transcriptional enhancers that specify macrophage identity and function. In addition, we have found that it is possible to use natural genetic variation as a powerful tool for advancing our understanding of how the macrophage deciphers the information encoded by the genome to attain specific phenotypes in a context-dependent manner. Here, I will describe our recent efforts to extend genetic and genomic approaches to investigate the roles of distinct tissue environments in determining the phenotypes of different resident populations of macrophages.

  18. An Italian functional genomic resource for Medicago truncatula

    PubMed Central

    Porceddu, Andrea; Panara, Francesco; Calderini, Ornella; Molinari, Lorna; Taviani, Paola; Lanfaloni, Luisa; Scotti, Carla; Carelli, Maria; Scaramelli, Laura; Bruschi, Gianluca; Cosson, Viviane; Ratet, Pascal; de Larembergue, Henri; Duc, Gerard; Piano, Efisio; Arcioni, Sergio

    2008-01-01

    Background Medicago truncatula is a model species for legumes. Its functional genomics have been considerably boosted in recent years due to initiatives based both in Europe and US. Collections of mutants are becoming increasingly available and this will help unravel the genetic control of important traits for many species of legumes. Findings Our report is on the production of three complementary mutant collections of the model species Medicago truncatula produced in Italy in the frame of a national genomic initiative. Well established strategies were used: Tnt1 mutagenesis, TILLING and activation tagging. Both forward and reverse genetics screenings proved the efficiency of the mutagenesis approaches adopted, enabling the isolation of interesting mutants which are in course of characterization. We anticipate that the reported collections will be complementary to the recently established functional genomics tools developed for Medicago truncatula both in Europe and in the United States. PMID:19077311

  19. Unraveling the 3D genome: genomics tools for multi-scale exploration

    PubMed Central

    Risca, Viviana I.; Greenleaf, William J.

    2015-01-01

    A decade of rapid method development has begun to yield exciting insights into the three-dimensional architecture of the metazoan genome and the roles it may play in regulating transcription. We review here core methods and new tools in the modern genomicist’s toolbox at three length scales, ranging from single base pair to megabase scale chromosomal domains, and discuss the emerging picture of the 3D genome that these tools have revealed. Blind spots remain, especially at intermediate length scales spanning a few nucleosomes, but thanks in part to new technologies that permit targeted alteration of chromatin states and time-resolved studies, the next decade holds great promise for hypothesis-driven research into the mechanisms that drive genome architecture and transcriptional regulation. PMID:25887733

  20. Tomato functional genomics database (TFGD): a comprehensive collection and analysis package for tomato functional genomics

    USDA-ARS?s Scientific Manuscript database

    Tomato Functional Genomics Database (TFGD; http://ted.bti.cornell.edu) provides a comprehensive systems biology resource to store, mine, analyze, visualize and integrate large-scale tomato functional genomics datasets. The database is expanded from the previously described Tomato Expression Database...

  1. A comparison of tools for the simulation of genomic next-generation sequencing data.

    PubMed

    Escalona, Merly; Rocha, Sara; Posada, David

    2016-08-01

    Computer simulation of genomic data has become increasingly popular for assessing and validating biological models or for gaining an understanding of specific data sets. Several computational tools for the simulation of next-generation sequencing (NGS) data have been developed in recent years, which could be used to compare existing and new NGS analytical pipelines. Here we review 23 of these tools, highlighting their distinct functionality, requirements and potential applications. We also provide a decision tree for the informed selection of an appropriate NGS simulation tool for the specific question at hand.

  2. A comparison of tools for the simulation of genomic next-generation sequencing data

    PubMed Central

    Escalona, Merly; Rocha, Sara; Posada, David

    2017-01-01

    Computer simulation of genomic data has become increasingly popular for assessing and validating biological models or to gain understanding about specific datasets. Multiple computational tools for the simulation of next-generation sequencing (NGS) data have been developed in recent years, which could be used to compare existing and new NGS analytical pipelines. Here we review 23 of these tools, highlighting their distinct functionality, requirements and potential applications. We also provide a decision tree for the informed selection of an appropriate NGS simulation tool for the specific question at hand. PMID:27320129

  3. Genomic-based-breeding tools for tropical maize improvement.

    PubMed

    Chakradhar, Thammineni; Hindu, Vemuri; Reddy, Palakolanu Sudhakar

    2017-09-05

    Maize has traditionally been the main staple diet in the Southern Asia and Sub-Saharan Africa and widely grown by millions of resource poor small scale farmers. Approximately, 35.4 million hectares are sown to tropical maize, constituting around 59% of the developing worlds. Tropical maize encounters tremendous challenges besides poor agro-climatic situations with average yields recorded <3 tones/hectare that is far less than the average of developed countries. On the contrary to poor yields, the demand for maize as food, feed, and fuel is continuously increasing in these regions. Heterosis breeding introduced in early 90 s improved maize yields significantly, but genetic gains is still a mirage, particularly for crop growing under marginal environments. Application of molecular markers has accelerated the pace of maize breeding to some extent. The availability of array of sequencing and genotyping technologies offers unrivalled service to improve precision in maize-breeding programs through modern approaches such as genomic selection, genome-wide association studies, bulk segregant analysis-based sequencing approaches, etc. Superior alleles underlying complex traits can easily be identified and introgressed efficiently using these sequence-based approaches. Integration of genomic tools and techniques with advanced genetic resources such as nested association mapping and backcross nested association mapping could certainly address the genetic issues in maize improvement programs in developing countries. Huge diversity in tropical maize and its inherent capacity for doubled haploid technology offers advantage to apply the next generation genomic tools for accelerating production in marginal environments of tropical and subtropical world. Precision in phenotyping is the key for success of any molecular-breeding approach. This article reviews genomic technologies and their application to improve agronomic traits in tropical maize breeding has been reviewed in

  4. Decoding the genome with an integrative analysis tool: combinatorial CRM Decoder.

    PubMed

    Kang, Keunsoo; Kim, Joomyeong; Chung, Jae Hoon; Lee, Daeyoup

    2011-09-01

    The identification of genome-wide cis-regulatory modules (CRMs) and characterization of their associated epigenetic features are fundamental steps toward the understanding of gene regulatory networks. Although integrative analysis of available genome-wide information can provide new biological insights, the lack of novel methodologies has become a major bottleneck. Here, we present a comprehensive analysis tool called combinatorial CRM decoder (CCD), which utilizes the publicly available information to identify and characterize genome-wide CRMs in a species of interest. CCD first defines a set of the epigenetic features which is significantly associated with a set of known CRMs as a code called 'trace code', and subsequently uses the trace code to pinpoint putative CRMs throughout the genome. Using 61 genome-wide data sets obtained from 17 independent mouse studies, CCD successfully catalogued ∼12 600 CRMs (five distinct classes) including polycomb repressive complex 2 target sites as well as imprinting control regions. Interestingly, we discovered that ∼4% of the identified CRMs belong to at least two different classes named 'multi-functional CRM', suggesting their functional importance for regulating spatiotemporal gene expression. From these examples, we show that CCD can be applied to any potential genome-wide datasets and therefore will shed light on unveiling genome-wide CRMs in various species.

  5. Functional Profiling of Human Fungal Pathogen Genomes

    PubMed Central

    Goranov, Alexi I.; Madhani, Hiten D.

    2015-01-01

    Fungal infections are challenging to diagnose and often difficult to treat, with only a handful of drug classes existing. Understanding the molecular mechanisms by which pathogenic fungi cause human disease is imperative. Here, we discuss how the development and use of genome-scale genetic resources, such as whole-genome knockout collections, can address this unmet need. Using work in Saccharomcyes cerevisiae as a guide, studies of Cryptococcus neoformans and Candida albicans have shown how the challenges of large-scale gene deletion can be overcome, and how such collections can be effectively used to obtain insights into mechanisms of pathogenesis. We conclude that, with concerted efforts, full genome-wide functional analysis of human fungal pathogen genomes is within reach. PMID:25377143

  6. Genome editing strategies: potential tools for eradicating HIV-1/AIDS.

    PubMed

    Khalili, Kamel; Kaminski, Rafal; Gordon, Jennifer; Cosentino, Laura; Hu, Wenhui

    2015-06-01

    Current therapy for controlling human immunodeficiency virus (HIV-1) infection and preventing acquired immunodeficiency syndrome (AIDS) progression has profoundly decreased viral replication in cells susceptible to HIV-1 infection, but it does not eliminate the low level of viral replication in latently infected cells, which contain integrated copies of HIV-1 proviral DNA. There is an urgent need for the development of HIV-1 genome eradication strategies that will lead to a permanent or "sterile" cure of HIV-1/AIDS. In the past few years, novel nuclease-initiated genome editing tools have been developing rapidly, including zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and the CRISPR/Cas9 system. These surgical knives, which can excise any genome, provide a great opportunity to eradicate the HIV-1 genome by targeting highly conserved regions of the HIV-1 long terminal repeats or essential viral genes. Given the time consuming and costly engineering of target-specific ZFNs and TALENs, the RNA-guided endonuclease Cas9 technology has emerged as a simpler and more versatile technology to allow permanent removal of integrated HIV-1 proviral DNA in eukaryotic cells, and hopefully animal models or human patients. The major unmet challenges of this approach at present include inefficient nuclease gene delivery, potential off-target cleavage, and cell-specific genome targeting. Nanoparticle or lentivirus-mediated delivery of next generation Cas9 technologies including nickase or RNA-guided FokI nuclease (RFN) will further improve the potential for genome editing to become a promising approach for curing HIV-1/AIDS.

  7. Molecular Tools for Exploring Polyploid Genomes in Plants

    PubMed Central

    Aversano, Riccardo; Ercolano, Maria Raffaella; Caruso, Immacolata; Fasano, Carlo; Rosellini, Daniele; Carputo, Domenico

    2012-01-01

    Polyploidy is a very common phenomenon in the plant kingdom, where even diploid species are often described as paleopolyploids. The polyploid condition may bring about several advantages compared to the diploid state. Polyploids often show phenotypes that are not present in their diploid progenitors or exceed the range of the contributing species. Some of these traits may play a role in heterosis or could favor adaptation to new ecological niches. Advances in genomics and sequencing technology may create unprecedented opportunities for discovering and monitoring the molecular effects of polyploidization. Through this review, we provide an overview of technologies and strategies that may allow an in-depth analysis of polyploid genomes. After introducing some basic aspects on the origin and genetics of polyploids, we highlight the main tools available for genome and gene expression analysis and summarize major findings. In the last part of this review, the implications of next generation sequencing are briefly discussed. The accumulation of knowledge on polyploid formation, maintenance, and divergence at whole-genome and subgenome levels will not only help plant biologists to understand how plants have evolved and diversified, but also assist plant breeders in designing new strategies for crop improvement. PMID:22949863

  8. [Genome Editing Tools and their Application in Experimental Ophthalmology].

    PubMed

    Yanik, M; Wende, W; Stieger, K

    2017-01-23

    New genome editing tools in molecular biology are revolutionising precise genome surgery and have greatly influenced experimental ophthalmology too. Aside from the commonly used nuclease-based platforms, such as the zinc-finger nucleases (ZFN) and transcription activator-like effector nucleases (TALEN), CRISPR/Cas systems, clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated (Cas) genes, perform very efficiently in site-specific DNA cleavage within living cells. DNA double strand breaks (DSB) are repaired through two different conserved repair pathways: NHEJ (non-homologous end joining) and HDR (homology directed repair). By using the correct DNA templates, these repair pathways can be used to knock out defective genes or to repair mutations. Genome editing technology lays the ground for new strategies in basic science, biotechnology, and biomedical science, as well as clinical studies with genome editing. Therapeutic gene editing strategies are now concentrating on diseases in the retina, due to the comparatively easy accessibility of the eye and with local application in vivo.

  9. Genomic Tools in Pea Breeding Programs: Status and Perspectives

    PubMed Central

    Tayeh, Nadim; Aubert, Grégoire; Pilet-Nayel, Marie-Laure; Lejeune-Hénaut, Isabelle; Warkentin, Thomas D.; Burstin, Judith

    2015-01-01

    Pea (Pisum sativum L.) is an annual cool-season legume and one of the oldest domesticated crops. Dry pea seeds contain 22–25% protein, complex starch and fiber constituents, and a rich array of vitamins, minerals, and phytochemicals which make them a valuable source for human consumption and livestock feed. Dry pea ranks third to common bean and chickpea as the most widely grown pulse in the world with more than 11 million tons produced in 2013. Pea breeding has achieved great success since the time of Mendel's experiments in the mid-1800s. However, several traits still require significant improvement for better yield stability in a larger growing area. Key breeding objectives in pea include improving biotic and abiotic stress resistance and enhancing yield components and seed quality. Taking advantage of the diversity present in the pea genepool, many mapping populations have been constructed in the last decades and efforts have been deployed to identify loci involved in the control of target traits and further introgress them into elite breeding materials. Pea now benefits from next-generation sequencing and high-throughput genotyping technologies that are paving the way for genome-wide association studies and genomic selection approaches. This review covers the significant development and deployment of genomic tools for pea breeding in recent years. Future prospects are discussed especially in light of current progress toward deciphering the pea genome. PMID:26640470

  10. ThermoAlign: a genome-aware primer design tool for tiled amplicon resequencing

    PubMed Central

    Francis, Felix; Dumas, Michael D.; Wisser, Randall J.

    2017-01-01

    Isolating and sequencing specific regions in a genome is a cornerstone of molecular biology. This has been facilitated by computationally encoding the thermodynamics of DNA hybridization for automated design of hybridization and priming oligonucleotides. However, the repetitive composition of genomes challenges the identification of target-specific oligonucleotides, which limits genetics and genomics research on many species. Here, a tool called ThermoAlign was developed that ensures the design of target-specific primer pairs for DNA amplification. This is achieved by evaluating the thermodynamics of hybridization for full-length oligonucleotide-template alignments — thermoalignments — across the genome to identify primers predicted to bind specifically to the target site. For amplification-based resequencing of regions that cannot be amplified by a single primer pair, a directed graph analysis method is used to identify minimum amplicon tiling paths. Laboratory validation by standard and long-range polymerase chain reaction and amplicon resequencing with maize, one of the most repetitive genomes sequenced to date (≈85% repeat content), demonstrated the specificity-by-design functionality of ThermoAlign. ThermoAlign is released under an open source license and bundled in a dependency-free container for wide distribution. It is anticipated that this tool will facilitate multiple applications in genetics and genomics and be useful in the workflow of high-throughput targeted resequencing studies. PMID:28300202

  11. A survey of tools for variant analysis of next-generation genome sequencing data

    PubMed Central

    Pabinger, Stephan; Dander, Andreas; Fischer, Maria; Snajder, Rene; Sperk, Michael; Efremova, Mirjana; Krabichler, Birgit; Speicher, Michael R.; Zschocke, Johannes

    2014-01-01

    Recent advances in genome sequencing technologies provide unprecedented opportunities to characterize individual genomic landscapes and identify mutations relevant for diagnosis and therapy. Specifically, whole-exome sequencing using next-generation sequencing (NGS) technologies is gaining popularity in the human genetics community due to the moderate costs, manageable data amounts and straightforward interpretation of analysis results. While whole-exome and, in the near future, whole-genome sequencing are becoming commodities, data analysis still poses significant challenges and led to the development of a plethora of tools supporting specific parts of the analysis workflow or providing a complete solution. Here, we surveyed 205 tools for whole-genome/whole-exome sequencing data analysis supporting five distinct analytical steps: quality assessment, alignment, variant identification, variant annotation and visualization. We report an overview of the functionality, features and specific requirements of the individual tools. We then selected 32 programs for variant identification, variant annotation and visualization, which were subjected to hands-on evaluation using four data sets: one set of exome data from two patients with a rare disease for testing identification of germline mutations, two cancer data sets for testing variant callers for somatic mutations, copy number variations and structural variations, and one semi-synthetic data set for testing identification of copy number variations. Our comprehensive survey and evaluation of NGS tools provides a valuable guideline for human geneticists working on Mendelian disorders, complex diseases and cancers. PMID:23341494

  12. Genomic tools to profile antibiotic mode of action.

    PubMed

    Cardona, Silvia T; Selin, Carrie; Gislason, April S

    2015-01-01

    The increasing emergence of antimicrobial multiresistant bacteria is of great concern to public health. While these bacteria are becoming an ever more prominent cause of nosocomial and community-acquired infections worldwide, the antibiotic discovery pipeline has been stalled in the last few years with very few efforts in the research and development of novel antibacterial therapies. Some of the root causes that have hampered current antibiotic drug development are the lack of understanding of the mode of action (MOA) of novel antibiotic molecules and the poor characterization of the bacterial physiological response to antibiotics that ultimately causes resistance. Here, we review how bacterial genetic tools can be applied at the genomic level with the goal of profiling resistance to antibiotics and elucidating antibiotic MOAs. Specifically, we highlight how chemical genomic detection of the MOA of novel antibiotic molecules and antibiotic profiling by next-generation sequencing are leveraging basic antibiotic research to unprecedented levels with great opportunities for knowledge translation.

  13. An evolutionary classification of genomic function.

    PubMed

    Graur, Dan; Zheng, Yichen; Azevedo, Ricardo B R

    2015-01-28

    The pronouncements of the ENCODE Project Consortium regarding "junk DNA" exposed the need for an evolutionary classification of genomic elements according to their selected-effect function. In the classification scheme presented here, we divide the genome into "functional DNA," that is, DNA sequences that have a selected-effect function, and "rubbish DNA," that is, sequences that do not. Functional DNA is further subdivided into "literal DNA" and "indifferent DNA." In literal DNA, the order of nucleotides is under selection; in indifferent DNA, only the presence or absence of the sequence is under selection. Rubbish DNA is further subdivided into "junk DNA" and "garbage DNA." Junk DNA neither contributes to nor detracts from the fitness of the organism and, hence, evolves under selective neutrality. Garbage DNA, on the other hand, decreases the fitness of its carriers. Garbage DNA exists in the genome only because natural selection is neither omnipotent nor instantaneous. Each of these four functional categories can be 1) transcribed and translated, 2) transcribed but not translated, or 3) not transcribed. The affiliation of a DNA segment to a particular functional category may change during evolution: Functional DNA may become junk DNA, junk DNA may become garbage DNA, rubbish DNA may become functional DNA, and so on; however, determining the functionality or nonfunctionality of a genomic sequence must be based on its present status rather than on its potential to change (or not to change) in the future. Changes in functional affiliation are divided into pseudogenes, Lazarus DNA, zombie DNA, and Jekyll-to-Hyde DNA.

  14. An Evolutionary Classification of Genomic Function

    PubMed Central

    Graur, Dan; Zheng, Yichen; Azevedo, Ricardo B.R.

    2015-01-01

    The pronouncements of the ENCODE Project Consortium regarding “junk DNA” exposed the need for an evolutionary classification of genomic elements according to their selected-effect function. In the classification scheme presented here, we divide the genome into “functional DNA,” that is, DNA sequences that have a selected-effect function, and “rubbish DNA,” that is, sequences that do not. Functional DNA is further subdivided into “literal DNA” and “indifferent DNA.” In literal DNA, the order of nucleotides is under selection; in indifferent DNA, only the presence or absence of the sequence is under selection. Rubbish DNA is further subdivided into “junk DNA” and “garbage DNA.” Junk DNA neither contributes to nor detracts from the fitness of the organism and, hence, evolves under selective neutrality. Garbage DNA, on the other hand, decreases the fitness of its carriers. Garbage DNA exists in the genome only because natural selection is neither omnipotent nor instantaneous. Each of these four functional categories can be 1) transcribed and translated, 2) transcribed but not translated, or 3) not transcribed. The affiliation of a DNA segment to a particular functional category may change during evolution: Functional DNA may become junk DNA, junk DNA may become garbage DNA, rubbish DNA may become functional DNA, and so on; however, determining the functionality or nonfunctionality of a genomic sequence must be based on its present status rather than on its potential to change (or not to change) in the future. Changes in functional affiliation are divided into pseudogenes, Lazarus DNA, zombie DNA, and Jekyll-to-Hyde DNA. PMID:25635041

  15. Open chromatin reveals the functional maize genome

    USDA-ARS?s Scientific Manuscript database

    Every cellular process mediated through nuclear DNA must contend with chromatin. As results from ENCODE show, open chromatin assays can efficiently integrate across diverse regulatory elements, revealing functional non-coding genome. In this study, we use a MNase hypersensitivity assay to discover o...

  16. Sugarcane genome sequencing by methylation filtration provides tools for genomic research in the genus Saccharum

    PubMed Central

    Grativol, Clícia; Regulski, Michael; Bertalan, Marcelo; McCombie, W. Richard; da Silva, Felipe Rodrigues; Neto, Adhemar Zerlotini; Vicentini, Renato; Farinelli, Laurent; Hemerly, Adriana Silva; Martienssen, Robert A.; Ferreira, Paulo Cavalcanti Gomes

    2015-01-01

    SUMMARY Many economically important crops have large and complex genomes, which hampers sequencing of their genome by standard methods such as WGS. Large tracts of methylated repeats occur at plant genomes interspersed by hypomethylated gene-rich regions. Gene enrichment strategies based on methylation profile offer an alternative to sequencing repetitive genomes. Here, we have applied methyl filtration (MF) with McrBC digestion to enrich for euchromatic regions of sugarcane genome. To verify the efficiency of MF and the assembly quality of sequences submitted to gene-enrichment strategy, we have compared assemblies using MF and unfiltered (UF) libraries. The MF allowed the achievement of a better assembly by filtering out 35% of the sugarcane genome and by producing 1.5 times more scaffolds and 1.7 times more assembled Mb compared to unfiltered scaffolds. The coverage of sorghum CDS by MF scaffolds was at least 36% higher than by UF scaffolds. Using MF technology, we increased by 134X the coverage of genic regions of the monoploid sugarcane genome. The MF reads assembled into scaffolds covering all genes at sugarcane BACs, 97.2% of sugarcane ESTs, 92.7% of sugarcane RNA-seq reads and 98.4% of sorghum protein sequences. Analysis of MF scaffolds encoding enzymes of the sucrose/starch pathway discovered 291 SNPs in the wild sugarcane species, S. spontaneum and S. officinarum. A large number of microRNA genes were also identified in the MF scaffolds. The information achieved by the MF dataset provides a valuable tool for genomic research in the genus Saccharum and improvement of sugarcane as a biofuel crop. PMID:24773339

  17. Sugarcane genome sequencing by methylation filtration provides tools for genomic research in the genus Saccharum.

    PubMed

    Grativol, Clícia; Regulski, Michael; Bertalan, Marcelo; McCombie, W Richard; da Silva, Felipe Rodrigues; Zerlotini Neto, Adhemar; Vicentini, Renato; Farinelli, Laurent; Hemerly, Adriana Silva; Martienssen, Robert A; Ferreira, Paulo Cavalcanti Gomes

    2014-07-01

    Many economically important crops have large and complex genomes that hamper their sequencing by standard methods such as whole genome shotgun (WGS). Large tracts of methylated repeats occur in plant genomes that are interspersed by hypomethylated gene-rich regions. Gene-enrichment strategies based on methylation profiles offer an alternative to sequencing repetitive genomes. Here, we have applied methyl filtration with McrBC endonuclease digestion to enrich for euchromatic regions in the sugarcane genome. To verify the efficiency of methylation filtration and the assembly quality of sequences submitted to gene-enrichment strategy, we have compared assemblies using methyl-filtered (MF) and unfiltered (UF) libraries. The use of methy filtration allowed a better assembly by filtering out 35% of the sugarcane genome and by producing 1.5× more scaffolds and 1.7× more assembled Mb in length compared with unfiltered dataset. The coverage of sorghum coding sequences (CDS) by MF scaffolds was at least 36% higher than by the use of UF scaffolds. Using MF technology, we increased by 134× the coverage of gene regions of the monoploid sugarcane genome. The MF reads assembled into scaffolds that covered all genes of the sugarcane bacterial artificial chromosomes (BACs), 97.2% of sugarcane expressed sequence tags (ESTs), 92.7% of sugarcane RNA-seq reads and 98.4% of sorghum protein sequences. Analysis of MF scaffolds from encoded enzymes of the sucrose/starch pathway discovered 291 single-nucleotide polymorphisms (SNPs) in the wild sugarcane species, S. spontaneum and S. officinarum. A large number of microRNA genes was also identified in the MF scaffolds. The information achieved by the MF dataset provides a valuable tool for genomic research in the genus Saccharum and for improvement of sugarcane as a biofuel crop. © 2014 The Authors The Plant Journal © 2014 John Wiley & Sons Ltd.

  18. Adapting CRISPR/Cas9 for functional genomics screens.

    PubMed

    Malina, Abba; Katigbak, Alexandra; Cencic, Regina; Maïga, Rayelle Itoua; Robert, Francis; Miura, Hisashi; Pelletier, Jerry

    2014-01-01

    The use of CRISPR/Cas9 (clustered regularly interspaced short palindromic repeats/CRISPR-associated protein) for targeted genome editing has been widely adopted and is considered a "game changing" technology. The ease and rapidity by which this approach can be used to modify endogenous loci in a wide spectrum of cell types and organisms makes it a powerful tool for customizable genetic modifications as well as for large-scale functional genomics. The development of retrovirus-based expression platforms to simultaneously deliver the Cas9 nuclease and single guide (sg) RNAs provides unique opportunities by which to ensure stable and reproducible expression of the editing tools and a broad cell targeting spectrum, while remaining compatible with in vivo genetic screens. Here, we describe methods and highlight considerations for designing and generating sgRNA libraries in all-in-one retroviral vectors for such applications.

  19. Microbial ecology in the age of genomics and metagenomics: concepts, tools, and recent advances.

    PubMed

    Xu, Jianping

    2006-06-01

    Microbial ecology examines the diversity and activity of micro-organisms in Earth's biosphere. In the last 20 years, the application of genomics tools have revolutionized microbial ecological studies and drastically expanded our view on the previously underappreciated microbial world. This review first introduces the basic concepts in microbial ecology and the main genomics methods that have been used to examine natural microbial populations and communities. In the ensuing three specific sections, the applications of the genomics in microbial ecological research are highlighted. The first describes the widespread application of multilocus sequence typing and representational difference analysis in studying genetic variation within microbial species. Such investigations have identified that migration, horizontal gene transfer and recombination are common in natural microbial populations and that microbial strains can be highly variable in genome size and gene content. The second section highlights and summarizes the use of four specific genomics methods (phylogenetic analysis of ribosomal RNA, DNA-DNA re-association kinetics, metagenomics, and micro-arrays) in analysing the diversity and potential activity of microbial populations and communities from a variety of terrestrial and aquatic environments. Such analyses have identified many unexpected phylogenetic lineages in viruses, bacteria, archaea, and microbial eukaryotes. Functional analyses of environmental DNA also revealed highly prevalent, but previously unknown, metabolic processes in natural microbial communities. In the third section, the ecological implications of sequenced microbial genomes are briefly discussed. Comparative analyses of prokaryotic genomic sequences suggest the importance of ecology in determining microbial genome size and gene content. The significant variability in genome size and gene content among strains and species of prokaryotes indicate the highly fluid nature of prokaryotic

  20. Functional Annotation Analytics of Rhodopseudomonas palustris Genomes.

    PubMed

    Simmons, Shaneka S; Isokpehi, Raphael D; Brown, Shyretha D; McAllister, Donee L; Hall, Charnia C; McDuffy, Wanaki M; Medley, Tamara L; Udensi, Udensi K; Rajnarayanan, Rajendram V; Ayensu, Wellington K; Cohly, Hari H P

    2011-01-01

    Rhodopseudomonas palustris, a nonsulphur purple photosynthetic bacteria, has been extensively investigated for its metabolic versatility including ability to produce hydrogen gas from sunlight and biomass. The availability of the finished genome sequences of six R. palustris strains (BisA53, BisB18, BisB5, CGA009, HaA2 and TIE-1) combined with online bioinformatics software for integrated analysis presents new opportunities to determine the genomic basis of metabolic versatility and ecological lifestyles of the bacteria species. The purpose of this investigation was to compare the functional annotations available for multiple R. palustris genomes to identify annotations that can be further investigated for strain-specific or uniquely shared phenotypic characteristics. A total of 2,355 protein family Pfam domain annotations were clustered based on presence or absence in the six genomes. The clustering process identified groups of functional annotations including those that could be verified as strain-specific or uniquely shared phenotypes. For example, genes encoding water/glycerol transport were present in the genome sequences of strains CGA009 and BisB5, but absent in strains BisA53, BisB18, HaA2 and TIE-1. Protein structural homology modeling predicted that the two orthologous 240 aa R. palustris aquaporins have water-specific transport function. Based on observations in other microbes, the presence of aquaporin in R. palustris strains may improve freeze tolerance in natural conditions of rapid freezing such as nitrogen fixation at low temperatures where access to liquid water is a limiting factor for nitrogenase activation. In the case of adaptive loss of aquaporin genes, strains may be better adapted to survive in conditions of high-sugar content such as fermentation of biomass for biohydrogen production. Finally, web-based resources were developed to allow for interactive, user-defined selection of the relationship between protein family annotations and the R

  1. Functional Annotation Analytics of Rhodopseudomonas palustris Genomes

    PubMed Central

    Simmons, Shaneka S.; Isokpehi, Raphael D.; Brown, Shyretha D.; McAllister, Donee L.; Hall, Charnia C.; McDuffy, Wanaki M.; Medley, Tamara L.; Udensi, Udensi K.; Rajnarayanan, Rajendram V.; Ayensu, Wellington K.; Cohly, Hari H.P.

    2011-01-01

    Rhodopseudomonas palustris, a nonsulphur purple photosynthetic bacteria, has been extensively investigated for its metabolic versatility including ability to produce hydrogen gas from sunlight and biomass. The availability of the finished genome sequences of six R. palustris strains (BisA53, BisB18, BisB5, CGA009, HaA2 and TIE-1) combined with online bioinformatics software for integrated analysis presents new opportunities to determine the genomic basis of metabolic versatility and ecological lifestyles of the bacteria species. The purpose of this investigation was to compare the functional annotations available for multiple R. palustris genomes to identify annotations that can be further investigated for strain-specific or uniquely shared phenotypic characteristics. A total of 2,355 protein family Pfam domain annotations were clustered based on presence or absence in the six genomes. The clustering process identified groups of functional annotations including those that could be verified as strain-specific or uniquely shared phenotypes. For example, genes encoding water/glycerol transport were present in the genome sequences of strains CGA009 and BisB5, but absent in strains BisA53, BisB18, HaA2 and TIE-1. Protein structural homology modeling predicted that the two orthologous 240 aa R. palustris aquaporins have water-specific transport function. Based on observations in other microbes, the presence of aquaporin in R. palustris strains may improve freeze tolerance in natural conditions of rapid freezing such as nitrogen fixation at low temperatures where access to liquid water is a limiting factor for nitrogenase activation. In the case of adaptive loss of aquaporin genes, strains may be better adapted to survive in conditions of high-sugar content such as fermentation of biomass for biohydrogen production. Finally, web-based resources were developed to allow for interactive, user-defined selection of the relationship between protein family annotations and the R

  2. Mos1 transposition as a tool to engineer the Caenorhabditis elegans genome by homologous recombination.

    PubMed

    Robert, Valérie J P; Katic, Iskra; Bessereau, Jean-Louis

    2009-11-01

    Gene knockouts and knock-ins have emerged as powerful tools to study gene function in model organisms. The construction of such engineered alleles requires that homologous recombination between a transgenic fragment carrying the modifications desired in the genome and the locus to engineer occurs at high frequencies. Homologous recombination frequency is significantly increased in the vicinity of a DNA double-strand break. Based on this observation, a new generation of transgene-instructed genome engineering protocols was developed. Here, we present MosTIC (for "Mos1 excision-induced transgene-instructed gene conversion"), a new technique that provides a means to engineer the Caenorhabditis elegans genome. MosTIC is initiated by the mobilization of Mos1, a Drosophila transposon experimentally introduced in C. elegans. During MosTIC, a Mos1 insertion localized in the genomic region to engineer is mobilized after germline expression of the Mos transposase. Mos1 excision generates a DNA double-strand break, which is repaired by homologous recombination using a transgenic repair template. This results in the transfer of information from the transgene into the genome. Depending on the method used to trigger Mos1 excision, two alternative MosTIC protocols are available, which are presented here in detail. This technique can be used for a wide range of applications, such as structure-function analysis, protein localization and purification, genetic screens or generation of single copy transgenes at a defined locus in the genome.

  3. Resources for Functional Genomics Studies in Drosophila melanogaster

    PubMed Central

    Mohr, Stephanie E.; Hu, Yanhui; Kim, Kevin; Housden, Benjamin E.; Perrimon, Norbert

    2014-01-01

    Drosophila melanogaster has become a system of choice for functional genomic studies. Many resources, including online databases and software tools, are now available to support design or identification of relevant fly stocks and reagents or analysis and mining of existing functional genomic, transcriptomic, proteomic, etc. datasets. These include large community collections of fly stocks and plasmid clones, “meta” information sites like FlyBase and FlyMine, and an increasing number of more specialized reagents, databases, and online tools. Here, we introduce key resources useful to plan large-scale functional genomics studies in Drosophila and to analyze, integrate, and mine the results of those studies in ways that facilitate identification of highest-confidence results and generation of new hypotheses. We also discuss ways in which existing resources can be used and might be improved and suggest a few areas of future development that would further support large- and small-scale studies in Drosophila and facilitate use of Drosophila information by the research community more generally. PMID:24653003

  4. CrossMap: a versatile tool for coordinate conversion between genome assemblies

    PubMed Central

    Zhao, Hao; Sun, Zhifu; Wang, Jing; Huang, Haojie; Kocher, Jean-Pierre; Wang, Liguo

    2014-01-01

    Motivation: Reference genome assemblies are subject to change and refinement from time to time. Generally, researchers need to convert the results that have been analyzed according to old assemblies to newer versions, or vice versa, to facilitate meta-analysis, direct comparison, data integration and visualization. Several useful conversion tools can convert genome interval files in browser extensible data or general feature format, but none have the functionality to convert files in sequence alignment map or BigWig format. This is a significant gap in computational genomics tools, as these formats are the ones most widely used for representing high-throughput sequencing data, such as RNA-seq, chromatin immunoprecipitation sequencing, DNA-seq, etc. Results: Here we developed CrossMap, a versatile and efficient tool for converting genome coordinates between assemblies. CrossMap supports most of the commonly used file formats, including BAM, sequence alignment map, Wiggle, BigWig, browser extensible data, general feature format, gene transfer format and variant call format. Availability and implementation: CrossMap is written in Python and C. Source code and a comprehensive user’s manual are freely available at: http://crossmap.sourceforge.net/. Contact: Kocher.JeanPierre@mayo.edu or wang.liguo@mayo.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24351709

  5. Structure-based inference of molecular functions of proteins of unknown function from Berkeley Structural Genomics Center

    SciTech Connect

    Kim, Sung-Hou; Shin, Dong Hae; Hou, Jingtong; Chandonia, John-Marc; Das, Debanu; Choi, In-Geol; Kim, Rosalind; Kim, Sung-Hou

    2007-09-02

    Advances in sequence genomics have resulted in an accumulation of a huge number of protein sequences derived from genome sequences. However, the functions of a large portion of them cannot be inferred based on the current methods of sequence homology detection to proteins of known functions. Three-dimensional structure can have an important impact in providing inference of molecular function (physical and chemical function) of a protein of unknown function. Structural genomics centers worldwide have been determining many 3-D structures of the proteins of unknown functions, and possible molecular functions of them have been inferred based on their structures. Combined with bioinformatics and enzymatic assay tools, the successful acceleration of the process of protein structure determination through high throughput pipelines enables the rapid functional annotation of a large fraction of hypothetical proteins. We present a brief summary of the process we used at the Berkeley Structural Genomics Center to infer molecular functions of proteins of unknown function.

  6. Structure-based inference of molecular functions of proteins of unknown function from Berkeley Structural Genomics Center.

    PubMed

    Shin, Dong Hae; Hou, Jingtong; Chandonia, John-Marc; Das, Debanu; Choi, In-Geol; Kim, Rosalind; Kim, Sung-Hou

    2007-09-01

    Advances in sequence genomics have resulted in an accumulation of a huge number of protein sequences derived from genome sequences. However, the functions of a large portion of them cannot be inferred based on the current methods of sequence homology detection to proteins of known functions. Three-dimensional structure can have an important impact in providing inference of molecular function (physical and chemical function) of a protein of unknown function. Structural genomics centers worldwide have been determining many 3-D structures of the proteins of unknown functions, and possible molecular functions of them have been inferred based on their structures. Combined with bioinformatics and enzymatic assay tools, the successful acceleration of the process of protein structure determination through high throughput pipelines enables the rapid functional annotation of a large fraction of hypothetical proteins. We present a brief summary of the process we used at the Berkeley Structural Genomics Center to infer molecular functions of proteins of unknown function.

  7. Functional genomic analysis of the Drosophila immune response.

    PubMed

    Valanne, Susanna

    2014-01-01

    Drosophila melanogaster has been widely used as a model organism for over a century now, and also as an immunological research model for over 20 years. With the emergence of RNA interference (RNAi) in Drosophila as a robust tool to silence genes of interest, large-scale or genome-wide functional analysis has become a popular way of studying the Drosophila immune response in cell culture. Drosophila immunity is composed of cellular and humoral immunity mechanisms, and especially the systemic, humoral response pathways have been extensively dissected using the functional genomic approach. Although most components of the main immune pathways had already been found using traditional genetic screening techniques, important findings including pathway components, positive and negative regulators and modifiers have been made with RNAi screening. Additionally, RNAi screening has produced new information on host-pathogen interactions related to the pathogenesis of many microbial species. Copyright © 2013 Elsevier Ltd. All rights reserved.

  8. Inducible CRISPR genome-editing tool: classifications and future trends.

    PubMed

    Dai, Xiaofeng; Chen, Xiao; Fang, Qiuwu; Li, Jia; Bai, Zhonghu

    2017-09-22

    The discovery of CRISPR-Cas9/dCas9 system has reinforced our ability and revolutionized our history in genome engineering. While Cas9 and dCas9 are programed to modulate gene expression by introducing DNA breaks, blocking transcription factor recruitment or dragging functional groups towards the targeted sites, sgRNAs determine the genomic loci where the modulation occurs. The off-target problem, due to limited sgRNA specificity and genome complexity of many species, has posed concerns for the wide application of this revolutionary technique. To solve this problem and, more importantly, gain power over gene functionality and cell fate control, inducible strategies have been continuously evolved to offer tailored solutions to address specific biological questions. By reviewing recent advances in inducible CRISPR system design and critical elements potentially adding values to such systems, we classify current approaches in this domain into four mechanically distinct categories, namely, "split system", "allosteric system", "combinatorial system", and "transient delivery system", discuss the pros and cons of each system, and point out the under-explored areas and future directions, with the aim of enriching our toolbox of delicate life engineering.

  9. Coral Reef Genomics: Developing tools for functional genomics ofcoral symbiosis

    SciTech Connect

    Schwarz, Jodi; Brokstein, Peter; Manohar, Chitra; Coffroth, MaryAlice; Szmant, Alina; Medina, Monica

    2005-03-01

    Symbioses between cnidarians and dinoflagellates in the genus Symbiodinium are widespread in the marine environment. The importance of this symbiosis to reef-building corals and reef nutrient and carbon cycles is well documented, but little is known about the mechanisms by which the partners establish and regulate the symbiosis. Because the dinoflagellate symbionts live inside the cells of their host coral, the interactions between the partners occur on cellular and molecular levels, as each partner alters the expression of genes and proteins to facilitate the partnership. These interactions can examined using high-throughput techniques that allow thousands of genes to be examined simultaneously. We are developing the groundwork so that we can use DNA microarray profiling to identify genes involved in the Montastraea faveolata and Acropora palmata symbioses. Here we report results from the initial steps in this microarray initiative, that is, the construction of cDNA libraries from 4 of 16 target stages, sequencing of 3450 cDNA clones to generate Expressed Sequenced Tags (ESTs), and annotation of the ESTs to identify candidate genes to include in the microarrays. An understanding of how the coral-dinoflagellate symbiosis is regulated will have implications for atmospheric and ocean sciences, conservation biology, the study and diagnosis of coral bleaching and disease, and comparative studies of animal-protest interactions.

  10. Emerging Imaging and Genomic Tools for Developmental Systems Biology.

    PubMed

    Liu, Zhe; Keller, Philipp J

    2016-03-21

    Animal development is a complex and dynamic process orchestrated by exquisitely timed cell lineage commitment, divisions, migration, and morphological changes at the single-cell level. In the past decade, extensive genetic, stem cell, and genomic studies provided crucial insights into molecular underpinnings and the functional importance of genetic pathways governing various cellular differentiation processes. However, it is still largely unknown how the precise coordination of these pathways is achieved at the whole-organism level and how the highly regulated spatiotemporal choreography of development is established in turn. Here, we discuss the latest technological advances in imaging and single-cell genomics that hold great promise for advancing our understanding of this intricate process. We propose an integrated approach that combines such methods to quantitatively decipher in vivo cellular dynamic behaviors and their underlying molecular mechanisms at the systems level with single-cell, single-molecule resolution.

  11. Groundnut improvement: use of genetic and genomic tools

    PubMed Central

    Janila, Pasupuleti; Nigam, S. N.; Pandey, Manish K.; Nagesh, P.; Varshney, Rajeev K.

    2013-01-01

    Groundnut (Arachis hypogaea L.), a self-pollinated legume is an important crop cultivated in 24 million ha world over for extraction of edible oil and food uses. The kernels are rich in oil (48–50%) and protein (25–28%), and are source of several vitamins, minerals, antioxidants, biologically active polyphenols, flavonoids, and isoflavones. Improved varieties of groundnut with high yield potential were developed and released for cultivation world over. The improved varieties belong to different maturity durations and possess resistance to diseases, tolerance to drought, enhanced oil content, and improved quality traits for food uses. Conventional breeding procedures along with the tools for phenotyping were largely used in groundnut improvement programs. Mutations were used to induce variability and wide hybridization was attempted to tap variability from wild species. Low genetic variability has been a bottleneck for groundnut improvement. The vast potential of wild species, reservoir of new alleles remains under-utilized. Development of linkage maps of groundnut during the last decade was followed by identification of markers and quantitative trait loci for the target traits. Consequently, the last decade has witnessed the deployment of molecular breeding approaches to complement the ongoing groundnut improvement programs in USA, China, India, and Japan. The other potential advantages of molecular breeding are the feasibility to target multiple traits for improvement and provide tools to tap new alleles from wild species. The first groundnut variety developed through marker-assisted back-crossing is a root-knot nematode-resistant variety, NemaTAM in USA. The uptake of molecular breeding approaches in groundnut improvement programs by NARS partners in India and many African countries is slow or needs to be initiated in part due to inadequate infrastructure, high genotyping costs, and human capacities. Availability of draft genome sequence for diploid (AA and

  12. Functional genomics approaches in parasitic helminths.

    PubMed

    Hagen, J; Lee, E F; Fairlie, W D; Kalinna, B H

    2012-01-01

    As research on parasitic helminths is moving into the post-genomic era, an enormous effort is directed towards deciphering gene function and to achieve gene annotation. The sequences that are available in public databases undoubtedly hold information that can be utilized for new interventions and control but the exploitation of these resources has until recently remained difficult. Only now, with the emergence of methods to genetically manipulate and transform parasitic worms will it be possible to gain a comprehensive understanding of the molecular mechanisms involved in nutrition, metabolism, developmental switches/maturation and interaction with the host immune system. This review focuses on functional genomics approaches in parasitic helminths that are currently used, to highlight potential applications of these technologies in the areas of cell biology, systems biology and immunobiology of parasitic helminths. © 2011 Blackwell Publishing Ltd.

  13. A New System for Comparative Functional Genomics of Saccharomyces Yeasts

    PubMed Central

    Caudy, Amy A.; Guan, Yuanfang; Jia, Yue; Hansen, Christina; DeSevo, Chris; Hayes, Alicia P.; Agee, Joy; Alvarez-Dominguez, Juan R.; Arellano, Hugo; Barrett, Daniel; Bauerle, Cynthia; Bisaria, Namita; Bradley, Patrick H.; Breunig, J. Scott; Bush, Erin; Cappel, David; Capra, Emily; Chen, Walter; Clore, John; Combs, Peter A.; Doucette, Christopher; Demuren, Olukunle; Fellowes, Peter; Freeman, Sam; Frenkel, Evgeni; Gadala-Maria, Daniel; Gawande, Richa; Glass, David; Grossberg, Samuel; Gupta, Anita; Hammonds-Odie, Latanya; Hoisos, Aaron; Hsi, Jenny; Hsu, Yu-Han Huang; Inukai, Sachi; Karczewski, Konrad J.; Ke, Xiaobo; Kojima, Mina; Leachman, Samuel; Lieber, Danny; Liebowitz, Anna; Liu, Julia; Liu, Yufei; Martin, Trevor; Mena, Jose; Mendoza, Rosa; Myhrvold, Cameron; Millian, Christian; Pfau, Sarah; Raj, Sandeep; Rich, Matt; Rokicki, Joe; Rounds, William; Salazar, Michael; Salesi, Matthew; Sharma, Rajani; Silverman, Sanford; Singer, Cara; Sinha, Sandhya; Staller, Max; Stern, Philip; Tang, Hanlin; Weeks, Sharon; Weidmann, Maxwell; Wolf, Ashley; Young, Carmen; Yuan, Jie; Crutchfield, Christopher; McClean, Megan; Murphy, Coleen T.; Llinás, Manuel; Botstein, David; Troyanskaya, Olga G.; Dunham, Maitreya J.

    2013-01-01

    Whole-genome sequencing, particularly in fungi, has progressed at a tremendous rate. More difficult, however, is experimental testing of the inferences about gene function that can be drawn from comparative sequence analysis alone. We present a genome-wide functional characterization of a sequenced but experimentally understudied budding yeast, Saccharomyces bayanus var. uvarum (henceforth referred to as S. bayanus), allowing us to map changes over the 20 million years that separate this organism from S. cerevisiae. We first created a suite of genetic tools to facilitate work in S. bayanus. Next, we measured the gene-expression response of S. bayanus to a diverse set of perturbations optimized using a computational approach to cover a diverse array of functionally relevant biological responses. The resulting data set reveals that gene-expression patterns are largely conserved, but significant changes may exist in regulatory networks such as carbohydrate utilization and meiosis. In addition to regulatory changes, our approach identified gene functions that have diverged. The functions of genes in core pathways are highly conserved, but we observed many changes in which genes are involved in osmotic stress, peroxisome biogenesis, and autophagy. A surprising number of genes specific to S. bayanus respond to oxidative stress, suggesting the organism may have evolved under different selection pressures than S. cerevisiae. This work expands the scope of genome-scale evolutionary studies from sequence-based analysis to rapid experimental characterization and could be adopted for functional mapping in any lineage of interest. Furthermore, our detailed characterization of S. bayanus provides a valuable resource for comparative functional genomics studies in yeast. PMID:23852385

  14. A new system for comparative functional genomics of Saccharomyces yeasts.

    PubMed

    Caudy, Amy A; Guan, Yuanfang; Jia, Yue; Hansen, Christina; DeSevo, Chris; Hayes, Alicia P; Agee, Joy; Alvarez-Dominguez, Juan R; Arellano, Hugo; Barrett, Daniel; Bauerle, Cynthia; Bisaria, Namita; Bradley, Patrick H; Breunig, J Scott; Bush, Erin; Cappel, David; Capra, Emily; Chen, Walter; Clore, John; Combs, Peter A; Doucette, Christopher; Demuren, Olukunle; Fellowes, Peter; Freeman, Sam; Frenkel, Evgeni; Gadala-Maria, Daniel; Gawande, Richa; Glass, David; Grossberg, Samuel; Gupta, Anita; Hammonds-Odie, Latanya; Hoisos, Aaron; Hsi, Jenny; Hsu, Yu-Han Huang; Inukai, Sachi; Karczewski, Konrad J; Ke, Xiaobo; Kojima, Mina; Leachman, Samuel; Lieber, Danny; Liebowitz, Anna; Liu, Julia; Liu, Yufei; Martin, Trevor; Mena, Jose; Mendoza, Rosa; Myhrvold, Cameron; Millian, Christian; Pfau, Sarah; Raj, Sandeep; Rich, Matt; Rokicki, Joe; Rounds, William; Salazar, Michael; Salesi, Matthew; Sharma, Rajani; Silverman, Sanford; Singer, Cara; Sinha, Sandhya; Staller, Max; Stern, Philip; Tang, Hanlin; Weeks, Sharon; Weidmann, Maxwell; Wolf, Ashley; Young, Carmen; Yuan, Jie; Crutchfield, Christopher; McClean, Megan; Murphy, Coleen T; Llinás, Manuel; Botstein, David; Troyanskaya, Olga G; Dunham, Maitreya J

    2013-09-01

    Whole-genome sequencing, particularly in fungi, has progressed at a tremendous rate. More difficult, however, is experimental testing of the inferences about gene function that can be drawn from comparative sequence analysis alone. We present a genome-wide functional characterization of a sequenced but experimentally understudied budding yeast, Saccharomyces bayanus var. uvarum (henceforth referred to as S. bayanus), allowing us to map changes over the 20 million years that separate this organism from S. cerevisiae. We first created a suite of genetic tools to facilitate work in S. bayanus. Next, we measured the gene-expression response of S. bayanus to a diverse set of perturbations optimized using a computational approach to cover a diverse array of functionally relevant biological responses. The resulting data set reveals that gene-expression patterns are largely conserved, but significant changes may exist in regulatory networks such as carbohydrate utilization and meiosis. In addition to regulatory changes, our approach identified gene functions that have diverged. The functions of genes in core pathways are highly conserved, but we observed many changes in which genes are involved in osmotic stress, peroxisome biogenesis, and autophagy. A surprising number of genes specific to S. bayanus respond to oxidative stress, suggesting the organism may have evolved under different selection pressures than S. cerevisiae. This work expands the scope of genome-scale evolutionary studies from sequence-based analysis to rapid experimental characterization and could be adopted for functional mapping in any lineage of interest. Furthermore, our detailed characterization of S. bayanus provides a valuable resource for comparative functional genomics studies in yeast.

  15. ScreenBEAM: a novel meta-analysis algorithm for functional genomics screens via Bayesian hierarchical modeling | Office of Cancer Genomics

    Cancer.gov

    Functional genomics (FG) screens, using RNAi or CRISPR technology, have become a standard tool for systematic, genome-wide loss-of-function studies for therapeutic target discovery. As in many large-scale assays, however, off-target effects, variable reagents' potency and experimental noise must be accounted for appropriately control for false positives.

  16. Applying functional genomics research to the study of pig reproduction.

    PubMed

    Pomp, D; Caetano, A R; Bertani, G R; Gladney, C D; Johnson, R K

    2001-01-01

    Functional genomics is an experimental approach that incorporates genome-wide or system-wide experimentation, expanding the scope of biological investigation from studying single genes to studying potentially all genes at once in a systematic manner. This technology is highly appealing because of its high throughput and relatively low cost. Furthermore, analysis of gene expression using microarrays is likely to be more biologically relevant than the conventional paradigm of reductionism, because it has the potential to uncover new biological connections between genes and biochemical pathways. However, functional genomics is still in its infancy, especially with regard to the study of pig reproduction. Currently, efforts are centred on developing the necessary resources to enable high throughput evaluation and comparison of gene expression. However, it is clear that in the near future functional genomics will be applied on a large scale to study the biology and physiology of reproduction in pigs, and to understand better the complex nature of genetic control over polygenic characteristics, such as ovulation rate and litter size. We can look forward to generating a significant amount of new data on differences in gene expression between genotypes, treatments, or at various temporal and spatial coordinates within a variety of reproductively relevant systems. Along with this capability will be the challenge of collating, analysing and interpreting datasets that are orders of magnitude more extensive and complex than those currently used. Furthermore, integration of functional genomics with traditional genetic approaches and with detailed analysis of the proteome and relevant whole animal phenotypes will be required to make full use of this powerful new experimental paradigm as a beneficial research tool.

  17. OCPAT: an online codon-preserved alignment tool for evolutionary genomic analysis of protein coding sequences

    PubMed Central

    Liu, Guozhen; Uddin, Monica; Islam, Munirul; Goodman, Morris; Grossman, Lawrence I; Romero, Roberto; Wildman, Derek E

    2007-01-01

    Background Rapidly accumulating genome sequence data from multiple species offer powerful opportunities for the detection of DNA sequence evolution. Phylogenetic tree construction and codon-based tests for natural selection are the prevailing tools used to detect functionally important evolutionary change in protein coding sequences. These analyses often require multiple DNA sequence alignments that maintain the correct reading frame for each collection of putative orthologous sequences. Since this feature is not available in most alignment tools, codon reading frames often must be checked manually before evolutionary analyses can commence. Results Here we report an online codon-preserved alignment tool (OCPAT) that generates multiple sequence alignments automatically from the coding sequences of any list of human gene IDs and their putative orthologs from genomes of other vertebrate tetrapods. OCPAT is programmed to extract putative orthologous genes from genomes and to align the orthologs with the reading frame maintained in all species. OCPAT also optimizes the alignment by trimming the most variable alignment regions at the 5' and 3' ends of each gene. The resulting output of alignments is returned in several formats, which facilitates further molecular evolutionary analyses by appropriate available software. Alignments are generally robust and reliable, retaining the correct reading frame. The tool can serve as the first step for comparative genomic analyses of protein-coding gene sequences including phylogenetic tree reconstruction and detection of natural selection. We aligned 20,658 human RefSeq mRNAs using OCPAT. Most alignments are missing sequence(s) from at least one species; however, functional annotation clustering of the ~1700 transcripts that were alignable to all species shows that genes involved in multi-subunit protein complexes are highly conserved. Conclusion The OCPAT program facilitates large-scale evolutionary and phylogenetic analyses of

  18. Functional cell-based uHTS in chemical genomic drug discovery.

    PubMed

    Croston, Glenn E

    2002-03-01

    The availability of genomic information significantly increases the number of potential targets available for drug discovery, although the function of many targets and their relationship to disease is unknown. In a chemical genomic research approach, ultra-high throughput screening (uHTS) of genomic targets takes place early in the drug discovery process, before target validation. Target-selective modulators then provide drug leads and pharmacological research tools to validate target function. Effective implementation of a chemical genomic strategy requires assays that can perform uHTS for large numbers of genomic targets. Cell-based functional assays are capable of the uHTS throughput required for chemical genomic research, and their functional nature provides distinct advantages over ligand-binding assays in the identification of target-selective modulators.

  19. AutoFACT: An Automatic Functional Annotation and Classification Tool

    PubMed Central

    Koski, Liisa B; Gray, Michael W; Lang, B Franz; Burger, Gertraud

    2005-01-01

    Background Assignment of function to new molecular sequence data is an essential step in genomics projects. The usual process involves similarity searches of a given sequence against one or more databases, an arduous process for large datasets. Results We present AutoFACT, a fully automated and customizable annotation tool that assigns biologically informative functions to a sequence. Key features of this tool are that it (1) analyzes nucleotide and protein sequence data; (2) determines the most informative functional description by combining multiple BLAST reports from several user-selected databases; (3) assigns putative metabolic pathways, functional classes, enzyme classes, GeneOntology terms and locus names; and (4) generates output in HTML, text and GFF formats for the user's convenience. We have compared AutoFACT to four well-established annotation pipelines. The error rate of functional annotation is estimated to be only between 1–2%. Comparison of AutoFACT to the traditional top-BLAST-hit annotation method shows that our procedure increases the number of functionally informative annotations by approximately 50%. Conclusion AutoFACT will serve as a useful annotation tool for smaller sequencing groups lacking dedicated bioinformatics staff. It is implemented in PERL and runs on LINUX/UNIX platforms. AutoFACT is available at . PMID:15960857

  20. Laser microdissection: A powerful tool for genomics at cell level.

    PubMed

    Bevilacqua, Claudia; Ducos, Bertrand

    2017-09-16

    Laser microdissection (LM) has become considerably democratized over the last fifteen years. Instruments have evolved to offer more powerful and efficient lasers as well as new options for sample collection and preparation. Technological evolutions have also focused on the post-microdissection analysis capabilities, opening up investigations in all disciplines of experimental and clinical biology, thanks to the advent of new high-throughput methods of genome analysis, including RNAseq and proteomics, now globally known as microgenomics, i.e. analysis of biomolecules at the cell level. In spite of the advances these rapidly developing methods have allowed, the workflow for sampling and collection by LM remains a critical step in insuring sample integrity in terms of histology (accurate cell identification) and biochemistry (reliable analyzes of biomolecules). In this review, we describe the sample processing as well as the strengths and limiting factors of LM applied to the specific selection of one or more cells of interest from a heterogeneous tissue. We will see how the latest developments in protocols and methods have made LM a powerful and sometimes essential tool for genomic and proteomic analyzes of tiny amounts of biomolecules extracted from few cells isolated from a complex tissue, in their physiological context, thus offering new opportunities for understanding fundamental physiological and/or patho-physiological processes. Copyright © 2017. Published by Elsevier Ltd.

  1. The Plant Ontology: A Tool for Plant Genomics.

    PubMed

    Cooper, Laurel; Jaiswal, Pankaj

    2016-01-01

    The use of controlled, structured vocabularies (ontologies) has become a critical tool for scientists in the post-genomic era of massive datasets. Adoption and integration of common vocabularies and annotation practices enables cross-species comparative analyses and increases data sharing and reusability. The Plant Ontology (PO; http://www.plantontology.org/ ) describes plant anatomy, morphology, and the stages of plant development, and offers a database of plant genomics annotations associated to the PO terms. The scope of the PO has grown from its original design covering only rice, maize, and Arabidopsis, and now includes terms to describe all green plants from angiosperms to green algae.This chapter introduces how the PO and other related ontologies are constructed and organized, including languages and software used for ontology development, and provides an overview of the key features. Detailed instructions illustrate how to search and browse the PO database and access the associated annotation data. Users are encouraged to provide input on the ontology through the online term request form and contribute datasets for integration in the PO database.

  2. Orchidstra: an integrated orchid functional genomics database.

    PubMed

    Su, Chun-lin; Chao, Ya-Ting; Yen, Shao-Hua; Chen, Chun-Yi; Chen, Wan-Chieh; Chang, Yao-Chien Alex; Shih, Ming-Che

    2013-02-01

    A specialized orchid database, named Orchidstra (URL: http://orchidstra.abrc.sinica.edu.tw), has been constructed to collect, annotate and share genomic information for orchid functional genomics studies. The Orchidaceae is a large family of Angiosperms that exhibits extraordinary biodiversity in terms of both the number of species and their distribution worldwide. Orchids exhibit many unique biological features; however, investigation of these traits is currently constrained due to the limited availability of genomic information. Transcriptome information for five orchid species and one commercial hybrid has been included in the Orchidstra database. Altogether, these comprise >380,000 non-redundant orchid transcript sequences, of which >110,000 are protein-coding genes. Sequences from the transcriptome shotgun assembly (TSA) were obtained either from output reads from next-generation sequencing technologies assembled into contigs, or from conventional cDNA library approaches. An annotation pipeline using Gene Ontology, KEGG and Pfam was built to assign gene descriptions and functional annotation to protein-coding genes. Deep sequencing of small RNA was also performed for Phalaenopsis aphrodite to search for microRNAs (miRNAs), extending the information archived for this species to miRNA annotation, precursors and putative target genes. The P. aphrodite transcriptome information was further used to design probes for an oligonucleotide microarray, and expression profiling analysis was carried out. The intensities of hybridized probes derived from microarray assays of various tissues were incorporated into the database as part of the functional evidence. In the future, the content of the Orchidstra database will be expanded with transcriptome data and genomic information from more orchid species.

  3. Functional genomics of Lactobacillus casei establishment in the gut

    PubMed Central

    Licandro-Seraut, Hélène; Scornec, Hélène; Pédron, Thierry; Cavin, Jean-François; Sansonetti, Philippe J.

    2014-01-01

    Although the composition of the gut microbiota and its symbiotic contribution to key host physiological functions are well established, little is known as yet about the bacterial factors that account for this symbiosis. We selected Lactobacillus casei as a model microorganism to proceed to genomewide identification of the functions required for a symbiont to establish colonization in the gut. As a result of our recent development of a transposon-mutagenesis tool that overcomes the barrier that had prevented L. casei random mutagenesis, we developed a signature-tagged mutagenesis approach combining whole-genome reverse genetics using a set of tagged transposons and in vivo screening using the rabbit ligated ileal loop model. After sequencing transposon insertion sites in 9,250 random mutants, we assembled a library of 1,110 independent mutants, all disrupted in a different gene, that provides a representative view of the L. casei genome. By determining the relative quantity of each of the 1,110 mutants before and after the in vivo challenge, we identified a core of 47 L. casei genes necessary for its establishment in the gut. They are involved in housekeeping functions, metabolism (sugar, amino acids), cell wall biogenesis, and adaptation to environment. Hence we provide what is, to our knowledge, the first global functional genomics analysis of L. casei symbiosis. PMID:25024222

  4. Chemogenetic tools to interrogate brain functions.

    PubMed

    Sternson, Scott M; Roth, Bryan L

    2014-01-01

    Elucidating the roles of neuronal cell types for physiology and behavior is essential for understanding brain functions. Perturbation of neuron electrical activity can be used to probe the causal relationship between neuronal cell types and behavior. New genetically encoded neuron perturbation tools have been developed for remotely controlling neuron function using small molecules that activate engineered receptors that can be targeted to cell types using genetic methods. Here we describe recent progress for approaches using genetically engineered receptors that selectively interact with small molecules. Called "chemogenetics," receptors with diverse cellular functions have been developed that facilitate the selective pharmacological control over a diverse range of cell-signaling processes, including electrical activity, for molecularly defined cell types. These tools have revealed remarkably specific behavioral physiological influences for molecularly defined cell types that are often intermingled with populations having different or even opposite functions.

  5. AnnTools: a comprehensive and versatile annotation toolkit for genomic variants

    PubMed Central

    Makarov, Vladimir; O'Grady, Tina; Cai, Guiqing; Lihm, Jayon; Buxbaum, Joseph D.; Yoon, Seungtai

    2012-01-01

    Summary: AnnTools is a versatile bioinformatics application designed for comprehensive annotation of a full spectrum of human genome variation: novel and known single-nucleotide substitutions (SNP/SNV), short insertions/deletions (INDEL) and structural variants/copy number variation (SV/CNV). The variants are interpreted by interrogating data compiled from 15 constantly updated sources. In addition to detailed functional characterization of the coding variants, AnnTools searches for overlaps with regulatory elements, disease/trait associated loci, known segmental duplications and artifact prone regions, thereby offering an integrated and comprehensive analysis of genomic data. The tool conveniently accepts user-provided tracks for custom annotation and offers flexibility in input data formats. The output is generated in the universal Variant Call Format. High annotation speed makes AnnTools suitable for high-throughput sequencing facilities, while a low-memory footprint and modest CPU requirements allow it to operate on a personal computer. The application is freely available for public use; the package includes installation scripts and a set of helper tools. Availability: http://anntools.sourceforge.net/ Contact: vladimir.makarov@mssm.edu; chris.yoon@mssm.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22257670

  6. FilooT: a visualization tool for exploring genomic data

    NASA Astrophysics Data System (ADS)

    Zeinaly, Mahshid; Soltangheis, Mina; Shaw, Chris D.

    2013-12-01

    In order to enhance analysis of synthetic health data of the IEEE VAST Challenge 2010, we introduce an interactive Visual Analytics tool called FilooT designed as a part of the Interactive Multi-genomic Analysis System (IMAS) project. In this paper, we describe different interactive views of FilooT: the Tabular View for exploring and comparing genetic sequences, the Matrix View for sorting sequences according to the values of different characteristics, the P-value View for finding the most important mutations across a family of sequences, the Graph View for finding related sequences and the Group View to group them for further investigation. We followed the Nested Process Model framework throughout the design process and the evaluation. To understand the tool's design capabilities for target domain analysts, we conducted a User Experience scenario-based study followed by an informal interview. The findings indicated how analysts employ each of the visualization and interaction designs in their Bioinformatics task-analysis process. The critical analysis of the results inspired design informing suggestions.

  7. Complete Genome Sequence of Flavobacterium psychrophilum Strain OSU THCO2-90, Used for Functional Genetic Analysis

    PubMed Central

    Rochat, Tatiana; Barbier, Paul; Nicolas, Pierre; Loux, Valentin; Pérez-Pascual, David; Guijarro, José A.; Bernardet, Jean-François

    2017-01-01

    ABSTRACT We report here the complete annotated genome sequence of Flavobacterium psychrophilum OSU THCO2-90, isolated from Coho salmon (Oncorhynchus kisutch) in Oregon. The genome consists of a circular chromosome with 2,343 predicted open reading frames. This strain has proved to be a valuable tool for functional genomics. PMID:28232446

  8. Novel R tools for analysis of genome-wide population genetic data with emphasis on clonality

    PubMed Central

    Kamvar, Zhian N.; Brooks, Jonah C.; Grünwald, Niklaus J.

    2015-01-01

    To gain a detailed understanding of how plant microbes evolve and adapt to hosts, pesticides, and other factors, knowledge of the population dynamics and evolutionary history of populations is crucial. Plant pathogen populations are often clonal or partially clonal which requires different analytical tools. With the advent of high throughput sequencing technologies, obtaining genome-wide population genetic data has become easier than ever before. We previously contributed the R package poppr specifically addressing issues with analysis of clonal populations. In this paper we provide several significant extensions to poppr with a focus on large, genome-wide SNP data. Specifically, we provide several new functionalities including the new function mlg.filter to define clone boundaries allowing for inspection and definition of what is a clonal lineage, minimum spanning networks with reticulation, a sliding-window analysis of the index of association, modular bootstrapping of any genetic distance, and analyses across any level of hierarchies. PMID:26113860

  9. Ensembl Plants: Integrating Tools for Visualizing, Mining, and Analyzing Plant Genomic Data.

    PubMed

    Bolser, Dan M; Staines, Daniel M; Perry, Emily; Kersey, Paul J

    2017-01-01

    Ensembl Plants ( http://plants.ensembl.org ) is an integrative resource presenting genome-scale information for 39 sequenced plant species. Available data includes genome sequence, gene models, functional annotation, and polymorphic loci; for the latter, additional information including population structure, individual genotypes, linkage, and phenotype data is available for some species. Comparative data is also available, including genomic alignments and "gene trees," which show the inferred evolutionary history of each gene family represented in the resource. Access to the data is provided through a genome browser, which incorporates many specialist interfaces for different data types, through a variety of programmatic interfaces, and via a specialist data mining tool supporting rapid filtering and retrieval of bulk data. Genomic data from many non-plant species, including those of plant pathogens, pests, and pollinators, is also available via the same interfaces through other divisions of Ensembl.Ensembl Plants is updated 4-6 times a year and is developed in collaboration with our international partners in the Gramene ( http://www.gramene.org ) and transPLANT projects ( http://www.transplantdb.eu ).

  10. Tetrahymena functional genomics database (TetraFGD): an integrated resource for Tetrahymena functional genomics.

    PubMed

    Xiong, Jie; Lu, Yuming; Feng, Jinmei; Yuan, Dongxia; Tian, Miao; Chang, Yue; Fu, Chengjie; Wang, Guangying; Zeng, Honghui; Miao, Wei

    2013-01-01

    The ciliated protozoan Tetrahymena thermophila is a useful unicellular model organism for studies of eukaryotic cellular and molecular biology. Researches on T. thermophila have contributed to a series of remarkable basic biological principles. After the macronuclear genome was sequenced, substantial progress has been made in functional genomics research on T. thermophila, including genome-wide microarray analysis of the T. thermophila life cycle, a T. thermophila gene network analysis based on the microarray data and transcriptome analysis by deep RNA sequencing. To meet the growing demands for the Tetrahymena research community, we integrated these data to provide a public access database: Tetrahymena functional genomics database (TetraFGD). TetraFGD contains three major resources, including the RNA-Seq transcriptome, microarray and gene networks. The RNA-Seq data define gene structures and transcriptome, with special emphasis on exon-intron boundaries; the microarray data describe gene expression of 20 time points during three major stages of the T. thermophila life cycle; the gene network data identify potential gene-gene interactions of 15 049 genes. The TetraFGD provides user-friendly search functions that assist researchers in accessing gene models, transcripts, gene expression data and gene-gene relationships. In conclusion, the TetraFGD is an important functional genomic resource for researchers who focus on the Tetrahymena or other ciliates. Database URL: http://tfgd.ihb.ac.cn/

  11. RNA interference for functional genomics and improvement of cotton (Gossypium species)

    USDA-ARS?s Scientific Manuscript database

    RNA interference (RNAi), is a powerful new technology in the discovery of genetic sequence functions, and has become a valuable tool for functional genomics of cotton (Gossypium ssp.). The rapid adoption of RNAi has replaced previous antisense technology. RNAi has aided in the discovery of function ...

  12. Human-mouse comparative genomics: successes and failures to reveal functional regions of the human genome

    SciTech Connect

    Pennacchio, Len A.; Baroukh, Nadine; Rubin, Edward M.

    2003-05-15

    Deciphering the genetic code embedded within the human genome remains a significant challenge despite the human genome consortium's recent success at defining its linear sequence (Lander et al. 2001; Venter et al. 2001). While useful strategies exist to identify a large percentage of protein encoding regions, efforts to accurately define functional sequences in the remaining {approx}97 percent of the genome lag. Our primary interest has been to utilize the evolutionary relationship and the universal nature of genomic sequence information in vertebrates to reveal functional elements in the human genome. This has been achieved through the combined use of vertebrate comparative genomics to pinpoint highly conserved sequences as candidates for biological activity and transgenic mouse studies to address the functionality of defined human DNA fragments. Accordingly, we describe strategies and insights into functional sequences in the human genome through the use of comparative genomics coupled wit h functional studies in the mouse.

  13. Ascribing Functions to Genes: Journey Towards Genetic Improvement of Rice Via Functional Genomics

    PubMed Central

    Mustafiz, Ananda; Kumari, Sumita; Karan, Ratna

    2016-01-01

    Rice, one of the most important cereal crops for mankind, feeds more than half the world population. Rice has been heralded as a model cereal owing to its small genome size, amenability to easy transformation, high synteny to other cereal crops and availability of complete genome sequence. Moreover, sequence wealth in rice is getting more refined and precise due to resequencing efforts. This humungous resource of sequence data has confronted research fraternity with a herculean challenge as well as an excellent opportunity to functionally validate expressed as well as regulatory portions of the genome. This will not only help us in understanding the genetic basis of plant architecture and physiology but would also steer us towards developing improved cultivars. No single technique can achieve such a mammoth task. Functional genomics through its diverse tools viz. loss and gain of function mutants, multifarious omics strategies like transcriptomics, proteomics, metabolomics and phenomics provide us with the necessary handle. A paradigm shift in technological advances in functional genomics strategies has been instrumental in generating considerable amount of information w.r.t functionality of rice genome. We now have several databases and online resources for functionally validated genes but despite that we are far from reaching the desired milestone of functionally characterizing each and every rice gene. There is an urgent need for a common platform, for information already available in rice, and collaborative efforts between researchers in a concerted manner as well as healthy public-private partnership, for genetic improvement of rice crop better able to handle the pressures of climate change and exponentially increasing population. PMID:27252584

  14. Ascribing Functions to Genes: Journey Towards Genetic Improvement of Rice Via Functional Genomics.

    PubMed

    Mustafiz, Ananda; Kumari, Sumita; Karan, Ratna

    2016-06-01

    Rice, one of the most important cereal crops for mankind, feeds more than half the world population. Rice has been heralded as a model cereal owing to its small genome size, amenability to easy transformation, high synteny to other cereal crops and availability of complete genome sequence. Moreover, sequence wealth in rice is getting more refined and precise due to resequencing efforts. This humungous resource of sequence data has confronted research fraternity with a herculean challenge as well as an excellent opportunity to functionally validate expressed as well as regulatory portions of the genome. This will not only help us in understanding the genetic basis of plant architecture and physiology but would also steer us towards developing improved cultivars. No single technique can achieve such a mammoth task. Functional genomics through its diverse tools viz. loss and gain of function mutants, multifarious omics strategies like transcriptomics, proteomics, metabolomics and phenomics provide us with the necessary handle. A paradigm shift in technological advances in functional genomics strategies has been instrumental in generating considerable amount of information w.r.t functionality of rice genome. We now have several databases and online resources for functionally validated genes but despite that we are far from reaching the desired milestone of functionally characterizing each and every rice gene. There is an urgent need for a common platform, for information already available in rice, and collaborative efforts between researchers in a concerted manner as well as healthy public-private partnership, for genetic improvement of rice crop better able to handle the pressures of climate change and exponentially increasing population.

  15. Sequencing Single Cell Microbial Genomes with Microfluidic Amplifications Tools (MICW - Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Quake, Steve [University of Stanford

    2016-07-12

    Stanford University's Steve Quake on "Sequencing Single Cell Microbial Genomes with Microfluidic Amplification Tools" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  16. Sequencing Single Cell Microbial Genomes with Microfluidic Amplifications Tools (MICW - Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    SciTech Connect

    Quake, Steve

    2011-10-12

    Stanford University's Steve Quake on "Sequencing Single Cell Microbial Genomes with Microfluidic Amplification Tools" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  17. Exploring the human genome with functional maps

    PubMed Central

    Huttenhower, Curtis; Haley, Erin M.; Hibbs, Matthew A.; Dumeaux, Vanessa; Barrett, Daniel R.; Coller, Hilary A.; Troyanskaya, Olga G.

    2009-01-01

    Human genomic data of many types are readily available, but the complexity and scale of human molecular biology make it difficult to integrate this body of data, understand it from a systems level, and apply it to the study of specific pathways or genetic disorders. An investigator could best explore a particular protein, pathway, or disease if given a functional map summarizing the data and interactions most relevant to his or her area of interest. Using a regularized Bayesian integration system, we provide maps of functional activity and interaction networks in over 200 areas of human cellular biology, each including information from ∼30,000 genome-scale experiments pertaining to ∼25,000 human genes. Key to these analyses is the ability to efficiently summarize this large data collection from a variety of biologically informative perspectives: prediction of protein function and functional modules, cross-talk among biological processes, and association of novel genes and pathways with known genetic disorders. In addition to providing maps of each of these areas, we also identify biological processes active in each data set. Experimental investigation of five specific genes, AP3B1, ATP6AP1, BLOC1S1, LAMP2, and RAB11A, has confirmed novel roles for these proteins in the proper initiation of macroautophagy in amino acid-starved human fibroblasts. Our functional maps can be explored using HEFalMp (Human Experimental/Functional Mapper), a web interface allowing interactive visualization and investigation of this large body of information. PMID:19246570

  18. A web-based genomic sequence database for the Streptomycetaceae: a tool for systematics and genome mining

    USDA-ARS?s Scientific Manuscript database

    The ARS Microbial Genome Sequence Database (http://199.133.98.43), a web-based database server, was established utilizing the BIGSdb (Bacterial Isolate Genomics Sequence Database) software package, developed at Oxford University, as a tool to manage multi-locus sequence data for the family Streptomy...

  19. Genomic Tools in Cowpea Breeding Programs: Status and Perspectives.

    PubMed

    Boukar, Ousmane; Fatokun, Christian A; Huynh, Bao-Lam; Roberts, Philip A; Close, Timothy J

    2016-01-01

    Cowpea is one of the most important grain legumes in sub-Saharan Africa (SSA). It provides strong support to the livelihood of small-scale farmers through its contributions to their nutritional security, income generation and soil fertility enhancement. Worldwide about 6.5 million metric tons of cowpea are produced annually on about 14.5 million hectares. The low productivity of cowpea is attributable to numerous abiotic and biotic constraints. The abiotic stress factors comprise drought, low soil fertility, and heat while biotic constraints include insects, diseases, parasitic weeds, and nematodes. Cowpea farmers also have limited access to quality seeds of improved varieties for planting. Some progress has been made through conventional breeding at international and national research institutions in the last three decades. Cowpea improvement could also benefit from modern breeding methods based on molecular genetic tools. A number of advances in cowpea genetic linkage maps, and quantitative trait loci associated with some desirable traits such as resistance to Striga, Macrophomina, Fusarium wilt, bacterial blight, root-knot nematodes, aphids, and foliar thrips have been reported. An improved consensus genetic linkage map has been developed and used to identify QTLs of additional traits. In order to take advantage of these developments single nucleotide polymorphism (SNP) genotyping is being streamlined to establish an efficient workflow supported by genotyping support service (GSS)-client interactions. About 1100 SNPs mapped on the cowpea genome were converted by LGC Genomics to KASP assays. Several cowpea breeding programs have been exploiting these resources to implement molecular breeding, especially for MARS and MABC, to accelerate cowpea variety improvement. The combination of conventional breeding and molecular breeding strategies, with workflow managed through the CGIAR breeding management system (BMS), promises an increase in the number of improved

  20. Genomic Tools in Cowpea Breeding Programs: Status and Perspectives

    PubMed Central

    Boukar, Ousmane; Fatokun, Christian A.; Huynh, Bao-Lam; Roberts, Philip A.; Close, Timothy J.

    2016-01-01

    Cowpea is one of the most important grain legumes in sub-Saharan Africa (SSA). It provides strong support to the livelihood of small-scale farmers through its contributions to their nutritional security, income generation and soil fertility enhancement. Worldwide about 6.5 million metric tons of cowpea are produced annually on about 14.5 million hectares. The low productivity of cowpea is attributable to numerous abiotic and biotic constraints. The abiotic stress factors comprise drought, low soil fertility, and heat while biotic constraints include insects, diseases, parasitic weeds, and nematodes. Cowpea farmers also have limited access to quality seeds of improved varieties for planting. Some progress has been made through conventional breeding at international and national research institutions in the last three decades. Cowpea improvement could also benefit from modern breeding methods based on molecular genetic tools. A number of advances in cowpea genetic linkage maps, and quantitative trait loci associated with some desirable traits such as resistance to Striga, Macrophomina, Fusarium wilt, bacterial blight, root-knot nematodes, aphids, and foliar thrips have been reported. An improved consensus genetic linkage map has been developed and used to identify QTLs of additional traits. In order to take advantage of these developments single nucleotide polymorphism (SNP) genotyping is being streamlined to establish an efficient workflow supported by genotyping support service (GSS)-client interactions. About 1100 SNPs mapped on the cowpea genome were converted by LGC Genomics to KASP assays. Several cowpea breeding programs have been exploiting these resources to implement molecular breeding, especially for MARS and MABC, to accelerate cowpea variety improvement. The combination of conventional breeding and molecular breeding strategies, with workflow managed through the CGIAR breeding management system (BMS), promises an increase in the number of improved

  1. Reverse Genetics and High Throughput Sequencing Methodologies for Plant Functional Genomics

    PubMed Central

    Ben-Amar, Anis; Daldoul, Samia; Reustle, Götz M.; Krczal, Gabriele; Mliki, Ahmed

    2016-01-01

    In the post-genomic era, increasingly sophisticated genetic tools are being developed with the long-term goal of understanding how the coordinated activity of genes gives rise to a complex organism. With the advent of the next generation sequencing associated with effective computational approaches, wide variety of plant species have been fully sequenced giving a wealth of data sequence information on structure and organization of plant genomes. Since thousands of gene sequences are already known, recently developed functional genomics approaches provide powerful tools to analyze plant gene functions through various gene manipulation technologies. Integration of different omics platforms along with gene annotation and computational analysis may elucidate a complete view in a system biology level. Extensive investigations on reverse genetics methodologies were deployed for assigning biological function to a specific gene or gene product. We provide here an updated overview of these high throughout strategies highlighting recent advances in the knowledge of functional genomics in plants. PMID:28217003

  2. ANAT: a tool for constructing and analyzing functional protein networks.

    PubMed

    Yosef, Nir; Zalckvar, Einat; Rubinstein, Assaf D; Homilius, Max; Atias, Nir; Vardi, Liram; Berman, Igor; Zur, Hadas; Kimchi, Adi; Ruppin, Eytan; Sharan, Roded

    2011-10-25

    Genome-scale screening studies are gradually accumulating a wealth of data on the putative involvement of hundreds of genes in various cellular responses or functions. A fundamental challenge is to chart the molecular pathways that underlie these systems. ANAT is an interactive software tool, implemented as a Cytoscape plug-in, for elucidating functional networks of proteins. It encompasses a number of network inference algorithms and provides access to networks of physical associations in several organisms. In contrast to existing software tools, ANAT can be used to infer subnetworks that connect hundreds of proteins to each other or to a given set of "anchor" proteins, a fundamental step in reconstructing cellular subnetworks. The interactive component of ANAT provides an array of tools for evaluating and exploring the resulting subnetwork models and for iteratively refining them. We demonstrate the utility of ANAT by studying the crosstalk between the autophagic and apoptotic cell death modules in humans, using a network of physical interactions. Relative to published software tools, ANAT is more accurate and provides more features for comprehensive network analysis. The latest version of the software is available at http://www.cs.tau.ac.il/~bnet/ANAT_SI.

  3. From bacterial genome to functionality; case bifidobacteria.

    PubMed

    Ventura, Marco; O'Connell-Motherway, Mary; Leahy, Sinead; Moreno-Munoz, Jose Antonio; Fitzgerald, Gerald F; van Sinderen, Douwe

    2007-11-30

    The availability of complete bacterial genome sequences has significantly furthered our understanding of the genetics, physiology and biochemistry of the microorganisms in question, particularly those that have commercially important applications. Bifidobacteria are among such microorganisms, as they constitute mammalian commensals of biotechnological significance due to their perceived role in maintaining a balanced gastrointestinal (GIT) microflora. Bifidobacteria are therefore frequently used as health-promoting or probiotic components in functional food products. A fundamental understanding of the metabolic activities employed by these commensal bacteria, in particular their capability to utilize a wide range of complex oligosaccharides, can reveal ways to provide in vivo growth advantages relative to other competing gut bacteria or pathogens. Furthermore, an in depth analysis of adaptive responses to nutritional or environmental stresses may provide methodologies to retain viability and improve functionality during commercial preparation, storage and delivery of the probiotic organism.

  4. GAViT: Genome Assembly Visualization Tool for Short Read Data

    SciTech Connect

    Syed, Aijazuddin; Shapiro, Harris; Tu, Hank; Pangilinan, Jasmyn; Trong, Stephan

    2008-03-14

    It is a challenging job for genome analysts to accurately debug, troubleshoot, and validate genome assembly results. Genome analysts rely on visualization tools to help validate and troubleshoot assembly results, including such problems as mis-assemblies, low-quality regions, and repeats. Short read data adds further complexity and makes it extremely challenging for the visualization tools to scale and to view all needed assembly information. As a result, there is a need for a visualization tool that can scale to display assembly data from the new sequencing technologies. We present Genome Assembly Visualization Tool (GAViT), a highly scalable and interactive assembly visualization tool developed at the DOE Joint Genome Institute (JGI).

  5. Selfish drive can trump function when animal mitochondrial genomes compete

    PubMed Central

    Ma, Hansong; O’Farrell, Patrick H.

    2016-01-01

    Mitochondrial genomes compete for transmission from mother to progeny. We explored this competition by introducing a second genome into Drosophila melanogaster to follow transmission. Competitions between closely related genomes favored those functional in electron transport, resulting in a host-beneficial purifying selection1. Contrastingly, matchups between distant genomes often favored those with negligible, negative or lethal consequences, indicating selfish selection. Exhibiting powerful selfish selection, a genome carrying a detrimental mutation displaced a complementing genome leading to population death after several generations. In a different pairing, opposing selfish and purifying selection counterbalanced to give stable transmission of two genomes. Sequencing of recombinant mitochondrial genomes revealed that the non-coding region, containing origins of replication, governs selfish transmission. Uniparental inheritance prevents encounters between distantly related genomes. Nonetheless, within each maternal lineage, constant competition among sibling genomes selects for super-replicators. We suggest that this relentless competition drives positive selection promoting change in the sequences influencing transmission. PMID:27270106

  6. Selfish drive can trump function when animal mitochondrial genomes compete.

    PubMed

    Ma, Hansong; O'Farrell, Patrick H

    2016-07-01

    Mitochondrial genomes compete for transmission from mother to progeny. We explored this competition by introducing a second genome into Drosophila melanogaster to follow transmission. Competitions between closely related genomes favored those functional in electron transport, resulting in a host-beneficial purifying selection. In contrast, matchups between distantly related genomes often favored those with negligible, negative or lethal consequences, indicating selfish selection. Exhibiting powerful selfish selection, a genome carrying a detrimental mutation displaced a complementing genome, leading to population death after several generations. In a different pairing, opposing selfish and purifying selection counterbalanced to give stable transmission of two genomes. Sequencing of recombinant mitochondrial genomes showed that the noncoding region, containing origins of replication, governs selfish transmission. Uniparental inheritance prevents encounters between distantly related genomes. Nonetheless, in each maternal lineage, constant competition among sibling genomes selects for super-replicators. We suggest that this relentless competition drives positive selection, promoting change in the sequences influencing transmission.

  7. Genome ARTIST: a robust, high-accuracy aligner tool for mapping transposon insertions and self-insertions.

    PubMed

    Ecovoiu, Alexandru Al; Ghionoiu, Iulian Constantin; Ciuca, Andrei Mihai; Ratiu, Attila Cristian

    2016-01-01

    A critical topic of insertional mutagenesis experiments performed on model organisms is mapping the hits of artificial transposons (ATs) at nucleotide level accuracy. Mapping errors may occur when sequencing artifacts or mutations as single nucleotide polymorphisms (SNPs) and small indels are present very close to the junction between a genomic sequence and a transposon inverted repeat (TIR). Another particular item of insertional mutagenesis is mapping of the transposon self-insertions and, to our best knowledge, there is no publicly available mapping tool designed to analyze such molecular events. We developed Genome ARTIST, a pairwise gapped aligner tool which works out both issues by means of an original, robust mapping strategy. Genome ARTIST is not designed to use next-generation sequencing (NGS) data but to analyze ATs insertions obtained in small to medium-scale mutagenesis experiments. Genome ARTIST employs a heuristic approach to find DNA sequence similarities and harnesses a multi-step implementation of a Smith-Waterman adapted algorithm to compute the mapping alignments. The experience is enhanced by easily customizable parameters and a user-friendly interface that describes the genomic landscape surrounding the insertion. Genome ARTIST is functional with many genomes of bacteria and eukaryotes available in Ensembl and GenBank repositories. Our tool specifically harnesses the sequence annotation data provided by FlyBase for Drosophila melanogaster (the fruit fly), which enables mapping of insertions relative to various genomic features such as natural transposons. Genome ARTIST was tested against other alignment tools using relevant query sequences derived from the D. melanogaster and Mus musculus (mouse) genomes. Real and simulated query sequences were also comparatively inquired, revealing that Genome ARTIST is a very robust solution for mapping transposon insertions. Genome ARTIST is a stand-alone user-friendly application, designed for high

  8. proGenomes: a resource for consistent functional and taxonomic annotations of prokaryotic genomes

    PubMed Central

    Mende, Daniel R.; Letunic, Ivica; Huerta-Cepas, Jaime; Li, Simone S.; Forslund, Kristoffer; Sunagawa, Shinichi; Bork, Peer

    2017-01-01

    The availability of microbial genomes has opened many new avenues of research within microbiology. This has been driven primarily by comparative genomics approaches, which rely on accurate and consistent characterization of genomic sequences. It is nevertheless difficult to obtain consistent taxonomic and integrated functional annotations for defined prokaryotic clades. Thus, we developed proGenomes, a resource that provides user-friendly access to currently 25 038 high-quality genomes whose sequences and consistent annotations can be retrieved individually or by taxonomic clade. These genomes are assigned to 5306 consistent and accurate taxonomic species clusters based on previously established methodology. proGenomes also contains functional information for almost 80 million protein-coding genes, including a comprehensive set of general annotations and more focused annotations for carbohydrate-active enzymes and antibiotic resistance genes. Additionally, broad habitat information is provided for many genomes. All genomes and associated information can be downloaded by user-selected clade or multiple habitat-specific sets of representative genomes. We expect that the availability of high-quality genomes with comprehensive functional annotations will promote advances in clinical microbial genomics, functional evolution and other subfields of microbiology. proGenomes is available at http://progenomes.embl.de. PMID:28053165

  9. proGenomes: a resource for consistent functional and taxonomic annotations of prokaryotic genomes.

    PubMed

    Mende, Daniel R; Letunic, Ivica; Huerta-Cepas, Jaime; Li, Simone S; Forslund, Kristoffer; Sunagawa, Shinichi; Bork, Peer

    2017-01-04

    The availability of microbial genomes has opened many new avenues of research within microbiology. This has been driven primarily by comparative genomics approaches, which rely on accurate and consistent characterization of genomic sequences. It is nevertheless difficult to obtain consistent taxonomic and integrated functional annotations for defined prokaryotic clades. Thus, we developed proGenomes, a resource that provides user-friendly access to currently 25 038 high-quality genomes whose sequences and consistent annotations can be retrieved individually or by taxonomic clade. These genomes are assigned to 5306 consistent and accurate taxonomic species clusters based on previously established methodology. proGenomes also contains functional information for almost 80 million protein-coding genes, including a comprehensive set of general annotations and more focused annotations for carbohydrate-active enzymes and antibiotic resistance genes. Additionally, broad habitat information is provided for many genomes. All genomes and associated information can be downloaded by user-selected clade or multiple habitat-specific sets of representative genomes. We expect that the availability of high-quality genomes with comprehensive functional annotations will promote advances in clinical microbial genomics, functional evolution and other subfields of microbiology. proGenomes is available at http://progenomes.embl.de.

  10. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects.

    PubMed

    Holt, Carson; Yandell, Mark

    2011-12-22

    Second-generation sequencing technologies are precipitating major shifts with regards to what kinds of genomes are being sequenced and how they are annotated. While the first generation of genome projects focused on well-studied model organisms, many of today's projects involve exotic organisms whose genomes are largely terra incognita. This complicates their annotation, because unlike first-generation projects, there are no pre-existing 'gold-standard' gene-models with which to train gene-finders. Improvements in genome assembly and the wide availability of mRNA-seq data are also creating opportunities to update and re-annotate previously published genome annotations. Today's genome projects are thus in need of new genome annotation tools that can meet the challenges and opportunities presented by second-generation sequencing technologies. We present MAKER2, a genome annotation and data management tool designed for second-generation genome projects. MAKER2 is a multi-threaded, parallelized application that can process second-generation datasets of virtually any size. We show that MAKER2 can produce accurate annotations for novel genomes where training-data are limited, of low quality or even non-existent. MAKER2 also provides an easy means to use mRNA-seq data to improve annotation quality; and it can use these data to update legacy annotations, significantly improving their quality. We also show that MAKER2 can evaluate the quality of genome annotations, and identify and prioritize problematic annotations for manual review. MAKER2 is the first annotation engine specifically designed for second-generation genome projects. MAKER2 scales to datasets of any size, requires little in the way of training data, and can use mRNA-seq data to improve annotation quality. It can also update and manage legacy genome annotation datasets.

  11. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects

    PubMed Central

    2011-01-01

    Background Second-generation sequencing technologies are precipitating major shifts with regards to what kinds of genomes are being sequenced and how they are annotated. While the first generation of genome projects focused on well-studied model organisms, many of today's projects involve exotic organisms whose genomes are largely terra incognita. This complicates their annotation, because unlike first-generation projects, there are no pre-existing 'gold-standard' gene-models with which to train gene-finders. Improvements in genome assembly and the wide availability of mRNA-seq data are also creating opportunities to update and re-annotate previously published genome annotations. Today's genome projects are thus in need of new genome annotation tools that can meet the challenges and opportunities presented by second-generation sequencing technologies. Results We present MAKER2, a genome annotation and data management tool designed for second-generation genome projects. MAKER2 is a multi-threaded, parallelized application that can process second-generation datasets of virtually any size. We show that MAKER2 can produce accurate annotations for novel genomes where training-data are limited, of low quality or even non-existent. MAKER2 also provides an easy means to use mRNA-seq data to improve annotation quality; and it can use these data to update legacy annotations, significantly improving their quality. We also show that MAKER2 can evaluate the quality of genome annotations, and identify and prioritize problematic annotations for manual review. Conclusions MAKER2 is the first annotation engine specifically designed for second-generation genome projects. MAKER2 scales to datasets of any size, requires little in the way of training data, and can use mRNA-seq data to improve annotation quality. It can also update and manage legacy genome annotation datasets. PMID:22192575

  12. TreeQ-VISTA: an interactive tree visualization tool with functional annotation query capabilities.

    PubMed

    Gu, Shengyin; Anderson, Iain; Kunin, Victor; Cipriano, Michael; Minovitsky, Simon; Weber, Gunther; Amenta, Nina; Hamann, Bernd; Dubchak, Inna

    2007-03-15

    We describe a general multiplatform exploratory tool called TreeQ-Vista, designed for presenting functional annotations in a phylogenetic context. Traits, such as phenotypic and genomic properties, are interactively queried from a user-provided relational database with a user-friendly interface which provides a set of tools for users with or without SQL knowledge. The query results are projected onto a phylogenetic tree and can be displayed in multiple color groups. A rich set of browsing, grouping and query tools are provided to facilitate trait exploration, comparison and analysis. The program, detailed tutorial and examples are available online (http:/genome.lbl.gov/vista/TreeQVista).

  13. E-TALEN: a web tool to design TALENs for genome engineering.

    PubMed

    Heigwer, Florian; Kerr, Grainne; Walther, Nike; Glaeser, Kathrin; Pelz, Oliver; Breinig, Marco; Boutros, Michael

    2013-11-01

    Use of transcription activator-like effector nucleases (TALENs) is a promising new technique in the field of targeted genome engineering, editing and reverse genetics. Its applications span from introducing knockout mutations to endogenous tagging of proteins and targeted excision repair. Owing to this wide range of possible applications, there is a need for fast and user-friendly TALEN design tools. We developed E-TALEN (http://www.e-talen.org), a web-based tool to design TALENs for experiments of varying scale. E-TALEN enables the design of TALENs against a single target or a large number of target genes. We significantly extended previously published design concepts to consider genomic context and different applications. E-TALEN guides the user through an end-to-end design process of de novo TALEN pairs, which are specific to a certain sequence or genomic locus. Furthermore, E-TALEN offers a functionality to predict targeting and specificity for existing TALENs. Owing to the computational complexity of many of the steps in the design of TALENs, particular emphasis has been put on the implementation of fast yet accurate algorithms. We implemented a user-friendly interface, from the input parameters to the presentation of results. An additional feature of E-TALEN is the in-built sequence and annotation database available for many organisms, including human, mouse, zebrafish, Drosophila and Arabidopsis, which can be extended in the future.

  14. Plant Aquaporins: Genome-Wide Identification, Transcriptomics, Proteomics, and Advanced Analytical Tools

    PubMed Central

    Deshmukh, Rupesh K.; Sonah, Humira; Bélanger, Richard R.

    2016-01-01

    Aquaporins (AQPs) are channel-forming integral membrane proteins that facilitate the movement of water and many other small molecules. Compared to animals, plants contain a much higher number of AQPs in their genome. Homology-based identification of AQPs in sequenced species is feasible because of the high level of conservation of protein sequences across plant species. Genome-wide characterization of AQPs has highlighted several important aspects such as distribution, genetic organization, evolution and conserved features governing solute specificity. From a functional point of view, the understanding of AQP transport system has expanded rapidly with the help of transcriptomics and proteomics data. The efficient analysis of enormous amounts of data generated through omic scale studies has been facilitated through computational advancements. Prediction of protein tertiary structures, pore architecture, cavities, phosphorylation sites, heterodimerization, and co-expression networks has become more sophisticated and accurate with increasing computational tools and pipelines. However, the effectiveness of computational approaches is based on the understanding of physiological and biochemical properties, transport kinetics, solute specificity, molecular interactions, sequence variations, phylogeny and evolution of aquaporins. For this purpose, tools like Xenopus oocyte assays, yeast expression systems, artificial proteoliposomes, and lipid membranes have been efficiently exploited to study the many facets that influence solute transport by AQPs. In the present review, we discuss genome-wide identification of AQPs in plants in relation with recent advancements in analytical tools, and their availability and technological challenges as they apply to AQPs. An exhaustive review of omics resources available for AQP research is also provided in order to optimize their efficient utilization. Finally, a detailed catalog of computational tools and analytical pipelines is

  15. Engineering plastid genomes: methods, tools, and applications in basic research and biotechnology.

    PubMed

    Bock, Ralph

    2015-01-01

    The small bacterial-type genome of the plastid (chloroplast) can be engineered by genetic transformation, generating cells and plants with transgenic plastid genomes, also referred to as transplastomic plants. The transformation process relies on homologous recombination, thereby facilitating the site-specific alteration of endogenous plastid genes as well as the precisely targeted insertion of foreign genes into the plastid DNA. The technology has been used extensively to analyze chloroplast gene functions and study plastid gene expression at all levels in vivo. Over the years, a large toolbox has been assembled that is now nearly comparable to the techniques available for plant nuclear transformation and that has enabled new applications of transplastomic technology in basic and applied research. This review describes the state of the art in engineering the plastid genomes of algae and land plants (Embryophyta). It provides an overview of the existing tools for plastid genome engineering, discusses current technological limitations, and highlights selected applications that demonstrate the immense potential of chloroplast transformation in several key areas of plant biotechnology.

  16. The function of genomes in bioenergetic organelles.

    PubMed Central

    Allen, John F

    2003-01-01

    Mitochondria and chloroplasts are energy-transducing organelles of the cytoplasm of eukaryotic cells. They originated as bacterial symbionts whose host cells acquired respiration from the precursor of the mitochondrion, and oxygenic photosynthesis from the precursor of the chloroplast. The host cells also acquired genetic information from their symbionts, eventually incorporating much of it into their own genomes. Genes of the eukaryotic cell nucleus now encode most mitochondrial and chloroplast proteins. Genes are copied and moved between cellular compartments with relative ease, and there is no obvious obstacle to successful import of any protein precursor from the cytosol. So why are any genes at all retained in cytoplasmic organelles? One proposal is that these small but functional genomes provide a location for genes that is close to, and in the same compartment as, their gene products. This co-location facilitates rapid and direct regulatory coupling. Redox control of synthesis de novo is put forward as the common property of those proteins that must be encoded and synthesized within mitochondria and chloroplasts. This testable hypothesis is termed CORR, for co-location for redox regulation. Principles, predictions and consequences of CORR are examined in the context of competing hypotheses and current evidence. PMID:12594916

  17. Population perspectives on functional genomic variation in yeast.

    PubMed

    Skelly, Daniel A; Magwene, Paul M

    2016-03-01

    Advances in high-throughput sequencing have facilitated large-scale surveys of genomic variation in the budding yeast,Saccharomyces cerevisiae These surveys have revealed extensive sequence variation between yeast strains. However, much less is known about how such variation influences the amount and nature of variation for functional genomic traits within and between yeast lineages. We review population-level studies of functional genomic variation, with a particular focus on how population functional genomic approaches can provide insights into both genome function and the evolutionary process. Although variation in functional genomics phenotypes is pervasive, our understanding of the consequences of this variation, either in physiological or evolutionary terms, is still rudimentary and thus motivates increased attention to appropriate null models. To date, much of the focus of population functional genomic studies has been on gene expression variation, but other functional genomic data types are just as likely to reveal important insights at the population level, suggesting a pressing need for more studies that go beyond transcription. Finally, we discuss how a population functional genomic perspective can be a powerful approach for developing a mechanistic understanding of the processes that link genomic variation to organismal phenotypes through gene networks.

  18. CRISPR/Cas9: an advanced tool for editing plant genomes.

    PubMed

    Samanta, Milan Kumar; Dey, Avishek; Gayen, Srimonta

    2016-10-01

    To meet current challenges in agriculture, genome editing using sequence-specific nucleases (SSNs) is a powerful tool for basic and applied plant biology research. Here, we describe the principle and application of available genome editing tools, including zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and the clustered regularly interspaced short palindromic repeat associated CRISPR/Cas9 system. Among these SSNs, CRISPR/Cas9 is the most recently characterized and rapidly developing genome editing technology, and has been successfully utilized in a wide variety of organisms. This review specifically illustrates the power of CRISPR/Cas9 as a tool for plant genome engineering, and describes the strengths and weaknesses of the CRISPR/Cas9 technology compared to two well-established genome editing tools, ZFNs and TALENs.

  19. Application of resequencing to rice genomics, functional genomics and evolutionary analysis

    PubMed Central

    2014-01-01

    Rice is a model system used for crop genomics studies. The completion of the rice genome draft sequences in 2002 not only accelerated functional genome studies, but also initiated a new era of resequencing rice genomes. Based on the reference genome in rice, next-generation sequencing (NGS) using the high-throughput sequencing system can efficiently accomplish whole genome resequencing of various genetic populations and diverse germplasm resources. Resequencing technology has been effectively utilized in evolutionary analysis, rice genomics and functional genomics studies. This technique is beneficial for both bridging the knowledge gap between genotype and phenotype and facilitating molecular breeding via gene design in rice. Here, we also discuss the limitation, application and future prospects of rice resequencing. PMID:25006357

  20. Three-Dimensional Genome Organization and Function in Drosophila

    PubMed Central

    Schwartz, Yuri B.; Cavalli, Giacomo

    2017-01-01

    Understanding how the metazoan genome is used during development and cell differentiation is one of the major challenges in the postgenomic era. Early studies in Drosophila suggested that three-dimensional (3D) chromosome organization plays important regulatory roles in this process and recent technological advances started to reveal connections at the molecular level. Here we will consider general features of the architectural organization of the Drosophila genome, providing historical perspective and insights from recent work. We will compare the linear and spatial segmentation of the fly genome and focus on the two key regulators of genome architecture: insulator components and Polycomb group proteins. With its unique set of genetic tools and a compact, well annotated genome, Drosophila is poised to remain a model system of choice for rapid progress in understanding principles of genome organization and to serve as a proving ground for development of 3D genome-engineering techniques. PMID:28049701

  1. Three-Dimensional Genome Organization and Function in Drosophila.

    PubMed

    Schwartz, Yuri B; Cavalli, Giacomo

    2017-01-01

    Understanding how the metazoan genome is used during development and cell differentiation is one of the major challenges in the postgenomic era. Early studies in Drosophila suggested that three-dimensional (3D) chromosome organization plays important regulatory roles in this process and recent technological advances started to reveal connections at the molecular level. Here we will consider general features of the architectural organization of the Drosophila genome, providing historical perspective and insights from recent work. We will compare the linear and spatial segmentation of the fly genome and focus on the two key regulators of genome architecture: insulator components and Polycomb group proteins. With its unique set of genetic tools and a compact, well annotated genome, Drosophila is poised to remain a model system of choice for rapid progress in understanding principles of genome organization and to serve as a proving ground for development of 3D genome-engineering techniques. Copyright © 2017 Schwartz and Cavalli.

  2. CAGO: a software tool for dynamic visual comparison and correlation measurement of genome organization.

    PubMed

    Chang, Yi-Feng; Chang, Chuan-Hsiung

    2011-01-01

    CAGO (Comparative Analysis of Genome Organization) is developed to address two critical shortcomings of conventional genome atlas plotters: lack of dynamic exploratory functions and absence of signal analysis for genomic properties. With dynamic exploratory functions, users can directly manipulate chromosome tracks of a genome atlas and intuitively identify distinct genomic signals by visual comparison. Signal analysis of genomic properties can further detect inconspicuous patterns from noisy genomic properties and calculate correlations between genomic properties across various genomes. To implement dynamic exploratory functions, CAGO presents each genome atlas in Scalable Vector Graphics (SVG) format and allows users to interact with it using a SVG viewer through JavaScript. Signal analysis functions are implemented using R statistical software and a discrete wavelet transformation package waveslim. CAGO is not only a plotter for generating complex genome atlases, but also a platform for exploring genome atlases with dynamic exploratory functions for visual comparison and with signal analysis for comparing genomic properties across multiple organisms. The web-based application of CAGO, its source code, user guides, video demos, and live examples are publicly available and can be accessed at http://cbs.ym.edu.tw/cago.

  3. Fishing for function: zebrafish BAC transgenics for functional genomics.

    PubMed

    Chatterjee, Sumantra; Lufkin, Thomas

    2011-08-01

    Transgenics using bacterial artificial chromosomes (BACs) offers a great opportunity to look at gene regulation in a developing embryo. The modified BAC containing a reporter inserted just before the translational start site of the gene of interest allows for the visualization of spatio-temporal gene expression. Though this method has been used in the mouse model extensively, its utility in zebrafish studies is relatively new. This review aims to look at the utility of making BAC transgenics in zebrafish and its applications in functional genomics. We look at the various methods to modify the BAC, some limitations and what the future holds.

  4. Fishing for function: zebrafish BAC transgenics for functional genomics

    PubMed Central

    Chatterjee, Sumantra; Lufkin, Thomas

    2012-01-01

    Transgenics using bacterial artificial chromosomes (BACs) offers a great opportunity to look at gene regulation in a developing embryo. The modified BAC containing a reporter inserted just before the translational start site of the gene of interest allows for the visualization of spatio-temporal gene expression. Though this method has been used in the mouse model extensively, its utility in zebrafish studies is relatively new. This review aims to look at the utility of making BAC transgenics in zebrafish and its applications in functional genomics. We look at the various methods to modify the BAC, some limitations and what the future holds. PMID:21647532

  5. Optimized CRISPR/Cas tools for efficient germline and somatic genome engineering in Drosophila.

    PubMed

    Port, Fillip; Chen, Hui-Min; Lee, Tzumin; Bullock, Simon L

    2014-07-22

    The type II clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) system has emerged recently as a powerful method to manipulate the genomes of various organisms. Here, we report a toolbox for high-efficiency genome engineering of Drosophila melanogaster consisting of transgenic Cas9 lines and versatile guide RNA (gRNA) expression plasmids. Systematic evaluation reveals Cas9 lines with ubiquitous or germ-line-restricted patterns of activity. We also demonstrate differential activity of the same gRNA expressed from different U6 snRNA promoters, with the previously untested U6:3 promoter giving the most potent effect. An appropriate combination of Cas9 and gRNA allows targeting of essential and nonessential genes with transmission rates ranging from 25-100%. We also demonstrate that our optimized CRISPR/Cas tools can be used for offset nicking-based mutagenesis. Furthermore, in combination with oligonucleotide or long double-stranded donor templates, our reagents allow precise genome editing by homology-directed repair with rates that make selection markers unnecessary. Last, we demonstrate a novel application of CRISPR/Cas-mediated technology in revealing loss-of-function phenotypes in somatic cells following efficient biallelic targeting by Cas9 expressed in a ubiquitous or tissue-restricted manner. Our CRISPR/Cas tools will facilitate the rapid evaluation of mutant phenotypes of specific genes and the precise modification of the genome with single-nucleotide precision. Our results also pave the way for high-throughput genetic screening with CRISPR/Cas.

  6. Genetic and genomic tools for the marine annelid Platynereis dumerilii.

    PubMed

    Zantke, Juliane; Bannister, Stephanie; Rajan, Vinoth Babu Veedin; Raible, Florian; Tessmar-Raible, Kristin

    2014-05-01

    The bristle worm Platynereis dumerilii displays many interesting biological characteristics. These include its reproductive timing, which is synchronized to the moon phase, its regenerative capacity that is hormonally controlled, and a slow rate of evolution, which permits analyses of ancestral genes and cell types. As a marine annelid, Platynereis is also representative of the marine ecosystem, as well as one of the three large animal subphyla, the Lophotrochozoa. Here, we provide an overview of the molecular resources, functional techniques, and behavioral assays that have recently been established for the bristle worm. This combination of tools now places Platynereis in an excellent position to advance research at the frontiers of neurobiology, chronobiology, evo-devo, and marine biology.

  7. Genetic and Genomic Tools for the Marine Annelid Platynereis dumerilii

    PubMed Central

    Zantke, Juliane; Bannister, Stephanie; Rajan, Vinoth Babu Veedin; Raible, Florian; Tessmar-Raible, Kristin

    2014-01-01

    The bristle worm Platynereis dumerilii displays many interesting biological characteristics. These include its reproductive timing, which is synchronized to the moon phase, its regenerative capacity that is hormonally controlled, and a slow rate of evolution, which permits analyses of ancestral genes and cell types. As a marine annelid, Platynereis is also representative of the marine ecosystem, as well as one of the three large animal subphyla, the Lophotrochozoa. Here, we provide an overview of the molecular resources, functional techniques, and behavioral assays that have recently been established for the bristle worm. This combination of tools now places Platynereis in an excellent position to advance research at the frontiers of neurobiology, chronobiology, evo-devo, and marine biology. PMID:24807110

  8. Blast2GO: A comprehensive suite for functional analysis in plant genomics.

    PubMed

    Conesa, Ana; Götz, Stefan

    2008-01-01

    Functional annotation of novel sequence data is a primary requirement for the utilization of functional genomics approaches in plant research. In this paper, we describe the Blast2GO suite as a comprehensive bioinformatics tool for functional annotation of sequences and data mining on the resulting annotations, primarily based on the gene ontology (GO) vocabulary. Blast2GO optimizes function transfer from homologous sequences through an elaborate algorithm that considers similarity, the extension of the homology, the database of choice, the GO hierarchy, and the quality of the original annotations. The tool includes numerous functions for the visualization, management, and statistical analysis of annotation results, including gene set enrichment analysis. The application supports InterPro, enzyme codes, KEGG pathways, GO direct acyclic graphs (DAGs), and GOSlim. Blast2GO is a suitable tool for plant genomics research because of its versatility, easy installation, and friendly use.

  9. Blast2GO: A Comprehensive Suite for Functional Analysis in Plant Genomics

    PubMed Central

    Conesa, Ana; Götz, Stefan

    2008-01-01

    Functional annotation of novel sequence data is a primary requirement for the utilization of functional genomics approaches in plant research. In this paper, we describe the Blast2GO suite as a comprehensive bioinformatics tool for functional annotation of sequences and data mining on the resulting annotations, primarily based on the gene ontology (GO) vocabulary. Blast2GO optimizes function transfer from homologous sequences through an elaborate algorithm that considers similarity, the extension of the homology, the database of choice, the GO hierarchy, and the quality of the original annotations. The tool includes numerous functions for the visualization, management, and statistical analysis of annotation results, including gene set enrichment analysis. The application supports InterPro, enzyme codes, KEGG pathways, GO direct acyclic graphs (DAGs), and GOSlim. Blast2GO is a suitable tool for plant genomics research because of its versatility, easy installation, and friendly use. PMID:18483572

  10. GHOSTX: A Fast Sequence Homology Search Tool for Functional Annotation of Metagenomic Data.

    PubMed

    Suzuki, Shuji; Ishida, Takashi; Ohue, Masahito; Kakuta, Masanori; Akiyama, Yutaka

    2017-01-01

    Metagenomic analysis based on whole genome shotgun sequencing data requires fast protein sequence homology searches for predicting the function of proteins coded on metagenome short reads. However, huge amounts of sequence data cause even general homology search analyses using BLASTX to become difficult in terms of computational cost. GHOSTX is a sequence homology search tool specifically developed for functional annotation of metagenome sequences. The tool is more than 160 times faster than BLASTX and has sufficient search sensitivity for metagenomic analysis. Using this tool, user can perform functional annotation of metagenomic data within a short time and infer metabolic pathways within an environment.

  11. Functional profiling of the Saccharomyces cerevisiae genome.

    PubMed

    Giaever, Guri; Chu, Angela M; Ni, Li; Connelly, Carla; Riles, Linda; Véronneau, Steeve; Dow, Sally; Lucau-Danila, Ankuta; Anderson, Keith; André, Bruno; Arkin, Adam P; Astromoff, Anna; El-Bakkoury, Mohamed; Bangham, Rhonda; Benito, Rocio; Brachat, Sophie; Campanaro, Stefano; Curtiss, Matt; Davis, Karen; Deutschbauer, Adam; Entian, Karl-Dieter; Flaherty, Patrick; Foury, Francoise; Garfinkel, David J; Gerstein, Mark; Gotte, Deanna; Güldener, Ulrich; Hegemann, Johannes H; Hempel, Svenja; Herman, Zelek; Jaramillo, Daniel F; Kelly, Diane E; Kelly, Steven L; Kötter, Peter; LaBonte, Darlene; Lamb, David C; Lan, Ning; Liang, Hong; Liao, Hong; Liu, Lucy; Luo, Chuanyun; Lussier, Marc; Mao, Rong; Menard, Patrice; Ooi, Siew Loon; Revuelta, Jose L; Roberts, Christopher J; Rose, Matthias; Ross-Macdonald, Petra; Scherens, Bart; Schimmack, Greg; Shafer, Brenda; Shoemaker, Daniel D; Sookhai-Mahadeo, Sharon; Storms, Reginald K; Strathern, Jeffrey N; Valle, Giorgio; Voet, Marleen; Volckaert, Guido; Wang, Ching-yun; Ward, Teresa R; Wilhelmy, Julie; Winzeler, Elizabeth A; Yang, Yonghong; Yen, Grace; Youngman, Elaine; Yu, Kexin; Bussey, Howard; Boeke, Jef D; Snyder, Michael; Philippsen, Peter; Davis, Ronald W; Johnston, Mark

    2002-07-25

    Determining the effect of gene deletion is a fundamental approach to understanding gene function. Conventional genetic screens exhibit biases, and genes contributing to a phenotype are often missed. We systematically constructed a nearly complete collection of gene-deletion mutants (96% of annotated open reading frames, or ORFs) of the yeast Saccharomyces cerevisiae. DNA sequences dubbed 'molecular bar codes' uniquely identify each strain, enabling their growth to be analysed in parallel and the fitness contribution of each gene to be quantitatively assessed by hybridization to high-density oligonucleotide arrays. We show that previously known and new genes are necessary for optimal growth under six well-studied conditions: high salt, sorbitol, galactose, pH 8, minimal medium and nystatin treatment. Less than 7% of genes that exhibit a significant increase in messenger RNA expression are also required for optimal growth in four of the tested conditions. Our results validate the yeast gene-deletion collection as a valuable resource for functional genomics.

  12. An FDA bioinformatics tool for microbial genomics research on molecular characterization of bacterial foodborne pathogens using microarrays.

    PubMed

    Fang, Hong; Xu, Joshua; Ding, Don; Jackson, Scott A; Patel, Isha R; Frye, Jonathan G; Zou, Wen; Nayak, Rajesh; Foley, Steven; Chen, James; Su, Zhenqiang; Ye, Yanbin; Turner, Steve; Harris, Steve; Zhou, Guangxu; Cerniglia, Carl; Tong, Weida

    2010-10-07

    Advances in microbial genomics and bioinformatics are offering greater insights into the emergence and spread of foodborne pathogens in outbreak scenarios. The Food and Drug Administration (FDA) has developed a genomics tool, ArrayTrack™, which provides extensive functionalities to manage, analyze, and interpret genomic data for mammalian species. ArrayTrack™ has been widely adopted by the research community and used for pharmacogenomics data review in the FDA's Voluntary Genomics Data Submission program. ArrayTrack™ has been extended to manage and analyze genomics data from bacterial pathogens of human, animal, and food origin. It was populated with bioinformatics data from public databases such as NCBI, Swiss-Prot, KEGG Pathway, and Gene Ontology to facilitate pathogen detection and characterization. ArrayTrack™'s data processing and visualization tools were enhanced with analysis capabilities designed specifically for microbial genomics including flag-based hierarchical clustering analysis (HCA), flag concordance heat maps, and mixed scatter plots. These specific functionalities were evaluated on data generated from a custom Affymetrix array (FDA-ECSG) previously developed within the FDA. The FDA-ECSG array represents 32 complete genomes of Escherichia coli and Shigella. The new functions were also used to analyze microarray data focusing on antimicrobial resistance genes from Salmonella isolates in a poultry production environment using a universal antimicrobial resistance microarray developed by the United States Department of Agriculture (USDA). The application of ArrayTrack™ to different microarray platforms demonstrates its utility in microbial genomics research, and thus will improve the capabilities of the FDA to rapidly identify foodborne bacteria and their genetic traits (e.g., antimicrobial resistance, virulence, etc.) during outbreak investigations. ArrayTrack™ is free to use and available to public, private, and academic researchers at

  13. BABELOMICS: a systems biology perspective in the functional annotation of genome-scale experiments

    PubMed Central

    Al-Shahrour, Fátima; Minguez, Pablo; Tárraga, Joaquín; Montaner, David; Alloza, Eva; Vaquerizas, Juan M.; Conde, Lucía; Blaschke, Christian; Vera, Javier; Dopazo, Joaquín

    2006-01-01

    We present a new version of Babelomics, a complete suite of web tools for functional analysis of genome-scale experiments, with new and improved tools. New functionally relevant terms have been included such as CisRed motifs or bioentities obtained by text-mining procedures. An improved indexing has considerably speeded up several of the modules. An improved version of the FatiScan method for studying the coordinate behaviour of groups of functionally related genes is presented, along with a similar tool, the Gene Set Enrichment Analysis. Babelomics is now more oriented to test systems biology inspired hypotheses. Babelomics can be found at . PMID:16845052

  14. GIANT API: an application programming interface for functional genomics.

    PubMed

    Roberts, Andrew M; Wong, Aaron K; Fisk, Ian; Troyanskaya, Olga G

    2016-07-08

    GIANT API provides biomedical researchers programmatic access to tissue-specific and global networks in humans and model organisms, and associated tools, which includes functional re-prioritization of existing genome-wide association study (GWAS) data. Using tissue-specific interaction networks, researchers are able to predict relationships between genes specific to a tissue or cell lineage, identify the changing roles of genes across tissues and uncover disease-gene associations. Additionally, GIANT API enables computational tools like NetWAS, which leverages tissue-specific networks for re-prioritization of GWAS results. The web services covered by the API include 144 tissue-specific functional gene networks in human, global functional networks for human and six common model organisms and the NetWAS method. GIANT API conforms to the REST architecture, which makes it stateless, cacheable and highly scalable. It can be used by a diverse range of clients including web browsers, command terminals, programming languages and standalone apps for data analysis and visualization. The API is freely available for use at http://giant-api.princeton.edu. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  15. GIANT API: an application programming interface for functional genomics

    PubMed Central

    Roberts, Andrew M.; Wong, Aaron K.; Fisk, Ian; Troyanskaya, Olga G.

    2016-01-01

    GIANT API provides biomedical researchers programmatic access to tissue-specific and global networks in humans and model organisms, and associated tools, which includes functional re-prioritization of existing genome-wide association study (GWAS) data. Using tissue-specific interaction networks, researchers are able to predict relationships between genes specific to a tissue or cell lineage, identify the changing roles of genes across tissues and uncover disease-gene associations. Additionally, GIANT API enables computational tools like NetWAS, which leverages tissue-specific networks for re-prioritization of GWAS results. The web services covered by the API include 144 tissue-specific functional gene networks in human, global functional networks for human and six common model organisms and the NetWAS method. GIANT API conforms to the REST architecture, which makes it stateless, cacheable and highly scalable. It can be used by a diverse range of clients including web browsers, command terminals, programming languages and standalone apps for data analysis and visualization. The API is freely available for use at http://giant-api.princeton.edu. PMID:27098035

  16. Functional viral metagenomics and the next generation of molecular tools.

    PubMed

    Schoenfeld, Thomas; Liles, Mark; Wommack, K Eric; Polson, Shawn W; Godiska, Ronald; Mead, David

    2010-01-01

    The enzymes of bacteriophages and other viruses have been essential research tools since the first days of molecular biology. However, the current repertoire of viral enzymes only hints at their overall potential. The most commonly used enzymes are derived from a surprisingly small number of cultivated viruses, which is remarkable considering the extreme abundance and diversity of viruses revealed over the past decade by metagenomic analysis. To access the treasure trove of enzymes hidden in the global virosphere and develop them for research, therapeutic and diagnostic uses, improvements are needed in our ability to rapidly and efficiently discover, express and characterize viral genes to produce useful proteins. In this paper, we discuss improvements to sampling and cloning methods, functional and genomics-based screens, and expression systems, which should accelerate discovery of new enzymes and other viral proteins for use in research and medicine.

  17. Pathway Tools version 19.0 update: software for pathway/genome informatics and systems biology.

    PubMed

    Karp, Peter D; Latendresse, Mario; Paley, Suzanne M; Krummenacker, Markus; Ong, Quang D; Billington, Richard; Kothari, Anamika; Weaver, Daniel; Lee, Thomas; Subhraveti, Pallavi; Spaulding, Aaron; Fulcher, Carol; Keseler, Ingrid M; Caspi, Ron

    2016-09-01

    Pathway Tools is a bioinformatics software environment with a broad set of capabilities. The software provides genome-informatics tools such as a genome browser, sequence alignments, a genome-variant analyzer and comparative-genomics operations. It offers metabolic-informatics tools, such as metabolic reconstruction, quantitative metabolic modeling, prediction of reaction atom mappings and metabolic route search. Pathway Tools also provides regulatory-informatics tools, such as the ability to represent and visualize a wide range of regulatory interactions. This article outlines the advances in Pathway Tools in the past 5 years. Major additions include components for metabolic modeling, metabolic route search, computation of atom mappings and estimation of compound Gibbs free energies of formation; addition of editors for signaling pathways, for genome sequences and for cellular architecture; storage of gene essentiality data and phenotype data; display of multiple alignments, and of signaling and electron-transport pathways; and development of Python and web-services application programming interfaces. Scientists around the world have created more than 9800 Pathway/Genome Databases by using Pathway Tools, many of which are curated databases for important model organisms. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  18. Pathway Tools version 19.0 update: software for pathway/genome informatics and systems biology

    PubMed Central

    Latendresse, Mario; Paley, Suzanne M.; Krummenacker, Markus; Ong, Quang D.; Billington, Richard; Kothari, Anamika; Weaver, Daniel; Lee, Thomas; Subhraveti, Pallavi; Spaulding, Aaron; Fulcher, Carol; Keseler, Ingrid M.; Caspi, Ron

    2016-01-01

    Pathway Tools is a bioinformatics software environment with a broad set of capabilities. The software provides genome-informatics tools such as a genome browser, sequence alignments, a genome-variant analyzer and comparative-genomics operations. It offers metabolic-informatics tools, such as metabolic reconstruction, quantitative metabolic modeling, prediction of reaction atom mappings and metabolic route search. Pathway Tools also provides regulatory-informatics tools, such as the ability to represent and visualize a wide range of regulatory interactions. This article outlines the advances in Pathway Tools in the past 5 years. Major additions include components for metabolic modeling, metabolic route search, computation of atom mappings and estimation of compound Gibbs free energies of formation; addition of editors for signaling pathways, for genome sequences and for cellular architecture; storage of gene essentiality data and phenotype data; display of multiple alignments, and of signaling and electron-transport pathways; and development of Python and web-services application programming interfaces. Scientists around the world have created more than 9800 Pathway/Genome Databases by using Pathway Tools, many of which are curated databases for important model organisms. PMID:26454094

  19. Phytozome: a Tool for Green Plant Comparative Genomics

    DOE Data Explorer

    Phytozome is a joint project of the Department of Energy's Joint Genome Institute and the Center for Integrative Genomics to facilitate comparative genomic studies amongst green plants. Clusters of orthologous and paralogous genes that represent the modern descendents of ancestral gene sets are constructed at key phylogenetic nodes. These clusters allow easy access to clade specific orthology/paralogy relationships as well as clade specific genes and gene expansions. As of release v4.0, Phytozome provides access to nine sequenced and annotated green plant genomes, eight of which have been clustered into gene families at six evolutionarily significant nodes. Where possible, each gene has been annotated with PFAM, KOG, KEGG, and PANTHER assignments, and publicly available annotations from RefSeq, UniProt, TAIR, JGI are hyper-linked and searchable. [Copied from the Overview at http://www.phytozome.net/Phytozome_info.php

  20. CSTminer: a web tool for the identification of coding and noncoding conserved sequence tags through cross-species genome comparison.

    PubMed

    Castrignanò, Tiziana; Canali, Alessandro; Grillo, Giorgio; Liuni, Sabino; Mignone, Flavio; Pesole, Graziano

    2004-07-01

    The identification and characterization of genome tracts that are highly conserved across species during evolution may contribute significantly to the functional annotation of whole-genome sequences. Indeed, such sequences are likely to correspond to known or unknown coding exons or regulatory motifs. Here, we present a web server implementing a previously developed algorithm that, by comparing user-submitted genome sequences, is able to identify statistically significant conserved blocks and assess their coding or noncoding nature through the measure of a coding potential score. The web tool, available at http://www.caspur.it/CSTminer/, is dynamically interconnected with the Ensembl genome resources and produces a graphical output showing a map of detected conserved sequences and annotated gene features.

  1. A statistical framework to predict functional non-coding regions in the human genome through integrated analysis of annotation data.

    PubMed

    Lu, Qiongshi; Hu, Yiming; Sun, Jiehuan; Cheng, Yuwei; Cheung, Kei-Hoi; Zhao, Hongyu

    2015-05-27

    Identifying functional regions in the human genome is a major goal in human genetics. Great efforts have been made to functionally annotate the human genome either through computational predictions, such as genomic conservation, or high-throughput experiments, such as the ENCODE project. These efforts have resulted in a rich collection of functional annotation data of diverse types that need to be jointly analyzed for integrated interpretation and annotation. Here we present GenoCanyon, a whole-genome annotation method that performs unsupervised statistical learning using 22 computational and experimental annotations thereby inferring the functional potential of each position in the human genome. With GenoCanyon, we are able to predict many of the known functional regions. The ability of predicting functional regions as well as its generalizable statistical framework makes GenoCanyon a unique and powerful tool for whole-genome annotation. The GenoCanyon web server is available at http://genocanyon.med.yale.edu.

  2. Comparative genomics of pectinacetylesterases: Insight on function and biology

    PubMed Central

    de Souza, Amancio José; Pauly, Markus

    2015-01-01

    Pectin acetylation influences the gelling ability of this important plant polysaccharide for the food industry. Plant apoplastic pectinacetylesterases (PAEs) play a key role in regulating the degree of pectin acetylation and modifying their expression thus represents one way to engineer plant polysaccharides for food applications. Identifying the major active enzymes within the PAE gene family will aid in our understanding of this biological phenomena as well as provide the tools for direct trait manipulation. Using comparative genomics we propose that there is a minimal set of 4 distinct PAEs in plants. Possible functional diversification of the PAE family in the grasses is also explored with the identification of 3 groups of PAE genes specific to grasses. PMID:26237162

  3. NCBI GEO: archive for functional genomics data sets—update

    PubMed Central

    Barrett, Tanya; Wilhite, Stephen E.; Ledoux, Pierre; Evangelista, Carlos; Kim, Irene F.; Tomashevsky, Maxim; Marshall, Kimberly A.; Phillippy, Katherine H.; Sherman, Patti M.; Holko, Michelle; Yefanov, Andrey; Lee, Hyeseung; Zhang, Naigong; Robertson, Cynthia L.; Serova, Nadezhda; Davis, Sean; Soboleva, Alexandra

    2013-01-01

    The Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) is an international public repository for high-throughput microarray and next-generation sequence functional genomic data sets submitted by the research community. The resource supports archiving of raw data, processed data and metadata which are indexed, cross-linked and searchable. All data are freely available for download in a variety of formats. GEO also provides several web-based tools and strategies to assist users to query, analyse and visualize data. This article reports current status and recent database developments, including the release of GEO2R, an R-based web application that helps users analyse GEO data. PMID:23193258

  4. NCBI GEO: archive for functional genomics data sets--update.

    PubMed

    Barrett, Tanya; Wilhite, Stephen E; Ledoux, Pierre; Evangelista, Carlos; Kim, Irene F; Tomashevsky, Maxim; Marshall, Kimberly A; Phillippy, Katherine H; Sherman, Patti M; Holko, Michelle; Yefanov, Andrey; Lee, Hyeseung; Zhang, Naigong; Robertson, Cynthia L; Serova, Nadezhda; Davis, Sean; Soboleva, Alexandra

    2013-01-01

    The Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) is an international public repository for high-throughput microarray and next-generation sequence functional genomic data sets submitted by the research community. The resource supports archiving of raw data, processed data and metadata which are indexed, cross-linked and searchable. All data are freely available for download in a variety of formats. GEO also provides several web-based tools and strategies to assist users to query, analyse and visualize data. This article reports current status and recent database developments, including the release of GEO2R, an R-based web application that helps users analyse GEO data.

  5. HGT-Finder: A New Tool for Horizontal Gene Transfer Finding and Application to Aspergillus genomes.

    PubMed

    Nguyen, Marcus; Ekstrom, Alex; Li, Xueqiong; Yin, Yanbin

    2015-10-09

    Horizontal gene transfer (HGT) is a fast-track mechanism that allows genetically unrelated organisms to exchange genes for rapid environmental adaptation. We developed a new phyletic distribution-based software, HGT-Finder, which implements a novel bioinformatics algorithm to calculate a horizontal transfer index and a probability value for each query gene. Applying this new tool to the Aspergillus fumigatus, Aspergillus flavus, and Aspergillus nidulans genomes, we found 273, 542, and 715 transferred genes (HTGs), respectively. HTGs have shorter length, higher guanine-cytosine (GC) content, and relaxed selection pressure. Metabolic process and secondary metabolism functions are significantly enriched in HTGs. Gene clustering analysis showed that 61%, 41% and 74% of HTGs in the three genomes form physically linked gene clusters (HTGCs). Overlapping manually curated, secondary metabolite gene clusters (SMGCs) with HTGCs found that 9 of the 33 A. fumigatus SMGCs and 31 of the 65 A. nidulans SMGCs share genes with HTGCs, and that HTGs are significantly enriched in SMGCs. Our genome-wide analysis thus presented very strong evidence to support the hypothesis that HGT has played a very critical role in the evolution of SMGCs. The program is freely available at http://cys.bios.niu.edu/HGTFinder/ HGTFinder.tar.gz.

  6. New Developments of RNAi in Paracoccidioides brasiliensis: Prospects for High-Throughput, Genome-Wide, Functional Genomics

    PubMed Central

    Goes, Tercio; Bailão, Elisa Flavia L. C.; Correa, Cristiane R.; Bozzi, Adriana; Santos, Luara I.; Gomes, Dawidson A.; Soares, Celia M. A.; Goes, Alfredo M.

    2014-01-01

    Background The Fungal Genome Initiative of the Broad Institute, in partnership with the Paracoccidioides research community, has recently sequenced the genome of representative isolates of this human-pathogen dimorphic fungus: Pb18 (S1), Pb03 (PS2) and Pb01. The accomplishment of future high-throughput, genome-wide, functional genomics will rely upon appropriate molecular tools and straightforward techniques to streamline the generation of stable loss-of-function phenotypes. In the past decades, RNAi has emerged as the most robust genetic technique to modulate or to suppress gene expression in diverse eukaryotes, including fungi. These molecular tools and techniques, adapted for RNAi, were up until now unavailable for P. brasiliensis. Methodology/Principal Findings In this paper, we report Agrobacterium tumefaciens mediated transformation of yeast cells for high-throughput applications with which higher transformation frequencies of 150±24 yeast cell transformants per 1×106 viable yeast cells were obtained. Our approach is based on a bifunctional selective marker fusion protein consisted of the Streptoalloteichus hindustanus bleomycin-resistance gene (Shble) and the intrinsically fluorescent monomeric protein mCherry which was codon-optimized for heterologous expression in P. brasiliensis. We also report successful GP43 gene knock-down through the expression of intron-containing hairpin RNA (ihpRNA) from a Gateway-adapted cassette (cALf) which was purpose-built for gene silencing in a high-throughput manner. Gp43 transcript levels were reduced by 73.1±22.9% with this approach. Conclusions/Significance We have a firm conviction that the genetic transformation technique and the molecular tools herein described will have a relevant contribution in future Paracoccidioides spp. functional genomics research. PMID:25275433

  7. New developments of RNAi in Paracoccidioides brasiliensis: prospects for high-throughput, genome-wide, functional genomics.

    PubMed

    Goes, Tercio; Bailão, Elisa Flavia L C; Correa, Cristiane R; Bozzi, Adriana; Santos, Luara I; Gomes, Dawidson A; Soares, Celia M A; Goes, Alfredo M

    2014-10-01

    The Fungal Genome Initiative of the Broad Institute, in partnership with the Paracoccidioides research community, has recently sequenced the genome of representative isolates of this human-pathogen dimorphic fungus: Pb18 (S1), Pb03 (PS2) and Pb01. The accomplishment of future high-throughput, genome-wide, functional genomics will rely upon appropriate molecular tools and straightforward techniques to streamline the generation of stable loss-of-function phenotypes. In the past decades, RNAi has emerged as the most robust genetic technique to modulate or to suppress gene expression in diverse eukaryotes, including fungi. These molecular tools and techniques, adapted for RNAi, were up until now unavailable for P. brasiliensis. In this paper, we report Agrobacterium tumefaciens mediated transformation of yeast cells for high-throughput applications with which higher transformation frequencies of 150±24 yeast cell transformants per 1×106 viable yeast cells were obtained. Our approach is based on a bifunctional selective marker fusion protein consisted of the Streptoalloteichus hindustanus bleomycin-resistance gene (Shble) and the intrinsically fluorescent monomeric protein mCherry which was codon-optimized for heterologous expression in P. brasiliensis. We also report successful GP43 gene knock-down through the expression of intron-containing hairpin RNA (ihpRNA) from a Gateway-adapted cassette (cALf) which was purpose-built for gene silencing in a high-throughput manner. Gp43 transcript levels were reduced by 73.1±22.9% with this approach. We have a firm conviction that the genetic transformation technique and the molecular tools herein described will have a relevant contribution in future Paracoccidioides spp. functional genomics research.

  8. Advances in mRNA Silencing and Transgene Expression: a Gateway to Functional Genomics in Schistosomes

    PubMed Central

    Tchoubrieva, Elissaveta B.; Kalinna, Bernd H.

    2013-01-01

    The completion of the WHO Schistosoma Genome Project in 2008, although not fully annotated, provides a golden opportunity to actively pursue fundamental research on the parasites genome. This analysis will aid identification of targets for drugs, vaccines and markers for diagnostic tools as well as for studying the biological basis of drug resistance, infectivity and pathology. For the validation of drug and vaccine targets, the genomic sequence data is only of use if functional analyses can be conducted (in the parasite itself). Until recently, gene manipulation approaches had not been seriously addressed. This situation is now changing and rapid advances have been made in gene silencing and transgenesis of schistosomes. PMID:21415884

  9. A functional genomics method for assaying gene function in phytopathogenic fungi through host-induced gene silencing mediated by agroinfiltration.

    PubMed

    Panwar, Vinay; McCallum, Brent; Bakkeren, Guus

    2015-01-01

    With the rapid growth of genomic information, there is an increasing demand for efficient analysis tools to study the function of predicted genes coded in genomes. Agroinfiltration, the delivery of gene constructs into plant cells by Agrobacterium tumefaciens infiltrated into leaves, is one such versatile, simple, and rapid technique that is increasingly used for transient gene expression assay in plants. In this chapter, we focus on the use of agroinfiltration as a functional genomics research tool in molecular plant pathology. Specifically, we describe in detail its use in expressing phytopathogenic fungal gene sequences in a host plant to induce RNA silencing of corresponding genes inside the pathogen, a method which has been termed host-induced gene silencing (HIGS). We target the fungal pathogen Puccinia triticina which causes leaf rust on its wheat host, but the method is applicable to a variety of pathosystems.

  10. The role of chromosome domains in shaping the functional genome.

    PubMed

    Sexton, Tom; Cavalli, Giacomo

    2015-03-12

    The genome must be highly compacted to fit within eukaryotic nuclei but must be accessible to the transcriptional machinery to allow appropriate expression of genes in different cell types and throughout developmental pathways. A growing body of work has shown that the genome, analogously to proteins, forms an ordered, hierarchical structure that closely correlates and may even be causally linked with regulation of functions such as transcription. This review describes our current understanding of how these functional genomic "secondary and tertiary structures" form a blueprint for global nuclear architecture and the potential they hold for understanding and manipulating genomic regulation.

  11. Genome-editing tools for stem cell biology

    PubMed Central

    Vasileva, E A; Shuvalov, O U; Garabadgiu, A V; Melino, G; Barlev, N A

    2015-01-01

    Human pluripotent stem cells provide a versatile platform for regenerative studies, drug testing and disease modeling. That the expression of only four transcription factors, Oct4, Klf4, Sox2 and c-Myc (OKSM), is sufficient for generation of induced pluripotent stem cells (iPSCs) from differentiated somatic cells has revolutionized the field and also highlighted the importance of OKSM as targets for genome editing. A number of novel genome-editing systems have been developed recently. In this review, we focus on successful applications of several such systems for generation of iPSCs. In particular, we discuss genome-editing systems based on zinc-finger fusion proteins (ZFs), transcription activator-like effectors (TALEs) and an RNA-guided DNA-specific nuclease, Cas9, derived from the bacterial defense system against viruses that utilizes clustered regularly interspaced short palindromic repeats (CRISPR). PMID:26203860

  12. Genome-editing tools for stem cell biology.

    PubMed

    Vasileva, E A; Shuvalov, O U; Garabadgiu, A V; Melino, G; Barlev, N A

    2015-07-23

    Human pluripotent stem cells provide a versatile platform for regenerative studies, drug testing and disease modeling. That the expression of only four transcription factors, Oct4, Klf4, Sox2 and c-Myc (OKSM), is sufficient for generation of induced pluripotent stem cells (iPSCs) from differentiated somatic cells has revolutionized the field and also highlighted the importance of OKSM as targets for genome editing. A number of novel genome-editing systems have been developed recently. In this review, we focus on successful applications of several such systems for generation of iPSCs. In particular, we discuss genome-editing systems based on zinc-finger fusion proteins (ZFs), transcription activator-like effectors (TALEs) and an RNA-guided DNA-specific nuclease, Cas9, derived from the bacterial defense system against viruses that utilizes clustered regularly interspaced short palindromic repeats (CRISPR).

  13. BACs as tools for the study of genomic imprinting.

    PubMed

    Tunster, S J; Van De Pette, M; John, R M

    2011-01-01

    Genomic imprinting in mammals results in the expression of genes from only one parental allele. Imprinting occurs as a consequence of epigenetic marks set down either in the father's or the mother's germ line and affects a very specific category of mammalian gene. A greater understanding of this distinctive phenomenon can be gained from studies using large genomic clones, called bacterial artificial chromosomes (BACs). Here, we review the important applications of BACs to imprinting research, covering physical mapping studies and the use of BACs as transgenes in mice to study gene expression patterns, to identify imprinting centres, and to isolate the consequences of altered gene dosage. We also highlight the significant and unique advantages that rapid BAC engineering brings to genomic imprinting research.

  14. Comprehensive Resources for Tomato Functional Genomics Based on the Miniature Model Tomato Micro-Tom

    PubMed Central

    Matsukura, C; Aoki, K; Fukuda, N; Mizoguchi, T; Asamizu, E; Saito, T; Shibata, D; Ezura, H

    2008-01-01

    Tomato (Solanum lycopersicum L., Solanaceae) is an excellent model plant for genomic research of solanaceous plants, as well as for studying the development, ripening, and metabolism of fruit. In 2003, the International Solanaceae Project (SOL, www.sgn.cornell.edu ) was initiated by members from more than 30 countries, and the tomato genome-sequencing project is currently underway. Genome sequence of tomato obtained by this project will provide a firm foundation for forthcoming genomic studies such as the comparative analysis of genes conserved among the Solanaceae species and the elucidation of the functions of unknown tomato genes. To exploit the wealth of the genome sequence information, there is an urgent need for novel resources and analytical tools for tomato functional genomics. Here, we present an overview of the development of genetic and genomic resources of tomato in the last decade, with a special focus on the activities of Japan SOL and the National Bio-Resource Project in the development of functional genomic resources of a model cultivar, Micro-Tom. PMID:19506732

  15. Development and characterization of rice mutants for functional genomic studies and breeding

    USDA-ARS?s Scientific Manuscript database

    Mutagenesis is a powerful tool for creating genetic materials for studying functional genomics, breeding, and understanding the molecular basis of disease resistance. Approximately 100,000 putative mutants of rice (Oryza sativa L.) have been generated with mutagens. Numerous mutant genes involved in...

  16. Using functional genomics to identify molecular markers for fire blight resistance (Erwinia amylovora) in apple (Malus)

    USDA-ARS?s Scientific Manuscript database

    Fire blight, caused by Erwinia amylovora (Ea), is a destructive disease of apple (Malus), pear (Pyrus) and some woody ornamentals in the rose family (Rosaceae). The goal of this project is to use a functional genomics approach to develop tools to breed fire blight resistant apples. Six hundred fifty...

  17. Cancer genomics object model: an object model for multiple functional genomics data for cancer research.

    PubMed

    Park, Yu Rang; Lee, Hye Won; Cho, Sung Bum; Kim, Ju Han

    2007-01-01

    The development of functional genomics including transcriptomics, proteomics and metabolomics allow us to monitor a large number of key cellular pathways simultaneously. Several technology-specific data models have been introduced for the representation of functional genomics experimental data, including the MicroArray Gene Expression-Object Model (MAGE-OM), the Proteomics Experiment Data Repository (PEDRo), and the Tissue MicroArray-Object Model (TMA-OM). Despite the increasing number of cancer studies using multiple functional genomics technologies, there is still no integrated data model for multiple functional genomics experimental and clinical data. We propose an object-oriented data model for cancer genomics research, Cancer Genomics Object Model (CaGe-OM). We reference four data models: Functional Genomic-Object Model, MAGE-OM, TMAOM and PEDRo. The clinical and histopathological information models are created by analyzing cancer management workflow and referencing the College of American Pathology Cancer Protocols and National Cancer Institute Common Data Elements. The CaGe-OM provides a comprehensive data model for integrated storage and analysis of clinical and multiple functional genomics data.

  18. NCBI GEO: archive for high-throughput functional genomic data.

    PubMed

    Barrett, Tanya; Troup, Dennis B; Wilhite, Stephen E; Ledoux, Pierre; Rudnev, Dmitry; Evangelista, Carlos; Kim, Irene F; Soboleva, Alexandra; Tomashevsky, Maxim; Marshall, Kimberly A; Phillippy, Katherine H; Sherman, Patti M; Muertter, Rolf N; Edgar, Ron

    2009-01-01

    The Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) is the largest public repository for high-throughput gene expression data. Additionally, GEO hosts other categories of high-throughput functional genomic data, including those that examine genome copy number variations, chromatin structure, methylation status and transcription factor binding. These data are generated by the research community using high-throughput technologies like microarrays and, more recently, next-generation sequencing. The database has a flexible infrastructure that can capture fully annotated raw and processed data, enabling compliance with major community-derived scientific reporting standards such as 'Minimum Information About a Microarray Experiment' (MIAME). In addition to serving as a centralized data storage hub, GEO offers many tools and features that allow users to effectively explore, analyze and download expression data from both gene-centric and experiment-centric perspectives. This article summarizes the GEO repository structure, content and operating procedures, as well as recently introduced data mining features. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/.

  19. SigmoID: a user-friendly tool for improving bacterial genome annotation through analysis of transcription control signals.

    PubMed

    Nikolaichik, Yevgeny; Damienikan, Aliaksandr U

    2016-01-01

    The majority of bacterial genome annotations are currently automated and based on a 'gene by gene' approach. Regulatory signals and operon structures are rarely taken into account which often results in incomplete and even incorrect gene function assignments. Here we present SigmoID, a cross-platform (OS X, Linux and Windows) open-source application aiming at simplifying the identification of transcription regulatory sites (promoters, transcription factor binding sites and terminators) in bacterial genomes and providing assistance in correcting annotations in accordance with regulatory information. SigmoID combines a user-friendly graphical interface to well known command line tools with a genome browser for visualising regulatory elements in genomic context. Integrated access to online databases with regulatory information (RegPrecise and RegulonDB) and web-based search engines speeds up genome analysis and simplifies correction of genome annotation. We demonstrate some features of SigmoID by constructing a series of regulatory protein binding site profiles for two groups of bacteria: Soft Rot Enterobacteriaceae (Pectobacterium and Dickeya spp.) and Pseudomonas spp. Furthermore, we inferred over 900 transcription factor binding sites and alternative sigma factor promoters in the annotated genome of Pectobacterium atrosepticum. These regulatory signals control putative transcription units covering about 40% of the P. atrosepticum chromosome. Reviewing the annotation in cases where it didn't fit with regulatory information allowed us to correct product and gene names for over 300 loci.

  20. SigmoID: a user-friendly tool for improving bacterial genome annotation through analysis of transcription control signals

    PubMed Central

    Damienikan, Aliaksandr U.

    2016-01-01

    The majority of bacterial genome annotations are currently automated and based on a ‘gene by gene’ approach. Regulatory signals and operon structures are rarely taken into account which often results in incomplete and even incorrect gene function assignments. Here we present SigmoID, a cross-platform (OS X, Linux and Windows) open-source application aiming at simplifying the identification of transcription regulatory sites (promoters, transcription factor binding sites and terminators) in bacterial genomes and providing assistance in correcting annotations in accordance with regulatory information. SigmoID combines a user-friendly graphical interface to well known command line tools with a genome browser for visualising regulatory elements in genomic context. Integrated access to online databases with regulatory information (RegPrecise and RegulonDB) and web-based search engines speeds up genome analysis and simplifies correction of genome annotation. We demonstrate some features of SigmoID by constructing a series of regulatory protein binding site profiles for two groups of bacteria: Soft Rot Enterobacteriaceae (Pectobacterium and Dickeya spp.) and Pseudomonas spp. Furthermore, we inferred over 900 transcription factor binding sites and alternative sigma factor promoters in the annotated genome of Pectobacterium atrosepticum. These regulatory signals control putative transcription units covering about 40% of the P. atrosepticum chromosome. Reviewing the annotation in cases where it didn’t fit with regulatory information allowed us to correct product and gene names for over 300 loci. PMID:27257541

  1. Whole genome mapping as a fast-track tool to assess genomic stability of sequenced Staphylococcus aureus strains.

    PubMed

    Sabirova, Julia S; Xavier, Basil Britto; Ieven, Margareta; Goossens, Herman; Malhotra-Kumar, Surbhi

    2014-10-08

    Whole genome (optical) mapping (WGM), a state-of-the-art mapping technology based on the generation of high resolution restriction maps, has so far been used for typing clinical outbreak strains and for mapping de novo sequence contigs in genome sequencing projects. We employed WGM to assess the genomic stability of previously sequenced Staphylococcus aureus strains that are commonly used in laboratories as reference standards. S. aureus strains (n = 12) were mapped on the Argus™ Optical Mapping System (Opgen Inc, Gaithersburg, USA). Assembly of NcoI-restricted DNA molecules, visualization, and editing of whole genome maps was performed employing MapManager and MapSolver softwares (Opgen Inc). In silico whole genome NcoI-restricted maps were also generated from available sequence data, and compared to the laboratory-generated maps. Strains showing differences between the two maps were resequenced using Nextera XT DNA Sample Preparation Kit and Miseq Reagent Kit V2 (MiSeq, Illumina) and de novo assembled into sequence contigs using the Velvet assembly tool. Sequence data were correlated with corresponding whole genome maps to perform contig mapping and genome assembly using MapSolver. Of the twelve strains tested, one (USA300_FPR3757) showed a 19-kbp deletion on WGM compared to its in silico generated map and reference sequence data. Resequencing of the USA300_FPR3757 identified the deleted fragment to be a 13 kbp-long integrative conjugative element ICE6013. Frequent subculturing and inter-laboratory transfers can induce genomic and therefore, phenotypic changes that could compromise the utility of standard reference strains. WGM can thus be used as a rapid genome screening method to identify genomic rearrangements whose size and type can be confirmed by sequencing.

  2. Improved criteria and comparative genomics tool provide new insights into grass paleogenomics.

    PubMed

    Salse, Jerome; Abrouk, Michael; Murat, Florent; Quraishi, Umar Masood; Feuillet, Catherine

    2009-11-01

    In the past decade, a number of bioinformatics tools have been developed to perform comparative genomics studies in plants and animals. However, most of the publicly available and user friendly tools lack common standards for the identification of robust orthologous relationships between genomes leading non-specialists to often over interpret the results of large scale comparative sequence analyses. Recently, we have established a number of improved parameters and tools to define significant relationships between genomes as a basis to develop paleogenomics studies in grasses. Here, we describe our approaches and propose the development of community-based standards that can be used in comparative genomic studies to (i) identify robust sets of orthologous gene pairs, (ii) derive complete sets of chromosome to chromosome relationships within and between genomes and (iii) model common paleo-ancestor genome structures. The rice and sorghum genome sequences are used to exemplify step-by-step a methodology that should allow users to perform accurate comparative genome analyses in their favourite species. Finally, we describe two applications for accurate gene annotation and synteny-based cloning of agronomically important traits.

  3. Population genomics as a new tool for wildlife management.

    PubMed

    Gompert, Zachariah

    2012-04-01

    Admixture and introgression have varied effects on population viability and fitness. Admixture might be an important source of new alleles, particularly for small, geographically isolated populations. However, admixture might also cause outbreeding depression if populations are adapted to different ecological or climatic conditions. Because of the emerging use of translocation and admixture as a conservation and wildlife management strategy to reduce genetic load (termed genetic rescue), the possible effects of admixture have practical consequences (Bouzat et al. 2009; Hedrick & Fredrickson 2010). Importantly, genetic load and local adaptation are properties of individual loci and epistatic interactions among loci rather than properties of genomes. Likewise, the outcome and consequences of genetic rescue depend on the fitness effects of individual introduced alleles. In this issue of Molecular Ecology, Miller et al. (2012) use model-based, population genomic analyses to document locus-specific effects of a recent genetic rescue in the bighorn sheep population within the National Bison Range wildlife refuge (NBR; Montana, USA). They find a subset of introduced alleles associated with increased fitness in NBR bighorn sheep, some of which experienced accelerated introgression following their introduction. These loci mark regions of the genome that could constitute the genetic basis of the successful NBR bighorn sheep genetic rescue. Although population genomic analyses are frequently used to study local adaptation and selection (e.g. Hohenlohe et al. 2010; Lawniczak et al. 2010), this study constitutes a novel application of this analytical framework for wildlife management. Moreover, the detailed demographic data available for the NBR bighorn sheep population provide a rare and powerful source of information and allow more robust population genomic inference than is often possible.

  4. Tool for rapid annotation of microbial SNPs (TRAMS): a simple program for rapid annotation of genomic variation in prokaryotes.

    PubMed

    Reumerman, Richard A; Tucker, Nicholas P; Herron, Paul R; Hoskisson, Paul A; Sangal, Vartul

    2013-09-01

    Next generation sequencing (NGS) has been widely used to study genomic variation in a variety of prokaryotes. Single nucleotide polymorphisms (SNPs) resulting from genomic comparisons need to be annotated for their functional impact on the coding sequences. We have developed a program, TRAMS, for functional annotation of genomic SNPs which is available to download as a single file executable for WINDOWS users with limited computational experience and as a Python script for Mac OS and Linux users. TRAMS needs a tab delimited text file containing SNP locations, reference nucleotide and SNPs in variant strains along with a reference genome sequence in GenBank or EMBL format. SNPs are annotated as synonymous, nonsynonymous or nonsense. Nonsynonymous SNPs in start and stop codons are separated as non-start and non-stop SNPs, respectively. SNPs in multiple overlapping features are annotated separately for each feature and multiple nucleotide polymorphisms within a codon are combined before annotation. We have also developed a workflow for Galaxy, a highly used tool for analysing NGS data, to map short reads to a reference genome and extract and annotate the SNPs. TRAMS is a simple program for rapid and accurate annotation of SNPs that will be very useful for microbiologists in analysing genomic diversity in microbial populations.

  5. Public data and open source tools for multi-assay genomic investigation of disease

    PubMed Central

    Kannan, Lavanya; Ramos, Marcel; Re, Angela; El-Hachem, Nehme; Safikhani, Zhaleh; Gendoo, Deena M.A.; Davis, Sean; Gomez-Cabrero, David; Castelo, Robert; Hansen, Kasper D.; Carey, Vincent J.; Morgan, Martin; Culhane, Aedín C.; Haibe-Kains, Benjamin

    2016-01-01

    Molecular interrogation of a biological sample through DNA sequencing, RNA and microRNA profiling, proteomics and other assays, has the potential to provide a systems level approach to predicting treatment response and disease progression, and to developing precision therapies. Large publicly funded projects have generated extensive and freely available multi-assay data resources; however, bioinformatic and statistical methods for the analysis of such experiments are still nascent. We review multi-assay genomic data resources in the areas of clinical oncology, pharmacogenomics and other perturbation experiments, population genomics and regulatory genomics and other areas, and tools for data acquisition. Finally, we review bioinformatic tools that are explicitly geared toward integrative genomic data visualization and analysis. This review provides starting points for accessing publicly available data and tools to support development of needed integrative methods. PMID:26463000

  6. A Simulation Tool for the Study of Symmetric Inversions in Bacterial Genomes

    NASA Astrophysics Data System (ADS)

    Dias, Ulisses; Dias, Zanoni; Setubal, João C.

    We present the tool SIB that simulates genomic inversions in bacterial chromosomes. The tool simulates symmetric inversions but allows the appearance of nonsymmetric inversions by simulating small syntenic blocks frequently observed on bacterial genome comparisons. We evaluate SIB by comparing its results to real genome alignments. We develop measures that allow quantitative comparisons between real pairwise alignments (in terms of dotplots) and simulated ones. These measures allow an evaluation of SIB in terms of dendrograms. We evaluate SIB by comparing its results to whole chromosome alignments and maximum likelihood trees for three bacterial groups (the Pseudomonadaceae family and the Xanthomonas and Shewanella genera). We demonstrate an application of SIB by using it to evaluate the ancestral genome reconstruction tool MGR.

  7. GReEn: a tool for efficient compression of genome resequencing data.

    PubMed

    Pinho, Armando J; Pratas, Diogo; Garcia, Sara P

    2012-02-01

    Research in the genomic sciences is confronted with the volume of sequencing and resequencing data increasing at a higher pace than that of data storage and communication resources, shifting a significant part of research budgets from the sequencing component of a project to the computational one. Hence, being able to efficiently store sequencing and resequencing data is a problem of paramount importance. In this article, we describe GReEn (Genome Resequencing Encoding), a tool for compressing genome resequencing data using a reference genome sequence. It overcomes some drawbacks of the recently proposed tool GRS, namely, the possibility of compressing sequences that cannot be handled by GRS, faster running times and compression gains of over 100-fold for some sequences. This tool is freely available for non-commercial use at ftp://ftp.ieeta.pt/~ap/codecs/GReEn1.tar.gz.

  8. High-resolution interrogation of functional elements in the noncoding genome

    PubMed Central

    Sanjana, Neville E.; Wright, Jason; Zheng, Kaijie; Shalem, Ophir; Fontanillas, Pierre; Joung, Julia; Cheng, Christine; Regev, Aviv; Zhang, Feng

    2016-01-01

    The noncoding genome affects gene regulation and disease, yet we lack tools for rapid identification and manipulation of noncoding elements. We develop a CRISPR screen employing ~18,000 sgRNAs targeting >700 kb surrounding the genes NF1, NF2, and CUL3, which are involved in BRAF inhibitor resistance in melanoma. We find that noncoding locations that modulate drug resistance also harbor predictive hallmarks of noncoding function. With a subset of regions at the CUL3 locus, we demonstrate that engineered mutations alter transcription factor occupancy and long-range and local epigenetic environments, implicating these sites in gene regulation and chemotherapeutic resistance. Though our expansion of the potential of pooled CRISPR screens we provide tools for genomic discovery and for elucidating biologically relevant mechanisms of gene regulation. Pooled CRISPR mutagenesis identifies functional elements in the noncoding genome. PMID:27708104

  9. Characterizing genomic alterations in cancer by complementary functional associations

    PubMed Central

    Kim, J. W.; Botvinnik, O. B.; Abudayyeh, O.; Birger, C.; Rosenbluh, J.; Shrestha, Y.; Abazeed, M. E.; Hammerman, P. S.; DiCara, D.; Konieczkowski, D. J.; Johannessen, C. M.; Liberzon, A.; Alizad-Rahvar, A. R.; Alexe, G.; Aguirre, A.; Ghandi, M.; Greulich, H.; Vazquez, F.; Weir, B. A.; Van Allen, E. M.; Tsherniak, A.; Shao, D. D.; Zack, T. I.; Noble, M.; Getz, G.; Beroukhim, R.; Garraway, L. A.; Ardakani, M.; Romualdi, C.; Sales, G.; Barbie, D. A.; Boehm, J. S.; Hahn, W. C.; Mesirov, J. P.; Tamayo, P.

    2016-01-01

    Systematic efforts to sequence the cancer genome have identified large numbers of relevant mutations and copy number alterations in human cancers; however, elucidating their functional consequences, and their interactions to drive or maintain oncogenic states, is still a significant challenge. Here we introduce REVEALER, a computational method that identifies combinations of mutually exclusive genomic alterations correlated with functional phenotypes, such as the activation or gene-dependency of oncogenic pathways or the sensitivity to a drug treatment. We use REVEALER to uncover complementary genomic alterations associated with the transcriptional activation of β-catenin and NRF2, MEK-inhibitor sensitivity, and KRAS dependency. REVEALER successfully identified both known and new associations demonstrating the power of combining functional profiles with extensive characterization of genomic alterations in cancer genomes. PMID:27088724

  10. BambooGDB: a bamboo genome database with functional annotation and an analysis platform

    PubMed Central

    Zhao, Hansheng; Peng, Zhenhua; Fei, Benhua; Li, Lubin; Hu, Tao; Gao, Zhimin; Jiang, Zehui

    2014-01-01

    Bamboo, as one of the most important non-timber forest products and fastest-growing plants in the world, represents the only major lineage of grasses that is native to forests. Recent success on the first high-quality draft genome sequence of moso bamboo (Phyllostachys edulis) provides new insights on bamboo genetics and evolution. To further extend our understanding on bamboo genome and facilitate future studies on the basis of previous achievements, here we have developed BambooGDB, a bamboo genome database with functional annotation and analysis platform. The de novo sequencing data, together with the full-length complementary DNA and RNA-seq data of moso bamboo composed the main contents of this database. Based on these sequence data, a comprehensively functional annotation for bamboo genome was made. Besides, an analytical platform composed of comparative genomic analysis, protein–protein interactions network, pathway analysis and visualization of genomic data was also constructed. As discovery tools to understand and identify biological mechanisms of bamboo, the platform can be used as a systematic framework for helping and designing experiments for further validation. Moreover, diverse and powerful search tools and a convenient browser were incorporated to facilitate the navigation of these data. As far as we know, this is the first genome database for bamboo. Through integrating high-throughput sequencing data, a full functional annotation and several analysis modules, BambooGDB aims to provide worldwide researchers with a central genomic resource and an extensible analysis platform for bamboo genome. BambooGDB is freely available at http://www.bamboogdb.org/. Database URL: http://www.bamboogdb.org PMID:24602877

  11. BambooGDB: a bamboo genome database with functional annotation and an analysis platform.

    PubMed

    Zhao, Hansheng; Peng, Zhenhua; Fei, Benhua; Li, Lubin; Hu, Tao; Gao, Zhimin; Jiang, Zehui

    2014-01-01

    Bamboo, as one of the most important non-timber forest products and fastest-growing plants in the world, represents the only major lineage of grasses that is native to forests. Recent success on the first high-quality draft genome sequence of moso bamboo (Phyllostachys edulis) provides new insights on bamboo genetics and evolution. To further extend our understanding on bamboo genome and facilitate future studies on the basis of previous achievements, here we have developed BambooGDB, a bamboo genome database with functional annotation and analysis platform. The de novo sequencing data, together with the full-length complementary DNA and RNA-seq data of moso bamboo composed the main contents of this database. Based on these sequence data, a comprehensively functional annotation for bamboo genome was made. Besides, an analytical platform composed of comparative genomic analysis, protein-protein interactions network, pathway analysis and visualization of genomic data was also constructed. As discovery tools to understand and identify biological mechanisms of bamboo, the platform can be used as a systematic framework for helping and designing experiments for further validation. Moreover, diverse and powerful search tools and a convenient browser were incorporated to facilitate the navigation of these data. As far as we know, this is the first genome database for bamboo. Through integrating high-throughput sequencing data, a full functional annotation and several analysis modules, BambooGDB aims to provide worldwide researchers with a central genomic resource and an extensible analysis platform for bamboo genome. BambooGDB is freely available at http://www.bamboogdb.org/. Database URL: http://www.bamboogdb.org.

  12. Novel tools for an old lineage: Population genomics for cycads.

    PubMed

    Cibrián-Jaramillo, Angelica; Marler, Thomas E

    2011-07-01

    With a ca. 300 million year-old evolutionary history, cycads are often perceived as "living fossils," relicts of their previously widespread dominance. Patterns of genetic variation for a member of the most basal cycad genus, Cycas micronesica, support the notion that cycads are a dynamic group with ongoing diversification. Herein we hypothesize that cycad's hefty genomes enable rapid adaptive change and facilitate specific beneficial interactions with varying assemblages of symbionts. Characterizing population-level genomic patterns of cycads and their symbionts, pollinators in particular, will enlighten our understanding of these mechanisms and of adaptive variation that underlies cycad evolution. In light of rapid climate and landscape change, cycads are a beacon for understanding the ecological processes that ultimately enable species long-term survival.

  13. Genomics tools for the unraveling of chromosome architecture

    PubMed Central

    van Steensel, Bas; Dekker, Job

    2010-01-01

    The spatial organization of chromosomes inside the cell nucleus is still poorly understood. This organization is guided by intra- and interchromosomal contacts and by interactions of specific chromosomal loci with relatively fixed nuclear “landmarks” such as the nuclear envelope and the nucleolus. New molecular genome-wide mapping techniques have begun to uncover both types of molecular interactions, providing insights into the fundamental principles of interphase chromosome folding. PMID:20944601

  14. Partnering for functional genomics research conference: Abstracts of poster presentations

    SciTech Connect

    1998-06-01

    This reports contains abstracts of poster presentations presented at the Functional Genomics Research Conference held April 16--17, 1998 in Oak Ridge, Tennessee. Attention is focused on the following areas: mouse mutagenesis and genomics; phenotype screening; gene expression analysis; DNA analysis technology development; bioinformatics; comparative analyses of mouse, human, and yeast sequences; and pilot projects to evaluate methodologies.

  15. Genetic screens and functional genomics using CRISPR/Cas9 technology.

    PubMed

    Hartenian, Ella; Doench, John G

    2015-04-01

    Functional genomics attempts to understand the genome by perturbing the flow of information from DNA to RNA to protein, in order to learn how gene dysfunction leads to disease. CRISPR/Cas9 technology is the newest tool in the geneticist's toolbox, allowing researchers to edit DNA with unprecedented ease, speed and accuracy, and representing a novel means to perform genome-wide genetic screens to discover gene function. In this review, we first summarize the discovery and characterization of CRISPR/Cas9, and then compare it to other genome engineering technologies. We discuss its initial use in screening applications, with a focus on optimizing on-target activity and minimizing off-target effects. Finally, we comment on future challenges and opportunities afforded by this technology.

  16. Whole genome scanning as a cytogenetic tool in hematologic malignancies

    PubMed Central

    Mufti, Ghulam J.

    2008-01-01

    Over the years, methods of cytogenetic analysis evolved and became part of routine laboratory testing, providing valuable diagnostic and prognostic information in hematologic disorders. Karyotypic aberrations contribute to the understanding of the molecular pathogenesis of disease and thereby to rational application of therapeutic modalities. Most of the progress in this field stems from the application of metaphase cytogenetics (MC), but recently, novel molecular technologies have been introduced that complement MC and overcome many of the limitations of traditional cytogenetics, including a need for cell culture. Whole genome scanning using comparative genomic hybridization and single nucleotide polymorphism arrays (CGH-A; SNP-A) can be used for analysis of somatic or clonal unbalanced chromosomal defects. In SNP-A, the combination of copy number detection and genotyping enables diagnosis of copy-neutral loss of heterozygosity, a lesion that cannot be detected using MC but may have important pathogenetic implications. Overall, whole genome scanning arrays, despite the drawback of an inability to detect balanced translocations, allow for discovery of chromosomal defects in a higher proportion of patients with hematologic malignancies. Newly detected chromosomal aberrations, including somatic uniparental disomy, may lead to more precise prognostic schemes in many diseases. PMID:18505780

  17. Electric fish genomics: Progress, prospects, and new tools for neuroethology.

    PubMed

    Pitchers, William R; Constantinou, Savvas J; Losilla, Mauricio; Gallant, Jason R

    2016-10-01

    Electric fish have served as a model system in biology since the 18th century, providing deep insight into the nature of bioelectrogenesis, the molecular structure of the synapse, and brain circuitry underlying complex behavior. Neuroethologists have collected extensive phenotypic data that span biological levels of analysis from molecules to ecosystems. This phenotypic data, together with genomic resources obtained over the past decades, have motivated new and exciting hypotheses that position the weakly electric fish model to address fundamental 21(st) century biological questions. This review article considers the molecular data collected for weakly electric fish over the past three decades, and the insights that data of this nature has motivated. For readers relatively new to molecular genetics techniques, we also provide a table of terminology aimed at clarifying the numerous acronyms and techniques that accompany this field. Next, we pose a research agenda for expanding genomic resources for electric fish research over the next 10years. We conclude by considering some of the exciting research prospects for neuroethology that electric fish genomics may offer over the coming decades, if the electric fish community is successful in these endeavors. Copyright © 2016 Elsevier Ltd. All rights reserved.

  18. Function-selective domain architecture plasticity potentials in eukaryotic genome evolution.

    PubMed

    Linkeviciute, Viktorija; Rackham, Owen J L; Gough, Julian; Oates, Matt E; Fang, Hai

    2015-12-01

    To help evaluate how protein function impacts on genome evolution, we introduce a new concept of 'architecture plasticity potential' - the capacity to form distinct domain architectures - both for an individual domain, or more generally for a set of domains grouped by shared function. We devise a scoring metric to measure the plasticity potential for these domain sets, and evaluate how function has changed over time for different species. Applying this metric to a phylogenetic tree of eukaryotic genomes, we find that the involvement of each function is not random but highly selective. For certain lineages there is strong bias for evolution to involve domains related to certain functions. In general eukaryotic genomes, particularly animals, expand complex functional activities such as signalling and regulation, but at the cost of reducing metabolic processes. We also observe differential evolution of transcriptional regulation and a unique evolutionary role of channel regulators; crucially this is only observable in terms of the architecture plasticity potential. Our findings provide a new layer of information to understand the significance of function in eukaryotic genome evolution. A web search tool, available at http://supfam.org/Pevo, offers a wide spectrum of options for exploring functional importance in eukaryotic genome evolution. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.

  19. Modeling Chromosomes in Mouse to Explore the Function of Genes, Genomic Disorders, and Chromosomal Organization

    PubMed Central

    Brault, Véronique; Pereira, Patricia; Duchon, Arnaud; Hérault, Yann

    2006-01-01

    One of the challenges of genomic research after the completion of the human genome project is to assign a function to all the genes and to understand their interactions and organizations. Among the various techniques, the emergence of chromosome engineering tools with the aim to manipulate large genomic regions in the mouse model offers a powerful way to accelerate the discovery of gene functions and provides more mouse models to study normal and pathological developmental processes associated with aneuploidy. The combination of gene targeting in ES cells, recombinase technology, and other techniques makes it possible to generate new chromosomes carrying specific and defined deletions, duplications, inversions, and translocations that are accelerating functional analysis. This review presents the current status of chromosome engineering techniques and discusses the different applications as well as the implication of these new techniques in future research to better understand the function of chromosomal organization and structures. PMID:16839184

  20. Cost-effective cloud computing: a case study using the comparative genomics tool, roundup.

    PubMed

    Kudtarkar, Parul; Deluca, Todd F; Fusaro, Vincent A; Tonellato, Peter J; Wall, Dennis P

    2010-12-22

    Comparative genomics resources, such as ortholog detection tools and repositories are rapidly increasing in scale and complexity. Cloud computing is an emerging technological paradigm that enables researchers to dynamically build a dedicated virtual cluster and may represent a valuable alternative for large computational tools in bioinformatics. In the present manuscript, we optimize the computation of a large-scale comparative genomics resource-Roundup-using cloud computing, describe the proper operating principles required to achieve computational efficiency on the cloud, and detail important procedures for improving cost-effectiveness to ensure maximal computation at minimal costs. Utilizing the comparative genomics tool, Roundup, as a case study, we computed orthologs among 902 fully sequenced genomes on Amazon's Elastic Compute Cloud. For managing the ortholog processes, we designed a strategy to deploy the web service, Elastic MapReduce, and maximize the use of the cloud while simultaneously minimizing costs. Specifically, we created a model to estimate cloud runtime based on the size and complexity of the genomes being compared that determines in advance the optimal order of the jobs to be submitted. We computed orthologous relationships for 245,323 genome-to-genome comparisons on Amazon's computing cloud, a computation that required just over 200 hours and cost $8,000 USD, at least 40% less than expected under a strategy in which genome comparisons were submitted to the cloud randomly with respect to runtime. Our cost savings projections were based on a model that not only demonstrates the optimal strategy for deploying RSD to the cloud, but also finds the optimal cluster size to minimize waste and maximize usage. Our cost-reduction model is readily adaptable for other comparative genomics tools and potentially of significant benefit to labs seeking to take advantage of the cloud as an alternative to local computing infrastructure.

  1. Cost-Effective Cloud Computing: A Case Study Using the Comparative Genomics Tool, Roundup

    PubMed Central

    Kudtarkar, Parul; DeLuca, Todd F.; Fusaro, Vincent A.; Tonellato, Peter J.; Wall, Dennis P.

    2010-01-01

    Background Comparative genomics resources, such as ortholog detection tools and repositories are rapidly increasing in scale and complexity. Cloud computing is an emerging technological paradigm that enables researchers to dynamically build a dedicated virtual cluster and may represent a valuable alternative for large computational tools in bioinformatics. In the present manuscript, we optimize the computation of a large-scale comparative genomics resource—Roundup—using cloud computing, describe the proper operating principles required to achieve computational efficiency on the cloud, and detail important procedures for improving cost-effectiveness to ensure maximal computation at minimal costs. Methods Utilizing the comparative genomics tool, Roundup, as a case study, we computed orthologs among 902 fully sequenced genomes on Amazon’s Elastic Compute Cloud. For managing the ortholog processes, we designed a strategy to deploy the web service, Elastic MapReduce, and maximize the use of the cloud while simultaneously minimizing costs. Specifically, we created a model to estimate cloud runtime based on the size and complexity of the genomes being compared that determines in advance the optimal order of the jobs to be submitted. Results We computed orthologous relationships for 245,323 genome-to-genome comparisons on Amazon’s computing cloud, a computation that required just over 200 hours and cost $8,000 USD, at least 40% less than expected under a strategy in which genome comparisons were submitted to the cloud randomly with respect to runtime. Our cost savings projections were based on a model that not only demonstrates the optimal strategy for deploying RSD to the cloud, but also finds the optimal cluster size to minimize waste and maximize usage. Our cost-reduction model is readily adaptable for other comparative genomics tools and potentially of significant benefit to labs seeking to take advantage of the cloud as an alternative to local computing

  2. Kvik: three-tier data exploration tools for flexible analysis of genomic data in epidemiological studies

    PubMed Central

    Fjukstad, Bjørn; Standahl Olsen, Karina; Jareid, Mie; Lund, Eiliv; Bongo, Lars Ailo

    2015-01-01

    Kvik is an open-source framework that we developed for explorative analysis of functional genomics data from large epidemiological studies. Creating such studies requires a significant amount of time and resources. It is therefore usual to reuse the data from one study for several research projects. Often each project requires implementing new analysis code, integration with specific knowledge bases, and specific visualizations. Although existing data exploration tools are available for single study data exploration, no tool provides all the required functionality for multistudy data exploration. We have therefore used the Kvik framework to develop Kvik Pathways, an application for exploring gene expression data in the context of biological pathways. We have used Kvik Pathways to explore data from both a cross-sectional study design and a case-control study within the Norwegian Women and Cancer (NOWAC) cohort. Kvik Pathways follows the three-tier architecture in web applications using a powerful back-end for statistical analyses and retrieval of metadata.In this note, we describe how we used the Kvik framework to develop the Kvik Pathways application. Kvik Pathways was used by our team of epidemiologists toexplore gene expression data from healthy women with high and low plasma ratios of essential fatty acids. PMID:26425340

  3. Kvik: three-tier data exploration tools for flexible analysis of genomic data in epidemiological studies.

    PubMed

    Fjukstad, Bjørn; Standahl Olsen, Karina; Jareid, Mie; Lund, Eiliv; Bongo, Lars Ailo

    2015-01-01

    Kvik is an open-source framework that we developed for explorative analysis of functional genomics data from large epidemiological studies. Creating such studies requires a significant amount of time and resources. It is therefore usual to reuse the data from one study for several research projects. Often each project requires implementing new analysis code, integration with specific knowledge bases, and specific visualizations. Although existing data exploration tools are available for single study data exploration, no tool provides all the required functionality for multistudy data exploration. We have therefore used the Kvik framework to develop Kvik Pathways, an application for exploring gene expression data in the context of biological pathways. We have used Kvik Pathways to explore data from both a cross-sectional study design and a case-control study within the Norwegian Women and Cancer (NOWAC) cohort. Kvik Pathways follows the three-tier architecture in web applications using a powerful back-end for statistical analyses and retrieval of metadata.In this note, we describe how we used the Kvik framework to develop the Kvik Pathways application. Kvik Pathways was used by our team of epidemiologists toexplore gene expression data from healthy women with high and low plasma ratios of essential fatty acids.

  4. Decoding the ecological function of accessory genome

    USDA-ARS?s Scientific Manuscript database

    Shiga toxin-producing Escherichia coli O157:H7 primarily resides in cattle asymptomatically, and can be transmitted to humans through food. A study by Lupolova et al applied a machine-learning approach to complex pan-genome information and predicted that only a small subset of bovine isolates have t...

  5. Emerging Genomic Tools for Legume Breeding: Current Status and Future Prospects.

    PubMed

    Pandey, Manish K; Roorkiwal, Manish; Singh, Vikas K; Ramalingam, Abirami; Kudapa, Himabindu; Thudi, Mahendar; Chitikineni, Anu; Rathore, Abhishek; Varshney, Rajeev K

    2016-01-01

    Legumes play a vital role in ensuring global nutritional food security and improving soil quality through nitrogen fixation. Accelerated higher genetic gains is required to meet the demand of ever increasing global population. In recent years, speedy developments have been witnessed in legume genomics due to advancements in next-generation sequencing (NGS) and high-throughput genotyping technologies. Reference genome sequences for many legume crops have been reported in the last 5 years. The availability of the draft genome sequences and re-sequencing of elite genotypes for several important legume crops have made it possible to identify structural variations at large scale. Availability of large-scale genomic resources and low-cost and high-throughput genotyping technologies are enhancing the efficiency and resolution of genetic mapping and marker-trait association studies. Most importantly, deployment of molecular breeding approaches has resulted in development of improved lines in some legume crops such as chickpea and groundnut. In order to support genomics-driven crop improvement at a fast pace, the deployment of breeder-friendly genomics and decision support tools seems appear to be critical in breeding programs in developing countries. This review provides an overview of emerging genomics and informatics tools/approaches that will be the key driving force for accelerating genomics-assisted breeding and ultimately ensuring nutritional and food security in developing countries.

  6. Emerging Genomic Tools for Legume Breeding: Current Status and Future Prospects

    PubMed Central

    Pandey, Manish K.; Roorkiwal, Manish; Singh, Vikas K.; Ramalingam, Abirami; Kudapa, Himabindu; Thudi, Mahendar; Chitikineni, Anu; Rathore, Abhishek; Varshney, Rajeev K.

    2016-01-01

    Legumes play a vital role in ensuring global nutritional food security and improving soil quality through nitrogen fixation. Accelerated higher genetic gains is required to meet the demand of ever increasing global population. In recent years, speedy developments have been witnessed in legume genomics due to advancements in next-generation sequencing (NGS) and high-throughput genotyping technologies. Reference genome sequences for many legume crops have been reported in the last 5 years. The availability of the draft genome sequences and re-sequencing of elite genotypes for several important legume crops have made it possible to identify structural variations at large scale. Availability of large-scale genomic resources and low-cost and high-throughput genotyping technologies are enhancing the efficiency and resolution of genetic mapping and marker-trait association studies. Most importantly, deployment of molecular breeding approaches has resulted in development of improved lines in some legume crops such as chickpea and groundnut. In order to support genomics-driven crop improvement at a fast pace, the deployment of breeder-friendly genomics and decision support tools seems appear to be critical in breeding programs in developing countries. This review provides an overview of emerging genomics and informatics tools/approaches that will be the key driving force for accelerating genomics-assisted breeding and ultimately ensuring nutritional and food security in developing countries. PMID:27199998

  7. Functional phylogenomics analysis of bacteria and archaea using consistent genome annotation with UniFam

    SciTech Connect

    Chai, Juanjuan; Kora, Guruprasad; Ahn, Tae-Hyuk; Hyatt, Doug; Pan, Chongle

    2014-01-01

    To supply some background, phylogenetic studies have provided detailed knowledge on the evolutionary mechanisms of genes and species in Bacteria and Archaea. However, the evolution of cellular functions, represented by metabolic pathways and biological processes, has not been systematically characterized. Many clades in the prokaryotic tree of life have now been covered by sequenced genomes in GenBank. This enables a large-scale functional phylogenomics study of many computationally inferred cellular functions across all sequenced prokaryotes. Our results show a total of 14,727 GenBank prokaryotic genomes were re-annotated using a new protein family database, UniFam, to obtain consistent functional annotations for accurate comparison. The functional profile of a genome was represented by the biological process Gene Ontology (GO) terms in its annotation. The GO term enrichment analysis differentiated the functional profiles between selected archaeal taxa. 706 prokaryotic metabolic pathways were inferred from these genomes using Pathway Tools and MetaCyc. The consistency between the distribution of metabolic pathways in the genomes and the phylogenetic tree of the genomes was measured using parsimony scores and retention indices. The ancestral functional profiles at the internal nodes of the phylogenetic tree were reconstructed to track the gains and losses of metabolic pathways in evolutionary history. In conclusion, our functional phylogenomics analysis shows divergent functional profiles of taxa and clades. Such function-phylogeny correlation stems from a set of clade-specific cellular functions with low parsimony scores. On the other hand, many cellular functions are sparsely dispersed across many clades with high parsimony scores. These different types of cellular functions have distinct evolutionary patterns reconstructed from the prokaryotic tree.

  8. Functional phylogenomics analysis of bacteria and archaea using consistent genome annotation with UniFam

    DOE PAGES

    Chai, Juanjuan; Kora, Guruprasad; Ahn, Tae-Hyuk; ...

    2014-01-01

    To supply some background, phylogenetic studies have provided detailed knowledge on the evolutionary mechanisms of genes and species in Bacteria and Archaea. However, the evolution of cellular functions, represented by metabolic pathways and biological processes, has not been systematically characterized. Many clades in the prokaryotic tree of life have now been covered by sequenced genomes in GenBank. This enables a large-scale functional phylogenomics study of many computationally inferred cellular functions across all sequenced prokaryotes. Our results show a total of 14,727 GenBank prokaryotic genomes were re-annotated using a new protein family database, UniFam, to obtain consistent functional annotations for accuratemore » comparison. The functional profile of a genome was represented by the biological process Gene Ontology (GO) terms in its annotation. The GO term enrichment analysis differentiated the functional profiles between selected archaeal taxa. 706 prokaryotic metabolic pathways were inferred from these genomes using Pathway Tools and MetaCyc. The consistency between the distribution of metabolic pathways in the genomes and the phylogenetic tree of the genomes was measured using parsimony scores and retention indices. The ancestral functional profiles at the internal nodes of the phylogenetic tree were reconstructed to track the gains and losses of metabolic pathways in evolutionary history. In conclusion, our functional phylogenomics analysis shows divergent functional profiles of taxa and clades. Such function-phylogeny correlation stems from a set of clade-specific cellular functions with low parsimony scores. On the other hand, many cellular functions are sparsely dispersed across many clades with high parsimony scores. These different types of cellular functions have distinct evolutionary patterns reconstructed from the prokaryotic tree.« less

  9. Human genome project: New tools for tomorrow's health research

    SciTech Connect

    Not Available

    1990-01-01

    The Human Genome Project is discussed within the context of the benefits that can be derived from human biological and biomedical research in the 21st century. Explanations are given about what chromosome mapping is and the different kinds that exist. Next, model organisms that were developed to better understand and interpret genetic information are discussed, as well as the database that the project will develop and the training opportunities that such a database will afford to emerging technologies. Finally, a review is given of the personal and social implications stemming from greater genetic knowledge.

  10. Chemical genomics for studying parasite gene function and interaction

    PubMed Central

    Li, Jian; Yuan, Jing; Chen, Chin-chien; Inglese, James; Su, Xin-zhuan

    2013-01-01

    With the development of new technologies in genome sequencing, gene expression profiling, genotyping, and high-throughput screening of chemical compound libraries, small molecules are playing increasingly important roles in studying gene expression regulation, gene-gene interaction, and gene function. Here we briefly review and discuss some recent advancements in drug target identification and phenotype characterization using combinations of high-throughput screening of small-molecule libraries and various genome-wide methods such as whole genome sequencing, genome-wide association studies, and genome-wide expressional analysis. These approaches can be used to search for new drugs against parasitic infections, to identify drug targets or drug-resistance genes, and to infer gene function. PMID:24215777

  11. Characterizing genomic alterations in cancer by complementary functional associations | Office of Cancer Genomics

    Cancer.gov

    Systematic efforts to sequence the cancer genome have identified large numbers of mutations and copy number alterations in human cancers. However, elucidating the functional consequences of these variants, and their interactions to drive or maintain oncogenic states, remains a challenge in cancer research. We developed REVEALER, a computational method that identifies combinations of mutually exclusive genomic alterations correlated with functional phenotypes, such as the activation or gene dependency of oncogenic pathways or sensitivity to a drug treatment.

  12. The capsicum transcriptome DB: a “hot” tool for genomic research

    PubMed Central

    Góngora-Castillo, Elsa; Fajardo-Jaime, Rubén; Fernández-Cortes, Araceli; Jofre-Garfias, Alba E; Lozoya-Gloria, Edmundo; Martínez, Octavio; Ochoa-Alejo, Neftalí; Rivera-Bustamante, Rafael

    2012-01-01

    Chili pepper (Capsicum annuum) is an economically important crop with no available public genome sequence. We describe a genomic resource to facilitate Capsicum annuum research. A collection of Expressed Sequence Tags (ESTs) derived from five C. annuum organs (root, stem, leaf, flower and fruit) were sequenced using the Sanger method and multiple leaf transcriptomes were deeply sampled using with GS-pyrosequencing. A hybrid assembly of 1,324,516 raw reads yielded 32,314 high quality contigs as validated by coverage and identity analysis with existing pepper sequences. Overall, 75.5% of the contigs had significant sequence similarity to entries in nucleic acid and protein databases; 23% of the sequences have not been previously reported for C. annuum and expand sequence resources for this species. A MySQL database and a user-friendly Web interface were constructed with search-tools that permit queries of the ESTs including sequence, functional annotation, Gene Ontology classification, metabolic pathways, and assembly information. The Capsicum Transcriptome DB is free available from http://www.bioingenios.ira.cinvestav.mx:81/Joomla/ PMID:22359434

  13. Genomics: Tool to predict and prevent male infertility.

    PubMed

    Halder, Ashutosh; Kumar, Prashant; Jain, Manish; Kalsi, Amanpreet Kaur

    2017-06-01

    A large number of human diseases arise as a result of genetic abnormalities. With the advent of improved molecular biology techniques, the genetic etiology of male infertility is increasing. The common genetic factors responsible for male infertility are chromosomal abnormalities, Yq microdeletion and cystic fibrosis. These are responsible for approximately 30 percent cases of male infertility. About 40 percent cases of male infertility are categorized as idiopathic. These cases may be associated with genetic and genomic abnormalities. During last few years more and more genes are implicated in male infertility leading to decline in prevalence of idiopathic etiology. In this review we will summarize up to date published works on genetic etiologies of male infertility including our own works. We also briefly describe reproductive technologies used to overcome male infertility, dangers of transmitting genetic disorders to offspring and ways to prevent transmission of genetic disorders during assisted reproduction. At the end we will provide our points on how genomic information can be utilized for prediction and prevention of male infertility in coming years.

  14. Iterative ACORN as a high throughput tool in structural genomics.

    PubMed

    Selvanayagam, S; Velmurugan, D; Yamane, T

    2006-08-01

    High throughput macromolecular structure determination is very essential in structural genomics as the available number of sequence information far exceeds the number of available 3D structures. ACORN, a freely available resource in the CCP4 suite of programs is a comprehensive and efficient program for phasing in the determination of protein structures, when atomic resolution data are available. ACORN with the automatic model-building program ARP/wARP and refinement program REFMAC is a suitable combination for the high throughput structural genomics. ACORN can also be run with secondary structural elements like helices and sheets as inputs with high resolution data. In situations, where ACORN phasing is not sufficient for building the protein model, the fragments (incomplete model/dummy atoms) can again be used as a starting input. Iterative ACORN is proved to work efficiently in the subsequent model building stages in congerin (PDB-ID: lis3) and catalase (PDB-ID: 1gwe) for which models are available.

  15. Defining functional DNA elements in the human genome.

    PubMed

    Kellis, Manolis; Wold, Barbara; Snyder, Michael P; Bernstein, Bradley E; Kundaje, Anshul; Marinov, Georgi K; Ward, Lucas D; Birney, Ewan; Crawford, Gregory E; Dekker, Job; Dunham, Ian; Elnitski, Laura L; Farnham, Peggy J; Feingold, Elise A; Gerstein, Mark; Giddings, Morgan C; Gilbert, David M; Gingeras, Thomas R; Green, Eric D; Guigo, Roderic; Hubbard, Tim; Kent, Jim; Lieb, Jason D; Myers, Richard M; Pazin, Michael J; Ren, Bing; Stamatoyannopoulos, John A; Weng, Zhiping; White, Kevin P; Hardison, Ross C

    2014-04-29

    With the completion of the human genome sequence, attention turned to identifying and annotating its functional DNA elements. As a complement to genetic and comparative genomics approaches, the Encyclopedia of DNA Elements Project was launched to contribute maps of RNA transcripts, transcriptional regulator binding sites, and chromatin states in many cell types. The resulting genome-wide data reveal sites of biochemical activity with high positional resolution and cell type specificity that facilitate studies of gene regulation and interpretation of noncoding variants associated with human disease. However, the biochemically active regions cover a much larger fraction of the genome than do evolutionarily conserved regions, raising the question of whether nonconserved but biochemically active regions are truly functional. Here, we review the strengths and limitations of biochemical, evolutionary, and genetic approaches for defining functional DNA segments, potential sources for the observed differences in estimated genomic coverage, and the biological implications of these discrepancies. We also analyze the relationship between signal intensity, genomic coverage, and evolutionary conservation. Our results reinforce the principle that each approach provides complementary information and that we need to use combinations of all three to elucidate genome function in human biology and disease.

  16. Defining functional DNA elements in the human genome

    PubMed Central

    Kellis, Manolis; Wold, Barbara; Snyder, Michael P.; Bernstein, Bradley E.; Kundaje, Anshul; Marinov, Georgi K.; Ward, Lucas D.; Birney, Ewan; Crawford, Gregory E.; Dekker, Job; Dunham, Ian; Elnitski, Laura L.; Farnham, Peggy J.; Feingold, Elise A.; Gerstein, Mark; Giddings, Morgan C.; Gilbert, David M.; Gingeras, Thomas R.; Green, Eric D.; Guigo, Roderic; Hubbard, Tim; Kent, Jim; Lieb, Jason D.; Myers, Richard M.; Pazin, Michael J.; Ren, Bing; Stamatoyannopoulos, John A.; Weng, Zhiping; White, Kevin P.; Hardison, Ross C.

    2014-01-01

    With the completion of the human genome sequence, attention turned to identifying and annotating its functional DNA elements. As a complement to genetic and comparative genomics approaches, the Encyclopedia of DNA Elements Project was launched to contribute maps of RNA transcripts, transcriptional regulator binding sites, and chromatin states in many cell types. The resulting genome-wide data reveal sites of biochemical activity with high positional resolution and cell type specificity that facilitate studies of gene regulation and interpretation of noncoding variants associated with human disease. However, the biochemically active regions cover a much larger fraction of the genome than do evolutionarily conserved regions, raising the question of whether nonconserved but biochemically active regions are truly functional. Here, we review the strengths and limitations of biochemical, evolutionary, and genetic approaches for defining functional DNA segments, potential sources for the observed differences in estimated genomic coverage, and the biological implications of these discrepancies. We also analyze the relationship between signal intensity, genomic coverage, and evolutionary conservation. Our results reinforce the principle that each approach provides complementary information and that we need to use combinations of all three to elucidate genome function in human biology and disease. PMID:24753594

  17. DivStat: A User-Friendly Tool for Single Nucleotide Polymorphism Analysis of Genomic Diversity

    PubMed Central

    Soares, Inês; Moleirinho, Ana; Oliveira, Gonçalo N. P.; Amorim, António

    2015-01-01

    Recent developments have led to an enormous increase of publicly available large genomic data, including complete genomes. The 1000 Genomes Project was a major contributor, releasing the results of sequencing a large number of individual genomes, and allowing for a myriad of large scale studies on human genetic variation. However, the tools currently available are insufficient when the goal concerns some analyses of data sets encompassing more than hundreds of base pairs and when considering haplotype sequences of single nucleotide polymorphisms (SNPs). Here, we present a new and potent tool to deal with large data sets allowing the computation of a variety of summary statistics of population genetic data, increasing the speed of data analysis. PMID:25756185

  18. Bioinformatics tools and databases for whole genome sequence analysis of Mycobacterium tuberculosis.

    PubMed

    Faksri, Kiatichai; Tan, Jun Hao; Chaiprasert, Angkana; Teo, Yik-Ying; Ong, Rick Twee-Hee

    2016-11-01

    Tuberculosis (TB) is an infectious disease of global public health importance caused by Mycobacterium tuberculosis complex (MTC) in which M. tuberculosis (Mtb) is the major causative agent. Recent advancements in genomic technologies such as next generation sequencing have enabled high throughput cost-effective generation of whole genome sequence information from Mtb clinical isolates, providing new insights into the evolution, genomic diversity and transmission of the Mtb bacteria, including molecular mechanisms of antibiotic resistance. The large volume of sequencing data generated however necessitated effective and efficient management, storage, analysis and visualization of the data and results through development of novel and customized bioinformatics software tools and databases. In this review, we aim to provide a comprehensive survey of the current freely available bioinformatics software tools and publicly accessible databases for genomic analysis of Mtb for identifying disease transmission in molecular epidemiology and in rapid determination of the antibiotic profiles of clinical isolates for prompt and optimal patient treatment.

  19. Predictive computation of genomic logic processing functions in embryonic development

    PubMed Central

    Peter, Isabelle S.; Faure, Emmanuel; Davidson, Eric H.

    2012-01-01

    Gene regulatory networks (GRNs) control the dynamic spatial patterns of regulatory gene expression in development. Thus, in principle, GRN models may provide system-level, causal explanations of developmental process. To test this assertion, we have transformed a relatively well-established GRN model into a predictive, dynamic Boolean computational model. This Boolean model computes spatial and temporal gene expression according to the regulatory logic and gene interactions specified in a GRN model for embryonic development in the sea urchin. Additional information input into the model included the progressive embryonic geometry and gene expression kinetics. The resulting model predicted gene expression patterns for a large number of individual regulatory genes each hour up to gastrulation (30 h) in four different spatial domains of the embryo. Direct comparison with experimental observations showed that the model predictively computed these patterns with remarkable spatial and temporal accuracy. In addition, we used this model to carry out in silico perturbations of regulatory functions and of embryonic spatial organization. The model computationally reproduced the altered developmental functions observed experimentally. Two major conclusions are that the starting GRN model contains sufficiently complete regulatory information to permit explanation of a complex developmental process of gene expression solely in terms of genomic regulatory code, and that the Boolean model provides a tool with which to test in silico regulatory circuitry and developmental perturbations. PMID:22927416

  20. Gnome View: A tool for visual representation of human genome data

    SciTech Connect

    Pelkey, J.E.; Thomas, G.S.; Thurman, D.A.; Lortz, V.B.; Douthart, R.J.

    1993-02-01

    GnomeView is a tool for exploring data generated by the Human Gemone Project. GnomeView provides both graphical and textural styles of data presentation: employs an intuitive window-based graphical query interface: and integrates its underlying genome databases in such a way that the user can navigate smoothly across databases and between different levels of data. This paper describes GnomeView and discusses how it addresses various genome informatics issues.

  1. Gnome View: A tool for visual representation of human genome data

    SciTech Connect

    Pelkey, J.E.; Thomas, G.S.; Thurman, D.A.; Lortz, V.B.; Douthart, R.J.

    1993-02-01

    GnomeView is a tool for exploring data generated by the Human Gemone Project. GnomeView provides both graphical and textural styles of data presentation: employs an intuitive window-based graphical query interface: and integrates its underlying genome databases in such a way that the user can navigate smoothly across databases and between different levels of data. This paper describes GnomeView and discusses how it addresses various genome informatics issues.

  2. Conservation and functional element discovery in 20 angiosperm plant genomes.

    PubMed

    Hupalo, Daniel; Kern, Andrew D

    2013-07-01

    Here, we describe the construction of a phylogenetically deep, whole-genome alignment of 20 flowering plants, along with an analysis of plant genome conservation. Each included angiosperm genome was aligned to a reference genome, Arabidopsis thaliana, using the LASTZ/MULTIZ paradigm and tools from the University of California-Santa Cruz Genome Browser source code. In addition to the multiple alignment, we created a local genome browser displaying multiple tracks of newly generated genome annotation, as well as annotation sourced from published data of other research groups. An investigation into A. thaliana gene features present in the aligned A. lyrata genome revealed better conservation of start codons, stop codons, and splice sites within our alignments (51% of features from A. thaliana conserved without interruption in A. lyrata) when compared with previous publicly available plant pairwise alignments (34% of features conserved). The detailed view of conservation across angiosperms revealed not only high coding-sequence conservation but also a large set of previously uncharacterized intergenic conservation. From this, we annotated the collection of conserved features, revealing dozens of putative noncoding RNAs, including some with recorded small RNA expression. Comparing conservation between kingdoms revealed a faster decay of vertebrate genome features when compared with angiosperm genomes. Finally, conserved sequences were searched for folding RNA features, including but not limited to noncoding RNA (ncRNA) genes. Among these, we highlight a double hairpin in the 5'-untranslated region (5'-UTR) of the PRIN2 gene and a putative ncRNA with homology targeting the LAF3 protein.

  3. Coordinated international action to accelerate genome-to-phenome with FAANG, The Functional Annotation of Animal Genomes project

    USDA-ARS?s Scientific Manuscript database

    We describe the organization of a nascent international effort - the "Functional Annotation of ANimal Genomes" project - whose aim is to produce comprehensive maps of functional elements in the genomes of domesticated animal species....

  4. Functional coverage of the human genome by existing structures, structural genomics targets, and homology models.

    PubMed

    Xie, Lei; Bourne, Philip E

    2005-08-01

    The bias in protein structure and function space resulting from experimental limitations and targeting of particular functional classes of proteins by structural biologists has long been recognized, but never continuously quantified. Using the Enzyme Commission and the Gene Ontology classifications as a reference frame, and integrating structure data from the Protein Data Bank (PDB), target sequences from the structural genomics projects, structure homology derived from the SUPERFAMILY database, and genome annotations from Ensembl and NCBI, we provide a quantified view, both at the domain and whole-protein levels, of the current and projected coverage of protein structure and function space relative to the human genome. Protein structures currently provide at least one domain that covers 37% of the functional classes identified in the genome; whole structure coverage exists for 25% of the genome. If all the structural genomics targets were solved (twice the current number of structures in the PDB), it is estimated that structures of one domain would cover 69% of the functional classes identified and complete structure coverage would be 44%. Homology models from existing experimental structures extend the 37% coverage to 56% of the genome as single domains and 25% to 31% for complete structures. Coverage from homology models is not evenly distributed by protein family, reflecting differing degrees of sequence and structure divergence within families. While these data provide coverage, conversely, they also systematically highlight functional classes of proteins for which structures should be determined. Current key functional families without structure representation are highlighted here; updated information on the "most wanted list" that should be solved is available on a weekly basis from http://function.rcsb.org:8080/pdb/function_distribution/index.html.

  5. Genomic Tools in Pearl Millet Breeding for Drought Tolerance: Status and Prospects.

    PubMed

    Serba, Desalegn D; Yadav, Rattan S

    2016-01-01

    Pearl millet [Penisetum glaucum (L) R. Br.] is a hardy cereal crop grown in the arid and semiarid tropics where other cereals are likely to fail to produce economic yields due to drought and heat stresses. Adaptive evolution, a form of natural selection shaped the crop to grow and yield satisfactorily with limited moisture supply or under periodic water deficits in the soil. Drought tolerance is a complex polygenic trait that various morphological and physiological responses are controlled by 100s of genes and significantly influenced by the environment. The development of genomic tools will have enormous potential to improve the efficiency and precision of conventional breeding. The apparent independent domestication events, highly outcrossing nature and traditional cultivation in stressful environments maintained tremendous amount of polymorphism in pearl millet. This high polymorphism of the crop has been revealed by genome mapping that in turn stimulated the mapping and tagging of genomic regions controlling important traits such as drought tolerance. Mapping of a major QTL for terminal drought tolerance in independent populations envisaged the prospect for the development of molecular breeding in pearl millet. To accelerate genetic gains for drought tolerance targeted novel approaches such as establishment of marker-trait associations, genomic selection tools, genome sequence and genotyping-by-sequencing are still limited. Development and application of high throughput genomic tools need to be intensified to improve the breeding efficiency of pearl millet to minimize the impact of climate change on its production.

  6. Genomic Tools in Pearl Millet Breeding for Drought Tolerance: Status and Prospects

    PubMed Central

    Serba, Desalegn D.; Yadav, Rattan S.

    2016-01-01

    Pearl millet [Penisetum glaucum (L) R. Br.] is a hardy cereal crop grown in the arid and semiarid tropics where other cereals are likely to fail to produce economic yields due to drought and heat stresses. Adaptive evolution, a form of natural selection shaped the crop to grow and yield satisfactorily with limited moisture supply or under periodic water deficits in the soil. Drought tolerance is a complex polygenic trait that various morphological and physiological responses are controlled by 100s of genes and significantly influenced by the environment. The development of genomic tools will have enormous potential to improve the efficiency and precision of conventional breeding. The apparent independent domestication events, highly outcrossing nature and traditional cultivation in stressful environments maintained tremendous amount of polymorphism in pearl millet. This high polymorphism of the crop has been revealed by genome mapping that in turn stimulated the mapping and tagging of genomic regions controlling important traits such as drought tolerance. Mapping of a major QTL for terminal drought tolerance in independent populations envisaged the prospect for the development of molecular breeding in pearl millet. To accelerate genetic gains for drought tolerance targeted novel approaches such as establishment of marker-trait associations, genomic selection tools, genome sequence and genotyping-by-sequencing are still limited. Development and application of high throughput genomic tools need to be intensified to improve the breeding efficiency of pearl millet to minimize the impact of climate change on its production. PMID:27920783

  7. 3D genome structure modeling by Lorentzian objective function.

    PubMed

    Trieu, Tuan; Cheng, Jianlin

    2017-02-17

    The 3D structure of the genome plays a vital role in biological processes such as gene interaction, gene regulation, DNA replication and genome methylation. Advanced chromosomal conformation capture techniques, such as Hi-C and tethered conformation capture, can generate chromosomal contact data that can be used to computationally reconstruct 3D structures of the genome. We developed a novel restraint-based method that is capable of reconstructing 3D genome structures utilizing both intra-and inter-chromosomal contact data. Our method was robust to noise and performed well in comparison with a panel of existing methods on a controlled simulated data set. On a real Hi-C data set of the human genome, our method produced chromosome and genome structures that are consistent with 3D FISH data and known knowledge about the human chromosome and genome, such as, chromosome territories and the cluster of small chromosomes in the nucleus center with the exception of the chromosome 18. The tool and experimental data are available at https://missouri.box.com/v/LorDG.

  8. 3D genome structure modeling by Lorentzian objective function.

    PubMed

    Trieu, Tuan; Cheng, Jianlin

    2016-11-29

    The 3D structure of the genome plays a vital role in biological processes such as gene interaction, gene regulation, DNA replication and genome methylation. Advanced chromosomal conformation capture techniques, such as Hi-C and tethered conformation capture, can generate chromosomal contact data that can be used to computationally reconstruct 3D structures of the genome. We developed a novel restraint-based method that is capable of reconstructing 3D genome structures utilizing both intra-and inter-chromosomal contact data. Our method was robust to noise and performed well in comparison with a panel of existing methods on a controlled simulated data set. On a real Hi-C data set of the human genome, our method produced chromosome and genome structures that are consistent with 3D FISH data and known knowledge about the human chromosome and genome, such as, chromosome territories and the cluster of small chromosomes in the nucleus center with the exception of the chromosome 18. The tool and experimental data are available at https://missouri.box.com/v/LorDG.

  9. Determining protein function and interaction from genome analysis

    DOEpatents

    Eisenberg, David; Marcotte, Edward M.; Thompson, Michael J.; Pellegrini, Matteo; Yeates, Todd O.

    2004-08-03

    A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

  10. Assigning protein functions by comparative genome analysis protein phylogenetic profiles

    DOEpatents

    Pellegrini, Matteo; Marcotte, Edward M.; Thompson, Michael J.; Eisenberg, David; Grothe, Robert; Yeates, Todd O.

    2003-05-13

    A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

  11. Application of Functional Genomics for Bovine Respiratory Disease Diagnostics.

    PubMed

    Rai, Aswathy N; Epperson, William B; Nanduri, Bindu

    2015-01-01

    Bovine respiratory disease (BRD) is the most common economically important disease affecting cattle. For developing accurate diagnostics that can predict disease susceptibility/resistance and stratification, it is necessary to identify the molecular mechanisms that underlie BRD. To study the complex interactions among the bovine host and the multitude of viral and bacterial pathogens, as well as the environmental factors associated with BRD etiology, genome-scale high-throughput functional genomics methods such as microarrays, RNA-seq, and proteomics are helpful. In this review, we summarize the progress made in our understanding of BRD using functional genomics approaches. We also discuss some of the available bioinformatics resources for analyzing high-throughput data, in the context of biological pathways and molecular interactions. Although resources for studying host response to infection are avail-able, the corresponding information is lacking for majority of BRD pathogens, impeding progress in identifying diagnostic signatures for BRD using functional genomics approaches.

  12. Transcriptome and genome sequencing uncovers functional variation in humans.

    PubMed

    Lappalainen, Tuuli; Sammeth, Michael; Friedländer, Marc R; 't Hoen, Peter A C; Monlong, Jean; Rivas, Manuel A; Gonzàlez-Porta, Mar; Kurbatova, Natalja; Griebel, Thasso; Ferreira, Pedro G; Barann, Matthias; Wieland, Thomas; Greger, Liliana; van Iterson, Maarten; Almlöf, Jonas; Ribeca, Paolo; Pulyakhina, Irina; Esser, Daniela; Giger, Thomas; Tikhonov, Andrew; Sultan, Marc; Bertier, Gabrielle; MacArthur, Daniel G; Lek, Monkol; Lizano, Esther; Buermans, Henk P J; Padioleau, Ismael; Schwarzmayr, Thomas; Karlberg, Olof; Ongen, Halit; Kilpinen, Helena; Beltran, Sergi; Gut, Marta; Kahlem, Katja; Amstislavskiy, Vyacheslav; Stegle, Oliver; Pirinen, Matti; Montgomery, Stephen B; Donnelly, Peter; McCarthy, Mark I; Flicek, Paul; Strom, Tim M; Lehrach, Hans; Schreiber, Stefan; Sudbrak, Ralf; Carracedo, Angel; Antonarakis, Stylianos E; Häsler, Robert; Syvänen, Ann-Christine; van Ommen, Gert-Jan; Brazma, Alvis; Meitinger, Thomas; Rosenstiel, Philip; Guigó, Roderic; Gut, Ivo G; Estivill, Xavier; Dermitzakis, Emmanouil T

    2013-09-26

    Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of messenger RNA and microRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project--the first uniformly processed high-throughput RNA-sequencing data from multiple human populations with high-quality genome sequences. We discover extremely widespread genetic variation affecting the regulation of most genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on the cellular mechanisms of regulatory and loss-of-function variation, and allows us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome.

  13. Application of Functional Genomics for Bovine Respiratory Disease Diagnostics

    PubMed Central

    Rai, Aswathy N.; Epperson, William B.; Nanduri, Bindu

    2015-01-01

    Bovine respiratory disease (BRD) is the most common economically important disease affecting cattle. For developing accurate diagnostics that can predict disease susceptibility/resistance and stratification, it is necessary to identify the molecular mechanisms that underlie BRD. To study the complex interactions among the bovine host and the multitude of viral and bacterial pathogens, as well as the environmental factors associated with BRD etiology, genome-scale high-throughput functional genomics methods such as microarrays, RNA-seq, and proteomics are helpful. In this review, we summarize the progress made in our understanding of BRD using functional genomics approaches. We also discuss some of the available bioinformatics resources for analyzing high-throughput data, in the context of biological pathways and molecular interactions. Although resources for studying host response to infection are avail-able, the corresponding information is lacking for majority of BRD pathogens, impeding progress in identifying diagnostic signatures for BRD using functional genomics approaches. PMID:26526746

  14. Cognitive Tools and User-Centered Learning Environments: Rethinking Tools, Functions, and Applications.

    ERIC Educational Resources Information Center

    Iiyoshi, Toru; Hannafin, Michael J.

    This paper introduces and analyzes problems and issues in the design and use of cognitive tools in open, user-centered learning environments. It introduces a classification scheme for tool functions, and showcases several tools in a current educational hypermedia research and development effort. Information-seeking, information-presentation,…

  15. A two-plasmid inducible CRISPR/Cas9 genome editing tool for Clostridium acetobutylicum.

    PubMed

    Wasels, François; Jean-Marie, Jennifer; Collas, Florent; López-Contreras, Ana M; Lopes Ferreira, Nicolas

    2017-09-01

    CRISPR/Cas-based genetic engineering has revolutionised molecular biology in both eukaryotes and prokaryotes. Several tools dedicated to the genomic transformation of the Clostridium genus of Gram-positive bacteria have been described in the literature; however, the integration of large DNA fragments still remains relatively limited. In this study, a CRISPR/Cas9 genome editing tool using a two-plasmid strategy was developed for the solventogenic strain Clostridium acetobutylicum ATCC 824. Codon-optimised cas9 from Streptococcus pyogenes was placed under the control of an anhydrotetracycline-inducible promoter on one plasmid, while the gRNA expression cassettes and editing templates were located on a second plasmid. Through the sequential introduction of these vectors into the cell, we achieved highly accurate genome modifications, including nucleotide substitution, gene deletion and cassette insertion up to 3.6kb. To demonstrate its potential, this genome editing tool was used to generate a marker-free mutant of ATCC 824 that produced an isopropanol-butanol-ethanol mixture. Whole-genome sequencing confirmed that no off-target modifications were present in the mutants. Such a tool is a prerequisite for efficient metabolic engineering in this solventogenic strain and provides an alternative editing strategy that might be applicable to other Clostridium strains. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  16. BioViews: Java-Based Tools for Genomic Data Visualization

    PubMed Central

    Helt, Gregg A.; Lewis, Suzanna; Loraine, Ann E.; Rubin, Gerald M.

    1998-01-01

    Visualization tools for bioinformatics ideally should provide universal access to the most current data in an interactive and intuitive graphical user interface. Since the introduction of Java, a language designed for distributed programming over the Web, the technology now exists to build a genomic data visualization tool that meets these requirements. Using Java we have developed a prototype genome browser applet (BioViews) that incorporates a three-level graphical view of genomic data: a physical map, an annotated sequence map, and a DNA sequence display. Annotated biological features are displayed on the physical and sequence-based maps, and the different views are interconnected. The applet is linked to several databases and can retrieve features and display hyperlinked textual data on selected features. In addition to browsing genomic data, different types of analyses can be performed interactively and the results of these analyses visualized alongside prior annotations. Our genome browser is built on top of extensible, reusable graphic components specifically designed for bioinformatics. Other groups can (and do) reuse this work in various ways. Genome centers can reuse large parts of the genome browser with minor modifications, bioinformatics groups working on sequence analysis can reuse components to build front ends for analysis programs, and biology laboratories can reuse components to publish results as dynamic Web documents. PMID:9521932

  17. FEATnotator: A tool for integrated annotation of sequence features and variation, facilitating interpretation in genomics experiments.

    PubMed

    Podicheti, Ram; Mockaitis, Keithanne

    2015-06-01

    As approaches are sought for more efficient and democratized uses of non-model and expanded model genomics references, ease of integration of genomic feature datasets is especially desirable in multidisciplinary research communities. Valuable conclusions are often missed or slowed when researchers refer experimental results to a single reference sequence that lacks integrated pan-genomic and multi-experiment data in accessible formats. Association of genomic positional information, such as results from an expansive variety of next-generation sequencing experiments, with annotated reference features such as genes or predicted protein binding sites, provides the context essential for conclusions and ongoing research. When the experimental system includes polymorphic genomic inputs, rapid calculation of gene structural and protein translational effects of sequence variation from the reference can be invaluable. Here we present FEATnotator, a lightweight, fast and easy to use open source software program that integrates and reports overlap and proximity in genomic information from any user-defined datasets including those from next generation sequencing applications. We illustrate use of the tool by summarizing whole genome sequence variation of a widely used natural isolate of Arabidopsis thaliana in the context of gene models of the reference accession. Previous discovery of a protein coding deletion influencing root development is replicated rapidly. Appropriate even in investigations of a single gene or genic regions such as QTL, comprehensive reports provided by FEATnotator better prepare researchers for interpretation of their experimental results. The tool is available for download at http://featnotator.sourceforge.net.

  18. Accessing the SEED genome databases via Web services API: tools for programmers.

    PubMed

    Disz, Terry; Akhter, Sajia; Cuevas, Daniel; Olson, Robert; Overbeek, Ross; Vonstein, Veronika; Stevens, Rick; Edwards, Robert A

    2010-06-14

    The SEED integrates many publicly available genome sequences into a single resource. The database contains accurate and up-to-date annotations based on the subsystems concept that leverages clustering between genomes and other clues to accurately and efficiently annotate microbial genomes. The backend is used as the foundation for many genome annotation tools, such as the Rapid Annotation using Subsystems Technology (RAST) server for whole genome annotation, the metagenomics RAST server for random community genome annotations, and the annotation clearinghouse for exchanging annotations from different resources. In addition to a web user interface, the SEED also provides Web services based API for programmatic access to the data in the SEED, allowing the development of third-party tools and mash-ups. The currently exposed Web services encompass over forty different methods for accessing data related to microbial genome annotations. The Web services provide comprehensive access to the database back end, allowing any programmer access to the most consistent and accurate genome annotations available. The Web services are deployed using a platform independent service-oriented approach that allows the user to choose the most suitable programming platform for their application. Example code demonstrate that Web services can be used to access the SEED using common bioinformatics programming languages such as Perl, Python, and Java. We present a novel approach to access the SEED database. Using Web services, a robust API for access to genomics data is provided, without requiring large volume downloads all at once. The API ensures timely access to the most current datasets available, including the new genomes as soon as they come online.

  19. Accessing the SEED Genome Databases via Web Services API: Tools for Programmers

    PubMed Central

    2010-01-01

    Background The SEED integrates many publicly available genome sequences into a single resource. The database contains accurate and up-to-date annotations based on the subsystems concept that leverages clustering between genomes and other clues to accurately and efficiently annotate microbial genomes. The backend is used as the foundation for many genome annotation tools, such as the Rapid Annotation using Subsystems Technology (RAST) server for whole genome annotation, the metagenomics RAST server for random community genome annotations, and the annotation clearinghouse for exchanging annotations from different resources. In addition to a web user interface, the SEED also provides Web services based API for programmatic access to the data in the SEED, allowing the development of third-party tools and mash-ups. Results The currently exposed Web services encompass over forty different methods for accessing data related to microbial genome annotations. The Web services provide comprehensive access to the database back end, allowing any programmer access to the most consistent and accurate genome annotations available. The Web services are deployed using a platform independent service-oriented approach that allows the user to choose the most suitable programming platform for their application. Example code demonstrate that Web services can be used to access the SEED using common bioinformatics programming languages such as Perl, Python, and Java. Conclusions We present a novel approach to access the SEED database. Using Web services, a robust API for access to genomics data is provided, without requiring large volume downloads all at once. The API ensures timely access to the most current datasets available, including the new genomes as soon as they come online. PMID:20546611

  20. Biotechnology and vaccines: application of functional genomics to Neisseria meningitidis and other bacterial pathogens.

    PubMed

    Serruto, Davide; Adu-Bobie, Jeannette; Capecchi, Barbara; Rappuoli, Rino; Pizza, Mariagrazia; Masignani, Vega

    2004-09-30

    Since its introduction, vaccinology has been very effective in preventing infectious diseases. However, in several cases, the conventional approach to identify protective antigens, based on biochemical, immunological and microbiological methods, has failed to deliver successful vaccine candidates against major bacterial pathogens. The recent development of powerful biotechnological tools applied to genome-based approaches has revolutionized vaccine development, biological research and clinical diagnostics. The availability of a genome provides an inclusive virtual catalogue of all the potential antigens from which it is possible to select the molecules that are likely to be more effective. Here, we describe the use of "reverse vaccinology", which has been successful in the identification of potential vaccines candidates against Neisseria meningitidis serogroup B and review the use of functional genomics approaches as DNA microarrays, proteomics and comparative genome analysis for the identification of virulence factors and novel vaccine candidates. In addition, we describe the potential of these powerful technologies in understanding the pathogenesis of various bacteria.

  1. GEnomes Management Application (GEM.app): a new software tool for large-scale collaborative genome analysis.

    PubMed

    Gonzalez, Michael A; Lebrigio, Rafael F Acosta; Van Booven, Derek; Ulloa, Rick H; Powell, Eric; Speziani, Fiorella; Tekin, Mustafa; Schüle, Rebecca; Züchner, Stephan

    2013-06-01

    Novel genes are now identified at a rapid pace for many Mendelian disorders, and increasingly, for genetically complex phenotypes. However, new challenges have also become evident: (1) effectively managing larger exome and/or genome datasets, especially for smaller labs; (2) direct hands-on analysis and contextual interpretation of variant data in large genomic datasets; and (3) many small and medium-sized clinical and research-based investigative teams around the world are generating data that, if combined and shared, will significantly increase the opportunities for the entire community to identify new genes. To address these challenges, we have developed GEnomes Management Application (GEM.app), a software tool to annotate, manage, visualize, and analyze large genomic datasets (https://genomics.med.miami.edu/). GEM.app currently contains ∼1,600 whole exomes from 50 different phenotypes studied by 40 principal investigators from 15 different countries. The focus of GEM.app is on user-friendly analysis for nonbioinformaticians to make next-generation sequencing data directly accessible. Yet, GEM.app provides powerful and flexible filter options, including single family filtering, across family/phenotype queries, nested filtering, and evaluation of segregation in families. In addition, the system is fast, obtaining results within 4 sec across ∼1,200 exomes. We believe that this system will further enhance identification of genetic causes of human disease. © 2013 Wiley Periodicals, Inc.

  2. GEnomes Management Application (GEM.app): A new software tool for large-scale collaborative genome analysis

    PubMed Central

    Gonzalez, Michael A.; Acosta Lebrigio, Rafael F.; Van Booven, Derek; Ulloa, Rick H.; Powell, Eric; Speziani, Fiorella; Tekin, Mustafa; Schule, Rebecca; Zuchner, Stephan

    2015-01-01

    Novel genes are now identified at a rapid pace for many Mendelian disorders, and increasingly, for genetically complex phenotypes. However, new challenges have also become evident: (1) effectively managing larger exome and/or genome datasets, especially for smaller labs; (2) direct hands-on analysis and contextual interpretation of variant data in large genomic datasets; and (3) many small and medium-sized clinical and research-based investigative teams around the world are generating data that, if combined and shared, will significantly increase the opportunities for the entire community to identify new genes. To address these challenges we have developed GEnomes Management Application (GEM.app), a software tool to annotate, manage, visualize, and analyze large genomic datasets (https://genomics.med.miami.edu/). GEM.app currently contains ~1,600 whole exomes from 50 different phenotypes studied by 40 principal investigators from 15 different countries. The focus of GEM.app is on user-friendly analysis for non-bioinformaticians to make NGS data directly accessible. Yet, GEM.app provides powerful and flexible filter options, including single family filtering, across family/phenotype queries, nested filtering, and evaluation of segregation in families. In addition, the system is fast, obtaining results within 4 seconds across ~1,200 exomes. We believe that this system will further enhance identification of genetic causes of human disease. PMID:23463597

  3. RNAiDB and PhenoBlast: web tools for genome-wide phenotypic mapping projects

    PubMed Central

    Gunsalus, Kristin C.; Yueh, Wan-Chen; MacMenamin, Philip; Piano, Fabio

    2004-01-01

    RNA interference (RNAi) is being used in large-scale genomic studies as a rapid way to obtain in vivo functional information associated with specific genes. How best to archive and mine the complex data derived from these studies provides a series of challenges associated with both the methods used to elicit the RNAi response and the functional data gathered. RNAiDB (RNAi Database; http://www.rnai.org) has been created for the archival, distribution and analysis of phenotypic data from large-scale RNAi analyses in Caenorhabditis elegans. The database contains a compendium of publicly available data and provides information on experimental methods and phenotypic results, including raw data in the form of images and streaming time-lapse movies. Phenotypic summaries together with graphical displays of RNAi to gene mappings allow quick intuitive comparison of results from different RNAi assays and visualization of the gene product(s) potentially inhibited by each RNAi experiment based on multiple sequence analysis methods. RNAiDB can be searched using combinatorial queries and using the novel tool PhenoBlast, which ranks genes according to their overall phenotypic similarity. RNAiDB could serve as a model database for distributing and navigating in vivo functional information from large-scale systematic phenotypic analyses in different organisms. PMID:14681444

  4. Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST): A web tool for generating protein sequence similarity networks

    PubMed Central

    Gerlt, John A.; Bouvier, Jason T.; Davidson, Daniel B.; Imker, Heidi J.; Sadkhin, Boris; Slater, David R.; Whalen, Katie L.

    2015-01-01

    The Enzyme Function Initiative, an NIH/NIGMS-supported Large-Scale Collaborative Project (EFI; U54GM093342; http://enzymefunction.org/), is focused on devising and disseminating bioinformatics and computational tools as well as experimental strategies for the prediction and assignment of functions (in vitro activities and in vivo physiological/metabolic roles) to uncharacterized enzymes discovered in genome projects. Protein sequence similarity networks (SSNs) are visually powerful tools for analyzing sequence relationships in protein families (H.J. Atkinson, J.H. Morris, T.E. Ferrin, and P.C. Babbitt, PLoS One 2009, 4, e4345). However, the members of the biological/biomedical community have not had access to the capability to generate SSNs for their “favorite” protein families. In this article we announce the EFI-EST (Enzyme Function Initiative-Enzyme Similarity Tool) web tool (http://efi.igb.illinois.edu/efi-est/) that is available without cost for the automated generation of SSNs by the community. The tool can create SSNs for the “closest neighbors” of a user-supplied protein sequence from the UniProt database (Option A) or of members of any user-supplied Pfam and/or InterPro family (Option B). We provide an introduction to SSNs, a description of EFI-EST, and a demonstration of the use of EFI-EST to explore sequence-function space in the OMP decarboxylase superfamily (PF00215). This article is designed as a tutorial that will allow members of the community to use the EFI-EST web tool for exploring sequence/function space in protein families. PMID:25900361

  5. Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome

    PubMed Central

    Bergman, Casey M; Pfeiffer, Barret D; Rincón-Limas, Diego E; Hoskins, Roger A; Gnirke, Andreas; Mungall, Chris J; Wang, Adrienne M; Kronmiller, Brent; Pacleb, Joanne; Park, Soo; Stapleton, Mark; Wan, Kenneth; George, Reed A; de Jong, Pieter J; Botas, Juan; Rubin, Gerald M; Celniker, Susan E

    2002-01-01

    Background It is widely accepted that comparative sequence data can aid the functional annotation of genome sequences; however, the most informative species and features of genome evolution for comparison remain to be determined. Results We analyzed conservation in eight genomic regions (apterous, even-skipped, fushi tarazu, twist, and Rhodopsins 1, 2, 3 and 4) from four Drosophila species (D. erecta, D. pseudoobscura, D. willistoni, and D. littoralis) covering more than 500 kb of the D. melanogaster genome. All D. melanogaster genes (and 78-82% of coding exons) identified in divergent species such as D. pseudoobscura show evidence of functional constraint. Addition of a third species can reveal functional constraint in otherwise non-significant pairwise exon comparisons. Microsynteny is largely conserved, with rearrangement breakpoints, novel transposable element insertions, and gene transpositions occurring in similar numbers. Rates of amino-acid substitution are higher in uncharacterized genes relative to genes that have previously been studied. Conserved non-coding sequences (CNCSs) tend to be spatially clustered with conserved spacing between CNCSs, and clusters of CNCSs can be used to predict enhancer sequences. Conclusions Our results provide the basis for choosing species whose genome sequences would be most useful in aiding the functional annotation of coding and cis-regulatory sequences in Drosophila. Furthermore, this work shows how decoding the spatial organization of conserved sequences, such as the clustering of CNCSs, can complement efforts to annotate eukaryotic genomes on the basis of sequence conservation alone. PMID:12537575

  6. The Dockstore: enabling modular, community-focused sharing of Docker-based genomics tools and workflows

    PubMed Central

    O'Connor, Brian D.; Yuen, Denis; Chung, Vincent; Duncan, Andrew G.; Liu, Xiang Kun; Patricia, Janice; Paten, Benedict; Stein, Lincoln; Ferretti, Vincent

    2017-01-01

    As genomic datasets continue to grow, the feasibility of downloading data to a local organization and running analysis on a traditional compute environment is becoming increasingly problematic. Current large-scale projects, such as the ICGC PanCancer Analysis of Whole Genomes (PCAWG), the Data Platform for the U.S. Precision Medicine Initiative, and the NIH Big Data to Knowledge Center for Translational Genomics, are using cloud-based infrastructure to both host and perform analysis across large data sets. In PCAWG, over 5,800 whole human genomes were aligned and variant called across 14 cloud and HPC environments; the processed data was then made available on the cloud for further analysis and sharing. If run locally, an operation at this scale would have monopolized a typical academic data centre for many months, and would have presented major challenges for data storage and distribution. However, this scale is increasingly typical for genomics projects and necessitates a rethink of how analytical tools are packaged and moved to the data. For PCAWG, we embraced the use of highly portable Docker images for encapsulating and sharing complex alignment and variant calling workflows across highly variable environments. While successful, this endeavor revealed a limitation in Docker containers, namely the lack of a standardized way to describe and execute the tools encapsulated inside the container. As a result, we created the Dockstore ( https://dockstore.org), a project that brings together Docker images with standardized, machine-readable ways of describing and running the tools contained within. This service greatly improves the sharing and reuse of genomics tools and promotes interoperability with similar projects through emerging web service standards developed by the Global Alliance for Genomics and Health (GA4GH). PMID:28344774

  7. The Dockstore: enabling modular, community-focused sharing of Docker-based genomics tools and workflows.

    PubMed

    O'Connor, Brian D; Yuen, Denis; Chung, Vincent; Duncan, Andrew G; Liu, Xiang Kun; Patricia, Janice; Paten, Benedict; Stein, Lincoln; Ferretti, Vincent

    2017-01-01

    As genomic datasets continue to grow, the feasibility of downloading data to a local organization and running analysis on a traditional compute environment is becoming increasingly problematic. Current large-scale projects, such as the ICGC PanCancer Analysis of Whole Genomes (PCAWG), the Data Platform for the U.S. Precision Medicine Initiative, and the NIH Big Data to Knowledge Center for Translational Genomics, are using cloud-based infrastructure to both host and perform analysis across large data sets. In PCAWG, over 5,800 whole human genomes were aligned and variant called across 14 cloud and HPC environments; the processed data was then made available on the cloud for further analysis and sharing. If run locally, an operation at this scale would have monopolized a typical academic data centre for many months, and would have presented major challenges for data storage and distribution. However, this scale is increasingly typical for genomics projects and necessitates a rethink of how analytical tools are packaged and moved to the data. For PCAWG, we embraced the use of highly portable Docker images for encapsulating and sharing complex alignment and variant calling workflows across highly variable environments. While successful, this endeavor revealed a limitation in Docker containers, namely the lack of a standardized way to describe and execute the tools encapsulated inside the container. As a result, we created the Dockstore ( https://dockstore.org), a project that brings together Docker images with standardized, machine-readable ways of describing and running the tools contained within. This service greatly improves the sharing and reuse of genomics tools and promotes interoperability with similar projects through emerging web service standards developed by the Global Alliance for Genomics and Health (GA4GH).

  8. LEGER: knowledge database and visualization tool for comparative genomics of pathogenic and non-pathogenic Listeria species

    PubMed Central

    Dieterich, Guido; Kärst, Uwe; Fischer, Elmar; Wehland, Jürgen; Jänsch, Lothar

    2006-01-01

    Listeria species are ubiquitous in the environment and often contaminate foods because they grow under conditions used for food preservation. Listeria monocytogenes, the human and animal pathogen, causes Listeriosis, an infection with a high mortality rate in risk groups such as immune-compromised individuals. Furthermore, L.monocytogenes is a model organism for the study of intracellular bacterial pathogens. The publication of its genome sequence and that of the non-pathogenic species Listeria innocua initiated numerous comparative studies and efforts to sequence all species comprising the genus. The Proteome database LEGER () was developed to support functional genome analyses by combining information obtained by applying bioinformatics methods and from public databases to improve the original annotations. LEGER offers three unique key features: (i) it is the first comprehensive information system focusing on the functional assignment of genes and proteins; (ii) integrated visualization tools, KEGG pathway and Genome Viewer, alleviate the functional exploration of complex data; and (iii) LEGER presents results of systematic post-genome studies, thus facilitating analyses combining computational and experimental results. Moreover, LEGER provides an unpublished membrane proteome analysis of L.innocua and in total visualizes experimentally validated information about the subcellular localizations of 789 different listerial proteins. PMID:16381897

  9. DNA-binding protein prediction using plant specific support vector machines: validation and application of a new genome annotation tool.

    PubMed

    Motion, Graham B; Howden, Andrew J M; Huitema, Edgar; Jones, Susan

    2015-12-15

    There are currently 151 plants with draft genomes available but levels of functional annotation for putative protein products are low. Therefore, accurate computational predictions are essential to annotate genomes in the first instance, and to provide focus for the more costly and time consuming functional assays that follow. DNA-binding proteins are an important class of proteins that require annotation, but current computational methods are not applicable for genome wide predictions in plant species. Here, we explore the use of species and lineage specific models for the prediction of DNA-binding proteins in plants. We show that a species specific support vector machine model based on Arabidopsis sequence data is more accurate (accuracy 81%) than a generic model (74%), and based on this we develop a plant specific model for predicting DNA-binding proteins. We apply this model to the tomato proteome and demonstrate its ability to perform accurate high-throughput prediction of DNA-binding proteins. In doing so, we have annotated 36 currently uncharacterised proteins by assigning a putative DNA-binding function. Our model is publically available and we propose it be used in combination with existing tools to help increase annotation levels of DNA-binding proteins encoded in plant genomes.

  10. In silico Comparison of 19 Porphyromonas gingivalis Strains in Genomics, Phylogenetics, Phylogenomics and Functional Genomics

    PubMed Central

    Chen, Tsute; Siddiqui, Huma; Olsen, Ingar

    2017-01-01

    Currently, genome sequences of a total of 19 Porphyromonas gingivalis strains are available, including eight completed genomes (strains W83, ATCC 33277, TDC60, HG66, A7436, AJW4, 381, and A7A1-28) and 11 high-coverage draft sequences (JCVI SC001, F0185, F0566, F0568, F0569, F0570, SJD2, W4087, W50, Ando, and MP4-504) that are assembled into fewer than 300 contigs. The objective was to compare these genomes at both nucleotide and protein sequence levels in order to understand their phylogenetic and functional relatedness. Four copies of 16S rRNA gene sequences were identified in each of the eight complete genomes and one in the other 11 unfinished genomes. These 43 16S rRNA sequences represent only 24 unique sequences and the derived phylogenetic tree suggests a possible evolutionary history for these strains. Phylogenomic comparison based on shared proteins and whole genome nucleotide sequences consistently showed two groups with closely related members: one consisted of ATCC 33277, 381, and HG66, another of W83, W50, and A7436. At least 1,037 core/shared proteins were identified in the 19 P. gingivalis genomes based on the most stringent detecting parameters. Comparative functional genomics based on genome-wide comparisons between NCBI and RAST annotations, as well as additional approaches, revealed functions that are unique or missing in individual P. gingivalis strains, or species-specific in all P. gingivalis strains, when compared to a neighboring species P. asaccharolytica. All the comparative results of this study are available online for download at ftp://www.homd.org/publication_data/20160425/. PMID:28261563

  11. In silico Comparison of 19 Porphyromonas gingivalis Strains in Genomics, Phylogenetics, Phylogenomics and Functional Genomics.

    PubMed

    Chen, Tsute; Siddiqui, Huma; Olsen, Ingar

    2017-01-01

    Currently, genome sequences of a total of 19 Porphyromonas gingivalis strains are available, including eight completed genomes (strains W83, ATCC 33277, TDC60, HG66, A7436, AJW4, 381, and A7A1-28) and 11 high-coverage draft sequences (JCVI SC001, F0185, F0566, F0568, F0569, F0570, SJD2, W4087, W50, Ando, and MP4-504) that are assembled into fewer than 300 contigs. The objective was to compare these genomes at both nucleotide and protein sequence levels in order to understand their phylogenetic and functional relatedness. Four copies of 16S rRNA gene sequences were identified in each of the eight complete genomes and one in the other 11 unfinished genomes. These 43 16S rRNA sequences represent only 24 unique sequences and the derived phylogenetic tree suggests a possible evolutionary history for these strains. Phylogenomic comparison based on shared proteins and whole genome nucleotide sequences consistently showed two groups with closely related members: one consisted of ATCC 33277, 381, and HG66, another of W83, W50, and A7436. At least 1,037 core/shared proteins were identified in the 19 P. gingivalis genomes based on the most stringent detecting parameters. Comparative functional genomics based on genome-wide comparisons between NCBI and RAST annotations, as well as additional approaches, revealed functions that are unique or missing in individual P. gingivalis strains, or species-specific in all P. gingivalis strains, when compared to a neighboring species P. asaccharolytica. All the comparative results of this study are available online for download at ftp://www.homd.org/publication_data/20160425/.

  12. Mapping by sequencing the Pneumocystis genome using the ordering DNA sequences V3 tool.

    PubMed Central

    Xu, Zheng; Lance, Britton; Vargas, Claudia; Arpinar, Budak; Bhandarkar, Suchendra; Kraemer, Eileen; Kochut, Krys J; Miller, John A; Wagner, Jeff R; Weise, Michael J; Wunderlich, John K; Stringer, James; Smulian, George; Cushion, Melanie T; Arnold, Jonathan

    2003-01-01

    A bioinformatics tool called ODS3 has been created for mapping by sequencing. The tool allows the creation of integrated genomic maps from genetic, physical mapping, and sequencing data and permits an integrated genome map to be stored, retrieved, viewed, and queried in a stand-alone capacity, in a client/server relationship with the Fungal Genome Database (FGDB), and as a web-browsing tool for the FGDB. In that ODS3 is programmed in Java, the tool promotes platform independence and supports export of integrated genome-mapping data in the extensible markup language (XML) for data interchange with other genome information systems. The tool ODS3 is used to create an initial integrated genome map of the AIDS-related fungal pathogen, Pneumocystis carinii. Contig dynamics would indicate that this physical map is approximately 50% complete with approximately 200 contigs. A total of 10 putative multigene families were found. Two of these putative families were previously characterized in P. carinii, namely the major surface glycoproteins (MSGs) and HSP70 proteins; three of these putative families (not previously characterized in P. carinii) were found to be similar to families encoding the HSP60 in Schizosaccharomyces pombe, the heat-shock psi protein in S. pombe, and the RNA synthetase family (i.e., MES1) in Saccharomyces cerevisiae. Physical mapping data are consistent with the 16S, 5.8S, and 26S rDNA genes being single copy in P. carinii. No other fungus outside this genus is known to have the rDNA genes in single copy. PMID:12702676

  13. Exploring the post-genomic world: differing explanatory and manipulatory functions of post-genomic sciences

    PubMed Central

    Holmes, Christina; Carlson, Siobhan M.; McDonald, Fiona; Jones, Mavis; Graham, Janice

    2016-01-01

    Richard Lewontin proposed that the ability of a scientific field to create a narrative for public understanding garners it social relevance. This article applies Lewontin's conceptual framework of the functions of science (manipulatory and explanatory) to compare and explain the current differences in perceived societal relevance of genetics/genomics and proteomics. We provide three examples to illustrate the social relevance and strong cultural narrative of genetics/genomics for which no counterpart exists for proteomics. We argue that the major difference between genetics/genomics and proteomics is that genomics has a strong explanatory function, due to the strong cultural narrative of heredity. Based on qualitative interviews and observations of proteomics conferences, we suggest that the nature of proteins, lack of public understanding, and theoretical complexity exacerbates this difference for proteomics. Lewontin's framework suggests that social scientists may find that omics sciences affect social relations in different ways than past analyses of genetics. PMID:27134568

  14. Exploring the post-genomic world: differing explanatory and manipulatory functions of post-genomic sciences.

    PubMed

    Holmes, Christina; Carlson, Siobhan M; McDonald, Fiona; Jones, Mavis; Graham, Janice

    2016-01-02

    Richard Lewontin proposed that the ability of a scientific field to create a narrative for public understanding garners it social relevance. This article applies Lewontin's conceptual framework of the functions of science (manipulatory and explanatory) to compare and explain the current differences in perceived societal relevance of genetics/genomics and proteomics. We provide three examples to illustrate the social relevance and strong cultural narrative of genetics/genomics for which no counterpart exists for proteomics. We argue that the major difference between genetics/genomics and proteomics is that genomics has a strong explanatory function, due to the strong cultural narrative of heredity. Based on qualitative interviews and observations of proteomics conferences, we suggest that the nature of proteins, lack of public understanding, and theoretical complexity exacerbates this difference for proteomics. Lewontin's framework suggests that social scientists may find that omics sciences affect social relations in different ways than past analyses of genetics.

  15. Functional genomics of lactic acid bacteria: from food to health

    PubMed Central

    2014-01-01

    Genome analysis using next generation sequencing technologies has revolutionized the characterization of lactic acid bacteria and complete genomes of all major groups are now available. Comparative genomics has provided new insights into the natural and laboratory evolution of lactic acid bacteria and their environmental interactions. Moreover, functional genomics approaches have been used to understand the response of lactic acid bacteria to their environment. The results have been instrumental in understanding the adaptation of lactic acid bacteria in artisanal and industrial food fermentations as well as their interactions with the human host. Collectively, this has led to a detailed analysis of genes involved in colonization, persistence, interaction and signaling towards to the human host and its health. Finally, massive parallel genome re-sequencing has provided new opportunities in applied genomics, specifically in the characterization of novel non-GMO strains that have potential to be used in the food industry. Here, we provide an overview of the state of the art of these functional genomics approaches and their impact in understanding, applying and designing lactic acid bacteria for food and health. PMID:25186768

  16. Functional genomics of lactic acid bacteria: from food to health.

    PubMed

    Douillard, François P; de Vos, Willem M

    2014-08-29

    Genome analysis using next generation sequencing technologies has revolutionized the characterization of lactic acid bacteria and complete genomes of all major groups are now available. Comparative genomics has provided new insights into the natural and laboratory evolution of lactic acid bacteria and their environmental interactions. Moreover, functional genomics approaches have been used to understand the response of lactic acid bacteria to their environment. The results have been instrumental in understanding the adaptation of lactic acid bacteria in artisanal and industrial food fermentations as well as their interactions with the human host. Collectively, this has led to a detailed analysis of genes involved in colonization, persistence, interaction and signaling towards to the human host and its health. Finally, massive parallel genome re-sequencing has provided new opportunities in applied genomics, specifically in the characterization of novel non-GMO strains that have potential to be used in the food industry. Here, we provide an overview of the state of the art of these functional genomics approaches and their impact in understanding, applying and designing lactic acid bacteria for food and health.

  17. Accelerating Genome Editing in CHO Cells Using CRISPR Cas9 and CRISPy, a Web-Based Target Finding Tool

    PubMed Central

    Ronda, Carlotta; Pedersen, Lasse Ebdrup; Hansen, Henning Gram; Kallehauge, Thomas Beuchert; Betenbaugh, Michael J; Nielsen, Alex Toftgaard; Kildegaard, Helene Faustrup

    2014-01-01

    Chinese hamster ovary (CHO) cells are widely used in the biopharmaceutical industry as a host for the production of complex pharmaceutical proteins. Thus genome engineering of CHO cells for improved product quality and yield is of great interest. Here, we demonstrate for the first time the efficacy of the CRISPR Cas9 technology in CHO cells by generating site-specific gene disruptions in COSMC and FUT8, both of which encode proteins involved in glycosylation. The tested single guide RNAs (sgRNAs) created an indel frequency up to 47.3% in COSMC, while an indel frequency up to 99.7% in FUT8 was achieved by applying lectin selection. All eight sgRNAs examined in this study resulted in relatively high indel frequencies, demonstrating that the Cas9 system is a robust and efficient genome-editing methodology in CHO cells. Deep sequencing revealed that 85% of the indels created by Cas9 resulted in frameshift mutations at the target sites, with a strong preference for single base indels. Finally, we have developed a user-friendly bioinformatics tool, named “CRISPy” for rapid identification of sgRNA target sequences in the CHO-K1 genome. The CRISPy tool identified 1,970,449 CRISPR targets divided into 27,553 genes and lists the number of off-target sites in the genome. In conclusion, the proven functionality of Cas9 to edit CHO genomes combined with our CRISPy database have the potential to accelerate genome editing and synthetic biology efforts in CHO cells. Biotechnol. Bioeng. 2014; 111: 1604–1616. © 2014 The Authors. Biotechnology and Bioengineering Published by Wiley Periodicals, Inc. PMID:24827782

  18. Accelerating genome editing in CHO cells using CRISPR Cas9 and CRISPy, a web-based target finding tool.

    PubMed

    Ronda, Carlotta; Pedersen, Lasse Ebdrup; Hansen, Henning Gram; Kallehauge, Thomas Beuchert; Betenbaugh, Michael J; Nielsen, Alex Toftgaard; Kildegaard, Helene Faustrup

    2014-08-01

    Chinese hamster ovary (CHO) cells are widely used in the biopharmaceutical industry as a host for the production of complex pharmaceutical proteins. Thus genome engineering of CHO cells for improved product quality and yield is of great interest. Here, we demonstrate for the first time the efficacy of the CRISPR Cas9 technology in CHO cells by generating site-specific gene disruptions in COSMC and FUT8, both of which encode proteins involved in glycosylation. The tested single guide RNAs (sgRNAs) created an indel frequency up to 47.3% in COSMC, while an indel frequency up to 99.7% in FUT8 was achieved by applying lectin selection. All eight sgRNAs examined in this study resulted in relatively high indel frequencies, demonstrating that the Cas9 system is a robust and efficient genome-editing methodology in CHO cells. Deep sequencing revealed that 85% of the indels created by Cas9 resulted in frameshift mutations at the target sites, with a strong preference for single base indels. Finally, we have developed a user-friendly bioinformatics tool, named "CRISPy" for rapid identification of sgRNA target sequences in the CHO-K1 genome. The CRISPy tool identified 1,970,449 CRISPR targets divided into 27,553 genes and lists the number of off-target sites in the genome. In conclusion, the proven functionality of Cas9 to edit CHO genomes combined with our CRISPy database have the potential to accelerate genome editing and synthetic biology efforts in CHO cells.

  19. Functional Analysis of the Human Genome:. Study of Genetic Disease

    NASA Astrophysics Data System (ADS)

    Tsui, Lap-Chee

    2003-04-01

    I will divide my remarks into 3 parts. First, I will give a brief summary of the Human Genome Project. Second, I will describe our work on human chromosome 7 to illustrate how we could contribute to the Project and disease research. Third, I would like to bring across the argument that study of genetic disease is an integral component of the Human Genome Project. In particular, I will use cystic fibrosis as an example to elaborate why I consider disease study is a part of functional genomics.

  20. DFAST and DAGA: web-based integrated genome annotation tools and resources

    PubMed Central

    TANIZAWA, Yasuhiro; FUJISAWA, Takatomo; KAMINUMA, Eli; NAKAMURA, Yasukazu; ARITA, Masanori

    2016-01-01

    Quality assurance and correct taxonomic affiliation of data submitted to public sequence databases have been an everlasting problem. The DDBJ Fast Annotation and Submission Tool (DFAST) is a newly developed genome annotation pipeline with quality and taxonomy assessment tools. To enable annotation of ready-to-submit quality, we also constructed curated reference protein databases tailored for lactic acid bacteria. DFAST was developed so that all the procedures required for DDBJ submission could be done seamlessly online. The online workspace would be especially useful for users not familiar with bioinformatics skills. In addition, we have developed a genome repository, DFAST Archive of Genome Annotation (DAGA), which currently includes 1,421 genomes covering 179 species and 18 subspecies of two genera, Lactobacillus and Pediococcus, obtained from both DDBJ/ENA/GenBank and Sequence Read Archive (SRA). All the genomes deposited in DAGA were annotated consistently and assessed using DFAST. To assess the taxonomic position based on genomic sequence information, we used the average nucleotide identity (ANI), which showed high discriminative power to determine whether two given genomes belong to the same species. We corrected mislabeled or misidentified genomes in the public database and deposited the curated information in DAGA. The repository will improve the accessibility and reusability of genome resources for lactic acid bacteria. By exploiting the data deposited in DAGA, we found intraspecific subgroups in Lactobacillus gasseri and Lactobacillus jensenii, whose variation between subgroups is larger than the well-accepted ANI threshold of 95% to differentiate species. DFAST and DAGA are freely accessible at https://dfast.nig.ac.jp. PMID:27867804

  1. DFAST and DAGA: web-based integrated genome annotation tools and resources.

    PubMed

    Tanizawa, Yasuhiro; Fujisawa, Takatomo; Kaminuma, Eli; Nakamura, Yasukazu; Arita, Masanori

    2016-01-01

    Quality assurance and correct taxonomic affiliation of data submitted to public sequence databases have been an everlasting problem. The DDBJ Fast Annotation and Submission Tool (DFAST) is a newly developed genome annotation pipeline with quality and taxonomy assessment tools. To enable annotation of ready-to-submit quality, we also constructed curated reference protein databases tailored for lactic acid bacteria. DFAST was developed so that all the procedures required for DDBJ submission could be done seamlessly online. The online workspace would be especially useful for users not familiar with bioinformatics skills. In addition, we have developed a genome repository, DFAST Archive of Genome Annotation (DAGA), which currently includes 1,421 genomes covering 179 species and 18 subspecies of two genera, Lactobacillus and Pediococcus, obtained from both DDBJ/ENA/GenBank and Sequence Read Archive (SRA). All the genomes deposited in DAGA were annotated consistently and assessed using DFAST. To assess the taxonomic position based on genomic sequence information, we used the average nucleotide identity (ANI), which showed high discriminative power to determine whether two given genomes belong to the same species. We corrected mislabeled or misidentified genomes in the public database and deposited the curated information in DAGA. The repository will improve the accessibility and reusability of genome resources for lactic acid bacteria. By exploiting the data deposited in DAGA, we found intraspecific subgroups in Lactobacillus gasseri and Lactobacillus jensenii, whose variation between subgroups is larger than the well-accepted ANI threshold of 95% to differentiate species. DFAST and DAGA are freely accessible at https://dfast.nig.ac.jp.

  2. Functional genomics of life history variation in a butterfly metapopulation.

    PubMed

    Wheat, Christopher W; Fescemyer, Howard W; Kvist, J; Tas, Eva; Vera, J Cristobal; Frilander, Mikko J; Hanski, Ilkka; Marden, James H

    2011-05-01

    In fragmented landscapes, small populations frequently go extinct and new ones are established with poorly understood consequences for genetic diversity and evolution of life history traits. Here, we apply functional genomic tools to an ecological model system, the well-studied metapopulation of the Glanville fritillary butterfly. We investigate how dispersal and colonization select upon existing genetic variation affecting life history traits by comparing common-garden reared 2-day adult females from new populations with those from established older populations. New-population females had higher expression of abdomen genes involved in egg provisioning and thorax genes involved in the maintenance of flight muscle proteins. Physiological studies confirmed that new-population butterflies have accelerated egg maturation, apparently regulated by higher juvenile hormone titer and angiotensin converting enzyme mRNA, as well as enhanced flight metabolism. Gene expression varied between allelic forms of two metabolic genes (Pgi and Sdhd), which themselves were associated with differences in flight metabolic rate, population age and population growth rate. These results identify likely molecular mechanisms underpinning life history variation that is maintained by extinction-colonization dynamics in metapopulations. © 2011 Blackwell Publishing Ltd.

  3. Accurate evaluation and analysis of functional genomics data and methods

    PubMed Central

    Greene, Casey S.; Troyanskaya, Olga G.

    2016-01-01

    The development of technology capable of inexpensively performing large-scale measurements of biological systems has generated a wealth of data. Integrative analysis of these data holds the promise of uncovering gene function, regulation, and, in the longer run, understanding complex disease. However, their analysis has proved very challenging, as it is difficult to quickly and effectively assess the relevance and accuracy of these data for individual biological questions. Here, we identify biases that present challenges for the assessment of functional genomics data and methods. We then discuss evaluation methods that, taken together, begin to address these issues. We also argue that the funding of systematic data-driven experiments and of high-quality curation efforts will further improve evaluation metrics so that they more-accurately assess functional genomics data and methods. Such metrics will allow researchers in the field of functional genomics to continue to answer important biological questions in a data-driven manner. PMID:22268703

  4. JCoast – A biologist-centric software tool for data mining and comparison of prokaryotic (meta)genomes

    PubMed Central

    Richter, Michael; Lombardot, Thierry; Kostadinov, Ivaylo; Kottmann, Renzo; Duhaime, Melissa Beth; Peplies, Jörg; Glöckner, Frank Oliver

    2008-01-01

    Background Current sequencing technologies give access to sequence information for genomes and metagenomes at a tremendous speed. Subsequent data processing is mainly performed by automatic pipelines provided by the sequencing centers. Although, standardised workflows are desirable and useful in many respects, rational data mining, comparative genomics, and especially the interpretation of the sequence information in the biological context, demands for intuitive, flexible, and extendable solutions. Results The JCoast software tool was primarily designed to analyse and compare (meta)genome sequences of prokaryotes. Based on a pre-computed GenDB database project, JCoast offers a flexible graphical user interface (GUI), as well as an application programming interface (API) that facilitates back-end data access. JCoast offers individual, cross genome-, and metagenome analysis, and assists the biologist in exploration of large and complex datasets. Conclusion JCoast combines all functions required for the mining, annotation, and interpretation of (meta)genomic data. The lightweight software solution allows the user to easily take advantage of advanced back-end database structures by providing a programming and graphical user interface to answer biological questions. JCoast is available at the project homepage. PMID:18380896

  5. JCoast - a biologist-centric software tool for data mining and comparison of prokaryotic (meta)genomes.

    PubMed

    Richter, Michael; Lombardot, Thierry; Kostadinov, Ivaylo; Kottmann, Renzo; Duhaime, Melissa Beth; Peplies, Jörg; Glöckner, Frank Oliver

    2008-04-01

    Current sequencing technologies give access to sequence information for genomes and metagenomes at a tremendous speed. Subsequent data processing is mainly performed by automatic pipelines provided by the sequencing centers. Although, standardised workflows are desirable and useful in many respects, rational data mining, comparative genomics, and especially the interpretation of the sequence information in the biological context, demands for intuitive, flexible, and extendable solutions. The JCoast software tool was primarily designed to analyse and compare (meta)genome sequences of prokaryotes. Based on a pre-computed GenDB database project, JCoast offers a flexible graphical user interface (GUI), as well as an application programming interface (API) that facilitates back-end data access. JCoast offers individual, cross genome-, and metagenome analysis, and assists the biologist in exploration of large and complex datasets. JCoast combines all functions required for the mining, annotation, and interpretation of (meta)genomic data. The lightweight software solution allows the user to easily take advantage of advanced back-end database structures by providing a programming and graphical user interface to answer biological questions. JCoast is available at the project homepage.

  6. A segmental genomic duplication generates a functional intron

    PubMed Central

    Hellsten, Uffe; Aspden, Julie L.; Rio, Donald C.; Rokhsar, Daniel S.

    2011-01-01

    An intron is an extended genomic feature whose function requires multiple constrained positions—donor and acceptor splice sites, a branch point, a polypyrimidine tract and suitable splicing enhancers—that may be distributed over hundreds or thousands of nucleotides. New introns are therefore unlikely to emerge by incremental accumulation of functional sub-elements. Here we demonstrate that a functional intron can be created de novo in a single step by a segmental genomic duplication. This experiment recapitulates in vivo the birth of an intron that arose in the ancestral jawed vertebrate lineage nearly half-a-billion years ago. PMID:21878908

  7. High-throughput TILLING for functional genomics.

    PubMed

    Till, Bradley J; Colbert, Trenton; Tompa, Rachel; Enns, Linda C; Codomo, Christine A; Johnson, Jessica E; Reynolds, Steven H; Henikoff, Jorja G; Greene, Elizabeth A; Steine, Michael N; Comai, Luca; Henikoff, Steven

    2003-01-01

    Targeting-induced local lesions in genomes (TILLING) is a general strategy for identifying induced point mutations that can be applied to almost any organism. Here, we describe the basic methodology for high-throughput TILLING. Gene segments are amplified using fluorescently tagged primers, and products are denatured and reannealed to form heteroduplexes between the mutated sequence and its wild-type counterpart. These heteroduplexes are substrates for cleavage by the endonuclease CEL I. Following cleavage, products are analyzed on denaturing polyacrylamide gels using the LI-COR DNA analyzer system. High-throughput TILLING has been adopted by the Arabidopsis TILLING Project (ATP) to provide allelic series of point mutations for the general Arabidopsis community.

  8. Animal QTLdb: an improved database tool for livestock animal QTL/association data dissemination in the post-genome era.

    PubMed

    Hu, Zhi-Liang; Park, Carissa A; Wu, Xiao-Lin; Reecy, James M

    2013-01-01

    The Animal QTL database (QTLdb; http://www.animalgenome.org/QTLdb) is designed to house all publicly available QTL and single-nucleotide polymorphism/gene association data on livestock animal species. An earlier version was published in the Nucleic Acids Research Database issue in 2007. Since then, we have continued our efforts to develop new and improved database tools to allow more data types, parameters and functions. Our efforts have transformed the Animal QTLdb into a tool that actively serves the research community as a quality data repository and more importantly, a provider of easily accessible tools and functions to disseminate QTL and gene association information. The QTLdb has been heavily used by the livestock genomics community since its first public release in 2004. To date, there are 5920 cattle, 3442 chicken, 7451 pigs, 753 sheep and 88 rainbow trout data points in the database, and at least 290 publications that cite use of the database. The rapid advancement in genomic studies of cattle, chicken, pigs, sheep and other livestock animals has presented us with challenges, as well as opportunities for the QTLdb to meet the evolving needs of the research community. Here, we report our progress over the recent years and highlight new functions and services available to the general public.

  9. gSearch: a fast and flexible general search tool for whole-genome sequencing.

    PubMed

    Song, Taemin; Hwang, Kyu-Baek; Hsing, Michael; Lee, Kyungjoon; Bohn, Justin; Kong, Sek Won

    2012-08-15

    Various processes such as annotation and filtering of variants or comparison of variants in different genomes are required in whole-genome or exome analysis pipelines. However, processing different databases and searching among millions of genomic loci is not trivial. gSearch compares sequence variants in the Genome Variation Format (GVF) or Variant Call Format (VCF) with a pre-compiled annotation or with variants in other genomes. Its search algorithms are subsequently optimized and implemented in a multi-threaded manner. The proposed method is not a stand-alone annotation tool with its own reference databases. Rather, it is a search utility that readily accepts public or user-prepared reference files in various formats including GVF, Generic Feature Format version 3 (GFF3), Gene Transfer Format (GTF), VCF and Browser Extensible Data (BED) format. Compared to existing tools such as ANNOVAR, gSearch runs more than 10 times faster. For example, it is capable of annotating 52.8 million variants with allele frequencies in 6 min. gSearch is available at http://ml.ssu.ac.kr/gSearch. It can be used as an independent search tool or can easily be integrated to existing pipelines through various programming environments such as Perl, Ruby and Python.

  10. Intervene: a tool for intersection and visualization of multiple gene or genomic region sets.

    PubMed

    Khan, Aziz; Mathelier, Anthony

    2017-05-31

    A common task for scientists relies on comparing lists of genes or genomic regions derived from high-throughput sequencing experiments. While several tools exist to intersect and visualize sets of genes, similar tools dedicated to the visualization of genomic region sets are currently limited. To address this gap, we have developed the Intervene tool, which provides an easy and automated interface for the effective intersection and visualization of genomic region or list sets, thus facilitating their analysis and interpretation. Intervene contains three modules: venn to generate Venn diagrams of up to six sets, upset to generate UpSet plots of multiple sets, and pairwise to compute and visualize intersections of multiple sets as clustered heat maps. Intervene, and its interactive web ShinyApp companion, generate publication-quality figures for the interpretation of genomic region and list sets. Intervene and its web application companion provide an easy command line and an interactive web interface to compute intersections of multiple genomic and list sets. They have the capacity to plot intersections using easy-to-interpret visual approaches. Intervene is developed and designed to meet the needs of both computer scientists and biologists. The source code is freely available at https://bitbucket.org/CBGR/intervene , with the web application available at https://asntech.shinyapps.io/intervene .

  11. Deductive genomics: a functional approach to identify innovative drug targets in the post-genome era.

    PubMed

    Stumm, Gabriele; Russ, Andreas; Nehls, Michael

    2002-01-01

    The sequencing of the human genome has generated a drug discovery process that is based on sequence analysis and hypothesis-driven (inductive) prediction of gene function. This approach, which we term inductive genomics, is currently dominating the efforts of the pharmaceutical industry to identify new drug targets. According to recent studies, this sequence-driven discovery process is paradoxically increasing the average cost of drug development, thus falling short of the promise of the Human Genome Project to simplify the creation of much needed novel therapeutics. In the early stages of discovery, the flurry of new gene sequences makes it difficult to pick and prioritize the most promising product candidates for product development, as with existing technologies important decisions have to be based on circumstantial evidence that does not strongly predict therapeutic potential. This is because the physiological function of a potential target cannot be predicted by gene sequence analysis and in vitro technologies alone. In contrast, deductive genomics, or large-scale forward genetics, bridges the gap between sequence and function by providing a function-driven in vivo screen of a highly orthologous mammalian model genome for medically relevant physiological functions and drug targets. This approach allows drug discovery to move beyond the focus on sequence-driven identification of new members of classical drug-able protein families towards the biology-driven identification of innovative targets and biological pathways.

  12. Primer: genomic and proteomic tools for the molecular dissection of disease.

    PubMed

    Walker, Erin J; Siminovitch, Katherine A

    2007-10-01

    Completion of the Human Genome Project has been rapidly followed by the emergence of high-throughput technologies that combine automation, miniaturization, and many other strategies and tools to enable systematic surveys of genome composition and gene expression. Of particular relevance to the prevention and management of disease are technologies such as high-throughput DNA genotyping, microarray-based gene-expression profiling, and mass spectrometry-facilitated protein profiling--platforms that collectively support the comprehensive analysis of DNA sequence variants across the genome and the global gene and protein expression changes that distinguish health from disease. Now used extensively in all facets of biomedical investigation, genomic and proteomic tools are already beginning to pinpoint molecular variants that influence risk and outcome in common diseases, and to thereby inform and direct development of novel molecular biomarkers and drug targets. As evidenced by recent advances in DNA sequencing methods, continued improvements in the scope, power, and cost efficiency of genomic and proteomic technologies should ensure their capacity to provide the scale and depth of knowledge required for translating genome sequence information into major medical impact.

  13. Natural selection on functional modules, a genome-wide analysis.

    PubMed

    Serra, François; Arbiza, Leonardo; Dopazo, Joaquín; Dopazo, Hernán

    2011-03-01

    Classically, the functional consequences of natural selection over genomes have been analyzed as the compound effects of individual genes. The current paradigm for large-scale analysis of adaptation is based on the observed significant deviations of rates of individual genes from neutral evolutionary expectation. This approach, which assumed independence among genes, has not been able to identify biological functions significantly enriched in positively selected genes in individual species. Alternatively, pooling related species has enhanced the search for signatures of selection. However, grouping signatures does not allow testing for adaptive differences between species. Here we introduce the Gene-Set Selection Analysis (GSSA), a new genome-wide approach to test for evidences of natural selection on functional modules. GSSA is able to detect lineage specific evolutionary rate changes in a notable number of functional modules. For example, in nine mammal and Drosophilae genomes GSSA identifies hundreds of functional modules with significant associations to high and low rates of evolution. Many of the detected functional modules with high evolutionary rates have been previously identified as biological functions under positive selection. Notably, GSSA identifies conserved functional modules with many positively selected genes, which questions whether they are exclusively selected for fitting genomes to environmental changes. Our results agree with previous studies suggesting that adaptation requires positive selection, but not every mutation under positive selection contributes to the adaptive dynamical process of the evolution of species.

  14. Genomic tools to improve progress and preserve variation for future generations

    USDA-ARS?s Scientific Manuscript database

    Use of genomic tools has greatly decreased generation intervals and increased genetic progress in dairy cattle, but faster selection cycles can also increase rates of inbreeding per unit of time. Average pedigree inbreeding of Holstein cows increased from 4.6% in 2000 to 5.6% in 2009 to 6.6% in 201...

  15. Applying Genomic and Genetic Tools to Understand and Mitigate Damage from Exposure to Toxins

    DTIC Science & Technology

    2011-10-01

    Understand and Mitigate Damage from Exposure to Toxins PRINCIPAL INVESTIGATOR: Richard Myers, Ph.D...2010 – 22 September 2011 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Applying Genomic and Genetic Tools to Understand and Mitigate Damage from...dose-response to PB in larval and adult zebrafish at the biochemical and transcriptomic levels, validated a methods of treatment for inducing stress

  16. Genomic tools for developing markers for postharvest disease resistance in Rosaceae fruit crops

    USDA-ARS?s Scientific Manuscript database

    A wealth of new plant genomic information and molecular tools have been developed over the past ten years and now the challenge is to learn how to apply this information to address critical production problems, such as disease resistance and abiotic stress tolerance. Malus sieversii, an apple speci...

  17. Global nutrient profiling by Phenotype MicroArrays: a tool complementing genomic and proteomic studies in conidial fungi*

    PubMed Central

    Atanasova, Lea; Druzhinina, Irina S.

    2010-01-01

    Conidial fungi or molds and mildews are widely used in modern biotechnology as producers of antibiotics and other secondary metabolites, industrially important enzymes, chemicals and food. They are also important pathogens of animals including humans and agricultural crops. These various applications and extremely versatile natural phenotypes have led to the constantly growing list of complete genomes which are now available. Functional genomics and proteomics widely exploit the genomic information to study the cell-wide impact of altered genes on the phenotype of an organism and its function. This allows for global analysis of the information flow from DNA to RNA to protein, but it is usually not sufficient for the description of the global phenotype of an organism. More recently, Phenotype MicroArray (PM) technology has been introduced as a tool to characterize the metabolism of a (wild) fungal strain or a mutant. In this article, we review the background of PM applications for fungi and the methodic requirements to obtain reliable results. We also report examples of the versatility of this tool. PMID:20205302

  18. CRISPR-Cas: From the Bacterial Adaptive Immune System to a Versatile Tool for Genome Engineering.

    PubMed

    Kirchner, Marion; Schneider, Sabine

    2015-11-09

    The field of biology has been revolutionized by the recent advancement of an adaptive bacterial immune system as a universal genome engineering tool. Bacteria and archaea use repetitive genomic elements termed clustered regularly interspaced short palindromic repeats (CRISPR) in combination with an RNA-guided nuclease (CRISPR-associated nuclease: Cas) to target and destroy invading DNA. By choosing the appropriate sequence of the guide RNA, this two-component system can be used to efficiently modify, target, and edit genomic loci of interest in plants, insects, fungi, mammalian cells, and whole organisms. This has opened up new frontiers in genome engineering, including the potential to treat or cure human genetic disorders. Now the potential risks as well as the ethical, social, and legal implications of this powerful new technique move into the limelight.

  19. Conifer genomics and adaptation: at the crossroads of genetic diversity and genome function.

    PubMed

    Prunier, Julien; Verta, Jukka-Pekka; MacKay, John J

    2016-01-01

    Conifers have been understudied at the genomic level despite their worldwide ecological and economic importance but the situation is rapidly changing with the development of next generation sequencing (NGS) technologies. With NGS, genomics research has simultaneously gained in speed, magnitude and scope. In just a few years, genomes of 20-24 gigabases have been sequenced for several conifers, with several others expected in the near future. Biological insights have resulted from recent sequencing initiatives as well as genetic mapping, gene expression profiling and gene discovery research over nearly two decades. We review the knowledge arising from conifer genomics research emphasizing genome evolution and the genomic basis of adaptation, and outline emerging questions and knowledge gaps. We discuss future directions in three areas with potential inputs from NGS technologies: the evolutionary impacts of adaptation in conifers based on the adaptation-by-speciation model; the contributions of genetic variability of gene expression in adaptation; and the development of a broader understanding of genetic diversity and its impacts on genome function. These research directions promise to sustain research aimed at addressing the emerging challenges of adaptation that face conifer trees. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  20. Bacterial genome mining of enzymatic tools for alkyne biosynthesis

    PubMed Central

    Zhu, Xuejun; Su, Michael; Manickam, Kadhirvel; Zhang, Wenjun

    2015-01-01

    The alkyne is an important functionality widely used in material science, pharmaceutical science, and chemical biology, but the importance of this functionality is contrasted by the very limited number of enzymes known to be involved in alkyne biosynthesis. We recently reported the first known carrier protein-dependent pathway for terminal alkyne formation, and in silico analysis suggested that this mechanism could be widespread in bacteria. In this paper, we screened additional homologous gene cassettes presumed to be involved in alkyne biosynthesis using both in vitro biochemical study and an E. coli-polyketide synthase (PKS) reporting system for in vivo analysis. We discovered and characterized a new terminal alkyne biosynthetic pathway comprised of TtuA, B, and C from Teredinibacter turnerae T7901. While the acyl-CoA ligase homolog (TtuA) demonstrated promiscuity in the activation and loading of medium-chain fatty acids onto the carrier protein (TtuC), the desaturase homolog (TtuB) showed stringent substrate specificity towards C10 fatty acyl moieties. In addition, TtuB was demonstrated to be a bifunctional desaturase/acetylenase that efficiently catalyzed two sequential O2-dependent dehydrogenation reactions. A novel terminal-alkyne bearing polyketide was further produced upon co-expression of ttuABC and a PKS gene in E. coli. The discovery and characterization of TtuA, B, and C provides us with a new bifunctional desaturase/acetylenase for mechanistic and structural study and expands the scarce enzyme inventory for the biosynthesis of the alkyne functionality, which has important applications in synthetic and chemical biology. PMID:26441143

  1. OmniMapFree: A unified tool to visualise and explore sequenced genomes

    PubMed Central

    2011-01-01

    • Background Acquiring and exploring whole genome sequence information for a species under investigation is now a routine experimental approach. On most genome browsers, typically, only the DNA sequence, EST support, motif search results, and GO annotations are displayed. However, for many species, a growing volume of additional experimental information is available but this is rarely searchable within the landscape of the entire genome. • Results We have developed a generic software which permits users to view a single genome in entirety either within its chromosome or supercontig context within a single window. This software permits the genome to be displayed at any scales and with any features. Different data types and data sets are displayed onto the genome, which have been acquired from other types of studies including classical genetics, forward and reverse genetics, transcriptomics, proteomics and improved annotation from alternative sources. In each display, different types of information can be overlapped, then retrieved in the desired combinations and scales and used in follow up analyses. The displays generated are of publication quality. • Conclusions OmniMapFree provides a unified, versatile and easy-to-use software tool for studying a single genome in association with all the other datasets and data types available for the organism. PMID:22085540

  2. CRISPR-Cas9: Tool for Qualitative and Quantitative Plant Genome Editing

    PubMed Central

    Noman, Ali; Aqeel, Muhammad; He, Shuilin

    2016-01-01

    Recent developments in genome editing techniques have aroused substantial excitement among agricultural scientists. These techniques offer new opportunities for developing improved plant lines with addition of important traits or removal of undesirable traits. Increased adoption of genome editing has been geared by swiftly developing Clustered regularly interspaced short palindromic repeats (CRISPR). This is appearing as driving force for innovative utilization in diverse branches of plant biology. CRISPR-Cas9 mediated genome editing is being used for rapid, easy and efficient alteration of genes among diverse plant species. With approximate completion of conceptual work about CRISPR-Cas9, plant scientists are applying this genome editing tool for crop attributes enhancement. The capability of this system for performing targeted and efficient modifications in genome sequence as well as gene expression will certainly spur novel developments not only in model plants but in crop and ornamental plants as well. Additionally, due to non-involvement of foreign DNA, this technique may help alleviating regulatory issues associated with genetically modified plants. We expect that prevailing challenges in plant science like genomic region manipulation, crop specific vectors etc. will be addressed along with sustained growth of this genome editing tool. In this review, recent progress of CRISPR-Cas9 technology in plants has been summarized and discussed. We reviewed significance of CRISPR-Cas9 for specific and non-traditional aspects of plant life. It also covers strengths of this technique in comparison with other genome editing techniques, e.g., Zinc finger nucleases, Transcription activator-like effector nucleases and potential challenges in coming decades have been described. PMID:27917188

  3. CRISPR-Cas9: Tool for Qualitative and Quantitative Plant Genome Editing.

    PubMed

    Noman, Ali; Aqeel, Muhammad; He, Shuilin

    2016-01-01

    Recent developments in genome editing techniques have aroused substantial excitement among agricultural scientists. These techniques offer new opportunities for developing improved plant lines with addition of important traits or removal of undesirable traits. Increased adoption of genome editing has been geared by swiftly developing Clustered regularly interspaced short palindromic repeats (CRISPR). This is appearing as driving force for innovative utilization in diverse branches of plant biology. CRISPR-Cas9 mediated genome editing is being used for rapid, easy and efficient alteration of genes among diverse plant species. With approximate completion of conceptual work about CRISPR-Cas9, plant scientists are applying this genome editing tool for crop attributes enhancement. The capability of this system for performing targeted and efficient modifications in genome sequence as well as gene expression will certainly spur novel developments not only in model plants but in crop and ornamental plants as well. Additionally, due to non-involvement of foreign DNA, this technique may help alleviating regulatory issues associated with genetically modified plants. We expect that prevailing challenges in plant science like genomic region manipulation, crop specific vectors etc. will be addressed along with sustained growth of this genome editing tool. In this review, recent progress of CRISPR-Cas9 technology in plants has been summarized and discussed. We reviewed significance of CRISPR-Cas9 for specific and non-traditional aspects of plant life. It also covers strengths of this technique in comparison with other genome editing techniques, e.g., Zinc finger nucleases, Transcription activator-like effector nucleases and potential challenges in coming decades have been described.

  4. Functional Genomic Analysis of C. elegans Molting

    PubMed Central

    Frand, Alison R; Russel, Sascha

    2005-01-01

    Although the molting cycle is a hallmark of insects and nematodes, neither the endocrine control of molting via size, stage, and nutritional inputs nor the enzymatic mechanism for synthesis and release of the exoskeleton is well understood. Here, we identify endocrine and enzymatic regulators of molting in C. elegans through a genome-wide RNA-interference screen. Products of the 159 genes discovered include annotated transcription factors, secreted peptides, transmembrane proteins, and extracellular matrix enzymes essential for molting. Fusions between several genes and green fluorescent protein show a pulse of expression before each molt in epithelial cells that synthesize the exoskeleton, indicating that the corresponding proteins are made in the correct time and place to regulate molting. We show further that inactivation of particular genes abrogates expression of the green fluorescent protein reporter genes, revealing regulatory networks that might couple the expression of genes essential for molting to endocrine cues. Many molting genes are conserved in parasitic nematodes responsible for human disease, and thus represent attractive targets for pesticide and pharmaceutical development. PMID:16122351

  5. Budding off: bringing functional genomics to Candida albicans

    PubMed Central

    Anderson, Matthew Z.

    2016-01-01

    Candida species are the most prevalent human fungal pathogens, with Candida albicans being the most clinically relevant species. Candida albicans resides as a commensal of the human gastrointestinal tract but is a frequent cause of opportunistic mucosal and systemic infections. Investigation of C. albicans virulence has traditionally relied on candidate gene approaches, but recent advances in functional genomics have now facilitated global, unbiased studies of gene function. Such studies include comparative genomics (both between and within Candida species), analysis of total RNA expression, and regulation and delineation of protein–DNA interactions. Additionally, large collections of mutant strains have begun to aid systematic screening of clinically relevant phenotypes. Here, we will highlight the development of functional genomics in C. albicans and discuss the use of these approaches to addressing both commensalism and pathogenesis in this species. PMID:26424829

  6. FuGE: Functional Genomics Experiment Object Model.

    PubMed

    Jones, Andrew R; Pizarro, Angel; Spellman, Paul; Miller, Michael

    2006-01-01

    This is an interim report on the Functional Genomics Experiment (FuGE) Object Model. FuGE is a framework for creating data standards for high-throughput biological experiments, developed by a consortium of researchers from academia and industry. FuGE supports rich annotation of samples, protocols, instruments, and software, as well as providing extension points for technology specific details. It has been adopted by microarray and proteomics standards bodies as a basis for forthcoming standards. It is hoped that standards developers for other omics techniques will join this collaborative effort; widespread adoption will allow uniform annotation of common parts of functional genomics workflows, reduce standard development and learning times through the sharing of consistent practice, and ease the construction of software for accessing and integrating functional genomics data.

  7. Budding off: bringing functional genomics to Candida albicans.

    PubMed

    Anderson, Matthew Z; Bennett, Richard J

    2016-03-01

    Candida species are the most prevalent human fungal pathogens, with Candida albicans being the most clinically relevant species. Candida albicans resides as a commensal of the human gastrointestinal tract but is a frequent cause of opportunistic mucosal and systemic infections. Investigation of C. albicans virulence has traditionally relied on candidate gene approaches, but recent advances in functional genomics have now facilitated global, unbiased studies of gene function. Such studies include comparative genomics (both between and within Candida species), analysis of total RNA expression, and regulation and delineation of protein-DNA interactions. Additionally, large collections of mutant strains have begun to aid systematic screening of clinically relevant phenotypes. Here, we will highlight the development of functional genomics in C. albicans and discuss the use of these approaches to addressing both commensalism and pathogenesis in this species. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  8. Functional innovations of three chronological mesohexaploid Brassica rapa genomes.

    PubMed

    Kim, Jungeun; Lee, Jeongyeo; Choi, Jae-Pil; Park, Inkyu; Yang, Kyungbong; Kim, Min Keun; Lee, Young Han; Nou, Ill-Sup; Kim, Dae-Soo; Min, Sung Ran; Park, Sang Un; Kim, HyeRan

    2014-07-18

    The Brassicaceae family is an exemplary model for studying plant polyploidy. The Brassicaceae knowledge-base includes the well-annotated Arabidopsis thaliana reference sequence; well-established evidence for three rounds of whole genome duplication (WGD); and the conservation of genomic structure, with 24 conserved genomic blocks (GBs). The recently released Brassica rapa draft genome provides an ideal opportunity to update our knowledge of the conserved genomic structures in Brassica, and to study evolutionary innovations of the mesohexaploid plant, B. rapa. Three chronological B. rapa genomes (recent, young, and old) were reconstructed with sequence divergences, revealing a trace of recursive WGD events. A total of 636 fast evolving genes were unevenly distributed throughout the recent and young genomes. The representative Gene Ontology (GO) terms for these genes were 'stress response' and 'development' both through a change in protein modification or signaling, rather than by enhancing signal recognition. In retention patterns analysis, 98% of B. rapa genes were retained as collinear gene pairs; 77% of those were singly-retained in recent or young genomes resulting from death of the ancestral copies, while others were multi-retained as long retention genes. GO enrichments indicated that single retention genes mainly function in the interpretation of genetic information, whereas, multi-retention genes were biased toward signal response, especially regarding development and defense. In the recent genome, 13,302, 5,790, and 20 gene pairs were multi-retained following Brassica whole genome triplication (WGT) events with 2, 3, and 4 homoeologous copies, respectively. Enriched GO-slim terms from B. rapa homomoelogues imply that a major effect of the B. rapa WGT may have been to acquire environmental adaptability or to change the course of development. These homoeologues seem to more frequently undergo subfunctionalization with spatial expression patterns compared with

  9. Recombination between defective tombusvirus RNAs generates functional hybrid genomes

    SciTech Connect

    White, K.A.; Morris, T.J.

    1994-04-26

    The tombusviruses represent a group of small icosahedral plant viruses that contain monopartite positive-sense RNA genomes. Tombusviruses are able to generate small replicating deletion mutants of their genomes (i.e., defective interfering RNAs) during infections via RNA recombination and/or rearrangement. To further study the process of RNA recombination and to determine whether tombusviruses were capable of trans-recombination, protoplasts were coinoculated with in vitro-generated transcripts of a nonreplicating 3{prime}-truncated genomic RNA of cucumber necrosis tombusvirus and either replicative or replication-defective DI RNAs of tomato bushy stunt tombusvirus. After 48-hr incubation, two dominant replicative chimeric recombinant viral RNA populations were detected that contained various large contiguous 5{prime} segments of the cucumber necrosis tombusvirus genomic RNA fused to 3{prime}-terminal regions of the tomato bushy stunt tombusvirus defective interfering RNA. Some of the larger chimeric recombinants formed in protoplasts were able to systemically infect plants and induce wild-type symptoms. In addition, a functional chimeric genome was generated in planta after direct coinoculation of whole plants with the defective RNA components. These results indicate that (i) RNA recombination can occur relatively efficiently in single-cell infections, (ii) trans-recombination can occur with nonreplicating viral RNA components, and (iii) functional chimeric genomes can be generated via recombination. Possible mechanisms for the formation of the recombinants are proposed, and evolutionary implications are discussed.

  10. Analysis of Multiple Genomic Sequence Alignments: A Web Resource, Online Tools, and Lessons Learned From Analysis of Mammalian SCL Loci

    PubMed Central

    Chapman, Michael A.; Donaldson, Ian J.; Gilbert, James; Grafham, Darren; Rogers, Jane; Green, Anthony R.; Göttgens, Berthold

    2004-01-01

    Comparative analysis of genomic sequences is becoming a standard technique for studying gene regulation. However, only a limited number of tools are currently available for the analysis of multiple genomic sequences. An extensive data set for the testing and training of such tools is provided by the SCL gene locus. Here we have expanded the data set to eight vertebrate species by sequencing the dog SCL locus and by annotating the dog and rat SCL loci. To provide a resource for the bioinformatics community, all SCL sequences and functional annotations, comprising a collation of the extensive experimental evidence pertaining to SCL regulation, have been made available via a Web server. A Web interface to new tools specifically designed for the display and analysis of multiple sequence alignments was also implemented. The unique SCL data set and new sequence comparison tools allowed us to perform a rigorous examination of the true benefits of multiple sequence comparisons. We demonstrate that multiple sequence alignments are, overall, superior to pairwise alignments for identification of mammalian regulatory regions. In the search for individual transcription factor binding sites, multiple alignments markedly increase the signal-to-noise ratio compared to pairwise alignments. PMID:14718377

  11. Functional Measurement: An Incredibly Flexible Tool

    ERIC Educational Resources Information Center

    Mullet, Etienne; Morales Martinez, Guadalupe Elizabeth; Makris, Ioannis; Roge, Bernadette; Munoz Sastre, Maria Teresa

    2012-01-01

    Functional Measurement (FM) has been applied to a variety of settings that can be considered as "extreme" settings; that is, settings involving participants with severe cognitive disabilities or involving unusual stimulus material. FM has, as instance, been successfully applied for analyzing (a) numerosity judgments among children as…

  12. Physical function assessment tools in pediatric rheumatology

    PubMed Central

    Moorthy, Lakshmi Nandini; Peterson, Margaret GE; Harrison, Melanie J; Onel, Karen B; Lehman, Thomas JA

    2008-01-01

    Pediatric rheumatic diseases with predominant musculoskeletal involvement such as juvenile idiopathic arthritis (JIA) and juvenile dermatomyositis(JDM) can cause considerable physical functional impairment and significantly affect the children's quality of life (QOL). Physical function, QOL, health-related QOL (HRQOL) and health status are personal constructs used as outcomes to estimate the impact of these diseases and often used as proxies for each other. The chronic, fluctuating nature of these diseases differs within and between patients, and complicates the measurement of these outcomes. In children, their growing needs and expectations, limited use of age-specific questionnaires, and the use of proxy respondents further influences this evaluation. This article will briefly review the different constructs inclusive of and related to physical function, and the scales used for measuring them. An understanding of these instruments will enable assessment of functional outcome in clinical studies of children with rheumatic diseases, measure the impact of the disease and treatments on their lives, and guide us in formulating appropriate interventions. PMID:18533038

  13. Transcriptome sequencing and microarray development for the Manila clam, Ruditapes philippinarum: genomic tools for environmental monitoring.

    PubMed

    Milan, Massimo; Coppe, Alessandro; Reinhardt, Richard; Cancela, Leonor M; Leite, Ricardo B; Saavedra, Carlos; Ciofi, Claudio; Chelazzi, Guido; Patarnello, Tomaso; Bortoluzzi, Stefania; Bargelloni, Luca

    2011-05-12

    The Manila clam, Ruditapes philippinarum, is one of the major aquaculture species in the world and a potential sentinel organism for monitoring the status of marine ecosystems. However, genomic resources for R. philippinarum are still extremely limited. Global analysis of gene expression profiles is increasingly used to evaluate the biological effects of various environmental stressors on aquatic animals under either artificial conditions or in the wild. Here, we report on the development of a transcriptomic platform for global gene expression profiling in the Manila clam. A normalized cDNA library representing a mixture of adult tissues was sequenced using a ultra high-throughput sequencing technology (Roche 454). A database consisting of 32,606 unique transcripts was constructed, 9,747 (30%) of which could be annotated by similarity. An oligo-DNA microarray platform was designed and applied to profile gene expression of digestive gland and gills. Functional annotation of differentially expressed genes between different tissues was performed by enrichment analysis. Expression of Natural Antisense Transcripts (NAT) analysis was also performed and bi-directional transcription appears a common phenomenon in the R. philippinarum transcriptome. A preliminary study on clam samples collected in a highly polluted area of the Venice Lagoon demonstrated the applicability of genomic tools to environmental monitoring. The transcriptomic platform developed for the Manila clam confirmed the high level of reproducibility of current microarray technology. Next-generation sequencing provided a good representation of the clam transcriptome. Despite the known limitations in transcript annotation and sequence coverage for non model species, sufficient information was obtained to identify a large set of genes potentially involved in cellular response to environmental stress.

  14. Transcriptome sequencing and microarray development for the Manila clam, Ruditapes philippinarum: genomic tools for environmental monitoring

    PubMed Central

    2011-01-01

    Background The Manila clam, Ruditapes philippinarum, is one of the major aquaculture species in the world and a potential sentinel organism for monitoring the status of marine ecosystems. However, genomic resources for R. philippinarum are still extremely limited. Global analysis of gene expression profiles is increasingly used to evaluate the biological effects of various environmental stressors on aquatic animals under either artificial conditions or in the wild. Here, we report on the development of a transcriptomic platform for global gene expression profiling in the Manila clam. Results A normalized cDNA library representing a mixture of adult tissues was sequenced using a ultra high-throughput sequencing technology (Roche 454). A database consisting of 32,606 unique transcripts was constructed, 9,747 (30%) of which could be annotated by similarity. An oligo-DNA microarray platform was designed and applied to profile gene expression of digestive gland and gills. Functional annotation of differentially expressed genes between different tissues was performed by enrichment analysis. Expression of Natural Antisense Transcripts (NAT) analysis was also performed and bi-directional transcription appears a common phenomenon in the R. philippinarum transcriptome. A preliminary study on clam samples collected in a highly polluted area of the Venice Lagoon demonstrated the applicability of genomic tools to environmental monitoring. Conclusions The transcriptomic platform developed for the Manila clam confirmed the high level of reproducibility of current microarray technology. Next-generation sequencing provided a good representation of the clam transcriptome. Despite the known limitations in transcript annotation and sequence coverage for non model species, sufficient information was obtained to identify a large set of genes potentially involved in cellular response to environmental stress. PMID:21569398

  15. University of Texas Southwestern Medical Center: Functional Signature Ontology Tool: Triplicate Measurements of Reporter Gene Expression in Response to Individual Genetic and Chemical Perturbations in HCT116 Cells | Office of Cancer Genomics

    Cancer.gov

    The goal of this project is to use an eight-gene expression profile to define functional signatures for small molecules and natural products with heretofore undefined mechanism of action. Two genes in the eight gene set are used as internal controls and do not vary across gene expression array data collected from the public domain. The remaining six genes are found to vary independently across a large collection of publically available gene expression array datasets.  Read the abstract

  16. A close relationship between primary nucleotides sequence structure and the composition of functional genes in the genome of prokaryotes.

    PubMed

    Garcia, Juan A L; Fernández-Guerra, Antoni; Casamayor, Emilio O

    2011-12-01

    Comparative genomics is an essential tool to unravel how genomes change over evolutionary time and to gain clues on the links between functional genomics and evolution. In prokaryotes, the large, good quality, genome sequences available in public databases and the recently developed large-scale computational methods, offer an unprecedent view on the ecology and evolution of microorganisms through comparative genomics. In this work, we examined the links among genome structure (i.e., the sequential distribution of nucleotides itself by detrended fluctuation analysis, DFA) and genomic diversity (i.e., gene functionality by Clusters of Orthologous Genes, COGs) in 828 full sequenced prokaryotic genomes from 548 different bacteria and archaea species. DFA scaling exponent α indicated persistent long-range correlations (fractality) in each genome analyzed. Higher resolution power was found when considering the sequential succession of purine (AG) vs. pyrimidine (CT) bases than either keto (GT) to amino (AC) forms or strongly (GC) vs. weakly (AT) bonded nucleotides. Interestingly, the phyla Aquificae, Fusobacteria, Dictyoglomi, Nitrospirae, and Thermotogae were closer to archaea than to their bacterial counterparts. A strong significant correlation was found between scaling exponent α and COGs distribution, and we consistently observed that the larger α the more heterogeneous was the gene distribution within each functional category, suggesting a close relationship between primary nucleotides sequence structure and functional genes composition.

  17. Bootstrap, Bayesian probability and maximum likelihood mapping: exploring new tools for comparative genome analyses

    PubMed Central

    Zhaxybayeva, Olga; Gogarten, J Peter

    2002-01-01

    Background Horizontal gene transfer (HGT) played an important role in shaping microbial genomes. In addition to genes under sporadic selection, HGT also affects housekeeping genes and those involved in information processing, even ribosomal RNA encoding genes. Here we describe tools that provide an assessment and graphic illustration of the mosaic nature of microbial genomes. Results We adapted the Maximum Likelihood (ML) mapping to the analyses of all detected quartets of orthologous genes found in four genomes. We have automated the assembly and analyses of these quartets of orthologs given the selection of four genomes. We compared the ML-mapping approach to more rigorous Bayesian probability and Bootstrap mapping techniques. The latter two approaches appear to be more conservative than the ML-mapping approach, but qualitatively all three approaches give equivalent results. All three tools were tested on mitochondrial genomes, which presumably were inherited as a single linkage group. Conclusions In some instances of interphylum relationships we find nearly equal numbers of quartets strongly supporting the three possible topologies. In contrast, our analyses of genome quartets containing the cyanobacterium Synechocystis sp. indicate that a large part of the cyanobacterial genome is related to that of low GC Gram positives. Other groups that had been suggested as sister groups to the cyanobacteria contain many fewer genes that group with the Synechocystis orthologs. Interdomain comparisons of genome quartets containing the archaeon Halobacterium sp. revealed that Halobacterium sp. shares more genes with Bacteria that live in the same environment than with Bacteria that are more closely related based on rRNA phylogeny . Many of these genes encode proteins involved in substrate transport and metabolism and in information storage and processing. The performed analyses demonstrate that relationships among prokaryotes cannot be accurately depicted by or inferred from

  18. WWOX: its genomics, partners, and functions.

    PubMed

    Del Mare, Sara; Salah, Zaidoun; Aqeilan, Rami I

    2009-11-01

    The WW domain-containing oxidoreductase (WWOX) spans one of the most active common fragile sites (CFSs) involved in cancer, FRA16D. WWOX encodes a 46-kDa protein that contains two N-terminal WW domains and a central short-chain dehydrogenase/reductase (SDR) domain. Through its WW domain, Wwox interacts with its partners and modulates their functions. Our data indicate that Wwox suppresses the transactivation function of several transcription factors implied in neoplasia by sequestering them in the cytoplasm. Work from our laboratory and other research groups have demonstrated that Wwox participates in a number of cellular processes including growth, differentiation, apoptosis, and tumor suppression. Targeted deletion of the Wwox gene in mice causes increased spontaneous and chemically induced tumor incidence supporting bona fide tumor suppressor function of WWOX. Moreover, generation of the Wwox-deficient mice uncovers, at least in part, some of the physiological in vivo functions of the WWOX gene. This review focuses on recent progress that elucidates Wwox functions in biology and pathology.

  19. Structure and Functional Studies on Dengue-2 Virus Genome

    DTIC Science & Technology

    1986-03-01

    AD STRUCTURE AND FUNCTIONAL STUDIES ON DENGUE -2 VIRUS GENOME FINAL Report Lfl C’) Radha Krishnan Padmanabhan, Ph.D. 0) March 1, 1986 Supported by U.S...and Functional Studies on Dengue -2 Virus Genome 12. PERSONAL AUTHOR(S) Radha Krishnan Padmanabhan 13a. TYPE OF REPORT 13b. TIME COVERED 14. DATE OF...3’-end of Dengue RNA in order to facilitate cDNA synthesis by oligo d(T) priming as proposed in the original research project. 2. We also showed that

  20. From functional genomics to systems biology: concepts and practices.

    PubMed

    Auffray, Charles; Imbeaud, Sandrine; Roux-Rouquié, Magali; Hood, Leroy

    2003-01-01

    Systems biology is the iterative and integrative study of biological systems as systems in response to perturbations. It is founded on hypotheses formalized in models built from the results of global functional genomics analyses of the complexity of the genome, transcriptome, proteome, metabolome, etc. Its implementation by cross-disciplinary teams in a standardized mode under quality assurance should allow accessing the small variations of the large number of elements determining functioning of biological systems. Galactose utilization in yeast, and sea urchin development are two examples of emerging systems biology.

  1. TreeQ-VISTA: An Interactive Tree Visualization Tool withFunctional Annotation Query Capabilities

    SciTech Connect

    Gu, Shengyin; Anderson, Iain; Kunin, Victor; Cipriano, Michael; Minovitsky, Simon; Weber, Gunther; Amenta, Nina; Hamann, Bernd; Dubchak,Inna

    2007-05-07

    Summary: We describe a general multiplatform exploratorytool called TreeQ-Vista, designed for presenting functional annotationsin a phylogenetic context. Traits, such as phenotypic and genomicproperties, are interactively queried from a relational database with auser-friendly interface which provides a set of tools for users with orwithout SQL knowledge. The query results are projected onto aphylogenetic tree and can be displayed in multiple color groups. A richset of browsing, grouping and query tools are provided to facilitatetrait exploration, comparison and analysis.Availability: The program,detailed tutorial and examples are available online athttp://genome-test.lbl.gov/vista/TreeQVista.

  2. Solutions for data integration in functional genomics: a critical assessment and case study.

    PubMed

    Smedley, Damian; Swertz, Morris A; Wolstencroft, Katy; Proctor, Glenn; Zouberakis, Michael; Bard, Jonathan; Hancock, John M; Schofield, Paul

    2008-11-01

    The torrent of data emerging from the application of new technologies to functional genomics and systems biology can no longer be contained within the traditional modes of data sharing and publication with the consequence that data is being deposited in, distributed across and disseminated through an increasing number of databases. The resulting fragmentation poses serious problems for the model organism community which increasingly rely on data mining and computational approaches that require gathering of data from a range of sources. In the light of these problems, the European Commission has funded a coordination action, CASIMIR (coordination and sustainability of international mouse informatics resources), with a remit to assess the technical and social aspects of database interoperability that currently prevent the full realization of the potential of data integration in mouse functional genomics. In this article, we assess the current problems with interoperability, with particular reference to mouse functional genomics, and critically review the technologies that can be deployed to overcome them. We describe a typical use-case where an investigator wishes to gather data on variation, genomic context and metabolic pathway involvement for genes discovered in a genome-wide screen. We go on to develop an automated approach involving an in silico experimental workflow tool, Taverna, using web services, BioMart and MOLGENIS technologies for data retrieval. Finally, we focus on the current impediments to adopting such an approach in a wider context, and strategies to overcome them.

  3. Family history in public health practice: a genomic tool for disease prevention and health promotion.

    PubMed

    Valdez, Rodolfo; Yoon, Paula W; Qureshi, Nadeem; Green, Ridgely Fisk; Khoury, Muin J

    2010-01-01

    Family history is a risk factor for many chronic diseases, including cancer, cardiovascular disease, and diabetes. Professional guidelines usually include family history to assess health risk, initiate interventions, and motivate behavioral changes. The advantages of family history over other genomic tools include a lower cost, greater acceptability, and a reflection of shared genetic and environmental factors. However, the utility of family history in public health has been poorly explored. To establish family history as a public health tool, it needs to be evaluated within the ACCE framework (analytical validity; clinical validity; clinical utility; and ethical, legal, and social issues). Currently, private and public organizations are developing tools to collect standardized family histories of many diseases. Their goal is to create family history tools that have decision support capabilities and are compatible with electronic health records. These advances will help realize the potential of family history as a public health tool.

  4. Integrative genomics to dissect retinoid functions.

    PubMed

    Mendoza-Parra, Marco-Antonio; Gronemeyer, Hinrich

    2014-01-01

    Retinoids and rexinoids, as all other ligands of the nuclear receptor (NR) family, act as ligand-regulated trans-acting transcription factors that bind to cis-acting DNA regulatory elements in the promoter regions of target genes (for reviews see [12, 22, 23, 26, 36]). Ligand binding modulates the communication functions of the receptor with the intracellular environment, which essentially entails receptor-protein and receptor-DNA or receptor-chromatin interactions. In this communication network, the receptor simultaneously serves as both intracellular sensor and regulator of cell/organ functions. Receptors are "intelligent" mediators of the information encoded in the chemical structure of a nuclear receptor ligand, as they interpret this information in the context of cellular identity and cell-physiological status and convert it into a dynamic chain of receptor-protein and receptor-DNA interactions. To process input and output information, they are composed of a modular structure with several domains that have evolved to exert particular molecular recognition functions. As detailed in other chapters in this volume, the main functional domains are the DNA-binding (DBD) and ligand-binding (LBD) [5-7, 38, 56, 71]. The LBD serves as a dual input-output information processor. Inputs, such as ligand binding or receptor phosphorylations, induce allosteric changes in receptor surfaces that serve as docking sites for outputs, such as subunits of transcription and epigenetic machineries or enzyme complexes. The complexity of input and output signals and their interdependencies is far from being understood.

  5. Targeted Genome-Wide Enrichment of Functional Regions

    PubMed Central

    Senapathy, Periannan; Bhasi, Ashwini; Mattox, Jeffrey; Dhandapany, Perundurai S.; Sadayappan, Sakthivel

    2010-01-01

    Only a small fraction of large genomes such as that of the human contains the functional regions such as the exons, promoters, and polyA sites. A platform technique for selective enrichment of functional genomic regions will enable several next-generation sequencing applications that include the discovery of causal mutations for disease and drug response. Here, we describe a powerful platform technique, termed “functional genomic fingerprinting” (FGF), for the multiplexed genomewide isolation and analysis of targeted regions such as the exome, promoterome, or exon splice enhancers. The technique employs a fixed part of a uniquely designed Fixed-Randomized primer, while the randomized part contains all the possible sequence permutations. The Fixed-Randomized primers bind with full sequence complementarity at multiple sites where the fixed sequence (such as the splice signals) occurs within the genome, and multiplex amplify many regions bounded by the fixed sequences (e.g., exons). Notably, validation of this technique using cardiac myosin binding protein-C (MYBPC3) gene as an example strongly supports the application and efficacy of this method. Further, assisted by genomewide computational analyses of such sequences, the FGF technique may provide a unique platform for high-throughput sample production and analysis of targeted genomic regions by the next-generation sequencing techniques, with powerful applications in discovering disease and drug response genes. PMID:20585402

  6. Correcting Inconsistencies and Errors in Bacterial Genome Metadata Using an Automated Curation Tool in Excel (AutoCurE).

    PubMed

    Schmedes, Sarah E; King, Jonathan L; Budowle, Bruce

    2015-01-01

    Whole-genome data are invaluable for large-scale comparative genomic studies. Current sequencing technologies have made it feasible to sequence entire bacterial genomes with relative ease and time with a substantially reduced cost per nucleotide, hence cost per genome. More than 3,000 bacterial genomes have been sequenced and are available at the finished status. Publically available genomes can be readily downloaded; however, there are challenges to verify the specific supporting data contained within the download and to identify errors and inconsistencies that may be present within the organizational data content and metadata. AutoCurE, an automated tool for bacterial genome database curation in Excel, was developed to facilitate local database curation of supporting data that accompany downloaded genomes from the National Center for Biotechnology Information. AutoCurE provides an automated approach to curate local genomic databases by flagging inconsistencies or errors by comparing the downloaded supporting data to the genome reports to verify genome name, RefSeq accession numbers, the presence of archaea, BioProject/UIDs, and sequence file descriptions. Flags are generated for nine metadata fields if there are inconsistencies between the downloaded genomes and genomes reports and if erroneous or missing data are evident. AutoCurE is an easy-to-use tool for local database curation for large-scale genome data prior to downstream analyses.

  7. Suite of tools for statistical N-gram language modeling for pattern mining in whole genome sequences.

    PubMed

    Ganapathiraju, Madhavi K; Mitchell, Asia D; Thahir, Mohamed; Motwani, Kamiya; Ananthasubramanian, Seshan

    2012-12-01

    Genome sequences contain a number of patterns that have biomedical significance. Repetitive sequences of various kinds are a primary component of most of the genomic sequence patterns. We extended the suffix-array based Biological Language Modeling Toolkit to compute n-gram frequencies as well as n-gram language-model based perplexity in windows over the whole genome sequence to find biologically relevant patterns. We present the suite of tools and their application for analysis on whole human genome sequence.

  8. Genomic Functionalization: The Next Revolution In Biology

    SciTech Connect

    Anderson, Peter; Schoeniger, Joseph S.; Imbro, Paula M.

    2014-07-01

    We have implemented a ligand-alignment algorithm into our developed computational pipeline for identifying specificity-determining features (SDFs) in protein-ligand complexes. Given a set of protein-ligand complex structures, the algorithm aligns the complexes by ligand rather than by the C -RMSD or standard approach, providing a single reference frame for extracting SDFs. We anticipate that this ligand-alignment capability will be highly useful for protein function prediction. We already have a database containing > 20 K ligand-protein complex crystal structures taken from the Protein Data Bank. By aligning these proteins to single reference frames using ligand alignment, we can submit the complexes to our pipeline for SDF extraction. The SDFs derived from this training procedure can be used as thumbprints that are hallmarks of individual enzyme classes. These SDF thumbprints may then serve as guides to the prediction of function of new unknown proteins.

  9. Exploring Protein Function Using the Saccharomyces Genome Database.

    PubMed

    Wong, Edith D

    2017-01-01

    Elucidating the function of individual proteins will help to create a comprehensive picture of cell biology, as well as shed light on human disease mechanisms, possible treatments, and cures. Due to its compact genome, and extensive history of experimentation and annotation, the budding yeast Saccharomyces cerevisiae is an ideal model organism in which to determine protein function. This information can then be leveraged to infer functions of human homologs. Despite the large amount of research and biological data about S. cerevisiae, many proteins' functions remain unknown. Here, we explore ways to use the Saccharomyces Genome Database (SGD; http://www.yeastgenome.org ) to predict the function of proteins and gain insight into their roles in various cellular processes.

  10. Comparative Analysis of Chloroplast Genomes: Functional Annotation, Genome-Based Phylogeny, and Deduced Evolutionary Patterns

    PubMed Central

    Rivas, Javier De Las; Lozano, Juan Jose; Ortiz, Angel R.

    2002-01-01

    All protein sequences from 19 complete chloroplast genomes (cpDNA) have been studied using a new computational method able to analyze functional correlations among series of protein sequences contained in complete proteomes. First, all open reading frames (ORFs) from the cpDNAs, comprising a total of 2266 protein sequences, were compared against the 3168 proteins from Synechocystis PCC6803 complete genome to find functionally related orthologous proteins. Additionally, all cpDNA genomes were pairwise compared to find orthologous groups not present in cyanobacteria. Annotations in the cluster of othologous proteins database and CyanoBase were used as reference for the functional assignments. Following this protocol, new functional assignments were made for ORFs of unknown function and for ycfs (hypothetical chloroplast frames), which still lack a functional assignment. Using this information, a matrix of functional relationships was derived from profiles of the presence and/or absence of orthologous proteins; the matrix included 1837 proteins in 277 orthologous clusters. A factor analysis study of this matrix, followed by cluster analysis, allowed us to obtain accurate phylogenetic reconstructions and the detection of genes probably involved in speciation as phylogenetic correlates. Finally, by grouping common evolutionary patterns, we show that it is possible to determine functionally linked protein networks. This has allowed us to suggest putative associations for some unknown ORFs. PMID:11932241

  11. Getting trichy: tools and approaches to interrogating Trichomonas vaginalis in a post-genome world

    PubMed Central

    Conrad, Melissa D.; Bradic, Martina; Warring, Sally D.; Gorman, Andrew W.; Carlton, Jane M.

    2012-01-01

    Trichomonas vaginalis is a parasite of the urogenital tract in men and women, with a worldwide presence and significant implications for global public health. T. vaginalis research entered the age of genomics with the publication of the first genome sequence in 2007, yet subsequent utilization of other ‘omics’ technologies and methods has been slow. Here, we review some of the tools and approaches available to interrogate T. vaginalis biology, with an emphasis on recent advances and current limitations, and draw attention to areas where further efforts are needed to effectively examine the complex and intriguing biology of the parasite. PMID:23219217

  12. Transcription activator-like effector nucleases (TALENs): a highly efficient and versatile tool for genome editing.

    PubMed

    Sun, Ning; Zhao, Huimin

    2013-07-01

    Transcription activator-like effector (TALE) nucleases (TALENs) have recently emerged as a revolutionary genome editing tool in many different organisms and cell types. The site-specific chromosomal double-strand breaks introduced by TALENs significantly increase the efficiency of genomic modification. The modular nature of the TALE central repeat domains enables researchers to tailor DNA recognition specificity with ease and target essentially any desired DNA sequence. Here, we comprehensively review the development of TALEN technology in terms of scaffold optimization, DNA recognition, and repeat array assembly. In addition, we provide some perspectives on the future development of this technology.

  13. Cas-analyzer: an online tool for assessing genome editing results using NGS data.

    PubMed

    Park, Jeongbin; Lim, Kayeong; Kim, Jin-Soo; Bae, Sangsu

    2017-01-15

    Genome editing with programmable nucleases has been widely adopted in research and medicine. Next generation sequencing (NGS) platforms are now widely used for measuring the frequencies of mutations induced by CRISPR-Cas9 and other programmable nucleases. Here, we present an online tool, Cas-Analyzer, a JavaScript-based implementation for NGS data analysis. Because Cas-Analyzer is completely used at a client-side web browser on-the-fly, there is no need to upload very large NGS datasets to a server, a time-consuming step in genome editing analysis. Currently, Cas-Analyzer supports various programmable nucleases, including single nucleases and paired nucleases.

  14. PerPlot & PerScan: tools for analysis of DNA curvature-related periodicity in genomic nucleotide sequences

    PubMed Central

    2011-01-01

    Background Periodic spacing of short adenine or thymine runs phased with DNA helical period of ~10.5 bp is associated with intrinsic DNA curvature and deformability, which play important roles in DNA-protein interactions and in the organization of chromosomes in both eukaryotes and prokaryotes. Local differences in DNA sequence periodicity have been linked to differences in gene expression in some organisms. Despite the significance of these periodic patterns, there are virtually no publicly accessible tools for their analysis. Results We present novel tools suitable for assessments of DNA curvature-related sequence periodicity in nucleotide sequences at the genome scale. Utility of the present software is demonstrated on a comparison of sequence periodicities in the genomes of Haemophilus influenzae, Methanocaldococcus jannaschii, Saccharomyces cerevisiae, and Arabidopsis thaliana. The software can be accessed through a web interface and the programs are also available for download. Conclusions The present software is suitable for comparing DNA curvature-related sequence periodicity among different genomes as well as for analysis of intrachromosomal heterogeneity of the sequence periodicity. It provides a quick and convenient way to detect anomalous regions of chromosomes that could have unusual structural and functional properties and/or distinct evolutionary history. PMID:22587738

  15. FunctionAnnotator, a versatile and efficient web tool for non-model organism annotation.

    PubMed

    Chen, Ting-Wen; Gan, Ruei-Chi; Fang, Yi-Kai; Chien, Kun-Yi; Liao, Wei-Chao; Chen, Chia-Chun; Wu, Timothy H; Chang, Ian Yi-Feng; Yang, Chi; Huang, Po-Jung; Yeh, Yuan-Ming; Chiu, Cheng-Hsun; Huang, Tzu-Wen; Tang, Petrus

    2017-09-05

    Along with the constant improvement in high-throughput sequencing technology, an increasing number of transcriptome sequencing projects are carried out in organisms without decoded genome information and even on environmental biological samples. To study the biological functions of novel transcripts, the very first task is to identify their potential functions. We present a web-based annotation tool, FunctionAnnotator, which offers comprehensive annotations, including GO term assignment, enzyme annotation, domain/motif identification and predictions for subcellular localization. To accelerate the annotation process, we have optimized the computation processes and used parallel computing for all annotation steps. Moreover, FunctionAnnotator is designed to be versatile, and it generates a variety of useful outputs for facilitating other analyses. Here, we demonstrate how FunctionAnnotator can be helpful in annotating non-model organisms. We further illustrate that FunctionAnnotator can estimate the taxonomic composition of environmental samples and assist in the identification of novel proteins by combining RNA-Seq data with proteomics technology. In summary, FunctionAnnotator can efficiently annotate transcriptomes and greatly benefits studies focusing on non-model organisms or metatranscriptomes. FunctionAnnotator, a comprehensive annotation web-service tool, is freely available online at: http://fa.cgu.edu.tw/ . This new web-based annotator will shed light on field studies involving organisms without a reference genome.

  16. Novel Genome-Editing Tools to Model and Correct Primary Immunodeficiencies

    PubMed Central

    Ott de Bruin, Lisa M.; Volpi, Stefano; Musunuru, Kiran

    2015-01-01

    Severe combined immunodeficiency (SCID) and other severe non-SCID primary immunodeficiencies (non-SCID PID) can be treated by allogeneic hematopoietic stem cell (HSC) transplantation, but when histocompatibility leukocyte antigen-matched donors are lacking, this can be a high-risk procedure. Correcting the patient’s own HSCs with gene therapy offers an attractive alternative. Gene therapies currently being used in clinical settings insert a functional copy of the entire gene by means of a viral vector. With this treatment, severe complications may result due to integration within oncogenes. A promising alternative is the use of endonucleases such as ZFNs, TALENs, and CRISPR/Cas9 to introduce a double-stranded break in the DNA and thus induce homology-directed repair. With these genome-editing tools a correct copy can be inserted in a precisely targeted “safe harbor.” They can also be used to correct pathogenic mutations in situ and to develop cellular or animal models needed to study the pathogenic effects of specific genetic defects found in immunodeficient patients. This review discusses the advantages and disadvantages of these endonucleases in gene correction and modeling with an emphasis on CRISPR/Cas9, which offers the most promise due to its efficacy and versatility. PMID:26052330

  17. From the genome to the phenome: tools to understand the basic biology of Plasmodium falciparum.

    PubMed

    Webster, Wesley A J; McFadden, Geoffrey I

    2014-01-01

    Malaria plagues one out of every 30 humans and contributes to almost a million deaths, and the problem could worsen. Our current therapeutic options are compromised by emerging resistance by the parasite to our front line drugs. It is thus imperative to better understand the basic biology of the parasite and develop novel drugs to stem this disease. The most facile approach to analyse a gene's function is to remove it from the genome or inhibit its activity. Although genetic manipulation of the human malaria parasite Plasmodium falciparum is a relatively standard procedure, there is no optimal method to perturb genes essential to the intraerythrocytic development cycle--the part of the life cycle that produces the clinical manifestation of malaria. This is a severe impediment to progress because the phenotype we wish to study is exactly the one that is so elusive. In the absence of any utilitarian way to conditionally delete essential genes, we are prevented from investigating the parasite's most vulnerable points. This review aims to focus on the development of tools identifying essential genes of P. falciparum and our ability to elicit phenotypic mutation. © 2014 The Author(s) Journal of Eukaryotic Microbiology © 2014 International Society of Protistologists.

  18. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists

    PubMed Central

    Huang, Da Wei; Sherman, Brad T.; Lempicki, Richard A.

    2009-01-01

    Functional analysis of large gene lists, derived in most cases from emerging high-throughput genomic, proteomic and bioinformatics scanning approaches, is still a challenging and daunting task. The gene-annotation enrichment analysis is a promising high-throughput strategy that increases the likelihood for investigators to identify biological processes most pertinent to their study. Approximately 68 bioinformatics enrichment tools that are currently available in the community are collected in this survey. Tools are uniquely categorized into three major classes, according to their underlying enrichment algorithms. The comprehensive collections, unique tool classifications and associated questions/issues will provide a more comprehensive and up-to-date view regarding the advantages, pitfalls and recent trends in a simpler tool-class level rather than by a tool-by-tool approach. Thus, the survey will help tool designers/developers and experienced end users understand the underlying algorithms and pertinent details of particular tool categories/tools, enabling them to make the best choices for their particular research interests. PMID:19033363

  19. Mutant power: using mutant allele collections for yeast functional genomics

    PubMed Central

    Norman, Kaitlyn L.

    2016-01-01

    The budding yeast has long served as a model eukaryote for the functional genomic analysis of highly conserved signaling pathways, cellular processes and mechanisms underlying human disease. The collection of reagents available for genomics in yeast is extensive, encompassing a growing diversity of mutant collections beyond gene deletion sets in the standard wild-type S288C genetic background. We review here three main types of mutant allele collections: transposon mutagen collections, essential gene collections and overexpression libraries. Each collection provides unique and identifiable alleles that can be utilized in genome-wide, high-throughput studies. These genomic reagents are particularly informative in identifying synthetic phenotypes and functions associated with essential genes, including those modeled most effectively in complex genetic backgrounds. Several examples of genomic studies in filamentous/pseudohyphal backgrounds are provided here to illustrate this point. Additionally, the limitations of each approach are examined. Collectively, these mutant allele collections in Saccharomyces cerevisiae and the related pathogenic yeast Candida albicans promise insights toward an advanced understanding of eukaryotic molecular and cellular biology. PMID:26453908

  20. Mutant power: using mutant allele collections for yeast functional genomics.

    PubMed

    Norman, Kaitlyn L; Kumar, Anuj

    2016-03-01

    The budding yeast has long served as a model eukaryote for the functional genomic analysis of highly conserved signaling pathways, cellular processes and mechanisms underlying human disease. The collection of reagents available for genomics in yeast is extensive, encompassing a growing diversity of mutant collections beyond gene deletion sets in the standard wild-type S288C genetic background. We review here three main types of mutant allele collections: transposon mutagen collections, essential gene collections and overexpression libraries. Each collection provides unique and identifiable alleles that can be utilized in genome-wide, high-throughput studies. These genomic reagents are particularly informative in identifying synthetic phenotypes and functions associated with essential genes, including those modeled most effectively in complex genetic backgrounds. Several examples of genomic studies in filamentous/pseudohyphal backgrounds are provided here to illustrate this point. Additionally, the limitations of each approach are examined. Collectively, these mutant allele collections in Saccharomyces cerevisiae and the related pathogenic yeast Candida albicans promise insights toward an advanced understanding of eukaryotic molecular and cellular biology. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  1. Novel approaches in function-driven single-cell genomics

    DOE PAGES

    Doud, Devin F. R.; Woyke, Tanja

    2017-06-07

    Deeper sequencing and improved bioinformatics in conjunction with single-cell and metagenomic approaches continue to illuminate undercharacterized environmental microbial communities. This has propelled the 'who is there, and what might they be doing' paradigm to the uncultivated and has already radically changed the topology of the tree of life and provided key insights into the microbial contribution to biogeochemistry. While characterization of 'who' based on marker genes can describe a large fraction of the community, answering 'what are they doing' remains the elusive pinnacle for microbiology. Function-driven single-cell genomics provides a solution by using a function-based screen to subsample complex microbialmore » communities in a targeted manner for the isolation and genome sequencing of single cells. This enables single-cell sequencing to be focused on cells with specific phenotypic or metabolic characteristics of interest. Recovered genomes are conclusively implicated for both encoding and exhibiting the feature of interest, improving downstream annotation and revealing activity levels within that environment. This emerging approach has already improved our understanding of microbial community functioning and facilitated the experimental analysis of uncharacterized gene product space. Here we provide a comprehensive review of strategies that have been applied for function-driven single-cell genomics and the future directions we envision.« less

  2. Advancing our understanding of functional genome organisation through studies in the fission yeast.

    PubMed

    Olsson, Ida; Bjerling, Pernilla

    2011-02-01

    Significant progress has been made in understanding the functional organisation of the cell nucleus. Still many questions remain to be answered about the relationship between the spatial organisation of the nucleus and the regulation of the genome function. There are many conflicting data in the field making it very difficult to merge published results on mammalian cells into one model on subnuclear chromatin organisation. The fission yeast, Schizosaccharomyces pombe, over the last decades has emerged as a valuable model organism in understanding basic biological mechanisms, especially the cell cycle and chromosome biology. In this review we describe and compare the nuclear organisation in mammalian and fission yeast cells. We believe that fission yeast is a good tool to resolve at least some of the contradictions and unanswered questions concerning functional nuclear architecture, since S. pombe has chromosomes structurally similar to that of human. S. pombe also has the advantage over higher eukaryotes in that the genome can easily be manipulated via homologous recombination making it possible to integrate the tools needed for visualisation of chromosomes using live-cell microscopy. Classical genetic experiments can be used to elucidate what factors are involved in a certain mechanism. The knowledge we have gained during the last few years indicates similarities between the genome organisation in fission yeast and mammalian cells. We therefore propose the use of fission yeast for further advancement of our understanding of functional nuclear organisation.

  3. Resurrection of DNA function in vivo from an extinct genome.

    PubMed

    Pask, Andrew J; Behringer, Richard R; Renfree, Marilyn B

    2008-05-21

    There is a burgeoning repository of information available from ancient DNA that can be used to understand how genomes have evolved and to determine the genetic features that defined a particular species. To assess the functional consequences of changes to a genome, a variety of methods are needed to examine extinct DNA function. We isolated a transcriptional enhancer element from the genome of an extinct marsupial, the Tasmanian tiger (Thylacinus cynocephalus or thylacine), obtained from 100 year-old ethanol-fixed tissues from museum collections. We then examined the function of the enhancer in vivo. Using a transgenic approach, it was possible to resurrect DNA function in transgenic mice. The results demonstrate that the thylacine Col2A1 enhancer directed chondrocyte-specific expression in this extinct mammalian species in the same way as its orthologue does in mice. While other studies have examined extinct coding DNA function in vitro, this is the first example of the restoration of extinct non-coding DNA and examination of its function in vivo. Our method using transgenesis can be used to explore the function of regulatory and protein-coding sequences obtained from any extinct species in an in vivo model system, providing important insights into gene evolution and diversity.

  4. Resurrection of DNA Function In Vivo from an Extinct Genome

    PubMed Central

    Pask, Andrew J.; Behringer, Richard R.; Renfree, Marilyn B.

    2008-01-01

    There is a burgeoning repository of information available from ancient DNA that can be used to understand how genomes have evolved and to determine the genetic features that defined a particular species. To assess the functional consequences of changes to a genome, a variety of methods are needed to examine extinct DNA function. We isolated a transcriptional enhancer element from the genome of an extinct marsupial, the Tasmanian tiger (Thylacinus cynocephalus or thylacine), obtained from 100 year-old ethanol-fixed tissues from museum collections. We then examined the function of the enhancer in vivo. Using a transgenic approach, it was possible to resurrect DNA function in transgenic mice. The results demonstrate that the thylacine Col2A1 enhancer directed chondrocyte-specific expression in this extinct mammalian species in the same way as its orthologue does in mice. While other studies have examined extinct coding DNA function in vitro, this is the first example of the restoration of extinct non-coding DNA and examination of its function in vivo. Our method using transgenesis can be used to explore the function of regulatory and protein-coding sequences obtained from any extinct species in an in vivo model system, providing important insights into gene evolution and diversity. PMID:18493600

  5. MELOGEN: an EST database for melon functional genomics

    PubMed Central

    Gonzalez-Ibeas, Daniel; Blanca, José; Roig, Cristina; González-To, Mireia; Picó, Belén; Truniger, Verónica; Gómez, Pedro; Deleu, Wim; Caño-Delgado, Ana; Arús, Pere; Nuez, Fernando; Garcia-Mas, Jordi; Puigdomènech, Pere; Aranda, Miguel A

    2007-01-01

    Background Melon (Cucumis melo L.) is one of the most important fleshy fruits for fresh consumption. Despite this, few genomic resources exist for this species. To facilitate the discovery of genes involved in essential traits, such as fruit development, fruit maturation and disease resistance, and to speed up the process of breeding new and better adapted melon varieties, we have produced a large collection of expressed sequence tags (ESTs) from eight normalized cDNA libraries from different tissues in different physiological conditions. Results We determined over 30,000 ESTs that were clustered into 16,637 non-redundant sequences or unigenes, comprising 6,023 tentative consensus sequences (contigs) and 10,614 unclustered sequences (singletons). Many potential molecular markers were identified in the melon dataset: 1,052 potential simple sequence repeats (SSRs) and 356 single nucleotide polymorphisms (SNPs) were found. Sixty-nine percent of the melon unigenes showed a significant similarity with proteins in databases. Functional classification of the unigenes was carried out following the Gene Ontology scheme. In total, 9,402 unigenes were mapped to one or more ontology. Remarkably, the distributions of melon and Arabidopsis unigenes followed similar tendencies, suggesting that the melon dataset is representative of the whole melon transcriptome. Bioinformatic analyses primarily focused on potential precursors of melon micro RNAs (miRNAs) in the melon dataset, but many other genes potentially controlling disease resistance and fruit quality traits were also identified. Patterns of transcript accumulation were characterised by Real-Time-qPCR for 20 of these genes. Conclusion The collection of ESTs characterised here represents a substantial increase on the genetic information available for melon. A database (MELOGEN) which contains all EST sequences, contig images and several tools for analysis and data mining has been created. This set of sequences constitutes

  6. Functional and evolutionary insights from the genomes of three parasitoid Nasonia species.

    PubMed

    Werren, John H; Richards, Stephen; Desjardins, Christopher A; Niehuis, Oliver; Gadau, Jürgen; Colbourne, John K; Werren, John H; Richards, Stephen; Desjardins, Christopher A; Niehuis, Oliver; Gadau, Jürgen; Colbourne, John K; Beukeboom, Leo W; Desplan, Claude; Elsik, Christine G; Grimmelikhuijzen, Cornelis J P; Kitts, Paul; Lynch, Jeremy A; Murphy, Terence; Oliveira, Deodoro C S G; Smith, Christopher D; van de Zande, Louis; Worley, Kim C; Zdobnov, Evgeny M; Aerts, Maarten; Albert, Stefan; Anaya, Victor H; Anzola, Juan M; Barchuk, Angel R; Behura, Susanta K; Bera, Agata N; Berenbaum, May R; Bertossa, Rinaldo C; Bitondi, Márcia M G; Bordenstein, Seth R; Bork, Peer; Bornberg-Bauer, Erich; Brunain, Marleen; Cazzamali, Giuseppe; Chaboub, Lesley; Chacko, Joseph; Chavez, Dean; Childers, Christopher P; Choi, Jeong-Hyeon; Clark, Michael E; Claudianos, Charles; Clinton, Rochelle A; Cree, Andrew G; Cristino, Alexandre S; Dang, Phat M; Darby, Alistair C; de Graaf, Dirk C; Devreese, Bart; Dinh, Huyen H; Edwards, Rachel; Elango, Navin; Elhaik, Eran; Ermolaeva, Olga; Evans, Jay D; Foret, Sylvain; Fowler, Gerald R; Gerlach, Daniel; Gibson, Joshua D; Gilbert, Donald G; Graur, Dan; Gründer, Stefan; Hagen, Darren E; Han, Yi; Hauser, Frank; Hultmark, Da; Hunter, Henry C; Hurst, Gregory D D; Jhangian, Shalini N; Jiang, Huaiyang; Johnson, Reed M; Jones, Andrew K; Junier, Thomas; Kadowaki, Tatsuhiko; Kamping, Albert; Kapustin, Yuri; Kechavarzi, Bobak; Kim, Jaebum; Kim, Jay; Kiryutin, Boris; Koevoets, Tosca; Kovar, Christie L; Kriventseva, Evgenia V; Kucharski, Robert; Lee, Heewook; Lee, Sandra L; Lees, Kristin; Lewis, Lora R; Loehlin, David W; Logsdon, John M; Lopez, Jacqueline A; Lozado, Ryan J; Maglott, Donna; Maleszka, Ryszard; Mayampurath, Anoop; Mazur, Danielle J; McClure, Marcella A; Moore, Andrew D; Morgan, Margaret B; Muller, Jean; Munoz-Torres, Monica C; Muzny, Donna M; Nazareth, Lynne V; Neupert, Susanne; Nguyen, Ngoc B; Nunes, Francis M F; Oakeshott, John G; Okwuonu, Geoffrey O; Pannebakker, Bart A; Pejaver, Vikas R; Peng, Zuogang; Pratt, Stephen C; Predel, Reinhard; Pu, Ling-Ling; Ranson, Hilary; Raychoudhury, Rhitoban; Rechtsteiner, Andreas; Reese, Justin T; Reid, Jeffrey G; Riddle, Megan; Robertson, Hugh M; Romero-Severson, Jeanne; Rosenberg, Miriam; Sackton, Timothy B; Sattelle, David B; Schlüns, Helge; Schmitt, Thomas; Schneider, Martina; Schüler, Andreas; Schurko, Andrew M; Shuker, David M; Simões, Zilá L P; Sinha, Saurabh; Smith, Zachary; Solovyev, Victor; Souvorov, Alexandre; Springauf, Andreas; Stafflinger, Elisabeth; Stage, Deborah E; Stanke, Mario; Tanaka, Yoshiaki; Telschow, Arndt; Trent, Carol; Vattathil, Selina; Verhulst, Eveline C; Viljakainen, Lumi; Wanner, Kevin W; Waterhouse, Robert M; Whitfield, James B; Wilkes, Timothy E; Williamson, Michael; Willis, Judith H; Wolschin, Florian; Wyder, Stefan; Yamada, Takuji; Yi, Soojin V; Zecher, Courtney N; Zhang, Lan; Gibbs, Richard A

    2010-01-15

    We report here genome sequences and comparative analyses of three closely related parasitoid wasps: Nasonia vitripennis, N. giraulti, and N. longicornis. Parasitoids are important regulators of arthropod populations, including major agricultural pests and disease vectors, and Nasonia is an emerging genetic model, particularly for evolutionary and developmental genetics. Key findings include the identification of a functional DNA methylation tool kit; hymenopteran-specific genes including diverse venoms; lateral gene transfers among Pox viruses, Wolbachia, and Nasonia; and the rapid evolution of genes involved in nuclear-mitochondrial interactions that are implicated in speciation. Newly developed genome resources advance Nasonia for genetic research, accelerate mapping and cloning of quantitative trait loci, and will ultimately provide tools and knowledge for further increasing the utility of parasitoids as pest insect-control agents.

  7. Functional and Evolutionary Insights from the Genomes of Three Parasitoid Nasonia Species

    PubMed Central

    2010-01-01

    We report here genome sequences and comparative analyses of three closely related parasitoid wasps: Nasonia vitripennis, N. giraulti, and N. longicornis. Parasitoids are important regulators of arthropod populations, including major agricultural pests and disease vectors, and Nasonia is an emerging genetic model, particularly for evolutionary and developmental genetics. Key findings include the identification of a functional DNA methylation tool kit; hymenopteran-specific genes including diverse venoms; lateral gene transfers among Pox viruses, Wolbachia, and Nasonia; and the rapid evolution of genes involved in nuclear-mitochondrial interactions that are implicated in speciation. Newly developed genome resources advance Nasonia for genetic research, accelerate mapping and cloning of quantitative trait loci, and will ultimately provide tools and knowledge for further increasing the utility of parasitoids as pest insect-control agents. PMID:20075255

  8. Strategies to explore functional genomics data sets in NCBI's GEO database.

    PubMed

    Wilhite, Stephen E; Barrett, Tanya

    2012-01-01

    The Gene Expression Omnibus (GEO) database is a major repository that stores high-throughput functional genomics data sets that are generated using both microarray-based and sequence-based technologies. Data sets are submitted to GEO primarily by researchers who are publishing their results in journals that require original data to be made freely available for review and analysis. In addition to serving as a public archive for these data, GEO has a suite of tools that allow users to identify, analyze, and visualize data relevant to their specific interests. These tools include sample comparison applications, gene expression profile charts, data set clusters, genome browser tracks, and a powerful search engine that enables users to construct complex queries.

  9. High-resolution interrogation of functional elements in the noncoding genome.

    PubMed

    Sanjana, Neville E; Wright, Jason; Zheng, Kaijie; Shalem, Ophir; Fontanillas, Pierre; Joung, Julia; Cheng, Christine; Regev, Aviv; Zhang, Feng

    2016-09-30

    The noncoding genome affects gene regulation and disease, yet we lack tools for rapid identification and manipulation of noncoding elements. We developed a CRISPR screen using ~18,000 single guide RNAs targeting >700 kilobases surrounding the genes NF1, NF2, and CUL3, which are involved in BRAF inhibitor resistance in melanoma. We find that noncoding locations that modulate drug resistance also harbor predictive hallmarks of noncoding function. With a subset of regions at the CUL3 locus, we demonstrate that engineered mutations alter transcription factor occupancy and long-range and local epigenetic environments, implicating these sites in gene regulation and chemotherapeutic resistance. Through our expansion of the potential of pooled CRISPR screens, we provide tools for genomic discovery and for elucidating biologically relevant mechanisms of gene regulation. Copyright © 2016, American Association for the Advancement of Science.

  10. Functional annotation from the genome sequence of the giant panda.

    PubMed

    Huo, Tong; Zhang, Yinjie; Lin, Jianping

    2012-08-01

    The giant panda is one of the most critically endangered species due to the fragmentation and loss of its habitat. Studying the functions of proteins in this animal, especially specific trait-related proteins, is therefore necessary to protect the species. In this work, the functions of these proteins were investigated using the genome sequence of the giant panda. Data on 21,001 proteins and their functions were stored in the Giant Panda Protein Database, in which the proteins were divided into two groups: 20,179 proteins whose functions can be predicted by GeneScan formed the known-function group, whereas 822 proteins whose functions cannot be predicted by GeneScan comprised the unknown-function group. For the known-function group, we further classified the proteins by molecular function, biological process, cellular component, and tissue specificity. For the unknown-function group, we developed a strategy in which the proteins were filtered by cross-Blast to identify panda-specific proteins under the assumption that proteins related to the panda-specific traits in the unknown-function group exist. After this filtering procedure, we identified 32 proteins (2 of which are membrane proteins) specific to the giant panda genome as compared against the dog and horse genomes. Based on their amino acid sequences, these 32 proteins were further analyzed by functional classification using SVM-Prot, motif prediction using MyHits, and interacting protein prediction using the Database of Interacting Proteins. Nineteen proteins were predicted to be zinc-binding proteins, thus affecting the activities of nucleic acids. The 32 panda-specific proteins will be further investigated by structural and functional analysis.

  11. The rules of tool incorporation: Tool morpho-functional & sensori-motor constraints.

    PubMed

    Cardinali, L; Brozzoli, C; Finos, L; Roy, A C; Farnè, A

    2016-04-01

    Previous studies showed that using tools modifies the agent's body and space representation. However, it is still not clear which rules govern those remapping processes. Here, we studied the differential role played by the morpho-functional characteristics of a tool and the sensori-motor constraints that a tool imposes on the hand. To do so, we asked a group of participants to reach and grasp an object using, in different conditions, two different tools: Pliers, to be acted upon by the index and thumb fingertips, and Sticks, taped to the same two digits. The two tools were equivalent in terms of morpho-functional characteristics, providing index finger and thumb with the same amount of elongation. Crucially, however, they imposed different sensori-motor constraints on the acting fingers. We measured and compared the kinematic profile of free-hand movements performed before and after the use of both devices. As predicted on the basis of their equivalent morpho-functional characteristics, both tools induced similar changes in the fingers (but not the arm) kinematics compatible with the hand being represented as bigger. Furthermore, the different sensori-motor constraints imposed by Pliers and Sticks over the hand, induced differential updates of the hand representation. In particular, the Sticks selectively affected the kinematics of the two fingers they were taped on, whereas Pliers had a more global effect, affecting the kinematics of hand movements not performed during the use of the tool. These results suggest that tool-use induces a rapid update of the hand representation in the brain, not only on the basis of the morpho-functional characteristics of the tool, but also depending on the specific sensori-motor constraints imposed by the tool.

  12. The hidden perils of read mapping as a quality assessment tool in genome sequencing

    PubMed Central

    Lehri, B.; Seddon, A. M.; Karlyshev, A. V.

    2017-01-01

    This article provides a comparative analysis of the various methods of genome sequencing focusing on verification of the assembly quality. The results of a comparative assessment of various de novo assembly tools, as well as sequencing technologies, are presented using a recently completed sequence of the genome of Lactobacillus fermentum 3872. In particular, quality of assemblies is assessed by using CLC Genomics Workbench read mapping and Optical mapping developed by OpGen. Over-extension of contigs without prior knowledge of contig location can lead to misassembled contigs, even when commonly used quality indicators such as read mapping suggest that a contig is well assembled. Precautions must also be undertaken when using long read sequencing technology, which may also lead to misassembled contigs. PMID:28225089

  13. MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities

    DOE PAGES

    Kang, Dongwan D.; Froula, Jeff; Egan, Rob; ...

    2015-01-01

    Grouping large genomic fragments assembled from shotgun metagenomic sequences to deconvolute complex microbial communities, or metagenome binning, enables the study of individual organisms and their interactions. Because of the complex nature of these communities, existing metagenome binning methods often miss a large number of microbial species. In addition, most of the tools are not scalable to large datasets. Here we introduce automated software called MetaBAT that integrates empirical probabilistic distances of genome abundance and tetranucleotide frequency for accurate metagenome binning. MetaBAT outperforms alternative methods in accuracy and computational efficiency on both synthetic and real metagenome datasets. Lastly, it automatically formsmore » hundreds of high quality genome bins on a very large assembly consisting millions of contigs in a matter of hours on a single node. MetaBAT is open source software and available at https://bitbucket.org/berkeleylab/metabat.« less

  14. MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities

    SciTech Connect

    Kang, Dongwan D.; Froula, Jeff; Egan, Rob; Wang, Zhong

    2015-01-01

    Grouping large genomic fragments assembled from shotgun metagenomic sequences to deconvolute complex microbial communities, or metagenome binning, enables the study of individual organisms and their interactions. Because of the complex nature of these communities, existing metagenome binning methods often miss a large number of microbial species. In addition, most of the tools are not scalable to large datasets. Here we introduce automated software called MetaBAT that integrates empirical probabilistic distances of genome abundance and tetranucleotide frequency for accurate metagenome binning. MetaBAT outperforms alternative methods in accuracy and computational efficiency on both synthetic and real metagenome datasets. Lastly, it automatically forms hundreds of high quality genome bins on a very large assembly consisting millions of contigs in a matter of hours on a single node. MetaBAT is open source software and available at https://bitbucket.org/berkeleylab/metabat.

  15. Nuclease-mediated genome editing: At the front-line of functional genomics technology.

    PubMed

    Sakuma, Tetsushi; Woltjen, Knut

    2014-01-01

    Genome editing with engineered endonucleases is rapidly becoming a staple method in developmental biology studies. Engineered nucleases permit random or designed genomic modification at precise loci through the stimulation of endogenous double-strand break repair. Homology-directed repair following targeted DNA damage is mediated by co-introduction of a custom repair template, allowing the derivation of knock-out and knock-in alleles in animal models previously refractory to classic gene targeting procedures. Currently there are three main types of customizable site-specific nucleases delineated by the source mechanism of DNA binding that guides nuclease activity to a genomic target: zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and clustered regularly interspaced short palindromic repeats (CRISPR). Among these genome engineering tools, characteristics such as the ease of design and construction, mechanism of inducing DNA damage, and DNA sequence specificity all differ, making their application complementary. By understanding the advantages and disadvantages of each method, one may make the best choice for their particular purpose.

  16. Knocking down the obstacles to functional genomics data sharing.

    PubMed

    Simpson, Kaylene J; Smith, Jennifer A

    2017-03-01

    This week, Scientific Data published a collection of eight papers that describe datasets from high-throughput functional genomics screens, primarily utilizing RNA interference (RNAi). The publications explore host-pathogen dependencies, innate immune response, disease pathways, and cell morphology and motility at the genome-level. All data, including raw images from the high content screens, are publically available in PubChem BioAssay, figshare, Harvard Dataverse or the Image Data Resource (IDR). Detailed data descriptors enable use of these data for analysis algorithm design, machine learning, data comparisons, as well as generating new scientific hypotheses.

  17. Structure and Functional Studies on Dengue-2 Virus Genome

    DTIC Science & Technology

    1986-03-01

    AD_ _ _ Lfl oSTRUCTURE AND FUNCTIONAL STUDIES ON DENGUE -2 VIRUS GENOME 0Annual Report Radha Krishnan Padmanabhan, Ph.D. March 1, 1986 Supported by...Studies on Dengue -2 Virus Genome 12 PERSONAL AUTHOR(S) Radha Krishnan Padmanabhan 13a TYPE OF REPORT 1 3b TIME COVERED 14 DATE OF REPORT (Year, Month, Day...analysis of these clones totalling 06 01 14,586 nucleotides: Deduced amino acid sequences of dengue virI 19 ABSTRACT (Continue on reverse of

  18. Knocking down the obstacles to functional genomics data sharing

    PubMed Central

    Simpson, Kaylene J.; Smith, Jennifer A.

    2017-01-01

    This week, Scientific Data published a collection of eight papers that describe datasets from high-throughput functional genomics screens, primarily utilizing RNA interference (RNAi). The publications explore host-pathogen dependencies, innate immune response, disease pathways, and cell morphology and motility at the genome-level. All data, including raw images from the high content screens, are publically available in PubChem BioAssay, figshare, Harvard Dataverse or the Image Data Resource (IDR). Detailed data descriptors enable use of these data for analysis algorithm design, machine learning, data comparisons, as well as generating new scientific hypotheses. PMID:28248922

  19. Immersive virtual environment technology: a promising tool for future social and behavioral genomics research and practice.

    PubMed

    Persky, Susan; McBride, Colleen M

    2009-12-01

    Social and behavioral research needs to get started now if scientists are to direct genomic discoveries to address pressing public health problems. Advancing social and behavioral science will require innovative and rigorous communication methodologies that move researchers beyond reliance on traditional tools and their inherent limitations. One such emerging research tool is immersive virtual environment technology (virtual reality), a methodology that gives researchers the ability to maintain high experimental control and mundane realism of scenarios; portray and manipulate complex, abstract objects and concepts; and implement innovative implicit behavioral measurement. This report suggests the role that immersive virtual environment technology can play in furthering future research in genomics-related education, decision making, test intentions, behavior change, and health-care provider behaviors. Practical implementation and challenges are also discussed.

  20. Bioinformatics tools for genome mining of polyketide and non-ribosomal peptides.

    PubMed

    Boddy, Christopher N

    2014-02-01

    Microbial natural products have played a key role in the development of clinical agents in nearly all therapeutic areas. Recent advances in genome sequencing have revealed that there is an incredible wealth of new polyketide and non-ribosomal peptide natural product diversity to be mined from genetic data. The diversity and complexity of polyketide and non-ribosomal peptide biosynthesis has required the development of unique bioinformatics tools to identify, annotate, and predict the structures of these natural products from their biosynthetic gene clusters. This review highlights and evaluates web-based bioinformatics tools currently available to the natural product community for genome mining to discover new polyketides and non-ribosomal peptides.

  1. Cpf1 Is A Versatile Tool for CRISPR Genome Editing Across Diverse Species of Cyanobacteria

    PubMed Central

    Ungerer, Justin; Pakrasi, Himadri B.

    2016-01-01

    Cyanobacteria are the ideal organisms for the production of a wide range of bioproducts as they can convert CO2 directly into the desired end product using solar energy. Unfortunately, the engineering of cyanobacteria to create efficient cell factories has been impaired by the cumbersome genetic tools that are currently available for these organisms; especially when trying to accumulate multiple modifications. We sought to construct an efficient and precise tool for generating numerous markerless modifications in cyanobacteria using CRISPR technology and the alternative nuclease, Cpf1. In this study we demonstrate rapid engineering of markerless knock-ins, knock-outs and point mutations in each of three model cyanobacteria; Synechococcus, Synechocystis and Anabaena. The markerless nature of cpf1 genome editing will allow for complex genome modification that was not possible with previously existing technology while facilitating the development of cyanobacteria as highly modified biofactories. PMID:28000776

  2. Cpf1 Is A Versatile Tool for CRISPR Genome Editing Across Diverse Species of Cyanobacteria.

    PubMed

    Ungerer, Justin; Pakrasi, Himadri B

    2016-12-21

    Cyanobacteria are the ideal organisms for the production of a wide range of bioproducts as they can convert CO2 directly into the desired end product using solar energy. Unfortunately, the engineering of cyanobacteria to create efficient cell factories has been impaired by the cumbersome genetic tools that are currently available for these organisms; especially when trying to accumulate multiple modifications. We sought to construct an efficient and precise tool for generating numerous markerless modifications in cyanobacteria using CRISPR technology and the alternative nuclease, Cpf1. In this study we demonstrate rapid engineering of markerless knock-ins, knock-outs and point mutations in each of three model cyanobacteria; Synechococcus, Synechocystis and Anabaena. The markerless nature of cpf1 genome editing will allow for complex genome modification that was not possible with previously existing technology while facilitating the development of cyanobacteria as highly modified biofactories.

  3. Immersive Virtual Environment Technology: A Promising Tool for Future Social and Behavioral Genomics Research and Practice

    PubMed Central

    Persky, Susan; McBride, Colleen M.

    2009-01-01

    Social and behavioral research needs to get started now if we are to direct genomic discoveries to address pressing public health problems. Advancing social and behavioral science will require innovative and rigorous communication methodologies that move us beyond reliance on traditional tools and their inherent limitations. One such emerging research tool is immersive virtual environment technology (aka: virtual reality), a methodology that gives researchers the ability to maintain high experimental control and mundane realism of scenarios, portray and manipulate complex, abstract objects and concepts, and implement innovative implicit behavioral measurement. This report suggests the role that immersive virtual environment technology can play in furthering future research in genomics-related: education, decision-making, test intentions, behavior change, and healthcare provider behaviors. Practical implementation and challenges are also discussed. PMID:20183376

  4. geneSurv: An interactive web-based tool for survival analysis in genomics research.

    PubMed

    Korkmaz, Selcuk; Goksuluk, Dincer; Zararsiz, Gokmen; Karahan, Sevilay

    2017-09-05

    Survival analysis methods are often used in cancer studies. It has been shown that the combination of clinical data with genomics increases the predictive performance of survival analysis methods. But, this leads to a high-dimensional data problem. Fortunately, new methods have been developed in the last decade to overcome this problem. However, there is a strong need for easily accessible, user-friendly and interactive tool to perform survival analysis in the presence of genomics data. We developed an open-source and freely available web-based tool for survival analysis methods that can deal with high-dimensional data. This tool includes classical methods, such as Kaplan-Meier, Cox proportional hazards regression, and advanced methods, such as penalized Cox regression and Random Survival Forests. It also offers an optimal cutoff determination method based on maximizing several test statistics. The tool has a simple and interactive interface, and it can handle high dimensional data through feature selection and ensemble methods. To dichotomize gene expressions, geneSurv can identify optimal cutoff points. Users can upload their microarray, RNA-Seq, chip-Seq, proteomics, metabolomics or clinical data as a nxp dimensional data matrix, where n refers to samples and p refers to genes. This tool is available free at www.biosoft.hacettepe.edu.tr/geneSurv. All source code is available at https://github.com/selcukorkmaz/geneSurv under the GPL-3 license. Copyright © 2017 Elsevier Ltd. All rights reserved.

  5. A Populus EST resource for plant functional genomics

    PubMed Central

    Sterky, Fredrik; Bhalerao, Rupali R.; Unneberg, Per; Segerman, Bo; Nilsson, Peter; Brunner, Amy M.; Charbonnel-Campaa, Laurence; Lindvall, Jenny Jonsson; Tandre, Karolina; Strauss, Steven H.; Sundberg, Björn; Gustafsson, Petter; Uhlén, Mathias; Bhalerao, Rishikesh P.; Nilsson, Ove; Sandberg, Göran; Karlsson, Jan; Lundeberg, Joakim; Jansson, Stefan

    2004-01-01

    Trees present a life form of paramount importance for terrestrial ecosystems and human societies because of their ecological structure and physiological function and provision of energy and industrial materials. The genus Populus is the internationally accepted model for molecular tree biology. We have analyzed 102,019 Populus ESTs that clustered into 11,885 clusters and 12,759 singletons. We also provide >4,000 assembled full clone sequences to serve as a basis for the upcoming annotation of the Populus genome sequence. A public web-based EST database (populusdb) provides digital expression profiles for 18 tissues that comprise the majority of differentiated organs. The coding content of Populus and Arabidopsis genomes shows very high similarity, indicating that differences between these annual and perennial angiosperm life forms result primarily from differences in gene regulation. The high similarity between Populus and Arabidopsis will allow studies of Populus to directly benefit from the detailed functional genomic information generated for Arabidopsis, enabling detailed insights into tree development and adaptation. These data will also valuable for functional genomic efforts in Arabidopsis. PMID:15353603

  6. Complete genome sequence of cyanobacterium Fischerella sp. NIES-3754, providing thermoresistant optogenetic tools.

    PubMed

    Hirose, Yuu; Fujisawa, Takatomo; Ohtsubo, Yoshiyuki; Katayama, Mitsunori; Misawa, Naomi; Wakazuki, Sachiko; Shimura, Yohei; Nakamura, Yasukazu; Kawachi, Masanobu; Yoshikawa, Hirofumi; Eki, Toshihiko; Kanesaki, Yu

    2016-02-20

    Cyanobacterial phytochrome-class photosensors are recently emerging optogenetic tools. We isolated Fischerella sp. strain NIES-3754 from hotspring at Suwa-shrine, Suwa, Nagano, Japan. We determined complete genome sequence of the NIES-3754 strain, which is composed of one chromosome and two putative replicons (total 5,826,863bp containing no gaps). We identified photosensor genes of 5 phytochromes and 9 cyanobacteriochromes, which will facilitate optogenetics of thermophile.

  7. Integrative genomics--a basic and essential tool for the development of molecular medicine.

    PubMed

    Ostrowski, Jerzy

    2008-01-01

    Understanding the molecular mechanisms of disease requires the introduction of molecular diagnostics into medical practice. Current medicine employs only elements of molecular diagnostics, and usually on the scale of single genes. Medicine in the post-genomic era will utilize thousands of molecular markers associated with disease that are provided by high-throughput sequencing and functional genomic, proteomic and metabolomic studies. Such a spectrum of techniques will link clinical medicine based on molecularly oriented diagnostics with the prediction and prevention of disease. To achieve this task, large-scale and genome-wide biological and medical data must be combined with biostatistical analyses and bioinformatic modeling of biological systems. The collecting, cataloging and comparison of data from molecular studies and the subsequent development of conclusions create the fundamentals of systems biology. This highly complex analytical process reflects a new scientific paradigm called integrative genomics.

  8. Algal Functional Annotation Tool: a web-based analysis suite to functionally interpret large gene lists using integrated annotation and expression data

    PubMed Central

    2011-01-01

    Background Progress in genome sequencing is proceeding at an exponential pace, and several new algal genomes are becoming available every year. One of the challenges facing the community is the association of protein sequences encoded in the genomes with biological function. While most genome assembly projects generate annotations for predicted protein sequences, they are usually limited and integrate functional terms from a limited number of databases. Another challenge is the use of annotations to interpret large lists of 'interesting' genes generated by genome-scale datasets. Previously, these gene lists had to be analyzed across several independent biological databases, often on a gene-by-gene basis. In contrast, several annotation databases, such as DAVID, integrate data from multiple functional databases and reveal underlying biological themes of large gene lists. While several such databases have been constructed for animals, none is currently available for the study of algae. Due to renewed interest in algae as potential sources of biofuels and the emergence of multiple algal genome sequences, a significant need has arisen for such a database to process the growing compendiums of algal genomic data. Description The Algal Functional Annotation Tool is a web-based comprehensive analysis suite integrating annotation data from several pathway, ontology, and protein family databases. The current version provides annotation for the model alga Chlamydomonas reinhardtii, and in the future will include additional genomes. The site allows users to interpret large gene lists by identifying associated functional terms, and their enrichment. Additionally, expression data for several experimental conditions were compiled and analyzed to provide an expression-based enrichment search. A tool to search for functionally-related genes based on gene expression across these conditions is also provided. Other features include dynamic visualization of genes on KEGG pathway maps

  9. Beyond Drosophila: RNAi in vivo and functional genomics in insects.

    PubMed

    Bellés, Xavier

    2010-01-01

    The increasing availability of insect genomes has revealed a large number of genes with unknown functions and the resulting problem of how to discover these functions. The RNA interference (RNAi) technique, which generates loss-of-function phenotypes by depletion of a chosen transcript, can help to overcome this challenge. RNAi can unveil the functions of new genes, lead to the discovery of new functions for old genes, and find the genes for old functions. Moreover, the possibility of studying the functions of homologous genes in different species can allow comparisons of the genetic networks regulating a given function in different insect groups, thereby facilitating an evolutionary insight into developmental processes. RNAi also has drawbacks and obscure points, however, such as those related to differences in species sensitivity. Disentangling these differences is one of the main challenges in the RNAi field.

  10. Genomic islands predict functional adaptation in marine actinobacteria

    SciTech Connect

    Penn, Kevin; Jenkins, Caroline; Nett, Markus; Udwary, Daniel; Gontang, Erin; McGlinchey, Ryan; Foster, Brian; Lapidus, Alla; Podell, Sheila; Allen, Eric; Moore, Bradley; Jensen, Paul

    2009-04-01

    Linking functional traits to bacterial phylogeny remains a fundamental but elusive goal of microbial ecology 1. Without this information, it becomes impossible to resolve meaningful units of diversity and the mechanisms by which bacteria interact with each other and adapt to environmental change. Ecological adaptations among bacterial populations have been linked to genomic islands, strain-specific regions of DNA that house functionally adaptive traits 2. In the case of environmental bacteria, these traits are largely inferred from bioinformatic or gene expression analyses 2, thus leaving few examples in which the functions of island genes have been experimentally characterized. Here we report the complete genome sequences of Salinispora tropica and S. arenicola, the first cultured, obligate marine Actinobacteria 3. These two species inhabit benthic marine environments and dedicate 8-10percent of their genomes to the biosynthesis of secondary metabolites. Despite a close phylogenetic relationship, 25 of 37 secondary metabolic pathways are species-specific and located within 21 genomic islands, thus providing new evidence linking secondary metabolism to ecological adaptation. Species-specific differences are also observed in CRISPR sequences, suggesting that variations in phage immunity provide fitness advantages that contribute to the cosmopolitan distribution of S. arenicola 4. The two Salinispora genomes have evolved by complex processes that include the duplication and acquisition of secondary metabolite genes, the products of which provide immediate opportunities for molecular diversification and ecological adaptation. Evidence that secondary metabolic pathways are exchanged by Horizontal Gene Transfer (HGT) yet are fixed among globally distributed populations 5 supports a functional role for their products and suggests that pathway acquisition represents a previously unrecognized force driving bacterial diversification

  11. Dissecting diabetes/metabolic disease mechanisms using pluripotent stem cells and genome editing tools

    PubMed Central

    Teo, Adrian Kee Keong; Gupta, Manoj K.; Doria, Alessandro; Kulkarni, Rohit N.

    2015-01-01

    Background Diabetes and metabolic syndromes are chronic, devastating diseases with increasing prevalence. Human pluripotent stem cells are gaining popularity in their usage for human in vitro disease modeling. With recent rapid advances in genome editing tools, these cells can now be genetically manipulated with relative ease to study how genes and gene variants contribute to diabetes and metabolic syndromes. Scope of review We highlight the diabetes and metabolic genes and gene variants, which could potentially be studied, using two powerful technologies – human pluripotent stem cells (hPSCs) and genome editing tools – to aid the elucidation of yet elusive mechanisms underlying these complex diseases. Major conclusions hPSCs and the advancing genome editing tools appear to be a timely and potent combination for probing molecular mechanism(s) underlying diseases such as diabetes and metabolic syndromes. The knowledge gained from these hiPSC-based disease modeling studies can potentially be translated into the clinics by guiding clinicians on the appropriate type of medication to use for each condition based on the mechanism of action of the disease. PMID:26413465

  12. Neocaridina denticulata: A Decapod Crustacean Model for Functional Genomics.

    PubMed

    Mykles, Donald L; Hui, Jerome H L

    2015-11-01

    A decapod crustacean model is needed for understanding the molecular mechanisms underlying physiological processes, such as reproduction, sex determination, molting and growth, immunity, regeneration, and response to stress. Criteria for selection are: life-history traits, adult size, availability and ease of culture, and genomics and genetic manipulation. Three freshwater species are considered: cherry shrimp, Neocaridina denticulata; red swamp crayfish, Procambarus clarkii; and redclaw crayfish, Cherax quadricarinatus. All three are readily available, reproduce year round, and grow rapidly. The crayfish species require more space for culture than does N. denticulata. The transparent cuticle of cherry shrimp provides for direct assessment of reproductive status, stage of molt, and tissue-specific expression of reporter genes, and facilitates screening of mutations affecting phenotype. Moreover, a preliminary genome of N. denticulata is available and efforts toward complete genome sequencing and transcriptome sequencing have been initiated. Neocaridina denticulata possesses the best combination of traits that make it most suitable as a model for functional genomics. The next step is to obtain the complete genome sequence and to develop molecular technologies for the screening of mutants and for manipulating tissue-specific gene expression.

  13. Functional noncoding sequences derived from SINEs in the mammalian genome

    PubMed Central

    Nishihara, Hidenori; Smit, Arian F.A.; Okada, Norihiro

    2006-01-01

    Recent comparative analyses of mammalian sequences have revealed that a large number of nonprotein-coding genomic regions are under strong selective constraint. Here, we report that some of these loci have been derived from a newly defined family of ancient SINEs (short interspersed repetitive elements). This is a surprising result, as SINEs and other transposable elements are commonly thought to be genomic parasites. We named the ancient SINE family AmnSINE1, for Amniota SINE1, because we found it to be present in mammals as well as in birds, and some copies predate the mammalian-bird split 310 million years ago (Mya). AmnSINE1 has a chimeric structure of a 5S rRNA and a tRNA-derived SINE, and is related to five tRNA-derived SINE families that we characterized here in the coelacanth, dogfish shark, hagfish, and amphioxus genomes. All of the newly described SINE families have a common central domain that is also shared by zebrafish SINE3, and we collectively name them the DeuSINE (Deuterostomia SINE) superfamily. Notably, of the ∼1000 still identifiable copies of AmnSINE1 in the human genome, 105 correspond to loci phylogenetically highly conserved among mammalian orthologs. The conservation is strongest over the central domain. Thus, AmnSINE1 appears to be the best example of a transposable element of which a significant fraction of the copies have acquired genomic functionality. PMID:16717141

  14. Functional noncoding sequences derived from SINEs in the mammalian genome.

    PubMed

    Nishihara, Hidenori; Smit, Arian F A; Okada, Norihiro

    2006-07-01

    Recent comparative analyses of mammalian sequences have revealed that a large number of nonprotein-coding genomic regions are under strong selective constraint. Here, we report that some of these loci have been derived from a newly defined family of ancient SINEs (short interspersed repetitive elements). This is a surprising result, as SINEs and other transposable elements are commonly thought to be genomic parasites. We named the ancient SINE family AmnSINE1, for Amniota SINE1, because we found it to be present in mammals as well as in birds, and some copies predate the mammalian-bird split 310 million years ago (Mya). AmnSINE1 has a chimeric structure of a 5S rRNA and a tRNA-derived SINE, and is related to five tRNA-derived SINE families that we characterized here in the coelacanth, dogfish shark, hagfish, and amphioxus genomes. All of the newly described SINE families have a common central domain that is also shared by zebrafish SINE3, and we collectively name them the DeuSINE (Deuterostomia SINE) superfamily. Notably, of the approximately 1000 still identifiable copies of AmnSINE1 in the human genome, 105 correspond to loci phylogenetically highly conserved among mammalian orthologs. The conservation is strongest over the central domain. Thus, AmnSINE1 appears to be the best example of a transposable element of which a significant fraction of the copies have acquired genomic functionality.

  15. Recent advances in tomato functional genomics: utilization of VIGS.

    PubMed

    Sahu, Pranav Pankaj; Puranik, Swati; Khan, Moinuddin; Prasad, Manoj

    2012-10-01

    Tomato unquestionably occupies a significant position in world vegetable production owing to its world-wide consumption. The tomato genome sequencing efforts being recently concluded, it becomes more imperative to recognize important functional genes from this treasure of generated information for improving tomato yield. While much progress has been made in conventional tomato breeding, post-transcriptional gene silencing (PTGS) offers an alternative approach for advancement of tomato functional genomics. In particular, virus-induced gene silencing (VIGS) is increasingly being used as rapid, reliable, and lucrative screening strategy to elucidate gene function. In this review, we focus on the recent advancement made through exploiting the potential of this technique for manipulating different agronomically important traits in tomato by discussing several case studies.

  16. Proteogenomic Tools and Approaches to Explore Protein Coding Landscapes of Eukaryotic Genomes.

    PubMed

    Kumar, Dhirendra; Dash, Debasis

    2016-01-01

    Proteogenomic strategies aim to refine genome-wide annotations of protein coding features by using actual protein level observations. Most of the currently applied proteogenomic approaches include integrative analysis of multiple types of high-throughput omics data, e.g., genomics, transcriptomics, proteomics, etc. Recent efforts towards creating a human proteome map were primarily targeted to experimentally detect at least one protein product for each gene in the genome and extensively utilized proteogenomic approaches. The 14 year long wait to get a draft human proteome map, after completion of similar efforts to sequence the genome, explains the huge complexity and technical hurdles of such efforts. Further, the integrative analysis of large-scale multi-omics datasets inherent to these studies becomes a major bottleneck to their success. However, recent developments of various analysis tools and pipelines dedicated to proteogenomics reduce both the time and complexity of such analysis. Here, we summarize notable approaches, studies, software developments and their potential applications towards eukaryotic genome annotation and clinical proteogenomics.

  17. Functional Genomics Using the Saccharomyces cerevisiae Yeast Deletion Collections.

    PubMed

    Nislow, Corey; Wong, Lai Hong; Lee, Amy Huei-Yi; Giaever, Guri

    2016-09-01

    Constructed by a consortium of 16 laboratories, the Saccharomyces genome-wide deletion collections have, for the past decade, provided a powerful, rapid, and inexpensive approach for functional profiling of the yeast genome. Loss-of-function deletion mutants were systematically created using a polymerase chain reaction (PCR)-based gene deletion strategy to generate a start-to-stop codon replacement of each open reading frame by homologous recombination. Each strain carries two molecular barcodes that serve as unique strain identifiers, enabling their growth to be analyzed in parallel and the fitness contribution of each gene to be quantitatively assessed by hybridization to high-density oligonucleotide arrays or through the use of next-generation sequencing technologies. Functional profiling of the deletion collections, using either strain-by-strain or parallel assays, provides an unbiased approach to systematically survey the yeast genome. The Saccharomyces yeast deletion collections have proved immensely powerful in contributing to the understanding of gene function, including functional relationships between genes and genetic pathways in response to diverse genetic and environmental perturbations. © 2016 Cold Spring Harbor Laboratory Press.

  18. Approaching Functions: Cabri Tools as Instruments of Semiotic Mediation

    ERIC Educational Resources Information Center

    Falcade, Rossana; Laborde, Colette; Mariotti, Maria Alessandra

    2007-01-01

    Assuming that dynamic features of Dynamic Geometry Software may provide a basic representation of both variation and functional dependency, and taking the Vygotskian perspective of semiotic mediation, a teaching experiment was designed with the aim of introducing students to the idea of function. This paper focuses on the use of the Trace tool and…

  19. Approaching Functions: Cabri Tools as Instruments of Semiotic Mediation

    ERIC Educational Resources Information Center

    Falcade, Rossana; Laborde, Colette; Mariotti, Maria Alessandra

    2007-01-01

    Assuming that dynamic features of Dynamic Geometry Software may provide a basic representation of both variation and functional dependency, and taking the Vygotskian perspective of semiotic mediation, a teaching experiment was designed with the aim of introducing students to the idea of function. This paper focuses on the use of the Trace tool and…

  20. A Semi-Automated Functional Test Data Analysis Tool

    SciTech Connect

    Xu, Peng; Haves, Philip; Kim, Moosung

    2005-05-01

    The growing interest in commissioning is creating a demand that will increasingly be met by mechanical contractors and less experienced commissioning agents. They will need tools to help them perform commissioning effectively and efficiently. The widespread availability of standardized procedures, accessible in the field, will allow commissioning to be specified with greater certainty as to what will be delivered, enhancing the acceptance and credibility of commissioning. In response, a functional test data analysis tool is being developed to analyze the data collected during functional tests for air-handling units. The functional test data analysis tool is designed to analyze test data, assess performance of the unit under test and identify the likely causes of the failure. The tool has a convenient user interface to facilitate manual entry of measurements made during a test. A graphical display shows the measured performance versus the expected performance, highlighting significant differences that indicate the unit is not able to pass the test. The tool is described as semiautomated because the measured data need to be entered manually, instead of being passed from the building control system automatically. However, the data analysis and visualization are fully automated. The tool is designed to be used by commissioning providers conducting functional tests as part of either new building commissioning or retro-commissioning, as well as building owners and operators interested in conducting routine tests periodically to check the performance of their HVAC systems.

  1. Comparative genomics and functional analysis of niche-specific adaptation in Pseudomonas putida

    PubMed Central

    Wu, Xiao; Monchy, Sébastien; Taghavi, Safiyh; Zhu, Wei; Ramos, Juan; van der Lelie, Daniel

    2011-01-01

    Pseudomonas putida is a gram-negative rod-shaped gammaproteobacterium that is found throughout various environments. Members of the species P. putida show a diverse spectrum of metabolic activities, which is indicative of their adaptation to various niches, which includes the ability to live in soils and sediments contaminated with high concentrations of heavy metals and organic contaminants. Pseudomonas putida strains are also found as plant growth-promoting rhizospheric and endophytic bacteria. The genome sequences of several P. putida species have become available and provide a unique tool to study the specific niche adaptation of the various P. putida strains. In this review, we compare the genomes of four P. putida strains: the rhizospheric strain KT2440, the endophytic strain W619, the aromatic hydrocarbon-degrading strain F1 and the manganese-oxidizing strain GB-1. Comparative genomics provided a powerful tool to gain new insights into the adaptation of P. putida to specific lifestyles and environmental niches, and clearly demonstrated that horizontal gene transfer played a key role in this adaptation process, as many of the niche-specific functions were found to be encoded on clearly defined genomic islands. PMID:20796030

  2. Comparative genomics and functional analysis of niche-specific adaptation in Pseudomonas putida

    SciTech Connect

    Wu X.; van der Lelie D.; Monchy, S.; Taghavi, S.; Zhu, W.; Ramos, J.

    2011-03-01

    Pseudomonas putida is a gram-negative rod-shaped gammaproteobacterium that is found throughout various environments. Members of the species P. putida show a diverse spectrum of metabolic activities, which is indicative of their adaptation to various niches, which includes the ability to live in soils and sediments contaminated with high concentrations of heavy metals and organic contaminants. Pseudomonas putida strains are also found as plant growth-promoting rhizospheric and endophytic bacteria. The genome sequences of several P. putida species have become available and provide a unique tool to study the specific niche adaptation of the various P. putida strains. In this review, we compare the genomes of four P. putida strains: the rhizospheric strain KT2440, the endophytic strain W619, the aromatic hydrocarbon-degrading strain F1 and the manganese-oxidizing strain GB-1. Comparative genomics provided a powerful tool to gain new insights into the adaptation of P. putida to specific lifestyles and environmental niches, and clearly demonstrated that horizontal gene transfer played a key role in this adaptation process, as many of the niche-specific functions were found to be encoded on clearly defined genomic islands.

  3. A Tool for Multiple Targeted Genome Deletions that Is Precise, Scar-Free, and Suitable for Automation.

    PubMed

    Aubrey, Wayne; Riley, Michael C; Young, Michael; King, Ross D; Oliver, Stephen G; Clare, Amanda

    2015-01-01

    Many advances in synthetic biology require the removal of a large number of genomic elements from a genome. Most existing deletion methods leave behind markers, and as there are a limited number of markers, such methods can only be applied a fixed number of times. Deletion methods that recycle markers generally are either imprecise (remove untargeted sequences), or leave scar sequences which can cause genome instability and rearrangements. No existing marker recycling method is automation-friendly. We have developed a novel openly available deletion tool that consists of: 1) a method for deleting genomic elements that can be repeatedly used without limit, is precise, scar-free, and suitable for automation; and 2) software to design the method's primers. Our tool is sequence agnostic and could be used to delete large numbers of coding sequences, promoter regions, transcription factor binding sites, terminators, etc in a single genome. We have validated our tool on the deletion of non-essential open reading frames (ORFs) from S. cerevisiae. The tool is applicable to arbitrary genomes, and we provide primer sequences for the deletion of: 90% of the ORFs from the S. cerevisiae genome, 88% of the ORFs from S. pombe genome, and 85% of the ORFs from the L. lactis genome.

  4. A Tool for Multiple Targeted Genome Deletions that Is Precise, Scar-Free, and Suitable for Automation

    PubMed Central

    Aubrey, Wayne; Riley, Michael C.; Young, Michael; King, Ross D.; Oliver, Stephen G.; Clare, Amanda

    2015-01-01

    Many advances in synthetic biology require the removal of a large number of genomic elements from a genome. Most existing deletion methods leave behind markers, and as there are a limited number of markers, such methods can only be applied a fixed number of times. Deletion methods that recycle markers generally are either imprecise (remove untargeted sequences), or leave scar sequences which can cause genome instability and rearrangements. No existing marker recycling method is automation-friendly. We have developed a novel openly available deletion tool that consists of: 1) a method for deleting genomic elements that can be repeatedly used without limit, is precise, scar-free, and suitable for automation; and 2) software to design the method’s primers. Our tool is sequence agnostic and could be used to delete large numbers of coding sequences, promoter regions, transcription factor binding sites, terminators, etc in a single genome. We have validated our tool on the deletion of non-essential open reading frames (ORFs) from S. cerevisiae. The tool is applicable to arbitrary genomes, and we provide primer sequences for the deletion of: 90% of the ORFs from the S. cerevisiae genome, 88% of the ORFs from S. pombe genome, and 85% of the ORFs from the L. lactis genome. PMID:26630677

  5. Infants learn enduring functions of novel tools from action demonstrations

    PubMed Central

    Hernik, Mikołaj; Csibra, Gergely

    2015-01-01

    According to recent theoretical proposals, one function of infant goal attribution is to support early social learning of artifact functions from instrumental actions, and one function of infant sensitivity to communication is to support early acquisition of generic knowledge about enduring, kind-relevant properties of the referents. The present study tested two hypotheses, derived from these proposals, about the conditions that facilitate the acquisition of enduring functions for novel tools in human infancy. Using a violation-of-expectation paradigm, we show that 13.5-months-old infants encode arbitrary end-states of action-sequences in relation to the novel tools employed to bring them about. These mappings are not formed if the same end states of action sequences cannot be interpreted as action goals. Moreover, the tool-goal mappings acquired from infant-directed communicative demonstrations are more resilient to counter-evidence than those acquired from non-infant-directed presentations, and thus show similarities to generic rather than episodic representations. These findings suggest that the acquisition of tool functions in infancy is guided by both teleological action interpretation mechanisms and the expectation that communicative demonstrations reveal enduring dispositional properties of tools. PMID:25462040

  6. Parcellation of left parietal tool representations by functional connectivity

    PubMed Central

    Garcea, Frank E.; Z. Mahon, Bradford

    2014-01-01

    Manipulating a tool according to its function requires the integration of visual, conceptual, and motor information, a process subserved in part by left parietal cortex. How these different types of information are integrated and how their integration is reflected in neural responses in the parietal lobule remains an open question. Here, participants viewed images of tools and animals during functional magnetic resonance imaging (fMRI). K-means clustering over time series data was used to parcellate left parietal cortex into subregions based on functional connectivity to a whole brain network of regions involved in tool processing. One cluster, in the inferior parietal cortex, expressed privileged functional connectivity to the left ventral premotor cortex. A second cluster, in the vicinity of the anterior intraparietal sulcus, expressed privileged functional connectivity with the left medial fusiform gyrus. A third cluster in the superior parietal lobe expressed privileged functional connectivity with dorsal occipital cortex. Control analyses using Monte Carlo style permutation tests demonstrated that the clustering solutions were outside the range of what would be observed based on chance ‘lumpiness’ in random data, or mere anatomical proximity. Finally, hierarchical clustering analyses were used to formally relate the resulting parcellation scheme of left parietal tool representations to previous work that has parcellated the left parietal lobule on purely anatomical grounds. These findings demonstrate significant heterogeneity in the functional organization of manipulable object representations in left parietal cortex, and outline a framework that generates novel predictions about the causes of some forms of upper limb apraxia. PMID:24892224

  7. Recent advances in functional perturbation and genome editing techniques in studying sea urchin development.

    PubMed

    Cui, Miao; Lin, Che-Yi; Su, Yi-Hsien

    2017-06-12

    Studies on the gene regulatory networks (GRNs) of sea urchin embryos have provided a basic understanding of the molecular mechanisms controlling animal development. The causal links in GRNs have been verified experimentally through perturbation of gene functions. Microinjection of antisense morpholino oligonucleotides (MOs) into the egg is the most widely used approach for gene knockdown in sea urchin embryos. The modification of MOs into a membrane-permeable form (vivo-MOs) has allowed gene knockdown at later developmental stages. Recent advances in genome editing tools, such as zinc-finger nucleases, transcription activator-like effector-based nucleases and the clustered regularly interspaced short palindromic repeat/clustered regularly interspaced short palindromic repeat-associated protein 9 (CRISPR/Cas9) system, have provided methods for gene knockout in sea urchins. Here, we review the use of vivo-MOs and genome editing tools in sea urchin studies since the publication of its genome in 2006. Various applications of the CRISPR/Cas9 system and their potential in studying sea urchin development are also discussed. These new tools will provide more sophisticated experimental methods for studying sea urchin development. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  8. EnzymeDetector: an integrated enzyme function prediction tool and database

    PubMed Central

    2011-01-01

    Background The ability to accurately predict enzymatic functions is an essential prerequisite for the interpretation of cellular functions, and the reconstruction and analysis of metabolic models. Several biological databases exist that provide such information. However, in many cases these databases provide partly different and inconsistent genome annotations. Description We analysed nine prokaryotic genomes and found about 70% inconsistencies in the enzyme predictions of the main annotation resources. Therefore, we implemented the annotation pipeline EnzymeDetector. This tool automatically compares and evaluates the assigned enzyme functions from the main annotation databases and supplements them with its own function prediction. This is based on a sequence similarity analysis, on manually created organism-specific enzyme information from BRENDA (Braunschweig Enzyme Database), and on sequence pattern searches. Conclusions EnzymeDetector provides a fast and comprehensive overview of the available enzyme function annotations for a genome of interest. The web interface allows the user to work with customisable weighting schemes and cut-offs for the different prediction methods. These customised quality criteria can easily be applied, and the resulting annotation can be downloaded. The summarised view of all used annotation sources provides up-to-date information. Annotation errors that occur in only one of the databases can be recognised (because of their low relevance score). The results are stored in a database and can be accessed at http://enzymedetector.tu-bs.de. PMID:21943292

  9. Ensembl Plants: Integrating Tools for Visualizing, Mining, and Analyzing Plant Genomics Data.

    PubMed

    Bolser, Dan; Staines, Daniel M; Pritchard, Emily; Kersey, Paul

    2016-01-01

    Ensembl Plants ( http://plants.ensembl.org ) is an integrative resource presenting genome-scale information for a growing number of sequenced plant species (currently 33). Data provided includes genome sequence, gene models, functional annotation, and polymorphic loci. Various additional information are provided for variation data, including population structure, individual genotypes, linkage, and phenotype data. In each release, comparative analyses are performed on whole genome and protein sequences, and genome alignments and gene trees are made available that show the implied evolutionary history of each gene family. Access to the data is provided through a genome browser incorporating many specialist interfaces for different data types, and through a variety of additional methods for programmatic access and data mining. These access routes are consistent with those offered through the Ensembl interface for the genomes of non-plant species, including those of plant pathogens, pests, and pollinators.Ensembl Plants is updated 4-5 times a year and is developed in collaboration with our international partners in the Gramene ( http://www.gramene.org ) and transPLANT projects ( http://www.transplantdb.org ).

  10. A profile-based method for identifying functional divergence of orthologous genes in bacterial genomes.

    PubMed

    Wheeler, Nicole E; Barquist, Lars; Kingsley, Robert A; Gardner, Paul P

    2016-12-01

    Next generation sequencing technologies have provided us with a wealth of information on genetic variation, but predicting the functional significance of this variation is a difficult task. While many comparative genomics studies have focused on gene flux and large scale changes, relatively little attention has been paid to quantifying the effects of single nucleotide polymorphisms and indels on protein function, particularly in bacterial genomics. We present a hidden Markov model based approach we call delta-bitscore (DBS) for identifying orthologous proteins that have diverged at the amino acid sequence level in a way that is likely to impact biological function. We benchmark this approach with several widely used datasets and apply it to a proof-of-concept study of orthologous proteomes in an investigation of host adaptation in Salmonella enterica We highlight the value of the method in identifying functional divergence of genes, and suggest that this tool may be a better approach than the commonly used dN/dS metric for identifying functionally significant genetic changes occurring in recently diverged organisms. A program implementing DBS for pairwise genome comparisons is freely available at: https://github.com/UCanCompBio/deltaBS CONTACT: nicole.wheeler@pg.canterbury.ac.nz or lars.barquist@uni-wuerzburg.deSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  11. CTDB: An Integrated Chickpea Transcriptome Database for Functional and Applied Genomics

    PubMed Central

    Patel, Ravi K.; Garg, Rohini; Jain, Mukesh

    2015-01-01

    Chickpea is an important grain legume used as a rich source of protein in human diet. The narrow genetic diversity and limited availability of genomic resources are the major constraints in implementing breeding strategies and biotechnological interventions for genetic enhancement of chickpea. We developed an integrated Chickpea Transcriptome Database (CTDB), which provides the comprehensive web interface for visualization and easy retrieval of transcriptome data in chickpea. The database features many tools for similarity search, functional annotation (putative function, PFAM domain and gene ontology) search and comparative gene expression analysis. The current release of CTDB (v2.0) hosts transcriptome datasets with high quality functional annotation from cultivated (desi and kabuli types) and wild chickpea. A catalog of transcription factor families and their expression profiles in chickpea are available in the database. The gene expression data have been integrated to study the expression profiles of chickpea transcripts in major tissues/organs and various stages of flower development. The utilities, such as similarity search, ortholog identification and comparative gene expression have also been implemented in the database to facilitate comparative genomic studies among different legumes and Arabidopsis. Furthermore, the CTDB represents a resource for the discovery of functional molecular markers (microsatellites and single nucleotide polymorphisms) between different chickpea types. We anticipate that integrated information content of this database will accelerate the functional and applied genomic research for improvement of chickpea. The CTDB web service is freely available at http://nipgr.res.in/ctdb.html. PMID:26322998

  12. A profile-based method for identifying functional divergence of orthologous genes in bacterial genomes

    PubMed Central

    Wheeler, Nicole E.; Barquist, Lars; Kingsley, Robert A.; Gardner, Paul P.

    2016-01-01

    Motivation: Next generation sequencing technologies have provided us with a wealth of information on genetic variation, but predicting the functional significance of this variation is a difficult task. While many comparative genomics studies have focused on gene flux and large scale changes, relatively little attention has been paid to quantifying the effects of single nucleotide polymorphisms and indels on protein function, particularly in bacterial genomics. Results: We present a hidden Markov model based approach we call delta-bitscore (DBS) for identifying orthologous proteins that have diverged at the amino acid sequence level in a way that is likely to impact biological function. We benchmark this approach with several widely used datasets and apply it to a proof-of-concept study of orthologous proteomes in an investigation of host adaptation in Salmonella enterica. We highlight the value of the method in identifying functional divergence of genes, and suggest that this tool may be a better approach than the commonly used dN/dS metric for identifying functionally significant genetic changes occurring in recently diverged organisms. Availability and Implementation: A program implementing DBS for pairwise genome comparisons is freely available at: https://github.com/UCanCompBio/deltaBS. Contact: nicole.wheeler@pg.canterbury.ac.nz or lars.barquist@uni-wuerzburg.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27503221

  13. Use of Genomic Tools to Improve Cattle Health in the Context of Infectious Diseases

    PubMed Central

    Raszek, Mikolaj M.; Guan, Le L.; Plastow, Graham S.

    2016-01-01

    Although infectious diseases impose a heavy economic burden on the cattle industry, the etiology of many disorders that affect livestock is not fully elucidated, and effective countermeasures are often lacking. The main tools available until now have been vaccines, antibiotics and antiparasitic drugs. Although these have been very successful in some cases, the appearance of parasite and microbial resistance to these treatments is a cause of concern. Next-generation sequencing provides important opportunities to tackle problems associated with pathogenic illnesses. This review describes the rapid gains achieved to track disease progression, identify the pathogens involved, and map pathogen interactions with the host. Use of novel genomic tools subsequently aids in treatment development, as well as successful creation of breeding programs aimed toward less susceptible livestock. These may be important tools for mitigating the long term effects of combating infection and helping reduce the reliance on antibiotic treatment. PMID:27014337

  14. Genome-wide functional analysis in Candida albicans.

    PubMed

    Motaung, Thabiso E; Ells, Ruan; Pohl, Carolina H; Albertyn, Jacobus; Tsilo, Toi J

    2017-02-08

    Candida albicans is an important etiological agent of superficial and life-threatening infections in individuals with compromised immune systems. To date, we know of several overlapping genetic networks that govern virulence attributes in this fungal pathogen. Classical use of deletion mutants has led to the discovery of numerous virulence factors over the years, and genome-wide functional analysis has propelled gene discovery at an even faster pace. Indeed, a number of recent studies using large-scale genetic screens followed by genome-wide functional analysis has allowed for the unbiased discovery of many new genes involved in C. albicans biology. Here we share our perspectives on the role of these studies in analyzing fundamental aspects of C. albicans virulence properties.

  15. The ethics of functional genomics: same, same, but different?

    PubMed

    Hoeyer, Klaus; Koch, Lene

    2006-09-01

    Respect for human life--a notion of worth uniting all members of the human race--constitutes a sense of anthropocentrism that has long been the justification for the enrollment of animals in experimentation executed to develop therapies to alleviate human suffering. Currently, however, advances in functional genomics are causing a qualitative transformation of the rationale for medical research performed on animals. The notion of human distinctness is being fundamentally challenged when gene sequences similar to those found in humans are identified in different species. In this Opinion article, we would like to highlight an inherent tension brought about by the current developments in functional genomics: a tension between the scientific and the ethical status of gene sequences. Is it reasonable to argue that they are the same for all practical purposes but different in ethical status?

  16. Relaxation of yeast mitochondrial functions after whole-genome duplication

    PubMed Central

    Jiang, Huifeng; Guan, Wenjun; Pinney, David; Wang, Wen; Gu, Zhenglong

    2008-01-01

    Mitochondria are essential for cellular energy production in most eukaryotic organisms. However, when glucose is abundant, yeast species that underwent whole-genome duplication (WGD) mostly conduct fermentation even under aerobic conditions, and most can survive without a functional mitochondrial genome. In this study, we show that the rate of evolution for the nuclear-encoded mitochondrial genes was greater in post-WGD species than pre-WGD species. Furthermore, codon usage bias was relaxed for these genes in post-WGD yeast species. The codon usage pattern and the distribution of a particular transcription regulatory element suggest that the change to an efficient aerobic fermentation lifestyle in this lineage might have emerged after WGD between the divergence of Kluyveromyces polysporus and Saccharomyces castellii from their common ancestor. This new energy production strategy could have led to the relaxation of mitochondrial function in the relevant yeast species. PMID:18669479

  17. Automated protein function prediction--the genomic challenge.

    PubMed

    Friedberg, Iddo

    2006-09-01

    Overwhelmed with genomic data, biologists are facing the first big post-genomic question--what do all genes do? First, not only is the volume of pure sequence and structure data growing, but its diversity is growing as well, leading to a disproportionate growth in the number of uncharacterized gene products. Consequently, established methods of gene and protein annotation, such as homology-based transfer, are annotating less data and in many cases are amplifying existing erroneous annotation. Second, there is a need for a functional annotation which is standardized and machine readable so that function prediction programs could be incorporated into larger workflows. This is problematic due to the subjective and contextual definition of protein function. Third, there is a need to assess the quality of function predictors. Again, the subjectivity of the term 'function' and the various aspects of biological function make this a challenging effort. This article briefly outlines the history of automated protein function prediction and surveys the latest innovations in all three topics.

  18. The infectious BAC genomic DNA expression library: a high capacity vector system for functional genomics

    PubMed Central

    Lufino, Michele M. P.; Edser, Pauline A. H.; Quail, Michael A.; Rice, Stephen; Adams, David J.; Wade-Martins, Richard

    2016-01-01

    Gene dosage plays a critical role in a range of cellular phenotypes, yet most cellular expression systems use heterologous cDNA-based vectors which express proteins well above physiological levels. In contrast, genomic DNA expression vectors generate physiologically-relevant levels of gene expression by carrying the whole genomic DNA locus of a gene including its regulatory elements. Here we describe the first genomic DNA expression library generated using the high-capacity herpes simplex virus-1 amplicon technology to deliver bacterial artificial chromosomes (BACs) into cells by viral transduction. The infectious BAC (iBAC) library contains 184,320 clones with an average insert size of 134.5 kb. We show in a Chinese hamster ovary (CHO) disease model cell line and mouse embryonic stem (ES) cells that this library can be used for genetic rescue studies in a range of contexts including the physiological restoration of Ldlr deficiency, and viral receptor expression. The iBAC library represents an important new genetic analysis tool openly available to the research community. PMID:27353647

  19. Progress in functional genomics approaches to antifungal drug target discovery.

    PubMed

    De Backer, Marianne D; Van Dijck, Patrick

    2003-10-01

    Antifungal drug discovery is starting to benefit from the enormous advances in the genomics field, which have occurred in the past decade. As traditional drug screening on existing targets is not delivering the long-awaited potent antifungals, efforts to use novel genetics and genomics-based strategies to aid in the discovery of novel drug targets are gaining increased importance. The current paradigm in antifungal drug target discovery focuses on basically two main classes of targets to evaluate: genes essential for viability and virulence or pathogenicity factors. Here we report on recent advances in genetics and genomics-based technologies that will allow us not only to identify and validate novel fungal drug targets, but hopefully in the longer run also to discover potent novel therapeutic agents. Fungal pathogens have typically presented significant obstacles when subjected to genetics, but the creativity of scientists in the anti-infectives field and the cross-talk with scientists in other areas is now yielding exciting new tools and technologies to tackle the problem of finding potent, specific and non-toxic antifungal therapeutics.

  20. Genomic Tools for Customized Recovery and Detection of Foodborne Shiga Toxigenic Escherichia coli.

    PubMed

    Knowles, Michael; Stinson, Sara; Lambert, Dominic; Carrillo, Catherine; Koziol, Adam; Gauthier, Martine; Blais, Burton

    2016-12-01

    Genomic antimicrobial resistance (AMR) prediction tools have the potential to support foodborne illness outbreak investigations through their application in the analysis of bacterial genomes from causative strains. The AMR marker profile of a strain of interest, initially identified in outbreak-associated clinical samples, may serve as the basis for customization of selective enrichment media, facilitating its recovery from samples in a food safety investigation. Different possibilities for AMR analyses include the use of comprehensive AMR gene databases such as the Comprehensive Antibiotic Resistance Database, which can be mined with in-house bioinformatics alignment tools (e.g., Antimicrobial Resistance Marker Identifier), or publicly available tools based on clinically relevant acquired AMR gene databases (e.g., ResFinder). In combination with a previously reported pipeline (SigSeekr) designed to identify specific DNA sequences associated with a particular strain for its rapid identification by PCR, it should be possible to deploy custom recovery and identification tools for the efficient detection of priority pathogens such as Shiga toxigenic Escherichia coli (STEC) outbreak strains within the time frame of an active investigation. Using a laboratory STEC strain as a model, trimethoprim resistance identified by both Antimicrobial Resistance Marker Identifier and ResFinder was used as the basis for its selective recovery against a background of commensal E. coli bacteria in ground beef samples. Enrichment in modified tryptic soy broth containing trimethoprim greatly enhanced the recovery of low numbers of model strain cells inoculated in ground beef samples, as verified by the enumeration of colonies on plating media using a strain-specific PCR method to determine the recovery efficiency for the target strain. We discuss the relative merits of different AMR marker prediction tools for this purpose and describe how such tools can be utilized to good effect in a

  1. A Functional Genomic Analysis of NF1-Associated Learning Disabilities

    DTIC Science & Technology

    2008-02-01

    family. For example, Rab1 and Rab2 are downregulated in the NF1 hippocampus, while Rab3A is upregulated; Synaptotagmin 1 is downregulated, while...syntaxin binding protein 1 DOWN 1421990_at Syt1 synaptotagmin I DOWN 1422589_at Rab3a RAB3A, member RAS oncogene family UP Vesicle recycling 1422809_at...AD_________________ Award Number: W81XWH-04- 1 -0261 TITLE: A Functional Genomic Analysis of NF1-Associated Learning Disabilities

  2. A Functional Genomic Analysis of NF1-Associated Learning Disabilities

    DTIC Science & Technology

    2007-02-01

    For example, Rab1 and Rab2 are downregulated in the NF1 hippocampus, while Rab3A is upregulated; Synaptotagmin 1 is downregulated, while...syntaxin binding protein 1 DOWN 1421990_at Syt1 synaptotagmin I DOWN 1422589_at Rab3a RAB3A, member RAS oncogene family UP Vesicle recycling 1422809_at...AD_________________ Award Number: W81XWH-04- 1 -0261 TITLE: A Functional Genomic Analysis of NF1

  3. Unlocking Holocentric Chromosomes: New Perspectives from Comparative and Functional Genomics?

    PubMed Central

    Mandrioli, Mauro; Manicardi, Gian Carlo

    2012-01-01

    The presence of chromosomes with diffuse centromeres (holocentric chromosomes) has been reported in several taxa since more than fifty years, but a full understanding of their origin is still lacking. Comparative and functional genomics are nowadays furnishing new data to better understand holocentric chromosome evolution thus opening new perspectives to analyse karyotype rearrangements in species with holocentric chromosomes in particular evidencing unusual common features, such as the uniform GC content and gene distribution along chromosomes. PMID:23372420

  4. Patient-derived cell models as preclinical tools for genome-directed targeted therapy

    PubMed Central

    Jang, Jiryeon; Park, Kyunghee; Yi, Jun Ho; Hong, Mineui; Ahn, Taejin; Rath, Oliver; Schueler, Julia; Kim, Seung Tae; Do, In-Gu; Lee, Sujin; Park, Se Hoon; Ji, Yong Ick; Kim, Dukwhan; Park, Joon Oh; Park, Young Suk; Kang, Won Ki; Kim, Kyoung-Mee; Park, Woong-Yang; Lim, Ho Yeong; Lee, Jeeyun

    2015-01-01

    Background In this study, we established patient-derived tumor cell (PDC) models using tissues collected from patients with metastatic cancer and assessed whether these models could be used as a tool for genome-based cancer treatment. Methods PDCs were isolated and cultured from malignant effusions including ascites and pleural fluid. Pathological examination, immunohistochemical analysis, and genomic profiling were performed to compare the histological and genomic features of primary tumors, PDCs. An exploratory gene expression profiling assay was performed to further characterize PDCs. Results From January 2012 to May 2013, 176 samples from patients with metastatic cancer were collected. PDC models were successfully established in 130 (73.6%) samples. The median time from specimen collection to passage 1 (P1) was 3 weeks (range, 0.5–4 weeks), while that from P1 to P2 was 2.5 weeks (range, 0.5–5 weeks). Sixteen paired samples of genomic alterations were highly concordant between each primary tumor and progeny PDCs, with an average variant allele frequency (VAF) correlation of 0.878. We compared genomic profiles of the primary tumor (P0), P1 cells, P2 cells, and patient-derived xenografts (PDXs) derived from P2 cells and found that three samples (P0, P1, and P2 cells) were highly correlated (0.99–1.00). Moreover, PDXs showed more than 100 variants, with correlations of only 0.6–0.8 for the other samples. Drug responses of PDCs were reflective of the clinical response to targeted agents in selected patient PDC lines. Conclusion(s) Our results provided evidence that our PDC model was a promising model for preclinical experiments and closely resembled the patient tumor genome and clinical response. PMID:26296973

  5. Nucleotide diversity analysis highlights functionally important genomic regions

    PubMed Central

    Tatarinova, Tatiana V.; Chekalin, Evgeny; Nikolsky, Yuri; Bruskin, Sergey; Chebotarov, Dmitry; McNally, Kenneth L.; Alexandrov, Nickolai

    2016-01-01

    We analyzed functionality and relative distribution of genetic variants across the complete Oryza sativa genome, using the 40 million single nucleotide polymorphisms (SNPs) dataset from the 3,000 Rice Genomes Project (http://snp-seek.irri.org), the largest and highest density SNP collection for any higher plant. We have shown that the DNA-binding transcription factors (TFs) are the most conserved group of genes, whereas kinases and membrane-localized transporters are the most variable ones. TFs may be conserved because they belong to some of the most connected regulatory hubs that modulate transcription of vast downstream gene networks, whereas signaling kinases and transporters need to adapt rapidly to changing environmental conditions. In general, the observed profound patterns of nucleotide variability reveal functionally important genomic regions. As expected, nucleotide diversity is much higher in intergenic regions than within gene bodies (regions spanning gene models), and protein-coding sequences are more conserved than untranslated gene regions. We have observed a sharp decline in nucleotide diversity that begins at about 250 nucleotides upstream of the transcription start and reaches minimal diversity exactly at the transcription start. We found the transcription termination sites to have remarkably symmetrical patterns of SNP density, implying presence of functional sites near transcription termination. Also, nucleotide diversity was significantly lower near 3′ UTRs, the area rich with regulatory regions. PMID:27774999

  6. Functional Genomics in the Study of Mind-Body Therapies

    PubMed Central

    Niles, Halsey; Mehta, Darshan H.; Corrigan, Alexandra A.; Bhasin, Manoj K.; Denninger, John W.

    2014-01-01

    Background Mind-body therapies (MBTs) are used throughout the world in treatment, disease prevention, and health promotion. However, the mechanisms by which MBTs exert their positive effects are not well understood. Investigations into MBTs using functional genomics have revolutionized the understanding of MBT mechanisms and their effects on human physiology. Methods We searched the literature for the effects of MBTs on functional genomics determinants using MEDLINE, supplemented by a manual search of additional journals and a reference list review. Results We reviewed 15 trials that measured global or targeted transcriptomic, epigenomic, or proteomic changes in peripheral blood. Sample sizes ranged from small pilot studies (n=2) to large trials (n=500). While the reliability of individual genes from trial to trial was often inconsistent, genes related to inflammatory response, particularly those involved in the nuclear factor-kappa B (NF-κB) pathway, were consistently downregulated across most studies. Conclusion In general, existing trials focusing on gene expression changes brought about by MBTs have revealed intriguing connections to the immune system through the NF-κB cascade, to telomere maintenance, and to apoptotic regulation. However, these findings are limited to a small number of trials and relatively small sample sizes. More rigorous randomized controlled trials of healthy subjects and specific disease states are warranted. Future research should investigate functional genomics areas both upstream and downstream of MBT-related gene expression changes—from epigenomics to proteomics and metabolomics. PMID:25598735

  7. Nucleotide diversity analysis highlights functionally important genomic regions.

    PubMed

    Tatarinova, Tatiana V; Chekalin, Evgeny; Nikolsky, Yuri; Bruskin, Sergey; Chebotarov, Dmitry; McNally, Kenneth L; Alexandrov, Nickolai

    2016-10-24

    We analyzed functionality and relative distribution of genetic variants across the complete Oryza sativa genome, using the 40 million single nucleotide polymorphisms (SNPs) dataset from the 3,000 Rice Genomes Project (http://snp-seek.irri.org), the largest and highest density SNP collection for any higher plant. We have shown that the DNA-binding transcription factors (TFs) are the most conserved group of genes, whereas kinases and membrane-localized transporters are the most variable ones. TFs may be conserved because they belong to some of the most connected regulatory hubs that modulate transcription of vast downstream gene networks, whereas signaling kinases and transporters need to adapt rapidly to changing environmental conditions. In general, the observed profound patterns of nucleotide variability reveal functionally important genomic regions. As expected, nucleotide diversity is much higher in intergenic regions than within gene bodies (regions spanning gene models), and protein-coding sequences are more conserved than untranslated gene regions. We have observed a sharp decline in nucleotide diversity that begins at about 250 nucleotides upstream of the transcription start and reaches minimal diversity exactly at the transcription start. We found the transcription termination sites to have remarkably symmetrical patterns of SNP density, implying presence of functional sites near transcription termination. Also, nucleotide diversity was significantly lower near 3' UTRs, the area rich with regulatory regions.

  8. Functional Profiling Using the Saccharomyces Genome Deletion Project Collections.

    PubMed

    Nislow, Corey; Wong, Lai Hong; Lee, Amy Huei-Yi; Giaever, Guri

    2016-09-01

    The ability to measure and quantify the fitness of an entire organism requires considerably more complex approaches than simply using traditional "omic" methods that examine, for example, the abundance of RNA transcripts, proteins, or metabolites. The yeast deletion collections represent the only systematic, comprehensive set of null alleles for any organism in which such fitness measurements can be assayed. Generated by the Saccharomyces Genome Deletion Project, these collections allow the systematic and parallel analysis of gene functions using any measurable phenotype. The unique 20-bp molecular barcodes engineered into the genome of each deletion strain facilitate the massively parallel analysis of individual fitness. Here, we present functional genomic protocols for use with the yeast deletion collections. We describe how to maintain, propagate, and store the deletion collections and how to perform growth fitness assays on single and parallel screening platforms. Phenotypic fitness analyses of the yeast mutants, described in brief here, provide important insights into biological functions, mechanisms of drug action, and response to environmental stresses. It is important to bear in mind that the specific assays described in this protocol represent some of the many ways in which these collections can be assayed, and in this description particular attention is paid to maximizing throughput using growth as the phenotypic measure.

  9. Tools for Accurate and Efficient Analysis of Complex Evolutionary Mechanisms in Microbial Genomes. Final Report

    SciTech Connect

    Nakhleh, Luay

    2014-03-12

    I proposed to develop computationally efficient tools for accurate detection and reconstruction of microbes' complex evolutionary mechanisms, thus enabling rapid and accurate annotation, analysis and understanding of their genomes. To achieve this goal, I proposed to address three aspects. (1) Mathematical modeling. A major challenge facing the accurate detection of HGT is that of distinguishing between these two events on the one hand and other events that have similar "effects." I proposed to develop a novel mathematical approach for distinguishing among these events. Further, I proposed to develop a set of novel optimization criteria for the evolutionary analysis of microbial genomes in the presence of these complex evolutionary events. (2) Algorithm design. In this aspect of the project, I proposed to develop an array of e cient and accurate algorithms for analyzing microbial genomes based on the formulated optimization criteria. Further, I proposed to test the viability of the criteria and the accuracy of the algorithms in an experimental setting using both synthetic as well as biological data. (3) Software development. I proposed the nal outcome to be a suite of software tools which implements the mathematical models as well as the algorithms developed.

  10. The Princeton Protein Orthology Database (P-POD): A Comparative Genomics Analysis Tool for Biologists

    PubMed Central

    Kang, Fan; Angiuoli, Samuel V.; White, Owen; Botstein, David; Dolinski, Kara

    2007-01-01

    Many biological databases that provide comparative genomics information and tools are now available on the internet. While certainly quite useful, to our knowledge none of the existing databases combine results from multiple comparative genomics methods with manually curated information from the literature. Here we describe the Princeton Protein Orthology Database (P-POD, http://ortholog.princeton.edu), a user-friendly database system that allows users to find and visualize the phylogenetic relationships among predicted orthologs (based on the OrthoMCL method) to a query gene from any of eight eukaryotic organisms, and to see the orthologs in a wider evolutionary context (based on the Jaccard clustering method). In addition to the phylogenetic information, the database contains experimental results manually collected from the literature that can be compared to the computational analyses, as well as links to relevant human disease and gene information via the OMIM, model organism, and sequence databases. Our aim is for the P-POD resource to be extremely useful to typical experimental biologists wanting to learn more about the evolutionary context of their favorite genes. P-POD is based on the commonly used Generic Model Organism Database (GMOD) schema and can be downloaded in its entirety for installation on one's own system. Thus, bioinformaticians and software developers may also find P-POD useful because they can use the P-POD database infrastructure when developing their own comparative genomics resources and database tools. PMID:17712414

  11. Beyond the genome: functional studies of phototrophic sulfur oxidation.

    PubMed

    Hanson, Thomas E; Morgan-Kiss, Rachael M; Chan, Leong-Keat; Hiras, Jennifer

    2010-01-01

    The increasing availability of complete genomic sequences for cultured phototrophic bacteria and assembled metagenomes from environments dominated by phototrophs has reinforced the need for a "post-genomic" analytical effort to test models of cellular structure and function proposed from genomic data. Comparative genomics has produced a testable model for pathways of sulfur compound oxidation in the phototrophic bacteria. In the case of sulfide, two enzymes are predicted to oxidize sulfide: sulfide:quinone oxidoreductase and flavocytochrome c sulfide dehydrogenase. However, these models do not predict which enzyme is important under what conditions. In Chlorobaculum tepidum, a model green sulfur bacterium, a combination of genetics and physiological analysis of mutant strains has led to the realization that this organism contains at least two active sulfide:quinone oxidoreductases and that there is significant interaction between sulfide oxidation and light harvesting. In the case of elemental sulfur, an organothiol intermediate of unknown structure has been proposed to activate elemental sulfur for transport into the cytoplasm where it can be oxidized or assimilated, and recent approaches using classical metabolite analysis have begun to shed light on this issue both in C. tepidum and the purple sulfur bacterium Allochromatium vinosum.

  12. Functional Genomics of Drought Tolerance in Bioenergy Crops

    SciTech Connect

    Yin, Hengfu; Chen, Rick; Yang, Jun; Weston, David; Chen, Jay; Muchero, Wellington; Ye, Ning; Tschaplinski, Timothy J; Wullschleger, Stan D; Cheng, Zong-Ming; Tuskan, Gerald A; Yang, Xiaohan

    2014-01-01

    With the predicted trends in climate change, drought will increasingly impose a grand challenge to biomass production. Most of the bioenergy crops have some degree of drought susceptibility with low water-use efficiency (WUE). It is imperative to improve drought tolerance and WUE in bioenergy crops for sustainable biomass production in arid and semi-arid regions with minimal water input. Genetics and functional genomics can play a critical role in generating knowledge to inform and aid genetic improvement of drought tolerance in bioenergy crops. The molecular aspect of drought response has been extensively investigated in model plants like Arabidopsis, yet our understanding of the molecular mechanisms underlying drought tolerance in bioenergy crops are limited. Crops exhibit various responses to drought stress depending on species and genotype. A rational strategy for studying drought tolerance in bioenergy crops is to translate the knowledge fro