functional genomics strategy: Topics by Science.gov

Sample records for functional genomics strategy

Robust one-Tube Ω-PCR Strategy Accelerates Precise Sequence Modification of Plasmids for Functional Genomics

PubMed Central

Chen, Letian; Wang, Fengpin; Wang, Xiaoyu; Liu, Yao-Guang

2013-01-01

Functional genomics requires vector construction for protein expression and functional characterization of target genes; therefore, a simple, flexible and low-cost molecular manipulation strategy will be highly advantageous for genomics approaches. Here, we describe a Ω-PCR strategy that enables multiple types of sequence modification, including precise insertion, deletion and substitution, in any position of a circular plasmid. Ω-PCR is based on an overlap extension site-directed mutagenesis technique, and is named for its characteristic Ω-shaped secondary structure during PCR. Ω-PCR can be performed either in two steps, or in one tube in combination with exonuclease I treatment. These strategies have wide applications for protein engineering, gene function analysis and in vitro gene splicing. PMID:23335613
Computational Prediction of the Global Functional Genomic Landscape: Applications, Methods and Challenges

PubMed Central

Zhou, Weiqiang; Sherwood, Ben; Ji, Hongkai

2017-01-01

Technological advances have led to an explosive growth of high-throughput functional genomic data. Exploiting the correlation among different data types, it is possible to predict one functional genomic data type from other data types. Prediction tools are valuable in understanding the relationship among different functional genomic signals. They also provide a cost-efficient solution to inferring the unknown functional genomic profiles when experimental data are unavailable due to resource or technological constraints. The predicted data may be used for generating hypotheses, prioritizing targets, interpreting disease variants, facilitating data integration, quality control, and many other purposes. This article reviews various applications of prediction methods in functional genomics, discusses analytical challenges, and highlights some common and effective strategies used to develop prediction methods for functional genomic data. PMID:28076869
A Bayesian deconvolution strategy for immunoprecipitation-based DNA methylome analysis

PubMed Central

Down, Thomas A.; Rakyan, Vardhman K.; Turner, Daniel J.; Flicek, Paul; Li, Heng; Kulesha, Eugene; Gräf, Stefan; Johnson, Nathan; Herrero, Javier; Tomazou, Eleni M.; Thorne, Natalie P.; Bäckdahl, Liselotte; Herberth, Marlis; Howe, Kevin L.; Jackson, David K.; Miretti, Marcos M.; Marioni, John C.; Birney, Ewan; Hubbard, Tim J. P.; Durbin, Richard; Tavaré, Simon; Beck, Stephan

2009-01-01

DNA methylation is an indispensible epigenetic modification of mammalian genomes. Consequently there is great interest in strategies for genome-wide/whole-genome DNA methylation analysis, and immunoprecipitation-based methods have proven to be a powerful option. Such methods are rapidly shifting the bottleneck from data generation to data analysis, necessitating the development of better analytical tools. Until now, a major analytical difficulty associated with immunoprecipitation-based DNA methylation profiling has been the inability to estimate absolute methylation levels. Here we report the development of a novel cross-platform algorithm – Bayesian Tool for Methylation Analysis (Batman) – for analyzing Methylated DNA Immunoprecipitation (MeDIP) profiles generated using arrays (MeDIP-chip) or next-generation sequencing (MeDIP-seq). The latter is an approach we have developed to elucidate the first high-resolution whole-genome DNA methylation profile (DNA methylome) of any mammalian genome. MeDIP-seq/MeDIP-chip combined with Batman represent robust, quantitative, and cost-effective functional genomic strategies for elucidating the function of DNA methylation. PMID:18612301
Enhancer scanning to locate regulatory regions in genomic loci

PubMed Central

Buckley, Melissa; Gjyshi, Anxhela; Mendoza-Fandiño, Gustavo; Baskin, Rebekah; Carvalho, Renato S.; Carvalho, Marcelo A.; Woods, Nicholas T.; Monteiro, Alvaro N.A.

2016-01-01

The present protocol provides a rapid, streamlined and scalable strategy to systematically scan genomic regions for the presence of transcriptional regulatory regions active in a specific cell type. It creates genomic tiles spanning a region of interest that are subsequently cloned by recombination into a luciferase reporter vector containing the Simian Virus 40 promoter. Tiling clones are transfected into specific cell types to test for the presence of transcriptional regulatory regions. The protocol includes testing of different SNP (single nucleotide polymorphism) alleles to determine their effect on regulatory activity. This procedure provides a systematic framework to identify candidate functional SNPs within a locus during functional analysis of genome-wide association studies. This protocol adapts and combines previous well-established molecular biology methods to provide a streamlined strategy, based on automated primer design and recombinational cloning to rapidly go from a genomic locus to a set of candidate functional SNPs in eight weeks. PMID:26658467
Ascribing Functions to Genes: Journey Towards Genetic Improvement of Rice Via Functional Genomics

PubMed Central

Mustafiz, Ananda; Kumari, Sumita; Karan, Ratna

2016-01-01

Rice, one of the most important cereal crops for mankind, feeds more than half the world population. Rice has been heralded as a model cereal owing to its small genome size, amenability to easy transformation, high synteny to other cereal crops and availability of complete genome sequence. Moreover, sequence wealth in rice is getting more refined and precise due to resequencing efforts. This humungous resource of sequence data has confronted research fraternity with a herculean challenge as well as an excellent opportunity to functionally validate expressed as well as regulatory portions of the genome. This will not only help us in understanding the genetic basis of plant architecture and physiology but would also steer us towards developing improved cultivars. No single technique can achieve such a mammoth task. Functional genomics through its diverse tools viz. loss and gain of function mutants, multifarious omics strategies like transcriptomics, proteomics, metabolomics and phenomics provide us with the necessary handle. A paradigm shift in technological advances in functional genomics strategies has been instrumental in generating considerable amount of information w.r.t functionality of rice genome. We now have several databases and online resources for functionally validated genes but despite that we are far from reaching the desired milestone of functionally characterizing each and every rice gene. There is an urgent need for a common platform, for information already available in rice, and collaborative efforts between researchers in a concerted manner as well as healthy public-private partnership, for genetic improvement of rice crop better able to handle the pressures of climate change and exponentially increasing population. PMID:27252584
Development of a targeted transgenesis strategy in highly differentiated cells: a powerful tool for functional genomic analysis.

PubMed

Puttini, Stefania; Ouvrard-Pascaud, Antoine; Palais, Gael; Beggah, Ahmed T; Gascard, Philippe; Cohen-Tannoudji, Michel; Babinet, Charles; Blot-Chabaud, Marcel; Jaisser, Frederic

2005-03-16

Functional genomic analysis is a challenging step in the so-called post-genomic field. Identification of potential targets using large-scale gene expression analysis requires functional validation to identify those that are physiologically relevant. Genetically modified cell models are often used for this purpose allowing up- or down-expression of selected targets in a well-defined and if possible highly differentiated cell type. However, the generation of such models remains time-consuming and expensive. In order to alleviate this step, we developed a strategy aimed at the rapid and efficient generation of genetically modified cell lines with conditional, inducible expression of various target genes. Efficient knock-in of various constructs, called targeted transgenesis, in a locus selected for its permissibility to the tet inducible system, was obtained through the stimulation of site-specific homologous recombination by the meganuclease I-SceI. Our results demonstrate that targeted transgenesis in a reference inducible locus greatly facilitated the functional analysis of the selected recombinant cells. The efficient screening strategy we have designed makes possible automation of the transfection and selection steps. Furthermore, this strategy could be applied to a variety of highly differentiated cells.
Genome Editing for the Study of Cardiovascular Diseases.

PubMed

Chadwick, Alexandra C; Musunuru, Kiran

2017-03-01

The opportunities afforded through the recent advent of genome-editing technologies have allowed investigators to more easily study a number of diseases. The advantages and limitations of the most prominent genome-editing technologies are described in this review, along with potential applications specifically focused on cardiovascular diseases. The recent genome-editing tools using programmable nucleases, such as zinc-finger nucleases, transcription activator-like effector nucleases, and clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated 9 (Cas9), have rapidly been adapted to manipulate genes in a variety of cellular and animal models. A number of recent cardiovascular disease-related publications report cases in which specific mutations are introduced into disease models for functional characterization and for testing of therapeutic strategies. Recent advances in genome-editing technologies offer new approaches to understand and treat diseases. Here, we discuss genome editing strategies to easily characterize naturally occurring mutations and offer strategies with potential clinical relevance.
The Paris-Sud yeast structural genomics pilot-project: from structure to function.

PubMed

Quevillon-Cheruel, Sophie; Liger, Dominique; Leulliot, Nicolas; Graille, Marc; Poupon, Anne; Li de La Sierra-Gallay, Inès; Zhou, Cong-Zhao; Collinet, Bruno; Janin, Joël; Van Tilbeurgh, Herman

2004-01-01

We present here the outlines and results from our yeast structural genomics (YSG) pilot-project. A lab-scale platform for the systematic production and structure determination is presented. In order to validate this approach, 250 non-membrane proteins of unknown structure were targeted. Strategies and final statistics are evaluated. We finally discuss the opportunity of structural genomics programs to contribute to functional biochemical annotation.
The Enzyme Function Initiative†

PubMed Central

Gerlt, John A.; Allen, Karen N.; Almo, Steven C.; Armstrong, Richard N.; Babbitt, Patricia C.; Cronan, John E.; Dunaway-Mariano, Debra; Imker, Heidi J.; Jacobson, Matthew P.; Minor, Wladek; Poulter, C. Dale; Raushel, Frank M.; Sali, Andrej; Shoichet, Brian K.; Sweedler, Jonathan V.

2011-01-01

The Enzyme Function Initiative (EFI) was recently established to address the challenge of assigning reliable functions to enzymes discovered in bacterial genome projects; in this Current Topic we review the structure and operations of the EFI. The EFI includes the Superfamily/Genome, Protein, Structure, Computation, and Data/Dissemination Cores that provide the infrastructure for reliably predicting the in vitro functions of unknown enzymes. The initial targets for functional assignment are selected from five functionally diverse superfamilies (amidohydrolase, enolase, glutathione transferase, haloalkanoic acid dehalogenase, and isoprenoid synthase), with five superfamily-specific Bridging Projects experimentally testing the predicted in vitro enzymatic activities. The EFI also includes the Microbiology Core that evaluates the in vivo context of in vitro enzymatic functions and confirms the functional predictions of the EFI. The deliverables of the EFI to the scientific community include: 1) development of a large-scale, multidisciplinary sequence/structure-based strategy for functional assignment of unknown enzymes discovered in genome projects (target selection, protein production, structure determination, computation, experimental enzymology, microbiology, and structure-based annotation); 2) dissemination of the strategy to the community via publications, collaborations, workshops, and symposia; 3) computational and bioinformatic tools for using the strategy; 4) provision of experimental protocols and/or reagents for enzyme production and characterization; and 5) dissemination of data via the EFI’s website, enzymefunction.org. The realization of multidisciplinary strategies for functional assignment will begin to define the full metabolic diversity that exists in nature and will impact basic biochemical and evolutionary understanding, as well as a wide range of applications of central importance to industrial, medicinal and pharmaceutical efforts. PMID:21999478
The Enzyme Function Initiative.

PubMed

Gerlt, John A; Allen, Karen N; Almo, Steven C; Armstrong, Richard N; Babbitt, Patricia C; Cronan, John E; Dunaway-Mariano, Debra; Imker, Heidi J; Jacobson, Matthew P; Minor, Wladek; Poulter, C Dale; Raushel, Frank M; Sali, Andrej; Shoichet, Brian K; Sweedler, Jonathan V

2011-11-22

The Enzyme Function Initiative (EFI) was recently established to address the challenge of assigning reliable functions to enzymes discovered in bacterial genome projects; in this Current Topic, we review the structure and operations of the EFI. The EFI includes the Superfamily/Genome, Protein, Structure, Computation, and Data/Dissemination Cores that provide the infrastructure for reliably predicting the in vitro functions of unknown enzymes. The initial targets for functional assignment are selected from five functionally diverse superfamilies (amidohydrolase, enolase, glutathione transferase, haloalkanoic acid dehalogenase, and isoprenoid synthase), with five superfamily specific Bridging Projects experimentally testing the predicted in vitro enzymatic activities. The EFI also includes the Microbiology Core that evaluates the in vivo context of in vitro enzymatic functions and confirms the functional predictions of the EFI. The deliverables of the EFI to the scientific community include (1) development of a large-scale, multidisciplinary sequence/structure-based strategy for functional assignment of unknown enzymes discovered in genome projects (target selection, protein production, structure determination, computation, experimental enzymology, microbiology, and structure-based annotation), (2) dissemination of the strategy to the community via publications, collaborations, workshops, and symposia, (3) computational and bioinformatic tools for using the strategy, (4) provision of experimental protocols and/or reagents for enzyme production and characterization, and (5) dissemination of data via the EFI's Website, http://enzymefunction.org. The realization of multidisciplinary strategies for functional assignment will begin to define the full metabolic diversity that exists in nature and will impact basic biochemical and evolutionary understanding, as well as a wide range of applications of central importance to industrial, medicinal, and pharmaceutical efforts. © 2011 American Chemical Society
Strategies used for genetically modifying bacterial genome: ite-directed mutagenesis, gene inactivation, and gene over-expression*

PubMed Central

Xu, Jian-zhong; Zhang, Wei-guo

2016-01-01

With the availability of the whole genome sequence of Escherichia coli or Corynebacterium glutamicum, strategies for directed DNA manipulation have developed rapidly. DNA manipulation plays an important role in understanding the function of genes and in constructing novel engineering bacteria according to requirement. DNA manipulation involves modifying the autologous genes and expressing the heterogenous genes. Two alternative approaches, using electroporation linear DNA or recombinant suicide plasmid, allow a wide variety of DNA manipulation. However, the over-expression of the desired gene is generally executed via plasmid-mediation. The current review summarizes the common strategies used for genetically modifying E. coli and C. glutamicum genomes, and discusses the technical problem of multi-layered DNA manipulation. Strategies for gene over-expression via integrating into genome are proposed. This review is intended to be an accessible introduction to DNA manipulation within the bacterial genome for novices and a source of the latest experimental information for experienced investigators. PMID:26834010
Application of selection index calculations to determine selection strategies in genomic breeding programs.

PubMed

König, S; Swalve, H H

2009-10-01

The availability of genomic estimated breeding values (GEBV) allows for possible modifications to existing dairy cattle breeding programs. Selection index calculations including genomic and phenotypic observations as index sources were used to determine the optimal number of offspring per genotyped sire with a focus on functional traits and the design of cooperator herds, and to evaluate the importance of a central station test for genotyped bull dams. Evaluation criteria to compare different breeding strategies were correlations between index and aggregate genotype (r(TI)), and the relative selection response percentage (RSR) of an index without single nucleotide polymorphism information in relation to a single nucleotide polymorphism-based index. The number of required daughter records per sire to achieve a predefined r(TI) strongly depends on the accuracy of GEBV (r(mg)) and the heritability of the trait. For a desired r(TI) of 0.8, h(2) = 0.10, and r(mg) = 0.5, at least 57 additional daughters have to be included in the genetic evaluation. Daughter records of genotyped sires are not necessary for optimal scenarios where r(mg) is greater than or equal to r(TI). There still is a substantial need for phenotypic daughter records, especially for low-heritability functional traits and r(mg) < 0.7. Phenotypic records from genotyped potential bull dams have no relevance for increasing r(TI), even with a low value for r(mg) of 0.5. Hence, genomic breeding programs should focus on recording functional traits within progeny groups, preferably in cooperator herds. For low-heritability traits and with r(mg) > 0.7, the RSR of conventional breeding programs was only 10% of RSR from genomic breeding strategies. As shown in scenarios including 2 traits in the index as well as in the aggregate genotype, the availability of highly accurate GEBV for production traits and low-accuracy GEBV for functional traits increased the risk of widening the gap between selection responses in production and functionality. Counteractions are possible, such as via higher economic weights for low-heritability functional traits. Finally, an alternative selection strategy considering only 2 pathways of selection for genotyped male calves and for cow dams was evaluated. This strategy is competitive with a 4-pathway genomic breeding program if the fraction of selected male calves for the artificial insemination program is below 1% and if selection is focused on functionality, thus pointing to substantial insufficiencies caused by low reliabilities of breeding values for cows for such traits in conventional bull dam selection schemes.
Genome Editing for the Study of Cardiovascular Diseases

PubMed Central

Chadwick, Alexandra C.

2018-01-01

Purpose of Review The opportunities afforded through the recent advent of genome-editing technologies have allowed investigators to more easily study a number of diseases. The advantages and limitations of the most prominent genome-editing technologies are described in this review, along with potential applications specifically focused on cardiovascular diseases. Recent Findings The recent genome-editing tools using programmable nucleases, such as zinc-finger nucleases, transcription activator-like effector nucleases, and clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated 9 (Cas9), have rapidly been adapted to manipulate genes in a variety of cellular and animal models. A number of recent cardiovascular disease-related publications report cases in which specific mutations are introduced into disease models for functional characterization and for testing of therapeutic strategies. Summary Recent advances in genome-editing technologies offer new approaches to understand and treat diseases. Here, we discuss genome editing strategies to easily characterize naturally occurring mutations and offer strategies with potential clinical relevance. PMID:28220462
Aligning the unalignable: bacteriophage whole genome alignments.

PubMed

Bérard, Sèverine; Chateau, Annie; Pompidor, Nicolas; Guertin, Paul; Bergeron, Anne; Swenson, Krister M

2016-01-13

In recent years, many studies focused on the description and comparison of large sets of related bacteriophage genomes. Due to the peculiar mosaic structure of these genomes, few informative approaches for comparing whole genomes exist: dot plots diagrams give a mostly qualitative assessment of the similarity/dissimilarity between two or more genomes, and clustering techniques are used to classify genomes. Multiple alignments are conspicuously absent from this scene. Indeed, whole genome aligners interpret lack of similarity between sequences as an indication of rearrangements, insertions, or losses. This behavior makes them ill-prepared to align bacteriophage genomes, where even closely related strains can accomplish the same biological function with highly dissimilar sequences. In this paper, we propose a multiple alignment strategy that exploits functional collinearity shared by related strains of bacteriophages, and uses partial orders to capture mosaicism of sets of genomes. As classical alignments do, the computed alignments can be used to predict that genes have the same biological function, even in the absence of detectable similarity. The Alpha aligner implements these ideas in visual interactive displays, and is used to compute several examples of alignments of Staphylococcus aureus and Mycobacterium bacteriophages, involving up to 29 genomes. Using these datasets, we prove that Alpha alignments are at least as good as those computed by standard aligners. Comparison with the progressive Mauve aligner - which implements a partial order strategy, but whose alignments are linearized - shows a greatly improved interactive graphic display, while avoiding misalignments. Multiple alignments of whole bacteriophage genomes work, and will become an important conceptual and visual tool in comparative genomics of sets of related strains. A python implementation of Alpha, along with installation instructions for Ubuntu and OSX, is available on bitbucket (https://bitbucket.org/thekswenson/alpha).
Biotechnological application of functional genomics towards plant-parasitic nematode control.

PubMed

Li, Jiarui; Todd, Timothy C; Lee, Junghoon; Trick, Harold N

2011-12-01

Plant-parasitic nematodes are primary biotic factors limiting the crop production. Current nematode control strategies include nematicides, crop rotation and resistant cultivars, but each has serious limitations. RNA interference (RNAi) represents a major breakthrough in the application of functional genomics for plant-parasitic nematode control. RNAi-induced suppression of numerous genes essential for nematode development, reproduction or parasitism has been demonstrated, highlighting the considerable potential for using this strategy to control damaging pest populations. In an effort to find more suitable and effective gene targets for silencing, researchers are employing functional genomics methodologies, including genome sequencing and transcriptome profiling. Microarrays have been used for studying the interactions between nematodes and plant roots and to measure both plants and nematodes transcripts. Furthermore, laser capture microdissection has been applied for the precise dissection of nematode feeding sites (syncytia) to allow the study of gene expression specifically in syncytia. In the near future, small RNA sequencing techniques will provide more direct information for elucidating small RNA regulatory mechanisms in plants and specific gene silencing using artificial microRNAs should further improve the potential of targeted gene silencing as a strategy for nematode management. © 2011 The Authors. Plant Biotechnology Journal © 2011 Society for Experimental Biology, Association of Applied Biologists and Blackwell Publishing Ltd.
Outlier analysis of functional genomic profiles enriches for oncology targets and enables precision medicine.

PubMed

Zhu, Zhou; Ihle, Nathan T; Rejto, Paul A; Zarrinkar, Patrick P

2016-06-13

Genome-scale functional genomic screens across large cell line panels provide a rich resource for discovering tumor vulnerabilities that can lead to the next generation of targeted therapies. Their data analysis typically has focused on identifying genes whose knockdown enhances response in various pre-defined genetic contexts, which are limited by biological complexities as well as the incompleteness of our knowledge. We thus introduce a complementary data mining strategy to identify genes with exceptional sensitivity in subsets, or outlier groups, of cell lines, allowing an unbiased analysis without any a priori assumption about the underlying biology of dependency. Genes with outlier features are strongly and specifically enriched with those known to be associated with cancer and relevant biological processes, despite no a priori knowledge being used to drive the analysis. Identification of exceptional responders (outliers) may not lead only to new candidates for therapeutic intervention, but also tumor indications and response biomarkers for companion precision medicine strategies. Several tumor suppressors have an outlier sensitivity pattern, supporting and generalizing the notion that tumor suppressors can play context-dependent oncogenic roles. The novel application of outlier analysis described here demonstrates a systematic and data-driven analytical strategy to decipher large-scale functional genomic data for oncology target and precision medicine discoveries.
A scoring strategy combining statistics and functional genomics supports a possible role for common polygenic variation in autism

PubMed Central

Carayol, Jérôme; Schellenberg, Gerard D.; Dombroski, Beth; Amiet, Claire; Génin, Bérengère; Fontaine, Karine; Rousseau, Francis; Vazart, Céline; Cohen, David; Frazier, Thomas W.; Hardan, Antonio Y.; Dawson, Geraldine; Rio Frio, Thomas

2014-01-01

Autism spectrum disorders (ASD) are highly heritable complex neurodevelopmental disorders with a 4:1 male: female ratio. Common genetic variation could explain 40–60% of the variance in liability to autism. Because of their small effect, genome-wide association studies (GWASs) have only identified a small number of individual single-nucleotide polymorphisms (SNPs). To increase the power of GWASs in complex disorders, methods like convergent functional genomics (CFG) have emerged to extract true association signals from noise and to identify and prioritize genes from SNPs using a scoring strategy combining statistics and functional genomics. We adapted and applied this approach to analyze data from a GWAS performed on families with multiple children affected with autism from Autism Speaks Autism Genetic Resource Exchange (AGRE). We identified a set of 133 candidate markers that were localized in or close to genes with functional relevance in ASD from a discovery population (545 multiplex families); a gender specific genetic score (GS) based on these common variants explained 1% (P = 0.01 in males) and 5% (P = 8.7 × 10−7 in females) of genetic variance in an independent sample of multiplex families. Overall, our work demonstrates that prioritization of GWAS data based on functional genomics identified common variants associated with autism and provided additional support for a common polygenic background in autism. PMID:24600472
Combining functional genomics and chemical biology to identify targets of bioactive compounds.

PubMed

Ho, Cheuk Hei; Piotrowski, Jeff; Dixon, Scott J; Baryshnikova, Anastasia; Costanzo, Michael; Boone, Charles

2011-02-01

Genome sequencing projects have revealed thousands of suspected genes, challenging researchers to develop efficient large-scale functional analysis methodologies. Determining the function of a gene product generally requires a means to alter its function. Genetically tractable model organisms have been widely exploited for the isolation and characterization of activating and inactivating mutations in genes encoding proteins of interest. Chemical genetics represents a complementary approach involving the use of small molecules capable of either inactivating or activating their targets. Saccharomyces cerevisiae has been an important test bed for the development and application of chemical genomic assays aimed at identifying targets and modes of action of known and uncharacterized compounds. Here we review yeast chemical genomic assays strategies for drug target identification. Copyright © 2010 Elsevier Ltd. All rights reserved.
The FUN of identifying gene function in bacterial pathogens; insights from Salmonella functional genomics.

PubMed

Hammarlöf, Disa L; Canals, Rocío; Hinton, Jay C D

2013-10-01

The availability of thousands of genome sequences of bacterial pathogens poses a particular challenge because each genome contains hundreds of genes of unknown function (FUN). How can we easily discover which FUN genes encode important virulence factors? One solution is to combine two different functional genomic approaches. First, transcriptomics identifies bacterial FUN genes that show differential expression during the process of mammalian infection. Second, global mutagenesis identifies individual FUN genes that the pathogen requires to cause disease. The intersection of these datasets can reveal a small set of candidate genes most likely to encode novel virulence attributes. We demonstrate this approach with the Salmonella infection model, and propose that a similar strategy could be used for other bacterial pathogens. Copyright © 2013 Elsevier Ltd. All rights reserved.
Functional interrogation of non-coding DNA through CRISPR genome editing

PubMed Central

Canver, Matthew C.; Bauer, Daniel E.; Orkin, Stuart H.

2017-01-01

Methodologies to interrogate non-coding regions have lagged behind coding regions despite comprising the vast majority of the genome. However, the rapid evolution of clustered regularly interspaced short palindromic repeats (CRISPR)-based genome editing has provided a multitude of novel techniques for laboratory investigation including significant contributions to the toolbox for studying non-coding DNA. CRISPR-mediated loss-of-function strategies rely on direct disruption of the underlying sequence or repression of transcription without modifying the targeted DNA sequence. CRISPR-mediated gain-of-function approaches similarly benefit from methods to alter the targeted sequence through integration of customized sequence into the genome as well as methods to activate transcription. Here we review CRISPR-based loss- and gain-of-function techniques for the interrogation of non-coding DNA. PMID:28288828

Schistosoma comparative genomics: integrating genome structure, parasite biology and anthelmintic discovery

PubMed Central

Swain, Martin T.; Larkin, Denis M.; Caffrey, Conor R.; Davies, Stephen J.; Loukas, Alex; Skelly, Patrick J.; Hoffmann, Karl F.

2011-01-01

Schistosoma genomes provide a comprehensive resource for identifying the molecular processes that shape parasite evolution and for discovering novel chemotherapeutic or immunoprophylactic targets. Here, we demonstrate how intra- and intergenus comparative genomics can be used to drive these investigations forward, illustrate the advantages and limitations of these approaches and review how post genomic technologies offer complementary strategies for genome characterisation. While sequencing and functional characterisation of other schistosome/platyhelminth genomes continues to expedite anthelmintic discovery, we contend that future priorities should equally focus on improving assembly quality, and chromosomal assignment, of existing schistosome/platyhelminth genomes. PMID:22024648
Reverse Genetics and High Throughput Sequencing Methodologies for Plant Functional Genomics

PubMed Central

Ben-Amar, Anis; Daldoul, Samia; Reustle, Götz M.; Krczal, Gabriele; Mliki, Ahmed

2016-01-01

In the post-genomic era, increasingly sophisticated genetic tools are being developed with the long-term goal of understanding how the coordinated activity of genes gives rise to a complex organism. With the advent of the next generation sequencing associated with effective computational approaches, wide variety of plant species have been fully sequenced giving a wealth of data sequence information on structure and organization of plant genomes. Since thousands of gene sequences are already known, recently developed functional genomics approaches provide powerful tools to analyze plant gene functions through various gene manipulation technologies. Integration of different omics platforms along with gene annotation and computational analysis may elucidate a complete view in a system biology level. Extensive investigations on reverse genetics methodologies were deployed for assigning biological function to a specific gene or gene product. We provide here an updated overview of these high throughout strategies highlighting recent advances in the knowledge of functional genomics in plants. PMID:28217003
Genome-health nutrigenomics and nutrigenetics: nutritional requirements or 'nutriomes' for chromosomal stability and telomere maintenance at the individual level.

PubMed

Bull, Caroline; Fenech, Michael

2008-05-01

It is becoming increasingly evident that (a) risk for developmental and degenerative disease increases with more DNA damage, which in turn is dependent on nutritional status, and (b) the optimal concentration of micronutrients for prevention of genome damage is also dependent on genetic polymorphisms that alter the function of genes involved directly or indirectly in the uptake and metabolism of micronutrients required for DNA repair and DNA replication. The development of dietary patterns, functional foods and supplements that are designed to improve genome-health maintenance in individuals with specific genetic backgrounds may provide an important contribution to an optimum health strategy based on the diagnosis and individualised nutritional prevention of genome damage, i.e. genome health clinics. The present review summarises some of the recent knowledge relating to micronutrients that are associated with chromosomal stability and provides some initial insights into the likely nutritional factors that may be expected to have an impact on the maintenance of telomeres. It is evident that developing effective strategies for defining nutrient doses and combinations or 'nutriomes' for genome-health maintenance at the individual level is essential for further progress in this research field.
Functional interactions of archaea, bacteria and viruses in a hypersaline endolithic community.

PubMed

Crits-Christoph, Alexander; Gelsinger, Diego R; Ma, Bing; Wierzchos, Jacek; Ravel, Jacques; Davila, Alfonso; Casero, M Cristina; DiRuggiero, Jocelyne

2016-06-01

Halite endoliths in the Atacama Desert represent one of the most extreme ecosystems on Earth. Cultivation-independent methods were used to examine the functional adaptations of the microbial consortia inhabiting halite nodules. The community was dominated by haloarchaea and functional analysis attributed most of the autotrophic CO2 fixation to one unique cyanobacterium. The assembled 1.1 Mbp genome of a novel nanohaloarchaeon, Candidatus Nanopetramus SG9, revealed a photoheterotrophic life style and a low median isoelectric point (pI) for all predicted proteins, suggesting a 'salt-in' strategy for osmotic balance. Predicted proteins of the algae identified in the community also had pI distributions similar to 'salt-in' strategists. The Nanopetramus genome contained a unique CRISPR/Cas system with a spacer that matched a partial viral genome from the metagenome. A combination of reference-independent methods identified over 30 complete or near complete viral or proviral genomes with diverse genome structure, genome size, gene content and hosts. Putative hosts included Halobacteriaceae, Nanohaloarchaea and Cyanobacteria. Despite the dependence of the halite community on deliquescence for liquid water availability, this study exposed an ecosystem spanning three phylogenetic domains, containing a large diversity of viruses and predominance of a 'salt-in' strategy to balance the high osmotic pressure of the environment. © 2016 Society for Applied Microbiology and John Wiley & Sons Ltd.
Functional precision cancer medicine-moving beyond pure genomics.

PubMed

Letai, Anthony

2017-09-08

The essential job of precision medicine is to match the right drugs to the right patients. In cancer, precision medicine has been nearly synonymous with genomics. However, sobering recent studies have generally shown that most patients with cancer who receive genomic testing do not benefit from a genomic precision medicine strategy. Although some call the entire project of precision cancer medicine into question, I suggest instead that the tools employed must be broadened. Instead of relying exclusively on big data measurements of initial conditions, we should also acquire highly actionable functional information by perturbing-for example, with cancer therapies-viable primary tumor cells from patients with cancer.
regioneR: an R/Bioconductor package for the association analysis of genomic regions based on permutation tests.

PubMed

Gel, Bernat; Díez-Villanueva, Anna; Serra, Eduard; Buschbeck, Marcus; Peinado, Miguel A; Malinverni, Roberto

2016-01-15

Statistically assessing the relation between a set of genomic regions and other genomic features is a common challenging task in genomic and epigenomic analyses. Randomization based approaches implicitly take into account the complexity of the genome without the need of assuming an underlying statistical model. regioneR is an R package that implements a permutation test framework specifically designed to work with genomic regions. In addition to the predefined randomization and evaluation strategies, regioneR is fully customizable allowing the use of custom strategies to adapt it to specific questions. Finally, it also implements a novel function to evaluate the local specificity of the detected association. regioneR is an R package released under Artistic-2.0 License. The source code and documents are freely available through Bioconductor (http://www.bioconductor.org/packages/regioneR). rmalinverni@carrerasresearch.org. © The Author 2015. Published by Oxford University Press.
Functional interrogation of non-coding DNA through CRISPR genome editing.

PubMed

Canver, Matthew C; Bauer, Daniel E; Orkin, Stuart H

2017-05-15

Methodologies to interrogate non-coding regions have lagged behind coding regions despite comprising the vast majority of the genome. However, the rapid evolution of clustered regularly interspaced short palindromic repeats (CRISPR)-based genome editing has provided a multitude of novel techniques for laboratory investigation including significant contributions to the toolbox for studying non-coding DNA. CRISPR-mediated loss-of-function strategies rely on direct disruption of the underlying sequence or repression of transcription without modifying the targeted DNA sequence. CRISPR-mediated gain-of-function approaches similarly benefit from methods to alter the targeted sequence through integration of customized sequence into the genome as well as methods to activate transcription. Here we review CRISPR-based loss- and gain-of-function techniques for the interrogation of non-coding DNA. Copyright © 2017 Elsevier Inc. All rights reserved.
Synthetic biology: Novel approaches for microbiology.

PubMed

Padilla-Vaca, Felipe; Anaya-Velázquez, Fernando; Franco, Bernardo

2015-06-01

In the past twenty years, molecular genetics has created powerful tools for genetic manipulation of living organisms. Whole genome sequencing has provided necessary information to assess knowledge on gene function and protein networks. In addition, new tools permit to modify organisms to perform desired tasks. Gene function analysis is speed up by novel approaches that couple both high throughput data generation and mining. Synthetic biology is an emerging field that uses tools for generating novel gene networks, whole genome synthesis and engineering. New applications in biotechnological, pharmaceutical and biomedical research are envisioned for synthetic biology. In recent years these new strategies have opened up the possibilities to study gene and genome editing, creation of novel tools for functional studies in virus, parasites and pathogenic bacteria. There is also the possibility to re-design organisms to generate vaccine subunits or produce new pharmaceuticals to combat multi-drug resistant pathogens. In this review we provide our opinion on the applicability of synthetic biology strategies for functional studies of pathogenic organisms and some applications such as genome editing and gene network studies to further comprehend virulence factors and determinants in pathogenic organisms. We also discuss what we consider important ethical issues for this field of molecular biology, especially for potential misuse of the new technologies. Copyright© by the Spanish Society for Microbiology and Institute for Catalan Studies.
Genome health nutrigenomics and nutrigenetics--diagnosis and nutritional treatment of genome damage on an individual basis.

PubMed

Fenech, Michael

2008-04-01

The term nutrigenomics refers to the effect of diet on gene expression. The term nutrigenetics refers to the impact of inherited traits on the response to a specific dietary pattern, functional food or supplement on a specific health outcome. The specific fields of genome health nutrigenomics and genome health nutrigenetics are emerging as important new research areas because it is becoming increasingly evident that (a) risk for developmental and degenerative disease increases with DNA damage which in turn is dependent on nutritional status and (b) optimal concentration of micronutrients for prevention of genome damage is also dependent on genetic polymorphisms that alter function of genes involved directly or indirectly in uptake and metabolism of micronutrients required for DNA repair and DNA replication. Development of dietary patterns, functional foods and supplements that are designed to improve genome health maintenance in humans with specific genetic backgrounds may provide an important contribution to a new optimum health strategy based on the diagnosis and individualised nutritional treatment of genome instability i.e. Genome Health Clinics.
Targeted activation of diverse CRISPR-Cas systems for mammalian genome editing via proximal CRISPR targeting.

PubMed

Chen, Fuqiang; Ding, Xiao; Feng, Yongmei; Seebeck, Timothy; Jiang, Yanfang; Davis, Gregory D

2017-04-07

Bacterial CRISPR-Cas systems comprise diverse effector endonucleases with different targeting ranges, specificities and enzymatic properties, but many of them are inactive in mammalian cells and are thus precluded from genome-editing applications. Here we show that the type II-B FnCas9 from Francisella novicida possesses novel properties, but its nuclease function is frequently inhibited at many genomic loci in living human cells. Moreover, we develop a proximal CRISPR (termed proxy-CRISPR) targeting method that restores FnCas9 nuclease activity in a target-specific manner. We further demonstrate that this proxy-CRISPR strategy is applicable to diverse CRISPR-Cas systems, including type II-C Cas9 and type V Cpf1 systems, and can facilitate precise gene editing even between identical genomic sites within the same genome. Our findings provide a novel strategy to enable use of diverse otherwise inactive CRISPR-Cas systems for genome-editing applications and a potential path to modulate the impact of chromatin microenvironments on genome modification.
Targeted activation of diverse CRISPR-Cas systems for mammalian genome editing via proximal CRISPR targeting

PubMed Central

Chen, Fuqiang; Ding, Xiao; Feng, Yongmei; Seebeck, Timothy; Jiang, Yanfang; Davis, Gregory D.

2017-01-01

Bacterial CRISPR–Cas systems comprise diverse effector endonucleases with different targeting ranges, specificities and enzymatic properties, but many of them are inactive in mammalian cells and are thus precluded from genome-editing applications. Here we show that the type II-B FnCas9 from Francisella novicida possesses novel properties, but its nuclease function is frequently inhibited at many genomic loci in living human cells. Moreover, we develop a proximal CRISPR (termed proxy-CRISPR) targeting method that restores FnCas9 nuclease activity in a target-specific manner. We further demonstrate that this proxy-CRISPR strategy is applicable to diverse CRISPR–Cas systems, including type II-C Cas9 and type V Cpf1 systems, and can facilitate precise gene editing even between identical genomic sites within the same genome. Our findings provide a novel strategy to enable use of diverse otherwise inactive CRISPR–Cas systems for genome-editing applications and a potential path to modulate the impact of chromatin microenvironments on genome modification. PMID:28387220
Functional Annotations of Paralogs: A Blessing and a Curse

PubMed Central

Zallot, Rémi; Harrison, Katherine J.; Kolaczkowski, Bryan; de Crécy-Lagard, Valérie

2016-01-01

Gene duplication followed by mutation is a classic mechanism of neofunctionalization, producing gene families with functional diversity. In some cases, a single point mutation is sufficient to change the substrate specificity and/or the chemistry performed by an enzyme, making it difficult to accurately separate enzymes with identical functions from homologs with different functions. Because sequence similarity is often used as a basis for assigning functional annotations to genes, non-isofunctional gene families pose a great challenge for genome annotation pipelines. Here we describe how integrating evolutionary and functional information such as genome context, phylogeny, metabolic reconstruction and signature motifs may be required to correctly annotate multifunctional families. These integrative analyses can also lead to the discovery of novel gene functions, as hints from specific subgroups can guide the functional characterization of other members of the family. We demonstrate how careful manual curation processes using comparative genomics can disambiguate subgroups within large multifunctional families and discover their functions. We present the COG0720 protein family as a case study. We also discuss strategies to automate this process to improve the accuracy of genome functional annotation pipelines. PMID:27618105
A Parvovirus B19 synthetic genome: sequence features and functional competence.

PubMed

Manaresi, Elisabetta; Conti, Ilaria; Bua, Gloria; Bonvicini, Francesca; Gallinella, Giorgio

2017-08-01

Central to genetic studies for Parvovirus B19 (B19V) is the availability of genomic clones that may possess functional competence and ability to generate infectious virus. In our study, we established a new model genetic system for Parvovirus B19. A synthetic approach was followed, by design of a reference genome sequence, by generation of a corresponding artificial construct and its molecular cloning in a complete and functional form, and by setup of an efficient strategy to generate infectious virus, via transfection in UT7/EpoS1 cells and amplification in erythroid progenitor cells. The synthetic genome was able to generate virus with biological properties paralleling those of native virus, its infectious activity being dependent on the preservation of self-complementarity and sequence heterogeneity within the terminal regions. A virus of defined genome sequence, obtained from controlled cell culture conditions, can constitute a reference tool for investigation of the structural and functional characteristics of the virus. Copyright © 2017 Elsevier Inc. All rights reserved.
Genome sequencing and comparative genomics reveal a repertoire of putative pathogenicity genes in chilli anthracnose fungus Colletotrichum truncatum.

PubMed

Rao, Soumya; Nandineni, Madhusudan R

2017-01-01

Colletotrichum truncatum, a major fungal phytopathogen, causes the anthracnose disease on an economically important spice crop chilli (Capsicum annuum), resulting in huge economic losses in tropical and sub-tropical countries. It follows a subcuticular intramural infection strategy on chilli with a short, asymptomatic, endophytic phase, which contrasts with the intracellular hemibiotrophic lifestyle adopted by most of the Colletotrichum species. However, little is known about the molecular determinants and the mechanism of pathogenicity in this fungus. A high quality whole genome sequence and gene annotation based on transcriptome data of an Indian isolate of C. truncatum from chilli has been obtained. Analysis of the genome sequence revealed a rich repertoire of pathogenicity genes in C. truncatum encoding secreted proteins, effectors, plant cell wall degrading enzymes, secondary metabolism associated proteins, with potential roles in the host-specific infection strategy, placing it next only to the Fusarium species. The size of genome assembly, number of predicted genes and some of the functional categories were similar to other sequenced Colletotrichum species. The comparative genomic analyses with other species and related fungi identified some unique genes and certain highly expanded gene families of CAZymes, proteases and secondary metabolism associated genes in the genome of C. truncatum. The draft genome assembly and functional annotation of potential pathogenicity genes of C. truncatum provide an important genomic resource for understanding the biology and lifestyle of this important phytopathogen and will pave the way for designing efficient disease control regimens.
Genome sequencing and comparative genomics reveal a repertoire of putative pathogenicity genes in chilli anthracnose fungus Colletotrichum truncatum

PubMed Central

Rao, Soumya

2017-01-01

Colletotrichum truncatum, a major fungal phytopathogen, causes the anthracnose disease on an economically important spice crop chilli (Capsicum annuum), resulting in huge economic losses in tropical and sub-tropical countries. It follows a subcuticular intramural infection strategy on chilli with a short, asymptomatic, endophytic phase, which contrasts with the intracellular hemibiotrophic lifestyle adopted by most of the Colletotrichum species. However, little is known about the molecular determinants and the mechanism of pathogenicity in this fungus. A high quality whole genome sequence and gene annotation based on transcriptome data of an Indian isolate of C. truncatum from chilli has been obtained. Analysis of the genome sequence revealed a rich repertoire of pathogenicity genes in C. truncatum encoding secreted proteins, effectors, plant cell wall degrading enzymes, secondary metabolism associated proteins, with potential roles in the host-specific infection strategy, placing it next only to the Fusarium species. The size of genome assembly, number of predicted genes and some of the functional categories were similar to other sequenced Colletotrichum species. The comparative genomic analyses with other species and related fungi identified some unique genes and certain highly expanded gene families of CAZymes, proteases and secondary metabolism associated genes in the genome of C. truncatum. The draft genome assembly and functional annotation of potential pathogenicity genes of C. truncatum provide an important genomic resource for understanding the biology and lifestyle of this important phytopathogen and will pave the way for designing efficient disease control regimens. PMID:28846714
Genetic and Proteomic Interrogation of Lower Confidence Candidate Genes Reveals Signaling Networks in beta-Catenin-Active Cancers | Office of Cancer Genomics

Cancer.gov

Genome-scale expression studies and comprehensive loss-of-function genetic screens have focused almost exclusively on the highest confidence candidate genes. Here, we describe a strategy for characterizing the lower confidence candidates identified by such approaches.
Dissecting genome-wide association signals for loss-of-function phenotypes in sorghum flavonoid pigmentation traits

USDA-ARS?s Scientific Manuscript database

Genome-wide association studies (GWAS) are a powerful method to dissect the genetic basis of traits, though in practice the effects of complex genetic architecture and population structure remain poorly understood. To compare mapping strategies we dissect the genetic control of flavonoid pigmentatio...
Genetic recombination pathways and their application for genome modification of human embryonic stem cells.

PubMed

Nieminen, Mikko; Tuuri, Timo; Savilahti, Harri

2010-10-01

Human embryonic stem cells are pluripotent cells derived from early human embryo and retain a potential to differentiate into all adult cell types. They provide vast opportunities in cell replacement therapies and are expected to become significant tools in drug discovery as well as in the studies of cellular and developmental functions of human genes. The progress in applying different types of DNA recombination reactions for genome modification in a variety of eukaryotic cell types has provided means to utilize recombination-based strategies also in human embryonic stem cells. Homologous recombination-based methods, particularly those utilizing extended homologous regions and those employing zinc finger nucleases to boost genomic integration, have shown their usefulness in efficient genome modification. Site-specific recombination systems are potent genome modifiers, and they can be used to integrate DNA into loci that contain an appropriate recombination signal sequence, either naturally occurring or suitably pre-engineered. Non-homologous recombination can be used to generate random integrations in genomes relatively effortlessly, albeit with a moderate efficiency and precision. DNA transposition-based strategies offer substantially more efficient random strategies and provide means to generate single-copy insertions, thus potentiating the generation of genome-wide insertion libraries applicable in genetic screens. 2010 Elsevier Inc. All rights reserved.
Neoclassic drug discovery: the case for lead generation using phenotypic and functional approaches.

PubMed

Lee, Jonathan A; Berg, Ellen L

2013-12-01

Innovation and new molecular entity production by the pharmaceutical industry has been below expectations. Surprisingly, more first-in-class small-molecule drugs approved by the U.S. Food and Drug Administration (FDA) between 1999 and 2008 were identified by functional phenotypic lead generation strategies reminiscent of pre-genomics pharmacology than contemporary molecular targeted strategies that encompass the vast majority of lead generation efforts. This observation, in conjunction with the difficulty in validating molecular targets for drug discovery, has diminished the impact of the "genomics revolution" and has led to a growing grassroots movement and now broader trend in pharma to reconsider the use of modern physiology-based or phenotypic drug discovery (PDD) strategies. This "From the Guest Editors" column provides an introduction and overview of the two-part special issues of Journal of Biomolecular Screening on PDD. Terminology and the business case for use of PDD are defined. Key issues such as assay performance, chemical optimization, target identification, and challenges to the organization and implementation of PDD are discussed. Possible solutions for these challenges and a new neoclassic vision for PDD that combines phenotypic and functional approaches with technology innovations resulting from the genomics-driven era of target-based drug discovery (TDD) are also described. Finally, an overview of the manuscripts in this special edition is provided.
Multi-allelic haplotype model based on genetic partition for genomic prediction and variance component estimation using SNP markers.

PubMed

Da, Yang

2015-12-18

The amount of functional genomic information has been growing rapidly but remains largely unused in genomic selection. Genomic prediction and estimation using haplotypes in genome regions with functional elements such as all genes of the genome can be an approach to integrate functional and structural genomic information for genomic selection. Towards this goal, this article develops a new haplotype approach for genomic prediction and estimation. A multi-allelic haplotype model treating each haplotype as an 'allele' was developed for genomic prediction and estimation based on the partition of a multi-allelic genotypic value into additive and dominance values. Each additive value is expressed as a function of h - 1 additive effects, where h = number of alleles or haplotypes, and each dominance value is expressed as a function of h(h - 1)/2 dominance effects. For a sample of q individuals, the limit number of effects is 2q - 1 for additive effects and is the number of heterozygous genotypes for dominance effects. Additive values are factorized as a product between the additive model matrix and the h - 1 additive effects, and dominance values are factorized as a product between the dominance model matrix and the h(h - 1)/2 dominance effects. Genomic additive relationship matrix is defined as a function of the haplotype model matrix for additive effects, and genomic dominance relationship matrix is defined as a function of the haplotype model matrix for dominance effects. Based on these results, a mixed model implementation for genomic prediction and variance component estimation that jointly use haplotypes and single markers is established, including two computing strategies for genomic prediction and variance component estimation with identical results. The multi-allelic genetic partition fills a theoretical gap in genetic partition by providing general formulations for partitioning multi-allelic genotypic values and provides a haplotype method based on the quantitative genetics model towards the utilization of functional and structural genomic information for genomic prediction and estimation.

Genome-scale deletion screening of human long non-coding RNAs using a paired-guide RNA CRISPR library

PubMed Central

Zhu, Shiyou; Li, Wei; Liu, Jingze; Chen, Chen-Hao; Liao, Qi; Xu, Ping; Xu, Han; Xiao, Tengfei; Cao, Zhongzheng; Peng, Jingyu; Yuan, Pengfei; Brown, Myles; Liu, Xiaole Shirley; Wei, Wensheng

2017-01-01

CRISPR/Cas9 screens have been widely adopted to analyse coding gene functions, but high throughput screening of non-coding elements using this method is more challenging, because indels caused by a single cut in non-coding regions are unlikely to produce a functional knockout. A high-throughput method to produce deletions of non-coding DNA is needed. Herein, we report a high throughput genomic deletion strategy to screen for functional long non-coding RNAs (lncRNAs) that is based on a lentiviral paired-guide RNA (pgRNA) library. Applying our screening method, we identified 51 lncRNAs that can positively or negatively regulate human cancer cell growth. We individually validated 9 lncRNAs using CRISPR/Cas9-mediated genomic deletion and functional rescue, CRISPR activation or inhibition, and gene expression profiling. Our high-throughput pgRNA genome deletion method should enable rapid identification of functional mammalian non-coding elements. PMID:27798563
Genome fluctuations in cyanobacteria reflect evolutionary, developmental and adaptive traits.

PubMed

Larsson, John; Nylander, Johan Aa; Bergman, Birgitta

2011-06-30

Cyanobacteria belong to an ancient group of photosynthetic prokaryotes with pronounced variations in their cellular differentiation strategies, physiological capacities and choice of habitat. Sequencing efforts have shown that genomes within this phylum are equally diverse in terms of size and protein-coding capacity. To increase our understanding of genomic changes in the lineage, the genomes of 58 contemporary cyanobacteria were analysed for shared and unique orthologs. A total of 404 protein families, present in all cyanobacterial genomes, were identified. Two of these are unique to the phylum, corresponding to an AbrB family transcriptional regulator and a gene that escapes functional annotation although its genomic neighbourhood is conserved among the organisms examined. The evolution of cyanobacterial genome sizes involves a mix of gains and losses in the clade encompassing complex cyanobacteria, while a single event of reduction is evident in a clade dominated by unicellular cyanobacteria. Genome sizes and gene family copy numbers evolve at a higher rate in the former clade, and multi-copy genes were predominant in large genomes. Orthologs unique to cyanobacteria exhibiting specific characteristics, such as filament formation, heterocyst differentiation, diazotrophy and symbiotic competence, were also identified. An ancestral character reconstruction suggests that the most recent common ancestor of cyanobacteria had a genome size of approx. 4.5 Mbp and 1678 to 3291 protein-coding genes, 4%-6% of which are unique to cyanobacteria today. The different rates of genome-size evolution and multi-copy gene abundance suggest two routes of genome development in the history of cyanobacteria. The expansion strategy is driven by gene-family enlargment and generates a broad adaptive potential; while the genome streamlining strategy imposes adaptations to highly specific niches, also reflected in their different functional capacities. A few genomes display extreme proliferation of non-coding nucleotides which is likely to be the result of initial expansion of genomes/gene copy number to gain adaptive potential, followed by a shift to a life-style in a highly specific niche (e.g. symbiosis). This transition results in redundancy of genes and gene families, leading to an increase in junk DNA and eventually to gene loss. A few orthologs can be correlated with specific phenotypes in cyanobacteria, such as filament formation and symbiotic competence; these constitute exciting exploratory targets.
Genome fluctuations in cyanobacteria reflect evolutionary, developmental and adaptive traits

PubMed Central

2011-01-01

Background Cyanobacteria belong to an ancient group of photosynthetic prokaryotes with pronounced variations in their cellular differentiation strategies, physiological capacities and choice of habitat. Sequencing efforts have shown that genomes within this phylum are equally diverse in terms of size and protein-coding capacity. To increase our understanding of genomic changes in the lineage, the genomes of 58 contemporary cyanobacteria were analysed for shared and unique orthologs. Results A total of 404 protein families, present in all cyanobacterial genomes, were identified. Two of these are unique to the phylum, corresponding to an AbrB family transcriptional regulator and a gene that escapes functional annotation although its genomic neighbourhood is conserved among the organisms examined. The evolution of cyanobacterial genome sizes involves a mix of gains and losses in the clade encompassing complex cyanobacteria, while a single event of reduction is evident in a clade dominated by unicellular cyanobacteria. Genome sizes and gene family copy numbers evolve at a higher rate in the former clade, and multi-copy genes were predominant in large genomes. Orthologs unique to cyanobacteria exhibiting specific characteristics, such as filament formation, heterocyst differentiation, diazotrophy and symbiotic competence, were also identified. An ancestral character reconstruction suggests that the most recent common ancestor of cyanobacteria had a genome size of approx. 4.5 Mbp and 1678 to 3291 protein-coding genes, 4%-6% of which are unique to cyanobacteria today. Conclusions The different rates of genome-size evolution and multi-copy gene abundance suggest two routes of genome development in the history of cyanobacteria. The expansion strategy is driven by gene-family enlargment and generates a broad adaptive potential; while the genome streamlining strategy imposes adaptations to highly specific niches, also reflected in their different functional capacities. A few genomes display extreme proliferation of non-coding nucleotides which is likely to be the result of initial expansion of genomes/gene copy number to gain adaptive potential, followed by a shift to a life-style in a highly specific niche (e.g. symbiosis). This transition results in redundancy of genes and gene families, leading to an increase in junk DNA and eventually to gene loss. A few orthologs can be correlated with specific phenotypes in cyanobacteria, such as filament formation and symbiotic competence; these constitute exciting exploratory targets. PMID:21718514
ABCdb: an online resource for ABC transporter repertories from sequenced archaeal and bacterial genomes.

PubMed

Fichant, Gwennaele; Basse, Marie-Jeanne; Quentin, Yves

2006-03-01

The ATP-binding cassette (ABC) transporters are one of the major classes of active transporters. They are widespread in archaea, bacteria, and eukaryota, indicating that they have arisen early in evolution. They are involved in many essential physiological processes, but the majority import or export a wide variety of compounds across cellular membranes. These systems share a common architecture composed of four (exporters) or five (importers) domains. To identify and reconstruct functional ABC transporters encoded by archaeal and bacterial genomes, we have developed a bioinformatic strategy. Cross-reference to the transport classification system is used to predict the type of compound transported. A high quality of annotation is achieved by manual verification of the predictions. However, in order to face the rapid increase in the number of published genomes, we also include analyses of genomes issuing directly from the automated strategy. Querying the database (http://www-abcdb.biotoul.fr) allows to easily retrieve ABC transporter repertories and related data. Additional query tools have been developed for the analysis of the ABC family from both functional and evolutionary perspectives.
Methods of epigenome editing for probing the function of genomic imprinting.

PubMed

Rienecker, Kira DA; Hill, Matthew J; Isles, Anthony R

2016-10-01

The curious patterns of imprinted gene expression draw interest from several scientific disciplines to the functional consequences of genomic imprinting. Methods of probing the function of imprinting itself have largely been indirect and correlational, relying heavily on conventional transgenics. Recently, the burgeoning field of epigenome editing has provided new tools and suggested strategies for asking causal questions with site specificity. This perspective article aims to outline how these new methods may be applied to questions of functional imprinting and, with this aim in mind, to suggest new dimensions for the expansion of these epigenome-editing tools.
Metagenomic analysis and functional characterization of the biogas microbiome using high throughput shotgun sequencing and a novel binning strategy.

PubMed

Campanaro, Stefano; Treu, Laura; Kougias, Panagiotis G; De Francisci, Davide; Valle, Giorgio; Angelidaki, Irini

2016-01-01

Biogas production is an economically attractive technology that has gained momentum worldwide over the past years. Biogas is produced by a biologically mediated process, widely known as "anaerobic digestion." This process is performed by a specialized and complex microbial community, in which different members have distinct roles in the establishment of a collective organization. Deciphering the complex microbial community engaged in this process is interesting both for unraveling the network of bacterial interactions and for applicability potential to the derived knowledge. In this study, we dissect the bioma involved in anaerobic digestion by means of high throughput Illumina sequencing (~51 gigabases of sequence data), disclosing nearly one million genes and extracting 106 microbial genomes by a novel strategy combining two binning processes. Microbial phylogeny and putative taxonomy performed using >400 proteins revealed that the biogas community is a trove of new species. A new approach based on functional properties as per network representation was developed to assign roles to the microbial species. The organization of the anaerobic digestion microbiome is resembled by a funnel concept, in which the microbial consortium presents a progressive functional specialization while reaching the final step of the process (i.e., methanogenesis). Key microbial genomes encoding enzymes involved in specific metabolic pathways, such as carbohydrates utilization, fatty acids degradation, amino acids fermentation, and syntrophic acetate oxidation, were identified. Additionally, the analysis identified a new uncultured archaeon that was putatively related to Methanomassiliicoccales but surprisingly having a methylotrophic methanogenic pathway. This study is a pioneer research on the phylogenetic and functional characterization of the microbial community populating biogas reactors. By applying for the first time high-throughput sequencing and a novel binning strategy, the identified genes were anchored to single genomes providing a clear understanding of their metabolic pathways and highlighting their involvement in anaerobic digestion. The overall research established a reference catalog of biogas microbial genomes that will greatly simplify future genomic studies.
Experimental Induction of Genome Chaos.

PubMed

Ye, Christine J; Liu, Guo; Heng, Henry H

2018-01-01

Genome chaos, or karyotype chaos, represents a powerful survival strategy for somatic cells under high levels of stress/selection. Since the genome context, not the gene content, encodes the genomic blueprint of the cell, stress-induced rapid and massive reorganization of genome topology functions as a very important mechanism for genome (karyotype) evolution. In recent years, the phenomenon of genome chaos has been confirmed by various sequencing efforts, and many different terms have been coined to describe different subtypes of the chaotic genome including "chromothripsis," "chromoplexy," and "structural mutations." To advance this exciting field, we need an effective experimental system to induce and characterize the karyotype reorganization process. In this chapter, an experimental protocol to induce chaotic genomes is described, following a brief discussion of the mechanism and implication of genome chaos in cancer evolution.
Functional genomic analysis of drug sensitivity pathways to guide adjuvant strategies in breast cancer

PubMed Central

Swanton, Charles; Szallasi, Zoltan; Brenton, James D; Downward, Julian

2008-01-01

The widespread introduction of high throughput RNA interference screening technology has revealed tumour drug sensitivity pathways to common cytotoxics such as paclitaxel, doxorubicin and 5-fluorouracil, targeted agents such as trastuzumab and inhibitors of AKT and Poly(ADP-ribose) polymerase (PARP) as well as endocrine therapies such as tamoxifen. Given the limited power of microarray signatures to predict therapeutic response in associative studies of small clinical trial cohorts, the use of functional genomic data combined with expression or sequence analysis of genes and microRNAs implicated in drug response in human tumours may provide a more robust method to guide adjuvant treatment strategies in breast cancer that are transferable across different expression platforms and patient cohorts. PMID:18986507
Functional Assays to Screen and Dissect Genomic Hits: Doubling Down on the National Investment in Genomic Research.

PubMed

Musunuru, Kiran; Bernstein, Daniel; Cole, F Sessions; Khokha, Mustafa K; Lee, Frank S; Lin, Shin; McDonald, Thomas V; Moskowitz, Ivan P; Quertermous, Thomas; Sankaran, Vijay G; Schwartz, David A; Silverman, Edwin K; Zhou, Xiaobo; Hasan, Ahmed A K; Luo, Xiao-Zhong James

2018-04-01

The National Institutes of Health have made substantial investments in genomic studies and technologies to identify DNA sequence variants associated with human disease phenotypes. The National Heart, Lung, and Blood Institute has been at the forefront of these commitments to ascertain genetic variation associated with heart, lung, blood, and sleep diseases and related clinical traits. Genome-wide association studies, exome- and genome-sequencing studies, and exome-genotyping studies of the National Heart, Lung, and Blood Institute-funded epidemiological and clinical case-control studies are identifying large numbers of genetic variants associated with heart, lung, blood, and sleep phenotypes. However, investigators face challenges in identification of genomic variants that are functionally disruptive among the myriad of computationally implicated variants. Studies to define mechanisms of genetic disruption encoded by computationally identified genomic variants require reproducible, adaptable, and inexpensive methods to screen candidate variant and gene function. High-throughput strategies will permit a tiered variant discovery and genetic mechanism approach that begins with rapid functional screening of a large number of computationally implicated variants and genes for discovery of those that merit mechanistic investigation. As such, improved variant-to-gene and gene-to-function screens-and adequate support for such studies-are critical to accelerating the translation of genomic findings. In this White Paper, we outline the variety of novel technologies, assays, and model systems that are making such screens faster, cheaper, and more accurate, referencing published work and ongoing work supported by the National Heart, Lung, and Blood Institute's R21/R33 Functional Assays to Screen Genomic Hits program. We discuss priorities that can accelerate the impressive but incomplete progress represented by big data genomic research. © 2018 American Heart Association, Inc.
Functional Genomics Using the Saccharomyces cerevisiae Yeast Deletion Collections.

PubMed

Nislow, Corey; Wong, Lai Hong; Lee, Amy Huei-Yi; Giaever, Guri

2016-09-01

Constructed by a consortium of 16 laboratories, the Saccharomyces genome-wide deletion collections have, for the past decade, provided a powerful, rapid, and inexpensive approach for functional profiling of the yeast genome. Loss-of-function deletion mutants were systematically created using a polymerase chain reaction (PCR)-based gene deletion strategy to generate a start-to-stop codon replacement of each open reading frame by homologous recombination. Each strain carries two molecular barcodes that serve as unique strain identifiers, enabling their growth to be analyzed in parallel and the fitness contribution of each gene to be quantitatively assessed by hybridization to high-density oligonucleotide arrays or through the use of next-generation sequencing technologies. Functional profiling of the deletion collections, using either strain-by-strain or parallel assays, provides an unbiased approach to systematically survey the yeast genome. The Saccharomyces yeast deletion collections have proved immensely powerful in contributing to the understanding of gene function, including functional relationships between genes and genetic pathways in response to diverse genetic and environmental perturbations. © 2016 Cold Spring Harbor Laboratory Press.
Integrating functional genomics to accelerate mechanistic personalized medicine.

PubMed

Tyner, Jeffrey W

2017-03-01

The advent of deep sequencing technologies has resulted in the deciphering of tremendous amounts of genetic information. These data have led to major discoveries, and many anecdotes now exist of individual patients whose clinical outcomes have benefited from novel, genetically guided therapeutic strategies. However, the majority of genetic events in cancer are currently undrugged, leading to a biological gap between understanding of tumor genetic etiology and translation to improved clinical approaches. Functional screening has made tremendous strides in recent years with the development of new experimental approaches to studying ex vivo and in vivo drug sensitivity. Numerous discoveries and anecdotes also exist for translation of functional screening into novel clinical strategies; however, the current clinical application of functional screening remains largely confined to small clinical trials at specific academic centers. The intersection between genomic and functional approaches represents an ideal modality to accelerate our understanding of drug sensitivities as they relate to specific genetic events and further understand the full mechanisms underlying drug sensitivity patterns.
Bipyrimidine Signatures as a Photoprotective Genome Strategy in G + C-rich Halophilic Archaea.

PubMed

Jones, Daniel L; Baxter, Bonnie K

2016-09-02

Halophilic archaea experience high levels of ultraviolet (UV) light in their environments and demonstrate resistance to UV irradiation. DNA repair systems and carotenoids provide UV protection but do not account for the high resistance observed. Herein, we consider genomic signatures as an additional photoprotective strategy. The predominant forms of UV-induced DNA damage are cyclobutane pyrimidine dimers, most notoriously thymine dimers (T^Ts), which form at adjacent Ts. We tested whether the high G + C content seen in halophilic archaea serves a photoprotective function through limiting T nucleotides, and thus T^T lesions. However, this speculation overlooks the other bipyrimidine sequences, all of which capable of forming photolesions to varying degrees. Therefore, we designed a program to determine the frequencies of the four bipyrimidine pairs (5' to 3': TT, TC, CT, and CC) within genomes of halophilic archaea and four other randomized sample groups for comparison. The outputs for each sampled genome were weighted by the intrinsic photoreactivities of each dinucleotide pair. Statistical methods were employed to investigate intergroup differences. Our findings indicate that the UV-resistance seen in halophilic archaea can be attributed in part to a genomic strategy: high G + C content and the resulting bipyrimidine signature reduces the genomic photoreactivity.
Bipyrimidine Signatures as a Photoprotective Genome Strategy in G + C-rich Halophilic Archaea

PubMed Central

Jones, Daniel L.; Baxter, Bonnie K.

2016-01-01

Halophilic archaea experience high levels of ultraviolet (UV) light in their environments and demonstrate resistance to UV irradiation. DNA repair systems and carotenoids provide UV protection but do not account for the high resistance observed. Herein, we consider genomic signatures as an additional photoprotective strategy. The predominant forms of UV-induced DNA damage are cyclobutane pyrimidine dimers, most notoriously thymine dimers (T^Ts), which form at adjacent Ts. We tested whether the high G + C content seen in halophilic archaea serves a photoprotective function through limiting T nucleotides, and thus T^T lesions. However, this speculation overlooks the other bipyrimidine sequences, all of which capable of forming photolesions to varying degrees. Therefore, we designed a program to determine the frequencies of the four bipyrimidine pairs (5’ to 3’: TT, TC, CT, and CC) within genomes of halophilic archaea and four other randomized sample groups for comparison. The outputs for each sampled genome were weighted by the intrinsic photoreactivities of each dinucleotide pair. Statistical methods were employed to investigate intergroup differences. Our findings indicate that the UV-resistance seen in halophilic archaea can be attributed in part to a genomic strategy: high G + C content and the resulting bipyrimidine signature reduces the genomic photoreactivity. PMID:27598206
Functional genomics approaches to neurodegenerative diseases.

PubMed

Rubinsztein, David C

2008-09-01

Many of the neurodegenerative diseases that afflict humans are characterised by the protein aggregation in neurons. These include complex diseases like Alzheimer's disease and Parkinson's disease, and Mendelian diseases caused by polyglutamine expansion mutations [like Huntington's disease (HD) and various spinocerebellar ataxias (SCAs), like SCA3]. A range of functional genomic strategies have been used to try to elucidate pathways involved in these diseases. In this minireview, I focus on how modifier screens in organisms from yeast to mice may be of value in helping to elucidate pathogenic pathways.
Active Transposition in Genomes

PubMed Central

Huang, Cheng Ran Lisa; Burns, Kathleen H.; Boeke, Jef D.

2013-01-01

Transposons are DNA sequences capable of moving in genomes. Early evidence showed their accumulation in many species and suggested their continued activity in at least isolated organisms. In the past decade, with the development of various genomic technologies, it has become abundantly clear that ongoing activity is the rule rather than the exception. Active transposons of various classes are observed throughout plants and animals, including humans. They continue to create new insertions, have an enormous variety of structural and functional impact on genes and genomes, and play important roles in genome evolution. Transposon activities have been identified and measured by employing various strategies. Here, we summarize evidence of current transposon activity in various plant and animal genomes. PMID:23145912
Methods for Optimizing CRISPR-Cas9 Genome Editing Specificity

PubMed Central

Tycko, Josh; Myer, Vic E.; Hsu, Patrick D.

2016-01-01

Summary Advances in the development of delivery, repair, and specificity strategies for the CRISPR-Cas9 genome engineering toolbox are helping researchers understand gene function with unprecedented precision and sensitivity. CRISPR-Cas9 also holds enormous therapeutic potential for the treatment of genetic disorders by directly correcting disease-causing mutations. Although the Cas9 protein has been shown to bind and cleave DNA at off-target sites, the field of Cas9 specificity is rapidly progressing with marked improvements in guide RNA selection, protein and guide engineering, novel enzymes, and off-target detection methods. We review important challenges and breakthroughs in the field as a comprehensive practical guide to interested users of genome editing technologies, highlighting key tools and strategies for optimizing specificity. The genome editing community should now strive to standardize such methods for measuring and reporting off-target activity, while keeping in mind that the goal for specificity should be continued improvement and vigilance. PMID:27494557
Comparative Genomic Analysis Reveals Organization, Function and Evolution of ars Genes in Pantoea spp.

PubMed

Wang, Liying; Wang, Jin; Jing, Chuanyong

2017-01-01

Numerous genes are involved in various strategies to resist toxic arsenic (As). However, the As resistance strategy in genus Pantoea is poorly understood. In this study, a comparative genome analysis of 23 Pantoea genomes was conducted. Two vertical genetic arsC -like genes without any contribution to As resistance were found to exist in the 23 Pantoea strains. Besides the two arsC -like genes, As resistance gene clusters arsRBC or arsRBCH were found in 15 Pantoea genomes. These ars clusters were found to be acquired by horizontal gene transfer (HGT) from sources related to Franconibacter helveticus, Serratia marcescens , and Citrobacter freundii . During the history of evolution, the ars clusters were acquired more than once in some species, and were lost in some strains, producing strains without As resistance capability. This study revealed the organization, distribution and the complex evolutionary history of As resistance genes in Pantoea spp.. The insights gained in this study improved our understanding on the As resistance strategy of Pantoea spp. and its roles in the biogeochemical cycling of As.
Comparative Genomic Analysis Reveals Organization, Function and Evolution of ars Genes in Pantoea spp.

PubMed Central

Wang, Liying; Wang, Jin; Jing, Chuanyong

2017-01-01

Numerous genes are involved in various strategies to resist toxic arsenic (As). However, the As resistance strategy in genus Pantoea is poorly understood. In this study, a comparative genome analysis of 23 Pantoea genomes was conducted. Two vertical genetic arsC-like genes without any contribution to As resistance were found to exist in the 23 Pantoea strains. Besides the two arsC-like genes, As resistance gene clusters arsRBC or arsRBCH were found in 15 Pantoea genomes. These ars clusters were found to be acquired by horizontal gene transfer (HGT) from sources related to Franconibacter helveticus, Serratia marcescens, and Citrobacter freundii. During the history of evolution, the ars clusters were acquired more than once in some species, and were lost in some strains, producing strains without As resistance capability. This study revealed the organization, distribution and the complex evolutionary history of As resistance genes in Pantoea spp.. The insights gained in this study improved our understanding on the As resistance strategy of Pantoea spp. and its roles in the biogeochemical cycling of As. PMID:28377759
Genomic control of patterning

PubMed Central

Peter, Isabelle S.; Davidson, Eric H.

2014-01-01

The development of multicellular organisms involves the partitioning of the organism into territories of cells of specific structure and function. The information for spatial patterning processes is directly encoded in the genome. The genome determines its own usage depending on stage and position, by means of interactions that constitute gene regulatory networks (GRNs). The GRN driving endomesoderm development in sea urchin embryos illustrates different regulatory strategies by which developmental programs are initiated, orchestrated, stabilized or excluded to define the pattern of specified territories in the developing embryo. PMID:19378258
Functional genomics of corrinoid starvation in the organohalide-respiring bacterium Dehalobacter restrictus strain PER-K23

PubMed Central

Rupakula, Aamani; Lu, Yue; Kruse, Thomas; Boeren, Sjef; Holliger, Christof; Smidt, Hauke; Maillard, Julien

2015-01-01

De novo corrinoid biosynthesis represents one of the most complicated metabolic pathways in nature. Organohalide-respiring bacteria (OHRB) have developed different strategies to deal with their need of corrinoid, as it is an essential cofactor of reductive dehalogenases, the key enzymes in OHR metabolism. In contrast to Dehalococcoides mccartyi, the genome of Dehalobacter restrictus strain PER-K23 contains a complete set of corrinoid biosynthetic genes, of which cbiH appears to be truncated and therefore non-functional, possibly explaining the corrinoid auxotrophy of this obligate OHRB. Comparative genomics within Dehalobacter spp. revealed that one (operon-2) of the five distinct corrinoid biosynthesis associated operons present in the genome of D. restrictus appeared to be present only in that particular strain, which encodes multiple members of corrinoid transporters and salvaging enzymes. Operon-2 was highly up-regulated upon corrinoid starvation both at the transcriptional (346-fold) and proteomic level (46-fold on average), in line with the presence of an upstream cobalamin riboswitch. Together, these data highlight the importance of this operon in corrinoid homeostasis in D. restrictus and the augmented salvaging strategy this bacterium adopted to cope with the need for this essential cofactor. PMID:25610435

An Efficient Strategy Combining SSR Markers- and Advanced QTL-seq-driven QTL Mapping Unravels Candidate Genes Regulating Grain Weight in Rice

PubMed Central

Daware, Anurag; Das, Sweta; Srivastava, Rishi; Badoni, Saurabh; Singh, Ashok K.; Agarwal, Pinky; Parida, Swarup K.; Tyagi, Akhilesh K.

2016-01-01

Development and use of genome-wide informative simple sequence repeat (SSR) markers and novel integrated genomic strategies are vital to drive genomics-assisted breeding applications and for efficient dissection of quantitative trait loci (QTLs) underlying complex traits in rice. The present study developed 6244 genome-wide informative SSR markers exhibiting in silico fragment length polymorphism based on repeat-unit variations among genomic sequences of 11 indica, japonica, aus, and wild rice accessions. These markers were mapped on diverse coding and non-coding sequence components of known cloned/candidate genes annotated from 12 chromosomes and revealed a much higher amplification (97%) and polymorphic potential (88%) along with wider genetic/functional diversity level (16–74% with a mean 53%) especially among accessions belonging to indica cultivar group, suggesting their utility in large-scale genomics-assisted breeding applications in rice. A high-density 3791 SSR markers-anchored genetic linkage map (IR 64 × Sonasal) spanning 2060 cM total map-length with an average inter-marker distance of 0.54 cM was generated. This reference genetic map identified six major genomic regions harboring robust QTLs (31% combined phenotypic variation explained with a 5.7–8.7 LOD) governing grain weight on six rice chromosomes. One strong grain weight major QTL region (OsqGW5.1) was narrowed-down by integrating traditional QTL mapping with high-resolution QTL region-specific integrated SSR and single nucleotide polymorphism markers-based QTL-seq analysis and differential expression profiling. This led us to delineate two natural allelic variants in two known cis-regulatory elements (RAV1AAT and CARGCW8GAT) of glycosyl hydrolase and serine carboxypeptidase genes exhibiting pronounced seed-specific differential regulation in low (Sonasal) and high (IR 64) grain weight mapping parental accessions. Our genome-wide SSR marker resource (polymorphic within/between diverse cultivar groups) and integrated genomic strategy can efficiently scan functionally relevant potential molecular tags (markers, candidate genes and alleles) regulating complex agronomic traits (grain weight) and expedite marker-assisted genetic enhancement in rice. PMID:27833617
Rapid CRISPR/Cas9-Mediated Cloning of Full-Length Epstein-Barr Virus Genomes from Latently Infected Cells.

PubMed

Yajima, Misako; Ikuta, Kazufumi; Kanda, Teru

2018-04-03

Herpesviruses have relatively large DNA genomes of more than 150 kb that are difficult to clone and sequence. Bacterial artificial chromosome (BAC) cloning of herpesvirus genomes is a powerful technique that greatly facilitates whole viral genome sequencing as well as functional characterization of reconstituted viruses. We describe recently invented technologies for rapid BAC cloning of herpesvirus genomes using CRISPR/Cas9-mediated homology-directed repair. We focus on recent BAC cloning techniques of Epstein-Barr virus (EBV) genomes and discuss the possible advantages of a CRISPR/Cas9-mediated strategy comparatively with precedent EBV-BAC cloning strategies. We also describe the design decisions of this technology as well as possible pitfalls and points to be improved in the future. The obtained EBV-BAC clones are subjected to long-read sequencing analysis to determine complete EBV genome sequence including repetitive regions. Rapid cloning and sequence determination of various EBV strains will greatly contribute to the understanding of their global geographical distribution. This technology can also be used to clone disease-associated EBV strains and test the hypothesis that they have special features that distinguish them from strains that infect asymptomatically.
Rapid CRISPR/Cas9-Mediated Cloning of Full-Length Epstein-Barr Virus Genomes from Latently Infected Cells

PubMed Central

Ikuta, Kazufumi; Kanda, Teru

2018-01-01

Herpesviruses have relatively large DNA genomes of more than 150 kb that are difficult to clone and sequence. Bacterial artificial chromosome (BAC) cloning of herpesvirus genomes is a powerful technique that greatly facilitates whole viral genome sequencing as well as functional characterization of reconstituted viruses. We describe recently invented technologies for rapid BAC cloning of herpesvirus genomes using CRISPR/Cas9-mediated homology-directed repair. We focus on recent BAC cloning techniques of Epstein-Barr virus (EBV) genomes and discuss the possible advantages of a CRISPR/Cas9-mediated strategy comparatively with precedent EBV-BAC cloning strategies. We also describe the design decisions of this technology as well as possible pitfalls and points to be improved in the future. The obtained EBV-BAC clones are subjected to long-read sequencing analysis to determine complete EBV genome sequence including repetitive regions. Rapid cloning and sequence determination of various EBV strains will greatly contribute to the understanding of their global geographical distribution. This technology can also be used to clone disease-associated EBV strains and test the hypothesis that they have special features that distinguish them from strains that infect asymptomatically. PMID:29614006
Discovering Functions of Unannotated Genes from a Transcriptome Survey of Wild Fungal Isolates

PubMed Central

Ellison, Christopher E.; Kowbel, David; Glass, N. Louise; Taylor, John W.

2014-01-01

ABSTRACT Most fungal genomes are poorly annotated, and many fungal traits of industrial and biomedical relevance are not well suited to classical genetic screens. Assigning genes to phenotypes on a genomic scale thus remains an urgent need in the field. We developed an approach to infer gene function from expression profiles of wild fungal isolates, and we applied our strategy to the filamentous fungus Neurospora crassa. Using transcriptome measurements in 70 strains from two well-defined clades of this microbe, we first identified 2,247 cases in which the expression of an unannotated gene rose and fell across N. crassa strains in parallel with the expression of well-characterized genes. We then used image analysis of hyphal morphologies, quantitative growth assays, and expression profiling to test the functions of four genes predicted from our population analyses. The results revealed two factors that influenced regulation of metabolism of nonpreferred carbon and nitrogen sources, a gene that governed hyphal architecture, and a gene that mediated amino acid starvation resistance. These findings validate the power of our population-transcriptomic approach for inference of novel gene function, and we suggest that this strategy will be of broad utility for genome-scale annotation in many fungal systems. PMID:24692637
A New Model Army: Emerging fish models to study the genomics of vertebrate Evo-Devo

PubMed Central

Braasch, Ingo; Peterson, Samuel M.; Desvignes, Thomas; McCluskey, Braedan M.; Batzel, Peter; Postlethwait, John H.

2014-01-01

Many fields of biology – including vertebrate Evo-Devo research – are facing an explosion of genomic and transcriptomic sequence information and a multitude of fish species are now swimming in this ‘genomic tsunami’. Here, we first give an overview of recent developments in sequencing fish genomes and transcriptomes that identify properties of fish genomes requiring particular attention and propose strategies to overcome common challenges in fish genomics. We suggest that the generation of chromosome-level genome assemblies - for which we introduce the term ‘chromonome’ – should be a key component of genomic investigations in fish because they enable large-scale conserved synteny analyses that inform orthology detection, a process critical for connectivity of genomes. Orthology calls in vertebrates, especially in teleost fish, are complicated by divergent evolution of gene repertoires and functions following two rounds of genome duplication in the ancestor of vertebrates and a third round at the base of teleost fish. Second, using examples of spotted gar, basal teleosts, zebrafish-related cyprinids, cavefish, livebearers, icefish, and lobefin fish, we illustrate how next generation sequencing technologies liberate emerging fish systems from genomic ignorance and transform them into a new model army to answer longstanding questions on the genomic and developmental basis of their biodiversity. Finally, we discuss recent progress in the genetic toolbox for the major fish models for functional analysis, zebrafish and medaka, that can be transferred to many other fish species to study in vivo the functional effect of evolutionary genomic change as Evo-Devo research enters the postgenomic era. PMID:25111899
Tissue-aware data integration approach for the inference of pathway interactions in metazoan organisms

PubMed Central

Park, Christopher Y.; Krishnan, Arjun; Zhu, Qian; Wong, Aaron K.; Lee, Young-Suk; Troyanskaya, Olga G.

2015-01-01

Motivation: Leveraging the large compendium of genomic data to predict biomedical pathways and specific mechanisms of protein interactions genome-wide in metazoan organisms has been challenging. In contrast to unicellular organisms, biological and technical variation originating from diverse tissues and cell-lineages is often the largest source of variation in metazoan data compendia. Therefore, a new computational strategy accounting for the tissue heterogeneity in the functional genomic data is needed to accurately translate the vast amount of human genomic data into specific interaction-level hypotheses. Results: We developed an integrated, scalable strategy for inferring multiple human gene interaction types that takes advantage of data from diverse tissue and cell-lineage origins. Our approach specifically predicts both the presence of a functional association and also the most likely interaction type among human genes or its protein products on a whole-genome scale. We demonstrate that directly incorporating tissue contextual information improves the accuracy of our predictions, and further, that such genome-wide results can be used to significantly refine regulatory interactions from primary experimental datasets (e.g. ChIP-Seq, mass spectrometry). Availability and implementation: An interactive website hosting all of our interaction predictions is publically available at http://pathwaynet.princeton.edu. Software was implemented using the open-source Sleipnir library, which is available for download at https://bitbucket.org/libsleipnir/libsleipnir.bitbucket.org. Contact: ogt@cs.princeton.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25431329
Whole-genome comparative analysis of three phytopathogenic Xylella fastidiosa strains.

PubMed

Bhattacharyya, Anamitra; Stilwagen, Stephanie; Ivanova, Natalia; D'Souza, Mark; Bernal, Axel; Lykidis, Athanasios; Kapatral, Vinayak; Anderson, Iain; Larsen, Niels; Los, Tamara; Reznik, Gary; Selkov, Eugene; Walunas, Theresa L; Feil, Helene; Feil, William S; Purcell, Alexander; Lassez, Jean-Louis; Hawkins, Trevor L; Haselkorn, Robert; Overbeek, Ross; Predki, Paul F; Kyrpides, Nikos C

2002-09-17

Xylella fastidiosa (Xf) causes wilt disease in plants and is responsible for major economic and crop losses globally. Owing to the public importance of this phytopathogen we embarked on a comparative analysis of the complete genome of Xf pv citrus and the partial genomes of two recently sequenced strains of this species: Xf pv almond and Xf pv oleander, which cause leaf scorch in almond and oleander plants, respectively. We report a reanalysis of the previously sequenced Xf 9a5c (CVC, citrus) strain and the two "gapped" Xf genomes revealing ORFs encoding critical functions in pathogenicity and conjugative transfer. Second, a detailed whole-genome functional comparison was based on the three sequenced Xf strains, identifying the unique genes present in each strain, in addition to those shared between strains. Third, an "in silico" cellular reconstruction of these organisms was made, based on a comparison of their core functional subsystems that led to a characterization of their conjugative transfer machinery, identification of potential differences in their adhesion mechanisms, and highlighting of the absence of a classical quorum-sensing mechanism. This study demonstrates the effectiveness of comparative analysis strategies in the interpretation of genomes that are closely related.
The future is now: single-cell genomics of bacteria and archaea

PubMed Central

Blainey, Paul C.

2013-01-01

Interest in the expanding catalog of uncultivated microorganisms, increasing recognition of heterogeneity among seemingly similar cells, and technological advances in whole-genome amplification and single-cell manipulation are driving considerable progress in single-cell genomics. Here, the spectrum of applications for single-cell genomics, key advances in the development of the field, and emerging methodology for single-cell genome sequencing are reviewed by example with attention to the diversity of approaches and their unique characteristics. Experimental strategies transcending specific methodologies are identified and organized as a road map for future studies in single-cell genomics of environmental microorganisms. Over the next decade, increasingly powerful tools for single-cell genome sequencing and analysis will play key roles in accessing the genomes of uncultivated organisms, determining the basis of microbial community functions, and fundamental aspects of microbial population biology. PMID:23298390
Novel approaches in function-driven single-cell genomics.

PubMed

Doud, Devin F R; Woyke, Tanja

2017-07-01

Deeper sequencing and improved bioinformatics in conjunction with single-cell and metagenomic approaches continue to illuminate undercharacterized environmental microbial communities. This has propelled the 'who is there, and what might they be doing' paradigm to the uncultivated and has already radically changed the topology of the tree of life and provided key insights into the microbial contribution to biogeochemistry. While characterization of 'who' based on marker genes can describe a large fraction of the community, answering 'what are they doing' remains the elusive pinnacle for microbiology. Function-driven single-cell genomics provides a solution by using a function-based screen to subsample complex microbial communities in a targeted manner for the isolation and genome sequencing of single cells. This enables single-cell sequencing to be focused on cells with specific phenotypic or metabolic characteristics of interest. Recovered genomes are conclusively implicated for both encoding and exhibiting the feature of interest, improving downstream annotation and revealing activity levels within that environment. This emerging approach has already improved our understanding of microbial community functioning and facilitated the experimental analysis of uncharacterized gene product space. Here we provide a comprehensive review of strategies that have been applied for function-driven single-cell genomics and the future directions we envision. © FEMS 2017.
Novel approaches in function-driven single-cell genomics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Doud, Devin F. R.; Woyke, Tanja

Deeper sequencing and improved bioinformatics in conjunction with single-cell and metagenomic approaches continue to illuminate undercharacterized environmental microbial communities. This has propelled the 'who is there, and what might they be doing' paradigm to the uncultivated and has already radically changed the topology of the tree of life and provided key insights into the microbial contribution to biogeochemistry. While characterization of 'who' based on marker genes can describe a large fraction of the community, answering 'what are they doing' remains the elusive pinnacle for microbiology. Function-driven single-cell genomics provides a solution by using a function-based screen to subsample complex microbialmore » communities in a targeted manner for the isolation and genome sequencing of single cells. This enables single-cell sequencing to be focused on cells with specific phenotypic or metabolic characteristics of interest. Recovered genomes are conclusively implicated for both encoding and exhibiting the feature of interest, improving downstream annotation and revealing activity levels within that environment. This emerging approach has already improved our understanding of microbial community functioning and facilitated the experimental analysis of uncharacterized gene product space. Here we provide a comprehensive review of strategies that have been applied for function-driven single-cell genomics and the future directions we envision.« less
Novel approaches in function-driven single-cell genomics

DOE PAGES

Doud, Devin F. R.; Woyke, Tanja

2017-06-07

Deeper sequencing and improved bioinformatics in conjunction with single-cell and metagenomic approaches continue to illuminate undercharacterized environmental microbial communities. This has propelled the 'who is there, and what might they be doing' paradigm to the uncultivated and has already radically changed the topology of the tree of life and provided key insights into the microbial contribution to biogeochemistry. While characterization of 'who' based on marker genes can describe a large fraction of the community, answering 'what are they doing' remains the elusive pinnacle for microbiology. Function-driven single-cell genomics provides a solution by using a function-based screen to subsample complex microbialmore » communities in a targeted manner for the isolation and genome sequencing of single cells. This enables single-cell sequencing to be focused on cells with specific phenotypic or metabolic characteristics of interest. Recovered genomes are conclusively implicated for both encoding and exhibiting the feature of interest, improving downstream annotation and revealing activity levels within that environment. This emerging approach has already improved our understanding of microbial community functioning and facilitated the experimental analysis of uncharacterized gene product space. Here we provide a comprehensive review of strategies that have been applied for function-driven single-cell genomics and the future directions we envision.« less
Network-assisted investigation of virulence and antibiotic-resistance systems in Pseudomonas aeruginosa

NASA Astrophysics Data System (ADS)

Hwang, Sohyun; Kim, Chan Yeong; Ji, Sun-Gou; Go, Junhyeok; Kim, Hanhae; Yang, Sunmo; Kim, Hye Jin; Cho, Ara; Yoon, Sang Sun; Lee, Insuk

2016-05-01

Pseudomonas aeruginosa is a Gram-negative bacterium of clinical significance. Although the genome of PAO1, a prototype strain of P. aeruginosa, has been extensively studied, approximately one-third of the functional genome remains unknown. With the emergence of antibiotic-resistant strains of P. aeruginosa, there is an urgent need to develop novel antibiotic and anti-virulence strategies, which may be facilitated by an approach that explores P. aeruginosa gene function in systems-level models. Here, we present a genome-wide functional network of P. aeruginosa genes, PseudomonasNet, which covers 98% of the coding genome, and a companion web server to generate functional hypotheses using various network-search algorithms. We demonstrate that PseudomonasNet-assisted predictions can effectively identify novel genes involved in virulence and antibiotic resistance. Moreover, an antibiotic-resistance network based on PseudomonasNet reveals that P. aeruginosa has common modular genetic organisations that confer increased or decreased resistance to diverse antibiotics, which accounts for the pervasiveness of cross-resistance across multiple drugs. The same network also suggests that P. aeruginosa has developed mechanism of trade-off in resistance across drugs by altering genetic interactions. Taken together, these results clearly demonstrate the usefulness of a genome-scale functional network to investigate pathogenic systems in P. aeruginosa.
Focusing on function to mine cancer genome data | Center for Cancer Research

Cancer.gov

CCR scientists have devised a strategy to sift through the tens of thousands of mutations in cancer genome data to find mutations that actually drive the disease. They have used the method to discover that the JNK signaling pathway, which in different contexts can either spur cancerous growth or rein it in, acts as a tumor suppressor in gastric cancers.
Decoding genes with coexpression networks and metabolomics - 'majority report by precogs'.

PubMed

Saito, Kazuki; Hirai, Masami Y; Yonekura-Sakakibara, Keiko

2008-01-01

Following the sequencing of whole genomes of model plants, high-throughput decoding of gene function is a major challenge in modern plant biology. In view of remarkable technical advances in transcriptomics and metabolomics, integrated analysis of these 'omics' by data-mining informatics is an excellent tool for prediction and identification of gene function, particularly for genes involved in complicated metabolic pathways. The availability of Arabidopsis public transcriptome datasets containing data of >1000 microarrays reinforces the potential for prediction of gene function by transcriptome coexpression analysis. Here, we review the strategy of combining transcriptome and metabolome as a powerful technology for studying the functional genomics of model plants and also crop and medicinal plants.
Technological advances and genomics in metazoan parasites.

PubMed

Knox, D P

2004-02-01

Molecular biology has provided the means to identify parasite proteins, to define their function, patterns of expression and the means to produce them in quantity for subsequent functional analyses. Whole genome and expressed sequence tag programmes, and the parallel development of powerful bioinformatics tools, allow the execution of genome-wide between stage or species comparisons and meaningful gene-expression profiling. The latter can be undertaken with several new technologies such as DNA microarray and serial analysis of gene expression. Proteome analysis has come to the fore in recent years providing a crucial link between the gene and its protein product. RNA interference and ballistic gene transfer are exciting developments which can provide the means to precisely define the function of individual genes and, of importance in devising novel parasite control strategies, the effect that gene knockdown will have on parasite survival.
Functional Information Stored in the Conserved Structural RNA Domains of Flavivirus Genomes

PubMed Central

Fernández-Sanlés, Alba; Ríos-Marco, Pablo; Romero-López, Cristina; Berzal-Herranz, Alfredo

2017-01-01

The genus Flavivirus comprises a large number of small, positive-sense single-stranded, RNA viruses able to replicate in the cytoplasm of certain arthropod and/or vertebrate host cells. The genus, which has some 70 member species, includes a number of emerging and re-emerging pathogens responsible for outbreaks of human disease around the world, such as the West Nile, dengue, Zika, yellow fever, Japanese encephalitis, St. Louis encephalitis, and tick-borne encephalitis viruses. Like other RNA viruses, flaviviruses have a compact RNA genome that efficiently stores all the information required for the completion of the infectious cycle. The efficiency of this storage system is attributable to supracoding elements, i.e., discrete, structural units with essential functions. This information storage system overlaps and complements the protein coding sequence and is highly conserved across the genus. It therefore offers interesting potential targets for novel therapeutic strategies. This review summarizes our knowledge of the features of flavivirus genome functional RNA domains. It also provides a brief overview of the main achievements reported in the design of antiviral nucleic acid-based drugs targeting functional genomic RNA elements. PMID:28421048
Genomic Aspects of Research Involving Polyploid Plants

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yang, Xiaohan; Ye, Chuyu; Tschaplinski, Timothy J

2011-01-01

Almost all extant plant species have spontaneously doubled their genomes at least once in their evolutionary histories, resulting in polyploidy which provided a rich genomic resource for evolutionary processes. Moreover, superior polyploid clones have been created during the process of crop domestication. Polyploid plants generated by evolutionary processes and/or crop domestication have been the intentional or serendipitous focus of research dealing with the dynamics and consequences of genome evolution. One of the new trends in genomics research is to create synthetic polyploid plants which provide materials for studying the initial genomic changes/responses immediately after polyploid formation. Polyploid plants are alsomore » used in functional genomics research to study gene expression in a complex genomic background. In this review, we summarize the recent progress in genomics research involving ancient, young, and synthetic polyploid plants, with a focus on genome size evolution, genomics diversity, genomic rearrangement, genetic and epigenetic changes in duplicated genes, gene discovery, and comparative genomics. Implications on plant sciences including evolution, functional genomics, and plant breeding are presented. It is anticipated that polyploids will be a regular subject of genomics research in the foreseeable future as the rapid advances in DNA sequencing technology create unprecedented opportunities for discovering and monitoring genomic and transcriptomic changes in polyploid plants. The fast accumulation of knowledge on polyploid formation, maintenance, and divergence at whole-genome and subgenome levels will not only help plant biologists understand how plants have evolved and diversified, but also assist plant breeders in designing new strategies for crop improvement.« less
Integration of genome-wide association studies with biological knowledge identifies six novel genes related to kidney function.

PubMed

Chasman, Daniel I; Fuchsberger, Christian; Pattaro, Cristian; Teumer, Alexander; Böger, Carsten A; Endlich, Karlhans; Olden, Matthias; Chen, Ming-Huei; Tin, Adrienne; Taliun, Daniel; Li, Man; Gao, Xiaoyi; Gorski, Mathias; Yang, Qiong; Hundertmark, Claudia; Foster, Meredith C; O'Seaghdha, Conall M; Glazer, Nicole; Isaacs, Aaron; Liu, Ching-Ti; Smith, Albert V; O'Connell, Jeffrey R; Struchalin, Maksim; Tanaka, Toshiko; Li, Guo; Johnson, Andrew D; Gierman, Hinco J; Feitosa, Mary F; Hwang, Shih-Jen; Atkinson, Elizabeth J; Lohman, Kurt; Cornelis, Marilyn C; Johansson, Asa; Tönjes, Anke; Dehghan, Abbas; Lambert, Jean-Charles; Holliday, Elizabeth G; Sorice, Rossella; Kutalik, Zoltan; Lehtimäki, Terho; Esko, Tõnu; Deshmukh, Harshal; Ulivi, Sheila; Chu, Audrey Y; Murgia, Federico; Trompet, Stella; Imboden, Medea; Coassin, Stefan; Pistis, Giorgio; Harris, Tamara B; Launer, Lenore J; Aspelund, Thor; Eiriksdottir, Gudny; Mitchell, Braxton D; Boerwinkle, Eric; Schmidt, Helena; Cavalieri, Margherita; Rao, Madhumathi; Hu, Frank; Demirkan, Ayse; Oostra, Ben A; de Andrade, Mariza; Turner, Stephen T; Ding, Jingzhong; Andrews, Jeanette S; Freedman, Barry I; Giulianini, Franco; Koenig, Wolfgang; Illig, Thomas; Meisinger, Christa; Gieger, Christian; Zgaga, Lina; Zemunik, Tatijana; Boban, Mladen; Minelli, Cosetta; Wheeler, Heather E; Igl, Wilmar; Zaboli, Ghazal; Wild, Sarah H; Wright, Alan F; Campbell, Harry; Ellinghaus, David; Nöthlings, Ute; Jacobs, Gunnar; Biffar, Reiner; Ernst, Florian; Homuth, Georg; Kroemer, Heyo K; Nauck, Matthias; Stracke, Sylvia; Völker, Uwe; Völzke, Henry; Kovacs, Peter; Stumvoll, Michael; Mägi, Reedik; Hofman, Albert; Uitterlinden, Andre G; Rivadeneira, Fernando; Aulchenko, Yurii S; Polasek, Ozren; Hastie, Nick; Vitart, Veronique; Helmer, Catherine; Wang, Jie Jin; Stengel, Bénédicte; Ruggiero, Daniela; Bergmann, Sven; Kähönen, Mika; Viikari, Jorma; Nikopensius, Tiit; Province, Michael; Ketkar, Shamika; Colhoun, Helen; Doney, Alex; Robino, Antonietta; Krämer, Bernhard K; Portas, Laura; Ford, Ian; Buckley, Brendan M; Adam, Martin; Thun, Gian-Andri; Paulweber, Bernhard; Haun, Margot; Sala, Cinzia; Mitchell, Paul; Ciullo, Marina; Kim, Stuart K; Vollenweider, Peter; Raitakari, Olli; Metspalu, Andres; Palmer, Colin; Gasparini, Paolo; Pirastu, Mario; Jukema, J Wouter; Probst-Hensch, Nicole M; Kronenberg, Florian; Toniolo, Daniela; Gudnason, Vilmundur; Shuldiner, Alan R; Coresh, Josef; Schmidt, Reinhold; Ferrucci, Luigi; Siscovick, David S; van Duijn, Cornelia M; Borecki, Ingrid B; Kardia, Sharon L R; Liu, Yongmei; Curhan, Gary C; Rudan, Igor; Gyllensten, Ulf; Wilson, James F; Franke, Andre; Pramstaller, Peter P; Rettig, Rainer; Prokopenko, Inga; Witteman, Jacqueline; Hayward, Caroline; Ridker, Paul M; Parsa, Afshin; Bochud, Murielle; Heid, Iris M; Kao, W H Linda; Fox, Caroline S; Köttgen, Anna

2012-12-15

In conducting genome-wide association studies (GWAS), analytical approaches leveraging biological information may further understanding of the pathophysiology of clinical traits. To discover novel associations with estimated glomerular filtration rate (eGFR), a measure of kidney function, we developed a strategy for integrating prior biological knowledge into the existing GWAS data for eGFR from the CKDGen Consortium. Our strategy focuses on single nucleotide polymorphism (SNPs) in genes that are connected by functional evidence, determined by literature mining and gene ontology (GO) hierarchies, to genes near previously validated eGFR associations. It then requires association thresholds consistent with multiple testing, and finally evaluates novel candidates by independent replication. Among the samples of European ancestry, we identified a genome-wide significant SNP in FBXL20 (P = 5.6 × 10(-9)) in meta-analysis of all available data, and additional SNPs at the INHBC, LRP2, PLEKHA1, SLC3A2 and SLC7A6 genes meeting multiple-testing corrected significance for replication and overall P-values of 4.5 × 10(-4)-2.2 × 10(-7). Neither the novel PLEKHA1 nor FBXL20 associations, both further supported by association with eGFR among African Americans and with transcript abundance, would have been implicated by eGFR candidate gene approaches. LRP2, encoding the megalin receptor, was identified through connection with the previously known eGFR gene DAB2 and extends understanding of the megalin system in kidney function. These findings highlight integration of existing genome-wide association data with independent biological knowledge to uncover novel candidate eGFR associations, including candidates lacking known connections to kidney-specific pathways. The strategy may also be applicable to other clinical phenotypes, although more testing will be needed to assess its potential for discovery in general.
Integration of genome-wide association studies with biological knowledge identifies six novel genes related to kidney function

PubMed Central

Chasman, Daniel I.; Fuchsberger, Christian; Pattaro, Cristian; Teumer, Alexander; Böger, Carsten A.; Endlich, Karlhans; Olden, Matthias; Chen, Ming-Huei; Tin, Adrienne; Taliun, Daniel; Li, Man; Gao, Xiaoyi; Gorski, Mathias; Yang, Qiong; Hundertmark, Claudia; Foster, Meredith C.; O'Seaghdha, Conall M.; Glazer, Nicole; Isaacs, Aaron; Liu, Ching-Ti; Smith, Albert V.; O'Connell, Jeffrey R.; Struchalin, Maksim; Tanaka, Toshiko; Li, Guo; Johnson, Andrew D.; Gierman, Hinco J.; Feitosa, Mary F.; Hwang, Shih-Jen; Atkinson, Elizabeth J.; Lohman, Kurt; Cornelis, Marilyn C.; Johansson, Åsa; Tönjes, Anke; Dehghan, Abbas; Lambert, Jean-Charles; Holliday, Elizabeth G.; Sorice, Rossella; Kutalik, Zoltan; Lehtimäki, Terho; Esko, Tõnu; Deshmukh, Harshal; Ulivi, Sheila; Chu, Audrey Y.; Murgia, Federico; Trompet, Stella; Imboden, Medea; Coassin, Stefan; Pistis, Giorgio; Harris, Tamara B.; Launer, Lenore J.; Aspelund, Thor; Eiriksdottir, Gudny; Mitchell, Braxton D.; Boerwinkle, Eric; Schmidt, Helena; Cavalieri, Margherita; Rao, Madhumathi; Hu, Frank; Demirkan, Ayse; Oostra, Ben A.; de Andrade, Mariza; Turner, Stephen T.; Ding, Jingzhong; Andrews, Jeanette S.; Freedman, Barry I.; Giulianini, Franco; Koenig, Wolfgang; Illig, Thomas; Meisinger, Christa; Gieger, Christian; Zgaga, Lina; Zemunik, Tatijana; Boban, Mladen; Minelli, Cosetta; Wheeler, Heather E.; Igl, Wilmar; Zaboli, Ghazal; Wild, Sarah H.; Wright, Alan F.; Campbell, Harry; Ellinghaus, David; Nöthlings, Ute; Jacobs, Gunnar; Biffar, Reiner; Ernst, Florian; Homuth, Georg; Kroemer, Heyo K.; Nauck, Matthias; Stracke, Sylvia; Völker, Uwe; Völzke, Henry; Kovacs, Peter; Stumvoll, Michael; Mägi, Reedik; Hofman, Albert; Uitterlinden, Andre G.; Rivadeneira, Fernando; Aulchenko, Yurii S.; Polasek, Ozren; Hastie, Nick; Vitart, Veronique; Helmer, Catherine; Wang, Jie Jin; Stengel, Bénédicte; Ruggiero, Daniela; Bergmann, Sven; Kähönen, Mika; Viikari, Jorma; Nikopensius, Tiit; Province, Michael; Ketkar, Shamika; Colhoun, Helen; Doney, Alex; Robino, Antonietta; Krämer, Bernhard K.; Portas, Laura; Ford, Ian; Buckley, Brendan M.; Adam, Martin; Thun, Gian-Andri; Paulweber, Bernhard; Haun, Margot; Sala, Cinzia; Mitchell, Paul; Ciullo, Marina; Kim, Stuart K.; Vollenweider, Peter; Raitakari, Olli; Metspalu, Andres; Palmer, Colin; Gasparini, Paolo; Pirastu, Mario; Jukema, J. Wouter; Probst-Hensch, Nicole M.; Kronenberg, Florian; Toniolo, Daniela; Gudnason, Vilmundur; Shuldiner, Alan R.; Coresh, Josef; Schmidt, Reinhold; Ferrucci, Luigi; Siscovick, David S.; van Duijn, Cornelia M.; Borecki, Ingrid B.; Kardia, Sharon L.R.; Liu, Yongmei; Curhan, Gary C.; Rudan, Igor; Gyllensten, Ulf; Wilson, James F.; Franke, Andre; Pramstaller, Peter P.; Rettig, Rainer; Prokopenko, Inga; Witteman, Jacqueline; Hayward, Caroline; Ridker, Paul M; Parsa, Afshin; Bochud, Murielle; Heid, Iris M.; Kao, W.H. Linda; Fox, Caroline S.; Köttgen, Anna

2012-01-01

In conducting genome-wide association studies (GWAS), analytical approaches leveraging biological information may further understanding of the pathophysiology of clinical traits. To discover novel associations with estimated glomerular filtration rate (eGFR), a measure of kidney function, we developed a strategy for integrating prior biological knowledge into the existing GWAS data for eGFR from the CKDGen Consortium. Our strategy focuses on single nucleotide polymorphism (SNPs) in genes that are connected by functional evidence, determined by literature mining and gene ontology (GO) hierarchies, to genes near previously validated eGFR associations. It then requires association thresholds consistent with multiple testing, and finally evaluates novel candidates by independent replication. Among the samples of European ancestry, we identified a genome-wide significant SNP in FBXL20 (P = 5.6 × 10−9) in meta-analysis of all available data, and additional SNPs at the INHBC, LRP2, PLEKHA1, SLC3A2 and SLC7A6 genes meeting multiple-testing corrected significance for replication and overall P-values of 4.5 × 10−4–2.2 × 10−7. Neither the novel PLEKHA1 nor FBXL20 associations, both further supported by association with eGFR among African Americans and with transcript abundance, would have been implicated by eGFR candidate gene approaches. LRP2, encoding the megalin receptor, was identified through connection with the previously known eGFR gene DAB2 and extends understanding of the megalin system in kidney function. These findings highlight integration of existing genome-wide association data with independent biological knowledge to uncover novel candidate eGFR associations, including candidates lacking known connections to kidney-specific pathways. The strategy may also be applicable to other clinical phenotypes, although more testing will be needed to assess its potential for discovery in general. PMID:22962313
Discovering functions of unannotated genes from a transcriptome survey of wild fungal isolates.

PubMed

Ellison, Christopher E; Kowbel, David; Glass, N Louise; Taylor, John W; Brem, Rachel B

2014-04-01

Most fungal genomes are poorly annotated, and many fungal traits of industrial and biomedical relevance are not well suited to classical genetic screens. Assigning genes to phenotypes on a genomic scale thus remains an urgent need in the field. We developed an approach to infer gene function from expression profiles of wild fungal isolates, and we applied our strategy to the filamentous fungus Neurospora crassa. Using transcriptome measurements in 70 strains from two well-defined clades of this microbe, we first identified 2,247 cases in which the expression of an unannotated gene rose and fell across N. crassa strains in parallel with the expression of well-characterized genes. We then used image analysis of hyphal morphologies, quantitative growth assays, and expression profiling to test the functions of four genes predicted from our population analyses. The results revealed two factors that influenced regulation of metabolism of nonpreferred carbon and nitrogen sources, a gene that governed hyphal architecture, and a gene that mediated amino acid starvation resistance. These findings validate the power of our population-transcriptomic approach for inference of novel gene function, and we suggest that this strategy will be of broad utility for genome-scale annotation in many fungal systems. IMPORTANCE Some fungal species cause deadly infections in humans or crop plants, and other fungi are workhorses of industrial chemistry, including the production of biofuels. Advances in medical and industrial mycology require an understanding of the genes that control fungal traits. We developed a method to infer functions of uncharacterized genes by observing correlated expression of their mRNAs with those of known genes across wild fungal isolates. We applied this strategy to a filamentous fungus and predicted functions for thousands of unknown genes. In four cases, we experimentally validated the predictions from our method, discovering novel genes involved in the metabolism of nutrient sources relevant for biofuel production, as well as colony morphology and starvation resistance. Our strategy is straightforward, inexpensive, and applicable for predicting gene function in many fungal species.

Improved annotation through genome-scale metabolic modeling of Aspergillus oryzae

PubMed Central

Vongsangnak, Wanwipa; Olsen, Peter; Hansen, Kim; Krogsgaard, Steen; Nielsen, Jens

2008-01-01

Background Since ancient times the filamentous fungus Aspergillus oryzae has been used in the fermentation industry for the production of fermented sauces and the production of industrial enzymes. Recently, the genome sequence of A. oryzae with 12,074 annotated genes was released but the number of hypothetical proteins accounted for more than 50% of the annotated genes. Considering the industrial importance of this fungus, it is therefore valuable to improve the annotation and further integrate genomic information with biochemical and physiological information available for this microorganism and other related fungi. Here we proposed the gene prediction by construction of an A. oryzae Expressed Sequence Tag (EST) library, sequencing and assembly. We enhanced the function assignment by our developed annotation strategy. The resulting better annotation was used to reconstruct the metabolic network leading to a genome scale metabolic model of A. oryzae. Results Our assembled EST sequences we identified 1,046 newly predicted genes in the A. oryzae genome. Furthermore, it was possible to assign putative protein functions to 398 of the newly predicted genes. Noteworthy, our annotation strategy resulted in assignment of new putative functions to 1,469 hypothetical proteins already present in the A. oryzae genome database. Using the substantially improved annotated genome we reconstructed the metabolic network of A. oryzae. This network contains 729 enzymes, 1,314 enzyme-encoding genes, 1,073 metabolites and 1,846 (1,053 unique) biochemical reactions. The metabolic reactions are compartmentalized into the cytosol, the mitochondria, the peroxisome and the extracellular space. Transport steps between the compartments and the extracellular space represent 281 reactions, of which 161 are unique. The metabolic model was validated and shown to correctly describe the phenotypic behavior of A. oryzae grown on different carbon sources. Conclusion A much enhanced annotation of the A. oryzae genome was performed and a genome-scale metabolic model of A. oryzae was reconstructed. The model accurately predicted the growth and biomass yield on different carbon sources. The model serves as an important resource for gaining further insight into our understanding of A. oryzae physiology. PMID:18500999
PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms.

PubMed

Gan, Ruei-Chi; Chen, Ting-Wen; Wu, Timothy H; Huang, Po-Jung; Lee, Chi-Ching; Yeh, Yuan-Ming; Chiu, Cheng-Hsun; Huang, Hsien-Da; Tang, Petrus

2016-12-22

Next-generation sequencing promises the de novo genomic and transcriptomic analysis of samples of interests. However, there are only a few organisms having reference genomic sequences and even fewer having well-defined or curated annotations. For transcriptome studies focusing on organisms lacking proper reference genomes, the common strategy is de novo assembly followed by functional annotation. However, things become even more complicated when multiple transcriptomes are compared. Here, we propose a new analysis strategy and quantification methods for quantifying expression level which not only generate a virtual reference from sequencing data, but also provide comparisons between transcriptomes. First, all reads from the transcriptome datasets are pooled together for de novo assembly. The assembled contigs are searched against NCBI NR databases to find potential homolog sequences. Based on the searched result, a set of virtual transcripts are generated and served as a reference transcriptome. By using the same reference, normalized quantification values including RC (read counts), eRPKM (estimated RPKM) and eTPM (estimated TPM) can be obtained that are comparable across transcriptome datasets. In order to demonstrate the feasibility of our strategy, we implement it in the web service PARRoT. PARRoT stands for Pipeline for Analyzing RNA Reads of Transcriptomes. It analyzes gene expression profiles for two transcriptome sequencing datasets. For better understanding of the biological meaning from the comparison among transcriptomes, PARRoT further provides linkage between these virtual transcripts and their potential function through showing best hits in SwissProt, NR database, assigning GO terms. Our demo datasets showed that PARRoT can analyze two paired-end transcriptomic datasets of approximately 100 million reads within just three hours. In this study, we proposed and implemented a strategy to analyze transcriptomes from non-reference organisms which offers the opportunity to quantify and compare transcriptome profiles through a homolog based virtual transcriptome reference. By using the homolog based reference, our strategy effectively avoids the problems that may cause from inconsistencies among transcriptomes. This strategy will shed lights on the field of comparative genomics for non-model organism. We have implemented PARRoT as a web service which is freely available at http://parrot.cgu.edu.tw .
A Multiplexed Single-Cell CRISPR Screening Platform Enables Systematic Dissection of the Unfolded Protein Response. | Office of Cancer Genomics

Cancer.gov

Functional genomics efforts face tradeoffs between number of perturbations examined and complexity of phenotypes measured. We bridge this gap with Perturb-seq, which combines droplet-based single-cell RNA-seq with a strategy for barcoding CRISPR-mediated perturbations, allowing many perturbations to be profiled in pooled format. We applied Perturb-seq to dissect the mammalian unfolded protein response (UPR) using single and combinatorial CRISPR perturbations. Two genome-scale CRISPR interference (CRISPRi) screens identified genes whose repression perturbs ER homeostasis.
Strategies to explore functional genomics data sets in NCBI's GEO database.

PubMed

Wilhite, Stephen E; Barrett, Tanya

2012-01-01

The Gene Expression Omnibus (GEO) database is a major repository that stores high-throughput functional genomics data sets that are generated using both microarray-based and sequence-based technologies. Data sets are submitted to GEO primarily by researchers who are publishing their results in journals that require original data to be made freely available for review and analysis. In addition to serving as a public archive for these data, GEO has a suite of tools that allow users to identify, analyze, and visualize data relevant to their specific interests. These tools include sample comparison applications, gene expression profile charts, data set clusters, genome browser tracks, and a powerful search engine that enables users to construct complex queries.
Strategies to Explore Functional Genomics Data Sets in NCBI’s GEO Database

PubMed Central

Wilhite, Stephen E.; Barrett, Tanya

2012-01-01

The Gene Expression Omnibus (GEO) database is a major repository that stores high-throughput functional genomics data sets that are generated using both microarray-based and sequence-based technologies. Data sets are submitted to GEO primarily by researchers who are publishing their results in journals that require original data to be made freely available for review and analysis. In addition to serving as a public archive for these data, GEO has a suite of tools that allow users to identify, analyze and visualize data relevant to their specific interests. These tools include sample comparison applications, gene expression profile charts, data set clusters, genome browser tracks, and a powerful search engine that enables users to construct complex queries. PMID:22130872
Global Organization of a Positive-strand RNA Virus Genome

PubMed Central

Wu, Baodong; Grigull, Jörg; Ore, Moriam O.; Morin, Sylvie; White, K. Andrew

2013-01-01

The genomes of plus-strand RNA viruses contain many regulatory sequences and structures that direct different viral processes. The traditional view of these RNA elements are as local structures present in non-coding regions. However, this view is changing due to the discovery of regulatory elements in coding regions and functional long-range intra-genomic base pairing interactions. The ∼4.8 kb long RNA genome of the tombusvirus tomato bushy stunt virus (TBSV) contains these types of structural features, including six different functional long-distance interactions. We hypothesized that to achieve these multiple interactions this viral genome must utilize a large-scale organizational strategy and, accordingly, we sought to assess the global conformation of the entire TBSV genome. Atomic force micrographs of the genome indicated a mostly condensed structure composed of interconnected protrusions extending from a central hub. This configuration was consistent with the genomic secondary structure model generated using high-throughput selective 2′-hydroxyl acylation analysed by primer extension (i.e. SHAPE), which predicted different sized RNA domains originating from a central region. Known RNA elements were identified in both domain and inter-domain regions, and novel structural features were predicted and functionally confirmed. Interestingly, only two of the six long-range interactions known to form were present in the structural model. However, for those interactions that did not form, complementary partner sequences were positioned relatively close to each other in the structure, suggesting that the secondary structure level of viral genome structure could provide a basic scaffold for the formation of different long-range interactions. The higher-order structural model for the TBSV RNA genome provides a snapshot of the complex framework that allows multiple functional components to operate in concert within a confined context. PMID:23717202
The strategies WDK: a graphical search interface and web development kit for functional genomics databases

PubMed Central

Fischer, Steve; Aurrecoechea, Cristina; Brunk, Brian P.; Gao, Xin; Harb, Omar S.; Kraemer, Eileen T.; Pennington, Cary; Treatman, Charles; Kissinger, Jessica C.; Roos, David S.; Stoeckert, Christian J.

2011-01-01

Web sites associated with the Eukaryotic Pathogen Bioinformatics Resource Center (EuPathDB.org) have recently introduced a graphical user interface, the Strategies WDK, intended to make advanced searching and set and interval operations easy and accessible to all users. With a design guided by usability studies, the system helps motivate researchers to perform dynamic computational experiments and explore relationships across data sets. For example, PlasmoDB users seeking novel therapeutic targets may wish to locate putative enzymes that distinguish pathogens from their hosts, and that are expressed during appropriate developmental stages. When a researcher runs one of the approximately 100 searches available on the site, the search is presented as a first step in a strategy. The strategy is extended by running additional searches, which are combined with set operators (union, intersect or minus), or genomic interval operators (overlap, contains). A graphical display uses Venn diagrams to make the strategy’s flow obvious. The interface facilitates interactive adjustment of the component searches with changes propagating forward through the strategy. Users may save their strategies, creating protocols that can be shared with colleagues. The strategy system has now been deployed on all EuPathDB databases, and successfully deployed by other projects. The Strategies WDK uses a configurable MVC architecture that is compatible with most genomics and biological warehouse databases, and is available for download at code.google.com/p/strategies-wdk. Database URL: www.eupathdb.org PMID:21705364
Genome Cyclization as Strategy for Flavivirus RNA Replication

PubMed Central

Villordo, Sergio M.; Gamarnik, Andrea V.

2017-01-01

Long-range and local RNA-RNA contacts in viral RNA genomes result in tertiary structures that modulate the function of enhancers, promoters, and silencers during translation, RNA replication, and encapsidation. In the case of flaviviruses, the presence of inverted complementary sequences at the 5′ and 3′ ends of the genome mediate long-range RNA interactions and RNA cyclization. The circular conformation of flavivirus genomes was demonstrated to be essential for RNA amplification. New ideas about the mechanisms by which circular genomes participate in flavivirus replication have emerged in the last few years. Here, we will describe the latest information about cis-acting elements involved in flavivirus genome cyclization, RNA promoter elements required for viral polymerase recognition, and how these elements together coordinate viral RNA synthesis. PMID:18703097
Complete genome sequence and comparative analysis of Acetobacter pasteurianus 386B, a strain well-adapted to the cocoa bean fermentation ecosystem.

PubMed

Illeghems, Koen; De Vuyst, Luc; Weckx, Stefan

2013-08-01

Acetobacter pasteurianus 386B, an acetic acid bacterium originating from a spontaneous cocoa bean heap fermentation, proved to be an ideal functional starter culture for coca bean fermentations. It is able to dominate the fermentation process, thereby resisting high acetic acid concentrations and temperatures. However, the molecular mechanisms underlying its metabolic capabilities and niche adaptations are unknown. In this study, whole-genome sequencing and comparative genome analysis was used to investigate this strain's mechanisms to dominate the cocoa bean fermentation process. The genome sequence of A. pasteurianus 386B is composed of a 2.8-Mb chromosome and seven plasmids. The annotation of 2875 protein-coding sequences revealed important characteristics, including several metabolic pathways, the occurrence of strain-specific genes such as an endopolygalacturonase, and the presence of mechanisms involved in tolerance towards various stress conditions. Furthermore, the low number of transposases in the genome and the absence of complete phage genomes indicate that this strain might be more genetically stable compared with other A. pasteurianus strains, which is an important advantage for the use of this strain as a functional starter culture. Comparative genome analysis with other members of the Acetobacteraceae confirmed the functional properties of A. pasteurianus 386B, such as its thermotolerant nature and unique genetic composition. Genome analysis of A. pasteurianus 386B provided detailed insights into the underlying mechanisms of its metabolic features, niche adaptations, and tolerance towards stress conditions. Combination of these data with previous experimental knowledge enabled an integrated, global overview of the functional characteristics of this strain. This knowledge will enable improved fermentation strategies and selection of appropriate acetic acid bacteria strains as functional starter culture for cocoa bean fermentation processes.
Proteomic strategy for the identification of critical actors in reorganization of the post-meiotic male genome.

PubMed

Govin, Jerome; Gaucher, Jonathan; Ferro, Myriam; Debernardi, Alexandra; Garin, Jerome; Khochbin, Saadi; Rousseaux, Sophie

2012-01-01

After meiosis, during the final stages of spermatogenesis, the haploid male genome undergoes major structural changes, resulting in a shift from a nucleosome-based genome organization to the sperm-specific, highly compacted nucleoprotamine structure. Recent data support the idea that region-specific programming of the haploid male genome is of high importance for the post-fertilization events and for successful embryo development. Although these events constitute a unique and essential step in reproduction, the mechanisms by which they occur have remained completely obscure and the factors involved have mostly remained uncharacterized. Here, we sought a strategy to significantly increase our understanding of proteins controlling the haploid male genome reprogramming, based on the identification of proteins in two specific pools: those with the potential to bind nucleic acids (basic proteins) and proteins capable of binding basic proteins (acidic proteins). For the identification of acidic proteins, we developed an approach involving a transition-protein (TP)-based chromatography, which has the advantage of retaining not only acidic proteins due to the charge interactions, but also potential TP-interacting factors. A second strategy, based on an in-depth bioinformatic analysis of the identified proteins, was then applied to pinpoint within the lists obtained, male germ cells expressed factors relevant to the post-meiotic genome organization. This approach reveals a functional network of DNA-packaging proteins and their putative chaperones and sheds a new light on the way the critical transitions in genome organizations could take place. This work also points to a new area of research in male infertility and sperm quality assessments.
Precise and heritable genome editing in evolutionarily diverse nematodes using TALENs and CRISPR/Cas9 to engineer insertions and deletions.

PubMed

Lo, Te-Wen; Pickle, Catherine S; Lin, Steven; Ralston, Edward J; Gurling, Mark; Schartner, Caitlin M; Bian, Qian; Doudna, Jennifer A; Meyer, Barbara J

2013-10-01

Exploitation of custom-designed nucleases to induce DNA double-strand breaks (DSBs) at genomic locations of choice has transformed our ability to edit genomes, regardless of their complexity. DSBs can trigger either error-prone repair pathways that induce random mutations at the break sites or precise homology-directed repair pathways that generate specific insertions or deletions guided by exogenously supplied DNA. Prior editing strategies using site-specific nucleases to modify the Caenorhabditis elegans genome achieved only the heritable disruption of endogenous loci through random mutagenesis by error-prone repair. Here we report highly effective strategies using TALE nucleases and RNA-guided CRISPR/Cas9 nucleases to induce error-prone repair and homology-directed repair to create heritable, precise insertion, deletion, or substitution of specific DNA sequences at targeted endogenous loci. Our robust strategies are effective across nematode species diverged by 300 million years, including necromenic nematodes (Pristionchus pacificus), male/female species (Caenorhabditis species 9), and hermaphroditic species (C. elegans). Thus, genome-editing tools now exist to transform nonmodel nematode species into genetically tractable model organisms. We demonstrate the utility of our broadly applicable genome-editing strategies by creating reagents generally useful to the nematode community and reagents specifically designed to explore the mechanism and evolution of X chromosome dosage compensation. By developing an efficient pipeline involving germline injection of nuclease mRNAs and single-stranded DNA templates, we engineered precise, heritable nucleotide changes both close to and far from DSBs to gain or lose genetic function, to tag proteins made from endogenous genes, and to excise entire loci through targeted FLP-FRT recombination.
Extensive complementarity between gene function prediction methods.

PubMed

Vidulin, Vedrana; Šmuc, Tomislav; Supek, Fran

2016-12-01

The number of sequenced genomes rises steadily but we still lack the knowledge about the biological roles of many genes. Automated function prediction (AFP) is thus a necessity. We hypothesized that AFP approaches that draw on distinct genome features may be useful for predicting different types of gene functions, motivating a systematic analysis of the benefits gained by obtaining and integrating such predictions. Our pipeline amalgamates 5 133 543 genes from 2071 genomes in a single massive analysis that evaluates five established genomic AFP methodologies. While 1227 Gene Ontology (GO) terms yielded reliable predictions, the majority of these functions were accessible to only one or two of the methods. Moreover, different methods tend to assign a GO term to non-overlapping sets of genes. Thus, inferences made by diverse genomic AFP methods display a striking complementary, both gene-wise and function-wise. Because of this, a viable integration strategy is to rely on a single most-confident prediction per gene/function, rather than enforcing agreement across multiple AFP methods. Using an information-theoretic approach, we estimate that current databases contain 29.2 bits/gene of known Escherichia coli gene functions. This can be increased by up to 5.5 bits/gene using individual AFP methods or by 11 additional bits/gene upon integration, thereby providing a highly-ranking predictor on the Critical Assessment of Function Annotation 2 community benchmark. Availability of more sequenced genomes boosts the predictive accuracy of AFP approaches and also the benefit from integrating them. The individual and integrated GO predictions for the complete set of genes are available from http://gorbi.irb.hr/ CONTACT: fran.supek@irb.hrSupplementary information: Supplementary materials are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
RNAi for functional genomics in plants.

PubMed

McGinnis, Karen M

2010-03-01

RNAi refers to several different types of gene silencing mediated by small, dsRNA molecules. Over the course of 20 years, the scientific understanding of RNAi has developed from the initial observation of unexpected expression patterns to a sophisticated understanding of a multi-faceted, evolutionarily conserved network of mechanisms that regulate gene expression in many organisms. It has also been developed as a genetic tool that can be exploited in a wide range of species. Because transgene-induced RNAi has been effective at silencing one or more genes in a wide range of plants, this technology also bears potential as a powerful functional genomics tool across the plant kingdom. Transgene-induced RNAi has indeed been shown to be an effective mechanism for silencing many genes in many organisms, but the results from multiple projects which attempted to exploit RNAi on a genome-wide scale suggest that there is a great deal of variation in the silencing efficacy between transgenic events, silencing targets and silencing-induced phenotype. The results from these projects indicate several important variables that should be considered in experimental design prior to the initiation of functional genomics efforts based on RNAi silencing. In recent years, alternative strategies have been developed for targeted gene silencing, and a combination of approaches may also enhance the use of targeted gene silencing for functional genomics.
04-ERD-052-Final Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Loots, G G; Ovcharenko, I; Collette, N

2007-02-26

Generating the sequence of the human genome represents a colossal achievement for science and mankind. The technical use for the human genome project information holds great promise to cure disease, prevent bioterror threats, as well as to learn about human origins. Yet converting the sequence data into biological meaningful information has not been immediately obvious, and we are still in the preliminary stages of understanding how the genome is organized, what are the functional building blocks and how do these sequences mediate complex biological processes. The overarching goal of this program was to develop novel methods and high throughput strategiesmore » for determining the functions of ''anonymous'' human genes that are evolutionarily deeply conserved in other vertebrates. We coupled analytical tool development and computational predictions regarding gene function with novel high throughput experimental strategies and tested biological predictions in the laboratory. The tools required for comparative genomic data-mining are fundamentally the same whether they are applied to scientific studies of related microbes or the search for functions of novel human genes. For this reason the tools, conceptual framework and the coupled informatics-experimental biology paradigm we developed in this LDRD has many potential scientific applications relevant to LLNL multidisciplinary research in bio-defense, bioengineering, bionanosciences and microbial and environmental genomics.« less
Systematic characterization of deubiquitylating enzymes for roles in maintaining genome integrity.

PubMed

Nishi, Ryotaro; Wijnhoven, Paul; le Sage, Carlos; Tjeertes, Jorrit; Galanty, Yaron; Forment, Josep V; Clague, Michael J; Urbé, Sylvie; Jackson, Stephen P

2014-10-01

DNA double-strand breaks (DSBs) are perhaps the most toxic of all DNA lesions, with defects in the DNA-damage response to DSBs being associated with various human diseases. Although it is known that DSB repair pathways are tightly regulated by ubiquitylation, we do not yet have a comprehensive understanding of how deubiquitylating enzymes (DUBs) function in DSB responses. Here, by carrying out a multidimensional screening strategy for human DUBs, we identify several with hitherto unknown links to DSB repair, the G2/M DNA-damage checkpoint and genome-integrity maintenance. Phylogenetic analyses reveal functional clustering within certain DUB subgroups, suggesting evolutionally conserved functions and/or related modes of action. Furthermore, we establish that the DUB UCHL5 regulates DSB resection and repair by homologous recombination through protecting its interactor, NFRKB, from degradation. Collectively, our findings extend the list of DUBs promoting the maintenance of genome integrity, and highlight their potential as therapeutic targets for cancer.
Functional correction of dystrophin actin binding domain mutations by genome editing

PubMed Central

Kyrychenko, Viktoriia; Kyrychenko, Sergii; Tiburcy, Malte; Shelton, John M.; Long, Chengzu; Schneider, Jay W.; Zimmermann, Wolfram-Hubertus; Bassel-Duby, Rhonda

2017-01-01

Dystrophin maintains the integrity of striated muscles by linking the actin cytoskeleton with the cell membrane. Duchenne muscular dystrophy (DMD) is caused by mutations in the dystrophin gene (DMD) that result in progressive, debilitating muscle weakness, cardiomyopathy, and a shortened lifespan. Mutations of dystrophin that disrupt the amino-terminal actin-binding domain 1 (ABD-1), encoded by exons 2–8, represent the second-most common cause of DMD. In the present study, we compared three different strategies for CRISPR/Cas9 genome editing to correct mutations in the ABD-1 region of the DMD gene by deleting exons 3–9, 6–9, or 7–11 in human induced pluripotent stem cells (iPSCs) and by assessing the function of iPSC-derived cardiomyocytes. All three exon deletion strategies enabled the expression of truncated dystrophin protein and restoration of cardiomyocyte contractility and calcium transients to varying degrees. We show that deletion of exons 3–9 by genomic editing provides an especially effective means of correcting disease-causing ABD-1 mutations. These findings represent an important step toward eventual correction of common DMD mutations and provide a means of rapidly assessing the expression and function of internally truncated forms of dystrophin-lacking portions of ABD-1. PMID:28931764
Genome-Wide Comparative Gene Family Classification

PubMed Central

Frech, Christian; Chen, Nansheng

2010-01-01

Correct classification of genes into gene families is important for understanding gene function and evolution. Although gene families of many species have been resolved both computationally and experimentally with high accuracy, gene family classification in most newly sequenced genomes has not been done with the same high standard. This project has been designed to develop a strategy to effectively and accurately classify gene families across genomes. We first examine and compare the performance of computer programs developed for automated gene family classification. We demonstrate that some programs, including the hierarchical average-linkage clustering algorithm MC-UPGMA and the popular Markov clustering algorithm TRIBE-MCL, can reconstruct manual curation of gene families accurately. However, their performance is highly sensitive to parameter setting, i.e. different gene families require different program parameters for correct resolution. To circumvent the problem of parameterization, we have developed a comparative strategy for gene family classification. This strategy takes advantage of existing curated gene families of reference species to find suitable parameters for classifying genes in related genomes. To demonstrate the effectiveness of this novel strategy, we use TRIBE-MCL to classify chemosensory and ABC transporter gene families in C. elegans and its four sister species. We conclude that fully automated programs can establish biologically accurate gene families if parameterized accordingly. Comparative gene family classification finds optimal parameters automatically, thus allowing rapid insights into gene families of newly sequenced species. PMID:20976221
Functional genomics to discover antibiotic resistance genes: The paradigm of resistance to colistin mediated by ethanolamine phosphotransferase in Shewanella algae MARS 14.

PubMed

Telke, Amar A; Rolain, Jean-Marc

2015-12-01

Shewanella algae MARS 14 is a colistin-resistant clinical isolate retrieved from bronchoalveolar lavage of a hospitalised patient. A functional genomics strategy was employed to discover the molecular support for colistin resistance in S. algae MARS 14. A pZE21 MCS-1 plasmid-based genomic expression library was constructed in Escherichia coli TOP10. The estimated library size was 1.30×10(8) bp. Functional screening of colistin-resistant clones was carried out on Luria-Bertani agar containing 8 mg/L colistin. Five colistin-resistant clones were obtained after complete screening of the genomic expression library. Analysis of DNA sequencing results found a unique gene in all selected clones. Amino acid sequence analysis of this unique gene using the Integrated Microbial Genomes (IMG) and KEGG databases revealed that this gene encodes ethanolamine phosphotransferase (EptA, or so-called PmrC). Reverse transcription PCR analysis indicated that resistance to colistin in S. algae MARS 14 was associated with overexpression of EptA (27-fold increase), which plays a crucial role in the arrangement of outer membrane lipopolysaccharide. Copyright © 2015 Elsevier B.V. and the International Society of Chemotherapy. All rights reserved.
Teaching strategies to incorporate genomics education into academic nursing curricula.

PubMed

Quevedo Garcia, Sylvia P; Greco, Karen E; Loescher, Lois J

2011-11-01

The translation of genomic science into health care has expanded our ability to understand the effects of genomics on human health and disease. As genomic advances continue, nurses are expected to have the knowledge and skills to translate genomic information into improved patient care. This integrative review describes strategies used to teach genomics in academic nursing programs and their facilitators and barriers to inclusion in nursing curricula. The Learning Engagement Model and the Diffusion of Innovations Theory guided the interpretation of findings. CINAHL, Medline, and Web of Science were resources for articles published during the past decade that included strategies for teaching genomics in academic nursing programs. Of 135 articles, 13 met criteria for review. Examples of effective genomics teaching strategies included clinical application through case studies, storytelling, online genomics resources, student self-assessment, guest lecturers, and a genetics focus group. Most strategies were not evaluated for effectiveness. Copyright 2011, SLACK Incorporated.
Identification of Genetic Loci Jointly Influencing Schizophrenia Risk and the Cognitive Traits of Verbal-Numerical Reasoning, Reaction Time, and General Cognitive Function.

PubMed

Smeland, Olav B; Frei, Oleksandr; Kauppi, Karolina; Hill, W David; Li, Wen; Wang, Yunpeng; Krull, Florian; Bettella, Francesco; Eriksen, Jon A; Witoelar, Aree; Davies, Gail; Fan, Chun C; Thompson, Wesley K; Lam, Max; Lencz, Todd; Chen, Chi-Hua; Ueland, Torill; Jönsson, Erik G; Djurovic, Srdjan; Deary, Ian J; Dale, Anders M; Andreassen, Ole A

2017-10-01

Schizophrenia is associated with widespread cognitive impairments. Although cognitive deficits are one of the factors most strongly associated with functional outcome in schizophrenia, current treatment strategies largely fail to ameliorate these impairments. To develop more efficient treatment strategies in patients with schizophrenia, a better understanding of the pathogenesis of these cognitive deficits is needed. Accumulating evidence indicates that genetic risk of schizophrenia may contribute to cognitive dysfunction. To identify genomic regions jointly influencing schizophrenia and the cognitive domains of reaction time and verbal-numerical reasoning, as well as general cognitive function, a phenotype that captures the shared variation in performance across cognitive domains. Combining data from genome-wide association studies from multiple phenotypes using conditional false discovery rate analysis provides increased power to discover genetic variants and could elucidate shared molecular genetic mechanisms. Data from the following genome-wide association studies, published from July 24, 2014, to January 17, 2017, were combined: schizophrenia in the Psychiatric Genomics Consortium cohort (n = 79 757 [cases, 34 486; controls, 45 271]); verbal-numerical reasoning (n = 36 035) and reaction time (n = 111 483) in the UK Biobank cohort; and general cognitive function in CHARGE (Cohorts for Heart and Aging Research in Genomic Epidemiology) (n = 53 949) and COGENT (Cognitive Genomics Consortium) (n = 27 888). Genetic loci identified by conditional false discovery rate analysis. Brain messenger RNA expression and brain expression quantitative trait locus functionality were determined. Among the participants in the genome-wide association studies, 21 loci jointly influencing schizophrenia and cognitive traits were identified: 2 loci shared between schizophrenia and verbal-numerical reasoning, 6 loci shared between schizophrenia and reaction time, and 14 loci shared between schizophrenia and general cognitive function. One locus was shared between schizophrenia and 2 cognitive traits and represented the strongest shared signal detected (nearest gene TCF20; chromosome 22q13.2), and was shared between schizophrenia (z score, 5.01; P = 5.53 × 10-7), general cognitive function (z score, -4.43; P = 9.42 × 10-6), and verbal-numerical reasoning (z score, -5.43; P = 5.64 × 10-8). For 18 loci, schizophrenia risk alleles were associated with poorer cognitive performance. The implicated genes are expressed in the developmental and adult human brain. Replicable expression quantitative trait locus functionality was identified for 4 loci in the adult human brain. The discovered loci improve the understanding of the common genetic basis underlying schizophrenia and cognitive function, suggesting novel molecular genetic mechanisms.

Functional assessment of human enhancer activities using whole-genome STARR-sequencing.

PubMed

Liu, Yuwen; Yu, Shan; Dhiman, Vineet K; Brunetti, Tonya; Eckart, Heather; White, Kevin P

2017-11-20

Genome-wide quantification of enhancer activity in the human genome has proven to be a challenging problem. Recent efforts have led to the development of powerful tools for enhancer quantification. However, because of genome size and complexity, these tools have yet to be applied to the whole human genome. In the current study, we use a human prostate cancer cell line, LNCaP as a model to perform whole human genome STARR-seq (WHG-STARR-seq) to reliably obtain an assessment of enhancer activity. This approach builds upon previously developed STARR-seq in the fly genome and CapSTARR-seq techniques in targeted human genomic regions. With an improved library preparation strategy, our approach greatly increases the library complexity per unit of starting material, which makes it feasible and cost-effective to explore the landscape of regulatory activity in the much larger human genome. In addition to our ability to identify active, accessible enhancers located in open chromatin regions, we can also detect sequences with the potential for enhancer activity that are located in inaccessible, closed chromatin regions. When treated with the histone deacetylase inhibitor, Trichostatin A, genes nearby this latter class of enhancers are up-regulated, demonstrating the potential for endogenous functionality of these regulatory elements. WHG-STARR-seq provides an improved approach to current pipelines for analysis of high complexity genomes to gain a better understanding of the intricacies of transcriptional regulation.
Uprobe: a genome-wide universal probe resource for comparative physical mapping in vertebrates.

PubMed

Kellner, Wendy A; Sullivan, Robert T; Carlson, Brian H; Thomas, James W

2005-01-01

Interspecies comparisons are important for deciphering the functional content and evolution of genomes. The expansive array of >70 public vertebrate genomic bacterial artificial chromosome (BAC) libraries can provide a means of comparative mapping, sequencing, and functional analysis of targeted chromosomal segments that is independent and complementary to whole-genome sequencing. However, at the present time, no complementary resource exists for the efficient targeted physical mapping of the majority of these BAC libraries. Universal overgo-hybridization probes, designed from regions of sequenced genomes that are highly conserved between species, have been demonstrated to be an effective resource for the isolation of orthologous regions from multiple BAC libraries in parallel. Here we report the application of the universal probe design principal across entire genomes, and the subsequent creation of a complementary probe resource, Uprobe, for screening vertebrate BAC libraries. Uprobe currently consists of whole-genome sets of universal overgo-hybridization probes designed for screening mammalian or avian/reptilian libraries. Retrospective analysis, experimental validation of the probe design process on a panel of representative BAC libraries, and estimates of probe coverage across the genome indicate that the majority of all eutherian and avian/reptilian genes or regions of interest can be isolated using Uprobe. Future implementation of the universal probe design strategy will be used to create an expanded number of whole-genome probe sets that will encompass all vertebrate genomes.
The contribution of the DNA microarray technology to gene expression profiling in Leishmania spp.: a retrospective.

PubMed

Alonso, Ana; Larraga, Vicente; Alcolea, Pedro J

2018-05-07

The first genome project of any living organism excluding viruses, the gammaproteobacteria Haemophilus influenzae, was completed in 1995. Until the last decade, genome sequencing was very tedious because genome survey sequences (GSS) and/or expressed sequence tags (ESTs) belonging to plasmid, cosmid and artificial chromosome genome libraries had to be sequenced and assembled in silico. Nowadays, no genome is completely assembled actually, because gaps and unassembled contigs are always remaining. However, most represent the whole genome of the organism of origin from a practical point of view. The first genome sequencing projects of trypanosomatid parasites were completed in 2005 following those strategies, and belong to Leishmania major, Trypanosoma cruzi and T. brucei. The functional genomics era rapidly developed on the basis of the microarray technology and has been evolving. In the case of the genus Leishmania, substantial biological information about differentiation in the digenetic life cycle of the parasite has been obtained. Later on, next generation sequencing has revolutionized genome sequencing and functional genomics, leading to more sensitive, accurate results by using much less resources. This new technology is more advantageous, but does not invalidate microarray results. In fact, promising vaccine candidates and drug targets have been found on the basis of microarray-based screening and preliminary proof-of-concept tests. Copyright © 2018. Published by Elsevier B.V.
Genomic landscape of gastric cancer: molecular classification and potential targets.

PubMed

Guo, Jiawei; Yu, Weiwei; Su, Hui; Pang, Xiufeng

2017-02-01

Gastric cancer imposes a considerable health burden worldwide, and its mortality ranks as the second highest for all types of cancers. The limited knowledge of the molecular mechanisms underlying gastric cancer tumorigenesis hinders the development of therapeutic strategies. However, ongoing collaborative sequencing efforts facilitate molecular classification and unveil the genomic landscape of gastric cancer. Several new drivers and tumorigenic pathways in gastric cancer, including chromatin remodeling genes, RhoA-related pathways, TP53 dysregulation, activation of receptor tyrosine kinases, stem cell pathways and abnormal DNA methylation, have been revealed. These newly identified genomic alterations await translation into clinical diagnosis and targeted therapies. Considering that loss-of-function mutations are intractable, synthetic lethality could be employed when discussing feasible therapeutic strategies. Although many challenges remain to be tackled, we are optimistic regarding improvements in the prognosis and treatment of gastric cancer in the near future.
Integration, warehousing, and analysis strategies of Omics data.

PubMed

Gedela, Srinubabu

2011-01-01

"-Omics" is a current suffix for numerous types of large-scale biological data generation procedures, which naturally demand the development of novel algorithms for data storage and analysis. With next generation genome sequencing burgeoning, it is pivotal to decipher a coding site on the genome, a gene's function, and information on transcripts next to the pure availability of sequence information. To explore a genome and downstream molecular processes, we need umpteen results at the various levels of cellular organization by utilizing different experimental designs, data analysis strategies and methodologies. Here comes the need for controlled vocabularies and data integration to annotate, store, and update the flow of experimental data. This chapter explores key methodologies to merge Omics data by semantic data carriers, discusses controlled vocabularies as eXtensible Markup Languages (XML), and provides practical guidance, databases, and software links supporting the integration of Omics data.
Evolutionary Genomics of Defense Systems in Archaea and Bacteria*

PubMed Central

Koonin, Eugene V.; Makarova, Kira S.; Wolf, Yuri I.

2018-01-01

Evolution of bacteria and archaea involves an incessant arms race against an enormous diversity of genetic parasites. Accordingly, a substantial fraction of the genes in most bacteria and archaea are dedicated to antiparasite defense. The functions of these defense systems follow several distinct strategies, including innate immunity; adaptive immunity; and dormancy induction, or programmed cell death. Recent comparative genomic studies taking advantage of the expanding database of microbial genomes and metagenomes, combined with direct experiments, resulted in the discovery of several previously unknown defense systems, including innate immunity centered on Argonaute proteins, bacteriophage exclusion, and new types of CRISPR-Cas systems of adaptive immunity. Some general principles of function and evolution of defense systems are starting to crystallize, in particular, extensive gain and loss of defense genes during the evolution of prokaryotes; formation of genomic defense islands; evolutionary connections between mobile genetic elements and defense, whereby genes of mobile elements are repeatedly recruited for defense functions; the partially selfish and addictive behavior of the defense systems; and coupling between immunity and dormancy induction/programmed cell death. PMID:28657885
Long noncoding RNAs and tumorigenesis: genetic associations, molecular mechanisms, and therapeutic strategies.

PubMed

Zhang, Fan; Zhang, Liang; Zhang, Caiguo

2016-01-01

The human genome contains a large number of nonprotein-coding sequences. Recently, new discoveries in the functions of nonprotein-coding sequences have demonstrated that the "Dark Genome" significantly contributes to human diseases, especially with regard to cancer. Of particular interest in this review are long noncoding RNAs (lncRNAs), which comprise a class of nonprotein-coding transcripts that are longer than 200 nucleotides. Accumulating evidence indicates that a large number of lncRNAs exhibit genetic associations with tumorigenesis, tumor progression, and metastasis. Our current understanding of the molecular bases of these lncRNAs that are associated with cancer indicate that they play critical roles in gene transcription, translation, and chromatin modification. Therapeutic strategies based on the targeting of lncRNAs to disrupt their expression or their functions are being developed. In this review, we briefly summarize and discuss the genetic associations and the aberrant expression of lncRNAs in cancer, with a particular focus on studies that have revealed the molecular mechanisms of lncRNAs in tumorigenesis. In addition, we also discuss different therapeutic strategies that involve the targeting of lncRNAs.
Emerging Strategies and Integrated Systems Microbiology Technologies for Biodiscovery of Marine Bioactive Compounds

PubMed Central

Rocha-Martin, Javier; Harrington, Catriona; Dobson, Alan D.W.; O’Gara, Fergal

2014-01-01

Marine microorganisms continue to be a source of structurally and biologically novel compounds with potential use in the biotechnology industry. The unique physiochemical properties of the marine environment (such as pH, pressure, temperature, osmolarity) and uncommon functional groups (such as isonitrile, dichloroimine, isocyanate, and halogenated functional groups) are frequently found in marine metabolites. These facts have resulted in the production of bioactive substances with different properties than those found in terrestrial habitats. In fact, the marine environment contains a relatively untapped reservoir of bioactivity. Recent advances in genomics, metagenomics, proteomics, combinatorial biosynthesis, synthetic biology, screening methods, expression systems, bioinformatics, and the ever increasing availability of sequenced genomes provides us with more opportunities than ever in the discovery of novel bioactive compounds and biocatalysts. The combination of these advanced techniques with traditional techniques, together with the use of dereplication strategies to eliminate known compounds, provides a powerful tool in the discovery of novel marine bioactive compounds. This review outlines and discusses the emerging strategies for the biodiscovery of these bioactive compounds. PMID:24918453
Species Choice for Comparative Genomics: Being Greedy Works

PubMed Central

Pardi, Fabio; Goldman, Nick

2005-01-01

Several projects investigating genetic function and evolution through sequencing and comparison of multiple genomes are now underway. These projects consume many resources, and appropriate planning should be devoted to choosing which species to sequence, potentially involving cooperation among different sequencing centres. A widely discussed criterion for species choice is the maximisation of evolutionary divergence. Our mathematical formalization of this problem surprisingly shows that the best long-term cooperative strategy coincides with the seemingly short-term “greedy” strategy of always choosing the next best single species. Other criteria influencing species choice, such as medical relevance or sequencing costs, can also be accommodated in our approach, suggesting our results' broad relevance in scientific policy decisions. PMID:16327885
Tol2 transposon-mediated transgenesis in Xenopus tropicalis.

PubMed

Hamlet, Michelle R Johnson; Yergeau, Donald A; Kuliyev, Emin; Takeda, Masatoshi; Taira, Masanori; Kawakami, Koichi; Mead, Paul E

2006-09-01

The diploid frog Xenopus tropicalis is becoming a powerful developmental genetic model system. Sequencing of the X. tropicalis genome is nearing completion and several labs are embarking on mutagenesis screens. We are interested in developing insertional mutagenesis strategies in X. tropicalis. Transposon-mediated insertional mutagenesis, once used exclusively in plants and invertebrate systems, is now more widely applicable to vertebrates. The first step in developing transposons as tools for mutagenesis is to demonstrate that these mobile elements function efficiently in the target organism. Here, we show that the Medaka fish transposon, Tol2, is able to stably integrate into the X. tropicalis genome and will serve as a powerful tool for insertional mutagenesis strategies in the frog.
Comparative Pan-Genome Analysis of Piscirickettsia salmonis Reveals Genomic Divergences within Genogroups.

PubMed

Nourdin-Galindo, Guillermo; Sánchez, Patricio; Molina, Cristian F; Espinoza-Rojas, Daniela A; Oliver, Cristian; Ruiz, Pamela; Vargas-Chacoff, Luis; Cárcamo, Juan G; Figueroa, Jaime E; Mancilla, Marcos; Maracaja-Coutinho, Vinicius; Yañez, Alejandro J

2017-01-01

Piscirickettsia salmonis is the etiological agent of salmonid rickettsial septicemia, a disease that seriously affects the salmonid industry. Despite efforts to genomically characterize P. salmonis , functional information on the life cycle, pathogenesis mechanisms, diagnosis, treatment, and control of this fish pathogen remain lacking. To address this knowledge gap, the present study conducted an in silico pan-genome analysis of 19 P. salmonis strains from distinct geographic locations and genogroups. Results revealed an expected open pan-genome of 3,463 genes and a core-genome of 1,732 genes. Two marked genogroups were identified, as confirmed by phylogenetic and phylogenomic relationships to the LF-89 and EM-90 reference strains, as well as by assessments of genomic structures. Different structural configurations were found for the six identified copies of the ribosomal operon in the P. salmonis genome, indicating translocation throughout the genetic material. Chromosomal divergences in genomic localization and quantity of genetic cassettes were also found for the Dot/Icm type IVB secretion system. To determine divergences between core-genomes, additional pan-genome descriptions were compiled for the so-termed LF and EM genogroups. Open pan-genomes composed of 2,924 and 2,778 genes and core-genomes composed of 2,170 and 2,228 genes were respectively found for the LF and EM genogroups. The core-genomes were functionally annotated using the Gene Ontology, KEGG, and Virulence Factor databases, revealing the presence of several shared groups of genes related to basic function of intracellular survival and bacterial pathogenesis. Additionally, the specific pan-genomes for the LF and EM genogroups were defined, resulting in the identification of 148 and 273 exclusive proteins, respectively. Notably, specific virulence factors linked to adherence, colonization, invasion factors, and endotoxins were established. The obtained data suggest that these genes could be directly associated with inter-genogroup differences in pathogenesis and host-pathogen interactions, information that could be useful in designing novel strategies for diagnosing and controlling P. salmonis infection.
Comparative Pan-Genome Analysis of Piscirickettsia salmonis Reveals Genomic Divergences within Genogroups

PubMed Central

Nourdin-Galindo, Guillermo; Sánchez, Patricio; Molina, Cristian F.; Espinoza-Rojas, Daniela A.; Oliver, Cristian; Ruiz, Pamela; Vargas-Chacoff, Luis; Cárcamo, Juan G.; Figueroa, Jaime E.; Mancilla, Marcos; Maracaja-Coutinho, Vinicius; Yañez, Alejandro J.

2017-01-01

Piscirickettsia salmonis is the etiological agent of salmonid rickettsial septicemia, a disease that seriously affects the salmonid industry. Despite efforts to genomically characterize P. salmonis, functional information on the life cycle, pathogenesis mechanisms, diagnosis, treatment, and control of this fish pathogen remain lacking. To address this knowledge gap, the present study conducted an in silico pan-genome analysis of 19 P. salmonis strains from distinct geographic locations and genogroups. Results revealed an expected open pan-genome of 3,463 genes and a core-genome of 1,732 genes. Two marked genogroups were identified, as confirmed by phylogenetic and phylogenomic relationships to the LF-89 and EM-90 reference strains, as well as by assessments of genomic structures. Different structural configurations were found for the six identified copies of the ribosomal operon in the P. salmonis genome, indicating translocation throughout the genetic material. Chromosomal divergences in genomic localization and quantity of genetic cassettes were also found for the Dot/Icm type IVB secretion system. To determine divergences between core-genomes, additional pan-genome descriptions were compiled for the so-termed LF and EM genogroups. Open pan-genomes composed of 2,924 and 2,778 genes and core-genomes composed of 2,170 and 2,228 genes were respectively found for the LF and EM genogroups. The core-genomes were functionally annotated using the Gene Ontology, KEGG, and Virulence Factor databases, revealing the presence of several shared groups of genes related to basic function of intracellular survival and bacterial pathogenesis. Additionally, the specific pan-genomes for the LF and EM genogroups were defined, resulting in the identification of 148 and 273 exclusive proteins, respectively. Notably, specific virulence factors linked to adherence, colonization, invasion factors, and endotoxins were established. The obtained data suggest that these genes could be directly associated with inter-genogroup differences in pathogenesis and host-pathogen interactions, information that could be useful in designing novel strategies for diagnosing and controlling P. salmonis infection. PMID:29164068
A novel strategy for the determination of a rhabdovirus genome and its application to sequencing of Eggplant mottled dwarf virus.

PubMed

Pappi, Polyxeni G; Dovas, Chrysostomos I; Efthimiou, Konstantinos E; Maliogka, Varvara I; Katis, Nikolaos I

2013-08-01

A novel strategy employing the rhabdovirus untranslated conserved intergenic regions was developed and applied successfully for the determination of the complete nucleotide sequence of Eggplant mottled dwarf virus (EMDV). The EMDV genome contains seven open reading frames with the same organization as Potato yellow dwarf virus (PYDV), the type species of the genus Nucleorhabdovirus. These two species encode five core genes [nucleocapsid (N), phosphoprotein (P), matrix (M), glycoprotein (G), and the polymerase (L)] like other viruses of the genus and an additional one (X), located between N and P, giving rise to a protein with currently unknown function. Furthermore, both EMDV and PYDV contain a gene (Y), inserted between P and M, which probably encodes the virus movement protein, in concordance with the rest of the plant-infecting rhabdoviruses. Phylogenetic analysis of the polymerase gene confirmed the classification of EMDV within the genus Nucleorhabdovirus and showed a close evolutionary relationship to PYDV. The novel sequencing strategy developed is a useful tool for the genome determination of yet uncharacterized rhabdoviruses.
Rapid one-step construction of a Middle East Respiratory Syndrome (MERS-CoV) infectious clone system by homologous recombination.

PubMed

Nikiforuk, Aidan M; Leung, Anders; Cook, Bradley W M; Court, Deborah A; Kobasa, Darwyn; Theriault, Steven S

2016-10-01

Viral Infectious clone systems serve as robust platforms to study viral gene or replicative function by reverse genetics, formulate vaccines and adapt a wild type-virus to an animal host. Since the development of the first viral infectious clone system for the poliovirus, novel strategies of viral genome construction have allowed for the assembly of viral genomes across the identified viral families. However, the molecular profiles of some viruses make their genome more difficult to construct than others. Two factors that affect the difficulty of infectious clone construction are genome length and genome complexity. This work examines the available strategies for overcoming the obstacles of assembling the long and complex RNA genomes of coronaviruses and reports one-step construction of an infectious clone system for the Middle East Respiratory Syndrome coronavirus (MERS-CoV) by homologous recombination in S. cerevisiae. Future use of this methodology will shorten the time between emergence of a novel viral pathogen and construction of an infectious clone system. Completion of a viral infectious clone system facilitates further study of a virus's biology, improvement of diagnostic tests, vaccine production and the screening of antiviral compounds. Crown Copyright © 2016. Published by Elsevier B.V. All rights reserved.
An integrative strategy to identify the entire protein coding potential of prokaryotic genomes by proteogenomics.

PubMed

Omasits, Ulrich; Varadarajan, Adithi R; Schmid, Michael; Goetze, Sandra; Melidis, Damianos; Bourqui, Marc; Nikolayeva, Olga; Québatte, Maxime; Patrignani, Andrea; Dehio, Christoph; Frey, Juerg E; Robinson, Mark D; Wollscheid, Bernd; Ahrens, Christian H

2017-12-01

Accurate annotation of all protein-coding sequences (CDSs) is an essential prerequisite to fully exploit the rapidly growing repertoire of completely sequenced prokaryotic genomes. However, large discrepancies among the number of CDSs annotated by different resources, missed functional short open reading frames (sORFs), and overprediction of spurious ORFs represent serious limitations. Our strategy toward accurate and complete genome annotation consolidates CDSs from multiple reference annotation resources, ab initio gene prediction algorithms and in silico ORFs (a modified six-frame translation considering alternative start codons) in an integrated proteogenomics database (iPtgxDB) that covers the entire protein-coding potential of a prokaryotic genome. By extending the PeptideClassifier concept of unambiguous peptides for prokaryotes, close to 95% of the identifiable peptides imply one distinct protein, largely simplifying downstream analysis. Searching a comprehensive Bartonella henselae proteomics data set against such an iPtgxDB allowed us to unambiguously identify novel ORFs uniquely predicted by each resource, including lipoproteins, differentially expressed and membrane-localized proteins, novel start sites and wrongly annotated pseudogenes. Most novelties were confirmed by targeted, parallel reaction monitoring mass spectrometry, including unique ORFs and single amino acid variations (SAAVs) identified in a re-sequenced laboratory strain that are not present in its reference genome. We demonstrate the general applicability of our strategy for genomes with varying GC content and distinct taxonomic origin. We release iPtgxDBs for B. henselae , Bradyrhizobium diazoefficiens and Escherichia coli and the software to generate both proteogenomics search databases and integrated annotation files that can be viewed in a genome browser for any prokaryote. © 2017 Omasits et al.; Published by Cold Spring Harbor Laboratory Press.
Functional annotation from the genome sequence of the giant panda.

PubMed

Huo, Tong; Zhang, Yinjie; Lin, Jianping

2012-08-01

The giant panda is one of the most critically endangered species due to the fragmentation and loss of its habitat. Studying the functions of proteins in this animal, especially specific trait-related proteins, is therefore necessary to protect the species. In this work, the functions of these proteins were investigated using the genome sequence of the giant panda. Data on 21,001 proteins and their functions were stored in the Giant Panda Protein Database, in which the proteins were divided into two groups: 20,179 proteins whose functions can be predicted by GeneScan formed the known-function group, whereas 822 proteins whose functions cannot be predicted by GeneScan comprised the unknown-function group. For the known-function group, we further classified the proteins by molecular function, biological process, cellular component, and tissue specificity. For the unknown-function group, we developed a strategy in which the proteins were filtered by cross-Blast to identify panda-specific proteins under the assumption that proteins related to the panda-specific traits in the unknown-function group exist. After this filtering procedure, we identified 32 proteins (2 of which are membrane proteins) specific to the giant panda genome as compared against the dog and horse genomes. Based on their amino acid sequences, these 32 proteins were further analyzed by functional classification using SVM-Prot, motif prediction using MyHits, and interacting protein prediction using the Database of Interacting Proteins. Nineteen proteins were predicted to be zinc-binding proteins, thus affecting the activities of nucleic acids. The 32 panda-specific proteins will be further investigated by structural and functional analysis.
HelmCoP: An Online Resource for Helminth Functional Genomics and Drug and Vaccine Targets Prioritization

PubMed Central

Taylor, Christina M.; Mitreva, Makedonka

2011-01-01

A vast majority of the burden from neglected tropical diseases result from helminth infections (nematodes and platyhelminthes). Parasitic helminthes infect over 2 billion, exerting a high collective burden that rivals high-mortality conditions such as AIDS or malaria, and cause devastation to crops and livestock. The challenges to improve control of parasitic helminth infections are multi-fold and no single category of approaches will meet them all. New information such as helminth genomics, functional genomics and proteomics coupled with innovative bioinformatic approaches provide fundamental molecular information about these parasites, accelerating both basic research as well as development of effective diagnostics, vaccines and new drugs. To facilitate such studies we have developed an online resource, HelmCoP (Helminth Control and Prevention), built by integrating functional, structural and comparative genomic data from plant, animal and human helminthes, to enable researchers to develop strategies for drug, vaccine and pesticide prioritization, while also providing a useful comparative genomics platform. HelmCoP encompasses genomic data from several hosts, including model organisms, along with a comprehensive suite of structural and functional annotations, to assist in comparative analyses and to study host-parasite interactions. The HelmCoP interface, with a sophisticated query engine as a backbone, allows users to search for multi-factorial combinations of properties and serves readily accessible information that will assist in the identification of various genes of interest. HelmCoP is publicly available at: http://www.nematode.net/helmcop.html. PMID:21760913
Solutions for data integration in functional genomics: a critical assessment and case study.

PubMed

Smedley, Damian; Swertz, Morris A; Wolstencroft, Katy; Proctor, Glenn; Zouberakis, Michael; Bard, Jonathan; Hancock, John M; Schofield, Paul

2008-11-01

The torrent of data emerging from the application of new technologies to functional genomics and systems biology can no longer be contained within the traditional modes of data sharing and publication with the consequence that data is being deposited in, distributed across and disseminated through an increasing number of databases. The resulting fragmentation poses serious problems for the model organism community which increasingly rely on data mining and computational approaches that require gathering of data from a range of sources. In the light of these problems, the European Commission has funded a coordination action, CASIMIR (coordination and sustainability of international mouse informatics resources), with a remit to assess the technical and social aspects of database interoperability that currently prevent the full realization of the potential of data integration in mouse functional genomics. In this article, we assess the current problems with interoperability, with particular reference to mouse functional genomics, and critically review the technologies that can be deployed to overcome them. We describe a typical use-case where an investigator wishes to gather data on variation, genomic context and metabolic pathway involvement for genes discovered in a genome-wide screen. We go on to develop an automated approach involving an in silico experimental workflow tool, Taverna, using web services, BioMart and MOLGENIS technologies for data retrieval. Finally, we focus on the current impediments to adopting such an approach in a wider context, and strategies to overcome them.
Epstein-Barr virus (EBV) recombinants: use of positive selection markers to rescue mutants in EBV-negative B-lymphoma cells.

PubMed Central

Wang, F; Marchini, A; Kieff, E

1991-01-01

The objective of these experiments was to develop strategies for creation and identification of recombinant mutant Epstein-Barr viruses (EBV). EBV recombinant molecular genetics has been limited to mutations within a short DNA segment deleted from a nontransforming EBV and an underlying strategy which relies on growth transformation of primary B lymphocytes for identification of recombinants. Thus, mutations outside the deletion or mutations which affect transformation cannot be easily recovered. In these experiments we investigated whether a toxic drug resistance gene, guanine phosphoribosyltransferase or hygromycin phosphotransferase, driven by the simian virus 40 promoter can be recombined into the EBV genome and can function to identify B-lymphoma cells infected with recombinant virus. Two different strategies were used to recombine the drug resistance marker into the EBV genome. Both utilized transfection of partially permissive, EBV-infected B95-8 cells and positive selection for cells which had incorporated a functional drug resistance gene. In the first series of experiments, B95-8 clones were screened for transfected DNA that had recombined into the EBV genome. In the second series of experiments, the transfected drug resistance marker was linked to the plasmid and lytic EBV origins so that it was maintained as an episome and could recombine with the B95-8 EBV genome during virus replication. The recombinant EBV from either experiment could be recovered by infection and toxic drug selection of EBV-negative B-lymphoma cells. The EBV genome in these B-lymphoma cells is frequently an episome. Virus genes associated with latent infection of primary B lymphocytes are expressed. Expression of Epstein-Barr virus nuclear antigen 2 (EBNA-2) and the EBNA-3 genes is variable relative to that of EBNA-1, as is characteristic of some naturally infected Burkitt tumor cells. Moreover, the EBV-infected B-lymphoma cells are often partially permissive for early replicative cycle gene expression and virus replication can be induced, in contrast to previously reported in vitro infected B-lymphoma cells. These studies demonstrate that dominant selectable markers can be inserted into the EBV genome, are active in the context of the EBV genome, and can be used to recover recombinant EBV in B-lymphoma cells. This system should be particularly useful for recovering EBV genomes with mutations in essential transforming genes. Images PMID:1848303
Epstein-Barr virus (EBV) recombinants: use of positive selection markers to rescue mutants in EBV-negative B-lymphoma cells.

PubMed

Wang, F; Marchini, A; Kieff, E

1991-04-01

The objective of these experiments was to develop strategies for creation and identification of recombinant mutant Epstein-Barr viruses (EBV). EBV recombinant molecular genetics has been limited to mutations within a short DNA segment deleted from a nontransforming EBV and an underlying strategy which relies on growth transformation of primary B lymphocytes for identification of recombinants. Thus, mutations outside the deletion or mutations which affect transformation cannot be easily recovered. In these experiments we investigated whether a toxic drug resistance gene, guanine phosphoribosyltransferase or hygromycin phosphotransferase, driven by the simian virus 40 promoter can be recombined into the EBV genome and can function to identify B-lymphoma cells infected with recombinant virus. Two different strategies were used to recombine the drug resistance marker into the EBV genome. Both utilized transfection of partially permissive, EBV-infected B95-8 cells and positive selection for cells which had incorporated a functional drug resistance gene. In the first series of experiments, B95-8 clones were screened for transfected DNA that had recombined into the EBV genome. In the second series of experiments, the transfected drug resistance marker was linked to the plasmid and lytic EBV origins so that it was maintained as an episome and could recombine with the B95-8 EBV genome during virus replication. The recombinant EBV from either experiment could be recovered by infection and toxic drug selection of EBV-negative B-lymphoma cells. The EBV genome in these B-lymphoma cells is frequently an episome. Virus genes associated with latent infection of primary B lymphocytes are expressed. Expression of Epstein-Barr virus nuclear antigen 2 (EBNA-2) and the EBNA-3 genes is variable relative to that of EBNA-1, as is characteristic of some naturally infected Burkitt tumor cells. Moreover, the EBV-infected B-lymphoma cells are often partially permissive for early replicative cycle gene expression and virus replication can be induced, in contrast to previously reported in vitro infected B-lymphoma cells. These studies demonstrate that dominant selectable markers can be inserted into the EBV genome, are active in the context of the EBV genome, and can be used to recover recombinant EBV in B-lymphoma cells. This system should be particularly useful for recovering EBV genomes with mutations in essential transforming genes.

Decomposing Oncogenic Transcriptional Signatures to Generate Maps of Divergent Cellular States* | Office of Cancer Genomics

Cancer.gov

The systematic sequencing of the cancer genome has led to the identification of numerous genetic alterations in cancer. However, a deeper understanding of the functional consequences of these alterations is necessary to guide appropriate therapeutic strategies. Here, we describe Onco-GPS (OncoGenic Positioning System), a data-driven analysis framework to organize individual tumor samples with shared oncogenic alterations onto a reference map defined by their underlying cellular states.
PGSB/MIPS Plant Genome Information Resources and Concepts for the Analysis of Complex Grass Genomes.

PubMed

Spannagl, Manuel; Bader, Kai; Pfeifer, Matthias; Nussbaumer, Thomas; Mayer, Klaus F X

2016-01-01

PGSB (Plant Genome and Systems Biology; formerly MIPS-Munich Institute for Protein Sequences) has been involved in developing, implementing and maintaining plant genome databases for more than a decade. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable datasets for model plant genomes as a backbone against which experimental data, e.g., from high-throughput functional genomics, can be organized and analyzed. In addition, genomes from both model and crop plants form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny) between related species on macro- and micro-levels.The genomes of many economically important Triticeae plants such as wheat, barley, and rye present a great challenge for sequence assembly and bioinformatic analysis due to their enormous complexity and large genome size. Novel concepts and strategies have been developed to deal with these difficulties and have been applied to the genomes of wheat, barley, rye, and other cereals. This includes the GenomeZipper concept, reference-guided exome assembly, and "chromosome genomics" based on flow cytometry sorted chromosomes.
Complete Genome Sequence of Treponema paraluiscuniculi, Strain Cuniculi A: The Loss of Infectivity to Humans Is Associated with Genome Decay

PubMed Central

Šmajs, David; Zobaníková, Marie; Strouhal, Michal; Čejková, Darina; Dugan-Rocha, Shannon; Pospíšilová, Petra; Norris, Steven J.; Albert, Tom; Qin, Xiang; Hallsworth-Pepin, Kym; Buhay, Christian; Muzny, Donna M.; Chen, Lei; Gibbs, Richard A.; Weinstock, George M.

2011-01-01

Treponema paraluiscuniculi is the causative agent of rabbit venereal spirochetosis. It is not infectious to humans, although its genome structure is very closely related to other pathogenic Treponema species including Treponema pallidum subspecies pallidum, the etiological agent of syphilis. In this study, the genome sequence of Treponema paraluiscuniculi, strain Cuniculi A, was determined by a combination of several high-throughput sequencing strategies. Whereas the overall size (1,133,390 bp), arrangement, and gene content of the Cuniculi A genome closely resembled those of the T. pallidum genome, the T. paraluiscuniculi genome contained a markedly higher number of pseudogenes and gene fragments (51). In addition to pseudogenes, 33 divergent genes were also found in the T. paraluiscuniculi genome. A set of 32 (out of 84) affected genes encoded proteins of known or predicted function in the Nichols genome. These proteins included virulence factors, gene regulators and components of DNA repair and recombination. The majority (52 or 61.9%) of the Cuniculi A pseudogenes and divergent genes were of unknown function. Our results indicate that T. paraluiscuniculi has evolved from a T. pallidum-like ancestor and adapted to a specialized host-associated niche (rabbits) during loss of infectivity to humans. The genes that are inactivated or altered in T. paraluiscuniculi are candidates for virulence factors important in the infectivity and pathogenesis of T. pallidum subspecies. PMID:21655244
NCBI GEO: archive for functional genomics data sets--update.

PubMed

Barrett, Tanya; Wilhite, Stephen E; Ledoux, Pierre; Evangelista, Carlos; Kim, Irene F; Tomashevsky, Maxim; Marshall, Kimberly A; Phillippy, Katherine H; Sherman, Patti M; Holko, Michelle; Yefanov, Andrey; Lee, Hyeseung; Zhang, Naigong; Robertson, Cynthia L; Serova, Nadezhda; Davis, Sean; Soboleva, Alexandra

2013-01-01

The Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) is an international public repository for high-throughput microarray and next-generation sequence functional genomic data sets submitted by the research community. The resource supports archiving of raw data, processed data and metadata which are indexed, cross-linked and searchable. All data are freely available for download in a variety of formats. GEO also provides several web-based tools and strategies to assist users to query, analyse and visualize data. This article reports current status and recent database developments, including the release of GEO2R, an R-based web application that helps users analyse GEO data.
Identifying Bacterial Immune Evasion Proteins Using Phage Display.

PubMed

Fevre, Cindy; Scheepmaker, Lisette; Haas, Pieter-Jan

2017-01-01

Methods aimed at identification of immune evasion proteins are mainly rely on in silico prediction of sequence, structural homology to known evasion proteins or use a proteomics driven approach. Although proven successful these methods are limited by a low efficiency and or lack of functional identification. Here we describe a high-throughput genomic strategy to functionally identify bacterial immune evasion proteins using phage display technology. Genomic bacterial DNA is randomly fragmented and ligated into a phage display vector that is used to create a phage display library expressing bacterial secreted and membrane bound proteins. This library is used to select displayed bacterial secretome proteins that interact with host immune components.
Genomics England's implementation of its public engagement strategy: Blurred boundaries between engagement for the United Kingdom's 100,000 Genomes project and the need for public support.

PubMed

Samuel, Gabrielle Natalie; Farsides, Bobbie

2018-04-01

The United Kingdom's 100,000 Genomes Project has the aim of sequencing 100,000 genomes from National Health Service patients such that whole genome sequencing becomes routine clinical practice. It also has a research-focused goal to provide data for scientific discovery. Genomics England is the limited company established by the Department of Health to deliver the project. As an innovative scientific/clinical venture, it is interesting to consider how Genomics England positions itself in relation to public engagement activities. We set out to explore how individuals working at, or associated with, Genomics England enacted public engagement in practice. Our findings show that individuals offered a narrative in which public engagement performed more than one function. On one side, public engagement was seen as 'good practice'. On the other, public engagement was presented as core to the project's success - needed to encourage involvement and ultimately recruitment. We discuss the implications of this in this article.
Diving into marine genomics with CRISPR/Cas9 systems.

PubMed

Momose, Tsuyoshi; Concordet, Jean-Paul

2016-12-01

More and more genomes are sequenced and a great range of biological questions can be examined at the genomic level in a growing number of organisms. Testing the function of genome features, from gene networks, genome organization, conserved non-coding sequences to microRNAs, and, more generally, experimentally addressing the genotype-phenotype relationship is now possible owing to the clustered, regularly interspaced, short palindromic repeats (CRISPR)-Cas9 revolution of genome editing. In the present review, we give a brief overview of the CRISPR/Cas9 toolbox and different strategies for genome editing currently available. We list the first examples of applications to marine organisms and also draw from studies in more common laboratory models to suggest both guidelines for design of genome editing experiments as well as discuss challenges specific to marine organisms. In addition, we discuss future perspectives, including applications of CRISPR/Cas9 to base editing and targeted reprogramming of gene transcription. Copyright © 2016 Elsevier B.V. All rights reserved.
Massive gene loss in mistletoe (Viscum, Viscaceae) mitochondria

PubMed Central

Petersen, G.; Cuenca, A.; Møller, I. M.; Seberg, O.

2015-01-01

Parasitism is a successful survival strategy across all kingdoms and has evolved repeatedly in angiosperms. Parasitic plants obtain nutrients from other plants and some are agricultural pests. Obligate parasites, which cannot complete their lifecycle without a host, may lack functional photosystems (holoparasites), or have retained photosynthesis (hemiparasites). Plastid genomes are often reduced in parasites, but complete mitochondrial genomes have not been sequenced and their mitochondrial respiratory capacities are largely unknown. The hemiparasitic European mistletoe (Viscum album), known from folklore and postulated therapeutic properties, is a pest in plantations and forestry. We compare the mitochondrial genomes of three Viscum species based on the complete mitochondrial genome of V. album, the first from a parasitic plant. We show that mitochondrial genes encoding proteins of all respiratory complexes are lacking or pseudogenized raising several questions relevant to all parasitic plants: Are any mitochondrial gene functions essential? Do any genes need to be located in the mitochondrial genome or can they all be transferred to the nucleus? Can parasitic plants survive without oxidative phosphorylation by using alternative respiratory pathways? More generally, our study is a step towards understanding how host- and self-perception, host integration and nucleic acid transfer has modified ancestral mitochondrial genomes. PMID:26625950
Functional alleles of the flowering time regulator FRIGIDA in the Brassica oleracea genome

PubMed Central

2012-01-01

Background Plants adopt different reproductive strategies as an adaptation to growth in a range of climates. In Arabidopsis thaliana FRIGIDA (FRI) confers a vernalization requirement and thus winter annual habit by increasing the expression of the MADS box transcriptional repressor FLOWERING LOCUS C (FLC). Variation at FRI plays a major role in A. thaliana life history strategy, as independent loss-of-function alleles that result in a rapid-cycling habit in different accessions, appear to have evolved many times. The aim of this study was to identify and characterize orthologues of FRI in Brassica oleracea. Results We describe the characterization of FRI from Brassica oleracea and identify the two B. oleracea FRI orthologues (BolC.FRI.a and BolC.FRI.b). These show extensive amino acid conservation in the central and C-terminal regions to FRI from other Brassicaceae, including A. thaliana, but have a diverged N-terminus. The genes map to two of the three regions of B. oleracea chromosomes syntenic to part of A. thaliana chromosome 5 suggesting that one of the FRI copies has been lost since the ancient triplication event that formed the B. oleracea genome. This genomic position is not syntenic with FRI in A. thaliana and comparative analysis revealed a recombination event within the A. thaliana FRI promoter. This relocated A. thaliana FRI to chromosome 4, very close to the nucleolar organizer region, leaving a fragment of FRI in the syntenic location on A. thaliana chromosome 5. Our data show this rearrangement occurred after the divergence from A. lyrata. We explored the allelic variation at BolC.FRI.a within cultivated B. oleracea germplasm and identified two major alleles, which appear equally functional both to each other and A. thaliana FRI, when expressed as fusions in A. thaliana. Conclusions We identify the two Brassica oleracea FRI genes, one of which we show through A. thaliana complementation experiments is functional, and show their genomic location is not syntenic with A. thaliana FRI due to an ancient recombination event. This has complicated previous association analyses of FRI with variation in life history strategy in the Brassica genus. PMID:22333192
Passage relevance models for genomics search.

PubMed

Urbain, Jay; Frieder, Ophir; Goharian, Nazli

2009-03-19

We present a passage relevance model for integrating syntactic and semantic evidence of biomedical concepts and topics using a probabilistic graphical model. Component models of topics, concepts, terms, and document are represented as potential functions within a Markov Random Field. The probability of a passage being relevant to a biologist's information need is represented as the joint distribution across all potential functions. Relevance model feedback of top ranked passages is used to improve distributional estimates of query concepts and topics in context, and a dimensional indexing strategy is used for efficient aggregation of concept and term statistics. By integrating multiple sources of evidence including dependencies between topics, concepts, and terms, we seek to improve genomics literature passage retrieval precision. Using this model, we are able to demonstrate statistically significant improvements in retrieval precision using a large genomics literature corpus.
CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription.

PubMed

Tang, Zhonghui; Luo, Oscar Junhong; Li, Xingwang; Zheng, Meizhen; Zhu, Jacqueline Jufen; Szalaj, Przemyslaw; Trzaskoma, Pawel; Magalska, Adriana; Wlodarczyk, Jakub; Ruszczycki, Blazej; Michalski, Paul; Piecuch, Emaly; Wang, Ping; Wang, Danjuan; Tian, Simon Zhongyuan; Penrad-Mobayed, May; Sachs, Laurent M; Ruan, Xiaoan; Wei, Chia-Lin; Liu, Edison T; Wilczynski, Grzegorz M; Plewczynski, Dariusz; Li, Guoliang; Ruan, Yijun

2015-12-17

Spatial genome organization and its effect on transcription remains a fundamental question. We applied an advanced chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) strategy to comprehensively map higher-order chromosome folding and specific chromatin interactions mediated by CCCTC-binding factor (CTCF) and RNA polymerase II (RNAPII) with haplotype specificity and nucleotide resolution in different human cell lineages. We find that CTCF/cohesin-mediated interaction anchors serve as structural foci for spatial organization of constitutive genes concordant with CTCF-motif orientation, whereas RNAPII interacts within these structures by selectively drawing cell-type-specific genes toward CTCF foci for coordinated transcription. Furthermore, we show that haplotype variants and allelic interactions have differential effects on chromosome configuration, influencing gene expression, and may provide mechanistic insights into functions associated with disease susceptibility. 3D genome simulation suggests a model of chromatin folding around chromosomal axes, where CTCF is involved in defining the interface between condensed and open compartments for structural regulation. Our 3D genome strategy thus provides unique insights in the topological mechanism of human variations and diseases. Copyright © 2015 Elsevier Inc. All rights reserved.
Emerging applications of genome-editing technology to examine functionality of GWAS-associated variants for complex traits.

PubMed

Smith, Andrew J P; Deloukas, Panos; Munroe, Patricia B

2018-04-13

Over the last decade, genome-wide association studies (GWAS) have propelled the discovery of thousands of loci associated with complex diseases. The focus is now turning towards the function of these association signals, determining the causal variant(s) amongst those in strong linkage disequilibrium, and identifying their underlying mechanisms, such as long-range gene regulation. Genome-editing techniques utilising zinc-finger nucleases (ZFN), transcription activator-like effector nucleases (TALENs) and clustered regularly-interspaced short palindromic repeats with Cas9 nuclease (CRISPR-Cas9), are becoming the tools of choice to establish functionality for these variants, due to the ability to assess effects of single variants in vivo. This review will discuss examples of how these technologies have begun to aid functional analysis of GWAS loci for complex traits such as cardiovascular disease, type 2 diabetes, cancer, obesity and autoimmune disease. We focus on analysis of variants occurring within non-coding genomic regions, as these comprise the majority of GWAS variants, providing the greatest challenges to determining functionality, and compare editing strategies that provide different levels of evidence for variant functionality. The review describes molecular insights into some of these potentially causal variants, and how these may relate to the pathology of the trait, and look towards future directions for these technologies in post-GWAS analysis, such as base-editing.
Gene Fusion: A Genome Wide Survey

NASA Technical Reports Server (NTRS)

Liang, Ping; Riley, Monica

2001-01-01

As a well known fact, organisms form larger and complex multimodular (composite or chimeric) and mostly multi-functional proteins through gene fusion of two or more individual genes which have independent evolution histories and functions. We call each of these components a module. The existence of multimodular proteins may improves the efficiency in gene regulation and in cellular functions, and thus may give the host organism advantages in adaptation to environments. Analysis of all gene fusions in present-day organisms should allow us to examine the patterns of gene fusion in context with cellular functions, to trace back the evolution processes from the ancient smaller and uni-functional proteins to the present-day larger and complex multi-functional proteins, and to estimate the minimal number of ancestor proteins that existed in the last common ancestor for all life on earth. Although many multimodular proteins have been experimentally known, identification of gene fusion events systematically at genome scale had not been possible until recently when large number of completed genome sequences have been becoming available. In addition, technical difficulties for such analysis also exist due to the complexity of this biological and evolutionary process. We report from this study a new strategy to computationally identify multimodular proteins using completed genome sequences and the results surveyed from 22 organisms with the data from over 40 organisms to be presented during the meeting. Additional information is contained in the original extended abstract.
Genome-wide SNP analysis reveals a genetic basis for sea-age variation in a wild population of Atlantic salmon (Salmo salar).

PubMed

Johnston, Susan E; Orell, Panu; Pritchard, Victoria L; Kent, Matthew P; Lien, Sigbjørn; Niemelä, Eero; Erkinaro, Jaakko; Primmer, Craig R

2014-07-01

Delaying sexual maturation can lead to larger body size and higher reproductive success, but carries an increased risk of death before reproducing. Classical life history theory predicts that trade-offs between reproductive success and survival should lead to the evolution of an optimal strategy in a given population. However, variation in mating strategies generally persists, and in general, there remains a poor understanding of genetic and physiological mechanisms underlying this variation. One extreme case of this is in the Atlantic salmon (Salmo salar), which can show variation in the age at which they return from their marine migration to spawn (i.e. their 'sea age'). This results in large size differences between strategies, with direct implications for individual fitness. Here, we used an Illumina Infinium SNP array to identify regions of the genome associated with variation in sea age in a large population of Atlantic salmon in Northern Europe, implementing individual-based genome-wide association studies (GWAS) and population-based FST outlier analyses. We identified several regions of the genome which vary in association with phenotype and/or selection between sea ages, with nearby genes having functions related to muscle development, metabolism, immune response and mate choice. In addition, we found that individuals of different sea ages belong to different, yet sympatric populations in this system, indicating that reproductive isolation may be driven by divergence between stable strategies. Overall, this study demonstrates how genome-wide methodologies can be integrated with samples collected from wild, structured populations to understand their ecology and evolution in a natural context. © 2014 John Wiley & Sons Ltd.
Repeat associated mechanisms of genome evolution and function revealed by the Mus caroli and Mus pahari genomes

PubMed Central

Thybert, David; Roller, Maša; Navarro, Fábio C.P.; Fiddes, Ian; Streeter, Ian; Feig, Christine; Martin-Galvez, David; Kolmogorov, Mikhail; Janoušek, Václav; Akanni, Wasiu; Aken, Bronwen; Aldridge, Sarah; Chakrapani, Varshith; Chow, William; Clarke, Laura; Cummins, Carla; Doran, Anthony; Dunn, Matthew; Goodstadt, Leo; Howe, Kerstin; Howell, Matthew; Josselin, Ambre-Aurore; Karn, Robert C.; Laukaitis, Christina M.; Jingtao, Lilue; Martin, Fergal; Muffato, Matthieu; Nachtweide, Stefanie; Quail, Michael A.; Sisu, Cristina; Stanke, Mario; Stefflova, Klara; Van Oosterhout, Cock; Veyrunes, Frederic; Ward, Ben; Yang, Fengtang; Yazdanifar, Golbahar; Zadissa, Amonida; Adams, David J.; Brazma, Alvis; Gerstein, Mark; Paten, Benedict; Pham, Son; Keane, Thomas M.; Odom, Duncan T.; Flicek, Paul

2018-01-01

Understanding the mechanisms driving lineage-specific evolution in both primates and rodents has been hindered by the lack of sister clades with a similar phylogenetic structure having high-quality genome assemblies. Here, we have created chromosome-level assemblies of the Mus caroli and Mus pahari genomes. Together with the Mus musculus and Rattus norvegicus genomes, this set of rodent genomes is similar in divergence times to the Hominidae (human-chimpanzee-gorilla-orangutan). By comparing the evolutionary dynamics between the Muridae and Hominidae, we identified punctate events of chromosome reshuffling that shaped the ancestral karyotype of Mus musculus and Mus caroli between 3 and 6 million yr ago, but that are absent in the Hominidae. Hominidae show between four- and sevenfold lower rates of nucleotide change and feature turnover in both neutral and functional sequences, suggesting an underlying coherence to the Muridae acceleration. Our system of matched, high-quality genome assemblies revealed how specific classes of repeats can play lineage-specific roles in related species. Recent LINE activity has remodeled protein-coding loci to a greater extent across the Muridae than the Hominidae, with functional consequences at the species level such as reproductive isolation. Furthermore, we charted a Muridae-specific retrotransposon expansion at unprecedented resolution, revealing how a single nucleotide mutation transformed a specific SINE element into an active CTCF binding site carrier specifically in Mus caroli, which resulted in thousands of novel, species-specific CTCF binding sites. Our results show that the comparison of matched phylogenetic sets of genomes will be an increasingly powerful strategy for understanding mammalian biology. PMID:29563166
Diverse Lifestyles and Strategies of Plant Pathogenesis Encoded in the Genomes of Eighteen Dothideomycetes Fungi

PubMed Central

Ohm, Robin A.; Feau, Nicolas; Henrissat, Bernard; Schoch, Conrad L.; Horwitz, Benjamin A.; Barry, Kerrie W.; Condon, Bradford J.; Copeland, Alex C.; Dhillon, Braham; Glaser, Fabian; Hesse, Cedar N.; Kosti, Idit; LaButti, Kurt; Lindquist, Erika A.; Lucas, Susan; Salamov, Asaf A.; Bradshaw, Rosie E.; Ciuffetti, Lynda; Hamelin, Richard C.; Kema, Gert H. J.; Lawrence, Christopher; Scott, James A.; Spatafora, Joseph W.; Turgeon, B. Gillian; de Wit, Pierre J. G. M.; Zhong, Shaobin; Goodwin, Stephen B.; Grigoriev, Igor V.

2012-01-01

The class Dothideomycetes is one of the largest groups of fungi with a high level of ecological diversity including many plant pathogens infecting a broad range of hosts. Here, we compare genome features of 18 members of this class, including 6 necrotrophs, 9 (hemi)biotrophs and 3 saprotrophs, to analyze genome structure, evolution, and the diverse strategies of pathogenesis. The Dothideomycetes most likely evolved from a common ancestor more than 280 million years ago. The 18 genome sequences differ dramatically in size due to variation in repetitive content, but show much less variation in number of (core) genes. Gene order appears to have been rearranged mostly within chromosomal boundaries by multiple inversions, in extant genomes frequently demarcated by adjacent simple repeats. Several Dothideomycetes contain one or more gene-poor, transposable element (TE)-rich putatively dispensable chromosomes of unknown function. The 18 Dothideomycetes offer an extensive catalogue of genes involved in cellulose degradation, proteolysis, secondary metabolism, and cysteine-rich small secreted proteins. Ancestors of the two major orders of plant pathogens in the Dothideomycetes, the Capnodiales and Pleosporales, may have had different modes of pathogenesis, with the former having fewer of these genes than the latter. Many of these genes are enriched in proximity to transposable elements, suggesting faster evolution because of the effects of repeat induced point (RIP) mutations. A syntenic block of genes, including oxidoreductases, is conserved in most Dothideomycetes and upregulated during infection in L. maculans, suggesting a possible function in response to oxidative stress. PMID:23236275
Diverse Lifestyles and Strategies of Plant Pathogenesis Encoded in the Genomes of Eighteen Dothideomycetes Fungi

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ohm, Robin A.; Feau, Nicolas; Henrissat, Bernard

The class Dothideomycetes is one of the largest groups of fungi with a high level of ecological diversity including many plant pathogens infecting a broad range of hosts. Here, we compare genome features of 18 members of this class, including 6 necrotrophs, 9 (hemi)biotrophs and 3 saprotrophs, to analyze genome structure, evolution, and the diverse strategies of pathogenesis. The Dothideomycetes most likely evolved from a common ancestor more than 280 million years ago. The 18 genome sequences differ dramatically in size due to variation in repetitive content, but show much less variation in number of (core) genes. Gene order appearsmore » to have been rearranged mostly within chromosomal boundaries by multiple inversions, in extant genomes frequently demarcated by adjacent simple repeats. Several Dothideomycetes contain one or more gene-poor, transposable element (TE)-rich putatively dispensable chromosomes of unknown function. The 18 Dothideomycetes offer an extensive catalogue of genes involved in cellulose degradation, proteolysis, secondary metabolism, and cysteine-rich small secreted proteins. Ancestors of the two major orders of plant pathogens in the Dothideomycetes, the Capnodiales and Pleosporales, may have had different modes of pathogenesis, with the former having fewer of these genes than the latter. Many of these genes are enriched in proximity to transposable elements, suggesting faster evolution because of the effects of repeat induced point (RIP) mutations. A syntenic block of genes, including oxidoreductases, is conserved in most Dothideomycetes and upregulated during infection in L. maculans, suggesting a possible function in response to oxidative stress.« less
Molecular mimicry: an important virulence strategy employed by Legionella pneumophila to subvert host functions.

PubMed

Nora, Tamara; Lomma, Mariella; Gomez-Valero, Laura; Buchrieser, Carmen

2009-08-01

It is 32 years since Legionella pneumophila was identified and recognized as a human pathogen, causing the severe form of pneumonia termed Legionnaires' disease, or legionellosis. This bacterium is found in freshwater reservoirs where it replicates in aquatic protozoa and can invade man-made water distribution systems. Although the disease can be treated by antibiotherapy and prevented through surveillance and control measures, reported cases of Legionnaires' disease continue to rise across Europe and outbreaks of major public health significance still occur. Genome sequencing and analyses led to a giant step forward by suggesting new ways by which this intracellular bacterium might subvert host functions. One particular feature revealed was the presence of many eukaryotic-like proteins, possibly mimicking host proteins to allow intracellular replication of Legionella. Here, we describe the identification and analysis of these proteins and report on recent advances detailing the mechanisms by which these proteins function. Finally, comparative and evolutionary genomic aspects regarding the eukaryotic-like proteins are presented. Collectively, these data have shed new light on the virulence strategies of L. pneumophila, a major aspect of which is molecular mimicry.
PvTFDB: a Phaseolus vulgaris transcription factors database for expediting functional genomics in legumes

PubMed Central

Bhawna; Bonthala, V.S.; Gajula, MNV Prasad

2016-01-01

The common bean [Phaseolus vulgaris (L.)] is one of the essential proteinaceous vegetables grown in developing countries. However, its production is challenged by low yields caused by numerous biotic and abiotic stress conditions. Regulatory transcription factors (TFs) symbolize a key component of the genome and are the most significant targets for producing stress tolerant crop and hence functional genomic studies of these TFs are important. Therefore, here we have constructed a web-accessible TFs database for P. vulgaris, called PvTFDB, which contains 2370 putative TF gene models in 49 TF families. This database provides a comprehensive information for each of the identified TF that includes sequence data, functional annotation, SSRs with their primer sets, protein physical properties, chromosomal location, phylogeny, tissue-specific gene expression data, orthologues, cis-regulatory elements and gene ontology (GO) assignment. Altogether, this information would be used in expediting the functional genomic studies of a specific TF(s) of interest. The objectives of this database are to understand functional genomics study of common bean TFs and recognize the regulatory mechanisms underlying various stress responses to ease breeding strategy for variety production through a couple of search interfaces including gene ID, functional annotation and browsing interfaces including by family and by chromosome. This database will also serve as a promising central repository for researchers as well as breeders who are working towards crop improvement of legume crops. In addition, this database provide the user unrestricted public access and the user can download entire data present in the database freely. Database URL: http://www.multiomics.in/PvTFDB/ PMID:27465131
Is there a strategy I iron uptake mechanism in maize?

PubMed

Li, Suzhen; Zhou, Xiaojin; Chen, Jingtang; Chen, Rumei

2018-04-03

Iron is a metal micronutrient that is essential for plant growth and development. Graminaceous and nongraminaceous plants have evolved different mechanisms to mediate Fe uptake. Generally, strategy I is used by nongraminaceous plants like Arabidopsis, while graminaceous plants, such as rice, barley, and maize, are considered to use strategy II Fe uptake. Upon the functional characterization of OsIRT1 and OsIRT2 in rice, it was suggested that rice, as an exceptional graminaceous plant, utilizes both strategy I and strategy II Fe uptake systems. Similarly, ZmIRT1 and ZmZIP3 were identified as functional zinc and iron transporters in the maize genome, along with the determination of several genes encoding Zn and Fe transporters, raising the possibility that strategy I Fe uptake also occurs in maize. This mini-review integrates previous reports and recent evidence to obtain a better understanding of the mechanisms of Fe uptake in maize.

The OncoPPi network of cancer-focused protein-protein interactions to inform biological insights and therapeutic strategies* | Office of Cancer Genomics

Cancer.gov

As genomics advances reveal the cancer gene landscape, a daunting task is to understand how these genes contribute to dysregulated oncogenic pathways. Integration of cancer genes into networks offers opportunities to reveal protein–protein interactions (PPIs) with functional and therapeutic significance. Here, we report the generation of a cancer-focused PPI network, termed OncoPPi, and identification of >260 cancer-associated PPIs not in other large-scale interactomes.
Novel Strategy for Discrimination of Transcription Factor Binding Motifs Employing Mathematical Neural Network

NASA Astrophysics Data System (ADS)

Sugimoto, Asuka; Sumi, Takuya; Kang, Jiyoung; Tateno, Masaru

2017-07-01

Recognition in biological macromolecular systems, such as DNA-protein recognition, is one of the most crucial problems to solve toward understanding the fundamental mechanisms of various biological processes. Since specific base sequences of genome DNA are discriminated by proteins, such as transcription factors (TFs), finding TF binding motifs (TFBMs) in whole genome DNA sequences is currently a central issue in interdisciplinary biophysical and information sciences. In the present study, a novel strategy to create a discriminant function for discrimination of TFBMs by constituting mathematical neural networks (NNs) is proposed, together with a method to determine the boundary of signals (TFBMs) and noise in the NN-score (output) space. This analysis also leads to the mathematical limitation of discrimination in the recognition of features representing TFBMs, in an information geometrical manifold. Thus, the present strategy enables the identification of the whole space of TFBMs, right up to the noise boundary.
From NGS assembly challenges to instability of fungal mitochondrial genomes: A case study in genome complexity.

PubMed

Misas, Elizabeth; Muñoz, José Fernando; Gallo, Juan Esteban; McEwen, Juan Guillermo; Clay, Oliver Keatinge

2016-04-01

The presence of repetitive or non-unique DNA persisting over sizable regions of a eukaryotic genome can hinder the genome's successful de novo assembly from short reads: ambiguities in assigning genome locations to the non-unique subsequences can result in premature termination of contigs and thus overfragmented assemblies. Fungal mitochondrial (mtDNA) genomes are compact (typically less than 100 kb), yet often contain short non-unique sequences that can be shown to impede their successful de novo assembly in silico. Such repeats can also confuse processes in the cell in vivo. A well-studied example is ectopic (out-of-register, illegitimate) recombination associated with repeat pairs, which can lead to deletion of functionally important genes that are located between the repeats. Repeats that remain conserved over micro- or macroevolutionary timescales despite such risks may indicate functionally or structurally (e.g., for replication) important regions. This principle could form the basis of a mining strategy for accelerating discovery of function in genome sequences. We present here our screening of a sample of 11 fully sequenced fungal mitochondrial genomes by observing where exact k-mer repeats occurred several times; initial analyses motivated us to focus on 17-mers occurring more than three times. Based on the diverse repeats we observe, we propose that such screening may serve as an efficient expedient for gaining a rapid but representative first insight into the repeat landscapes of sparsely characterized mitochondrial chromosomes. Our matching of the flagged repeats to previously reported regions of interest supports the idea that systems of persisting, non-trivial repeats in genomes can often highlight features meriting further attention. Copyright © 2016 Elsevier Ltd. All rights reserved.
Challenges and strategies for implementing genomic services in diverse settings: experiences from the Implementing GeNomics In pracTicE (IGNITE) network.

PubMed

Sperber, Nina R; Carpenter, Janet S; Cavallari, Larisa H; J Damschroder, Laura; Cooper-DeHoff, Rhonda M; Denny, Joshua C; Ginsburg, Geoffrey S; Guan, Yue; Horowitz, Carol R; Levy, Kenneth D; Levy, Mia A; Madden, Ebony B; Matheny, Michael E; Pollin, Toni I; Pratt, Victoria M; Rosenman, Marc; Voils, Corrine I; W Weitzel, Kristen; Wilke, Russell A; Ryanne Wu, R; Orlando, Lori A

2017-05-22

To realize potential public health benefits from genetic and genomic innovations, understanding how best to implement the innovations into clinical care is important. The objective of this study was to synthesize data on challenges identified by six diverse projects that are part of a National Human Genome Research Institute (NHGRI)-funded network focused on implementing genomics into practice and strategies to overcome these challenges. We used a multiple-case study approach with each project considered as a case and qualitative methods to elicit and describe themes related to implementation challenges and strategies. We describe challenges and strategies in an implementation framework and typology to enable consistent definitions and cross-case comparisons. Strategies were linked to challenges based on expert review and shared themes. Three challenges were identified by all six projects, and strategies to address these challenges varied across the projects. One common challenge was to increase the relative priority of integrating genomics within the health system electronic health record (EHR). Four projects used data warehousing techniques to accomplish the integration. The second common challenge was to strengthen clinicians' knowledge and beliefs about genomic medicine. To overcome this challenge, all projects developed educational materials and conducted meetings and outreach focused on genomic education for clinicians. The third challenge was engaging patients in the genomic medicine projects. Strategies to overcome this challenge included use of mass media to spread the word, actively involving patients in implementation (e.g., a patient advisory board), and preparing patients to be active participants in their healthcare decisions. This is the first collaborative evaluation focusing on the description of genomic medicine innovations implemented in multiple real-world clinical settings. Findings suggest that strategies to facilitate integration of genomic data within existing EHRs and educate stakeholders about the value of genomic services are considered important for effective implementation. Future work could build on these findings to evaluate which strategies are optimal under what conditions. This information will be useful for guiding translation of discoveries to clinical care, which, in turn, can provide data to inform continual improvement of genomic innovations and their applications.
Genome mining of Streptomyces scabrisporus NF3 reveals symbiotic features including genes related to plant interactions.

PubMed

Ceapă, Corina Diana; Vázquez-Hernández, Melissa; Rodríguez-Luna, Stefany Daniela; Cruz Vázquez, Angélica Patricia; Jiménez Suárez, Verónica; Rodríguez-Sanoja, Romina; Alvarez-Buylla, Elena R; Sánchez, Sergio

2018-01-01

Endophytic bacteria are wide-spread and associated with plant physiological benefits, yet their genomes and secondary metabolites remain largely unidentified. In this study, we explored the genome of the endophyte Streptomyces scabrisporus NF3 for discovery of potential novel molecules as well as genes and metabolites involved in host interactions. The complete genomes of seven Streptomyces and three other more distantly related bacteria were used to define the functional landscape of this unique microbe. The S. scabrisporus NF3 genome is larger than the average Streptomyces genome and not structured for an obligate endosymbiotic lifestyle; this and the fact that can grow in R2YE media implies that it could include a soil-living stage. The genome displays an enrichment of genes associated with amino acid production, protein secretion, secondary metabolite and antioxidants production and xenobiotic degradation, indicating that S. scabrisporus NF3 could contribute to the metabolic enrichment of soil microbial communities and of its hosts. Importantly, besides its metabolic advantages, the genome showed evidence for differential functional specificity and diversification of plant interaction molecules, including genes for the production of plant hormones, stress resistance molecules, chitinases, antibiotics and siderophores. Given the diversity of S. scabrisporus mechanisms for host upkeep, we propose that these strategies were necessary for its adaptation to plant hosts and to face changes in environmental conditions.
Genome mining of Streptomyces scabrisporus NF3 reveals symbiotic features including genes related to plant interactions

PubMed Central

Rodríguez-Luna, Stefany Daniela; Cruz Vázquez, Angélica Patricia; Jiménez Suárez, Verónica; Rodríguez-Sanoja, Romina; Alvarez-Buylla, Elena R.; Sánchez, Sergio

2018-01-01

Endophytic bacteria are wide-spread and associated with plant physiological benefits, yet their genomes and secondary metabolites remain largely unidentified. In this study, we explored the genome of the endophyte Streptomyces scabrisporus NF3 for discovery of potential novel molecules as well as genes and metabolites involved in host interactions. The complete genomes of seven Streptomyces and three other more distantly related bacteria were used to define the functional landscape of this unique microbe. The S. scabrisporus NF3 genome is larger than the average Streptomyces genome and not structured for an obligate endosymbiotic lifestyle; this and the fact that can grow in R2YE media implies that it could include a soil-living stage. The genome displays an enrichment of genes associated with amino acid production, protein secretion, secondary metabolite and antioxidants production and xenobiotic degradation, indicating that S. scabrisporus NF3 could contribute to the metabolic enrichment of soil microbial communities and of its hosts. Importantly, besides its metabolic advantages, the genome showed evidence for differential functional specificity and diversification of plant interaction molecules, including genes for the production of plant hormones, stress resistance molecules, chitinases, antibiotics and siderophores. Given the diversity of S. scabrisporus mechanisms for host upkeep, we propose that these strategies were necessary for its adaptation to plant hosts and to face changes in environmental conditions. PMID:29447216
Global mapping of DNA conformational flexibility on Saccharomyces cerevisiae.

PubMed

Menconi, Giulia; Bedini, Andrea; Barale, Roberto; Sbrana, Isabella

2015-04-01

In this study we provide the first comprehensive map of DNA conformational flexibility in Saccharomyces cerevisiae complete genome. Flexibility plays a key role in DNA supercoiling and DNA/protein binding, regulating DNA transcription, replication or repair. Specific interest in flexibility analysis concerns its relationship with human genome instability. Enrichment in flexible sequences has been detected in unstable regions of human genome defined fragile sites, where genes map and carry frequent deletions and rearrangements in cancer. Flexible sequences have been suggested to be the determinants of fragile gene proneness to breakage; however, their actual role and properties remain elusive. Our in silico analysis carried out genome-wide via the StabFlex algorithm, shows the conserved presence of highly flexible regions in budding yeast genome as well as in genomes of other Saccharomyces sensu stricto species. Flexibile peaks in S. cerevisiae identify 175 ORFs mapping on their 3'UTR, a region affecting mRNA translation, localization and stability. (TA)n repeats of different extension shape the central structure of peaks and co-localize with polyadenylation efficiency element (EE) signals. ORFs with flexible peaks share common features. Transcripts are characterized by decreased half-life: this is considered peculiar of genes involved in regulatory systems with high turnover; consistently, their function affects biological processes such as cell cycle regulation or stress response. Our findings support the functional importance of flexibility peaks, suggesting that the flexible sequence may be derived by an expansion of canonical TAYRTA polyadenylation efficiency element. The flexible (TA)n repeat amplification could be the outcome of an evolutionary neofunctionalization leading to a differential 3'-end processing and expression regulation in genes with peculiar function. Our study provides a new support to the functional role of flexibility in genomes and a strategy for its characterization inside human fragile sites.
Global Mapping of DNA Conformational Flexibility on Saccharomyces cerevisiae

PubMed Central

Menconi, Giulia; Bedini, Andrea; Barale, Roberto; Sbrana, Isabella

2015-01-01

In this study we provide the first comprehensive map of DNA conformational flexibility in Saccharomyces cerevisiae complete genome. Flexibility plays a key role in DNA supercoiling and DNA/protein binding, regulating DNA transcription, replication or repair. Specific interest in flexibility analysis concerns its relationship with human genome instability. Enrichment in flexible sequences has been detected in unstable regions of human genome defined fragile sites, where genes map and carry frequent deletions and rearrangements in cancer. Flexible sequences have been suggested to be the determinants of fragile gene proneness to breakage; however, their actual role and properties remain elusive. Our in silico analysis carried out genome-wide via the StabFlex algorithm, shows the conserved presence of highly flexible regions in budding yeast genome as well as in genomes of other Saccharomyces sensu stricto species. Flexibile peaks in S. cerevisiae identify 175 ORFs mapping on their 3’UTR, a region affecting mRNA translation, localization and stability. (TA)n repeats of different extension shape the central structure of peaks and co-localize with polyadenylation efficiency element (EE) signals. ORFs with flexible peaks share common features. Transcripts are characterized by decreased half-life: this is considered peculiar of genes involved in regulatory systems with high turnover; consistently, their function affects biological processes such as cell cycle regulation or stress response. Our findings support the functional importance of flexibility peaks, suggesting that the flexible sequence may be derived by an expansion of canonical TAYRTA polyadenylation efficiency element. The flexible (TA)n repeat amplification could be the outcome of an evolutionary neofunctionalization leading to a differential 3’-end processing and expression regulation in genes with peculiar function. Our study provides a new support to the functional role of flexibility in genomes and a strategy for its characterization inside human fragile sites. PMID:25860149
Integrative computational approach for genome-based study of microbial lipid-degrading enzymes.

PubMed

Vorapreeda, Tayvich; Thammarongtham, Chinae; Laoteng, Kobkul

2016-07-01

Lipid-degrading or lipolytic enzymes have gained enormous attention in academic and industrial sectors. Several efforts are underway to discover new lipase enzymes from a variety of microorganisms with particular catalytic properties to be used for extensive applications. In addition, various tools and strategies have been implemented to unravel the functional relevance of the versatile lipid-degrading enzymes for special purposes. This review highlights the study of microbial lipid-degrading enzymes through an integrative computational approach. The identification of putative lipase genes from microbial genomes and metagenomic libraries using homology-based mining is discussed, with an emphasis on sequence analysis of conserved motifs and enzyme topology. Molecular modelling of three-dimensional structure on the basis of sequence similarity is shown to be a potential approach for exploring the structural and functional relationships of candidate lipase enzymes. The perspectives on a discriminative framework of cutting-edge tools and technologies, including bioinformatics, computational biology, functional genomics and functional proteomics, intended to facilitate rapid progress in understanding lipolysis mechanism and to discover novel lipid-degrading enzymes of microorganisms are discussed.
Precision medicine for cancer with next-generation functional diagnostics.

PubMed

Friedman, Adam A; Letai, Anthony; Fisher, David E; Flaherty, Keith T

2015-12-01

Precision medicine is about matching the right drugs to the right patients. Although this approach is technology agnostic, in cancer there is a tendency to make precision medicine synonymous with genomics. However, genome-based cancer therapeutic matching is limited by incomplete biological understanding of the relationship between phenotype and cancer genotype. This limitation can be addressed by functional testing of live patient tumour cells exposed to potential therapies. Recently, several 'next-generation' functional diagnostic technologies have been reported, including novel methods for tumour manipulation, molecularly precise assays of tumour responses and device-based in situ approaches; these address the limitations of the older generation of chemosensitivity tests. The promise of these new technologies suggests a future diagnostic strategy that integrates functional testing with next-generation sequencing and immunoprofiling to precisely match combination therapies to individual cancer patients.
Unlocking Triticeae genomics to sustainably feed the future

PubMed Central

Mochida, Keiichi; Shinozaki, Kazuo

2013-01-01

The tribe Triticeae includes the major crops wheat and barley. Within the last few years, the whole genomes of four Triticeae species—barley, wheat, Tausch’s goatgrass (Aegilops tauschii) and wild einkorn wheat (Triticum urartu)—have been sequenced. The availability of these genomic resources for Triticeae plants and innovative analytical applications using next-generation sequencing technologies are helping to revitalize our approaches in genetic work and to accelerate improvement of the Triticeae crops. Comparative genomics and integration of genomic resources from Triticeae plants and the model grass Brachypodium distachyon are aiding the discovery of new genes and functional analyses of genes in Triticeae crops. Innovative approaches and tools such as analysis of next-generation populations, evolutionary genomics and systems approaches with mathematical modeling are new strategies that will help us discover alleles for adaptive traits to future agronomic environments. In this review, we provide an update on genomic tools for use with Triticeae plants and Brachypodium and describe emerging approaches toward crop improvements in Triticeae. PMID:24204022
Application of Genetic/Genomic Approaches to Allergic Disorders

PubMed Central

Baye, Tesfaye M.; Martin, Lisa J.; Khurana Hershey, Gurjit K.

2010-01-01

Completion of the human genome project and rapid progress in genetics and bioinformatics have enabled the development of large public databases, which include genetic and genomic data linked to clinical health data. With the massive amount of information available, clinicians and researchers have the unique opportunity to complement and integrate their daily practice with the existing resources to clarify the underlying etiology of complex phenotypes such as allergic diseases. The genome itself is now often utilized as a starting point for many studies and multiple innovative approaches have emerged applying genetic/genomic strategies to key questions in the field of allergy and immunology. There have been several successes, which have uncovered new insights into the biologic underpinnings of allergic disorders. Herein, we will provide an in depth review of genomic approaches to identifying genes and biologic networks involved in allergic diseases. We will discuss genetic and phenotypic variation, statistical approaches for gene discovery, public databases, functional genomics, clinical implications, and the challenges that remain. PMID:20638111
Ebolavirus comparative genomics

DOE PAGES

Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; ...

2015-07-14

The 2014 Ebola outbreak in West Africa is the largest documented for this virus. We examine the dynamics of this genome, comparing more than one hundred currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus, and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of themore » same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP), and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. In conclusion, this information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.« less
Engineered kinesin motor proteins amenable to small-molecule inhibition

PubMed Central

Engelke, Martin F.; Winding, Michael; Yue, Yang; Shastry, Shankar; Teloni, Federico; Reddy, Sanjay; Blasius, T. Lynne; Soppina, Pushpanjali; Hancock, William O.; Gelfand, Vladimir I.; Verhey, Kristen J.

2016-01-01

The human genome encodes 45 kinesin motor proteins that drive cell division, cell motility, intracellular trafficking and ciliary function. Determining the cellular function of each kinesin would benefit from specific small-molecule inhibitors. However, screens have yielded only a few specific inhibitors. Here we present a novel chemical-genetic approach to engineer kinesin motors that can carry out the function of the wild-type motor yet can also be efficiently inhibited by small, cell-permeable molecules. Using kinesin-1 as a prototype, we develop two independent strategies to generate inhibitable motors, and characterize the resulting inhibition in single-molecule assays and in cells. We further apply these two strategies to create analogously inhibitable kinesin-3 motors. These inhibitable motors will be of great utility to study the functions of specific kinesins in a dynamic manner in cells and animals. Furthermore, these strategies can be used to generate inhibitable versions of any motor protein of interest. PMID:27045608
Engineering and Functional Characterization of Fusion Genes Identifies Novel Oncogenic Drivers of Cancer. | Office of Cancer Genomics

Cancer.gov

Oncogenic gene fusions drive many human cancers, but tools to more quickly unravel their functional contributions are needed. Here we describe methodology permitting fusion gene construction for functional evaluation. Using this strategy, we engineered the known fusion oncogenes, BCR-ABL1, EML4-ALK, and ETV6-NTRK3, as well as 20 previously uncharacterized fusion genes identified in TCGA datasets.
Cancer driver-passenger distinction via sporadic human and dog cancer comparison: a proof of principle study with colorectal cancer

PubMed Central

Tang, Jie; Li, Yaping; Lyon, Kenneth; Camps, Jordi; Dalton, Stephen; Ried, Thomas; Zhao, Shaying

2014-01-01

Herein we report a proof of principle study illustrating a novel dog-human comparison strategy that addresses a central aim of cancer research, namely cancer driver–passenger distinction. We previously demonstrated that sporadic canine colorectal cancers (CRCs) share similar molecular pathogenesis mechanisms as their human counterparts. In this study, we compared the genome-wide copy number abnormalities between 29 human- and 10 canine sporadic CRCs. This led to the identification of 73 driver candidate genes (DCGs), altered in both species and with 27 from the whole genome and 46 from dog-human genomic rearrangement breakpoint (GRB) regions, as well as 38 passenger candidate genes (PCGs), altered in humans only and located in GRB regions. We noted that DCGs significantly differ from PCGs in every analysis conducted to assess their cancer relevance and biological functions. Importantly, while PCGs are not enriched in any specific functions, DCGs possess significantly enhanced functionality closely associated with cell proliferation and death regulation, as well as with epithelial cell apicobasal polarity establishment/maintenance. These observations support the notion that, in sporadic CRCs of both species, cell polarity genes not only contribute in preventing cancer cell invasion and spreading, but also likely serve as tumor suppressors by modulating cell growth. This pilot study validates our novel strategy and has uncovered four new potential cell polarity and colorectal tumor suppressor genes (RASA3, NUPL1, DENND5A, and AVL9). Expansion of this study would make more driver-passenger distinctions for cancers with large genomic amplifications or deletions, and address key questions regarding the relationship between cancer pathogenesis and epithelial cell polarity control in mammals. PMID:23416983
Cancer driver-passenger distinction via sporadic human and dog cancer comparison: a proof-of-principle study with colorectal cancer.

PubMed

Tang, J; Li, Y; Lyon, K; Camps, J; Dalton, S; Ried, T; Zhao, S

2014-02-13

Herein we report a proof-of-principle study illustrating a novel dog-human comparison strategy that addresses a central aim of cancer research, namely cancer driver-passenger distinction. We previously demonstrated that sporadic canine colorectal cancers (CRCs) share similar molecular pathogenesis mechanisms as their human counterparts. In this study, we compared the genome-wide copy number abnormalities between 29 human and 10 canine sporadic CRCs. This led to the identification of 73 driver candidate genes (DCGs), altered in both species, and with 27 from the whole genome and 46 from dog-human genomic rearrangement breakpoint (GRB) regions, as well as 38 passenger candidate genes (PCGs), altered in humans only and located in GRB regions. We noted that DCGs significantly differ from PCGs in every analysis conducted to assess their cancer relevance and biological functions. Importantly, although PCGs are not enriched in any specific functions, DCGs possess significantly enhanced functionality closely associated with cell proliferation and death regulation, as well as with epithelial cell apicobasal polarity establishment/maintenance. These observations support the notion that, in sporadic CRCs of both species, cell polarity genes not only contribute in preventing cancer cell invasion and spreading, but also likely serve as tumor suppressors by modulating cell growth. This pilot study validates our novel strategy and has uncovered four new potential cell polarity and colorectal tumor suppressor genes (RASA3, NUPL1, DENND5A and AVL9). Expansion of this study would make more driver-passenger distinctions for cancers with large genomic amplifications or deletions, and address key questions regarding the relationship between cancer pathogenesis and epithelial cell polarity control in mammals.
RNAi screening comes of age: improved techniques and complementary approaches

PubMed Central

Mohr, Stephanie E.; Smith, Jennifer A.; Shamu, Caroline E.; Neumüller, Ralph A.; Perrimon, Norbert

2014-01-01

Gene silencing through sequence-specific targeting of mRNAs by RNAi has enabled genome-wide functional screens in cultured cells and in vivo in model organisms. These screens have resulted in the identification of new cellular pathways and potential drug targets. Considerable progress has been made to improve the quality of RNAi screen data through the development of new experimental and bioinformatics approaches. The recent availability of genome-editing strategies, such as the CRISPR (clustered regularly interspaced short palindromic repeats)-Cas9 system, when combined with RNAi, could lead to further improvements in screen data quality and follow-up experiments, thus promoting our understanding of gene function and gene regulatory networks. PMID:25145850
NCBI GEO: archive for functional genomics data sets—update

PubMed Central

Barrett, Tanya; Wilhite, Stephen E.; Ledoux, Pierre; Evangelista, Carlos; Kim, Irene F.; Tomashevsky, Maxim; Marshall, Kimberly A.; Phillippy, Katherine H.; Sherman, Patti M.; Holko, Michelle; Yefanov, Andrey; Lee, Hyeseung; Zhang, Naigong; Robertson, Cynthia L.; Serova, Nadezhda; Davis, Sean; Soboleva, Alexandra

2013-01-01

The Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) is an international public repository for high-throughput microarray and next-generation sequence functional genomic data sets submitted by the research community. The resource supports archiving of raw data, processed data and metadata which are indexed, cross-linked and searchable. All data are freely available for download in a variety of formats. GEO also provides several web-based tools and strategies to assist users to query, analyse and visualize data. This article reports current status and recent database developments, including the release of GEO2R, an R-based web application that helps users analyse GEO data. PMID:23193258
The transcriptional regulator pool of the marine bacterium Rhodopirellula baltica SH 1T as revealed by whole genome comparisons.

PubMed

Lombardot, Thierry; Bauer, Margarete; Teeling, Hanno; Amann, Rudolf; Glöckner, Frank Oliver

2005-01-01

Rhodopirellula baltica (strain SH 1T) is a free-living marine representative of the phylogenetically independent and environmentally relevant phylum Planctomycetes. Little is known about the regulatory strategies of free-living bacteria with large (7.15 Mb) genomes. Therefore, a consistent, quantitative and qualitative description was produced by comparing R. baltica's transcriptional regulator pool with that of 123 publicly available bacterial genomes. The overall results are congruous with earlier observations that in Bacteria, the proportion of genes encoding transcriptional regulators generally increases with genome size. However, R. baltica distinctly stands out from this trend with only 2.4% (174) of all genes predicted to encode transcriptional regulators. The qualitative investigation of R. baltica's transcriptional regulators revealed a clear shift towards high numbers of two-component systems (66) as well as high numbers of sigma factors (49), with more than 76% (37) belonging to the extra-cytoplasmic function subfamily of sigma-70. Only one predicted sigma factor showed a relatively close phylogenetic relationship to that of another bacterium, the sigma factor SigZ of Bacillus subtilis. In summary, analysis of the R. baltica genome revealed disparate regulatory mechanisms and a clear bias towards direct environmental sensing. This strategy might provide a selective advantage for organisms living in habitats with frequently changing environmental conditions.

Diploidy and the selective advantage for sexual reproduction in unicellular organisms.

PubMed

Kleiman, Maya; Tannenbaum, Emmanuel

2009-11-01

This article develops mathematical models describing the evolutionary dynamics of both asexually and sexually reproducing populations of diploid unicellular organisms. The asexual and sexual life cycles are based on the asexual and sexual life cycles in Saccharomyces cerevisiae, Baker's yeast, which normally reproduces by asexual budding, but switches to sexual reproduction when stressed. The mathematical models consider three reproduction pathways: (1) Asexual reproduction, (2) self-fertilization, and (3) sexual reproduction. We also consider two forms of genome organization. In the first case, we assume that the genome consists of two multi-gene chromosomes, whereas in the second case, we consider the opposite extreme and assume that each gene defines a separate chromosome, which we call the multi-chromosome genome. These two cases are considered to explore the role that recombination has on the mutation-selection balance and the selective advantage of the various reproduction strategies. We assume that the purpose of diploidy is to provide redundancy, so that damage to a gene may be repaired using the other, presumably undamaged copy (a process known as homologous recombination repair). As a result, we assume that the fitness of the organism only depends on the number of homologous gene pairs that contain at least one functional copy of a given gene. If the organism has at least one functional copy of every gene in the genome, we assume a fitness of 1. In general, if the organism has l homologous pairs that lack a functional copy of the given gene, then the fitness of the organism is kappa(l). The kappa(l) are assumed to be monotonically decreasing, so that kappa(0) = 1 > kappa(1) > kappa(2) > cdots, three dots, centered > kappa(infinity) = 0. For nearly all of the reproduction strategies we consider, we find, in the limit of large N, that the mean fitness at mutation-selection balance is max{2e(-mu) - 1,0} where N is the number of genes in the haploid set of the genome, epsilon is the probability that a given DNA template strand of a given gene produces a mutated daughter during replication, and mu = Nepsilon. The only exception is the sexual reproduction pathway for the multi-chromosomed genome. Assuming a multiplicative fitness landscape where kappa(l) = alpha(l) for alpha in (0, 1), this strategy is found to have a mean fitness that exceeds the mean fitness of all the other strategies. Furthermore, while other reproduction strategies experience a total loss of viability due to the steady accumulation of deleterious mutations once mu exceeds [Formula: see text] no such transition occurs in the sexual pathway. Indeed, in the limit as alpha --> 1 for the multiplicative landscape, we can show that the mean fitness for the sexual pathway with the multi-chromosomed genome converges to e(-2mu), which is always positive. We explicitly allow for mitotic recombination in this study, which, in contrast to previous studies using different models, does not have any advantage over other asexual reproduction strategies. The results of this article provide a basis for understanding the selective advantage of the specific meiotic pathway that is employed by sexually reproducing organisms. The results of this article also suggest an explanation for why unicellular organisms such as Saccharomyces cerevisiae (Baker's yeast) switch to a sexual mode of reproduction when stressed. While the results of this article are based on modeling mutation-propagation in unicellular organisms, they nevertheless suggest that, in more complex organisms with significantly larger genomes, sex is necessary to prevent the loss of viability of a population due to genetic drift. Finally, and perhaps most importantly, the results of this article demonstrate a selective advantage for sexual reproduction with fewer and much less restrictive assumptions than those of previous studies.
Reductive evolution of architectural repertoires in proteomes and the birth of the tripartite world

PubMed Central

Wang, Minglei; Yafremava, Liudmila S.; Caetano-Anollés, Derek; Mittenthal, Jay E.; Caetano-Anollés, Gustavo

2007-01-01

The repertoire of protein architectures in proteomes is evolutionarily conserved and capable of preserving an accurate record of genomic history. Here we use a census of protein architecture in 185 genomes that have been fully sequenced to generate genome-based phylogenies that describe the evolution of the protein world at fold (F) and fold superfamily (FSF) levels. The patterns of representation of F and FSF architectures over evolutionary history suggest three epochs in the evolution of the protein world: (1) architectural diversification, where members of an architecturally rich ancestral community diversified their protein repertoire; (2) superkingdom specification, where superkingdoms Archaea, Bacteria, and Eukarya were specified; and (3) organismal diversification, where F and FSF specific to relatively small sets of organisms appeared as the result of diversification of organismal lineages. Functional annotation of FSF along these architectural chronologies revealed patterns of discovery of biological function. Most importantly, the analysis identified an early and extensive differential loss of architectures occurring primarily in Archaea that segregates the archaeal lineage from the ancient community of organisms and establishes the first organismal divide. Reconstruction of phylogenomic trees of proteomes reflects the timeline of architectural diversification in the emerging lineages. Thus, Archaea undertook a minimalist strategy using only a small subset of the full architectural repertoire and then crystallized into a diversified superkingdom late in evolution. Our analysis also suggests a communal ancestor to all life that was molecularly complex and adopted genomic strategies currently present in Eukarya. PMID:17908824
Reflections on the Anopheles gambiae genome sequence, transgenic mosquitoes and the prospect for controlling malaria and other vector borne diseases.

PubMed

Tabachnick, Walter J

2003-09-01

The completion of the Anopheles gambiae Giles genome sequencing project is a milestone toward developing more effective strategies in reducing the impact of malaria and other vector borne diseases. The successes in developing transgenic approaches using mosquitoes have provided another essential new tool for further progress in basic vector genetics and the goal of disease control. The use of transgenic approaches to develop refractory mosquitoes is also possible. The ability to use genome sequence to identify genes, and transgenic approaches to construct refractory mosquitoes, has provided the opportunity that with the future development of an appropriate genetic drive system, refractory transgenes can be released into vector populations leading to nontransmitting mosquitoes. An. gambiae populations incapable of transmitting malaria. This compelling strategy will be very difficult to achieve and will require a broad substantial research program for success. The fundamental information that is required on genome structure, gene function and environmental effects on genetic expression are largely unknown. The ability to predict gene effects on phenotype is rudimentary, particularly in natural populations. As a result, the release of a refractory transgene into natural mosquito populations is imprecise and there is little ability to predict unintended consequences. The new genetic tools at hand provide opportunities to address an array of important issues, many of which can have immediate impact on the effectiveness of a host of strategies to control vector borne disease. Transgenic release approaches represent only one strategy that should be pursued. A balanced research program is required.
Identification and characterization of nuclear genes involved in photosynthesis in Populus

PubMed Central

2014-01-01

Background The gap between the real and potential photosynthetic rate under field conditions suggests that photosynthesis could potentially be improved. Nuclear genes provide possible targets for improving photosynthetic efficiency. Hence, genome-wide identification and characterization of the nuclear genes affecting photosynthetic traits in woody plants would provide key insights on genetic regulation of photosynthesis and identify candidate processes for improvement of photosynthesis. Results Using microarray and bulked segregant analysis strategies, we identified differentially expressed nuclear genes for photosynthesis traits in a segregating population of poplar. We identified 515 differentially expressed genes in this population (FC ≥ 2 or FC ≤ 0.5, P < 0.05), 163 up-regulated and 352 down-regulated. Real-time PCR expression analysis confirmed the microarray data. Singular Enrichment Analysis identified 48 significantly enriched GO terms for molecular functions (28), biological processes (18) and cell components (2). Furthermore, we selected six candidate genes for functional examination by a single-marker association approach, which demonstrated that 20 SNPs in five candidate genes significantly associated with photosynthetic traits, and the phenotypic variance explained by each SNP ranged from 2.3% to 12.6%. This revealed that regulation of photosynthesis by the nuclear genome mainly involves transport, metabolism and response to stimulus functions. Conclusions This study provides new genome-scale strategies for the discovery of potential candidate genes affecting photosynthesis in Populus, and for identification of the functions of genes involved in regulation of photosynthesis. This work also suggests that improving photosynthetic efficiency under field conditions will require the consideration of multiple factors, such as stress responses. PMID:24673936
GAP Final Technical Report 12-14-04

DOE Office of Scientific and Technical Information (OSTI.GOV)

Andrew J. Bordner, PhD, Senior Research Scientist

2004-12-14

The Genomics Annotation Platform (GAP) was designed to develop new tools for high throughput functional annotation and characterization of protein sequences and structures resulting from genomics and structural proteomics, benchmarking and application of those tools. Furthermore, this platform integrated the genomic scale sequence and structural analysis and prediction tools with the advanced structure prediction and bioinformatics environment of ICM. The development of GAP was primarily oriented towards the annotation of new biomolecular structures using both structural and sequence data. Even though the amount of protein X-ray crystal data is growing exponentially, the volume of sequence data is growing even moremore » rapidly. This trend was exploited by leveraging the wealth of sequence data to provide functional annotation for protein structures. The additional information provided by GAP is expected to assist the majority of the commercial users of ICM, who are involved in drug discovery, in identifying promising drug targets as well in devising strategies for the rational design of therapeutics directed at the protein of interest. The GAP also provided valuable tools for biochemistry education, and structural genomics centers. In addition, GAP incorporates many novel prediction and analysis methods not available in other molecular modeling packages. This development led to signing the first Molsoft agreement in the structural genomics annotation area with the University of oxford Structural Genomics Center. This commercial agreement validated the Molsoft efforts under the GAP project and provided the basis for further development of the large scale functional annotation platform.« less
Repeat associated mechanisms of genome evolution and function revealed by the Mus caroli and Mus pahari genomes.

PubMed

Thybert, David; Roller, Maša; Navarro, Fábio C P; Fiddes, Ian; Streeter, Ian; Feig, Christine; Martin-Galvez, David; Kolmogorov, Mikhail; Janoušek, Václav; Akanni, Wasiu; Aken, Bronwen; Aldridge, Sarah; Chakrapani, Varshith; Chow, William; Clarke, Laura; Cummins, Carla; Doran, Anthony; Dunn, Matthew; Goodstadt, Leo; Howe, Kerstin; Howell, Matthew; Josselin, Ambre-Aurore; Karn, Robert C; Laukaitis, Christina M; Jingtao, Lilue; Martin, Fergal; Muffato, Matthieu; Nachtweide, Stefanie; Quail, Michael A; Sisu, Cristina; Stanke, Mario; Stefflova, Klara; Van Oosterhout, Cock; Veyrunes, Frederic; Ward, Ben; Yang, Fengtang; Yazdanifar, Golbahar; Zadissa, Amonida; Adams, David J; Brazma, Alvis; Gerstein, Mark; Paten, Benedict; Pham, Son; Keane, Thomas M; Odom, Duncan T; Flicek, Paul

2018-04-01

Understanding the mechanisms driving lineage-specific evolution in both primates and rodents has been hindered by the lack of sister clades with a similar phylogenetic structure having high-quality genome assemblies. Here, we have created chromosome-level assemblies of the Mus caroli and Mus pahari genomes. Together with the Mus musculus and Rattus norvegicus genomes, this set of rodent genomes is similar in divergence times to the Hominidae (human-chimpanzee-gorilla-orangutan). By comparing the evolutionary dynamics between the Muridae and Hominidae, we identified punctate events of chromosome reshuffling that shaped the ancestral karyotype of Mus musculus and Mus caroli between 3 and 6 million yr ago, but that are absent in the Hominidae. Hominidae show between four- and sevenfold lower rates of nucleotide change and feature turnover in both neutral and functional sequences, suggesting an underlying coherence to the Muridae acceleration. Our system of matched, high-quality genome assemblies revealed how specific classes of repeats can play lineage-specific roles in related species. Recent LINE activity has remodeled protein-coding loci to a greater extent across the Muridae than the Hominidae, with functional consequences at the species level such as reproductive isolation. Furthermore, we charted a Muridae-specific retrotransposon expansion at unprecedented resolution, revealing how a single nucleotide mutation transformed a specific SINE element into an active CTCF binding site carrier specifically in Mus caroli , which resulted in thousands of novel, species-specific CTCF binding sites. Our results show that the comparison of matched phylogenetic sets of genomes will be an increasingly powerful strategy for understanding mammalian biology. © 2018 Thybert et al.; Published by Cold Spring Harbor Laboratory Press.
A knowledge base for Vitis vinifera functional analysis.

PubMed

Pulvirenti, Alfredo; Giugno, Rosalba; Distefano, Rosario; Pigola, Giuseppe; Mongiovi, Misael; Giudice, Girolamo; Vendramin, Vera; Lombardo, Alessandro; Cattonaro, Federica; Ferro, Alfredo

2015-01-01

Vitis vinifera (Grapevine) is the most important fruit species in the modern world. Wine and table grapes sales contribute significantly to the economy of major wine producing countries. The most relevant goals in wine production concern quality and safety. In order to significantly improve the achievement of these objectives and to gain biological knowledge about cultivars, a genomic approach is the most reliable strategy. The recent grapevine genome sequencing offers the opportunity to study the potential roles of genes and microRNAs in fruit maturation and other physiological and pathological processes. Although several systems allowing the analysis of plant genomes have been reported, none of them has been designed specifically for the functional analysis of grapevine genomes of cultivars under environmental stress in connection with microRNA data. Here we introduce a novel knowledge base, called BIOWINE, designed for the functional analysis of Vitis vinifera genomes of cultivars present in Sicily. The system allows the analysis of RNA-seq experiments of two different cultivars, namely Nero d'Avola and Nerello Mascalese. Samples were taken under different climatic conditions of phenological phases, diseases, and geographic locations. The BIOWINE web interface is equipped with data analysis modules for grapevine genomes. In particular users may analyze the current genome assembly together with the RNA-seq data through a customized version of GBrowse. The web interface allows users to perform gene set enrichment by exploiting third-party databases. BIOWINE is a knowledge base implementing a set of bioinformatics tools for the analysis of grapevine genomes. The system aims to increase our understanding of the grapevine varieties and species of Sicilian products focusing on adaptability to different climatic conditions, phenological phases, diseases, and geographic locations.
Random mutagenesis of the hyperthermophilic archaeon Pyrococcus furiosus using in vitro mariner transposition and natural transformation.

PubMed

Guschinskaya, Natalia; Brunel, Romain; Tourte, Maxime; Lipscomb, Gina L; Adams, Michael W W; Oger, Philippe; Charpentier, Xavier

2016-11-08

Transposition mutagenesis is a powerful tool to identify the function of genes, reveal essential genes and generally to unravel the genetic basis of living organisms. However, transposon-mediated mutagenesis has only been successfully applied to a limited number of archaeal species and has never been reported in Thermococcales. Here, we report random insertion mutagenesis in the hyperthermophilic archaeon Pyrococcus furiosus. The strategy takes advantage of the natural transformability of derivatives of the P. furiosus COM1 strain and of in vitro Mariner-based transposition. A transposon bearing a genetic marker is randomly transposed in vitro in genomic DNA that is then used for natural transformation of P. furiosus. A small-scale transposition reaction routinely generates several hundred and up to two thousands transformants. Southern analysis and sequencing showed that the obtained mutants contain a single and random genomic insertion. Polyploidy has been reported in Thermococcales and P. furiosus is suspected of being polyploid. Yet, about half of the mutants obtained on the first selection are homozygous for the transposon insertion. Two rounds of isolation on selective medium were sufficient to obtain gene conversion in initially heterozygous mutants. This transposition mutagenesis strategy will greatly facilitate functional exploration of the Thermococcales genomes.
Cre/lox-Recombinase-Mediated Cassette Exchange for Reversible Site-Specific Genomic Targeting of the Disease Vector, Aedes aegypti.

PubMed

Häcker, Irina; Harrell Ii, Robert A; Eichner, Gerrit; Pilitt, Kristina L; O'Brochta, David A; Handler, Alfred M; Schetelig, Marc F

2017-03-07

Site-specific genome modification (SSM) is an important tool for mosquito functional genomics and comparative gene expression studies, which contribute to a better understanding of mosquito biology and are thus a key to finding new strategies to eliminate vector-borne diseases. Moreover, it allows for the creation of advanced transgenic strains for vector control programs. SSM circumvents the drawbacks of transposon-mediated transgenesis, where random transgene integration into the host genome results in insertional mutagenesis and variable position effects. We applied the Cre/lox recombinase-mediated cassette exchange (RMCE) system to Aedes aegypti, the vector of dengue, chikungunya, and Zika viruses. In this context we created four target site lines for RMCE and evaluated their fitness costs. Cre-RMCE is functional in a two-step mechanism and with good efficiency in Ae. aegypti. The advantages of Cre-RMCE over existing site-specific modification systems for Ae. aegypti, phiC31-RMCE and CRISPR, originate in the preservation of the recombination sites, which 1) allows successive modifications and rapid expansion or adaptation of existing systems by repeated targeting of the same site; and 2) provides reversibility, thus allowing the excision of undesired sequences. Thereby, Cre-RMCE complements existing genomic modification tools, adding flexibility and versatility to vector genome targeting.
Efficient genome editing of differentiated renal epithelial cells.

PubMed

Hofherr, Alexis; Busch, Tilman; Huber, Nora; Nold, Andreas; Bohn, Albert; Viau, Amandine; Bienaimé, Frank; Kuehn, E Wolfgang; Arnold, Sebastian J; Köttgen, Michael

2017-02-01

Recent advances in genome editing technologies have enabled the rapid and precise manipulation of genomes, including the targeted introduction, alteration, and removal of genomic sequences. However, respective methods have been described mainly in non-differentiated or haploid cell types. Genome editing of well-differentiated renal epithelial cells has been hampered by a range of technological issues, including optimal design, efficient expression of multiple genome editing constructs, attainable mutation rates, and best screening strategies. Here, we present an easily implementable workflow for the rapid generation of targeted heterozygous and homozygous genomic sequence alterations in renal cells using transcription activator-like effector nucleases (TALENs) and the clustered regularly interspaced short palindromic repeat (CRISPR) system. We demonstrate the versatility of established protocols by generating novel cellular models for studying autosomal dominant polycystic kidney disease (ADPKD). Furthermore, we show that cell culture-validated genetic modifications can be readily applied to mouse embryonic stem cells (mESCs) for the generation of corresponding mouse models. The described procedure for efficient genome editing can be applied to any cell type to study physiological and pathophysiological functions in the context of precisely engineered genotypes.
Genomics England’s implementation of its public engagement strategy: Blurred boundaries between engagement for the United Kingdom’s 100,000 Genomes project and the need for public support

PubMed Central

Samuel, Gabrielle Natalie; Farsides, Bobbie

2017-01-01

The United Kingdom’s 100,000 Genomes Project has the aim of sequencing 100,000 genomes from National Health Service patients such that whole genome sequencing becomes routine clinical practice. It also has a research-focused goal to provide data for scientific discovery. Genomics England is the limited company established by the Department of Health to deliver the project. As an innovative scientific/clinical venture, it is interesting to consider how Genomics England positions itself in relation to public engagement activities. We set out to explore how individuals working at, or associated with, Genomics England enacted public engagement in practice. Our findings show that individuals offered a narrative in which public engagement performed more than one function. On one side, public engagement was seen as ‘good practice’. On the other, public engagement was presented as core to the project’s success – needed to encourage involvement and ultimately recruitment. We discuss the implications of this in this article. PMID:29241419
PvTFDB: a Phaseolus vulgaris transcription factors database for expediting functional genomics in legumes.

PubMed

Bhawna; Bonthala, V S; Gajula, Mnv Prasad

2016-01-01

The common bean [Phaseolus vulgaris (L.)] is one of the essential proteinaceous vegetables grown in developing countries. However, its production is challenged by low yields caused by numerous biotic and abiotic stress conditions. Regulatory transcription factors (TFs) symbolize a key component of the genome and are the most significant targets for producing stress tolerant crop and hence functional genomic studies of these TFs are important. Therefore, here we have constructed a web-accessible TFs database for P. vulgaris, called PvTFDB, which contains 2370 putative TF gene models in 49 TF families. This database provides a comprehensive information for each of the identified TF that includes sequence data, functional annotation, SSRs with their primer sets, protein physical properties, chromosomal location, phylogeny, tissue-specific gene expression data, orthologues, cis-regulatory elements and gene ontology (GO) assignment. Altogether, this information would be used in expediting the functional genomic studies of a specific TF(s) of interest. The objectives of this database are to understand functional genomics study of common bean TFs and recognize the regulatory mechanisms underlying various stress responses to ease breeding strategy for variety production through a couple of search interfaces including gene ID, functional annotation and browsing interfaces including by family and by chromosome. This database will also serve as a promising central repository for researchers as well as breeders who are working towards crop improvement of legume crops. In addition, this database provide the user unrestricted public access and the user can download entire data present in the database freely.Database URL: http://www.multiomics.in/PvTFDB/. © The Author(s) 2016. Published by Oxford University Press.
A strategy for implementing genomics into nursing practice informed by three behaviour change theories.

PubMed

Leach, Verity; Tonkin, Emma; Lancastle, Deborah; Kirk, Maggie

2016-06-01

Genomics is an ever increasing aspect of nursing practice, with focus being directed towards improving health. The authors present an implementation strategy for the incorporation of genomics into nursing practice within the UK, based on three behaviour change theories and the identification of individuals who are likely to provide support for change. Individuals identified as Opinion Leaders and Adopters of genomics illustrate how changes in behaviour might occur among the nursing profession. The core philosophy of the strategy is that genomic nurse Adopters and Opinion Leaders who have direct interaction with their peers in practice will be best placed to highlight the importance of genomics within the nursing role. The strategy discussed in this paper provides scope for continued nursing education and development of genomics within nursing practice on a larger scale. The recommendations might be of particular relevance for senior staff and management. © 2016 John Wiley & Sons Australia, Ltd.
Genomics and transcriptomics in drug discovery.

PubMed

Dopazo, Joaquin

2014-02-01

The popularization of genomic high-throughput technologies is causing a revolution in biomedical research and, particularly, is transforming the field of drug discovery. Systems biology offers a framework to understand the extensive human genetic heterogeneity revealed by genomic sequencing in the context of the network of functional, regulatory and physical protein-drug interactions. Thus, approaches to find biomarkers and therapeutic targets will have to take into account the complex system nature of the relationships of the proteins with the disease. Pharmaceutical companies will have to reorient their drug discovery strategies considering the human genetic heterogeneity. Consequently, modeling and computational data analysis will have an increasingly important role in drug discovery. Copyright © 2013 Elsevier Ltd. All rights reserved.
VCGDB: a dynamic genome database of the Chinese population

PubMed Central

2014-01-01

Background The data released by the 1000 Genomes Project contain an increasing number of genome sequences from different nations and populations with a large number of genetic variations. As a result, the focus of human genome studies is changing from single and static to complex and dynamic. The currently available human reference genome (GRCh37) is based on sequencing data from 13 anonymous Caucasian volunteers, which might limit the scope of genomics, transcriptomics, epigenetics, and genome wide association studies. Description We used the massive amount of sequencing data published by the 1000 Genomes Project Consortium to construct the Virtual Chinese Genome Database (VCGDB), a dynamic genome database of the Chinese population based on the whole genome sequencing data of 194 individuals. VCGDB provides dynamic genomic information, which contains 35 million single nucleotide variations (SNVs), 0.5 million insertions/deletions (indels), and 29 million rare variations, together with genomic annotation information. VCGDB also provides a highly interactive user-friendly virtual Chinese genome browser (VCGBrowser) with functions like seamless zooming and real-time searching. In addition, we have established three population-specific consensus Chinese reference genomes that are compatible with mainstream alignment software. Conclusions VCGDB offers a feasible strategy for processing big data to keep pace with the biological data explosion by providing a robust resource for genomics studies; in particular, studies aimed at finding regions of the genome associated with diseases. PMID:24708222
A functional genomics strategy reveals Rora as a component of the mammalian circadian clock.

PubMed

Sato, Trey K; Panda, Satchidananda; Miraglia, Loren J; Reyes, Teresa M; Rudic, Radu D; McNamara, Peter; Naik, Kinnery A; FitzGerald, Garret A; Kay, Steve A; Hogenesch, John B

2004-08-19

The mammalian circadian clock plays an integral role in timing rhythmic physiology and behavior, such as locomotor activity, with anticipated daily environmental changes. The master oscillator resides within the suprachiasmatic nucleus (SCN), which can maintain circadian rhythms in the absence of synchronizing light input. Here, we describe a genomics-based approach to identify circadian activators of Bmal1, itself a key transcriptional activator that is necessary for core oscillator function. Using cell-based functional assays, as well as behavioral and molecular analyses, we identified Rora as an activator of Bmal1 transcription within the SCN. Rora is required for normal Bmal1 expression and consolidation of daily locomotor activity and is regulated by the core clock in the SCN. These results suggest that opposing activities of the orphan nuclear receptors Rora and Rev-erb alpha, which represses Bmal1 expression, are important in the maintenance of circadian clock function.
Dereplication, Aggregation and Scoring Tool (DAS Tool) v1.0

DOE Office of Scientific and Technical Information (OSTI.GOV)

SIEBER, CHRISTIAN

Communities of uncultivated microbes are critical to ecosystem function and microorganism health, and a key objective of metagenomic studies is to analyze organism-specific metabolic pathways and reconstruct community interaction networks. This requires accurate assignment of genes to genomes, yet existing binning methods often fail to predict a reasonable number of genomes and report many bins of low quality and completeness. Furthermore, the performance of existing algorithms varies between samples and biotypes. Here, we present a dereplication, aggregation and scoring strategy, DAS Tool, that combines the strengths of a flexible set of established binning algorithms. DAS Tools applied to a constructedmore » community generated more accurate bins than any automated method. Further, when applied to samples of different complexity, including soil, natural oil seeps, and the human gut, DAS Tool recovered substantially more near-complete genomes than any single binning method alone. Included were three genomes from a novel lineage . The ability to reconstruct many near-complete genomes from metagenomics data will greatly advance genome-centric analyses of ecosystems.« less
Strategies for optimizing BioNano and Dovetail explored through a second reference quality assembly for the legume model, Medicago truncatula.

PubMed

Moll, Karen M; Zhou, Peng; Ramaraj, Thiruvarangan; Fajardo, Diego; Devitt, Nicholas P; Sadowsky, Michael J; Stupar, Robert M; Tiffin, Peter; Miller, Jason R; Young, Nevin D; Silverstein, Kevin A T; Mudge, Joann

2017-08-04

Third generation sequencing technologies, with sequencing reads in the tens- of kilo-bases, facilitate genome assembly by spanning ambiguous regions and improving continuity. This has been critical for plant genomes, which are difficult to assemble due to high repeat content, gene family expansions, segmental and tandem duplications, and polyploidy. Recently, high-throughput mapping and scaffolding strategies have further improved continuity. Together, these long-range technologies enable quality draft assemblies of complex genomes in a cost-effective and timely manner. Here, we present high quality genome assemblies of the model legume plant, Medicago truncatula (R108) using PacBio, Dovetail Chicago (hereafter, Dovetail) and BioNano technologies. To test these technologies for plant genome assembly, we generated five assemblies using all possible combinations and ordering of these three technologies in the R108 assembly. While the BioNano and Dovetail joins overlapped, they also showed complementary gains in continuity and join numbers. Both technologies spanned repetitive regions that PacBio alone was unable to bridge. Combining technologies, particularly Dovetail followed by BioNano, resulted in notable improvements compared to Dovetail or BioNano alone. A combination of PacBio, Dovetail, and BioNano was used to generate a high quality draft assembly of R108, a M. truncatula accession widely used in studies of functional genomics. As a test for the usefulness of the resulting genome sequence, the new R108 assembly was used to pinpoint breakpoints and characterize flanking sequence of a previously identified translocation between chromosomes 4 and 8, identifying more than 22.7 Mb of novel sequence not present in the earlier A17 reference assembly. Adding Dovetail followed by BioNano data yielded complementary improvements in continuity over the original PacBio assembly. This strategy proved efficient and cost-effective for developing a quality draft assembly compared to traditional reference assemblies.
Inducible CRISPR genome-editing tool: classifications and future trends.

PubMed

Dai, Xiaofeng; Chen, Xiao; Fang, Qiuwu; Li, Jia; Bai, Zhonghu

2018-06-01

The discovery of CRISPR-Cas9/dCas9 system has reinforced our ability and revolutionized our history in genome engineering. While Cas9 and dCas9 are programed to modulate gene expression by introducing DNA breaks, blocking transcription factor recruitment or dragging functional groups towards the targeted sites, sgRNAs determine the genomic loci where the modulation occurs. The off-target problem, due to limited sgRNA specificity and genome complexity of many species, has posed concerns for the wide application of this revolutionary technique. To solve this problem and, more importantly, gain power over gene functionality and cell fate control, inducible strategies have been continuously evolved to offer tailored solutions to address specific biological questions. By reviewing recent advances in inducible CRISPR system design and critical elements potentially adding values to such systems, we classify current approaches in this domain into four mechanically distinct categories, namely, "split system", "allosteric system", "combinatorial system", and "transient delivery system", discuss the pros and cons of each system, and point out the under-explored areas and future directions, with the aim of enriching our toolbox of delicate life engineering.
Integration of biological data by kernels on graph nodes allows prediction of new genes involved in mitotic chromosome condensation

PubMed Central

Hériché, Jean-Karim; Lees, Jon G.; Morilla, Ian; Walter, Thomas; Petrova, Boryana; Roberti, M. Julia; Hossain, M. Julius; Adler, Priit; Fernández, José M.; Krallinger, Martin; Haering, Christian H.; Vilo, Jaak; Valencia, Alfonso; Ranea, Juan A.; Orengo, Christine; Ellenberg, Jan

2014-01-01

The advent of genome-wide RNA interference (RNAi)–based screens puts us in the position to identify genes for all functions human cells carry out. However, for many functions, assay complexity and cost make genome-scale knockdown experiments impossible. Methods to predict genes required for cell functions are therefore needed to focus RNAi screens from the whole genome on the most likely candidates. Although different bioinformatics tools for gene function prediction exist, they lack experimental validation and are therefore rarely used by experimentalists. To address this, we developed an effective computational gene selection strategy that represents public data about genes as graphs and then analyzes these graphs using kernels on graph nodes to predict functional relationships. To demonstrate its performance, we predicted human genes required for a poorly understood cellular function—mitotic chromosome condensation—and experimentally validated the top 100 candidates with a focused RNAi screen by automated microscopy. Quantitative analysis of the images demonstrated that the candidates were indeed strongly enriched in condensation genes, including the discovery of several new factors. By combining bioinformatics prediction with experimental validation, our study shows that kernels on graph nodes are powerful tools to integrate public biological data and predict genes involved in cellular functions of interest. PMID:24943848

MTAP deletion confers enhanced dependency on the PRMT5 arginine methyltransferase in cancer cells | Office of Cancer Genomics

Cancer.gov

The discovery of cancer dependencies has the potential to inform therapeutic strategies and to identify putative drug targets. Integrating data from comprehensive genomic profiling of cancer cell lines and from functional characterization of cancer cell dependencies, we discovered that loss of the enzyme methylthioadenosine phosphorylase (MTAP) confers a selective dependence on protein arginine methyltransferase 5 (PRMT5) and its binding partner WDR77. MTAP is frequently lost due to its proximity to the commonly deleted tumor suppressor gene, CDKN2A.
Advances in sarcoma genomics and new therapeutic targets

PubMed Central

Taylor, Barry S.; Barretina, Jordi; Maki, Robert G.; Antonescu, Cristina R.; Singer, Samuel; Ladanyi, Marc

2012-01-01

Preface Increasingly, human mesenchymal malignancies are classified by the abnormalities that drive their pathogenesis. While many of these aberrations are highly prevalent within particular sarcoma subtypes, few are currently targeted therapeutically. Indeed, most subtypes of sarcoma are still treated with traditional therapeutic modalities and in many cases are resistant to adjuvant therapies. In this Review, we discuss the core molecular determinants of sarcomagenesis and emphasize the emerging genomic and functional genetic approaches that, coupled to novel therapeutic strategies, have the potential to transform the care of patients with sarcoma. PMID:21753790
Trinity: Transcriptome Assembly for Genetic and Functional Analysis of Cancer | Informatics Technology for Cancer Research (ITCR)

Cancer.gov

The cancer transcriptome is shaped by genetic changes, variation in gene transcription, mRNA processing, editing and stability, and the cancer microbiome. Deciphering this variation and understanding its implications on tumorigenesis requires sophisticated computational analyses. Most RNA-Seq analyses rely on methods that first map short reads to a reference genome, and then compare them to annotated transcripts or assemble them. However, this strategy can be limited when the cancer genome is substantially different than the reference or for detecting sequences from the cancer microbiome.
Observing copepods through a genomic lens

PubMed Central

2011-01-01

Background Copepods outnumber every other multicellular animal group. They are critical components of the world's freshwater and marine ecosystems, sensitive indicators of local and global climate change, key ecosystem service providers, parasites and predators of economically important aquatic animals and potential vectors of waterborne disease. Copepods sustain the world fisheries that nourish and support human populations. Although genomic tools have transformed many areas of biological and biomedical research, their power to elucidate aspects of the biology, behavior and ecology of copepods has only recently begun to be exploited. Discussion The extraordinary biological and ecological diversity of the subclass Copepoda provides both unique advantages for addressing key problems in aquatic systems and formidable challenges for developing a focused genomics strategy. This article provides an overview of genomic studies of copepods and discusses strategies for using genomics tools to address key questions at levels extending from individuals to ecosystems. Genomics can, for instance, help to decipher patterns of genome evolution such as those that occur during transitions from free living to symbiotic and parasitic lifestyles and can assist in the identification of genetic mechanisms and accompanying physiological changes associated with adaptation to new or physiologically challenging environments. The adaptive significance of the diversity in genome size and unique mechanisms of genome reorganization during development could similarly be explored. Genome-wide and EST studies of parasitic copepods of salmon and large EST studies of selected free-living copepods have demonstrated the potential utility of modern genomics approaches for the study of copepods and have generated resources such as EST libraries, shotgun genome sequences, BAC libraries, genome maps and inbred lines that will be invaluable in assisting further efforts to provide genomics tools for copepods. Summary Genomics research on copepods is needed to extend our exploration and characterization of their fundamental biological traits, so that we can better understand how copepods function and interact in diverse environments. Availability of large scale genomics resources will also open doors to a wide range of systems biology type studies that view the organism as the fundamental system in which to address key questions in ecology and evolution. PMID:21933388
Prunus transcription factors: breeding perspectives

PubMed Central

Bianchi, Valmor J.; Rubio, Manuel; Trainotti, Livio; Verde, Ignazio; Bonghi, Claudio; Martínez-Gómez, Pedro

2015-01-01

Many plant processes depend on differential gene expression, which is generally controlled by complex proteins called transcription factors (TFs). In peach, 1533 TFs have been identified, accounting for about 5.5% of the 27,852 protein-coding genes. These TFs are the reference for the rest of the Prunus species. TF studies in Prunus have been performed on the gene expression analysis of different agronomic traits, including control of the flowering process, fruit quality, and biotic and abiotic stress resistance. These studies, using quantitative RT-PCR, have mainly been performed in peach, and to a lesser extent in other species, including almond, apricot, black cherry, Fuji cherry, Japanese apricot, plum, and sour and sweet cherry. Other tools have also been used in TF studies, including cDNA-AFLP, LC-ESI-MS, RNA, and DNA blotting or mapping. More recently, new tools assayed include microarray and high-throughput DNA sequencing (DNA-Seq) and RNA sequencing (RNA-Seq). New functional genomics opportunities include genome resequencing and the well-known synteny among Prunus genomes and transcriptomes. These new functional studies should be applied in breeding programs in the development of molecular markers. With the genome sequences available, some strategies that have been used in model systems (such as SNP genotyping assays and genotyping-by-sequencing) may be applicable in the functional analysis of Prunus TFs as well. In addition, the knowledge of the gene functions and position in the peach reference genome of the TFs represents an additional advantage. These facts could greatly facilitate the isolation of genes via QTL (quantitative trait loci) map-based cloning in the different Prunus species, following the association of these TFs with the identified QTLs using the peach reference genome. PMID:26124770
[Advances in microbial genome reduction and modification].

PubMed

Wang, Jianli; Wang, Xiaoyuan

2013-08-01

Microbial genome reduction and modification are important strategies for constructing cellular chassis used for synthetic biology. This article summarized the essential genes and the methods to identify them in microorganisms, compared various strategies for microbial genome reduction, and analyzed the characteristics of some microorganisms with the minimized genome. This review shows the important role of genome reduction in constructing cellular chassis.
Sequence- and Structure-Based Functional Annotation and Assessment of Metabolic Transporters in Aspergillus oryzae: A Representative Case Study

PubMed Central

Raethong, Nachon; Wong-ekkabut, Jirasak; Laoteng, Kobkul; Vongsangnak, Wanwipa

2016-01-01

Aspergillus oryzae is widely used for the industrial production of enzymes. In A. oryzae metabolism, transporters appear to play crucial roles in controlling the flux of molecules for energy generation, nutrients delivery, and waste elimination in the cell. While the A. oryzae genome sequence is available, transporter annotation remains limited and thus the connectivity of metabolic networks is incomplete. In this study, we developed a metabolic annotation strategy to understand the relationship between the sequence, structure, and function for annotation of A. oryzae metabolic transporters. Sequence-based analysis with manual curation showed that 58 genes of 12,096 total genes in the A. oryzae genome encoded metabolic transporters. Under consensus integrative databases, 55 unambiguous metabolic transporter genes were distributed into channels and pores (7 genes), electrochemical potential-driven transporters (33 genes), and primary active transporters (15 genes). To reveal the transporter functional role, a combination of homology modeling and molecular dynamics simulation was implemented to assess the relationship between sequence to structure and structure to function. As in the energy metabolism of A. oryzae, the H+-ATPase encoded by the AO090005000842 gene was selected as a representative case study of multilevel linkage annotation. Our developed strategy can be used for enhancing metabolic network reconstruction. PMID:27274991
Sequence- and Structure-Based Functional Annotation and Assessment of Metabolic Transporters in Aspergillus oryzae: A Representative Case Study.

PubMed

Raethong, Nachon; Wong-Ekkabut, Jirasak; Laoteng, Kobkul; Vongsangnak, Wanwipa

2016-01-01

Aspergillus oryzae is widely used for the industrial production of enzymes. In A. oryzae metabolism, transporters appear to play crucial roles in controlling the flux of molecules for energy generation, nutrients delivery, and waste elimination in the cell. While the A. oryzae genome sequence is available, transporter annotation remains limited and thus the connectivity of metabolic networks is incomplete. In this study, we developed a metabolic annotation strategy to understand the relationship between the sequence, structure, and function for annotation of A. oryzae metabolic transporters. Sequence-based analysis with manual curation showed that 58 genes of 12,096 total genes in the A. oryzae genome encoded metabolic transporters. Under consensus integrative databases, 55 unambiguous metabolic transporter genes were distributed into channels and pores (7 genes), electrochemical potential-driven transporters (33 genes), and primary active transporters (15 genes). To reveal the transporter functional role, a combination of homology modeling and molecular dynamics simulation was implemented to assess the relationship between sequence to structure and structure to function. As in the energy metabolism of A. oryzae, the H(+)-ATPase encoded by the AO090005000842 gene was selected as a representative case study of multilevel linkage annotation. Our developed strategy can be used for enhancing metabolic network reconstruction.
Integrated platform for genome-wide screening and construction of high-density genetic interaction maps in mammalian cells

PubMed Central

Kampmann, Martin; Bassik, Michael C.; Weissman, Jonathan S.

2013-01-01

A major challenge of the postgenomic era is to understand how human genes function together in normal and disease states. In microorganisms, high-density genetic interaction (GI) maps are a powerful tool to elucidate gene functions and pathways. We have developed an integrated methodology based on pooled shRNA screening in mammalian cells for genome-wide identification of genes with relevant phenotypes and systematic mapping of all GIs among them. We recently demonstrated the potential of this approach in an application to pathways controlling the susceptibility of human cells to the toxin ricin. Here we present the complete quantitative framework underlying our strategy, including experimental design, derivation of quantitative phenotypes from pooled screens, robust identification of hit genes using ultra-complex shRNA libraries, parallel measurement of tens of thousands of GIs from a single double-shRNA experiment, and construction of GI maps. We describe the general applicability of our strategy. Our pooled approach enables rapid screening of the same shRNA library in different cell lines and under different conditions to determine a range of different phenotypes. We illustrate this strategy here for single- and double-shRNA libraries. We compare the roles of genes for susceptibility to ricin and Shiga toxin in different human cell lines and reveal both toxin-specific and cell line-specific pathways. We also present GI maps based on growth and ricin-resistance phenotypes, and we demonstrate how such a comparative GI mapping strategy enables functional dissection of physical complexes and context-dependent pathways. PMID:23739767
Strategies and tools for whole genome alignments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Couronne, Olivier; Poliakov, Alexander; Bray, Nicolas

2002-11-25

The availability of the assembled mouse genome makespossible, for the first time, an alignment and comparison of two largevertebrate genomes. We have investigated different strategies ofalignment for the subsequent analysis of conservation of genomes that areeffective for different quality assemblies. These strategies were appliedto the comparison of the working draft of the human genome with the MouseGenome Sequencing Consortium assembly, as well as other intermediatemouse assemblies. Our methods are fast and the resulting alignmentsexhibit a high degree of sensitivity, covering more than 90 percent ofknown coding exons in the human genome. We have obtained such coveragewhile preserving specificity. With amore » view towards the end user, we havedeveloped a suite of tools and websites for automatically aligning, andsubsequently browsing and working with whole genome comparisons. Wedescribe the use of these tools to identify conserved non-coding regionsbetween the human and mouse genomes, some of which have not beenidentified by other methods.« less
Silicon Era of Carbon-Based Life: Application of Genomics and Bioinformatics in Crop Stress Research

PubMed Central

Li, Man-Wah; Qi, Xinpeng; Ni, Meng; Lam, Hon-Ming

2013-01-01

Abiotic and biotic stresses lead to massive reprogramming of different life processes and are the major limiting factors hampering crop productivity. Omics-based research platforms allow for a holistic and comprehensive survey on crop stress responses and hence may bring forth better crop improvement strategies. Since high-throughput approaches generate considerable amounts of data, bioinformatics tools will play an essential role in storing, retrieving, sharing, processing, and analyzing them. Genomic and functional genomic studies in crops still lag far behind similar studies in humans and other animals. In this review, we summarize some useful genomics and bioinformatics resources available to crop scientists. In addition, we also discuss the major challenges and advancements in the “-omics” studies, with an emphasis on their possible impacts on crop stress research and crop improvement. PMID:23759993
Genomically Encoded Analog Memory with Precise In vivo DNA Writing in Living Cell Populations

PubMed Central

Farzadfard, Fahim; Lu, Timothy K.

2014-01-01

Cellular memory is crucial to many natural biological processes and for sophisticated synthetic-biology applications. Existing cellular memories rely on epigenetic switches or recombinases, which are limited in scalability and recording capacity. Here, we use the DNA of living cell populations as genomic ‘tape recorders’ for the analog and distributed recording of long-term event histories. We describe a platform for generating single-stranded DNA (ssDNA) in vivo in response to arbitrary transcriptional signals. When co-expressed with a recombinase, these intracellularly expressed ssDNAs target specific genomic DNA addresses, resulting in precise mutations that accumulate in cell populations as a function of the magnitude and duration of the inputs. This platform could enable long-term cellular recorders for environmental and biomedical applications, biological state machines, and enhanced genome engineering strategies. PMID:25395541
Pan-genome analysis of human gastric pathogen H. pylori: comparative genomics and pathogenomics approaches to identify regions associated with pathogenicity and prediction of potential core therapeutic targets.

PubMed

Ali, Amjad; Naz, Anam; Soares, Siomar C; Bakhtiar, Marriam; Tiwari, Sandeep; Hassan, Syed S; Hanan, Fazal; Ramos, Rommel; Pereira, Ulisses; Barh, Debmalya; Figueiredo, Henrique César Pereira; Ussery, David W; Miyoshi, Anderson; Silva, Artur; Azevedo, Vasco

2015-01-01

Helicobacter pylori is a human gastric pathogen implicated as the major cause of peptic ulcer and second leading cause of gastric cancer (~70%) around the world. Conversely, an increased resistance to antibiotics and hindrances in the development of vaccines against H. pylori are observed. Pan-genome analyses of the global representative H. pylori isolates consisting of 39 complete genomes are presented in this paper. Phylogenetic analyses have revealed close relationships among geographically diverse strains of H. pylori. The conservation among these genomes was further analyzed by pan-genome approach; the predicted conserved gene families (1,193) constitute ~77% of the average H. pylori genome and 45% of the global gene repertoire of the species. Reverse vaccinology strategies have been adopted to identify and narrow down the potential core-immunogenic candidates. Total of 28 nonhost homolog proteins were characterized as universal therapeutic targets against H. pylori based on their functional annotation and protein-protein interaction. Finally, pathogenomics and genome plasticity analysis revealed 3 highly conserved and 2 highly variable putative pathogenicity islands in all of the H. pylori genomes been analyzed.
From data to function: functional modeling of poultry genomics data.

PubMed

McCarthy, F M; Lyons, E

2013-09-01

One of the challenges of functional genomics is to create a better understanding of the biological system being studied so that the data produced are leveraged to provide gains for agriculture, human health, and the environment. Functional modeling enables researchers to make sense of these data as it reframes a long list of genes or gene products (mRNA, ncRNA, and proteins) by grouping based upon function, be it individual molecular functions or interactions between these molecules or broader biological processes, including metabolic and signaling pathways. However, poultry researchers have been hampered by a lack of functional annotation data, tools, and training to use these data and tools. Moreover, this lack is becoming more critical as new sequencing technologies enable us to generate data not only for an increasingly diverse range of species but also individual genomes and populations of individuals. We discuss the impact of these new sequencing technologies on poultry research, with a specific focus on what functional modeling resources are available for poultry researchers. We also describe key strategies for researchers who wish to functionally model their own data, providing background information about functional modeling approaches, the data and tools to support these approaches, and the strengths and limitations of each. Specifically, we describe methods for functional analysis using Gene Ontology (GO) functional summaries, functional enrichment analysis, and pathways and network modeling. As annotation efforts begin to provide the fundamental data that underpin poultry functional modeling (such as improved gene identification, standardized gene nomenclature, temporal and spatial expression data and gene product function), tool developers are incorporating these data into new and existing tools that are used for functional modeling, and cyberinfrastructure is being developed to provide the necessary extendibility and scalability for storing and analyzing these data. This process will support the efforts of poultry researchers to make sense of their functional genomics data sets, and we provide here a starting point for researchers who wish to take advantage of these tools.
Cyclophilin B is a functional regulator of hepatitis C virus RNA polymerase.

PubMed

Watashi, Koichi; Ishii, Naoto; Hijikata, Makoto; Inoue, Daisuke; Murata, Takayuki; Miyanari, Yusuke; Shimotohno, Kunitada

2005-07-01

Viruses depend on host-derived factors for their efficient genome replication. Here, we demonstrate that a cellular peptidyl-prolyl cis-trans isomerase (PPIase), cyclophilin B (CyPB), is critical for the efficient replication of the hepatitis C virus (HCV) genome. CyPB interacted with the HCV RNA polymerase NS5B to directly stimulate its RNA binding activity. Both the RNA interference (RNAi)-mediated reduction of endogenous CyPB expression and the induced loss of NS5B binding to CyPB decreased the levels of HCV replication. Thus, CyPB functions as a stimulatory regulator of NS5B in HCV replication machinery. This regulation mechanism for viral replication identifies CyPB as a target for antiviral therapeutic strategies.
Simultaneous non-contiguous deletions using large synthetic DNA and site-specific recombinases

PubMed Central

Krishnakumar, Radha; Grose, Carissa; Haft, Daniel H.; Zaveri, Jayshree; Alperovich, Nina; Gibson, Daniel G.; Merryman, Chuck; Glass, John I.

2014-01-01

Toward achieving rapid and large scale genome modification directly in a target organism, we have developed a new genome engineering strategy that uses a combination of bioinformatics aided design, large synthetic DNA and site-specific recombinases. Using Cre recombinase we swapped a target 126-kb segment of the Escherichia coli genome with a 72-kb synthetic DNA cassette, thereby effectively eliminating over 54 kb of genomic DNA from three non-contiguous regions in a single recombination event. We observed complete replacement of the native sequence with the modified synthetic sequence through the action of the Cre recombinase and no competition from homologous recombination. Because of the versatility and high-efficiency of the Cre-lox system, this method can be used in any organism where this system is functional as well as adapted to use with other highly precise genome engineering systems. Compared to present-day iterative approaches in genome engineering, we anticipate this method will greatly speed up the creation of reduced, modularized and optimized genomes through the integration of deletion analyses data, transcriptomics, synthetic biology and site-specific recombination. PMID:24914053
MIPS: a database for genomes and protein sequences.

PubMed Central

Mewes, H W; Heumann, K; Kaps, A; Mayer, K; Pfeiffer, F; Stocker, S; Frishman, D

1999-01-01

The Munich Information Center for Protein Sequences (MIPS-GSF), Martinsried near Munich, Germany, develops and maintains genome oriented databases. It is commonplace that the amount of sequence data available increases rapidly, but not the capacity of qualified manual annotation at the sequence databases. Therefore, our strategy aims to cope with the data stream by the comprehensive application of analysis tools to sequences of complete genomes, the systematic classification of protein sequences and the active support of sequence analysis and functional genomics projects. This report describes the systematic and up-to-date analysis of genomes (PEDANT), a comprehensive database of the yeast genome (MYGD), a database reflecting the progress in sequencing the Arabidopsis thaliana genome (MATD), the database of assembled, annotated human EST clusters (MEST), and the collection of protein sequence data within the framework of the PIR-International Protein Sequence Database (described elsewhere in this volume). MIPS provides access through its WWW server (http://www.mips.biochem.mpg.de) to a spectrum of generic databases, including the above mentioned as well as a database of protein families (PROTFAM), the MITOP database, and the all-against-all FASTA database. PMID:9847138
Identification of Thiotetronic Acid Antibiotic Biosynthetic Pathways by Target-directed Genome Mining.

PubMed

Tang, Xiaoyu; Li, Jie; Millán-Aguiñaga, Natalie; Zhang, Jia Jia; O'Neill, Ellis C; Ugalde, Juan A; Jensen, Paul R; Mantovani, Simone M; Moore, Bradley S

2015-12-18

Recent genome sequencing efforts have led to the rapid accumulation of uncharacterized or "orphaned" secondary metabolic biosynthesis gene clusters (BGCs) in public databases. This increase in DNA-sequenced big data has given rise to significant challenges in the applied field of natural product genome mining, including (i) how to prioritize the characterization of orphan BGCs and (ii) how to rapidly connect genes to biosynthesized small molecules. Here, we show that by correlating putative antibiotic resistance genes that encode target-modified proteins with orphan BGCs, we predict the biological function of pathway specific small molecules before they have been revealed in a process we call target-directed genome mining. By querying the pan-genome of 86 Salinispora bacterial genomes for duplicated house-keeping genes colocalized with natural product BGCs, we prioritized an orphan polyketide synthase-nonribosomal peptide synthetase hybrid BGC (tlm) with a putative fatty acid synthase resistance gene. We employed a new synthetic double-stranded DNA-mediated cloning strategy based on transformation-associated recombination to efficiently capture tlm and the related ttm BGCs directly from genomic DNA and to heterologously express them in Streptomyces hosts. We show the production of a group of unusual thiotetronic acid natural products, including the well-known fatty acid synthase inhibitor thiolactomycin that was first described over 30 years ago, yet never at the genetic level in regards to biosynthesis and autoresistance. This finding not only validates the target-directed genome mining strategy for the discovery of antibiotic producing gene clusters without a priori knowledge of the molecule synthesized but also paves the way for the investigation of novel enzymology involved in thiotetronic acid natural product biosynthesis.
Immune subversion by chromatin manipulation: a 'new face' of host-bacterial pathogen interaction.

PubMed

Arbibe, Laurence

2008-08-01

Bacterial pathogens have evolved various strategies to avoid immune surveillance, depending of their in vivo'lifestyle'. The identification of few bacterial effectors capable to enter the nucleus and modifying chromatin structure in host raises the fascinating questions of how pathogens modulate chromatin structure and why. Chromatin is a dynamic structure that maintains the stability and accessibility of the host DNA genome to the transcription machinery. This review describes the various strategies used by pathogens to interface with host chromatin. In some cases, chromatin injury can be a strategy to take control of major cellular functions, such as the cell cycle. In other cases, manipulation of chromatin structure at specific genomic locations by modulating epigenetic information provides a way for the pathogen to impose its own transcriptional signature onto host cells. This emerging field should strongly influence our understanding of chromatin regulation at interphase nucleus and may provide invaluable openings to the control of immune gene expression in inflammatory and infectious diseases.
Managing microbial communities for sequentially reconstruct genomes from complex metagenomes

NASA Astrophysics Data System (ADS)

Delmont, Tom O.; Vogel, Timothy M.; Simonet, Pascal

2013-04-01

Global understanding on environmental microbial communities is currently limited by the bottleneck of genome reconstruction. Soil is a typical example where individual cells are currently mostly uncultured and metagenomic datasets unassembled. In this study, the microbial community composition of a natural grassland soil was managed under several controlled selective pressures to experiment a "multi-evenness" stratagem for sequentially attempt to reconstruct genomes from a complex metagenome. While lowly represented in the natural community, several newly dominant genomes (an enrichment attaining 105 in some cases) were successfully reconstructed under various "harsh" tested conditions. These genomes belong to several genera including (but not restricted to) Leifsonia, Rhodanobacter, Bacillus, Ktedonobacter, Xanthomonas, Streptomyces and Burkholderia. So far, from 10 to 78% of generated metagenomic datasets were reconstructed, so providing access to more than 88 000 genes of known or unknown functions and to their genetic environment. Adaptative genes directly related to selective pressures were found, mostly in large plasmids. Functions of potential industrial interest (e.g., novel polyketide synthase modules in Streptomyces) were also discovered. Furthermore, an important phage infection snapshot (>1500X of coverage for the most represented phage) was observed among the Streptomyces population (three distinct genomes reconstructed) of a particular enrichment (mercury, 0.02g/kg) during the fourth month of incubation. This "divide and conquer" strategy could be applied to other environments and using auxiliary sequencing approaches like single cell to detect, connect and mine taxa and functions of interest while creating an extensive set of reference genomes from across the planet. Next limit could turn out to become our imagination defining novel selective pressures to sequentially make dominant the 1030 cells of the biosphere.

Comparative genomics of 9 novel Paenibacillus larvae bacteriophages

PubMed Central

Stamereilers, Casey; LeBlanc, Lucy; Yost, Diane; Amy, Penny S.; Tsourkas, Philippos K.

2016-01-01

ABSTRACT American Foulbrood Disease, caused by the bacterium Paenibacillus larvae, is one of the most destructive diseases of the honeybee, Apis mellifera. Our group recently published the sequences of 9 new phages with the ability to infect and lyse P. larvae. Here, we characterize the genomes of these P. larvae phages, compare them to each other and to other sequenced P. larvae phages, and putatively identify protein function. The phage genomes are 38–45 kb in size and contain 68–86 genes, most of which appear to be unique to P. larvae phages. We classify P. larvae phages into 2 main clusters and one singleton based on nucleotide sequence identity. Three of the new phages show sequence similarity to other sequenced P. larvae phages, while the remaining 6 do not. We identified functions for roughly half of the P. larvae phage proteins, including structural, assembly, host lysis, DNA replication/metabolism, regulatory, and host-related functions. Structural and assembly proteins are highly conserved among our phages and are located at the start of the genome. DNA replication/metabolism, regulatory, and host-related proteins are located in the middle and end of the genome, and are not conserved, with many of these genes found in some of our phages but not others. All nine phages code for a conserved N-acetylmuramoyl-L-alanine amidase. Comparative analysis showed the phages use the “cohesive ends with 3′ overhang” DNA packaging strategy. This work is the first in-depth study of P. larvae phage genomics, and serves as a marker for future work in this area. PMID:27738559
Improving transcriptome de novo assembly by using a reference genome of a related species: Translational genomics from oil palm to coconut.

PubMed

Armero, Alix; Baudouin, Luc; Bocs, Stéphanie; This, Dominique

2017-01-01

The palms are a family of tropical origin and one of the main constituents of the ecosystems of these regions around the world. The two main species of palm represent different challenges: coconut (Cocos nucifera L.) is a source of multiple goods and services in tropical communities, while oil palm (Elaeis guineensis Jacq) is the main protagonist of the oil market. In this study, we present a workflow that exploits the comparative genomics between a target species (coconut) and a reference species (oil palm) to improve the transcriptomic data, providing a proteome useful to answer functional or evolutionary questions. This workflow reduces redundancy and fragmentation, two inherent problems of transcriptomic data, while preserving the functional representation of the target species. Our approach was validated in Arabidopsis thaliana using Arabidopsis lyrata and Capsella rubella as references species. This analysis showed the high sensitivity and specificity of our strategy, relatively independent of the reference proteome. The workflow increased the length of proteins products in A. thaliana by 13%, allowing, often, to recover 100% of the protein sequence length. In addition redundancy was reduced by a factor greater than 3. In coconut, the approach generated 29,366 proteins, 1,246 of these proteins deriving from new contigs obtained with the BRANCH software. The coconut proteome presented a functional profile similar to that observed in rice and an important number of metabolic pathways related to secondary metabolism. The new sequences found with BRANCH software were enriched in functions related to biotic stress. Our strategy can be used as a complementary step to de novo transcriptome assembly to get a representative proteome of a target species. The results of the current analysis are available on the website PalmComparomics (http://palm-comparomics.southgreen.fr/).
Combining guilt-by-association and guilt-by-profiling to predict Saccharomyces cerevisiae gene function

PubMed Central

Tian, Weidong; Zhang, Lan V; Taşan, Murat; Gibbons, Francis D; King, Oliver D; Park, Julie; Wunderlich, Zeba; Cherry, J Michael; Roth, Frederick P

2008-01-01

Background: Learning the function of genes is a major goal of computational genomics. Methods for inferring gene function have typically fallen into two categories: 'guilt-by-profiling', which exploits correlation between function and other gene characteristics; and 'guilt-by-association', which transfers function from one gene to another via biological relationships. Results: We have developed a strategy ('Funckenstein') that performs guilt-by-profiling and guilt-by-association and combines the results. Using a benchmark set of functional categories and input data for protein-coding genes in Saccharomyces cerevisiae, Funckenstein was compared with a previous combined strategy. Subsequently, we applied Funckenstein to 2,455 Gene Ontology terms. In the process, we developed 2,455 guilt-by-profiling classifiers based on 8,848 gene characteristics and 12 functional linkage graphs based on 23 biological relationships. Conclusion: Funckenstein outperforms a previous combined strategy using a common benchmark dataset. The combination of 'guilt-by-profiling' and 'guilt-by-association' gave significant improvement over the component classifiers, showing the greatest synergy for the most specific functions. Performance was evaluated by cross-validation and by literature examination of the top-scoring novel predictions. These quantitative predictions should help prioritize experimental study of yeast gene functions. PMID:18613951
Domain selection combined with improved cloning strategy for high throughput expression of higher eukaryotic proteins

PubMed Central

Chen, Yunjia; Qiu, Shihong; Luan, Chi-Hao; Luo, Ming

2007-01-01

Background Expression of higher eukaryotic genes as soluble, stable recombinant proteins is still a bottleneck step in biochemical and structural studies of novel proteins today. Correct identification of stable domains/fragments within the open reading frame (ORF), combined with proper cloning strategies, can greatly enhance the success rate when higher eukaryotic proteins are expressed as these domains/fragments. Furthermore, a HTP cloning pipeline incorporated with bioinformatics domain/fragment selection methods will be beneficial to studies of structure and function genomics/proteomics. Results With bioinformatics tools, we developed a domain/domain boundary prediction (DDBP) method, which was trained by available experimental data. Combined with an improved cloning strategy, DDBP had been applied to 57 proteins from C. elegans. Expression and purification results showed there was a 10-fold increase in terms of obtaining purified proteins. Based on the DDBP method, the improved GATEWAY cloning strategy and a robotic platform, we constructed a high throughput (HTP) cloning pipeline, including PCR primer design, PCR, BP reaction, transformation, plating, colony picking and entry clones extraction, which have been successfully applied to 90 C. elegans genes, 88 Brucella genes, and 188 human genes. More than 97% of the targeted genes were obtained as entry clones. This pipeline has a modular design and can adopt different operations for a variety of cloning/expression strategies. Conclusion The DDBP method and improved cloning strategy were satisfactory. The cloning pipeline, combined with our recombinant protein HTP expression pipeline and the crystal screening robots, constitutes a complete platform for structure genomics/proteomics. This platform will increase the success rate of purification and crystallization dramatically and promote the further advancement of structure genomics/proteomics. PMID:17663785
Host gene targets for novel influenza therapies elucidated by high-throughput RNA interference screens

PubMed Central

Meliopoulos, Victoria A.; Andersen, Lauren E.; Birrer, Katherine F.; Simpson, Kaylene J.; Lowenthal, John W.; Bean, Andrew G. D.; Stambas, John; Stewart, Cameron R.; Tompkins, S. Mark; van Beusechem, Victor W.; Fraser, Iain; Mhlanga, Musa; Barichievy, Samantha; Smith, Queta; Leake, Devin; Karpilow, Jon; Buck, Amy; Jona, Ghil; Tripp, Ralph A.

2012-01-01

Influenza virus encodes only 11 viral proteins but replicates in a broad range of avian and mammalian species by exploiting host cell functions. Genome-wide RNA interference (RNAi) has proven to be a powerful tool for identifying the host molecules that participate in each step of virus replication. Meta-analysis of findings from genome-wide RNAi screens has shown influenza virus to be dependent on functional nodes in host cell pathways, requiring a wide variety of molecules and cellular proteins for replication. Because rapid evolution of the influenza A viruses persistently complicates the effectiveness of vaccines and therapeutics, a further understanding of the complex host cell pathways coopted by influenza virus for replication may provide new targets and strategies for antiviral therapy. RNAi genome screening technologies together with bioinformatics can provide the ability to rapidly identify specific host factors involved in resistance and susceptibility to influenza virus, allowing for novel disease intervention strategies.—Meliopoulos, V. A., Andersen, L. E., Birrer, K. F., Simpson, K. J., Lowenthal, J. W., Bean, A. G. D., Stambas, J., Stewart, C. R., Tompkins, S. M., van Beusechem, V. W., Fraser, I., Mhlanga, M., Barichievy, S., Smith, Q., Leake, D., Karpilow, J., Buck, A., Jona, G., Tripp, R. A. Host gene targets for novel influenza therapies elucidated by high-throughput RNA interference screens. PMID:22247330
Sharka: The Past, The Present and The Future

PubMed Central

Sochor, Jiri; Babula, Petr; Adam, Vojtech; Krska, Boris; Kizek, Rene

2012-01-01

Members the Potyviridae family belong to a group of plant viruses that are causing devastating plant diseases with a significant impact on agronomy and economics. Plum pox virus (PPV), as a causative agent of sharka disease, is widely discussed. The understanding of the molecular biology of potyviruses including PPV and the function of individual proteins as products of genome expression are quite necessary for the proposal the new antiviral strategies. This review brings to view the members of Potyviridae family with respect to plum pox virus. The genome of potyviruses is discussed with respect to protein products of its expression and their function. Plum pox virus distribution, genome organization, transmission and biochemical changes in infected plants are introduced. In addition, techniques used in PPV detection are accentuated and discussed, especially with respect to new modern techniques of nucleic acids isolation, based on the nanotechnological approach. Finally, perspectives on the future of possibilities for nanotechnology application in PPV determination/identification are outlined. PMID:23202508
Sharka: the past, the present and the future.

PubMed

Sochor, Jiri; Babula, Petr; Adam, Vojtech; Krska, Boris; Kizek, Rene

2012-11-07

Members the Potyviridae family belong to a group of plant viruses that are causing devastating plant diseases with a significant impact on agronomy and economics. Plum pox virus (PPV), as a causative agent of sharka disease, is widely discussed. The understanding of the molecular biology of potyviruses including PPV and the function of individual proteins as products of genome expression are quite necessary for the proposal the new antiviral strategies. This review brings to view the members of Potyviridae family with respect to plum pox virus. The genome of potyviruses is discussed with respect to protein products of its expression and their function. Plum pox virus distribution, genome organization, transmission and biochemical changes in infected plants are introduced. In addition, techniques used in PPV detection are accentuated and discussed, especially with respect to new modern techniques of nucleic acids isolation, based on the nanotechnological approach. Finally, perspectives on the future of possibilities for nanotechnology application in PPV determination/identification are outlined.
Emerging trends in the functional genomics of the abiotic stress response in crop plants.

PubMed

Vij, Shubha; Tyagi, Akhilesh K

2007-05-01

Plants are exposed to different abiotic stresses, such as water deficit, high temperature, salinity, cold, heavy metals and mechanical wounding, under field conditions. It is estimated that such stress conditions can potentially reduce the yield of crop plants by more than 50%. Investigations of the physiological, biochemical and molecular aspects of stress tolerance have been conducted to unravel the intrinsic mechanisms developed during evolution to mitigate against stress by plants. Before the advent of the genomics era, researchers primarily used a gene-by-gene approach to decipher the function of the genes involved in the abiotic stress response. However, abiotic stress tolerance is a complex trait and, although large numbers of genes have been identified to be involved in the abiotic stress response, there remain large gaps in our understanding of the trait. The availability of the genome sequences of certain important plant species has enabled the use of strategies, such as genome-wide expression profiling, to identify the genes associated with the stress response, followed by the verification of gene function by the analysis of mutants and transgenics. Certain components of both abscisic acid-dependent and -independent cascades involved in the stress response have already been identified. Information originating from the genome-wide analysis of abiotic stress tolerance will help to provide an insight into the stress-responsive network(s), and may allow the modification of this network to reduce the loss caused by stress and to increase agricultural productivity.
A Phylogenetic Strategy Based on a Legume-Specific Whole Genome Duplication Yields Symbiotic Cytokinin Type-A Response Regulators1[C][W][OA

PubMed Central

Op den Camp, Rik H.M.; De Mita, Stéphane; Lillo, Alessandra; Cao, Qingqin; Limpens, Erik; Bisseling, Ton; Geurts, René

2011-01-01

Legumes host their Rhizobium spp. symbiont in novel root organs called nodules. Nodules originate from differentiated root cortical cells that dedifferentiate and subsequently form nodule primordia, a process controlled by cytokinin. A whole-genome duplication has occurred at the root of the legume Papilionoideae subfamily. We hypothesize that gene pairs originating from this duplication event and are conserved in distinct Papilionoideae lineages have evolved symbiotic functions. A phylogenetic strategy was applied to search for such gene pairs to identify novel regulators of nodulation, using the cytokinin phosphorelay pathway as a test case. In this way, two paralogous type-A cytokinin response regulators were identified that are involved in root nodule symbiosis. Response Regulator9 (MtRR9) and MtRR11 in medicago (Medicago truncatula) and an ortholog in lotus (Lotus japonicus) are rapidly induced upon Rhizobium spp. Nod factor signaling. Constitutive expression of MtRR9 results in arrested primordia that have emerged from cortical, endodermal, and pericycle cells. In legumes, lateral root primordia are not exclusively formed from pericycle cells but also require the involvement of the root cortical cell layer. Therefore, the MtRR9-induced foci of cell divisions show a strong resemblance to lateral root primordia, suggesting an ancestral function of MtRR9 in this process. Together, these findings provide a proof of principle for the applied phylogenetic strategy to identify genes with a symbiotic function in legumes. PMID:22034625
Construction of a mutagenesis cartridge for poliovirus genome-linked viral protein: isolation and characterization of viable and nonviable mutants

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kuhn, R.J.; Tada, H.; Ypma-Wong, M.F.

1988-01-01

By following a strategy of genetic analysis of poliovirus, the authors have constructed a synthetic mutagenesis cartridge spanning the genome-linked viral protein coding region and flanking cleavage sites in an infectious cDNA clone of the type I (Mahoney) genome. The insertion of new restriction sites within the infectious clone has allowed them to replace the wild-type sequences with short complementary pairs of synthetic oligonucleotides containing various mutations. A set of mutations have been made that create methionine codons within the genome-linked viral protein region. The resulting viruses have growth characteristics similar to wild type. Experiments that led to an alterationmore » of the tyrosine residue responsible for the linkage to RNA have resulted in nonviable virus. In one mutant, proteolytic processing assayed in vitro appeared unimpaired by the mutation. They suggest that the position of the tyrosine residue is important for genome-linked viral protein function(s).« less
Trade-off between Transcriptome Plasticity and Genome Evolution in Cephalopods.

PubMed

Liscovitch-Brauer, Noa; Alon, Shahar; Porath, Hagit T; Elstein, Boaz; Unger, Ron; Ziv, Tamar; Admon, Arie; Levanon, Erez Y; Rosenthal, Joshua J C; Eisenberg, Eli

2017-04-06

RNA editing, a post-transcriptional process, allows the diversification of proteomes beyond the genomic blueprint; however it is infrequently used among animals for this purpose. Recent reports suggesting increased levels of RNA editing in squids thus raise the question of the nature and effects of these events. We here show that RNA editing is particularly common in behaviorally sophisticated coleoid cephalopods, with tens of thousands of evolutionarily conserved sites. Editing is enriched in the nervous system, affecting molecules pertinent for excitability and neuronal morphology. The genomic sequence flanking editing sites is highly conserved, suggesting that the process confers a selective advantage. Due to the large number of sites, the surrounding conservation greatly reduces the number of mutations and genomic polymorphisms in protein-coding regions. This trade-off between genome evolution and transcriptome plasticity highlights the importance of RNA recoding as a strategy for diversifying proteins, particularly those associated with neural function. PAPERCLIP. Copyright © 2017 Elsevier Inc. All rights reserved.
Application of CRISPR/Cas9 system in breeding of new antiviral plant germplasm.

PubMed

Zhang, Dao-wei; Zhang, Chao-fan; Dong, Fang; Huang, Yan-lan; Zhang, Ya; Zhou, Hong

2016-09-01

With the development and improvement of CRISPR/Cas9 system in genomic editing technology, the system has been applied to the prevention and control of animal viral infectious diseases, which has made considerable achievements. It has also been applied to the study of highly efficient gene targeting editing in plant virus genomes. The CRISPR/Cas9-mediated targeted gene modification has not only achieved the genome editing of plant DNA virus, but also showed the genome editing potential of plant RNA virus. In addition, the CRISPR/Cas9 system functions at the gene transcriptional and post-transcriptional level, indicating that the system could regulate the replication of plant viruses through different ways. Compared with other plant viral disease control strategies, this system is more accurate in genome editing, more stable in gene expression regulation, and has broader spectrum of resistance to virus disease. In this review, we summarized the advantages, main problems and development tendency of CRISPR/cas9 system in breeding of new antiviral plant germplasms.
Non-functional genes repaired at the RNA level.

PubMed

Burger, Gertraud

2016-01-01

Genomes and genes continuously evolve. Gene sequences undergo substitutions, deletions or nucleotide insertions; mobile genetic elements invade genomes and interleave in genes; chromosomes break, even within genes, and pieces reseal in reshuffled order. To maintain functional gene products and assure an organism's survival, two principal strategies are used - either repair of the gene itself or of its product. I will introduce common types of gene aberrations and how gene function is restored secondarily, and then focus on systematically fragmented genes found in a poorly studied protist group, the diplonemids. Expression of their broken genes involves restitching of pieces at the RNA-level, and substantial RNA editing, to compensate for point mutations. I will conclude with thoughts on how such a grotesquely unorthodox system may have evolved, and why this group of organisms persists and thrives since tens of millions of years. Copyright © 2016 Académie des sciences. Published by Elsevier SAS. All rights reserved.
Synthetic biology. Genomically encoded analog memory with precise in vivo DNA writing in living cell populations.

PubMed

Farzadfard, Fahim; Lu, Timothy K

2014-11-14

Cellular memory is crucial to many natural biological processes and sophisticated synthetic biology applications. Existing cellular memories rely on epigenetic switches or recombinases, which are limited in scalability and recording capacity. In this work, we use the DNA of living cell populations as genomic "tape recorders" for the analog and distributed recording of long-term event histories. We describe a platform for generating single-stranded DNA (ssDNA) in vivo in response to arbitrary transcriptional signals. When coexpressed with a recombinase, these intracellularly expressed ssDNAs target specific genomic DNA addresses, resulting in precise mutations that accumulate in cell populations as a function of the magnitude and duration of the inputs. This platform could enable long-term cellular recorders for environmental and biomedical applications, biological state machines, and enhanced genome engineering strategies. Copyright © 2014, American Association for the Advancement of Science.
Clustering analysis of proteins from microbial genomes at multiple levels of resolution.

PubMed

Zaslavsky, Leonid; Ciufo, Stacy; Fedorov, Boris; Tatusova, Tatiana

2016-08-31

Microbial genomes at the National Center for Biotechnology Information (NCBI) represent a large collection of more than 35,000 assemblies. There are several complexities associated with the data: a great variation in sampling density since human pathogens are densely sampled while other bacteria are less represented; different protein families occur in annotations with different frequencies; and the quality of genome annotation varies greatly. In order to extract useful information from these sophisticated data, the analysis needs to be performed at multiple levels of phylogenomic resolution and protein similarity, with an adequate sampling strategy. Protein clustering is used to construct meaningful and stable groups of similar proteins to be used for analysis and functional annotation. Our approach is to create protein clusters at three levels. First, tight clusters in groups of closely-related genomes (species-level clades) are constructed using a combined approach that takes into account both sequence similarity and genome context. Second, clustroids of conservative in-clade clusters are organized into seed global clusters. Finally, global protein clusters are built around the the seed clusters. We propose filtering strategies that allow limiting the protein set included in global clustering. The in-clade clustering procedure, subsequent selection of clustroids and organization into seed global clusters provides a robust representation and high rate of compression. Seed protein clusters are further extended by adding related proteins. Extended seed clusters include a significant part of the data and represent all major known cell machinery. The remaining part, coming from either non-conservative (unique) or rapidly evolving proteins, from rare genomes, or resulting from low-quality annotation, does not group together well. Processing these proteins requires significant computational resources and results in a large number of questionable clusters. The developed filtering strategies allow to identify and exclude such peripheral proteins limiting the protein dataset in global clustering. Overall, the proposed methodology allows the relevant data at different levels of details to be obtained and data redundancy eliminated while keeping biologically interesting variations.
CRISPRi and CRISPRa: New Functional Genomics Tools Provide Complementary Insights into Cancer Biology and Therapeutic Strategies | Office of Cancer Genomics

Cancer.gov

A central goal of research for targeted cancer therapy, or precision oncology, is to reveal the intrinsic vulnerabilities of cancer cells and exploit them as therapeutic targets. Examples of cancer cell vulnerabilities include driver oncogenes that are essential for the initiation and progression of cancer, or non-oncogene addictions resulting from the cancerous state of the cell. To identify vulnerabilities, scientists perform genetic “loss-of-function” and “gain-of-function” studies to better understand the roles of specific genes in cancer cells.
National Plant Genome Initiative

DTIC Science & Technology

2004-01-01

trials have also identified new objectives for vegetable breeding programs, expedited by knowledge and tools from crop genomics and farmer demand...The same tools and resources are being applied to develop improved crops and new breeding strategies, as well. With the sequencing of the rice genome...marker-assisted breeding strategies for wheat • Establishment of a comparative cereal genomics database, Gramene, which uses the complete rice
Draft Genome Sequence, and a Sequence-Defined Genetic Linkage Map of the Legume Crop Species Lupinus angustifolius L

PubMed Central

Zheng, Zequn; Zhang, Qisen; Zhou, Gaofeng; Sweetingham, Mark W.; Howieson, John G.; Li, Chengdao

2013-01-01

Lupin (Lupinus angustifolius L.) is the most recently domesticated crop in major agricultural cultivation. Its seeds are high in protein and dietary fibre, but low in oil and starch. Medical and dietetic studies have shown that consuming lupin-enriched food has significant health benefits. We report the draft assembly from a whole genome shotgun sequencing dataset for this legume species with 26.9x coverage of the genome, which is predicted to contain 57,807 genes. Analysis of the annotated genes with metabolic pathways provided a partial understanding of some key features of lupin, such as the amino acid profile of storage proteins in seeds. Furthermore, we applied the NGS-based RAD-sequencing technology to obtain 8,244 sequence-defined markers for anchoring the genomic sequences. A total of 4,214 scaffolds from the genome sequence assembly were aligned into the genetic map. The combination of the draft assembly and a sequence-defined genetic map made it possible to locate and study functional genes of agronomic interest. The identification of co-segregating SNP markers, scaffold sequences and gene annotation facilitated the identification of a candidate R gene associated with resistance to the major lupin disease anthracnose. We demonstrated that the combination of medium-depth genome sequencing and a high-density genetic linkage map by application of NGS technology is a cost-effective approach to generating genome sequence data and a large number of molecular markers to study the genomics, genetics and functional genes of lupin, and to apply them to molecular plant breeding. This strategy does not require prior genome knowledge, which potentiates its application to a wide range of non-model species. PMID:23734219
Draft genome sequence, and a sequence-defined genetic linkage map of the legume crop species Lupinus angustifolius L.

PubMed

Yang, Huaan; Tao, Ye; Zheng, Zequn; Zhang, Qisen; Zhou, Gaofeng; Sweetingham, Mark W; Howieson, John G; Li, Chengdao

2013-01-01

Lupin (Lupinus angustifolius L.) is the most recently domesticated crop in major agricultural cultivation. Its seeds are high in protein and dietary fibre, but low in oil and starch. Medical and dietetic studies have shown that consuming lupin-enriched food has significant health benefits. We report the draft assembly from a whole genome shotgun sequencing dataset for this legume species with 26.9x coverage of the genome, which is predicted to contain 57,807 genes. Analysis of the annotated genes with metabolic pathways provided a partial understanding of some key features of lupin, such as the amino acid profile of storage proteins in seeds. Furthermore, we applied the NGS-based RAD-sequencing technology to obtain 8,244 sequence-defined markers for anchoring the genomic sequences. A total of 4,214 scaffolds from the genome sequence assembly were aligned into the genetic map. The combination of the draft assembly and a sequence-defined genetic map made it possible to locate and study functional genes of agronomic interest. The identification of co-segregating SNP markers, scaffold sequences and gene annotation facilitated the identification of a candidate R gene associated with resistance to the major lupin disease anthracnose. We demonstrated that the combination of medium-depth genome sequencing and a high-density genetic linkage map by application of NGS technology is a cost-effective approach to generating genome sequence data and a large number of molecular markers to study the genomics, genetics and functional genes of lupin, and to apply them to molecular plant breeding. This strategy does not require prior genome knowledge, which potentiates its application to a wide range of non-model species.
High-throughput screens in mammalian cells using the CRISPR-Cas9 system.

PubMed

Peng, Jingyu; Zhou, Yuexin; Zhu, Shiyou; Wei, Wensheng

2015-06-01

As a powerful genome-editing tool, the clustered regularly interspaced short palindromic repeats (CRISPR)-clustered regularly interspaced short palindromic repeats-associated protein 9 (Cas9) system has been quickly developed into a large-scale function-based screening strategy in mammalian cells. This new type of genetic library is constructed through the lentiviral delivery of single-guide RNA collections that direct Cas9 or inactive dead Cas9 fused with effectors to interrogate gene function or regulate gene transcription in targeted cells. Compared with RNA interference screening, the CRISPR-Cas9 system demonstrates much higher levels of effectiveness and reliability with respect to both loss-of-function and gain-of-function screening. Unlike the RNA interference strategy, a CRISPR-Cas9 library can target both protein-coding sequences and regulatory elements, including promoters, enhancers and elements transcribing microRNAs and long noncoding RNAs. This powerful genetic tool will undoubtedly accelerate the mechanistic discovery of various biological processes. In this mini review, we summarize the general procedure of CRISPR-Cas9 library mediated functional screening, system optimization strategies and applications of this new genetic toolkit. © 2015 FEBS.

Genomic Medicine Without Borders: Which Strategies Should Developing Countries Employ to Invest in Precision Medicine? A New "Fast-Second Winner" Strategy.

PubMed

Mitropoulos, Konstantinos; Cooper, David N; Mitropoulou, Christina; Agathos, Spiros; Reichardt, Jürgen K V; Al-Maskari, Fatima; Chantratita, Wasun; Wonkam, Ambroise; Dandara, Collet; Katsila, Theodora; Lopez-Correa, Catalina; Ali, Bassam R; Patrinos, George P

2017-11-01

Genomic medicine has greatly matured in terms of its technical capabilities, but the diffusion of genomic innovations worldwide faces significant barriers beyond mere access to technology. New global development strategies are sorely needed for biotechnologies such as genomics and their applications toward precision medicine without borders. Moreover, diffusion of genomic medicine globally cannot adhere to a "one-size-fits-all-countries" development strategy, in the same way that drug treatments should be customized. This begs a timely, difficult but crucial question: How should developing countries, and the resource-limited regions of developed countries, invest in genomic medicine? Although a full-scale investment in infrastructure from discovery to the translational implementation of genomic science is ideal, this may not always be feasible in all countries at all times. A simple "transplantation of genomics" from developed to developing countries is unlikely to be feasible. Nor should developing countries be seen as simple recipients and beneficiaries of genomic medicine developed elsewhere because important advances in genomic medicine have materialized in developing countries as well. There are several noteworthy examples of genomic medicine success stories involving resource-limited settings that are contextualized and described in this global genomic medicine innovation analysis. In addition, we outline here a new long-term development strategy for global genomic medicine in a way that recognizes the individual country's pressing public health priorities and disease burdens. We term this approach the "Fast-Second Winner" model of innovation that supports innovation commencing not only "upstream" of discovery science but also "mid-stream," building on emerging highly promising biomarker and diagnostic candidates from the global science discovery pipeline, based on the unique needs of each country. A mid-stream entry into innovation can enhance collective learning from other innovators' mistakes upstream in discovery science and boost the probability of success for translation and implementation when resources are limited. This à la carte model of global innovation and development strategy offers multiple entry points into the global genomics innovation ecosystem for developing countries, whether or not extensive and expensive discovery infrastructures are already in place. Ultimately, broadening our thinking beyond the linear model of innovation will help us to enable the vision and practice of genomics without borders in both developed and resource-limited settings.
Metabolomic strategies to map functions of metabolic pathways

PubMed Central

Mulvihill, Melinda M.

2014-01-01

Genome sequencing efforts have revealed a strikingly large number of unannotated and uncharacterized genes that fall into metabolic enzymes classes, likely indicating that our current knowledge of biochemical pathways in normal physiology, let alone in disease states, remains largely incomplete. This realization presents a daunting challenge for post-genomic-era scientists in deciphering the biochemical and (patho)physiological roles of these enzymes and their metabolites and metabolic networks. This is further complicated by many recent studies showing a rewiring of normal metabolic networks in disease states to give rise to unique pathophysiological functions of enzymes, metabolites, and metabolic pathways. This review focuses on recent discoveries made using metabolic mapping technologies to uncover novel pathways and metabolite-mediated posttranslational modifications and epigenetic alterations and their impact on physiology and disease. PMID:24918200
Computational genomic identification and functional reconstitution of plant natural product biosynthetic pathways

PubMed Central

2016-01-01

Covering: 2003 to 2016 The last decade has seen the first major discoveries regarding the genomic basis of plant natural product biosynthetic pathways. Four key computationally driven strategies have been developed to identify such pathways, which make use of physical clustering, co-expression, evolutionary co-occurrence and epigenomic co-regulation of the genes involved in producing a plant natural product. Here, we discuss how these approaches can be used for the discovery of plant biosynthetic pathways encoded by both chromosomally clustered and non-clustered genes. Additionally, we will discuss opportunities to prioritize plant gene clusters for experimental characterization, and end with a forward-looking perspective on how synthetic biology technologies will allow effective functional reconstitution of candidate pathways using a variety of genetic systems. PMID:27321668
RNAseq versus genome-predicted transcriptomes: a large population of novel transcripts identified in an Illumina-454 Hydra transcriptome.

PubMed

Wenger, Yvan; Galliot, Brigitte

2013-03-25

Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48'909 unique sequences including splice variants, representing approximately 24'450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10'597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11'270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events.
RNAseq versus genome-predicted transcriptomes: a large population of novel transcripts identified in an Illumina-454 Hydra transcriptome

PubMed Central

2013-01-01

Background Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. Results To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48’909 unique sequences including splice variants, representing approximately 24’450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10’597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11’270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. Conclusions We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events. PMID:23530871
Informatics and computational strategies for the study of lipids.

PubMed

Yetukuri, Laxman; Ekroos, Kim; Vidal-Puig, Antonio; Oresic, Matej

2008-02-01

Recent advances in mass spectrometry (MS)-based techniques for lipidomic analysis have empowered us with the tools that afford studies of lipidomes at the systems level. However, these techniques pose a number of challenges for lipidomic raw data processing, lipid informatics, and the interpretation of lipidomic data in the context of lipid function and structure. Integration of lipidomic data with other systemic levels, such as genomic or proteomic, in the context of molecular pathways and biophysical processes provides a basis for the understanding of lipid function at the systems level. The present report, based on the limited literature, is an update on a young but rapidly emerging field of lipid informatics and related pathway reconstruction strategies.
Synthetic Genetic Arrays: Automation of Yeast Genetics.

PubMed

Kuzmin, Elena; Costanzo, Michael; Andrews, Brenda; Boone, Charles

2016-04-01

Genome-sequencing efforts have led to great strides in the annotation of protein-coding genes and other genomic elements. The current challenge is to understand the functional role of each gene and how genes work together to modulate cellular processes. Genetic interactions define phenotypic relationships between genes and reveal the functional organization of a cell. Synthetic genetic array (SGA) methodology automates yeast genetics and enables large-scale and systematic mapping of genetic interaction networks in the budding yeast,Saccharomyces cerevisiae SGA facilitates construction of an output array of double mutants from an input array of single mutants through a series of replica pinning steps. Subsequent analysis of genetic interactions from SGA-derived mutants relies on accurate quantification of colony size, which serves as a proxy for fitness. Since its development, SGA has given rise to a variety of other experimental approaches for functional profiling of the yeast genome and has been applied in a multitude of other contexts, such as genome-wide screens for synthetic dosage lethality and integration with high-content screening for systematic assessment of morphology defects. SGA-like strategies can also be implemented similarly in a number of other cell types and organisms, includingSchizosaccharomyces pombe,Escherichia coli, Caenorhabditis elegans, and human cancer cell lines. The genetic networks emerging from these studies not only generate functional wiring diagrams but may also play a key role in our understanding of the complex relationship between genotype and phenotype. © 2016 Cold Spring Harbor Laboratory Press.
The functional genomic studies of curcumin.

PubMed

Huminiecki, Lukasz; Horbańczuk, Jarosław; Atanasov, Atanas G

2017-10-01

Curcumin is a natural plant-derived compound that has attracted a lot of attention for its anti-cancer activities. Curcumin can slow proliferation of and induce apoptosis in cancer cell lines, but the precise mechanisms of these effects are not fully understood. However, many lines of evidence suggested that curcumin has a potent impact on gene expression profiles; thus, functional genomics should be the key to understanding how curcumin exerts its anti-cancer activities. Here, we review the published functional genomic studies of curcumin focusing on cancer. Typically, a cancer cell line or a grafted tumor were exposed to curcumin and profiled with microarrays, methylation assays, or RNA-seq. Crucially, these studies are in agreement that curcumin has a powerful effect on gene expression. In the majority of the studies, among differentially expressed genes we found genes involved in cell signaling, apoptosis, and the control of cell cycle. Curcumin can also induce specific methylation changes, and is a powerful regulator of the expression of microRNAs which control oncogenesis. We also reflect on how the broader technological progress in transcriptomics has been reflected on the field of curcumin. We conclude by discussing the areas where more functional genomic studies are highly desirable. Integrated OMICS approaches will clearly be the key to understanding curcumin's anticancer and chemopreventive effects. Such strategies may become a template for elucidating the mode of action of other natural products; many natural products have pleiotropic effects that are well suited for a systems-level analysis. Copyright © 2017 Elsevier Ltd. All rights reserved.
Evolution of Functional Diversification within Quasispecies

PubMed Central

Colizzi, Enrico Sandro; Hogeweg, Paulien

2014-01-01

According to quasispecies theory, high mutation rates limit the amount of information genomes can store (Eigen’s Paradox), whereas genomes with higher degrees of neutrality may be selected even at the expenses of higher replication rates (the “survival of the flattest” effect). Introducing a complex genotype to phenotype map, such as RNA folding, epitomizes such effect because of the existence of neutral networks and their exploitation by evolution, affecting both population structure and genome composition. We reexamine these classical results in the light of an RNA-based system that can evolve its own ecology. Contrary to expectations, we find that quasispecies evolving at high mutation rates are steep and characterized by one master sequence. Importantly, the analysis of the system and the characterization of the evolved quasispecies reveal the emergence of functionalities as phenotypes of nonreplicating genotypes, whose presence is crucial for the overall viability and stability of the system. In other words, the master sequence codes for the information of the entire ecosystem, whereas the decoding happens, stochastically, through mutations. We show that this solution quickly outcompetes strategies based on genomes with a high degree of neutrality. In conclusion, individually coded but ecosystem-based diversity evolves and persists indefinitely close to the Information Threshold. PMID:25056399
Stakeholder engagement: a key component of integrating genomic information into electronic health records

PubMed Central

Hartzler, Andrea; McCarty, Catherine A.; Rasmussen, Luke V.; Williams, Marc S.; Brilliant, Murray; Bowton, Erica A.; Clayton, Ellen Wright; Faucett, William A.; Ferryman, Kadija; Field, Julie R.; Fullerton, Stephanie M.; Horowitz, Carol R.; Koenig, Barbara A.; McCormick, Jennifer B.; Ralston, James D.; Sanderson, Saskia C.; Smith, Maureen E.; Trinidad, Susan Brown

2014-01-01

Integrating genomic information into clinical care and the electronic health record can facilitate personalized medicine through genetically guided clinical decision support. Stakeholder involvement is critical to the success of these implementation efforts. Prior work on implementation of clinical information systems provides broad guidance to inform effective engagement strategies. We add to this evidence-based recommendations that are specific to issues at the intersection of genomics and the electronic health record. We describe stakeholder engagement strategies employed by the Electronic Medical Records and Genomics Network, a national consortium of US research institutions funded by the National Human Genome Research Institute to develop, disseminate, and apply approaches that combine genomic and electronic health record data. Through select examples drawn from sites of the Electronic Medical Records and Genomics Network, we illustrate a continuum of engagement strategies to inform genomic integration into commercial and homegrown electronic health records across a range of health-care settings. We frame engagement as activities to consult, involve, and partner with key stakeholder groups throughout specific phases of health information technology implementation. Our aim is to provide insights into engagement strategies to guide genomic integration based on our unique network experiences and lessons learned within the broader context of implementation research in biomedical informatics. On the basis of our collective experience, we describe key stakeholder practices, challenges, and considerations for successful genomic integration to support personalized medicine. PMID:24030437
The genome of the sea urchin Strongylocentrotus purpuratus.

PubMed

Sodergren, Erica; Weinstock, George M; Davidson, Eric H; Cameron, R Andrew; Gibbs, Richard A; Angerer, Robert C; Angerer, Lynne M; Arnone, Maria Ina; Burgess, David R; Burke, Robert D; Coffman, James A; Dean, Michael; Elphick, Maurice R; Ettensohn, Charles A; Foltz, Kathy R; Hamdoun, Amro; Hynes, Richard O; Klein, William H; Marzluff, William; McClay, David R; Morris, Robert L; Mushegian, Arcady; Rast, Jonathan P; Smith, L Courtney; Thorndyke, Michael C; Vacquier, Victor D; Wessel, Gary M; Wray, Greg; Zhang, Lan; Elsik, Christine G; Ermolaeva, Olga; Hlavina, Wratko; Hofmann, Gretchen; Kitts, Paul; Landrum, Melissa J; Mackey, Aaron J; Maglott, Donna; Panopoulou, Georgia; Poustka, Albert J; Pruitt, Kim; Sapojnikov, Victor; Song, Xingzhi; Souvorov, Alexandre; Solovyev, Victor; Wei, Zheng; Whittaker, Charles A; Worley, Kim; Durbin, K James; Shen, Yufeng; Fedrigo, Olivier; Garfield, David; Haygood, Ralph; Primus, Alexander; Satija, Rahul; Severson, Tonya; Gonzalez-Garay, Manuel L; Jackson, Andrew R; Milosavljevic, Aleksandar; Tong, Mark; Killian, Christopher E; Livingston, Brian T; Wilt, Fred H; Adams, Nikki; Bellé, Robert; Carbonneau, Seth; Cheung, Rocky; Cormier, Patrick; Cosson, Bertrand; Croce, Jenifer; Fernandez-Guerra, Antonio; Genevière, Anne-Marie; Goel, Manisha; Kelkar, Hemant; Morales, Julia; Mulner-Lorillon, Odile; Robertson, Anthony J; Goldstone, Jared V; Cole, Bryan; Epel, David; Gold, Bert; Hahn, Mark E; Howard-Ashby, Meredith; Scally, Mark; Stegeman, John J; Allgood, Erin L; Cool, Jonah; Judkins, Kyle M; McCafferty, Shawn S; Musante, Ashlan M; Obar, Robert A; Rawson, Amanda P; Rossetti, Blair J; Gibbons, Ian R; Hoffman, Matthew P; Leone, Andrew; Istrail, Sorin; Materna, Stefan C; Samanta, Manoj P; Stolc, Viktor; Tongprasit, Waraporn; Tu, Qiang; Bergeron, Karl-Frederik; Brandhorst, Bruce P; Whittle, James; Berney, Kevin; Bottjer, David J; Calestani, Cristina; Peterson, Kevin; Chow, Elly; Yuan, Qiu Autumn; Elhaik, Eran; Graur, Dan; Reese, Justin T; Bosdet, Ian; Heesun, Shin; Marra, Marco A; Schein, Jacqueline; Anderson, Michele K; Brockton, Virginia; Buckley, Katherine M; Cohen, Avis H; Fugmann, Sebastian D; Hibino, Taku; Loza-Coll, Mariano; Majeske, Audrey J; Messier, Cynthia; Nair, Sham V; Pancer, Zeev; Terwilliger, David P; Agca, Cavit; Arboleda, Enrique; Chen, Nansheng; Churcher, Allison M; Hallböök, F; Humphrey, Glen W; Idris, Mohammed M; Kiyama, Takae; Liang, Shuguang; Mellott, Dan; Mu, Xiuqian; Murray, Greg; Olinski, Robert P; Raible, Florian; Rowe, Matthew; Taylor, John S; Tessmar-Raible, Kristin; Wang, D; Wilson, Karen H; Yaguchi, Shunsuke; Gaasterland, Terry; Galindo, Blanca E; Gunaratne, Herath J; Juliano, Celina; Kinukawa, Masashi; Moy, Gary W; Neill, Anna T; Nomura, Mamoru; Raisch, Michael; Reade, Anna; Roux, Michelle M; Song, Jia L; Su, Yi-Hsien; Townley, Ian K; Voronina, Ekaterina; Wong, Julian L; Amore, Gabriele; Branno, Margherita; Brown, Euan R; Cavalieri, Vincenzo; Duboc, Véronique; Duloquin, Louise; Flytzanis, Constantin; Gache, Christian; Lapraz, François; Lepage, Thierry; Locascio, Annamaria; Martinez, Pedro; Matassi, Giorgio; Matranga, Valeria; Range, Ryan; Rizzo, Francesca; Röttinger, Eric; Beane, Wendy; Bradham, Cynthia; Byrum, Christine; Glenn, Tom; Hussain, Sofia; Manning, Gerard; Miranda, Esther; Thomason, Rebecca; Walton, Katherine; Wikramanayke, Athula; Wu, Shu-Yu; Xu, Ronghui; Brown, C Titus; Chen, Lili; Gray, Rachel F; Lee, Pei Yun; Nam, Jongmin; Oliveri, Paola; Smith, Joel; Muzny, Donna; Bell, Stephanie; Chacko, Joseph; Cree, Andrew; Curry, Stacey; Davis, Clay; Dinh, Huyen; Dugan-Rocha, Shannon; Fowler, Jerry; Gill, Rachel; Hamilton, Cerrissa; Hernandez, Judith; Hines, Sandra; Hume, Jennifer; Jackson, Laronda; Jolivet, Angela; Kovar, Christie; Lee, Sandra; Lewis, Lora; Miner, George; Morgan, Margaret; Nazareth, Lynne V; Okwuonu, Geoffrey; Parker, David; Pu, Ling-Ling; Thorn, Rachel; Wright, Rita

2006-11-10

We report the sequence and analysis of the 814-megabase genome of the sea urchin Strongylocentrotus purpuratus, a model for developmental and systems biology. The sequencing strategy combined whole-genome shotgun and bacterial artificial chromosome (BAC) sequences. This use of BAC clones, aided by a pooling strategy, overcame difficulties associated with high heterozygosity of the genome. The genome encodes about 23,300 genes, including many previously thought to be vertebrate innovations or known only outside the deuterostomes. This echinoderm genome provides an evolutionary outgroup for the chordates and yields insights into the evolution of deuterostomes.
Forward and reverse mutagenesis in C. elegans

PubMed Central

Kutscher, Lena M.; Shaham, Shai

2014-01-01

Mutagenesis drives natural selection. In the lab, mutations allow gene function to be deciphered. C. elegans is highly amendable to functional genetics because of its short generation time, ease of use, and wealth of available gene-alteration techniques. Here we provide an overview of historical and contemporary methods for mutagenesis in C. elegans, and discuss principles and strategies for forward (genome-wide mutagenesis) and reverse (target-selected and gene-specific mutagenesis) genetic studies in this animal. PMID:24449699
MOSAIC: an online database dedicated to the comparative genomics of bacterial strains at the intra-species level.

PubMed

Chiapello, Hélène; Gendrault, Annie; Caron, Christophe; Blum, Jérome; Petit, Marie-Agnès; El Karoui, Meriem

2008-11-27

The recent availability of complete sequences for numerous closely related bacterial genomes opens up new challenges in comparative genomics. Several methods have been developed to align complete genomes at the nucleotide level but their use and the biological interpretation of results are not straightforward. It is therefore necessary to develop new resources to access, analyze, and visualize genome comparisons. Here we present recent developments on MOSAIC, a generalist comparative bacterial genome database. This database provides the bacteriologist community with easy access to comparisons of complete bacterial genomes at the intra-species level. The strategy we developed for comparison allows us to define two types of regions in bacterial genomes: backbone segments (i.e., regions conserved in all compared strains) and variable segments (i.e., regions that are either specific to or variable in one of the aligned genomes). Definition of these segments at the nucleotide level allows precise comparative and evolutionary analyses of both coding and non-coding regions of bacterial genomes. Such work is easily performed using the MOSAIC Web interface, which allows browsing and graphical visualization of genome comparisons. The MOSAIC database now includes 493 pairwise comparisons and 35 multiple maximal comparisons representing 78 bacterial species. Genome conserved regions (backbones) and variable segments are presented in various formats for further analysis. A graphical interface allows visualization of aligned genomes and functional annotations. The MOSAIC database is available online at http://genome.jouy.inra.fr/mosaic.
Mining a database of single amplified genomes from Red Sea brine pool extremophiles—improving reliability of gene function prediction using a profile and pattern matching algorithm (PPMA)

PubMed Central

Grötzinger, Stefan W.; Alam, Intikhab; Ba Alawi, Wail; Bajic, Vladimir B.; Stingl, Ulrich; Eppinger, Jörg

2014-01-01

Reliable functional annotation of genomic data is the key-step in the discovery of novel enzymes. Intrinsic sequencing data quality problems of single amplified genomes (SAGs) and poor homology of novel extremophile's genomes pose significant challenges for the attribution of functions to the coding sequences identified. The anoxic deep-sea brine pools of the Red Sea are a promising source of novel enzymes with unique evolutionary adaptation. Sequencing data from Red Sea brine pool cultures and SAGs are annotated and stored in the Integrated Data Warehouse of Microbial Genomes (INDIGO) data warehouse. Low sequence homology of annotated genes (no similarity for 35% of these genes) may translate into false positives when searching for specific functions. The Profile and Pattern Matching (PPM) strategy described here was developed to eliminate false positive annotations of enzyme function before progressing to labor-intensive hyper-saline gene expression and characterization. It utilizes InterPro-derived Gene Ontology (GO)-terms (which represent enzyme function profiles) and annotated relevant PROSITE IDs (which are linked to an amino acid consensus pattern). The PPM algorithm was tested on 15 protein families, which were selected based on scientific and commercial potential. An initial list of 2577 enzyme commission (E.C.) numbers was translated into 171 GO-terms and 49 consensus patterns. A subset of INDIGO-sequences consisting of 58 SAGs from six different taxons of bacteria and archaea were selected from six different brine pool environments. Those SAGs code for 74,516 genes, which were independently scanned for the GO-terms (profile filter) and PROSITE IDs (pattern filter). Following stringent reliability filtering, the non-redundant hits (106 profile hits and 147 pattern hits) are classified as reliable, if at least two relevant descriptors (GO-terms and/or consensus patterns) are present. Scripts for annotation, as well as for the PPM algorithm, are available through the INDIGO website. PMID:24778629
Microinjection-based RNA interference knockdown of ecdysteroid biosynthetic genes in a non-model hemipteran pest, Lygus hesperus (western tarnished plant bug)

USDA-ARS?s Scientific Manuscript database

RNAi-mediated knockdown of target transcripts offers great potential, both in terms of insect functional genomics and the development of novel insect pest management strategies. Frequently, dsRNAs targeting transcripts of interest are introduced orally to the target organism via feeding. This delive...
Genome Reduction in Psychromonas Species within the Gut of an Amphipod from the Ocean's Deepest Point.

PubMed

Zhang, Weipeng; Tian, Ren-Mao; Sun, Jin; Bougouffa, Salim; Ding, Wei; Cai, Lin; Lan, Yi; Tong, Haoya; Li, Yongxin; Jamieson, Alan J; Bajic, Vladimir B; Drazen, Jeffrey C; Bartlett, Douglas; Qian, Pei-Yuan

2018-01-01

Amphipods are the dominant scavenging metazoan species in the Mariana Trench, the deepest known point in Earth's oceans. Here the gut microbiota of the amphipod Hirondellea gigas collected from the Challenger and Sirena Deeps of the Mariana Trench were investigated. The 11 amphipod individuals included for analyses were dominated by Psychromonas , of which a nearly complete genome was successfully recovered (designated CDP1). Compared with previously reported free-living Psychromonas strains, CDP1 has a highly reduced genome. Genome alignment showed deletion of the trimethylamine N -oxide (TMAO) reducing gene cluster in CDP1, suggesting that the "piezolyte" function of TMAO is more important than its function in respiration, which may lead to TMAO accumulation. In terms of nutrient utilization, the bacterium retains its central carbohydrate metabolism but lacks most of the extended carbohydrate utilization pathways, suggesting the confinement of Psychromonas to the host gut and sequestration from more variable environmental conditions. Moreover, CDP1 contains a complete formate hydrogenlyase complex, which might be involved in energy production. The genomic analyses imply that CDP1 may have developed adaptive strategies for a lifestyle within the gut of the hadal amphipod H. gigas. IMPORTANCE As a unique but poorly investigated habitat within marine ecosystems, hadal trenches have received interest in recent years. This study explores the gut microbial composition and function in hadal amphipods, which are among the dominant carrion feeders in hadal habitats. Further analyses of a dominant strain revealed genomic features that may contribute to its adaptation to the amphipod gut environment. Our findings provide new insights into animal-associated bacteria in the hadal biosphere.
Genome Reduction in Psychromonas Species within the Gut of an Amphipod from the Ocean’s Deepest Point

PubMed Central

Zhang, Weipeng; Tian, Ren-Mao; Sun, Jin; Bougouffa, Salim; Ding, Wei; Cai, Lin; Lan, Yi; Tong, Haoya; Li, Yongxin; Jamieson, Alan J.; Bajic, Vladimir B.; Drazen, Jeffrey C.; Bartlett, Douglas

2018-01-01

ABSTRACT Amphipods are the dominant scavenging metazoan species in the Mariana Trench, the deepest known point in Earth’s oceans. Here the gut microbiota of the amphipod Hirondellea gigas collected from the Challenger and Sirena Deeps of the Mariana Trench were investigated. The 11 amphipod individuals included for analyses were dominated by Psychromonas, of which a nearly complete genome was successfully recovered (designated CDP1). Compared with previously reported free-living Psychromonas strains, CDP1 has a highly reduced genome. Genome alignment showed deletion of the trimethylamine N-oxide (TMAO) reducing gene cluster in CDP1, suggesting that the “piezolyte” function of TMAO is more important than its function in respiration, which may lead to TMAO accumulation. In terms of nutrient utilization, the bacterium retains its central carbohydrate metabolism but lacks most of the extended carbohydrate utilization pathways, suggesting the confinement of Psychromonas to the host gut and sequestration from more variable environmental conditions. Moreover, CDP1 contains a complete formate hydrogenlyase complex, which might be involved in energy production. The genomic analyses imply that CDP1 may have developed adaptive strategies for a lifestyle within the gut of the hadal amphipod H. gigas. IMPORTANCE As a unique but poorly investigated habitat within marine ecosystems, hadal trenches have received interest in recent years. This study explores the gut microbial composition and function in hadal amphipods, which are among the dominant carrion feeders in hadal habitats. Further analyses of a dominant strain revealed genomic features that may contribute to its adaptation to the amphipod gut environment. Our findings provide new insights into animal-associated bacteria in the hadal biosphere. PMID:29657971
Fanconi anemia and the cell cycle: new perspectives on aneuploidy

PubMed Central

2014-01-01

Fanconi anemia (FA) is a complex heterogenic disorder of genomic instability, bone marrow failure, cancer predisposition, and congenital malformations. The FA signaling network orchestrates the DNA damage recognition and repair in interphase as well as proper execution of mitosis. Loss of FA signaling causes chromosome instability by weakening the spindle assembly checkpoint, disrupting centrosome maintenance, disturbing resolution of ultrafine anaphase bridges, and dysregulating cytokinesis. Thus, the FA genes function as guardians of genome stability throughout the cell cycle. This review discusses recent advances in diagnosis and clinical management of Fanconi anemia and presents the new insights into the origins of genomic instability in FA. These new discoveries may facilitate the development of rational therapeutic strategies for FA and for FA-deficient malignancies in the general population. PMID:24765528
Evolutionary scalpels for dissecting tumor ecosystems

PubMed Central

Rosenbloom, Daniel I. S.; Camara, Pablo G.; Chu, Tim; Rabadan, Raul

2017-01-01

Amidst the growing literature on cancer genomics and intratumor heterogeneity, essential principles in evolutionary biology recur time and time again. Here we use these principles to guide the reader through major advances in cancer research, highlighting issues of “hit hard, hit early” treatment strategies, drug resistance, and metastasis. We distinguish between two frameworks for understanding heterogeneous tumors, both of which can inform treatment strategies: (1) The tumor as diverse ecosystem, a Darwinian population of sometimes-competing, sometimes-cooperating cells; (2) The tumor as tightly integrated, self-regulating organ, which may hijack developmental signals to restore functional heterogeneity after treatment. While the first framework dominates literature on cancer evolution, the second framework enjoys support as well. Throughout this review, we illustrate how mathematical models inform understanding of tumor progression and treatment outcomes. Connecting models to genomic data faces computational and technical hurdles, but high-throughput single-cell technologies show promise to clear these hurdles. PMID:27923679
Total chemical synthesis of modified histones

NASA Astrophysics Data System (ADS)

Qi, Yun-Kun; Ai, Hua-Song; Li, Yi-Ming; Yan, Baihui

2018-02-01

In the post-genome era, epigenetics has received increasing attentions in recent years. The post-translational modifications (PTMs) of four core histones play central roles in epigenetic regulation of eukaryotic genome by either directly altering the biophysical properties of nucleosomes or by recruiting other effector proteins. In order to study the biological functions and structural mechanisms of these histone PTMs, an obligatory step is to prepare a sufficient amount of homogeneously modified histones. This task cannot be fully accomplished either by recombinant technology or enzymatic modification. In this context, synthetic chemists have developed novel protein synthetic tools and state-of-the-art chemical ligation strategies for the preparation of homologous modified histones. In this review, we summarize the recent advances in the preparation of modified histones, focusing on the total chemical synthesis strategies. The importance and potential of synthetic chemistry for the study of histone code will be also discussed.

Cow genotyping strategies for genomic selection in small dairy cattle population

USDA-ARS?s Scientific Manuscript database

This study compares how different cow genotyping strategies increase the accuracy of genomic estimated breeding values (EBV) in dairy cattle breeds with low numbers. In these breeds there are few sires with progeny records and genotyping cows can improve the accuracy of genomic EBV. The Guernsey bre...
Methods of Genomic Competency Integration in Practice

PubMed Central

Jenkins, Jean; Calzone, Kathleen A.; Caskey, Sarah; Culp, Stacey; Weiner, Marsha; Badzek, Laurie

2015-01-01

Purpose Genomics is increasingly relevant to health care, necessitating support for nurses to incorporate genomic competencies into practice. The primary aim of this project was to develop, implement, and evaluate a year-long genomic education intervention that trained, supported, and supervised institutional administrator and educator champion dyads to increase nursing capacity to integrate genomics through assessments of program satisfaction and institutional achieved outcomes. Design Longitudinal study of 23 Magnet Recognition Program® Hospitals (21 intervention, 2 controls) participating in a 1-year new competency integration effort aimed at increasing genomic nursing competency and overcoming barriers to genomics integration in practice. Methods Champion dyads underwent genomic training consisting of one in-person kick-off training meeting followed by monthly education webinars. Champion dyads designed institution-specific action plans detailing objectives, methods or strategies used to engage and educate nursing staff, timeline for implementation, and outcomes achieved. Action plans focused on a minimum of seven genomic priority areas: champion dyad personal development; practice assessment; policy content assessment; staff knowledge needs assessment; staff development; plans for integration; and anticipated obstacles and challenges. Action plans were updated quarterly, outlining progress made as well as inclusion of new methods or strategies. Progress was validated through virtual site visits with the champion dyads and chief nursing officers. Descriptive data were collected on all strategies or methods utilized, and timeline for achievement. Descriptive data were analyzed using content analysis. Findings The complexity of the competency content and the uniqueness of social systems and infrastructure resulted in a significant variation of champion dyad interventions. Conclusions Nursing champions can facilitate change in genomic nursing capacity through varied strategies but require substantial training in order to design and implement interventions. Clinical Relevance Genomics is critical to the practice of all nurses. There is a great opportunity and interest to address genomic knowledge deficits in the practicing nurse workforce as a strategy to improve patient outcomes. Exemplars of champion dyad interventions designed to increase nursing capacity focus on improving education, policy, and healthcare services. PMID:25808828
A high-throughput Sanger strategy for human mitochondrial genome sequencing

PubMed Central

2013-01-01

Background A population reference database of complete human mitochondrial genome (mtGenome) sequences is needed to enable the use of mitochondrial DNA (mtDNA) coding region data in forensic casework applications. However, the development of entire mtGenome haplotypes to forensic data quality standards is difficult and laborious. A Sanger-based amplification and sequencing strategy that is designed for automated processing, yet routinely produces high quality sequences, is needed to facilitate high-volume production of these mtGenome data sets. Results We developed a robust 8-amplicon Sanger sequencing strategy that regularly produces complete, forensic-quality mtGenome haplotypes in the first pass of data generation. The protocol works equally well on samples representing diverse mtDNA haplogroups and DNA input quantities ranging from 50 pg to 1 ng, and can be applied to specimens of varying DNA quality. The complete workflow was specifically designed for implementation on robotic instrumentation, which increases throughput and reduces both the opportunities for error inherent to manual processing and the cost of generating full mtGenome sequences. Conclusions The described strategy will assist efforts to generate complete mtGenome haplotypes which meet the highest data quality expectations for forensic genetic and other applications. Additionally, high-quality data produced using this protocol can be used to assess mtDNA data developed using newer technologies and chemistries. Further, the amplification strategy can be used to enrich for mtDNA as a first step in sample preparation for targeted next-generation sequencing. PMID:24341507
Genome-based Modeling and Design of Metabolic Interactions in Microbial Communities

PubMed Central

Mahadevan, Radhakrishnan; Henson, Michael A.

2012-01-01

Biotechnology research is traditionally focused on individual microbial strains that are perceived to have the necessary metabolic functions, or the capability to have these functions introduced, to achieve a particular task. For many important applications, the development of such omnipotent microbes is an extremely challenging if not impossible task. By contrast, nature employs a radically different strategy based on synergistic combinations of different microbial species that collectively achieve the desired task. These natural communities have evolved to exploit the native metabolic capabilities of each species and are highly adaptive to changes in their environments. However, microbial communities have proven difficult to study due to a lack of suitable experimental and computational tools. With the advent of genome sequencing, omics technologies, bioinformatics and genome-scale modeling, researchers now have unprecedented capabilities to analyze and engineer the metabolism of microbial communities. The goal of this review is to summarize recent applications of genome-scale metabolic modeling to microbial communities. A brief introduction to lumped community models is used to motivate the need for genome-level descriptions of individual species and their metabolic interactions. The review of genome-scale models begins with static modeling approaches, which are appropriate for communities where the extracellular environment can be assumed to be time invariant or slowly varying. Dynamic extensions of the static modeling approach are described, and then applications of genome-scale models for design of synthetic microbial communities are reviewed. The review concludes with a summary of metagenomic tools for analyzing community metabolism and an outlook for future research. PMID:24688668
Genome-based Modeling and Design of Metabolic Interactions in Microbial Communities.

PubMed

Mahadevan, Radhakrishnan; Henson, Michael A

2012-01-01

Biotechnology research is traditionally focused on individual microbial strains that are perceived to have the necessary metabolic functions, or the capability to have these functions introduced, to achieve a particular task. For many important applications, the development of such omnipotent microbes is an extremely challenging if not impossible task. By contrast, nature employs a radically different strategy based on synergistic combinations of different microbial species that collectively achieve the desired task. These natural communities have evolved to exploit the native metabolic capabilities of each species and are highly adaptive to changes in their environments. However, microbial communities have proven difficult to study due to a lack of suitable experimental and computational tools. With the advent of genome sequencing, omics technologies, bioinformatics and genome-scale modeling, researchers now have unprecedented capabilities to analyze and engineer the metabolism of microbial communities. The goal of this review is to summarize recent applications of genome-scale metabolic modeling to microbial communities. A brief introduction to lumped community models is used to motivate the need for genome-level descriptions of individual species and their metabolic interactions. The review of genome-scale models begins with static modeling approaches, which are appropriate for communities where the extracellular environment can be assumed to be time invariant or slowly varying. Dynamic extensions of the static modeling approach are described, and then applications of genome-scale models for design of synthetic microbial communities are reviewed. The review concludes with a summary of metagenomic tools for analyzing community metabolism and an outlook for future research.
Integrated genomics and molecular breeding approaches for dissecting the complex quantitative traits in crop plants.

PubMed

Kujur, Alice; Saxena, Maneesha S; Bajaj, Deepak; Laxmi; Parida, Swarup K

2013-12-01

The enormous population growth, climate change and global warming are now considered major threats to agriculture and world's food security. To improve the productivity and sustainability of agriculture, the development of highyielding and durable abiotic and biotic stress-tolerant cultivars and/climate resilient crops is essential. Henceforth, understanding the molecular mechanism and dissection of complex quantitative yield and stress tolerance traits is the prime objective in current agricultural biotechnology research. In recent years, tremendous progress has been made in plant genomics and molecular breeding research pertaining to conventional and next-generation whole genome, transcriptome and epigenome sequencing efforts, generation of huge genomic, transcriptomic and epigenomic resources and development of modern genomics-assisted breeding approaches in diverse crop genotypes with contrasting yield and abiotic stress tolerance traits. Unfortunately, the detailed molecular mechanism and gene regulatory networks controlling such complex quantitative traits is not yet well understood in crop plants. Therefore, we propose an integrated strategies involving available enormous and diverse traditional and modern -omics (structural, functional, comparative and epigenomics) approaches/resources and genomics-assisted breeding methods which agricultural biotechnologist can adopt/utilize to dissect and decode the molecular and gene regulatory networks involved in the complex quantitative yield and stress tolerance traits in crop plants. This would provide clues and much needed inputs for rapid selection of novel functionally relevant molecular tags regulating such complex traits to expedite traditional and modern marker-assisted genetic enhancement studies in target crop species for developing high-yielding stress-tolerant varieties.
Construction of a dairy microbial genome catalog opens new perspectives for the metagenomic analysis of dairy fermented products.

PubMed

Almeida, Mathieu; Hébert, Agnès; Abraham, Anne-Laure; Rasmussen, Simon; Monnet, Christophe; Pons, Nicolas; Delbès, Céline; Loux, Valentin; Batto, Jean-Michel; Leonard, Pierre; Kennedy, Sean; Ehrlich, Stanislas Dusko; Pop, Mihai; Montel, Marie-Christine; Irlinger, Françoise; Renault, Pierre

2014-12-13

Microbial communities of traditional cheeses are complex and insufficiently characterized. The origin, safety and functional role in cheese making of these microbial communities are still not well understood. Metagenomic analysis of these communities by high throughput shotgun sequencing is a promising approach to characterize their genomic and functional profiles. Such analyses, however, critically depend on the availability of appropriate reference genome databases against which the sequencing reads can be aligned. We built a reference genome catalog suitable for short read metagenomic analysis using a low-cost sequencing strategy. We selected 142 bacteria isolated from dairy products belonging to 137 different species and 67 genera, and succeeded to reconstruct the draft genome of 117 of them at a standard or high quality level, including isolates from the genera Kluyvera, Luteococcus and Marinilactibacillus, still missing from public database. To demonstrate the potential of this catalog, we analysed the microbial composition of the surface of two smear cheeses and one blue-veined cheese, and showed that a significant part of the microbiota of these traditional cheeses was composed of microorganisms newly sequenced in our study. Our study provides data, which combined with publicly available genome references, represents the most expansive catalog to date of cheese-associated bacteria. Using this extended dairy catalog, we revealed the presence in traditional cheese of dominant microorganisms not deliberately inoculated, mainly Gram-negative genera such as Pseudoalteromonas haloplanktis or Psychrobacter immobilis, that may contribute to the characteristics of cheese produced through traditional methods.
Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics.

PubMed

Kelly, Benjamin J; Fitch, James R; Hu, Yangqiu; Corsmeier, Donald J; Zhong, Huachun; Wetzel, Amy N; Nordquist, Russell D; Newsom, David L; White, Peter

2015-01-20

While advances in genome sequencing technology make population-scale genomics a possibility, current approaches for analysis of these data rely upon parallelization strategies that have limited scalability, complex implementation and lack reproducibility. Churchill, a balanced regional parallelization strategy, overcomes these challenges, fully automating the multiple steps required to go from raw sequencing reads to variant discovery. Through implementation of novel deterministic parallelization techniques, Churchill allows computationally efficient analysis of a high-depth whole genome sample in less than two hours. The method is highly scalable, enabling full analysis of the 1000 Genomes raw sequence dataset in a week using cloud resources. http://churchill.nchri.org/.
Mitigation of inbreeding while preserving genetic gain in genomic breeding programs for outbred plants.

PubMed

Lin, Zibei; Shi, Fan; Hayes, Ben J; Daetwyler, Hans D

2017-05-01

Heuristic genomic inbreeding controls reduce inbreeding in genomic breeding schemes without reducing genetic gain. Genomic selection is increasingly being implemented in plant breeding programs to accelerate genetic gain of economically important traits. However, it may cause significant loss of genetic diversity when compared with traditional schemes using phenotypic selection. We propose heuristic strategies to control the rate of inbreeding in outbred plants, which can be categorised into three types: controls during mate allocation, during selection, and simultaneous selection and mate allocation. The proposed mate allocation measure GminF allocates two or more parents for mating in mating groups that minimise coancestry using a genomic relationship matrix. Two types of relationship-adjusted genomic breeding values for parent selection candidates ([Formula: see text]) and potential offspring ([Formula: see text]) are devised to control inbreeding during selection and even enabling simultaneous selection and mate allocation. These strategies were tested in a case study using a simulated perennial ryegrass breeding scheme. As compared to the genomic selection scheme without controls, all proposed strategies could significantly decrease inbreeding while achieving comparable genetic gain. In particular, the scenario using [Formula: see text] in simultaneous selection and mate allocation reduced inbreeding to one-third of the original genomic selection scheme. The proposed strategies are readily applicable in any outbred plant breeding program.
Structure and Function of the Splice Variants of TMPRSS2-ERG, a Prevalent Genomic Alteration in Prostate Cancer

DTIC Science & Technology

2009-09-01

binding ETS domain) and five type II (without ETS domain). Fusion-positive type I– and type II–containing phages were amplified with T3 and T7 primers...will be performed to identify the authentic 3’ UTRs from the mRNA pool from CaP patient specimens. Using phage excision strategy, we will use to... phage DNA sequences plasmids (cDNA) clones were generated by using phage excision strategy. Figure 1. ERG splice variants in prostate cancer
Modelling Human Regulatory Variation in Mouse: Finding the Function in Genome-Wide Association Studies and Whole-Genome Sequencing

PubMed Central

Schmouth, Jean-François; Bonaguro, Russell J.; Corso-Diaz, Ximena; Simpson, Elizabeth M.

2012-01-01

An increasing body of literature from genome-wide association studies and human whole-genome sequencing highlights the identification of large numbers of candidate regulatory variants of potential therapeutic interest in numerous diseases. Our relatively poor understanding of the functions of non-coding genomic sequence, and the slow and laborious process of experimental validation of the functional significance of human regulatory variants, limits our ability to fully benefit from this information in our efforts to comprehend human disease. Humanized mouse models (HuMMs), in which human genes are introduced into the mouse, suggest an approach to this problem. In the past, HuMMs have been used successfully to study human disease variants; e.g., the complex genetic condition arising from Down syndrome, common monogenic disorders such as Huntington disease and β-thalassemia, and cancer susceptibility genes such as BRCA1. In this commentary, we highlight a novel method for high-throughput single-copy site-specific generation of HuMMs entitled High-throughput Human Genes on the X Chromosome (HuGX). This method can be applied to most human genes for which a bacterial artificial chromosome (BAC) construct can be derived and a mouse-null allele exists. This strategy comprises (1) the use of recombineering technology to create a human variant–harbouring BAC, (2) knock-in of this BAC into the mouse genome using Hprt docking technology, and (3) allele comparison by interspecies complementation. We demonstrate the throughput of the HuGX method by generating a series of seven different alleles for the human NR2E1 gene at Hprt. In future challenges, we consider the current limitations of experimental approaches and call for a concerted effort by the genetics community, for both human and mouse, to solve the challenge of the functional analysis of human regulatory variation. PMID:22396661
Shortening tobacco life cycle accelerates functional gene identification in genomic research.

PubMed

Ning, G; Xiao, X; Lv, H; Li, X; Zuo, Y; Bao, M

2012-11-01

Definitive allocation of function requires the introduction of genetic mutations and analysis of their phenotypic consequences. Novel, rapid and convenient techniques or materials are very important and useful to accelerate gene identification in functional genomics research. Here, over-expression of PmFT (Prunus mume), a novel FT orthologue, and PtFT (Populus tremula) lead to shortening of the tobacco life cycle. A series of novel short life cycle stable tobacco lines (30-50 days) were developed through repeated self-crossing selection breeding. Based on the second transformation via a gusA reporter gene, the promoter from BpFULL1 in silver birch (Betula pendula) and the gene (CPC) from Arabidopsis thaliana were effectively tested using short life cycle tobacco lines. Comparative analysis among wild type, short life cycle tobacco and Arabidopsis transformation system verified that it is optional to accelerate functional gene studies by shortening host plant material life cycle, at least in these short life cycle tobacco lines. The results verified that the novel short life cycle transgenic tobacco lines not only combine the advantages of economic nursery requirements and a simple transformation system, but also provide a robust, effective and stable host system to accelerate gene analysis. Thus, shortening tobacco life cycle strategy is feasible to accelerate heterologous or homologous functional gene identification in genomic research. © 2012 German Botanical Society and The Royal Botanical Society of the Netherlands.
The Genome of the Sea Urchin Strongylocentrotus purpuratus

PubMed Central

2011-01-01

We report the sequence and analysis of the 814-megabase genome of the sea urchin Strongylocentrotus purpuratus, a model for developmental and systems biology. The sequencing strategy combined whole-genome shotgun and bacterial artificial chromosome (BAC) sequences. This use of BAC clones, aided by a pooling strategy, overcame difficulties associated with high heterozygosity of the genome. The genome encodes about 23,300 genes, including many previously thought to be vertebrate innovations or known only outside the deuterostomes. This echinoderm genome provides an evolutionary outgroup for the chordates and yields insights into the evolution of deuterostomes. PMID:17095691
Improvement of genome assembly completeness and identification of novel full-length protein-coding genes by RNA-seq in the giant panda genome.

PubMed

Chen, Meili; Hu, Yibo; Liu, Jingxing; Wu, Qi; Zhang, Chenglin; Yu, Jun; Xiao, Jingfa; Wei, Fuwen; Wu, Jiayan

2015-12-11

High-quality and complete gene models are the basis of whole genome analyses. The giant panda (Ailuropoda melanoleuca) genome was the first genome sequenced on the basis of solely short reads, but the genome annotation had lacked the support of transcriptomic evidence. In this study, we applied RNA-seq to globally improve the genome assembly completeness and to detect novel expressed transcripts in 12 tissues from giant pandas, by using a transcriptome reconstruction strategy that combined reference-based and de novo methods. Several aspects of genome assembly completeness in the transcribed regions were effectively improved by the de novo assembled transcripts, including genome scaffolding, the detection of small-size assembly errors, the extension of scaffold/contig boundaries, and gap closure. Through expression and homology validation, we detected three groups of novel full-length protein-coding genes. A total of 12.62% of the novel protein-coding genes were validated by proteomic data. GO annotation analysis showed that some of the novel protein-coding genes were involved in pigmentation, anatomical structure formation and reproduction, which might be related to the development and evolution of the black-white pelage, pseudo-thumb and delayed embryonic implantation of giant pandas. The updated genome annotation will help further giant panda studies from both structural and functional perspectives.
Community engagement strategies for genomic studies in Africa: a review of the literature.

PubMed

Tindana, Paulina; de Vries, Jantina; Campbell, Megan; Littler, Katherine; Seeley, Janet; Marshall, Patricia; Troyer, Jennifer; Ogundipe, Morisola; Alibu, Vincent Pius; Yakubu, Aminu; Parker, Michael

2015-04-12

Community engagement has been recognised as an important aspect of the ethical conduct of biomedical research, especially when research is focused on ethnically or culturally distinct populations. While this is a generally accepted tenet of biomedical research, it is unclear what components are necessary for effective community engagement, particularly in the context of genomic research in Africa. We conducted a review of the published literature to identify the community engagement strategies that can support the successful implementation of genomic studies in Africa. Our search strategy involved using online databases, Pubmed (National Library of Medicine), Medline and Google scholar. Search terms included a combination of the following: community engagement, community advisory boards, community consultation, community participation, effectiveness, genetic and genomic research, Africa, developing countries. A total of 44 articles and 1 thesis were retrieved of which 38 met the selection criteria. Of these, 21 were primary studies on community engagement, while the rest were secondary reports on community engagement efforts in biomedical research studies. 34 related to biomedical research generally, while 4 were specific to genetic and genomic research in Africa. We concluded that there were several community engagement strategies that could support genomic studies in Africa. While many of the strategies could support the early stages of a research project such as the recruitment of research participants, further research is needed to identify effective strategies to engage research participants and their communities beyond the participant recruitment stage. Research is also needed to address how the views of local communities should be incorporated into future uses of human biological samples. Finally, studies evaluating the impact of CE on genetic research are lacking. Systematic evaluation of CE strategies is essential to determine the most effective models of CE for genetic and genomic research conducted in African settings.
Potential pitfalls of CRISPR/Cas9-mediated genome editing.

PubMed

Peng, Rongxue; Lin, Guigao; Li, Jinming

2016-04-01

Recently, a novel technique named the clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein (Cas)9 system has been rapidly developed. This genome editing tool has improved our ability tremendously with respect to exploring the pathogenesis of diseases and correcting disease mutations, as well as phenotypes. With a short guide RNA, Cas9 can be precisely directed to target sites, and functions as an endonuclease to efficiently produce breaks in DNA double strands. Over the past 30 years, CRISPR has evolved from the 'curious sequences of unknown biological function' into a promising genome editing tool. As a result of the incessant development in the CRISPR/Cas9 system, Cas9 co-expressed with custom guide RNAs has been successfully used in a variety of cells and organisms. This genome editing technology can also be applied to synthetic biology, functional genomic screening, transcriptional modulation and gene therapy. However, although CRISPR/Cas9 has a broad range of action in science, there are several aspects that affect its efficiency and specificity, including Cas9 activity, target site selection and short guide RNA design, delivery methods, off-target effects and the incidence of homology-directed repair. In the present review, we highlight the factors that affect the utilization of CRISPR/Cas9, as well as possible strategies for handling any problems. Addressing these issues will allow us to take better advantage of this technique. In addition, we also review the history and rapid development of the CRISPR/Cas system from the time of its initial discovery in 2012. © 2015 FEBS.
Mammalian genomic regulatory regions predicted by utilizing human genomics, transcriptomics, and epigenetics data

PubMed Central

Nguyen, Quan H; Tellam, Ross L; Naval-Sanchez, Marina; Porto-Neto, Laercio R; Barendse, William; Reverter, Antonio; Hayes, Benjamin; Kijas, James; Dalrymple, Brian P

2018-01-01

Abstract Genome sequences for hundreds of mammalian species are available, but an understanding of their genomic regulatory regions, which control gene expression, is only beginning. A comprehensive prediction of potential active regulatory regions is necessary to functionally study the roles of the majority of genomic variants in evolution, domestication, and animal production. We developed a computational method to predict regulatory DNA sequences (promoters, enhancers, and transcription factor binding sites) in production animals (cows and pigs) and extended its broad applicability to other mammals. The method utilizes human regulatory features identified from thousands of tissues, cell lines, and experimental assays to find homologous regions that are conserved in sequences and genome organization and are enriched for regulatory elements in the genome sequences of other mammalian species. Importantly, we developed a filtering strategy, including a machine learning classification method, to utilize a very small number of species-specific experimental datasets available to select for the likely active regulatory regions. The method finds the optimal combination of sensitivity and accuracy to unbiasedly predict regulatory regions in mammalian species. Furthermore, we demonstrated the utility of the predicted regulatory datasets in cattle for prioritizing variants associated with multiple production and climate change adaptation traits and identifying potential genome editing targets. PMID:29618048
Genome3D: a UK collaborative project to annotate genomic sequences with predicted 3D structures based on SCOP and CATH domains.

PubMed

Lewis, Tony E; Sillitoe, Ian; Andreeva, Antonina; Blundell, Tom L; Buchan, Daniel W A; Chothia, Cyrus; Cuff, Alison; Dana, Jose M; Filippis, Ioannis; Gough, Julian; Hunter, Sarah; Jones, David T; Kelley, Lawrence A; Kleywegt, Gerard J; Minneci, Federico; Mitchell, Alex; Murzin, Alexey G; Ochoa-Montaño, Bernardo; Rackham, Owen J L; Smith, James; Sternberg, Michael J E; Velankar, Sameer; Yeats, Corin; Orengo, Christine

2013-01-01

Genome3D, available at http://www.genome3d.eu, is a new collaborative project that integrates UK-based structural resources to provide a unique perspective on sequence-structure-function relationships. Leading structure prediction resources (DomSerf, FUGUE, Gene3D, pDomTHREADER, Phyre and SUPERFAMILY) provide annotations for UniProt sequences to indicate the locations of structural domains (structural annotations) and their 3D structures (structural models). Structural annotations and 3D model predictions are currently available for three model genomes (Homo sapiens, E. coli and baker's yeast), and the project will extend to other genomes in the near future. As these resources exploit different strategies for predicting structures, the main aim of Genome3D is to enable comparisons between all the resources so that biologists can see where predictions agree and are therefore more trusted. Furthermore, as these methods differ in whether they build their predictions using CATH or SCOP, Genome3D also contains the first official mapping between these two databases. This has identified pairs of similar superfamilies from the two resources at various degrees of consensus (532 bronze pairs, 527 silver pairs and 370 gold pairs).
Mammalian genomic regulatory regions predicted by utilizing human genomics, transcriptomics, and epigenetics data.

PubMed

Nguyen, Quan H; Tellam, Ross L; Naval-Sanchez, Marina; Porto-Neto, Laercio R; Barendse, William; Reverter, Antonio; Hayes, Benjamin; Kijas, James; Dalrymple, Brian P

2018-03-01

Genome sequences for hundreds of mammalian species are available, but an understanding of their genomic regulatory regions, which control gene expression, is only beginning. A comprehensive prediction of potential active regulatory regions is necessary to functionally study the roles of the majority of genomic variants in evolution, domestication, and animal production. We developed a computational method to predict regulatory DNA sequences (promoters, enhancers, and transcription factor binding sites) in production animals (cows and pigs) and extended its broad applicability to other mammals. The method utilizes human regulatory features identified from thousands of tissues, cell lines, and experimental assays to find homologous regions that are conserved in sequences and genome organization and are enriched for regulatory elements in the genome sequences of other mammalian species. Importantly, we developed a filtering strategy, including a machine learning classification method, to utilize a very small number of species-specific experimental datasets available to select for the likely active regulatory regions. The method finds the optimal combination of sensitivity and accuracy to unbiasedly predict regulatory regions in mammalian species. Furthermore, we demonstrated the utility of the predicted regulatory datasets in cattle for prioritizing variants associated with multiple production and climate change adaptation traits and identifying potential genome editing targets.
mQTL-seq delineates functionally relevant candidate gene harbouring a major QTL regulating pod number in chickpea

PubMed Central

Das, Shouvik; Singh, Mohar; Srivastava, Rishi; Bajaj, Deepak; Saxena, Maneesha S.; Rana, Jai C.; Bansal, Kailash C.; Tyagi, Akhilesh K.; Parida, Swarup K.

2016-01-01

The present study used a whole-genome, NGS resequencing-based mQTL-seq (multiple QTL-seq) strategy in two inter-specific mapping populations (Pusa 1103 × ILWC 46 and Pusa 256 × ILWC 46) to scan the major genomic region(s) underlying QTL(s) governing pod number trait in chickpea. Essentially, the whole-genome resequencing of low and high pod number-containing parental accessions and homozygous individuals (constituting bulks) from each of these two mapping populations discovered >8 million high-quality homozygous SNPs with respect to the reference kabuli chickpea. The functional significance of the physically mapped SNPs was apparent from the identified 2,264 non-synonymous and 23,550 regulatory SNPs, with 8–10% of these SNPs-carrying genes corresponding to transcription factors and disease resistance-related proteins. The utilization of these mined SNPs in Δ (SNP index)-led QTL-seq analysis and their correlation between two mapping populations based on mQTL-seq, narrowed down two (CaqaPN4.1: 867.8 kb and CaqaPN4.2: 1.8 Mb) major genomic regions harbouring robust pod number QTLs into the high-resolution short QTL intervals (CaqbPN4.1: 637.5 kb and CaqbPN4.2: 1.28 Mb) on chickpea chromosome 4. The integration of mQTL-seq-derived one novel robust QTL with QTL region-specific association analysis delineated the regulatory (C/T) and coding (C/A) SNPs-containing one pentatricopeptide repeat (PPR) gene at a major QTL region regulating pod number in chickpea. This target gene exhibited anther, mature pollen and pod-specific expression, including pronounced higher up-regulated (∼3.5-folds) transcript expression in high pod number-containing parental accessions and homozygous individuals of two mapping populations especially during pollen and pod development. The proposed mQTL-seq-driven combinatorial strategy has profound efficacy in rapid genome-wide scanning of potential candidate gene(s) underlying trait-associated high-resolution robust QTL(s), thereby expediting genomics-assisted breeding and genetic enhancement of crop plants, including chickpea. PMID:26685680

Integrative molecular network analysis identifies emergent enzalutamide resistance mechanisms in prostate cancer

PubMed Central

King, Carly J.; Woodward, Josha; Schwartzman, Jacob; Coleman, Daniel J.; Lisac, Robert; Wang, Nicholas J.; Van Hook, Kathryn; Gao, Lina; Urrutia, Joshua; Dane, Mark A.; Heiser, Laura M.; Alumkal, Joshi J.

2017-01-01

Recent work demonstrates that castration-resistant prostate cancer (CRPC) tumors harbor countless genomic aberrations that control many hallmarks of cancer. While some specific mutations in CRPC may be actionable, many others are not. We hypothesized that genomic aberrations in cancer may operate in concert to promote drug resistance and tumor progression, and that organization of these genomic aberrations into therapeutically targetable pathways may improve our ability to treat CRPC. To identify the molecular underpinnings of enzalutamide-resistant CRPC, we performed transcriptional and copy number profiling studies using paired enzalutamide-sensitive and resistant LNCaP prostate cancer cell lines. Gene networks associated with enzalutamide resistance were revealed by performing an integrative genomic analysis with the PAthway Representation and Analysis by Direct Reference on Graphical Models (PARADIGM) tool. Amongst the pathways enriched in the enzalutamide-resistant cells were those associated with MEK, EGFR, RAS, and NFKB. Functional validation studies of 64 genes identified 10 candidate genes whose suppression led to greater effects on cell viability in enzalutamide-resistant cells as compared to sensitive parental cells. Examination of a patient cohort demonstrated that several of our functionally-validated gene hits are deregulated in metastatic CRPC tumor samples, suggesting that they may be clinically relevant therapeutic targets for patients with enzalutamide-resistant CRPC. Altogether, our approach demonstrates the potential of integrative genomic analyses to clarify determinants of drug resistance and rational co-targeting strategies to overcome resistance. PMID:29340039
pTC Plasmids from Sulfolobus Species in the Geothermal Area of Tengchong, China: Genomic Conservation and Naturally-Occurring Variations as a Result of Transposition by Mobile Genetic Elements

PubMed Central

Xiang, Xiaoyu; Huang, Xiaoxing; Wang, Haina; Huang, Li

2015-01-01

Plasmids occur frequently in Archaea. A novel plasmid (denoted pTC1) containing typical conjugation functions has been isolated from Sulfolobus tengchongensis RT8-4, a strain obtained from a hot spring in Tengchong, China, and characterized. The plasmid is a circular double-stranded DNA molecule of 20,417 bp. Among a total of 26 predicted pTC1 ORFs, 23 have homologues in other known Sulfolobus conjugative plasmids (CPs). pTC1 resembles other Sulfolobus CPs in genome architecture, and is most highly conserved in the genomic region encoding conjugation functions. However, attempts to demonstrate experimentally the capacity of the plasmid for conjugational transfer were unsuccessful. A survey revealed that pTC1 and its closely related plasmid variants were widespread in the geothermal area of Tengchong. Variations of the plasmids at the target sites for transposition by an insertion sequence (IS) and a miniature inverted-repeat transposable element (MITE) were readily detected. The IS was efficiently inserted into the pTC1 genome, and the inserted sequence was inactivated and degraded more frequently in an imprecise manner than in a precise manner. These results suggest that the host organism has evolved a strategy to maintain a balance between the insertion and elimination of mobile genetic elements to permit genomic plasticity while inhibiting their fast spreading. PMID:25686154
pTC Plasmids from Sulfolobus Species in the Geothermal Area of Tengchong, China: Genomic Conservation and Naturally-Occurring Variations as a Result of Transposition by Mobile Genetic Elements.

PubMed

Xiang, Xiaoyu; Huang, Xiaoxing; Wang, Haina; Huang, Li

2015-02-12

Plasmids occur frequently in Archaea. A novel plasmid (denoted pTC1) containing typical conjugation functions has been isolated from Sulfolobus tengchongensis RT8-4, a strain obtained from a hot spring in Tengchong, China, and characterized. The plasmid is a circular double-stranded DNA molecule of 20,417 bp. Among a total of 26 predicted pTC1 ORFs, 23 have homologues in other known Sulfolobus conjugative plasmids (CPs). pTC1 resembles other Sulfolobus CPs in genome architecture, and is most highly conserved in the genomic region encoding conjugation functions. However, attempts to demonstrate experimentally the capacity of the plasmid for conjugational transfer were unsuccessful. A survey revealed that pTC1 and its closely related plasmid variants were widespread in the geothermal area of Tengchong. Variations of the plasmids at the target sites for transposition by an insertion sequence (IS) and a miniature inverted-repeat transposable element (MITE) were readily detected. The IS was efficiently inserted into the pTC1 genome, and the inserted sequence was inactivated and degraded more frequently in an imprecise manner than in a precise manner. These results suggest that the host organism has evolved a strategy to maintain a balance between the insertion and elimination of mobile genetic elements to permit genomic plasticity while inhibiting their fast spreading.
Sequence and functional characterization of MIRNA164 promoters from Brassica shows copy number dependent regulatory diversification among homeologs.

PubMed

Jain, Aditi; Anand, Saurabh; Singh, Neer K; Das, Sandip

2018-03-12

The impact of polyploidy on functional diversification of cis-regulatory elements is poorly understood. This is primarily on account of lack of well-defined structure of cis-elements and a universal regulatory code. To the best of our knowledge, this is the first report on characterization of sequence and functional diversification of paralogous and homeologous promoter elements associated with MIR164 from Brassica. The availability of whole genome sequence allowed us to identify and isolate a total of 42 homologous copies of MIR164 from diploid species-Brassica rapa (A-genome), Brassica nigra (B-genome), Brassica oleracea (C-genome), and allopolyploids-Brassica juncea (AB-genome), Brassica carinata (BC-genome) and Brassica napus (AC-genome). Additionally, we retrieved homologous sequences based on comparative genomics from Arabidopsis lyrata, Capsella rubella, and Thellungiella halophila, spanning ca. 45 million years of evolutionary history of Brassicaceae. Sequence comparison across Brassicaceae revealed lineage-, karyotype, species-, and sub-genome specific changes providing a snapshot of evolutionary dynamics of miRNA promoters in polyploids. Tree topology of cis-elements associated with MIR164 was found to re-capitulate the species and family evolutionary history. Phylogenetic shadowing identified transcription factor binding sites (TFBS) conserved across Brassicaceae, of which, some are already known as regulators of MIR164 expression. Some of the TFBS were found to be distributed in a sub-genome specific (e.g., SOX specific to promoter of MIR164c from MF2 sub-genome), lineage-specific (YABBY binding motif, specific to C. rubella in MIR164b), or species-specific (e.g., VOZ in A. thaliana MIR164a) manner which might contribute towards genetic and adaptive variation. Reporter activity driven by promoters associated with MIR164 paralogs and homeologs was majorly in agreement with known role of miR164 in leaf shaping, regulation of lateral root development and senescence, and one previously un-described novel role in trichome. The impact of polyploidy was most profound when reporter activity across three MIR164c homeologs were compared that revealed negligible overlap, whereas reporter activity among two homeologs of MIR164a displays significant overlap. A copy number dependent cis-regulatory divergence thus exists in MIR164 genes in Brassica juncea. The full extent of regulatory diversification towards adaptive strategies will only be known when future endeavors analyze the promoter function under duress of stress and hormonal regimes.
Semi-Automatic In Silico Gap Closure Enabled De Novo Assembly of Two Dehalobacter Genomes from Metagenomic Data

PubMed Central

Tang, Shuiquan; Gong, Yunchen; Edwards, Elizabeth A.

2012-01-01

Typically, the assembly and closure of a complete bacterial genome requires substantial additional effort spent in a wet lab for gap resolution and genome polishing. Assembly is further confounded by subspecies polymorphism when starting from metagenome sequence data. In this paper, we describe an in silico gap-resolution strategy that can substantially improve assembly. This strategy resolves assembly gaps in scaffolds using pre-assembled contigs, followed by verification with read mapping. It is capable of resolving assembly gaps caused by repetitive elements and subspecies polymorphisms. Using this strategy, we realized the de novo assembly of the first two Dehalobacter genomes from the metagenomes of two anaerobic mixed microbial cultures capable of reductive dechlorination of chlorinated ethanes and chloroform. Only four additional PCR reactions were required even though the initial assembly with Newbler v. 2.5 produced 101 contigs within 9 scaffolds belonging to two Dehalobacter strains. By applying this strategy to the re-assembly of a recently published genome of Bacteroides, we demonstrate its potential utility for other sequencing projects, both metagenomic and genomic. PMID:23284863
Type-2 diabetes-associated variants with cross-trait relevance: Post-GWAs strategies for biological function interpretation.

PubMed

Frau, Francesca; Crowther, Daniel; Ruetten, Hartmut; Allebrandt, Karla V

2017-05-01

Genome-wide association studies (GWAs) for type 2 diabetes (T2D) have been successful in identifying many loci with robust association signals. Nevertheless, there is a clear need for post-GWAs strategies to understand mechanism of action and clinical relevance of these variants. The association of several comorbidities with T2D suggests a common etiology for these phenotypes and complicates the management of the disease. In this study, we focused on the genetics underlying these relationships, using systems genomics to identify genetic variation associated with T2D and 12 other traits. GWAs studies summary statistics for pairwise comparisons were obtained for glycemic traits, obesity, coronary artery disease, and lipids from large consortia GWAs meta-analyses. We used a network medicine approach to leverage experimental information about the identified genes and variants with cross traits effects for biological function interpretation. We identified a set of 38 genetic variants with cross traits effects that point to a main network of genes that should be relevant for T2D and its comorbidities. We prioritized the T2D associated genes based on the number of traits they showed association with and the experimental evidence showing their relation to the disease etiology. In this study, we demonstrated how systems genomics and network medicine approaches can shed light into GWAs discoveries, translating findings into a more therapeutically relevant context. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Mouse ENU Mutagenesis to Understand Immunity to Infection: Methods, Selected Examples, and Perspectives

PubMed Central

Caignard, Grégory; Eva, Megan M.; van Bruggen, Rebekah; Eveleigh, Robert; Bourque, Guillaume; Malo, Danielle; Gros, Philippe; Vidal, Silvia M.

2014-01-01

Infectious diseases are responsible for over 25% of deaths globally, but many more individuals are exposed to deadly pathogens. The outcome of infection results from a set of diverse factors including pathogen virulence factors, the environment, and the genetic make-up of the host. The completion of the human reference genome sequence in 2004 along with technological advances have tremendously accelerated and renovated the tools to study the genetic etiology of infectious diseases in humans and its best characterized mammalian model, the mouse. Advancements in mouse genomic resources have accelerated genome-wide functional approaches, such as gene-driven and phenotype-driven mutagenesis, bringing to the fore the use of mouse models that reproduce accurately many aspects of the pathogenesis of human infectious diseases. Treatment with the mutagen N-ethyl-N-nitrosourea (ENU) has become the most popular phenotype-driven approach. Our team and others have employed mouse ENU mutagenesis to identify host genes that directly impact susceptibility to pathogens of global significance. In this review, we first describe the strategies and tools used in mouse genetics to understand immunity to infection with special emphasis on chemical mutagenesis of the mouse germ-line together with current strategies to efficiently identify functional mutations using next generation sequencing. Then, we highlight illustrative examples of genes, proteins, and cellular signatures that have been revealed by ENU screens and have been shown to be involved in susceptibility or resistance to infectious diseases caused by parasites, bacteria, and viruses. PMID:25268389
Emerging Role of CRISPR/Cas9 Technology for MicroRNAs Editing in Cancer Research.

PubMed

Aquino-Jarquin, Guillermo

2017-12-15

MicroRNAs (miRNA) are small, noncoding RNA molecules with a master role in the regulation of important tasks in different critical processes of cancer pathogenesis. Because there are different miRNAs implicated in all the stages of cancer, for example, functioning as oncogenes, this makes these small molecules suitable targets for cancer diagnosis and therapy. RNA-mediated interference has been one major approach for sequence-specific regulation of gene expression in eukaryotic organisms. Recently, the CRISPR (clustered regularly interspaced short palindromic repeats)/Cas9 system, first identified in bacteria and archaea as an adaptive immune response to invading genetic material, has been explored as a sequence-specific molecular tool for editing genomic sequences for basic research in life sciences and for therapeutic purposes. There is growing evidence that small noncoding RNAs, including miRNAs, can be targeted by the CRISPR/Cas9 system despite their lacking an open reading frame to evaluate functional loss. Thus, CRISPR/Cas9 technology represents a novel gene-editing strategy with compelling robustness, specificity, and stability for the modification of miRNA expression. Here, I summarize key features of current knowledge of genomic editing by CRISPR/Cas9 technology as a feasible strategy for globally interrogating miRNA gene function and miRNA-based therapeutic intervention. Alternative emerging strategies for nonviral delivery of CRISPR/Cas9 core components into human cells in a clinical context are also analyzed critically. Cancer Res; 77(24); 6812-7. ©2017 AACR . ©2017 American Association for Cancer Research.
Protein Function Prediction: Problems and Pitfalls.

PubMed

Pearson, William R

2015-09-03

The characterization of new genomes based on their protein sets has been revolutionized by new sequencing technologies, but biologists seeking to exploit new sequence information are often frustrated by the challenges associated with accurately assigning biological functions to newly identified proteins. Here, we highlight some of the challenges in functional inference from sequence similarity. Investigators can improve the accuracy of function prediction by (1) being conservative about the evolutionary distance to a protein of known function; (2) considering the ambiguous meaning of "functional similarity," and (3) being aware of the limitations of annotations in functional databases. Protein function prediction does not offer "one-size-fits-all" solutions. Prediction strategies work better when the idiosyncrasies of function and functional annotation are better understood. Copyright © 2015 John Wiley & Sons, Inc.
Metabolomic strategies to map functions of metabolic pathways.

PubMed

Mulvihill, Melinda M; Nomura, Daniel K

2014-08-01

Genome sequencing efforts have revealed a strikingly large number of unannotated and uncharacterized genes that fall into metabolic enzymes classes, likely indicating that our current knowledge of biochemical pathways in normal physiology, let alone in disease states, remains largely incomplete. This realization presents a daunting challenge for post-genomic-era scientists in deciphering the biochemical and (patho)physiological roles of these enzymes and their metabolites and metabolic networks. This is further complicated by many recent studies showing a rewiring of normal metabolic networks in disease states to give rise to unique pathophysiological functions of enzymes, metabolites, and metabolic pathways. This review focuses on recent discoveries made using metabolic mapping technologies to uncover novel pathways and metabolite-mediated posttranslational modifications and epigenetic alterations and their impact on physiology and disease. Copyright © 2014 the American Physiological Society.
The CRISPR/Cas9 system sheds new lights on the biology of protozoan parasites.

PubMed

Grzybek, Maciej; Golonko, Aleksandra; Górska, Aleksandra; Szczepaniak, Klaudiusz; Strachecka, Aneta; Lass, Anna; Lisowski, Paweł

2018-06-01

The CRISPR/Cas9 system, a natural defence system of bacterial organisms, has recently been used to modify genomes of the most important protozoa parasites. Successful genome manipulations with the CRISPR/Cas9 system are changing the present view of genetics in parasitology. The application of this system offers a major chance to overcome the current restriction in culturing, maintaining and analysing protozoan parasites, and allows dynamic analysis of parasite genes functions, leading to a better understanding of pathogenesis. CRISPR/Cas9 system will have a significant influence on the process of developing novel drugs and treatment strategies against protozoa parasites.
Designing of plant artificial chromosome (PAC) by using the Chlorella smallest chromosome as a model system.

PubMed

Noutoshi, Y; Arai, R; Fujie, M; Yamada, T

1997-01-01

As a model for plant-type chromosomes, we have been characterizing molecular organization of the Chlorella vulgaris C-169 chromosome I. To identify chromosome structural elements including the centromeric region and replication origins, we constructed a chromosome I specific cosmid library and aligned each cosmid clones to generate contigs. So far, more than 80% of the entire chromosome I has been covered. A complete clonal physical reconstitution of chromosome I provides information on the structure and genomic organization of plant genome. We propose our strategy to construct an artificial chromosome by assembling the functional chromosome structural elements identified on Chrorella chromosome I.
A High Throughput Barley Stripe Mosaic Virus Vector for Virus Induced Gene Silencing in Monocots and Dicots

PubMed Central

Yan, Lijie; Jackson, Andrew O.; Liu, Zhiyong; Han, Chenggui; Yu, Jialin; Li, Dawei

2011-01-01

Barley stripe mosaic virus (BSMV) is a single-stranded RNA virus with three genome components designated alpha, beta, and gamma. BSMV vectors have previously been shown to be efficient virus induced gene silencing (VIGS) vehicles in barley and wheat and have provided important information about host genes functioning during pathogenesis as well as various aspects of genes functioning in development. To permit more effective use of BSMV VIGS for functional genomics experiments, we have developed an Agrobacterium delivery system for BSMV and have coupled this with a ligation independent cloning (LIC) strategy to mediate efficient cloning of host genes. Infiltrated Nicotiana benthamiana leaves provided excellent sources of virus for secondary BSMV infections and VIGS in cereals. The Agro/LIC BSMV VIGS vectors were able to function in high efficiency down regulation of phytoene desaturase (PDS), magnesium chelatase subunit H (ChlH), and plastid transketolase (TK) gene silencing in N. benthamiana and in the monocots, wheat, barley, and the model grass, Brachypodium distachyon. Suppression of an Arabidopsis orthologue cloned from wheat (TaPMR5) also interfered with wheat powdery mildew (Blumeria graminis f. sp. tritici) infections in a manner similar to that of the A. thaliana PMR5 loss-of-function allele. These results imply that the PMR5 gene has maintained similar functions across monocot and dicot families. Our BSMV VIGS system provides substantial advantages in expense, cloning efficiency, ease of manipulation and ability to apply VIGS for high throughput genomics studies. PMID:22031834
Comparative Genomics Reveals the Core Gene Toolbox for the Fungus-Insect Symbiosis.

PubMed

Wang, Yan; Stata, Matt; Wang, Wei; Stajich, Jason E; White, Merlin M; Moncalvo, Jean-Marc

2018-05-15

Modern genomics has shed light on many entomopathogenic fungi and expanded our knowledge widely; however, little is known about the genomic features of the insect-commensal fungi. Harpellales are obligate commensals living in the digestive tracts of disease-bearing insects (black flies, midges, and mosquitoes). In this study, we produced and annotated whole-genome sequences of nine Harpellales taxa and conducted the first comparative analyses to infer the genomic diversity within the members of the Harpellales. The genomes of the insect gut fungi feature low (26% to 37%) GC content and large genome size variations (25 to 102 Mb). Further comparisons with insect-pathogenic fungi (from both Ascomycota and Zoopagomycota), as well as with free-living relatives (as negative controls), helped to identify a gene toolbox that is essential to the fungus-insect symbiosis. The results not only narrow the genomic scope of fungus-insect interactions from several thousands to eight core players but also distinguish host invasion strategies employed by insect pathogens and commensals. The genomic content suggests that insect commensal fungi rely mostly on adhesion protein anchors that target digestive system, while entomopathogenic fungi have higher numbers of transmembrane helices, signal peptides, and pathogen-host interaction (PHI) genes across the whole genome and enrich genes as well as functional domains to inactivate the host inflammation system and suppress the host defense. Phylogenomic analyses have revealed that genome sizes of Harpellales fungi vary among lineages with an integer-multiple pattern, which implies that ancient genome duplications may have occurred within the gut of insects. IMPORTANCE Insect guts harbor various microbes that are important for host digestion, immune response, and disease dispersal in certain cases. Bacteria, which are among the primary endosymbionts, have been studied extensively. However, fungi, which are also frequently encountered, are poorly known with respect to their biology within the insect guts. To understand the genomic features and related biology, we produced the whole-genome sequences of nine gut commensal fungi from disease-bearing insects (black flies, midges, and mosquitoes). The results show that insect gut fungi tend to have low GC content across their genomes. By comparing these commensals with entomopathogenic and free-living fungi that have available genome sequences, we found a universal core gene toolbox that is unique and thus potentially important for the insect-fungus symbiosis. This comparative work also uncovered different host invasion strategies employed by insect pathogens and commensals, as well as a model system to study ancient fungal genome duplication within the gut of insects. © Crown copyright 2018.
A Rare SNP Identified a TCP Transcription Factor Essential for Tendril Development in Cucumber.

PubMed

Wang, Shenhao; Yang, Xueyong; Xu, Mengnan; Lin, Xingzhong; Lin, Tao; Qi, Jianjian; Shao, Guangjin; Tian, Nana; Yang, Qing; Zhang, Zhonghua; Huang, Sanwen

2015-12-07

Rare genetic variants are abundant in genomes but less tractable in genome-wide association study. Here we exploit a strategy of rare variation mapping to discover a gene essential for tendril development in cucumber (Cucumis sativus L.). In a collection of >3000 lines, we discovered a unique tendril-less line that forms branches instead of tendrils and, therefore, loses its climbing ability. We hypothesized that this unusual phenotype was caused by a rare variation and subsequently identified the causative single nucleotide polymorphism. The affected gene TEN encodes a TCP transcription factor conserved within the cucurbits and is expressed specifically in tendrils, representing a new organ identity gene. The variation occurs within a protein motif unique to the cucurbits and impairs its function as a transcriptional activator. Analyses of transcriptomes from near-isogenic lines identified downstream genes required for the tendril's capability to sense and climb a support. This study provides an example to explore rare functional variants in plant genomes. Copyright © 2015 The Author. Published by Elsevier Inc. All rights reserved.
Drosophila and experimental neurology in the post-genomic era.

PubMed

Shulman, Joshua M

2015-12-01

For decades, the fruit fly, Drosophila melanogaster, has been among the premiere genetic model systems for probing fundamental neurobiology, including elucidation of mechanisms responsible for human neurologic disorders. Flies continue to offer virtually unparalleled versatility and speed for genetic manipulation, strong genomic conservation, and a nervous system that recapitulates a range of cellular and network properties relevant to human disease. I focus here on four critical challenges emerging from recent advances in our understanding of the genomic basis of human neurologic disorders where innovative experimental strategies are urgently needed: (1) pinpointing causal genes from associated genomic loci; (2) confirming the functional impact of allelic variants; (3) elucidating nervous system roles for novel or poorly studied genes; and (4) probing network interactions within implicated regulatory pathways. Drosophila genetic approaches are ideally suited to address each of these potential translational roadblocks, and will therefore contribute to mechanistic insights and potential breakthrough therapies for complex genetic disorders in the coming years. Strategic collaboration between neurologists, human geneticists, and the Drosophila research community holds great promise to accelerate progress in the post-genomic era. Copyright © 2015 Elsevier Inc. All rights reserved.
Coordinated phenotype switching with large-scale chromosome flip-flop inversion observed in bacteria.

PubMed

Cui, Longzhu; Neoh, Hui-min; Iwamoto, Akira; Hiramatsu, Keiichi

2012-06-19

Genome inversions are ubiquitous in organisms ranging from prokaryotes to eukaryotes. Typical examples can be identified by comparing the genomes of two or more closely related organisms, where genome inversion footprints are clearly visible. Although the evolutionary implications of this phenomenon are huge, little is known about the function and biological meaning of this process. Here, we report our findings on a bacterium that generates a reversible, large-scale inversion of its chromosome (about half of its total genome) at high frequencies of up to once every four generations. This inversion switches on or off bacterial phenotypes, including colony morphology, antibiotic susceptibility, hemolytic activity, and expression of dozens of genes. Quantitative measurements and mathematical analyses indicate that this reversible switching is stochastic but self-organized so as to maintain two forms of stable cell populations (i.e., small colony variant, normal colony variant) as a bet-hedging strategy. Thus, this heritable and reversible genome fluctuation seems to govern the bacterial life cycle; it has a profound impact on the course and outcomes of bacterial infections.
TARGETED CAPTURE IN EVOLUTIONARY AND ECOLOGICAL GENOMICS

PubMed Central

Jones, Matthew R.; Good, Jeffrey M.

2016-01-01

The rapid expansion of next-generation sequencing has yielded a powerful array of tools to address fundamental biological questions at a scale that was inconceivable just a few years ago. Various genome partitioning strategies to sequence select subsets of the genome have emerged as powerful alternatives to whole genome sequencing in ecological and evolutionary genomic studies. High throughput targeted capture is one such strategy that involves the parallel enrichment of pre-selected genomic regions of interest. The growing use of targeted capture demonstrates its potential power to address a range of research questions, yet these approaches have yet to expand broadly across labs focused on evolutionary and ecological genomics. In part, the use of targeted capture has been hindered by the logistics of capture design and implementation in species without established reference genomes. Here we aim to 1) increase the accessibility of targeted capture to researchers working in non-model taxa by discussing capture methods that circumvent the need of a reference genome, 2) highlight the evolutionary and ecological applications where this approach is emerging as a powerful sequencing strategy, and 3) discuss the future of targeted capture and other genome partitioning approaches in light of the increasing accessibility of whole genome sequencing. Given the practical advantages and increasing feasibility of high-throughput targeted capture, we anticipate an ongoing expansion of capture-based approaches in evolutionary and ecological research, synergistic with an expansion of whole genome sequencing. PMID:26137993
Off-target Effects in CRISPR/Cas9-mediated Genome Engineering

PubMed Central

Zhang, Xiao-Hui; Tee, Louis Y; Wang, Xiao-Gang; Huang, Qun-Shan; Yang, Shi-Hua

2015-01-01

CRISPR/Cas9 is a versatile genome-editing technology that is widely used for studying the functionality of genetic elements, creating genetically modified organisms as well as preclinical research of genetic disorders. However, the high frequency of off-target activity (≥50%)—RGEN (RNA-guided endonuclease)-induced mutations at sites other than the intended on-target site—is one major concern, especially for therapeutic and clinical applications. Here, we review the basic mechanisms underlying off-target cutting in the CRISPR/Cas9 system, methods for detecting off-target mutations, and strategies for minimizing off-target cleavage. The improvement off-target specificity in the CRISPR/Cas9 system will provide solid genotype–phenotype correlations, and thus enable faithful interpretation of genome-editing data, which will certainly facilitate the basic and clinical application of this technology. PMID:26575098
Geminiviruses for biotechnology: the art of parasite taming.

PubMed

Lozano-Durán, Rosa

2016-04-01

Viruses are intracellular pathogens that have evolved efficient strategies for replication and expression of their proteins in the host cells. Geminiviruses - plant viruses with small circular single-stranded DNA genomes - effectively manipulate plant cell processes for viral functions, entailing great potential for biotechnological applications. This potentiality has been realized in the form of protein expression and gene-silencing vectors, and, more recently, vectors for genome editing - a technology that these viruses seem particularly well-suited to facilitate. This insight offers an overview of the biological properties of geminiviruses, with emphasis on those leveraging development of geminivirus-based replicons. It illustrates the basis for engineering geminivirus-based replicons and their applications. Furthermore, it discusses the reported use and future perspectives of geminivirus-based replicons for genome editing. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

Topological analysis of metabolic networks integrating co-segregating transcriptomes and metabolomes in type 2 diabetic rat congenic series.

PubMed

Dumas, Marc-Emmanuel; Domange, Céline; Calderari, Sophie; Martínez, Andrea Rodríguez; Ayala, Rafael; Wilder, Steven P; Suárez-Zamorano, Nicolas; Collins, Stephan C; Wallis, Robert H; Gu, Quan; Wang, Yulan; Hue, Christophe; Otto, Georg W; Argoud, Karène; Navratil, Vincent; Mitchell, Steve C; Lindon, John C; Holmes, Elaine; Cazier, Jean-Baptiste; Nicholson, Jeremy K; Gauguier, Dominique

2016-09-30

The genetic regulation of metabolic phenotypes (i.e., metabotypes) in type 2 diabetes mellitus occurs through complex organ-specific cellular mechanisms and networks contributing to impaired insulin secretion and insulin resistance. Genome-wide gene expression profiling systems can dissect the genetic contributions to metabolome and transcriptome regulations. The integrative analysis of multiple gene expression traits and metabolic phenotypes (i.e., metabotypes) together with their underlying genetic regulation remains a challenge. Here, we introduce a systems genetics approach based on the topological analysis of a combined molecular network made of genes and metabolites identified through expression and metabotype quantitative trait locus mapping (i.e., eQTL and mQTL) to prioritise biological characterisation of candidate genes and traits. We used systematic metabotyping by 1 H NMR spectroscopy and genome-wide gene expression in white adipose tissue to map molecular phenotypes to genomic blocks associated with obesity and insulin secretion in a series of rat congenic strains derived from spontaneously diabetic Goto-Kakizaki (GK) and normoglycemic Brown-Norway (BN) rats. We implemented a network biology strategy approach to visualize the shortest paths between metabolites and genes significantly associated with each genomic block. Despite strong genomic similarities (95-99 %) among congenics, each strain exhibited specific patterns of gene expression and metabotypes, reflecting the metabolic consequences of series of linked genetic polymorphisms in the congenic intervals. We subsequently used the congenic panel to map quantitative trait loci underlying specific mQTLs and genome-wide eQTLs. Variation in key metabolites like glucose, succinate, lactate, or 3-hydroxybutyrate and second messenger precursors like inositol was associated with several independent genomic intervals, indicating functional redundancy in these regions. To navigate through the complexity of these association networks we mapped candidate genes and metabolites onto metabolic pathways and implemented a shortest path strategy to highlight potential mechanistic links between metabolites and transcripts at colocalized mQTLs and eQTLs. Minimizing the shortest path length drove prioritization of biological validations by gene silencing. These results underline the importance of network-based integration of multilevel systems genetics datasets to improve understanding of the genetic architecture of metabotype and transcriptomic regulation and to characterize novel functional roles for genes determining tissue-specific metabolism.
Genome Stability Pathways in Head and Neck Cancers

PubMed Central

O'Byrne, Kenneth J.; Panizza, Benedict; Richard, Derek J.

2013-01-01

Genomic instability underlies the transformation of host cells toward malignancy, promotes development of invasion and metastasis and shapes the response of established cancer to treatment. In this review, we discuss recent advances in our understanding of genomic stability in squamous cell carcinoma of the head and neck (HNSCC), with an emphasis on DNA repair pathways. HNSCC is characterized by distinct profiles in genome stability between similarly staged cancers that are reflected in risk, treatment response and outcomes. Defective DNA repair generates chromosomal derangement that can cause subsequent alterations in gene expression, and is a hallmark of progression toward carcinoma. Variable functionality of an increasing spectrum of repair gene polymorphisms is associated with increased cancer risk, while aetiological factors such as human papillomavirus, tobacco and alcohol induce significantly different behaviour in induced malignancy, underpinned by differences in genomic stability. Targeted inhibition of signalling receptors has proven to be a clinically-validated therapy, and protein expression of other DNA repair and signalling molecules associated with cancer behaviour could potentially provide a more refined clinical model for prognosis and treatment prediction. Development and expansion of current genomic stability models is furthering our understanding of HNSCC pathophysiology and uncovering new, promising treatment strategies. PMID:24364026
The Genome Biology of Effector Gene Evolution in Filamentous Plant Pathogens.

PubMed

Sánchez-Vallet, Andrea; Fouché, Simone; Fudal, Isabelle; Hartmann, Fanny E; Soyer, Jessica L; Tellier, Aurélien; Croll, Daniel

2018-05-16

Filamentous pathogens, including fungi and oomycetes, pose major threats to global food security. Crop pathogens cause damage by secreting effectors that manipulate the host to the pathogen's advantage. Genes encoding such effectors are among the most rapidly evolving genes in pathogen genomes. Here, we review how the major characteristics of the emergence, function, and regulation of effector genes are tightly linked to the genomic compartments where these genes are located in pathogen genomes. The presence of repetitive elements in these compartments is associated with elevated rates of point mutations and sequence rearrangements with a major impact on effector diversification. The expression of many effectors converges on an epigenetic control mediated by the presence of repetitive elements. Population genomics analyses showed that rapidly evolving pathogens show high rates of turnover at effector loci and display a mosaic in effector presence-absence polymorphism among strains. We conclude that effective pathogen containment strategies require a thorough understanding of the effector genome biology and the pathogen's potential for rapid adaptation. Expected final online publication date for the Annual Review of Phytopathology Volume 56 is August 25, 2018. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Single sample resolution of rare microbial dark matter in a marine invertebrate metagenome

DOE Office of Scientific and Technical Information (OSTI.GOV)

Miller, Ian J.; Weyna, Theodore R.; Fong, Stephen S.

Direct, untargeted sequencing of environmental samples (metagenomics) and de novo genome assembly enable the study of uncultured and phylogenetically divergent organisms. However, separating individual genomes from a mixed community has often relied on the differential-coverage analysis of multiple, deeply sequenced samples. In the metagenomic investigation of the marine bryozoan Bugula neritina, we uncovered seven bacterial genomes associated with a single B. neritina individual that appeared to be transient associates, two of which were unique to one individual and undetectable using certain “universal” 16S rRNA primers and probes. We recovered high quality genome assemblies for several rare instances of “microbial darkmore » matter,” or phylogenetically divergent bacteria lacking genomes in reference databases, from a single tissue sample that was not subjected to any physical or chemical pre-treatment. One of these rare, divergent organisms has a small (593 kbp), poorly annotated genome with low GC content (20.9%) and a 16S rRNA gene with just 65% sequence similarity to the closest reference sequence. Lastly, our findings illustrate the importance of sampling strategy and de novo assembly of metagenomic reads to understand the extent and function of bacterial biodiversity.« less
Single sample resolution of rare microbial dark matter in a marine invertebrate metagenome

DOE PAGES

Miller, Ian J.; Weyna, Theodore R.; Fong, Stephen S.; ...

2016-09-29

Direct, untargeted sequencing of environmental samples (metagenomics) and de novo genome assembly enable the study of uncultured and phylogenetically divergent organisms. However, separating individual genomes from a mixed community has often relied on the differential-coverage analysis of multiple, deeply sequenced samples. In the metagenomic investigation of the marine bryozoan Bugula neritina, we uncovered seven bacterial genomes associated with a single B. neritina individual that appeared to be transient associates, two of which were unique to one individual and undetectable using certain “universal” 16S rRNA primers and probes. We recovered high quality genome assemblies for several rare instances of “microbial darkmore » matter,” or phylogenetically divergent bacteria lacking genomes in reference databases, from a single tissue sample that was not subjected to any physical or chemical pre-treatment. One of these rare, divergent organisms has a small (593 kbp), poorly annotated genome with low GC content (20.9%) and a 16S rRNA gene with just 65% sequence similarity to the closest reference sequence. Lastly, our findings illustrate the importance of sampling strategy and de novo assembly of metagenomic reads to understand the extent and function of bacterial biodiversity.« less
Gene discovery by chemical mutagenesis and whole-genome sequencing in Dictyostelium.

PubMed

Li, Cheng-Lin Frank; Santhanam, Balaji; Webb, Amanda Nicole; Zupan, Blaž; Shaulsky, Gad

2016-09-01

Whole-genome sequencing is a useful approach for identification of chemical-induced lesions, but previous applications involved tedious genetic mapping to pinpoint the causative mutations. We propose that saturation mutagenesis under low mutagenic loads, followed by whole-genome sequencing, should allow direct implication of genes by identifying multiple independent alleles of each relevant gene. We tested the hypothesis by performing three genetic screens with chemical mutagenesis in the social soil amoeba Dictyostelium discoideum Through genome sequencing, we successfully identified mutant genes with multiple alleles in near-saturation screens, including resistance to intense illumination and strong suppressors of defects in an allorecognition pathway. We tested the causality of the mutations by comparison to published data and by direct complementation tests, finding both dominant and recessive causative mutations. Therefore, our strategy provides a cost- and time-efficient approach to gene discovery by integrating chemical mutagenesis and whole-genome sequencing. The method should be applicable to many microbial systems, and it is expected to revolutionize the field of functional genomics in Dictyostelium by greatly expanding the mutation spectrum relative to other common mutagenesis methods. © 2016 Li et al.; Published by Cold Spring Harbor Laboratory Press.
Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study.

PubMed

Cerdeira, Louise Teixeira; Carneiro, Adriana Ribeiro; Ramos, Rommel Thiago Jucá; de Almeida, Sintia Silva; D'Afonseca, Vivian; Schneider, Maria Paula Cruz; Baumbach, Jan; Tauch, Andreas; McCulloch, John Anthony; Azevedo, Vasco Ariston Carvalho; Silva, Artur

2011-08-01

Due to the advent of the so-called Next-Generation Sequencing (NGS) technologies the amount of monetary and temporal resources for whole-genome sequencing has been reduced by several orders of magnitude. Sequence reads can be assembled either by anchoring them directly onto an available reference genome (classical reference assembly), or can be concatenated by overlap (de novo assembly). The latter strategy is preferable because it tends to maintain the architecture of the genome sequence the however, depending on the NGS platform used, the shortness of read lengths cause tremendous problems the in the subsequent genome assembly phase, impeding closing of the entire genome sequence. To address the problem, we developed a multi-pronged hybrid de novo strategy combining De Bruijn graph and Overlap-Layout-Consensus methods, which was used to assemble from short reads the entire genome of Corynebacterium pseudotuberculosis strain I19, a bacterium with immense importance in veterinary medicine that causes Caseous Lymphadenitis in ruminants, principally ovines and caprines. Briefly, contigs were assembled de novo from the short reads and were only oriented using a reference genome by anchoring. Remaining gaps were closed using iterative anchoring of short reads by craning to gap flanks. Finally, we compare the genome sequence assembled using our hybrid strategy to a classical reference assembly using the same data as input and show that with the availability of a reference genome, it pays off to use the hybrid de novo strategy, rather than a classical reference assembly, because more genome sequences are preserved using the former. Copyright © 2011 Elsevier B.V. All rights reserved.
Genome-scale CRISPR-Cas9 Knockout and Transcriptional Activation Screening

PubMed Central

Joung, Julia; Konermann, Silvana; Gootenberg, Jonathan S.; Abudayyeh, Omar O.; Platt, Randall J.; Brigham, Mark D.; Sanjana, Neville E.; Zhang, Feng

2017-01-01

Forward genetic screens are powerful tools for the unbiased discovery and functional characterization of specific genetic elements associated with a phenotype of interest. Recently, the RNA-guided endonuclease Cas9 from the microbial CRISPR (clustered regularly interspaced short palindromic repeats) immune system has been adapted for genome-scale screening by combining Cas9 with pooled guide RNA libraries. Here we describe a protocol for genome-scale knockout and transcriptional activation screening using the CRISPR-Cas9 system. Custom- or ready-made guide RNA libraries are constructed and packaged into lentiviral vectors for delivery into cells for screening. As each screen is unique, we provide guidelines for determining screening parameters and maintaining sufficient coverage. To validate candidate genes identified from the screen, we further describe strategies for confirming the screening phenotype as well as genetic perturbation through analysis of indel rate and transcriptional activation. Beginning with library design, a genome-scale screen can be completed in 9–15 weeks followed by 4–5 weeks of validation. PMID:28333914
Small molecules enhance CRISPR genome editing in pluripotent stem cells.

PubMed

Yu, Chen; Liu, Yanxia; Ma, Tianhua; Liu, Kai; Xu, Shaohua; Zhang, Yu; Liu, Honglei; La Russa, Marie; Xie, Min; Ding, Sheng; Qi, Lei S

2015-02-05

The bacterial CRISPR-Cas9 system has emerged as an effective tool for sequence-specific gene knockout through non-homologous end joining (NHEJ), but it remains inefficient for precise editing of genome sequences. Here we develop a reporter-based screening approach for high-throughput identification of chemical compounds that can modulate precise genome editing through homology-directed repair (HDR). Using our screening method, we have identified small molecules that can enhance CRISPR-mediated HDR efficiency, 3-fold for large fragment insertions and 9-fold for point mutations. Interestingly, we have also observed that a small molecule that inhibits HDR can enhance frame shift insertion and deletion (indel) mutations mediated by NHEJ. The identified small molecules function robustly in diverse cell types with minimal toxicity. The use of small molecules provides a simple and effective strategy to enhance precise genome engineering applications and facilitates the study of DNA repair mechanisms in mammalian cells. Copyright © 2015 Elsevier Inc. All rights reserved.
Genome-scale CRISPR-Cas9 knockout and transcriptional activation screening.

PubMed

Joung, Julia; Konermann, Silvana; Gootenberg, Jonathan S; Abudayyeh, Omar O; Platt, Randall J; Brigham, Mark D; Sanjana, Neville E; Zhang, Feng

2017-04-01

Forward genetic screens are powerful tools for the unbiased discovery and functional characterization of specific genetic elements associated with a phenotype of interest. Recently, the RNA-guided endonuclease Cas9 from the microbial CRISPR (clustered regularly interspaced short palindromic repeats) immune system has been adapted for genome-scale screening by combining Cas9 with pooled guide RNA libraries. Here we describe a protocol for genome-scale knockout and transcriptional activation screening using the CRISPR-Cas9 system. Custom- or ready-made guide RNA libraries are constructed and packaged into lentiviral vectors for delivery into cells for screening. As each screen is unique, we provide guidelines for determining screening parameters and maintaining sufficient coverage. To validate candidate genes identified by the screen, we further describe strategies for confirming the screening phenotype, as well as genetic perturbation, through analysis of indel rate and transcriptional activation. Beginning with library design, a genome-scale screen can be completed in 9-15 weeks, followed by 4-5 weeks of validation.
Transposons As Tools for Functional Genomics in Vertebrate Models.

PubMed

Kawakami, Koichi; Largaespada, David A; Ivics, Zoltán

2017-11-01

Genetic tools and mutagenesis strategies based on transposable elements are currently under development with a vision to link primary DNA sequence information to gene functions in vertebrate models. By virtue of their inherent capacity to insert into DNA, transposons can be developed into powerful tools for chromosomal manipulations. Transposon-based forward mutagenesis screens have numerous advantages including high throughput, easy identification of mutated alleles, and providing insight into genetic networks and pathways based on phenotypes. For example, the Sleeping Beauty transposon has become highly instrumental to induce tumors in experimental animals in a tissue-specific manner with the aim of uncovering the genetic basis of diverse cancers. Here, we describe a battery of mutagenic cassettes that can be applied in conjunction with transposon vectors to mutagenize genes, and highlight versatile experimental strategies for the generation of engineered chromosomes for loss-of-function as well as gain-of-function mutagenesis for functional gene annotation in vertebrate models, including zebrafish, mice, and rats. Copyright © 2017 Elsevier Ltd. All rights reserved.
Functional wiring of the yeast kinome revealed by global analysis of genetic network motifs

PubMed Central

Sharifpoor, Sara; van Dyk, Dewald; Costanzo, Michael; Baryshnikova, Anastasia; Friesen, Helena; Douglas, Alison C.; Youn, Ji-Young; VanderSluis, Benjamin; Myers, Chad L.; Papp, Balázs; Boone, Charles; Andrews, Brenda J.

2012-01-01

A combinatorial genetic perturbation strategy was applied to interrogate the yeast kinome on a genome-wide scale. We assessed the global effects of gene overexpression or gene deletion to map an integrated genetic interaction network of synthetic dosage lethal (SDL) and loss-of-function genetic interactions (GIs) for 92 kinases, producing a meta-network of 8700 GIs enriched for pathways known to be regulated by cognate kinases. Kinases most sensitive to dosage perturbations had constitutive cell cycle or cell polarity functions under standard growth conditions. Condition-specific screens confirmed that the spectrum of kinase dosage interactions can be expanded substantially in activating conditions. An integrated network composed of systematic SDL, negative and positive loss-of-function GIs, and literature-curated kinase–substrate interactions revealed kinase-dependent regulatory motifs predictive of novel gene-specific phenotypes. Our study provides a valuable resource to unravel novel functional relationships and pathways regulated by kinases and outlines a general strategy for deciphering mutant phenotypes from large-scale GI networks. PMID:22282571
Rice functional genomics research in China.

PubMed

Han, Bin; Xue, Yongbiao; Li, Jiayang; Deng, Xing-Wang; Zhang, Qifa

2007-06-29

Rice functional genomics is a scientific approach that seeks to identify and define the function of rice genes, and uncover when and how genes work together to produce phenotypic traits. Rapid progress in rice genome sequencing has facilitated research in rice functional genomics in China. The Ministry of Science and Technology of China has funded two major rice functional genomics research programmes for building up the infrastructures of the functional genomics study such as developing rice functional genomics tools and resources. The programmes were also aimed at cloning and functional analyses of a number of genes controlling important agronomic traits from rice. National and international collaborations on rice functional genomics study are accelerating rice gene discovery and application.
Model Organisms Facilitate Rare Disease Diagnosis and Therapeutic Research

PubMed Central

Wangler, Michael F.; Yamamoto, Shinya; Chao, Hsiao-Tuan; Posey, Jennifer E.; Westerfield, Monte; Postlethwait, John; Hieter, Philip; Boycott, Kym M.; Campeau, Philippe M.; Bellen, Hugo J.

2017-01-01

Efforts to identify the genetic underpinnings of rare undiagnosed diseases increasingly involve the use of next-generation sequencing and comparative genomic hybridization methods. These efforts are limited by a lack of knowledge regarding gene function, and an inability to predict the impact of genetic variation on the encoded protein function. Diagnostic challenges posed by undiagnosed diseases have solutions in model organism research, which provides a wealth of detailed biological information. Model organism geneticists are by necessity experts in particular genes, gene families, specific organs, and biological functions. Here, we review the current state of research into undiagnosed diseases, highlighting large efforts in North America and internationally, including the Undiagnosed Diseases Network (UDN) (Supplemental Material, File S1) and UDN International (UDNI), the Centers for Mendelian Genomics (CMG), and the Canadian Rare Diseases Models and Mechanisms Network (RDMM). We discuss how merging human genetics with model organism research guides experimental studies to solve these medical mysteries, gain new insights into disease pathogenesis, and uncover new therapeutic strategies. PMID:28874452
Synaptogenesis and heritable aspects of executive attention.

PubMed

Fossella, John A; Sommer, Tobias; Fan, Jin; Pfaff, Don; Posner, Michael I

2003-01-01

In humans, changes in brain structure and function can be measured non-invasively during postnatal development. In animals, advanced optical imaging measures can track the formation of synapses during learning and behavior. With the recent progress in these technologies, it is appropriate to begin to assess how the physiological processes of synapse, circuit, and neural network formation relate to the process of cognitive development. Of particular interest is the development of executive function, which develops more gradually in humans. One approach that has shown promise is molecular genetics. The completion of the human genome project and the human genome diversity project make it straightforward to ask whether variation in a particular gene correlates with variation in behavior, brain structure, brain activity, or all of the above. Strategies that unify the wealth of biochemical knowledge pertaining to synapse formation with the functional measures of brain structure and activity may lead to new insights in developmental cognitive psychology. Copyright 2003 Wiley-Liss, Inc.
High-throughput screening of a CRISPR/Cas9 library for functional genomics in human cells.

PubMed

Zhou, Yuexin; Zhu, Shiyou; Cai, Changzu; Yuan, Pengfei; Li, Chunmei; Huang, Yanyi; Wei, Wensheng

2014-05-22

Targeted genome editing technologies are powerful tools for studying biology and disease, and have a broad range of research applications. In contrast to the rapid development of toolkits to manipulate individual genes, large-scale screening methods based on the complete loss of gene expression are only now beginning to be developed. Here we report the development of a focused CRISPR/Cas-based (clustered regularly interspaced short palindromic repeats/CRISPR-associated) lentiviral library in human cells and a method of gene identification based on functional screening and high-throughput sequencing analysis. Using knockout library screens, we successfully identified the host genes essential for the intoxication of cells by anthrax and diphtheria toxins, which were confirmed by functional validation. The broad application of this powerful genetic screening strategy will not only facilitate the rapid identification of genes important for bacterial toxicity but will also enable the discovery of genes that participate in other biological processes.
Phylogenomic evolutionary surveys of subtilase superfamily genes in fungi.

PubMed

Li, Juan; Gu, Fei; Wu, Runian; Yang, JinKui; Zhang, Ke-Qin

2017-03-30

Subtilases belong to a superfamily of serine proteases which are ubiquitous in fungi and are suspected to have developed distinct functional properties to help fungi adapt to different ecological niches. In this study, we conducted a large-scale phylogenomic survey of subtilase protease genes in 83 whole genome sequenced fungal species in order to identify the evolutionary patterns and subsequent functional divergences of different subtilase families among the main lineages of the fungal kingdom. Our comparative genomic analyses of the subtilase superfamily indicated that extensive gene duplications, losses and functional diversifications have occurred in fungi, and that the four families of subtilase enzymes in fungi, including proteinase K-like, Pyrolisin, kexin and S53, have distinct evolutionary histories which may have facilitated the adaptation of fungi to a broad array of life strategies. Our study provides new insights into the evolution of the subtilase superfamily in fungi and expands our understanding of the evolution of fungi with different lifestyles.
Improving transcriptome de novo assembly by using a reference genome of a related species: Translational genomics from oil palm to coconut

PubMed Central

Armero, Alix; Bocs, Stéphanie; This, Dominique

2017-01-01

The palms are a family of tropical origin and one of the main constituents of the ecosystems of these regions around the world. The two main species of palm represent different challenges: coconut (Cocos nucifera L.) is a source of multiple goods and services in tropical communities, while oil palm (Elaeis guineensis Jacq) is the main protagonist of the oil market. In this study, we present a workflow that exploits the comparative genomics between a target species (coconut) and a reference species (oil palm) to improve the transcriptomic data, providing a proteome useful to answer functional or evolutionary questions. This workflow reduces redundancy and fragmentation, two inherent problems of transcriptomic data, while preserving the functional representation of the target species. Our approach was validated in Arabidopsis thaliana using Arabidopsis lyrata and Capsella rubella as references species. This analysis showed the high sensitivity and specificity of our strategy, relatively independent of the reference proteome. The workflow increased the length of proteins products in A. thaliana by 13%, allowing, often, to recover 100% of the protein sequence length. In addition redundancy was reduced by a factor greater than 3. In coconut, the approach generated 29,366 proteins, 1,246 of these proteins deriving from new contigs obtained with the BRANCH software. The coconut proteome presented a functional profile similar to that observed in rice and an important number of metabolic pathways related to secondary metabolism. The new sequences found with BRANCH software were enriched in functions related to biotic stress. Our strategy can be used as a complementary step to de novo transcriptome assembly to get a representative proteome of a target species. The results of the current analysis are available on the website PalmComparomics (http://palm-comparomics.southgreen.fr/). PMID:28334050
ABrowse--a customizable next-generation genome browser framework.

PubMed

Kong, Lei; Wang, Jun; Zhao, Shuqi; Gu, Xiaocheng; Luo, Jingchu; Gao, Ge

2012-01-05

With the rapid growth of genome sequencing projects, genome browser is becoming indispensable, not only as a visualization system but also as an interactive platform to support open data access and collaborative work. Thus a customizable genome browser framework with rich functions and flexible configuration is needed to facilitate various genome research projects. Based on next-generation web technologies, we have developed a general-purpose genome browser framework ABrowse which provides interactive browsing experience, open data access and collaborative work support. By supporting Google-map-like smooth navigation, ABrowse offers end users highly interactive browsing experience. To facilitate further data analysis, multiple data access approaches are supported for external platforms to retrieve data from ABrowse. To promote collaborative work, an online user-space is provided for end users to create, store and share comments, annotations and landmarks. For data providers, ABrowse is highly customizable and configurable. The framework provides a set of utilities to import annotation data conveniently. To build ABrowse on existing annotation databases, data providers could specify SQL statements according to database schema. And customized pages for detailed information display of annotation entries could be easily plugged in. For developers, new drawing strategies could be integrated into ABrowse for new types of annotation data. In addition, standard web service is provided for data retrieval remotely, providing underlying machine-oriented programming interface for open data access. ABrowse framework is valuable for end users, data providers and developers by providing rich user functions and flexible customization approaches. The source code is published under GNU Lesser General Public License v3.0 and is accessible at http://www.abrowse.org/. To demonstrate all the features of ABrowse, a live demo for Arabidopsis thaliana genome has been built at http://arabidopsis.cbi.edu.cn/.
Factors affecting reproducibility between genome-scale siRNA-based screens

PubMed Central

Barrows, Nicholas J.; Le Sommer, Caroline; Garcia-Blanco, Mariano A.; Pearson, James L.

2011-01-01

RNA interference-based screening is a powerful new genomic technology which addresses gene function en masse. To evaluate factors influencing hit list composition and reproducibility, we performed two identically designed small interfering RNA (siRNA)-based, whole genome screens for host factors supporting yellow fever virus infection. These screens represent two separate experiments completed five months apart and allow the direct assessment of the reproducibility of a given siRNA technology when performed in the same environment. Candidate hit lists generated by sum rank, median absolute deviation, z-score, and strictly standardized mean difference were compared within and between whole genome screens. Application of these analysis methodologies within a single screening dataset using a fixed threshold equivalent to a p-value ≤ 0.001 resulted in hit lists ranging from 82 to 1,140 members and highlighted the tremendous impact analysis methodology has on hit list composition. Intra- and inter-screen reproducibility was significantly influenced by the analysis methodology and ranged from 32% to 99%. This study also highlighted the power of testing at least two independent siRNAs for each gene product in primary screens. To facilitate validation we conclude by suggesting methods to reduce false discovery at the primary screening stage. In this study we present the first comprehensive comparison of multiple analysis strategies, and demonstrate the impact of the analysis methodology on the composition of the “hit list”. Therefore, we propose that the entire dataset derived from functional genome-scale screens, especially if publicly funded, should be made available as is done with data derived from gene expression and genome-wide association studies. PMID:20625183

Computational and informatics strategies for identification of specific protein interaction partners in affinity purification mass spectrometry experiments

PubMed Central

Nesvizhskii, Alexey I.

2013-01-01

Analysis of protein interaction networks and protein complexes using affinity purification and mass spectrometry (AP/MS) is among most commonly used and successful applications of proteomics technologies. One of the foremost challenges of AP/MS data is a large number of false positive protein interactions present in unfiltered datasets. Here we review computational and informatics strategies for detecting specific protein interaction partners in AP/MS experiments, with a focus on incomplete (as opposite to genome-wide) interactome mapping studies. These strategies range from standard statistical approaches, to empirical scoring schemes optimized for a particular type of data, to advanced computational frameworks. The common denominator among these methods is the use of label-free quantitative information such as spectral counts or integrated peptide intensities that can be extracted from AP/MS data. We also discuss related issues such as combining multiple biological or technical replicates, and dealing with data generated using different tagging strategies. Computational approaches for benchmarking of scoring methods are discussed, and the need for generation of reference AP/MS datasets is highlighted. Finally, we discuss the possibility of more extended modeling of experimental AP/MS data, including integration with external information such as protein interaction predictions based on functional genomics data. PMID:22611043
Comparative Genomics Reveals the Core Gene Toolbox for the Fungus-Insect Symbiosis

PubMed Central

Stata, Matt; Wang, Wei; White, Merlin M.; Moncalvo, Jean-Marc

2018-01-01

ABSTRACT Modern genomics has shed light on many entomopathogenic fungi and expanded our knowledge widely; however, little is known about the genomic features of the insect-commensal fungi. Harpellales are obligate commensals living in the digestive tracts of disease-bearing insects (black flies, midges, and mosquitoes). In this study, we produced and annotated whole-genome sequences of nine Harpellales taxa and conducted the first comparative analyses to infer the genomic diversity within the members of the Harpellales. The genomes of the insect gut fungi feature low (26% to 37%) GC content and large genome size variations (25 to 102 Mb). Further comparisons with insect-pathogenic fungi (from both Ascomycota and Zoopagomycota), as well as with free-living relatives (as negative controls), helped to identify a gene toolbox that is essential to the fungus-insect symbiosis. The results not only narrow the genomic scope of fungus-insect interactions from several thousands to eight core players but also distinguish host invasion strategies employed by insect pathogens and commensals. The genomic content suggests that insect commensal fungi rely mostly on adhesion protein anchors that target digestive system, while entomopathogenic fungi have higher numbers of transmembrane helices, signal peptides, and pathogen-host interaction (PHI) genes across the whole genome and enrich genes as well as functional domains to inactivate the host inflammation system and suppress the host defense. Phylogenomic analyses have revealed that genome sizes of Harpellales fungi vary among lineages with an integer-multiple pattern, which implies that ancient genome duplications may have occurred within the gut of insects. PMID:29764946
Bacterial CRISPR: Accomplishments and Prospects

PubMed Central

Peters, Jason M.; Silvis, Melanie R.; Zhao, Dehua; Hawkins, John S.; Gross, Carol A.; Qi, Lei S.

2015-01-01

In this review we briefly describe the development of CRISPR tools for genome editing and control of transcription in bacteria. We focus on the Type II CRISPR/Cas9 system, provide specific examples for use of the system, and highlight the advantages and disadvantages of CRISPR versus other techniques. We suggest potential strategies for combining CRISPR tools with high-throughput approaches to elucidate gene function in bacteria. PMID:26363124
From proteomics to systems biology: MAPA, MASS WESTERN, PROMEX, and COVAIN as a user-oriented platform.

PubMed

Weckwerth, Wolfram; Wienkoop, Stefanie; Hoehenwarter, Wolfgang; Egelhofer, Volker; Sun, Xiaoliang

2014-01-01

Genome sequencing and systems biology are revolutionizing life sciences. Proteomics emerged as a fundamental technique of this novel research area as it is the basis for gene function analysis and modeling of dynamic protein networks. Here a complete proteomics platform suited for functional genomics and systems biology is presented. The strategy includes MAPA (mass accuracy precursor alignment; http://www.univie.ac.at/mosys/software.html ) as a rapid exploratory analysis step; MASS WESTERN for targeted proteomics; COVAIN ( http://www.univie.ac.at/mosys/software.html ) for multivariate statistical analysis, data integration, and data mining; and PROMEX ( http://www.univie.ac.at/mosys/databases.html ) as a database module for proteogenomics and proteotypic peptides for targeted analysis. Moreover, the presented platform can also be utilized to integrate metabolomics and transcriptomics data for the analysis of metabolite-protein-transcript correlations and time course analysis using COVAIN. Examples for the integration of MAPA and MASS WESTERN data, proteogenomic and metabolic modeling approaches for functional genomics, phosphoproteomics by integration of MOAC (metal-oxide affinity chromatography) with MAPA, and the integration of metabolomics, transcriptomics, proteomics, and physiological data using this platform are presented. All software and step-by-step tutorials for data processing and data mining can be downloaded from http://www.univie.ac.at/mosys/software.html.
A novel transgenic chimaeric mouse system for the rapid functional evaluation of genes encoding secreted proteins

PubMed Central

Kakitani, Makoto; Oshima, Takeshi; Horikoshi, Kaori; Yoshitome, Tetsuo; Ueda, Akiko; Kajikawa, Miwa; Iba, Yumi; Ozone, Yoshinao; Ijima, Yuki; Yoshino, Tohko; Itoh, Mikiko; Seki, Sachiko; Aoki, Ayako; Ishihara, Toshie; Shionoya, Michiyo; Makino, Utako; Kitada, Rina; Ohguma, Atsuko; Ohta, Takami; Yoshida, Yoshimasa; Kudoh, Hiroe; Hanaoka, Kazunori; Sibuya, Kazunori; Ishida, Isao; Kakeda, Minoru; Yagi, Mikio; Yoneya, Takashi; Tomizuka, Kazuma

2005-01-01

A major challenge of the post-genomic era is the functional characterization of anonymous open reading frames (ORFs) identified by the Human Genome Project. In this context, there is a strong requirement for the development of technologies that enhance our ability to analyze gene functions at the level of the whole organism. Here, we describe a rapid and efficient procedure to generate transgenic chimaeric mice that continuously secrete a foreign protein into the systemic circulation. The transgene units were inserted into the genomic site adjacent to the endogenous immunoglobulin (Ig) κ locus by homologous recombination, using a modified mouse embryonic stem (ES) cell line that exhibits a high frequency of homologous recombination at the Igκ region. The resultant ES clones were injected into embryos derived from a B-cell-deficient host strain, thus producing chimaerism-independent, B-cell-specific transgene expression. This feature of the system eliminates the time-consuming breeding typically implemented in standard transgenic strategies and allows for evaluating the effect of ectopic transgene expression directly in the resulting chimaeric mice. To demonstrate the utility of this system we showed high-level protein expression in the sera and severe phenotypes in human EPO (hEPO) and murine thrombopoietin (mTPO) transgenic chimaeras. PMID:15914664
Systems biology-based approaches toward understanding drought tolerance in food crops.

PubMed

Jogaiah, Sudisha; Govind, Sharathchandra Ramsandra; Tran, Lam-Son Phan

2013-03-01

Economically important crops, such as maize, wheat, rice, barley, and other food crops are affected by even small changes in water potential at important growth stages. Developing a comprehensive understanding of host response to drought requires a global view of the complex mechanisms involved. Research on drought tolerance has generally been conducted using discipline-specific approaches. However, plant stress response is complex and interlinked to a point where discipline-specific approaches do not give a complete global analysis of all the interlinked mechanisms. Systems biology perspective is needed to understand genome-scale networks required for building long-lasting drought resistance. Network maps have been constructed by integrating multiple functional genomics data with both model plants, such as Arabidopsis thaliana, Lotus japonicus, and Medicago truncatula, and various food crops, such as rice and soybean. Useful functional genomics data have been obtained from genome-wide comparative transcriptome and proteome analyses of drought responses from different crops. This integrative approach used by many groups has led to identification of commonly regulated signaling pathways and genes following exposure to drought. Combination of functional genomics and systems biology is very useful for comparative analysis of other food crops and has the ability to develop stable food systems worldwide. In addition, studying desiccation tolerance in resurrection plants will unravel how combination of molecular genetic and metabolic processes interacts to produce a resurrection phenotype. Systems biology-based approaches have helped in understanding how these individual factors and mechanisms (biochemical, molecular, and metabolic) "interact" spatially and temporally. Signaling network maps of such interactions are needed that can be used to design better engineering strategies for improving drought tolerance of important crop species.
Expanding and reprogramming the genetic code.

PubMed

Chin, Jason W

2017-10-04

Nature uses a limited, conservative set of amino acids to synthesize proteins. The ability to genetically encode an expanded set of building blocks with new chemical and physical properties is transforming the study, manipulation and evolution of proteins, and is enabling diverse applications, including approaches to probe, image and control protein function, and to precisely engineer therapeutics. Underpinning this transformation are strategies to engineer and rewire translation. Emerging strategies aim to reprogram the genetic code so that noncanonical biopolymers can be synthesized and evolved, and to test the limits of our ability to engineer the translational machinery and systematically recode genomes.
Meta-Analysis in Genome-Wide Association Datasets: Strategies and Application in Parkinson Disease

PubMed Central

Evangelou, Evangelos; Maraganore, Demetrius M.; Ioannidis, John P.A.

2007-01-01

Background Genome-wide association studies hold substantial promise for identifying common genetic variants that regulate susceptibility to complex diseases. However, for the detection of small genetic effects, single studies may be underpowered. Power may be improved by combining genome-wide datasets with meta-analytic techniques. Methodology/Principal Findings Both single and two-stage genome-wide data may be combined and there are several possible strategies. In the two-stage framework, we considered the options of (1) enhancement of replication data and (2) enhancement of first-stage data, and then, we also considered (3) joint meta-analyses including all first-stage and second-stage data. These strategies were examined empirically using data from two genome-wide association studies (three datasets) on Parkinson disease. In the three strategies, we derived 12, 5, and 49 single nucleotide polymorphisms that show significant associations at conventional levels of statistical significance. None of these remained significant after conservative adjustment for the number of performed analyses in each strategy. However, some may warrant further consideration: 6 SNPs were identified with at least 2 of the 3 strategies and 3 SNPs [rs1000291 on chromosome 3, rs2241743 on chromosome 4 and rs3018626 on chromosome 11] were identified with all 3 strategies and had no or minimal between-dataset heterogeneity (I2 = 0, 0 and 15%, respectively). Analyses were primarily limited by the suboptimal overlap of tested polymorphisms across different datasets (e.g., only 31,192 shared polymorphisms between the two tier 1 datasets). Conclusions/Significance Meta-analysis may be used to improve the power and examine the between-dataset heterogeneity of genome-wide association studies. Prospective designs may be most efficient, if they try to maximize the overlap of genotyping platforms and anticipate the combination of data across many genome-wide association studies. PMID:17332845
Sox17 drives functional engraftment of endothelium converted from non-vascular cells.

PubMed

Schachterle, William; Badwe, Chaitanya R; Palikuqi, Brisa; Kunar, Balvir; Ginsberg, Michael; Lis, Raphael; Yokoyama, Masataka; Elemento, Olivier; Scandura, Joseph M; Rafii, Shahin

2017-01-16

Transplanting vascular endothelial cells (ECs) to support metabolism and express regenerative paracrine factors is a strategy to treat vasculopathies and to promote tissue regeneration. However, transplantation strategies have been challenging to develop, because ECs are difficult to culture and little is known about how to direct them to stably integrate into vasculature. Here we show that only amniotic cells could convert to cells that maintain EC gene expression. Even so, these converted cells perform sub-optimally in transplantation studies. Constitutive Akt signalling increases expression of EC morphogenesis genes, including Sox17, shifts the genomic targeting of Fli1 to favour nearby Sox consensus sites and enhances the vascular function of converted cells. Enforced expression of Sox17 increases expression of morphogenesis genes and promotes integration of transplanted converted cells into injured vessels. Thus, Ets transcription factors specify non-vascular, amniotic cells to EC-like cells, whereas Sox17 expression is required to confer EC function.
Nutrition and the gut microbiome in the elderly

PubMed Central

Salazar, Nuria; Valdés-Varela, Lorena; González, Sonia; Gueimonde, Miguel; de los Reyes-Gavilán, Clara G.

2017-01-01

ABSTRACT The gut microbiota is the assembly of microorganisms living in our intestine and their genomes are known as the microbiome. The correct composition and functionality of this microbiome is essential for maintaining a “healthy status.” Aging is related to changes in the gut microbiota which are frequently associated with physiological modifications of the gastrointestinal tract, as well as, to changes in dietary patterns, together with a concomitant decline in cognitive and immune function, all together contributing to frailty. Therefore, nutritional strategies directed at restoring the microbiota in the elderly have to be addressed from a global perspective, considering not only the microbiota but also other extra-intestinal targets of action. The present review aims at summarizing the current knowledge on intestinal microbiota alterations and other functions impaired in the elderly and to analyze tools for implementing nutritional strategies, through the use of probiotics, prebiotics or specific nutrients in order to counterbalance such alterations. PMID:27808595
Hybrid coexpression link similarity graph clustering for mining biological modules from multiple gene expression datasets.

PubMed

Salem, Saeed; Ozcaglar, Cagri

2014-01-01

Advances in genomic technologies have enabled the accumulation of vast amount of genomic data, including gene expression data for multiple species under various biological and environmental conditions. Integration of these gene expression datasets is a promising strategy to alleviate the challenges of protein functional annotation and biological module discovery based on a single gene expression data, which suffers from spurious coexpression. We propose a joint mining algorithm that constructs a weighted hybrid similarity graph whose nodes are the coexpression links. The weight of an edge between two coexpression links in this hybrid graph is a linear combination of the topological similarities and co-appearance similarities of the corresponding two coexpression links. Clustering the weighted hybrid similarity graph yields recurrent coexpression link clusters (modules). Experimental results on Human gene expression datasets show that the reported modules are functionally homogeneous as evident by their enrichment with biological process GO terms and KEGG pathways.
A functional genomics approach reveals CHE as a component of the Arabidopsis circadian clock.

PubMed

Pruneda-Paz, Jose L; Breton, Ghislain; Para, Alessia; Kay, Steve A

2009-03-13

Transcriptional feedback loops constitute the molecular circuitry of the plant circadian clock. In Arabidopsis, a core loop is established between CCA1 and TOC1. Although CCA1 directly represses TOC1, the TOC1 protein has no DNA binding domains, which suggests that it cannot directly regulate CCA1. We established a functional genomic strategy that led to the identification of CHE, a TCP transcription factor that binds specifically to the CCA1 promoter. CHE is a clock component partially redundant with LHY in the repression of CCA1. The expression of CHE is regulated by CCA1, thus adding a CCA1/CHE feedback loop to the Arabidopsis circadian network. Because CHE and TOC1 interact, and CHE binds to the CCA1 promoter, a molecular linkage between TOC1 and CCA1 gene regulation is established.
G protein-coupled receptor 30 in tumor development.

PubMed

Wang, Dengfeng; Hu, Lina; Zhang, Guonan; Zhang, Lin; Chen, Chen

2010-08-01

Estrogen plays several important physiological and pathological functions in not only reproductive system but many other systems as well. Its transcriptional activation has been traditionally described as being mediated by classic nuclear estrogen receptors (ERs). It is however established recently that a novel functional estrogen transmembrane receptor, G protein-coupled receptor 30 (GPR30), modulates both rapid non-genomic events and genomic transcriptional events of estrogen. It has been demonstrated that GPR30 promotes the progress of estrogen-related tumors through mitogen-activated protein kinase (MAPK) signaling pathways. Effects mediated by GPR30 are maintained when classic ERs are absent or blocked. In addition, GPR30 is involved in drug resistance, which is often occurring during cancer treatments. All these new findings strongly imply that GPR30 may be an important therapeutic target for estrogen-related tumors. Simultaneously blocking both GPR30 and classic ERs may be a better strategy for the treatment of estrogen-related tumors.
Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets

PubMed Central

Macosko, Evan Z.; Basu, Anindita; Satija, Rahul; Nemesh, James; Shekhar, Karthik; Goldman, Melissa; Tirosh, Itay; Bialas, Allison R.; Kamitaki, Nolan; Martersteck, Emily M.; Trombetta, John J.; Weitz, David A.; Sanes, Joshua R.; Shalek, Alex K.; Regev, Aviv; McCarroll, Steven A.

2015-01-01

Summary Cells, the basic units of biological structure and function, vary broadly in type and state. Single-cell genomics can characterize cell identity and function, but limitations of ease and scale have prevented its broad application. Here we describe Drop-Seq, a strategy for quickly profiling thousands of individual cells by separating them into nanoliter-sized aqueous droplets, associating a different barcode with each cell’s RNAs, and sequencing them all together. Drop-Seq analyzes mRNA transcripts from thousands of individual cells simultaneously while remembering transcripts’ cell of origin. We analyzed transcriptomes from 44,808 mouse retinal cells and identified 39 transcriptionally distinct cell populations, creating a molecular atlas of gene expression for known retinal cell classes and novel candidate cell subtypes. Drop-Seq will accelerate biological discovery by enabling routine transcriptional profiling at single-cell resolution. PMID:26000488
Multiple roles of genome-attached bacteriophage terminal proteins

DOE Office of Scientific and Technical Information (OSTI.GOV)

Redrejo-Rodríguez, Modesto; Salas, Margarita, E-mail: msalas@cbm.csic.es

2014-11-15

Protein-primed replication constitutes a generalized mechanism to initiate DNA or RNA synthesis in linear genomes, including viruses, gram-positive bacteria, linear plasmids and mobile elements. By this mechanism a specific amino acid primes replication and becomes covalently linked to the genome ends. Despite the fact that TPs lack sequence homology, they share a similar structural arrangement, with the priming residue in the C-terminal half of the protein and an accumulation of positively charged residues at the N-terminal end. In addition, various bacteriophage TPs have been shown to have DNA-binding capacity that targets TPs and their attached genomes to the host nucleoid.more » Furthermore, a number of bacteriophage TPs from different viral families and with diverse hosts also contain putative nuclear localization signals and localize in the eukaryotic nucleus, which could lead to the transport of the attached DNA. This suggests a possible role of bacteriophage TPs in prokaryote-to-eukaryote horizontal gene transfer. - Highlights: • Protein-primed genome replication constitutes a strategy to initiate DNA or RNA synthesis in linear genomes. • Bacteriophage terminal proteins (TPs) are covalently attached to viral genomes by their primary function priming DNA replication. • TPs are also DNA-binding proteins and target phage genomes to the host nucleoid. • TPs can also localize in the eukaryotic nucleus and may have a role in phage-mediated interkingdom gene transfer.« less
Functional Genomics of Drought Tolerance in Bioenergy Crops

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yin, Hengfu; Chen, Rick; Yang, Jun

2014-01-01

With the predicted trends in climate change, drought will increasingly impose a grand challenge to biomass production. Most of the bioenergy crops have some degree of drought susceptibility with low water-use efficiency (WUE). It is imperative to improve drought tolerance and WUE in bioenergy crops for sustainable biomass production in arid and semi-arid regions with minimal water input. Genetics and functional genomics can play a critical role in generating knowledge to inform and aid genetic improvement of drought tolerance in bioenergy crops. The molecular aspect of drought response has been extensively investigated in model plants like Arabidopsis, yet our understandingmore » of the molecular mechanisms underlying drought tolerance in bioenergy crops are limited. Crops exhibit various responses to drought stress depending on species and genotype. A rational strategy for studying drought tolerance in bioenergy crops is to translate the knowledge from model plants and pinpoint the unique features associated with individual species and genotypes. In this review, we summarize the general knowledge about drought responsive pathways in plants, with a focus on the identification of commonality and specialty in drought responsive mechanisms among different species and/or genotypes. We describe the genomic resources developed for bioenergy crops and discuss genetic and epigenetic regulation of drought responses. We also examine comparative and evolutionary genomics to leverage the ever-increasing genomics resources and provide new insights beyond what has been known from studies on individual species. Finally, we outline future exploration of drought tolerance using the emerging new technologies.« less
Activity-based protein profiling: from enzyme chemistry to proteomic chemistry.

PubMed

Cravatt, Benjamin F; Wright, Aaron T; Kozarich, John W

2008-01-01

Genome sequencing projects have provided researchers with a complete inventory of the predicted proteins produced by eukaryotic and prokaryotic organisms. Assignment of functions to these proteins represents one of the principal challenges for the field of proteomics. Activity-based protein profiling (ABPP) has emerged as a powerful chemical proteomic strategy to characterize enzyme function directly in native biological systems on a global scale. Here, we review the basic technology of ABPP, the enzyme classes addressable by this method, and the biological discoveries attributable to its application.
Rice epigenomics and epigenetics: challenges and opportunities.

PubMed

Chen, Xiangsong; Zhou, Dao-Xiu

2013-05-01

During recent years rice genome-wide epigenomic information such as DNA methylation and histone modifications, which are important for genome activity has been accumulated. The function of a number of rice epigenetic regulators has been studied, many of which are found to be involved in a diverse range of developmental and stress-responsive pathways. Analysis of epigenetic variations among different rice varieties indicates that epigenetic modification may lead to inheritable phenotypic variation. Characterizing phenotypic consequences of rice epigenomic variations and the underlining chromatin mechanism and identifying epialleles related to important agronomic traits may provide novel strategies to enhance agronomically favorable traits and grain productivity in rice. Copyright © 2013 Elsevier Ltd. All rights reserved.
Genome Mining for Ribosomally Synthesized Natural Products

PubMed Central

Velásquez, Juan E.; van der Donk, Wilfred

2011-01-01

In recent years, the number of known peptide natural products that are synthesized via the ribosomal pathway has rapidly grown. Taking advantage of sequence homology among genes encoding precursor peptides or biosynthetic proteins, in silico mining of genomes combined with molecular biology approaches has guided the discovery of a large number of new ribosomal natural products, including lantipeptides, cyanobactins, linear thiazole/oxazole-containing peptides, microviridins, lasso peptides, amatoxins, cyclotides, and conopeptides. In this review, we describe the strategies used for the identification of these ribosomally-synthesized and posttranslationally modified peptides (RiPPs) and the structures of newly identified compounds. The increasing number of chemical entities and their remarkable structural and functional diversity may lead to novel pharmaceutical applications. PMID:21095156
Prochlorococcus: Advantages and Limits of Minimalism

NASA Astrophysics Data System (ADS)

Partensky, Frédéric; Garczarek, Laurence

2010-01-01

Prochlorococcus is the key phytoplanktonic organism of tropical gyres, large ocean regions that are depleted of the essential macronutrients needed for photosynthesis and cell growth. This cyanobacterium has adapted itself to oligotrophy by minimizing the resources necessary for life through a drastic reduction of cell and genome sizes. This rarely observed strategy in free-living organisms has conferred on Prochlorococcus a considerable advantage over other phototrophs, including its closest relative Synechococcus, for life in this vast yet little variable ecosystem. However, this strategy seems to reach its limits in the upper layer of the S Pacific gyre, the most oligotrophic region of the world ocean. By losing some important genes and/or functions during evolution, Prochlorococcus has seemingly become dependent on co-occurring microorganisms. In this review, we present some of the recent advances in the ecology, biology, and evolution of Prochlorococcus, which because of its ecological importance and tiny genome is rapidly imposing itself as a model organism in environmental microbiology.

Prochlorococcus: advantages and limits of minimalism.

PubMed

Partensky, Frédéric; Garczarek, Laurence

2010-01-01

Prochlorococcus is the key phytoplanktonic organism of tropical gyres, large ocean regions that are depleted of the essential macronutrients needed for photosynthesis and cell growth. This cyanobacterium has adapted itself to oligotrophy by minimizing the resources necessary for life through a drastic reduction of cell and genome sizes. This rarely observed strategy in free-living organisms has conferred on Prochlorococcus a considerable advantage over other phototrophs, including its closest relative Synechococcus, for life in this vast yet little variable ecosystem. However, this strategy seems to reach its limits in the upper layer of the S Pacific gyre, the most oligotrophic region of the world ocean. By losing some important genes and/or functions during evolution, Prochlorococcus has seemingly become dependent on co-occurring microorganisms. In this review, we present some of the recent advances in the ecology, biology, and evolution of Prochlorococcus, which because of its ecological importance and tiny genome is rapidly imposing itself as a model organism in environmental microbiology.
United States Department of Agriculture-Agricultural Research Service: advances in the molecular genetic analysis of insects and their application to pest management.

PubMed

Handler, Alfred M; Beeman, Richard W

2003-01-01

USDA-ARS scientists have made important contributions to the molecular genetic analysis of agriculturally important insects, and have been in the forefront of using this information for the development of new pest management strategies. Advances have been made in the identification and analysis of genetic systems involved in insect development, reproduction and behavior which enable the identification of new targets for control, as well as the development of highly specific insecticidal products. Other studies have been on the leading edge of developing gene transfer technology to better elucidate these biological processes though functional genomics and to develop new transgenic strains for biological control. Important contributions have also been made to the development and use of molecular markers and methodologies to identify and track insect populations. The use of molecular genetic technology and strategies will become increasingly important to pest management as genomic sequencing information becomes available from important pest insects, their targets and other associated organisms.
Ecological roles of dominant and rare prokaryotes in acid mine drainage revealed by metagenomics and metatranscriptomics.

PubMed

Hua, Zheng-Shuang; Han, Yu-Jiao; Chen, Lin-Xing; Liu, Jun; Hu, Min; Li, Sheng-Jin; Kuang, Jia-Liang; Chain, Patrick S G; Huang, Li-Nan; Shu, Wen-Sheng

2015-06-01

High-throughput sequencing is expanding our knowledge of microbial diversity in the environment. Still, understanding the metabolic potentials and ecological roles of rare and uncultured microbes in natural communities remains a major challenge. To this end, we applied a 'divide and conquer' strategy that partitioned a massive metagenomic data set (>100 Gbp) into subsets based on K-mer frequency in sequence assembly to a low-diversity acid mine drainage (AMD) microbial community and, by integrating with an additional metatranscriptomic assembly, successfully obtained 11 draft genomes most of which represent yet uncultured and/or rare taxa (relative abundance <1%). We report the first genome of a naturally occurring Ferrovum population (relative abundance >90%) and its metabolic potentials and gene expression profile, providing initial molecular insights into the ecological role of these lesser known, but potentially important, microorganisms in the AMD environment. Gene transcriptional analysis of the active taxa revealed major metabolic capabilities executed in situ, including carbon- and nitrogen-related metabolisms associated with syntrophic interactions, iron and sulfur oxidation, which are key in energy conservation and AMD generation, and the mechanisms of adaptation and response to the environmental stresses (heavy metals, low pH and oxidative stress). Remarkably, nitrogen fixation and sulfur oxidation were performed by the rare taxa, indicating their critical roles in the overall functioning and assembly of the AMD community. Our study demonstrates the potential of the 'divide and conquer' strategy in high-throughput sequencing data assembly for genome reconstruction and functional partitioning analysis of both dominant and rare species in natural microbial assemblages.
Drought response in wheat: key genes and regulatory mechanisms controlling root system architecture and transpiration efficiency

NASA Astrophysics Data System (ADS)

Kulkarni, Manoj; Soolanayakanahally, Raju; Ogawa, Satoshi; Uga, Yusaku; Selvaraj, Michael G.; Kagale, Sateesh

2017-12-01

Abiotic stresses such as drought, heat, salinity and flooding threaten global food security. Crop genetic improvement with increased resilience to abiotic stresses is a critical component of crop breeding strategies. Wheat is an important cereal crop and a staple food source globally. Enhanced drought tolerance in wheat is critical for sustainable food production and global food security. Recent advances in drought tolerance research have uncovered many key genes and transcription regulators governing morpho-physiological traits. Genes controlling root architecture and stomatal development play an important role in soil moisture extraction and its retention, and therefore have been targets of molecular breeding strategies for improving drought tolerance. In this systematic review, we have summarized evidence of beneficial contributions of root and stomatal traits to plant adaptation to drought stress. Specifically, we discuss a few key genes such as DRO1 in rice and ERECTA in Arabidopsis and rice that were identified to be the enhancers of drought tolerance via regulation of root traits and transpiration efficiency. Additionally, we highlight several transcription factor families, such as ERF (ethylene response factors), DREB (dehydration responsive element binding), ZFP (zinc finger proteins), WRKY and MYB that were identified to be both positive and negative regulators of drought responses in wheat, rice, maize and/or Arabidopsis. The overall aim of this review was to provide an overview of candidate genes that have been tested as regulators of drought response in plants. The lack of a reference genome sequence for wheat and nontransgenic approaches for manipulation of gene functions in the past had impeded high-resolution interrogation of functional elements, including genes and QTLs, and their application in cultivar improvement. The recent developments in wheat genomics and reverse genetics, including the availability of a gold-standard reference genome sequence and advent genome editing technologies, are expected to aid in deciphering of the functional roles of genes and regulatory networks underlying adaptive phenological traits, and utilizing the outcomes of such studies in developing drought tolerance cultivars.
Drought Response in Wheat: Key Genes and Regulatory Mechanisms Controlling Root System Architecture and Transpiration Efficiency.

PubMed

Kulkarni, Manoj; Soolanayakanahally, Raju; Ogawa, Satoshi; Uga, Yusaku; Selvaraj, Michael G; Kagale, Sateesh

2017-01-01

Abiotic stresses such as, drought, heat, salinity, and flooding threaten global food security. Crop genetic improvement with increased resilience to abiotic stresses is a critical component of crop breeding strategies. Wheat is an important cereal crop and a staple food source globally. Enhanced drought tolerance in wheat is critical for sustainable food production and global food security. Recent advances in drought tolerance research have uncovered many key genes and transcription regulators governing morpho-physiological traits. Genes controlling root architecture and stomatal development play an important role in soil moisture extraction and its retention, and therefore have been targets of molecular breeding strategies for improving drought tolerance. In this systematic review, we have summarized evidence of beneficial contributions of root and stomatal traits to plant adaptation to drought stress. Specifically, we discuss a few key genes such as, DRO1 in rice and ERECTA in Arabidopsis and rice that were identified to be the enhancers of drought tolerance via regulation of root traits and transpiration efficiency. Additionally, we highlight several transcription factor families, such as, ERF (ethylene response factors), DREB (dehydration responsive element binding), ZFP (zinc finger proteins), WRKY, and MYB that were identified to be both positive and negative regulators of drought responses in wheat, rice, maize, and/or Arabidopsis. The overall aim of this review is to provide an overview of candidate genes that have been identified as regulators of drought response in plants. The lack of a reference genome sequence for wheat and non-transgenic approaches for manipulation of gene functions in wheat in the past had impeded high-resolution interrogation of functional elements, including genes and QTLs, and their application in cultivar improvement. The recent developments in wheat genomics and reverse genetics, including the availability of a gold-standard reference genome sequence and advent of genome editing technologies, are expected to aid in deciphering of the functional roles of genes and regulatory networks underlying adaptive phenological traits, and utilizing the outcomes of such studies in developing drought tolerant cultivars.
Personalized biochemistry and biophysics.

PubMed

Kroncke, Brett M; Vanoye, Carlos G; Meiler, Jens; George, Alfred L; Sanders, Charles R

2015-04-28

Whole human genome sequencing of individuals is becoming rapid and inexpensive, enabling new strategies for using personal genome information to help diagnose, treat, and even prevent human disorders for which genetic variations are causative or are known to be risk factors. Many of the exploding number of newly discovered genetic variations alter the structure, function, dynamics, stability, and/or interactions of specific proteins and RNA molecules. Accordingly, there are a host of opportunities for biochemists and biophysicists to participate in (1) developing tools to allow accurate and sometimes medically actionable assessment of the potential pathogenicity of individual variations and (2) establishing the mechanistic linkage between pathogenic variations and their physiological consequences, providing a rational basis for treatment or preventive care. In this review, we provide an overview of these opportunities and their associated challenges in light of the current status of genomic science and personalized medicine, the latter often termed precision medicine.
Personalized Biochemistry and Biophysics

PubMed Central

2016-01-01

Whole human genome sequencing of individuals is becoming rapid and inexpensive, enabling new strategies for using personal genome information to help diagnose, treat, and even prevent human disorders for which genetic variations are causative or are known to be risk factors. Many of the exploding number of newly discovered genetic variations alter the structure, function, dynamics, stability, and/or interactions of specific proteins and RNA molecules. Accordingly, there are a host of opportunities for biochemists and biophysicists to participate in (1) developing tools to allow accurate and sometimes medically actionable assessment of the potential pathogenicity of individual variations and (2) establishing the mechanistic linkage between pathogenic variations and their physiological consequences, providing a rational basis for treatment or preventive care. In this review, we provide an overview of these opportunities and their associated challenges in light of the current status of genomic science and personalized medicine, the latter often termed precision medicine. PMID:25856502
Genome Dynamics in Legionella: The Basis of Versatility and Adaptation to Intracellular Replication

PubMed Central

Gomez-Valero, Laura; Buchrieser, Carmen

2013-01-01

Legionella pneumophila is a bacterial pathogen present in aquatic environments that can cause a severe pneumonia called Legionnaires’ disease. Soon after its recognition, it was shown that Legionella replicates inside amoeba, suggesting that bacteria replicating in environmental protozoa are able to exploit conserved signaling pathways in human phagocytic cells. Comparative, evolutionary, and functional genomics suggests that the Legionella–amoeba interaction has shaped this pathogen more than previously thought. A complex evolutionary scenario involving mobile genetic elements, type IV secretion systems, and horizontal gene transfer among Legionella, amoeba, and other organisms seems to take place. This long-lasting coevolution led to the development of very sophisticated virulence strategies and a high level of temporal and spatial fine-tuning of bacteria host–cell interactions. We will discuss current knowledge of the evolution of virulence of Legionella from a genomics perspective and propose our vision of the emergence of this human pathogen from the environment. PMID:23732852
Mutants of Cre recombinase with improved accuracy

PubMed Central

Eroshenko, Nikolai; Church, George M.

2013-01-01

Despite rapid advances in genome engineering technologies, inserting genes into precise locations in the human genome remains an outstanding problem. It has been suggested that site-specific recombinases can be adapted towards use as transgene delivery vectors. The specificity of recombinases can be altered either with directed evolution or via fusions to modular DNA-binding domains. Unfortunately, both wildtype and altered variants often have detectable activities at off-target sites. Here we use bacterial selections to identify mutations in the dimerization surface of Cre recombinase (R32V, R32M, and 303GVSdup) that improve the accuracy of recombination. The mutants are functional in bacteria, in human cells, and in vitro (except for 303GVSdup, which we did not purify), and have improved selectivity against both model off-target sites and the entire E. coli genome. We propose that destabilizing binding cooperativity may be a general strategy for improving the accuracy of dimeric DNA-binding proteins. PMID:24056590
Genome dynamics in Legionella: the basis of versatility and adaptation to intracellular replication.

PubMed

Gomez-Valero, Laura; Buchrieser, Carmen

2013-06-01

Legionella pneumophila is a bacterial pathogen present in aquatic environments that can cause a severe pneumonia called Legionnaires' disease. Soon after its recognition, it was shown that Legionella replicates inside amoeba, suggesting that bacteria replicating in environmental protozoa are able to exploit conserved signaling pathways in human phagocytic cells. Comparative, evolutionary, and functional genomics suggests that the Legionella-amoeba interaction has shaped this pathogen more than previously thought. A complex evolutionary scenario involving mobile genetic elements, type IV secretion systems, and horizontal gene transfer among Legionella, amoeba, and other organisms seems to take place. This long-lasting coevolution led to the development of very sophisticated virulence strategies and a high level of temporal and spatial fine-tuning of bacteria host-cell interactions. We will discuss current knowledge of the evolution of virulence of Legionella from a genomics perspective and propose our vision of the emergence of this human pathogen from the environment.
The Black Queen Hypothesis: evolution of dependencies through adaptive gene loss.

PubMed

Morris, J Jeffrey; Lenski, Richard E; Zinser, Erik R

2012-01-01

Reductive genomic evolution, driven by genetic drift, is common in endosymbiotic bacteria. Genome reduction is less common in free-living organisms, but it has occurred in the numerically dominant open-ocean bacterioplankton Prochlorococcus and "Candidatus Pelagibacter," and in these cases the reduction appears to be driven by natural selection rather than drift. Gene loss in free-living organisms may leave them dependent on cooccurring microbes for lost metabolic functions. We present the Black Queen Hypothesis (BQH), a novel theory of reductive evolution that explains how selection leads to such dependencies; its name refers to the queen of spades in the game Hearts, where the usual strategy is to avoid taking this card. Gene loss can provide a selective advantage by conserving an organism's limiting resources, provided the gene's function is dispensable. Many vital genetic functions are leaky, thereby unavoidably producing public goods that are available to the entire community. Such leaky functions are thus dispensable for individuals, provided they are not lost entirely from the community. The BQH predicts that the loss of a costly, leaky function is selectively favored at the individual level and will proceed until the production of public goods is just sufficient to support the equilibrium community; at that point, the benefit of any further loss would be offset by the cost. Evolution in accordance with the BQH thus generates "beneficiaries" of reduced genomic content that are dependent on leaky "helpers," and it may explain the observed nonuniversality of prototrophy, stress resistance, and other cellular functions in the microbial world.
Hunting for the function of orphan GPCRs – beyond the search for the endogenous ligand

PubMed Central

Ahmad, Raise; Wojciech, Stefanie; Jockers, Ralf

2015-01-01

Seven transmembrane-spanning proteins (7TM), also called GPCRs, are among the most versatile and evolutionary successful protein families. Out of the 400 non-odourant members identified in the human genome, approximately 100 remain orphans that have not been matched with an endogenous ligand. Apart from the classical deorphanization strategies, several alternative strategies provided recent new insights into the function of these proteins, which hold promise for high therapeutic potential. These alternative strategies consist of the phenotypical characterization of organisms silenced or overexpressing orphan 7TM proteins, the search for constitutive receptor activity and formation of protein complexes including 7TM proteins as well as the development of synthetic, surrogate ligands. Taken together, a variety of ligand-independent functions can be attributed to orphan 7TM proteins that range from constitutive activity to complex formation with other proteins and include ‘true’ orphans for which no ligand exist and ‘conditional’ orphans that behave like orphans in the absence of ligand and as non-orphans in the presence of ligand. PMID:25231237
Ancient Exaptation of a CORE-SINE Retroposon into a Highly Conserved Mammalian Neuronal Enhancer of the Proopiomelanocortin Gene

PubMed Central

Bumaschny, Viviana F; Low, Malcolm J; Rubinstein, Marcelo

2007-01-01

The proopiomelanocortin gene (POMC) is expressed in the pituitary gland and the ventral hypothalamus of all jawed vertebrates, producing several bioactive peptides that function as peripheral hormones or central neuropeptides, respectively. We have recently determined that mouse and human POMC expression in the hypothalamus is conferred by the action of two 5′ distal and unrelated enhancers, nPE1 and nPE2. To investigate the evolutionary origin of the neuronal enhancer nPE2, we searched available vertebrate genome databases and determined that nPE2 is a highly conserved element in placentals, marsupials, and monotremes, whereas it is absent in nonmammalian vertebrates. Following an in silico paleogenomic strategy based on genome-wide searches for paralog sequences, we discovered that opossum and wallaby nPE2 sequences are highly similar to members of the superfamily of CORE-short interspersed nucleotide element (SINE) retroposons, in particular to MAR1 retroposons that are widely present in marsupial genomes. Thus, the neuronal enhancer nPE2 originated from the exaptation of a CORE-SINE retroposon in the lineage leading to mammals and remained under purifying selection in all mammalian orders for the last 170 million years. Expression studies performed in transgenic mice showed that two nonadjacent nPE2 subregions are essential to drive reporter gene expression into POMC hypothalamic neurons, providing the first functional example of an exapted enhancer derived from an ancient CORE-SINE retroposon. In addition, we found that this CORE-SINE family of retroposons is likely to still be active in American and Australian marsupial genomes and that several highly conserved exonic, intronic and intergenic sequences in the human genome originated from the exaptation of CORE-SINE retroposons. Together, our results provide clear evidence of the functional novelties that transposed elements contributed to their host genomes throughout evolution. PMID:17922573
Genomics and functional genomics in Chlamydomonas reinhardtii

DOE Office of Scientific and Technical Information (OSTI.GOV)

Blaby, Ian K.; Blaby-Haas, Crysten E.

The availability of the Chlamydomonas reinhardtii nuclear genome sequence continues to enable researchers to address biological questions relevant to algae, land plants and animals in unprecedented ways. As we continue to characterize and understand biological processes in C. reinhardtii and translate that knowledge to other systems, we are faced with the realization that many genes encode proteins without a defined function. The field of functional genomics aims to close this gap between genome sequence and protein function. Transcriptomes, proteomes and phenomes can each provide layers of gene-specific functional data while supplying a global snapshot of cellular behavior under different conditions.more » Herein we present a brief history of functional genomics, the present status of the C. reinhardtii genome, how genome-wide experiments can aid in supplying protein function inferences, and provide an outlook for functional genomics in C. reinhardtii.« less
Genomics and functional genomics in Chlamydomonas reinhardtii

DOE PAGES

Blaby, Ian K.; Blaby-Haas, Crysten E.

2017-03-21

The availability of the Chlamydomonas reinhardtii nuclear genome sequence continues to enable researchers to address biological questions relevant to algae, land plants and animals in unprecedented ways. As we continue to characterize and understand biological processes in C. reinhardtii and translate that knowledge to other systems, we are faced with the realization that many genes encode proteins without a defined function. The field of functional genomics aims to close this gap between genome sequence and protein function. Transcriptomes, proteomes and phenomes can each provide layers of gene-specific functional data while supplying a global snapshot of cellular behavior under different conditions.more » Herein we present a brief history of functional genomics, the present status of the C. reinhardtii genome, how genome-wide experiments can aid in supplying protein function inferences, and provide an outlook for functional genomics in C. reinhardtii.« less
A world without bacterial meningitis: how genomic epidemiology can inform vaccination strategy.

PubMed

Rodrigues, Charlene M C; Maiden, Martin C J

2018-01-01

Bacterial meningitis remains an important cause of global morbidity and mortality. Although effective vaccinations exist and are being increasingly used worldwide, bacterial diversity threatens their impact and the ultimate goal of eliminating the disease. Through genomic epidemiology, we can appreciate bacterial population structure and its consequences for transmission dynamics, virulence, antimicrobial resistance, and development of new vaccines. Here, we review what we have learned through genomic epidemiological studies, following the rapid implementation of whole genome sequencing that can help to optimise preventative strategies for bacterial meningitis.
[Fine mapping of complex disease susceptibility loci].

PubMed

Song, Qingfeng; Zhang, Hongxing; Ma, Yilong; Zhou, Gangqiao

2014-01-01

Genome-wide association studies (GWAS) using single nucleotide polymorphism (SNP) markers have identified more than 3800 susceptibility loci for more than 660 diseases or traits. However, the most significantly associated variants or causative variants in these loci and their biological functions have remained to be clarified. These causative variants can help to elucidate the pathogenesis and discover new biomarkers of complex diseases. One of the main goals in the post-GWAS era is to identify the causative variants and susceptibility genes, and clarify their functional aspects by fine mapping. For common variants, imputation or re-sequencing based strategies were implemented to increase the number of analyzed variants and help to identify the most significantly associated variants. In addition, functional element, expression quantitative trait locus (eQTL) and haplotype analyses were performed to identify functional common variants and susceptibility genes. For rare variants, fine mapping was carried out by re-sequencing, rare haplotype analysis, family-based analysis, burden test, etc.This review summarizes the strategies and problems for fine mapping.
Intestinal microbiome landscaping: insight in community assemblage and implications for microbial modulation strategies

PubMed Central

Hugenholtz, Floor; Lahti, Leo; Smidt, Hauke; de Vos, Willem M.

2017-01-01

Abstract High individuality, large complexity and limited understanding of the mechanisms underlying human intestinal microbiome function remain the major challenges for designing beneficial modulation strategies. Exemplified by the analysis of intestinal bacteria in a thousand Western adults, we discuss key concepts of the human intestinal microbiome landscape, i.e. the compositional and functional ‘core’, the presence of community types and the existence of alternative stable states. Genomic investigation of core taxa revealed functional redundancy, which is expected to stabilize the ecosystem, as well as taxa with specialized functions that have the potential to shape the microbiome landscape. The contrast between Prevotella- and Bacteroides-dominated systems has been well described. However, less known is the effect of not so abundant bacteria, for example, Dialister spp. that have been proposed to exhibit distinct bistable dynamics. Studies employing time-series analysis have highlighted the dynamical variation in the microbiome landscape with and without the effect of defined perturbations, such as the use of antibiotics or dietary changes. We incorporate ecosystem-level observations of the human intestinal microbiota and its keystone species to suggest avenues for designing microbiome modulation strategies to improve host health. PMID:28364729
High precision multi-genome scale reannotation of enzyme function by EFICAz

PubMed Central

Arakaki, Adrian K; Tian, Weidong; Skolnick, Jeffrey

2006-01-01

Background The functional annotation of most genes in newly sequenced genomes is inferred from similarity to previously characterized sequences, an annotation strategy that often leads to erroneous assignments. We have performed a reannotation of 245 genomes using an updated version of EFICAz, a highly precise method for enzyme function prediction. Results Based on our three-field EC number predictions, we have obtained lower-bound estimates for the average enzyme content in Archaea (29%), Bacteria (30%) and Eukarya (18%). Most annotations added in KEGG from 2005 to 2006 agree with EFICAz predictions made in 2005. The coverage of EFICAz predictions is significantly higher than that of KEGG, especially for eukaryotes. Thousands of our novel predictions correspond to hypothetical proteins. We have identified a subset of 64 hypothetical proteins with low sequence identity to EFICAz training enzymes, whose biochemical functions have been recently characterized and find that in 96% (84%) of the cases we correctly identified their three-field (four-field) EC numbers. For two of the 64 hypothetical proteins: PA1167 from Pseudomonas aeruginosa, an alginate lyase (EC 4.2.2.3) and Rv1700 of Mycobacterium tuberculosis H37Rv, an ADP-ribose diphosphatase (EC 3.6.1.13), we have detected annotation lag of more than two years in databases. Two examples are presented where EFICAz predictions act as hypothesis generators for understanding the functional roles of hypothetical proteins: FLJ11151, a human protein overexpressed in cancer that EFICAz identifies as an endopolyphosphatase (EC 3.6.1.10), and MW0119, a protein of Staphylococcus aureus strain MW2 that we propose as candidate virulence factor based on its EFICAz predicted activity, sphingomyelin phosphodiesterase (EC 3.1.4.12). Conclusion Our results suggest that we have generated enzyme function annotations of high precision and recall. These predictions can be mined and correlated with other information sources to generate biologically significant hypotheses and can be useful for comparative genome analysis and automated metabolic pathway reconstruction. PMID:17166279
Fungal proteomics: from identification to function.

PubMed

Doyle, Sean

2011-08-01

Some fungi cause disease in humans and plants, while others have demonstrable potential for the control of insect pests. In addition, fungi are also a rich reservoir of therapeutic metabolites and industrially useful enzymes. Detailed analysis of fungal biochemistry is now enabled by multiple technologies including protein mass spectrometry, genome and transcriptome sequencing and advances in bioinformatics. Yet, the assignment of function to fungal proteins, encoded either by in silico annotated, or unannotated genes, remains problematic. The purpose of this review is to describe the strategies used by many researchers to reveal protein function in fungi, and more importantly, to consolidate the nomenclature of 'unknown function protein' as opposed to 'hypothetical protein' - once any protein has been identified by protein mass spectrometry. A combination of approaches including comparative proteomics, pathogen-induced protein expression and immunoproteomics are outlined, which, when used in combination with a variety of other techniques (e.g. functional genomics, microarray analysis, immunochemical and infection model systems), appear to yield comprehensive and definitive information on protein function in fungi. The relative advantages of proteomic, as opposed to transcriptomic-only, analyses are also described. In the future, combined high-throughput, quantitative proteomics, allied to transcriptomic sequencing, are set to reveal much about protein function in fungi. © 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.

RefSeq microbial genomes database: new representation and annotation strategy.

PubMed

Tatusova, Tatiana; Ciufo, Stacy; Fedorov, Boris; O'Neill, Kathleen; Tolstoy, Igor

2014-01-01

The source of the microbial genomic sequences in the RefSeq collection is the set of primary sequence records submitted to the International Nucleotide Sequence Database public archives. These can be accessed through the Entrez search and retrieval system at http://www.ncbi.nlm.nih.gov/genome. Next-generation sequencing has enabled researchers to perform genomic sequencing at rates that were unimaginable in the past. Microbial genomes can now be sequenced in a matter of hours, which has led to a significant increase in the number of assembled genomes deposited in the public archives. This huge increase in DNA sequence data presents new challenges for the annotation, analysis and visualization bioinformatics tools. New strategies have been developed for the annotation and representation of reference genomes and sequence variations derived from population studies and clinical outbreaks.
Targeting Palmitoyl Acyltransferases in Mutant NRAS-Driven Melanoma

DTIC Science & Technology

2015-10-01

activation in melanoma cells using chemical biology and functional genomic approaches. In the first year of the study, we have developed more potent...post-translational modification by adding a 16-carbon palmitate) is required for N-RAS proper membrane localization and its oncogenic activities ...RAS regulation could be a novel strategy to treat N-RAS mutant melanoma. We have developed chemical probes that covalently label the active sites of
Minireview: The Molecular and Genomic Basis for Prostate Cancer Health Disparities

PubMed Central

Bollig-Fischer, Aliccia

2013-01-01

Despite more aggressive screening across all demographics and gradual declines in mortality related to prostate cancer (PCa) in the United States, race disparities persist. For African American men (AAM), PCa is more often an aggressive disease showing increased metastases and greater PCa-related mortality compared with European American men. The earliest research points to how distinctions are likely the result of a combination of factors, including ancestry genetics and lifestyle variables. More recent research considers that cancer, although influenced by external forces, is ultimately a disease primarily driven by aberrations observed in the molecular genetics of the tumor. Research studying PCa predominantly from European American men shows that indolent and advanced or metastatic prostate tumors have distinguishing molecular genomic make-ups. Early yet increasing evidence suggests that clinically distinct PCa from AAM also display molecular distinctions. It is reasonable to predict that further study will reveal molecular subtypes and various frequencies for PCa subtypes among diverse patient groups, thereby providing insight as to the genomic lesions and gene signatures that are functionally implicated in carcinogenesis or aggressive PCa in AAM. That knowledge will prove useful in developing strategies to predict who will develop advanced PCa among AAM and will provide the rationale to develop effective individualized treatment strategies to overcome disparities. PMID:23608645
In silico mining and PCR-based approaches to transcription factor discovery in non-model plants: gene discovery of the WRKY transcription factors in conifers.

PubMed

Liu, Jun-Jun; Xiang, Yu

2011-01-01

WRKY transcription factors are key regulators of numerous biological processes in plant growth and development, as well as plant responses to abiotic and biotic stresses. Research on biological functions of plant WRKY genes has focused in the past on model plant species or species with largely characterized transcriptomes. However, a variety of non-model plants, such as forest conifers, are essential as feed, biofuel, and wood or for sustainable ecosystems. Identification of WRKY genes in these non-model plants is equally important for understanding the evolutionary and function-adaptive processes of this transcription factor family. Because of limited genomic information, the rarity of regulatory gene mRNAs in transcriptomes, and the sequence divergence to model organism genes, identification of transcription factors in non-model plants using methods similar to those generally used for model plants is difficult. This chapter describes a gene family discovery strategy for identification of WRKY transcription factors in conifers by a combination of in silico-based prediction and PCR-based experimental approaches. Compared to traditional cDNA library screening or EST sequencing at transcriptome scales, this integrated gene discovery strategy provides fast, simple, reliable, and specific methods to unveil the WRKY gene family at both genome and transcriptome levels in non-model plants.
Searching new signals for production traits through gene-based association analysis in three Italian cattle breeds.

PubMed

Capomaccio, Stefano; Milanesi, Marco; Bomba, Lorenzo; Cappelli, Katia; Nicolazzi, Ezequiel L; Williams, John L; Ajmone-Marsan, Paolo; Stefanon, Bruno

2015-08-01

Genome-wide association studies (GWAS) have been widely applied to disentangle the genetic basis of complex traits. In cattle breeds, classical GWAS approaches with medium-density marker panels are far from conclusive, especially for complex traits. This is due to the intrinsic limitations of GWAS and the assumptions that are made to step from the association signals to the functional variations. Here, we applied a gene-based strategy to prioritize genotype-phenotype associations found for milk production and quality traits with classical approaches in three Italian dairy cattle breeds with different sample sizes (Italian Brown n = 745; Italian Holstein n = 2058; Italian Simmental n = 477). Although classical regression on single markers revealed only a single genome-wide significant genotype-phenotype association, for Italian Holstein, the gene-based approach identified specific genes in each breed that are associated with milk physiology and mammary gland development. As no standard method has yet been established to step from variation to functional units (i.e., genes), the strategy proposed here may contribute to revealing new genes that play significant roles in complex traits, such as those investigated here, amplifying low association signals using a gene-centric approach. © 2015 Stichting International Foundation for Animal Genetics.
Implications of streamlining theory for microbial ecology

PubMed Central

Giovannoni, Stephen J; Cameron Thrash, J; Temperton, Ben

2014-01-01

Whether a small cell, a small genome or a minimal set of chemical reactions with self-replicating properties, simplicity is beguiling. As Leonardo da Vinci reportedly said, ‘simplicity is the ultimate sophistication'. Two diverging views of simplicity have emerged in accounts of symbiotic and commensal bacteria and cosmopolitan free-living bacteria with small genomes. The small genomes of obligate insect endosymbionts have been attributed to genetic drift caused by small effective population sizes (Ne). In contrast, streamlining theory attributes small cells and genomes to selection for efficient use of nutrients in populations where Ne is large and nutrients limit growth. Regardless of the cause of genome reduction, lost coding potential eventually dictates loss of function. Consequences of reductive evolution in streamlined organisms include atypical patterns of prototrophy and the absence of common regulatory systems, which have been linked to difficulty in culturing these cells. Recent evidence from metagenomics suggests that streamlining is commonplace, may broadly explain the phenomenon of the uncultured microbial majority, and might also explain the highly interdependent (connected) behavior of many microbial ecosystems. Streamlining theory is belied by the observation that many successful bacteria are large cells with complex genomes. To fully appreciate streamlining, we must look to the life histories and adaptive strategies of cells, which impose minimum requirements for complexity that vary with niche. PMID:24739623
Baculovirus-based genome editing in primary cells.

PubMed

Mansouri, Maysam; Ehsaei, Zahra; Taylor, Verdon; Berger, Philipp

2017-03-01

Genome editing in eukaryotes became easier in the last years with the development of nucleases that induce double strand breaks in DNA at user-defined sites. CRISPR/Cas9-based genome editing is currently one of the most powerful strategies. In the easiest case, a nuclease (e.g. Cas9) and a target defining guide RNA (gRNA) are transferred into a target cell. Non-homologous end joining (NHEJ) repair of the DNA break following Cas9 cleavage can lead to inactivation of the target gene. Specific repair or insertion of DNA with Homology Directed Repair (HDR) needs the simultaneous delivery of a repair template. Recombinant Lentivirus or Adenovirus genomes have enough capacity for a nuclease coding sequence and the gRNA but are usually too small to also carry large targeting constructs. We recently showed that a baculovirus-based multigene expression system (MultiPrime) can be used for genome editing in primary cells since it possesses the necessary capacity to carry the nuclease and gRNA expression constructs and the HDR targeting sequences. Here we present new Acceptor plasmids for MultiPrime that allow simplified cloning of baculoviruses for genome editing and we show their functionality in primary cells with limited life span and induced pluripotent stem cells (iPS). Copyright © 2017 Elsevier Inc. All rights reserved.
SG-ADVISER CNV: copy-number variant annotation and interpretation.

PubMed

Erikson, Galina A; Deshpande, Neha; Kesavan, Balachandar G; Torkamani, Ali

2015-09-01

Copy-number variants have been associated with a variety of diseases, especially cancer, autism, schizophrenia, and developmental delay. The majority of clinically relevant events occur de novo, necessitating the interpretation of novel events. In this light, we present the Scripps Genome ADVISER CNV annotation pipeline and Web server, which aims to fill the gap between copy number variant detection and interpretation by performing in-depth annotations and functional predictions for copy number variants. The Scripps Genome ADVISER CNV suite includes a Web server interface to a high-performance computing environment for calculations of annotations and a table-based user interface that allows for the execution of numerous annotation-based variant filtration strategies and statistics. The annotation results include details regarding location, impact on the coding portion of genes, allele frequency information (including allele frequencies from the Scripps Wellderly cohort), and overlap information with other reference data sets (including ClinVar, DGV, DECIPHER). A summary variant classification is produced (ADVISER score) based on the American College of Medical Genetics and Genomics scoring guidelines. We demonstrate >90% sensitivity/specificity for detection of pathogenic events. Scripps Genome ADVISER CNV is designed to allow users with no prior bioinformatics expertise to manipulate large volumes of copy-number variant data. Scripps Genome ADVISER CNV is available at http://genomics.scripps.edu/ADVISER/.
Visualization of RNA structure models within the Integrative Genomics Viewer.

PubMed

Busan, Steven; Weeks, Kevin M

2017-07-01

Analyses of the interrelationships between RNA structure and function are increasingly important components of genomic studies. The SHAPE-MaP strategy enables accurate RNA structure probing and realistic structure modeling of kilobase-length noncoding RNAs and mRNAs. Existing tools for visualizing RNA structure models are not suitable for efficient analysis of long, structurally heterogeneous RNAs. In addition, structure models are often advantageously interpreted in the context of other experimental data and gene annotation information, for which few tools currently exist. We have developed a module within the widely used and well supported open-source Integrative Genomics Viewer (IGV) that allows visualization of SHAPE and other chemical probing data, including raw reactivities, data-driven structural entropies, and data-constrained base-pair secondary structure models, in context with linear genomic data tracks. We illustrate the usefulness of visualizing RNA structure in the IGV by exploring structure models for a large viral RNA genome, comparing bacterial mRNA structure in cells with its structure under cell- and protein-free conditions, and comparing a noncoding RNA structure modeled using SHAPE data with a base-pairing model inferred through sequence covariation analysis. © 2017 Busan and Weeks; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
designGG: an R-package and web tool for the optimal design of genetical genomics experiments.

PubMed

Li, Yang; Swertz, Morris A; Vera, Gonzalo; Fu, Jingyuan; Breitling, Rainer; Jansen, Ritsert C

2009-06-18

High-dimensional biomolecular profiling of genetically different individuals in one or more environmental conditions is an increasingly popular strategy for exploring the functioning of complex biological systems. The optimal design of such genetical genomics experiments in a cost-efficient and effective way is not trivial. This paper presents designGG, an R package for designing optimal genetical genomics experiments. A web implementation for designGG is available at http://gbic.biol.rug.nl/designGG. All software, including source code and documentation, is freely available. DesignGG allows users to intelligently select and allocate individuals to experimental units and conditions such as drug treatment. The user can maximize the power and resolution of detecting genetic, environmental and interaction effects in a genome-wide or local mode by giving more weight to genome regions of special interest, such as previously detected phenotypic quantitative trait loci. This will help to achieve high power and more accurate estimates of the effects of interesting factors, and thus yield a more reliable biological interpretation of data. DesignGG is applicable to linkage analysis of experimental crosses, e.g. recombinant inbred lines, as well as to association analysis of natural populations.
A De-Novo Genome Analysis Pipeline (DeNoGAP) for large-scale comparative prokaryotic genomics studies.

PubMed

Thakur, Shalabh; Guttman, David S

2016-06-30

Comparative analysis of whole genome sequence data from closely related prokaryotic species or strains is becoming an increasingly important and accessible approach for addressing both fundamental and applied biological questions. While there are number of excellent tools developed for performing this task, most scale poorly when faced with hundreds of genome sequences, and many require extensive manual curation. We have developed a de-novo genome analysis pipeline (DeNoGAP) for the automated, iterative and high-throughput analysis of data from comparative genomics projects involving hundreds of whole genome sequences. The pipeline is designed to perform reference-assisted and de novo gene prediction, homolog protein family assignment, ortholog prediction, functional annotation, and pan-genome analysis using a range of proven tools and databases. While most existing methods scale quadratically with the number of genomes since they rely on pairwise comparisons among predicted protein sequences, DeNoGAP scales linearly since the homology assignment is based on iteratively refined hidden Markov models. This iterative clustering strategy enables DeNoGAP to handle a very large number of genomes using minimal computational resources. Moreover, the modular structure of the pipeline permits easy updates as new analysis programs become available. DeNoGAP integrates bioinformatics tools and databases for comparative analysis of a large number of genomes. The pipeline offers tools and algorithms for annotation and analysis of completed and draft genome sequences. The pipeline is developed using Perl, BioPerl and SQLite on Ubuntu Linux version 12.04 LTS. Currently, the software package accompanies script for automated installation of necessary external programs on Ubuntu Linux; however, the pipeline should be also compatible with other Linux and Unix systems after necessary external programs are installed. DeNoGAP is freely available at https://sourceforge.net/projects/denogap/ .
Comprehensive definition of genome features in Spirodela polyrhiza by high-depth physical mapping and short-read DNA sequencing strategies.

PubMed

Michael, Todd P; Bryant, Douglas; Gutierrez, Ryan; Borisjuk, Nikolai; Chu, Philomena; Zhang, Hanzhong; Xia, Jing; Zhou, Junfei; Peng, Hai; El Baidouri, Moaine; Ten Hallers, Boudewijn; Hastie, Alex R; Liang, Tiffany; Acosta, Kenneth; Gilbert, Sarah; McEntee, Connor; Jackson, Scott A; Mockler, Todd C; Zhang, Weixiong; Lam, Eric

2017-02-01

Spirodela polyrhiza is a fast-growing aquatic monocot with highly reduced morphology, genome size and number of protein-coding genes. Considering these biological features of Spirodela and its basal position in the monocot lineage, understanding its genome architecture could shed light on plant adaptation and genome evolution. Like many draft genomes, however, the 158-Mb Spirodela genome sequence has not been resolved to chromosomes, and important genome characteristics have not been defined. Here we deployed rapid genome-wide physical maps combined with high-coverage short-read sequencing to resolve the 20 chromosomes of Spirodela and to empirically delineate its genome features. Our data revealed a dramatic reduction in the number of the rDNA repeat units in Spirodela to fewer than 100, which is even fewer than that reported for yeast. Consistent with its unique phylogenetic position, small RNA sequencing revealed 29 Spirodela-specific microRNA, with only two being shared with Elaeis guineensis (oil palm) and Musa balbisiana (banana). Combining DNA methylation data and small RNA sequencing enabled the accurate prediction of 20.5% long terminal repeats (LTRs) that doubled the previous estimate, and revealed a high Solo:Intact LTR ratio of 8.2. Interestingly, we found that Spirodela has the lowest global DNA methylation levels (9%) of any plant species tested. Taken together our results reveal a genome that has undergone reduction, likely through eliminating non-essential protein coding genes, rDNA and LTRs. In addition to delineating the genome features of this unique plant, the methodologies described and large-scale genome resources from this work will enable future evolutionary and functional studies of this basal monocot family. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.
Calibrating genomic and allelic coverage bias in single-cell sequencing.

PubMed

Zhang, Cheng-Zhong; Adalsteinsson, Viktor A; Francis, Joshua; Cornils, Hauke; Jung, Joonil; Maire, Cecile; Ligon, Keith L; Meyerson, Matthew; Love, J Christopher

2015-04-16

Artifacts introduced in whole-genome amplification (WGA) make it difficult to derive accurate genomic information from single-cell genomes and require different analytical strategies from bulk genome analysis. Here, we describe statistical methods to quantitatively assess the amplification bias resulting from whole-genome amplification of single-cell genomic DNA. Analysis of single-cell DNA libraries generated by different technologies revealed universal features of the genome coverage bias predominantly generated at the amplicon level (1-10 kb). The magnitude of coverage bias can be accurately calibrated from low-pass sequencing (∼0.1 × ) to predict the depth-of-coverage yield of single-cell DNA libraries sequenced at arbitrary depths. We further provide a benchmark comparison of single-cell libraries generated by multi-strand displacement amplification (MDA) and multiple annealing and looping-based amplification cycles (MALBAC). Finally, we develop statistical models to calibrate allelic bias in single-cell whole-genome amplification and demonstrate a census-based strategy for efficient and accurate variant detection from low-input biopsy samples.
Calibrating genomic and allelic coverage bias in single-cell sequencing

PubMed Central

Francis, Joshua; Cornils, Hauke; Jung, Joonil; Maire, Cecile; Ligon, Keith L.; Meyerson, Matthew; Love, J. Christopher

2016-01-01

Artifacts introduced in whole-genome amplification (WGA) make it difficult to derive accurate genomic information from single-cell genomes and require different analytical strategies from bulk genome analysis. Here, we describe statistical methods to quantitatively assess the amplification bias resulting from whole-genome amplification of single-cell genomic DNA. Analysis of single-cell DNA libraries generated by different technologies revealed universal features of the genome coverage bias predominantly generated at the amplicon level (1–10 kb). The magnitude of coverage bias can be accurately calibrated from low-pass sequencing (~0.1 ×) to predict the depth-of-coverage yield of single-cell DNA libraries sequenced at arbitrary depths. We further provide a benchmark comparison of single-cell libraries generated by multi-strand displacement amplification (MDA) and multiple annealing and looping-based amplification cycles (MALBAC). Finally, we develop statistical models to calibrate allelic bias in single-cell whole-genome amplification and demonstrate a census-based strategy for efficient and accurate variant detection from low-input biopsy samples. PMID:25879913
Transcriptome and methylome profiling reveals relics of genome dominance in the mesopolyploid Brassica oleracea

PubMed Central

2014-01-01

Background Brassica oleracea is a valuable vegetable species that has contributed to human health and nutrition for hundreds of years and comprises multiple distinct cultivar groups with diverse morphological and phytochemical attributes. In addition to this phenotypic wealth, B. oleracea offers unique insights into polyploid evolution, as it results from multiple ancestral polyploidy events and a final Brassiceae-specific triplication event. Further, B. oleracea represents one of the diploid genomes that formed the economically important allopolyploid oilseed, Brassica napus. A deeper understanding of B. oleracea genome architecture provides a foundation for crop improvement strategies throughout the Brassica genus. Results We generate an assembly representing 75% of the predicted B. oleracea genome using a hybrid Illumina/Roche 454 approach. Two dense genetic maps are generated to anchor almost 92% of the assembled scaffolds to nine pseudo-chromosomes. Over 50,000 genes are annotated and 40% of the genome predicted to be repetitive, thus contributing to the increased genome size of B. oleracea compared to its close relative B. rapa. A snapshot of both the leaf transcriptome and methylome allows comparisons to be made across the triplicated sub-genomes, which resulted from the most recent Brassiceae-specific polyploidy event. Conclusions Differential expression of the triplicated syntelogs and cytosine methylation levels across the sub-genomes suggest residual marks of the genome dominance that led to the current genome architecture. Although cytosine methylation does not correlate with individual gene dominance, the independent methylation patterns of triplicated copies suggest epigenetic mechanisms play a role in the functional diversification of duplicate genes. PMID:24916971
ROCK1 is a potential combinatorial drug target for BRAF mutant melanoma

PubMed Central

Smit, Marjon A; Maddalo, Gianluca; Greig, Kylie; Raaijmakers, Linsey M; Possik, Patricia A; van Breukelen, Bas; Cappadona, Salvatore; Heck, Albert JR; Altelaar, AF Maarten; Peeper, Daniel S

2014-01-01

Treatment of BRAF mutant melanomas with specific BRAF inhibitors leads to tumor remission. However, most patients eventually relapse due to drug resistance. Therefore, we designed an integrated strategy using (phospho)proteomic and functional genomic platforms to identify drug targets whose inhibition sensitizes melanoma cells to BRAF inhibition. We found many proteins to be induced upon PLX4720 (BRAF inhibitor) treatment that are known to be involved in BRAF inhibitor resistance, including FOXD3 and ErbB3. Several proteins were down-regulated, including Rnd3, a negative regulator of ROCK1 kinase. For our genomic approach, we performed two parallel shRNA screens using a kinome library to identify genes whose inhibition sensitizes to BRAF or ERK inhibitor treatment. By integrating our functional genomic and (phospho)proteomic data, we identified ROCK1 as a potential drug target for BRAF mutant melanoma. ROCK1 silencing increased melanoma cell elimination when combined with BRAF or ERK inhibitor treatment. Translating this to a preclinical setting, a ROCK inhibitor showed augmented melanoma cell death upon BRAF or ERK inhibition in vitro. These data merit exploration of ROCK1 as a target in combination with current BRAF mutant melanoma therapies. PMID:25538140
Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea.

PubMed

Makarova, Kira S; Sorokin, Alexander V; Novichkov, Pavel S; Wolf, Yuri I; Koonin, Eugene V

2007-11-27

An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs). Rapid accumulation of genome sequences creates opportunities for refining COGs but also represents a challenge because of error amplification. One of the practical strategies involves construction of refined COGs for phylogenetically compact subsets of genomes. New Archaeal Clusters of Orthologous Genes (arCOGs) were constructed for 41 archaeal genomes (13 Crenarchaeota, 27 Euryarchaeota and one Nanoarchaeon) using an improved procedure that employs a similarity tree between smaller, group-specific clusters, semi-automatically partitions orthology domains in multidomain proteins, and uses profile searches for identification of remote orthologs. The annotation of arCOGs is a consensus between three assignments based on the COGs, the CDD database, and the annotations of homologs in the NR database. The 7538 arCOGs, on average, cover approximately 88% of the genes in a genome compared to a approximately 76% coverage in COGs. The finer granularity of ortholog identification in the arCOGs is apparent from the fact that 4538 arCOGs correspond to 2362 COGs; approximately 40% of the arCOGs are new. The archaeal gene core (protein-coding genes found in all 41 genome) consists of 166 arCOGs. The arCOGs were used to reconstruct gene loss and gene gain events during archaeal evolution and gene sets of ancestral forms. The Last Archaeal Common Ancestor (LACA) is conservatively estimated to possess 996 genes compared to 1245 and 1335 genes for the last common ancestors of Crenarchaeota and Euryarchaeota, respectively. It is inferred that LACA was a chemoautotrophic hyperthermophile that, in addition to the core archaeal functions, encoded more idiosyncratic systems, e.g., the CASS systems of antivirus defense and some toxin-antitoxin systems. The arCOGs provide a convenient, flexible framework for functional annotation of archaeal genomes, comparative genomics and evolutionary reconstructions. Genomic reconstructions suggest that the last common ancestor of archaea might have been (nearly) as advanced as the modern archaeal hyperthermophiles. ArCOGs and related information are available at: ftp://ftp.ncbi.nih.gov/pub/koonin/arCOGs/.
Microsatellite mapping of QTLs affecting resistance to coccidiosis (Eimeria tenella) in a Fayoumi x White Leghorn cross.

PubMed

Pinard-van der Laan, Marie-Hélène; Bed'hom, Bertrand; Coville, Jean-Luc; Pitel, Frédérique; Feve, Katia; Leroux, Sophie; Legros, Hélène; Thomas, Aurélie; Gourichon, David; Repérant, Jean-Michel; Rault, Paul

2009-01-20

Avian coccidiosis is a major parasitic disease of poultry, causing severe economical loss to poultry production by affecting growth and feed efficiency of infected birds. Current control strategies using mainly drugs and more recently vaccination are showing drawbacks and alternative strategies are needed. Using genetic resistance that would limit the negative and very costly effects of the disease would be highly relevant. The purpose of this work was to detect for the first time QTL for disease resistance traits to Eimeria tenella in chicken by performing a genome scan in an F2 cross issued from a resistant Fayoumi line and a susceptible Leghorn line. The QTL analysis detected 21 chromosome-wide significant QTL for the different traits related to disease resistance (body weight growth, plasma coloration, hematocrit, rectal temperature and lesion) on 6 chromosomes. Out of these, a genome-wide very significant QTL for body weight growth was found on GGA1, five genome-wide significant QTL for body weight growth, plasma coloration and hematocrit and one for plasma coloration were found on GGA1 and GGA6, respectively. Two genome-wide suggestive QTL for plasma coloration and rectal temperature were found on GGA1 and GGA2, respectively. Other chromosme-wide significant QTL were identified on GGA2, GGA3, GGA6, GGA15 and GGA23. Parent-of-origin effects were found for QTL for body weight growth and plasma coloration on GGA1 and GGA3. Several QTL for different resistance phenotypes were identified as co-localized on the same location. Using an F2 cross from resistant and susceptible chicken lines proved to be a successful strategy to identify QTL for different resistance traits to Eimeria tenella, opening the way for further gene identification and underlying mechanisms and hopefully possibilities for new breeding strategies for resistance to coccidiosis in the chicken. From the QTL regions identified, several candidate genes and relevant pathways linked to innate immune and inflammatory responses were suggested. These results will be combined with functional genomics approaches on the same lines to provide positional candidate genes for resistance loci for coccidiosis. Results suggested also for further analysis, models tackling the complexity of the genetic architecture of these correlated disease resistance traits including potential epistatic effects.
Adaptive genomic divergence under high gene flow between freshwater and brackish-water ecotypes of prickly sculpin (Cottus asper) revealed by Pool-Seq.

PubMed

Dennenmoser, Stefan; Vamosi, Steven M; Nolte, Arne W; Rogers, Sean M

2017-01-01

Understanding the genomic basis of adaptive divergence in the presence of gene flow remains a major challenge in evolutionary biology. In prickly sculpin (Cottus asper), an abundant euryhaline fish in northwestern North America, high genetic connectivity among brackish-water (estuarine) and freshwater (tributary) habitats of coastal rivers does not preclude the build-up of neutral genetic differentiation and emergence of different life history strategies. Because these two habitats present different osmotic niches, we predicted high genetic differentiation at known teleost candidate genes underlying salinity tolerance and osmoregulation. We applied whole-genome sequencing of pooled DNA samples (Pool-Seq) to explore adaptive divergence between two estuarine and two tributary habitats. Paired-end sequence reads were mapped against genomic contigs of European Cottus, and the gene content of candidate regions was explored based on comparisons with the threespine stickleback genome. Genes showing signals of repeated differentiation among brackish-water and freshwater habitats included functions such as ion transport and structural permeability in freshwater gills, which suggests that local adaptation to different osmotic niches might contribute to genomic divergence among habitats. Overall, the presence of both repeated and unique signatures of differentiation across many loci scattered throughout the genome is consistent with polygenic adaptation from standing genetic variation and locally variable selection pressures in the early stages of life history divergence. © 2016 John Wiley & Sons Ltd.
Comparative Genomics Analysis of Streptococcus Isolates from the Human Small Intestine Reveals their Adaptation to a Highly Dynamic Ecosystem

PubMed Central

Van den Bogert, Bartholomeus; Boekhorst, Jos; Herrmann, Ruth; Smid, Eddy J.; Zoetendal, Erwin G.; Kleerebezem, Michiel

2013-01-01

The human small-intestinal microbiota is characterised by relatively large and dynamic Streptococcus populations. In this study, genome sequences of small-intestinal streptococci from S. mitis, S. bovis, and S. salivarius species-groups were determined and compared with those from 58 Streptococcus strains in public databases. The Streptococcus pangenome consists of 12,403 orthologous groups of which 574 are shared among all sequenced streptococci and are defined as the Streptococcus core genome. Genome mining of the small-intestinal streptococci focused on functions playing an important role in the interaction of these streptococci in the small-intestinal ecosystem, including natural competence and nutrient-transport and metabolism. Analysis of the small-intestinal Streptococcus genomes predicts a high capacity to synthesize amino acids and various vitamins as well as substantial divergence in their carbohydrate transport and metabolic capacities, which is in agreement with observed physiological differences between these Streptococcus strains. Gene-specific PCR-strategies enabled evaluation of conservation of Streptococcus populations in intestinal samples from different human individuals, revealing that the S. salivarius strains were frequently detected in the small-intestine microbiota, supporting the representative value of the genomes provided in this study. Finally, the Streptococcus genomes allow prediction of the effect of dietary substances on Streptococcus population dynamics in the human small-intestine. PMID:24386196

Genome-reconstruction for eukaryotes from complex natural microbial communities.

PubMed

West, Patrick T; Probst, Alexander J; Grigoriev, Igor V; Thomas, Brian C; Banfield, Jillian F

2018-04-01

Microbial eukaryotes are integral components of natural microbial communities, and their inclusion is critical for many ecosystem studies, yet the majority of published metagenome analyses ignore eukaryotes. In order to include eukaryotes in environmental studies, we propose a method to recover eukaryotic genomes from complex metagenomic samples. A key step for genome recovery is separation of eukaryotic and prokaryotic fragments. We developed a k -mer-based strategy, EukRep, for eukaryotic sequence identification and applied it to environmental samples to show that it enables genome recovery, genome completeness evaluation, and prediction of metabolic potential. We used this approach to test the effect of addition of organic carbon on a geyser-associated microbial community and detected a substantial change of the community metabolism, with selection against almost all candidate phyla bacteria and archaea and for eukaryotes. Near complete genomes were reconstructed for three fungi placed within the Eurotiomycetes and an arthropod. While carbon fixation and sulfur oxidation were important functions in the geyser community prior to carbon addition, the organic carbon-impacted community showed enrichment for secreted proteases, secreted lipases, cellulose targeting CAZymes, and methanol oxidation. We demonstrate the broader utility of EukRep by reconstructing and evaluating relatively high-quality fungal, protist, and rotifer genomes from complex environmental samples. This approach opens the way for cultivation-independent analyses of whole microbial communities. © 2018 West et al.; Published by Cold Spring Harbor Laboratory Press.
A genome-wide CRISPR library for high-throughput genetic screening in Drosophila cells.

PubMed

Bassett, Andrew R; Kong, Lesheng; Liu, Ji-Long

2015-06-20

The simplicity of the CRISPR/Cas9 system of genome engineering has opened up the possibility of performing genome-wide targeted mutagenesis in cell lines, enabling screening for cellular phenotypes resulting from genetic aberrations. Drosophila cells have proven to be highly effective in identifying genes involved in cellular processes through similar screens using partial knockdown by RNAi. This is in part due to the lower degree of redundancy between genes in this organism, whilst still maintaining highly conserved gene networks and orthologs of many human disease-causing genes. The ability of CRISPR to generate genetic loss of function mutations not only increases the magnitude of any effect over currently employed RNAi techniques, but allows analysis over longer periods of time which can be critical for certain phenotypes. In this study, we have designed and built a genome-wide CRISPR library covering 13,501 genes, among which 8989 genes are targeted by three or more independent single guide RNAs (sgRNAs). Moreover, we describe strategies to monitor the population of guide RNAs by high throughput sequencing (HTS). We hope that this library will provide an invaluable resource for the community to screen loss of function mutations for cellular phenotypes, and as a source of guide RNA designs for future studies. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Network analysis of genomic alteration profiles reveals co-altered functional modules and driver genes for glioblastoma.

PubMed

Gu, Yunyan; Wang, Hongwei; Qin, Yao; Zhang, Yujing; Zhao, Wenyuan; Qi, Lishuang; Zhang, Yuannv; Wang, Chenguang; Guo, Zheng

2013-03-01

The heterogeneity of genetic alterations in human cancer genomes presents a major challenge to advancing our understanding of cancer mechanisms and identifying cancer driver genes. To tackle this heterogeneity problem, many approaches have been proposed to investigate genetic alterations and predict driver genes at the individual pathway level. However, most of these approaches ignore the correlation of alteration events between pathways and miss many genes with rare alterations collectively contributing to carcinogenesis. Here, we devise a network-based approach to capture the cooperative functional modules hidden in genome-wide somatic mutation and copy number alteration profiles of glioblastoma (GBM) from The Cancer Genome Atlas (TCGA), where a module is a set of altered genes with dense interactions in the protein interaction network. We identify 7 pairs of significantly co-altered modules that involve the main pathways known to be altered in GBM (TP53, RB and RTK signaling pathways) and highlight the striking co-occurring alterations among these GBM pathways. By taking into account the non-random correlation of gene alterations, the property of co-alteration could distinguish oncogenic modules that contain driver genes involved in the progression of GBM. The collaboration among cancer pathways suggests that the redundant models and aggravating models could shed new light on the potential mechanisms during carcinogenesis and provide new indications for the design of cancer therapeutic strategies.
A pipeline for the systematic identification of non-redundant full-ORF cDNAs for polymorphic and evolutionary divergent genomes: Application to the ascidian Ciona intestinalis

DOE PAGES

Gilchrist, Michael J.; Sobral, Daniel; Khoueiry, Pierre; ...

2015-05-27

Genome-wide resources, such as collections of cDNA clones encoding for complete proteins (full-ORF clones), are crucial tools for studying the evolution of gene function and genetic interactions. Non-model organisms, in particular marine organisms, provide a rich source of functional diversity. Marine organism genomes are, however, frequently highly polymorphic and encode proteins that diverge significantly from those of well-annotated model genomes. The construction of full-ORF clone collections from non-model organisms is hindered by the difficulty of predicting accurately the N-terminal ends of proteins, and distinguishing recent paralogs from highly polymorphic alleles. We also report a computational strategy that overcomes these difficulties,more » and allows for accurate gene level clustering of transcript data followed by the automated identification of full-ORFs with correct 5'- and 3'-ends. It is robust to polymorphism, includes paralog calling and does not require evolutionary proximity to well annotated model organisms. Here, we developed this pipeline for the ascidian Ciona intestinalis, a highly polymorphic member of the divergent sister group of the vertebrates, emerging as a powerful model organism to study chordate gene function, Gene Regulatory Networks and molecular mechanisms underlying human pathologies. Furthermore, using this pipeline we have generated the first full-ORF collection for a highly polymorphic marine invertebrate. It contains 19,163 full-ORF cDNA clones covering 60% of Ciona coding genes, and full-ORF orthologs for approximately half of curated human disease-associated genes.« less
Large-Scale Comparative Phenotypic and Genomic Analyses Reveal Ecological Preferences of Shewanella Species and Identify Metabolic Pathways Conserved at the Genus Level ▿ †

PubMed Central

Rodrigues, Jorge L. M.; Serres, Margrethe H.; Tiedje, James M.

2011-01-01

The use of comparative genomics for the study of different microbiological species has increased substantially as sequence technologies become more affordable. However, efforts to fully link a genotype to its phenotype remain limited to the development of one mutant at a time. In this study, we provided a high-throughput alternative to this limiting step by coupling comparative genomics to the use of phenotype arrays for five sequenced Shewanella strains. Positive phenotypes were obtained for 441 nutrients (C, N, P, and S sources), with N-based compounds being the most utilized for all strains. Many genes and pathways predicted by genome analyses were confirmed with the comparative phenotype assay, and three degradation pathways believed to be missing in Shewanella were confirmed as missing. A number of previously unknown gene products were predicted to be parts of pathways or to have a function, expanding the number of gene targets for future genetic analyses. Ecologically, the comparative high-throughput phenotype analysis provided insights into niche specialization among the five different strains. For example, Shewanella amazonensis strain SB2B, isolated from the Amazon River delta, was capable of utilizing 60 C compounds, whereas Shewanella sp. strain W3-18-1, isolated from deep marine sediment, utilized only 25 of them. In spite of the large number of nutrient sources yielding positive results, our study indicated that except for the N sources, they were not sufficiently informative to predict growth phenotypes from increasing evolutionary distances. Our results indicate the importance of phenotypic evaluation for confirming genome predictions. This strategy will accelerate the functional discovery of genes and provide an ecological framework for microbial genome sequencing projects. PMID:21642407
Genomic Copy Number Dictates a Gene-Independent Cell Response to CRISPR/Cas9 Targeting.

PubMed

Aguirre, Andrew J; Meyers, Robin M; Weir, Barbara A; Vazquez, Francisca; Zhang, Cheng-Zhong; Ben-David, Uri; Cook, April; Ha, Gavin; Harrington, William F; Doshi, Mihir B; Kost-Alimova, Maria; Gill, Stanley; Xu, Han; Ali, Levi D; Jiang, Guozhi; Pantel, Sasha; Lee, Yenarae; Goodale, Amy; Cherniack, Andrew D; Oh, Coyin; Kryukov, Gregory; Cowley, Glenn S; Garraway, Levi A; Stegmaier, Kimberly; Roberts, Charles W; Golub, Todd R; Meyerson, Matthew; Root, David E; Tsherniak, Aviad; Hahn, William C

2016-08-01

The CRISPR/Cas9 system enables genome editing and somatic cell genetic screens in mammalian cells. We performed genome-scale loss-of-function screens in 33 cancer cell lines to identify genes essential for proliferation/survival and found a strong correlation between increased gene copy number and decreased cell viability after genome editing. Within regions of copy-number gain, CRISPR/Cas9 targeting of both expressed and unexpressed genes, as well as intergenic loci, led to significantly decreased cell proliferation through induction of a G2 cell-cycle arrest. By examining single-guide RNAs that map to multiple genomic sites, we found that this cell response to CRISPR/Cas9 editing correlated strongly with the number of target loci. These observations indicate that genome targeting by CRISPR/Cas9 elicits a gene-independent antiproliferative cell response. This effect has important practical implications for the interpretation of CRISPR/Cas9 screening data and confounds the use of this technology for the identification of essential genes in amplified regions. We found that the number of CRISPR/Cas9-induced DNA breaks dictates a gene-independent antiproliferative response in cells. These observations have practical implications for using CRISPR/Cas9 to interrogate cancer gene function and illustrate that cancer cells are highly sensitive to site-specific DNA damage, which may provide a path to novel therapeutic strategies. Cancer Discov; 6(8); 914-29. ©2016 AACR.See related commentary by Sheel and Xue, p. 824See related article by Munoz et al., p. 900This article is highlighted in the In This Issue feature, p. 803. 2016 American Association for Cancer Research.
Widespread signatures of local mRNA folding structure selection in four Dengue virus serotypes

PubMed Central

2015-01-01

Background It is known that mRNA folding can affect and regulate various gene expression steps both in living organisms and in viruses. Previous studies have recognized functional RNA structures in the genome of the Dengue virus. However, these studies usually focused either on the viral untranslated regions or on very specific and limited regions at the beginning of the coding sequences, in a limited number of strains, and without considering evolutionary selection. Results Here we performed the first large scale comprehensive genomics analysis of selection for local mRNA folding strength in the Dengue virus coding sequences, based on a total of 1,670 genomes and 4 serotypes. Our analysis identified clusters of positions along the coding regions that may undergo a conserved evolutionary selection for strong or weak local folding maintained across different viral variants. Specifically, 53-66 clusters for strong folding and 49-73 clusters for weak folding (depending on serotype) aggregated of positions with a significant conservation of folding energy signals (related to partially overlapping local genomic regions) were recognized. In addition, up to 7% of these positions were found to be conserved in more than 90% of the viral genomes. Although some of the identified positions undergo frequent synonymous / non-synonymous substitutions, the selection for folding strength therein is preserved, and thus cannot be trivially explained based on sequence conservation alone. Conclusions The fact that many of the positions with significant folding related signals are conserved among different Dengue variants suggests that a better understanding of the mRNA structures in the corresponding regions may promote the development of prospective anti- Dengue vaccination strategies. The comparative genomics approach described here can be employed in the future for detecting functional regions in other pathogens with very high mutations rates. PMID:26449467
Genomics of Three New Bacteriophages Useful in the Biocontrol of Salmonella

PubMed Central

Bardina, Carlota; Colom, Joan; Spricigo, Denis A.; Otero, Jennifer; Sánchez-Osuna, Miquel; Cortés, Pilar; Llagostera, Montserrat

2016-01-01

Non-typhoid Salmonella is the principal pathogen related to food-borne diseases throughout the world. Widespread antibiotic resistance has adversely affected human health and has encouraged the search for alternative antimicrobial agents. The advances in bacteriophage therapy highlight their use in controlling a broad spectrum of food-borne pathogens. One requirement for the use of bacteriophages as antibacterials is the characterization of their genomes. In this work, complete genome sequencing and molecular analyses were carried out for three new virulent Salmonella-specific bacteriophages (UAB_Phi20, UAB_Phi78, and UAB_Phi87) able to infect a broad range of Salmonella strains. Sequence analysis of the genomes of UAB_Phi20, UAB_Phi78, and UAB_Phi87 bacteriophages did not evidence the presence of known virulence-associated and antibiotic resistance genes, and potential immunoreactive food allergens. The UAB_Phi20 genome comprised 41,809 base pairs with 80 open reading frames (ORFs); 24 of them with assigned function. Genome sequence showed a high homology of UAB_Phi20 with Salmonella bacteriophage P22 and other P22likeviruses genus of the Podoviridae family, including ST64T and ST104. The DNA of UAB_Phi78 contained 44,110 bp including direct terminal repeats (DTR) of 179 bp and 58 putative ORFs were predicted and 20 were assigned function. This bacteriophage was assigned to the SP6likeviruses genus of the Podoviridae family based on its high similarity not only with SP6 but also with the K1-5, K1E, and K1F bacteriophages, all of which infect Escherichia coli. The UAB_Phi87 genome sequence consisted of 87,669 bp with terminal direct repeats of 608 bp; although 148 ORFs were identified, putative functions could be assigned to only 29 of them. Sequence comparisons revealed the mosaic structure of UAB_Phi87 and its high similarity with bacteriophages Felix O1 and wV8 of E. coli with respect to genetic content and functional organization. Phylogenetic analysis of large terminase subunits confirms their packaging strategies and grouping to the different phage genus type. All these studies are necessary for the development and the use of an efficient cocktail with commercial applications in bacteriophage therapy against Salmonella. PMID:27148229
Employing genome-wide SNP discovery and genotyping strategy to extrapolate the natural allelic diversity and domestication patterns in chickpea

PubMed Central

Kujur, Alice; Bajaj, Deepak; Upadhyaya, Hari D.; Das, Shouvik; Ranjan, Rajeev; Shree, Tanima; Saxena, Maneesha S.; Badoni, Saurabh; Kumar, Vinod; Tripathi, Shailesh; Gowda, C. L. L.; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K.; Parida, Swarup K.

2015-01-01

The genome-wide discovery and high-throughput genotyping of SNPs in chickpea natural germplasm lines is indispensable to extrapolate their natural allelic diversity, domestication, and linkage disequilibrium (LD) patterns leading to the genetic enhancement of this vital legume crop. We discovered 44,844 high-quality SNPs by sequencing of 93 diverse cultivated desi, kabuli, and wild chickpea accessions using reference genome- and de novo-based GBS (genotyping-by-sequencing) assays that were physically mapped across eight chromosomes of desi and kabuli. Of these, 22,542 SNPs were structurally annotated in different coding and non-coding sequence components of genes. Genes with 3296 non-synonymous and 269 regulatory SNPs could functionally differentiate accessions based on their contrasting agronomic traits. A high experimental validation success rate (92%) and reproducibility (100%) along with strong sensitivity (93–96%) and specificity (99%) of GBS-based SNPs was observed. This infers the robustness of GBS as a high-throughput assay for rapid large-scale mining and genotyping of genome-wide SNPs in chickpea with sub-optimal use of resources. With 23,798 genome-wide SNPs, a relatively high intra-specific polymorphic potential (49.5%) and broader molecular diversity (13–89%)/functional allelic diversity (18–77%) was apparent among 93 chickpea accessions, suggesting their tremendous applicability in rapid selection of desirable diverse accessions/inter-specific hybrids in chickpea crossbred varietal improvement program. The genome-wide SNPs revealed complex admixed domestication pattern, extensive LD estimates (0.54–0.68) and extended LD decay (400–500 kb) in a structured population inclusive of 93 accessions. These findings reflect the utility of our identified SNPs for subsequent genome-wide association study (GWAS) and selective sweep-based domestication trait dissection analysis to identify potential genomic loci (gene-associated targets) specifically regulating important complex quantitative agronomic traits in chickpea. The numerous informative genome-wide SNPs, natural allelic diversity-led domestication pattern, and LD-based information generated in our study have got multidimensional applicability with respect to chickpea genomics-assisted breeding. PMID:25873920
Exon trapping: a genetic screen to identify candidate transcribed sequences in cloned mammalian genomic DNA.

PubMed

Duyk, G M; Kim, S W; Myers, R M; Cox, D R

1990-11-01

Identification and recovery of transcribed sequences from cloned mammalian genomic DNA remains an important problem in isolating genes on the basis of their chromosomal location. We have developed a strategy that facilitates the recovery of exons from random pieces of cloned genomic DNA. The basis of this "exon trapping" strategy is that, during a retroviral life cycle, genomic sequences of nonviral origin are correctly spliced and may be recovered as a cDNA copy of the introduced segment. By using this genetic assay for cis-acting sequences required for RNA splicing, we have screened approximately 20 kilobase pairs of cloned genomic DNA and have recovered all four predicted exons.
Exon trapping: a genetic screen to identify candidate transcribed sequences in cloned mammalian genomic DNA.

PubMed Central

Duyk, G M; Kim, S W; Myers, R M; Cox, D R

1990-01-01

Identification and recovery of transcribed sequences from cloned mammalian genomic DNA remains an important problem in isolating genes on the basis of their chromosomal location. We have developed a strategy that facilitates the recovery of exons from random pieces of cloned genomic DNA. The basis of this "exon trapping" strategy is that, during a retroviral life cycle, genomic sequences of nonviral origin are correctly spliced and may be recovered as a cDNA copy of the introduced segment. By using this genetic assay for cis-acting sequences required for RNA splicing, we have screened approximately 20 kilobase pairs of cloned genomic DNA and have recovered all four predicted exons. PMID:2247475
Advances in plant gene-targeted and functional markers: a review

PubMed Central

2013-01-01

Public genomic databases have provided new directions for molecular marker development and initiated a shift in the types of PCR-based techniques commonly used in plant science. Alongside commonly used arbitrarily amplified DNA markers, other methods have been developed. Targeted fingerprinting marker techniques are based on the well-established practices of arbitrarily amplified DNA methods, but employ novel methodological innovations such as the incorporation of gene or promoter elements in the primers. These markers provide good reproducibility and increased resolution by the concurrent incidence of dominant and co-dominant bands. Despite their promising features, these semi-random markers suffer from possible problems of collision and non-homology analogous to those found with randomly generated fingerprints. Transposable elements, present in abundance in plant genomes, may also be used to generate fingerprints. These markers provide increased genomic coverage by utilizing specific targeted sites and produce bands that mostly seem to be homologous. The biggest drawback with most of these techniques is that prior genomic information about retrotransposons is needed for primer design, prohibiting universal applications. Another class of recently developed methods exploits length polymorphism present in arrays of multi-copy gene families such as cytochrome P450 and β-tubulin genes to provide cross-species amplification and transferability. A specific class of marker makes use of common features of plant resistance genes to generate bands linked to a given phenotype, or to reveal genetic diversity. Conserved DNA-based strategies have limited genome coverage and may fail to reveal genetic diversity, while resistance genes may be under specific evolutionary selection. Markers may also be generated from functional and/or transcribed regions of the genome using different gene-targeting approaches coupled with the use of RNA information. Such techniques have the potential to generate phenotypically linked functional markers, especially when fingerprints are generated from the transcribed or expressed region of the genome. It is to be expected that these recently developed techniques will generate larger datasets, but their shortcomings should also be acknowledged and carefully investigated. PMID:23406322
Foxp2 Regulates Gene Networks Implicated in Neurite Outgrowth in the Developing Brain

PubMed Central

Vernes, Sonja C.; Oliver, Peter L.; Spiteri, Elizabeth; Lockstone, Helen E.; Puliyadi, Rathi; Taylor, Jennifer M.; Ho, Joses; Mombereau, Cedric; Brewer, Ariel; Lowy, Ernesto; Nicod, Jérôme; Groszer, Matthias; Baban, Dilair; Sahgal, Natasha; Cazier, Jean-Baptiste; Ragoussis, Jiannis; Davies, Kay E.; Geschwind, Daniel H.; Fisher, Simon E.

2011-01-01

Forkhead-box protein P2 is a transcription factor that has been associated with intriguing aspects of cognitive function in humans, non-human mammals, and song-learning birds. Heterozygous mutations of the human FOXP2 gene cause a monogenic speech and language disorder. Reduced functional dosage of the mouse version (Foxp2) causes deficient cortico-striatal synaptic plasticity and impairs motor-skill learning. Moreover, the songbird orthologue appears critically important for vocal learning. Across diverse vertebrate species, this well-conserved transcription factor is highly expressed in the developing and adult central nervous system. Very little is known about the mechanisms regulated by Foxp2 during brain development. We used an integrated functional genomics strategy to robustly define Foxp2-dependent pathways, both direct and indirect targets, in the embryonic brain. Specifically, we performed genome-wide in vivo ChIP–chip screens for Foxp2-binding and thereby identified a set of 264 high-confidence neural targets under strict, empirically derived significance thresholds. The findings, coupled to expression profiling and in situ hybridization of brain tissue from wild-type and mutant mouse embryos, strongly highlighted gene networks linked to neurite development. We followed up our genomics data with functional experiments, showing that Foxp2 impacts on neurite outgrowth in primary neurons and in neuronal cell models. Our data indicate that Foxp2 modulates neuronal network formation, by directly and indirectly regulating mRNAs involved in the development and plasticity of neuronal connections. PMID:21765815
Foxp2 regulates gene networks implicated in neurite outgrowth in the developing brain.

PubMed

Vernes, Sonja C; Oliver, Peter L; Spiteri, Elizabeth; Lockstone, Helen E; Puliyadi, Rathi; Taylor, Jennifer M; Ho, Joses; Mombereau, Cedric; Brewer, Ariel; Lowy, Ernesto; Nicod, Jérôme; Groszer, Matthias; Baban, Dilair; Sahgal, Natasha; Cazier, Jean-Baptiste; Ragoussis, Jiannis; Davies, Kay E; Geschwind, Daniel H; Fisher, Simon E

2011-07-01

Forkhead-box protein P2 is a transcription factor that has been associated with intriguing aspects of cognitive function in humans, non-human mammals, and song-learning birds. Heterozygous mutations of the human FOXP2 gene cause a monogenic speech and language disorder. Reduced functional dosage of the mouse version (Foxp2) causes deficient cortico-striatal synaptic plasticity and impairs motor-skill learning. Moreover, the songbird orthologue appears critically important for vocal learning. Across diverse vertebrate species, this well-conserved transcription factor is highly expressed in the developing and adult central nervous system. Very little is known about the mechanisms regulated by Foxp2 during brain development. We used an integrated functional genomics strategy to robustly define Foxp2-dependent pathways, both direct and indirect targets, in the embryonic brain. Specifically, we performed genome-wide in vivo ChIP-chip screens for Foxp2-binding and thereby identified a set of 264 high-confidence neural targets under strict, empirically derived significance thresholds. The findings, coupled to expression profiling and in situ hybridization of brain tissue from wild-type and mutant mouse embryos, strongly highlighted gene networks linked to neurite development. We followed up our genomics data with functional experiments, showing that Foxp2 impacts on neurite outgrowth in primary neurons and in neuronal cell models. Our data indicate that Foxp2 modulates neuronal network formation, by directly and indirectly regulating mRNAs involved in the development and plasticity of neuronal connections.
Cyanobacterial Biofuels: Strategies and Developments on Network and Modeling.

PubMed

Klanchui, Amornpan; Raethong, Nachon; Prommeenate, Peerada; Vongsangnak, Wanwipa; Meechai, Asawin

Cyanobacteria, the phototrophic microorganisms, have attracted much attention recently as a promising source for environmentally sustainable biofuels production. However, barriers for commercial markets of cyanobacteria-based biofuels concern the economic feasibility. Miscellaneous strategies for improving the production performance of cyanobacteria have thus been developed. Among these, the simple ad hoc strategies resulting in failure to optimize fully cell growth coupled with desired product yield are explored. With the advancement of genomics and systems biology, a new paradigm toward systems metabolic engineering has been recognized. In particular, a genome-scale metabolic network reconstruction and modeling is a crucial systems-based tool for whole-cell-wide investigation and prediction. In this review, the cyanobacterial genome-scale metabolic models, which offer a system-level understanding of cyanobacterial metabolism, are described. The main process of metabolic network reconstruction and modeling of cyanobacteria are summarized. Strategies and developments on genome-scale network and modeling through the systems metabolic engineering approach are advanced and employed for efficient cyanobacterial-based biofuels production.
A searchable database for the genome of Phomopsis longicolla (isolate MSPL 10-6).

PubMed

Darwish, Omar; Li, Shuxian; May, Zane; Matthews, Benjamin; Alkharouf, Nadim W

2016-01-01

Phomopsis longicolla (syn. Diaporthe longicolla) is an important seed-borne fungal pathogen that primarily causes Phomopsis seed decay (PSD) in most soybean production areas worldwide. This disease severely decreases soybean seed quality by reducing seed viability and oil quality, altering seed composition, and increasing frequencies of moldy and/or split beans. To facilitate investigation of the genetic base of fungal virulence factors and understand the mechanism of disease development, we designed and developed a database for P. longicolla isolate MSPL 10-6 that contains information about the genome assemblies (contigs), gene models, gene descriptions and GO functional ontologies. A web-based front end to the database was built using ASP.NET, which allows researchers to search and mine the genome of this important fungus. This database represents the first reported genome database for a seed borne fungal pathogen in the Diaporthe- Phomopsis complex. The database will also be a valuable resource for research and agricultural communities. It will aid in the development of new control strategies for this pathogen. http://bioinformatics.towson.edu/Phomopsis_longicolla/HomePage.aspx.
A searchable database for the genome of Phomopsis longicolla (isolate MSPL 10-6)

PubMed Central

May, Zane; Matthews, Benjamin; Alkharouf, Nadim W.

2016-01-01

Phomopsis longicolla (syn. Diaporthe longicolla) is an important seed-borne fungal pathogen that primarily causes Phomopsis seed decay (PSD) in most soybean production areas worldwide. This disease severely decreases soybean seed quality by reducing seed viability and oil quality, altering seed composition, and increasing frequencies of moldy and/or split beans. To facilitate investigation of the genetic base of fungal virulence factors and understand the mechanism of disease development, we designed and developed a database for P. longicolla isolate MSPL 10-6 that contains information about the genome assemblies (contigs), gene models, gene descriptions and GO functional ontologies. A web-based front end to the database was built using ASP.NET, which allows researchers to search and mine the genome of this important fungus. This database represents the first reported genome database for a seed borne fungal pathogen in the Diaporthe– Phomopsis complex. The database will also be a valuable resource for research and agricultural communities. It will aid in the development of new control strategies for this pathogen. Availability: http://bioinformatics.towson.edu/Phomopsis_longicolla/HomePage.aspx PMID:28197060
Investigating the Biosynthesis of Natural Products from Marine Proteobacteria: A Survey of Molecules and Strategies

PubMed Central

Timmermans, Marshall L.; Paudel, Yagya P.; Ross, Avena C.

2017-01-01

The phylum proteobacteria contains a wide array of Gram-negative marine bacteria. With recent advances in genomic sequencing, genome analysis, and analytical chemistry techniques, a whole host of information is being revealed about the primary and secondary metabolism of marine proteobacteria. This has led to the discovery of a growing number of medically relevant natural products, including novel leads for the treatment of multidrug-resistant Staphylococcus aureus (MRSA) and cancer. Of equal interest, marine proteobacteria produce natural products whose structure and biosynthetic mechanisms differ from those of their terrestrial and actinobacterial counterparts. Notable features of secondary metabolites produced by marine proteobacteria include halogenation, sulfur-containing heterocycles, non-ribosomal peptides, and polyketides with unusual biosynthetic logic. As advances are made in the technology associated with functional genomics, such as computational sequence analysis, targeted DNA manipulation, and heterologous expression, it has become easier to probe the mechanisms for natural product biosynthesis. This review will focus on genomics driven approaches to understanding the biosynthetic mechanisms for natural products produced by marine proteobacteria. PMID:28762997
A novel prokaryotic promoter identified in the genome of some monopartite begomoviruses.

PubMed

Wang, Wei-Chen; Hsu, Yau-Heiu; Lin, Na-Sheng; Wu, Chia-Ying; Lai, Yi-Chin; Hu, Chung-Chi

2013-01-01

Geminiviruses are known to exhibit both prokaryotic and eukaryotic features in their genomes, with the ability to express their genes and even replicate in bacterial cells. We have demonstrated previously the existence of unit-length single-stranded circular DNAs of Ageratum yellow vein virus (AYVV, a species in the genus Begomovirus, family Geminiviridae) in Escherichia coli cells, which prompted our search for unknown prokaryotic functions in the begomovirus genomes. By using a promoter trapping strategy, we identified a novel prokaryotic promoter, designated AV3 promoter, in nts 762-831 of the AYVV genome. Activity assays revealed that the AV3 promoter is strong, unidirectional, and constitutive, with an endogenous downstream ribosome binding site and a translatable short open reading frame of eight amino acids. Sequence analyses suggested that the AV3 promoter might be a remnant of prokaryotic ancestors that could be related to certain promoters of bacteria from marine or freshwater environments. The discovery of the prokaryotic AV3 promoter provided further evidence for the prokaryotic origin in the evolutionary history of geminiviruses.
A Novel Prokaryotic Promoter Identified in the Genome of Some Monopartite Begomoviruses

PubMed Central

Wang, Wei-Chen; Hsu, Yau-Heiu; Lin, Na-Sheng; Wu, Chia-Ying; Lai, Yi-Chin; Hu, Chung-Chi

2013-01-01

Geminiviruses are known to exhibit both prokaryotic and eukaryotic features in their genomes, with the ability to express their genes and even replicate in bacterial cells. We have demonstrated previously the existence of unit-length single-stranded circular DNAs of Ageratum yellow vein virus (AYVV, a species in the genus Begomovirus, family Geminiviridae) in Escherichia coli cells, which prompted our search for unknown prokaryotic functions in the begomovirus genomes. By using a promoter trapping strategy, we identified a novel prokaryotic promoter, designated AV3 promoter, in nts 762-831 of the AYVV genome. Activity assays revealed that the AV3 promoter is strong, unidirectional, and constitutive, with an endogenous downstream ribosome binding site and a translatable short open reading frame of eight amino acids. Sequence analyses suggested that the AV3 promoter might be a remnant of prokaryotic ancestors that could be related to certain promoters of bacteria from marine or freshwater environments. The discovery of the prokaryotic AV3 promoter provided further evidence for the prokaryotic origin in the evolutionary history of geminiviruses. PMID:23936138

Comparative Pathogenomics Reveals Horizontally Acquired Novel Virulence Genes in Fungi Infecting Cereal Hosts

PubMed Central

Gardiner, Donald M.; McDonald, Megan C.; Covarelli, Lorenzo; Solomon, Peter S.; Rusu, Anca G.; Marshall, Mhairi; Kazan, Kemal; Chakraborty, Sukumar; McDonald, Bruce A.; Manners, John M.

2012-01-01

Comparative analyses of pathogen genomes provide new insights into how pathogens have evolved common and divergent virulence strategies to invade related plant species. Fusarium crown and root rots are important diseases of wheat and barley world-wide. In Australia, these diseases are primarily caused by the fungal pathogen Fusarium pseudograminearum. Comparative genomic analyses showed that the F. pseudograminearum genome encodes proteins that are present in other fungal pathogens of cereals but absent in non-cereal pathogens. In some cases, these cereal pathogen specific genes were also found in bacteria associated with plants. Phylogenetic analysis of selected F. pseudograminearum genes supported the hypothesis of horizontal gene transfer into diverse cereal pathogens. Two horizontally acquired genes with no previously known role in fungal pathogenesis were studied functionally via gene knockout methods and shown to significantly affect virulence of F. pseudograminearum on the cereal hosts wheat and barley. Our results indicate using comparative genomics to identify genes specific to pathogens of related hosts reveals novel virulence genes and illustrates the importance of horizontal gene transfer in the evolution of plant infecting fungal pathogens. PMID:23028337
The Impact of Chromatin Dynamics on Cas9-Mediated Genome Editing in Human Cells.

PubMed

Daer, René M; Cutts, Josh P; Brafman, David A; Haynes, Karmella A

2017-03-17

In order to efficiently edit eukaryotic genomes, it is critical to test the impact of chromatin dynamics on CRISPR/Cas9 function and develop strategies to adapt the system to eukaryotic contexts. So far, research has extensively characterized the relationship between the CRISPR endonuclease Cas9 and the composition of the RNA-DNA duplex that mediates the system's precision. Evidence suggests that chromatin modifications and DNA packaging can block eukaryotic genome editing by custom-built DNA endonucleases like Cas9; however, the underlying mechanism of Cas9 inhibition is unclear. Here, we demonstrate that closed, gene-silencing-associated chromatin is a mechanism for the interference of Cas9-mediated DNA editing. Our assays use a transgenic cell line with a drug-inducible switch to control chromatin states (open and closed) at a single genomic locus. We show that closed chromatin inhibits binding and editing at specific target sites and that artificial reversal of the silenced state restores editing efficiency. These results provide new insights to improve Cas9-mediated editing in human and other mammalian cells.
Investigating the Biosynthesis of Natural Products from Marine Proteobacteria: A Survey of Molecules and Strategies.

PubMed

Timmermans, Marshall L; Paudel, Yagya P; Ross, Avena C

2017-08-01

The phylum proteobacteria contains a wide array of Gram-negative marine bacteria. With recent advances in genomic sequencing, genome analysis, and analytical chemistry techniques, a whole host of information is being revealed about the primary and secondary metabolism of marine proteobacteria. This has led to the discovery of a growing number of medically relevant natural products, including novel leads for the treatment of multidrug-resistant Staphylococcus aureus (MRSA) and cancer. Of equal interest, marine proteobacteria produce natural products whose structure and biosynthetic mechanisms differ from those of their terrestrial and actinobacterial counterparts. Notable features of secondary metabolites produced by marine proteobacteria include halogenation, sulfur-containing heterocycles, non-ribosomal peptides, and polyketides with unusual biosynthetic logic. As advances are made in the technology associated with functional genomics, such as computational sequence analysis, targeted DNA manipulation, and heterologous expression, it has become easier to probe the mechanisms for natural product biosynthesis. This review will focus on genomics driven approaches to understanding the biosynthetic mechanisms for natural products produced by marine proteobacteria.
Cold adaptive traits revealed by comparative genomic analysis of the eurypsychrophile Rhodococcus sp. JG3 isolated from high elevation McMurdo Dry Valley permafrost, Antarctica.

PubMed

Goordial, Jacqueline; Raymond-Bouchard, Isabelle; Zolotarov, Yevgen; de Bethencourt, Luis; Ronholm, Jennifer; Shapiro, Nicole; Woyke, Tanja; Stromvik, Martina; Greer, Charles W; Bakermans, Corien; Whyte, Lyle

2016-02-01

The permafrost soils of the high elevation McMurdo Dry Valleys are the most cold, desiccating and oligotrophic on Earth. Rhodococcus sp. JG3 is one of very few bacterial isolates from Antarctic Dry Valley permafrost, and displays subzero growth down to -5°C. To understand how Rhodococcus sp. JG3 is able to survive extreme permafrost conditions and be metabolically active at subzero temperatures, we sequenced its genome and compared it to the genomes of 14 mesophilic rhodococci. Rhodococcus sp. JG3 possessed a higher copy number of genes for general stress response, UV protection and protection from cold shock, osmotic stress and oxidative stress. We characterized genome wide molecular adaptations to cold, and identified genes that had amino acid compositions favourable for increased flexibility and functionality at low temperatures. Rhodococcus sp. JG3 possesses multiple complimentary strategies which may enable its survival in some of the harshest permafrost on Earth. © FEMS 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Pharmacogenomics of the human ABC transporter ABCG2: from functional evaluation to drug molecular design

NASA Astrophysics Data System (ADS)

Ishikawa, Toshihisa; Tamura, Ai; Saito, Hikaru; Wakabayashi, Kanako; Nakagawa, Hiroshi

2005-10-01

In the post-genome-sequencing era, emerging genomic technologies are shifting the paradigm for drug discovery and development. Nevertheless, drug discovery and development still remain high-risk and high-stakes ventures with long and costly timelines. Indeed, the attrition of drug candidates in preclinical and development stages is a major problem in drug design. For at least 30% of the candidates, this attrition is due to poor pharmacokinetics and toxicity. Thus, pharmaceutical companies have begun to seriously re-evaluate their current strategies of drug discovery and development. In that light, we propose that a transport mechanism-based design might help to create new, pharmacokinetically advantageous drugs, and as such should be considered an important component of drug design strategy. Performing enzyme- and/or cell-based drug transporter, interaction tests may greatly facilitate drug development and allow the prediction of drug-drug interactions. We recently developed methods for high-speed functional screening and quantitative structure-activity relationship analysis to study the substrate specificity of ABC transporters and to evaluate the effect of genetic polymorphisms on their function. These methods would provide a practical tool to screen synthetic and natural compounds, and these data can be applied to the molecular design of new drugs. In this review article, we present an overview on the genetic polymorphisms of human ABC transporter ABCG2 and new camptothecin analogues that can circumvent AGCG2-associated multidrug resistance of cancer.
Effects of HBV Genetic Variability on RNAi Strategies

PubMed Central

Panjaworayan, Nattanan; Brown, Chris M.

2011-01-01

RNAi strategies present promising antiviral strategies against HBV. RNAi strategies require base pairing between short RNAi effectors and targets in the HBV pregenome or other RNAs. Natural variation in HBV genotypes, quasispecies variation, or mutations selected by the RNAi strategy could potentially make these strategies less effective. However, current and proposed antiviral strategies against HBV are being, or could be, designed to avoid this. This would involve simultaneous targeting of multiple regions of the genome, or regions in which variation or mutation is not tolerated. RNAi strategies against single genotypes or against variable regions of the genome would need to have significant other advantages to be part of robust therapies. PMID:21760994
Cow genotyping strategies for genomic selection in a small dairy cattle population.

PubMed

Jenko, J; Wiggans, G R; Cooper, T A; Eaglen, S A E; Luff, W G de L; Bichard, M; Pong-Wong, R; Woolliams, J A

2017-01-01

This study compares how different cow genotyping strategies increase the accuracy of genomic estimated breeding values (EBV) in dairy cattle breeds with low numbers. In these breeds, few sires have progeny records, and genotyping cows can improve the accuracy of genomic EBV. The Guernsey breed is a small dairy cattle breed with approximately 14,000 recorded individuals worldwide. Predictions of phenotypes of milk yield, fat yield, protein yield, and calving interval were made for Guernsey cows from England and Guernsey Island using genomic EBV, with training sets including 197 de-regressed proofs of genotyped bulls, with cows selected from among 1,440 genotyped cows using different genotyping strategies. Accuracies of predictions were tested using 10-fold cross-validation among the cows. Genomic EBV were predicted using 4 different methods: (1) pedigree BLUP, (2) genomic BLUP using only bulls, (3) univariate genomic BLUP using bulls and cows, and (4) bivariate genomic BLUP. Genotyping cows with phenotypes and using their data for the prediction of single nucleotide polymorphism effects increased the correlation between genomic EBV and phenotypes compared with using only bulls by 0.163±0.022 for milk yield, 0.111±0.021 for fat yield, and 0.113±0.018 for protein yield; a decrease of 0.014±0.010 for calving interval from a low base was the only exception. Genetic correlation between phenotypes from bulls and cows were approximately 0.6 for all yield traits and significantly different from 1. Only a very small change occurred in correlation between genomic EBV and phenotypes when using the bivariate model. It was always better to genotype all the cows, but when only half of the cows were genotyped, a divergent selection strategy was better compared with the random or directional selection approach. Divergent selection of 30% of the cows remained superior for the yield traits in 8 of 10 folds. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
High-efficiency CRISPR/Cas9 multiplex gene editing using the glycine tRNA-processing system-based strategy in maize.

PubMed

Qi, Weiwei; Zhu, Tong; Tian, Zhongrui; Li, Chaobin; Zhang, Wei; Song, Rentao

2016-08-11

CRISPR/Cas9 genome editing strategy has been applied to a variety of species and the tRNA-processing system has been used to compact multiple gRNAs into one synthetic gene for manipulating multiple genes in rice. We optimized and introduced the multiplex gene editing strategy based on the tRNA-processing system into maize. Maize glycine-tRNA was selected to design multiple tRNA-gRNA units for the simultaneous production of numerous gRNAs under the control of one maize U6 promoter. We designed three gRNAs for simplex editing and three multiple tRNA-gRNA units for multiplex editing. The results indicate that this system not only increased the number of targeted sites but also enhanced mutagenesis efficiency in maize. Additionally, we propose an advanced sequence selection of gRNA spacers for relatively more efficient and accurate chromosomal fragment deletion, which is important for complete abolishment of gene function especially long non-coding RNAs (lncRNAs). Our results also indicated that up to four tRNA-gRNA units in one expression cassette design can still work in maize. The examples reported here demonstrate the utility of the tRNA-processing system-based strategy as an efficient multiplex genome editing tool to enhance maize genetic research and breeding.
Development and validation of an rDNA operon based primer walking strategy applicable to de novo bacterial genome finishing

PubMed Central

Eastman, Alexander W.; Yuan, Ze-Chun

2015-01-01

Advances in sequencing technology have drastically increased the depth and feasibility of bacterial genome sequencing. However, little information is available that details the specific techniques and procedures employed during genome sequencing despite the large numbers of published genomes. Shotgun approaches employed by second-generation sequencing platforms has necessitated the development of robust bioinformatics tools for in silico assembly, and complete assembly is limited by the presence of repetitive DNA sequences and multi-copy operons. Typically, re-sequencing with multiple platforms and laborious, targeted Sanger sequencing are employed to finish a draft bacterial genome. Here we describe a novel strategy based on the identification and targeted sequencing of repetitive rDNA operons to expedite bacterial genome assembly and finishing. Our strategy was validated by finishing the genome of Paenibacillus polymyxa strain CR1, a bacterium with potential in sustainable agriculture and bio-based processes. An analysis of the 38 contigs contained in the P. polymyxa strain CR1 draft genome revealed 12 repetitive rDNA operons with varied intragenic and flanking regions of variable length, unanimously located at contig boundaries and within contig gaps. These highly similar but not identical rDNA operons were experimentally verified and sequenced simultaneously with multiple, specially designed primer sets. This approach also identified and corrected significant sequence rearrangement generated during the initial in silico assembly of sequencing reads. Our approach reduces the required effort associated with blind primer walking for contig assembly, increasing both the speed and feasibility of genome finishing. Our study further reinforces the notion that repetitive DNA elements are major limiting factors for genome finishing. Moreover, we provided a step-by-step workflow for genome finishing, which may guide future bacterial genome finishing projects. PMID:25653642
Functional genomics platform for pooled screening and mammalian genetic interaction maps

PubMed Central

Kampmann, Martin; Bassik, Michael C.; Weissman, Jonathan S.

2014-01-01

Systematic genetic interaction maps in microorganisms are powerful tools for identifying functional relationships between genes and defining the function of uncharacterized genes. We have recently implemented this strategy in mammalian cells as a two-stage approach. First, genes of interest are robustly identified in a pooled genome-wide screen using complex shRNA libraries. Second, phenotypes for all pairwise combinations of hit genes are measured in a double-shRNA screen and used to construct a genetic interaction map. Our protocol allows for rapid pooled screening under various conditions without a requirement for robotics, in contrast to arrayed approaches. Each stage of the protocol can be implemented in ~2 weeks, with additional time for analysis and generation of reagents. We discuss considerations for screen design, and present complete experimental procedures as well as a full computational analysis suite for identification of hits in pooled screens and generation of genetic interaction maps. While the protocols outlined here were developed for our original shRNA-based approach, they can be applied more generally, including to CRISPR-based approaches. PMID:24992097
Genome mining for ribosomally synthesized natural products.

PubMed

Velásquez, Juan E; van der Donk, Wilfred A

2011-02-01

In recent years, the number of known peptide natural products that are synthesized via the ribosomal pathway has rapidly grown. Taking advantage of sequence homology among genes encoding precursor peptides or biosynthetic proteins, in silico mining of genomes combined with molecular biology approaches has guided the discovery of a large number of new ribosomal natural products, including lantipeptides, cyanobactins, linear thiazole/oxazole-containing peptides, microviridins, lasso peptides, amatoxins, cyclotides, and conopeptides. In this review, we describe the strategies used for the identification of these ribosomally synthesized and posttranslationally modified peptides (RiPPs) and the structures of newly identified compounds. The increasing number of chemical entities and their remarkable structural and functional diversity may lead to novel pharmaceutical applications. Copyright © 2010 Elsevier Ltd. All rights reserved.
Treating hepatitis C: can you teach old dogs new tricks?

PubMed

Rice, Charles M; You, Shihyun

2005-12-01

Viruses depend on host-derived factors for their efficient genome replication. Here, we demonstrate that a cellular peptidyl-prolyl cis-trans isomerase (PPIase), cyclophilin B (CyPB), is critical for the efficient replication of the hepatitis C virus genome. CyPB interacted with the HCV RNA polymerase NS5B to directly stimulate its RNA binding activity. Both the RNA interference (RNAi)-mediated reduction of endogenous CyPB expression and the induced loss of NS5B binding to CyPB decreased the levels of HCV replication. Thus, CyPB functions as a stimulatory regulator of NS5B in HCV replication machinery. This regulation mechanism for viral replication identifies CyPB as a target for antiviral therapeutic strategies.
The re-emergence of natural products for drug discovery in the genomics era.

PubMed

Harvey, Alan L; Edrada-Ebel, RuAngelie; Quinn, Ronald J

2015-02-01

Natural products have been a rich source of compounds for drug discovery. However, their use has diminished in the past two decades, in part because of technical barriers to screening natural products in high-throughput assays against molecular targets. Here, we review strategies for natural product screening that harness the recent technical advances that have reduced these barriers. We also assess the use of genomic and metabolomic approaches to augment traditional methods of studying natural products, and highlight recent examples of natural products in antimicrobial drug discovery and as inhibitors of protein-protein interactions. The growing appreciation of functional assays and phenotypic screens may further contribute to a revival of interest in natural products for drug discovery.
Immunoglobulin superfamily members encoded by viruses and their multiple roles in immune evasion.

PubMed

Farré, Domènec; Martínez-Vicente, Pablo; Engel, Pablo; Angulo, Ana

2017-05-01

Pathogens have developed a plethora of strategies to undermine host immune defenses in order to guarantee their survival. For large DNA viruses, these immune evasion mechanisms frequently rely on the expression of genes acquired from host genomes. Horizontally transferred genes include members of the immunoglobulin superfamily, whose products constitute the most diverse group of proteins of vertebrate genomes. Their promiscuous immunoglobulin domains, which comprise the building blocks of these molecules, are involved in a large variety of functions mediated by ligand-binding interactions. The flexible structural nature of the immunoglobulin domains makes them appealing targets for viral capture due to their capacity to generate high functional diversity. Here, we present an up-to-date review of immunoglobulin superfamily gene homologs encoded by herpesviruses, poxviruses, and adenoviruses, that include CD200, CD47, Fc receptors, interleukin-1 receptor 2, interleukin-18 binding protein, CD80, carcinoembryonic antigen-related cell adhesion molecules, and signaling lymphocyte activation molecules. We discuss their distinct structural attributes, binding properties, and functions, shaped by evolutionary pressures to disarm specific immune pathways. We include several novel genes identified from extensive genome database surveys. An understanding of the properties and modes of action of these viral proteins may guide the development of novel immune-modulatory therapeutic tools. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Gilchrist, Michael J.; Sobral, Daniel; Khoueiry, Pierre

Genome-wide resources, such as collections of cDNA clones encoding for complete proteins (full-ORF clones), are crucial tools for studying the evolution of gene function and genetic interactions. Non-model organisms, in particular marine organisms, provide a rich source of functional diversity. Marine organism genomes are, however, frequently highly polymorphic and encode proteins that diverge significantly from those of well-annotated model genomes. The construction of full-ORF clone collections from non-model organisms is hindered by the difficulty of predicting accurately the N-terminal ends of proteins, and distinguishing recent paralogs from highly polymorphic alleles. We also report a computational strategy that overcomes these difficulties,more » and allows for accurate gene level clustering of transcript data followed by the automated identification of full-ORFs with correct 5'- and 3'-ends. It is robust to polymorphism, includes paralog calling and does not require evolutionary proximity to well annotated model organisms. Here, we developed this pipeline for the ascidian Ciona intestinalis, a highly polymorphic member of the divergent sister group of the vertebrates, emerging as a powerful model organism to study chordate gene function, Gene Regulatory Networks and molecular mechanisms underlying human pathologies. Furthermore, using this pipeline we have generated the first full-ORF collection for a highly polymorphic marine invertebrate. It contains 19,163 full-ORF cDNA clones covering 60% of Ciona coding genes, and full-ORF orthologs for approximately half of curated human disease-associated genes.« less
A map of human genome variation from population-scale sequencing.

PubMed

Abecasis, Gonçalo R; Altshuler, David; Auton, Adam; Brooks, Lisa D; Durbin, Richard M; Gibbs, Richard A; Hurles, Matt E; McVean, Gil A

2010-10-28

The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research.
Functional Study of Genes Essential for Autogamy and Nuclear Reorganization in Paramecium▿§

PubMed Central

Nowak, Jacek K.; Gromadka, Robert; Juszczuk, Marek; Jerka-Dziadosz, Maria; Maliszewska, Kamila; Mucchielli, Marie-Hélène; Gout, Jean-François; Arnaiz, Olivier; Agier, Nicolas; Tang, Thomas; Aggerbeck, Lawrence P.; Cohen, Jean; Delacroix, Hervé; Sperling, Linda; Herbert, Christopher J.; Zagulski, Marek; Bétermier, Mireille

2011-01-01

Like all ciliates, Paramecium tetraurelia is a unicellular eukaryote that harbors two kinds of nuclei within its cytoplasm. At each sexual cycle, a new somatic macronucleus (MAC) develops from the germ line micronucleus (MIC) through a sequence of complex events, which includes meiosis, karyogamy, and assembly of the MAC genome from MIC sequences. The latter process involves developmentally programmed genome rearrangements controlled by noncoding RNAs and a specialized RNA interference machinery. We describe our first attempts to identify genes and biological processes that contribute to the progression of the sexual cycle. Given the high percentage of unknown genes annotated in the P. tetraurelia genome, we applied a global strategy to monitor gene expression profiles during autogamy, a self-fertilization process. We focused this pilot study on the genes carried by the largest somatic chromosome and designed dedicated DNA arrays covering 484 genes from this chromosome (1.2% of all genes annotated in the genome). Transcriptome analysis revealed four major patterns of gene expression, including two successive waves of gene induction. Functional analysis of 15 upregulated genes revealed four that are essential for vegetative growth, one of which is involved in the maintenance of MAC integrity and another in cell division or membrane trafficking. Two additional genes, encoding a MIC-specific protein and a putative RNA helicase localizing to the old and then to the new MAC, are specifically required during sexual processes. Our work provides a proof of principle that genes essential for meiosis and nuclear reorganization can be uncovered following genome-wide transcriptome analysis. PMID:21257794
A Genome-wide Combinatorial Strategy Dissects Complex Genetic Architecture of Seed Coat Color in Chickpea

PubMed Central

Bajaj, Deepak; Das, Shouvik; Upadhyaya, Hari D.; Ranjan, Rajeev; Badoni, Saurabh; Kumar, Vinod; Tripathi, Shailesh; Gowda, C. L. Laxmipathi; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K.; Parida, Swarup K.

2015-01-01

The study identified 9045 high-quality SNPs employing both genome-wide GBS- and candidate gene-based SNP genotyping assays in 172, including 93 cultivated (desi and kabuli) and 79 wild chickpea accessions. The GWAS in a structured population of 93 sequenced accessions detected 15 major genomic loci exhibiting significant association with seed coat color. Five seed color-associated major genomic loci underlying robust QTLs mapped on a high-density intra-specific genetic linkage map were validated by QTL mapping. The integration of association and QTL mapping with gene haplotype-specific LD mapping and transcript profiling identified novel allelic variants (non-synonymous SNPs) and haplotypes in a MATE secondary transporter gene regulating light/yellow brown and beige seed coat color differentiation in chickpea. The down-regulation and decreased transcript expression of beige seed coat color-associated MATE gene haplotype was correlated with reduced proanthocyanidins accumulation in the mature seed coats of beige than light/yellow brown seed colored desi and kabuli accessions for their coloration/pigmentation. This seed color-regulating MATE gene revealed strong purifying selection pressure primarily in LB/YB seed colored desi and wild Cicer reticulatum accessions compared with the BE seed colored kabuli accessions. The functionally relevant molecular tags identified have potential to decipher the complex transcriptional regulatory gene function of seed coat coloration and for understanding the selective sweep-based seed color trait evolutionary pattern in cultivated and wild accessions during chickpea domestication. The genome-wide integrated approach employed will expedite marker-assisted genetic enhancement for developing cultivars with desirable seed coat color types in chickpea. PMID:26635822
Genome network medicine: innovation to overcome huge challenges in cancer therapy.

PubMed

Roukos, Dimitrios H

2014-01-01

The post-ENCODE era shapes now a new biomedical research direction for understanding transcriptional and signaling networks driving gene expression and core cellular processes such as cell fate, survival, and apoptosis. Over the past half century, the Francis Crick 'central dogma' of single n gene/protein-phenotype (trait/disease) has defined biology, human physiology, disease, diagnostics, and drugs discovery. However, the ENCODE project and several other genomic studies using high-throughput sequencing technologies, computational strategies, and imaging techniques to visualize regulatory networks, provide evidence that transcriptional process and gene expression are regulated by highly complex dynamic molecular and signaling networks. This Focus article describes the linear experimentation-based limitations of diagnostics and therapeutics to cure advanced cancer and the need to move on from reductionist to network-based approaches. With evident a wide genomic heterogeneity, the power and challenges of next-generation sequencing (NGS) technologies to identify a patient's personal mutational landscape for tailoring the best target drugs in the individual patient are discussed. However, the available drugs are not capable of targeting aberrant signaling networks and research on functional transcriptional heterogeneity and functional genome organization is poorly understood. Therefore, the future clinical genome network medicine aiming at overcoming multiple problems in the new fields of regulatory DNA mapping, noncoding RNA, enhancer RNAs, and dynamic complexity of transcriptional circuitry are also discussed expecting in new innovation technology and strong appreciation of clinical data and evidence-based medicine. The problematic and potential solutions in the discovery of next-generation, molecular, and signaling circuitry-based biomarkers and drugs are explored. © 2013 Wiley Periodicals, Inc.
'Unknown' proteins and 'orphan' enzymes: the missing half of the engineering parts list--and how to find it.

PubMed

Hanson, Andrew D; Pribat, Anne; Waller, Jeffrey C; de Crécy-Lagard, Valérie

2009-12-14

Like other forms of engineering, metabolic engineering requires knowledge of the components (the 'parts list') of the target system. Lack of such knowledge impairs both rational engineering design and diagnosis of the reasons for failures; it also poses problems for the related field of metabolic reconstruction, which uses a cell's parts list to recreate its metabolic activities in silico. Despite spectacular progress in genome sequencing, the parts lists for most organisms that we seek to manipulate remain highly incomplete, due to the dual problem of 'unknown' proteins and 'orphan' enzymes. The former are all the proteins deduced from genome sequence that have no known function, and the latter are all the enzymes described in the literature (and often catalogued in the EC database) for which no corresponding gene has been reported. Unknown proteins constitute up to about half of the proteins in prokaryotic genomes, and much more than this in higher plants and animals. Orphan enzymes make up more than a third of the EC database. Attacking the 'missing parts list' problem is accordingly one of the great challenges for post-genomic biology, and a tremendous opportunity to discover new facets of life's machinery. Success will require a co-ordinated community-wide attack, sustained over years. In this attack, comparative genomics is probably the single most effective strategy, for it can reliably predict functions for unknown proteins and genes for orphan enzymes. Furthermore, it is cost-efficient and increasingly straightforward to deploy owing to a proliferation of databases and associated tools.

Systematic review of knowledge, confidence and education in nutritional genomics for students and professionals in nutrition and dietetics.

PubMed

Wright, O R L

2014-06-01

This review examines knowledge and confidence of nutrition and dietetics professionals in nutritional genomics and evaluates the teaching strategies in this field within nutrition and dietetics university programmes and professional development courses internationally. A systematic search of 10 literature databases was conducted from January 2000 to December 2012 to identify original research. Any studies of either nutrition and/or dietetics students or dietitians/nutritionists investigating current levels of knowledge or confidence in nutritional genomics, or strategies to improve learning and/or confidence in this area, were eligible. Eighteen articles (15 separate studies) met the inclusion criteria. Three articles were assessed as negative, eight as neutral and seven as positive according to the American Dietetics Association Quality Criteria Checklist. The overall ranking of evidence was low. Dietitians have low involvement, knowledge and confidence in nutritional genomics, and evidence for educational strategies is limited and methodologically weak. There is a need to develop training pathways and material to up-skill nutrition and/or dietetics students and nutrition and/or dietetics professionals in nutritional genomics through multidisciplinary collaboration with content area experts. There is a paucity of high quality evidence on optimum teaching strategies; however, methods promoting repetitive exposure to nutritional genomics material, problem-solving, collaborative and case-based learning are most promising for university and professional development programmes. © 2013 The British Dietetic Association Ltd.
Genome-editing Technologies for Gene and Cell Therapy.

PubMed

Maeder, Morgan L; Gersbach, Charles A

2016-03-01

Gene therapy has historically been defined as the addition of new genes to human cells. However, the recent advent of genome-editing technologies has enabled a new paradigm in which the sequence of the human genome can be precisely manipulated to achieve a therapeutic effect. This includes the correction of mutations that cause disease, the addition of therapeutic genes to specific sites in the genome, and the removal of deleterious genes or genome sequences. This review presents the mechanisms of different genome-editing strategies and describes each of the common nuclease-based platforms, including zinc finger nucleases, transcription activator-like effector nucleases (TALENs), meganucleases, and the CRISPR/Cas9 system. We then summarize the progress made in applying genome editing to various areas of gene and cell therapy, including antiviral strategies, immunotherapies, and the treatment of monogenic hereditary disorders. The current challenges and future prospects for genome editing as a transformative technology for gene and cell therapy are also discussed.
Genome-editing Technologies for Gene and Cell Therapy

PubMed Central

Maeder, Morgan L; Gersbach, Charles A

2016-01-01

Gene therapy has historically been defined as the addition of new genes to human cells. However, the recent advent of genome-editing technologies has enabled a new paradigm in which the sequence of the human genome can be precisely manipulated to achieve a therapeutic effect. This includes the correction of mutations that cause disease, the addition of therapeutic genes to specific sites in the genome, and the removal of deleterious genes or genome sequences. This review presents the mechanisms of different genome-editing strategies and describes each of the common nuclease-based platforms, including zinc finger nucleases, transcription activator-like effector nucleases (TALENs), meganucleases, and the CRISPR/Cas9 system. We then summarize the progress made in applying genome editing to various areas of gene and cell therapy, including antiviral strategies, immunotherapies, and the treatment of monogenic hereditary disorders. The current challenges and future prospects for genome editing as a transformative technology for gene and cell therapy are also discussed. PMID:26755333
Functional genomics reveals that a compact terpene synthase gene family can account for terpene volatile production in apple.

PubMed

Nieuwenhuizen, Niels J; Green, Sol A; Chen, Xiuyin; Bailleul, Estelle J D; Matich, Adam J; Wang, Mindy Y; Atkinson, Ross G

2013-02-01

Terpenes are specialized plant metabolites that act as attractants to pollinators and as defensive compounds against pathogens and herbivores, but they also play an important role in determining the quality of horticultural food products. We show that the genome of cultivated apple (Malus domestica) contains 55 putative terpene synthase (TPS) genes, of which only 10 are predicted to be functional. This low number of predicted functional TPS genes compared with other plant species was supported by the identification of only eight potentially functional TPS enzymes in apple 'Royal Gala' expressed sequence tag databases, including the previously characterized apple (E,E)-α-farnesene synthase. In planta functional characterization of these TPS enzymes showed that they could account for the majority of terpene volatiles produced in cv Royal Gala, including the sesquiterpenes germacrene-D and (E)-β-caryophyllene, the monoterpenes linalool and α-pinene, and the homoterpene (E)-4,8-dimethyl-1,3,7-nonatriene. Relative expression analysis of the TPS genes indicated that floral and vegetative tissues were the primary sites of terpene production in cv Royal Gala. However, production of cv Royal Gala floral-specific terpenes and TPS genes was observed in the fruit of some heritage apple cultivars. Our results suggest that the apple TPS gene family has been shaped by a combination of ancestral and more recent genome-wide duplication events. The relatively small number of functional enzymes suggests that the remaining terpenes produced in floral and vegetative and fruit tissues are maintained under a positive selective pressure, while the small number of terpenes found in the fruit of modern cultivars may be related to commercial breeding strategies.
Functional Genomics Reveals That a Compact Terpene Synthase Gene Family Can Account for Terpene Volatile Production in Apple1[W

PubMed Central

Nieuwenhuizen, Niels J.; Green, Sol A.; Chen, Xiuyin; Bailleul, Estelle J.D.; Matich, Adam J.; Wang, Mindy Y.; Atkinson, Ross G.

2013-01-01

Terpenes are specialized plant metabolites that act as attractants to pollinators and as defensive compounds against pathogens and herbivores, but they also play an important role in determining the quality of horticultural food products. We show that the genome of cultivated apple (Malus domestica) contains 55 putative terpene synthase (TPS) genes, of which only 10 are predicted to be functional. This low number of predicted functional TPS genes compared with other plant species was supported by the identification of only eight potentially functional TPS enzymes in apple ‘Royal Gala’ expressed sequence tag databases, including the previously characterized apple (E,E)-α-farnesene synthase. In planta functional characterization of these TPS enzymes showed that they could account for the majority of terpene volatiles produced in cv Royal Gala, including the sesquiterpenes germacrene-D and (E)-β-caryophyllene, the monoterpenes linalool and α-pinene, and the homoterpene (E)-4,8-dimethyl-1,3,7-nonatriene. Relative expression analysis of the TPS genes indicated that floral and vegetative tissues were the primary sites of terpene production in cv Royal Gala. However, production of cv Royal Gala floral-specific terpenes and TPS genes was observed in the fruit of some heritage apple cultivars. Our results suggest that the apple TPS gene family has been shaped by a combination of ancestral and more recent genome-wide duplication events. The relatively small number of functional enzymes suggests that the remaining terpenes produced in floral and vegetative and fruit tissues are maintained under a positive selective pressure, while the small number of terpenes found in the fruit of modern cultivars may be related to commercial breeding strategies. PMID:23256150
Phytozome Comparative Plant Genomics Portal

DOE Office of Scientific and Technical Information (OSTI.GOV)

Goodstein, David; Batra, Sajeev; Carlson, Joseph

2014-09-09

The Dept. of Energy Joint Genome Institute is a genomics user facility supporting DOE mission science in the areas of Bioenergy, Carbon Cycling, and Biogeochemistry. The Plant Program at the JGI applies genomic, analytical, computational and informatics platforms and methods to: 1. Understand and accelerate the improvement (domestication) of bioenergy crops 2. Characterize and moderate plant response to climate change 3. Use comparative genomics to identify constrained elements and infer gene function 4. Build high quality genomic resource platforms of JGI Plant Flagship genomes for functional and experimental work 5. Expand functional genomic resources for Plant Flagship genomes
Ecological roles of dominant and rare prokaryotes in acid mine drainage revealed by metagenomics and metatranscriptomics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hua, Zheng-Shuang; Han, Yu-Jiao; Chen, Lin-Xing

Here we report that high-throughput sequencing is expanding our knowledge of microbial diversity in the environment. Still, understanding the metabolic potentials and ecological roles of rare and uncultured microbes in natural communities remains a major challenge. To this end, we applied a ‘divide and conquer’ strategy that partitioned a massive metagenomic data set (>100 Gbp) into subsets based on K-mer frequency in sequence assembly to a low-diversity acid mine drainage (AMD) microbial community and, by integrating with an additional metatranscriptomic assembly, successfully obtained 11 draft genomes most of which represent yet uncultured and/or rare taxa (relative abundance <1%). We reportmore » the first genome of a naturally occurring Ferrovum population (relative abundance >90%) and its metabolic potentials and gene expression profile, providing initial molecular insights into the ecological role of these lesser known, but potentially important, microorganisms in the AMD environment. Gene transcriptional analysis of the active taxa revealed major metabolic capabilities executed in situ, including carbon- and nitrogen-related metabolisms associated with syntrophic interactions, iron and sulfur oxidation, which are key in energy conservation and AMD generation, and the mechanisms of adaptation and response to the environmental stresses (heavy metals, low pH and oxidative stress). Remarkably, nitrogen fixation and sulfur oxidation were performed by the rare taxa, indicating their critical roles in the overall functioning and assembly of the AMD community. Finally, our study demonstrates the potential of the ‘divide and conquer’ strategy in high-throughput sequencing data assembly for genome reconstruction and functional partitioning analysis of both dominant and rare species in natural microbial assemblages.« less
Ecological roles of dominant and rare prokaryotes in acid mine drainage revealed by metagenomics and metatranscriptomics

DOE PAGES

Hua, Zheng-Shuang; Han, Yu-Jiao; Chen, Lin-Xing; ...

2014-11-07

Here we report that high-throughput sequencing is expanding our knowledge of microbial diversity in the environment. Still, understanding the metabolic potentials and ecological roles of rare and uncultured microbes in natural communities remains a major challenge. To this end, we applied a ‘divide and conquer’ strategy that partitioned a massive metagenomic data set (>100 Gbp) into subsets based on K-mer frequency in sequence assembly to a low-diversity acid mine drainage (AMD) microbial community and, by integrating with an additional metatranscriptomic assembly, successfully obtained 11 draft genomes most of which represent yet uncultured and/or rare taxa (relative abundance <1%). We reportmore » the first genome of a naturally occurring Ferrovum population (relative abundance >90%) and its metabolic potentials and gene expression profile, providing initial molecular insights into the ecological role of these lesser known, but potentially important, microorganisms in the AMD environment. Gene transcriptional analysis of the active taxa revealed major metabolic capabilities executed in situ, including carbon- and nitrogen-related metabolisms associated with syntrophic interactions, iron and sulfur oxidation, which are key in energy conservation and AMD generation, and the mechanisms of adaptation and response to the environmental stresses (heavy metals, low pH and oxidative stress). Remarkably, nitrogen fixation and sulfur oxidation were performed by the rare taxa, indicating their critical roles in the overall functioning and assembly of the AMD community. Finally, our study demonstrates the potential of the ‘divide and conquer’ strategy in high-throughput sequencing data assembly for genome reconstruction and functional partitioning analysis of both dominant and rare species in natural microbial assemblages.« less
Powerful tools for genetic modification: Advances in gene editing.

PubMed

Roesch, Erica A; Drumm, Mitchell L

2017-11-01

Recent discoveries and technical advances in genetic engineering, methods called gene or genome editing, provide hope for repairing genes that cause diseases like cystic fibrosis (CF) or otherwise altering a gene for therapeutic benefit. There are both hopes and hurdles with these technologies, with new ideas emerging almost daily. Initial studies using intestinal organoid cultures carrying the common, F508del mutation have shown that gene editing by CRISPR/Cas9 can convert cells lacking CFTR function to cells with normal channel function, providing a precedent that this technology can be harnessed for CF. While this is an important precedent, the challenges that remain are not trivial. A logistical issue for this and many other genetic diseases is genetic heterogeneity. Approximately, 2000 mutations associated with CF have been found in CFTR, the gene responsible for CF, and thus a feasible strategy that would encompass all individuals affected by the disease is particularly difficult to envision. However, single strategies that would be applicable to all subjects affected by CF have been conceived and are being investigated. With all of these approaches, efficiency (the proportion of cells edited), accuracy (how often other sites in the genome are affected), and delivery of the gene editing components to the desired cells are perhaps the most significant, impending hurdles. Our understanding of each of these areas is increasing rapidly, and while it is impossible to predict when a successful strategy will reach the clinic, there is every reason to believe it is a question of "when" and not "if." © 2017 Wiley Periodicals, Inc.
Non-coding-regulatory regions of human brain genes delineated by bacterial artificial chromosome knock-in mice.

PubMed

Schmouth, Jean-François; Castellarin, Mauro; Laprise, Stéphanie; Banks, Kathleen G; Bonaguro, Russell J; McInerny, Simone C; Borretta, Lisa; Amirabbasi, Mahsa; Korecki, Andrea J; Portales-Casamar, Elodie; Wilson, Gary; Dreolini, Lisa; Jones, Steven J M; Wasserman, Wyeth W; Goldowitz, Daniel; Holt, Robert A; Simpson, Elizabeth M

2013-10-14

The next big challenge in human genetics is understanding the 98% of the genome that comprises non-coding DNA. Hidden in this DNA are sequences critical for gene regulation, and new experimental strategies are needed to understand the functional role of gene-regulation sequences in health and disease. In this study, we build upon our HuGX ('high-throughput human genes on the X chromosome') strategy to expand our understanding of human gene regulation in vivo. In all, ten human genes known to express in therapeutically important brain regions were chosen for study. For eight of these genes, human bacterial artificial chromosome clones were identified, retrofitted with a reporter, knocked single-copy into the Hprt locus in mouse embryonic stem cells, and mouse strains derived. Five of these human genes expressed in mouse, and all expressed in the adult brain region for which they were chosen. This defined the boundaries of the genomic DNA sufficient for brain expression, and refined our knowledge regarding the complexity of gene regulation. We also characterized for the first time the expression of human MAOA and NR2F2, two genes for which the mouse homologs have been extensively studied in the central nervous system (CNS), and AMOTL1 and NOV, for which roles in CNS have been unclear. We have demonstrated the use of the HuGX strategy to functionally delineate non-coding-regulatory regions of therapeutically important human brain genes. Our results also show that a careful investigation, using publicly available resources and bioinformatics, can lead to accurate predictions of gene expression.
Tomato functional genomics database (TFGD): a comprehensive collection and analysis package for tomato functional genomics

USDA-ARS?s Scientific Manuscript database

Tomato Functional Genomics Database (TFGD; http://ted.bti.cornell.edu) provides a comprehensive systems biology resource to store, mine, analyze, visualize and integrate large-scale tomato functional genomics datasets. The database is expanded from the previously described Tomato Expression Database...
Chemotherapy-Induced Peripheral Neurotoxicity and Ototoxicity: New Paradigms for Translational Genomics

PubMed Central

Fossa, Sophie D.; Sesso, Howard D.; Frisina, Robert D.; Herrmann, David N.; Beard, Clair J.; Feldman, Darren R.; Pagliaro, Lance C.; Miller, Robert C.; Vaughn, David J.; Einhorn, Lawrence H.; Cox, Nancy J.; Dolan, M. Eileen

2014-01-01

In view of advances in early detection and treatment, the 5-year relative survival rate for all cancer patients combined is now approximately 66%. As a result, there are more than 13.7 million cancer survivors in the United States, with this number increasing by 2% annually. For many patients, improvements in survival have been countered by therapy-associated adverse effects that may seriously impair long-term functional status, workplace productivity, and quality of life. Approximately 20% to 40% of cancer patients given neurotoxic chemotherapy develop chemotherapy-induced peripheral neurotoxicity (CIPN), which represents one of the most common and potentially permanent nonhematologic side effects of chemotherapy. Permanent bilateral hearing loss and/or tinnitus can result from several ototoxic therapies, including cisplatin- or carboplatin-based chemotherapy. CIPN and ototoxicity represent important challenges because of the lack of means for effective prevention, mitigation, or a priori identification of high-risk patients, and few studies have applied modern genomic approaches to understand underlying mechanisms/pathways. Translational genomics, including cell-based models, now offer opportunities to make inroads for the first time to develop preventive and interventional strategies for CIPN, ototoxicity, and other treatment-related complications. This commentary provides current perspective on a successful research strategy, with a focus on cisplatin, developed by an experienced, transdisciplinary group of researchers and clinicians, representing pharmacogenomics, statistical genetics, neurology, hearing science, medical oncology, epidemiology, and cancer survivorship. Principles outlined herein are applicable to the construction of research programs in translational genomics with strong clinical relevance and highlight unprecedented opportunities to understand, prevent, and treat long-term treatment-related morbidities. PMID:24623533
DNMT3A and IDH mutations in acute myeloid leukemia and other myeloid malignancies: associations with prognosis and potential treatment strategies

PubMed Central

Im, AP; Sehgal, AR; Carroll, MP; Smith, BD; Tefferi, A; Johnson, DE; Boyiadzis, M

2014-01-01

The development of effective treatment strategies for most forms of acute myeloid leukemia (AML) has languished for the past several decades. There are a number of reasons for this, but key among them is the considerable heterogeneity of this disease and the paucity of molecular markers that can be used to predict clinical outcomes and responsiveness to different therapies. The recent large-scale sequencing of AML genomes is now providing opportunities for patient stratification and personalized approaches to treatment that are based on individual mutational profiles. It is particularly notable that studies by The Cancer Genome Atlas and others have determined that 44% of patients with AML exhibit mutations in genes that regulate methylation of genomic DNA. In particular, frequent mutation has been observed in the genes encoding DNA methyltransferase 3A (DNMT3A), isocitrate dehydrogenase 1 (IDH1) and isocitrate dehydrogenase 2 (IDH2), as well as Tet oncogene family member 2. This review will summarize the incidence of these mutations, their impact on biochemical functions including epigenetic modification of genomic DNA and their potential usefulness as prognostic indicators. Importantly, the presence of DNMT3A, IDH1 or IDH2 mutations may confer sensitivity to novel therapeutic approaches, including the use of demethylating agents. Therefore, the clinical experience with decitabine and azacitidine in the treatment of patients harboring these mutations will be reviewed. Overall, we propose that understanding the role of these mutations in AML biology will lead to more rational therapeutic approaches targeting molecularly defined subtypes of the disease. PMID:24699305
Temperate and lytic bacteriophages programmed to sensitize and kill antibiotic-resistant bacteria

PubMed Central

Yosef, Ido; Manor, Miriam; Kiro, Ruth

2015-01-01

The increasing threat of pathogen resistance to antibiotics requires the development of novel antimicrobial strategies. Here we present a proof of concept for a genetic strategy that aims to sensitize bacteria to antibiotics and selectively kill antibiotic-resistant bacteria. We use temperate phages to deliver a functional clustered regularly interspaced short palindromic repeats (CRISPR)–CRISPR-associated (Cas) system into the genome of antibiotic-resistant bacteria. The delivered CRISPR-Cas system destroys both antibiotic resistance-conferring plasmids and genetically modified lytic phages. This linkage between antibiotic sensitization and protection from lytic phages is a key feature of the strategy. It allows programming of lytic phages to kill only antibiotic-resistant bacteria while protecting antibiotic-sensitized bacteria. Phages designed according to this strategy may be used on hospital surfaces and hand sanitizers to facilitate replacement of antibiotic-resistant pathogens with sensitive ones. PMID:26060300
Temperate and lytic bacteriophages programmed to sensitize and kill antibiotic-resistant bacteria.

PubMed

Yosef, Ido; Manor, Miriam; Kiro, Ruth; Qimron, Udi

2015-06-09

The increasing threat of pathogen resistance to antibiotics requires the development of novel antimicrobial strategies. Here we present a proof of concept for a genetic strategy that aims to sensitize bacteria to antibiotics and selectively kill antibiotic-resistant bacteria. We use temperate phages to deliver a functional clustered regularly interspaced short palindromic repeats (CRISPR)-CRISPR-associated (Cas) system into the genome of antibiotic-resistant bacteria. The delivered CRISPR-Cas system destroys both antibiotic resistance-conferring plasmids and genetically modified lytic phages. This linkage between antibiotic sensitization and protection from lytic phages is a key feature of the strategy. It allows programming of lytic phages to kill only antibiotic-resistant bacteria while protecting antibiotic-sensitized bacteria. Phages designed according to this strategy may be used on hospital surfaces and hand sanitizers to facilitate replacement of antibiotic-resistant pathogens with sensitive ones.
Novel genomic rearrangements mediated by multiple genetic elements in Streptococcus pyogenes M23ND confer potential for evolutionary persistence

PubMed Central

Bao, Yun-Juan; Liang, Zhong; Mayfield, Jeffrey A.; McShan, William M.; Lee, Shaun W.; Ploplis, Victoria A.; Castellino, Francis J.

2016-01-01

Symmetric genomic rearrangements around replication axes in genomes are commonly observed in prokaryotic genomes, including Group A Streptococcus (GAS). However, asymmetric rearrangements are rare. Our previous studies showed that the hypervirulent invasive GAS strain, M23ND, containing an inactivated transcriptional regulator system, covRS, exhibits unique extensive asymmetric rearrangements, which reconstructed a genomic structure distinct from other GAS genomes. In the current investigation, we identified the rearrangement events and examined the genetic consequences and evolutionary implications underlying the rearrangements. By comparison with a close phylogenetic relative, M18-MGAS8232, we propose a molecular model wherein a series of asymmetric rearrangements have occurred in M23ND, involving translocations, inversions and integrations mediated by multiple factors, viz., rRNA-comX (factor for late competence), transposons and phage-encoded gene segments. Assessments of the cumulative gene orientations and GC skews reveal that the asymmetric genomic rearrangements did not affect the general genomic integrity of the organism. However, functional distributions reveal re-clustering of a broad set of CovRS-regulated actively transcribed genes, including virulence factors and metabolic genes, to the same leading strand, with high confidence (p-value ~10−10). The re-clustering of the genes suggests a potential selection advantage for the spatial proximity to the transcription complexes, which may contain the global transcriptional regulator, CovRS, and other RNA polymerases. Their proximities allow for efficient transcription of the genes required for growth, virulence and persistence. A new paradigm of survival strategies of GAS strains is provided through multiple genomic rearrangements, while, at the same time, maintaining genomic integrity. PMID:27329479
Sox17 drives functional engraftment of endothelium converted from non-vascular cells

PubMed Central

Schachterle, William; Badwe, Chaitanya R.; Palikuqi, Brisa; Kunar, Balvir; Ginsberg, Michael; Lis, Raphael; Yokoyama, Masataka; Elemento, Olivier; Scandura, Joseph M.; Rafii, Shahin

2017-01-01

Transplanting vascular endothelial cells (ECs) to support metabolism and express regenerative paracrine factors is a strategy to treat vasculopathies and to promote tissue regeneration. However, transplantation strategies have been challenging to develop, because ECs are difficult to culture and little is known about how to direct them to stably integrate into vasculature. Here we show that only amniotic cells could convert to cells that maintain EC gene expression. Even so, these converted cells perform sub-optimally in transplantation studies. Constitutive Akt signalling increases expression of EC morphogenesis genes, including Sox17, shifts the genomic targeting of Fli1 to favour nearby Sox consensus sites and enhances the vascular function of converted cells. Enforced expression of Sox17 increases expression of morphogenesis genes and promotes integration of transplanted converted cells into injured vessels. Thus, Ets transcription factors specify non-vascular, amniotic cells to EC-like cells, whereas Sox17 expression is required to confer EC function. PMID:28091527
Hybrid coexpression link similarity graph clustering for mining biological modules from multiple gene expression datasets

PubMed Central

2014-01-01

Background Advances in genomic technologies have enabled the accumulation of vast amount of genomic data, including gene expression data for multiple species under various biological and environmental conditions. Integration of these gene expression datasets is a promising strategy to alleviate the challenges of protein functional annotation and biological module discovery based on a single gene expression data, which suffers from spurious coexpression. Results We propose a joint mining algorithm that constructs a weighted hybrid similarity graph whose nodes are the coexpression links. The weight of an edge between two coexpression links in this hybrid graph is a linear combination of the topological similarities and co-appearance similarities of the corresponding two coexpression links. Clustering the weighted hybrid similarity graph yields recurrent coexpression link clusters (modules). Experimental results on Human gene expression datasets show that the reported modules are functionally homogeneous as evident by their enrichment with biological process GO terms and KEGG pathways. PMID:25221624
My46: a web-based tool for self-guided management of genomic test results in research and clinical settings

PubMed Central

Tabor, Holly K.; Jamal, Seema M.; Yu, Joon-Ho; Crouch, Julia M.; Shankar, Aditi G.; Dent, Karin M.; Anderson, Nick; Miller, Damon A.; Futral, Brett T.; Bamshad, Michael J.

2016-01-01

A major challenge to implementing precision medicine is the need for an efficient and cost-effective strategy for returning individual genomic test results that is easily scalable and can be incorporated into multiple models of clinical practice. My46 is a web-based tool for managing the return of genetic results that was designed and developed to support a wide range of approaches to results disclosure, ranging from traditional face-to-face disclosure to self-guided models. My46 has five key functions: set and modify results return preferences, return results, educate, manage return of results, and assess return of results. These key functions are supported by six distinct modules and a suite of features that enhance the user experience, ease site navigation, facilitate knowledge sharing, and enable results return tracking. My46 is a potentially effective solution for returning results and supports current trends toward shared decision-making between patient and provider and patient-driven health management. PMID:27632689
Proteomic approaches in brain research and neuropharmacology.

PubMed

Vercauteren, Freya G G; Bergeron, John J M; Vandesande, Frans; Arckens, Lut; Quirion, Rémi

2004-10-01

Numerous applications of genomic technologies have enabled the assembly of unprecedented inventories of genes, expressed in cells under specific physiological and pathophysiological conditions. Complementing the valuable information generated through functional genomics with the integrative knowledge of protein expression and function should enable the development of more efficient diagnostic tools and therapeutic agents. Proteomic analyses are particularly suitable to elucidate posttranslational modifications, expression levels and protein-protein interactions of thousands of proteins at a time. In this review, two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) investigations of brain tissues in neurodegenerative diseases such as Alzheimer's disease, Down syndrome and schizophrenia, and the construction of 2D-PAGE proteome maps of the brain are discussed. The role of the Human Proteome Organization (HUPO) as an international coordinating organization for proteomic efforts, as well as challenges for proteomic technologies and data analysis are also addressed. It is expected that the use of proteomic strategies will have significant impact in neuropharmacology over the coming decade.

A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response

PubMed Central

Adamson, Britt; Norman, Thomas M.; Jost, Marco; Cho, Min Y.; Nuñez, James K.; Chen, Yuwen; Villalta, Jacqueline E.; Gilbert, Luke A.; Horlbeck, Max A.; Hein, Marco Y.; Pak, Ryan A.; Gray, Andrew N.; Gross, Carol A.; Dixit, Atray; Parnas, Oren; Regev, Aviv; Weissman, Jonathan S.

2016-01-01

SUMMARY Functional genomics efforts face tradeoffs between number of perturbations examined and complexity of phenotypes measured. We bridge this gap with Perturb-seq, which combines droplet-based single-cell RNA-seq with a strategy for barcoding CRISPR-mediated perturbations, allowing many perturbations to be profiled in pooled format. We applied Perturb-seq to dissect the mammalian unfolded protein response (UPR) using single and combinatorial CRISPR perturbations. Two genome-scale CRISPR interference (CRISPRi) screens identified genes whose repression perturbs ER homeostasis. Subjecting ~100 hits to Perturb-seq enabled high-precision functional clustering of genes. Single-cell analyses decoupled the three UPR branches, revealed bifurcated UPR branch activation among cells subject to the same perturbation, and uncovered differential activation of the branches across hits, including an isolated feedback loop between the translocon and IRE1α. These studies provide insight into how the three sensors of ER homeostasis monitor distinct types of stress and highlight the ability of Perturb-seq to dissect complex cellular responses. PMID:27984733
The Food and Drug Addiction Epidemic: Targeting Dopamine Homeostasis.

PubMed

Blum, Kenneth; Thanos, Panayotis K; Wang, Gene-Jack; Febo, Marcelo; Demetrovics, Zsolt; Modestino, Edward Justin; Braverman, Eric R; Baron, David; Badgaiyan, Rajendra D; Gold, Mark S

2018-02-12

Obesity is damaging the lives of more than 300 million people worldwide and maintaining a healthy weight using popular weight loss tactics remains a very difficult undertaking. Managing the obesity problem seems within reach, as better understanding develops, of the function of our genome in drug/nutrient responses. Strategies indicated by this understanding of nutriepigenomics and neurogenetics in the treatment and prevention of metabolic syndrome and obesity include moderation of mRNA expression by DNA methylation, and inhibition of histone deacetylation. Based on an individual's genetic makeup, deficient metabolic pathways can be targeted epigenetically by, for example, the provision of dietary supplementation that includes phytochemicals, vitamins, and importantly functional amino acids. Also, the chromatin structure of imprinted genes that control nutrients during fetal development can be modified. Pathways affecting dopamine signaling, molecular transport and nervous system development are implicated in these strategies. Obesity is a subtype of Reward Deficiency Syndrome (RDS) and these new strategies in the treatment and prevention of obesity target improved dopamine function. It is not merely a matter of gastrointestinal signaling linked to hypothalamic peptides, but alternatively, finding novel ways to improve ventral tegmental area (VTA) dopaminergic function and homeostasis. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Gain-of-function mutagenesis approaches in rice for functional genomics and improvement of crop productivity.

PubMed

Moin, Mazahar; Bakshi, Achala; Saha, Anusree; Dutta, Mouboni; Kirti, P B

2017-07-01

The epitome of any genome research is to identify all the existing genes in a genome and investigate their roles. Various techniques have been applied to unveil the functions either by silencing or over-expressing the genes by targeted expression or random mutagenesis. Rice is the most appropriate model crop for generating a mutant resource for functional genomic studies because of the availability of high-quality genome sequence and relatively smaller genome size. Rice has syntenic relationships with members of other cereals. Hence, characterization of functionally unknown genes in rice will possibly provide key genetic insights and can lead to comparative genomics involving other cereals. The current review attempts to discuss the available gain-of-function mutagenesis techniques for functional genomics, emphasizing the contemporary approach, activation tagging and alterations to this method for the enhancement of yield and productivity of rice. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Using Zebrafish to Test the Genetic Basis of Human Craniofacial Diseases.

PubMed

Machado, R Grecco; Eames, B Frank

2017-10-01

Genome-wide association studies (GWASs) opened an innovative and productive avenue to investigate the molecular basis of human craniofacial disease. However, GWASs identify candidate genes only; they do not prove that any particular one is the functional villain underlying disease or just an unlucky genomic bystander. Genetic manipulation of animal models is the best approach to reveal which genetic loci identified from human GWASs are functionally related to specific diseases. The purpose of this review is to discuss the potential of zebrafish to resolve which candidate genetic loci are mechanistic drivers of craniofacial diseases. Many anatomic, embryonic, and genetic features of craniofacial development are conserved among zebrafish and mammals, making zebrafish a good model of craniofacial diseases. Also, the ability to manipulate gene function in zebrafish was greatly expanded over the past 20 y, enabling systems such as Gateway Tol2 and CRISPR-Cas9 to test gain- and loss-of-function alleles identified from human GWASs in coding and noncoding regions of DNA. With the optimization of genetic editing methods, large numbers of candidate genes can be efficiently interrogated. Finding the functional villains that underlie diseases will permit new treatments and prevention strategies and will increase understanding of how gene pathways operate during normal development.
proGenomes: a resource for consistent functional and taxonomic annotations of prokaryotic genomes.

PubMed

Mende, Daniel R; Letunic, Ivica; Huerta-Cepas, Jaime; Li, Simone S; Forslund, Kristoffer; Sunagawa, Shinichi; Bork, Peer

2017-01-04

The availability of microbial genomes has opened many new avenues of research within microbiology. This has been driven primarily by comparative genomics approaches, which rely on accurate and consistent characterization of genomic sequences. It is nevertheless difficult to obtain consistent taxonomic and integrated functional annotations for defined prokaryotic clades. Thus, we developed proGenomes, a resource that provides user-friendly access to currently 25 038 high-quality genomes whose sequences and consistent annotations can be retrieved individually or by taxonomic clade. These genomes are assigned to 5306 consistent and accurate taxonomic species clusters based on previously established methodology. proGenomes also contains functional information for almost 80 million protein-coding genes, including a comprehensive set of general annotations and more focused annotations for carbohydrate-active enzymes and antibiotic resistance genes. Additionally, broad habitat information is provided for many genomes. All genomes and associated information can be downloaded by user-selected clade or multiple habitat-specific sets of representative genomes. We expect that the availability of high-quality genomes with comprehensive functional annotations will promote advances in clinical microbial genomics, functional evolution and other subfields of microbiology. proGenomes is available at http://progenomes.embl.de. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Unlocking the Bottleneck in Forward Genetics Using Whole-Genome Sequencing and Identity by Descent to Isolate Causative Mutations

PubMed Central

Siggs, Owen M.; Miosge, Lisa A.; Roots, Carla M.; Enders, Anselm; Bertram, Edward M.; Crockford, Tanya L.; Whittle, Belinda; Potter, Paul K.; Simon, Michelle M.; Mallon, Ann-Marie; Brown, Steve D. M.; Beutler, Bruce; Goodnow, Christopher C.; Lunter, Gerton; Cornall, Richard J.

2013-01-01

Forward genetics screens with N-ethyl-N-nitrosourea (ENU) provide a powerful way to illuminate gene function and generate mouse models of human disease; however, the identification of causative mutations remains a limiting step. Current strategies depend on conventional mapping, so the propagation of affected mice requires non-lethal screens; accurate tracking of phenotypes through pedigrees is complex and uncertain; out-crossing can introduce unexpected modifiers; and Sanger sequencing of candidate genes is inefficient. Here we show how these problems can be efficiently overcome using whole-genome sequencing (WGS) to detect the ENU mutations and then identify regions that are identical by descent (IBD) in multiple affected mice. In this strategy, we use a modification of the Lander-Green algorithm to isolate causative recessive and dominant mutations, even at low coverage, on a pure strain background. Analysis of the IBD regions also allows us to calculate the ENU mutation rate (1.54 mutations per Mb) and to model future strategies for genetic screens in mice. The introduction of this approach will accelerate the discovery of causal variants, permit broader and more informative lethal screens to be used, reduce animal costs, and herald a new era for ENU mutagenesis. PMID:23382690
Particle infectivity of HIV-1 full-length genome infectious molecular clones in a subtype C heterosexual transmission pair following high fidelity amplification and unbiased cloning

DOE Office of Scientific and Technical Information (OSTI.GOV)

Deymier, Martin J., E-mail: mdeymie@emory.edu; Claiborne, Daniel T., E-mail: dclaibo@emory.edu; Ende, Zachary, E-mail: zende@emory.edu

The high genetic diversity of HIV-1 impedes high throughput, large-scale sequencing and full-length genome cloning by common restriction enzyme based methods. Applying novel methods that employ a high-fidelity polymerase for amplification and an unbiased fusion-based cloning strategy, we have generated several HIV-1 full-length genome infectious molecular clones from an epidemiologically linked transmission pair. These clones represent the transmitted/founder virus and phylogenetically diverse non-transmitted variants from the chronically infected individual's diverse quasispecies near the time of transmission. We demonstrate that, using this approach, PCR-induced mutations in full-length clones derived from their cognate single genome amplicons are rare. Furthermore, all eight non-transmittedmore » genomes tested produced functional virus with a range of infectivities, belying the previous assumption that a majority of circulating viruses in chronic HIV-1 infection are defective. Thus, these methods provide important tools to update protocols in molecular biology that can be universally applied to the study of human viral pathogens. - Highlights: • Our novel methodology demonstrates accurate amplification and cloning of full-length HIV-1 genomes. • A majority of plasma derived HIV variants from a chronically infected individual are infectious. • The transmitted/founder was more infectious than the majority of the variants from the chronically infected donor.« less
DRD4 and DAT1 in ADHD: Functional neurobiology to pharmacogenetics

PubMed Central

Turic, Darko; Swanson, James; Sonuga-Barke, Edmund

2010-01-01

Attention deficit/hyperactivity disorder (ADHD) is a common and potentially very impairing neuropsychiatric disorder of childhood. Statistical genetic studies of twins have shown ADHD to be highly heritable, with the combination of genes and gene by environment interactions accounting for around 80% of phenotypic variance. The initial molecular genetic studies where candidates were selected because of the efficacy of dopaminergic compounds in the treatment of ADHD were remarkably successful and provided strong evidence for the role of DRD4 and DAT1 variants in the pathogenesis of ADHD. However, the recent application of non-candidate gene strategies (eg, genome-wide association scans) has failed to identify additional genes with substantial genetic main effects, and the effects for DRD4 and DAT1 have not been replicated. This is the usual pattern observed for most other physical and mental disorders evaluated with current state-of-the-art methods. In this paper we discuss future strategies for genetic studies in ADHD, highlighting both the pitfalls and possible solutions relating to candidate gene studies, genome-wide studies, defining the phenotype, and statistical approaches. PMID:23226043
Disruption of the psbA gene by the copy correction mechanism reveals that the expression of plastid-encoded genes is regulated by photosynthesis activity.

PubMed

Khan, Muhammad Sarwar; Hameed, Waqar; Nozoe, Mikio; Shiina, Takashi

2007-05-01

The functional analysis of genes encoded by the chloroplast genome of tobacco by reverse genetics is routine. Nevertheless, for a small number of genes their deletion generates heteroplasmic genotypes, complicating their analysis. There is thus the need for additional strategies to develop deletion mutants for these genes. We have developed a homologous copy correction-based strategy for deleting/mutating genes encoded on the chloroplast genome. This system was used to produce psbA knockouts. The resulting plants are homoplasmic and lack photosystem II (PSII) activity. Further, the deletion mutants exhibit a distinct phenotype; young leaves are green, whereas older leaves are bleached, irrespective of light conditions. This suggests that senescence is promoted by the absence of psbA. Analysis of the transcript levels indicates that NEP (nuclear-encoded plastid RNA polymerase)-dependent plastid genes are up regulated in the psbA deletion mutants, whereas the bleached leaves retain plastid-encoded plastid RNA polymerase activity. Hence, the expression of NEP-dependent plastid genes may be regulated by photosynthesis, either directly or indirectly.
“SNP Snappy”: A Strategy for Fast Genome-Wide Association Studies Fitting a Full Mixed Model

PubMed Central

Meyer, Karin; Tier, Bruce

2012-01-01

A strategy to reduce computational demands of genome-wide association studies fitting a mixed model is presented. Improvements are achieved by utilizing a large proportion of calculations that remain constant across the multiple analyses for individual markers involved, with estimates obtained without inverting large matrices. PMID:22021386
The TTSMI database: a catalog of triplex target DNA sites associated with genes and regulatory elements in the human genome.

PubMed

Jenjaroenpun, Piroon; Chew, Chee Siang; Yong, Tai Pang; Choowongkomon, Kiattawee; Thammasorn, Wimada; Kuznetsov, Vladimir A

2015-01-01

A triplex target DNA site (TTS), a stretch of DNA that is composed of polypurines, is able to form a triple-helix (triplex) structure with triplex-forming oligonucleotides (TFOs) and is able to influence the site-specific modulation of gene expression and/or the modification of genomic DNA. The co-localization of a genomic TTS with gene regulatory signals and functional genome structures suggests that TFOs could potentially be exploited in antigene strategies for the therapy of cancers and other genetic diseases. Here, we present the TTS Mapping and Integration (TTSMI; http://ttsmi.bii.a-star.edu.sg) database, which provides a catalog of unique TTS locations in the human genome and tools for analyzing the co-localization of TTSs with genomic regulatory sequences and signals that were identified using next-generation sequencing techniques and/or predicted by computational models. TTSMI was designed as a user-friendly tool that facilitates (i) fast searching/filtering of TTSs using several search terms and criteria associated with sequence stability and specificity, (ii) interactive filtering of TTSs that co-localize with gene regulatory signals and non-B DNA structures, (iii) exploration of dynamic combinations of the biological signals of specific TTSs and (iv) visualization of a TTS simultaneously with diverse annotation tracks via the UCSC genome browser. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Estimating P-coverage of biosynthetic pathways in DNA libraries and screening by genetic selection: biotin biosynthesis in the marine microorganism Chromohalobacter.

PubMed

Kim, Eun Jin; Angell, Scott; Janes, Jeff; Watanabe, Coran M H

2008-06-01

Traditional approaches to natural product discovery involve cell-based screening of natural product extracts followed by compound isolation and characterization. Their importance notwithstanding, continued mining leads to depletion of natural resources and the reisolation of previously identified metabolites. Metagenomic strategies aimed at localizing the biosynthetic cluster genes and expressing them in surrogate hosts offers one possible alternative. A fundamental question that naturally arises when pursuing such a strategy is, how large must the genomic library be to effectively represent the genome of an organism(s) and the biosynthetic gene clusters they harbor? Such an issue is certainly augmented in the absence of expensive robotics to expedite colony picking and/or screening of clones. We have developed an algorism, named BPC (biosynthetic pathway coverage), supported by molecular simulations to deduce the number of BAC clones required to achieve proper coverage of the genome and their respective biosynthetic pathways. The strategy has been applied to the construction of a large-insert BAC library from a marine microorganism, Hon6 (isolated from Honokohau, Maui) thought to represent a new species. The genomic library is constructed with a BAC yeast shuttle vector pClasper lacZ paving the way for the culturing of libraries in both prokaryotic and eukaryotic hosts. Flow cytometric methods are utilized to estimate the genome size of the organism and BPC implemented to assess P-coverage or percent coverage. A genetic selection strategy is illustrated, applications of which could expedite screening efforts in the identification and localization of biosynthetic pathways from marine microbial consortia, offering a powerful complement to genome sequencing and degenerate probe strategies. Implementing this approach, we report on the biotin biosynthetic pathway from the marine microorganism Hon6.
Comparative genomics of the pathogenic ciliate Ichthyophthirius multifiliis, its free-living relatives and a host species provide insights into adoption of a parasitic lifestyle and prospects for disease control

PubMed Central

2011-01-01

Background Ichthyophthirius multifiliis, commonly known as Ich, is a highly pathogenic ciliate responsible for 'white spot', a disease causing significant economic losses to the global aquaculture industry. Options for disease control are extremely limited, and Ich's obligate parasitic lifestyle makes experimental studies challenging. Unlike most well-studied protozoan parasites, Ich belongs to a phylum composed primarily of free-living members. Indeed, it is closely related to the model organism Tetrahymena thermophila. Genomic studies represent a promising strategy to reduce the impact of this disease and to understand the evolutionary transition to parasitism. Results We report the sequencing, assembly and annotation of the Ich macronuclear genome. Compared with its free-living relative T. thermophila, the Ich genome is reduced approximately two-fold in length and gene density and three-fold in gene content. We analyzed in detail several gene classes with diverse functions in behavior, cellular function and host immunogenicity, including protein kinases, membrane transporters, proteases, surface antigens and cytoskeletal components and regulators. We also mapped by orthology Ich's metabolic pathways in comparison with other ciliates and a potential host organism, the zebrafish Danio rerio. Conclusions Knowledge of the complete protein-coding and metabolic potential of Ich opens avenues for rational testing of therapeutic drugs that target functions essential to this parasite but not to its fish hosts. Also, a catalog of surface protein-encoding genes will facilitate development of more effective vaccines. The potential to use T. thermophila as a surrogate model offers promise toward controlling 'white spot' disease and understanding the adaptation to a parasitic lifestyle. PMID:22004680
Exploiting EST databases for the development and characterisation of 3425 gene-tagged CISP markers in biofuel crop sugarcane and their transferability in cereals and orphan tropical grasses.

PubMed

Chandra, Amaresh; Jain, Radha; Solomon, Sushil; Shrivastava, Shiksha; Roy, Ajoy K

2013-02-04

Sugarcane is an important cash crop, providing 70% of the global raw sugar as well as raw material for biofuel production. Genetic analysis is hindered in sugarcane because of its large and complex polyploid genome and lack of sufficiently informative gene-tagged markers. Modern genomics has produced large amount of ESTs, which can be exploited to develop molecular markers based on comparative analysis with EST datasets of related crops and whole rice genome sequence, and accentuate their cross-technical functionality in orphan crops like tropical grasses. Utilising 246,180 Saccharum officinarum EST sequences vis-à-vis its comparative analysis with ESTs of sorghum and barley and the whole rice genome sequence, we have developed 3425 novel gene-tagged markers - namely, conserved-intron scanning primers (CISP) - using the web program GeMprospector. Rice orthologue annotation results indicated homology of 1096 sequences with expressed proteins, 491 with hypothetical proteins. The remaining 1838 were miscellaneous in nature. A total of 367 primer-pairs were tested in diverse panel of samples. The data indicate amplification of 41% polymorphic bands leading to 0.52 PIC and 3.50 MI with a set of sugarcane varieties and Saccharum species. In addition, a moderate technical functionality of a set of such markers with orphan tropical grasses (22%) and fodder cum cereal oat (33%) is observed. Developed gene-tagged CISP markers exhibited considerable technical functionality with varieties of sugarcane and unexplored species of tropical grasses. These markers would thus be particularly useful in identifying the economical traits in sugarcane and developing conservation strategies for orphan tropical grasses.
The Gene Expression Omnibus Database.

PubMed

Clough, Emily; Barrett, Tanya

2016-01-01

The Gene Expression Omnibus (GEO) database is an international public repository that archives and freely distributes high-throughput gene expression and other functional genomics data sets. Created in 2000 as a worldwide resource for gene expression studies, GEO has evolved with rapidly changing technologies and now accepts high-throughput data for many other data applications, including those that examine genome methylation, chromatin structure, and genome-protein interactions. GEO supports community-derived reporting standards that specify provision of several critical study elements including raw data, processed data, and descriptive metadata. The database not only provides access to data for tens of thousands of studies, but also offers various Web-based tools and strategies that enable users to locate data relevant to their specific interests, as well as to visualize and analyze the data. This chapter includes detailed descriptions of methods to query and download GEO data and use the analysis and visualization tools. The GEO homepage is at http://www.ncbi.nlm.nih.gov/geo/.
MToolBox: a highly automated pipeline for heteroplasmy annotation and prioritization analysis of human mitochondrial variants in high-throughput sequencing

PubMed Central

Diroma, Maria Angela; Santorsola, Mariangela; Guttà, Cristiano; Gasparre, Giuseppe; Picardi, Ernesto; Pesole, Graziano; Attimonelli, Marcella

2014-01-01

Motivation: The increasing availability of mitochondria-targeted and off-target sequencing data in whole-exome and whole-genome sequencing studies (WXS and WGS) has risen the demand of effective pipelines to accurately measure heteroplasmy and to easily recognize the most functionally important mitochondrial variants among a huge number of candidates. To this purpose, we developed MToolBox, a highly automated pipeline to reconstruct and analyze human mitochondrial DNA from high-throughput sequencing data. Results: MToolBox implements an effective computational strategy for mitochondrial genomes assembling and haplogroup assignment also including a prioritization analysis of detected variants. MToolBox provides a Variant Call Format file featuring, for the first time, allele-specific heteroplasmy and annotation files with prioritized variants. MToolBox was tested on simulated samples and applied on 1000 Genomes WXS datasets. Availability and implementation: MToolBox package is available at https://sourceforge.net/projects/mtoolbox/. Contact: marcella.attimonelli@uniba.it Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25028726
The planetary biology of cytochrome P450 aromatases.

PubMed

Gaucher, Eric A; Graddy, Logan G; Li, Tang; Simmen, Rosalia C M; Simmen, Frank A; Schreiber, David R; Liberles, David A; Janis, Christine M; Benner, Steven A

2004-08-17

Joining a model for the molecular evolution of a protein family to the paleontological and geological records (geobiology), and then to the chemical structures of substrates, products, and protein folds, is emerging as a broad strategy for generating hypotheses concerning function in a post-genomic world. This strategy expands systems biology to a planetary context, necessary for a notion of fitness to underlie (as it must) any discussion of function within a biomolecular system. Here, we report an example of such an expansion, where tools from planetary biology were used to analyze three genes from the pig Sus scrofa that encode cytochrome P450 aromatases-enzymes that convert androgens into estrogens. The evolutionary history of the vertebrate aromatase gene family was reconstructed. Transition redundant exchange silent substitution metrics were used to interpolate dates for the divergence of family members, the paleontological record was consulted to identify changes in physiology that correlated in time with the change in molecular behavior, and new aromatase sequences from peccary were obtained. Metrics that detect changing function in proteins were then applied, including KA/KS values and those that exploit structural biology. These identified specific amino acid replacements that were associated with changing substrate and product specificity during the time of presumed adaptive change. The combined analysis suggests that aromatase paralogs arose in pigs as a result of selection for Suoidea with larger litters than their ancestors, and permitted the Suoidea to survive the global climatic trauma that began in the Eocene. This combination of bioinformatics analysis, molecular evolution, paleontology, cladistics, global climatology, structural biology, and organic chemistry serves as a paradigm in planetary biology. As the geological, paleontological, and genomic records improve, this approach should become widely useful to make systems biology statements about high-level function for biomolecular systems.
The planetary biology of cytochrome P450 aromatases

PubMed Central

Gaucher, Eric A; Graddy, Logan G; Li, Tang; Simmen, Rosalia CM; Simmen, Frank A; Schreiber, David R; Liberles, David A; Janis, Christine M; Benner, Steven A

2004-01-01

Background Joining a model for the molecular evolution of a protein family to the paleontological and geological records (geobiology), and then to the chemical structures of substrates, products, and protein folds, is emerging as a broad strategy for generating hypotheses concerning function in a post-genomic world. This strategy expands systems biology to a planetary context, necessary for a notion of fitness to underlie (as it must) any discussion of function within a biomolecular system. Results Here, we report an example of such an expansion, where tools from planetary biology were used to analyze three genes from the pig Sus scrofa that encode cytochrome P450 aromatases–enzymes that convert androgens into estrogens. The evolutionary history of the vertebrate aromatase gene family was reconstructed. Transition redundant exchange silent substitution metrics were used to interpolate dates for the divergence of family members, the paleontological record was consulted to identify changes in physiology that correlated in time with the change in molecular behavior, and new aromatase sequences from peccary were obtained. Metrics that detect changing function in proteins were then applied, including KA/KS values and those that exploit structural biology. These identified specific amino acid replacements that were associated with changing substrate and product specificity during the time of presumed adaptive change. The combined analysis suggests that aromatase paralogs arose in pigs as a result of selection for Suoidea with larger litters than their ancestors, and permitted the Suoidea to survive the global climatic trauma that began in the Eocene. Conclusions This combination of bioinformatics analysis, molecular evolution, paleontology, cladistics, global climatology, structural biology, and organic chemistry serves as a paradigm in planetary biology. As the geological, paleontological, and genomic records improve, this approach should become widely useful to make systems biology statements about high-level function for biomolecular systems. PMID:15315709
GEAR: genomic enrichment analysis of regional DNA copy number changes.

PubMed

Kim, Tae-Min; Jung, Yu-Chae; Rhyu, Mun-Gan; Jung, Myeong Ho; Chung, Yeun-Jun

2008-02-01

We developed an algorithm named GEAR (genomic enrichment analysis of regional DNA copy number changes) for functional interpretation of genome-wide DNA copy number changes identified by array-based comparative genomic hybridization. GEAR selects two types of chromosomal alterations with potential biological relevance, i.e. recurrent and phenotype-specific alterations. Then it performs functional enrichment analysis using a priori selected functional gene sets to identify primary and clinical genomic signatures. The genomic signatures identified by GEAR represent functionally coordinated genomic changes, which can provide clues on the underlying molecular mechanisms related to the phenotypes of interest. GEAR can help the identification of key molecular functions that are activated or repressed in the tumor genomes leading to the improved understanding on the tumor biology. GEAR software is available with online manual in the website, http://www.systemsbiology.co.kr/GEAR/.
Identification of giant Mimivirus protein functions using RNA interference

PubMed Central

Sobhy, Haitham; Scola, Bernard La; Pagnier, Isabelle; Raoult, Didier; Colson, Philippe

2015-01-01

Genomic analysis of giant viruses, such as Mimivirus, has revealed that more than half of the putative genes have no known functions (ORFans). We knocked down Mimivirus genes using short interfering RNA as a proof of concept to determine the functions of giant virus ORFans. As fibers are easy to observe, we targeted a gene encoding a protein absent in a Mimivirus mutant devoid of fibers as well as three genes encoding products identified in a protein concentrate of fibers, including one ORFan and one gene of unknown function. We found that knocking down these four genes was associated with depletion or modification of the fibers. Our strategy of silencing ORFan genes in giant viruses opens a way to identify its complete gene repertoire and may clarify the role of these genes, differentiating between junk DNA and truly used genes. Using this strategy, we were able to annotate four proteins in Mimivirus and 30 homologous proteins in other giant viruses. In addition, we were able to annotate >500 proteins from cellular organisms and 100 from metagenomic databases. PMID:25972846

Biological effects of simple changes in functionality on rhodium metalloinsertors

PubMed Central

Weidmann, Alyson G.; Komor, Alexis C.; Barton, Jacqueline K.

2013-01-01

DNA mismatch repair (MMR) is crucial to ensuring the fidelity of the genome. The inability to correct single base mismatches leads to elevated mutation rates and carcinogenesis. Using metalloinsertors–bulky metal complexes that bind with high specificity to mismatched sites in the DNA duplex–our laboratory has adopted a new chemotherapeutic strategy through the selective targeting of MMR-deficient cells, that is, those that have a propensity for cancerous transformation. Rhodium metalloinsertors display inhibitory effects selectively in cells that are deficient in the MMR machinery, consistent with this strategy. However, a highly sensitive structure–function relationship is emerging with the development of new complexes that highlights the importance of subcellular localization. We have found that small structural modifications, for example a hydroxyl versus a methyl functional group, can yield profound differences in biological function. Despite similar binding affinities and selectivities for DNA mismatches, only one metalloinsertor shows selective inhibition of cellular proliferation in MMR-deficient versus -proficient cells. Studies of whole-cell, nuclear and mitochondrial uptake reveal that this selectivity depends upon targeting DNA mismatches in the cell nucleus. PMID:23776288
Genome editing for human gene therapy.

PubMed

Meissner, Torsten B; Mandal, Pankaj K; Ferreira, Leonardo M R; Rossi, Derrick J; Cowan, Chad A

2014-01-01

The rapid advancement of genome-editing techniques holds much promise for the field of human gene therapy. From bacteria to model organisms and human cells, genome editing tools such as zinc-finger nucleases (ZNFs), TALENs, and CRISPR/Cas9 have been successfully used to manipulate the respective genomes with unprecedented precision. With regard to human gene therapy, it is of great interest to test the feasibility of genome editing in primary human hematopoietic cells that could potentially be used to treat a variety of human genetic disorders such as hemoglobinopathies, primary immunodeficiencies, and cancer. In this chapter, we explore the use of the CRISPR/Cas9 system for the efficient ablation of genes in two clinically relevant primary human cell types, CD4+ T cells and CD34+ hematopoietic stem and progenitor cells. By using two guide RNAs directed at a single locus, we achieve highly efficient and predictable deletions that ablate gene function. The use of a Cas9-2A-GFP fusion protein allows FACS-based enrichment of the transfected cells. The ease of designing, constructing, and testing guide RNAs makes this dual guide strategy an attractive approach for the efficient deletion of clinically relevant genes in primary human hematopoietic stem and effector cells and enables the use of CRISPR/Cas9 for gene therapy.
Genome-Wide Stochastic Adaptive DNA Amplification at Direct and Inverted DNA Repeats in the Parasite Leishmania

PubMed Central

Plourde, Marie; Gingras, Hélène; Roy, Gaétan; Lapointe, Andréanne; Leprohon, Philippe; Papadopoulou, Barbara; Corbeil, Jacques; Ouellette, Marc

2014-01-01

Gene amplification of specific loci has been described in all kingdoms of life. In the protozoan parasite Leishmania, the product of amplification is usually part of extrachromosomal circular or linear amplicons that are formed at the level of direct or inverted repeated sequences. A bioinformatics screen revealed that repeated sequences are widely distributed in the Leishmania genome and the repeats are chromosome-specific, conserved among species, and generally present in low copy number. Using sensitive PCR assays, we provide evidence that the Leishmania genome is continuously being rearranged at the level of these repeated sequences, which serve as a functional platform for constitutive and stochastic amplification (and deletion) of genomic segments in the population. This process is adaptive as the copy number of advantageous extrachromosomal circular or linear elements increases upon selective pressure and is reversible when selection is removed. We also provide mechanistic insights on the formation of circular and linear amplicons through RAD51 recombinase-dependent and -independent mechanisms, respectively. The whole genome of Leishmania is thus stochastically rearranged at the level of repeated sequences, and the selection of parasite subpopulations with changes in the copy number of specific loci is used as a strategy to respond to a changing environment. PMID:24844805
Whole-genome duplication increases tumor cell sensitivity to MPS1 inhibition.

PubMed

Jemaà, Mohamed; Manic, Gwenola; Lledo, Gwendaline; Lissa, Delphine; Reynes, Christelle; Morin, Nathalie; Chibon, Frédéric; Sistigu, Antonella; Castedo, Maria; Vitale, Ilio; Kroemer, Guido; Abrieu, Ariane

2016-01-05

Several lines of evidence indicate that whole-genome duplication resulting in tetraploidy facilitates carcinogenesis by providing an intermediate and metastable state more prone to generate oncogenic aneuploidy. Here, we report a novel strategy to preferentially kill tetraploid cells based on the abrogation of the spindle assembly checkpoint (SAC) via the targeting of TTK protein kinase (better known as monopolar spindle 1, MPS1). The pharmacological inhibition as well as the knockdown of MPS1 kills more efficiently tetraploid cells than their diploid counterparts. By using time-lapse videomicroscopy, we show that tetraploid cells do not survive the aborted mitosis due to SAC abrogation upon MPS1 depletion. On the contrary diploid cells are able to survive up to at least two more cell cycles upon the same treatment. This effect might reflect the enhanced difficulty of cells with whole-genome doubling to tolerate a further increase in ploidy and/or an elevated level of chromosome instability in the absence of SAC functions. We further show that MPS1-inhibited tetraploid cells promote mitotic catastrophe executed by the intrinsic pathway of apoptosis, as indicated by the loss of mitochondrial potential, the release of the pro-apoptotic cytochrome c from mitochondria, and the activation of caspases. Altogether, our results suggest that MPS1 inhibition could be used as a therapeutic strategy for targeting tetraploid cancer cells.
Strategies for the acquisition of transcriptional and epigenetic information in single cells.

PubMed

Li, Guang; Dzilic, Elda; Flores, Nick; Shieh, Alice; Wu, Sean M

2017-03-01

As the basic unit of living organisms, each single cell has unique molecular signatures and functions. Our ability to uncover the transcriptional and epigenetic signature of single cells has been hampered by the lack of tools to explore this area of research. The advent of microfluidic single cell technology along with single cell genome-wide DNA amplification methods had greatly improved our understanding of the expression variation in single cells. Transcriptional expression profile by multiplex qPCR or genome-wide RNA sequencing has enabled us to examine genes expression in single cells in different tissues. With the new tools, the identification of new cellular heterogeneity, novel marker genes, unique subpopulations, and spatial locations of each single cell can be acquired successfully. Epigenetic modifications for each single cell can also be obtained via similar methods. Based on single cell genome sequencing, single cell epigenetic information including histone modifications, DNA methylation, and chromatin accessibility have been explored and provided valuable insights regarding gene regulation and disease prognosis. In this article, we review the development of strategies to obtain single cell transcriptional and epigenetic data. Furthermore, we discuss ways in which single cell studies may help to provide greater understanding of the mechanisms of basic cardiovascular biology that will eventually lead to improvement in our ability to diagnose disease and develop new therapies.
Optimization of Swine Breeding Programs Using Genomic Selection with ZPLAN+

PubMed Central

Lopez, B. M.; Kang, H. S.; Kim, T. H.; Viterbo, V. S.; Kim, H. S.; Na, C. S.; Seo, K. S.

2016-01-01

The objective of this study was to evaluate the present conventional selection program of a swine nucleus farm and compare it with a new selection strategy employing genomic enhanced breeding value (GEBV) as the selection criteria. The ZPLAN+ software was employed to calculate and compare the genetic gain, total cost, return and profit of each selection strategy. The first strategy reflected the current conventional breeding program, which was a progeny test system (CS). The second strategy was a selection scheme based strictly on genomic information (GS1). The third scenario was the same as GS1, but the selection by GEBV was further supplemented by the performance test (GS2). The last scenario was a mixture of genomic information and progeny tests (GS3). The results showed that the accuracy of the selection index of young boars of GS1 was 26% higher than that of CS. On the other hand, both GS2 and GS3 gave 31% higher accuracy than CS for young boars. The annual monetary genetic gain of GS1, GS2 and GS3 was 10%, 12%, and 11% higher, respectively, than that of CS. As expected, the discounted costs of genomic selection strategies were higher than those of CS. The costs of GS1, GS2 and GS3 were 35%, 73%, and 89% higher than those of CS, respectively, assuming a genotyping cost of $120. As a result, the discounted profit per animal of GS1 and GS2 was 8% and 2% higher, respectively, than that of CS while GS3 was 6% lower. Comparison among genomic breeding scenarios revealed that GS1 was more profitable than GS2 and GS3. The genomic selection schemes, especially GS1 and GS2, were clearly superior to the conventional scheme in terms of monetary genetic gain and profit. PMID:26954222
Experimental Strategies for Functional Annotation and Metabolism Discovery: Targeted Screening of Solute Binding Proteins and Unbiased Panning of Metabolomes

DOE PAGES

Vetting, Matthew W.; Al-Obaidi, Nawar; Zhao, Suwen; ...

2014-12-25

The rate at which genome sequencing data is accruing demands enhanced methods for functional annotation and metabolism discovery. Solute binding proteins (SBPs) facilitate the transport of the first reactant in a metabolic pathway, thereby constraining the regions of chemical space and the chemistries that must be considered for pathway reconstruction. Here in this paper, we describe high-throughput protein production and differential scanning fluorimetry platforms, which enabled the screening of 158 SBPs against a 189 component library specifically tailored for this class of proteins. Like all screening efforts, this approach is limited by the practical constraints imposed by construction of themore » library, i.e., we can study only those metabolites that are known to exist and which can be made in sufficient quantities for experimentation. To move beyond these inherent limitations, we illustrate the promise of crystallographic- and mass spectrometric-based approaches for the unbiased use of entire metabolomes as screening libraries. Together, our approaches identified 40 new SBP ligands, generated experiment-based annotations for 2084 SBPs in 71 isofunctional clusters, and defined numerous metabolic pathways, including novel catabolic pathways for the utilization of ethanolamine as sole nitrogen source and the use of D-Ala-D-Ala as sole carbon source. These efforts begin to define an integrated strategy for realizing the full value of amassing genome sequence data.« less
Centromere reference models for human chromosomes X and Y satellite arrays

PubMed Central

Miga, Karen H.; Newton, Yulia; Jain, Miten; Altemose, Nicolas; Willard, Huntington F.; Kent, W. James

2014-01-01

The human genome sequence remains incomplete, with multimegabase-sized gaps representing the endogenous centromeres and other heterochromatic regions. Available sequence-based studies within these sites in the genome have demonstrated a role in centromere function and chromosome pairing, necessary to ensure proper chromosome segregation during cell division. A common genomic feature of these regions is the enrichment of long arrays of near-identical tandem repeats, known as satellite DNAs, which offer a limited number of variant sites to differentiate individual repeat copies across millions of bases. This substantial sequence homogeneity challenges available assembly strategies and, as a result, centromeric regions are omitted from ongoing genomic studies. To address this problem, we utilize monomer sequence and ordering information obtained from whole-genome shotgun reads to model two haploid human satellite arrays on chromosomes X and Y, resulting in an initial characterization of 3.83 Mb of centromeric DNA within an individual genome. To further expand the utility of each centromeric reference sequence model, we evaluate sites within the arrays for short-read mappability and chromosome specificity. Because satellite DNAs evolve in a concerted manner, we use these centromeric assemblies to assess the extent of sequence variation among 366 individuals from distinct human populations. We thus identify two satellite array variants in both X and Y centromeres, as determined by array length and sequence composition. This study provides an initial sequence characterization of a regional centromere and establishes a foundation to extend genomic characterization to these sites as well as to other repeat-rich regions within complex genomes. PMID:24501022
Coordinated international action to accelerate genome-to-phenome with FAANG, the Functional Annotation of Animal Genomes project.

PubMed

Andersson, Leif; Archibald, Alan L; Bottema, Cynthia D; Brauning, Rudiger; Burgess, Shane C; Burt, Dave W; Casas, Eduardo; Cheng, Hans H; Clarke, Laura; Couldrey, Christine; Dalrymple, Brian P; Elsik, Christine G; Foissac, Sylvain; Giuffra, Elisabetta; Groenen, Martien A; Hayes, Ben J; Huang, LuSheng S; Khatib, Hassan; Kijas, James W; Kim, Heebal; Lunney, Joan K; McCarthy, Fiona M; McEwan, John C; Moore, Stephen; Nanduri, Bindu; Notredame, Cedric; Palti, Yniv; Plastow, Graham S; Reecy, James M; Rohrer, Gary A; Sarropoulou, Elena; Schmidt, Carl J; Silverstein, Jeffrey; Tellam, Ross L; Tixier-Boichard, Michele; Tosser-Klopp, Gwenola; Tuggle, Christopher K; Vilkki, Johanna; White, Stephen N; Zhao, Shuhong; Zhou, Huaijun

2015-03-25

We describe the organization of a nascent international effort, the Functional Annotation of Animal Genomes (FAANG) project, whose aim is to produce comprehensive maps of functional elements in the genomes of domesticated animal species.
Coordinated international action to accelerate genome-to-phenome with FAANG, The Functional Annotation of Animal Genomes project

USDA-ARS?s Scientific Manuscript database

We describe the organization of a nascent international effort - the "Functional Annotation of ANimal Genomes" project - whose aim is to produce comprehensive maps of functional elements in the genomes of domesticated animal species....
Cost-effective cloud computing: a case study using the comparative genomics tool, roundup.

PubMed

Kudtarkar, Parul; Deluca, Todd F; Fusaro, Vincent A; Tonellato, Peter J; Wall, Dennis P

2010-12-22

Comparative genomics resources, such as ortholog detection tools and repositories are rapidly increasing in scale and complexity. Cloud computing is an emerging technological paradigm that enables researchers to dynamically build a dedicated virtual cluster and may represent a valuable alternative for large computational tools in bioinformatics. In the present manuscript, we optimize the computation of a large-scale comparative genomics resource-Roundup-using cloud computing, describe the proper operating principles required to achieve computational efficiency on the cloud, and detail important procedures for improving cost-effectiveness to ensure maximal computation at minimal costs. Utilizing the comparative genomics tool, Roundup, as a case study, we computed orthologs among 902 fully sequenced genomes on Amazon's Elastic Compute Cloud. For managing the ortholog processes, we designed a strategy to deploy the web service, Elastic MapReduce, and maximize the use of the cloud while simultaneously minimizing costs. Specifically, we created a model to estimate cloud runtime based on the size and complexity of the genomes being compared that determines in advance the optimal order of the jobs to be submitted. We computed orthologous relationships for 245,323 genome-to-genome comparisons on Amazon's computing cloud, a computation that required just over 200 hours and cost $8,000 USD, at least 40% less than expected under a strategy in which genome comparisons were submitted to the cloud randomly with respect to runtime. Our cost savings projections were based on a model that not only demonstrates the optimal strategy for deploying RSD to the cloud, but also finds the optimal cluster size to minimize waste and maximize usage. Our cost-reduction model is readily adaptable for other comparative genomics tools and potentially of significant benefit to labs seeking to take advantage of the cloud as an alternative to local computing infrastructure.
Multiplex CRISPR/Cas9 system impairs HCMV replication by excising an essential viral gene.

PubMed

Gergen, Janina; Coulon, Flora; Creneguy, Alison; Elain-Duret, Nathan; Gutierrez, Alejandra; Pinkenburg, Olaf; Verhoeyen, Els; Anegon, Ignacio; Nguyen, Tuan Huy; Halary, Franck Albert; Haspot, Fabienne

2018-01-01

Anti-HCMV treatments used in immunosuppressed patients reduce viral replication, but resistant viral strains can emerge. Moreover, these drugs do not target latently infected cells. We designed two anti-viral CRISPR/Cas9 strategies to target the UL122/123 gene, a key regulator of lytic replication and reactivation from latency. The singleplex strategy contains one gRNA to target the start codon. The multiplex strategy contains three gRNAs to excise the complete UL122/123 gene. Primary fibroblasts and U-251 MG cells were transduced with lentiviral vectors encoding Cas9 and one or three gRNAs. Both strategies induced mutations in the target gene and a concomitant reduction of immediate early (IE) protein expression in primary fibroblasts. Further detailed analysis in U-251 MG cells showed that the singleplex strategy induced 50% of indels in the viral genome, leading to a reduction in IE protein expression. The multiplex strategy excised the IE gene in 90% of all viral genomes and thus led to the inhibition of IE protein expression. Consequently, viral genome replication and late protein expression were reduced by 90%. Finally, the production of new viral particles was nearly abrogated. In conclusion, the multiplex anti-UL122/123 CRISPR/Cas9 system can target the viral genome efficiently enough to significantly prevent viral replication.
Seed-effect modeling improves the consistency of genome-wide loss-of-function screens and identifies synthetic lethal vulnerabilities in cancer cells.

PubMed

Jaiswal, Alok; Peddinti, Gopal; Akimov, Yevhen; Wennerberg, Krister; Kuznetsov, Sergey; Tang, Jing; Aittokallio, Tero

2017-06-01

Genome-wide loss-of-function profiling is widely used for systematic identification of genetic dependencies in cancer cells; however, the poor reproducibility of RNA interference (RNAi) screens has been a major concern due to frequent off-target effects. Currently, a detailed understanding of the key factors contributing to the sub-optimal consistency is still a lacking, especially on how to improve the reliability of future RNAi screens by controlling for factors that determine their off-target propensity. We performed a systematic, quantitative analysis of the consistency between two genome-wide shRNA screens conducted on a compendium of cancer cell lines, and also compared several gene summarization methods for inferring gene essentiality from shRNA level data. We then devised novel concepts of seed essentiality and shRNA family, based on seed region sequences of shRNAs, to study in-depth the contribution of seed-mediated off-target effects to the consistency of the two screens. We further investigated two seed-sequence properties, seed pairing stability, and target abundance in terms of their capability to minimize the off-target effects in post-screening data analysis. Finally, we applied this novel methodology to identify genetic interactions and synthetic lethal partners of cancer drivers, and confirmed differential essentiality phenotypes by detailed CRISPR/Cas9 experiments. Using the novel concepts of seed essentiality and shRNA family, we demonstrate how genome-wide loss-of-function profiling of a common set of cancer cell lines can be actually made fairly reproducible when considering seed-mediated off-target effects. Importantly, by excluding shRNAs having higher propensity for off-target effects, based on their seed-sequence properties, one can remove noise from the genome-wide shRNA datasets. As a translational application case, we demonstrate enhanced reproducibility of genetic interaction partners of common cancer drivers, as well as identify novel synthetic lethal partners of a major oncogenic driver, PIK3CA, supported by a complementary CRISPR/Cas9 experiment. We provide practical guidelines for improved design and analysis of genome-wide loss-of-function profiling and demonstrate how this novel strategy can be applied towards improved mapping of genetic dependencies of cancer cells to aid development of targeted anticancer treatments.
Separating the wheat from the chaff: systematic identification of functionally relevant noncoding variants in ADHD.

PubMed

Tong, J H S; Hawi, Z; Dark, C; Cummins, T D R; Johnson, B P; Newman, D P; Lau, R; Vance, A; Heussler, H S; Matthews, N; Bellgrove, M A; Pang, K C

2016-11-01

Attention deficit hyperactivity disorder (ADHD) is a highly heritable psychiatric condition with negative lifetime outcomes. Uncovering its genetic architecture should yield important insights into the neurobiology of ADHD and assist development of novel treatment strategies. Twenty years of candidate gene investigations and more recently genome-wide association studies have identified an array of potential association signals. In this context, separating the likely true from false associations ('the wheat' from 'the chaff') will be crucial for uncovering the functional biology of ADHD. Here, we defined a set of 2070 DNA variants that showed evidence of association with ADHD (or were in linkage disequilibrium). More than 97% of these variants were noncoding, and were prioritised for further exploration using two tools-genome-wide annotation of variants (GWAVA) and Combined Annotation-Dependent Depletion (CADD)-that were recently developed to rank variants based upon their likely pathogenicity. Capitalising on recent efforts such as the Encyclopaedia of DNA Elements and US National Institutes of Health Roadmap Epigenomics Projects to improve understanding of the noncoding genome, we subsequently identified 65 variants to which we assigned functional annotations, based upon their likely impact on alternative splicing, transcription factor binding and translational regulation. We propose that these 65 variants, which possess not only a high likelihood of pathogenicity but also readily testable functional hypotheses, represent a tractable shortlist for future experimental validation in ADHD. Taken together, this study brings into sharp focus the likely relevance of noncoding variants for the genetic risk associated with ADHD, and more broadly suggests a bioinformatics approach that should be relevant to other psychiatric disorders.
Genome Sequence of the Soil Bacterium Janthinobacterium sp. KBS0711

PubMed Central

Shoemaker, William R.; Muscarella, Mario E.

2015-01-01

We present a draft genome of Janthinobacterium sp. KBS0711 that was isolated from agricultural soil. The genome provides insight into the ecological strategies of this bacterium in free-living and host-associated environments. PMID:26089434
Nannochloropsis plastid and mitochondrial phylogenomes reveal organelle diversification mechanism and intragenus phylotyping strategy in microalgae.

PubMed

Wei, Li; Xin, Yi; Wang, Dongmei; Jing, Xiaoyan; Zhou, Qian; Su, Xiaoquan; Jia, Jing; Ning, Kang; Chen, Feng; Hu, Qiang; Xu, Jian

2013-08-05

Microalgae are promising feedstock for production of lipids, sugars, bioactive compounds and in particular biofuels, yet development of sensitive and reliable phylotyping strategies for microalgae has been hindered by the paucity of phylogenetically closely-related finished genomes. Using the oleaginous eustigmatophyte Nannochloropsis as a model, we assessed current intragenus phylotyping strategies by producing the complete plastid (pt) and mitochondrial (mt) genomes of seven strains from six Nannochloropsis species. Genes on the pt and mt genomes have been highly conserved in content, size and order, strongly negatively selected and evolving at a rate 33% and 66% of nuclear genomes respectively. Pt genome diversification was driven by asymmetric evolution of two inverted repeats (IRa and IRb): psbV and clpC in IRb are highly conserved whereas their counterparts in IRa exhibit three lineage-associated types of structural polymorphism via duplication or disruption of whole or partial genes. In the mt genomes, however, a single evolution hotspot varies in copy-number of a 3.5 Kb-long, cox1-harboring repeat. The organelle markers (e.g., cox1, cox2, psbA, rbcL and rrn16_mt) and nuclear markers (e.g., ITS2 and 18S) that are widely used for phylogenetic analysis obtained a divergent phylogeny for the seven strains, largely due to low SNP density. A new strategy for intragenus phylotyping of microalgae was thus proposed that includes (i) twelve sequence markers that are of higher sensitivity than ITS2 for interspecies phylogenetic analysis, (ii) multi-locus sequence typing based on rps11_mt-nad4, rps3_mt and cox2-rrn16_mt for intraspecies phylogenetic reconstruction and (iii) several SSR loci for identification of strains within a given species. This first comprehensive dataset of organelle genomes for a microalgal genus enabled exhaustive assessment and searches of all candidate phylogenetic markers on the organelle genomes. A new strategy for intragenus phylotyping of microalgae was proposed which might be generally applicable to other microalgal genera and should serve as a valuable tool in the expanding algal biotechnology industry.
Genetic resources offer efficient tools for rice functional genomics research.

PubMed

Lo, Shuen-Fang; Fan, Ming-Jen; Hsing, Yue-Ie; Chen, Liang-Jwu; Chen, Shu; Wen, Ien-Chie; Liu, Yi-Lun; Chen, Ku-Ting; Jiang, Mirng-Jier; Lin, Ming-Kuang; Rao, Meng-Yen; Yu, Lin-Chih; Ho, Tuan-Hua David; Yu, Su-May

2016-05-01

Rice is an important crop and major model plant for monocot functional genomics studies. With the establishment of various genetic resources for rice genomics, the next challenge is to systematically assign functions to predicted genes in the rice genome. Compared with the robustness of genome sequencing and bioinformatics techniques, progress in understanding the function of rice genes has lagged, hampering the utilization of rice genes for cereal crop improvement. The use of transfer DNA (T-DNA) insertional mutagenesis offers the advantage of uniform distribution throughout the rice genome, but preferentially in gene-rich regions, resulting in direct gene knockout or activation of genes within 20-30 kb up- and downstream of the T-DNA insertion site and high gene tagging efficiency. Here, we summarize the recent progress in functional genomics using the T-DNA-tagged rice mutant population. We also discuss important features of T-DNA activation- and knockout-tagging and promoter-trapping of the rice genome in relation to mutant and candidate gene characterizations and how to more efficiently utilize rice mutant populations and datasets for high-throughput functional genomics and phenomics studies by forward and reverse genetics approaches. These studies may facilitate the translation of rice functional genomics research to improvements of rice and other cereal crops. © 2015 John Wiley & Sons Ltd.
Identification of mitochondrial carriers in Saccharomyces cerevisiae by transport assay of reconstituted recombinant proteins.

PubMed

Palmieri, Ferdinando; Agrimi, Gennaro; Blanco, Emanuela; Castegna, Alessandra; Di Noia, Maria A; Iacobazzi, Vito; Lasorsa, Francesco M; Marobbio, Carlo M T; Palmieri, Luigi; Scarcia, Pasquale; Todisco, Simona; Vozza, Angelo; Walker, John

2006-01-01

The inner membranes of mitochondria contain a family of carrier proteins that are responsible for the transport in and out of the mitochondrial matrix of substrates, products, co-factors and biosynthetic precursors that are essential for the function and activities of the organelle. This family of proteins is characterized by containing three tandem homologous sequence repeats of approximately 100 amino acids, each folded into two transmembrane alpha-helices linked by an extensive polar loop. Each repeat contains a characteristic conserved sequence. These features have been used to determine the extent of the family in genome sequences. The genome of Saccharomyces cerevisiae contains 34 members of the family. The identity of five of them was known before the determination of the genome sequence, but the functions of the remaining family members were not. This review describes how the functions of 15 of these previously unknown transport proteins have been determined by a strategy that consists of expressing the genes in Escherichia coli or Saccharomyces cerevisiae, reconstituting the gene products into liposomes and establishing their functions by transport assay. Genetic and biochemical evidence as well as phylogenetic considerations have guided the choice of substrates that were tested in the transport assays. The physiological roles of these carriers have been verified by genetic experiments. Various pieces of evidence point to the functions of six additional members of the family, but these proposals await confirmation by transport assay. The sequences of many of the newly identified yeast carriers have been used to characterize orthologs in other species, and in man five diseases are presently known to be caused by defects in specific mitochondrial carrier genes. The roles of eight yeast mitochondrial carriers remain to be established.
Discovery of new enzymes and metabolic pathways by using structure and genome context.

PubMed

Zhao, Suwen; Kumar, Ritesh; Sakai, Ayano; Vetting, Matthew W; Wood, B McKay; Brown, Shoshana; Bonanno, Jeffery B; Hillerich, Brandan S; Seidel, Ronald D; Babbitt, Patricia C; Almo, Steven C; Sweedler, Jonathan V; Gerlt, John A; Cronan, John E; Jacobson, Matthew P

2013-10-31

Assigning valid functions to proteins identified in genome projects is challenging: overprediction and database annotation errors are the principal concerns. We and others are developing computation-guided strategies for functional discovery with 'metabolite docking' to experimentally derived or homology-based three-dimensional structures. Bacterial metabolic pathways often are encoded by 'genome neighbourhoods' (gene clusters and/or operons), which can provide important clues for functional assignment. We recently demonstrated the synergy of docking and pathway context by 'predicting' the intermediates in the glycolytic pathway in Escherichia coli. Metabolite docking to multiple binding proteins and enzymes in the same pathway increases the reliability of in silico predictions of substrate specificities because the pathway intermediates are structurally similar. Here we report that structure-guided approaches for predicting the substrate specificities of several enzymes encoded by a bacterial gene cluster allowed the correct prediction of the in vitro activity of a structurally characterized enzyme of unknown function (PDB 2PMQ), 2-epimerization of trans-4-hydroxy-L-proline betaine (tHyp-B) and cis-4-hydroxy-D-proline betaine (cHyp-B), and also the correct identification of the catabolic pathway in which Hyp-B 2-epimerase participates. The substrate-liganded pose predicted by virtual library screening (docking) was confirmed experimentally. The enzymatic activities in the predicted pathway were confirmed by in vitro assays and genetic analyses; the intermediates were identified by metabolomics; and repression of the genes encoding the pathway by high salt concentrations was established by transcriptomics, confirming the osmolyte role of tHyp-B. This study establishes the utility of structure-guided functional predictions to enable the discovery of new metabolic pathways.
Beyond 'knock-out' mice: new perspectives for the programmed modification of the mammalian genome.

PubMed

Cohen-Tannoudji, M; Babinet, C

1998-10-01

The emergence of gene inactivation by homologous recombination methodology in embryonic stem cells has revolutionized the field of mouse genetics. Indeed, the availability of a rapidly growing number of mouse null mutants has represented an invaluable source of knowledge on mammalian development, cellular biology and physiology and has provided many models for human inherited diseases. In recent years, improvements of the original 'knock-out' strategy, as well as the exploitation of exogenous enzymatic systems that are active in the recombination process, have considerably extended the range of genetic manipulations that can be produced. For example, it is now possible to create a mouse bearing a targeted point mutation as the unique change in its entire genome therefore allowing very fine dissection of gene function in vivo. Chromosome alterations such as large deletions, inversions or translocations can also be designed and will facilitate the global functional analysis of the mouse genome. This will extend the possibilities of creating models of human pathologies that frequently originate from various chromosomal disorders. Finally, the advent of methods allowing conditional gene targeting will open the way for the analysis of the consequence of a particular mutation in a defined organ and at a specific time during the life of a mouse.

SGFSC: speeding the gene functional similarity calculation based on hash tables.

PubMed

Tian, Zhen; Wang, Chunyu; Guo, Maozu; Liu, Xiaoyan; Teng, Zhixia

2016-11-04

In recent years, many measures of gene functional similarity have been proposed and widely used in all kinds of essential research. These methods are mainly divided into two categories: pairwise approaches and group-wise approaches. However, a common problem with these methods is their time consumption, especially when measuring the gene functional similarities of a large number of gene pairs. The problem of computational efficiency for pairwise approaches is even more prominent because they are dependent on the combination of semantic similarity. Therefore, the efficient measurement of gene functional similarity remains a challenging problem. To speed current gene functional similarity calculation methods, a novel two-step computing strategy is proposed: (1) establish a hash table for each method to store essential information obtained from the Gene Ontology (GO) graph and (2) measure gene functional similarity based on the corresponding hash table. There is no need to traverse the GO graph repeatedly for each method with the help of the hash table. The analysis of time complexity shows that the computational efficiency of these methods is significantly improved. We also implement a novel Speeding Gene Functional Similarity Calculation tool, namely SGFSC, which is bundled with seven typical measures using our proposed strategy. Further experiments show the great advantage of SGFSC in measuring gene functional similarity on the whole genomic scale. The proposed strategy is successful in speeding current gene functional similarity calculation methods. SGFSC is an efficient tool that is freely available at http://nclab.hit.edu.cn/SGFSC . The source code of SGFSC can be downloaded from http://pan.baidu.com/s/1dFFmvpZ .
GeNemo: a search engine for web-based functional genomic data.

PubMed

Zhang, Yongqing; Cao, Xiaoyi; Zhong, Sheng

2016-07-08

A set of new data types emerged from functional genomic assays, including ChIP-seq, DNase-seq, FAIRE-seq and others. The results are typically stored as genome-wide intensities (WIG/bigWig files) or functional genomic regions (peak/BED files). These data types present new challenges to big data science. Here, we present GeNemo, a web-based search engine for functional genomic data. GeNemo searches user-input data against online functional genomic datasets, including the entire collection of ENCODE and mouse ENCODE datasets. Unlike text-based search engines, GeNemo's searches are based on pattern matching of functional genomic regions. This distinguishes GeNemo from text or DNA sequence searches. The user can input any complete or partial functional genomic dataset, for example, a binding intensity file (bigWig) or a peak file. GeNemo reports any genomic regions, ranging from hundred bases to hundred thousand bases, from any of the online ENCODE datasets that share similar functional (binding, modification, accessibility) patterns. This is enabled by a Markov Chain Monte Carlo-based maximization process, executed on up to 24 parallel computing threads. By clicking on a search result, the user can visually compare her/his data with the found datasets and navigate the identified genomic regions. GeNemo is available at www.genemo.org. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
The function of dog models in developing gene therapy strategies for human health.

PubMed

Nowend, Keri L; Starr-Moss, Alison N; Murphy, Keith E

2011-08-01

The domestic dog is of great benefit to humankind, not only through companionship and working activities cultivated through domestication and selective breeding, but also as a model for biomedical research. Many single-gene traits have been well-characterized at the genomic level, and recent advances in whole-genome association studies will allow for better understanding of complex, multigenic hereditary diseases. Additionally, the dog serves as an invaluable large animal model for assessment of novel therapeutic agents. Thus, the dog has filled a crucial step in the translation of basic research to new treatment regimens for various human diseases. Four well-characterized diseases in canine models are discussed as they relate to other animal model availability, novel therapeutic approach, and extrapolation to human gene therapy trials.
Glucocorticoid receptor signaling in health and disease

PubMed Central

Kadmiel, Mahita; Cidlowski, John A.

2013-01-01

Glucocorticoids are steroid hormones regulated in a circadian and stres-associated manner to maintain various metabolic and homeostatic functions that are necessary for life. Synthetic glucocorticoids are widely prescribed drugs for many conditions including asthma, chronic obstructive pulmonary disease (COPD), and inflammatory disorders of the eye. Research in the last few years has begun to unravel the profound complexity of glucocorticoid signaling and has contributed remarkably to improved therapeutic strategies. Glucocorticoids signal through the glucocorticoid receptor, a member of the superfamily of nuclear receptors, in both genomic and non-genomic ways in almost every tissue in the human body. In this review, we will provide an update on glucocorticoid receptor signaling and highlight the role of GR signaling in physiological and pathophysiological conditions in the major organ systems in the human body. PMID:23953592
Gene Editing: Regulatory and Translation to Clinic.

PubMed

Ando, Dale; Meyer, Kathleen

2017-10-01

The clinical application and regulatory strategy of genome editing for ex vivo cell therapy is derived from the intersection of two fields of study: viral vector gene therapy trials; and clinical trials with ex vivo purification and engraftment of CD34 + hematopoietic stem cells, T cells, and tumor cell vaccines. This article covers the regulatory and translational preclinical activities needed for a genome editing clinical trial modifying hematopoietic stem cells and the genesis of this current strategy based on previous clinical trials using genome-edited T cells. The SB-728 zinc finger nuclease platform is discussed because this is the most clinically advanced genome editing technology. Copyright © 2017 Elsevier Inc. All rights reserved.
Generalizing genetical genomics: getting added value from environmental perturbation.

PubMed

Li, Yang; Breitling, Rainer; Jansen, Ritsert C

2008-10-01

Genetical genomics is a useful approach for studying the effect of genetic perturbations on biological systems at the molecular level. However, molecular networks depend on the environmental conditions and, thus, a comprehensive understanding of biological systems requires studying them across multiple environments. We propose a generalization of genetical genomics, which combines genetic and sensibly chosen environmental perturbations, to study the plasticity of molecular networks. This strategy forms a crucial step toward understanding why individuals respond differently to drugs, toxins, pathogens, nutrients and other environmental influences. Here we outline a strategy for selecting and allocating individuals to particular treatments, and we discuss the promises and pitfalls of the generalized genetical genomics approach.
Significance of functional disease-causal/susceptible variants identified by whole-genome analyses for the understanding of human diseases.

PubMed

Hitomi, Yuki; Tokunaga, Katsushi

2017-01-01

Human genome variation may cause differences in traits and disease risks. Disease-causal/susceptible genes and variants for both common and rare diseases can be detected by comprehensive whole-genome analyses, such as whole-genome sequencing (WGS), using next-generation sequencing (NGS) technology and genome-wide association studies (GWAS). Here, in addition to the application of an NGS as a whole-genome analysis method, we summarize approaches for the identification of functional disease-causal/susceptible variants from abundant genetic variants in the human genome and methods for evaluating their functional effects in human diseases, using an NGS and in silico and in vitro functional analyses. We also discuss the clinical applications of the functional disease causal/susceptible variants to personalized medicine.
Genomics and molecular breeding in lesser explored pulse crops: current trends and future opportunities.

PubMed

Bohra, Abhishek; Jha, Uday Chand; Kishor, P B Kavi; Pandey, Shailesh; Singh, Narendra P

2014-12-01

Pulses are multipurpose crops for providing income, employment and food security in the underprivileged regions, notably the FAO-defined low-income food-deficit countries. Owing to their intrinsic ability to endure environmental adversities and the least input/management requirements, these crops remain central to subsistence farming. Given their pivotal role in rain-fed agriculture, substantial research has been invested to boost the productivity of these pulse crops. To this end, genomic tools and technologies have appeared as the compelling supplement to the conventional breeding. However, the progress in minor pulse crops including dry beans (Vigna spp.), lupins, lablab, lathyrus and vetches has remained unsatisfactory, hence these crops are often labeled as low profile or lesser researched. Nevertheless, recent scientific and technological breakthroughs particularly the next generation sequencing (NGS) are radically transforming the scenario of genomics and molecular breeding in these minor crops. NGS techniques have allowed de novo assembly of whole genomes in these orphan crops. Moreover, the availability of a reference genome sequence would promote re-sequencing of diverse genotypes to unlock allelic diversity at a genome-wide scale. In parallel, NGS has offered high-resolution genetic maps or more precisely, a robust genetic framework to implement whole-genome strategies for crop improvement. As has already been demonstrated in lupin, sequencing-based genotyping of the representative sample provided access to a number of functionally-relevant markers that could be deployed straight away in crop breeding programs. This article attempts to outline the recent progress made in genomics of these lesser explored pulse crops, and examines the prospects of genomics assisted integrated breeding to enhance and stabilize crop yields. Copyright © 2014 Elsevier Inc. All rights reserved.
Genome-wide annotation of the soybean WRKY family and functional characterization of genes involved in response to Phakopsora pachyrhizi infection.

PubMed

Bencke-Malato, Marta; Cabreira, Caroline; Wiebke-Strohm, Beatriz; Bücker-Neto, Lauro; Mancini, Estefania; Osorio, Marina B; Homrich, Milena S; Turchetto-Zolet, Andreia Carina; De Carvalho, Mayra C C G; Stolf, Renata; Weber, Ricardo L M; Westergaard, Gastón; Castagnaro, Atílio P; Abdelnoor, Ricardo V; Marcelino-Guimarães, Francismar C; Margis-Pinheiro, Márcia; Bodanese-Zanettini, Maria Helena

2014-09-10

Many previous studies have shown that soybean WRKY transcription factors are involved in the plant response to biotic and abiotic stresses. Phakopsora pachyrhizi is the causal agent of Asian Soybean Rust, one of the most important soybean diseases. There are evidences that WRKYs are involved in the resistance of some soybean genotypes against that fungus. The number of WRKY genes already annotated in soybean genome was underrepresented. In the present study, a genome-wide annotation of the soybean WRKY family was carried out and members involved in the response to P. pachyrhizi were identified. As a result of a soybean genomic databases search, 182 WRKY-encoding genes were annotated and 33 putative pseudogenes identified. Genes involved in the response to P. pachyrhizi infection were identified using superSAGE, RNA-Seq of microdissected lesions and microarray experiments. Seventy-five genes were differentially expressed during fungal infection. The expression of eight WRKY genes was validated by RT-qPCR. The expression of these genes in a resistant genotype was earlier and/or stronger compared with a susceptible genotype in response to P. pachyrhizi infection. Soybean somatic embryos were transformed in order to overexpress or silence WRKY genes. Embryos overexpressing a WRKY gene were obtained, but they were unable to convert into plants. When infected with P. pachyrhizi, the leaves of the silenced transgenic line showed a higher number of lesions than the wild-type plants. The present study reports a genome-wide annotation of soybean WRKY family. The participation of some members in response to P. pachyrhizi infection was demonstrated. The results contribute to the elucidation of gene function and suggest the manipulation of WRKYs as a strategy to increase fungal resistance in soybean plants.
Methods comparison for microsatellite marker development: Different isolation methods, different yield efficiency

NASA Astrophysics Data System (ADS)

Zhan, Aibin; Bao, Zhenmin; Hu, Xiaoli; Lu, Wei; Hu, Jingjie

2009-06-01

Microsatellite markers have become one kind of the most important molecular tools used in various researches. A large number of microsatellite markers are required for the whole genome survey in the fields of molecular ecology, quantitative genetics and genomics. Therefore, it is extremely necessary to select several versatile, low-cost, efficient and time- and labor-saving methods to develop a large panel of microsatellite markers. In this study, we used Zhikong scallop ( Chlamys farreri) as the target species to compare the efficiency of the five methods derived from three strategies for microsatellite marker development. The results showed that the strategy of constructing small insert genomic DNA library resulted in poor efficiency, while the microsatellite-enriched strategy highly improved the isolation efficiency. Although the mining public database strategy is time- and cost-saving, it is difficult to obtain a large number of microsatellite markers, mainly due to the limited sequence data of non-model species deposited in public databases. Based on the results in this study, we recommend two methods, microsatellite-enriched library construction method and FIASCO-colony hybridization method, for large-scale microsatellite marker development. Both methods were derived from the microsatellite-enriched strategy. The experimental results obtained from Zhikong scallop also provide the reference for microsatellite marker development in other species with large genomes.
Genome Editing in the Cricket, Gryllus bimaculatus.

PubMed

Watanabe, Takahito; Noji, Sumihare; Mito, Taro

2017-01-01

Hemimetabolous, or incompletely metamorphosing, insects are phylogenetically basal and include many beneficial and deleterious species. The cricket, Gryllus bimaculatus, is an emerging model for hemimetabolous insects, based on the success of RNA interference (RNAi)-based gene-functional analyses and transgenic technology. Taking advantage of genome editing technologies in this species would greatly promote functional genomics studies. Genome editing has proven to be an effective method for site-specific genome manipulation in various species. Here, we describe a protocol for genome editing including gene knockout and gene knockin in G. bimaculatus for functional genomics studies.
Engineering and Functional Characterization of Fusion Genes Identifies Novel Oncogenic Drivers of Cancer.

PubMed

Lu, Hengyu; Villafane, Nicole; Dogruluk, Turgut; Grzeskowiak, Caitlin L; Kong, Kathleen; Tsang, Yiu Huen; Zagorodna, Oksana; Pantazi, Angeliki; Yang, Lixing; Neill, Nicholas J; Kim, Young Won; Creighton, Chad J; Verhaak, Roel G; Mills, Gordon B; Park, Peter J; Kucherlapati, Raju; Scott, Kenneth L

2017-07-01

Oncogenic gene fusions drive many human cancers, but tools to more quickly unravel their functional contributions are needed. Here we describe methodology permitting fusion gene construction for functional evaluation. Using this strategy, we engineered the known fusion oncogenes, BCR-ABL1, EML4-ALK , and ETV6-NTRK3, as well as 20 previously uncharacterized fusion genes identified in The Cancer Genome Atlas datasets. In addition to confirming oncogenic activity of the known fusion oncogenes engineered by our construction strategy, we validated five novel fusion genes involving MET, NTRK2 , and BRAF kinases that exhibited potent transforming activity and conferred sensitivity to FDA-approved kinase inhibitors. Our fusion construction strategy also enabled domain-function studies of BRAF fusion genes. Our results confirmed other reports that the transforming activity of BRAF fusions results from truncation-mediated loss of inhibitory domains within the N-terminus of the BRAF protein. BRAF mutations residing within this inhibitory region may provide a means for BRAF activation in cancer, therefore we leveraged the modular design of our fusion gene construction methodology to screen N-terminal domain mutations discovered in tumors that are wild-type at the BRAF mutation hotspot, V600. We identified an oncogenic mutation, F247L, whose expression robustly activated the MAPK pathway and sensitized cells to BRAF and MEK inhibitors. When applied broadly, these tools will facilitate rapid fusion gene construction for subsequent functional characterization and translation into personalized treatment strategies. Cancer Res; 77(13); 3502-12. ©2017 AACR . ©2017 American Association for Cancer Research.
Genomics Literacy: Implications for Teaching Students with a Range of Special Needs

ERIC Educational Resources Information Center

Rafter, Mary; Gillies, Robyn M.

2018-01-01

Recent developments in genomic-based knowledge is challenging educators to learn more about the early precursors of various difficulties children experience in learning and how they can use this information to identify preventative strategies or strategies that minimise their effect. The purpose of this article is to provide a brief outline of…
Exploration of Genetic and Genomic Resources for Abiotic and Biotic Stress Tolerance in Pearl Millet

PubMed Central

Shivhare, Radha; Lata, Charu

2017-01-01

Pearl millet is one of the most important small-grained C4 Panicoid crops with a large genome size (∼2352 Mb), short life cycle and outbreeding nature. It is highly resilient to areas with scanty rain and high temperature. Pearl millet is a nutritionally superior staple crop for people inhabiting hot, drought-prone arid and semi-arid regions of South Asia and Africa where it is widely grown and used for food, hay, silage, bird feed, building material, and fuel. Having excellent nutrient composition and exceptional buffering capacity against variable climatic conditions and pathogen attack makes pearl millet a wonderful model crop for stress tolerance studies. Pearl millet germplasm show a large range of genotypic and phenotypic variations including tolerance to abiotic and biotic stresses. Conventional breeding for enhancing abiotic and biotic stress resistance in pearl millet have met with considerable success, however, in last few years various novel approaches including functional genomics and molecular breeding have been attempted in this crop for augmenting yield under adverse environmental conditions, and there is still a lot of scope for further improvement using genomic tools. Discovery and use of various DNA-based markers such as EST-SSRs, DArT, CISP, and SSCP-SNP in pearl millet not only help in determining population structure and genetic diversity but also prove to be important for developing strategies for crop improvement at a faster rate and greater precision. Molecular marker-based genetic linkage maps and identification of genomic regions determining yield under abiotic stresses particularly terminal drought have paved way for marker-assisted selection and breeding of pearl millet cultivars. Reference collections and marker-assisted backcrossing have also been used to improve biotic stress resistance in pearl millet specifically to downy mildew. Whole genome sequencing of pearl millet genome will give new insights for processing of functional genes and assist in crop improvement programs through molecular breeding approaches. This review thus summarizes the exploration of pearl millet genetic and genomic resources for improving abiotic and biotic stress resistance and development of cultivars superior in stress tolerance. PMID:28167949
Synchronized dynamics of bacterial niche-specific functions during biofilm development in a cold seep brine pool.

PubMed

Zhang, Weipeng; Wang, Yong; Bougouffa, Salim; Tian, Renmao; Cao, Huiluo; Li, Yongxin; Cai, Lin; Wong, Yue Him; Zhang, Gen; Zhou, Guowei; Zhang, Xixiang; Bajic, Vladimir B; Al-Suwailem, Abdulaziz; Qian, Pei-Yuan

2015-10-01

The biology of biofilm in deep-sea environments is barely being explored. Here, biofilms were developed at the brine pool (characterized by limited carbon sources) and the normal bottom water adjacent to Thuwal cold seeps. Comparative metagenomics based on 50 Gb datasets identified polysaccharide degradation, nitrate reduction and proteolysis as enriched functional categories for brine biofilms. The genomes of two dominant species: a novel Deltaproteobacterium and a novel Epsilonproteobacterium in the brine biofilms were reconstructed. Despite rather small genome sizes, the Deltaproteobacterium possessed enhanced polysaccharide fermentation pathways, whereas the Epsilonproteobacterium was a versatile nitrogen reactor possessing nar, nap and nif gene clusters. These metabolic functions, together with specific regulatory and hypersaline-tolerant genes, made the two bacteria unique compared with their close relatives, including those from hydrothermal vents. Moreover, these functions were regulated by biofilm development, as both the abundance and the expression level of key functional genes were higher in later stage biofilms, and co-occurrences between the two dominant bacteria were demonstrated. Collectively, unique mechanisms were revealed: (i) polysaccharides fermentation, proteolysis interacted with nitrogen cycling to form a complex chain for energy generation, and (ii) remarkably exploiting and organizing niche-specific functions would be an important strategy for biofilm-dependent adaptation to the extreme conditions. © 2015 Society for Applied Microbiology and John Wiley & Sons Ltd.
In-depth comparative analysis of malaria parasite genomes reveals protein-coding genes linked to human disease in Plasmodium falciparum genome.

PubMed

Liu, Xuewu; Wang, Yuanyuan; Liang, Jiao; Wang, Luojun; Qin, Na; Zhao, Ya; Zhao, Gang

2018-05-02

Plasmodium falciparum is the most virulent malaria parasite capable of parasitizing human erythrocytes. The identification of genes related to this capability can enhance our understanding of the molecular mechanisms underlying human malaria and lead to the development of new therapeutic strategies for malaria control. With the availability of several malaria parasite genome sequences, performing computational analysis is now a practical strategy to identify genes contributing to this disease. Here, we developed and used a virtual genome method to assign 33,314 genes from three human malaria parasites, namely, P. falciparum, P. knowlesi and P. vivax, and three rodent malaria parasites, namely, P. berghei, P. chabaudi and P. yoelii, to 4605 clusters. Each cluster consisted of genes whose protein sequences were significantly similar and was considered as a virtual gene. Comparing the enriched values of all clusters in human malaria parasites with those in rodent malaria parasites revealed 115 P. falciparum genes putatively responsible for parasitizing human erythrocytes. These genes are mainly located in the chromosome internal regions and participate in many biological processes, including membrane protein trafficking and thiamine biosynthesis. Meanwhile, 289 P. berghei genes were included in the rodent parasite-enriched clusters. Most are located in subtelomeric regions and encode erythrocyte surface proteins. Comparing cluster values in P. falciparum with those in P. vivax and P. knowlesi revealed 493 candidate genes linked to virulence. Some of them encode proteins present on the erythrocyte surface and participate in cytoadhesion, virulence factor trafficking, or erythrocyte invasion, but many genes with unknown function were also identified. Cerebral malaria is characterized by accumulation of infected erythrocytes at trophozoite stage in brain microvascular. To discover cerebral malaria-related genes, fast Fourier transformation (FFT) was introduced to extract genes highly transcribed at the trophozoite stage. Finally, 55 candidate genes were identified. Considering that parasite-infected erythrocyte surface protein 2 (PIESP2) contains gap-junction-related Neuromodulin_N domain and that anti-PIESP2 might provide protection against malaria, we chose PIESP2 for further experimental study. Our analysis revealed a limited number of genes linked to human disease in P. falciparum genome. These genes could be interesting targets for further functional characterization.
Morphology and genomic hallmarks of breast tumours developed by ATM deleterious variant carriers.

PubMed

Renault, Anne-Laure; Mebirouk, Noura; Fuhrmann, Laetitia; Bataillon, Guillaume; Cavaciuti, Eve; Le Gal, Dorothée; Girard, Elodie; Popova, Tatiana; La Rosa, Philippe; Beauvallet, Juana; Eon-Marchais, Séverine; Dondon, Marie-Gabrielle; d'Enghien, Catherine Dubois; Laugé, Anthony; Chemlali, Walid; Raynal, Virginie; Labbé, Martine; Bièche, Ivan; Baulande, Sylvain; Bay, Jacques-Olivier; Berthet, Pascaline; Caron, Olivier; Buecher, Bruno; Faivre, Laurence; Fresnay, Marc; Gauthier-Villars, Marion; Gesta, Paul; Janin, Nicolas; Lejeune, Sophie; Maugard, Christine; Moutton, Sébastien; Venat-Bouvet, Laurence; Zattara, Hélène; Fricker, Jean-Pierre; Gladieff, Laurence; Coupier, Isabelle; Chenevix-Trench, Georgia; Hall, Janet; Vincent-Salomon, Anne; Stoppa-Lyonnet, Dominique; Andrieu, Nadine; Lesueur, Fabienne

2018-04-17

The ataxia telangiectasia mutated (ATM) gene is a moderate-risk breast cancer susceptibility gene; germline loss-of-function variants are found in up to 3% of hereditary breast and ovarian cancer (HBOC) families who undergo genetic testing. So far, no clear histopathological and molecular features of breast tumours occurring in ATM deleterious variant carriers have been described, but identification of an ATM-associated tumour signature may help in patient management. To characterise hallmarks of ATM-associated tumours, we performed systematic pathology review of tumours from 21 participants from ataxia-telangiectasia families and 18 participants from HBOC families, as well as copy number profiling on a subset of 23 tumours. Morphology of ATM-associated tumours was compared with that of 599 patients with no BRCA1 and BRCA2 mutations from a hospital-based series, as well as with data from The Cancer Genome Atlas. Absolute copy number and loss of heterozygosity (LOH) profiles were obtained from the OncoScan SNP array. In addition, we performed whole-genome sequencing on four tumours from ATM loss-of-function variant carriers with available frozen material. We found that ATM-associated tumours belong mostly to the luminal B subtype, are tetraploid and show LOH at the ATM locus at 11q22-23. Unlike tumours in which BRCA1 or BRCA2 is inactivated, tumours arising in ATM deleterious variant carriers are not associated with increased large-scale genomic instability as measured by the large-scale state transitions signature. Losses at 13q14.11-q14.3, 17p13.2-p12, 21p11.2-p11.1 and 22q11.23 were observed. Somatic alterations at these loci may therefore represent biomarkers for ATM testing and harbour driver mutations in potentially 'druggable' genes that would allow patients to be directed towards tailored therapeutic strategies. Although ATM is involved in the DNA damage response, ATM-associated tumours are distinct from BRCA1-associated tumours in terms of morphological characteristics and genomic alterations, and they are also distinguishable from sporadic breast tumours, thus opening up the possibility to identify ATM variant carriers outside the ataxia-telangiectasia disorder and direct them towards effective cancer risk management and therapeutic strategies.
A Network Biology Approach Identifies Molecular Cross-Talk between Normal Prostate Epithelial and Prostate Carcinoma Cells

PubMed Central

Trevino, Victor; Cassese, Alberto; Nagy, Zsuzsanna; Zhuang, Xiaodong; Herbert, John; Antzack, Philipp; Clarke, Kim; Davies, Nicholas; Rahman, Ayesha; Campbell, Moray J.; Bicknell, Roy; Vannucci, Marina; Falciani, Francesco

2016-01-01

Abstract The advent of functional genomics has enabled the genome-wide characterization of the molecular state of cells and tissues, virtually at every level of biological organization. The difficulty in organizing and mining this unprecedented amount of information has stimulated the development of computational methods designed to infer the underlying structure of regulatory networks from observational data. These important developments had a profound impact in biological sciences since they triggered the development of a novel data-driven investigative approach. In cancer research, this strategy has been particularly successful. It has contributed to the identification of novel biomarkers, to a better characterization of disease heterogeneity and to a more in depth understanding of cancer pathophysiology. However, so far these approaches have not explicitly addressed the challenge of identifying networks representing the interaction of different cell types in a complex tissue. Since these interactions represent an essential part of the biology of both diseased and healthy tissues, it is of paramount importance that this challenge is addressed. Here we report the definition of a network reverse engineering strategy designed to infer directional signals linking adjacent cell types within a complex tissue. The application of this inference strategy to prostate cancer genome-wide expression profiling data validated the approach and revealed that normal epithelial cells exert an anti-tumour activity on prostate carcinoma cells. Moreover, by using a Bayesian hierarchical model integrating genetics and gene expression data and combining this with survival analysis, we show that the expression of putative cell communication genes related to focal adhesion and secretion is affected by epistatic gene copy number variation and it is predictive of patient survival. Ultimately, this study represents a generalizable approach to the challenge of deciphering cell communication networks in a wide spectrum of biological systems. PMID:27124473
A Network Biology Approach Identifies Molecular Cross-Talk between Normal Prostate Epithelial and Prostate Carcinoma Cells.

PubMed

Trevino, Victor; Cassese, Alberto; Nagy, Zsuzsanna; Zhuang, Xiaodong; Herbert, John; Antczak, Philipp; Clarke, Kim; Davies, Nicholas; Rahman, Ayesha; Campbell, Moray J; Guindani, Michele; Bicknell, Roy; Vannucci, Marina; Falciani, Francesco

2016-04-01

The advent of functional genomics has enabled the genome-wide characterization of the molecular state of cells and tissues, virtually at every level of biological organization. The difficulty in organizing and mining this unprecedented amount of information has stimulated the development of computational methods designed to infer the underlying structure of regulatory networks from observational data. These important developments had a profound impact in biological sciences since they triggered the development of a novel data-driven investigative approach. In cancer research, this strategy has been particularly successful. It has contributed to the identification of novel biomarkers, to a better characterization of disease heterogeneity and to a more in depth understanding of cancer pathophysiology. However, so far these approaches have not explicitly addressed the challenge of identifying networks representing the interaction of different cell types in a complex tissue. Since these interactions represent an essential part of the biology of both diseased and healthy tissues, it is of paramount importance that this challenge is addressed. Here we report the definition of a network reverse engineering strategy designed to infer directional signals linking adjacent cell types within a complex tissue. The application of this inference strategy to prostate cancer genome-wide expression profiling data validated the approach and revealed that normal epithelial cells exert an anti-tumour activity on prostate carcinoma cells. Moreover, by using a Bayesian hierarchical model integrating genetics and gene expression data and combining this with survival analysis, we show that the expression of putative cell communication genes related to focal adhesion and secretion is affected by epistatic gene copy number variation and it is predictive of patient survival. Ultimately, this study represents a generalizable approach to the challenge of deciphering cell communication networks in a wide spectrum of biological systems.
Application of resequencing to rice genomics, functional genomics and evolutionary analysis

PubMed Central

2014-01-01

Rice is a model system used for crop genomics studies. The completion of the rice genome draft sequences in 2002 not only accelerated functional genome studies, but also initiated a new era of resequencing rice genomes. Based on the reference genome in rice, next-generation sequencing (NGS) using the high-throughput sequencing system can efficiently accomplish whole genome resequencing of various genetic populations and diverse germplasm resources. Resequencing technology has been effectively utilized in evolutionary analysis, rice genomics and functional genomics studies. This technique is beneficial for both bridging the knowledge gap between genotype and phenotype and facilitating molecular breeding via gene design in rice. Here, we also discuss the limitation, application and future prospects of rice resequencing. PMID:25006357

Mutagenesis of diploid mammalian genes by gene entrapment

PubMed Central

Lin, Qing; Donahue, Sarah L.; Moore-Jarrett, Tracy; Cao, Shang; Osipovich, Anna B.; Ruley, H. Earl

2006-01-01

The present study describes a genome-wide method for biallelic mutagenesis in mammalian cells. Novel poly(A) gene trap vectors, which contain features for direct cloning vector–cell fusion transcripts and for post-entrapment genome engineering, were used to generate a library of 979 mutant ES cells. The entrapment mutations generally disrupted gene expression and were readily transmitted through the germline, establishing the library as a resource for constructing mutant mice. Cells homozygous for most entrapment loci could be isolated by selecting for enhanced expression of an inserted neomycin-resistance gene that resulted from losses of heterozygosity (LOH). The frequencies of LOH measured at 37 sites in the genome ranged from 1.3 × 10−5 to 1.2 × 10−4 per cell and increased with increasing distance from the centromere, implicating mitotic recombination in the process. The ease and efficiency of obtaining homozygous mutations will (i) facilitate genetic studies of gene function in cultured cells, (ii) permit genome-wide studies of recombination events that result in LOH and mediate a type of chromosomal instability important in carcinogenesis, and (iii) provide new strategies for phenotype-driven mutagenesis screens in mammalian cells. PMID:17062627
Genome Editing by CRISPR/Cas9: a Game Change in the Genetic Manipulation of Protists

PubMed Central

Lander, Noelia; Chiurillo, Miguel A.; Docampo, Roberto

2016-01-01

Genome editing by CRISPR (clustered regularly interspaced short palindromic repeats)/Cas9 (CRISPR-associated gene 9) system has been transformative in biology. Originally discovered as an adaptive prokaryotic immune system, CRISPR/Cas9 has been repurposed for genome editing in a broad range of model organisms, from yeast to mammalian cells. Protist parasites are unicellular organisms producing important human diseases that affect millions of people around the world. For many of these diseases, such as malaria, Chagas disease, leishmaniasis and cryptosporidiosis, there are no effective treatments or vaccines available. The recent adaptation of the CRISPR/Cas9 technology to several protist models will be playing a key role in the functional study of their proteins, in the characterization of their metabolic pathways, and in the understanding of their biology, and will facilitate the search for new chemotherapeutic targets. In this work we review recent studies where the CRISPR/Cas9 system was adapted to protist parasites, particularly to Apicomplexans and trypanosomatids, emphasizing the different molecular strategies used for genome editing of each organism, as well as their advantages. We also discuss the potential usefulness of this technology in the green alga Chlamydomonas reinhardtii. PMID:27315329
Genome-Wide Analysis in Three Fusarium Pathogens Identifies Rapidly Evolving Chromosomes and Genes Associated with Pathogenicity

PubMed Central

Sperschneider, Jana; Gardiner, Donald M.; Thatcher, Louise F.; Lyons, Rebecca; Singh, Karam B.; Manners, John M.; Taylor, Jennifer M.

2015-01-01

Pathogens and hosts are in an ongoing arms race and genes involved in host–pathogen interactions are likely to undergo diversifying selection. Fusarium plant pathogens have evolved diverse infection strategies, but how they interact with their hosts in the biotrophic infection stage remains puzzling. To address this, we analyzed the genomes of three Fusarium plant pathogens for genes that are under diversifying selection. We found a two-speed genome structure both on the chromosome and gene group level. Diversifying selection acts strongly on the dispensable chromosomes in Fusarium oxysporum f. sp. lycopersici and on distinct core chromosome regions in Fusarium graminearum, all of which have associations with virulence. Members of two gene groups evolve rapidly, namely those that encode proteins with an N-terminal [SG]-P-C-[KR]-P sequence motif and proteins that are conserved predominantly in pathogens. Specifically, 29 F. graminearum genes are rapidly evolving, in planta induced and encode secreted proteins, strongly pointing toward effector function. In summary, diversifying selection in Fusarium is strongly reflected as genomic footprints and can be used to predict a small gene set likely to be involved in host–pathogen interactions for experimental verification. PMID:25994930
Identification of proteins likely to be involved in morphogenesis, cell division, and signal transduction in Planctomycetes by comparative genomics.

PubMed

Jogler, Christian; Waldmann, Jost; Huang, Xiaoluo; Jogler, Mareike; Glöckner, Frank Oliver; Mascher, Thorsten; Kolter, Roberto

2012-12-01

Members of the Planctomycetes clade share many unusual features for bacteria. Their cytoplasm contains membrane-bound compartments, they lack peptidoglycan and FtsZ, they divide by polar budding, and they are capable of endocytosis. Planctomycete genomes have remained enigmatic, generally being quite large (up to 9 Mb), and on average, 55% of their predicted proteins are of unknown function. Importantly, proteins related to the unusual traits of Planctomycetes remain largely unknown. Thus, we embarked on bioinformatic analyses of these genomes in an effort to predict proteins that are likely to be involved in compartmentalization, cell division, and signal transduction. We used three complementary strategies. First, we defined the Planctomycetes core genome and subtracted genes of well-studied model organisms. Second, we analyzed the gene content and synteny of morphogenesis and cell division genes and combined both methods using a "guilt-by-association" approach. Third, we identified signal transduction systems as well as sigma factors. These analyses provide a manageable list of candidate genes for future genetic studies and provide evidence for complex signaling in the Planctomycetes akin to that observed for bacteria with complex life-styles, such as Myxococcus xanthus.
A statistical framework to predict functional non-coding regions in the human genome through integrated analysis of annotation data.

PubMed

Lu, Qiongshi; Hu, Yiming; Sun, Jiehuan; Cheng, Yuwei; Cheung, Kei-Hoi; Zhao, Hongyu

2015-05-27

Identifying functional regions in the human genome is a major goal in human genetics. Great efforts have been made to functionally annotate the human genome either through computational predictions, such as genomic conservation, or high-throughput experiments, such as the ENCODE project. These efforts have resulted in a rich collection of functional annotation data of diverse types that need to be jointly analyzed for integrated interpretation and annotation. Here we present GenoCanyon, a whole-genome annotation method that performs unsupervised statistical learning using 22 computational and experimental annotations thereby inferring the functional potential of each position in the human genome. With GenoCanyon, we are able to predict many of the known functional regions. The ability of predicting functional regions as well as its generalizable statistical framework makes GenoCanyon a unique and powerful tool for whole-genome annotation. The GenoCanyon web server is available at http://genocanyon.med.yale.edu.
Absence of Complex I Is Associated with Diminished Respiratory Chain Function in European Mistletoe.

PubMed

Maclean, Andrew E; Hertle, Alexander P; Ligas, Joanna; Bock, Ralph; Balk, Janneke; Meyer, Etienne H

2018-05-21

Parasitism is a life history strategy found across all domains of life whereby nutrition is obtained from a host. It is often associated with reductive evolution of the genome, including loss of genes from the organellar genomes [1, 2]. In some unicellular parasites, the mitochondrial genome (mitogenome) has been lost entirely, with far-reaching consequences for the physiology of the organism [3, 4]. Recently, mitogenome sequences of several species of the hemiparasitic plant mistletoe (Viscum sp.) have been reported [5, 6], revealing a striking loss of genes not seen in any other multicellular eukaryotes. In particular, the nad genes encoding subunits of respiratory complex I are all absent and other protein-coding genes are also lost or highly diverged in sequence, raising the question what remains of the respiratory complexes and mitochondrial functions. Here we show that oxidative phosphorylation (OXPHOS) in European mistletoe, Viscum album, is highly diminished. Complex I activity and protein subunits of complex I could not be detected. The levels of complex IV and ATP synthase were at least 5-fold lower than in the non-parasitic model plant Arabidopsis thaliana, whereas alternative dehydrogenases and oxidases were higher in abundance. Carbon flux analysis indicates that cytosolic reactions including glycolysis are greater contributors to ATP synthesis than the mitochondrial tricarboxylic acid (TCA) cycle. Our results describe the extreme adjustments in mitochondrial functions of the first reported multicellular eukaryote without complex I. Copyright © 2018 Elsevier Ltd. All rights reserved.
CTCF and Cohesin in Genome Folding and Transcriptional Gene Regulation.

PubMed

Merkenschlager, Matthias; Nora, Elphège P

2016-08-31

Genome function, replication, integrity, and propagation rely on the dynamic structural organization of chromosomes during the cell cycle. Genome folding in interphase provides regulatory segmentation for appropriate transcriptional control, facilitates ordered genome replication, and contributes to genome integrity by limiting illegitimate recombination. Here, we review recent high-resolution chromosome conformation capture and functional studies that have informed models of the spatial and regulatory compartmentalization of mammalian genomes, and discuss mechanistic models for how CTCF and cohesin control the functional architecture of mammalian chromosomes.
The changing landscape of gene editing in hematopoietic stem cells: a step towards Cas9 clinical translation.

PubMed

Dever, Daniel P; Porteus, Matthew H

2017-11-01

Since the discovery two decades ago that programmable endonucleases can be engineered to modify human cells at single nucleotide resolution, the concept of genome editing was born. Now these technologies are being applied to therapeutically relevant cell types, including hematopoietic stem cells (HSC), which possess the power to repopulate an entire blood and immune system. The purpose of this review is to discuss the changing landscape of genome editing in hematopoietic stem cells (GE-HSC) from the discovery stage to the preclinical stage, with the imminent goal of clinical translation for the treatment of serious genetic diseases of the blood and immune system. With the discovery that the RNA-programmable (sgRNA) clustered regularly interspace short palindromic repeats (CRISPR)-Cas9 nuclease (Cas9/sgRNA) systems can be easily used to precisely modify the human genome in 2012, a genome-editing revolution of hematopoietic stem cells (HSC) has bloomed. We have observed that over the last 2 years, academic institutions and small biotech companies are developing HSC-based Cas9/sgRNA genome-editing curative strategies to treat monogenic disorders, including β-hemoglobinopathies and primary immunodeficiencies. We will focus on recent publications (within the past 2 years) that employ different genome-editing strategies to 'hijack' the cell's endogenous double-strand repair pathways to confer a disease-specific therapeutic advantage. The number of genome-editing strategies in HSCs that could offer therapeutic potential for diseases of the blood and immune system have dramatically risen over the past 2 years. The HSC-based genome-editing field is primed to enter clinical trials in the subsequent years. We will summarize the major advancements for the development of novel autologous GE-HSC cell and gene therapy strategies for hematopoietic diseases that are candidates for curative allogeneic bone marrow transplantation.
BEACON: automated tool for Bacterial GEnome Annotation ComparisON.

PubMed

Kalkatawi, Manal; Alam, Intikhab; Bajic, Vladimir B

2015-08-18

Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON's utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27%, while the number of genes without any function assignment is reduced. We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/ .
Subgenome-anchored physical frameworks of the allotetraploid Upland cotton (Gossypium hirsutum L.) genome, and an approach toward reference-grade assemblies of polyploids

USDA-ARS?s Scientific Manuscript database

Like many agricultural crops, the cultivated cotton genome is large and polyploid (~2.5Gb), consisting of two very similar repeat-rich subgenomes, whose size and complexity pose significant challenges for accurate genome reconstruction using whole-genome shotgun approaches. A strategy for accurately...
A High-Resolution InDel (Insertion–Deletion) Markers-Anchored Consensus Genetic Map Identifies Major QTLs Governing Pod Number and Seed Yield in Chickpea

PubMed Central

Srivastava, Rishi; Singh, Mohar; Bajaj, Deepak; Parida, Swarup K.

2016-01-01

Development and large-scale genotyping of user-friendly informative genome/gene-derived InDel markers in natural and mapping populations is vital for accelerating genomics-assisted breeding applications of chickpea with minimal resource expenses. The present investigation employed a high-throughput whole genome next-generation resequencing strategy in low and high pod number parental accessions and homozygous individuals constituting the bulks from each of two inter-specific mapping populations [(Pusa 1103 × ILWC 46) and (Pusa 256 × ILWC 46)] to develop non-erroneous InDel markers at a genome-wide scale. Comparing these high-quality genomic sequences, 82,360 InDel markers with reference to kabuli genome and 13,891 InDel markers exhibiting differentiation between low and high pod number parental accessions and bulks of aforementioned mapping populations were developed. These informative markers were structurally and functionally annotated in diverse coding and non-coding sequence components of genome/genes of kabuli chickpea. The functional significance of regulatory and coding (frameshift and large-effect mutations) InDel markers for establishing marker-trait linkages through association/genetic mapping was apparent. The markers detected a greater amplification (97%) and intra-specific polymorphic potential (58–87%) among a diverse panel of cultivated desi, kabuli, and wild accessions even by using a simpler cost-efficient agarose gel-based assay implicating their utility in large-scale genetic analysis especially in domesticated chickpea with narrow genetic base. Two high-density inter-specific genetic linkage maps generated using aforesaid mapping populations were integrated to construct a consensus 1479 InDel markers-anchored high-resolution (inter-marker distance: 0.66 cM) genetic map for efficient molecular mapping of major QTLs governing pod number and seed yield per plant in chickpea. Utilizing these high-density genetic maps as anchors, three major genomic regions harboring each of pod number and seed yield robust QTLs (15–28% phenotypic variation explained) were identified on chromosomes 2, 4, and 6. The integration of genetic and physical maps at these QTLs mapped on chromosomes scaled-down the long major QTL intervals into high-resolution short pod number and seed yield robust QTL physical intervals (0.89–2.94 Mb) which were essentially got validated in multiple genetic backgrounds of two chickpea mapping populations. The genome-wide InDel markers including natural allelic variants and genomic loci/genes delineated at major six especially in one colocalized novel congruent robust pod number and seed yield robust QTLs mapped on a high-density consensus genetic map were found most promising in chickpea. These functionally relevant molecular tags can drive marker-assisted genetic enhancement to develop high-yielding cultivars with increased seed/pod number and yield in chickpea. PMID:27695461
Genome-wide screen for modulation of hepatic apolipoprotein A-I (ApoA-I) secretion.

PubMed

Miles, Rebecca R; Perry, William; Haas, Joseph V; Mosior, Marian K; N'Cho, Mathias; Wang, Jian W J; Yu, Peng; Calley, John; Yue, Yong; Carter, Quincy; Han, Bomie; Foxworthy, Patricia; Kowala, Mark C; Ryan, Timothy P; Solenberg, Patricia J; Michael, Laura F

2013-03-01

Control of plasma cholesterol levels is a major therapeutic strategy for management of coronary artery disease (CAD). Although reducing LDL cholesterol (LDL-c) levels decreases morbidity and mortality, this therapeutic intervention only translates into a 25-40% reduction in cardiovascular events. Epidemiological studies have shown that a high LDL-c level is not the only risk factor for CAD; low HDL cholesterol (HDL-c) is an independent risk factor for CAD. Apolipoprotein A-I (ApoA-I) is the major protein component of HDL-c that mediates reverse cholesterol transport from tissues to the liver for excretion. Therefore, increasing ApoA-I levels is an attractive strategy for HDL-c elevation. Using genome-wide siRNA screening, targets that regulate hepatocyte ApoA-I secretion were identified through transfection of 21,789 siRNAs into hepatocytes whereby cell supernatants were assayed for ApoA-I. Approximately 800 genes were identified and triaged using a convergence of information, including genetic associations with HDL-c levels, tissue-specific gene expression, druggability assessments, and pathway analysis. Fifty-nine genes were selected for reconfirmation; 40 genes were confirmed. Here we describe the siRNA screening strategy, assay implementation and validation, data triaging, and example genes of interest. The genes of interest include known and novel genes encoding secreted enzymes, proteases, G-protein-coupled receptors, metabolic enzymes, ion transporters, and proteins of unknown function. Repression of farnesyltransferase (FNTA) by siRNA and the enzyme inhibitor manumycin A caused elevation of ApoA-I secretion from hepatocytes and from transgenic mice expressing hApoA-I and cholesterol ester transfer protein transgenes. In total, this work underscores the power of functional genetic assessment to identify new therapeutic targets.
A Perfect Match Genomic Landscape Provides a Unified Framework for the Precise Detection of Variation in Natural and Synthetic Haploid Genomes

PubMed Central

Palacios-Flores, Kim; García-Sotelo, Jair; Castillo, Alejandra; Uribe, Carina; Aguilar, Luis; Morales, Lucía; Gómez-Romero, Laura; Reyes, José; Garciarubio, Alejandro; Boege, Margareta; Dávila, Guillermo

2018-01-01

We present a conceptually simple, sensitive, precise, and essentially nonstatistical solution for the analysis of genome variation in haploid organisms. The generation of a Perfect Match Genomic Landscape (PMGL), which computes intergenome identity with single nucleotide resolution, reveals signatures of variation wherever a query genome differs from a reference genome. Such signatures encode the precise location of different types of variants, including single nucleotide variants, deletions, insertions, and amplifications, effectively introducing the concept of a general signature of variation. The precise nature of variants is then resolved through the generation of targeted alignments between specific sets of sequence reads and known regions of the reference genome. Thus, the perfect match logic decouples the identification of the location of variants from the characterization of their nature, providing a unified framework for the detection of genome variation. We assessed the performance of the PMGL strategy via simulation experiments. We determined the variation profiles of natural genomes and of a synthetic chromosome, both in the context of haploid yeast strains. Our approach uncovered variants that have previously escaped detection. Moreover, our strategy is ideally suited for further refining high-quality reference genomes. The source codes for the automated PMGL pipeline have been deposited in a public repository. PMID:29367403
Addressing Beacon re-identification attacks: quantification and mitigation of privacy risks

PubMed Central

Zhao, Yongan; Carey, Knox; Lloyd, David; Sofia, Heidi; Baker, Dixie; Flicek, Paul; Shringarpure, Suyash; Bustamante, Carlos; Wang, Shuang; Jiang, Xiaoqian; Ohno-Machado, Lucila; Tang, Haixu; Wang, XiaoFeng; Hubaux, Jean-Pierre

2018-01-01

The Global Alliance for Genomics and Health (GA4GH) created the Beacon Project as a means of testing the willingness of data holders to share genetic data in the simplest technical context—a query for the presence of a specified nucleotide at a given position within a chromosome. Each participating site (or “beacon”) is responsible for assuring that genomic data are exposed through the Beacon service only with the permission of the individual to whom the data pertains and in accordance with the GA4GH policy and standards. While recognizing the inference risks associated with large-scale data aggregation, and the fact that some beacons contain sensitive phenotypic associations that increase privacy risk, the GA4GH adjudged the risk of re-identification based on the binary yes/no allele-presence query responses as acceptable. However, recent work demonstrated that, given a beacon with specific characteristics (including relatively small sample size and an adversary who possesses an individual’s whole genome sequence), the individual’s membership in a beacon can be inferred through repeated queries for variants present in the individual’s genome. In this paper, we propose three practical strategies for reducing re-identification risks in beacons. The first two strategies manipulate the beacon such that the presence of rare alleles is obscured; the third strategy budgets the number of accesses per user for each individual genome. Using a beacon containing data from the 1000 Genomes Project, we demonstrate that the proposed strategies can effectively reduce re-identification risk in beacon-like datasets. PMID:28339683
Multi-omics reveal the lifestyle of the acidophilic, mineral-oxidizing model species Leptospirillum ferriphilumT.

PubMed

Christel, Stephan; Herold, Malte; Bellenberg, Sören; El Hajjami, Mohamed; Buetti-Dinh, Antoine; Pivkin, Igor V; Sand, Wolfgang; Wilmes, Paul; Poetsch, Ansgar; Dopson, Mark

2017-11-17

Leptospirillum ferriphilum plays a major role in acidic, metal rich environments where it represents one of the most prevalent iron oxidizers. These milieus include acid rock and mine drainage as well as biomining operations. Despite its perceived importance, no complete genome sequence of this model species' type strain is available, limiting the possibilities to investigate the strategies and adaptations Leptospirillum ferriphilum T applies to survive and compete in its niche. This study presents a complete, circular genome of Leptospirillum ferriphilum T DSM 14647 obtained by PacBio SMRT long read sequencing for use as a high quality reference. Analysis of the functionally annotated genome, mRNA transcripts, and protein concentrations revealed a previously undiscovered nitrogenase cluster for atmospheric nitrogen fixation and elucidated metabolic systems taking part in energy conservation, carbon fixation, pH homeostasis, heavy metal tolerance, oxidative stress response, chemotaxis and motility, quorum sensing, and biofilm formation. Additionally, mRNA transcript counts and protein concentrations were compared between cells grown in continuous culture using ferrous iron as substrate and bioleaching cultures containing chalcopyrite (CuFeS 2 ). Leptospirillum ferriphilum T adaptations to growth on chalcopyrite included a possibly enhanced production of reducing power, reduced carbon dioxide fixation, as well as elevated RNA transcripts and proteins involved in heavy metal resistance, with special emphasis on copper efflux systems. Finally, expression and translation of genes responsible for chemotaxis and motility were enhanced. IMPORTANCE Leptospirillum ferriphilum is one of the most important iron-oxidizers in the context of acidic and metal rich environments during moderately thermophilic biomining. A high-quality circular genome of Leptospirillum ferriphilum T coupled with functional omics data provides new insights into its metabolic properties, such as the novel identification of genes for atmospheric nitrogen fixation, and represents an essential step for further accurate proteomic and transcriptomic investigation of this acidophile model species in the future. Additionally, light is shed on Leptospirillum ferriphilum T adaptation strategies to growth on the copper mineral chalcopyrite. This data can be applied to deepen our understanding and optimization of bioleaching and biooxidation, techniques that present sustainable and environmentally friendly alternatives to many traditional methods for metal extraction. Copyright © 2017 Christel et al.
Evolution, language and analogy in functional genomics.

PubMed

Benner, S A; Gaucher, E A

2001-07-01

Almost a century ago, Wittgenstein pointed out that theory in science is intricately connected to language. This connection is not a frequent topic in the genomics literature. But a case can be made that functional genomics is today hindered by the paradoxes that Wittgenstein identified. If this is true, until these paradoxes are recognized and addressed, functional genomics will continue to be limited in its ability to extrapolate information from genomic sequences.
Evolution, language and analogy in functional genomics

NASA Technical Reports Server (NTRS)

Benner, S. A.; Gaucher, E. A.

2001-01-01

Almost a century ago, Wittgenstein pointed out that theory in science is intricately connected to language. This connection is not a frequent topic in the genomics literature. But a case can be made that functional genomics is today hindered by the paradoxes that Wittgenstein identified. If this is true, until these paradoxes are recognized and addressed, functional genomics will continue to be limited in its ability to extrapolate information from genomic sequences.
Deep landscape update of dispersed and tandem repeats in the genome model of the red jungle fowl, Gallus gallus, using a series of de novo investigating tools.

PubMed

Guizard, Sébastien; Piégu, Benoît; Arensburger, Peter; Guillou, Florian; Bigot, Yves

2016-08-19

The program RepeatMasker and the database Repbase-ISB are part of the most widely used strategy for annotating repeats in animal genomes. They have been used to show that avian genomes have a lower repeat content (8-12 %) than the sequenced genomes of many vertebrate species (30-55 %). However, the efficiency of such a library-based strategies is dependent on the quality and completeness of the sequences in the database that is used. An alternative to these library based methods are methods that identify repeats de novo. These alternative methods have existed for a least a decade and may be more powerful than the library based methods. We have used an annotation strategy involving several complementary de novo tools to determine the repeat content of the model genome galGal4 (1.04 Gbp), including identifying simple sequence repeats (SSRs), tandem repeats and transposable elements (TEs). We annotated over one Gbp. of the galGal4 genome and showed that it is composed of approximately 19 % SSRs and TEs repeats. Furthermore, we estimate that the actual genome of the red jungle fowl contains about 31-35 % repeats. We find that library-based methods tend to overestimate TE diversity. These results have a major impact on the current understanding of repeats distributions throughout chromosomes in the red jungle fowl. Our results are a proof of concept of the reliability of using de novo tools to annotate repeats in large animal genomes. They have also revealed issues that will need to be resolved in order to develop gold-standard methodologies for annotating repeats in eukaryote genomes.
Navigating the currents of seascape genomics: how spatial analyses can augment population genomic studies

PubMed Central

Crandall, Eric D.; Liggins, Libby; Bongaerts, Pim; Treml, Eric A.

2016-01-01

Population genomic approaches are making rapid inroads in the study of non-model organisms, including marine taxa. To date, these marine studies have predominantly focused on rudimentary metrics describing the spatial and environmental context of their study region (e.g., geographical distance, average sea surface temperature, average salinity). We contend that a more nuanced and considered approach to quantifying seascape dynamics and patterns can strengthen population genomic investigations and help identify spatial, temporal, and environmental factors associated with differing selective regimes or demographic histories. Nevertheless, approaches for quantifying marine landscapes are complicated. Characteristic features of the marine environment, including pelagic living in flowing water (experienced by most marine taxa at some point in their life cycle), require a well-designed spatial-temporal sampling strategy and analysis. Many genetic summary statistics used to describe populations may be inappropriate for marine species with large population sizes, large species ranges, stochastic recruitment, and asymmetrical gene flow. Finally, statistical approaches for testing associations between seascapes and population genomic patterns are still maturing with no single approach able to capture all relevant considerations. None of these issues are completely unique to marine systems and therefore similar issues and solutions will be shared for many organisms regardless of habitat. Here, we outline goals and spatial approaches for landscape genomics with an emphasis on marine systems and review the growing empirical literature on seascape genomics. We review established tools and approaches and highlight promising new strategies to overcome select issues including a strategy to spatially optimize sampling. Despite the many challenges, we argue that marine systems may be especially well suited for identifying candidate genomic regions under environmentally mediated selection and that seascape genomic approaches are especially useful for identifying robust locus-by-environment associations. PMID:29491947
Navigating the currents of seascape genomics: how spatial analyses can augment population genomic studies.

PubMed

Riginos, Cynthia; Crandall, Eric D; Liggins, Libby; Bongaerts, Pim; Treml, Eric A

2016-12-01

Population genomic approaches are making rapid inroads in the study of non-model organisms, including marine taxa. To date, these marine studies have predominantly focused on rudimentary metrics describing the spatial and environmental context of their study region (e.g., geographical distance, average sea surface temperature, average salinity). We contend that a more nuanced and considered approach to quantifying seascape dynamics and patterns can strengthen population genomic investigations and help identify spatial, temporal, and environmental factors associated with differing selective regimes or demographic histories. Nevertheless, approaches for quantifying marine landscapes are complicated. Characteristic features of the marine environment, including pelagic living in flowing water (experienced by most marine taxa at some point in their life cycle), require a well-designed spatial-temporal sampling strategy and analysis. Many genetic summary statistics used to describe populations may be inappropriate for marine species with large population sizes, large species ranges, stochastic recruitment, and asymmetrical gene flow. Finally, statistical approaches for testing associations between seascapes and population genomic patterns are still maturing with no single approach able to capture all relevant considerations. None of these issues are completely unique to marine systems and therefore similar issues and solutions will be shared for many organisms regardless of habitat. Here, we outline goals and spatial approaches for landscape genomics with an emphasis on marine systems and review the growing empirical literature on seascape genomics. We review established tools and approaches and highlight promising new strategies to overcome select issues including a strategy to spatially optimize sampling. Despite the many challenges, we argue that marine systems may be especially well suited for identifying candidate genomic regions under environmentally mediated selection and that seascape genomic approaches are especially useful for identifying robust locus-by-environment associations.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.