uniqueness genome interactions: Topics by Science.gov

Sample records for uniqueness genome interactions

Explaining human uniqueness: genome interactions with environment, behaviour and culture.

PubMed

Varki, Ajit; Geschwind, Daniel H; Eichler, Evan E

2008-10-01

What makes us human? Specialists in each discipline respond through the lens of their own expertise. In fact, 'anthropogeny' (explaining the origin of humans) requires a transdisciplinary approach that eschews such barriers. Here we take a genomic and genetic perspective towards molecular variation, explore systems analysis of gene expression and discuss an organ-systems approach. Rejecting any 'genes versus environment' dichotomy, we then consider genome interactions with environment, behaviour and culture, finally speculating that aspects of human uniqueness arose because of a primate evolutionary trend towards increasing and irreversible dependence on learned behaviours and culture - perhaps relaxing allowable thresholds for large-scale genomic diversity.
Explaining human uniqueness: genome interactions with environment, behaviour and culture

PubMed Central

Varki, Ajit; Geschwind, Daniel H.; Eichler, Evan E.

2009-01-01

What makes us human? Specialists in each discipline respond through the lens of their own expertise. In fact, ‘anthropogeny’ (explaining the origin of humans) requires a transdisciplinary approach that eschews such barriers. Here we take a genomic and genetic perspective towards molecular variation, explore systems analysis of gene expression and discuss an organ-systems approach. Rejecting any ‘genes versus environment’ dichotomy, we then consider genome interactions with environment, behaviour and culture, finally speculating that aspects of human uniqueness arose because of a primate evolutionary trend towards increasing and irreversible dependence on learned behaviours and culture — perhaps relaxing allowable thresholds for large-scale genomic diversity. PMID:18802414
Translating Mendelian and complex inheritance of Alzheimer's disease genes for predicting unique personal genome variants

PubMed Central

Regan, Kelly; Wang, Kanix; Doughty, Emily; Li, Haiquan; Li, Jianrong; Lee, Younghee; Kann, Maricel G

2012-01-01

Objective Although trait-associated genes identified as complex versus single-gene inheritance differ substantially in odds ratio, the authors nonetheless posit that their mechanistic concordance can reveal fundamental properties of the genetic architecture, allowing the automated interpretation of unique polymorphisms within a personal genome. Materials and methods An analytical method, SPADE-gen, spanning three biological scales was developed to demonstrate the mechanistic concordance between Mendelian and complex inheritance of Alzheimer's disease (AD) genes: biological functions (BP), protein interaction modeling, and protein domain implicated in the disease-associated polymorphism. Results Among Gene Ontology (GO) biological processes (BP) enriched at a false detection rate <5% in 15 AD genes of Mendelian inheritance (Online Mendelian Inheritance in Man) and independently in those of complex inheritance (25 host genes of intragenic AD single-nucleotide polymorphisms confirmed in genome-wide association studies), 16 overlapped (empirical p=0.007) and 45 were similar (empirical p<0.009; information theory). SPAN network modeling extended the canonical pathway of AD (KEGG) with 26 new protein interactions (empirical p<0.0001). Discussion The study prioritized new AD-associated biological mechanisms and focused the analysis on previously unreported interactions associated with the biological processes of polymorphisms that affect specific protein domains within characterized AD genes and their direct interactors using (1) concordant GO-BP and (2) domain interactions within STRING protein–protein interactions corresponding to the genomic location of the AD polymorphism (eg, EPHA1, APOE, and CD2AP). Conclusion These results are in line with unique-event polymorphism theory, indicating how disease-associated polymorphisms of Mendelian or complex inheritance relate genetically to those observed as ‘unique personal variants’. They also provide insight for identifying novel targets, for repositioning drugs, and for personal therapeutics. PMID:22319180
Genome and metagenome analyses reveal adaptive evolution of the host and interaction with the gut microbiota in the goose

PubMed Central

Gao, Guangliang; Zhao, Xianzhi; Li, Qin; He, Chuan; Zhao, Wenjing; Liu, Shuyun; Ding, Jinmei; Ye, Weixing; Wang, Jun; Chen, Ye; Wang, Haiwei; Li, Jing; Luo, Yi; Su, Jian; Huang, Yong; Liu, Zuohua; Dai, Ronghua; Shi, Yixiang; Meng, He; Wang, Qigui

2016-01-01

The goose is an economically important waterfowl that exhibits unique characteristics and abilities, such as liver fat deposition and fibre digestion. Here, we report de novo whole-genome assemblies for the goose and swan goose and describe the evolutionary relationships among 7 bird species, including domestic and wild geese, which diverged approximately 3.4~6.3 million years ago (Mya). In contrast to chickens as a proximal species, the expanded and rapidly evolving genes found in the goose genome are mainly involved in metabolism, including energy, amino acid and carbohydrate metabolism. Further integrated analysis of the host genome and gut metagenome indicated that the most widely shared functional enrichment of genes occurs for functions such as glycolysis/gluconeogenesis, starch and sucrose metabolism, propanoate metabolism and the citrate cycle. We speculate that the unique physiological abilities of geese benefit from the adaptive evolution of the host genome and symbiotic interactions with gut microbes. PMID:27608918
Genome-nuclear lamina interactions: from cell populations to single cells.

PubMed

Yáñez-Cuna, J Omar; van Steensel, Bas

2017-04-01

Lamina-associated domains (LADs) are large genomic regions that interact with the nuclear lamina (NL) and help to guide the spatial folding of chromosomes in the interphase nucleus. LADs have been linked to gene repression and other functions. Recent studies have begun to uncover some of the molecular players that drive LAD-NL interactions. A picture emerges in which DNA sequence, chromatin components and nuclear lamina proteins play an important role. Complementary to this, imaging and single-cell genomics approaches have revealed that some LAD-NL interactions are variable from cell to cell, while others are very stable. Understanding LADs can provide a unique perspective into the general process of genome organization. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
phiGENOME: an integrative navigation throughout bacteriophage genomes.

PubMed

Stano, Matej; Klucar, Lubos

2011-11-01

phiGENOME is a web-based genome browser generating dynamic and interactive graphical representation of phage genomes stored in the phiSITE, database of gene regulation in bacteriophages. phiGENOME is an integral part of the phiSITE web portal (http://www.phisite.org/phigenome) and it was optimised for visualisation of phage genomes with the emphasis on the gene regulatory elements. phiGENOME consists of three components: (i) genome map viewer built using Adobe Flash technology, providing dynamic and interactive graphical display of phage genomes; (ii) sequence browser based on precisely formatted HTML tags, providing detailed exploration of genome features on the sequence level and (iii) regulation illustrator, based on Scalable Vector Graphics (SVG) and designed for graphical representation of gene regulations. Bringing 542 complete genome sequences accompanied with their rich annotations and references, makes phiGENOME a unique information resource in the field of phage genomics. Copyright Â© 2011 Elsevier Inc. All rights reserved.
Plant Microbe Interactions in Post Genomic Era: Perspectives and Applications

PubMed Central

Imam, Jahangir; Singh, Puneet K.; Shukla, Pratyoosh

2016-01-01

Deciphering plant–microbe interactions is a promising aspect to understand the benefits and the pathogenic effect of microbes and crop improvement. The advancement in sequencing technologies and various ‘omics’ tool has impressively accelerated the research in biological sciences in this area. The recent and ongoing developments provide a unique approach to describing these intricate interactions and test hypotheses. In the present review, we discuss the role of plant-pathogen interaction in crop improvement. The plant innate immunity has always been an important aspect of research and leads to some interesting information like the adaptation of unique immune mechanisms of plants against pathogens. The development of new techniques in the post - genomic era has greatly enhanced our understanding of the regulation of plant defense mechanisms against pathogens. The present review also provides an overview of beneficial plant–microbe interactions with special reference to Agrobacterium tumefaciens-plant interactions where plant derived signal molecules and plant immune responses are important in pathogenicity and transformation efficiency. The construction of various Genome-scale metabolic models of microorganisms and plants presented a better understanding of all metabolic interactions activated during the interactions. This review also lists the emerging repertoire of phytopathogens and its impact on plant disease resistance. Outline of different aspects of plant-pathogen interactions is presented in this review to bridge the gap between plant microbial ecology and their immune responses. PMID:27725809
From genes to genomes: a new paradigm for studying fungal pathogenesis in Magnaporthe oryzae.

PubMed

Xu, Jin-Rong; Zhao, Xinhua; Dean, Ralph A

2007-01-01

Magnaporthe oryzae is the most destructive fungal pathogen of rice worldwide and because of its amenability to classical and molecular genetic manipulation, availability of a genome sequence, and other resources it has emerged as a leading model system to study host-pathogen interactions. This chapter reviews recent progress toward elucidation of the molecular basis of infection-related morphogenesis, host penetration, invasive growth, and host-pathogen interactions. Related information on genome analysis and genomic studies of plant infection processes is summarized under specific topics where appropriate. Particular emphasis is placed on the role of MAP kinase and cAMP signal transduction pathways and unique features in the genome such as repetitive sequences and expanded gene families. Emerging developments in functional genome analysis through large-scale insertional mutagenesis and gene expression profiling are detailed. The chapter concludes with new prospects in the area of systems biology, such as protein expression profiling, and highlighting remaining crucial information needed to fully appreciate host-pathogen interactions.
Homo sapiens exhibit a distinct pattern of CNV genes regulation: an important role of miRNAs and SNPs in expression plasticity.

PubMed

Dweep, Harsh; Kubikova, Nada; Gretz, Norbert; Voskarides, Konstantinos; Felekkis, Kyriacos

2015-07-16

Gene expression regulation is a complex and highly organized process involving a variety of genomic factors. It is widely accepted that differences in gene expression can contribute to the phenotypic variability between species, and that their interpretation can aid in the understanding of the physiologic variability. CNVs and miRNAs are two major players in the regulation of expression plasticity and may be responsible for the unique phenotypic characteristics observed in different lineages. We have previously demonstrated that a close interaction between these two genomic elements may have contributed to the regulation of gene expression during evolution. This work presents the molecular interactions between CNV and non CNV genes with miRNAs and other genomic elements in eight different species. A comprehensive analysis of these interactions indicates a unique nature of human CNV genes regulation as compared to other species. By using genes with short 3' UTR that abolish the "canonical" miRNA-dependent regulation, as a model, we demonstrate a distinct and tight regulation of human genes that might explain some of the unique features of human physiology. In addition, comparison of gene expression regulation between species indicated that there is a significant difference between humans and mice possibly questioning the effectiveness of the latest as experimental models of human diseases.
Homo sapiens exhibit a distinct pattern of CNV genes regulation: an important role of miRNAs and SNPs in expression plasticity

PubMed Central

Dweep, Harsh; Kubikova, Nada; Gretz, Norbert; Voskarides, Konstantinos; Felekkis, Kyriacos

2015-01-01

Gene expression regulation is a complex and highly organized process involving a variety of genomic factors. It is widely accepted that differences in gene expression can contribute to the phenotypic variability between species, and that their interpretation can aid in the understanding of the physiologic variability. CNVs and miRNAs are two major players in the regulation of expression plasticity and may be responsible for the unique phenotypic characteristics observed in different lineages. We have previously demonstrated that a close interaction between these two genomic elements may have contributed to the regulation of gene expression during evolution. This work presents the molecular interactions between CNV and non CNV genes with miRNAs and other genomic elements in eight different species. A comprehensive analysis of these interactions indicates a unique nature of human CNV genes regulation as compared to other species. By using genes with short 3′ UTR that abolish the “canonical” miRNA-dependent regulation, as a model, we demonstrate a distinct and tight regulation of human genes that might explain some of the unique features of human physiology. In addition, comparison of gene expression regulation between species indicated that there is a significant difference between humans and mice possibly questioning the effectiveness of the latest as experimental models of human diseases. PMID:26178010
CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription.

PubMed

Tang, Zhonghui; Luo, Oscar Junhong; Li, Xingwang; Zheng, Meizhen; Zhu, Jacqueline Jufen; Szalaj, Przemyslaw; Trzaskoma, Pawel; Magalska, Adriana; Wlodarczyk, Jakub; Ruszczycki, Blazej; Michalski, Paul; Piecuch, Emaly; Wang, Ping; Wang, Danjuan; Tian, Simon Zhongyuan; Penrad-Mobayed, May; Sachs, Laurent M; Ruan, Xiaoan; Wei, Chia-Lin; Liu, Edison T; Wilczynski, Grzegorz M; Plewczynski, Dariusz; Li, Guoliang; Ruan, Yijun

2015-12-17

Spatial genome organization and its effect on transcription remains a fundamental question. We applied an advanced chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) strategy to comprehensively map higher-order chromosome folding and specific chromatin interactions mediated by CCCTC-binding factor (CTCF) and RNA polymerase II (RNAPII) with haplotype specificity and nucleotide resolution in different human cell lineages. We find that CTCF/cohesin-mediated interaction anchors serve as structural foci for spatial organization of constitutive genes concordant with CTCF-motif orientation, whereas RNAPII interacts within these structures by selectively drawing cell-type-specific genes toward CTCF foci for coordinated transcription. Furthermore, we show that haplotype variants and allelic interactions have differential effects on chromosome configuration, influencing gene expression, and may provide mechanistic insights into functions associated with disease susceptibility. 3D genome simulation suggests a model of chromatin folding around chromosomal axes, where CTCF is involved in defining the interface between condensed and open compartments for structural regulation. Our 3D genome strategy thus provides unique insights in the topological mechanism of human variations and diseases. Copyright © 2015 Elsevier Inc. All rights reserved.
The first genome sequence of a metatherian herpesvirus: Macropodid herpesvirus 1.

PubMed

Vaz, Paola K; Mahony, Timothy J; Hartley, Carol A; Fowler, Elizabeth V; Ficorilli, Nino; Lee, Sang W; Gilkerson, James R; Browning, Glenn F; Devlin, Joanne M

2016-01-22

While many placental herpesvirus genomes have been fully sequenced, the complete genome of a marsupial herpesvirus has not been described. Here we present the first genome sequence of a metatherian herpesvirus, Macropodid herpesvirus 1 (MaHV-1). The MaHV-1 viral genome was sequenced using an Illumina MiSeq sequencer, de novo assembly was performed and the genome was annotated. The MaHV-1 genome was 140 kbp in length and clustered phylogenetically with the primate simplexviruses, sharing 67% nucleotide sequence identity with Human herpesviruses 1 and 2. The MaHV-1 genome contained 66 predicted open reading frames (ORFs) homologous to those in other herpesvirus genomes, but lacked homologues of UL3, UL4, UL56 and glycoprotein J. This is the first alphaherpesvirus genome that has been found to lack the UL3 and UL4 homologues. We identified six novel ORFs and confirmed their transcription by RT-PCR. This is the first genome sequence of a herpesvirus that infects metatherians, a taxonomically unique mammalian clade. Members of the Simplexvirus genus are remarkably conserved, so the absence of ORFs otherwise retained in eutherian and avian alphaherpesviruses contributes to our understanding of the Alphaherpesvirinae. Further study of metatherian herpesvirus genetics and pathogenesis provides a unique approach to understanding herpesvirus-mammalian interactions.
The Divided Bacterial Genome: Structure, Function, and Evolution.

PubMed

diCenzo, George C; Finan, Turlough M

2017-09-01

Approximately 10% of bacterial genomes are split between two or more large DNA fragments, a genome architecture referred to as a multipartite genome. This multipartite organization is found in many important organisms, including plant symbionts, such as the nitrogen-fixing rhizobia, and plant, animal, and human pathogens, including the genera Brucella , Vibrio , and Burkholderia . The availability of many complete bacterial genome sequences means that we can now examine on a broad scale the characteristics of the different types of DNA molecules in a genome. Recent work has begun to shed light on the unique properties of each class of replicon, the unique functional role of chromosomal and nonchromosomal DNA molecules, and how the exploitation of novel niches may have driven the evolution of the multipartite genome. The aims of this review are to (i) outline the literature regarding bacterial genomes that are divided into multiple fragments, (ii) provide a meta-analysis of completed bacterial genomes from 1,708 species as a way of reviewing the abundant information present in these genome sequences, and (iii) provide an encompassing model to explain the evolution and function of the multipartite genome structure. This review covers, among other topics, salient genome terminology; mechanisms of multipartite genome formation; the phylogenetic distribution of multipartite genomes; how each part of a genome differs with respect to genomic signatures, genetic variability, and gene functional annotation; how each DNA molecule may interact; as well as the costs and benefits of this genome structure. Copyright © 2017 American Society for Microbiology.
Genomics Portals: integrative web-platform for mining genomics data.

PubMed

Shinde, Kaustubh; Phatak, Mukta; Johannes, Freudenberg M; Chen, Jing; Li, Qian; Vineet, Joshi K; Hu, Zhen; Ghosh, Krishnendu; Meller, Jaroslaw; Medvedovic, Mario

2010-01-13

A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org.
Genomics Portals: integrative web-platform for mining genomics data

PubMed Central

2010-01-01

Background A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Results Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. Conclusion The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org. PMID:20070909
Predicting human genetic interactions from cancer genome evolution.

PubMed

Lu, Xiaowen; Megchelenbrink, Wout; Notebaart, Richard A; Huynen, Martijn A

2015-01-01

Synthetic Lethal (SL) genetic interactions play a key role in various types of biological research, ranging from understanding genotype-phenotype relationships to identifying drug-targets against cancer. Despite recent advances in empirical measuring SL interactions in human cells, the human genetic interaction map is far from complete. Here, we present a novel approach to predict this map by exploiting patterns in cancer genome evolution. First, we show that empirically determined SL interactions are reflected in various gene presence, absence, and duplication patterns in hundreds of cancer genomes. The most evident pattern that we discovered is that when one member of an SL interaction gene pair is lost, the other gene tends not to be lost, i.e. the absence of co-loss. This observation is in line with expectation, because the loss of an SL interacting pair will be lethal to the cancer cell. SL interactions are also reflected in gene expression profiles, such as an under representation of cases where the genes in an SL pair are both under expressed, and an over representation of cases where one gene of an SL pair is under expressed, while the other one is over expressed. We integrated the various previously unknown cancer genome patterns and the gene expression patterns into a computational model to identify SL pairs. This simple, genome-wide model achieves a high prediction power (AUC = 0.75) for known genetic interactions. It allows us to present for the first time a comprehensive genome-wide list of SL interactions with a high estimated prediction precision, covering up to 591,000 gene pairs. This unique list can potentially be used in various application areas ranging from biotechnology to medical genetics.
The End of a 60-year Riddle: Identification and Genomic Characterization of an Iridovirus, the Causative Agent of White Fat Cell Disease in Zooplankton

PubMed Central

Toenshoff, Elena R.; Fields, Peter D.; Bourgeois, Yann X.; Ebert, Dieter

2018-01-01

The planktonic freshwater crustacean of the genus Daphnia are a model system for biomedical research and, in particular, invertebrate-parasite interactions. Up until now, no virus has been characterized for this system. Here we report the discovery of an iridovirus as the causative agent of White Fat Cell Disease (WFCD) in Daphnia. WFCD is a highly virulent disease of Daphnia that can easily be cultured under laboratory conditions. Although it has been studied from sites across Eurasia for more than 60 years, its causative agent had not been described, nor had an iridovirus been connected to WFCD before now. Here we find that an iridovirus—the Daphnia iridescent virus 1 (DIV-1)—is the causative agent of WFCD. DIV-1 has a genome sequence of about 288 kbp, with 39% G+C content and encodes 367 predicted open reading frames. DIV-1 clusters together with other invertebrate iridoviruses but has by far the largest genome among all sequenced iridoviruses. Comparative genomics reveal that DIV-1 has apparently recently lost a substantial number of unique genes but has also gained genes by horizontal gene transfer from its crustacean host. DIV-1 represents the first invertebrate iridovirus that encodes proteins to purportedly cap RNA, and it contains unique genes for a DnaJ-like protein, a membrane glycoprotein and protein of the immunoglobulin superfamily, which may mediate host–pathogen interactions and pathogenicity. Our findings end a 60-year search for the causative agent of WFCD and add to our knowledge of iridovirus genomics and invertebrate–virus interactions. PMID:29487186
Genome mining of Streptomyces scabrisporus NF3 reveals symbiotic features including genes related to plant interactions.

PubMed

Ceapă, Corina Diana; Vázquez-Hernández, Melissa; Rodríguez-Luna, Stefany Daniela; Cruz Vázquez, Angélica Patricia; Jiménez Suárez, Verónica; Rodríguez-Sanoja, Romina; Alvarez-Buylla, Elena R; Sánchez, Sergio

2018-01-01

Endophytic bacteria are wide-spread and associated with plant physiological benefits, yet their genomes and secondary metabolites remain largely unidentified. In this study, we explored the genome of the endophyte Streptomyces scabrisporus NF3 for discovery of potential novel molecules as well as genes and metabolites involved in host interactions. The complete genomes of seven Streptomyces and three other more distantly related bacteria were used to define the functional landscape of this unique microbe. The S. scabrisporus NF3 genome is larger than the average Streptomyces genome and not structured for an obligate endosymbiotic lifestyle; this and the fact that can grow in R2YE media implies that it could include a soil-living stage. The genome displays an enrichment of genes associated with amino acid production, protein secretion, secondary metabolite and antioxidants production and xenobiotic degradation, indicating that S. scabrisporus NF3 could contribute to the metabolic enrichment of soil microbial communities and of its hosts. Importantly, besides its metabolic advantages, the genome showed evidence for differential functional specificity and diversification of plant interaction molecules, including genes for the production of plant hormones, stress resistance molecules, chitinases, antibiotics and siderophores. Given the diversity of S. scabrisporus mechanisms for host upkeep, we propose that these strategies were necessary for its adaptation to plant hosts and to face changes in environmental conditions.
Genome mining of Streptomyces scabrisporus NF3 reveals symbiotic features including genes related to plant interactions

PubMed Central

Rodríguez-Luna, Stefany Daniela; Cruz Vázquez, Angélica Patricia; Jiménez Suárez, Verónica; Rodríguez-Sanoja, Romina; Alvarez-Buylla, Elena R.; Sánchez, Sergio

2018-01-01

Endophytic bacteria are wide-spread and associated with plant physiological benefits, yet their genomes and secondary metabolites remain largely unidentified. In this study, we explored the genome of the endophyte Streptomyces scabrisporus NF3 for discovery of potential novel molecules as well as genes and metabolites involved in host interactions. The complete genomes of seven Streptomyces and three other more distantly related bacteria were used to define the functional landscape of this unique microbe. The S. scabrisporus NF3 genome is larger than the average Streptomyces genome and not structured for an obligate endosymbiotic lifestyle; this and the fact that can grow in R2YE media implies that it could include a soil-living stage. The genome displays an enrichment of genes associated with amino acid production, protein secretion, secondary metabolite and antioxidants production and xenobiotic degradation, indicating that S. scabrisporus NF3 could contribute to the metabolic enrichment of soil microbial communities and of its hosts. Importantly, besides its metabolic advantages, the genome showed evidence for differential functional specificity and diversification of plant interaction molecules, including genes for the production of plant hormones, stress resistance molecules, chitinases, antibiotics and siderophores. Given the diversity of S. scabrisporus mechanisms for host upkeep, we propose that these strategies were necessary for its adaptation to plant hosts and to face changes in environmental conditions. PMID:29447216
Genomes as geography: using GIS technology to build interactive genome feature maps

PubMed Central

Dolan, Mary E; Holden, Constance C; Beard, M Kate; Bult, Carol J

2006-01-01

Background Many commonly used genome browsers display sequence annotations and related attributes as horizontal data tracks that can be toggled on and off according to user preferences. Most genome browsers use only simple keyword searches and limit the display of detailed annotations to one chromosomal region of the genome at a time. We have employed concepts, methodologies, and tools that were developed for the display of geographic data to develop a Genome Spatial Information System (GenoSIS) for displaying genomes spatially, and interacting with genome annotations and related attribute data. In contrast to the paradigm of horizontally stacked data tracks used by most genome browsers, GenoSIS uses the concept of registered spatial layers composed of spatial objects for integrated display of diverse data. In addition to basic keyword searches, GenoSIS supports complex queries, including spatial queries, and dynamically generates genome maps. Our adaptation of the geographic information system (GIS) model in a genome context supports spatial representation of genome features at multiple scales with a versatile and expressive query capability beyond that supported by existing genome browsers. Results We implemented an interactive genome sequence feature map for the mouse genome in GenoSIS, an application that uses ArcGIS, a commercially available GIS software system. The genome features and their attributes are represented as spatial objects and data layers that can be toggled on and off according to user preferences or displayed selectively in response to user queries. GenoSIS supports the generation of custom genome maps in response to complex queries about genome features based on both their attributes and locations. Our example application of GenoSIS to the mouse genome demonstrates the powerful visualization and query capability of mature GIS technology applied in a novel domain. Conclusion Mapping tools developed specifically for geographic data can be exploited to display, explore and interact with genome data. The approach we describe here is organism independent and is equally useful for linear and circular chromosomes. One of the unique capabilities of GenoSIS compared to existing genome browsers is the capacity to generate genome feature maps dynamically in response to complex attribute and spatial queries. PMID:16984652

MycoCosm, an Integrated Fungal Genomics Resource

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shabalov, Igor; Grigoriev, Igor

2012-03-16

MycoCosm is a web-based interactive fungal genomics resource, which was first released in March 2010, in response to an urgent call from the fungal community for integration of all fungal genomes and analytical tools in one place (Pan-fungal data resources meeting, Feb 21-22, 2010, Alexandria, VA). MycoCosm integrates genomics data and analysis tools to navigate through over 100 fungal genomes sequenced at JGI and elsewhere. This resource allows users to explore fungal genomes in the context of both genome-centric analysis and comparative genomics, and promotes user community participation in data submission, annotation and analysis. MycoCosm has over 4500 unique visitors/monthmore » or 35000+ visitors/year as well as hundreds of registered users contributing their data and expertise to this resource. Its scalable architecture allows significant expansion of the data expected from JGI Fungal Genomics Program, its users, and integration with external resources used by fungal community.« less
Unique features of a global human ectoparasite identified through sequencing of the bed bug genome.

PubMed

Benoit, Joshua B; Adelman, Zach N; Reinhardt, Klaus; Dolan, Amanda; Poelchau, Monica; Jennings, Emily C; Szuter, Elise M; Hagan, Richard W; Gujar, Hemant; Shukla, Jayendra Nath; Zhu, Fang; Mohan, M; Nelson, David R; Rosendale, Andrew J; Derst, Christian; Resnik, Valentina; Wernig, Sebastian; Menegazzi, Pamela; Wegener, Christian; Peschel, Nicolai; Hendershot, Jacob M; Blenau, Wolfgang; Predel, Reinhard; Johnston, Paul R; Ioannidis, Panagiotis; Waterhouse, Robert M; Nauen, Ralf; Schorn, Corinna; Ott, Mark-Christoph; Maiwald, Frank; Johnston, J Spencer; Gondhalekar, Ameya D; Scharf, Michael E; Peterson, Brittany F; Raje, Kapil R; Hottel, Benjamin A; Armisén, David; Crumière, Antonin Jean Johan; Refki, Peter Nagui; Santos, Maria Emilia; Sghaier, Essia; Viala, Sèverine; Khila, Abderrahman; Ahn, Seung-Joon; Childers, Christopher; Lee, Chien-Yueh; Lin, Han; Hughes, Daniel S T; Duncan, Elizabeth J; Murali, Shwetha C; Qu, Jiaxin; Dugan, Shannon; Lee, Sandra L; Chao, Hsu; Dinh, Huyen; Han, Yi; Doddapaneni, Harshavardhan; Worley, Kim C; Muzny, Donna M; Wheeler, David; Panfilio, Kristen A; Vargas Jentzsch, Iris M; Vargo, Edward L; Booth, Warren; Friedrich, Markus; Weirauch, Matthew T; Anderson, Michelle A E; Jones, Jeffery W; Mittapalli, Omprakash; Zhao, Chaoyang; Zhou, Jing-Jiang; Evans, Jay D; Attardo, Geoffrey M; Robertson, Hugh M; Zdobnov, Evgeny M; Ribeiro, Jose M C; Gibbs, Richard A; Werren, John H; Palli, Subba R; Schal, Coby; Richards, Stephen

2016-02-02

The bed bug, Cimex lectularius, has re-established itself as a ubiquitous human ectoparasite throughout much of the world during the past two decades. This global resurgence is likely linked to increased international travel and commerce in addition to widespread insecticide resistance. Analyses of the C. lectularius sequenced genome (650 Mb) and 14,220 predicted protein-coding genes provide a comprehensive representation of genes that are linked to traumatic insemination, a reduced chemosensory repertoire of genes related to obligate hematophagy, host-symbiont interactions, and several mechanisms of insecticide resistance. In addition, we document the presence of multiple putative lateral gene transfer events. Genome sequencing and annotation establish a solid foundation for future research on mechanisms of insecticide resistance, human-bed bug and symbiont-bed bug associations, and unique features of bed bug biology that contribute to the unprecedented success of C. lectularius as a human ectoparasite.
Unique features of a global human ectoparasite identified through sequencing of the bed bug genome

PubMed Central

Benoit, Joshua B.; Adelman, Zach N.; Reinhardt, Klaus; Dolan, Amanda; Poelchau, Monica; Jennings, Emily C.; Szuter, Elise M.; Hagan, Richard W.; Gujar, Hemant; Shukla, Jayendra Nath; Zhu, Fang; Mohan, M.; Nelson, David R.; Rosendale, Andrew J.; Derst, Christian; Resnik, Valentina; Wernig, Sebastian; Menegazzi, Pamela; Wegener, Christian; Peschel, Nicolai; Hendershot, Jacob M.; Blenau, Wolfgang; Predel, Reinhard; Johnston, Paul R.; Ioannidis, Panagiotis; Waterhouse, Robert M.; Nauen, Ralf; Schorn, Corinna; Ott, Mark-Christoph; Maiwald, Frank; Johnston, J. Spencer; Gondhalekar, Ameya D.; Scharf, Michael E.; Peterson, Brittany F.; Raje, Kapil R.; Hottel, Benjamin A.; Armisén, David; Crumière, Antonin Jean Johan; Refki, Peter Nagui; Santos, Maria Emilia; Sghaier, Essia; Viala, Sèverine; Khila, Abderrahman; Ahn, Seung-Joon; Childers, Christopher; Lee, Chien-Yueh; Lin, Han; Hughes, Daniel S. T.; Duncan, Elizabeth J.; Murali, Shwetha C.; Qu, Jiaxin; Dugan, Shannon; Lee, Sandra L.; Chao, Hsu; Dinh, Huyen; Han, Yi; Doddapaneni, Harshavardhan; Worley, Kim C.; Muzny, Donna M.; Wheeler, David; Panfilio, Kristen A.; Vargas Jentzsch, Iris M.; Vargo, Edward L.; Booth, Warren; Friedrich, Markus; Weirauch, Matthew T.; Anderson, Michelle A. E.; Jones, Jeffery W.; Mittapalli, Omprakash; Zhao, Chaoyang; Zhou, Jing-Jiang; Evans, Jay D.; Attardo, Geoffrey M.; Robertson, Hugh M.; Zdobnov, Evgeny M.; Ribeiro, Jose M. C.; Gibbs, Richard A.; Werren, John H.; Palli, Subba R.; Schal, Coby; Richards, Stephen

2016-01-01

The bed bug, Cimex lectularius, has re-established itself as a ubiquitous human ectoparasite throughout much of the world during the past two decades. This global resurgence is likely linked to increased international travel and commerce in addition to widespread insecticide resistance. Analyses of the C. lectularius sequenced genome (650 Mb) and 14,220 predicted protein-coding genes provide a comprehensive representation of genes that are linked to traumatic insemination, a reduced chemosensory repertoire of genes related to obligate hematophagy, host–symbiont interactions, and several mechanisms of insecticide resistance. In addition, we document the presence of multiple putative lateral gene transfer events. Genome sequencing and annotation establish a solid foundation for future research on mechanisms of insecticide resistance, human–bed bug and symbiont–bed bug associations, and unique features of bed bug biology that contribute to the unprecedented success of C. lectularius as a human ectoparasite. PMID:26836814
Nothing in Evolution Makes Sense Except in the Light of Genomics: Read-Write Genome Evolution as an Active Biological Process.

PubMed

Shapiro, James A

2016-06-08

The 21st century genomics-based analysis of evolutionary variation reveals a number of novel features impossible to predict when Dobzhansky and other evolutionary biologists formulated the neo-Darwinian Modern Synthesis in the middle of the last century. These include three distinct realms of cell evolution; symbiogenetic fusions forming eukaryotic cells with multiple genome compartments; horizontal organelle, virus and DNA transfers; functional organization of proteins as systems of interacting domains subject to rapid evolution by exon shuffling and exonization; distributed genome networks integrated by mobile repetitive regulatory signals; and regulation of multicellular development by non-coding lncRNAs containing repetitive sequence components. Rather than single gene traits, all phenotypes involve coordinated activity by multiple interacting cell molecules. Genomes contain abundant and functional repetitive components in addition to the unique coding sequences envisaged in the early days of molecular biology. Combinatorial coding, plus the biochemical abilities cells possess to rearrange DNA molecules, constitute a powerful toolbox for adaptive genome rewriting. That is, cells possess "Read-Write Genomes" they alter by numerous biochemical processes capable of rapidly restructuring cellular DNA molecules. Rather than viewing genome evolution as a series of accidental modifications, we can now study it as a complex biological process of active self-modification.
Genome sequencing and comparative genomics of honey bee microsporidia, Nosema apis reveal novel insights into host-parasite interactions.

PubMed

Chen, Yan ping; Pettis, Jeffery S; Zhao, Yan; Liu, Xinyue; Tallon, Luke J; Sadzewicz, Lisa D; Li, Renhua; Zheng, Huoqing; Huang, Shaokang; Zhang, Xuan; Hamilton, Michele C; Pernal, Stephen F; Melathopoulos, Andony P; Yan, Xianghe; Evans, Jay D

2013-07-05

The microsporidia parasite Nosema contributes to the steep global decline of honey bees that are critical pollinators of food crops. There are two species of Nosema that have been found to infect honey bees, Nosema apis and N. ceranae. Genome sequencing of N. apis and comparative genome analysis with N. ceranae, a fully sequenced microsporidia species, reveal novel insights into host-parasite interactions underlying the parasite infections. We applied the whole-genome shotgun sequencing approach to sequence and assemble the genome of N. apis which has an estimated size of 8.5 Mbp. We predicted 2,771 protein- coding genes and predicted the function of each putative protein using the Gene Ontology. The comparative genomic analysis led to identification of 1,356 orthologs that are conserved between the two Nosema species and genes that are unique characteristics of the individual species, thereby providing a list of virulence factors and new genetic tools for studying host-parasite interactions. We also identified a highly abundant motif in the upstream promoter regions of N. apis genes. This motif is also conserved in N. ceranae and other microsporidia species and likely plays a role in gene regulation across the microsporidia. The availability of the N. apis genome sequence is a significant addition to the rapidly expanding body of microsprodian genomic data which has been improving our understanding of eukaryotic genome diversity and evolution in a broad sense. The predicted virulent genes and transcriptional regulatory elements are potential targets for innovative therapeutics to break down the life cycle of the parasite.
TumorMap: Exploring the Molecular Similarities of Cancer Samples in an Interactive Portal.

PubMed

Newton, Yulia; Novak, Adam M; Swatloski, Teresa; McColl, Duncan C; Chopra, Sahil; Graim, Kiley; Weinstein, Alana S; Baertsch, Robert; Salama, Sofie R; Ellrott, Kyle; Chopra, Manu; Goldstein, Theodore C; Haussler, David; Morozova, Olena; Stuart, Joshua M

2017-11-01

Vast amounts of molecular data are being collected on tumor samples, which provide unique opportunities for discovering trends within and between cancer subtypes. Such cross-cancer analyses require computational methods that enable intuitive and interactive browsing of thousands of samples based on their molecular similarity. We created a portal called TumorMap to assist in exploration and statistical interrogation of high-dimensional complex "omics" data in an interactive and easily interpretable way. In the TumorMap, samples are arranged on a hexagonal grid based on their similarity to one another in the original genomic space and are rendered with Google's Map technology. While the important feature of this public portal is the ability for the users to build maps from their own data, we pre-built genomic maps from several previously published projects. We demonstrate the utility of this portal by presenting results obtained from The Cancer Genome Atlas project data. Cancer Res; 77(21); e111-4. ©2017 AACR . ©2017 American Association for Cancer Research.
TumorMap: Exploring the Molecular Similarities of Cancer Samples in an Interactive Portal

PubMed Central

Newton, Yulia; Novak, Adam M.; Swatloski, Teresa; McColl, Duncan C.; Chopra, Sahil; Graim, Kiley; Weinstein, Alana S.; Baertsch, Robert; Salama, Sofie R.; Ellrott, Kyle; Chopra, Manu; Goldstein, Theodore C.; Haussler, David; Morozova, Olena; Stuart, Joshua M.

2017-01-01

Vast amounts of molecular data are being collected on tumor samples, which provide unique opportunities for discovering trends within and between cancer subtypes. Such cross-cancer analyses require computational methods that enable intuitive and interactive browsing of thousands of samples based on their molecular similarity. We created a portal called TumorMap to assist in exploration and statistical interrogation of high-dimensional complex “omics” data in an interactive and easily interpretable way. In the TumorMap, samples are arranged on a hexagonal grid based on their similarity to one another in the original genomic space and are rendered with Google’s Map technology. While the important feature of this public portal is the ability for the users to build maps from their own data, we pre-built genomic maps from several previously published projects. We demonstrate the utility of this portal by presenting results obtained from The Cancer Genome Atlas project data. PMID:29092953
Unique transposon landscapes are pervasive across Drosophila melanogaster genomes

PubMed Central

Rahman, Reazur; Chirn, Gung-wei; Kanodia, Abhay; Sytnikova, Yuliya A.; Brembs, Björn; Bergman, Casey M.; Lau, Nelson C.

2015-01-01

To understand how transposon landscapes (TLs) vary across animal genomes, we describe a new method called the Transposon Insertion and Depletion AnaLyzer (TIDAL) and a database of >300 TLs in Drosophila melanogaster (TIDAL-Fly). Our analysis reveals pervasive TL diversity across cell lines and fly strains, even for identically named sub-strains from different laboratories such as the ISO1 strain used for the reference genome sequence. On average, >500 novel insertions exist in every lab strain, inbred strains of the Drosophila Genetic Reference Panel (DGRP), and fly isolates in the Drosophila Genome Nexus (DGN). A minority (<25%) of transposon families comprise the majority (>70%) of TL diversity across fly strains. A sharp contrast between insertion and depletion patterns indicates that many transposons are unique to the ISO1 reference genome sequence. Although TL diversity from fly strains reaches asymptotic limits with increasing sequencing depth, rampant TL diversity causes unsaturated detection of TLs in pools of flies. Finally, we show novel transposon insertions negatively correlate with Piwi-interacting RNA (piRNA) levels for most transposon families, except for the highly-abundant roo retrotransposon. Our study provides a useful resource for Drosophila geneticists to understand how transposons create extensive genomic diversity in fly cell lines and strains. PMID:26578579
Nothing in Evolution Makes Sense Except in the Light of Genomics: Read–Write Genome Evolution as an Active Biological Process

PubMed Central

Shapiro, James A.

2016-01-01

The 21st century genomics-based analysis of evolutionary variation reveals a number of novel features impossible to predict when Dobzhansky and other evolutionary biologists formulated the neo-Darwinian Modern Synthesis in the middle of the last century. These include three distinct realms of cell evolution; symbiogenetic fusions forming eukaryotic cells with multiple genome compartments; horizontal organelle, virus and DNA transfers; functional organization of proteins as systems of interacting domains subject to rapid evolution by exon shuffling and exonization; distributed genome networks integrated by mobile repetitive regulatory signals; and regulation of multicellular development by non-coding lncRNAs containing repetitive sequence components. Rather than single gene traits, all phenotypes involve coordinated activity by multiple interacting cell molecules. Genomes contain abundant and functional repetitive components in addition to the unique coding sequences envisaged in the early days of molecular biology. Combinatorial coding, plus the biochemical abilities cells possess to rearrange DNA molecules, constitute a powerful toolbox for adaptive genome rewriting. That is, cells possess “Read–Write Genomes” they alter by numerous biochemical processes capable of rapidly restructuring cellular DNA molecules. Rather than viewing genome evolution as a series of accidental modifications, we can now study it as a complex biological process of active self-modification. PMID:27338490
Transcriptome of interstitial cells of Cajal reveals unique and selective gene signatures

PubMed Central

Park, Paul J.; Fuchs, Robert; Wei, Lai; Jorgensen, Brian G.; Redelman, Doug; Ward, Sean M.; Sanders, Kenton M.

2017-01-01

Transcriptome-scale data can reveal essential clues into understanding the underlying molecular mechanisms behind specific cellular functions and biological processes. Transcriptomics is a continually growing field of research utilized in biomarker discovery. The transcriptomic profile of interstitial cells of Cajal (ICC), which serve as slow-wave electrical pacemakers for gastrointestinal (GI) smooth muscle, has yet to be uncovered. Using copGFP-labeled ICC mice and flow cytometry, we isolated ICC populations from the murine small intestine and colon and obtained their transcriptomes. In analyzing the transcriptome, we identified a unique set of ICC-restricted markers including transcription factors, epigenetic enzymes/regulators, growth factors, receptors, protein kinases/phosphatases, and ion channels/transporters. This analysis provides new and unique insights into the cellular and biological functions of ICC in GI physiology. Additionally, we constructed an interactive ICC genome browser (http://med.unr.edu/physio/transcriptome) based on the UCSC genome database. To our knowledge, this is the first online resource that provides a comprehensive library of all known genetic transcripts expressed in primary ICC. Our genome browser offers a new perspective into the alternative expression of genes in ICC and provides a valuable reference for future functional studies. PMID:28426719
Fundulus as the Premier Teleost Model in Environmental Biology: Opportunities for New Insights Using Genomics

EPA Science Inventory

A strong foundation of basic and applied research documents that the estuarine fish Fundulus heteroclitus and related species are unique laboratory and field models for understanding how individuals and populations interact with their environment. In this paper we summarize an ex...
Comparative genomics of Paracoccus sp. SM22M-07 isolated from coral mucus: insights into bacteria-host interactions.

PubMed

Carlos, Camila; Pereira, Letícia Bianca; Ottoboni, Laura Maria Mariscal

2017-06-01

One of the main goals of coral microbiology is to understand the ways in which coral-bacteria associations are established and maintained. This work describes the sequencing of the genome of Paracoccus sp. SM22M-07 isolated from the mucus of the endemic Brazilian coral species Mussismilia hispida. Comparative analysis was used to identify unique genomic features of SM22M-07 that might be involved in its adaptation to the marine ecosystem and the nutrient-rich environment provided by coral mucus, as well as in the establishment and strengthening of the interaction with the host. These features included genes related to the type IV protein secretion system, erythritol catabolism, and succinoglycan biosynthesis. We experimentally confirmed the production of succinoglycan by Paracoccus sp. SM22M-07 and we hypothesize that it may be involved in the association of the bacterium with coral surfaces.
Investigating the beneficial traits of Trichoderma hamatum GD12 for sustainable agriculture—insights from genomics

PubMed Central

Studholme, David J.; Harris, Beverley; Le Cocq, Kate; Winsbury, Rebecca; Perera, Venura; Ryder, Lauren; Ward, Jane L.; Beale, Michael H.; Thornton, Chris R.; Grant, Murray

2013-01-01

Trichoderma hamatum strain GD12 is unique in that it can promote plant growth, activate biocontrol against pre- and post-emergence soil pathogens and can induce systemic resistance to foliar pathogens. This study extends previous work in lettuce to demonstrate that GD12 can confer beneficial agronomic traits to other plants, providing examples of plant growth promotion in the model dicot, Arabidopsis thaliana and induced foliar resistance to Magnaporthe oryzae in the model monocot rice. We further characterize the lettuce-T. hamatum interaction to show that bran extracts from GD12 and an N-acetyl-β-D-glucosamindase-deficient mutant differentially promote growth in a concentration dependent manner, and these differences correlate with differences in the small molecule secretome. We show that GD12 mycoparasitises a range of isolates of the pre-emergence soil pathogen Sclerotinia sclerotiorum and that this interaction induces a further increase in plant growth promotion above that conferred by GD12. To understand the genetic potential encoded by T. hamatum GD12 and to facilitate its use as a model beneficial organism to study plant growth promotion, induced systemic resistance and mycoparasitism we present de novo genome sequence data. We compare GD12 with other published Trichoderma genomes and show that T. hamatum GD12 contains unique genomic regions with the potential to encode novel bioactive metabolites that may contribute to GD12's agrochemically important traits. PMID:23908658
CMS: A Web-Based System for Visualization and Analysis of Genome-Wide Methylation Data of Human Cancers

PubMed Central

Huang, Yi-Wen; Roa, Juan C.; Goodfellow, Paul J.; Kizer, E. Lynette; Huang, Tim H. M.; Chen, Yidong

2013-01-01

Background DNA methylation of promoter CpG islands is associated with gene suppression, and its unique genome-wide profiles have been linked to tumor progression. Coupled with high-throughput sequencing technologies, it can now efficiently determine genome-wide methylation profiles in cancer cells. Also, experimental and computational technologies make it possible to find the functional relationship between cancer-specific methylation patterns and their clinicopathological parameters. Methodology/Principal Findings Cancer methylome system (CMS) is a web-based database application designed for the visualization, comparison and statistical analysis of human cancer-specific DNA methylation. Methylation intensities were obtained from MBDCap-sequencing, pre-processed and stored in the database. 191 patient samples (169 tumor and 22 normal specimen) and 41 breast cancer cell-lines are deposited in the database, comprising about 6.6 billion uniquely mapped sequence reads. This provides comprehensive and genome-wide epigenetic portraits of human breast cancer and endometrial cancer to date. Two views are proposed for users to better understand methylation structure at the genomic level or systemic methylation alteration at the gene level. In addition, a variety of annotation tracks are provided to cover genomic information. CMS includes important analytic functions for interpretation of methylation data, such as the detection of differentially methylated regions, statistical calculation of global methylation intensities, multiple gene sets of biologically significant categories, interactivity with UCSC via custom-track data. We also present examples of discoveries utilizing the framework. Conclusions/Significance CMS provides visualization and analytic functions for cancer methylome datasets. A comprehensive collection of datasets, a variety of embedded analytic functions and extensive applications with biological and translational significance make this system powerful and unique in cancer methylation research. CMS is freely accessible at: http://cbbiweb.uthscsa.edu/KMethylomes/. PMID:23630576
CMS: a web-based system for visualization and analysis of genome-wide methylation data of human cancers.

PubMed

Gu, Fei; Doderer, Mark S; Huang, Yi-Wen; Roa, Juan C; Goodfellow, Paul J; Kizer, E Lynette; Huang, Tim H M; Chen, Yidong

2013-01-01

DNA methylation of promoter CpG islands is associated with gene suppression, and its unique genome-wide profiles have been linked to tumor progression. Coupled with high-throughput sequencing technologies, it can now efficiently determine genome-wide methylation profiles in cancer cells. Also, experimental and computational technologies make it possible to find the functional relationship between cancer-specific methylation patterns and their clinicopathological parameters. Cancer methylome system (CMS) is a web-based database application designed for the visualization, comparison and statistical analysis of human cancer-specific DNA methylation. Methylation intensities were obtained from MBDCap-sequencing, pre-processed and stored in the database. 191 patient samples (169 tumor and 22 normal specimen) and 41 breast cancer cell-lines are deposited in the database, comprising about 6.6 billion uniquely mapped sequence reads. This provides comprehensive and genome-wide epigenetic portraits of human breast cancer and endometrial cancer to date. Two views are proposed for users to better understand methylation structure at the genomic level or systemic methylation alteration at the gene level. In addition, a variety of annotation tracks are provided to cover genomic information. CMS includes important analytic functions for interpretation of methylation data, such as the detection of differentially methylated regions, statistical calculation of global methylation intensities, multiple gene sets of biologically significant categories, interactivity with UCSC via custom-track data. We also present examples of discoveries utilizing the framework. CMS provides visualization and analytic functions for cancer methylome datasets. A comprehensive collection of datasets, a variety of embedded analytic functions and extensive applications with biological and translational significance make this system powerful and unique in cancer methylation research. CMS is freely accessible at: http://cbbiweb.uthscsa.edu/KMethylomes/.
Complete genome sequence and integrated protein localization and interaction map for alfalfa dwarf virus, which combines properties of both cytoplasmic and nuclear plant rhabdoviruses

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bejerman, Nicolás, E-mail: n.bejerman@uq.edu.au; Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St Lucia, QLD 4072; Giolitti, Fabián

Summary: We have determined the full-length 14,491-nucleotide genome sequence of a new plant rhabdovirus, alfalfa dwarf virus (ADV). Seven open reading frames (ORFs) were identified in the antigenomic orientation of the negative-sense, single-stranded viral RNA, in the order 3′-N-P-P3-M-G-P6-L-5′. The ORFs are separated by conserved intergenic regions and the genome coding region is flanked by complementary 3′ leader and 5′ trailer sequences. Phylogenetic analysis of the nucleoprotein amino acid sequence indicated that this alfalfa-infecting rhabdovirus is related to viruses in the genus Cytorhabdovirus. When transiently expressed as GFP fusions in Nicotiana benthamiana leaves, most ADV proteins accumulated in the cellmore » periphery, but unexpectedly P protein was localized exclusively in the nucleus. ADV P protein was shown to have a homotypic, and heterotypic nuclear interactions with N, P3 and M proteins by bimolecular fluorescence complementation. ADV appears unique in that it combines properties of both cytoplasmic and nuclear plant rhabdoviruses. - Highlights: • The complete genome of alfalfa dwarf virus is obtained. • An integrated localization and interaction map for ADV is determined. • ADV has a genome sequence similarity and evolutionary links with cytorhabdoviruses. • ADV protein localization and interaction data show an association with the nucleus. • ADV combines properties of both cytoplasmic and nuclear plant rhabdoviruses.« less
Smooth Muscle Cell Genome Browser: Enabling the Identification of Novel Serum Response Factor Target Genes

PubMed Central

Lee, Moon Young; Park, Chanjae; Berent, Robyn M.; Park, Paul J.; Fuchs, Robert; Syn, Hannah; Chin, Albert; Townsend, Jared; Benson, Craig C.; Redelman, Doug; Shen, Tsai-wei; Park, Jong Kun; Miano, Joseph M.; Sanders, Kenton M.; Ro, Seungil

2015-01-01

Genome-scale expression data on the absolute numbers of gene isoforms offers essential clues in cellular functions and biological processes. Smooth muscle cells (SMCs) perform a unique contractile function through expression of specific genes controlled by serum response factor (SRF), a transcription factor that binds to DNA sites known as the CArG boxes. To identify SRF-regulated genes specifically expressed in SMCs, we isolated SMC populations from mouse small intestine and colon, obtained their transcriptomes, and constructed an interactive SMC genome and CArGome browser. To our knowledge, this is the first online resource that provides a comprehensive library of all genetic transcripts expressed in primary SMCs. The browser also serves as the first genome-wide map of SRF binding sites. The browser analysis revealed novel SMC-specific transcriptional variants and SRF target genes, which provided new and unique insights into the cellular and biological functions of the cells in gastrointestinal (GI) physiology. The SRF target genes in SMCs, which were discovered in silico, were confirmed by proteomic analysis of SMC-specific Srf knockout mice. Our genome browser offers a new perspective into the alternative expression of genes in the context of SRF binding sites in SMCs and provides a valuable reference for future functional studies. PMID:26241044
Proteomic Dissection of the Mitochondrial DNA Metabolism Apparatus in Arabidopsis

DOE Office of Scientific and Technical Information (OSTI.GOV)

SAlly A. Mackenzie

2004-01-06

This study involves the investigation of nuclear genetic components that regulate mitochondrial genome behavior in higher plants. The approach utilizes the advanced plant model system of Arabidopsis thaliana to identify and functionally characterize multiple components of the mitochondrial DNA replication, recombination and mismatch repair system and their interaction partners. The rationale for the research stems from the central importance of mitochondria to overall cellular metabolism and the essential nature of the mitochondrial genome to mitochondrial function. Relatively little is understood about mitochondrial DNA maintenance and transmission in higher eukaryotes, and the higher plant mitochondrial genome displays unique properties and behavior.more » This investigation has revealed at least three important properties of plant mitochondrial DNA metabolism components. (1) Many are dual targeted to mitochondrial and chloroplasts by novel mechanisms, suggesting that the mitochondria a nd chloroplast share their genome maintenance apparatus. (2)The MSH1 gene, originating as a component of mismatch repair, has evolved uniquely in plants to participate in differential replication of the mitochondrial genome. (3) This mitochondrial differential replication process, termed substoichiometric shifting and also involving a RecA-related gene, appears to represent an adaptive mechanism to expand plant reproductive capacity and is likely present throughout the plant kingdom.« less
A Comprehensive Pan-Cancer Molecular Study of Gynecologic and Breast Cancers. | Office of Cancer Genomics

Cancer.gov

We analyzed molecular data on 2,579 tumors from The Cancer Genome Atlas (TCGA) of four gynecological types plus breast. Our aims were to identify shared and unique molecular features, clinically significant subtypes, and potential therapeutic targets. We found 61 somatic copy-number alterations (SCNAs) and 46 significantly mutated genes (SMGs). Eleven SCNAs and 11 SMGs had not been identified in previous TCGA studies of the individual tumor types. We found functionally significant estrogen receptor-regulated long non-coding RNAs (lncRNAs) and gene/lncRNA interaction networks.
Functional interactions of archaea, bacteria and viruses in a hypersaline endolithic community.

PubMed

Crits-Christoph, Alexander; Gelsinger, Diego R; Ma, Bing; Wierzchos, Jacek; Ravel, Jacques; Davila, Alfonso; Casero, M Cristina; DiRuggiero, Jocelyne

2016-06-01

Halite endoliths in the Atacama Desert represent one of the most extreme ecosystems on Earth. Cultivation-independent methods were used to examine the functional adaptations of the microbial consortia inhabiting halite nodules. The community was dominated by haloarchaea and functional analysis attributed most of the autotrophic CO2 fixation to one unique cyanobacterium. The assembled 1.1 Mbp genome of a novel nanohaloarchaeon, Candidatus Nanopetramus SG9, revealed a photoheterotrophic life style and a low median isoelectric point (pI) for all predicted proteins, suggesting a 'salt-in' strategy for osmotic balance. Predicted proteins of the algae identified in the community also had pI distributions similar to 'salt-in' strategists. The Nanopetramus genome contained a unique CRISPR/Cas system with a spacer that matched a partial viral genome from the metagenome. A combination of reference-independent methods identified over 30 complete or near complete viral or proviral genomes with diverse genome structure, genome size, gene content and hosts. Putative hosts included Halobacteriaceae, Nanohaloarchaea and Cyanobacteria. Despite the dependence of the halite community on deliquescence for liquid water availability, this study exposed an ecosystem spanning three phylogenetic domains, containing a large diversity of viruses and predominance of a 'salt-in' strategy to balance the high osmotic pressure of the environment. © 2016 Society for Applied Microbiology and John Wiley & Sons Ltd.

Aurora kinase A interacts with H-Ras and potentiates Ras-MAPK signaling | Office of Cancer Genomics

Cancer.gov

In cancer, upregulated Ras promotes cellular transformation and proliferation in part through activation of oncogenic Ras-MAPK signaling. While directly inhibiting Ras has proven challenging, new insights into Ras regulation through protein-protein interactions may offer unique opportunities for therapeutic intervention. Here we report the identification and validation of Aurora kinase A (Aurora A) as a novel Ras binding protein. We demonstrate that the kinase domain of Aurora A mediates the interaction with the N-terminal domain of H-Ras.
Fungal Genomics Program

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grigoriev, Igor

The JGI Fungal Genomics Program aims to scale up sequencing and analysis of fungal genomes to explore the diversity of fungi important for energy and the environment, and to promote functional studies on a system level. Combining new sequencing technologies and comparative genomics tools, JGI is now leading the world in fungal genome sequencing and analysis. Over 120 sequenced fungal genomes with analytical tools are available via MycoCosm (www.jgi.doe.gov/fungi), a web-portal for fungal biologists. Our model of interacting with user communities, unique among other sequencing centers, helps organize these communities, improves genome annotation and analysis work, and facilitates new larger-scalemore » genomic projects. This resulted in 20 high-profile papers published in 2011 alone and contributing to the Genomics Encyclopedia of Fungi, which targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts). Our next grand challenges include larger scale exploration of fungal diversity (1000 fungal genomes), developing molecular tools for DOE-relevant model organisms, and analysis of complex systems and metagenomes.« less
The genome of the amoeba symbiont "Candidatus Amoebophilus asiaticus" encodes an afp-like prophage possibly used for protein secretion.

PubMed

Penz, Thomas; Horn, Matthias; Schmitz-Esser, Stephan

2010-01-01

The recently sequenced genome of the obligate intracellular amoeba symbiont 'Candidatus Amoebophilus asiaticus' is unique among prokaryotic genomes due to its extremely large fraction of genes encoding proteins harboring eukaryotic domains such as ankyrin-repeats, TPR/SEL1 repeats, leucine-rich repeats, as well as F- and U-box domains, most of which likely serve in the interaction with the amoeba host. Here we provide evidence for the presence of additional proteins which are presumably presented extracellularly and should thus also be important for host cell interaction. Surprisingly, we did not find homologues of any of the well-known protein secretion systems required to translocate effector proteins into the host cell in the A. asiaticus genome, and the type six secretion systems seems to be incomplete. Here we describe the presence of a putative prophage in the A. asiaticus genome, which shows similarity to the antifeeding prophage from the insect pathogen Serratia entomophila. In S. entomophila this system is used to deliver toxins into insect hosts. This putative antifeeding-like prophage might thus represent the missing protein secretion apparatus in A. asiaticus.
Identifying Genotype-by-Environment Interactions in the Metabolism of Germinating Arabidopsis Seeds Using Generalized Genetical Genomics 1[C][W][OA

PubMed Central

Joosen, Ronny Viktor Louis; Arends, Danny; Li, Yang; Willems, Leo A.J.; Keurentjes, Joost J.B.; Ligterink, Wilco; Jansen, Ritsert C.; Hilhorst, Henk W.M.

2013-01-01

A complex phenotype such as seed germination is the result of several genetic and environmental cues and requires the concerted action of many genes. The use of well-structured recombinant inbred lines in combination with “omics” analysis can help to disentangle the genetic basis of such quantitative traits. This so-called genetical genomics approach can effectively capture both genetic and epistatic interactions. However, to understand how the environment interacts with genomic-encoded information, a better understanding of the perception and processing of environmental signals is needed. In a classical genetical genomics setup, this requires replication of the whole experiment in different environmental conditions. A novel generalized setup overcomes this limitation and includes environmental perturbation within a single experimental design. We developed a dedicated quantitative trait loci mapping procedure to implement this approach and used existing phenotypical data to demonstrate its power. In addition, we studied the genetic regulation of primary metabolism in dry and imbibed Arabidopsis (Arabidopsis thaliana) seeds. In the metabolome, many changes were observed that were under both environmental and genetic controls and their interaction. This concept offers unique reduction of experimental load with minimal compromise of statistical power and is of great potential in the field of systems genetics, which requires a broad understanding of both plasticity and dynamic regulation. PMID:23606598
The Genome of the Obligately Intracellular Bacterium Ehrlichia canis Reveals Themes of Complex Membrane Structure and Immune Evasion Strategies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mavromatis, K; Doyle, C Kuyler; Lykidis, A

2006-01-01

Ehrlichia canis, a small obligately intracellular, tick-transmitted, gram-negative, {alpha}-proteobacterium, is the primary etiologic agent of globally distributed canine monocytic ehrlichiosis. Complete genome sequencing revealed that the E. canis genome consists of a single circular chromosome of 1,315,030 bp predicted to encode 925 proteins, 40 stable RNA species, 17 putative pseudogenes, and a substantial proportion of noncoding sequence (27%). Interesting genome features include a large set of proteins with transmembrane helices and/or signal sequences and a unique serine-threonine bias associated with the potential for O glycosylation that was prominent in proteins associated with pathogen-host interactions. Furthermore, two paralogous protein families associatedmore » with immune evasion were identified, one of which contains poly(G-C) tracts, suggesting that they may play a role in phase variation and facilitation of persistent infections. Genes associated with pathogen-host interactions were identified, including a small group encoding proteins (n = 12) with tandem repeats and another group encoding proteins with eukaryote-like ankyrin domains (n = 7).« less
The genome of obligately intracellular Ehrlichia canis revealsthemes of complex membrane structure and immune evasion strategies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mavromatis, K.; Kuyler Doyle, C.; Lykidis, A.

2005-09-01

Ehrlichia canis, a small obligately intracellular, tick-transmitted, gram-negative, a-proteobacterium is the primary etiologic agent of globally distributed canine monocytic ehrlichiosis. Complete genome sequencing revealed that the E. canis genome consists of a single circular chromosome of 1,315,030 bp predicted to encode 925 proteins, 40 stable RNA species, and 17 putative pseudogenes, and a substantial proportion of non-coding sequence (27 percent). Interesting genome features include a large set of proteins with transmembrane helices and/or signal sequences, and a unique serine-threonine bias associated with the potential for O-glycosylation that was prominent in proteins associated with pathogen-host interactions. Furthermore, two paralogous protein familiesmore » associated with immune evasion were identified, one of which contains poly G:C tracts, suggesting that they may play a role in phase variation and facilitation of persistent infections. Proteins associated with pathogen-host interactions were identified including a small group of proteins (12) with tandem repeats and another with eukaryotic-like ankyrin domains (7).« less
Evolution of insect proteomes: insights into synapse organization and synaptic vesicle life cycle

PubMed Central

Yanay, Chava; Morpurgo, Noa; Linial, Michal

2008-01-01

Background The molecular components in synapses that are essential to the life cycle of synaptic vesicles are well characterized. Nonetheless, many aspects of synaptic processes, in particular how they relate to complex behaviour, remain elusive. The genomes of flies, mosquitoes, the honeybee and the beetle are now fully sequenced and span an evolutionary breadth of about 350 million years; this provides a unique opportunity to conduct a comparative genomics study of the synapse. Results We compiled a list of 120 gene prototypes that comprise the core of presynaptic structures in insects. Insects lack several scaffolding proteins in the active zone, such as bassoon and piccollo, and the most abundant protein in the mammalian synaptic vesicle, namely synaptophysin. The pattern of evolution of synaptic protein complexes is analyzed. According to this analysis, the components of presynaptic complexes as well as proteins that take part in organelle biogenesis are tightly coordinated. Most synaptic proteins are involved in rich protein interaction networks. Overall, the number of interacting proteins and the degrees of sequence conservation between human and insects are closely correlated. Such a correlation holds for exocytotic but not for endocytotic proteins. Conclusion This comparative study of human with insects sheds light on the composition and assembly of protein complexes in the synapse. Specifically, the nature of the protein interaction graphs differentiate exocytotic from endocytotic proteins and suggest unique evolutionary constraints for each set. General principles in the design of proteins of the presynaptic site can be inferred from a comparative study of human and insect genomes. PMID:18257909
Comparative Genomics of Plant-Associated Pseudomonas spp.: Insights into Diversity and Inheritance of Traits Involved in Multitrophic Interactions

PubMed Central

Loper, Joyce E.; Hassan, Karl A.; Mavrodi, Dmitri V.; Davis, Edward W.; Lim, Chee Kent; Shaffer, Brenda T.; Elbourne, Liam D. H.; Stockwell, Virginia O.; Hartney, Sierra L.; Breakwell, Katy; Henkels, Marcella D.; Tetu, Sasha G.; Rangel, Lorena I.; Kidarsa, Teresa A.; Wilson, Neil L.; van de Mortel, Judith E.; Song, Chunxu; Blumhagen, Rachel; Radune, Diana; Hostetler, Jessica B.; Brinkac, Lauren M.; Durkin, A. Scott; Kluepfel, Daniel A.; Wechter, W. Patrick; Anderson, Anne J.; Kim, Young Cheol; Pierson, Leland S.; Pierson, Elizabeth A.; Lindow, Steven E.; Kobayashi, Donald Y.; Raaijmakers, Jos M.; Weller, David M.; Thomashow, Linda S.; Allen, Andrew E.; Paulsen, Ian T.

2012-01-01

We provide here a comparative genome analysis of ten strains within the Pseudomonas fluorescens group including seven new genomic sequences. These strains exhibit a diverse spectrum of traits involved in biological control and other multitrophic interactions with plants, microbes, and insects. Multilocus sequence analysis placed the strains in three sub-clades, which was reinforced by high levels of synteny, size of core genomes, and relatedness of orthologous genes between strains within a sub-clade. The heterogeneity of the P. fluorescens group was reflected in the large size of its pan-genome, which makes up approximately 54% of the pan-genome of the genus as a whole, and a core genome representing only 45–52% of the genome of any individual strain. We discovered genes for traits that were not known previously in the strains, including genes for the biosynthesis of the siderophores achromobactin and pseudomonine and the antibiotic 2-hexyl-5-propyl-alkylresorcinol; novel bacteriocins; type II, III, and VI secretion systems; and insect toxins. Certain gene clusters, such as those for two type III secretion systems, are present only in specific sub-clades, suggesting vertical inheritance. Almost all of the genes associated with multitrophic interactions map to genomic regions present in only a subset of the strains or unique to a specific strain. To explore the evolutionary origin of these genes, we mapped their distributions relative to the locations of mobile genetic elements and repetitive extragenic palindromic (REP) elements in each genome. The mobile genetic elements and many strain-specific genes fall into regions devoid of REP elements (i.e., REP deserts) and regions displaying atypical tri-nucleotide composition, possibly indicating relatively recent acquisition of these loci. Collectively, the results of this study highlight the enormous heterogeneity of the P. fluorescens group and the importance of the variable genome in tailoring individual strains to their specific lifestyles and functional repertoire. PMID:22792073
Identification and characterization of microRNAs in oilseed rape (Brassica napus) responsive to infection with the pathogenic fungus Verticillium longisporum using Brassica AA (Brassica rapa) and CC (Brassica oleracea) as reference genomes.

PubMed

Shen, Dan; Suhrkamp, Ina; Wang, Yu; Liu, Shenyi; Menkhaus, Jan; Verreet, Joseph-Alexander; Fan, Longjiang; Cai, Daguang

2014-11-01

Verticillium longisporum, a soil-borne pathogenic fungus, causes vascular disease in oilseed rape (Brassica napus). We proposed that plant microRNAs (miRNAs) are involved in the plant-V. longisporum interaction. To identify oilseed rape miRNAs, we deep-sequenced two small RNA libraries made from V. longisporum infected/noninfected roots and employed Brassica rapa and Brassica oleracea genomes as references for miRNA prediction and characterization. We identified 893 B. napus miRNAs representing 360 conserved and 533 novel miRNAs, and mapped 429 and 464 miRNAs to the AA and CC genomes, respectively. Microsynteny analysis with the conserved miRNAs and their flanking protein coding sequences revealed 137 AA-CC genome syntenic miRNA pairs and 61 AA and 42 CC genome-unique miRNAs. Sixty-two miRNAs were responsive to the V. longisporum infection. We present data for specific interactions and simultaneously reciprocal changes in the expression levels of the miRNAs and their targets in the infected roots. We demonstrate that miRNAs are involved in the plant-fungus interaction and that miRNA168-Argonaute 1 (AGO1) expression modulation might act as a key regulatory module in a compatible plant-V. longisporum interaction. Our results suggest that V. longisporum may have evolved a virulence mechanism by interference with plant miRNAs to reprogram plant gene expression and achieve infection. © 2014 The Authors. New Phytologist © 2014 New Phytologist Trust.
The Translational Genomics Core at Partners Personalized Medicine: Facilitating the Transition of Research towards Personalized Medicine

PubMed Central

Blau, Ashley; Brown, Alison; Mahanta, Lisa; Amr, Sami S.

2016-01-01

The Translational Genomics Core (TGC) at Partners Personalized Medicine (PPM) serves as a fee-for-service core laboratory for Partners Healthcare researchers, providing access to technology platforms and analysis pipelines for genomic, transcriptomic, and epigenomic research projects. The interaction of the TGC with various components of PPM provides it with a unique infrastructure that allows for greater IT and bioinformatics opportunities, such as sample tracking and data analysis. The following article describes some of the unique opportunities available to an academic research core operating within PPM, such the ability to develop analysis pipelines with a dedicated bioinformatics team and maintain a flexible Laboratory Information Management System (LIMS) with the support of an internal IT team, as well as the operational challenges encountered to respond to emerging technologies, diverse investigator needs, and high staff turnover. In addition, the implementation and operational role of the TGC in the Partners Biobank genotyping project of over 25,000 samples is presented as an example of core activities working with other components of PPM. PMID:26927185
The Translational Genomics Core at Partners Personalized Medicine: Facilitating the Transition of Research towards Personalized Medicine.

PubMed

Blau, Ashley; Brown, Alison; Mahanta, Lisa; Amr, Sami S

2016-02-26

The Translational Genomics Core (TGC) at Partners Personalized Medicine (PPM) serves as a fee-for-service core laboratory for Partners Healthcare researchers, providing access to technology platforms and analysis pipelines for genomic, transcriptomic, and epigenomic research projects. The interaction of the TGC with various components of PPM provides it with a unique infrastructure that allows for greater IT and bioinformatics opportunities, such as sample tracking and data analysis. The following article describes some of the unique opportunities available to an academic research core operating within PPM, such the ability to develop analysis pipelines with a dedicated bioinformatics team and maintain a flexible Laboratory Information Management System (LIMS) with the support of an internal IT team, as well as the operational challenges encountered to respond to emerging technologies, diverse investigator needs, and high staff turnover. In addition, the implementation and operational role of the TGC in the Partners Biobank genotyping project of over 25,000 samples is presented as an example of core activities working with other components of PPM.
Evolutionary interaction between W/Y chromosome and transposable elements.

PubMed

Śliwińska, Ewa B; Martyka, Rafał; Tryjanowski, Piotr

2016-06-01

The W/Y chromosome is unique among chromosomes as it does not recombine in its mature form. The main side effect of cessation of recombination is evolutionary instability and degeneration of the W/Y chromosome, or frequent W/Y chromosome turnovers. Another important feature of W/Y chromosome degeneration is transposable element (TEs) accumulation. Transposon accumulation has been confirmed for all W/Y chromosomes that have been sequenced so far. Models of W/Y chromosome instability include the assemblage of deleterious mutations in protein coding genes, but do not include the influence of transposable elements that are accumulated gradually in the non-recombining genome. The multiple roles of genomic TEs, and the interactions between retrotransposons and genome defense proteins are currently being studied intensively. Small RNAs originating from retrotransposon transcripts appear to be, in some cases, the only mediators of W/Y chromosome function. Based on the review of the most recent publications, we present knowledge on W/Y evolution in relation to retrotransposable element accumulation.
Whipworm genome and dual-species transcriptome analyses provide molecular insights into an intimate host-parasite interaction.

PubMed

Foth, Bernardo J; Tsai, Isheng J; Reid, Adam J; Bancroft, Allison J; Nichol, Sarah; Tracey, Alan; Holroyd, Nancy; Cotton, James A; Stanley, Eleanor J; Zarowiecki, Magdalena; Liu, Jimmy Z; Huckvale, Thomas; Cooper, Philip J; Grencis, Richard K; Berriman, Matthew

2014-07-01

Whipworms are common soil-transmitted helminths that cause debilitating chronic infections in man. These nematodes are only distantly related to Caenorhabditis elegans and have evolved to occupy an unusual niche, tunneling through epithelial cells of the large intestine. We report here the whole-genome sequences of the human-infective Trichuris trichiura and the mouse laboratory model Trichuris muris. On the basis of whole-transcriptome analyses, we identify many genes that are expressed in a sex- or life stage-specific manner and characterize the transcriptional landscape of a morphological region with unique biological adaptations, namely, bacillary band and stichosome, found only in whipworms and related parasites. Using RNA sequencing data from whipworm-infected mice, we describe the regulated T helper 1 (TH1)-like immune response of the chronically infected cecum in unprecedented detail. In silico screening identified numerous new potential drug targets against trichuriasis. Together, these genomes and associated functional data elucidate key aspects of the molecular host-parasite interactions that define chronic whipworm infection.
Whipworm genome and dual-species transcriptome analyses provide molecular insights into an intimate host-parasite interaction

PubMed Central

Nichol, Sarah; Tracey, Alan; Holroyd, Nancy; Cotton, James A.; Stanley, Eleanor J.; Zarowiecki, Magdalena; Liu, Jimmy Z.; Huckvale, Thomas; Cooper, Philip J.; Grencis, Richard K.; Berriman, Matthew

2014-01-01

Whipworms are common soil-transmitted helminths that cause debilitating chronic infections in man. These nematodes are only distantly related to Caenorhabditis elegans and have evolved to occupy an unusual niche, tunneling through epithelial cells of the large intestine. Here we present the genome sequences of the human-infective Trichuris trichiura and the murine laboratory model T. muris. Based on whole transcriptome analyses we identify many genes that are expressed in a gender- or life stage-specific manner and characterise the transcriptional landscape of a morphological region with unique biological adaptations, namely bacillary band and stichosome, found only in whipworms and related parasites. Using RNAseq data from whipworm-infected mice we describe the regulated Th1-like immune response of the chronically infected cecum in unprecedented detail. In silico screening identifies numerous potential new drug targets against trichuriasis. Together, these genomes and associated functional data elucidate key aspects of the molecular host-parasite interactions that define chronic whipworm infection. PMID:24929830
Observing copepods through a genomic lens

PubMed Central

2011-01-01

Background Copepods outnumber every other multicellular animal group. They are critical components of the world's freshwater and marine ecosystems, sensitive indicators of local and global climate change, key ecosystem service providers, parasites and predators of economically important aquatic animals and potential vectors of waterborne disease. Copepods sustain the world fisheries that nourish and support human populations. Although genomic tools have transformed many areas of biological and biomedical research, their power to elucidate aspects of the biology, behavior and ecology of copepods has only recently begun to be exploited. Discussion The extraordinary biological and ecological diversity of the subclass Copepoda provides both unique advantages for addressing key problems in aquatic systems and formidable challenges for developing a focused genomics strategy. This article provides an overview of genomic studies of copepods and discusses strategies for using genomics tools to address key questions at levels extending from individuals to ecosystems. Genomics can, for instance, help to decipher patterns of genome evolution such as those that occur during transitions from free living to symbiotic and parasitic lifestyles and can assist in the identification of genetic mechanisms and accompanying physiological changes associated with adaptation to new or physiologically challenging environments. The adaptive significance of the diversity in genome size and unique mechanisms of genome reorganization during development could similarly be explored. Genome-wide and EST studies of parasitic copepods of salmon and large EST studies of selected free-living copepods have demonstrated the potential utility of modern genomics approaches for the study of copepods and have generated resources such as EST libraries, shotgun genome sequences, BAC libraries, genome maps and inbred lines that will be invaluable in assisting further efforts to provide genomics tools for copepods. Summary Genomics research on copepods is needed to extend our exploration and characterization of their fundamental biological traits, so that we can better understand how copepods function and interact in diverse environments. Availability of large scale genomics resources will also open doors to a wide range of systems biology type studies that view the organism as the fundamental system in which to address key questions in ecology and evolution. PMID:21933388
Systems biology approach in plant abiotic stresses.

PubMed

Mohanta, Tapan Kumar; Bashir, Tufail; Hashem, Abeer; Abd Allah, Elsayed Fathi

2017-12-01

Plant abiotic stresses are the major constraint on plant growth and development, causing enormous crop losses across the world. Plants have unique features to defend themselves against these challenging adverse stress conditions. They modulate their phenotypes upon changes in physiological, biochemical, molecular and genetic information, thus making them tolerant against abiotic stresses. It is of paramount importance to determine the stress-tolerant traits of a diverse range of genotypes of plant species and integrate those traits for crop improvement. Stress-tolerant traits can be identified by conducting genome-wide analysis of stress-tolerant genotypes through the highly advanced structural and functional genomics approach. Specifically, whole-genome sequencing, development of molecular markers, genome-wide association studies and comparative analysis of interaction networks between tolerant and susceptible crop varieties grown under stress conditions can greatly facilitate discovery of novel agronomic traits that protect plants against abiotic stresses. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Exploring the Potential of Direct-To-Consumer Genomic Test Data for Predicting Adverse Drug Events.

PubMed

Zhang, Patrick M; Sarkar, Indra Neil

2018-01-01

Recent technological advancements in genetic testing and the growing accessibility of public genomic data provide researchers with a unique avenue to approach personalized medicine. This feasibility study examined the potential of direct-to-consumer (DTC) genomic tests (focusing on 23andMe) in research and clinical applications. In particular, we combined population genetics information from the Personal Genome Project with adverse event reports from AEOLUS and pharmacogenetic information from PharmGKB. Primarily, associations between drugs based on co-occurring genetic variations and associations between variants and adverse events were used to assess the potential for leveraging single nucleotide polymorphism information from 23andMe. The results of this study suggest potential clinical uses of DTC tests in light of potential drug interactions. Furthermore, the results suggest great potential for analyzing associations at a population level to facilitate knowledge discovery in the realm of predicting adverse drug events.
Powerful multilocus tests of genetic association in the presence of gene-gene and gene-environment interactions.

PubMed

Chatterjee, Nilanjan; Kalaylioglu, Zeynep; Moslehi, Roxana; Peters, Ulrike; Wacholder, Sholom

2006-12-01

In modern genetic epidemiology studies, the association between the disease and a genomic region, such as a candidate gene, is often investigated using multiple SNPs. We propose a multilocus test of genetic association that can account for genetic effects that might be modified by variants in other genes or by environmental factors. We consider use of the venerable and parsimonious Tukey's 1-degree-of-freedom model of interaction, which is natural when individual SNPs within a gene are associated with disease through a common biological mechanism; in contrast, many standard regression models are designed as if each SNP has unique functional significance. On the basis of Tukey's model, we propose a novel but computationally simple generalized test of association that can simultaneously capture both the main effects of the variants within a genomic region and their interactions with the variants in another region or with an environmental exposure. We compared performance of our method with that of two standard tests of association, one ignoring gene-gene/gene-environment interactions and the other based on a saturated model of interactions. We demonstrate major power advantages of our method both in analysis of data from a case-control study of the association between colorectal adenoma and DNA variants in the NAT2 genomic region, which are well known to be related to a common biological phenotype, and under different models of gene-gene interactions with use of simulated data.
Lineage-specific Virulence Determinants of Haemophilus influenzae Biogroup aegyptius

PubMed Central

Strouts, Fiona R.; Power, Peter; Croucher, Nicholas J.; Corton, Nicola; van Tonder, Andries; Quail, Michael A.; Langford, Paul R.; Hudson, Michael J.; Parkhill, Julian; Bentley, Stephen D.

2012-01-01

An emergent clone of Haemophilus influenzae biogroup aegyptius (Hae) is responsible for outbreaks of Brazilian purpuric fever (BPF). First recorded in Brazil in 1984, the so-called BPF clone of Hae caused a fulminant disease that started with conjunctivitis but developed into septicemic shock; mortality rates were as high as 70%. To identify virulence determinants, we conducted a pan-genomic analysis. Sequencing of the genomes of the BPF clone strain F3031 and a noninvasive conjunctivitis strain, F3047, and comparison of these sequences with 5 other complete H. influenzae genomes showed that >77% of the F3031 genome is shared among all H. influenzae strains. Delineation of the Hae accessory genome enabled characterization of 163 predicted protein-coding genes; identified differences in established autotransporter adhesins; and revealed a suite of novel adhesins unique to Hae, including novel trimeric autotransporter adhesins and 4 new fimbrial operons. These novel adhesins might play a critical role in host–pathogen interactions. PMID:22377449
Genomic insights into the Ixodes scapularis tick vector of Lyme disease

PubMed Central

Gulia-Nuss, Monika; Nuss, Andrew B.; Meyer, Jason M.; Sonenshine, Daniel E.; Roe, R. Michael; Waterhouse, Robert M.; Sattelle, David B.; de la Fuente, José; Ribeiro, Jose M.; Megy, Karine; Thimmapuram, Jyothi; Miller, Jason R.; Walenz, Brian P.; Koren, Sergey; Hostetler, Jessica B.; Thiagarajan, Mathangi; Joardar, Vinita S.; Hannick, Linda I.; Bidwell, Shelby; Hammond, Martin P.; Young, Sarah; Zeng, Qiandong; Abrudan, Jenica L.; Almeida, Francisca C.; Ayllón, Nieves; Bhide, Ketaki; Bissinger, Brooke W.; Bonzon-Kulichenko, Elena; Buckingham, Steven D.; Caffrey, Daniel R.; Caimano, Melissa J.; Croset, Vincent; Driscoll, Timothy; Gilbert, Don; Gillespie, Joseph J.; Giraldo-Calderón, Gloria I.; Grabowski, Jeffrey M.; Jiang, David; Khalil, Sayed M. S.; Kim, Donghun; Kocan, Katherine M.; Koči, Juraj; Kuhn, Richard J.; Kurtti, Timothy J.; Lees, Kristin; Lang, Emma G.; Kennedy, Ryan C.; Kwon, Hyeogsun; Perera, Rushika; Qi, Yumin; Radolf, Justin D.; Sakamoto, Joyce M.; Sánchez-Gracia, Alejandro; Severo, Maiara S.; Silverman, Neal; Šimo, Ladislav; Tojo, Marta; Tornador, Cristian; Van Zee, Janice P.; Vázquez, Jesús; Vieira, Filipe G.; Villar, Margarita; Wespiser, Adam R.; Yang, Yunlong; Zhu, Jiwei; Arensburger, Peter; Pietrantonio, Patricia V.; Barker, Stephen C.; Shao, Renfu; Zdobnov, Evgeny M.; Hauser, Frank; Grimmelikhuijzen, Cornelis J. P.; Park, Yoonseong; Rozas, Julio; Benton, Richard; Pedra, Joao H. F.; Nelson, David R.; Unger, Maria F.; Tubio, Jose M. C.; Tu, Zhijian; Robertson, Hugh M.; Shumway, Martin; Sutton, Granger; Wortman, Jennifer R.; Lawson, Daniel; Wikel, Stephen K.; Nene, Vishvanath M.; Fraser, Claire M.; Collins, Frank H.; Birren, Bruce; Nelson, Karen E.; Caler, Elisabet; Hill, Catherine A.

2016-01-01

Ticks transmit more pathogens to humans and animals than any other arthropod. We describe the 2.1 Gbp nuclear genome of the tick, Ixodes scapularis (Say), which vectors pathogens that cause Lyme disease, human granulocytic anaplasmosis, babesiosis and other diseases. The large genome reflects accumulation of repetitive DNA, new lineages of retro-transposons, and gene architecture patterns resembling ancient metazoans rather than pancrustaceans. Annotation of scaffolds representing ∼57% of the genome, reveals 20,486 protein-coding genes and expansions of gene families associated with tick–host interactions. We report insights from genome analyses into parasitic processes unique to ticks, including host ‘questing', prolonged feeding, cuticle synthesis, blood meal concentration, novel methods of haemoglobin digestion, haem detoxification, vitellogenesis and prolonged off-host survival. We identify proteins associated with the agent of human granulocytic anaplasmosis, an emerging disease, and the encephalitis-causing Langat virus, and a population structure correlated to life-history traits and transmission of the Lyme disease agent. PMID:26856261

Genomic insights into the Ixodes scapularis tick vector of Lyme disease.

PubMed

Gulia-Nuss, Monika; Nuss, Andrew B; Meyer, Jason M; Sonenshine, Daniel E; Roe, R Michael; Waterhouse, Robert M; Sattelle, David B; de la Fuente, José; Ribeiro, Jose M; Megy, Karine; Thimmapuram, Jyothi; Miller, Jason R; Walenz, Brian P; Koren, Sergey; Hostetler, Jessica B; Thiagarajan, Mathangi; Joardar, Vinita S; Hannick, Linda I; Bidwell, Shelby; Hammond, Martin P; Young, Sarah; Zeng, Qiandong; Abrudan, Jenica L; Almeida, Francisca C; Ayllón, Nieves; Bhide, Ketaki; Bissinger, Brooke W; Bonzon-Kulichenko, Elena; Buckingham, Steven D; Caffrey, Daniel R; Caimano, Melissa J; Croset, Vincent; Driscoll, Timothy; Gilbert, Don; Gillespie, Joseph J; Giraldo-Calderón, Gloria I; Grabowski, Jeffrey M; Jiang, David; Khalil, Sayed M S; Kim, Donghun; Kocan, Katherine M; Koči, Juraj; Kuhn, Richard J; Kurtti, Timothy J; Lees, Kristin; Lang, Emma G; Kennedy, Ryan C; Kwon, Hyeogsun; Perera, Rushika; Qi, Yumin; Radolf, Justin D; Sakamoto, Joyce M; Sánchez-Gracia, Alejandro; Severo, Maiara S; Silverman, Neal; Šimo, Ladislav; Tojo, Marta; Tornador, Cristian; Van Zee, Janice P; Vázquez, Jesús; Vieira, Filipe G; Villar, Margarita; Wespiser, Adam R; Yang, Yunlong; Zhu, Jiwei; Arensburger, Peter; Pietrantonio, Patricia V; Barker, Stephen C; Shao, Renfu; Zdobnov, Evgeny M; Hauser, Frank; Grimmelikhuijzen, Cornelis J P; Park, Yoonseong; Rozas, Julio; Benton, Richard; Pedra, Joao H F; Nelson, David R; Unger, Maria F; Tubio, Jose M C; Tu, Zhijian; Robertson, Hugh M; Shumway, Martin; Sutton, Granger; Wortman, Jennifer R; Lawson, Daniel; Wikel, Stephen K; Nene, Vishvanath M; Fraser, Claire M; Collins, Frank H; Birren, Bruce; Nelson, Karen E; Caler, Elisabet; Hill, Catherine A

2016-02-09

Ticks transmit more pathogens to humans and animals than any other arthropod. We describe the 2.1 Gbp nuclear genome of the tick, Ixodes scapularis (Say), which vectors pathogens that cause Lyme disease, human granulocytic anaplasmosis, babesiosis and other diseases. The large genome reflects accumulation of repetitive DNA, new lineages of retro-transposons, and gene architecture patterns resembling ancient metazoans rather than pancrustaceans. Annotation of scaffolds representing ∼57% of the genome, reveals 20,486 protein-coding genes and expansions of gene families associated with tick-host interactions. We report insights from genome analyses into parasitic processes unique to ticks, including host 'questing', prolonged feeding, cuticle synthesis, blood meal concentration, novel methods of haemoglobin digestion, haem detoxification, vitellogenesis and prolonged off-host survival. We identify proteins associated with the agent of human granulocytic anaplasmosis, an emerging disease, and the encephalitis-causing Langat virus, and a population structure correlated to life-history traits and transmission of the Lyme disease agent.
The genome biology of phytoplasma: modulators of plants and insects.

PubMed

Sugio, Akiko; Hogenhout, Saskia A

2012-06-01

Phytoplasmas are bacterial pathogens of plants that are transmitted by insects. These bacteria uniquely multiply intracellularly in both plants (Plantae) and insects (Animalia). Similarly to bacterial endosymbionts, phytoplasmas have reduced genomes with limited metabolic capabilities. Nonetheless, the chromosomes of many phytoplasmas are rich in repeated DNA consisting of mobile elements. Phytoplasmas produce an arsenal of effectors most of which are encoded on these mobile elements and on plasmids. These effectors target conserved plant transcription factors resulting in witches' broom and leafy flower symptoms and suppression of plant defense to insect vectors that transmit the phytoplasmas. Future studies of these fascinating microbes will generate a wealth of new knowledge about forces that shape genomes and microbial interactions with multicellular hosts. Copyright © 2012 Elsevier Ltd. All rights reserved.
Age and Diet: Major interacting factors that drive sporadic intestinal cancer | Division of Cancer Prevention

Cancer.gov

Age and diet are the two most clearly recognized risk factors for common sporadic colon cancer, responsible for >90% of cases in developed countries. We will make use of an important technical advance for whole genome sequencing of single cells recently reported by co-investigator Vijg that can uniquely detect rare mutational events to define the mutational load and spectrum
A genome resource to address mechanisms of developmental programming: determination of the fetal sheep heart transcriptome.

PubMed

Cox, Laura A; Glenn, Jeremy P; Spradling, Kimberly D; Nijland, Mark J; Garcia, Roy; Nathanielsz, Peter W; Ford, Stephen P

2012-06-15

The pregnant sheep has provided seminal insights into reproduction related to animal and human development (ovarian function, fertility, implantation, fetal growth, parturition and lactation). Fetal sheep physiology has been extensively studied since 1950, contributing significantly to the basis for our understanding of many aspects of fetal development and behaviour that remain in use in clinical practice today. Understanding mechanisms requires the combination of systems approaches uniquely available in fetal sheep with the power of genomic studies. Absence of the full range of sheep genomic resources has limited the full realization of the power of this model, impeding progress in emerging areas of pregnancy biology such as developmental programming. We have examined the expressed fetal sheep heart transcriptome using high-throughput sequencing technologies. In so doing we identified 36,737 novel transcripts and describe genes, gene variants and pathways relevant to fundamental developmental mechanisms. Genes with the highest expression levels and with novel exons in the fetal heart transcriptome are known to play central roles in muscle development. We show that high-throughput sequencing methods can generate extensive transcriptome information in the absence of an assembled and annotated genome for that species. The gene sequence data obtained provide a unique genomic resource for sheep specific genetic technology development and, combined with the polymorphism data, augment annotation and assembly of the sheep genome. In addition, identification and pathway analysis of novel fetal sheep heart transcriptome splice variants is a first step towards revealing mechanisms of genetic variation and gene environment interactions during fetal heart development.
A genome resource to address mechanisms of developmental programming: determination of the fetal sheep heart transcriptome

PubMed Central

Cox, Laura A; Glenn, Jeremy P; Spradling, Kimberly D; Nijland, Mark J; Garcia, Roy; Nathanielsz, Peter W; Ford, Stephen P

2012-01-01

The pregnant sheep has provided seminal insights into reproduction related to animal and human development (ovarian function, fertility, implantation, fetal growth, parturition and lactation). Fetal sheep physiology has been extensively studied since 1950, contributing significantly to the basis for our understanding of many aspects of fetal development and behaviour that remain in use in clinical practice today. Understanding mechanisms requires the combination of systems approaches uniquely available in fetal sheep with the power of genomic studies. Absence of the full range of sheep genomic resources has limited the full realization of the power of this model, impeding progress in emerging areas of pregnancy biology such as developmental programming. We have examined the expressed fetal sheep heart transcriptome using high-throughput sequencing technologies. In so doing we identified 36,737 novel transcripts and describe genes, gene variants and pathways relevant to fundamental developmental mechanisms. Genes with the highest expression levels and with novel exons in the fetal heart transcriptome are known to play central roles in muscle development. We show that high-throughput sequencing methods can generate extensive transcriptome information in the absence of an assembled and annotated genome for that species. The gene sequence data obtained provide a unique genomic resource for sheep specific genetic technology development and, combined with the polymorphism data, augment annotation and assembly of the sheep genome. In addition, identification and pathway analysis of novel fetal sheep heart transcriptome splice variants is a first step towards revealing mechanisms of genetic variation and gene environment interactions during fetal heart development. PMID:22508961
The Nostoc punctiforme Genome

DOE Office of Scientific and Technical Information (OSTI.GOV)

John C. Meeks

2001-12-31

Nostoc punctiforme is a filamentous cyanobacterium with extensive phenotypic characteristics and a relatively large genome, approaching 10 Mb. The phenotypic characteristics include a photoautotrophic, diazotrophic mode of growth, but N. punctiforme is also facultatively heterotrophic; its vegetative cells have multiple development alternatives, including terminal differentiation into nitrogen-fixing heterocysts and transient differentiation into spore-like akinetes or motile filaments called hormogonia; and N. punctiforme has broad symbiotic competence with fungi and terrestrial plants, including bryophytes, gymnosperms and an angiosperm. The shotgun-sequencing phase of the N. punctiforme strain ATCC 29133 genome has been completed by the Joint Genome Institute. Annotation of an 8.9more » Mb database yielded 7432 open reading frames, 45% of which encode proteins with known or probable known function and 29% of which are unique to N. punctiforme. Comparative analysis of the sequence indicates a genome that is highly plastic and in a state of flux, with numerous insertion sequences and multilocus repeats, as well as genes encoding transposases and DNA modification enzymes. The sequence also reveals the presence of genes encoding putative proteins that collectively define almost all characteristics of cyanobacteria as a group. N. punctiforme has an extensive potential to sense and respond to environmental signals as reflected by the presence of more than 400 genes encoding sensor protein kinases, response regulators and other transcriptional factors. The signal transduction systems and any of the large number of unique genes may play essential roles in the cell differentiation and symbiotic interaction properties of N. punctiforme.« less
Comparative Functional Genomics of Lactobacillus spp. Reveals Possible Mechanisms for Specialization of Vaginal Lactobacilli to Their Environment

PubMed Central

Suzuki, Haruo; Hickey, Roxana J.; Forney, Larry J.

2014-01-01

Lactobacilli are found in a wide variety of habitats. Four species, Lactobacillus crispatus, L. gasseri, L. iners, and L. jensenii, are common and abundant in the human vagina and absent from other habitats. These may be adapted to the vagina and possess characteristics enabling them to thrive in that environment. Furthermore, stable codominance of multiple Lactobacillus species in a single community is infrequently observed. Thus, it is possible that individual vaginal Lactobacillus species possess unique characteristics that confer to them host-specific competitive advantages. We performed comparative functional genomic analyses of representatives of 25 species of Lactobacillus, searching for habitat-specific traits in the genomes of the vaginal lactobacilli. We found that the genomes of the vaginal species were significantly smaller and had significantly lower GC content than those of the nonvaginal species. No protein families were found to be specific to the vaginal species analyzed, but some were either over- or underrepresented relative to nonvaginal species. We also found that within the vaginal species, each genome coded for species-specific protein families. Our results suggest that even though the vaginal species show no general signatures of adaptation to the vaginal environment, each species has specific and perhaps unique ways of interacting with its environment, be it the host or other microbes in the community. These findings will serve as a foundation for further exploring the role of lactobacilli in the ecological dynamics of vaginal microbial communities and their ultimate impact on host health. PMID:24488312
PopHuman: the human population genomics browser

PubMed Central

Mulet, Roger; Villegas-Mirón, Pablo; Hervas, Sergi; Sanz, Esteve; Velasco, Daniel; Bertranpetit, Jaume; Laayouni, Hafid

2018-01-01

Abstract The 1000 Genomes Project (1000GP) represents the most comprehensive world-wide nucleotide variation data set so far in humans, providing the sequencing and analysis of 2504 genomes from 26 populations and reporting >84 million variants. The availability of this sequence data provides the human lineage with an invaluable resource for population genomics studies, allowing the testing of molecular population genetics hypotheses and eventually the understanding of the evolutionary dynamics of genetic variation in human populations. Here we present PopHuman, a new population genomics-oriented genome browser based on JBrowse that allows the interactive visualization and retrieval of an extensive inventory of population genetics metrics. Efficient and reliable parameter estimates have been computed using a novel pipeline that faces the unique features and limitations of the 1000GP data, and include a battery of nucleotide variation measures, divergence and linkage disequilibrium parameters, as well as different tests of neutrality, estimated in non-overlapping windows along the chromosomes and in annotated genes for all 26 populations of the 1000GP. PopHuman is open and freely available at http://pophuman.uab.cat. PMID:29059408
Africa: continent of genome contrasts with implications for biomedical research and health.

PubMed

Ramsay, Michèle

2012-08-31

The genomic architecture of African populations is poorly understood and there is considerable variation between ethno-linguistic groups. Genome-wide approaches have been extensively applied to search for genetic associations to complex traits in Europeans, but rarely in Africans. This is largely attributed to lower levels of funding, poor infrastructure and public health systems, and to the small pool of trained scientists. High levels of genetic variation and underlying population structure in Africans present significant challenges, but lower levels of linkage disequilibrium provide an opportunity for more effective localisation of causal variants. High throughput technologies, including dense genotyping arrays, genome sequencing and epigenome studies, together with plummeting costs, are making research more affordable, even for African scientists. Understanding the interactions between genome structure and environmental influences is essential to interpreting their contributions to the increase in infectious diseases and non-communicable diseases, exacerbated by adverse environments and lifestyle choices. The unique genome dynamics in African populations have an important role to play in understanding human health and susceptibility to disease. Copyright © 2012. Published by Elsevier B.V.
Species-specific chitin-binding module 18 expansion in the amphibian pathogen Batrachochytrium dendrobatidis.

PubMed

Abramyan, John; Stajich, Jason E

2012-01-01

Batrachochytrium dendrobatidis is the causative agent of chytridiomycosis, which is considered one of the driving forces behind the worldwide decline in populations of amphibians. As a member of the phylum Chytridiomycota, B. dendrobatidis has diverged significantly to emerge as the only pathogen of adult vertebrates. Such shifts in lifestyle are generally accompanied by various degrees of genomic modifications, yet neither its mode of pathogenicity nor any factors associated with it have ever been identified. Presented here is the identification and characterization of a unique expansion of the carbohydrate-binding module family 18 (CBM18), specific to B. dendrobatidis. CBM (chitin-binding module) expansions have been likened to the evolution of pathogenicity in a variety of fungus species, making this expanded group a prime candidate for the identification of potential pathogenicity factors. Furthermore, the CBM18 expansions are confined to three categories of genes, each having been previously implicated in host-pathogen interactions. These correlations highlight this specific domain expansion as a potential key player in the mode of pathogenicity in this unique fungus. The expansion of CBM18 in B. dendrobatidis is exceptional in its size and diversity compared to other pathogenic species of fungi, making this genomic feature unique in an evolutionary context as well as in pathogenicity. Amphibian populations are declining worldwide at an unprecedented rate. Although various factors are thought to contribute to this phenomenon, chytridiomycosis has been identified as one of the leading causes. This deadly fungal disease is cause by Batrachochytrium dendrobatidis, a chytrid fungus species unique in its pathogenicity and, furthermore, its specificity to amphibians. Despite more than two decades of research, the biology of this fungus species and its deadly interaction with amphibians had been notoriously difficult to unravel. Due to the alarming rate of worldwide spread and associated decline in amphibian populations, it is imperative to incorporate novel genomic and genetic techniques into the study of this species. In this study, we present the first reported potential pathogenicity factors in B. dendrobatidis. In silico studies such as this allow us to identify putative targets for more specific molecular analyses, furthering our hope for the control of this pathogen.
Oxidized Base Damage and Single-Strand Break Repair in Mammalian Genomes: Role of Disordered Regions and Posttranslational Modifications in Early Enzymes

PubMed Central

Hegde, Muralidhar L.; Izumi, Tadahide; Mitra, Sankar

2012-01-01

Oxidative genome damage induced by reactive oxygen species includes oxidized bases, abasic (AP) sites, and single-strand breaks, all of which are repaired via the evolutionarily conserved base excision repair/single-strand break repair (BER/SSBR) pathway. BER/SSBR in mammalian cells is complex, with preferred and backup sub-pathways, and is linked to genome replication and transcription. The early BER/SSBR enzymes, namely, DNA glycosylases (DGs) and the end-processing proteins such as abasic endonuclease 1 (APE1), form complexes with downstream repair (and other noncanonical) proteins via pairwise interactions. Furthermore, a unique feature of mammalian early BER/ SSBR enzymes is the presence of a disordered terminal extension that is absent in their Escherichia coli prototypes. These nonconserved segments usually contain organelle-targeting signals, common interaction interfaces, and sites of posttranslational modifications that may be involved in regulating their repair function including lesion scanning. Finally, the linkage of BER/SSBR deficiency to cancer, aging, and human neurodegenerative diseases, and therapeutic targeting of BER/SSBR are discussed. PMID:22749145
The Einstein Genome Gateway using WASP - a high throughput multi-layered life sciences portal for XSEDE.

PubMed

Golden, Aaron; McLellan, Andrew S; Dubin, Robert A; Jing, Qiang; O Broin, Pilib; Moskowitz, David; Zhang, Zhengdong; Suzuki, Masako; Hargitai, Joseph; Calder, R Brent; Greally, John M

2012-01-01

Massively-parallel sequencing (MPS) technologies and their diverse applications in genomics and epigenomics research have yielded enormous new insights into the physiology and pathophysiology of the human genome. The biggest hurdle remains the magnitude and diversity of the datasets generated, compromising our ability to manage, organize, process and ultimately analyse data. The Wiki-based Automated Sequence Processor (WASP), developed at the Albert Einstein College of Medicine (hereafter Einstein), uniquely manages to tightly couple the sequencing platform, the sequencing assay, sample metadata and the automated workflows deployed on a heterogeneous high performance computing cluster infrastructure that yield sequenced, quality-controlled and 'mapped' sequence data, all within the one operating environment accessible by a web-based GUI interface. WASP at Einstein processes 4-6 TB of data per week and since its production cycle commenced it has processed ~ 1 PB of data overall and has revolutionized user interactivity with these new genomic technologies, who remain blissfully unaware of the data storage, management and most importantly processing services they request. The abstraction of such computational complexity for the user in effect makes WASP an ideal middleware solution, and an appropriate basis for the development of a grid-enabled resource - the Einstein Genome Gateway - as part of the Extreme Science and Engineering Discovery Environment (XSEDE) program. In this paper we discuss the existing WASP system, its proposed middleware role, and its planned interaction with XSEDE to form the Einstein Genome Gateway.
The enigmatic archaeal virosphere.

PubMed

Prangishvili, David; Bamford, Dennis H; Forterre, Patrick; Iranzo, Jaime; Koonin, Eugene V; Krupovic, Mart

2017-11-10

One of the most prominent features of archaea is the extraordinary diversity of their DNA viruses. Many archaeal viruses differ substantially in morphology from bacterial and eukaryotic viruses and represent unique virus families. The distinct nature of archaeal viruses also extends to the gene composition and architectures of their genomes and the properties of the proteins that they encode. Environmental research has revealed prominent roles of archaeal viruses in influencing microbial communities in ocean ecosystems, and recent metagenomic studies have uncovered new groups of archaeal viruses that infect extremophiles and mesophiles in diverse habitats. In this Review, we summarize recent advances in our understanding of the genomic and morphological diversity of archaeal viruses and the molecular biology of their life cycles and virus-host interactions, including interactions with archaeal CRISPR-Cas systems. We also examine the potential origins and evolution of archaeal viruses and discuss their place in the global virosphere.
mySyntenyPortal: an application package to construct websites for synteny block analysis.

PubMed

Lee, Jongin; Lee, Daehwan; Sim, Mikang; Kwon, Daehong; Kim, Juyeon; Ko, Younhee; Kim, Jaebum

2018-06-05

Advances in sequencing technologies have facilitated large-scale comparative genomics based on whole genome sequencing. Constructing and investigating conserved genomic regions among multiple species (called synteny blocks) are essential in the comparative genomics. However, they require significant amounts of computational resources and time in addition to bioinformatics skills. Many web interfaces have been developed to make such tasks easier. However, these web interfaces cannot be customized for users who want to use their own set of genome sequences or definition of synteny blocks. To resolve this limitation, we present mySyntenyPortal, a stand-alone application package to construct websites for synteny block analyses by using users' own genome data. mySyntenyPortal provides both command line and web-based interfaces to build and manage websites for large-scale comparative genomic analyses. The websites can be also easily published and accessed by other users. To demonstrate the usability of mySyntenyPortal, we present an example study for building websites to compare genomes of three mammalian species (human, mouse, and cow) and show how they can be easily utilized to identify potential genes affected by genome rearrangements. mySyntenyPortal will contribute for extended comparative genomic analyses based on large-scale whole genome sequences by providing unique functionality to support the easy creation of interactive websites for synteny block analyses from user's own genome data.
Bat Biology, Genomes, and the Bat1K Project: To Generate Chromosome-Level Genomes for All Living Bat Species.

PubMed

Teeling, Emma C; Vernes, Sonja C; Dávalos, Liliana M; Ray, David A; Gilbert, M Thomas P; Myers, Eugene

2018-02-15

Bats are unique among mammals, possessing some of the rarest mammalian adaptations, including true self-powered flight, laryngeal echolocation, exceptional longevity, unique immunity, contracted genomes, and vocal learning. They provide key ecosystem services, pollinating tropical plants, dispersing seeds, and controlling insect pest populations, thus driving healthy ecosystems. They account for more than 20% of all living mammalian diversity, and their crown-group evolutionary history dates back to the Eocene. Despite their great numbers and diversity, many species are threatened and endangered. Here we announce Bat1K, an initiative to sequence the genomes of all living bat species (n∼1,300) to chromosome-level assembly. The Bat1K genome consortium unites bat biologists (>148 members as of writing), computational scientists, conservation organizations, genome technologists, and any interested individuals committed to a better understanding of the genetic and evolutionary mechanisms that underlie the unique adaptations of bats. Our aim is to catalog the unique genetic diversity present in all living bats to better understand the molecular basis of their unique adaptations; uncover their evolutionary history; link genotype with phenotype; and ultimately better understand, promote, and conserve bats. Here we review the unique adaptations of bats and highlight how chromosome-level genome assemblies can uncover the molecular basis of these traits. We present a novel sequencing and assembly strategy and review the striking societal and scientific benefits that will result from the Bat1K initiative.
Proteomic strategy for the identification of critical actors in reorganization of the post-meiotic male genome.

PubMed

Govin, Jerome; Gaucher, Jonathan; Ferro, Myriam; Debernardi, Alexandra; Garin, Jerome; Khochbin, Saadi; Rousseaux, Sophie

2012-01-01

After meiosis, during the final stages of spermatogenesis, the haploid male genome undergoes major structural changes, resulting in a shift from a nucleosome-based genome organization to the sperm-specific, highly compacted nucleoprotamine structure. Recent data support the idea that region-specific programming of the haploid male genome is of high importance for the post-fertilization events and for successful embryo development. Although these events constitute a unique and essential step in reproduction, the mechanisms by which they occur have remained completely obscure and the factors involved have mostly remained uncharacterized. Here, we sought a strategy to significantly increase our understanding of proteins controlling the haploid male genome reprogramming, based on the identification of proteins in two specific pools: those with the potential to bind nucleic acids (basic proteins) and proteins capable of binding basic proteins (acidic proteins). For the identification of acidic proteins, we developed an approach involving a transition-protein (TP)-based chromatography, which has the advantage of retaining not only acidic proteins due to the charge interactions, but also potential TP-interacting factors. A second strategy, based on an in-depth bioinformatic analysis of the identified proteins, was then applied to pinpoint within the lists obtained, male germ cells expressed factors relevant to the post-meiotic genome organization. This approach reveals a functional network of DNA-packaging proteins and their putative chaperones and sheds a new light on the way the critical transitions in genome organizations could take place. This work also points to a new area of research in male infertility and sperm quality assessments.
Light-harvesting antenna complexes in the moss Physcomitrella patens: implications for the evolutionary transition from green algae to land plants.

PubMed

Iwai, Masakazu; Yokono, Makio

2017-06-01

Plants have successfully adapted to a vast range of terrestrial environments during their evolution. To elucidate the evolutionary transition of light-harvesting antenna proteins from green algae to land plants, the moss Physcomitrella patens is ideally placed basally among land plants. Compared to the genomes of green algae and land plants, the P. patens genome codes for more diverse and redundant light-harvesting antenna proteins. It also encodes Lhcb9, which has characteristics not found in other light-harvesting antenna proteins. The unique complement of light-harvesting antenna proteins in P. patens appears to facilitate protein interactions that include those lost in both green algae and land plants with regard to stromal electron transport pathways and photoprotection mechanisms. This review will highlight unique characteristics of the P. patens light-harvesting antenna system and the resulting implications about the evolutionary transition during plant terrestrialization. Copyright © 2017 Elsevier Ltd. All rights reserved.
The genome sequence of Leishmania (Leishmania) amazonensis: functional annotation and extended analysis of gene models.

PubMed

Real, Fernando; Vidal, Ramon Oliveira; Carazzolle, Marcelo Falsarella; Mondego, Jorge Maurício Costa; Costa, Gustavo Gilson Lacerda; Herai, Roberto Hirochi; Würtele, Martin; de Carvalho, Lucas Miguel; Carmona e Ferreira, Renata; Mortara, Renato Arruda; Barbiéri, Clara Lucia; Mieczkowski, Piotr; da Silveira, José Franco; Briones, Marcelo Ribeiro da Silva; Pereira, Gonçalo Amarante Guimarães; Bahia, Diana

2013-12-01

We present the sequencing and annotation of the Leishmania (Leishmania) amazonensis genome, an etiological agent of human cutaneous leishmaniasis in the Amazon region of Brazil. L. (L.) amazonensis shares features with Leishmania (L.) mexicana but also exhibits unique characteristics regarding geographical distribution and clinical manifestations of cutaneous lesions (e.g. borderline disseminated cutaneous leishmaniasis). Predicted genes were scored for orthologous gene families and conserved domains in comparison with other human pathogenic Leishmania spp. Carboxypeptidase, aminotransferase, and 3'-nucleotidase genes and ATPase, thioredoxin, and chaperone-related domains were represented more abundantly in L. (L.) amazonensis and L. (L.) mexicana species. Phylogenetic analysis revealed that these two species share groups of amastin surface proteins unique to the genus that could be related to specific features of disease outcomes and host cell interactions. Additionally, we describe a hypothetical hybrid interactome of potentially secreted L. (L.) amazonensis proteins and host proteins under the assumption that parasite factors mimic their mammalian counterparts. The model predicts an interaction between an L. (L.) amazonensis heat-shock protein and mammalian Toll-like receptor 9, which is implicated in important immune responses such as cytokine and nitric oxide production. The analysis presented here represents valuable information for future studies of leishmaniasis pathogenicity and treatment.
The Genome Sequence of Leishmania (Leishmania) amazonensis: Functional Annotation and Extended Analysis of Gene Models

PubMed Central

Real, Fernando; Vidal, Ramon Oliveira; Carazzolle, Marcelo Falsarella; Mondego, Jorge Maurício Costa; Costa, Gustavo Gilson Lacerda; Herai, Roberto Hirochi; Würtele, Martin; de Carvalho, Lucas Miguel; e Ferreira, Renata Carmona; Mortara, Renato Arruda; Barbiéri, Clara Lucia; Mieczkowski, Piotr; da Silveira, José Franco; Briones, Marcelo Ribeiro da Silva; Pereira, Gonçalo Amarante Guimarães; Bahia, Diana

2013-01-01

We present the sequencing and annotation of the Leishmania (Leishmania) amazonensis genome, an etiological agent of human cutaneous leishmaniasis in the Amazon region of Brazil. L. (L.) amazonensis shares features with Leishmania (L.) mexicana but also exhibits unique characteristics regarding geographical distribution and clinical manifestations of cutaneous lesions (e.g. borderline disseminated cutaneous leishmaniasis). Predicted genes were scored for orthologous gene families and conserved domains in comparison with other human pathogenic Leishmania spp. Carboxypeptidase, aminotransferase, and 3′-nucleotidase genes and ATPase, thioredoxin, and chaperone-related domains were represented more abundantly in L. (L.) amazonensis and L. (L.) mexicana species. Phylogenetic analysis revealed that these two species share groups of amastin surface proteins unique to the genus that could be related to specific features of disease outcomes and host cell interactions. Additionally, we describe a hypothetical hybrid interactome of potentially secreted L. (L.) amazonensis proteins and host proteins under the assumption that parasite factors mimic their mammalian counterparts. The model predicts an interaction between an L. (L.) amazonensis heat-shock protein and mammalian Toll-like receptor 9, which is implicated in important immune responses such as cytokine and nitric oxide production. The analysis presented here represents valuable information for future studies of leishmaniasis pathogenicity and treatment. PMID:23857904
Transcriptomics exposes the uniqueness of parasitic plants.

PubMed

Ichihashi, Yasunori; Mutuku, J Musembi; Yoshida, Satoko; Shirasu, Ken

2015-07-01

Parasitic plants have the ability to obtain nutrients directly from other plants, and several species are serious biological threats to agriculture by parasitizing crops of high economic importance. The uniqueness of parasitic plants is characterized by the presence of a multicellular organ called a haustorium, which facilitates plant-plant interactions, and shutting down or reducing their own photosynthesis. Current technical advances in next-generation sequencing and bioinformatics have allowed us to dissect the molecular mechanisms behind the uniqueness of parasitic plants at the genome-wide level. In this review, we summarize recent key findings mainly in transcriptomics that will give us insights into the future direction of parasitic plant research. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

Analysis of the Pantoea ananatis pan-genome reveals factors underlying its ability to colonize and interact with plant, insect and vertebrate hosts.

PubMed

De Maayer, Pieter; Chan, Wai Yin; Rubagotti, Enrico; Venter, Stephanus N; Toth, Ian K; Birch, Paul R J; Coutinho, Teresa A

2014-05-27

Pantoea ananatis is found in a wide range of natural environments, including water, soil, as part of the epi- and endophytic flora of various plant hosts, and in the insect gut. Some strains have proven effective as biological control agents and plant-growth promoters, while other strains have been implicated in diseases of a broad range of plant hosts and humans. By analysing the pan-genome of eight sequenced P. ananatis strains isolated from different sources we identified factors potentially underlying its ability to colonize and interact with hosts in both the plant and animal Kingdoms. The pan-genome of the eight compared P. ananatis strains consisted of a core genome comprised of 3,876 protein coding sequences (CDSs) and a sizeable accessory genome consisting of 1,690 CDSs. We estimate that ~106 unique CDSs would be added to the pan-genome with each additional P. ananatis genome sequenced in the future. The accessory fraction is derived mainly from integrated prophages and codes mostly for proteins of unknown function. Comparison of the translated CDSs on the P. ananatis pan-genome with the proteins encoded on all sequenced bacterial genomes currently available revealed that P. ananatis carries a number of CDSs with orthologs restricted to bacteria associated with distinct hosts, namely plant-, animal- and insect-associated bacteria. These CDSs encode proteins with putative roles in transport and metabolism of carbohydrate and amino acid substrates, adherence to host tissues, protection against plant and animal defense mechanisms and the biosynthesis of potential pathogenicity determinants including insecticidal peptides, phytotoxins and type VI secretion system effectors. P. ananatis has an 'open' pan-genome typical of bacterial species that colonize several different environments. The pan-genome incorporates a large number of genes encoding proteins that may enable P. ananatis to colonize, persist in and potentially cause disease symptoms in a wide range of plant and animal hosts.
Gene expression profiling--Opening the black box of plant ecosystem responses to global change

DOE Office of Scientific and Technical Information (OSTI.GOV)

Leakey, A.D.B.; Ainsworth, E.A.; Bernard, S.M.

The use of genomic techniques to address ecological questions is emerging as the field of genomic ecology. Experimentation under environmentally realistic conditions to investigate the molecular response of plants to meaningful changes in growth conditions and ecological interactions is the defining feature of genomic ecology. Since the impact of global change factors on plant performance are mediated by direct effects at the molecular, biochemical and physiological scales, gene expression analysis promises important advances in understanding factors that have previously been consigned to the 'black box' of unknown mechanism. Various tools and approaches are available for assessing gene expression in modelmore » and non-model species as part of global change biology studies. Each approach has its own unique advantages and constraints. A first generation of genomic ecology studies in managed ecosystems and mesocosms have provided a testbed for the approach and have begun to reveal how the experimental design and data analysis of gene expression studies can be tailored for use in an ecological context.« less
Extensive Mobilome-Driven Genome Diversification in Mouse Gut-Associated Bacteroides vulgatus mpk

PubMed Central

Lange, Anna; Beier, Sina; Steimle, Alex; Autenrieth, Ingo B.; Huson, Daniel H.; Frick, Julia-Stefanie

2016-01-01

Like many other Bacteroides species, Bacteroides vulgatus strain mpk, a mouse fecal isolate which was shown to promote intestinal homeostasis, utilizes a variety of mobile elements for genome evolution. Based on sequences collected by Pacific Biosciences SMRT sequencing technology, we discuss the challenges of assembling and studying a bacterial genome of high plasticity. Additionally, we conducted comparative genomics comparing this commensal strain with the B. vulgatus type strain ATCC 8482 as well as multiple other Bacteroides and Parabacteroides strains to reveal the most important differences and identify the unique features of B. vulgatus mpk. The genome of B. vulgatus mpk harbors a large and diverse set of mobile element proteins compared with other sequenced Bacteroides strains. We found evidence of a number of different horizontal gene transfer events and a genome landscape that has been extensively altered by different mobilization events. A CRISPR/Cas system could be identified that provides a possible mechanism for preventing the integration of invading external DNA. We propose that the high genome plasticity and the introduced genome instabilities of B. vulgatus mpk arising from the various mobilization events might play an important role not only in its adaptation to the challenging intestinal environment in general, but also in its ability to interact with the gut microbiota. PMID:27071651
Breaking-Cas—interactive design of guide RNAs for CRISPR-Cas experiments for ENSEMBL genomes

PubMed Central

Oliveros, Juan C.; Franch, Mònica; Tabas-Madrid, Daniel; San-León, David; Montoliu, Lluis; Cubas, Pilar; Pazos, Florencio

2016-01-01

The CRISPR/Cas technology is enabling targeted genome editing in multiple organisms with unprecedented accuracy and specificity by using RNA-guided nucleases. A critical point when planning a CRISPR/Cas experiment is the design of the guide RNA (gRNA), which directs the nuclease and associated machinery to the desired genomic location. This gRNA has to fulfil the requirements of the nuclease and lack homology with other genome sites that could lead to off-target effects. Here we introduce the Breaking-Cas system for the design of gRNAs for CRISPR/Cas experiments, including those based in the Cas9 nuclease as well as others recently introduced. The server has unique features not available in other tools, including the possibility of using all eukaryotic genomes available in ENSEMBL (currently around 700), placing variable PAM sequences at 5′ or 3′ and setting the guide RNA length and the scores per nucleotides. It can be freely accessed at: http://bioinfogp.cnb.csic.es/tools/breakingcas, and the code is available upon request. PMID:27166368
Tissue-Specific Chromatin Modifications at a Multigene Locus Generate Asymmetric Transcriptional Interactions

PubMed Central

Yoo, Eung Jae; Cajiao, Isabela; Kim, Jeong-Seon; Kimura, Atsushi P.; Zhang, Aiwen; Cooke, Nancy E.; Liebhaber, Stephen A.

2006-01-01

Random assortment within mammalian genomes juxtaposes genes with distinct expression profiles. This organization, along with the prevalence of long-range regulatory controls, generates a potential for aberrant transcriptional interactions. The human CD79b/GH locus contains six tightly linked genes with three mutually exclusive tissue specificities and interdigitated control elements. One consequence of this compact organization is that the pituitarycell-specific transcriptional events that activate hGH-N also trigger ectopic activation of CD79b. However, the B-cell-specific events that activate CD79b do not trigger reciprocal activation of hGH-N. Here we utilized DNase I hypersensitive site mapping, chromatin immunoprecipitation, and transgenic models to explore the basis for this asymmetric relationship. The results reveal tissue-specific patterns of chromatin structures and transcriptional controls at the CD79b/GH locus in B cells distinct from those in the pituitary gland and placenta. These three unique transcriptional environments suggest a set of corresponding gene expression pathways and transcriptional interactions that are likely to be found juxtaposed at multiple sites within the eukaryotic genome. PMID:16847312
The ChIP-exo Method: Identifying Protein-DNA Interactions with Near Base Pair Precision.

PubMed

Perreault, Andrea A; Venters, Bryan J

2016-12-23

Chromatin immunoprecipitation (ChIP) is an indispensable tool in the fields of epigenetics and gene regulation that isolates specific protein-DNA interactions. ChIP coupled to high throughput sequencing (ChIP-seq) is commonly used to determine the genomic location of proteins that interact with chromatin. However, ChIP-seq is hampered by relatively low mapping resolution of several hundred base pairs and high background signal. The ChIP-exo method is a refined version of ChIP-seq that substantially improves upon both resolution and noise. The key distinction of the ChIP-exo methodology is the incorporation of lambda exonuclease digestion in the library preparation workflow to effectively footprint the left and right 5' DNA borders of the protein-DNA crosslink site. The ChIP-exo libraries are then subjected to high throughput sequencing. The resulting data can be leveraged to provide unique and ultra-high resolution insights into the functional organization of the genome. Here, we describe the ChIP-exo method that we have optimized and streamlined for mammalian systems and next-generation sequencing-by-synthesis platform.
PopHuman: the human population genomics browser.

PubMed

Casillas, Sònia; Mulet, Roger; Villegas-Mirón, Pablo; Hervas, Sergi; Sanz, Esteve; Velasco, Daniel; Bertranpetit, Jaume; Laayouni, Hafid; Barbadilla, Antonio

2018-01-04

The 1000 Genomes Project (1000GP) represents the most comprehensive world-wide nucleotide variation data set so far in humans, providing the sequencing and analysis of 2504 genomes from 26 populations and reporting >84 million variants. The availability of this sequence data provides the human lineage with an invaluable resource for population genomics studies, allowing the testing of molecular population genetics hypotheses and eventually the understanding of the evolutionary dynamics of genetic variation in human populations. Here we present PopHuman, a new population genomics-oriented genome browser based on JBrowse that allows the interactive visualization and retrieval of an extensive inventory of population genetics metrics. Efficient and reliable parameter estimates have been computed using a novel pipeline that faces the unique features and limitations of the 1000GP data, and include a battery of nucleotide variation measures, divergence and linkage disequilibrium parameters, as well as different tests of neutrality, estimated in non-overlapping windows along the chromosomes and in annotated genes for all 26 populations of the 1000GP. PopHuman is open and freely available at http://pophuman.uab.cat. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Unique core genomes of the bacterial family vibrionaceae: insights into niche adaptation and speciation.

PubMed

Kahlke, Tim; Goesmann, Alexander; Hjerde, Erik; Willassen, Nils Peder; Haugen, Peik

2012-05-10

The criteria for defining bacterial species and even the concept of bacterial species itself are under debate, and the discussion is apparently intensifying as more genome sequence data is becoming available. However, it is still unclear how the new advances in genomics should be used most efficiently to address this question. In this study we identify genes that are common to any group of genomes in our dataset, to determine whether genes specific to a particular taxon exist and to investigate their potential role in adaptation of bacteria to their specific niche. These genes were named unique core genes. Additionally, we investigate the existence and importance of unique core genes that are found in isolates of phylogenetically non-coherent groups. These groups of isolates, that share a genetic feature without sharing a closest common ancestor, are termed genophyletic groups. The bacterial family Vibrionaceae was used as the model, and we compiled and compared genome sequences of 64 different isolates. Using the software orthoMCL we determined clusters of homologous genes among the investigated genome sequences. We used multilocus sequence analysis to build a host phylogeny and mapped the numbers of unique core genes of all distinct groups of isolates onto the tree. The results show that unique core genes are more likely to be found in monophyletic groups of isolates. Genophyletic groups of isolates, in contrast, are less common especially for large groups of isolate. The subsequent annotation of unique core genes that are present in genophyletic groups indicate a high degree of horizontally transferred genes. Finally, the annotation of the unique core genes of Vibrio cholerae revealed genes involved in aerotaxis and biosynthesis of the iron-chelator vibriobactin. The presented work indicates that genes specific for any taxon inside the bacterial family Vibrionaceae exist. These unique core genes encode conserved metabolic functions that can shed light on the adaptation of a species to its ecological niche. Additionally, our study suggests that unique core genes can be used to aid classification of bacteria and contribute to a bacterial species definition on a genomic level. Furthermore, these genes may be of importance in clinical diagnostics and drug development.
Hepatitis A Virus Genome Organization and Replication Strategy.

PubMed

McKnight, Kevin L; Lemon, Stanley M

2018-04-02

Hepatitis A virus (HAV) is a positive-strand RNA virus classified in the genus Hepatovirus of the family Picornaviridae It is an ancient virus with a long evolutionary history and multiple features of its capsid structure, genome organization, and replication cycle that distinguish it from other mammalian picornaviruses. HAV proteins are produced by cap-independent translation of a single, long open reading frame under direction of an inefficient, upstream internal ribosome entry site (IRES). Genome replication occurs slowly and is noncytopathic, with transcription likely primed by a uridylated protein primer as in other picornaviruses. Newly produced quasi-enveloped virions (eHAV) are released from cells in a nonlytic fashion in a unique process mediated by interactions of capsid proteins with components of the host cell endosomal sorting complexes required for transport (ESCRT) system. Copyright © 2018 Cold Spring Harbor Laboratory Press; all rights reserved.
Somatic cell nuclear transfer: infinite reproduction of a unique diploid genome.

PubMed

Kishigami, Satoshi; Wakayama, Sayaka; Hosoi, Yoshihiko; Iritani, Akira; Wakayama, Teruhiko

2008-06-10

In mammals, a diploid genome of an individual following fertilization of an egg and a spermatozoon is unique and irreproducible. This implies that the generated unique diploid genome is doomed with the individual ending. Even as cultured cells from the individual, they cannot normally proliferate in perpetuity because of the "Hayflick limit". However, Dolly, the sheep cloned from an adult mammary gland cell, changes this scenario. Somatic cell nuclear transfer (SCNT) enables us to produce offspring without germ cells, that is, to "passage" a unique diploid genome. Animal cloning has also proven to be a powerful research tool for reprogramming in many mammals, notably mouse and cow. The mechanism underlying reprogramming, however, remains largely unknown and, animal cloning has been inefficient as a result. More momentously, in addition to abortion and fetal mortality, some cloned animals display possible premature aging phenotypes including early death and short telomere lengths. Under these inauspicious conditions, is it really possible for SCNT to preserve a diploid genome? Delightfully, in mouse and recently in primate, using SCNT we can produce nuclear transfer ES cells (ntES) more efficiently, which can preserve the eternal lifespan for the "passage" of a unique diploid genome. Further, new somatic cloning technique using histone-deacetylase inhibitors has been developed which can significantly increase the previous cloning rates two to six times. Here, we introduce SCNT and its value as a preservation tool for a diploid genome while reviewing aging of cloned animals on cellular and individual levels.
Gene: a gene-centered information resource at NCBI.

PubMed

Brown, Garth R; Hem, Vichet; Katz, Kenneth S; Ovetsky, Michael; Wallin, Craig; Ermolaeva, Olga; Tolstoy, Igor; Tatusova, Tatiana; Pruitt, Kim D; Maglott, Donna R; Murphy, Terence D

2015-01-01

The National Center for Biotechnology Information's (NCBI) Gene database (www.ncbi.nlm.nih.gov/gene) integrates gene-specific information from multiple data sources. NCBI Reference Sequence (RefSeq) genomes for viruses, prokaryotes and eukaryotes are the primary foundation for Gene records in that they form the critical association between sequence and a tracked gene upon which additional functional and descriptive content is anchored. Additional content is integrated based on the genomic location and RefSeq transcript and protein sequence data. The content of a Gene record represents the integration of curation and automated processing from RefSeq, collaborating model organism databases, consortia such as Gene Ontology, and other databases within NCBI. Records in Gene are assigned unique, tracked integers as identifiers. The content (citations, nomenclature, genomic location, gene products and their attributes, phenotypes, sequences, interactions, variation details, maps, expression, homologs, protein domains and external databases) is available via interactive browsing through NCBI's Entrez system, via NCBI's Entrez programming utilities (E-Utilities and Entrez Direct) and for bulk transfer by FTP. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Genomic and Proteomic Dissection of the Ubiquitous Plant Pathogen, Armillaria mellea: Toward a New Infection Model System

PubMed Central

2013-01-01

Armillaria mellea is a major plant pathogen. Yet, no large-scale “-omics” data are available to enable new studies, and limited experimental models are available to investigate basidiomycete pathogenicity. Here we reveal that the A. mellea genome comprises 58.35 Mb, contains 14473 gene models, of average length 1575 bp (4.72 introns/gene). Tandem mass spectrometry identified 921 mycelial (n = 629 unique) and secreted (n = 183 unique) proteins. Almost 100 mycelial proteins were either species-specific or previously unidentified at the protein level. A number of proteins (n = 111) was detected in both mycelia and culture supernatant extracts. Signal sequence occurrence was 4-fold greater for secreted (50.2%) compared to mycelial (12%) proteins. Analyses revealed a rich reservoir of carbohydrate degrading enzymes, laccases, and lignin peroxidases in the A. mellea proteome, reminiscent of both basidiomycete and ascomycete glycodegradative arsenals. We discovered that A. mellea exhibits a specific killing effect against Candida albicans during coculture. Proteomic investigation of this interaction revealed the unique expression of defensive and potentially offensive A. mellea proteins (n = 30). Overall, our data reveal new insights into the origin of basidiomycete virulence and we present a new model system for further studies aimed at deciphering fungal pathogenic mechanisms. PMID:23656496
Analysis of the Legionella longbeachae Genome and Transcriptome Uncovers Unique Strategies to Cause Legionnaires' Disease

PubMed Central

Rusniok, Christophe; Lomma, Mariella; Dervins-Ravault, Delphine; Newton, Hayley J.; Sansom, Fiona M.; Jarraud, Sophie; Zidane, Nora; Ma, Laurence; Bouchier, Christiane; Etienne, Jerôme; Hartland, Elizabeth L.; Buchrieser, Carmen

2010-01-01

Legionella pneumophila and L. longbeachae are two species of a large genus of bacteria that are ubiquitous in nature. L. pneumophila is mainly found in natural and artificial water circuits while L. longbeachae is mainly present in soil. Under the appropriate conditions both species are human pathogens, capable of causing a severe form of pneumonia termed Legionnaires' disease. Here we report the sequencing and analysis of four L. longbeachae genomes, one complete genome sequence of L. longbeachae strain NSW150 serogroup (Sg) 1, and three draft genome sequences another belonging to Sg1 and two to Sg2. The genome organization and gene content of the four L. longbeachae genomes are highly conserved, indicating strong pressure for niche adaptation. Analysis and comparison of L. longbeachae strain NSW150 with L. pneumophila revealed common but also unexpected features specific to this pathogen. The interaction with host cells shows distinct features from L. pneumophila, as L. longbeachae possesses a unique repertoire of putative Dot/Icm type IV secretion system substrates, eukaryotic-like and eukaryotic domain proteins, and encodes additional secretion systems. However, analysis of the ability of a dotA mutant of L. longbeachae NSW150 to replicate in the Acanthamoeba castellanii and in a mouse lung infection model showed that the Dot/Icm type IV secretion system is also essential for the virulence of L. longbeachae. In contrast to L. pneumophila, L. longbeachae does not encode flagella, thereby providing a possible explanation for differences in mouse susceptibility to infection between the two pathogens. Furthermore, transcriptome analysis revealed that L. longbeachae has a less pronounced biphasic life cycle as compared to L. pneumophila, and genome analysis and electron microscopy suggested that L. longbeachae is encapsulated. These species-specific differences may account for the different environmental niches and disease epidemiology of these two Legionella species. PMID:20174605
The Genome of the “Great Speciator” Provides Insights into Bird Diversification

PubMed Central

Cornetti, Luca; Valente, Luis M.; Dunning, Luke T.; Quan, Xueping; Black, Richard A.; Hébert, Olivier; Savolainen, Vincent

2015-01-01

Among birds, white-eyes (genus Zosterops) have diversified so extensively that Jared Diamond and Ernst Mayr referred to them as the “great speciator.” The Zosterops lineage exhibits some of the fastest rates of species diversification among vertebrates, and its members are the most prolific passerine island colonizers. We present a high-quality genome assembly for the silvereye (Zosterops lateralis), a white-eye species consisting of several subspecies distributed across multiple islands. We investigate the genetic basis of rapid diversification in white-eyes by conducting genomic analyses at varying taxonomic levels. First, we compare the silvereye genome with those of birds from different families and searched for genomic features that may be unique to Zosterops. Second, we compare the genomes of different species of white-eyes from Lifou island (South Pacific), using whole genome resequencing and restriction site associated DNA. Third, we contrast the genomes of two subspecies of silvereye that differ in plumage color. In accordance with theory, we show that white-eyes have high rates of substitutions, gene duplication, and positive selection relative to other birds. Below genus level, we find that genomic differentiation accumulates rapidly and reveals contrasting demographic histories between sympatric species on Lifou, indicative of past interspecific interactions. Finally, we highlight genes possibly involved in color polymorphism between the subspecies of silvereye. By providing the first whole-genome sequence resources for white-eyes and by conducting analyses at different taxonomic levels, we provide genomic evidence underpinning this extraordinary bird radiation. PMID:26338191
Signatures of Pleiotropy, Economy and Convergent Evolution in a Domain-Resolved Map of Human–Virus Protein–Protein Interaction Networks

PubMed Central

Garamszegi, Sara; Franzosa, Eric A.; Xia, Yu

2013-01-01

A central challenge in host-pathogen systems biology is the elucidation of general, systems-level principles that distinguish host-pathogen interactions from within-host interactions. Current analyses of host-pathogen and within-host protein-protein interaction networks are largely limited by their resolution, treating proteins as nodes and interactions as edges. Here, we construct a domain-resolved map of human-virus and within-human protein-protein interaction networks by annotating protein interactions with high-coverage, high-accuracy, domain-centric interaction mechanisms: (1) domain-domain interactions, in which a domain in one protein binds to a domain in a second protein, and (2) domain-motif interactions, in which a domain in one protein binds to a short, linear peptide motif in a second protein. Analysis of these domain-resolved networks reveals, for the first time, significant mechanistic differences between virus-human and within-human interactions at the resolution of single domains. While human proteins tend to compete with each other for domain binding sites by means of sequence similarity, viral proteins tend to compete with human proteins for domain binding sites in the absence of sequence similarity. Independent of their previously established preference for targeting human protein hubs, viral proteins also preferentially target human proteins containing linear motif-binding domains. Compared to human proteins, viral proteins participate in more domain-motif interactions, target more unique linear motif-binding domains per residue, and contain more unique linear motifs per residue. Together, these results suggest that viruses surmount genome size constraints by convergently evolving multiple short linear motifs in order to effectively mimic, hijack, and manipulate complex host processes for their survival. Our domain-resolved analyses reveal unique signatures of pleiotropy, economy, and convergent evolution in viral-host interactions that are otherwise hidden in the traditional binary network, highlighting the power and necessity of high-resolution approaches in host-pathogen systems biology. PMID:24339775
Signatures of pleiotropy, economy and convergent evolution in a domain-resolved map of human-virus protein-protein interaction networks.

PubMed

Garamszegi, Sara; Franzosa, Eric A; Xia, Yu

2013-01-01

A central challenge in host-pathogen systems biology is the elucidation of general, systems-level principles that distinguish host-pathogen interactions from within-host interactions. Current analyses of host-pathogen and within-host protein-protein interaction networks are largely limited by their resolution, treating proteins as nodes and interactions as edges. Here, we construct a domain-resolved map of human-virus and within-human protein-protein interaction networks by annotating protein interactions with high-coverage, high-accuracy, domain-centric interaction mechanisms: (1) domain-domain interactions, in which a domain in one protein binds to a domain in a second protein, and (2) domain-motif interactions, in which a domain in one protein binds to a short, linear peptide motif in a second protein. Analysis of these domain-resolved networks reveals, for the first time, significant mechanistic differences between virus-human and within-human interactions at the resolution of single domains. While human proteins tend to compete with each other for domain binding sites by means of sequence similarity, viral proteins tend to compete with human proteins for domain binding sites in the absence of sequence similarity. Independent of their previously established preference for targeting human protein hubs, viral proteins also preferentially target human proteins containing linear motif-binding domains. Compared to human proteins, viral proteins participate in more domain-motif interactions, target more unique linear motif-binding domains per residue, and contain more unique linear motifs per residue. Together, these results suggest that viruses surmount genome size constraints by convergently evolving multiple short linear motifs in order to effectively mimic, hijack, and manipulate complex host processes for their survival. Our domain-resolved analyses reveal unique signatures of pleiotropy, economy, and convergent evolution in viral-host interactions that are otherwise hidden in the traditional binary network, highlighting the power and necessity of high-resolution approaches in host-pathogen systems biology.
THE INVOLVEMENT OF HUMAN MONOGENIC CARDIOMYOPATHY GENES IN EXPERIMENTAL POLYGENIC CARDIAC HYPERTROPHY.

PubMed

Prestes, Priscilla R; Marques, Francine Z; Lopez-Campos, Guillermo; Lewandowski, Paul; Delbridge, Lea M D; Charchar, Fadi J; Harrap, Stephen B

2018-05-18

Hypertrophic cardiomyopathy thickens heart muscles reducing functionality and increasing risk of cardiac disease and morbidity. Genetic factors are involved, but their contribution is poorly understood. We used the hypertrophic heart rat (HHR), a unique normotensive polygenic model of cardiac hypertrophy and heart failure to investigate the role of genes associated with monogenic human cardiomyopathy. We selected 42 genes involved in monogenic human cardiomyopathies to study: 1) DNA variants, by sequencing the whole-genome of 13-week old HHR and age-matched normal heart rat (NHR), its genetic control strain; 2) mRNA expression, by targeted RNA-sequencing in left ventricles of HHR and NHR at five ages (2-days old, 4-, 13-, 33- and 50-weeks old) compared to human idiopathic dilated data; and 3) microRNA expression, with rat microRNA microarrays in left ventricles of 2-days old HHR and age-matched NHR. We also investigated experimentally validated microRNA-mRNA interactions. Whole-genome sequencing revealed unique variants mostly located in non-coding regions of HHR and NHR. We found 29 genes differentially expressed in at least one age. Genes encoding desmoglein 2 (Dsg2) and transthyretin (Ttr) were significantly differentially expressed at all ages in the HHR, but only Ttr was also differentially expressed in human idiopathic cardiomyopathy. Lastly, only two microRNAs differentially expressed in the HHR were present in our comparison of validated microRNA-mRNA interactions. These two microRNAs interact with five of the genes studied. Our study shows that genes involved in monogenic forms of human cardiomyopathies may also influence polygenic forms of the disease.
The Dynamic Interplay Between DNA Topoisomerases and DNA Topology.

PubMed

Seol, Yeonee; Neuman, Keir C

2016-09-01

Topological properties of DNA influence its structure and biochemical interactions. Within the cell DNA topology is constantly in flux. Transcription and other essential processes including DNA replication and repair, alter the topology of the genome, while introducing additional complications associated with DNA knotting and catenation. These topological perturbations are counteracted by the action of topoisomerases, a specialized class of highly conserved and essential enzymes that actively regulate the topological state of the genome. This dynamic interplay among DNA topology, DNA processing enzymes, and DNA topoisomerases, is a pervasive factor that influences DNA metabolism in vivo . Building on the extensive structural and biochemical characterization over the past four decades that established the fundamental mechanistic basis of topoisomerase activity, the unique roles played by DNA topology in modulating and influencing the activity of topoisomerases have begun to be explored. In this review we survey established and emerging DNA topology dependent protein-DNA interactions with a focus on in vitro measurements of the dynamic interplay between DNA topology and topoisomerase activity.
The dynamic interplay between DNA topoisomerases and DNA topology.

PubMed

Seol, Yeonee; Neuman, Keir C

2016-11-01

Topological properties of DNA influence its structure and biochemical interactions. Within the cell, DNA topology is constantly in flux. Transcription and other essential processes, including DNA replication and repair, not only alter the topology of the genome but also introduce additional complications associated with DNA knotting and catenation. These topological perturbations are counteracted by the action of topoisomerases, a specialized class of highly conserved and essential enzymes that actively regulate the topological state of the genome. This dynamic interplay among DNA topology, DNA processing enzymes, and DNA topoisomerases is a pervasive factor that influences DNA metabolism in vivo. Building on the extensive structural and biochemical characterization over the past four decades that has established the fundamental mechanistic basis of topoisomerase activity, scientists have begun to explore the unique roles played by DNA topology in modulating and influencing the activity of topoisomerases. In this review we survey established and emerging DNA topology-dependent protein-DNA interactions with a focus on in vitro measurements of the dynamic interplay between DNA topology and topoisomerase activity.
Position-based scanning for comparative genomics and identification of genetic islands in Haemophilus influenzae type b.

PubMed

Bergman, Nicholas H; Akerley, Brian J

2003-03-01

Bacteria exhibit extensive genetic heterogeneity within species. In many cases, these differences account for virulence properties unique to specific strains. Several such loci have been discovered in the genome of the type b serotype of Haemophilus influenzae, a human pathogen able to cause meningitis, pneumonia, and septicemia. Here we report application of a PCR-based scanning procedure to compare the genome of a virulent type b (Hib) strain with that of the laboratory-passaged Rd KW20 strain for which a complete genome sequence is available. We have identified seven DNA segments or H. influenzae genetic islands (HiGIs) present in the type b genome and absent from the Rd genome. These segments vary in size and content and show signs of horizontal gene transfer in that their percent G+C content differs from that of the rest of the H. influenzae genome, they contain genes similar to those found on phages or other mobile elements, or they are flanked by DNA repeats. Several of these loci represent potential pathogenicity islands, because they contain genes likely to mediate interactions with the host. These newly identified genetic islands provide areas of investigation into both the evolution and pathogenesis of H. influenzae. In addition, the genome scanning approach developed to identify these islands provides a rapid means to compare the genomes of phenotypically diverse bacterial strains once the genome sequence of one representative strain has been determined.

MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease.

PubMed

Shen, Lishuang; Diroma, Maria Angela; Gonzalez, Michael; Navarro-Gomez, Daniel; Leipzig, Jeremy; Lott, Marie T; van Oven, Mannis; Wallace, Douglas C; Muraresku, Colleen Clarke; Zolkipli-Cunningham, Zarazuela; Chinnery, Patrick F; Attimonelli, Marcella; Zuchner, Stephan; Falk, Marni J; Gai, Xiaowu

2016-06-01

MSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes, genes, and variants. A central Web portal (https://mseqdr.org) integrates community knowledge from expert-curated databases with genomic and phenotype data shared by clinicians and researchers. MSeqDR also functions as a centralized application server for Web-based tools to analyze data across both mitochondrial and nuclear DNA, including investigator-driven whole exome or genome dataset analyses through MSeqDR-Genesis. MSeqDR-GBrowse genome browser supports interactive genomic data exploration and visualization with custom tracks relevant to mtDNA variation and mitochondrial disease. MSeqDR-LSDB is a locus-specific database that currently manages 178 mitochondrial diseases, 1,363 genes associated with mitochondrial biology or disease, and 3,711 pathogenic variants in those genes. MSeqDR Disease Portal allows hierarchical tree-style disease exploration to evaluate their unique descriptions, phenotypes, and causative variants. Automated genomic data submission tools are provided that capture ClinVar compliant variant annotations. PhenoTips will be used for phenotypic data submission on deidentified patients using human phenotype ontology terminology. The development of a dynamic informed patient consent process to guide data access is underway to realize the full potential of these resources. © 2016 WILEY PERIODICALS, INC.
Phaeobacter gallaeciensis genomes from globally opposite locations reveal high similarity of adaptation to surface life

PubMed Central

Thole, Sebastian; Kalhoefer, Daniela; Voget, Sonja; Berger, Martine; Engelhardt, Tim; Liesegang, Heiko; Wollherr, Antje; Kjelleberg, Staffan; Daniel, Rolf; Simon, Meinhard; Thomas, Torsten; Brinkhoff, Thorsten

2012-01-01

Phaeobacter gallaeciensis, a member of the abundant marine Roseobacter clade, is known to be an effective colonizer of biotic and abiotic marine surfaces. Production of the antibiotic tropodithietic acid (TDA) makes P. gallaeciensis a strong antagonist of many bacteria, including fish and mollusc pathogens. In addition to TDA, several other secondary metabolites are produced, allowing the mutualistic bacterium to also act as an opportunistic pathogen. Here we provide the manually annotated genome sequences of the P. gallaeciensis strains DSM 17395 and 2.10, isolated at the Atlantic coast of north western Spain and near Sydney, Australia, respectively. Despite their isolation sites from the two different hemispheres, the genome comparison demonstrated a surprisingly high level of synteny (only 3% nucleotide dissimilarity and 88% and 93% shared genes). Minor differences in the genomes result from horizontal gene transfer and phage infection. Comparison of the P. gallaeciensis genomes with those of other roseobacters revealed unique genomic traits, including the production of iron-scavenging siderophores. Experiments supported the predicted capacity of both strains to grow on various algal osmolytes. Transposon mutagenesis was used to expand the current knowledge on the TDA biosynthesis pathway in strain DSM 17395. This first comparative genomic analysis of finished genomes of two closely related strains belonging to one species of the Roseobacter clade revealed features that provide competitive advantages and facilitate surface attachment and interaction with eukaryotic hosts. PMID:22717884
Extensive Mobilome-Driven Genome Diversification in Mouse Gut-Associated Bacteroides vulgatus mpk.

PubMed

Lange, Anna; Beier, Sina; Steimle, Alex; Autenrieth, Ingo B; Huson, Daniel H; Frick, Julia-Stefanie

2016-04-25

Like many other Bacteroides species, Bacteroides vulgatus strain mpk, a mouse fecal isolate which was shown to promote intestinal homeostasis, utilizes a variety of mobile elements for genome evolution. Based on sequences collected by Pacific Biosciences SMRT sequencing technology, we discuss the challenges of assembling and studying a bacterial genome of high plasticity. Additionally, we conducted comparative genomics comparing this commensal strain with the B. vulgatus type strain ATCC 8482 as well as multiple other Bacteroides and Parabacteroides strains to reveal the most important differences and identify the unique features of B. vulgatus mpk. The genome of B. vulgatus mpk harbors a large and diverse set of mobile element proteins compared with other sequenced Bacteroides strains. We found evidence of a number of different horizontal gene transfer events and a genome landscape that has been extensively altered by different mobilization events. A CRISPR/Cas system could be identified that provides a possible mechanism for preventing the integration of invading external DNA. We propose that the high genome plasticity and the introduced genome instabilities of B. vulgatus mpk arising from the various mobilization events might play an important role not only in its adaptation to the challenging intestinal environment in general, but also in its ability to interact with the gut microbiota. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
SARS-unique fold in the Rousettus bat coronavirus HKU9.

PubMed

Hammond, Robert G; Tan, Xuan; Johnson, Margaret A

2017-09-01

The coronavirus nonstructural protein 3 (nsp3) is a multifunctional protein that comprises multiple structural domains. This protein assists viral polyprotein cleavage, host immune interference, and may play other roles in genome replication or transcription. Here, we report the solution NMR structure of a protein from the "SARS-unique region" of the bat coronavirus HKU9. The protein contains a frataxin fold or double-wing motif, which is an α + β fold that is associated with protein/protein interactions, DNA binding, and metal ion binding. High structural similarity to the human severe acute respiratory syndrome (SARS) coronavirus nsp3 is present. A possible functional site that is conserved among some betacoronaviruses has been identified using bioinformatics and biochemical analyses. This structure provides strong experimental support for the recent proposal advanced by us and others that the "SARS-unique" region is not unique to the human SARS virus, but is conserved among several different phylogenetic groups of coronaviruses and provides essential functions. © 2017 The Protein Society.
Sequencing Needs for Viral Diagnostics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gardner, S N; Lam, M; Mulakken, N J

2004-01-26

We built a system to guide decisions regarding the amount of genomic sequencing required to develop diagnostic DNA signatures, which are short sequences that are sufficient to uniquely identify a viral species. We used our existing DNA diagnostic signature prediction pipeline, which selects regions of a target species genome that are conserved among strains of the target (for reliability, to prevent false negatives) and unique relative to other species (for specificity, to avoid false positives). We performed simulations, based on existing sequence data, to assess the number of genome sequences of a target species and of close phylogenetic relatives (''nearmore » neighbors'') that are required to predict diagnostic signature regions that are conserved among strains of the target species and unique relative to other bacterial and viral species. For DNA viruses such as variola (smallpox), three target genomes provide sufficient guidance for selecting species-wide signatures. Three near neighbor genomes are critical for species specificity. In contrast, most RNA viruses require four target genomes and no near neighbor genomes, since lack of conservation among strains is more limiting than uniqueness. SARS and Ebola Zaire are exceptional, as additional target genomes currently do not improve predictions, but near neighbor sequences are urgently needed. Our results also indicate that double stranded DNA viruses are more conserved among strains than are RNA viruses, since in most cases there was at least one conserved signature candidate for the DNA viruses and zero conserved signature candidates for the RNA viruses.« less
Cross-species comparison of the gut: Differential gene expression sheds light on biological differences in closely related tenebrionids.

PubMed

Oppert, Brenda; Perkin, Lindsey; Martynov, Alexander G; Elpidina, Elena N

2018-04-01

The gut is one of the primary interfaces between an insect and its environment. Understanding gene expression profiles in the insect gut can provide insight into interactions with the environment as well as identify potential control methods for pests. We compared the expression profiles of transcripts from the gut of larval stages of two coleopteran insects, Tenebrio molitor and Tribolium castaneum. These tenebrionids have different life cycles, varying in the duration and number of larval instars. T. castaneum has a sequenced genome and has been a model for coleopterans, and we recently obtained a draft genome for T. molitor. We assembled gut transcriptome reads from each insect to their respective genomes and filtered mapped reads to RPKM>1, yielding 11,521 and 17,871 genes in the T. castaneum and T. molitor datasets, respectively. There were identical GO terms in each dataset, and enrichment analyses also identified shared GO terms. From these datasets, we compiled an ortholog list of 6907 genes; 45% of the total assembled reads from T. castaneum were found in the top 25 orthologs, but only 27% of assembled reads were found in the top 25 T. molitor orthologs. There were 2281 genes unique to T. castaneum, and 2088 predicted genes unique to T. molitor, although improvements to the T. molitor genome will likely reduce these numbers as more orthologs are identified. We highlight a few unique genes in T. castaneum or T. molitor that may relate to distinct biological functions. A large number of putative genes expressed in the larval gut with uncharacterized functions (36 and 68% from T. castaneum and T. molitor, respectively) support the need for further research. These data are the first step in building a comprehensive understanding of the physiology of the gut in tenebrionid insects, illustrating commonalities and differences that may be related to speciation and environmental adaptation. Published by Elsevier Ltd.
Breaking-Cas-interactive design of guide RNAs for CRISPR-Cas experiments for ENSEMBL genomes.

PubMed

Oliveros, Juan C; Franch, Mònica; Tabas-Madrid, Daniel; San-León, David; Montoliu, Lluis; Cubas, Pilar; Pazos, Florencio

2016-07-08

The CRISPR/Cas technology is enabling targeted genome editing in multiple organisms with unprecedented accuracy and specificity by using RNA-guided nucleases. A critical point when planning a CRISPR/Cas experiment is the design of the guide RNA (gRNA), which directs the nuclease and associated machinery to the desired genomic location. This gRNA has to fulfil the requirements of the nuclease and lack homology with other genome sites that could lead to off-target effects. Here we introduce the Breaking-Cas system for the design of gRNAs for CRISPR/Cas experiments, including those based in the Cas9 nuclease as well as others recently introduced. The server has unique features not available in other tools, including the possibility of using all eukaryotic genomes available in ENSEMBL (currently around 700), placing variable PAM sequences at 5' or 3' and setting the guide RNA length and the scores per nucleotides. It can be freely accessed at: http://bioinfogp.cnb.csic.es/tools/breakingcas, and the code is available upon request. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Deciphering the genomes of 16 Acanthamoeba species does not provide evidence of integration of known giant virus-associated mobile genetic elements.

PubMed

Chelkha, Nisrine; Colson, Philippe; Levasseur, Anthony; La Scola, Bernard

2018-06-02

Giant viruses infect protozoa, especially amoebae of the genus Acanthamoeba. These viruses possess genetic elements named Mobilome. So far, this mobilome comprises provirophages which are integrated into the genome of their hosts, transpovirons, and Maverick/Polintons. Virophages replicate inside virus factories within Acanthamoeba and can decrease the infectivity of giant viruses. The virophage infecting CroV was found to be integrated in the host of CroV, Cafeteria roenbergensis, thus protecting C. roenbergensis by reduction of CroV multiplication. Because of this unique property, assessment of the mechanisms of replication of virophages and their relationship with giant viruses is a key element of this investigation. This work aimed at evaluating the presence and the dynamic of these mobile elements in sixteen Acanthamoeba genomes. No significant traces of the integration of genomes or sequences from known virophages were identified in all the available Acanthamoeba genomes. These results brought us to hypothesize that the interactions between mimiviruses and their virophages might occur through different mechanisms, or at low frequency. An additional explanation could be that our knowledge of the diversity of virophages is still very limited. Copyright © 2018 Elsevier B.V. All rights reserved.
Whole-Genome Sequence Analysis of Bombella intestini LMG 28161T, a Novel Acetic Acid Bacterium Isolated from the Crop of a Red-Tailed Bumble Bee, Bombus lapidarius.

PubMed

Li, Leilei; Illeghems, Koen; Van Kerrebroeck, Simon; Borremans, Wim; Cleenwerck, Ilse; Smagghe, Guy; De Vuyst, Luc; Vandamme, Peter

2016-01-01

The whole-genome sequence of Bombella intestini LMG 28161T, an endosymbiotic acetic acid bacterium (AAB) occurring in bumble bees, was determined to investigate the molecular mechanisms underlying its metabolic capabilities. The draft genome sequence of B. intestini LMG 28161T was 2.02 Mb. Metabolic carbohydrate pathways were in agreement with the metabolite analyses of fermentation experiments and revealed its oxidative capacity towards sucrose, D-glucose, D-fructose and D-mannitol, but not ethanol and glycerol. The results of the fermentation experiments also demonstrated that the lack of effective aeration in small-scale carbohydrate consumption experiments may be responsible for the lack of reproducibility of such results in taxonomic studies of AAB. Finally, compared to the genome sequences of its nearest phylogenetic neighbor and of three other insect associated AAB strains, the B. intestini LMG 28161T genome lost 69 orthologs and included 89 unique genes. Although many of the latter were hypothetical they also included several type IV secretion system proteins, amino acid transporter/permeases and membrane proteins which might play a role in the interaction with the bumble bee host.
The Ever-Evolving Concept of the Gene: The Use of RNA/Protein Experimental Techniques to Understand Genome Functions

PubMed Central

Cipriano, Andrea; Ballarino, Monica

2018-01-01

The completion of the human genome sequence together with advances in sequencing technologies have shifted the paradigm of the genome, as composed of discrete and hereditable coding entities, and have shown the abundance of functional noncoding DNA. This part of the genome, previously dismissed as “junk” DNA, increases proportionally with organismal complexity and contributes to gene regulation beyond the boundaries of known protein-coding genes. Different classes of functionally relevant nonprotein-coding RNAs are transcribed from noncoding DNA sequences. Among them are the long noncoding RNAs (lncRNAs), which are thought to participate in the basal regulation of protein-coding genes at both transcriptional and post-transcriptional levels. Although knowledge of this field is still limited, the ability of lncRNAs to localize in different cellular compartments, to fold into specific secondary structures and to interact with different molecules (RNA or proteins) endows them with multiple regulatory mechanisms. It is becoming evident that lncRNAs may play a crucial role in most biological processes such as the control of development, differentiation and cell growth. This review places the evolution of the concept of the gene in its historical context, from Darwin's hypothetical mechanism of heredity to the post-genomic era. We discuss how the original idea of protein-coding genes as unique determinants of phenotypic traits has been reconsidered in light of the existence of noncoding RNAs. We summarize the technological developments which have been made in the genome-wide identification and study of lncRNAs and emphasize the methodologies that have aided our understanding of the complexity of lncRNA-protein interactions in recent years. PMID:29560353
Making the Bend: DNA Tertiary Structure and Protein-DNA Interactions

PubMed Central

Harteis, Sabrina; Schneider, Sabine

2014-01-01

DNA structure functions as an overlapping code to the DNA sequence. Rapid progress in understanding the role of DNA structure in gene regulation, DNA damage recognition and genome stability has been made. The three dimensional structure of both proteins and DNA plays a crucial role for their specific interaction, and proteins can recognise the chemical signature of DNA sequence (“base readout”) as well as the intrinsic DNA structure (“shape recognition”). These recognition mechanisms do not exist in isolation but, depending on the individual interaction partners, are combined to various extents. Driving force for the interaction between protein and DNA remain the unique thermodynamics of each individual DNA-protein pair. In this review we focus on the structures and conformations adopted by DNA, both influenced by and influencing the specific interaction with the corresponding protein binding partner, as well as their underlying thermodynamics. PMID:25026169
Improving the annotation of the Heterorhabditis bacteriophora genome.

PubMed

McLean, Florence; Berger, Duncan; Laetsch, Dominik R; Schwartz, Hillel T; Blaxter, Mark

2018-04-01

Genome assembly and annotation remain exacting tasks. As the tools available for these tasks improve, it is useful to return to data produced with earlier techniques to assess their credibility and correctness. The entomopathogenic nematode Heterorhabditis bacteriophora is widely used to control insect pests in horticulture. The genome sequence for this species was reported to encode an unusually high proportion of unique proteins and a paucity of secreted proteins compared to other related nematodes. We revisited the H. bacteriophora genome assembly and gene predictions to determine whether these unusual characteristics were biological or methodological in origin. We mapped an independent resequencing dataset to the genome and used the blobtools pipeline to identify potential contaminants. While present (0.2% of the genome span, 0.4% of predicted proteins), assembly contamination was not significant. Re-prediction of the gene set using BRAKER1 and published transcriptome data generated a predicted proteome that was very different from the published one. The new gene set had a much reduced complement of unique proteins, better completeness values that were in line with other related species' genomes, and an increased number of proteins predicted to be secreted. It is thus likely that methodological issues drove the apparent uniqueness of the initial H. bacteriophora genome annotation and that similar contamination and misannotation issues affect other published genome assemblies.
CartograTree: connecting tree genomes, phenotypes and environment.

PubMed

Vasquez-Gross, Hans A; Yu, John J; Figueroa, Ben; Gessler, Damian D G; Neale, David B; Wegrzyn, Jill L

2013-05-01

Today, researchers spend a tremendous amount of time gathering, formatting, filtering and visualizing data collected from disparate sources. Under the umbrella of forest tree biology, we seek to provide a platform and leverage modern technologies to connect biotic and abiotic data. Our goal is to provide an integrated web-based workspace that connects environmental, genomic and phenotypic data via geo-referenced coordinates. Here, we connect the genomic query web-based workspace, DiversiTree and a novel geographical interface called CartograTree to data housed on the TreeGenes database. To accomplish this goal, we implemented Simple Semantic Web Architecture and Protocol to enable the primary genomics database, TreeGenes, to communicate with semantic web services regardless of platform or back-end technologies. The novelty of CartograTree lies in the interactive workspace that allows for geographical visualization and engagement of high performance computing (HPC) resources. The application provides a unique tool set to facilitate research on the ecology, physiology and evolution of forest tree species. CartograTree can be accessed at: http://dendrome.ucdavis.edu/cartogratree. © 2013 Blackwell Publishing Ltd.
Identification and analysis of the bacterial endosymbiont specialized for production of the chemotherapeutic natural product ET-743

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schofield, Michael M.; Jain, Sunit; Porat, Daphne

Ecteinascidin 743 (ET-743, Yondelis) is a clinically approved chemotherapeutic natural product isolated from the Caribbean mangrove tunicate Ecteinascidia turbinata. Researchers have long suspected that a microorganism may be the true producer of the anti-cancer drug, but its genome has remained elusive due to our inability to culture the bacterium in the laboratory using standard techniques. Here, we sequenced and assembled the complete genome of the ET-743 producer, Candidatus Endoecteinascidia frumentensis, directly from metagenomic DNA isolated from the tunicate. Analysis of the ~631 kb microbial genome revealed strong evidence of an endosymbiotic lifestyle and extreme genome reduction. Phylogenetic analysis suggested thatmore » the producer of the anti-cancer drug is taxonomically distinct from other sequenced microorganisms and could represent a new family of Gammaproteobacteria. The complete genome has also greatly expanded our understanding of ET-743 production and revealed new biosynthetic genes dispersed across more than 173 kb of the small genome. The gene cluster’s architecture and its preservation demonstrate that the drug is likely essential to the interactions of the microorganism with its mangrove tunicate host. In conclusion, taken together, these studies elucidate the lifestyle of a unique, and pharmaceutically-important microorganism and highlight the wide diversity of bacteria capable of making potent natural products.« less
Identification and analysis of the bacterial endosymbiont specialized for production of the chemotherapeutic natural product ET-743

DOE PAGES

Schofield, Michael M.; Jain, Sunit; Porat, Daphne; ...

2015-07-21

Ecteinascidin 743 (ET-743, Yondelis) is a clinically approved chemotherapeutic natural product isolated from the Caribbean mangrove tunicate Ecteinascidia turbinata. Researchers have long suspected that a microorganism may be the true producer of the anti-cancer drug, but its genome has remained elusive due to our inability to culture the bacterium in the laboratory using standard techniques. Here, we sequenced and assembled the complete genome of the ET-743 producer, Candidatus Endoecteinascidia frumentensis, directly from metagenomic DNA isolated from the tunicate. Analysis of the ~631 kb microbial genome revealed strong evidence of an endosymbiotic lifestyle and extreme genome reduction. Phylogenetic analysis suggested thatmore » the producer of the anti-cancer drug is taxonomically distinct from other sequenced microorganisms and could represent a new family of Gammaproteobacteria. The complete genome has also greatly expanded our understanding of ET-743 production and revealed new biosynthetic genes dispersed across more than 173 kb of the small genome. The gene cluster’s architecture and its preservation demonstrate that the drug is likely essential to the interactions of the microorganism with its mangrove tunicate host. In conclusion, taken together, these studies elucidate the lifestyle of a unique, and pharmaceutically-important microorganism and highlight the wide diversity of bacteria capable of making potent natural products.« less
Birth and death of genes linked to chromosomal inversion

PubMed Central

Furuta, Yoshikazu; Kawai, Mikihiko; Yahara, Koji; Takahashi, Noriko; Handa, Naofumi; Tsuru, Takeshi; Oshima, Kenshiro; Yoshida, Masaru; Azuma, Takeshi; Hattori, Masahira; Uchiyama, Ikuo; Kobayashi, Ichizo

2011-01-01

The birth and death of genes is central to adaptive evolution, yet the underlying genome dynamics remain elusive. The availability of closely related complete genome sequences helps to follow changes in gene contents and clarify their relationship to overall genome organization. Helicobacter pylori, bacteria in our stomach, are known for their extreme genome plasticity through mutation and recombination and will make a good target for such an analysis. In comparing their complete genome sequences, we found that gain and loss of genes (loci) for outer membrane proteins, which mediate host interaction, occurred at breakpoints of chromosomal inversions. Sequence comparison there revealed a unique mechanism of DNA duplication: DNA duplication associated with inversion. In this process, a DNA segment at one chromosomal locus is copied and inserted, in an inverted orientation, into a distant locus on the same chromosome, while the entire region between these two loci is also inverted. Recognition of this and three more inversion modes, which occur through reciprocal recombination between long or short sequence similarity or adjacent to a mobile element, allowed reconstruction of synteny evolution through inversion events in this species. These results will guide the interpretation of extensive DNA sequencing results for understanding long- and short-term genome evolution in various organisms and in cancer cells. PMID:21212362
A "candidate-interactome" aggregate analysis of genome-wide association data in multiple sclerosis.

PubMed

Mechelli, Rosella; Umeton, Renato; Policano, Claudia; Annibali, Viviana; Coarelli, Giulia; Ricigliano, Vito A G; Vittori, Danila; Fornasiero, Arianna; Buscarinu, Maria Chiara; Romano, Silvia; Salvetti, Marco; Ristori, Giovanni

2013-01-01

Though difficult, the study of gene-environment interactions in multifactorial diseases is crucial for interpreting the relevance of non-heritable factors and prevents from overlooking genetic associations with small but measurable effects. We propose a "candidate interactome" (i.e. a group of genes whose products are known to physically interact with environmental factors that may be relevant for disease pathogenesis) analysis of genome-wide association data in multiple sclerosis. We looked for statistical enrichment of associations among interactomes that, at the current state of knowledge, may be representative of gene-environment interactions of potential, uncertain or unlikely relevance for multiple sclerosis pathogenesis: Epstein-Barr virus, human immunodeficiency virus, hepatitis B virus, hepatitis C virus, cytomegalovirus, HHV8-Kaposi sarcoma, H1N1-influenza, JC virus, human innate immunity interactome for type I interferon, autoimmune regulator, vitamin D receptor, aryl hydrocarbon receptor and a panel of proteins targeted by 70 innate immune-modulating viral open reading frames from 30 viral species. Interactomes were either obtained from the literature or were manually curated. The P values of all single nucleotide polymorphism mapping to a given interactome were obtained from the last genome-wide association study of the International Multiple Sclerosis Genetics Consortium & the Wellcome Trust Case Control Consortium, 2. The interaction between genotype and Epstein Barr virus emerges as relevant for multiple sclerosis etiology. However, in line with recent data on the coexistence of common and unique strategies used by viruses to perturb the human molecular system, also other viruses have a similar potential, though probably less relevant in epidemiological terms.
A “Candidate-Interactome” Aggregate Analysis of Genome-Wide Association Data in Multiple Sclerosis

PubMed Central

Policano, Claudia; Annibali, Viviana; Coarelli, Giulia; Ricigliano, Vito A. G.; Vittori, Danila; Fornasiero, Arianna; Buscarinu, Maria Chiara; Romano, Silvia; Salvetti, Marco; Ristori, Giovanni

2013-01-01

Though difficult, the study of gene-environment interactions in multifactorial diseases is crucial for interpreting the relevance of non-heritable factors and prevents from overlooking genetic associations with small but measurable effects. We propose a “candidate interactome” (i.e. a group of genes whose products are known to physically interact with environmental factors that may be relevant for disease pathogenesis) analysis of genome-wide association data in multiple sclerosis. We looked for statistical enrichment of associations among interactomes that, at the current state of knowledge, may be representative of gene-environment interactions of potential, uncertain or unlikely relevance for multiple sclerosis pathogenesis: Epstein-Barr virus, human immunodeficiency virus, hepatitis B virus, hepatitis C virus, cytomegalovirus, HHV8-Kaposi sarcoma, H1N1-influenza, JC virus, human innate immunity interactome for type I interferon, autoimmune regulator, vitamin D receptor, aryl hydrocarbon receptor and a panel of proteins targeted by 70 innate immune-modulating viral open reading frames from 30 viral species. Interactomes were either obtained from the literature or were manually curated. The P values of all single nucleotide polymorphism mapping to a given interactome were obtained from the last genome-wide association study of the International Multiple Sclerosis Genetics Consortium & the Wellcome Trust Case Control Consortium, 2. The interaction between genotype and Epstein Barr virus emerges as relevant for multiple sclerosis etiology. However, in line with recent data on the coexistence of common and unique strategies used by viruses to perturb the human molecular system, also other viruses have a similar potential, though probably less relevant in epidemiological terms. PMID:23696811
Genomic Expression Patterns in Menstrually-Related Migraine in Adolescents

PubMed Central

Hershey, Andrew; Horn, Paul; Kabbouche, Marielle; O'Brien, Hope; Powers, Scott

2011-01-01

Background Exacerbation of migraine with menses is common in adolescent girls and women with migraine, occurring in up to 60% of females with migraine. These migraines are oftentimes longer and more disabling and may be related to estrogen levels and hormonal fluctuations. Objective This study identifies the unique genomic expression pattern of menstrually-related migraine (MRM) in comparison to migraine occurring outside the menstrual period and headache free controls. Methods Whole blood samples were obtained from female subjects having an acute migraine during their menstrual period (MRM) or outside of their menstrual period (nonMRM) and controls (C) – females having a menstrual period without any history of headache. The mRNA was isolated from these samples and genomic profile was assessed. Affymetrix Human Exon ST 1.0 arrays were used to examine the genomic expression pattern differences between these three groups. Results Blood genomic expression patterns were obtained on 56 subjects (MRM = 18, nonMRM = 18 and C = 20). Unique genomic expression patterns were observed for both MRM and nonMRM. For MRM, 77 genes were identified that were unique to MRM, while 61 genes were commonly expressed for MRM and nonMRM and 127 genes appeared to have a unique expression pattern for nonMRM. In addition, there were 279 genes that differentially expressed for MRM compared to nonMRM that were not differentially expressed for nonMRM. Gene ontology of these samples indicated many of these groups of genes were functionally related and included categories of immunomodulation/inflammation, mitochondrial function and DNA homeostasis. Conclusions Blood genomic patterns can accurately differentiate MRM from nonMRM. These results indicate that MRM involves a unique molecular biology pathway that can be identified with a specific biomarker and suggest that individuals with MRM have a different underlying genetic etiology. PMID:22220971
Toward a rigorous network of protein-protein interactions of the model sulfate reducer Desulfovibrio vulgaris Hildenborough

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chhabra, S.R.; Joachimiak, M.P.; Petzold, C.J.

Protein–protein interactions offer an insight into cellular processes beyond what may be obtained by the quantitative functional genomics tools of proteomics and transcriptomics. The aforementioned tools have been extensively applied to study E. coli and other aerobes and more recently to study the stress response behavior of Desulfovibrio 5 vulgaris Hildenborough, a model anaerobe and sulfate reducer. In this paper we present the first attempt to identify protein-protein interactions in an obligate anaerobic bacterium. We used suicide vector-assisted chromosomal modification of 12 open reading frames encoded by this sulfate reducer to append an eight amino acid affinity tag to themore » carboxy-terminus of the chosen proteins. Three biological replicates of the 10 ‘pulled-down’ proteins were separated and analyzed using liquid chromatography-mass spectrometry. Replicate agreement ranged between 35% and 69%. An interaction network among 12 bait and 90 prey proteins was reconstructed based on 134 bait-prey interactions computationally identified to be of high confidence. We discuss the biological significance of several unique metabolic features of D. vulgaris revealed by this protein-protein interaction data 15 and protein modifications that were observed. These include the distinct role of the putative carbon monoxide-induced hydrogenase, unique electron transfer routes associated with different oxidoreductases, and the possible role of methylation in regulating sulfate reduction.« less

Structure, proteome and genome of Sinorhizobium meliloti phage ΦM5: A virus with LUZ24-like morphology and a highly mosaic genome.

PubMed

Johnson, Matthew C; Sena-Velez, Marta; Washburn, Brian K; Platt, Georgia N; Lu, Stephen; Brewer, Tess E; Lynn, Jason S; Stroupe, M Elizabeth; Jones, Kathryn M

2017-12-01

Bacteriophages of nitrogen-fixing rhizobial bacteria are revealing a wealth of novel structures, diverse enzyme combinations and genomic features. Here we report the cryo-EM structure of the phage capsid at 4.9-5.7Å-resolution, the phage particle proteome, and the genome of the Sinorhizobium meliloti-infecting Podovirus ΦM5. This is the first structure of a phage with a capsid and capsid-associated structural proteins related to those of the LUZ24-like viruses that infect Pseudomonas aeruginosa. Like many other Podoviruses, ΦM5 is a T=7 icosahedron with a smooth capsid and short, relatively featureless tail. Nonetheless, this group is phylogenetically quite distinct from Podoviruses of the well-characterized T7, P22, and epsilon 15 supergroups. Structurally, a distinct bridge of density that appears unique to ΦM5 reaches down the body of the coat protein to the extended loop that interacts with the next monomer in a hexamer, perhaps stabilizing the mature capsid. Further, the predicted tail fibers of ΦM5 are quite different from those of enteric bacteria phages, but have domains in common with other rhizophages. Genomically, ΦM5 is highly mosaic. The ΦM5 genome is 44,005bp with 357bp direct terminal repeats (DTRs) and 58 unique ORFs. Surprisingly, the capsid structural module, the tail module, the DNA-packaging terminase, the DNA replication module and the integrase each appear to be from a different lineage. One of the most unusual features of ΦM5 is its terminase whose large subunit is quite different from previously-described short-DTR-generating packaging machines and does not fit into any of the established phylogenetic groups. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Complete Genome Sequence of Sporisorium scitamineum and Biotrophic Interaction Transcriptome with Sugarcane

PubMed Central

Benevenuto, Juliana; Peters, Leila P.; Carvalho, Giselle; Palhares, Alessandra; Quecine, Maria C.; Nunes, Filipe R. S.; Kmit, Maria C. P.; Wai, Alvan; Hausner, Georg; Aitken, Karen S.; Berkman, Paul J.; Fraser, James A.; Moolhuijzen, Paula M.; Coutinho, Luiz L.; Creste, Silvana; Vieira, Maria L. C.; Kitajima, João P.; Monteiro-Vitorello, Claudia B.

2015-01-01

Sporisorium scitamineum is a biotrophic fungus responsible for the sugarcane smut, a worldwide spread disease. This study provides the complete sequence of individual chromosomes of S. scitamineum from telomere to telomere achieved by a combination of PacBio long reads and Illumina short reads sequence data, as well as a draft sequence of a second fungal strain. Comparative analysis to previous available sequences of another strain detected few polymorphisms among the three genomes. The novel complete sequence described herein allowed us to identify and annotate extended subtelomeric regions, repetitive elements and the mitochondrial DNA sequence. The genome comprises 19,979,571 bases, 6,677 genes encoding proteins, 111 tRNAs and 3 assembled copies of rDNA, out of our estimated number of copies as 130. Chromosomal reorganizations were detected when comparing to sequences of S. reilianum, the closest smut relative, potentially influenced by repeats of transposable elements. Repetitive elements may have also directed the linkage of the two mating-type loci. The fungal transcriptome profiling from in vitro and from interaction with sugarcane at two time points (early infection and whip emergence) revealed that 13.5% of the genes were differentially expressed in planta and particular to each developmental stage. Among them are plant cell wall degrading enzymes, proteases, lipases, chitin modification and lignin degradation enzymes, sugar transporters and transcriptional factors. The fungus also modulates transcription of genes related to surviving against reactive oxygen species and other toxic metabolites produced by the plant. Previously described effectors in smut/plant interactions were detected but some new candidates are proposed. Ten genomic islands harboring some of the candidate genes unique to S. scitamineum were expressed only in planta. RNAseq data was also used to reassure gene predictions. PMID:26065709
MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease

PubMed Central

Shen, Lishuang; Diroma, Maria Angela; Gonzalez, Michael; Navarro-Gomez, Daniel; Leipzig, Jeremy; Lott, Marie T.; van Oven, Mannis; Wallace, Douglas C.; Muraresku, Colleen Clarke; Zolkipli-Cunningham, Zarazuela; Chinnery, Patrick F.; Attimonelli, Marcella; Zuchner, Stephan

2016-01-01

MSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes, genes, and variants. A central Web portal (https://mseqdr.org) integrates community knowledge from expert-curated databases with genomic and phenotype data shared by clinicians and researchers. MSeqDR also functions as a centralized application server for Web-based tools to analyze data across both mitochondrial and nuclear DNA, including investigator-driven whole exome or genome dataset analyses through MSeqDR-Genesis. MSeqDR-GBrowse supports interactive genomic data exploration and visualization with custom tracks relevant to mtDNA variation and disease. MSeqDR-LSDB is a locus specific database that currently manages 178 mitochondrial diseases, 1,363 genes associated with mitochondrial biology or disease, and 3,711 pathogenic variants in those genes. MSeqDR Disease Portal allows hierarchical tree-style disease exploration to evaluate their unique descriptions, phenotypes, and causative variants. Automated genomic data submission tools are provided that capture ClinVar-compliant variant annotations. PhenoTips is used for phenotypic data submission on de-identified patients using human phenotype ontology terminology. Development of a dynamic informed patient consent process to guide data access is underway to realize the full potential of these resources. PMID:26919060
The Laccaria and Tuber Genomes Reveal Unique Signatures of Mycorrhizal Symbiosis Evolution (2010 JGI User Meeting)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Martin, Francis

Francis Martin from the French National Institute for Agricultural Research (INRA) talks on how "The Laccaria and Tuber genomes reveal unique signatures of mycorrhizal symbiosis evolution" on March 24, 2010 at the 5th Annual DOE JGI User Meeting
Genome Sequence of a Canadian Vibrio parahaemolyticus Isolate with Unique Mobilizing Capacity.

PubMed

Bioteau, Audrey; Huguet, Kévin; Burrus, Vincent; Banerjee, Swapan

2018-06-14

Vibrio parahaemolyticus is a clinically significant marine bacterium implicated in gastroenteritis among consumers of raw or undercooked seafood. This report presents the whole-genome sequence of a unique strain of V. parahaemolyticus isolated from oysters harvested in Canada. © Crown copyright 2018.
Human Heat shock protein 40 (Hsp40/DnaJB1) promotes influenza A virus replication by assisting nuclear import of viral ribonucleoproteins.

PubMed

Batra, Jyoti; Tripathi, Shashank; Kumar, Amrita; Katz, Jacqueline M; Cox, Nancy J; Lal, Renu B; Sambhara, Suryaprakash; Lal, Sunil K

2016-01-11

A unique feature of influenza A virus (IAV) life cycle is replication of the viral genome in the host cell nucleus. The nuclear import of IAV genome is an indispensable step in establishing virus infection. IAV nucleoprotein (NP) is known to mediate the nuclear import of viral genome via its nuclear localization signals. Here, we demonstrate that cellular heat shock protein 40 (Hsp40/DnaJB1) facilitates the nuclear import of incoming IAV viral ribonucleoproteins (vRNPs) and is important for efficient IAV replication. Hsp40 was found to interact with NP component of IAV RNPs during early stages of infection. This interaction is mediated by the J domain of Hsp40 and N-terminal region of NP. Drug or RNAi mediated inhibition of Hsp40 resulted in reduced nuclear import of IAV RNPs, diminished viral polymerase function and attenuates overall viral replication. Hsp40 was also found to be required for efficient association between NP and importin alpha, which is crucial for IAV RNP nuclear translocation. These studies demonstrate an important role for cellular chaperone Hsp40/DnaJB1 in influenza A virus life cycle by assisting nuclear trafficking of viral ribonucleoproteins.
Comparative molecular dynamics studies of heterozygous open reading frames of DNA polymerase eta (η) in pathogenic yeast Candida albicans

NASA Astrophysics Data System (ADS)

Satpati, Suresh; Manohar, Kodavati; Acharya, Narottam; Dixit, Anshuman

2017-01-01

Genomic instability in Candida albicans is believed to play a crucial role in fungal pathogenesis. DNA polymerases contribute significantly to stability of any genome. Although Candida Genome database predicts presence of S. cerevisiae DNA polymerase orthologs; functional and structural characterizations of Candida DNA polymerases are still unexplored. DNA polymerase eta (Polη) is unique as it promotes efficient bypass of cyclobutane pyrimidine dimers. Interestingly, C. albicans is heterozygous in carrying two Polη genes and the nucleotide substitutions were found only in the ORFs. As allelic differences often result in functional differences of the encoded proteins, comparative analyses of structural models and molecular dynamic simulations were performed to characterize these orthologs of DNA Polη. Overall structures of both the ORFs remain conserved except subtle differences in the palm and PAD domains. The complementation analysis showed that both the ORFs equally suppressed UV sensitivity of yeast rad30 deletion strain. Our study has predicted two novel molecular interactions, a highly conserved molecular tetrad of salt bridges and a series of π-π interactions spanning from thumb to PAD. This study suggests these ORFs as the homologues of yeast Polη, and due to its heterogeneity in C. albicans they may play a significant role in pathogenicity.
Human Heat shock protein 40 (Hsp40/DnaJB1) promotes influenza A virus replication by assisting nuclear import of viral ribonucleoproteins

PubMed Central

Batra, Jyoti; Tripathi, Shashank; Kumar, Amrita; Katz, Jacqueline M.; Cox, Nancy J.; Lal, Renu B.; Sambhara, Suryaprakash; Lal, Sunil K.

2016-01-01

A unique feature of influenza A virus (IAV) life cycle is replication of the viral genome in the host cell nucleus. The nuclear import of IAV genome is an indispensable step in establishing virus infection. IAV nucleoprotein (NP) is known to mediate the nuclear import of viral genome via its nuclear localization signals. Here, we demonstrate that cellular heat shock protein 40 (Hsp40/DnaJB1) facilitates the nuclear import of incoming IAV viral ribonucleoproteins (vRNPs) and is important for efficient IAV replication. Hsp40 was found to interact with NP component of IAV RNPs during early stages of infection. This interaction is mediated by the J domain of Hsp40 and N-terminal region of NP. Drug or RNAi mediated inhibition of Hsp40 resulted in reduced nuclear import of IAV RNPs, diminished viral polymerase function and attenuates overall viral replication. Hsp40 was also found to be required for efficient association between NP and importin alpha, which is crucial for IAV RNP nuclear translocation. These studies demonstrate an important role for cellular chaperone Hsp40/DnaJB1 in influenza A virus life cycle by assisting nuclear trafficking of viral ribonucleoproteins. PMID:26750153
A predicted protein interactome identifies conserved global networks and disease resistance subnetworks in maize

PubMed Central

Musungu, Bryan; Bhatnagar, Deepak; Brown, Robert L.; Fakhoury, Ahmad M.; Geisler, Matt

2015-01-01

Interactomes are genome-wide roadmaps of protein-protein interactions. They have been produced for humans, yeast, the fruit fly, and Arabidopsis thaliana and have become invaluable tools for generating and testing hypotheses. A predicted interactome for Zea mays (PiZeaM) is presented here as an aid to the research community for this valuable crop species. PiZeaM was built using a proven method of interologs (interacting orthologs) that were identified using both one-to-one and many-to-many orthology between genomes of maize and reference species. Where both maize orthologs occurred for an experimentally determined interaction in the reference species, we predicted a likely interaction in maize. A total of 49,026 unique interactions for 6004 maize proteins were predicted. These interactions are enriched for processes that are evolutionarily conserved, but include many otherwise poorly annotated proteins in maize. The predicted maize interactions were further analyzed by comparing annotation of interacting proteins, including different layers of ontology. A map of pairwise gene co-expression was also generated and compared to predicted interactions. Two global subnetworks were constructed for highly conserved interactions. These subnetworks showed clear clustering of proteins by function. Another subnetwork was created for disease response using a bait and prey strategy to capture interacting partners for proteins that respond to other organisms. Closer examination of this subnetwork revealed the connectivity between biotic and abiotic hormone stress pathways. We believe PiZeaM will provide a useful tool for the prediction of protein function and analysis of pathways for Z. mays researchers and is presented in this paper as a reference tool for the exploration of protein interactions in maize. PMID:26089837
Comparative Genome Structure, Secondary Metabolite, and Effector Coding Capacity across Cochliobolus Pathogens

PubMed Central

Bushley, Kathryn E.; Ohm, Robin A.; Otillar, Robert; Martin, Joel; Schackwitz, Wendy; Grimwood, Jane; MohdZainudin, NurAinIzzati; Xue, Chunsheng; Wang, Rui; Manning, Viola A.; Dhillon, Braham; Tu, Zheng Jin; Steffenson, Brian J.; Salamov, Asaf; Sun, Hui; Lowry, Steve; LaButti, Kurt; Han, James; Copeland, Alex; Lindquist, Erika; Barry, Kerrie; Schmutz, Jeremy; Baker, Scott E.; Ciuffetti, Lynda M.; Grigoriev, Igor V.; Zhong, Shaobin; Turgeon, B. Gillian

2013-01-01

The genomes of five Cochliobolus heterostrophus strains, two Cochliobolus sativus strains, three additional Cochliobolus species (Cochliobolus victoriae, Cochliobolus carbonum, Cochliobolus miyabeanus), and closely related Setosphaeria turcica were sequenced at the Joint Genome Institute (JGI). The datasets were used to identify SNPs between strains and species, unique genomic regions, core secondary metabolism genes, and small secreted protein (SSP) candidate effector encoding genes with a view towards pinpointing structural elements and gene content associated with specificity of these closely related fungi to different cereal hosts. Whole-genome alignment shows that three to five percent of each genome differs between strains of the same species, while a quarter of each genome differs between species. On average, SNP counts among field isolates of the same C. heterostrophus species are more than 25× higher than those between inbred lines and 50× lower than SNPs between Cochliobolus species. The suites of nonribosomal peptide synthetase (NRPS), polyketide synthase (PKS), and SSP–encoding genes are astoundingly diverse among species but remarkably conserved among isolates of the same species, whether inbred or field strains, except for defining examples that map to unique genomic regions. Functional analysis of several strain-unique PKSs and NRPSs reveal a strong correlation with a role in virulence. PMID:23357949
Comparative Genome Structure, Secondary Metabolite, and Effector Coding Capacity across Cochliobolus Pathogens

DOE Office of Scientific and Technical Information (OSTI.GOV)

Condon, Bradford J.; Leng, Yueqiang; Wu, Dongliang

The genomes of five Cochliobolus heterostrophus strains, two Cochliobolus sativus strains, three additional Cochliobolus species (Cochliobolus victoriae, Cochliobolus carbonum, Cochliobolus miyabeanus), and closely related Setosphaeria turcica were sequenced at the Joint Genome Institute (JGI). The datasets were used to identify SNPs between strains and species, unique genomic regions, core secondary metabolism genes, and small secreted protein (SSP) candidate effector encoding genes with a view towards pinpointing structural elements and gene content associated with specificity of these closely related fungi to different cereal hosts. Whole-genome alignment shows that three to five of each genome differs between strains of the same species,more » while a quarter of each genome differs between species. On average, SNP counts among field isolates of the same C. heterostrophus species are more than 25 higher than those between inbred lines and 50 lower than SNPs between Cochliobolus species. The suites of nonribosomal peptide synthetase (NRPS), polyketide synthase (PKS), and SSP encoding genes are astoundingly diverse among species but remarkably conserved among isolates of the same species, whether inbred or field strains, except for defining examples that map to unique genomic regions. Functional analysis of several strain-unique PKSs and NRPSs reveal a strong correlation with a role in virulence.« less
Fundulus as the Premier Teleost Model in Environmental Biology: Opportunities for New Insights Using Genomics

PubMed Central

Burnett, Karen G.; Bain, Lisa J.; Baldwin, William S.; Callard, Gloria V.; Cohen, Sarah; Di Giulio, Richard T.; Evans, David H.; Gómez-Chiarri, Marta; Hahn, Mark E.; Hoover, Cindi A.; Karchner, Sibel I.; Katoh, Fumi; MacLatchy, Deborah L.; Marshall, William S.; Meyer, Joel N.; Nacci, Diane E.; Oleksiak, Marjorie F.; Rees, Bernard B.; Singer, Thomas D.; Stegeman, John J.; Towle, David W.; Van Veld, Peter A.; Vogelbein, Wolfgang K.; Whitehead, Andrew; Winn, Richard N.; Crawford, Douglas L.

2007-01-01

A strong foundation of basic and applied research documents that the estuarine fish Fundulus heteroclitus and related species are unique laboratory and field models for understanding how individuals and populations interact with their environment. In this paper we summarize an extensive body of work examining the adaptive responses of Fundulus species to environmental conditions, and describe how this research has contributed importantly to our understanding of physiology, gene regulation, toxicology, and ecological and evolutionary genetics of teleosts and other vertebrates. These explorations have reached a critical juncture at which advancement is hindered by the lack of genomic resources for these species. We suggest that a more complete genomics toolbox for F. heteroclitus and related species will permit researchers to exploit the power of this model organism to rapidly advance our understanding of fundamental biological and pathological mechanisms among vertebrates, as well as ecological strategies and evolutionary processes common to all living organisms. PMID:18071578
Sequence of events in measles virus replication: role of phosphoprotein-nucleocapsid interactions.

PubMed

Brunel, Joanna; Chopy, Damien; Dosnon, Marion; Bloyet, Louis-Marie; Devaux, Patricia; Urzua, Erica; Cattaneo, Roberto; Longhi, Sonia; Gerlier, Denis

2014-09-01

The genome of nonsegmented negative-strand RNA viruses is tightly embedded within a nucleocapsid made of a nucleoprotein (N) homopolymer. To ensure processive RNA synthesis, the viral polymerase L in complex with its cofactor phosphoprotein (P) binds the nucleocapsid that constitutes the functional template. Measles virus P and N interact through two binding sites. While binding of the P amino terminus with the core of N (NCORE) prevents illegitimate encapsidation of cellular RNA, the interaction between their C-terminal domains, P(XD) and N(TAIL) is required for viral RNA synthesis. To investigate the binding dynamics between the two latter domains, the P(XD) F497 residue that makes multiple hydrophobic intramolecular interactions was mutated. Using a quantitative mammalian protein complementation assay and recombinant viruses, we found that an increase in P(XD)-to-N(TAIL) binding strength is associated with a slower transcript accumulation rate and that abolishing the interaction renders the polymerase nonfunctional. The use of a newly developed system allowing conditional expression of wild-type or mutated P genes, revealed that the loss of the P(XD)-N(TAIL) interaction results in reduced transcription by preformed transcriptases, suggesting reduced engagement on the genomic template. These intracellular data indicate that the viral polymerase entry into and progression along its genomic template relies on a protein-protein interaction that serves as a tightly controlled dynamic anchor. Mononegavirales have a unique machinery to replicate RNA. Processivity of their polymerase is only achieved when the genome template is entirely embedded into a helical homopolymer of nucleoproteins that constitutes the nucleocapsid. The polymerase binds to the nucleocapsid template through the phosphoprotein. How the polymerase complex enters and travels along the nucleocapsid template to ensure uninterrupted synthesis of up to ∼ 6,700-nucleotide messenger RNAs from six to ten consecutive genes is unknown. Using a quantitative protein complementation assay and a biGene-biSilencing system allowing conditional expression of two P genes copies, the role of the P-to-N interaction in polymerase function was further characterized. We report here a dynamic protein anchoring mechanism that differs from all other known polymerases that rely only onto a sustained and direct binding to their nucleic acid template. Copyright © 2014, American Society for Microbiology. All Rights Reserved.
Yeast for virus research

PubMed Central

Zhao, Richard Yuqi

2017-01-01

Budding yeast (Saccharomyces cerevisiae) and fission yeast (Schizosaccharomyces pombe) are two popular model organisms for virus research. They are natural hosts for viruses as they carry their own indigenous viruses. Both yeasts have been used for studies of plant, animal and human viruses. Many positive sense (+) RNA viruses and some DNA viruses replicate with various levels in yeasts, thus allowing study of those viral activities during viral life cycle. Yeasts are single cell eukaryotic organisms. Hence, many of the fundamental cellular functions such as cell cycle regulation or programed cell death are highly conserved from yeasts to higher eukaryotes. Therefore, they are particularly suited to study the impact of those viral activities on related cellular activities during virus-host interactions. Yeasts present many unique advantages in virus research over high eukaryotes. Yeast cells are easy to maintain in the laboratory with relative short doubling time. They are non-biohazardous, genetically amendable with small genomes that permit genome-wide analysis of virologic and cellular functions. In this review, similarities and differences of these two yeasts are described. Studies of virologic activities such as viral translation, viral replication and genome-wide study of virus-cell interactions in yeasts are highlighted. Impacts of viral proteins on basic cellular functions such as cell cycle regulation and programed cell death are discussed. Potential applications of using yeasts as hosts to carry out functional analysis of small viral genome and to develop high throughput drug screening platform for the discovery of antiviral drugs are presented. PMID:29082230
High-resolution analysis of CpG methylation and in vivo protein-DNA interactions at the alternative Epstein-Barr virus latency promoters Qp and Cp in the nasopharyngeal carcinoma cell line C666-1.

PubMed

Bakos, Agnes; Banati, Ferenc; Koroknai, Anita; Takacs, Maria; Salamon, Daniel; Minarovits-Kormuta, Susanna; Schwarzmann, Fritz; Wolf, Hans; Niller, Hans Helmut; Minarovits, Janos

2007-10-01

Transcripts for the Epstein-Barr virus (EBV) encoded nuclear antigens (EBNAs) are initiated at alternative promoters (Wp, Cp, for EBNA 1-6 transcripts and Qp, for EBNA 1 transcripts only) located in the BamHI W, C or Q fragment of the viral genome. To understand the host-cell dependent expression of EBNAs in EBV-associated tumors (lymphomas and carcinomas) and in vitro transformed cell lines, it is necessary to analyse the regulatory mechanisms governing the activity of the alternative promoters of EBNA transcripts. Such studies focused mainly on lymphoid cell lines carrying latent EBV genomes, due to the lack of EBV-associated carcinoma cell lines maintaining latent EBV genomes during cultivation in tissue culture. We took advantage of the unique nasopharyngeal carcinoma cell line, C666-1, harboring EBV genomes, and undertook a detailed analysis of CpG methylation patterns and in vivo protein-DNA interactions at the latency promoters Qp and Cp. We found that the active, unmethylated Qp was marked with strong footprints of cellular transcription factors and the viral protein EBNA 1. In contrast, we could not detect binding of relevant transcription factors to the methylated, silent Cp. We concluded that the epigenetic marks at Qp and Cp in C666-1 cells of epithelial origin resemble those of group I Burkitt's lymphoma cell lines.
Ecological genomics of the newly discovered diazotrophic filamentous cyanobacterium ESFC-1

NASA Astrophysics Data System (ADS)

Everroad, C.; Bebout, B.; Bebout, L. E.; Detweiler, A. M.; Lee, J.; Mayali, X.; Singer, S. W.; Stuart, R.; Weber, P. K.; Woebken, D.; Pett-Ridge, J.

2014-12-01

Cyanobacteria-dominated microbial mats played a key role in the evolution of the early Earth and provide a model for exploring the relationships between ecology, evolution and biogeochemistry. A recently described nonheterocystous filamentous cyanobacterium, strain ESFC-1, has been shown to be a major diazotroph year round in the intertidal microbial mat system at Elkhorn Slough, CA, USA. Based on phylogenetic analyses of the 16s RNA gene, ESFC-1 appears to belong to a unique, genus-level divergence within the cyanobacteria. Consequently, the draft genome sequence of this strain has been determined. Here we report features of this genome, particularly as they relate to the ecological functions and capabilities of strain ESFC-1. One striking feature of this cyanobacterium is the apparent lack of a functional bi-directional hydrogenase typically expected to be found within a diazotroph; consortia- and culture-based experiments exploring the metabolic processes of ESFC-1 also indicate that this hydrogenase is absent. Co-culture studies with ESFC-1 and some of the dominant heterotrophic members within the microbial mat system, including the ubiquitous Flavobacterium Muricauda sp., which often is found associated with cyanobacteria in nature and in culture collections worldwide, have also been performed. We report on these species-species interactions, including materials exchange between the cyanobacterium and heterotrophic bacterium. The combination of genomics with culture- and consortia-based experimental research is a powerful tool for understanding microbial processes and interactions in complex ecosystems.
Comparative Genomics Reveals the Core Gene Toolbox for the Fungus-Insect Symbiosis.

PubMed

Wang, Yan; Stata, Matt; Wang, Wei; Stajich, Jason E; White, Merlin M; Moncalvo, Jean-Marc

2018-05-15

Modern genomics has shed light on many entomopathogenic fungi and expanded our knowledge widely; however, little is known about the genomic features of the insect-commensal fungi. Harpellales are obligate commensals living in the digestive tracts of disease-bearing insects (black flies, midges, and mosquitoes). In this study, we produced and annotated whole-genome sequences of nine Harpellales taxa and conducted the first comparative analyses to infer the genomic diversity within the members of the Harpellales. The genomes of the insect gut fungi feature low (26% to 37%) GC content and large genome size variations (25 to 102 Mb). Further comparisons with insect-pathogenic fungi (from both Ascomycota and Zoopagomycota), as well as with free-living relatives (as negative controls), helped to identify a gene toolbox that is essential to the fungus-insect symbiosis. The results not only narrow the genomic scope of fungus-insect interactions from several thousands to eight core players but also distinguish host invasion strategies employed by insect pathogens and commensals. The genomic content suggests that insect commensal fungi rely mostly on adhesion protein anchors that target digestive system, while entomopathogenic fungi have higher numbers of transmembrane helices, signal peptides, and pathogen-host interaction (PHI) genes across the whole genome and enrich genes as well as functional domains to inactivate the host inflammation system and suppress the host defense. Phylogenomic analyses have revealed that genome sizes of Harpellales fungi vary among lineages with an integer-multiple pattern, which implies that ancient genome duplications may have occurred within the gut of insects. IMPORTANCE Insect guts harbor various microbes that are important for host digestion, immune response, and disease dispersal in certain cases. Bacteria, which are among the primary endosymbionts, have been studied extensively. However, fungi, which are also frequently encountered, are poorly known with respect to their biology within the insect guts. To understand the genomic features and related biology, we produced the whole-genome sequences of nine gut commensal fungi from disease-bearing insects (black flies, midges, and mosquitoes). The results show that insect gut fungi tend to have low GC content across their genomes. By comparing these commensals with entomopathogenic and free-living fungi that have available genome sequences, we found a universal core gene toolbox that is unique and thus potentially important for the insect-fungus symbiosis. This comparative work also uncovered different host invasion strategies employed by insect pathogens and commensals, as well as a model system to study ancient fungal genome duplication within the gut of insects. © Crown copyright 2018.
Genomic profiling of neutrophil transcripts in Asian Qigong practitioners: a pilot study in gene regulation by mind-body interaction.

PubMed

Li, Quan-Zhen; Li, Ping; Garcia, Gabriela E; Johnson, Richard J; Feng, Lili

2005-02-01

The great similarity of the genomes of humans and other species stimulated us to search for genes regulated by elements associated with human uniqueness, such as the mind-body interaction. DNA microarray technology offers the advantage of analyzing thousands of genes simultaneously, with the potential to determine healthy phenotypic changes in gene expression. The aim of this study was to determine the genomic profile and function of neutrophils in Falun Gong (FLG, an ancient Chinese Qigong) practitioners, with healthy subjects as controls. Six (6) Asian FLG practitioners and 6 Asian normal healthy controls were recruited for our study. The practitioners have practiced FLG for at least 1 year (range, 1-5 years). The practice includes daily reading of FLG books and daily practice of exercises lasting 1-2 hours. Selected normal healthy controls did not perform Qigong, yoga, t'ai chi, or any other type of mind-body practice, and had not followed any conventional physical exercise program for at least 1 year. Neutrophils were isolated from fresh blood and assayed for gene expression, using microarrays and RNase protection assay (RPA), as well as for function (phagocytosis) and survival (apoptosis). The changes in gene expression of FLG practitioners in contrast to normal healthy controls were characterized by enhanced immunity, downregulation of cellular metabolism, and alteration of apoptotic genes in favor of a rapid resolution of inflammation. The lifespan of normal neutrophils was prolonged, while the inflammatory neutrophils displayed accelerated cell death in FLG practitioners as determined by enzyme-linked immunosorbent assay. Correlating with enhanced immunity reflected by microarray data, neutrophil phagocytosis was significantly increased in Qigong practitioners. Some of the altered genes observed by microarray were confirmed by RPA. Qigong practice may regulate immunity, metabolic rate, and cell death, possibly at the transcriptional level. Our pilot study provides the first evidence that Qigong practice may exert transcriptional regulation at a genomic level. New approaches are needed to study how genes are regulated by elements associated with human uniqueness, such as consciousness, cognition, and spirituality.
Novel Insights into Tree Biology and Genome Evolution as Revealed Through Genomics.

PubMed

Neale, David B; Martínez-García, Pedro J; De La Torre, Amanda R; Montanari, Sara; Wei, Xiao-Xin

2017-04-28

Reference genome sequences are the key to the discovery of genes and gene families that determine traits of interest. Recent progress in sequencing technologies has enabled a rapid increase in genome sequencing of tree species, allowing the dissection of complex characters of economic importance, such as fruit and wood quality and resistance to biotic and abiotic stresses. Although the number of reference genome sequences for trees lags behind those for other plant species, it is not too early to gain insight into the unique features that distinguish trees from nontree plants. Our review of the published data suggests that, although many gene families are conserved among herbaceous and tree species, some gene families, such as those involved in resistance to biotic and abiotic stresses and in the synthesis and transport of sugars, are often expanded in tree genomes. As the genomes of more tree species are sequenced, comparative genomics will further elucidate the complexity of tree genomes and how this relates to traits unique to trees.
Mosaic Graphs and Comparative Genomics in Phage Communities

PubMed Central

Belcaid, Mahdi; Bergeron, Anne

2010-01-01

Abstract Comparing the genomes of two closely related viruses often produces mosaics where nearly identical sequences alternate with sequences that are unique to each genome. When several closely related genomes are compared, the unique sequences are likely to be shared with third genomes, leading to virus mosaic communities. Here we present comparative analysis of sets of Staphylococcus aureus phages that share large identical sequences with up to three other genomes, and with different partners along their genomes. We introduce mosaic graphs to represent these complex recombination events, and use them to illustrate the breath and depth of sequence sharing: some genomes are almost completely made up of shared sequences, while genomes that share very large identical sequences can adopt alternate functional modules. Mosaic graphs also allow us to identify breakpoints that could eventually be used for the construction of recombination networks. These findings have several implications on phage metagenomics assembly, on the horizontal gene transfer paradigm, and more generally on the understanding of the composition and evolutionary dynamics of virus communities. PMID:20874413

Genetics of the Framingham Heart Study Population

PubMed Central

Govindaraju, Diddahally R.; Cupples, L. Adrienne; Kannel, William B.; O’Donnell, Christopher J.; Atwood, Larry D.; D’Agostino, Ralph B.; Fox, Caroline S.; Larson, Marty; Levy, Daniel; Morabito, Joanne; Vasan, Ramachandran S.; Splansky, Greta Lee; Wolf, Philip A.; Benjamin, Emelia J.

2010-01-01

This article provides an introduction to the Framingham Heart Study (FHS) and the genetic research related to cardiovascular diseases conducted in this unique population1. It briefly describes the origins of the study, the risk factors that contribute to heart disease and the approaches taken to discover the genetic basis of some of these risk factors. The genetic architecture of several biological risk factors has been explained using family studies, segregation analysis, heritability, phenotypic and genetic correlations. Many quantitative trait loci underlying cardiovascular diseases have been discovered using different molecular markers. Additionally, results from genome-wide association studies using 100,000 markers, and the prospects of using 550,000 markers for association studies are presented. Finally, the use of this unique sample in genotype and environment interaction is described. PMID:19010253
GENOMIC ORGANIZATION OF THE SP22 GENE AND A UNIQUE PATTERN OF EXPRESSION IN SPERMATOGENIC CELLS

EPA Science Inventory

GENOMIC ORGANIZATION OF THE SP22 GENE AND A UNIQUE PATTERN OF EXPRESSION IN SPERMATOGENIC CELLS.
JE Welch*, RR Barbee*, JD Suarez*, NL Roberts*, and GR Klinefelter. Reproductive Toxicology Division, NHEERL, U.S. EPA, Research Triangle Park, NC, USA.
Our laboratory has rep...
Genome fluctuations in cyanobacteria reflect evolutionary, developmental and adaptive traits.

PubMed

Larsson, John; Nylander, Johan Aa; Bergman, Birgitta

2011-06-30

Cyanobacteria belong to an ancient group of photosynthetic prokaryotes with pronounced variations in their cellular differentiation strategies, physiological capacities and choice of habitat. Sequencing efforts have shown that genomes within this phylum are equally diverse in terms of size and protein-coding capacity. To increase our understanding of genomic changes in the lineage, the genomes of 58 contemporary cyanobacteria were analysed for shared and unique orthologs. A total of 404 protein families, present in all cyanobacterial genomes, were identified. Two of these are unique to the phylum, corresponding to an AbrB family transcriptional regulator and a gene that escapes functional annotation although its genomic neighbourhood is conserved among the organisms examined. The evolution of cyanobacterial genome sizes involves a mix of gains and losses in the clade encompassing complex cyanobacteria, while a single event of reduction is evident in a clade dominated by unicellular cyanobacteria. Genome sizes and gene family copy numbers evolve at a higher rate in the former clade, and multi-copy genes were predominant in large genomes. Orthologs unique to cyanobacteria exhibiting specific characteristics, such as filament formation, heterocyst differentiation, diazotrophy and symbiotic competence, were also identified. An ancestral character reconstruction suggests that the most recent common ancestor of cyanobacteria had a genome size of approx. 4.5 Mbp and 1678 to 3291 protein-coding genes, 4%-6% of which are unique to cyanobacteria today. The different rates of genome-size evolution and multi-copy gene abundance suggest two routes of genome development in the history of cyanobacteria. The expansion strategy is driven by gene-family enlargment and generates a broad adaptive potential; while the genome streamlining strategy imposes adaptations to highly specific niches, also reflected in their different functional capacities. A few genomes display extreme proliferation of non-coding nucleotides which is likely to be the result of initial expansion of genomes/gene copy number to gain adaptive potential, followed by a shift to a life-style in a highly specific niche (e.g. symbiosis). This transition results in redundancy of genes and gene families, leading to an increase in junk DNA and eventually to gene loss. A few orthologs can be correlated with specific phenotypes in cyanobacteria, such as filament formation and symbiotic competence; these constitute exciting exploratory targets.
Genome fluctuations in cyanobacteria reflect evolutionary, developmental and adaptive traits

PubMed Central

2011-01-01

Background Cyanobacteria belong to an ancient group of photosynthetic prokaryotes with pronounced variations in their cellular differentiation strategies, physiological capacities and choice of habitat. Sequencing efforts have shown that genomes within this phylum are equally diverse in terms of size and protein-coding capacity. To increase our understanding of genomic changes in the lineage, the genomes of 58 contemporary cyanobacteria were analysed for shared and unique orthologs. Results A total of 404 protein families, present in all cyanobacterial genomes, were identified. Two of these are unique to the phylum, corresponding to an AbrB family transcriptional regulator and a gene that escapes functional annotation although its genomic neighbourhood is conserved among the organisms examined. The evolution of cyanobacterial genome sizes involves a mix of gains and losses in the clade encompassing complex cyanobacteria, while a single event of reduction is evident in a clade dominated by unicellular cyanobacteria. Genome sizes and gene family copy numbers evolve at a higher rate in the former clade, and multi-copy genes were predominant in large genomes. Orthologs unique to cyanobacteria exhibiting specific characteristics, such as filament formation, heterocyst differentiation, diazotrophy and symbiotic competence, were also identified. An ancestral character reconstruction suggests that the most recent common ancestor of cyanobacteria had a genome size of approx. 4.5 Mbp and 1678 to 3291 protein-coding genes, 4%-6% of which are unique to cyanobacteria today. Conclusions The different rates of genome-size evolution and multi-copy gene abundance suggest two routes of genome development in the history of cyanobacteria. The expansion strategy is driven by gene-family enlargment and generates a broad adaptive potential; while the genome streamlining strategy imposes adaptations to highly specific niches, also reflected in their different functional capacities. A few genomes display extreme proliferation of non-coding nucleotides which is likely to be the result of initial expansion of genomes/gene copy number to gain adaptive potential, followed by a shift to a life-style in a highly specific niche (e.g. symbiosis). This transition results in redundancy of genes and gene families, leading to an increase in junk DNA and eventually to gene loss. A few orthologs can be correlated with specific phenotypes in cyanobacteria, such as filament formation and symbiotic competence; these constitute exciting exploratory targets. PMID:21718514
Mitochondrial DNA as a non-invasive biomarker: Accurate quantification using real time quantitative PCR without co-amplification of pseudogenes and dilution bias

DOE Office of Scientific and Technical Information (OSTI.GOV)

Malik, Afshan N., E-mail: afshan.malik@kcl.ac.uk; Shahni, Rojeen; Rodriguez-de-Ledesma, Ana

2011-08-19

Highlights: {yields} Mitochondrial dysfunction is central to many diseases of oxidative stress. {yields} 95% of the mitochondrial genome is duplicated in the nuclear genome. {yields} Dilution of untreated genomic DNA leads to dilution bias. {yields} Unique primers and template pretreatment are needed to accurately measure mitochondrial DNA content. -- Abstract: Circulating mitochondrial DNA (MtDNA) is a potential non-invasive biomarker of cellular mitochondrial dysfunction, the latter known to be central to a wide range of human diseases. Changes in MtDNA are usually determined by quantification of MtDNA relative to nuclear DNA (Mt/N) using real time quantitative PCR. We propose that themore » methodology for measuring Mt/N needs to be improved and we have identified that current methods have at least one of the following three problems: (1) As much of the mitochondrial genome is duplicated in the nuclear genome, many commonly used MtDNA primers co-amplify homologous pseudogenes found in the nuclear genome; (2) use of regions from genes such as {beta}-actin and 18S rRNA which are repetitive and/or highly variable for qPCR of the nuclear genome leads to errors; and (3) the size difference of mitochondrial and nuclear genomes cause a 'dilution bias' when template DNA is diluted. We describe a PCR-based method using unique regions in the human mitochondrial genome not duplicated in the nuclear genome; unique single copy region in the nuclear genome and template treatment to remove dilution bias, to accurately quantify MtDNA from human samples.« less
Unique physiology of host-parasite interactions in microsporidia infections.

PubMed

Williams, Bryony A P

2009-11-01

Microsporidia are intracellular parasites of all major animal lineages and have a described diversity of over 1200 species and an actual diversity that is estimated to be much higher. They are important pathogens of mammals, and are now one of the most common infections among immunocompromised humans. Although related to fungi, microsporidia are atypical in genomic biology, cell structure and infection mechanism. Host cell infection involves the rapid expulsion of a polar tube from a dormant spore to pierce the host cell membrane and allow the direct transfer of the spore contents into the host cell cytoplasm. This intimate relationship between parasite and host is unique. It allows the microsporidia to be highly exploitative of the host cell environment and cause such diverse effects as the induction of hypertrophied cells to harbour prolific spore development, host sex ratio distortion and host cell organelle and microtubule reorganization. Genome sequencing has revealed that microsporidia have achieved this high level of parasite sophistication with radically reduced proteomes and with many typical eukaryotic pathways pared-down to what appear to be minimal functional units. These traits make microsporidia intriguing model systems for understanding the extremes of reductive parasite evolution and host cell manipulation.
The genome of Eucalyptus grandis.

PubMed

Myburg, Alexander A; Grattapaglia, Dario; Tuskan, Gerald A; Hellsten, Uffe; Hayes, Richard D; Grimwood, Jane; Jenkins, Jerry; Lindquist, Erika; Tice, Hope; Bauer, Diane; Goodstein, David M; Dubchak, Inna; Poliakov, Alexandre; Mizrachi, Eshchar; Kullan, Anand R K; Hussey, Steven G; Pinard, Desre; van der Merwe, Karen; Singh, Pooja; van Jaarsveld, Ida; Silva-Junior, Orzenil B; Togawa, Roberto C; Pappas, Marilia R; Faria, Danielle A; Sansaloni, Carolina P; Petroli, Cesar D; Yang, Xiaohan; Ranjan, Priya; Tschaplinski, Timothy J; Ye, Chu-Yu; Li, Ting; Sterck, Lieven; Vanneste, Kevin; Murat, Florent; Soler, Marçal; Clemente, Hélène San; Saidi, Naijib; Cassan-Wang, Hua; Dunand, Christophe; Hefer, Charles A; Bornberg-Bauer, Erich; Kersting, Anna R; Vining, Kelly; Amarasinghe, Vindhya; Ranik, Martin; Naithani, Sushma; Elser, Justin; Boyd, Alexander E; Liston, Aaron; Spatafora, Joseph W; Dharmwardhana, Palitha; Raja, Rajani; Sullivan, Christopher; Romanel, Elisson; Alves-Ferreira, Marcio; Külheim, Carsten; Foley, William; Carocha, Victor; Paiva, Jorge; Kudrna, David; Brommonschenkel, Sergio H; Pasquali, Giancarlo; Byrne, Margaret; Rigault, Philippe; Tibbits, Josquin; Spokevicius, Antanas; Jones, Rebecca C; Steane, Dorothy A; Vaillancourt, René E; Potts, Brad M; Joubert, Fourie; Barry, Kerrie; Pappas, Georgios J; Strauss, Steven H; Jaiswal, Pankaj; Grima-Pettenati, Jacqueline; Salse, Jérôme; Van de Peer, Yves; Rokhsar, Daniel S; Schmutz, Jeremy

2014-06-19

Eucalypts are the world's most widely planted hardwood trees. Their outstanding diversity, adaptability and growth have made them a global renewable resource of fibre and energy. We sequenced and assembled >94% of the 640-megabase genome of Eucalyptus grandis. Of 36,376 predicted protein-coding genes, 34% occur in tandem duplications, the largest proportion thus far in plant genomes. Eucalyptus also shows the highest diversity of genes for specialized metabolites such as terpenes that act as chemical defence and provide unique pharmaceutical oils. Genome sequencing of the E. grandis sister species E. globulus and a set of inbred E. grandis tree genomes reveals dynamic genome evolution and hotspots of inbreeding depression. The E. grandis genome is the first reference for the eudicot order Myrtales and is placed here sister to the eurosids. This resource expands our understanding of the unique biology of large woody perennials and provides a powerful tool to accelerate comparative biology, breeding and biotechnology.
Massive gene acquisitions in Mycobacterium indicus pranii provide a perspective on mycobacterial evolution

PubMed Central

Saini, Vikram; Raghuvanshi, Saurabh; Khurana, Jitendra P.; Ahmed, Niyaz; Hasnain, Seyed E.; Tyagi, Akhilesh K.; Tyagi, Anil K.

2012-01-01

Understanding the evolutionary and genomic mechanisms responsible for turning the soil-derived saprophytic mycobacteria into lethal intracellular pathogens is a critical step towards the development of strategies for the control of mycobacterial diseases. In this context, Mycobacterium indicus pranii (MIP) is of specific interest because of its unique immunological and evolutionary significance. Evolutionarily, it is the progenitor of opportunistic pathogens belonging to M. avium complex and is endowed with features that place it between saprophytic and pathogenic species. Herein, we have sequenced the complete MIP genome to understand its unique life style, basis of immunomodulation and habitat diversification in mycobacteria. As a case of massive gene acquisitions, 50.5% of MIP open reading frames (ORFs) are laterally acquired. We show, for the first time for Mycobacterium, that MIP genome has mosaic architecture. These gene acquisitions have led to the enrichment of selected gene families critical to MIP physiology. Comparative genomic analysis indicates a higher antigenic potential of MIP imparting it a unique ability for immunomodulation. Besides, it also suggests an important role of genomic fluidity in habitat diversification within mycobacteria and provides a unique view of evolutionary divergence and putative bottlenecks that might have eventually led to intracellular survival and pathogenic attributes in mycobacteria. PMID:22965120
Chromosome-wise dissection of the genome of the extremely big mouse line DU6i.

PubMed

Bevova, Marianna R; Aulchenko, Yurii S; Aksu, Soner; Renne, Ulla; Brockmann, Gudrun A

2006-01-01

The extreme high-body-weight-selected mouse line DU6i is a polygenic model for growth research, harboring many small-effect QTL. We dissected the genome of this line into 19 autosomes and the Y chromosome by the construction of a new panel of chromosome substitution strains (CSS). The DU6i chromosomes were transferred to a DBA/2 mice genetic background by marker-assisted recurrent backcrossing. Mitochondria and the X chromosome were of DBA/2 origin in the backcross. During the construction of these novel strains, >4000 animals were generated, phenotyped, and genotyped. Using these data, we studied the genetic control of variation in body weight and weight gain at 21, 42, and 63 days. The unique data set facilitated the analysis of chromosomal interaction with sex and parent-of-origin effects. All analyzed chromosomes affected body weight and weight gain either directly or in interaction with sex or parent of origin. The effects were age specific, with some chromosomes showing opposite effects at different stages of development.
FARE-CAFE: a database of functional and regulatory elements of cancer-associated fusion events.

PubMed

Korla, Praveen Kumar; Cheng, Jack; Huang, Chien-Hung; Tsai, Jeffrey J P; Liu, Yu-Hsuan; Kurubanjerdjit, Nilubon; Hsieh, Wen-Tsong; Chen, Huey-Yi; Ng, Ka-Lok

2015-01-01

Chromosomal translocation (CT) is of enormous clinical interest because this disorder is associated with various major solid tumors and leukemia. A tumor-specific fusion gene event may occur when a translocation joins two separate genes. Currently, various CT databases provide information about fusion genes and their genomic elements. However, no database of the roles of fusion genes, in terms of essential functional and regulatory elements in oncogenesis, is available. FARE-CAFE is a unique combination of CTs, fusion proteins, protein domains, domain-domain interactions, protein-protein interactions, transcription factors and microRNAs, with subsequent experimental information, which cannot be found in any other CT database. Genomic DNA information including, for example, manually collected exact locations of the first and second break points, sequences and karyotypes of fusion genes are included. FARE-CAFE will substantially facilitate the cancer biologist's mission of elucidating the pathogenesis of various types of cancer. This database will ultimately help to develop 'novel' therapeutic approaches. Database URL: http://ppi.bioinfo.asia.edu.tw/FARE-CAFE. © The Author(s) 2015. Published by Oxford University Press.
FARE-CAFE: a database of functional and regulatory elements of cancer-associated fusion events

PubMed Central

Korla, Praveen Kumar; Cheng, Jack; Huang, Chien-Hung; Tsai, Jeffrey J. P.; Liu, Yu-Hsuan; Kurubanjerdjit, Nilubon; Hsieh, Wen-Tsong; Chen, Huey-Yi; Ng, Ka-Lok

2015-01-01

Chromosomal translocation (CT) is of enormous clinical interest because this disorder is associated with various major solid tumors and leukemia. A tumor-specific fusion gene event may occur when a translocation joins two separate genes. Currently, various CT databases provide information about fusion genes and their genomic elements. However, no database of the roles of fusion genes, in terms of essential functional and regulatory elements in oncogenesis, is available. FARE-CAFE is a unique combination of CTs, fusion proteins, protein domains, domain–domain interactions, protein–protein interactions, transcription factors and microRNAs, with subsequent experimental information, which cannot be found in any other CT database. Genomic DNA information including, for example, manually collected exact locations of the first and second break points, sequences and karyotypes of fusion genes are included. FARE-CAFE will substantially facilitate the cancer biologist’s mission of elucidating the pathogenesis of various types of cancer. This database will ultimately help to develop ‘novel’ therapeutic approaches. Database URL: http://ppi.bioinfo.asia.edu.tw/FARE-CAFE PMID:26384373
Comparative genomic analysis of Helicobacter pylori from Malaysia identifies three distinct lineages suggestive of differential evolution

PubMed Central

Kumar, Narender; Mariappan, Vanitha; Baddam, Ramani; Lankapalli, Aditya K.; Shaik, Sabiha; Goh, Khean-Lee; Loke, Mun Fai; Perkins, Tim; Benghezal, Mohammed; Hasnain, Seyed E.; Vadivelu, Jamuna; Marshall, Barry J.; Ahmed, Niyaz

2015-01-01

The discordant prevalence of Helicobacter pylori and its related diseases, for a long time, fostered certain enigmatic situations observed in the countries of the southern world. Variation in H. pylori infection rates and disease outcomes among different populations in multi-ethnic Malaysia provides a unique opportunity to understand dynamics of host–pathogen interaction and genome evolution. In this study, we extensively analyzed and compared genomes of 27 Malaysian H. pylori isolates and identified three major phylogeographic lineages: hspEastAsia, hpEurope and hpSouthIndia. The analysis of the virulence genes within the core genome, however, revealed a comparable pathogenic potential of the strains. In addition, we identified four genes limited to strains of East-Asian lineage. Our analyses identified a few strain-specific genes encoding restriction modification systems and outlined 311 core genes possibly under differential evolutionary constraints, among the strains representing different ethnic groups. The cagA and vacA genes also showed variations in accordance with the host genetic background of the strains. Moreover, restriction modification genes were found to be significantly enriched in East-Asian strains. An understanding of these variations in the genome content would provide significant insights into various adaptive and host modulation strategies harnessed by H. pylori to effectively persist in a host-specific manner. PMID:25452339
Optimizing Restriction Site Placement for Synthetic Genomes

NASA Astrophysics Data System (ADS)

Montes, Pablo; Memelli, Heraldo; Ward, Charles; Kim, Joondong; Mitchell, Joseph S. B.; Skiena, Steven

Restriction enzymes are the workhorses of molecular biology. We introduce a new problem that arises in the course of our project to design virus variants to serve as potential vaccines: we wish to modify virus-length genomes to introduce large numbers of unique restriction enzyme recognition sites while preserving wild-type function by substitution of synonymous codons. We show that the resulting problem is NP-Complete, give an exponential-time algorithm, and propose effective heuristics, which we show give excellent results for five sample viral genomes. Our resulting modified genomes have several times more unique restriction sites and reduce the maximum gap between adjacent sites by three to nine-fold.
Ontology-based literature mining of E. coli vaccine-associated gene interaction networks.

PubMed

Hur, Junguk; Özgür, Arzucan; He, Yongqun

2017-03-14

Pathogenic Escherichia coli infections cause various diseases in humans and many animal species. However, with extensive E. coli vaccine research, we are still unable to fully protect ourselves against E. coli infections. To more rational development of effective and safe E. coli vaccine, it is important to better understand E. coli vaccine-associated gene interaction networks. In this study, we first extended the Vaccine Ontology (VO) to semantically represent various E. coli vaccines and genes used in the vaccine development. We also normalized E. coli gene names compiled from the annotations of various E. coli strains using a pan-genome-based annotation strategy. The Interaction Network Ontology (INO) includes a hierarchy of various interaction-related keywords useful for literature mining. Using VO, INO, and normalized E. coli gene names, we applied an ontology-based SciMiner literature mining strategy to mine all PubMed abstracts and retrieve E. coli vaccine-associated E. coli gene interactions. Four centrality metrics (i.e., degree, eigenvector, closeness, and betweenness) were calculated for identifying highly ranked genes and interaction types. Using vaccine-related PubMed abstracts, our study identified 11,350 sentences that contain 88 unique INO interactions types and 1,781 unique E. coli genes. Each sentence contained at least one interaction type and two unique E. coli genes. An E. coli gene interaction network of genes and INO interaction types was created. From this big network, a sub-network consisting of 5 E. coli vaccine genes, including carA, carB, fimH, fepA, and vat, and 62 other E. coli genes, and 25 INO interaction types was identified. While many interaction types represent direct interactions between two indicated genes, our study has also shown that many of these retrieved interaction types are indirect in that the two genes participated in the specified interaction process in a required but indirect process. Our centrality analysis of these gene interaction networks identified top ranked E. coli genes and 6 INO interaction types (e.g., regulation and gene expression). Vaccine-related E. coli gene-gene interaction network was constructed using ontology-based literature mining strategy, which identified important E. coli vaccine genes and their interactions with other genes through specific interaction types.
Molecular Innovation in Ciliates with Complex Genome Rearrangements

NASA Astrophysics Data System (ADS)

Neme, R.; Landweber, L. F.

2017-07-01

We study molecular innovation in several ciliate species with unique massive genome rearrangements to understand how a radically distinct genome architecture can shape the process of acquiring new functions, genes and structures.
Expression of virus-encoded proteinases: functional and structural similarities with cellular enzymes.

PubMed Central

Dougherty, W G; Semler, B L

1993-01-01

Many viruses express their genome, or part of their genome, initially as a polyprotein precursor that undergoes proteolytic processing. Molecular genetic analyses of viral gene expression have revealed that many of these processing events are mediated by virus-encoded proteinases. Biochemical activity studies and structural analyses of these viral enzymes reveal that they have remarkable similarities to cellular proteinases. However, the viral proteinases have evolved unique features that permit them to function in a cellular environment. In this article, the current status of plant and animal virus proteinases is described along with their role in the viral replication cycle. The reactions catalyzed by viral proteinases are not simple enzyme-substrate interactions; rather, the processing steps are highly regulated, are coordinated with other viral processes, and frequently involve the participation of other factors. Images PMID:8302216
DOE Office of Scientific and Technical Information (OSTI.GOV)

Kishigami, Satoshi; Kinki University, 930 Nishimitani, Kinokawa 599-5993; Wakayama, Sayaka

In mammals, a diploid genome of an individual following fertilization of an egg and a spermatozoon is unique and irreproducible. This implies that the generated unique diploid genome is doomed with the individual ending. Even as cultured cells from the individual, they cannot normally proliferate in perpetuity because of the 'Hayflick limit'. However, Dolly, the sheep cloned from an adult mammary gland cell, changes this scenario. Somatic cell nuclear transfer (SCNT) enables us to produce offspring without germ cells, that is, to 'passage' a unique diploid genome. Animal cloning has also proven to be a powerful research tool for reprogrammingmore » in many mammals, notably mouse and cow. The mechanism underlying reprogramming, however, remains largely unknown and, animal cloning has been inefficient as a result. More momentously, in addition to abortion and fetal mortality, some cloned animals display possible premature aging phenotypes including early death and short telomere lengths. Under these inauspicious conditions, is it really possible for SCNT to preserve a diploid genome? Delightfully, in mouse and recently in primate, using SCNT we can produce nuclear transfer ES cells (ntES) more efficiently, which can preserve the eternal lifespan for the 'passage' of a unique diploid genome. Further, new somatic cloning technique using histone-deacetylase inhibitors has been developed which can significantly increase the previous cloning rates two to six times. Here, we introduce SCNT and its value as a preservation tool for a diploid genome while reviewing aging of cloned animals on cellular and individual levels.« less
Complete genome sequence of Brachyspira intermedia reveals unique genomic features in Brachyspira species and phage-mediated horizontal gene transfer

PubMed Central

2011-01-01

Background Brachyspira spp. colonize the intestines of some mammalian and avian species and show different degrees of enteropathogenicity. Brachyspira intermedia can cause production losses in chickens and strain PWS/AT now becomes the fourth genome to be completed in the genus Brachyspira. Results 15 classes of unique and shared genes were analyzed in B. intermedia, B. murdochii, B. hyodysenteriae and B. pilosicoli. The largest number of unique genes was found in B. intermedia and B. murdochii. This indicates the presence of larger pan-genomes. In general, hypothetical protein annotations are overrepresented among the unique genes. A 3.2 kb plasmid was found in B. intermedia strain PWS/AT. The plasmid was also present in the B. murdochii strain but not in nine other Brachyspira isolates. Within the Brachyspira genomes, genes had been translocated and also frequently switched between leading and lagging strands, a process that can be followed by different AT-skews in the third positions of synonymous codons. We also found evidence that bacteriophages were being remodeled and genes incorporated into them. Conclusions The accessory gene pool shapes species-specific traits. It is also influenced by reductive genome evolution and horizontal gene transfer. Gene-transfer events can cross both species and genus boundaries and bacteriophages appear to play an important role in this process. A mechanism for horizontal gene transfer appears to be gene translocations leading to remodeling of bacteriophages in combination with broad tropism. PMID:21816042
Endogenous retroviruses of sheep: a model system for understanding physiological adaptation to an evolving ruminant genome.

PubMed

Spencer, Thomas E; Palmarini, Massimo

2012-01-01

Endogenous retroviruses (ERVs) are present in the genome of all vertebrates and are remnants of ancient exogenous retroviral infections of the host germline transmitted vertically from generation to generation. Sheep betaretroviruses offer a unique model system to study the complex interaction between retroviruses and their host. The sheep genome contains 27 endogenous betaretroviruses (enJSRVs) related to the exogenous and pathogenic Jaagsiekte sheep retrovirus (JSRV), the causative agent of a transmissible lung cancer in sheep. The enJSRVs can protect their host against JSRV infection by blocking early and late steps of the JSRV replication cycle. In the female reproductive tract, enJSRVs are specifically expressed in the uterine luminal and glandular epithelia as well as in the conceptus (embryo and associated extraembryonic membranes) trophectoderm and in utero loss-of-function experiments found the enJSRVs envelope (env) to be essential for conceptus elongation and trophectoderm growth and development. Collectively, available evidence in sheep and other mammals indicate that ERVs coevolved with their hosts for millions of years and were positively selected for biological roles in genome plasticity and evolution, protection of the host against infection of related pathogenic and exogenous retroviruses, and placental development.
Herbicide targets and detoxification proteins in sugarcane: from gene assembly to structure modelling.

PubMed

Lloyd Evans, Dyfed; Joshi, Shailesh Vinay

2017-07-01

In a genome context, sugarcane is a classic orphan crop, in that no genome and only very few genes have been assembled. We have devised a novel exome assembly methodology that has allowed us to assemble and characterize 49 genes that serve as herbicide targets, safener interacting proteins, and members of herbicide detoxification pathways within the sugarcane genome. We have structurally modelled the products of each of these genes, as well as determining allelic, genomic, and RNA-Seq based polymorphisms for each gene. This study provides the largest collection of sugarcane structures modelled to date. We demonstrate that sugarcane genes are highly polymorphic, revealing that each genotype is evolving both uniquely and independently. In addition, we present an exome assembly system for orphan crops that can be executed on commodity infrastructure, making exome assembly practical for any group. In terms of knowledge about herbicide modes of action and detoxification, we have advanced sugarcane from a crop where no information about any herbicide-associated gene was available to the situation where sugarcane is now a species with the single largest collection of known and annotated herbicide-associated genes.

The TTSMI database: a catalog of triplex target DNA sites associated with genes and regulatory elements in the human genome.

PubMed

Jenjaroenpun, Piroon; Chew, Chee Siang; Yong, Tai Pang; Choowongkomon, Kiattawee; Thammasorn, Wimada; Kuznetsov, Vladimir A

2015-01-01

A triplex target DNA site (TTS), a stretch of DNA that is composed of polypurines, is able to form a triple-helix (triplex) structure with triplex-forming oligonucleotides (TFOs) and is able to influence the site-specific modulation of gene expression and/or the modification of genomic DNA. The co-localization of a genomic TTS with gene regulatory signals and functional genome structures suggests that TFOs could potentially be exploited in antigene strategies for the therapy of cancers and other genetic diseases. Here, we present the TTS Mapping and Integration (TTSMI; http://ttsmi.bii.a-star.edu.sg) database, which provides a catalog of unique TTS locations in the human genome and tools for analyzing the co-localization of TTSs with genomic regulatory sequences and signals that were identified using next-generation sequencing techniques and/or predicted by computational models. TTSMI was designed as a user-friendly tool that facilitates (i) fast searching/filtering of TTSs using several search terms and criteria associated with sequence stability and specificity, (ii) interactive filtering of TTSs that co-localize with gene regulatory signals and non-B DNA structures, (iii) exploration of dynamic combinations of the biological signals of specific TTSs and (iv) visualization of a TTS simultaneously with diverse annotation tracks via the UCSC genome browser. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Draft Genome Sequence of the Spore-Forming Probiotic Strain Bacillus coagulans Unique IS-2

PubMed Central

Upadrasta, Aditya; Pitta, Swetha

2016-01-01

Bacillus coagulans Unique IS-2 is a potential spore-forming probiotic that is commercially available on the market. The draft genome sequence presented here provides deep insight into the beneficial features of this strain for its safe use as a probiotic for various human and animal health applications. PMID:27103709
Identification of a unique library of complex, but ordered, arrays of repetitive elements in the human genome and implication of their potential involvement in pathobiology.

PubMed

Lee, Kang-Hoon; Lee, Young-Kwan; Kwon, Deug-Nam; Chiu, Sophia; Chew, Victoria; Rah, Hyungchul; Kujawski, Gregory; Melhem, Ramzi; Hsu, Karen; Chung, Cecilia; Greenhalgh, David G; Cho, Kiho

2011-06-01

Approximately 2% of the human genome is reported to be occupied by genes. Various forms of repetitive elements (REs), both characterized and uncharacterized, are presumed to make up the vast majority of the rest of the genomes of human and other species. In conjunction with a comprehensive annotation of genes, information regarding components of genome biology, such as gene polymorphisms, non-coding RNAs, and certain REs, is found in human genome databases. However, the genome-wide profile of unique RE arrangements formed by different groups of REs has not been fully characterized yet. In this study, the entire human genome was subjected to an unbiased RE survey to establish a whole-genome profile of REs and their arrangements. Due to the limitation in query size within the bl2seq alignment program (National Center for Biotechnology Information [NCBI]) utilized for the RE survey, the entire NCBI reference human genome was fragmented into 6206 units of 0.5M nucleotides. A number of RE arrangements with varying complexities and patterns were identified throughout the genome. Each chromosome had unique profiles of RE arrangements and density, and high levels of RE density were measured near the centromere regions. Subsequently, 175 complex RE arrangements, which were selected throughout the genome, were subjected to a comparison analysis using five different human genome sequences. Interestingly, three of the five human genome databases shared the exactly same arrangement patterns and sequences for all 175 RE arrangement regions (a total of 12,765,625 nucleotides). The findings from this study demonstrate that a substantial fraction of REs in the human genome are clustered into various forms of ordered structures. Further investigations are needed to examine whether some of these ordered RE arrangements contribute to the human pathobiology as a functional genome unit. Copyright © 2011 Elsevier Inc. All rights reserved.
Comparison of the protein-coding gene content of Chlamydia trachomatis and Protochlamydia amoebophila using a Raspberry Pi computer.

PubMed

Robson, James F; Barker, Daniel

2015-10-13

To demonstrate the bioinformatics capabilities of a low-cost computer, the Raspberry Pi, we present a comparison of the protein-coding gene content of two species in phylum Chlamydiae: Chlamydia trachomatis, a common sexually transmitted infection of humans, and Candidatus Protochlamydia amoebophila, a recently discovered amoebal endosymbiont. Identifying species-specific proteins and differences in protein families could provide insights into the unique phenotypes of the two species. Using a Raspberry Pi computer, sequence similarity-based protein families were predicted across the two species, C. trachomatis and P. amoebophila, and their members counted. Examples include nine multi-protein families unique to C. trachomatis, 132 multi-protein families unique to P. amoebophila and one family with multiple copies in both. Most families unique to C. trachomatis were polymorphic outer-membrane proteins. Additionally, multiple protein families lacking functional annotation were found. Predicted functional interactions suggest one of these families is involved with the exodeoxyribonuclease V complex. The Raspberry Pi computer is adequate for a comparative genomics project of this scope. The protein families unique to P. amoebophila may provide a basis for investigating the host-endosymbiont interaction. However, additional species should be included; and further laboratory research is required to identify the functions of unknown or putative proteins. Multiple outer membrane proteins were found in C. trachomatis, suggesting importance for host evasion. The tyrosine transport protein family is shared between both species, with four proteins in C. trachomatis and two in P. amoebophila. Shared protein families could provide a starting point for discovery of wide-spectrum drugs against Chlamydiae.
Genotype by environment (climate) interaction improves genomic prediction for production traits in US Holstein cattle.

PubMed

Tiezzi, F; de Los Campos, G; Parker Gaddis, K L; Maltecca, C

2017-03-01

Genotype by environment interaction (G × E) in dairy cattle productive traits has been shown to exist, but current genetic evaluation methods do not take this component into account. As several environmental descriptors (e.g., climate, farming system) are known to vary within the United States, not accounting for the G × E could lead to reranking of bulls and loss in genetic gain. Using test-day records on milk yield, somatic cell score, fat, and protein percentage from all over the United States, we computed within herd-year-season daughter yield deviations for 1,087 Holstein bulls and regressed them on genetic and environmental information to estimate variance components and to assess prediction accuracy. Genomic information was obtained from a 50k SNP marker panel. Environmental effect inputs included herd (160 levels), geographical region (7 levels), geographical location (2 variables), climate information (7 variables), and management conditions of the herds (16 total variables divided in 4 subgroups). For each set of environmental descriptors, environmental, genomic, and G × E components were sequentially fitted. Variance components estimates confirmed the presence of G × E on milk yield, with its effect being larger than main genetic effect and the environmental effect for some models. Conversely, G × E was moderate for somatic cell score and small for milk composition. Genotype by environment interaction, when included, partially eroded the genomic effect (as compared with the models where G × E was not included), suggesting that the genomic variance could at least in part be attributed to G × E not appropriately accounted for. Model predictive ability was assessed using 3 cross-validation schemes (new bulls, incomplete progeny test, and new environmental conditions), and performance was compared with a reference model including only the main genomic effect. In each scenario, at least 1 of the models including G × E was able to perform better than the reference model, although it was not possible to find the overall best-performing model that included the same set of environmental descriptors. In general, the methodology used is promising in accounting for G × E in genomic predictions, but challenges exist in identifying a unique set of covariates capable of describing the entire variety of environments. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Evolutionary insights from Erwinia amylovora genomics.

PubMed

Smits, Theo H M; Rezzonico, Fabio; Duffy, Brion

2011-08-20

Evolutionary genomics is coming into focus with the recent availability of complete sequences for many bacterial species. A hypothesis on the evolution of virulence factors in the plant pathogen Erwinia amylovora, the causative agent of fire blight, was generated using comparative genomics with the genomes E. amylovora, Erwinia pyrifoliae and Erwinia tasmaniensis. Putative virulence factors were mapped to the proposed genealogy of the genus Erwinia that is based on phylogenetic and genomic data. Ancestral origin of several virulence factors was identified, including levan biosynthesis, sorbitol metabolism, three T3SS and two T6SS. Other factors appeared to have been acquired after divergence of pathogenic species, including a second flagellar gene and two glycosyltransferases involved in amylovoran biosynthesis. E. amylovora singletons include 3 unique T3SS effectors that may explain differential virulence/host ranges. E. amylovora also has a unique T1SS export system, and a unique third T6SS gene cluster. Genetic analysis revealed signatures of foreign DNA suggesting that horizontal gene transfer is responsible for some of these differential features between the three species. Copyright © 2010 Elsevier B.V. All rights reserved.
Genome-wide identification of nuclear receptor (NR) superfamily genes in the copepod Tigriopus japonicus.

PubMed

Hwang, Dae-Sik; Lee, Bo-Young; Kim, Hui-Su; Lee, Min Chul; Kyung, Do-Hyun; Om, Ae-Son; Rhee, Jae-Sung; Lee, Jae-Seong

2014-11-18

Nuclear receptors (NRs) are a large superfamily of proteins defined by a DNA-binding domain (DBD) and a ligand-binding domain (LBD). They function as transcriptional regulators to control expression of genes involved in development, homeostasis, and metabolism. The number of NRs differs from species to species, because of gene duplications and/or lineage-specific gene losses during metazoan evolution. Many NRs in arthropods interact with the ecdysteroid hormone and are involved in ecdysone-mediated signaling in arthropods. The nuclear receptor superfamily complement has been reported in several arthropods, including crustaceans, but not in copepods. We identified the entire NR repertoire of the copepod Tigriopus japonicus, which is an important marine model species for ecotoxicology and environmental genomics. Using whole genome and transcriptome sequences, we identified a total of 31 nuclear receptors in the genome of T. japonicus. Nomenclature of the nuclear receptors was determined based on the sequence similarities of the DNA-binding domain (DBD) and ligand-binding domain (LBD). The 7 subfamilies of NRs separate into five major clades (subfamilies NR1, NR2, NR3, NR4, and NR5/6). Although the repertoire of NR members in, T. japonicus was similar to that reported for other arthropods, there was an expansion of the NR1 subfamily in Tigriopus japonicus. The twelve unique nuclear receptors identified in T. japonicus are members of NR1L. This expansion may be a unique lineage-specific feature of crustaceans. Interestingly, E78 and HR83, which are present in other arthropods, were absent from the genomes of T. japonicus and two congeneric copepod species (T. japonicus and Tigriopus californicus), suggesting copepod lineage-specific gene loss. We identified all NR receptors present in the copepod, T. japonicus. Knowledge of the copepod nuclear receptor repertoire will contribute to a better understanding of copepod- and crustacean-specific NR evolution.
Mechanisms of action of Coxiella burnetii effectors inferred from host-pathogen protein interactions.

PubMed

Wallqvist, Anders; Wang, Hao; Zavaljevski, Nela; Memišević, Vesna; Kwon, Keehwan; Pieper, Rembert; Rajagopala, Seesandra V; Reifman, Jaques

2017-01-01

Coxiella burnetii is an obligate Gram-negative intracellular pathogen and the etiological agent of Q fever. Successful infection requires a functional Type IV secretion system, which translocates more than 100 effector proteins into the host cytosol to establish the infection, restructure the intracellular host environment, and create a parasitophorous vacuole where the replicating bacteria reside. We used yeast two-hybrid (Y2H) screening of 33 selected C. burnetii effectors against whole genome human and murine proteome libraries to generate a map of potential host-pathogen protein-protein interactions (PPIs). We detected 273 unique interactions between 20 pathogen and 247 human proteins, and 157 between 17 pathogen and 137 murine proteins. We used orthology to combine the data and create a single host-pathogen interaction network containing 415 unique interactions between 25 C. burnetii and 363 human proteins. We further performed complementary pairwise Y2H testing of 43 out of 91 C. burnetii-human interactions involving five pathogen proteins. We used the combined data to 1) perform enrichment analyses of target host cellular processes and pathways, 2) examine effectors with known infection phenotypes, and 3) infer potential mechanisms of action for four effectors with uncharacterized functions. The host-pathogen interaction profiles supported known Coxiella phenotypes, such as adapting cell morphology through cytoskeletal re-arrangements, protein processing and trafficking, organelle generation, cholesterol processing, innate immune modulation, and interactions with the ubiquitin and proteasome pathways. The generated dataset of PPIs-the largest collection of unbiased Coxiella host-pathogen interactions to date-represents a rich source of information with respect to secreted pathogen effector proteins and their interactions with human host proteins.
Spatial organization of the budding yeast genome in the cell nucleus and identification of specific chromatin interactions from multi-chromosome constrained chromatin model.

PubMed

Gürsoy, Gamze; Xu, Yun; Liang, Jie

2017-07-01

Nuclear landmarks and biochemical factors play important roles in the organization of the yeast genome. The interaction pattern of budding yeast as measured from genome-wide 3C studies are largely recapitulated by model polymer genomes subject to landmark constraints. However, the origin of inter-chromosomal interactions, specific roles of individual landmarks, and the roles of biochemical factors in yeast genome organization remain unclear. Here we describe a multi-chromosome constrained self-avoiding chromatin model (mC-SAC) to gain understanding of the budding yeast genome organization. With significantly improved sampling of genome structures, both intra- and inter-chromosomal interaction patterns from genome-wide 3C studies are accurately captured in our model at higher resolution than previous studies. We show that nuclear confinement is a key determinant of the intra-chromosomal interactions, and centromere tethering is responsible for the inter-chromosomal interactions. In addition, important genomic elements such as fragile sites and tRNA genes are found to be clustered spatially, largely due to centromere tethering. We uncovered previously unknown interactions that were not captured by genome-wide 3C studies, which are found to be enriched with tRNA genes, RNAPIII and TFIIS binding. Moreover, we identified specific high-frequency genome-wide 3C interactions that are unaccounted for by polymer effects under landmark constraints. These interactions are enriched with important genes and likely play biological roles.
Zinc Finger Independent Genome-Wide Binding of Sp2 Potentiates Recruitment of Histone-Fold Protein Nf-y Distinguishing It from Sp1 and Sp3

PubMed Central

Finkernagel, Florian; Stiewe, Thorsten; Nist, Andrea; Suske, Guntram

2015-01-01

Transcription factors are grouped into families based on sequence similarity within functional domains, particularly DNA-binding domains. The Specificity proteins Sp1, Sp2 and Sp3 are paradigmatic of closely related transcription factors. They share amino-terminal glutamine-rich regions and a conserved carboxy-terminal zinc finger domain that can bind to GC rich motifs in vitro. All three Sp proteins are ubiquitously expressed; yet they carry out unique functions in vivo raising the question of how specificity is achieved. Crucially, it is unknown whether they bind to distinct genomic sites and, if so, how binding site selection is accomplished. In this study, we have examined the genomic binding patterns of Sp1, Sp2 and Sp3 in mouse embryonic fibroblasts by ChIP-seq. Sp1 and Sp3 essentially occupy the same promoters and localize to GC boxes. The genomic binding pattern of Sp2 is different; Sp2 primarily localizes at CCAAT motifs. Consistently, re-expression of Sp2 and Sp3 mutants in corresponding knockout MEFs revealed strikingly different modes of genomic binding site selection. Most significantly, while the zinc fingers dictate genomic binding of Sp3, they are completely dispensable for binding of Sp2. Instead, the glutamine-rich amino-terminal region is sufficient for recruitment of Sp2 to its target promoters in vivo. We have identified the trimeric histone-fold CCAAT box binding transcription factor Nf-y as the major partner for Sp2-chromatin interaction. Nf-y is critical for recruitment of Sp2 to co-occupied regulatory elements. Equally, Sp2 potentiates binding of Nf-y to shared sites indicating the existence of an extensive Sp2-Nf-y interaction network. Our results unveil strikingly different recruitment mechanisms of Sp1/Sp2/Sp3 transcription factor members uncovering an unexpected layer of complexity in their binding to chromatin in vivo. PMID:25793500
The Naegleria genome: a free-living microbial eukaryote lends unique insights into core eukaryotic cell biology

PubMed Central

Fritz-Laylin, Lillian K.; Ginger, Michael L.; Walsh, Charles; Dawson, Scott C.; Fulton, Chandler

2016-01-01

Naegleria gruberi, a free-living protist, has long been treasured as a model for basal body and flagellar assembly due to its ability to differentiate from crawling amoebae into swimming flagellates. The full genome sequence of Naegleria gruberi has recently been used to estimate gene families ancestral to all eukaryotes and to identify novel aspects of Naegleria biology, including likely facultative anaerobic metabolism, extensive signaling cascades, and evidence for sexuality. Distinctive features of the Naegleria genome and nuclear biology provide unique perspectives for comparative cell biology, including cell division, RNA processing and nucleolar assembly. We highlight here exciting new and novel aspects of Naegleria biology identified through genomic analysis. PMID:21392573
Structural Mechanisms of Plant Glucan Phosphatases in Starch Metabolism

PubMed Central

Meekins, David A.; Vander Kooi, Craig W.; Gentry, Matthew S.

2016-01-01

Glucan phosphatases are a recently discovered class of enzymes that dephosphorylate starch and glycogen, thereby regulating energy metabolism. Plant genomes encode for two glucan phosphatases called Starch EXcess4 (SEX4) and Like Sex Four2 (LSF2) that regulate starch metabolism by selectively dephosphorylating glucose moieties within starch glucan chains. Recently, the structures of both SEX4 and LSF2 were determined, with and without phosphoglucan products bound, revealing the mechanism for their unique activities. This review explores the structural and enzymatic features of the plant glucan phosphatases and outlines how they are uniquely adapted for carrying out their cellular functions. We outline the physical mechanisms employed by SEX4 and LSF2 to interact with starch glucans: SEX4 binds glucan chains via a continuous glucan binding platform comprised of its Dual Specificity Phosphatase (DSP) domain and Carbohydrate Binding Module (CBM) while LSF2 utilizes Surface Binding Sites (SBSs). SEX4 and LSF2 both contain a unique network of aromatic residues in their catalytic DSP domains that serve as glucan engagement platforms and are unique to the glucan phosphatases. We also discuss the phosphoglucan substrate specificities inherent to SEX4 and LSF2 and outline structural features within the active site that govern glucan orientation. This review defines the structural mechanism of the plant glucan phosphatases with respect to phosphatases, starch metabolism, and protein-glucan interaction; thereby providing a framework for their applications in both agricultural and industrial settings. PMID:26934589
Genome sequences of the human body louse and its primary endosymbiont provide insights into the permanent parasitic lifestyle.

PubMed

Kirkness, Ewen F; Haas, Brian J; Sun, Weilin; Braig, Henk R; Perotti, M Alejandra; Clark, John M; Lee, Si Hyeock; Robertson, Hugh M; Kennedy, Ryan C; Elhaik, Eran; Gerlach, Daniel; Kriventseva, Evgenia V; Elsik, Christine G; Graur, Dan; Hill, Catherine A; Veenstra, Jan A; Walenz, Brian; Tubío, José Manuel C; Ribeiro, José M C; Rozas, Julio; Johnston, J Spencer; Reese, Justin T; Popadic, Aleksandar; Tojo, Marta; Raoult, Didier; Reed, David L; Tomoyasu, Yoshinori; Kraus, Emily; Krause, Emily; Mittapalli, Omprakash; Margam, Venu M; Li, Hong-Mei; Meyer, Jason M; Johnson, Reed M; Romero-Severson, Jeanne; Vanzee, Janice Pagel; Alvarez-Ponce, David; Vieira, Filipe G; Aguadé, Montserrat; Guirao-Rico, Sara; Anzola, Juan M; Yoon, Kyong S; Strycharz, Joseph P; Unger, Maria F; Christley, Scott; Lobo, Neil F; Seufferheld, Manfredo J; Wang, Naikuan; Dasch, Gregory A; Struchiner, Claudio J; Madey, Greg; Hannick, Linda I; Bidwell, Shelby; Joardar, Vinita; Caler, Elisabet; Shao, Renfu; Barker, Stephen C; Cameron, Stephen; Bruggner, Robert V; Regier, Allison; Johnson, Justin; Viswanathan, Lakshmi; Utterback, Terry R; Sutton, Granger G; Lawson, Daniel; Waterhouse, Robert M; Venter, J Craig; Strausberg, Robert L; Berenbaum, May R; Collins, Frank H; Zdobnov, Evgeny M; Pittendrigh, Barry R

2010-07-06

As an obligatory parasite of humans, the body louse (Pediculus humanus humanus) is an important vector for human diseases, including epidemic typhus, relapsing fever, and trench fever. Here, we present genome sequences of the body louse and its primary bacterial endosymbiont Candidatus Riesia pediculicola. The body louse has the smallest known insect genome, spanning 108 Mb. Despite its status as an obligate parasite, it retains a remarkably complete basal insect repertoire of 10,773 protein-coding genes and 57 microRNAs. Representing hemimetabolous insects, the genome of the body louse thus provides a reference for studies of holometabolous insects. Compared with other insect genomes, the body louse genome contains significantly fewer genes associated with environmental sensing and response, including odorant and gustatory receptors and detoxifying enzymes. The unique architecture of the 18 minicircular mitochondrial chromosomes of the body louse may be linked to the loss of the gene encoding the mitochondrial single-stranded DNA binding protein. The genome of the obligatory louse endosymbiont Candidatus Riesia pediculicola encodes less than 600 genes on a short, linear chromosome and a circular plasmid. The plasmid harbors a unique arrangement of genes required for the synthesis of pantothenate, an essential vitamin deficient in the louse diet. The human body louse, its primary endosymbiont, and the bacterial pathogens that it vectors all possess genomes reduced in size compared with their free-living close relatives. Thus, the body louse genome project offers unique information and tools to use in advancing understanding of coevolution among vectors, symbionts, and pathogens.
Genome sequences of the human body louse and its primary endosymbiont provide insights into the permanent parasitic lifestyle

PubMed Central

Kirkness, Ewen F.; Haas, Brian J.; Sun, Weilin; Braig, Henk R.; Perotti, M. Alejandra; Clark, John M.; Lee, Si Hyeock; Robertson, Hugh M.; Kennedy, Ryan C.; Elhaik, Eran; Gerlach, Daniel; Kriventseva, Evgenia V.; Elsik, Christine G.; Graur, Dan; Hill, Catherine A.; Veenstra, Jan A.; Walenz, Brian; Tubío, José Manuel C.; Ribeiro, José M. C.; Rozas, Julio; Johnston, J. Spencer; Reese, Justin T.; Popadic, Aleksandar; Tojo, Marta; Raoult, Didier; Reed, David L.; Tomoyasu, Yoshinori; Kraus, Emily; Mittapalli, Omprakash; Margam, Venu M.; Li, Hong-Mei; Meyer, Jason M.; Johnson, Reed M.; Romero-Severson, Jeanne; VanZee, Janice Pagel; Alvarez-Ponce, David; Vieira, Filipe G.; Aguadé, Montserrat; Guirao-Rico, Sara; Anzola, Juan M.; Yoon, Kyong S.; Strycharz, Joseph P.; Unger, Maria F.; Christley, Scott; Lobo, Neil F.; Seufferheld, Manfredo J.; Wang, NaiKuan; Dasch, Gregory A.; Struchiner, Claudio J.; Madey, Greg; Hannick, Linda I.; Bidwell, Shelby; Joardar, Vinita; Caler, Elisabet; Shao, Renfu; Barker, Stephen C.; Cameron, Stephen; Bruggner, Robert V.; Regier, Allison; Johnson, Justin; Viswanathan, Lakshmi; Utterback, Terry R.; Sutton, Granger G.; Lawson, Daniel; Waterhouse, Robert M.; Venter, J. Craig; Strausberg, Robert L.; Collins, Frank H.; Zdobnov, Evgeny M.; Pittendrigh, Barry R.

2010-01-01

As an obligatory parasite of humans, the body louse (Pediculus humanus humanus) is an important vector for human diseases, including epidemic typhus, relapsing fever, and trench fever. Here, we present genome sequences of the body louse and its primary bacterial endosymbiont Candidatus Riesia pediculicola. The body louse has the smallest known insect genome, spanning 108 Mb. Despite its status as an obligate parasite, it retains a remarkably complete basal insect repertoire of 10,773 protein-coding genes and 57 microRNAs. Representing hemimetabolous insects, the genome of the body louse thus provides a reference for studies of holometabolous insects. Compared with other insect genomes, the body louse genome contains significantly fewer genes associated with environmental sensing and response, including odorant and gustatory receptors and detoxifying enzymes. The unique architecture of the 18 minicircular mitochondrial chromosomes of the body louse may be linked to the loss of the gene encoding the mitochondrial single-stranded DNA binding protein. The genome of the obligatory louse endosymbiont Candidatus Riesia pediculicola encodes less than 600 genes on a short, linear chromosome and a circular plasmid. The plasmid harbors a unique arrangement of genes required for the synthesis of pantothenate, an essential vitamin deficient in the louse diet. The human body louse, its primary endosymbiont, and the bacterial pathogens that it vectors all possess genomes reduced in size compared with their free-living close relatives. Thus, the body louse genome project offers unique information and tools to use in advancing understanding of coevolution among vectors, symbionts, and pathogens. PMID:20566863
Draft genome sequence of marine alphaproteobacterial strain HIMB11, the first cultivated representative of a unique lineage within the Roseobacter clade possessing an unusually small genome

PubMed Central

Durham, Bryndan P.; Grote, Jana; Whittaker, Kerry A.; Bender, Sara J.; Luo, Haiwei; Grim, Sharon L.; Brown, Julia M.; Casey, John R.; Dron, Antony; Florez-Leiva, Lennin; Krupke, Andreas; Luria, Catherine M.; Mine, Aric H.; Nigro, Olivia D.; Pather, Santhiska; Talarmin, Agathe; Wear, Emma K.; Weber, Thomas S.; Wilson, Jesse M.; Church, Matthew J.; DeLong, Edward F.; Karl, David M.; Steward, Grieg F.; Eppley, John M.; Kyrpides, Nikos C.; Schuster, Stephan; Rappé, Michael S.

2014-01-01

Strain HIMB11 is a planktonic marine bacterium isolated from coastal seawater in Kaneohe Bay, Oahu, Hawaii belonging to the ubiquitous and versatile Roseobacter clade of the alphaproteobacterial family Rhodobacteraceae. Here we describe the preliminary characteristics of strain HIMB11, including annotation of the draft genome sequence and comparative genomic analysis with other members of the Roseobacter lineage. The 3,098,747 bp draft genome is arranged in 34 contigs and contains 3,183 protein-coding genes and 54 RNA genes. Phylogenomic and 16S rRNA gene analyses indicate that HIMB11 represents a unique sublineage within the Roseobacter clade. Comparison with other publicly available genome sequences from members of the Roseobacter lineage reveals that strain HIMB11 has the genomic potential to utilize a wide variety of energy sources (e.g. organic matter, reduced inorganic sulfur, light, carbon monoxide), while possessing a reduced number of substrate transporters. PMID:25197450
Draft genome sequence of marine alphaproteobacterial strain HIMB11, the first cultivated representative of a unique lineage within the Roseobacter clade possessing an unusually small genome.

PubMed

Durham, Bryndan P; Grote, Jana; Whittaker, Kerry A; Bender, Sara J; Luo, Haiwei; Grim, Sharon L; Brown, Julia M; Casey, John R; Dron, Antony; Florez-Leiva, Lennin; Krupke, Andreas; Luria, Catherine M; Mine, Aric H; Nigro, Olivia D; Pather, Santhiska; Talarmin, Agathe; Wear, Emma K; Weber, Thomas S; Wilson, Jesse M; Church, Matthew J; DeLong, Edward F; Karl, David M; Steward, Grieg F; Eppley, John M; Kyrpides, Nikos C; Schuster, Stephan; Rappé, Michael S

2014-06-15

Strain HIMB11 is a planktonic marine bacterium isolated from coastal seawater in Kaneohe Bay, Oahu, Hawaii belonging to the ubiquitous and versatile Roseobacter clade of the alphaproteobacterial family Rhodobacteraceae. Here we describe the preliminary characteristics of strain HIMB11, including annotation of the draft genome sequence and comparative genomic analysis with other members of the Roseobacter lineage. The 3,098,747 bp draft genome is arranged in 34 contigs and contains 3,183 protein-coding genes and 54 RNA genes. Phylogenomic and 16S rRNA gene analyses indicate that HIMB11 represents a unique sublineage within the Roseobacter clade. Comparison with other publicly available genome sequences from members of the Roseobacter lineage reveals that strain HIMB11 has the genomic potential to utilize a wide variety of energy sources (e.g. organic matter, reduced inorganic sulfur, light, carbon monoxide), while possessing a reduced number of substrate transporters.
Global Genome and Transcriptome Analyses of Magnaporthe oryzae Epidemic Isolate 98-06 Uncover Novel Effectors and Pathogenicity-Related Genes, Revealing Gene Gain and Lose Dynamics in Genome Evolution

PubMed Central

Dong, Yanhan; Li, Ying; Zhao, Miaomiao; Jing, Maofeng; Liu, Xinyu; Liu, Muxing; Guo, Xianxian; Zhang, Xing; Chen, Yue; Liu, Yongfeng; Liu, Yanhong; Ye, Wenwu; Zhang, Haifeng; Wang, Yuanchao; Zheng, Xiaobo; Wang, Ping; Zhang, Zhengguang

2015-01-01

Genome dynamics of pathogenic organisms are driven by pathogen and host co-evolution, in which pathogen genomes are shaped to overcome stresses imposed by hosts with various genetic backgrounds through generation of a variety of isolates. This same principle applies to the rice blast pathogen Magnaporthe oryzae and the rice host; however, genetic variations among different isolates of M. oryzae remain largely unknown, particularly at genome and transcriptome levels. Here, we applied genomic and transcriptomic analytical tools to investigate M. oryzae isolate 98-06 that is the most aggressive in infection of susceptible rice cultivars. A unique 1.4 Mb of genomic sequences was found in isolate 98-06 in comparison to reference strain 70-15. Genome-wide expression profiling revealed the presence of two critical expression patterns of M. oryzae based on 64 known pathogenicity-related (PaR) genes. In addition, 134 candidate effectors with various segregation patterns were identified. Five tested proteins could suppress BAX-mediated programmed cell death in Nicotiana benthamiana leaves. Characterization of isolate-specific effector candidates Iug6 and Iug9 and PaR candidate Iug18 revealed that they have a role in fungal propagation and pathogenicity. Moreover, Iug6 and Iug9 are located exclusively in the biotrophic interfacial complex (BIC) and their overexpression leads to suppression of defense-related gene expression in rice, suggesting that they might participate in biotrophy by inhibiting the SA and ET pathways within the host. Thus, our studies identify novel effector and PaR proteins involved in pathogenicity of the highly aggressive M. oryzae field isolate 98-06, and reveal molecular and genomic dynamics in the evolution of M. oryzae and rice host interactions. PMID:25837042
Comparative Genomic Analysis of Phylogenetically Closely Related Hydrogenobaculum sp. Isolates from Yellowstone National Park

PubMed Central

Romano, Christine; D'Imperio, Seth; Woyke, Tanja; Mavromatis, Konstantinos; Lasken, Roger; Shock, Everett L.

2013-01-01

We describe the complete genome sequences of four closely related Hydrogenobaculum sp. isolates (≥99.7% 16S rRNA gene identity) that were isolated from the outflow channel of Dragon Spring (DS), Norris Geyser Basin, in Yellowstone National Park (YNP), WY. The genomes range in size from 1,552,607 to 1,552,931 bp, contain 1,667 to 1,676 predicted genes, and are highly syntenic. There are subtle differences among the DS isolates, which as a group are different from Hydrogenobaculum sp. strain Y04AAS1 that was previously isolated from a geographically distinct YNP geothermal feature. Genes unique to the DS genomes encode arsenite [As(III)] oxidation, NADH-ubiquinone-plastoquinone (complex I), NADH-ubiquinone oxidoreductase chain, a DNA photolyase, and elements of a type II secretion system. Functions unique to strain Y04AAS1 include thiosulfate metabolism, nitrate respiration, and mercury resistance determinants. DS genomes contain seven CRISPR loci that are almost identical but are different from the single CRISPR locus in strain Y04AAS1. Other differences between the DS and Y04AAS1 genomes include average nucleotide identity (94.764%) and percentage conserved DNA (80.552%). Approximately half of the genes unique to Y04AAS1 are predicted to have been acquired via horizontal gene transfer. Fragment recruitment analysis and marker gene searches demonstrated that the DS metagenome was more similar to the DS genomes than to the Y04AAS1 genome, but that the DS community is likely comprised of a continuum of Hydrogenobaculum genotypes that span from the DS genomes described here to an Y04AAS1-like organism, which appears to represent a distinct ecotype relative to the DS genomes characterized. PMID:23435891
Chromatin interaction networks revealed unique connectivity patterns of broad H3K4me3 domains and super enhancers in 3D chromatin.

PubMed

Thibodeau, Asa; Márquez, Eladio J; Shin, Dong-Guk; Vera-Licona, Paola; Ucar, Duygu

2017-10-31

Broad domain promoters and super enhancers are regulatory elements that govern cell-specific functions and harbor disease-associated sequence variants. These elements are characterized by distinct epigenomic profiles, such as expanded deposition of histone marks H3K27ac for super enhancers and H3K4me3 for broad domains, however little is known about how they interact with each other and the rest of the genome in three-dimensional chromatin space. Using network theory methods, we studied chromatin interactions between broad domains and super enhancers in three ENCODE cell lines (K562, MCF7, GM12878) obtained via ChIA-PET, Hi-C, and Hi-CHIP assays. In these networks, broad domains and super enhancers interact more frequently with each other compared to their typical counterparts. Network measures and graphlets revealed distinct connectivity patterns associated with these regulatory elements that are robust across cell types and alternative assays. Machine learning models showed that these connectivity patterns could effectively discriminate broad domains from typical promoters and super enhancers from typical enhancers. Finally, targets of broad domains in these networks were enriched in disease-causing SNPs of cognate cell types. Taken together these results suggest a robust and unique organization of the chromatin around broad domains and super enhancers: loci critical for pathologies and cell-specific functions.
Peering down the barrel of a bacteriophage portal: the genome packaging and release valve in p22.

PubMed

Tang, Jinghua; Lander, Gabriel C; Olia, Adam S; Olia, Adam; Li, Rui; Casjens, Sherwood; Prevelige, Peter; Cingolani, Gino; Baker, Timothy S; Johnson, John E

2011-04-13

The encapsidated genome in all double-strand DNA bacteriophages is packaged to liquid crystalline density through a unique vertex in the procapsid assembly intermediate, which has a portal protein dodecamer in place of five coat protein subunits. The portal orchestrates DNA packaging and exit, through a series of varying interactions with the scaffolding, terminase, and closure proteins. Here, we report an asymmetric cryoEM reconstruction of the entire P22 virion at 7.8 Å resolution. X-ray crystal structure models of the full-length portal and of the portal lacking 123 residues at the C terminus in complex with gene product 4 (Δ123portal-gp4) obtained by Olia et al. (2011) were fitted into this reconstruction. The interpreted density map revealed that the 150 Å, coiled-coil, barrel portion of the portal entraps the last DNA to be packaged and suggests a mechanism for head-full DNA signaling and transient stabilization of the genome during addition of closure proteins. Copyright © 2011 Elsevier Ltd. All rights reserved.

Protoparvovirus Knocking at the Nuclear Door.

PubMed

Mäntylä, Elina; Kann, Michael; Vihinen-Ranta, Maija

2017-10-02

Protoparvoviruses target the nucleus due to their dependence on the cellular reproduction machinery during the replication and expression of their single-stranded DNA genome. In recent years, our understanding of the multistep process of the capsid nuclear import has improved, and led to the discovery of unique viral nuclear entry strategies. Preceded by endosomal transport, endosomal escape and microtubule-mediated movement to the vicinity of the nuclear envelope, the protoparvoviruses interact with the nuclear pore complexes. The capsids are transported actively across the nuclear pore complexes using nuclear import receptors. The nuclear import is sometimes accompanied by structural changes in the nuclear envelope, and is completed by intranuclear disassembly of capsids and chromatinization of the viral genome. This review discusses the nuclear import strategies of protoparvoviruses and describes its dynamics comprising active and passive movement, and directed and diffusive motion of capsids in the molecularly crowded environment of the cell.
Breaking barriers in the genomics and pharmacogenetics of drug addiction

PubMed Central

Ho, MK; Goldman, D; Heinz, A; Kaprio, J; Kreek, MJ; Li, MD; Munafò, MR; Tyndale, RF

2013-01-01

Drug addictions remain a substantial health issue, with limited treatment options currently available. Despite considerable advances in the understanding of our genetic architecture, the genetic underpinning of complex disorders remains elusive. Numerous candidate genes have been implicated in the etiology and response to treatment for different addictions based on our current understanding of the neurobiology. Genome-wide association studies have also provided novel targets. However, replication of these studies is often lacking which complicates interpretation; this will improve as issues such as phenotypic characterization, the apparent “missing heritability”, the identification of functional variants, and possible gene-environment interactions are addressed. In addition, there is growing evidence that genetic information can be useful for refining the choice of addiction treatment. As genetic testing becomes more common in the practice of medicine, a variety of ethical and practical challenges, some of which are unique to drug addiction, will also need to be considered. PMID:20981002
Mammalian-specific genomic functions: Newly acquired traits generated by genomic imprinting and LTR retrotransposon-derived genes in mammals

PubMed Central

KANEKO-ISHINO, Tomoko; ISHINO, Fumitoshi

2015-01-01

Mammals, including human beings, have evolved a unique viviparous reproductive system and a highly developed central nervous system. How did these unique characteristics emerge in mammalian evolution, and what kinds of changes did occur in the mammalian genomes as evolution proceeded? A key conceptual term in approaching these issues is “mammalian-specific genomic functions”, a concept covering both mammalian-specific epigenetics and genetics. Genomic imprinting and LTR retrotransposon-derived genes are reviewed as the representative, mammalian-specific genomic functions that are essential not only for the current mammalian developmental system, but also mammalian evolution itself. First, the essential roles of genomic imprinting in mammalian development, especially related to viviparous reproduction via placental function, as well as the emergence of genomic imprinting in mammalian evolution, are discussed. Second, we introduce the novel concept of “mammalian-specific traits generated by mammalian-specific genes from LTR retrotransposons”, based on the finding that LTR retrotransposons served as a critical driving force in the mammalian evolution via generating mammalian-specific genes. PMID:26666304
Mammalian-specific genomic functions: Newly acquired traits generated by genomic imprinting and LTR retrotransposon-derived genes in mammals.

PubMed

Kaneko-Ishino, Tomoko; Ishino, Fumitoshi

2015-01-01

Mammals, including human beings, have evolved a unique viviparous reproductive system and a highly developed central nervous system. How did these unique characteristics emerge in mammalian evolution, and what kinds of changes did occur in the mammalian genomes as evolution proceeded? A key conceptual term in approaching these issues is "mammalian-specific genomic functions", a concept covering both mammalian-specific epigenetics and genetics. Genomic imprinting and LTR retrotransposon-derived genes are reviewed as the representative, mammalian-specific genomic functions that are essential not only for the current mammalian developmental system, but also mammalian evolution itself. First, the essential roles of genomic imprinting in mammalian development, especially related to viviparous reproduction via placental function, as well as the emergence of genomic imprinting in mammalian evolution, are discussed. Second, we introduce the novel concept of "mammalian-specific traits generated by mammalian-specific genes from LTR retrotransposons", based on the finding that LTR retrotransposons served as a critical driving force in the mammalian evolution via generating mammalian-specific genes.
Hepatitis A virus: host interactions, molecular epidemiology and evolution.

PubMed

Vaughan, Gilberto; Goncalves Rossi, Livia Maria; Forbi, Joseph C; de Paula, Vanessa S; Purdy, Michael A; Xia, Guoliang; Khudyakov, Yury E

2014-01-01

Infection with hepatitis A virus (HAV) is the commonest viral cause of liver disease and presents an important public health problem worldwide. Several unique HAV properties and molecular mechanisms of its interaction with host were recently discovered and should aid in clarifying the pathogenesis of hepatitis A. Genetic characterization of HAV strains have resulted in the identification of different genotypes and subtypes, which exhibit a characteristic worldwide distribution. Shifts in HAV endemicity occurring in different parts of the world, introduction of genetically diverse strains from geographically distant regions, genotype displacement observed in some countries and population expansion detected in the last decades of the 20th century using phylogenetic analysis are important factors contributing to the complex dynamics of HAV infections worldwide. Strong selection pressures, some of which, like usage of deoptimized codons, are unique to HAV, limit genetic variability of the virus. Analysis of subgenomic regions has been proven useful for outbreak investigations. However, sharing short sequences among epidemiologically unrelated strains indicates that specific identification of HAV strains for molecular surveillance can be achieved only using whole-genome sequences. Here, we present up-to-date information on the HAV molecular epidemiology and evolution, and highlight the most relevant features of the HAV-host interactions. Published by Elsevier B.V.
Accessing the genomic effects of naked nanoceria in murine neuronal cells.

PubMed

Lee, Tin-Lap; Raitano, Joan M; Rennert, Owen M; Chan, Siu-Wai; Chan, Wai-Yee

2012-07-01

Cerium oxide nanoparticles (nanoceria) are engineered nanoparticles whose versatility is due to their unique redox properties. We and others have demonstrated that naked nanoceria can act as antioxidants to protect cells against oxidative damage. Although the redox properties may be beneficial, the genome-wide effects of nanoceria on gene transcription and associated biological processes remain elusive. Here we applied a functional genomic approach to examine the genome-wide effects of nanoceria on global gene transcription and cellular functions in mouse neuronal cells. Importantly, we demonstrated that nanoceria induced chemical- and size-specific changes in the murine neuronal cell transcriptome. The nanoceria contributed more than 83% of the population of uniquely altered genes and were associated with a unique spectrum of genes related to neurological disease, cell cycle control, and growth. These observations suggest that an in-depth assessment of potential health effects of naked nanoceria and other naked nanoparticles is both necessary and imminent. Cerium oxide nanoparticles are important antioxidants, with potential applications in neurodegenerative conditions. This team of investigators demonstrated the genomic effects of nanoceria, showing that it induced chemical- and size-specific changes in the murine neuronal cell transcriptome. Published by Elsevier Inc.
First draft genome of an iconic clownfish species (Amphiprion frenatus).

PubMed

Marcionetti, Anna; Rossier, Victor; Bertrand, Joris A M; Litsios, Glenn; Salamin, Nicolas

2018-02-17

Clownfishes (or anemonefishes) form an iconic group of coral reef fishes, principally known for their mutualistic interaction with sea anemones. They are characterized by particular life history traits, such as a complex social structure and mating system involving sequential hermaphroditism, coupled with an exceptionally long lifespan. Additionally, clownfishes are considered to be one of the rare groups to have experienced an adaptive radiation in the marine environment. Here, we assembled and annotated the first genome of a clownfish species, the tomato clownfish (Amphiprion frenatus). We obtained 17,801 assembled scaffolds, containing a total of 26,917 genes. The completeness of the assembly and annotation was satisfying, with 96.5% of the Actinopterygii Benchmarking Universal Single-Copy Orthologs (BUSCOs) being retrieved in A. frenatus assembly. The quality of the resulting assembly is comparable to other bony fish assemblies. This resource is valuable for advancing studies of the particular life history traits of clownfishes, as well as being useful for population genetic studies and the development of new phylogenetic markers. It will also open the way to comparative genomics. Indeed, future genomic comparison among closely related fishes may provide means to identify genes related to the unique adaptations to different sea anemone hosts, as well as better characterize the genomic signatures of an adaptive radiation. © 2018 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.
Comparative genomic analysis of Helicobacter pylori from Malaysia identifies three distinct lineages suggestive of differential evolution.

PubMed

Kumar, Narender; Mariappan, Vanitha; Baddam, Ramani; Lankapalli, Aditya K; Shaik, Sabiha; Goh, Khean-Lee; Loke, Mun Fai; Perkins, Tim; Benghezal, Mohammed; Hasnain, Seyed E; Vadivelu, Jamuna; Marshall, Barry J; Ahmed, Niyaz

2015-01-01

The discordant prevalence of Helicobacter pylori and its related diseases, for a long time, fostered certain enigmatic situations observed in the countries of the southern world. Variation in H. pylori infection rates and disease outcomes among different populations in multi-ethnic Malaysia provides a unique opportunity to understand dynamics of host-pathogen interaction and genome evolution. In this study, we extensively analyzed and compared genomes of 27 Malaysian H. pylori isolates and identified three major phylogeographic lineages: hspEastAsia, hpEurope and hpSouthIndia. The analysis of the virulence genes within the core genome, however, revealed a comparable pathogenic potential of the strains. In addition, we identified four genes limited to strains of East-Asian lineage. Our analyses identified a few strain-specific genes encoding restriction modification systems and outlined 311 core genes possibly under differential evolutionary constraints, among the strains representing different ethnic groups. The cagA and vacA genes also showed variations in accordance with the host genetic background of the strains. Moreover, restriction modification genes were found to be significantly enriched in East-Asian strains. An understanding of these variations in the genome content would provide significant insights into various adaptive and host modulation strategies harnessed by H. pylori to effectively persist in a host-specific manner. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
The PathoYeastract database: an information system for the analysis of gene and genomic transcription regulation in pathogenic yeasts.

PubMed

Monteiro, Pedro Tiago; Pais, Pedro; Costa, Catarina; Manna, Sauvagya; Sá-Correia, Isabel; Teixeira, Miguel Cacho

2017-01-04

We present the PATHOgenic YEAst Search for Transcriptional Regulators And Consensus Tracking (PathoYeastract - http://pathoyeastract.org) database, a tool for the analysis and prediction of transcription regulatory associations at the gene and genomic levels in the pathogenic yeasts Candida albicans and C. glabrata Upon data retrieval from hundreds of publications, followed by curation, the database currently includes 28 000 unique documented regulatory associations between transcription factors (TF) and target genes and 107 DNA binding sites, considering 134 TFs in both species. Following the structure used for the YEASTRACT database, PathoYeastract makes available bioinformatics tools that enable the user to exploit the existing information to predict the TFs involved in the regulation of a gene or genome-wide transcriptional response, while ranking those TFs in order of their relative importance. Each search can be filtered based on the selection of specific environmental conditions, experimental evidence or positive/negative regulatory effect. Promoter analysis tools and interactive visualization tools for the representation of TF regulatory networks are also provided. The PathoYeastract database further provides simple tools for the prediction of gene and genomic regulation based on orthologous regulatory associations described for other yeast species, a comparative genomics setup for the study of cross-species evolution of regulatory networks. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
MIPS: a database for protein sequences, homology data and yeast genome information.

PubMed Central

Mewes, H W; Albermann, K; Heumann, K; Liebl, S; Pfeiffer, F

1997-01-01

The MIPS group (Martinsried Institute for Protein Sequences) at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, collects, processes and distributes protein sequence data within the framework of the tripartite association of the PIR-International Protein Sequence Database (,). MIPS contributes nearly 50% of the data input to the PIR-International Protein Sequence Database. The database is distributed on CD-ROM together with PATCHX, an exhaustive supplement of unique, unverified protein sequences from external sources compiled by MIPS. Through its WWW server (http://www.mips.biochem.mpg.de/ ) MIPS permits internet access to sequence databases, homology data and to yeast genome information. (i) Sequence similarity results from the FASTA program () are stored in the FASTA database for all proteins from PIR-International and PATCHX. The database is dynamically maintained and permits instant access to FASTA results. (ii) Starting with FASTA database queries, proteins have been classified into families and superfamilies (PROT-FAM). (iii) The HPT (hashed position tree) data structure () developed at MIPS is a new approach for rapid sequence and pattern searching. (iv) MIPS provides access to the sequence and annotation of the complete yeast genome (), the functional classification of yeast genes (FunCat) and its graphical display, the 'Genome Browser' (). A CD-ROM based on the JAVA programming language providing dynamic interactive access to the yeast genome and the related protein sequences has been compiled and is available on request. PMID:9016498
A rice kinase-protein interaction map.

PubMed

Ding, Xiaodong; Richter, Todd; Chen, Mei; Fujii, Hiroaki; Seo, Young Su; Xie, Mingtang; Zheng, Xianwu; Kanrar, Siddhartha; Stevenson, Rebecca A; Dardick, Christopher; Li, Ying; Jiang, Hao; Zhang, Yan; Yu, Fahong; Bartley, Laura E; Chern, Mawsheng; Bart, Rebecca; Chen, Xiuhua; Zhu, Lihuang; Farmerie, William G; Gribskov, Michael; Zhu, Jian-Kang; Fromm, Michael E; Ronald, Pamela C; Song, Wen-Yuan

2009-03-01

Plants uniquely contain large numbers of protein kinases, and for the vast majority of the 1,429 kinases predicted in the rice (Oryza sativa) genome, little is known of their functions. Genetic approaches often fail to produce observable phenotypes; thus, new strategies are needed to delineate kinase function. We previously developed a cost-effective high-throughput yeast two-hybrid system. Using this system, we have generated a protein interaction map of 116 representative rice kinases and 254 of their interacting proteins. Overall, the resulting interaction map supports a large number of known or predicted kinase-protein interactions from both plants and animals and reveals many new functional insights. Notably, we found a potential widespread role for E3 ubiquitin ligases in pathogen defense signaling mediated by receptor-like kinases, particularly by the kinases that may have evolved from recently expanded kinase subfamilies in rice. We anticipate that the data provided here will serve as a foundation for targeted functional studies in rice and other plants. The application of yeast two-hybrid and TAPtag analyses for large-scale plant protein interaction studies is also discussed.
Sequence analysis of the PIP5K locus in Eimeria maxima provides further evidence for eimerian genome plasticity and segmental organization.

PubMed

Song, B K; Pan, M Z; Lau, Y L; Wan, K L

2014-07-29

Commercial flocks infected by Eimeria species parasites, including Eimeria maxima, have an increased risk of developing clinical or subclinical coccidiosis; an intestinal enteritis associated with increased mortality rates in poultry. Currently, infection control is largely based on chemotherapy or live vaccines; however, drug resistance is common and vaccines are relatively expensive. The development of new cost-effective intervention measures will benefit from unraveling the complex genetic mechanisms that underlie host-parasite interactions, including the identification and characterization of genes encoding proteins such as phosphatidylinositol 4-phosphate 5-kinase (PIP5K). We previously identified a PIP5K coding sequence within the E. maxima genome. In this study, we analyzed two bacterial artificial chromosome clones presenting a ~145-kb E. maxima (Weybridge strain) genomic region spanning the PIP5K gene locus. Sequence analysis revealed that ~95% of the simple sequence repeats detected were located within regions comparable to the previously described feature-rich segments of the Eimeria tenella genome. Comparative sequence analysis with the orthologous E. maxima (Houghton strain) region revealed a moderate level of conserved synteny. Unique segmental organizations and telomere-like repeats were also observed in both genomes. A number of incomplete transposable elements were detected and further scrutiny of these elements in both orthologous segments revealed interesting nesting events, which may play a role in facilitating genome plasticity in E. maxima. The current analysis provides more detailed information about the genome organization of E. maxima and may help to reveal genotypic differences that are important for expression of traits related to pathogenicity and virulence.
Microbial ecology of the skin in the era of metagenomics and molecular microbiology.

PubMed

Hannigan, Geoffrey D; Grice, Elizabeth A

2013-12-01

The skin is the primary physical barrier between the body and the external environment and is also a substrate for the colonization of numerous microbes. Previously, dermatological microbiology research was dominated by culture-based techniques, but significant advances in genomic technologies have enabled the development of less-biased, culture-independent approaches to characterize skin microbial communities. These molecular microbiology approaches illustrate the great diversity of microbiota colonizing the skin and highlight unique features such as site specificity, temporal dynamics, and interpersonal variation. Disruptions in skin commensal microbiota are associated with the progression of many dermatological diseases. A greater understanding of how skin microbes interact with each other and with their host, and how we can therapeutically manipulate those interactions, will provide powerful tools for treating and preventing dermatological disease.
Primordial germ cell-mediated transgenesis and genome editing in birds.

PubMed

Han, Jae Yong; Park, Young Hyun

2018-01-01

Transgenesis and genome editing in birds are based on a unique germline transmission system using primordial germ cells (PGCs), which is quite different from the mammalian transgenic and genome editing system. PGCs are progenitor cells of gametes that can deliver genetic information to the next generation. Since avian PGCs were first discovered in nineteenth century, there have been numerous efforts to reveal their origin, specification, and unique migration pattern, and to improve germline transmission efficiency. Recent advances in the isolation and in vitro culture of avian PGCs with genetic manipulation and genome editing tools enable the development of valuable avian models that were unavailable before. However, many challenges remain in the production of transgenic and genome-edited birds, including the precise control of germline transmission, introduction of exogenous genes, and genome editing in PGCs. Therefore, establishing reliable germline-competent PGCs and applying precise genome editing systems are critical current issues in the production of avian models. Here, we introduce a historical overview of avian PGCs and their application, including improved techniques and methodologies in the production of transgenic and genome-edited birds, and we discuss the future potential applications of transgenic and genome-edited birds to provide opportunities and benefits for humans.
The compact genome of the plant pathogen Plasmodiophora brassicae is adapted to intracellular interactions with host Brassica spp.

PubMed

Rolfe, Stephen A; Strelkov, Stephen E; Links, Matthew G; Clarke, Wayne E; Robinson, Stephen J; Djavaheri, Mohammad; Malinowski, Robert; Haddadi, Parham; Kagale, Sateesh; Parkin, Isobel A P; Taheri, Ali; Borhan, M Hossein

2016-03-31

The protist Plasmodiophora brassicae is a soil-borne pathogen of cruciferous species and the causal agent of clubroot disease of Brassicas including agriculturally important crops such as canola/rapeseed (Brassica napus). P. brassicae has remained an enigmatic plant pathogen and is a rare example of an obligate biotroph that resides entirely inside the host plant cell. The pathogen is the cause of severe yield losses and can render infested fields unsuitable for Brassica crop growth due to the persistence of resting spores in the soil for up to 20 years. To provide insight into the biology of the pathogen and its interaction with its primary host B. napus, we produced a draft genome of P. brassicae pathotypes 3 and 6 (Pb3 and Pb6) that differ in their host range. Pb3 is highly virulent on B. napus (but also infects other Brassica species) while Pb6 infects only vegetable Brassica crops. Both the Pb3 and Pb6 genomes are highly compact, each with a total size of 24.2 Mb, and contain less than 2 % repetitive DNA. Clustering of genome-wide single nucleotide polymorphisms (SNP) of Pb3, Pb6 and three additional re-sequenced pathotypes (Pb2, Pb5 and Pb8) shows a high degree of correlation of cluster grouping with host range. The Pb3 genome features significant reduction of intergenic space with multiple examples of overlapping untranslated regions (UTRs). Dependency on the host for essential nutrients is evident from the loss of genes for the biosynthesis of thiamine and some amino acids and the presence of a wide range of transport proteins, including some unique to P. brassicae. The annotated genes of Pb3 include those with a potential role in the regulation of the plant growth hormones cytokinin and auxin. The expression profile of Pb3 genes, including putative effectors, during infection and their potential role in manipulation of host defence is discussed. The P. brassicae genome sequence reveals a compact genome, a dependency of the pathogen on its host for some essential nutrients and a potential role in the regulation of host plant cytokinin and auxin. Genome annotation supported by RNA sequencing reveals significant reduction in intergenic space which, in addition to low repeat content, has likely contributed to the P. brassicae compact genome.
From NGS assembly challenges to instability of fungal mitochondrial genomes: A case study in genome complexity.

PubMed

Misas, Elizabeth; Muñoz, José Fernando; Gallo, Juan Esteban; McEwen, Juan Guillermo; Clay, Oliver Keatinge

2016-04-01

The presence of repetitive or non-unique DNA persisting over sizable regions of a eukaryotic genome can hinder the genome's successful de novo assembly from short reads: ambiguities in assigning genome locations to the non-unique subsequences can result in premature termination of contigs and thus overfragmented assemblies. Fungal mitochondrial (mtDNA) genomes are compact (typically less than 100 kb), yet often contain short non-unique sequences that can be shown to impede their successful de novo assembly in silico. Such repeats can also confuse processes in the cell in vivo. A well-studied example is ectopic (out-of-register, illegitimate) recombination associated with repeat pairs, which can lead to deletion of functionally important genes that are located between the repeats. Repeats that remain conserved over micro- or macroevolutionary timescales despite such risks may indicate functionally or structurally (e.g., for replication) important regions. This principle could form the basis of a mining strategy for accelerating discovery of function in genome sequences. We present here our screening of a sample of 11 fully sequenced fungal mitochondrial genomes by observing where exact k-mer repeats occurred several times; initial analyses motivated us to focus on 17-mers occurring more than three times. Based on the diverse repeats we observe, we propose that such screening may serve as an efficient expedient for gaining a rapid but representative first insight into the repeat landscapes of sparsely characterized mitochondrial chromosomes. Our matching of the flagged repeats to previously reported regions of interest supports the idea that systems of persisting, non-trivial repeats in genomes can often highlight features meriting further attention. Copyright © 2016 Elsevier Ltd. All rights reserved.
Analysis of strain-specific genes in glutamic acid-producing Corynebacterium glutamicum ssp. lactofermentum AJ 1511.

PubMed

Nishio, Yousuke; Koseki, Chie; Tonouchi, Naoto; Matsui, Kazuhiko; Sugimoto, Shinichi; Usuda, Yoshihiro

2017-07-11

Strains of the bacterium, Corynebacterium glutamicum, are widely used for the industrial production of L-glutamic acid and various other substances. C. glutamicum ssp. lactofermentum AJ 1511, formerly classified as Brevibacterium lactofermentum, and the closely related C. glutamicum ATCC 13032 have been used as industrial strains for more than 50 years. We determined the whole genome sequence of C. glutamicum AJ 1511 and performed genome-wide comparative analysis with C. glutamicum ATCC 13032 to determine strain-specific genetic differences. This analysis revealed that the genomes of the two industrial strains are highly similar despite the phenotypic differences between the two strains. Both strains harbored unique genes but gene transpositions or inversions were not observed. The largest unique region, a 220-kb AT-rich region located between 1.78 and 2.00 Mb position in C. glutamicum ATCC 13032 genome, was missing in the genome of C. glutamicum AJ 1511. The next two largest unique regions were present in C. glutamicum AJ 1511. The first region (413-484 kb position) contains several predicted transport proteins, enzymes involved in sugar metabolism, and transposases. The second region (1.47-1.50 Mb position) encodes restriction modification systems. A gene predicted to encode NADH-dependent glutamate dehydrogenase, which is involved in L-glutamate biosynthesis, is present in C. glutamicum AJ 1511. Strain-specific genes identified in this study are likely to govern phenotypes unique to each strain.
Enigmatic, ultrasmall, uncultivated Archaea

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baker, Brett J.; Comolli, Luis; Dick, Gregory J.

Metagenomics has provided access to genomes of as yet uncultivated microorganisms in natural environments, yet there are gaps in our knowledge particularly for Archaea that occur at relatively low abundance and in extreme environments. Ultrasmall cells (<500 nm in diameter) from lineages without cultivated representatives that branch near the crenarchaeal/euryarchaeal divide have been detected in a variety of acidic ecosystems. We reconstructed composite, near-complete 1-Mb genomes for three lineages, referred to as ARMAN (archaeal Richmond Mine acidophilic nanoorganisms), from environmental samples and a biofilm filtrate. Genes of two lineages are among the smallest yet described, enabling a 10% higher codingmore » density than found genomes of the same size, and there are noncontiguous genes. No biological function could be inferred for up to 45% of genes and no more than 63% of the predicted proteins could be assigned to a revised set of archaeal clusters of orthologous groups. Some core metabolic genes are more common in Crenarchaeota than Euryarchaeota, up to 21% of genes have the highest sequence identity to bacterial genes, and 12 belong to clusters of orthologous groups that were previously exclusive to bacteria. A small subset of 3D cryo-electron tomographic reconstructions clearly show penetration of the ARMAN cell wall and cytoplasmic membranes by protuberances extended from cells of the archaeal order Thermoplasmatales. Interspecies interactions, the presence of a unique internal tubular organelle [Comolli, et al. (2009) ISME J 3:159 167], and many genes previously only affiliated with Crenarchaea or Bacteria indicate extensive unique physiology in organisms that branched close to the time that Cren- and Euryarchaeotal lineages diverged.« less
Enigmatic, ultrasmall, uncultivated Archaea

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baker, Brett J.; Comolli, Luis; Dick, Gregory J.

Metagenomics has provided access to genomes of as yet uncultivated microorganisms in natural environments, yet there are gaps in our knowledge-particularly for Archaea-that occur at relatively low abundance and in extreme environments. Ultrasmall cells (<500 nm in diameter) from lineages without cultivated representatives that branch near the crenarchaeal/euryarchaeal divide have been detected in a variety of acidic ecosystems. We reconstructed composite, near-complete similar to 1-Mb genomes for three lineages, referred to as ARMAN (archaeal Richmond Mine acidophilic nanoorganisms), from environmental samples and a biofilm filtrate. Genes of two lineages are among the smallest yet described, enabling a 10% higher codingmore » density than found genomes of the same size, and there are noncontiguous genes. No biological function could be inferred for up to 45% of genes and no more than 63% of the predicted proteins could be assigned to a revised set of archaeal clusters of orthologous groups. Some core metabolic genes are more common in Crenarchaeota than Euryarchaeota, up to 21% of genes have the highest sequence identity to bacterial genes, and 12 belong to clusters of orthologous groups that were previously exclusive to bacteria. A small subset of 3D cryo-electron tomographic reconstructions clearly show penetration of the ARMAN cell wall and cytoplasmic membranes by protuberances extended from cells of the archaeal order Thermoplasmatales. Interspecies interactions, the presence of a unique internal tubular organelle [Comolli, et al. (2009) ISME J 3: 159-167], and many genes previously only affiliated with Crenarchaea or Bacteria indicate extensive unique physiology in organisms that branched close to the time that Cren- and Euryarchaeotal lineages diverged.« less
Onco-Regulon: an integrated database and software suite for site specific targeting of transcription factors of cancer genes

PubMed Central

Tomar, Navneet; Mishra, Akhilesh; Mrinal, Nirotpal; Jayaram, B.

2016-01-01

Transcription factors (TFs) bind at multiple sites in the genome and regulate expression of many genes. Regulating TF binding in a gene specific manner remains a formidable challenge in drug discovery because the same binding motif may be present at multiple locations in the genome. Here, we present Onco-Regulon (http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm), an integrated database of regulatory motifs of cancer genes clubbed with Unique Sequence-Predictor (USP) a software suite that identifies unique sequences for each of these regulatory DNA motifs at the specified position in the genome. USP works by extending a given DNA motif, in 5′→3′, 3′ →5′ or both directions by adding one nucleotide at each step, and calculates the frequency of each extended motif in the genome by Frequency Counter programme. This step is iterated till the frequency of the extended motif becomes unity in the genome. Thus, for each given motif, we get three possible unique sequences. Closest Sequence Finder program predicts off-target drug binding in the genome. Inclusion of DNA-Protein structural information further makes Onco-Regulon a highly informative repository for gene specific drug development. We believe that Onco-Regulon will help researchers to design drugs which will bind to an exclusive site in the genome with no off-target effects, theoretically. Database URL: http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm PMID:27515825

Effect of reference genome selection on the performance of computational methods for genome-wide protein-protein interaction prediction.

PubMed

Muley, Vijaykumar Yogesh; Ranjan, Akash

2012-01-01

Recent progress in computational methods for predicting physical and functional protein-protein interactions has provided new insights into the complexity of biological processes. Most of these methods assume that functionally interacting proteins are likely to have a shared evolutionary history. This history can be traced out for the protein pairs of a query genome by correlating different evolutionary aspects of their homologs in multiple genomes known as the reference genomes. These methods include phylogenetic profiling, gene neighborhood and co-occurrence of the orthologous protein coding genes in the same cluster or operon. These are collectively known as genomic context methods. On the other hand a method called mirrortree is based on the similarity of phylogenetic trees between two interacting proteins. Comprehensive performance analyses of these methods have been frequently reported in literature. However, very few studies provide insight into the effect of reference genome selection on detection of meaningful protein interactions. We analyzed the performance of four methods and their variants to understand the effect of reference genome selection on prediction efficacy. We used six sets of reference genomes, sampled in accordance with phylogenetic diversity and relationship between organisms from 565 bacteria. We used Escherichia coli as a model organism and the gold standard datasets of interacting proteins reported in DIP, EcoCyc and KEGG databases to compare the performance of the prediction methods. Higher performance for predicting protein-protein interactions was achievable even with 100-150 bacterial genomes out of 565 genomes. Inclusion of archaeal genomes in the reference genome set improves performance. We find that in order to obtain a good performance, it is better to sample few genomes of related genera of prokaryotes from the large number of available genomes. Moreover, such a sampling allows for selecting 50-100 genomes for comparable accuracy of predictions when computational resources are limited.
Exploration of Panviral Proteome: High-Throughput Cloning and Functional Implications in Virus-host Interactions

PubMed Central

Yu, Xiaobo; Bian, Xiaofang; Throop, Andrea; Song, Lusheng; Moral, Lerys Del; Park, Jin; Seiler, Catherine; Fiacco, Michael; Steel, Jason; Hunter, Preston; Saul, Justin; Wang, Jie; Qiu, Ji; Pipas, James M.; LaBaer, Joshua

2014-01-01

Throughout the long history of virus-host co-evolution, viruses have developed delicate strategies to facilitate their invasion and replication of their genome, while silencing the host immune responses through various mechanisms. The systematic characterization of viral protein-host interactions would yield invaluable information in the understanding of viral invasion/evasion, diagnosis and therapeutic treatment of a viral infection, and mechanisms of host biology. With more than 2,000 viral genomes sequenced, only a small percent of them are well investigated. The access of these viral open reading frames (ORFs) in a flexible cloning format would greatly facilitate both in vitro and in vivo virus-host interaction studies. However, the overall progress of viral ORF cloning has been slow. To facilitate viral studies, we are releasing the initiation of our panviral proteome collection of 2,035 ORF clones from 830 viral genes in the Gateway® recombinational cloning system. Here, we demonstrate several uses of our viral collection including highly efficient production of viral proteins using human cell-free expression system in vitro, global identification of host targets for rubella virus using Nucleic Acid Programmable Protein Arrays (NAPPA) containing 10,000 unique human proteins, and detection of host serological responses using micro-fluidic multiplexed immunoassays. The studies presented here begin to elucidate host-viral protein interactions with our systemic utilization of viral ORFs, high-throughput cloning, and proteomic technologies. These valuable plasmid resources will be available to the research community to enable continued viral functional studies. PMID:24955142
Exploration of panviral proteome: high-throughput cloning and functional implications in virus-host interactions.

PubMed

Yu, Xiaobo; Bian, Xiaofang; Throop, Andrea; Song, Lusheng; Moral, Lerys Del; Park, Jin; Seiler, Catherine; Fiacco, Michael; Steel, Jason; Hunter, Preston; Saul, Justin; Wang, Jie; Qiu, Ji; Pipas, James M; LaBaer, Joshua

2014-01-01

Throughout the long history of virus-host co-evolution, viruses have developed delicate strategies to facilitate their invasion and replication of their genome, while silencing the host immune responses through various mechanisms. The systematic characterization of viral protein-host interactions would yield invaluable information in the understanding of viral invasion/evasion, diagnosis and therapeutic treatment of a viral infection, and mechanisms of host biology. With more than 2,000 viral genomes sequenced, only a small percent of them are well investigated. The access of these viral open reading frames (ORFs) in a flexible cloning format would greatly facilitate both in vitro and in vivo virus-host interaction studies. However, the overall progress of viral ORF cloning has been slow. To facilitate viral studies, we are releasing the initiation of our panviral proteome collection of 2,035 ORF clones from 830 viral genes in the Gateway® recombinational cloning system. Here, we demonstrate several uses of our viral collection including highly efficient production of viral proteins using human cell-free expression system in vitro, global identification of host targets for rubella virus using Nucleic Acid Programmable Protein Arrays (NAPPA) containing 10,000 unique human proteins, and detection of host serological responses using micro-fluidic multiplexed immunoassays. The studies presented here begin to elucidate host-viral protein interactions with our systemic utilization of viral ORFs, high-throughput cloning, and proteomic technologies. These valuable plasmid resources will be available to the research community to enable continued viral functional studies.
Comparative Analysis of the Genomes of Two Field Isolates of the Rice Blast Fungus Magnaporthe oryzae

PubMed Central

Li, Zhigang; Hu, Songnian; Yao, Nan; Dean, Ralph A.; Zhao, Wensheng; Shen, Mi; Zhang, Haiwang; Li, Chao; Liu, Liyuan; Cao, Lei; Xu, Xiaowen; Xing, Yunfei; Hsiang, Tom; Zhang, Ziding; Xu, Jin-Rong; Peng, You-Liang

2012-01-01

Rice blast caused by Magnaporthe oryzae is one of the most destructive diseases of rice worldwide. The fungal pathogen is notorious for its ability to overcome host resistance. To better understand its genetic variation in nature, we sequenced the genomes of two field isolates, Y34 and P131. In comparison with the previously sequenced laboratory strain 70-15, both field isolates had a similar genome size but slightly more genes. Sequences from the field isolates were used to improve genome assembly and gene prediction of 70-15. Although the overall genome structure is similar, a number of gene families that are likely involved in plant-fungal interactions are expanded in the field isolates. Genome-wide analysis on asynonymous to synonymous nucleotide substitution rates revealed that many infection-related genes underwent diversifying selection. The field isolates also have hundreds of isolate-specific genes and a number of isolate-specific gene duplication events. Functional characterization of randomly selected isolate-specific genes revealed that they play diverse roles, some of which affect virulence. Furthermore, each genome contains thousands of loci of transposon-like elements, but less than 30% of them are conserved among different isolates, suggesting active transposition events in M. oryzae. A total of approximately 200 genes were disrupted in these three strains by transposable elements. Interestingly, transposon-like elements tend to be associated with isolate-specific or duplicated sequences. Overall, our results indicate that gain or loss of unique genes, DNA duplication, gene family expansion, and frequent translocation of transposon-like elements are important factors in genome variation of the rice blast fungus. PMID:22876203
Sequence Analysis of Leuconostoc mesenteroides Bacteriophage Φ1-A4 Isolated from an Industrial Vegetable Fermentation▿

PubMed Central

Lu, Z.; Altermann, E.; Breidt, F.; Kozyavkin, S.

2010-01-01

Vegetable fermentations rely on the proper succession of a variety of lactic acid bacteria (LAB). Leuconostoc mesenteroides initiates fermentation. As fermentation proceeds, L. mesenteroides dies off and other LAB complete the fermentation. Phages infecting L. mesenteroides may significantly influence the die-off of L. mesenteroides. However, no L. mesenteroides phages have been previously genetically characterized. Knowledge of more phage genome sequences may provide new insights into phage genomics, phage evolution, and phage-host interactions. We have determined the complete genome sequence of L. mesenteroides phage Φ1-A4, isolated from an industrial sauerkraut fermentation. The phage possesses a linear, double-stranded DNA genome consisting of 29,508 bp with a G+C content of 36%. Fifty open reading frames (ORFs) were predicted. Putative functions were assigned to 26 ORFs (52%), including 5 ORFs of structural proteins. The phage genome was modularly organized, containing DNA replication, DNA-packaging, head and tail morphogenesis, cell lysis, and DNA regulation/modification modules. In silico analyses showed that Φ1-A4 is a unique lytic phage with a large-scale genome inversion (∼30% of the genome). The genome inversion encompassed the lysis module, part of the structural protein module, and a cos site. The endolysin gene was flanked by two holin genes. The tail morphogenesis module was interspersed with cell lysis genes and other genes with unknown functions. The predicted amino acid sequences of the phage proteins showed little similarity to other phages, but functional analyses showed that Φ1-A4 clusters with several Lactococcus phages. To our knowledge, Φ1-A4 is the first genetically characterized L. mesenteroides phage. PMID:20118355
Accumulation of slightly deleterious mutations in the mitochondrial genome: a hallmark of animal domestication.

PubMed

Hughes, Austin L

2013-02-15

The hypothesis that domestication leads to a relaxation of purifying selection on mitochondrial (mt) genomes was tested by comparative analysis of mt genes from dog, pig, chicken, and silkworm. The three vertebrate species showed mt genome phylogenies in which domestic and wild isolates were intermingled, whereas the domestic silkworm (Bombyx mori) formed a distinct cluster nested within its closest wild relative (Bombyx mandarina). In spite of these differences in phylogenetic pattern, significantly greater proportions of nonsynonymous SNPs than of synonymous SNPs were unique to the domestic populations of all four species. Likewise, in all four species, significantly greater proportions of RNA-encoding SNPs than of synonymous SNPs were unique to the domestic populations. Thus, domestic populations were characterized by an excess of unique polymorphisms in two categories generally subject to purifying selection: nonsynonymous sites and RNA-encoding sites. Many of these unique polymorphisms thus seem likely to be slightly deleterious; the latter hypothesis was supported by the generally lower gene diversities of polymorphisms unique to domestic populations in comparison to those of polymorphisms shared by domestic and wild populations. Copyright © 2012 Elsevier B.V. All rights reserved.
Brief Overview of a Decade of Genome-Wide Association Studies on Primary Hypertension.

PubMed

Azam, Afifah Binti; Azizan, Elena Aisha Binti

2018-01-01

Primary hypertension is widely believed to be a complex polygenic disorder with the manifestation influenced by the interactions of genomic and environmental factors making identification of susceptibility genes a major challenge. With major advancement in high-throughput genotyping technology, genome-wide association study (GWAS) has become a powerful tool for researchers studying genetically complex diseases. GWASs work through revealing links between DNA sequence variation and a disease or trait with biomedical importance. The human genome is a very long DNA sequence which consists of billions of nucleotides arranged in a unique way. A single base-pair change in the DNA sequence is known as a single nucleotide polymorphism (SNP). With the help of modern genotyping techniques such as chip-based genotyping arrays, thousands of SNPs can be genotyped easily. Large-scale GWASs, in which more than half a million of common SNPs are genotyped and analyzed for disease association in hundreds of thousands of cases and controls, have been broadly successful in identifying SNPs associated with heart diseases, diabetes, autoimmune diseases, and psychiatric disorders. It is however still debatable whether GWAS is the best approach for hypertension. The following is a brief overview on the outcomes of a decade of GWASs on primary hypertension.
The Atlantic salmon genome provides insights into rediploidization

USDA-ARS?s Scientific Manuscript database

The common ancestor of salmonids underwent an autotetraploid whole genome duplication event (Ss4R) approximately eighty million years ago, which provides unique opportunities to study the early evolutionary fate of a duplicated vertebrate genome in different extant lineages. Here, we present a high ...
Comparison and quantitative verification of mapping algorithms for whole genome bisulfite sequencing

USDA-ARS?s Scientific Manuscript database

Coupling bisulfite conversion with next-generation sequencing (Bisulfite-seq) enables genome-wide measurement of DNA methylation, but poses unique challenges for mapping. However, despite a proliferation of Bisulfite-seq mapping tools, no systematic comparison of their genomic coverage and quantitat...
Mechanisms and dynamics of nuclear lamina-genome interactions.

PubMed

Amendola, Mario; van Steensel, Bas

2014-06-01

The nuclear lamina (NL) interacts with the genomic DNA and is thought to influence chromosome organization and gene expression. Both DNA sequences and histone modifications are important for NL tethering of the genomic DNA. These interactions are dynamic in individual cells and can change during differentiation and development. Evidence is accumulating that the NL contributes to the repression of transcription. Advances in mapping, genome-editing and microscopy techniques are increasing our understanding of the molecular mechanisms involved in NL-genome interactions. Copyright © 2014 Elsevier Ltd. All rights reserved.
De novo assembly, characterization and functional annotation of pineapple fruit transcriptome through massively parallel sequencing.

PubMed

Ong, Wen Dee; Voo, Lok-Yung Christopher; Kumar, Vijay Subbiah

2012-01-01

Pineapple (Ananas comosus var. comosus), is an important tropical non-climacteric fruit with high commercial potential. Understanding the mechanism and processes underlying fruit ripening would enable scientists to enhance the improvement of quality traits such as, flavor, texture, appearance and fruit sweetness. Although, the pineapple is an important fruit, there is insufficient transcriptomic or genomic information that is available in public databases. Application of high throughput transcriptome sequencing to profile the pineapple fruit transcripts is therefore needed. To facilitate this, we have performed transcriptome sequencing of ripe yellow pineapple fruit flesh using Illumina technology. About 4.7 millions Illumina paired-end reads were generated and assembled using the Velvet de novo assembler. The assembly produced 28,728 unique transcripts with a mean length of approximately 200 bp. Sequence similarity search against non-redundant NCBI database identified a total of 16,932 unique transcripts (58.93%) with significant hits. Out of these, 15,507 unique transcripts were assigned to gene ontology terms. Functional annotation against Kyoto Encyclopedia of Genes and Genomes pathway database identified 13,598 unique transcripts (47.33%) which were mapped to 126 pathways. The assembly revealed many transcripts that were previously unknown. The unique transcripts derived from this work have rapidly increased of the number of the pineapple fruit mRNA transcripts as it is now available in public databases. This information can be further utilized in gene expression, genomics and other functional genomics studies in pineapple.
De Novo Assembly, Characterization and Functional Annotation of Pineapple Fruit Transcriptome through Massively Parallel Sequencing

PubMed Central

Ong, Wen Dee; Voo, Lok-Yung Christopher; Kumar, Vijay Subbiah

2012-01-01

Background Pineapple (Ananas comosus var. comosus), is an important tropical non-climacteric fruit with high commercial potential. Understanding the mechanism and processes underlying fruit ripening would enable scientists to enhance the improvement of quality traits such as, flavor, texture, appearance and fruit sweetness. Although, the pineapple is an important fruit, there is insufficient transcriptomic or genomic information that is available in public databases. Application of high throughput transcriptome sequencing to profile the pineapple fruit transcripts is therefore needed. Methodology/Principal Findings To facilitate this, we have performed transcriptome sequencing of ripe yellow pineapple fruit flesh using Illumina technology. About 4.7 millions Illumina paired-end reads were generated and assembled using the Velvet de novo assembler. The assembly produced 28,728 unique transcripts with a mean length of approximately 200 bp. Sequence similarity search against non-redundant NCBI database identified a total of 16,932 unique transcripts (58.93%) with significant hits. Out of these, 15,507 unique transcripts were assigned to gene ontology terms. Functional annotation against Kyoto Encyclopedia of Genes and Genomes pathway database identified 13,598 unique transcripts (47.33%) which were mapped to 126 pathways. The assembly revealed many transcripts that were previously unknown. Conclusions The unique transcripts derived from this work have rapidly increased of the number of the pineapple fruit mRNA transcripts as it is now available in public databases. This information can be further utilized in gene expression, genomics and other functional genomics studies in pineapple. PMID:23091603
The Comprehensive Antibiotic Resistance Database

PubMed Central

McArthur, Andrew G.; Waglechner, Nicholas; Nizam, Fazmin; Yan, Austin; Azad, Marisa A.; Baylay, Alison J.; Bhullar, Kirandeep; Canova, Marc J.; De Pascale, Gianfranco; Ejim, Linda; Kalan, Lindsay; King, Andrew M.; Koteva, Kalinka; Morar, Mariya; Mulvey, Michael R.; O'Brien, Jonathan S.; Pawlowski, Andrew C.; Piddock, Laura J. V.; Spanogiannopoulos, Peter; Sutherland, Arlene D.; Tang, Irene; Taylor, Patricia L.; Thaker, Maulik; Wang, Wenliang; Yan, Marie; Yu, Tennison

2013-01-01

The field of antibiotic drug discovery and the monitoring of new antibiotic resistance elements have yet to fully exploit the power of the genome revolution. Despite the fact that the first genomes sequenced of free living organisms were those of bacteria, there have been few specialized bioinformatic tools developed to mine the growing amount of genomic data associated with pathogens. In particular, there are few tools to study the genetics and genomics of antibiotic resistance and how it impacts bacterial populations, ecology, and the clinic. We have initiated development of such tools in the form of the Comprehensive Antibiotic Research Database (CARD; http://arpcard.mcmaster.ca). The CARD integrates disparate molecular and sequence data, provides a unique organizing principle in the form of the Antibiotic Resistance Ontology (ARO), and can quickly identify putative antibiotic resistance genes in new unannotated genome sequences. This unique platform provides an informatic tool that bridges antibiotic resistance concerns in health care, agriculture, and the environment. PMID:23650175
Genome sequences and comparative genomics of two Lactobacillus ruminis strains from the bovine and human intestinal tracts

PubMed Central

2011-01-01

Background The genus Lactobacillus is characterized by an extraordinary degree of phenotypic and genotypic diversity, which recent genomic analyses have further highlighted. However, the choice of species for sequencing has been non-random and unequal in distribution, with only a single representative genome from the L. salivarius clade available to date. Furthermore, there is no data to facilitate a functional genomic analysis of motility in the lactobacilli, a trait that is restricted to the L. salivarius clade. Results The 2.06 Mb genome of the bovine isolate Lactobacillus ruminis ATCC 27782 comprises a single circular chromosome, and has a G+C content of 44.4%. In silico analysis identified 1901 coding sequences, including genes for a pediocin-like bacteriocin, a single large exopolysaccharide-related cluster, two sortase enzymes, two CRISPR loci and numerous IS elements and pseudogenes. A cluster of genes related to a putative pilin was identified, and shown to be transcribed in vitro. A high quality draft assembly of the genome of a second L. ruminis strain, ATCC 25644 isolated from humans, suggested a slightly larger genome of 2.138 Mb, that exhibited a high degree of synteny with the ATCC 27782 genome. In contrast, comparative analysis of L. ruminis and L. salivarius identified a lack of long-range synteny between these closely related species. Comparison of the L. salivarius clade core proteins with those of nine other Lactobacillus species distributed across 4 major phylogenetic groups identified the set of shared proteins, and proteins unique to each group. Conclusions The genome of L. ruminis provides a comparative tool for directing functional analyses of other members of the L. salivarius clade, and it increases understanding of the divergence of this distinct Lactobacillus lineage from other commensal lactobacilli. The genome sequence provides a definitive resource to facilitate investigation of the genetics, biochemistry and host interactions of these motile intestinal lactobacilli. PMID:21995554
Pgltools: a genomic arithmetic tool suite for manipulation of Hi-C peak and other chromatin interaction data.

PubMed

Greenwald, William W; Li, He; Smith, Erin N; Benaglio, Paola; Nariai, Naoki; Frazer, Kelly A

2017-04-07

Genomic interaction studies use next-generation sequencing (NGS) to examine the interactions between two loci on the genome, with subsequent bioinformatics analyses typically including annotation, intersection, and merging of data from multiple experiments. While many file types and analysis tools exist for storing and manipulating single locus NGS data, there is currently no file standard or analysis tool suite for manipulating and storing paired-genomic-loci: the data type resulting from "genomic interaction" studies. As genomic interaction sequencing data are becoming prevalent, a standard file format and tools for working with these data conveniently and efficiently are needed. This article details a file standard and novel software tool suite for working with paired-genomic-loci data. We present the paired-genomic-loci (PGL) file standard for genomic-interactions data, and the accompanying analysis tool suite "pgltools": a cross platform, pypy compatible python package available both as an easy-to-use UNIX package, and as a python module, for integration into pipelines of paired-genomic-loci analyses. Pgltools is a freely available, open source tool suite for manipulating paired-genomic-loci data. Source code, an in-depth manual, and a tutorial are available publicly at www.github.com/billgreenwald/pgltools , and a python module of the operations can be installed from PyPI via the PyGLtools module.
Multifaceted Genomic Risk for Brain Function in Schizophrenia

PubMed Central

Chen, Jiayu; Calhoun, Vince D.; Pearlson, Godfrey D.; Ehrlich, Stefan; Turner, Jessica A.; Ho, Beng-Choon; Wassink, Thomas H.; Michael, Andrew M; Liu, Jingyu

2012-01-01

Recently, deriving candidate endophenotypes from brain imaging data has become a valuable approach to study genetic influences on schizophrenia (SZ), whose pathophysiology remains unclear. In this work we utilized a multivariate approach, parallel independent component analysis, to identify genomic risk components associated with brain function abnormalities in SZ. 5157 candidate single nucleotide polymorphisms (SNPs) were derived from genome-wide array based on their possible connections with SZ and further investigated for their associations with brain activations captured with functional magnetic resonance imaging (fMRI) during a sensorimotor task. Using data from 92 SZ patients and 116 healthy controls, we detected a significant correlation (r= 0.29; p= 2.41×10−5) between one fMRI component and one SNP component, both of which significantly differentiated patients from controls. The fMRI component mainly consisted of precentral and postcentral gyri, the major activated regions in the motor task. On average, higher activation in these regions was observed in participants with higher loadings of the linked SNP component, predominantly contributed to by 253 SNPs. 138 identified SNPs were from known coding regions of 100 unique genes. 31 identified SNPs did not differ between groups, but moderately correlated with some other group-discriminating SNPs, indicating interactions among alleles contributing towards elevated SZ susceptibility. The genes associated with the identified SNPs participated in four neurotransmitter pathways: GABA receptor signaling, dopamine receptor signaling, neuregulin signaling and glutamate receptor signaling. In summary, our work provides further evidence for the complexity of genomic risk to the functional brain abnormality in SZ and suggests a pathological role of interactions between SNPs, genes and multiple neurotransmitter pathways. PMID:22440650
Identification of genes containing expanded purine repeats in the human genome and their apparent protective role against cancer.

PubMed

Singh, Himanshu Narayan; Rajeswari, Moganty R

2016-01-01

Purine repeat sequences present in a gene are unique as they have high propensity to form unusual DNA-triple helix structures. Friedreich's ataxia is the only human disease that is well known to be associated with DNA-triplexes formed by purine repeats. The purpose of this study was to recognize the expanded purine repeats (EPRs) in human genome and find their correlation with cancer pathogenesis. We developed "PuRepeatFinder.pl" algorithm to identify non-overlapping EPRs without pyrimidine interruptions in the human genome and customized for searching repeat lengths, n ≥ 200. A total of 1158 EPRs were identified in the genome which followed Wakeby distribution. Two hundred and ninety-six EPRs were found in geneic regions of 282 genes (EPR-genes). Gene clustering of EPR-genes was done based on their cellular function and a large number of EPR-genes were found to be enzymes/enzyme modulators. Meta-analysis of 282 EPR-genes identified only 63 EPR-genes in association with cancer, mostly in breast, lung, and blood cancers. Protein-protein interaction network analysis of all 282 EPR-genes identified proteins including those in cadherins and VEGF. The two observations, that EPRs can induce mutations under malignant conditions and that identification of some EPR-gene products in vital cell signaling-mediated pathways, together suggest the crucial role of EPRs in carcinogenesis. The new link between EPR-genes and their functionally interacting proteins throws a new dimension in the present understanding of cancer pathogenesis and can help in planning therapeutic strategies. Validation of present results using techniques like NGS is required to establish the role of the EPR genes in cancer pathology.
A Genomic and Protein-Protein Interaction Analyses of Nonsyndromic Hearing Impairment in Cameroon Using Targeted Genomic Enrichment and Massively Parallel Sequencing.

PubMed

Lebeko, Kamogelo; Manyisa, Noluthando; Chimusa, Emile R; Mulder, Nicola; Dandara, Collet; Wonkam, Ambroise

2017-02-01

Hearing impairment (HI) is one of the leading causes of disability in the world, impacting the social, economic, and psychological well-being of the affected individual. This is particularly true in sub-Saharan Africa, which carries one of the highest burdens of this condition. Despite this, there are limited data on the most prevalent genes or mutations that cause HI among sub-Saharan Africans. Next-generation technologies, such as targeted genomic enrichment and massively parallel sequencing, offer new promise in this context. This study reports, for the first time to the best of our knowledge, on the prevalence of novel mutations identified through a platform of 116 HI genes (OtoSCOPE ® ), among 82 African probands with HI. Only variants OTOF NM_194248.2:c.766-2A>G and MYO7A NM_000260.3:c.1996C>T, p.Arg666Stop were found in 3 (3.7%) and 5 (6.1%) patients, respectively. In addition and uniquely, the analysis of protein-protein interactions (PPI), through interrogation of gene subnetworks, using a custom script and two databases (Enrichr and PANTHER), and an algorithm in the igraph package of R, identified the enrichment of sensory perception and mechanical stimulus biological processes, and the most significant molecular functions of these variants pertained to binding or structural activity. Furthermore, 10 genes (MYO7A, MYO6, KCTD3, NUMA1, MYH9, KCNQ1, UBC, DIAPH1, PSMC2, and RDX) were identified as significant hubs within the subnetworks. Results reveal that the novel variants identified among familial cases of HI in Cameroon are not common, and PPI analysis has highlighted the role of 10 genes, potentially important in understanding HI genomics among Africans.
Genomic sequence for the aflatoxigenic filamentous fungus Aspergillus nomius

USDA-ARS?s Scientific Manuscript database

The genome of the A. nomius type strain was sequenced using a personal genome machine. Annotation of the genes was undertaken, followed by gene ontology and an investigation into the number of secondary metabolite clusters. Comparative studies with other Aspergillus species involved shared/unique ge...
Identification and verification of potential piRNAs from domesticated yak testis.

PubMed

Gong, Jishang; Zhang, Quanwei; Wang, Qi; Ma, Youji; Du, Jiaxiang; Zhang, Yong; Zhao, Xingxu

2018-02-01

PIWI-interacting RNAs (piRNA) are small non-coding RNA molecules expressed in animal germ cells that interact with PIWI family proteins to form RNA-protein complexes involved in epigenetic and post-transcriptional gene silencing of retrotransposons and other genetic elements in germ line cells, including reproductive stem cell self-sustainment, differentiation, meiosis and spermatogenesis. In the present study, we performed high-throughput sequencing of piRNAs in testis samples from yaks in different stages of sexual maturity. Deep sequencing of the small RNAs (18-40 nt in length) yielded 4,900,538 unique reads from a total of 53,035,635 reads. We identified yak small RNAs (18-30 nt) and performed functional characterization. Yak small RNAs showed a bimodal length distribution, with two peaks at 22 nt and >28 nt. More than 80% of the 3,106,033 putative piRNAs were mapped to 4637 piRNA-producing genomic clusters using RPKM. 6388 candidate piRNAs were identified from clean reads and the annotations were compared with the yak reference genome repeat region. Integrated network analysis suggested that some differentially expressed genes were involved in spermatogenesis through ECM-receptor interaction and PI3K-Akt signaling pathways. Our data provide novel insights into the molecular expression and regulation similarities and diversities in spermatogenesis and testicular development in yaks at different stages of sexual maturity. © 2018 The authors.

Identification and verification of potential piRNAs from domesticated yak testis

PubMed Central

Gong, Jishang; Zhang, Quanwei; Wang, Qi; Ma, Youji; Du, Jiaxiang; Zhang, Yong

2018-01-01

PIWI-interacting RNAs (piRNA) are small non-coding RNA molecules expressed in animal germ cells that interact with PIWI family proteins to form RNA–protein complexes involved in epigenetic and post-transcriptional gene silencing of retrotransposons and other genetic elements in germ line cells, including reproductive stem cell self-sustainment, differentiation, meiosis and spermatogenesis. In the present study, we performed high-throughput sequencing of piRNAs in testis samples from yaks in different stages of sexual maturity. Deep sequencing of the small RNAs (18–40 nt in length) yielded 4,900,538 unique reads from a total of 53,035,635 reads. We identified yak small RNAs (18–30 nt) and performed functional characterization. Yak small RNAs showed a bimodal length distribution, with two peaks at 22 nt and >28 nt. More than 80% of the 3,106,033 putative piRNAs were mapped to 4637 piRNA-producing genomic clusters using RPKM. 6388 candidate piRNAs were identified from clean reads and the annotations were compared with the yak reference genome repeat region. Integrated network analysis suggested that some differentially expressed genes were involved in spermatogenesis through ECM–receptor interaction and PI3K-Akt signaling pathways. Our data provide novel insights into the molecular expression and regulation similarities and diversities in spermatogenesis and testicular development in yaks at different stages of sexual maturity. PMID:29101267
Genome wide re-sequencing of newly developed Rice Lines from common wild rice (Oryza rufipogon Griff.) for the identification of NBS-LRR genes.

PubMed

Liu, Wen; Ghouri, Fozia; Yu, Hang; Li, Xiang; Yu, Shuhong; Shahid, Muhammad Qasim; Liu, Xiangdong

2017-01-01

Common wild rice (Oryza rufipogon Griff.) is an important germplasm for rice breeding, which contains many resistance genes. Re-sequencing provides an unprecedented opportunity to explore the abundant useful genes at whole genome level. Here, we identified the nucleotide-binding site leucine-rich repeat (NBS-LRR) encoding genes by re-sequencing of two wild rice lines (i.e. Huaye 1 and Huaye 2) that were developed from common wild rice. We obtained 128 to 147 million reads with approximately 32.5-fold coverage depth, and uniquely covered more than 89.6% (> = 1 fold) of reference genomes. Two wild rice lines showed high SNP (single-nucleotide polymorphisms) variation rate in 12 chromosomes against the reference genomes of Nipponbare (japonica cultivar) and 93-11 (indica cultivar). InDels (insertion/deletion polymorphisms) count-length distribution exhibited normal distribution in the two lines, and most of the InDels were ranged from -5 to 5 bp. With reference to the Nipponbare genome sequence, we detected a total of 1,209,308 SNPs, 161,117 InDels and 4,192 SVs (structural variations) in Huaye 1, and 1,387,959 SNPs, 180,226 InDels and 5,305 SVs in Huaye 2. A total of 44.9% and 46.9% genes exhibited sequence variations in two wild rice lines compared to the Nipponbare and 93-11 reference genomes, respectively. Analysis of NBS-LRR mutant candidate genes showed that they were mainly distributed on chromosome 11, and NBS domain was more conserved than LRR domain in both wild rice lines. NBS genes depicted higher levels of genetic diversity in Huaye 1 than that found in Huaye 2. Furthermore, protein-protein interaction analysis showed that NBS genes mostly interacted with the cytochrome C protein (Os05g0420600, Os01g0885000 and BGIOSGA038922), while some NBS genes interacted with heat shock protein, DNA-binding activity, Phosphoinositide 3-kinase and a coiled coil region. We explored abundant NBS-LRR encoding genes in two common wild rice lines through genome wide re-sequencing, which proved to be a useful tool to exploit elite NBS-LRR genes in wild rice. The data here provide a foundation for future work aimed at dissecting the genetic basis of disease resistance in rice, and the two wild rice lines will be useful germplasm for the molecular improvement of cultivated rice.
Genome wide re-sequencing of newly developed Rice Lines from common wild rice (Oryza rufipogon Griff.) for the identification of NBS-LRR genes

PubMed Central

Yu, Hang; Li, Xiang; Yu, Shuhong; Shahid, Muhammad Qasim

2017-01-01

Common wild rice (Oryza rufipogon Griff.) is an important germplasm for rice breeding, which contains many resistance genes. Re-sequencing provides an unprecedented opportunity to explore the abundant useful genes at whole genome level. Here, we identified the nucleotide-binding site leucine-rich repeat (NBS-LRR) encoding genes by re-sequencing of two wild rice lines (i.e. Huaye 1 and Huaye 2) that were developed from common wild rice. We obtained 128 to 147 million reads with approximately 32.5-fold coverage depth, and uniquely covered more than 89.6% (> = 1 fold) of reference genomes. Two wild rice lines showed high SNP (single-nucleotide polymorphisms) variation rate in 12 chromosomes against the reference genomes of Nipponbare (japonica cultivar) and 93–11 (indica cultivar). InDels (insertion/deletion polymorphisms) count-length distribution exhibited normal distribution in the two lines, and most of the InDels were ranged from -5 to 5 bp. With reference to the Nipponbare genome sequence, we detected a total of 1,209,308 SNPs, 161,117 InDels and 4,192 SVs (structural variations) in Huaye 1, and 1,387,959 SNPs, 180,226 InDels and 5,305 SVs in Huaye 2. A total of 44.9% and 46.9% genes exhibited sequence variations in two wild rice lines compared to the Nipponbare and 93–11 reference genomes, respectively. Analysis of NBS-LRR mutant candidate genes showed that they were mainly distributed on chromosome 11, and NBS domain was more conserved than LRR domain in both wild rice lines. NBS genes depicted higher levels of genetic diversity in Huaye 1 than that found in Huaye 2. Furthermore, protein-protein interaction analysis showed that NBS genes mostly interacted with the cytochrome C protein (Os05g0420600, Os01g0885000 and BGIOSGA038922), while some NBS genes interacted with heat shock protein, DNA-binding activity, Phosphoinositide 3-kinase and a coiled coil region. We explored abundant NBS-LRR encoding genes in two common wild rice lines through genome wide re-sequencing, which proved to be a useful tool to exploit elite NBS-LRR genes in wild rice. The data here provide a foundation for future work aimed at dissecting the genetic basis of disease resistance in rice, and the two wild rice lines will be useful germplasm for the molecular improvement of cultivated rice. PMID:28700714
Ephemeral association between gene CG5762 and hybrid male sterility in Drosophila sibling species.

PubMed

Ma, Daina; Michalak, Pawel

2011-10-01

Interspecies divergence in regulatory pathways may result in hybrid male sterility (HMS) when dominance and epistatic interactions between alleles that are functional within one genome are disrupted in hybrid genomes. The identification of genes contributing to HMS and other hybrid dysfunctions is essential for understanding the origin of new species (speciation). Previously, we identified a panel of male-specific loci misexpressed in sterile male hybrids of Drosophila simulans and D. mauritiana relative to parental species. In the current work, we attempt to dissect the genetic associations between HMS and one of the genes, CG5762, a Drosophila-unique locus characterized by rapid sequence divergence within the genus, presumably driven by positive natural selection. CG5762 is underexpressed in sterile backcross males compared with their fertile brothers. In CG5762 heterozygotes, the D. mauritiana allele is consistently overexpressed on both the D. simulans and D. mauritiana backcross genomic background, suggesting a cis-acting regulation factor. There is a significant association between heterozygosity and HMS in hybrid males from early but not later backcross generations. Microsatellite markers spanning CG5762 fail to associate with HMS. These observations lead to a conclusion that CG5762 is not a causative factor of HMS. Although genetic linkage between CG5762 and a neighboring causative introgression cannot be ruled out, it seems that the pattern is most consistent with CG5762 participating in epistatic interactions that are disrupted in flies with HMS.
The genome of Eucalyptus grandis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Myburg, Alexander A.; Grattapaglia, Dario; Tuskan, Gerald A.

Eucalypts are the world s most widely planted hardwood trees. Their broad adaptability, rich species diversity, fast growth and superior multipurpose wood, have made them a global renewable resource of fiber and energy that mitigates human pressures on natural forests. We sequenced and assembled >94% of the 640 Mbp genome of Eucalyptus grandis into its 11 chromosomes. A set of 36,376 protein coding genes were predicted revealing that 34% occur in tandem duplications, the largest proportion found thus far in any plant genome. Eucalypts also show the highest diversity of genes for plant specialized metabolism that act as chemical defencemore » against biotic agents and provide unique pharmaceutical oils. Resequencing of a set of inbred tree genomes revealed regions of strongly conserved heterozygosity, likely hotspots of inbreeding depression. The resequenced genome of the sister species E. globulus underscored the high inter-specific genome colinearity despite substantial genome size variation in the genus. The genome of E. grandis is the first reference for the early diverging Rosid order Myrtales and is placed here basal to the Eurosids. This resource expands knowledge on the unique biology of large woody perennials and provides a powerful tool to accelerate comparative biology, breeding and biotechnology.« less
Identification of shared and unique susceptibility pathways among cancers of the lung, breast, and prostate from genome-wide association studies and tissue-specific protein interactions

PubMed Central

Qian, David C.; Byun, Jinyoung; Han, Younghun; Greene, Casey S.; Field, John K.; Hung, Rayjean J.; Brhane, Yonathan; Mclaughlin, John R.; Fehringer, Gordon; Landi, Maria Teresa; Rosenberger, Albert; Bickeböller, Heike; Malhotra, Jyoti; Risch, Angela; Heinrich, Joachim; Hunter, David J.; Henderson, Brian E.; Haiman, Christopher A.; Schumacher, Fredrick R.; Eeles, Rosalind A.; Easton, Douglas F.; Seminara, Daniela; Amos, Christopher I.

2015-01-01

Results from genome-wide association studies (GWAS) have indicated that strong single-gene effects are the exception, not the rule, for most diseases. We assessed the joint effects of germline genetic variations through a pathway-based approach that considers the tissue-specific contexts of GWAS findings. From GWAS meta-analyses of lung cancer (12 160 cases/16 838 controls), breast cancer (15 748 cases/18 084 controls) and prostate cancer (14 160 cases/12 724 controls) in individuals of European ancestry, we determined the tissue-specific interaction networks of proteins expressed from genes that are likely to be affected by disease-associated variants. Reactome pathways exhibiting enrichment of proteins from each network were compared across the cancers. Our results show that pathways associated with all three cancers tend to be broad cellular processes required for growth and survival. Significant examples include the nerve growth factor (P = 7.86 × 10−33), epidermal growth factor (P = 1.18 × 10−31) and fibroblast growth factor (P = 2.47 × 10−31) signaling pathways. However, within these shared pathways, the genes that influence risk largely differ by cancer. Pathways found to be unique for a single cancer focus on more specific cellular functions, such as interleukin signaling in lung cancer (P = 1.69 × 10−15), apoptosis initiation by Bad in breast cancer (P = 3.14 × 10−9) and cellular responses to hypoxia in prostate cancer (P = 2.14 × 10−9). We present the largest comparative cross-cancer pathway analysis of GWAS to date. Our approach can also be applied to the study of inherited mechanisms underlying risk across multiple diseases in general. PMID:26483192
Using peptide array to identify binding motifs and interaction networks for modular domains.

PubMed

Li, Shawn S-C; Wu, Chenggang

2009-01-01

Specific protein-protein interactions underlie all essential biological processes and form the basis of cellular signal transduction. The recognition of a short, linear peptide sequence in one protein by a modular domain in another represents a common theme of macromolecular recognition in cells, and the importance of this mode of protein-protein interaction is highlighted by the large number of peptide-binding domains encoded by the human genome. This phenomenon also provides a unique opportunity to identify protein-protein binding events using peptide arrays and complementary biochemical assays. Accordingly, high-density peptide array has emerged as a useful tool by which to map domain-mediated protein-protein interaction networks at the proteome level. Using the Src-homology 2 (SH2) and 3 (SH3) domains as examples, we describe the application of oriented peptide array libraries in uncovering specific motifs recognized by an SH2 domain and the use of high-density peptide arrays in identifying interaction networks mediated by the SH3 domain. Methods reviewed here could also be applied to other modular domains, including catalytic domains, that recognize linear peptide sequences.
LAMP detection assays for boxwood blight pathogens: A comparative genomics approach

DOE Office of Scientific and Technical Information (OSTI.GOV)

Malapi-Wight, Martha; Demers, Jill E.; Veltri, Daniel

Rapid and accurate molecular diagnostic tools are critical to efforts to minimize the impact and spread of emergent pathogens. The identification of diagnostic markers for novel pathogens presents several challenges, especially in the absence of information about population diversity and where genetic resources are limited. The objective of this study was to use comparative genomics datasets to find unique target regions suitable for the diagnosis of two fungal species causing a newly emergent blight disease of boxwood. Candidate marker regions for loop-mediated isothermal amplification (LAMP) assays were identified from draft genomes of Calonectria henricotiae and C. pseudonaviculata, as well asmore » three related species not associated with this disease. To increase the probability of identifying unique targets, we used three approaches to mine genome datasets, based on (i) unique regions, (ii) polymorphisms, and (iii) presence/absence of regions across datasets. From a pool of candidate markers, we demonstrate LAMP assay specificity by testing related fungal species, common boxwood pathogens, and environmental samples containing 445 diverse fungal taxa. In conclusion, this comparative-genomics-based approach to the development of LAMP diagnostic assays is the first of its kind for fungi and could be easily applied to diagnostic marker development for other newly emergent plant pathogens.« less
LAMP detection assays for boxwood blight pathogens: A comparative genomics approach

DOE PAGES

Malapi-Wight, Martha; Demers, Jill E.; Veltri, Daniel; ...

2016-05-20

Rapid and accurate molecular diagnostic tools are critical to efforts to minimize the impact and spread of emergent pathogens. The identification of diagnostic markers for novel pathogens presents several challenges, especially in the absence of information about population diversity and where genetic resources are limited. The objective of this study was to use comparative genomics datasets to find unique target regions suitable for the diagnosis of two fungal species causing a newly emergent blight disease of boxwood. Candidate marker regions for loop-mediated isothermal amplification (LAMP) assays were identified from draft genomes of Calonectria henricotiae and C. pseudonaviculata, as well asmore » three related species not associated with this disease. To increase the probability of identifying unique targets, we used three approaches to mine genome datasets, based on (i) unique regions, (ii) polymorphisms, and (iii) presence/absence of regions across datasets. From a pool of candidate markers, we demonstrate LAMP assay specificity by testing related fungal species, common boxwood pathogens, and environmental samples containing 445 diverse fungal taxa. In conclusion, this comparative-genomics-based approach to the development of LAMP diagnostic assays is the first of its kind for fungi and could be easily applied to diagnostic marker development for other newly emergent plant pathogens.« less
Divergent and convergent modes of interaction between wheat and Puccinia graminis f. sp. tritici isolates revealed by the comparative gene co-expression network and genome analyses.

PubMed

Rutter, William B; Salcedo, Andres; Akhunova, Alina; He, Fei; Wang, Shichen; Liang, Hanquan; Bowden, Robert L; Akhunov, Eduard

2017-04-12

Two opposing evolutionary constraints exert pressure on plant pathogens: one to diversify virulence factors in order to evade plant defenses, and the other to retain virulence factors critical for maintaining a compatible interaction with the plant host. To better understand how the diversified arsenals of fungal genes promote interaction with the same compatible wheat line, we performed a comparative genomic analysis of two North American isolates of Puccinia graminis f. sp. tritici (Pgt). The patterns of inter-isolate divergence in the secreted candidate effector genes were compared with the levels of conservation and divergence of plant-pathogen gene co-expression networks (GCN) developed for each isolate. Comprative genomic analyses revealed substantial level of interisolate divergence in effector gene complement and sequence divergence. Gene Ontology (GO) analyses of the conserved and unique parts of the isolate-specific GCNs identified a number of conserved host pathways targeted by both isolates. Interestingly, the degree of inter-isolate sub-network conservation varied widely for the different host pathways and was positively associated with the proportion of conserved effector candidates associated with each sub-network. While different Pgt isolates tended to exploit similar wheat pathways for infection, the mode of plant-pathogen interaction varied for different pathways with some pathways being associated with the conserved set of effectors and others being linked with the diverged or isolate-specific effectors. Our data suggest that at the intra-species level pathogen populations likely maintain divergent sets of effectors capable of targeting the same plant host pathways. This functional redundancy may play an important role in the dynamic of the "arms-race" between host and pathogen serving as the basis for diverse virulence strategies and creating conditions where mutations in certain effector groups will not have a major effect on the pathogen's ability to infect the host.
Comparative genome analysis of rice-pathogenic Burkholderia provides insight into capacity to adapt to different environments and hosts.

PubMed

Seo, Young-Su; Lim, Jae Yun; Park, Jungwook; Kim, Sunyoung; Lee, Hyun-Hee; Cheong, Hoon; Kim, Sang-Mok; Moon, Jae Sun; Hwang, Ingyu

2015-05-06

In addition to human and animal diseases, bacteria of the genus Burkholderia can cause plant diseases. The representative species of rice-pathogenic Burkholderia are Burkholderia glumae, B. gladioli, and B. plantarii, which primarily cause grain rot, sheath rot, and seedling blight, respectively, resulting in severe reductions in rice production. Though Burkholderia rice pathogens cause problems in rice-growing countries, comprehensive studies of these rice-pathogenic species aiming to control Burkholderia-mediated diseases are only in the early stages. We first sequenced the complete genome of B. plantarii ATCC 43733T. Second, we conducted comparative analysis of the newly sequenced B. plantarii ATCC 43733T genome with eleven complete or draft genomes of B. glumae and B. gladioli strains. Furthermore, we compared the genome of three rice Burkholderia pathogens with those of other Burkholderia species such as those found in environmental habitats and those known as animal/human pathogens. These B. glumae, B. gladioli, and B. plantarii strains have unique genes involved in toxoflavin or tropolone toxin production and the clustered regularly interspaced short palindromic repeats (CRISPR)-mediated bacterial immune system. Although the genome of B. plantarii ATCC 43733T has many common features with those of B. glumae and B. gladioli, this B. plantarii strain has several unique features, including quorum sensing and CRISPR/CRISPR-associated protein (Cas) systems. The complete genome sequence of B. plantarii ATCC 43733T and publicly available genomes of B. glumae BGR1 and B. gladioli BSR3 enabled comprehensive comparative genome analyses among three rice-pathogenic Burkholderia species responsible for tissue rotting and seedling blight. Our results suggest that B. glumae has evolved rapidly, or has undergone rapid genome rearrangements or deletions, in response to the hosts. It also, clarifies the unique features of rice pathogenic Burkholderia species relative to other animal and human Burkholderia species.
Comparative and genetic analysis of the four sequenced Paenibacillus polymyxa genomes reveals a diverse metabolism and conservation of genes relevant to plant-growth promotion and competitiveness.

PubMed

Eastman, Alexander W; Heinrichs, David E; Yuan, Ze-Chun

2014-10-03

Members of the genus Paenibacillus are important plant growth-promoting rhizobacteria that can serve as bio-reactors. Paenibacillus polymyxa promotes the growth of a variety of economically important crops. Our lab recently completed the genome sequence of Paenibacillus polymyxa CR1. As of January 2014, four P. polymyxa genomes have been completely sequenced but no comparative genomic analyses have been reported. Here we report the comparative and genetic analyses of four sequenced P. polymyxa genomes, which revealed a significantly conserved core genome. Complex metabolic pathways and regulatory networks were highly conserved and allow P. polymyxa to rapidly respond to dynamic environmental cues. Genes responsible for phytohormone synthesis, phosphate solubilization, iron acquisition, transcriptional regulation, σ-factors, stress responses, transporters and biomass degradation were well conserved, indicating an intimate association with plant hosts and the rhizosphere niche. In addition, genes responsible for antimicrobial resistance and non-ribosomal peptide/polyketide synthesis are present in both the core and accessory genome of each strain. Comparative analyses also reveal variations in the accessory genome, including large plasmids present in strains M1 and SC2. Furthermore, a considerable number of strain-specific genes and genomic islands are irregularly distributed throughout each genome. Although a variety of plant-growth promoting traits are encoded by all strains, only P. polymyxa CR1 encodes the unique nitrogen fixation cluster found in other Paenibacillus sp. Our study revealed that genomic loci relevant to host interaction and ecological fitness are highly conserved within the P. polymyxa genomes analysed, despite variations in the accessory genome. This work suggets that plant-growth promotion by P. polymyxa is mediated largely through phytohormone production, increased nutrient availability and bio-control mechanisms. This study provides an in-depth understanding of the genome architecture of this species, thus facilitating future genetic engineering and applications in agriculture, industry and medicine. Furthermore, this study highlights the current gap in our understanding of complex plant biomass metabolism in Gram-positive bacteria.
The complete chloroplast genome sequence of Epipremnum aureum and its comparative analysis among eight Araceae species

PubMed Central

Han, Limin; Chen, Chen; Wang, Zhezhi

2018-01-01

Epipremnum aureum is an important foliage plant in the Araceae family. In this study, we have sequenced the complete chloroplast genome of E. aureum by using Illumina Hiseq sequencing platforms. This genome is a double-stranded circular DNA sequence of 164,831 bp that contains 35.8% GC. The two inverted repeats (IRa and IRb; 26,606 bp) are spaced by a small single-copy region (22,868 bp) and a large single-copy region (88,751 bp). The chloroplast genome has 131 (113 unique) functional genes, including 86 (79 unique) protein-coding genes, 37 (30 unique) tRNA genes, and eight (four unique) rRNA genes. Tandem repeats comprise the majority of the 43 long repetitive sequences. In addition, 111 simple sequence repeats are present, with mononucleotides being the most common type and di- and tetranucleotides being infrequent events. Positive selection pressure on rps12 in the E. aureum chloroplast has been demonstrated via synonymous and nonsynonymous substitution rates and selection pressure sites analyses. Ycf15 and infA are pseudogenes in this species. We constructed a Maximum Likelihood phylogenetic tree based on the complete chloroplast genomes of 38 species from 13 families. Those results strongly indicated that E. aureum is positioned as the sister of Colocasia esculenta within the Araceae family. This work may provide information for further study of the molecular phylogenetic relationships within Araceae, as well as molecular markers and breeding novel varieties by chloroplast genetic-transformation of E. aureum in particular. PMID:29529038
Genomic expression patterns in medication overuse headaches

PubMed Central

Hershey, Andrew D; Burdine, Danny; Kabbouche, Marielle A; Powers, Scott W

2016-01-01

Background Chronic daily headache (CDH) and chronic migraine (CM) are one of the most frequent problems encountered in neurology, are often difficult to treat, and frequently complicated by medication-overuse headache (MOH). Proper recognition of MOH may alter treatment outcome and prevent long term disability. Objective This study identifies the unique genomic expression pattern MOH that respond to cessation of the overused medication. Methods Baseline occurrence of MOH and typical pattern of response to medication cessation were measured from a large database. Whole blood samples from patients with CM with or without MOH were obtained and their genomic profile was assessed. Affymetrix human U133 plus2 arrays were used to examine the genomic expression patterns prior to treatment and 6–12 weeks later. Headache characterisation and response to treatment based on headache frequency and disability were compared. Results Of 1311 patients reporting daily or continuous headaches, 513 (39.1%) reported overusing analgesic medication. At follow-up, 44.5% had a 50% or greater reduction in headache frequency, while 41.6% had no change. Blood genomic expression patterns were obtained on 33 patients with 19 (57.6%) overusing analgesic medication with a unique genomic expression pattern in MOH that responded to cessation of analgesics. Gene ontology of these samples indicated a significant number were involved with brain and immunological tissues, including multiple signalling pathways and apoptosis. Conclusions Blood genomic patterns can accurately identify MOH patients that respond to medication cessation. These results suggest that MOH involves a unique molecular biology pathway that can be identified with a specific biomarker. PMID:20974594
Genomic Variation in Natural Populations of Drosophila melanogaster

PubMed Central

Langley, Charles H.; Stevens, Kristian; Cardeno, Charis; Lee, Yuh Chwen G.; Schrider, Daniel R.; Pool, John E.; Langley, Sasha A.; Suarez, Charlyn; Corbett-Detig, Russell B.; Kolaczkowski, Bryan; Fang, Shu; Nista, Phillip M.; Holloway, Alisha K.; Kern, Andrew D.; Dewey, Colin N.; Song, Yun S.; Hahn, Matthew W.; Begun, David J.

2012-01-01

This report of independent genome sequences of two natural populations of Drosophila melanogaster (37 from North America and 6 from Africa) provides unique insight into forces shaping genomic polymorphism and divergence. Evidence of interactions between natural selection and genetic linkage is abundant not only in centromere- and telomere-proximal regions, but also throughout the euchromatic arms. Linkage disequilibrium, which decays within 1 kbp, exhibits a strong bias toward coupling of the more frequent alleles and provides a high-resolution map of recombination rate. The juxtaposition of population genetics statistics in small genomic windows with gene structures and chromatin states yields a rich, high-resolution annotation, including the following: (1) 5′- and 3′-UTRs are enriched for regions of reduced polymorphism relative to lineage-specific divergence; (2) exons overlap with windows of excess relative polymorphism; (3) epigenetic marks associated with active transcription initiation sites overlap with regions of reduced relative polymorphism and relatively reduced estimates of the rate of recombination; (4) the rate of adaptive nonsynonymous fixation increases with the rate of crossing over per base pair; and (5) both duplications and deletions are enriched near origins of replication and their density correlates negatively with the rate of crossing over. Available demographic models of X and autosome descent cannot account for the increased divergence on the X and loss of diversity associated with the out-of-Africa migration. Comparison of the variation among these genomes to variation among genomes from D. simulans suggests that many targets of directional selection are shared between these species. PMID:22673804
Comparative Genomics Suggests an Independent Origin of Cytoplasmic Incompatibility in Cardinium hertigii

PubMed Central

Kelly, Suzanne E.; Cass, Bodil N.; Müller, Anneliese; Woyke, Tanja; Malfatti, Stephanie A.; Hunter, Martha S.; Horn, Matthias

2012-01-01

Terrestrial arthropods are commonly infected with maternally inherited bacterial symbionts that cause cytoplasmic incompatibility (CI). In CI, the outcome of crosses between symbiont-infected males and uninfected females is reproductive failure, increasing the relative fitness of infected females and leading to spread of the symbiont in the host population. CI symbionts have profound impacts on host genetic structure and ecology and may lead to speciation and the rapid evolution of sex determination systems. Cardinium hertigii, a member of the Bacteroidetes and symbiont of the parasitic wasp Encarsia pergandiella, is the only known bacterium other than the Alphaproteobacteria Wolbachia to cause CI. Here we report the genome sequence of Cardinium hertigii cEper1. Comparison with the genomes of CI–inducing Wolbachia pipientis strains wMel, wRi, and wPip provides a unique opportunity to pinpoint shared proteins mediating host cell interaction, including some candidate proteins for CI that have not previously been investigated. The genome of Cardinium lacks all major biosynthetic pathways but harbors a complete biotin biosynthesis pathway, suggesting a potential role for Cardinium in host nutrition. Cardinium lacks known protein secretion systems but encodes a putative phage-derived secretion system distantly related to the antifeeding prophage of the entomopathogen Serratia entomophila. Lastly, while Cardinium and Wolbachia genomes show only a functional overlap of proteins, they show no evidence of laterally transferred elements that would suggest common ancestry of CI in both lineages. Instead, comparative genomics suggests an independent evolution of CI in Cardinium and Wolbachia and provides a novel context for understanding the mechanistic basis of CI. PMID:23133394
The genome and structural proteome of YuA, a new Pseudomonas aeruginosa phage resembling M6.

PubMed

Ceyssens, Pieter-Jan; Mesyanzhinov, Vadim; Sykilinda, Nina; Briers, Yves; Roucourt, Bart; Lavigne, Rob; Robben, Johan; Domashin, Artem; Miroshnikov, Konstantin; Volckaert, Guido; Hertveldt, Kirsten

2008-02-01

Pseudomonas aeruginosa phage YuA (Siphoviridae) was isolated from a pond near Moscow, Russia. It has an elongated head, encapsulating a circularly permuted genome of 58,663 bp, and a flexible, noncontractile tail, which is terminally and subterminally decorated with short fibers. The YuA genome is neither Mu- nor lambda-like and encodes 78 gene products that cluster in three major regions involved in (i) DNA metabolism and replication, (ii) host interaction, and (iii) phage particle formation and host lysis. At the protein level, YuA displays significant homology with phages M6, phiJL001, 73, B3, DMS3, and D3112. Eighteen YuA proteins were identified as part of the phage particle by mass spectrometry analysis. Five different bacterial promoters were experimentally identified using a promoter trap assay, three of which have a sigma54-specific binding site and regulate transcription in the genome region involved in phage particle formation and host lysis. The dependency of these promoters on the host sigma54 factor was confirmed by analysis of an rpoN mutant strain of P. aeruginosa PAO1. At the DNA level, YuA is 91% identical to the recently (July 2007) annotated phage M6 of the Lindberg typing set. Despite this level of DNA homology throughout the genome, both phages combined have 15 unique genes that do not occur in the other phage. The genome organization of both phages differs substantially from those of the other known Pseudomonas-infecting Siphoviridae, delineating them as a distinct genus within this family.
Population genomics of Fusarium graminearum reveals signatures of divergent evolution within a major cereal pathogen

PubMed Central

2018-01-01

The cereal pathogen Fusarium graminearum is the primary cause of Fusarium head blight (FHB) and a significant threat to food safety and crop production. To elucidate population structure and identify genomic targets of selection within major FHB pathogen populations in North America we sequenced the genomes of 60 diverse F. graminearum isolates. We also assembled the first pan-genome for F. graminearum to clarify population-level differences in gene content potentially contributing to pathogen diversity. Bayesian and phylogenomic analyses revealed genetic structure associated with isolates that produce the novel NX-2 mycotoxin, suggesting a North American population that has remained genetically distinct from other endemic and introduced cereal-infecting populations. Genome scans uncovered distinct signatures of selection within populations, focused in high diversity, frequently recombining regions. These patterns suggested selection for genomic divergence at the trichothecene toxin gene cluster and thirteen additional regions containing genes potentially involved in pathogen specialization. Gene content differences further distinguished populations, in that 121 genes showed population-specific patterns of conservation. Genes that differentiated populations had predicted functions related to pathogenesis, secondary metabolism and antagonistic interactions, though a subset had unique roles in temperature and light sensitivity. Our results indicated that F. graminearum populations are distinguished by dozens of genes with signatures of selection and an array of dispensable accessory genes, suggesting that FHB pathogen populations may be equipped with different traits to exploit the agroecosystem. These findings provide insights into the evolutionary processes and genomic features contributing to population divergence in plant pathogens, and highlight candidate genes for future functional studies of pathogen specialization across evolutionarily and ecologically diverse fungi. PMID:29584736
Coevolution study of mitochondria respiratory chain proteins: toward the understanding of protein--protein interaction.

PubMed

Yang, Ming; Ge, Yan; Wu, Jiayan; Xiao, Jingfa; Yu, Jun

2011-05-20

Coevolution can be seen as the interdependency between evolutionary histories. In the context of protein evolution, functional correlation proteins are ever-present coordinated evolutionary characters without disruption of organismal integrity. As to complex system, there are two forms of protein--protein interactions in vivo, which refer to inter-complex interaction and intra-complex interaction. In this paper, we studied the difference of coevolution characters between inter-complex interaction and intra-complex interaction using "Mirror tree" method on the respiratory chain (RC) proteins. We divided the correlation coefficients of every pairwise RC proteins into two groups corresponding to the binary protein--protein interaction in intra-complex and the binary protein--protein interaction in inter-complex, respectively. A dramatical discrepancy is detected between the coevolution characters of the two sets of protein interactions (Wilcoxon test, p-value = 4.4 × 10(-6)). Our finding reveals some critical information on coevolutionary study and assists the mechanical investigation of protein--protein interaction. Furthermore, the results also provide some unique clue for supramolecular organization of protein complexes in the mitochondrial inner membrane. More detailed binding sites map and genome information of nuclear encoded RC proteins will be extraordinary valuable for the further mitochondria dynamics study. Copyright © 2011. Published by Elsevier Ltd.
Differential principal component analysis of ChIP-seq.

PubMed

Ji, Hongkai; Li, Xia; Wang, Qian-fei; Ning, Yang

2013-04-23

We propose differential principal component analysis (dPCA) for analyzing multiple ChIP-sequencing datasets to identify differential protein-DNA interactions between two biological conditions. dPCA integrates unsupervised pattern discovery, dimension reduction, and statistical inference into a single framework. It uses a small number of principal components to summarize concisely the major multiprotein synergistic differential patterns between the two conditions. For each pattern, it detects and prioritizes differential genomic loci by comparing the between-condition differences with the within-condition variation among replicate samples. dPCA provides a unique tool for efficiently analyzing large amounts of ChIP-sequencing data to study dynamic changes of gene regulation across different biological conditions. We demonstrate this approach through analyses of differential chromatin patterns at transcription factor binding sites and promoters as well as allele-specific protein-DNA interactions.

Complete genome sequencing and evolutionary analysis of Indian isolates of Dengue virus type 2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dash, Paban Kumar, E-mail: pabandash@rediffmail.com; Sharma, Shashi; Soni, Manisha

Highlights: •Complete genome of Indian DENV-2 was deciphered for the first time in this study. •The recent Indian DENV-2 revealed presence of many unique amino acid residues. •Genotype shift (American to Cosmopolitan) characterizes evolution of DENV-2 in India. •Circulation of a unique clade of DENV-2 in South Asia was identified. -- Abstract: Dengue is the most important arboviral infection of global public health significance. It is now endemic in most parts of the South East Asia including India. Though Dengue virus type 2 (DENV-2) is predominantly associated with major outbreaks in India, complete genome information of Indian DENV-2 is notmore » available. In this study, the full-length genome of five DENV-2 isolates (four from 2001 to 2011 and one from 1960), from different parts of India was determined. The complete genome of the Indian DENV-2 was found to be 10,670 bases long with an open reading frame coding for 3391 amino acids. The recent Indian DENV-2 (2001–2011) revealed a nucleotide sequence identity of around 90% and 97% with an older Indian DENV-2 (1960) and closely related Sri Lankan and Chinese DENV-2 respectively. Presence of unique amino acid residues and non-conservative substitutions in critical amino acid residues of major structural and non-structural proteins was observed in recent Indian DENV-2. Selection pressure analysis revealed positive selection in few amino acid sites of the genes encoding for structural and non-structural proteins. The molecular phylogenetic analysis based on comparison of both complete coding region and envelope protein gene with globally diverse DENV-2 viruses classified the recent Indian isolates into a unique South Asian clade within Cosmopolitan genotype. A shift of genotype from American to Cosmopolitan in 1970s characterized the evolution of DENV-2 in India. Present study is the first report on complete genome characterization of emerging DENV-2 isolates from India and highlights the circulation of a unique clade in South Asia.« less
Insights into the genomic plasticity of Pseudomonas putida KF715, a strain with unique biphenyl-utilizing activity and genome instability properties.

PubMed

Suenaga, Hikaru; Fujihara, Hidehiko; Kimura, Nobutada; Hirose, Jun; Watanabe, Takahito; Futagami, Taiki; Goto, Masatoshi; Shimodaira, Jun; Furukawa, Kensuke

2017-10-01

Pseudomonas putida KF715 exhibits unique properties in both catabolic activity and genome plasticity. Our previous studies revealed that the DNA region containing biphenyl and salycilate metabolism gene clusters (termed the bph-sal element) was frequently deleted and transferred by conjugation to closely related P. putida strains. In this study, we first determined the complete nucleotide sequence of the KF715 genome. Next, to determine the underlying cause of genome plasticity in KF715, we compared the KF715 genome with the genomes of one KF715 defective mutant, two transconjugants, and several P. putida strains available from public databases. The gapless KF715 genome sequence revealed five replicons: one circular chromosome, and four plasmids. Southern blot analysis indicated that most of the KF715 cell population carries the bph-sal element on the chromosome whereas a small number carry it on a huge plasmid, pKF715A. Moreover, the bph-sal element is present stably on the plasmid and did not integrate into the chromosome of its transconjugants. Comparative genome analysis and experiments showed that a number of diverse putative genetic elements are present in KF715 and are likely involved in genome rearrangement. These data provide insights into the genetic plasticity and adaptability of microorganisms for survival in various ecological niches. © 2017 Society for Applied Microbiology and John Wiley & Sons Ltd.
Detection of PIWI and piRNAs in the mitochondria of mammalian cancer cells.

PubMed

Kwon, ChangHyuk; Tak, Hyosun; Rho, Mina; Chang, Hae Ryung; Kim, Yon Hui; Kim, Kyung Tae; Balch, Curt; Lee, Eun Kyung; Nam, Seungyoon

2014-03-28

Piwi-interacting RNAs (piRNAs) are 26-31 nt small noncoding RNAs that are processed from their longer precursor transcripts by Piwi proteins. Localization of Piwi and piRNA has been reported mostly in nucleus and cytoplasm of higher eukaryotes germ-line cells, where it is believed that known piRNA sequences are located in repeat regions of nuclear genome in germ-line cells. However, localization of PIWI and piRNA in mammalian somatic cell mitochondria yet remains largely unknown. We identified 29 piRNA sequence alignments from various regions of the human mitochondrial genome. Twelve out 29 piRNA sequences matched stem-loop fragment sequences of seven distinct tRNAs. We observed their actual expression in mitochondria subcellular fractions by inspecting mitochondrial-specific small RNA-Seq datasets. Of interest, the majority of the 29 piRNAs overlapped with multiple longer transcripts (expressed sequence tags) that are unique to the human mitochondrial genome. The presence of mature piRNAs in mitochondria was detected by qRT-PCR of mitochondrial subcellular RNAs. Further validation showed detection of Piwi by colocalization using anti-Piwil1 and mitochondria organelle-specific protein antibodies. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
Applied genomics in ruminants-new discoveries and model for predictive medicine

USDA-ARS?s Scientific Manuscript database

An overview of the progress for Dr. Sonstegard’s work in applied genomics in dairy cattle will be presented. The overview will include how applied research in livestock offers unique investigative models to discover gene function as a result of genetic load or inbreeding and also how genome selectio...
Complete Genome Sequences of Bacillus Phages Janet and OTooleKemple52

PubMed Central

2018-01-01

ABSTRACT We report here the genome sequences of two novel Bacillus cereus group-infecting bacteriophages, Janet and OTooleKemple52. These bacteriophages are double-stranded DNA-containing Myoviridae isolated from soil samples. While their genomes share a high degree of sequence identity with one another, their host preferences are unique. PMID:29748396
Long Noncoding RNAs: a New Regulatory Code in Metabolic Control

PubMed Central

Zhao, Xu-Yun; Lin, Jiandie D.

2015-01-01

Long noncoding RNAs (lncRNAs) are emerging as an integral part of the regulatory information encoded in the genome. LncRNAs possess the unique capability to interact with nucleic acids and proteins and exert discrete effects on numerous biological processes. Recent studies have delineated multiple lncRNA pathways that control metabolic tissue development and function. The expansion of the regulatory code that links nutrient and hormonal signals to tissue metabolism gives new insights into the genetic and pathogenic mechanisms underlying metabolic disease. This review discusses lncRNA biology with a focus on its role in the development, signaling, and function of key metabolic tissues. PMID:26410599
[What makes a parasite "transforming"? Insights into cancer from the agents of an exotic pathology, Theileria spp].

PubMed

Cheeseman, K M; Weitzman, J B

2017-02-01

Theileria are obligate eukaryotic intracellular parasites of cattle. The diseases they cause, Tropical theileriosis and East Coast Fever, cause huge economic loss in East African, Mediterranean and central and South-East Asian countries. These apicomplexan parasites are the only intracellular eukaryotic parasites known to transform their host cell and represent a unique model to study host-parasite interactions and mechanisms of cancer onset.Here, we review how Theileria parasites induce transformation of their leukocyte host cell and discuss similarities with tumorigenesis. We describe how genomic innovation, epigenetic changes and hijacking of signal transductions enable a eukaryotic parasite to transform its host cell.
Insights on genome size evolution from a miniature inverted repeat transposon driving a satellite DNA.

PubMed

Scalvenzi, Thibault; Pollet, Nicolas

2014-12-01

The genome size in eukaryotes does not correlate well with the number of genes they contain. We can observe this so-called C-value paradox in amphibian species. By analyzing an amphibian genome we asked how repetitive DNA can impact genome size and architecture. We describe here our discovery of a Tc1/mariner miniature inverted-repeat transposon family present in Xenopus frogs. These transposons named miDNA4 are unique since they contain a satellite DNA motif. We found that miDNA4 measured 331 bp, contained 25 bp long inverted terminal repeat sequences and a sequence motif of 119 bp present as a unique copy or as an array of 2-47 copies. We characterized the structure, dynamics, impact and evolution of the miDNA4 family and its satellite DNA in Xenopus frog genomes. This led us to propose a model for the evolution of these two repeated sequences and how they can synergize to increase genome size. Copyright © 2014 Elsevier Inc. All rights reserved.
GenomeD3Plot: a library for rich, interactive visualizations of genomic data in web applications.

PubMed

Laird, Matthew R; Langille, Morgan G I; Brinkman, Fiona S L

2015-10-15

A simple static image of genomes and associated metadata is very limiting, as researchers expect rich, interactive tools similar to the web applications found in the post-Web 2.0 world. GenomeD3Plot is a light weight visualization library written in javascript using the D3 library. GenomeD3Plot provides a rich API to allow the rapid visualization of complex genomic data using a convenient standards based JSON configuration file. When integrated into existing web services GenomeD3Plot allows researchers to interact with data, dynamically alter the view, or even resize or reposition the visualization in their browser window. In addition GenomeD3Plot has built in functionality to export any resulting genome visualization in PNG or SVG format for easy inclusion in manuscripts or presentations. GenomeD3Plot is being utilized in the recently released Islandviewer 3 (www.pathogenomics.sfu.ca/islandviewer/) to visualize predicted genomic islands with other genome annotation data. However, its features enable it to be more widely applicable for dynamic visualization of genomic data in general. GenomeD3Plot is licensed under the GNU-GPL v3 at https://github.com/brinkmanlab/GenomeD3Plot/. brinkman@sfu.ca. © The Author 2015. Published by Oxford University Press.
An archaeal genomic signature

NASA Technical Reports Server (NTRS)

Graham, D. E.; Overbeek, R.; Olsen, G. J.; Woese, C. R.

2000-01-01

Comparisons of complete genome sequences allow the most objective and comprehensive descriptions possible of a lineage's evolution. This communication uses the completed genomes from four major euryarchaeal taxa to define a genomic signature for the Euryarchaeota and, by extension, the Archaea as a whole. The signature is defined in terms of the set of protein-encoding genes found in at least two diverse members of the euryarchaeal taxa that function uniquely within the Archaea; most signature proteins have no recognizable bacterial or eukaryal homologs. By this definition, 351 clusters of signature proteins have been identified. Functions of most proteins in this signature set are currently unknown. At least 70% of the clusters that contain proteins from all the euryarchaeal genomes also have crenarchaeal homologs. This conservative set, which appears refractory to horizontal gene transfer to the Bacteria or the Eukarya, would seem to reflect the significant innovations that were unique and fundamental to the archaeal "design fabric." Genomic protein signature analysis methods may be extended to characterize the evolution of any phylogenetically defined lineage. The complete set of protein clusters for the archaeal genomic signature is presented as supplementary material (see the PNAS web site, www.pnas.org).
InFlo: a novel systems biology framework identifies cAMP-CREB1 axis as a key modulator of platinum resistance in ovarian cancer.

PubMed

Dimitrova, N; Nagaraj, A B; Razi, A; Singh, S; Kamalakaran, S; Banerjee, N; Joseph, P; Mankovich, A; Mittal, P; DiFeo, A; Varadan, V

2017-04-27

Characterizing the complex interplay of cellular processes in cancer would enable the discovery of key mechanisms underlying its development and progression. Published approaches to decipher driver mechanisms do not explicitly model tissue-specific changes in pathway networks and the regulatory disruptions related to genomic aberrations in cancers. We therefore developed InFlo, a novel systems biology approach for characterizing complex biological processes using a unique multidimensional framework integrating transcriptomic, genomic and/or epigenomic profiles for any given cancer sample. We show that InFlo robustly characterizes tissue-specific differences in activities of signalling networks on a genome scale using unique probabilistic models of molecular interactions on a per-sample basis. Using large-scale multi-omics cancer datasets, we show that InFlo exhibits higher sensitivity and specificity in detecting pathway networks associated with specific disease states when compared to published pathway network modelling approaches. Furthermore, InFlo's ability to infer the activity of unmeasured signalling network components was also validated using orthogonal gene expression signatures. We then evaluated multi-omics profiles of primary high-grade serous ovarian cancer tumours (N=357) to delineate mechanisms underlying resistance to frontline platinum-based chemotherapy. InFlo was the only algorithm to identify hyperactivation of the cAMP-CREB1 axis as a key mechanism associated with resistance to platinum-based therapy, a finding that we subsequently experimentally validated. We confirmed that inhibition of CREB1 phosphorylation potently sensitized resistant cells to platinum therapy and was effective in killing ovarian cancer stem cells that contribute to both platinum-resistance and tumour recurrence. Thus, we propose InFlo to be a scalable and widely applicable and robust integrative network modelling framework for the discovery of evidence-based biomarkers and therapeutic targets.
Enhanced guide-RNA design and targeting analysis for precise CRISPR genome editing of single and consortia of industrially relevant and non-model organisms.

PubMed

Mendoza, Brian J; Trinh, Cong T

2018-01-01

Genetic diversity of non-model organisms offers a repertoire of unique phenotypic features for exploration and cultivation for synthetic biology and metabolic engineering applications. To realize this enormous potential, it is critical to have an efficient genome editing tool for rapid strain engineering of these organisms to perform novel programmed functions. To accommodate the use of CRISPR/Cas systems for genome editing across organisms, we have developed a novel method, named CRISPR Associated Software for Pathway Engineering and Research (CASPER), for identifying on- and off-targets with enhanced predictability coupled with an analysis of non-unique (repeated) targets to assist in editing any organism with various endonucleases. Utilizing CASPER, we demonstrated a modest 2.4% and significant 30.2% improvement (F-test, P < 0.05) over the conventional methods for predicting on- and off-target activities, respectively. Further we used CASPER to develop novel applications in genome editing: multitargeting analysis (i.e. simultaneous multiple-site modification on a target genome with a sole guide-RNA requirement) and multispecies population analysis (i.e. guide-RNA design for genome editing across a consortium of organisms). Our analysis on a selection of industrially relevant organisms revealed a number of non-unique target sites associated with genes and transposable elements that can be used as potential sites for multitargeting. The analysis also identified shared and unshared targets that enable genome editing of single or multiple genomes in a consortium of interest. We envision CASPER as a useful platform to enhance the precise CRISPR genome editing for metabolic engineering and synthetic biology applications. https://github.com/TrinhLab/CASPER. ctrinh@utk.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Elucidation of the genome organization of tobacco mosaic virus.

PubMed Central

Zaitlin, M

1999-01-01

Proteins unique to tobacco mosaic virus (TMV)-infected plants were detected in the 1970s by electrophoretic analyses of extracts of virus-infected tissues, comparing their proteins to those generated in extracts of uninfected tissues. The genome organization of TMV was deduced principally from studies involving in vitro translation of proteins from the genomic and subgenomic messenger RNAs. The ultimate analysis of the TMV genome came in 1982 when P. Goelet and colleagues sequenced the entire genome. Studies leading to the elucidation of the TMV genome organization are described below. PMID:10212938
Complete chloroplast genome sequence of a major allogamous forage species, perennial ryegrass (Lolium perenne L.).

PubMed

Diekmann, Kerstin; Hodkinson, Trevor R; Wolfe, Kenneth H; van den Bekerom, Rob; Dix, Philip J; Barth, Susanne

2009-06-01

Lolium perenne L. (perennial ryegrass) is globally one of the most important forage and grassland crops. We sequenced the chloroplast (cp) genome of Lolium perenne cultivar Cashel. The L. perenne cp genome is 135 282 bp with a typical quadripartite structure. It contains genes for 76 unique proteins, 30 tRNAs and four rRNAs. As in other grasses, the genes accD, ycf1 and ycf2 are absent. The genome is of average size within its subfamily Pooideae and of medium size within the Poaceae. Genome size differences are mainly due to length variations in non-coding regions. However, considerable length differences of 1-27 codons in comparison of L. perenne to other Poaceae and 1-68 codons among all Poaceae were also detected. Within the cp genome of this outcrossing cultivar, 10 insertion/deletion polymorphisms and 40 single nucleotide polymorphisms were detected. Two of the polymorphisms involve tiny inversions within hairpin structures. By comparing the genome sequence with RT-PCR products of transcripts for 33 genes, 31 mRNA editing sites were identified, five of them unique to Lolium. The cp genome sequence of L. perenne is available under Accession number AM777385 at the European Molecular Biology Laboratory, National Center for Biotechnology Information and DNA DataBank of Japan.
Next Generation Protein Interactomes for Plant Systems Biology and Biomass Feedstock Research

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ecker, Joseph Robert; Trigg, Shelly; Garza, Renee

Biofuel crop cultivation is a necessary step in heading towards a sustainable future, making their genomic studies a priority. While technology platforms that currently exist for studying non-model crop species, like switch-grass or sorghum, have yielded large quantities of genomic and expression data, still a large gap exists between molecular mechanism and phenotype. The aspect of molecular activity at the level of protein-protein interactions has recently begun to bridge this gap, providing a more global perspective. Interactome analysis has defined more specific functional roles of proteins based on their interaction partners, neighborhoods, and other network features, making it possible tomore » distinguish unique modules of immune response to different plant pathogens(Jiang, Dong, and Zhang 2016). As we work towards cultivating heartier biofuel crops, interactome data will lead to uncovering crop-specific defense and development networks. However, the collection of protein interaction data has been limited to expensive, time-consuming, hard-to-scale assays that mostly require cloned ORF collections. For these reasons, we have successfully developed a highly scalable, economical, and sensitive yeast two-hybrid assay, ProCREate, that can be universally applied to generate proteome-wide primary interactome data. ProCREate enables en masse pooling and massively paralleled sequencing for the identification of interacting proteins by exploiting Cre-lox recombination. ProCREate can be used to screen ORF/cDNA libraries from feedstock plant tissues. The interactome data generated will yield deeper insight into many molecular processes and pathways that can be used to guide improvement of feedstock productivity and sustainability.« less
Synchronized dynamics of bacterial niche-specific functions during biofilm development in a cold seep brine pool.

PubMed

Zhang, Weipeng; Wang, Yong; Bougouffa, Salim; Tian, Renmao; Cao, Huiluo; Li, Yongxin; Cai, Lin; Wong, Yue Him; Zhang, Gen; Zhou, Guowei; Zhang, Xixiang; Bajic, Vladimir B; Al-Suwailem, Abdulaziz; Qian, Pei-Yuan

2015-10-01

The biology of biofilm in deep-sea environments is barely being explored. Here, biofilms were developed at the brine pool (characterized by limited carbon sources) and the normal bottom water adjacent to Thuwal cold seeps. Comparative metagenomics based on 50 Gb datasets identified polysaccharide degradation, nitrate reduction and proteolysis as enriched functional categories for brine biofilms. The genomes of two dominant species: a novel Deltaproteobacterium and a novel Epsilonproteobacterium in the brine biofilms were reconstructed. Despite rather small genome sizes, the Deltaproteobacterium possessed enhanced polysaccharide fermentation pathways, whereas the Epsilonproteobacterium was a versatile nitrogen reactor possessing nar, nap and nif gene clusters. These metabolic functions, together with specific regulatory and hypersaline-tolerant genes, made the two bacteria unique compared with their close relatives, including those from hydrothermal vents. Moreover, these functions were regulated by biofilm development, as both the abundance and the expression level of key functional genes were higher in later stage biofilms, and co-occurrences between the two dominant bacteria were demonstrated. Collectively, unique mechanisms were revealed: (i) polysaccharides fermentation, proteolysis interacted with nitrogen cycling to form a complex chain for energy generation, and (ii) remarkably exploiting and organizing niche-specific functions would be an important strategy for biofilm-dependent adaptation to the extreme conditions. © 2015 Society for Applied Microbiology and John Wiley & Sons Ltd.
The sea cucumber genome provides insights into morphological evolution and visceral regeneration

PubMed Central

Dai, Hui; Hamel, Jean-François; Liu, Chengzhang; Yu, Yang; Liu, Shilin; Lin, Wenchao; Guo, Kaimin; Jin, Songjun; Xu, Peng; Storey, Kenneth B.; Huan, Pin; Zhang, Tao; Zhou, Yi; Zhang, Jiquan; Lin, Chenggang; Li, Xiaoni; Xing, Lili; Huo, Da; Sun, Mingzhe; Wang, Lei; Mercier, Annie; Li, Fuhua; Yang, Hongsheng

2017-01-01

Apart from sharing common ancestry with chordates, sea cucumbers exhibit a unique morphology and exceptional regenerative capacity. Here we present the complete genome sequence of an economically important sea cucumber, A. japonicus, generated using Illumina and PacBio platforms, to achieve an assembly of approximately 805 Mb (contig N50 of 190 Kb and scaffold N50 of 486 Kb), with 30,350 protein-coding genes and high continuity. We used this resource to explore key genetic mechanisms behind the unique biological characters of sea cucumbers. Phylogenetic and comparative genomic analyses revealed the presence of marker genes associated with notochord and gill slits, suggesting that these chordate features were present in ancestral echinoderms. The unique shape and weak mineralization of the sea cucumber adult body were also preliminarily explained by the contraction of biomineralization genes. Genome, transcriptome, and proteome analyses of organ regrowth after induced evisceration provided insight into the molecular underpinnings of visceral regeneration, including a specific tandem-duplicated prostatic secretory protein of 94 amino acids (PSP94)-like gene family and a significantly expanded fibrinogen-related protein (FREP) gene family. This high-quality genome resource will provide a useful framework for future research into biological processes and evolution in deuterostomes, including remarkable regenerative abilities that could have medical applications. Moreover, the multiomics data will be of prime value for commercial sea cucumber breeding programs. PMID:29023486
The sea cucumber genome provides insights into morphological evolution and visceral regeneration.

PubMed

Zhang, Xiaojun; Sun, Lina; Yuan, Jianbo; Sun, Yamin; Gao, Yi; Zhang, Libin; Li, Shihao; Dai, Hui; Hamel, Jean-François; Liu, Chengzhang; Yu, Yang; Liu, Shilin; Lin, Wenchao; Guo, Kaimin; Jin, Songjun; Xu, Peng; Storey, Kenneth B; Huan, Pin; Zhang, Tao; Zhou, Yi; Zhang, Jiquan; Lin, Chenggang; Li, Xiaoni; Xing, Lili; Huo, Da; Sun, Mingzhe; Wang, Lei; Mercier, Annie; Li, Fuhua; Yang, Hongsheng; Xiang, Jianhai

2017-10-01

Apart from sharing common ancestry with chordates, sea cucumbers exhibit a unique morphology and exceptional regenerative capacity. Here we present the complete genome sequence of an economically important sea cucumber, A. japonicus, generated using Illumina and PacBio platforms, to achieve an assembly of approximately 805 Mb (contig N50 of 190 Kb and scaffold N50 of 486 Kb), with 30,350 protein-coding genes and high continuity. We used this resource to explore key genetic mechanisms behind the unique biological characters of sea cucumbers. Phylogenetic and comparative genomic analyses revealed the presence of marker genes associated with notochord and gill slits, suggesting that these chordate features were present in ancestral echinoderms. The unique shape and weak mineralization of the sea cucumber adult body were also preliminarily explained by the contraction of biomineralization genes. Genome, transcriptome, and proteome analyses of organ regrowth after induced evisceration provided insight into the molecular underpinnings of visceral regeneration, including a specific tandem-duplicated prostatic secretory protein of 94 amino acids (PSP94)-like gene family and a significantly expanded fibrinogen-related protein (FREP) gene family. This high-quality genome resource will provide a useful framework for future research into biological processes and evolution in deuterostomes, including remarkable regenerative abilities that could have medical applications. Moreover, the multiomics data will be of prime value for commercial sea cucumber breeding programs.
Co-regulation of pluripotency and genetic integrity at the genomic level.

PubMed

Cooper, Daniel J; Walter, Christi A; McCarrey, John R

2014-11-01

The Disposable Soma Theory holds that genetic integrity will be maintained at more pristine levels in germ cells than in somatic cells because of the unique role germ cells play in perpetuating the species. We tested the hypothesis that the same concept applies to pluripotent cells compared to differentiated cells. Analyses of transcriptome and cistrome databases, along with canonical pathway analysis and chromatin immunoprecipitation confirmed differential expression of DNA repair and cell death genes in embryonic stem cells and induced pluripotent stem cells relative to fibroblasts, and predicted extensive direct and indirect interactions between the pluripotency and genetic integrity gene networks in pluripotent cells. These data suggest that enhanced maintenance of genetic integrity is fundamentally linked to the epigenetic state of pluripotency at the genomic level. In addition, these findings demonstrate how a small number of key pluripotency factors can regulate large numbers of downstream genes in a pathway-specific manner. Copyright © 2014. Published by Elsevier B.V.
Comparative promoter analysis allows de novo identification of specialized cell junction-associated proteins.

PubMed

Cohen, Clemens D; Klingenhoff, Andreas; Boucherot, Anissa; Nitsche, Almut; Henger, Anna; Brunner, Bodo; Schmid, Holger; Merkle, Monika; Saleem, Moin A; Koller, Klaus-Peter; Werner, Thomas; Gröne, Hermann-Josef; Nelson, Peter J; Kretzler, Matthias

2006-04-11

Shared transcription factor binding sites that are conserved in distance and orientation help control the expression of gene products that act together in the same biological context. New bioinformatics approaches allow the rapid characterization of shared promoter structures and can be used to find novel interacting molecules. Here, these principles are demonstrated by using molecules linked to the unique functional unit of the glomerular slit diaphragm. An evolutionarily conserved promoter model was generated by comparative genomics in the proximal promoter regions of the slit diaphragm-associated molecule nephrin. Phylogenetic promoter fingerprints of known elements of the slit diaphragm complex identified the nephrin model in the promoter region of zonula occludens-1 (ZO-1). Genome-wide scans using this promoter model effectively predicted a previously unrecognized slit diaphragm molecule, cadherin-5. Nephrin, ZO-1, and cadherin-5 mRNA showed stringent coexpression across a diverse set of human glomerular diseases. Comparative promoter analysis can identify regulatory pathways at work in tissue homeostasis and disease processes.

Workshop on Molecular Evolution

NASA Technical Reports Server (NTRS)

Cummings, Michael P.

2004-01-01

Molecular evolution has become the nexus of many areas of biological research. It both brings together and enriches such areas as biochemistry, molecular biology, microbiology, population genetics, systematics, developmental biology, genomics, bioinformatics, in vitro evolution, and molecular ecology. The Workshop provides an important contribution to these fields in that it promotes interdisciplinary research and interaction, and thus provides a glue that sticks together disparate fields. Due to the wide range of fields addressed by the study of molecular evolution, it is difficult to offer a comprehensive course in a university setting. It is rare for a single institution to maintain expertise in all necessary areas. In contrast, the Workshop is uniquely able to provide necessary breadth and depth by utilizing a large number of faculty with appropriate expertise. Furthermore, the flexible nature of the Workshop allows for rapid adaptation to changes in the dynamic field of molecular evolution. For example, the 2003 Workshop included recently emergent research areas of molecular evolution of development and genomics.
Starch Catabolism by a Prominent Human Gut Symbiont Is Directed by the Recognition of Amylose Helices

DOE Office of Scientific and Technical Information (OSTI.GOV)

Koropatkin, Nicole M.; Martens, Eric C.; Gordon, Jeffrey I.

2009-01-12

The human gut microbiota performs functions that are not encoded in our Homo sapiens genome, including the processing of otherwise undigestible dietary polysaccharides. Defining the structures of proteins involved in the import and degradation of specific glycans by saccharolytic bacteria complements genomic analysis of the nutrient-processing capabilities of gut communities. Here, we describe the atomic structure of one such protein, SusD, required for starch binding and utilization by Bacteroides thetaiotaomicron, a prominent adaptive forager of glycans in the distal human gut microbiota. The binding pocket of this unique {alpha}-helical protein contains an arc of aromatic residues that complements the naturalmore » helical structure of starch and imposes this conformation on bound maltoheptaose. Furthermore, SusD binds cyclic oligosaccharides with higher affinity than linear forms. The structures of several SusD/oligosaccharide complexes reveal an inherent ligand recognition plasticity dominated by the three-dimensional conformation of the oligosaccharides rather than specific interactions with the composite sugars.« less
Context influences on TALE–DNA binding revealed by quantitative profiling

PubMed Central

Rogers, Julia M.; Barrera, Luis A.; Reyon, Deepak; Sander, Jeffry D.; Kellis, Manolis; Joung, J Keith; Bulyk, Martha L.

2015-01-01

Transcription activator-like effector (TALE) proteins recognize DNA using a seemingly simple DNA-binding code, which makes them attractive for use in genome engineering technologies that require precise targeting. Although this code is used successfully to design TALEs to target specific sequences, off-target binding has been observed and is difficult to predict. Here we explore TALE–DNA interactions comprehensively by quantitatively assaying the DNA-binding specificities of 21 representative TALEs to ∼5,000–20,000 unique DNA sequences per protein using custom-designed protein-binding microarrays (PBMs). We find that protein context features exert significant influences on binding. Thus, the canonical recognition code does not fully capture the complexity of TALE–DNA binding. We used the PBM data to develop a computational model, Specificity Inference For TAL-Effector Design (SIFTED), to predict the DNA-binding specificity of any TALE. We provide SIFTED as a publicly available web tool that predicts potential genomic off-target sites for improved TALE design. PMID:26067805
Context influences on TALE-DNA binding revealed by quantitative profiling.

PubMed

Rogers, Julia M; Barrera, Luis A; Reyon, Deepak; Sander, Jeffry D; Kellis, Manolis; Joung, J Keith; Bulyk, Martha L

2015-06-11

Transcription activator-like effector (TALE) proteins recognize DNA using a seemingly simple DNA-binding code, which makes them attractive for use in genome engineering technologies that require precise targeting. Although this code is used successfully to design TALEs to target specific sequences, off-target binding has been observed and is difficult to predict. Here we explore TALE-DNA interactions comprehensively by quantitatively assaying the DNA-binding specificities of 21 representative TALEs to ∼5,000-20,000 unique DNA sequences per protein using custom-designed protein-binding microarrays (PBMs). We find that protein context features exert significant influences on binding. Thus, the canonical recognition code does not fully capture the complexity of TALE-DNA binding. We used the PBM data to develop a computational model, Specificity Inference For TAL-Effector Design (SIFTED), to predict the DNA-binding specificity of any TALE. We provide SIFTED as a publicly available web tool that predicts potential genomic off-target sites for improved TALE design.
The Generation R Study: Biobank update 2015.

PubMed

Kruithof, Claudia J; Kooijman, Marjolein N; van Duijn, Cornelia M; Franco, Oscar H; de Jongste, Johan C; Klaver, Caroline C W; Mackenbach, Johan P; Moll, Henriëtte A; Raat, Hein; Rings, Edmond H H M; Rivadeneira, Fernando; Steegers, Eric A P; Tiemeier, Henning; Uitterlinden, Andre G; Verhulst, Frank C; Wolvius, Eppo B; Hofman, Albert; Jaddoe, Vincent W V

2014-12-01

The Generation R Study is a population-based prospective cohort study from fetal life until adulthood. The study is designed to identify early environmental and genetic causes and causal pathways leading to normal and abnormal growth, development and health from fetal life, childhood and young adulthood. In total, 9,778 mothers were enrolled in the study. Data collection in children and their parents include questionnaires, interviews, detailed physical and ultrasound examinations, behavioural observations, Magnetic Resonance Imaging and biological samples. Efforts have been conducted for collecting biological samples including blood, hair, faeces, nasal swabs, saliva and urine samples and generating genomics data on DNA, RNA and microbiome. In this paper, we give an update of the collection, processing and storage of these biological samples and available measures. Together with detailed phenotype measurements, these biological samples provide a unique resource for epidemiological studies focused on environmental exposures, genetic and genomic determinants and their interactions in relation to growth, health and development from fetal life onwards.
Analysis and functional annotation of expressed sequence tags from the fall armyworm Spodoptera frugiperda

PubMed Central

Deng, Youping; Dong, Yinghua; Thodima, Venkata; Clem, Rollie J; Passarelli, A Lorena

2006-01-01

Background Little is known about the genome sequences of lepidopteran insects, although this group of insects has been studied extensively in the fields of endocrinology, development, immunity, and pathogen-host interactions. In addition, cell lines derived from Spodoptera frugiperda and other lepidopteran insects are routinely used for baculovirus foreign gene expression. This study reports the results of an expressed sequence tag (EST) sequencing project in cells from the lepidopteran insect S. frugiperda, the fall armyworm. Results We have constructed an EST database using two cDNA libraries from the S. frugiperda-derived cell line, SF-21. The database consists of 2,367 ESTs which were assembled into 244 contigs and 951 singlets for a total of 1,195 unique sequences. Conclusion S. frugiperda is an agriculturally important pest insect and genomic information will be instrumental for establishing initial transcriptional profiling and gene function studies, and for obtaining information about genes manipulated during infections by insect pathogens such as baculoviruses. PMID:17052344
In Silico Pattern-Based Analysis of the Human Cytomegalovirus Genome

PubMed Central

Rigoutsos, Isidore; Novotny, Jiri; Huynh, Tien; Chin-Bow, Stephen T.; Parida, Laxmi; Platt, Daniel; Coleman, David; Shenk, Thomas

2003-01-01

More than 200 open reading frames (ORFs) from the human cytomegalovirus genome have been reported as potentially coding for proteins. We have used two pattern-based in silico approaches to analyze this set of putative viral genes. With the help of an objective annotation method that is based on the Bio-Dictionary, a comprehensive collection of amino acid patterns that describes the currently known natural sequence space of proteins, we have reannotated all of the previously reported putative genes of the human cytomegalovirus. Also, with the help of MUSCA, a pattern-based multiple sequence alignment algorithm, we have reexamined the original human cytomegalovirus gene family definitions. Our analysis of the genome shows that many of the coded proteins comprise amino acid combinations that are unique to either the human cytomegalovirus or the larger group of herpesviruses. We have confirmed that a surprisingly large portion of the analyzed ORFs encode membrane proteins, and we have discovered a significant number of previously uncharacterized proteins that are predicted to be G-protein-coupled receptor homologues. The analysis also indicates that many of the encoded proteins undergo posttranslational modifications such as hydroxylation, phosphorylation, and glycosylation. ORFs encoding proteins with similar functional behavior appear in neighboring regions of the human cytomegalovirus genome. All of the results of the present study can be found and interactively explored online (http://cbcsrv.watson.ibm.com/virus/). PMID:12634390
In silico pattern-based analysis of the human cytomegalovirus genome.

PubMed

Rigoutsos, Isidore; Novotny, Jiri; Huynh, Tien; Chin-Bow, Stephen T; Parida, Laxmi; Platt, Daniel; Coleman, David; Shenk, Thomas

2003-04-01

More than 200 open reading frames (ORFs) from the human cytomegalovirus genome have been reported as potentially coding for proteins. We have used two pattern-based in silico approaches to analyze this set of putative viral genes. With the help of an objective annotation method that is based on the Bio-Dictionary, a comprehensive collection of amino acid patterns that describes the currently known natural sequence space of proteins, we have reannotated all of the previously reported putative genes of the human cytomegalovirus. Also, with the help of MUSCA, a pattern-based multiple sequence alignment algorithm, we have reexamined the original human cytomegalovirus gene family definitions. Our analysis of the genome shows that many of the coded proteins comprise amino acid combinations that are unique to either the human cytomegalovirus or the larger group of herpesviruses. We have confirmed that a surprisingly large portion of the analyzed ORFs encode membrane proteins, and we have discovered a significant number of previously uncharacterized proteins that are predicted to be G-protein-coupled receptor homologues. The analysis also indicates that many of the encoded proteins undergo posttranslational modifications such as hydroxylation, phosphorylation, and glycosylation. ORFs encoding proteins with similar functional behavior appear in neighboring regions of the human cytomegalovirus genome. All of the results of the present study can be found and interactively explored online (http://cbcsrv.watson.ibm.com/virus/).
A comprehensive draft genome sequence for lupin (Lupinus angustifolius), an emerging health food: insights into plant-microbe interactions and legume evolution.

PubMed

Hane, James K; Ming, Yao; Kamphuis, Lars G; Nelson, Matthew N; Garg, Gagan; Atkins, Craig A; Bayer, Philipp E; Bravo, Armando; Bringans, Scott; Cannon, Steven; Edwards, David; Foley, Rhonda; Gao, Ling-Ling; Harrison, Maria J; Huang, Wei; Hurgobin, Bhavna; Li, Sean; Liu, Cheng-Wu; McGrath, Annette; Morahan, Grant; Murray, Jeremy; Weller, James; Jian, Jianbo; Singh, Karam B

2017-03-01

Lupins are important grain legume crops that form a critical part of sustainable farming systems, reducing fertilizer use and providing disease breaks. It has a basal phylogenetic position relative to other crop and model legumes and a high speciation rate. Narrow-leafed lupin (NLL; Lupinus angustifolius L.) is gaining popularity as a health food, which is high in protein and dietary fibre but low in starch and gluten-free. We report the draft genome assembly (609 Mb) of NLL cultivar Tanjil, which has captured >98% of the gene content, sequences of additional lines and a dense genetic map. Lupins are unique among legumes and differ from most other land plants in that they do not form mycorrhizal associations. Remarkably, we find that NLL has lost all mycorrhiza-specific genes, but has retained genes commonly required for mycorrhization and nodulation. In addition, the genome also provided candidate genes for key disease resistance and domestication traits. We also find evidence of a whole-genome triplication at around 25 million years ago in the genistoid lineage leading to Lupinus. Our results will support detailed studies of legume evolution and accelerate lupin breeding programmes. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
Comparative genome analysis of Burkholderia phytofirmans PsJN reveals a wide spectrum of endophytic lifestyles based on interaction strategies with host plants

PubMed Central

Mitter, Birgit; Petric, Alexandra; Shin, Maria W.; Chain, Patrick S. G.; Hauberg-Lotte, Lena; Reinhold-Hurek, Barbara; Nowak, Jerzy; Sessitsch, Angela

2013-01-01

Burkholderia phytofirmans PsJN is a naturally occurring plant-associated bacterial endophyte that effectively colonizes a wide range of plants and stimulates their growth and vitality. Here we analyze whole genomes, of PsJN and of eight other endophytic bacteria. This study illustrates that a wide spectrum of endophytic life styles exists. Although we postulate the existence of typical endophytic traits, no unique gene cluster could be exclusively linked to the endophytic lifestyle. Furthermore, our study revealed a high genetic diversity among bacterial endophytes as reflected in their genotypic and phenotypic features. B. phytofirmans PsJN is in many aspects outstanding among the selected endophytes. It has the biggest genome consisting of two chromosomes and one plasmid, well-equipped with genes for the degradation of complex organic compounds and detoxification, e.g., 24 glutathione-S-transferase (GST) genes. Furthermore, strain PsJN has a high number of cell surface signaling and secretion systems and harbors the 3-OH-PAME quorum-sensing system that coordinates the switch of free-living to the symbiotic lifestyle in the plant-pathogen R. solanacearum. The ability of B. phytofirmans PsJN to successfully colonize such a wide variety of plant species might be based on its large genome harboring a broad range of physiological functions. PMID:23641251
BuD, a helix–loop–helix DNA-binding domain for genome modification

PubMed Central

Stella, Stefano; Molina, Rafael; López-Méndez, Blanca; Juillerat, Alexandre; Bertonati, Claudia; Daboussi, Fayza; Campos-Olivas, Ramon; Duchateau, Phillippe; Montoya, Guillermo

2014-01-01

DNA editing offers new possibilities in synthetic biology and biomedicine for modulation or modification of cellular functions to organisms. However, inaccuracy in this process may lead to genome damage. To address this important problem, a strategy allowing specific gene modification has been achieved through the addition, removal or exchange of DNA sequences using customized proteins and the endogenous DNA-repair machinery. Therefore, the engineering of specific protein–DNA interactions in protein scaffolds is key to providing ‘toolkits’ for precise genome modification or regulation of gene expression. In a search for putative DNA-binding domains, BurrH, a protein that recognizes a 19 bp DNA target, was identified. Here, its apo and DNA-bound crystal structures are reported, revealing a central region containing 19 repeats of a helix–loop–helix modular domain (BurrH domain; BuD), which identifies the DNA target by a single residue-to-nucleotide code, thus facilitating its redesign for gene targeting. New DNA-binding specificities have been engineered in this template, showing that BuD-derived nucleases (BuDNs) induce high levels of gene targeting in a locus of the human haemoglobin β (HBB) gene close to mutations responsible for sickle-cell anaemia. Hence, the unique combination of high efficiency and specificity of the BuD arrays can push forward diverse genome-modification approaches for cell or organism redesign, opening new avenues for gene editing. PMID:25004980
Plasmid Characterization and Chromosome Analysis of Two netF+ Clostridium perfringens Isolates Associated with Foal and Canine Necrotizing Enteritis.

PubMed

Mehdizadeh Gohari, Iman; Kropinski, Andrew M; Weese, Scott J; Parreira, Valeria R; Whitehead, Ashley E; Boerlin, Patrick; Prescott, John F

2016-01-01

The recent discovery of a novel beta-pore-forming toxin, NetF, which is strongly associated with canine and foal necrotizing enteritis should improve our understanding of the role of type A Clostridium perfringens associated disease in these animals. The current study presents the complete genome sequence of two netF-positive strains, JFP55 and JFP838, which were recovered from cases of foal necrotizing enteritis and canine hemorrhagic gastroenteritis, respectively. Genome sequencing was done using Single Molecule, Real-Time (SMRT) technology-PacBio and Illumina Hiseq2000. The JFP55 and JFP838 genomes include a single 3.34 Mb and 3.53 Mb chromosome, respectively, and both genomes include five circular plasmids. Plasmid annotation revealed that three plasmids were shared by the two newly sequenced genomes, including a NetF/NetE toxins-encoding tcp-conjugative plasmid, a CPE/CPB2 toxins-encoding tcp-conjugative plasmid and a putative bacteriocin-encoding plasmid. The putative beta-pore-forming toxin genes, netF, netE and netG, were located in unique pathogenicity loci on tcp-conjugative plasmids. The C. perfringens JFP55 chromosome carries 2,825 protein-coding genes whereas the chromosome of JFP838 contains 3,014 protein-encoding genes. Comparison of these two chromosomes with three available reference C. perfringens chromosome sequences identified 48 (~247 kb) and 81 (~430 kb) regions unique to JFP55 and JFP838, respectively. Some of these divergent genomic regions in both chromosomes are phage- and plasmid-related segments. Sixteen of these unique chromosomal regions (~69 kb) were shared between the two isolates. Five of these shared regions formed a mosaic of plasmid-integrated segments, suggesting that these elements were acquired early in a clonal lineage of netF-positive C. perfringens strains. These results provide significant insight into the basis of canine and foal necrotizing enteritis and are the first to demonstrate that netF resides on a large and unique plasmid-encoded locus.
Predicting Protein Function by Genomic Context: Quantitative Evaluation and Qualitative Inferences

PubMed Central

Huynen, Martijn; Snel, Berend; Lathe, Warren; Bork, Peer

2000-01-01

Various new methods have been proposed to predict functional interactions between proteins based on the genomic context of their genes. The types of genomic context that they use are Type I: the fusion of genes; Type II: the conservation of gene-order or co-occurrence of genes in potential operons; and Type III: the co-occurrence of genes across genomes (phylogenetic profiles). Here we compare these types for their coverage, their correlations with various types of functional interaction, and their overlap with homology-based function assignment. We apply the methods to Mycoplasma genitalium, the standard benchmarking genome in computational and experimental genomics. Quantitatively, conservation of gene order is the technique with the highest coverage, applying to 37% of the genes. By combining gene order conservation with gene fusion (6%), the co-occurrence of genes in operons in absence of gene order conservation (8%), and the co-occurrence of genes across genomes (11%), significant context information can be obtained for 50% of the genes (the categories overlap). Qualitatively, we observe that the functional interactions between genes are stronger as the requirements for physical neighborhood on the genome are more stringent, while the fraction of potential false positives decreases. Moreover, only in cases in which gene order is conserved in a substantial fraction of the genomes, in this case six out of twenty-five, does a single type of functional interaction (physical interaction) clearly dominate (>80%). In other cases, complementary function information from homology searches, which is available for most of the genes with significant genomic context, is essential to predict the type of interaction. Using a combination of genomic context and homology searches, new functional features can be predicted for 10% of M. genitalium genes. PMID:10958638
Genome sequence of an aflatoxigenic pathogen of Argentinian peanut, Aspergillus arachidicola

USDA-ARS?s Scientific Manuscript database

In this study we sequenced the genome of the A. arachidicola Type strain (CBS 117610) and found its genome size to be 38.9 Mb, and its number of predicted genes to be 12,091, which are values comparable to those in other sequenced Aspergilli. Of its predicted genes, 691 were identified as unique to ...
Complete Genome Sequences of Bacillus Phages Janet and OTooleKemple52.

PubMed

Kent, Brenna; Raymond, Thomas; Mosier, Philip D; Johnson, Allison A

2018-05-10

We report here the genome sequences of two novel Bacillus cereus group-infecting bacteriophages, Janet and OTooleKemple52. These bacteriophages are double-stranded DNA-containing Myoviridae isolated from soil samples. While their genomes share a high degree of sequence identity with one another, their host preferences are unique. Copyright © 2018 Kent et al.
Odonata (dragonflies and damselflies) as a bridge between ecology and evolutionary genomics.

PubMed

Bybee, Seth; Córdoba-Aguilar, Alex; Duryea, M Catherine; Futahashi, Ryo; Hansson, Bengt; Lorenzo-Carballa, M Olalla; Schilder, Ruud; Stoks, Robby; Suvorov, Anton; Svensson, Erik I; Swaegers, Janne; Takahashi, Yuma; Watts, Phillip C; Wellenreuther, Maren

2016-01-01

Odonata (dragonflies and damselflies) present an unparalleled insect model to integrate evolutionary genomics with ecology for the study of insect evolution. Key features of Odonata include their ancient phylogenetic position, extensive phenotypic and ecological diversity, several unique evolutionary innovations, ease of study in the wild and usefulness as bioindicators for freshwater ecosystems worldwide. In this review, we synthesize studies on the evolution, ecology and physiology of odonates, highlighting those areas where the integration of ecology with genomics would yield significant insights into the evolutionary processes that would not be gained easily by working on other animal groups. We argue that the unique features of this group combined with their complex life cycle, flight behaviour, diversity in ecological niches and their sensitivity to anthropogenic change make odonates a promising and fruitful taxon for genomics focused research. Future areas of research that deserve increased attention are also briefly outlined.
Transposable elements in Drosophila.

PubMed

McCullers, Tabitha J; Steiniger, Mindy

2017-01-01

Transposable elements (TEs) are mobile genetic elements that can mobilize within host genomes. As TEs comprise more than 40% of the human genome and are linked to numerous diseases, understanding their mechanisms of mobilization and regulation is important. Drosophila melanogaster is an ideal model organism for the study of eukaryotic TEs as its genome contains a diverse array of active TEs. TEs universally impact host genome size via transposition and deletion events, but may also adopt unique functional roles in host organisms. There are 2 main classes of TEs: DNA transposons and retrotransposons. These classes are further divided into subgroups of TEs with unique structural and functional characteristics, demonstrating the significant variability among these elements. Despite this variability, D. melanogaster and other eukaryotic organisms utilize conserved mechanisms to regulate TEs. This review focuses on the transposition mechanisms and regulatory pathways of TEs, and their functional roles in D. melanogaster .
Transposable elements in Drosophila

PubMed Central

McCullers, Tabitha J.; Steiniger, Mindy

2017-01-01

ABSTRACT Transposable elements (TEs) are mobile genetic elements that can mobilize within host genomes. As TEs comprise more than 40% of the human genome and are linked to numerous diseases, understanding their mechanisms of mobilization and regulation is important. Drosophila melanogaster is an ideal model organism for the study of eukaryotic TEs as its genome contains a diverse array of active TEs. TEs universally impact host genome size via transposition and deletion events, but may also adopt unique functional roles in host organisms. There are 2 main classes of TEs: DNA transposons and retrotransposons. These classes are further divided into subgroups of TEs with unique structural and functional characteristics, demonstrating the significant variability among these elements. Despite this variability, D. melanogaster and other eukaryotic organisms utilize conserved mechanisms to regulate TEs. This review focuses on the transposition mechanisms and regulatory pathways of TEs, and their functional roles in D. melanogaster. PMID:28580197
Intrinsically-disordered N-termini in human parechovirus 1 capsid proteins bind encapsidated RNA.

PubMed

Shakeel, Shabih; Evans, James D; Hazelbaker, Mark; Kao, C Cheng; Vaughan, Robert C; Butcher, Sarah J

2018-04-11

Human parechoviruses (HPeV) are picornaviruses with a highly-ordered RNA genome contained within icosahedrally-symmetric capsids. Ordered RNA structures have recently been shown to interact with capsid proteins VP1 and VP3 and facilitate virus assembly in HPeV1. Using an assay that combines reversible cross-linking, RNA affinity purification and peptide mass fingerprinting (RCAP), we mapped the RNA-interacting regions of the capsid proteins from the whole HPeV1 virion in solution. The intrinsically-disordered N-termini of capsid proteins VP1 and VP3, and unexpectedly, VP0, were identified to interact with RNA. Comparing these results to those obtained using recombinantly-expressed VP0 and VP1 confirmed the virion binding regions, and revealed unique RNA binding regions in the isolated VP0 not previously observed in the crystal structure of HPeV1. We used RNA fluorescence anisotropy to confirm the RNA-binding competency of each of the capsid proteins' N-termini. These findings suggests that dynamic interactions between the viral RNA and the capsid proteins modulate virus assembly, and suggest a novel role for VP0.
Crowd Sourcing a New Paradigm for Interactome Driven Drug Target Identification in Mycobacterium tuberculosis

PubMed Central

Rohira, Harsha; Bhat, Ashwini G.; Passi, Anurag; Mukherjee, Keya; Choudhary, Kumari Sonal; Kumar, Vikas; Arora, Anshula; Munusamy, Prabhakaran; Subramanian, Ahalyaa; Venkatachalam, Aparna; S, Gayathri; Raj, Sweety; Chitra, Vijaya; Verma, Kaveri; Zaheer, Salman; J, Balaganesh; Gurusamy, Malarvizhi; Razeeth, Mohammed; Raja, Ilamathi; Thandapani, Madhumohan; Mevada, Vishal; Soni, Raviraj; Rana, Shruti; Ramanna, Girish Muthagadhalli; Raghavan, Swetha; Subramanya, Sunil N.; Kholia, Trupti; Patel, Rajesh; Bhavnani, Varsha; Chiranjeevi, Lakavath; Sengupta, Soumi; Singh, Pankaj Kumar; Atray, Naresh; Gandhi, Swati; Avasthi, Tiruvayipati Suma; Nisthar, Shefin; Anurag, Meenakshi; Sharma, Pratibha; Hasija, Yasha; Dash, Debasis; Sharma, Arun; Scaria, Vinod; Thomas, Zakir; Chandra, Nagasuma; Brahmachari, Samir K.; Bhardwaj, Anshu

2012-01-01

A decade since the availability of Mycobacterium tuberculosis (Mtb) genome sequence, no promising drug has seen the light of the day. This not only indicates the challenges in discovering new drugs but also suggests a gap in our current understanding of Mtb biology. We attempt to bridge this gap by carrying out extensive re-annotation and constructing a systems level protein interaction map of Mtb with an objective of finding novel drug target candidates. Towards this, we synergized crowd sourcing and social networking methods through an initiative ‘Connect to Decode’ (C2D) to generate the first and largest manually curated interactome of Mtb termed ‘interactome pathway’ (IPW), encompassing a total of 1434 proteins connected through 2575 functional relationships. Interactions leading to gene regulation, signal transduction, metabolism, structural complex formation have been catalogued. In the process, we have functionally annotated 87% of the Mtb genome in context of gene products. We further combine IPW with STRING based network to report central proteins, which may be assessed as potential drug targets for development of drugs with least possible side effects. The fact that five of the 17 predicted drug targets are already experimentally validated either genetically or biochemically lends credence to our unique approach. PMID:22808064

Transcript map of the Ovum mutant (Om) locus: isolation by exon trapping of new candidate genes for the DDK syndrome.

PubMed

Le Bras, Stéphanie; Cohen-Tannoudji, Michel; Guyot, Valérie; Vandormael-Pournin, Sandrine; Coumailleau, Franck; Babinet, Charles; Baldacci, Patricia

2002-08-21

The DDK syndrome is defined as the embryonic lethality of F1 mouse embryos from crosses between DDK females and males from other strains (named hereafter as non-DDK strains). Genetically controlled by the Ovum mutant (Om) locus, it is due to a deleterious interaction between a maternal factor present in DDK oocytes and the non-DDK paternal pronucleus. Therefore, the DDK syndrome constitutes a unique genetic tool to study the crucial interactions that take place between the parental genomes and the egg cytoplasm during mammalian development. In this paper, we present an extensive analysis performed by exon trapping on the Om region. Twenty-seven trapped sequences were from genes in the databases: beta-adaptin, CCT zeta2, DNA LigaseIII, Notchless, Rad51l3 and Scya1. Twenty-eight other sequences presented similarities with expressed sequence tags and genomic sequences whereas 57 did not. The pattern of expression of 37 of these markers was established. Importantly, five of them are expressed in DDK oocytes and are candidate genes for the maternal factor, and 20 are candidate genes for the paternal factor since they are expressed in testis. This data is an important step towards identifying the genes responsible for the DDK syndrome.
From Wolves to Dogs, and Back: Genetic Composition of the Czechoslovakian Wolfdog.

PubMed

Smetanová, Milena; Černá Bolfíková, Barbora; Randi, Ettore; Caniglia, Romolo; Fabbri, Elena; Galaverni, Marco; Kutal, Miroslav; Hulva, Pavel

2015-01-01

The Czechoslovakian Wolfdog is a unique dog breed that originated from hybridization between German Shepherds and wild Carpathian wolves in the 1950s as a military experiment. This breed was used for guarding the Czechoslovakian borders during the cold war and is currently kept by civilian breeders all round the world. The aim of our study was to characterize, for the first time, the genetic composition of this breed in relation to its known source populations. We sequenced the hypervariable part of the mtDNA control region and genotyped the Amelogenin gene, four sex-linked microsatellites and 39 autosomal microsatellites in 79 Czechoslovakian Wolfdogs, 20 German Shepherds and 28 Carpathian wolves. We performed a range of population genetic analyses based on both empirical and simulated data. Only two mtDNA and two Y-linked haplotypes were found in Czechoslovakian Wolfdogs. Both mtDNA haplotypes were of domestic origin, while only one of the Y-haplotypes was shared with German Shepherds and the other was unique to Czechoslovakian Wolfdogs. The observed inbreeding coefficient was low despite the small effective population size of the breed, possibly due to heterozygote advantages determined by introgression of wolf alleles. Moreover, Czechoslovakian Wolfdog genotypes were distinct from both parental populations, indicating the role of founder effect, drift and/or genetic hitchhiking. The results revealed the peculiar genetic composition of the Czechoslovakian Wolfdog, showing a limited introgression of wolf alleles within a higher proportion of the dog genome, consistent with the reiterated backcrossing used in the pedigree. Artificial selection aiming to keep wolf-like phenotypes but dog-like behavior resulted in a distinctive genetic composition of Czechoslovakian Wolfdogs, which provides a unique example to study the interactions between dog and wolf genomes.
Conserved small mRNA with an unique, extended Shine-Dalgarno sequence

PubMed Central

Hahn, Julia; Migur, Anzhela; von Boeselager, Raphael Freiherr; Kubatova, Nina; Kubareva, Elena; Schwalbe, Harald

2017-01-01

ABSTRACT Up to now, very small protein-coding genes have remained unrecognized in sequenced genomes. We identified an mRNA of 165 nucleotides (nt), which is conserved in Bradyrhizobiaceae and encodes a polypeptide with 14 amino acid residues (aa). The small mRNA harboring a unique Shine-Dalgarno sequence (SD) with a length of 17 nt was localized predominantly in the ribosome-containing P100 fraction of Bradyrhizobium japonicum USDA 110. Strong interaction between the mRNA and 30S ribosomal subunits was demonstrated by their co-sedimentation in sucrose density gradient. Using translational fusions with egfp, we detected weak translation and found that it is impeded by both the extended SD and the GTG start codon (instead of ATG). Biophysical characterization (CD- and NMR-spectroscopy) showed that synthesized polypeptide remained unstructured in physiological puffer. Replacement of the start codon by a stop codon increased the stability of the transcript, strongly suggesting additional posttranscriptional regulation at the ribosome. Therefore, the small gene was named rreB (ribosome-regulated expression in Bradyrhizobiaceae). Assuming that the unique ribosome binding site (RBS) is a hallmark of rreB homologs or similarly regulated genes, we looked for similar putative RBS in bacterial genomes and detected regions with at least 16 nt complementarity to the 3′-end of 16S rRNA upstream of sORFs in Caulobacterales, Rhizobiales, Rhodobacterales and Rhodospirillales. In the Rhodobacter/Roseobacter lineage of α-proteobacteria the corresponding gene (rreR) is conserved and encodes an 18 aa protein. This shows how specific RBS features can be used to identify new genes with presumably similar control of expression at the RNA level. PMID:27834614
Comparative analyses of Xanthomonas and Xylella complete genomes.

PubMed

Moreira, Leandro M; De Souza, Robson F; Digiampietri, Luciano A; Da Silva, Ana C R; Setubal, João C

2005-01-01

Computational analyses of four bacterial genomes of the Xanthomonadaceae family reveal new unique genes that may be involved in adaptation, pathogenicity, and host specificity. The Xanthomonas genus presents 3636 unique genes distributed in 1470 families, while Xylella genus presents 1026 unique genes distributed in 375 families. Among Xanthomonas-specific genes, we highlight a large number of cell wall degrading enzymes, proteases, and iron receptors, a set of energy metabolism genes, second copy of the type II secretion system, type III secretion system, flagella and chemotactic machinery, and the xanthomonadin synthesis gene cluster. Important genes unique to the Xylella genus are an additional copy of a type IV pili gene cluster and the complete machinery of colicin V synthesis and secretion. Intersections of gene sets from both genera reveal a cluster of genes homologous to Salmonella's SPI-7 island in Xanthomonas axonopodis pv citri and Xylella fastidiosa 9a5c, which might be involved in host specificity. Each genome also presents important unique genes, such as an HMS cluster, the kdgT gene, and O-antigen in Xanthomonas axonopodis pv citri; a number of avrBS genes and a distinct O-antigen in Xanthomonas campestris pv campestris, a type I restriction-modification system and a nickase gene in Xylella fastidiosa 9a5c, and a type II restriction-modification system and two genes related to peptidoglycan biosynthesis in Xylella fastidiosa temecula 1. All these differences imply a considerable number of gene gains and losses during the divergence of the four lineages, and are associated with structural genome modifications that may have a direct relation with the mode of transmission, adaptation to specific environments and pathogenicity of each organism.
Secretome Analysis from the Ectomycorrhizal Ascomycete Cenococcum geophilum

PubMed Central

de Freitas Pereira, Maíra; Veneault-Fourrey, Claire; Vion, Patrice; Guinet, Fréderic; Morin, Emmanuelle; Barry, Kerrie W.; Lipzen, Anna; Singan, Vasanth; Pfister, Stephanie; Na, Hyunsoo; Kennedy, Megan; Egli, Simon; Grigoriev, Igor; Martin, Francis; Kohler, Annegret; Peter, Martina

2018-01-01

Cenococcum geophilum is an ectomycorrhizal fungus with global distribution in numerous habitats and associates with a large range of host species including gymnosperm and angiosperm trees. Moreover, C. geophilum is the unique ectomycorrhizal species within the clade Dothideomycetes, the largest class of Ascomycetes containing predominantly saprotrophic and many devastating phytopathogenic fungi. Recent studies highlight that mycorrhizal fungi, as pathogenic ones, use effectors in form of Small Secreted Proteins (SSPs) as molecular keys to promote symbiosis. In order to better understand the biotic interaction of C. geophilum with its host plants, the goal of this work was to characterize mycorrhiza-induced small-secreted proteins (MiSSPs) that potentially play a role in the ectomycorrhiza formation and functioning of this ecologically very important species. We combined different approaches such as gene expression profiling, genome localization and conservation of MiSSP genes in different C. geophilum strains and closely related species as well as protein subcellular localization studies of potential targets of MiSSPs in interacting plants using in tobacco leaf cells. Gene expression analyses of C. geophilum interacting with Pinus sylvestris (pine) and Populus tremula × Populus alba (poplar) showed that similar sets of genes coding for secreted proteins were up-regulated and only few were specific to each host. Whereas pine induced more carbohydrate active enzymes (CAZymes), the interaction with poplar induced the expression of specific SSPs. We identified a set of 22 MiSSPs, which are located in both, gene-rich, repeat-poor or gene-sparse, repeat-rich regions of the C. geophilum genome, a genome showing a bipartite architecture as seen for some pathogens but not yet for an ectomycorrhizal fungus. Genome re-sequencing data of 15 C. geophilum strains and two close relatives Glonium stellatum and Lepidopterella palustris were used to study sequence conservation of MiSSP-encoding genes. The 22 MiSSPs showed a high presence-absence polymorphism among the studied C. geophilum strains suggesting an evolution through gene gain/gene loss. Finally, we showed that six CgMiSSPs target four distinct sub-cellular compartments such as endoplasmic reticulum, plasma membrane, cytosol and tonoplast. Overall, this work presents a comprehensive analysis of secreted proteins and MiSSPs in different genetic level of C. geophilum opening a valuable resource to future functional analysis. PMID:29487573
Genus-Wide Comparative Genomics of Malassezia Delineates Its Phylogeny, Physiology, and Niche Adaptation on Human Skin

PubMed Central

Wu, Guangxi; Zhao, He; Li, Chenhao; Rajapakse, Menaka Priyadarsani; Wong, Wing Cheong; Xu, Jun; Saunders, Charles W.; Reeder, Nancy L.; Reilman, Raymond A.; Scheynius, Annika; Sun, Sheng; Billmyre, Blake Robert; Li, Wenjun; Averette, Anna Floyd; Mieczkowski, Piotr; Heitman, Joseph; Theelen, Bart; Schröder, Markus S.; De Sessions, Paola Florez; Butler, Geraldine; Maurer-Stroh, Sebastian; Boekhout, Teun; Nagarajan, Niranjan; Dawson, Thomas L.

2015-01-01

Malassezia is a unique lipophilic genus in class Malasseziomycetes in Ustilaginomycotina, (Basidiomycota, fungi) that otherwise consists almost exclusively of plant pathogens. Malassezia are typically isolated from warm-blooded animals, are dominant members of the human skin mycobiome and are associated with common skin disorders. To characterize the genetic basis of the unique phenotypes of Malassezia spp., we sequenced the genomes of all 14 accepted species and used comparative genomics against a broad panel of fungal genomes to comprehensively identify distinct features that define the Malassezia gene repertoire: gene gain and loss; selection signatures; and lineage-specific gene family expansions. Our analysis revealed key gene gain events (64) with a single gene conserved across all Malassezia but absent in all other sequenced Basidiomycota. These likely horizontally transferred genes provide intriguing gain-of-function events and prime candidates to explain the emergence of Malassezia. A larger set of genes (741) were lost, with enrichment for glycosyl hydrolases and carbohydrate metabolism, concordant with adaptation to skin’s carbohydrate-deficient environment. Gene family analysis revealed extensive turnover and underlined the importance of secretory lipases, phospholipases, aspartyl proteases, and other peptidases. Combining genomic analysis with a re-evaluation of culture characteristics, we establish the likely lipid-dependence of all Malassezia. Our phylogenetic analysis sheds new light on the relationship between Malassezia and other members of Ustilaginomycotina, as well as phylogenetic lineages within the genus. Overall, our study provides a unique genomic resource for understanding Malassezia niche-specificity and potential virulence, as well as their abundance and distribution in the environment and on human skin. PMID:26539826
Genus-Wide Comparative Genomics of Malassezia Delineates Its Phylogeny, Physiology, and Niche Adaptation on Human Skin.

PubMed

Wu, Guangxi; Zhao, He; Li, Chenhao; Rajapakse, Menaka Priyadarsani; Wong, Wing Cheong; Xu, Jun; Saunders, Charles W; Reeder, Nancy L; Reilman, Raymond A; Scheynius, Annika; Sun, Sheng; Billmyre, Blake Robert; Li, Wenjun; Averette, Anna Floyd; Mieczkowski, Piotr; Heitman, Joseph; Theelen, Bart; Schröder, Markus S; De Sessions, Paola Florez; Butler, Geraldine; Maurer-Stroh, Sebastian; Boekhout, Teun; Nagarajan, Niranjan; Dawson, Thomas L

2015-11-01

Malassezia is a unique lipophilic genus in class Malasseziomycetes in Ustilaginomycotina, (Basidiomycota, fungi) that otherwise consists almost exclusively of plant pathogens. Malassezia are typically isolated from warm-blooded animals, are dominant members of the human skin mycobiome and are associated with common skin disorders. To characterize the genetic basis of the unique phenotypes of Malassezia spp., we sequenced the genomes of all 14 accepted species and used comparative genomics against a broad panel of fungal genomes to comprehensively identify distinct features that define the Malassezia gene repertoire: gene gain and loss; selection signatures; and lineage-specific gene family expansions. Our analysis revealed key gene gain events (64) with a single gene conserved across all Malassezia but absent in all other sequenced Basidiomycota. These likely horizontally transferred genes provide intriguing gain-of-function events and prime candidates to explain the emergence of Malassezia. A larger set of genes (741) were lost, with enrichment for glycosyl hydrolases and carbohydrate metabolism, concordant with adaptation to skin's carbohydrate-deficient environment. Gene family analysis revealed extensive turnover and underlined the importance of secretory lipases, phospholipases, aspartyl proteases, and other peptidases. Combining genomic analysis with a re-evaluation of culture characteristics, we establish the likely lipid-dependence of all Malassezia. Our phylogenetic analysis sheds new light on the relationship between Malassezia and other members of Ustilaginomycotina, as well as phylogenetic lineages within the genus. Overall, our study provides a unique genomic resource for understanding Malassezia niche-specificity and potential virulence, as well as their abundance and distribution in the environment and on human skin.
Polygenic risk score, genome-wide association, and gene set analyses of cognitive domain deficits in schizophrenia.

PubMed

Nakahara, Soichiro; Medland, Sarah; Turner, Jessica A; Calhoun, Vince D; Lim, Kelvin O; Mueller, Bryon A; Bustillo, Juan R; O'Leary, Daniel S; Vaidya, Jatin G; McEwen, Sarah; Voyvodic, James; Belger, Aysenil; Mathalon, Daniel H; Ford, Judith M; Guffanti, Guia; Macciardi, Fabio; Potkin, Steven G; van Erp, Theo G M

2018-06-12

This study assessed genetic contributions to six cognitive domains, identified by the MATRICS Cognitive Consensus Battery as relevant for schizophrenia, cognition-enhancing, clinical trials. Psychiatric Genomics Consortium Schizophrenia polygenic risk scores showed significant negative correlations with each cognitive domain. Genome-wide association analyses identified loci associated with attention/vigilance (rs830786 within HNF4G), verbal memory (rs67017972 near NDUFS4), and reasoning/problem solving (rs76872642 within HDAC9). Gene set analysis identified unique and shared genes across cognitive domains. These findings suggest involvement of common and unique mechanisms across cognitive domains and may contribute to the discovery of new therapeutic targets to treat cognitive deficits in schizophrenia. Copyright © 2018 Elsevier B.V. All rights reserved.
The enemy within: Targeting host–parasite interaction for antileishmanial drug discovery

PubMed Central

Späth, Gerald F.; Rachidi, Najma; Prina, Eric

2017-01-01

The state of antileishmanial chemotherapy is strongly compromised by the emergence of drug-resistant Leishmania. The evolution of drug-resistant phenotypes has been linked to the parasites’ intrinsic genome instability, with frequent gene and chromosome amplifications causing fitness gains that are directly selected by environmental factors, including the presence of antileishmanial drugs. Thus, even though the unique eukaryotic biology of Leishmania and its dependence on parasite-specific virulence factors provide valid opportunities for chemotherapeutical intervention, all strategies that target the parasite in a direct fashion are likely prone to select for resistance. Here, we review the current state of antileishmanial chemotherapy and discuss the limitations of ongoing drug discovery efforts. We finally propose new strategies that target Leishmania viability indirectly via mechanisms of host–parasite interaction, including parasite-released ectokinases and host epigenetic regulation, which modulate host cell signaling and transcriptional regulation, respectively, to establish permissive conditions for intracellular Leishmania survival. PMID:28594938
The enemy within: Targeting host-parasite interaction for antileishmanial drug discovery.

PubMed

Lamotte, Suzanne; Späth, Gerald F; Rachidi, Najma; Prina, Eric

2017-06-01

The state of antileishmanial chemotherapy is strongly compromised by the emergence of drug-resistant Leishmania. The evolution of drug-resistant phenotypes has been linked to the parasites' intrinsic genome instability, with frequent gene and chromosome amplifications causing fitness gains that are directly selected by environmental factors, including the presence of antileishmanial drugs. Thus, even though the unique eukaryotic biology of Leishmania and its dependence on parasite-specific virulence factors provide valid opportunities for chemotherapeutical intervention, all strategies that target the parasite in a direct fashion are likely prone to select for resistance. Here, we review the current state of antileishmanial chemotherapy and discuss the limitations of ongoing drug discovery efforts. We finally propose new strategies that target Leishmania viability indirectly via mechanisms of host-parasite interaction, including parasite-released ectokinases and host epigenetic regulation, which modulate host cell signaling and transcriptional regulation, respectively, to establish permissive conditions for intracellular Leishmania survival.
A Phyletically Rare Gene Promotes the Niche-specific Fitness of an E. coli Pathogen during Bacteremia

PubMed Central

Wiles, Travis J.; Lewis, Adam J.; Mobley, Harry L. T.; Casjens, Sherwood R.; Mulvey, Matthew A.

2013-01-01

In bacteria, laterally acquired genes are often concentrated within chromosomal regions known as genomic islands. Using a recently developed zebrafish infection model, we set out to identify unique factors encoded within genomic islands that contribute to the fitness and virulence of a reference urosepsis isolate—extraintestinal pathogenic Escherichia coli strain CFT073. By screening a series of deletion mutants, we discovered a previously uncharacterized gene, neaT, that is conditionally required by the pathogen during systemic infections. In vitro assays indicate that neaT can limit bacterial interactions with host phagocytes and alter the aggregative properties of CFT073. The neaT gene is localized within an integrated P2-like bacteriophage in CFT073, but was rarely found within other proteobacterial genomes. Sequence-based analyses revealed that neaT homologues are present, but discordantly conserved, within a phyletically diverse set of bacterial species. In CFT073, neaT appears to be unameliorated, having an exceptionally A+T-rich composition along with a notably altered codon bias. These data suggest that neaT was recently brought into the proteobacterial pan-genome from an extra-phyletic source. Interestingly, even in G+C-poor genomes, as found within the Firmicutes lineage, neaT-like genes are often unameliorated. Sequence-level features of neaT homologues challenge the common supposition that the A+T-rich nature of many recently acquired genes reflects the nucleotide composition of their genomes of origin. In total, these findings highlight the complexity of the evolutionary forces that can affect the acquisition, utilization, and assimilation of rare genes that promote the niche-dependent fitness and virulence of a bacterial pathogen. PMID:23459509
High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource

PubMed Central

Seaver, Samuel M. D.; Gerdes, Svetlana; Frelin, Océane; Lerma-Ortiz, Claudia; Bradbury, Louis M. T.; Zallot, Rémi; Hasnain, Ghulam; Niehaus, Thomas D.; El Yacoubi, Basma; Pasternak, Shiran; Olson, Robert; Pusch, Gordon; Overbeek, Ross; Stevens, Rick; de Crécy-Lagard, Valérie; Ware, Doreen; Hanson, Andrew D.; Henry, Christopher S.

2014-01-01

The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today’s annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic models. To overcome these problems, we have developed the PlantSEED, an integrated, metabolism-centric database to support subsystems-based annotation and metabolic model reconstruction for plant genomes. PlantSEED combines SEED subsystems technology, first developed for microbial genomes, with refined protein families and biochemical data to assign fully consistent functional annotations to orthologous genes, particularly those encoding primary metabolic pathways. Seamless integration with its parent, the prokaryotic SEED database, makes PlantSEED a unique environment for cross-kingdom comparative analysis of plant and bacterial genomes. The consistent annotations imposed by PlantSEED permit rapid reconstruction and modeling of primary metabolism for all plant genomes in the database. This feature opens the unique possibility of model-based assessment of the completeness and accuracy of gene annotation and thus allows computational identification of genes and pathways that are restricted to certain genomes or need better curation. We demonstrate the PlantSEED system by producing consistent annotations for 10 reference genomes. We also produce a functioning metabolic model for each genome, gapfilling to identify missing annotations and proposing gene candidates for missing annotations. Models are built around an extended biomass composition representing the most comprehensive published to date. To our knowledge, our models are the first to be published for seven of the genomes analyzed. PMID:24927599
High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource.

PubMed

Seaver, Samuel M D; Gerdes, Svetlana; Frelin, Océane; Lerma-Ortiz, Claudia; Bradbury, Louis M T; Zallot, Rémi; Hasnain, Ghulam; Niehaus, Thomas D; El Yacoubi, Basma; Pasternak, Shiran; Olson, Robert; Pusch, Gordon; Overbeek, Ross; Stevens, Rick; de Crécy-Lagard, Valérie; Ware, Doreen; Hanson, Andrew D; Henry, Christopher S

2014-07-01

The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today's annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic models. To overcome these problems, we have developed the PlantSEED, an integrated, metabolism-centric database to support subsystems-based annotation and metabolic model reconstruction for plant genomes. PlantSEED combines SEED subsystems technology, first developed for microbial genomes, with refined protein families and biochemical data to assign fully consistent functional annotations to orthologous genes, particularly those encoding primary metabolic pathways. Seamless integration with its parent, the prokaryotic SEED database, makes PlantSEED a unique environment for cross-kingdom comparative analysis of plant and bacterial genomes. The consistent annotations imposed by PlantSEED permit rapid reconstruction and modeling of primary metabolism for all plant genomes in the database. This feature opens the unique possibility of model-based assessment of the completeness and accuracy of gene annotation and thus allows computational identification of genes and pathways that are restricted to certain genomes or need better curation. We demonstrate the PlantSEED system by producing consistent annotations for 10 reference genomes. We also produce a functioning metabolic model for each genome, gapfilling to identify missing annotations and proposing gene candidates for missing annotations. Models are built around an extended biomass composition representing the most comprehensive published to date. To our knowledge, our models are the first to be published for seven of the genomes analyzed.
Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies

PubMed Central

2014-01-01

Background The size and complexity of conifer genomes has, until now, prevented full genome sequencing and assembly. The large research community and economic importance of loblolly pine, Pinus taeda L., made it an early candidate for reference sequence determination. Results We develop a novel strategy to sequence the genome of loblolly pine that combines unique aspects of pine reproductive biology and genome assembly methodology. We use a whole genome shotgun approach relying primarily on next generation sequence generated from a single haploid seed megagametophyte from a loblolly pine tree, 20-1010, that has been used in industrial forest tree breeding. The resulting sequence and assembly was used to generate a draft genome spanning 23.2 Gbp and containing 20.1 Gbp with an N50 scaffold size of 66.9 kbp, making it a significant improvement over available conifer genomes. The long scaffold lengths allow the annotation of 50,172 gene models with intron lengths averaging over 2.7 kbp and sometimes exceeding 100 kbp in length. Analysis of orthologous gene sets identifies gene families that may be unique to conifers. We further characterize and expand the existing repeat library based on the de novo analysis of the repetitive content, estimated to encompass 82% of the genome. Conclusions In addition to its value as a resource for researchers and breeders, the loblolly pine genome sequence and assembly reported here demonstrates a novel approach to sequencing the large and complex genomes of this important group of plants that can now be widely applied. PMID:24647006
Interactive or static reports to guide clinical interpretation of cancer genomics.

PubMed

Gray, Stacy W; Gagan, Jeffrey; Cerami, Ethan; Cronin, Angel M; Uno, Hajime; Oliver, Nelly; Lowenstein, Carol; Lederman, Ruth; Revette, Anna; Suarez, Aaron; Lee, Charlotte; Bryan, Jordan; Sholl, Lynette; Van Allen, Eliezer M

2018-05-01

Misinterpretation of complex genomic data presents a major challenge in the implementation of precision oncology. We sought to determine whether interactive genomic reports with embedded clinician education and optimized data visualization improved genomic data interpretation. We conducted a randomized, vignette-based survey study to determine whether exposure to interactive reports for a somatic gene panel, as compared to static reports, improves physicians' genomic comprehension and report-related satisfaction (overall scores calculated across 3 vignettes, range 0-18 and 1-4, respectively, higher score corresponding with improved endpoints). One hundred and five physicians at a tertiary cancer center participated (29% participation rate): 67% medical, 20% pediatric, 7% radiation, and 7% surgical oncology; 37% female. Prior to viewing the case-based vignettes, 34% of the physicians reported difficulty making treatment recommendations based on the standard static report. After vignette/report exposure, physicians' overall comprehension scores did not differ by report type (mean score: interactive 11.6 vs static 10.5, difference = 1.1, 95% CI, -0.3, 2.5, P = .13). However, physicians exposed to the interactive report were more likely to correctly assess sequencing quality (P < .001) and understand when reports needed to be interpreted with caution (eg, low tumor purity; P = .02). Overall satisfaction scores were higher in the interactive group (mean score 2.5 vs 2.1, difference = 0.4, 95% CI, 0.2-0.7, P = .001). Interactive genomic reports may improve physicians' ability to accurately assess genomic data and increase report-related satisfaction. Additional research in users' genomic needs and efforts to integrate interactive reports into electronic health records may facilitate the implementation of precision oncology.
Genomic survey of pathogenicity determinants and VNTR markers in the cassava bacterial pathogen Xanthomonas axonopodis pv. Manihotis strain CIO151.

PubMed

Arrieta-Ortiz, Mario L; Rodríguez-R, Luis M; Pérez-Quintero, Álvaro L; Poulin, Lucie; Díaz, Ana C; Arias Rojas, Nathalia; Trujillo, Cesar; Restrepo Benavides, Mariana; Bart, Rebecca; Boch, Jens; Boureau, Tristan; Darrasse, Armelle; David, Perrine; Dugé de Bernonville, Thomas; Fontanilla, Paula; Gagnevin, Lionel; Guérin, Fabien; Jacques, Marie-Agnès; Lauber, Emmanuelle; Lefeuvre, Pierre; Medina, Cesar; Medina, Edgar; Montenegro, Nathaly; Muñoz Bodnar, Alejandra; Noël, Laurent D; Ortiz Quiñones, Juan F; Osorio, Daniela; Pardo, Carolina; Patil, Prabhu B; Poussier, Stéphane; Pruvost, Olivier; Robène-Soustrade, Isabelle; Ryan, Robert P; Tabima, Javier; Urrego Morales, Oscar G; Vernière, Christian; Carrere, Sébastien; Verdier, Valérie; Szurek, Boris; Restrepo, Silvia; López, Camilo; Koebnik, Ralf; Bernal, Adriana

2013-01-01

Xanthomonas axonopodis pv. manihotis (Xam) is the causal agent of bacterial blight of cassava, which is among the main components of human diet in Africa and South America. Current information about the molecular pathogenicity factors involved in the infection process of this organism is limited. Previous studies in other bacteria in this genus suggest that advanced draft genome sequences are valuable resources for molecular studies on their interaction with plants and could provide valuable tools for diagnostics and detection. Here we have generated the first manually annotated high-quality draft genome sequence of Xam strain CIO151. Its genomic structure is similar to that of other xanthomonads, especially Xanthomonas euvesicatoria and Xanthomonas citri pv. citri species. Several putative pathogenicity factors were identified, including type III effectors, cell wall-degrading enzymes and clusters encoding protein secretion systems. Specific characteristics in this genome include changes in the xanthomonadin cluster that could explain the lack of typical yellow color in all strains of this pathovar and the presence of 50 regions in the genome with atypical nucleotide composition. The genome sequence was used to predict and evaluate 22 variable number of tandem repeat (VNTR) loci that were subsequently demonstrated as polymorphic in representative Xam strains. Our results demonstrate that Xanthomonas axonopodis pv. manihotis strain CIO151 possesses ten clusters of pathogenicity factors conserved within the genus Xanthomonas. We report 126 genes that are potentially unique to Xam, as well as potential horizontal transfer events in the history of the genome. The relation of these regions with virulence and pathogenicity could explain several aspects of the biology of this pathogen, including its ability to colonize both vascular and non-vascular tissues of cassava plants. A set of 16 robust, polymorphic VNTR loci will be useful to develop a multi-locus VNTR analysis scheme for epidemiological surveillance of this disease.
Genomic Survey of Pathogenicity Determinants and VNTR Markers in the Cassava Bacterial Pathogen Xanthomonas axonopodis pv. Manihotis Strain CIO151

PubMed Central

Arrieta-Ortiz, Mario L.; Rodríguez-R, Luis M.; Pérez-Quintero, Álvaro L.; Poulin, Lucie; Díaz, Ana C.; Arias Rojas, Nathalia; Trujillo, Cesar; Restrepo Benavides, Mariana; Bart, Rebecca; Boch, Jens; Boureau, Tristan; Darrasse, Armelle; David, Perrine; Dugé de Bernonville, Thomas; Fontanilla, Paula; Gagnevin, Lionel; Guérin, Fabien; Jacques, Marie-Agnès; Lauber, Emmanuelle; Lefeuvre, Pierre; Medina, Cesar; Medina, Edgar; Montenegro, Nathaly; Muñoz Bodnar, Alejandra; Noël, Laurent D.; Ortiz Quiñones, Juan F.; Osorio, Daniela; Pardo, Carolina; Patil, Prabhu B.; Poussier, Stéphane; Pruvost, Olivier; Robène-Soustrade, Isabelle; Ryan, Robert P.; Tabima, Javier; Urrego Morales, Oscar G.; Vernière, Christian; Carrere, Sébastien; Verdier, Valérie; Szurek, Boris; Restrepo, Silvia; López, Camilo

2013-01-01

Xanthomonas axonopodis pv. manihotis (Xam) is the causal agent of bacterial blight of cassava, which is among the main components of human diet in Africa and South America. Current information about the molecular pathogenicity factors involved in the infection process of this organism is limited. Previous studies in other bacteria in this genus suggest that advanced draft genome sequences are valuable resources for molecular studies on their interaction with plants and could provide valuable tools for diagnostics and detection. Here we have generated the first manually annotated high-quality draft genome sequence of Xam strain CIO151. Its genomic structure is similar to that of other xanthomonads, especially Xanthomonas euvesicatoria and Xanthomonas citri pv. citri species. Several putative pathogenicity factors were identified, including type III effectors, cell wall-degrading enzymes and clusters encoding protein secretion systems. Specific characteristics in this genome include changes in the xanthomonadin cluster that could explain the lack of typical yellow color in all strains of this pathovar and the presence of 50 regions in the genome with atypical nucleotide composition. The genome sequence was used to predict and evaluate 22 variable number of tandem repeat (VNTR) loci that were subsequently demonstrated as polymorphic in representative Xam strains. Our results demonstrate that Xanthomonas axonopodis pv. manihotis strain CIO151 possesses ten clusters of pathogenicity factors conserved within the genus Xanthomonas. We report 126 genes that are potentially unique to Xam, as well as potential horizontal transfer events in the history of the genome. The relation of these regions with virulence and pathogenicity could explain several aspects of the biology of this pathogen, including its ability to colonize both vascular and non-vascular tissues of cassava plants. A set of 16 robust, polymorphic VNTR loci will be useful to develop a multi-locus VNTR analysis scheme for epidemiological surveillance of this disease. PMID:24278159
Deciphering the Cryptic Genome: Genome-wide Analyses of the Rice Pathogen Fusarium fujikuroi Reveal Complex Regulation of Secondary Metabolism and Novel Metabolites

PubMed Central

Studt, Lena; Niehaus, Eva-Maria; Espino, Jose J.; Huß, Kathleen; Michielse, Caroline B.; Albermann, Sabine; Wagner, Dominik; Bergner, Sonja V.; Connolly, Lanelle R.; Fischer, Andreas; Reuter, Gunter; Kleigrewe, Karin; Bald, Till; Wingfield, Brenda D.; Ophir, Ron; Freeman, Stanley; Hippler, Michael; Smith, Kristina M.; Brown, Daren W.; Proctor, Robert H.; Münsterkötter, Martin; Freitag, Michael; Humpf, Hans-Ulrich; Güldener, Ulrich; Tudzynski, Bettina

2013-01-01

The fungus Fusarium fujikuroi causes “bakanae” disease of rice due to its ability to produce gibberellins (GAs), but it is also known for producing harmful mycotoxins. However, the genetic capacity for the whole arsenal of natural compounds and their role in the fungus' interaction with rice remained unknown. Here, we present a high-quality genome sequence of F. fujikuroi that was assembled into 12 scaffolds corresponding to the 12 chromosomes described for the fungus. We used the genome sequence along with ChIP-seq, transcriptome, proteome, and HPLC-FTMS-based metabolome analyses to identify the potential secondary metabolite biosynthetic gene clusters and to examine their regulation in response to nitrogen availability and plant signals. The results indicate that expression of most but not all gene clusters correlate with proteome and ChIP-seq data. Comparison of the F. fujikuroi genome to those of six other fusaria revealed that only a small number of gene clusters are conserved among these species, thus providing new insights into the divergence of secondary metabolism in the genus Fusarium. Noteworthy, GA biosynthetic genes are present in some related species, but GA biosynthesis is limited to F. fujikuroi, suggesting that this provides a selective advantage during infection of the preferred host plant rice. Among the genome sequences analyzed, one cluster that includes a polyketide synthase gene (PKS19) and another that includes a non-ribosomal peptide synthetase gene (NRPS31) are unique to F. fujikuroi. The metabolites derived from these clusters were identified by HPLC-FTMS-based analyses of engineered F. fujikuroi strains overexpressing cluster genes. In planta expression studies suggest a specific role for the PKS19-derived product during rice infection. Thus, our results indicate that combined comparative genomics and genome-wide experimental analyses identified novel genes and secondary metabolites that contribute to the evolutionary success of F. fujikuroi as a rice pathogen. PMID:23825955
Personalized medicine, genomics, and pharmacogenomics: a primer for nurses.

PubMed

Blix, Andrew

2014-08-01

Personalized medicine is the study of patients' unique environmental influences as well as the totality of their genetic code-their genome-to tailor personalized risk assessments, diagnoses, prognoses, and treatments. The study of how patients' genomes affect responses to medications, or pharmacogenomics, is a related field. Personalized medicine and genomics are particularly relevant in oncology because of the genetic basis of cancer. Nurses need to understand related issues such as the role of genetic and genomic counseling, the ethical and legal questions surrounding genomics, and the growing direct-to-consumer genomics industry. As genomics research is incorporated into health care, nurses need to understand the technology to provide advocacy and education for patients and their families.
A diversity study of Saccharomycopsis fibuligera in rice wine starter nuruk, reveals the evolutionary process associated with its interspecies hybrid.

PubMed

Farh, Mohamed El-Agamy; Cho, Yunjoo; Lim, Jae Yun; Seo, Jeong-Ah

2017-05-01

The amylolytic yeast Saccharomycopsis fibuligera is the predominant yeast in the starter product, nuruk, which is utilized for rice wine production in South Korea. Latest molecular studies explore a recently developed interspecific hybridization among stains of S. fibuligera with a unique genetic feature. However, the origin of the natural hybridization occurrence is still unclear. Thus, to respectively distinguish parental and hybrid strains, specific primer sets were applied on 141 yeast strains isolated from different nuruk samples fermented in different provinces. Sixty-seven strains were defined accordingly as parental species with genome A while 8 strains were defined as hybrid strains. Unexpectedly, another parental species with genome B could not be found among the strain pools yet. Furthermore, it was observed that hybrid strains are phenotypically different from A genome strains; asci containing tetrad ascospores were observed in A genome strains more frequent than in hybrid strains. Nevertheless, hybrid strains were slightly more thermotolerant than A genome strains. Interestingly, all hybrid strains were located only in Jeju province. Based on these sets of data, we speculated that the unique climate of Jeju province might play an evolutionary role in the interspecific hybridization between A genome strains, as well as the unculturable allopatric B genome strains.

Global Organization of a Positive-strand RNA Virus Genome

PubMed Central

Wu, Baodong; Grigull, Jörg; Ore, Moriam O.; Morin, Sylvie; White, K. Andrew

2013-01-01

The genomes of plus-strand RNA viruses contain many regulatory sequences and structures that direct different viral processes. The traditional view of these RNA elements are as local structures present in non-coding regions. However, this view is changing due to the discovery of regulatory elements in coding regions and functional long-range intra-genomic base pairing interactions. The ∼4.8 kb long RNA genome of the tombusvirus tomato bushy stunt virus (TBSV) contains these types of structural features, including six different functional long-distance interactions. We hypothesized that to achieve these multiple interactions this viral genome must utilize a large-scale organizational strategy and, accordingly, we sought to assess the global conformation of the entire TBSV genome. Atomic force micrographs of the genome indicated a mostly condensed structure composed of interconnected protrusions extending from a central hub. This configuration was consistent with the genomic secondary structure model generated using high-throughput selective 2′-hydroxyl acylation analysed by primer extension (i.e. SHAPE), which predicted different sized RNA domains originating from a central region. Known RNA elements were identified in both domain and inter-domain regions, and novel structural features were predicted and functionally confirmed. Interestingly, only two of the six long-range interactions known to form were present in the structural model. However, for those interactions that did not form, complementary partner sequences were positioned relatively close to each other in the structure, suggesting that the secondary structure level of viral genome structure could provide a basic scaffold for the formation of different long-range interactions. The higher-order structural model for the TBSV RNA genome provides a snapshot of the complex framework that allows multiple functional components to operate in concert within a confined context. PMID:23717202
Informing the Design of Direct-to-Consumer Interactive Personal Genomics Reports

PubMed Central

Shaer, Orit; Okerlund, Johanna; Balestra, Martina; Stowell, Elizabeth; Ascher, Laura; Bi, Joanna; Schlenker, Claire; Ball, Madeleine

2015-01-01

Background In recent years, people who sought direct-to-consumer genetic testing services have been increasingly confronted with an unprecedented amount of personal genomic information, which influences their decisions, emotional state, and well-being. However, these users of direct-to-consumer genetic services, who vary in their education and interests, frequently have little relevant experience or tools for understanding, reasoning about, and interacting with their personal genomic data. Online interactive techniques can play a central role in making personal genomic data useful for these users. Objective We sought to (1) identify the needs of diverse users as they make sense of their personal genomic data, (2) consequently develop effective interactive visualizations of genomic trait data to address these users’ needs, and (3) evaluate the effectiveness of the developed visualizations in facilitating comprehension. Methods The first two user studies, conducted with 63 volunteers in the Personal Genome Project and with 36 personal genomic users who participated in a design workshop, respectively, employed surveys and interviews to identify the needs and expectations of diverse users. Building on the two initial studies, the third study was conducted with 730 Amazon Mechanical Turk users and employed a controlled experimental design to examine the effectiveness of different design interventions on user comprehension. Results The first two studies identified searching, comparing, sharing, and organizing data as fundamental to users’ understanding of personal genomic data. The third study demonstrated that interactive and visual design interventions could improve the understandability of personal genomic reports for consumers. In particular, results showed that a new interactive bubble chart visualization designed for the study resulted in the highest comprehension scores, as well as the highest perceived comprehension scores. These scores were significantly higher than scores received using the industry standard tabular reports currently used for communicating personal genomic information. Conclusions Drawing on multiple research methods and populations, the findings of the studies reported in this paper offer deep understanding of users’ needs and practices, and demonstrate that interactive online design interventions can improve the understandability of personal genomic reports for consumers. We discuss implications for designers and researchers. PMID:26070951
Informing the Design of Direct-to-Consumer Interactive Personal Genomics Reports.

PubMed

Shaer, Orit; Nov, Oded; Okerlund, Johanna; Balestra, Martina; Stowell, Elizabeth; Ascher, Laura; Bi, Joanna; Schlenker, Claire; Ball, Madeleine

2015-06-12

In recent years, people who sought direct-to-consumer genetic testing services have been increasingly confronted with an unprecedented amount of personal genomic information, which influences their decisions, emotional state, and well-being. However, these users of direct-to-consumer genetic services, who vary in their education and interests, frequently have little relevant experience or tools for understanding, reasoning about, and interacting with their personal genomic data. Online interactive techniques can play a central role in making personal genomic data useful for these users. We sought to (1) identify the needs of diverse users as they make sense of their personal genomic data, (2) consequently develop effective interactive visualizations of genomic trait data to address these users' needs, and (3) evaluate the effectiveness of the developed visualizations in facilitating comprehension. The first two user studies, conducted with 63 volunteers in the Personal Genome Project and with 36 personal genomic users who participated in a design workshop, respectively, employed surveys and interviews to identify the needs and expectations of diverse users. Building on the two initial studies, the third study was conducted with 730 Amazon Mechanical Turk users and employed a controlled experimental design to examine the effectiveness of different design interventions on user comprehension. The first two studies identified searching, comparing, sharing, and organizing data as fundamental to users' understanding of personal genomic data. The third study demonstrated that interactive and visual design interventions could improve the understandability of personal genomic reports for consumers. In particular, results showed that a new interactive bubble chart visualization designed for the study resulted in the highest comprehension scores, as well as the highest perceived comprehension scores. These scores were significantly higher than scores received using the industry standard tabular reports currently used for communicating personal genomic information. Drawing on multiple research methods and populations, the findings of the studies reported in this paper offer deep understanding of users' needs and practices, and demonstrate that interactive online design interventions can improve the understandability of personal genomic reports for consumers. We discuss implications for designers and researchers.
Comparative Genomics of a Parthenogenesis-Inducing Wolbachia Symbiont

PubMed Central

Lindsey, Amelia R. I.; Werren, John H.; Richards, Stephen; Stouthamer, Richard

2016-01-01

Wolbachia is an intracellular symbiont of invertebrates responsible for inducing a wide variety of phenotypes in its host. These host-Wolbachia relationships span the continuum from reproductive parasitism to obligate mutualism, and provide a unique system to study genomic changes associated with the evolution of symbiosis. We present the genome sequence from a parthenogenesis-inducing Wolbachia strain (wTpre) infecting the minute parasitoid wasp Trichogramma pretiosum. The wTpre genome is the most complete parthenogenesis-inducing Wolbachia genome available to date. We used comparative genomics across 16 Wolbachia strains, representing five supergroups, to identify a core Wolbachia genome of 496 sets of orthologous genes. Only 14 of these sets are unique to Wolbachia when compared to other bacteria from the Rickettsiales. We show that the B supergroup of Wolbachia, of which wTpre is a member, contains a significantly higher number of ankyrin repeat-containing genes than other supergroups. In the wTpre genome, there is evidence for truncation of the protein coding sequences in 20% of ORFs, mostly as a result of frameshift mutations. The wTpre strain represents a conversion from cytoplasmic incompatibility to a parthenogenesis-inducing lifestyle, and is required for reproduction in the Trichogramma host it infects. We hypothesize that the large number of coding frame truncations has accompanied the change in reproductive mode of the wTpre strain. PMID:27194801
Comparative Genomics of a Parthenogenesis-Inducing Wolbachia Symbiont.

PubMed

Lindsey, Amelia R I; Werren, John H; Richards, Stephen; Stouthamer, Richard

2016-07-07

Wolbachia is an intracellular symbiont of invertebrates responsible for inducing a wide variety of phenotypes in its host. These host-Wolbachia relationships span the continuum from reproductive parasitism to obligate mutualism, and provide a unique system to study genomic changes associated with the evolution of symbiosis. We present the genome sequence from a parthenogenesis-inducing Wolbachia strain (wTpre) infecting the minute parasitoid wasp Trichogramma pretiosum The wTpre genome is the most complete parthenogenesis-inducing Wolbachia genome available to date. We used comparative genomics across 16 Wolbachia strains, representing five supergroups, to identify a core Wolbachia genome of 496 sets of orthologous genes. Only 14 of these sets are unique to Wolbachia when compared to other bacteria from the Rickettsiales. We show that the B supergroup of Wolbachia, of which wTpre is a member, contains a significantly higher number of ankyrin repeat-containing genes than other supergroups. In the wTpre genome, there is evidence for truncation of the protein coding sequences in 20% of ORFs, mostly as a result of frameshift mutations. The wTpre strain represents a conversion from cytoplasmic incompatibility to a parthenogenesis-inducing lifestyle, and is required for reproduction in the Trichogramma host it infects. We hypothesize that the large number of coding frame truncations has accompanied the change in reproductive mode of the wTpre strain. Copyright © 2016 Lindsey et al.
Draft genome analysis provides insights into the fiber yield, crude protein biosynthesis, and vegetative growth of domesticated ramie (Boehmeria nivea L. Gaud).

PubMed

Liu, Chan; Zeng, Liangbin; Zhu, Siyuan; Wu, Lingqing; Wang, Yanzhou; Tang, Shouwei; Wang, Hongwu; Zheng, Xia; Zhao, Jian; Chen, Xiaorong; Dai, Qiuzhong; Liu, Touming

2017-11-15

Plentiful bast fiber, a high crude protein content, and vigorous vegetative growth make ramie a popular fiber and forage crop. Here, we report the draft genome of ramie, along with a genomic comparison and evolutionary analysis. The draft genome contained a sequence of approximately 335.6 Mb with 42,463 predicted genes. A high-density genetic map with 4,338 single nucleotide polymorphisms (SNPs) was developed and used to anchor the genome sequence, thus, creating an integrated genetic and physical map containing a 58.2-Mb genome sequence and 4,304 molecular markers. A genomic comparison identified 1,075 unique gene families in ramie, containing 4,082 genes. Among these unique genes, five were cellulose synthase genes that were specifically expressed in stem bark, and 3 encoded a WAT1-related protein, suggesting that they are probably related to high bast fiber yield. An evolutionary analysis detected 106 positively selected genes, 22 of which were related to nitrogen metabolism, indicating that they are probably responsible for the crude protein content and vegetative growth of domesticated varieties. This study is the first to characterize the genome and develop a high-density genetic map of ramie and provides a basis for the genetic and molecular study of this crop. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Giraffe genome sequence reveals clues to its unique morphology and physiology

PubMed Central

Agaba, Morris; Ishengoma, Edson; Miller, Webb C.; McGrath, Barbara C.; Hudson, Chelsea N.; Bedoya Reina, Oscar C.; Ratan, Aakrosh; Burhans, Rico; Chikhi, Rayan; Medvedev, Paul; Praul, Craig A.; Wu-Cavener, Lan; Wood, Brendan; Robertson, Heather; Penfold, Linda; Cavener, Douglas R.

2016-01-01

The origins of giraffe's imposing stature and associated cardiovascular adaptations are unknown. Okapi, which lacks these unique features, is giraffe's closest relative and provides a useful comparison, to identify genetic variation underlying giraffe's long neck and cardiovascular system. The genomes of giraffe and okapi were sequenced, and through comparative analyses genes and pathways were identified that exhibit unique genetic changes and likely contribute to giraffe's unique features. Some of these genes are in the HOX, NOTCH and FGF signalling pathways, which regulate both skeletal and cardiovascular development, suggesting that giraffe's stature and cardiovascular adaptations evolved in parallel through changes in a small number of genes. Mitochondrial metabolism and volatile fatty acids transport genes are also evolutionarily diverged in giraffe and may be related to its unusual diet that includes toxic plants. Unexpectedly, substantial evolutionary changes have occurred in giraffe and okapi in double-strand break repair and centrosome functions. PMID:27187213
The protist, Monosiga brevicollis, has a tyrosine kinase signaling network more elaborate and diverse than found in any known metazoan.

PubMed

Manning, Gerard; Young, Susan L; Miller, W Todd; Zhai, Yufeng

2008-07-15

Tyrosine kinase signaling has long been considered a hallmark of intercellular communication, unique to multicellular animals. Our genomic analysis of the unicellular choanoflagellate Monosiga brevicollis discovers a remarkable count of 128 tyrosine kinases, 38 tyrosine phosphatases, and 123 phosphotyrosine (pTyr)-binding SH2 proteins, all higher counts than seen in any metazoan. This elaborate signaling network shows little orthology to metazoan counterparts yet displays many innovations reminiscent of metazoans. These include extracellular domains structurally related to those of metazoan receptor kinases, alternative methods for membrane anchoring and phosphotyrosine interaction in cytoplasmic kinases, and domain combinations that link kinases to small GTPase signaling and transcription. These proteins also display a wealth of combinations of known signaling domains. This uniquely divergent and elaborate signaling network illuminates the early evolution of pTyr signaling, explores innovative ways to traverse the cellular signaling circuitry, and shows extensive convergent evolution, highlighting pervasive constraints on pTyr signaling.
A Single Banana Streak Virus Integration Event in the Banana Genome as the Origin of Infectious Endogenous Pararetrovirus▿

PubMed Central

Gayral, Philippe; Noa-Carrazana, Juan-Carlos; Lescot, Magali; Lheureux, Fabrice; Lockhart, Benham E. L.; Matsumoto, Takashi; Piffanelli, Pietro; Iskra-Caruana, Marie-Line

2008-01-01

Sequencing of plant nuclear genomes reveals the widespread presence of integrated viral sequences known as endogenous pararetroviruses (EPRVs). Banana is one of the three plant species known to harbor infectious EPRVs. Musa balbisiana carries integrated copies of Banana streak virus (BSV), which are infectious by releasing virions in interspecific hybrids. Here, we analyze the organization of the EPRV of BSV Goldfinger (BSGfV) present in the wild diploid M. balbisiana cv. Pisang Klutuk Wulung (PKW) revealed by the study of Musa bacterial artificial chromosome resources and interspecific genetic cross. cv. PKW contains two similar EPRVs of BSGfV. Genotyping of these integrants and studies of their segregation pattern show an allelic insertion. Despite the fact that integrated BSGfV has undergone extensive rearrangement, both EPRVs contain the full-length viral genome. The high degree of sequence conservation between the integrated and episomal form of the virus indicates a recent integration event; however, only one allele is infectious. Analysis of BSGfV EPRV segregation among an F1 population from an interspecific genetic cross revealed that these EPRV sequences correspond to two alleles originating from a single integration event. We describe here for the first time the full genomic and genetic organization of the two EPRVs of BSGfV present in cv. PKW in response to the challenge facing both scientists and breeders to identify and generate genetic resources free from BSV. We discuss the consequences of this unique host-pathogen interaction in terms of genetic and genomic plant defenses versus strategies of infectious BSGfV EPRVs. PMID:18417582
A Single Amino Acid Substitution within the Paramyxovirus Sendai Virus Nucleoprotein Is a Critical Determinant for Production of Interferon-Beta-Inducing Copyback-Type Defective Interfering Genomes.

PubMed

Yoshida, Asuka; Kawabata, Ryoko; Honda, Tomoyuki; Sakai, Kouji; Ami, Yasushi; Sakaguchi, Takemasa; Irie, Takashi

2018-03-01

One of the first defenses against infecting pathogens is the innate immune system activated by cellular recognition of pathogen-associated molecular patterns (PAMPs). Although virus-derived RNA species, especially copyback (cb)-type defective interfering (DI) genomes, have been shown to serve as real PAMPs, which strongly induce interferon-beta (IFN-β) during mononegavirus infection, the mechanisms underlying DI generation remain unclear. Here, for the first time, we identified a single amino acid substitution causing production of cbDI genomes by successful isolation of two distinct types of viral clones with cbDI-producing and cbDI-nonproducing phenotypes from the stock Sendai virus (SeV) strain Cantell, which has been widely used in a number of studies on antiviral innate immunity as a representative IFN-β-inducing virus. IFN-β induction was totally dependent on the presence of a significant amount of cbDI genome-containing viral particles (DI particles) in the viral stock, but not on deficiency of the IFN-antagonistic viral accessory proteins C and V. Comparison of the isolates indicated that a single amino acid substitution found within the N protein of the cbDI-producing clone was enough to cause the emergence of DI genomes. The mutated N protein of the cbDI-producing clone resulted in a lower density of nucleocapsids than that of the DI-nonproducing clone, probably causing both production of the DI genomes and their formation of a stem-loop structure, which serves as an ideal ligand for RIG-I. These results suggested that the integrity of mononegaviral nucleocapsids might be a critical factor in avoiding the undesirable recognition of infection by host cells. IMPORTANCE The type I interferon (IFN) system is a pivotal defense against infecting RNA viruses that is activated by sensing viral RNA species. RIG-I is a major sensor for infection with most mononegaviruses, and copyback (cb)-type defective interfering (DI) genomes have been shown to serve as strong RIG-I ligands in real infections. However, the mechanism underlying production of cbDI genomes remains unclear, although DI genomes emerge as the result of an error during viral replication with high doses of viruses. Sendai virus has been extensively studied and is unique in that its interaction with innate immunity reveals opposing characteristics, such as high-level IFN-β induction and strong inhibition of type I IFN pathways. Our findings provide novel insights into the mechanism of production of mononegaviral cbDI genomes, as well as virus-host interactions during innate immunity. Copyright © 2018 American Society for Microbiology.
The resurrection genome of Boea hygrometrica: A blueprint for survival of dehydration.

PubMed

Xiao, Lihong; Yang, Ge; Zhang, Liechi; Yang, Xinhua; Zhao, Shuang; Ji, Zhongzhong; Zhou, Qing; Hu, Min; Wang, Yu; Chen, Ming; Xu, Yu; Jin, Haijing; Xiao, Xuan; Hu, Guipeng; Bao, Fang; Hu, Yong; Wan, Ping; Li, Legong; Deng, Xin; Kuang, Tingyun; Xiang, Chengbin; Zhu, Jian-Kang; Oliver, Melvin J; He, Yikun

2015-05-05

"Drying without dying" is an essential trait in land plant evolution. Unraveling how a unique group of angiosperms, the Resurrection Plants, survive desiccation of their leaves and roots has been hampered by the lack of a foundational genome perspective. Here we report the ∼1,691-Mb sequenced genome of Boea hygrometrica, an important resurrection plant model. The sequence revealed evidence for two historical genome-wide duplication events, a compliment of 49,374 protein-coding genes, 29.15% of which are unique (orphan) to Boea and 20% of which (9,888) significantly respond to desiccation at the transcript level. Expansion of early light-inducible protein (ELIP) and 5S rRNA genes highlights the importance of the protection of the photosynthetic apparatus during drying and the rapid resumption of protein synthesis in the resurrection capability of Boea. Transcriptome analysis reveals extensive alternative splicing of transcripts and a focus on cellular protection strategies. The lack of desiccation tolerance-specific genome organizational features suggests the resurrection phenotype evolved mainly by an alteration in the control of dehydration response genes.
Incoming human papillomavirus type 16 genome resides in a vesicular compartment throughout mitosis.

PubMed

DiGiuseppe, Stephen; Luszczek, Wioleta; Keiffer, Timothy R; Bienkowska-Haba, Malgorzata; Guion, Lucile G M; Sapp, Martin J

2016-05-31

During the entry process, the human papillomavirus (HPV) capsid is trafficked to the trans-Golgi network (TGN), whereupon it enters the nucleus during mitosis. We previously demonstrated that the minor capsid protein L2 assumes a transmembranous conformation in the TGN. Here we provide evidence that the incoming viral genome dissociates from the TGN and associates with microtubules after the onset of mitosis. Deposition onto mitotic chromosomes is L2-mediated. Using differential staining of an incoming viral genome by small molecular dyes in selectively permeabilized cells, nuclease protection, and flotation assays, we found that HPV resides in a membrane-bound vesicle until mitosis is completed and the nuclear envelope has reformed. As a result, expression of the incoming viral genome is delayed. Taken together, these data provide evidence that HPV has evolved a unique strategy for delivering the viral genome to the nucleus of dividing cells. Furthermore, it is unlikely that nuclear vesicles are unique to HPV, and thus we may have uncovered a hitherto unrecognized cellular pathway that may be of interest for future cell biological studies.
Nucleotide sequence of the Kaposi sarcoma-associated herpesvirus (HHV8)

PubMed Central

Russo, James J.; Bohenzky, Roy A.; Chien, Ming-Cheng; Chen, Jing; Yan, Ming; Maddalena, Dawn; Parry, J. Preston; Peruzzi, Daniela; Edelman, Isidore S.; Chang, Yuan; Moore, Patrick S.

1996-01-01

The genome of the Kaposi sarcoma-associated herpesvirus (KSHV or HHV8) was mapped with cosmid and phage genomic libraries from the BC-1 cell line. Its nucleotide sequence was determined except for a 3-kb region at the right end of the genome that was refractory to cloning. The BC-1 KSHV genome consists of a 140.5-kb-long unique coding region flanked by multiple G+C-rich 801-bp terminal repeat sequences. A genomic duplication that apparently arose in the parental tumor is present in this cell culture-derived strain. At least 81 ORFs, including 66 with homology to herpesvirus saimiri ORFs, and 5 internal repeat regions are present in the long unique region. The virus encodes homologs to complement-binding proteins, three cytokines (two macrophage inflammatory proteins and interleukin 6), dihydrofolate reductase, bcl-2, interferon regulatory factors, interleukin 8 receptor, neural cell adhesion molecule-like adhesin, and a D-type cyclin, as well as viral structural and metabolic proteins. Terminal repeat analysis of virus DNA from a KS lesion suggests a monoclonal expansion of KSHV in the KS tumor. PMID:8962146
Personalized Medicine in a New Genomic Era: Ethical and Legal Aspects.

PubMed

Shoaib, Maria; Rameez, Mansoor Ali Merchant; Hussain, Syed Ather; Madadin, Mohammed; Menezes, Ritesh G

2017-08-01

The genome of two completely unrelated individuals is quite similar apart from minor variations called single nucleotide polymorphisms which contribute to the uniqueness of each and every person. These single nucleotide polymorphisms are of great interest clinically as they are useful in figuring out the susceptibility of certain individuals to particular diseases and for recognizing varied responses to pharmacological interventions. This gives rise to the idea of 'personalized medicine' as an exciting new therapeutic science in this genomic era. Personalized medicine suggests a unique treatment strategy based on an individual's genetic make-up. Its key principles revolve around applied pharmaco-genomics, pharmaco-kinetics and pharmaco-proteomics. Herein, the ethical and legal aspects of personalized medicine in a new genomic era are briefly addressed. The ultimate goal is to comprehensively recognize all relevant forms of genetic variation in each individual and be able to interpret this information in a clinically meaningful manner within the ambit of ethical and legal considerations. The authors of this article firmly believe that personalized medicine has the potential to revolutionize the current landscape of medicine as it makes its way into clinical practice.
Genome-Wide Analysis of Grain Yield Stability and Environmental Interactions in a Multiparental Soybean Population.

PubMed

Xavier, Alencar; Jarquin, Diego; Howard, Reka; Ramasubramanian, Vishnu; Specht, James E; Graef, George L; Beavis, William D; Diers, Brian W; Song, Qijian; Cregan, Perry B; Nelson, Randall; Mian, Rouf; Shannon, J Grover; McHale, Leah; Wang, Dechun; Schapaugh, William; Lorenz, Aaron J; Xu, Shizhong; Muir, William M; Rainey, Katy M

2018-02-02

Genetic improvement toward optimized and stable agronomic performance of soybean genotypes is desirable for food security. Understanding how genotypes perform in different environmental conditions helps breeders develop sustainable cultivars adapted to target regions. Complex traits of importance are known to be controlled by a large number of genomic regions with small effects whose magnitude and direction are modulated by environmental factors. Knowledge of the constraints and undesirable effects resulting from genotype by environmental interactions is a key objective in improving selection procedures in soybean breeding programs. In this study, the genetic basis of soybean grain yield responsiveness to environmental factors was examined in a large soybean nested association population. For this, a genome-wide association to performance stability estimates generated from a Finlay-Wilkinson analysis and the inclusion of the interaction between marker genotypes and environmental factors was implemented. Genomic footprints were investigated by analysis and meta-analysis using a recently published multiparent model. Results indicated that specific soybean genomic regions were associated with stability, and that multiplicative interactions were present between environments and genetic background. Seven genomic regions in six chromosomes were identified as being associated with genotype-by-environment interactions. This study provides insight into genomic assisted breeding aimed at achieving a more stable agronomic performance of soybean, and documented opportunities to exploit genomic regions that were specifically associated with interactions involving environments and subpopulations. Copyright © 2018 Xavier et al.
Complete genome sequence of uropathogenic Escherichia coli isolate UPEC 26-1.

PubMed

Subhadra, Bindu; Kim, Dong Ho; Kim, Jaeseok; Woo, Kyungho; Sohn, Kyung Mok; Kim, Hwa-Jung; Han, Kyudong; Oh, Man Hwan; Choi, Chul Hee

2018-06-01

Urinary tract infections (UTIs) are among the most common infections in humans, predominantly caused by uropathogenic Escherichia coli (UPEC). The diverse genomes of UPEC strains mostly impede disease prevention and control measures. In this study, we comparatively analyzed the whole genome sequence of a highly virulent UPEC strain, namely UPEC 26-1, which was isolated from urine sample of a patient suffering from UTI in Korea. Whole genome analysis showed that the genome consists of one circular chromosome of 5,329,753 bp, comprising 5064 protein-coding genes, 122 RNA genes (94 tRNA, 22 rRNA and 6 ncRNA genes), and 100 pseudogenes, with an average G+C content of 50.56%. In addition, we identified 8 prophage regions comprising 5 intact, 2 incomplete and 1 questionable ones and 63 genomic islands, suggesting the possibility of horizontal gene transfer in this strain. Comparative genome analysis of UPEC 26-1 with the UPEC strain CFT073 revealed an average nucleotide identity of 99.7%. The genome comparison with CFT073 provides major differences in the genome of UPEC 26-1 that would explain its increased virulence and biofilm formation. Nineteen of the total GIs were unique to UPEC 26-1 compared to CFT073 and nine of them harbored unique genes that are involved in virulence, multidrug resistance, biofilm formation and bacterial pathogenesis. The data from this study will assist in future studies of UPEC strains to develop effective control measures.
Comprehensive definition of genome features in Spirodela polyrhiza by high-depth physical mapping and short-read DNA sequencing strategies.

PubMed

Michael, Todd P; Bryant, Douglas; Gutierrez, Ryan; Borisjuk, Nikolai; Chu, Philomena; Zhang, Hanzhong; Xia, Jing; Zhou, Junfei; Peng, Hai; El Baidouri, Moaine; Ten Hallers, Boudewijn; Hastie, Alex R; Liang, Tiffany; Acosta, Kenneth; Gilbert, Sarah; McEntee, Connor; Jackson, Scott A; Mockler, Todd C; Zhang, Weixiong; Lam, Eric

2017-02-01

Spirodela polyrhiza is a fast-growing aquatic monocot with highly reduced morphology, genome size and number of protein-coding genes. Considering these biological features of Spirodela and its basal position in the monocot lineage, understanding its genome architecture could shed light on plant adaptation and genome evolution. Like many draft genomes, however, the 158-Mb Spirodela genome sequence has not been resolved to chromosomes, and important genome characteristics have not been defined. Here we deployed rapid genome-wide physical maps combined with high-coverage short-read sequencing to resolve the 20 chromosomes of Spirodela and to empirically delineate its genome features. Our data revealed a dramatic reduction in the number of the rDNA repeat units in Spirodela to fewer than 100, which is even fewer than that reported for yeast. Consistent with its unique phylogenetic position, small RNA sequencing revealed 29 Spirodela-specific microRNA, with only two being shared with Elaeis guineensis (oil palm) and Musa balbisiana (banana). Combining DNA methylation data and small RNA sequencing enabled the accurate prediction of 20.5% long terminal repeats (LTRs) that doubled the previous estimate, and revealed a high Solo:Intact LTR ratio of 8.2. Interestingly, we found that Spirodela has the lowest global DNA methylation levels (9%) of any plant species tested. Taken together our results reveal a genome that has undergone reduction, likely through eliminating non-essential protein coding genes, rDNA and LTRs. In addition to delineating the genome features of this unique plant, the methodologies described and large-scale genome resources from this work will enable future evolutionary and functional studies of this basal monocot family. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.
Broad genomic and transcriptional analysis reveals a highly derived genome in dinoflagellate mitochondria

PubMed Central

Jackson, Christopher J; Norman, John E; Schnare, Murray N; Gray, Michael W; Keeling, Patrick J; Waller, Ross F

2007-01-01

Background Dinoflagellates comprise an ecologically significant and diverse eukaryotic phylum that is sister to the phylum containing apicomplexan endoparasites. The mitochondrial genome of apicomplexans is uniquely reduced in gene content and size, encoding only three proteins and two ribosomal RNAs (rRNAs) within a highly compacted 6 kb DNA. Dinoflagellate mitochondrial genomes have been comparatively poorly studied: limited available data suggest some similarities with apicomplexan mitochondrial genomes but an even more radical type of genomic organization. Here, we investigate structure, content and expression of dinoflagellate mitochondrial genomes. Results From two dinoflagellates, Crypthecodinium cohnii and Karlodinium micrum, we generated over 42 kb of mitochondrial genomic data that indicate a reduced gene content paralleling that of mitochondrial genomes in apicomplexans, i.e., only three protein-encoding genes and at least eight conserved components of the highly fragmented large and small subunit rRNAs. Unlike in apicomplexans, dinoflagellate mitochondrial genes occur in multiple copies, often as gene fragments, and in numerous genomic contexts. Analysis of cDNAs suggests several novel aspects of dinoflagellate mitochondrial gene expression. Polycistronic transcripts were found, standard start codons are absent, and oligoadenylation occurs upstream of stop codons, resulting in the absence of termination codons. Transcripts of at least one gene, cox3, are apparently trans-spliced to generate full-length mRNAs. RNA substitutional editing, a process previously identified for mRNAs in dinoflagellate mitochondria, is also implicated in rRNA expression. Conclusion The dinoflagellate mitochondrial genome shares the same gene complement and fragmentation of rRNA genes with its apicomplexan counterpart. However, it also exhibits several unique characteristics. Most notable are the expansion of gene copy numbers and their arrangements within the genome, RNA editing, loss of stop codons, and use of trans-splicing. PMID:17897476
The carnivorous pale pitcher plant harbors diverse, distinct, and time-dependent bacterial communities.

PubMed

Koopman, Margaret M; Fuselier, Danielle M; Hird, Sarah; Carstens, Bryan C

2010-03-01

The ability of American carnivorous pitcher plants (Sarracenia) to digest insect prey is facilitated by microbial associations. Knowledge of the details surrounding this interaction has been limited by our capability to characterize bacterial diversity in this system. To describe microbial diversity within and between pitchers of one species, Sarracenia alata, and to explore how these communities change over time as pitchers accumulate and digest insect prey, we collected and analyzed environmental sequence tag (454 pyrosequencing) and genomic fingerprint (automated ribosomal intergenic spacer analysis and terminal restriction fragment length polymorphism) data. Microbial richness associated with pitcher plant fluid is high; more than 1,000 unique phylogroups were identified across at least seven phyla and 50 families. We documented an increase in bacterial diversity and abundance with time and observed repeated changes in bacterial community composition. Pitchers from different plants harbored significantly more similar bacterial communities at a given time point than communities coming from the same genetic host over time. The microbial communities in pitcher plant fluid also differ significantly from those present in the surrounding soil. These findings indicate that the bacteria associated with pitcher plant leaves are far from random assemblages and represent an important step toward understanding this unique plant-microbe interaction.
The Carnivorous Pale Pitcher Plant Harbors Diverse, Distinct, and Time-Dependent Bacterial Communities▿ †

PubMed Central

Koopman, Margaret M.; Fuselier, Danielle M.; Hird, Sarah; Carstens, Bryan C.

2010-01-01

The ability of American carnivorous pitcher plants (Sarracenia) to digest insect prey is facilitated by microbial associations. Knowledge of the details surrounding this interaction has been limited by our capability to characterize bacterial diversity in this system. To describe microbial diversity within and between pitchers of one species, Sarracenia alata, and to explore how these communities change over time as pitchers accumulate and digest insect prey, we collected and analyzed environmental sequence tag (454 pyrosequencing) and genomic fingerprint (automated ribosomal intergenic spacer analysis and terminal restriction fragment length polymorphism) data. Microbial richness associated with pitcher plant fluid is high; more than 1,000 unique phylogroups were identified across at least seven phyla and 50 families. We documented an increase in bacterial diversity and abundance with time and observed repeated changes in bacterial community composition. Pitchers from different plants harbored significantly more similar bacterial communities at a given time point than communities coming from the same genetic host over time. The microbial communities in pitcher plant fluid also differ significantly from those present in the surrounding soil. These findings indicate that the bacteria associated with pitcher plant leaves are far from random assemblages and represent an important step toward understanding this unique plant-microbe interaction. PMID:20097807

Chicken Interferon-induced Protein with Tetratricopeptide Repeats 5 Antagonizes Replication of RNA Viruses.

PubMed

Santhakumar, Diwakar; Rohaim, Mohammed Abdel Mohsen Shahaat; Hussein, Hussein A; Hawes, Pippa; Ferreira, Helena Lage; Behboudi, Shahriar; Iqbal, Munir; Nair, Venugopal; Arns, Clarice W; Munir, Muhammad

2018-05-01

The intracellular actions of interferon (IFN)-regulated proteins, including IFN-induced proteins with tetratricopeptide repeats (IFITs), attribute a major component of the protective antiviral host defense. Here we applied genomics approaches to annotate the chicken IFIT locus and currently identified a single IFIT (chIFIT5) gene. The profound transcriptional level of this effector of innate immunity was mapped within its unique cis-acting elements. This highly virus- and IFN-responsive chIFIT5 protein interacted with negative sense viral RNA structures that carried a triphosphate group on its 5' terminus (ppp-RNA). This interaction reduced the replication of RNA viruses in lentivirus-mediated IFIT5-stable chicken fibroblasts whereas CRISPR/Cas9-edited chIFIT5 gene knockout fibroblasts supported the replication of RNA viruses. Finally, we generated mosaic transgenic chicken embryos stably expressing chIFIT5 protein or knocked-down for endogenous chIFIT5 gene. Replication kinetics of RNA viruses in these transgenic chicken embryos demonstrated the antiviral potential of chIFIT5 in ovo. Taken together, these findings propose that IFIT5 specifically antagonize RNA viruses by sequestering viral nucleic acids in chickens, which are unique in innate immune sensing and responses to viruses of both poultry and human health significance.
Emergence and Evolution of Hominidae-Specific Coding and Noncoding Genomic Sequences

PubMed Central

Saber, Morteza Mahmoudi; Adeyemi Babarinde, Isaac; Hettiarachchi, Nilmini; Saitou, Naruya

2016-01-01

Family Hominidae, which includes humans and great apes, is recognized for unique complex social behavior and intellectual abilities. Despite the increasing genome data, however, the genomic origin of its phenotypic uniqueness has remained elusive. Clade-specific genes and highly conserved noncoding sequences (HCNSs) are among the high-potential evolutionary candidates involved in driving clade-specific characters and phenotypes. On this premise, we analyzed whole genome sequences along with gene orthology data retrieved from major DNA databases to find Hominidae-specific (HS) genes and HCNSs. We discovered that Down syndrome critical region 4 (DSCR4) is the only experimentally verified gene uniquely present in Hominidae. DSCR4 has no structural homology to any known protein and was inferred to have emerged in several steps through LTR/ERV1, LTR/ERVL retrotransposition, and transversion. Using the genomic distance as neutral evolution threshold, we identified 1,658 HS HCNSs. Polymorphism coverage and derived allele frequency analysis of HS HCNSs showed that these HCNSs are under purifying selection, indicating that they may harbor important functions. They are overrepresented in promoters/untranslated regions, in close proximity of genes involved in sensory perception of sound and developmental process, and also showed a significantly lower nucleosome occupancy probability. Interestingly, many ancestral sequences of the HS HCNSs showed very high evolutionary rates. This suggests that new functions emerged through some kind of positive selection, and then purifying selection started to operate to keep these functions. PMID:27289096
Putative and unique gene sequence utilization for the design of species specific probes as modeled by Lactobacillus plantarum

USDA-ARS?s Scientific Manuscript database

The concept of utilizing putative and unique gene sequences for the design of species specific probes was tested. The abundance profile of assigned functions within the Lactobacillus plantarum genome was used for the identification of the putative and unique gene sequence, csh. The targeted gene (cs...
Genomic comparison of multi-drug resistant invasive and colonizing Acinetobacter baumannii isolated from diverse human body sites reveals genomic plasticity.

PubMed

Sahl, Jason W; Johnson, J Kristie; Harris, Anthony D; Phillippy, Adam M; Hsiao, William W; Thom, Kerri A; Rasko, David A

2011-06-04

Acinetobacter baumannii has recently emerged as a significant global pathogen, with a surprisingly rapid acquisition of antibiotic resistance and spread within hospitals and health care institutions. This study examines the genomic content of three A. baumannii strains isolated from distinct body sites. Isolates from blood, peri-anal, and wound sources were examined in an attempt to identify genetic features that could be correlated to each isolation source. Pulsed-field gel electrophoresis, multi-locus sequence typing and antibiotic resistance profiles demonstrated genotypic and phenotypic variation. Each isolate was sequenced to high-quality draft status, which allowed for comparative genomic analyses with existing A. baumannii genomes. A high resolution, whole genome alignment method detailed the phylogenetic relationships of sequenced A. baumannii and found no correlation between phylogeny and body site of isolation. This method identified genomic regions unique to both those isolates found on the surface of the skin or in wounds, termed colonization isolates, and those identified from body fluids, termed invasive isolates; these regions may play a role in the pathogenesis and spread of this important pathogen. A PCR-based screen of 74 A. baumanii isolates demonstrated that these unique genes are not exclusive to either phenotype or isolation source; however, a conserved genomic region exclusive to all sequenced A. baumannii was identified and verified. The results of the comparative genome analysis and PCR assay show that A. baumannii is a diverse and genomically variable pathogen that appears to have the potential to cause a range of human disease regardless of the isolation source.
Molecular Cloning, Functional Characterization, and Evolutionary Analysis of Vitamin D Receptors Isolated from Basal Vertebrates

PubMed Central

Kollitz, Erin M.; Zhang, Guozhu; Hawkins, Mary Beth; Whitfield, G. Kerr; Reif, David M.; Kullman, Seth W.

2015-01-01

The vertebrate genome is a result of two rapid and successive rounds of whole genome duplication, referred to as 1R and 2R. Furthermore, teleost fish have undergone a third whole genome duplication (3R) specific to their lineage, resulting in the retention of multiple gene paralogs. The more recent 3R event in teleosts provides a unique opportunity to gain insight into how genes evolve through specific evolutionary processes. In this study we compare molecular activities of vitamin D receptors (VDR) from basal species that diverged at key points in vertebrate evolution in order to infer derived and ancestral VDR functions of teleost paralogs. Species include the sea lamprey (Petromyzon marinus), a 1R jawless fish; the little skate (Leucoraja erinacea), a cartilaginous fish that diverged after the 2R event; and the Senegal bichir (Polypterus senegalus), a primitive 2R ray-finned fish. Saturation binding assays and gel mobility shift assays demonstrate high affinity ligand binding and classic DNA binding characteristics of VDR has been conserved across vertebrate evolution. Concentration response curves in transient transfection assays reveal EC50 values in the low nanomolar range, however maximum transactivational efficacy varies significantly between receptor orthologs. Protein-protein interactions were investigated using co-transfection, mammalian 2-hybrid assays, and mutations of coregulator activation domains. We then combined these results with our previous study of VDR paralogs from 3R teleosts into a bioinformatics analysis. Our results suggest that 1, 25D3 acts as a partial agonist in basal species. Furthermore, our bioinformatics analysis suggests that functional differences between VDR orthologs and paralogs are influenced by differential protein interactions with essential coregulator proteins. We speculate that we may be observing a change in the pharmacodynamics relationship between VDR and 1, 25D3 throughout vertebrate evolution that may have been driven by changes in protein-protein interactions between VDR and essential coregulators. PMID:25855982
The Dynamic Genome and Transcriptome of the Human Fungal Pathogen Blastomyces and Close Relative Emmonsia

PubMed Central

Gallo, Juan E.; Holder, Jason; Sullivan, Thomas D.; Marty, Amber J.; Carmen, John C.; Chen, Zehua; Ding, Li; Gujja, Sharvari; Magrini, Vincent; Misas, Elizabeth; Mitreva, Makedonka; Priest, Margaret; Saif, Sakina; Whiston, Emily A.; Young, Sarah; Zeng, Qiandong; Goldman, William E.; Mardis, Elaine R.; Taylor, John W.; McEwen, Juan G.; Clay, Oliver K.; Klein, Bruce S.; Cuomo, Christina A.

2015-01-01

Three closely related thermally dimorphic pathogens are causal agents of major fungal diseases affecting humans in the Americas: blastomycosis, histoplasmosis and paracoccidioidomycosis. Here we report the genome sequence and analysis of four strains of the etiological agent of blastomycosis, Blastomyces, and two species of the related genus Emmonsia, typically pathogens of small mammals. Compared to related species, Blastomyces genomes are highly expanded, with long, often sharply demarcated tracts of low GC-content sequence. These GC-poor isochore-like regions are enriched for gypsy elements, are variable in total size between isolates, and are least expanded in the avirulent B. dermatitidis strain ER-3 as compared with the virulent B. gilchristii strain SLH14081. The lack of similar regions in related species suggests these isochore-like regions originated recently in the ancestor of the Blastomyces lineage. While gene content is highly conserved between Blastomyces and related fungi, we identified changes in copy number of genes potentially involved in host interaction, including proteases and characterized antigens. In addition, we studied gene expression changes of B. dermatitidis during the interaction of the infectious yeast form with macrophages and in a mouse model. Both experiments highlight a strong antioxidant defense response in Blastomyces, and upregulation of dioxygenases in vivo suggests that dioxide produced by antioxidants may be further utilized for amino acid metabolism. We identify a number of functional categories upregulated exclusively in vivo, such as secreted proteins, zinc acquisition proteins, and cysteine and tryptophan metabolism, which may include critical virulence factors missed before in in vitro studies. Across the dimorphic fungi, loss of certain zinc acquisition genes and differences in amino acid metabolism suggest unique adaptations of Blastomyces to its host environment. These results reveal the dynamics of genome evolution and of factors contributing to virulence in Blastomyces. PMID:26439490
Comparative Genomics of Carp Herpesviruses

PubMed Central

Kurobe, Tomofumi; Gatherer, Derek; Cunningham, Charles; Korf, Ian; Fukuda, Hideo; Hedrick, Ronald P.; Waltzek, Thomas B.

2013-01-01

Three alloherpesviruses are known to cause disease in cyprinid fish: cyprinid herpesviruses 1 and 3 (CyHV1 and CyHV3) in common carp and koi and cyprinid herpesvirus 2 (CyHV2) in goldfish. We have determined the genome sequences of CyHV1 and CyHV2 and compared them with the published CyHV3 sequence. The CyHV1 and CyHV2 genomes are 291,144 and 290,304 bp, respectively, in size, and thus the CyHV3 genome, at 295,146 bp, remains the largest recorded among the herpesviruses. Each of the three genomes consists of a unique region flanked at each terminus by a sizeable direct repeat. The CyHV1, CyHV2, and CyHV3 genomes are predicted to contain 137, 150, and 155 unique, functional protein-coding genes, respectively, of which six, four, and eight, respectively, are duplicated in the terminal repeat. The three viruses share 120 orthologous genes in a largely colinear arrangement, of which up to 55 are also conserved in the other member of the genus Cyprinivirus, anguillid herpesvirus 1. Twelve genes are conserved convincingly in all sequenced alloherpesviruses, and two others are conserved marginally. The reference CyHV3 strain has been reported to contain five fragmented genes that are presumably nonfunctional. The CyHV2 strain has two fragmented genes, and the CyHV1 strain has none. CyHV1, CyHV2, and CyHV3 have five, six, and five families of paralogous genes, respectively. One family unique to CyHV1 is related to cellular JUNB, which encodes a transcription factor involved in oncogenesis. To our knowledge, this is the first time that JUNB-related sequences have been reported in a herpesvirus. PMID:23269803
Complete mitochondrial genome of Ostrea denselamellosa (Bivalvia, Ostreidae).

PubMed

Yu, Hong; Kong, Lingfeng; Li, Qi

2016-01-01

The complete mitochondrial (mt) genome of the flat oyster, Ostrea denselamellosa, was determined using Long-PCR and genome walking techniques in this study. The total length of the mt genome sequence of O. denselamellosa was 16,227 bp, which is the smallest reported Ostreidae mt genome to date. It contained 12 protein-coding genes (lacking of ATP8), 23 transfer RNA genes, and two ribosomal RNA genes. A bias towards a higher representation of nucleotides A and T (60.7%) was detected in the mt genome of O. denselamellosa. The rrnL was split into two fragments (3' half, 711 bp; 5' half, 509 bp), which seems to be the unique characteristics of Ostreidae mt genomes.
Cloning, analysis and functional annotation of expressed sequence tags from the Earthworm Eisenia fetida

PubMed Central

Pirooznia, Mehdi; Gong, Ping; Guan, Xin; Inouye, Laura S; Yang, Kuan; Perkins, Edward J; Deng, Youping

2007-01-01

Background Eisenia fetida, commonly known as red wiggler or compost worm, belongs to the Lumbricidae family of the Annelida phylum. Little is known about its genome sequence although it has been extensively used as a test organism in terrestrial ecotoxicology. In order to understand its gene expression response to environmental contaminants, we cloned 4032 cDNAs or expressed sequence tags (ESTs) from two E. fetida libraries enriched with genes responsive to ten ordnance related compounds using suppressive subtractive hybridization-PCR. Results A total of 3144 good quality ESTs (GenBank dbEST accession number EH669363–EH672369 and EL515444–EL515580) were obtained from the raw clone sequences after cleaning. Clustering analysis yielded 2231 unique sequences including 448 contigs (from 1361 ESTs) and 1783 singletons. Comparative genomic analysis showed that 743 or 33% of the unique sequences shared high similarity with existing genes in the GenBank nr database. Provisional function annotation assigned 830 Gene Ontology terms to 517 unique sequences based on their homology with the annotated genomes of four model organisms Drosophila melanogaster, Mus musculus, Saccharomyces cerevisiae, and Caenorhabditis elegans. Seven percent of the unique sequences were further mapped to 99 Kyoto Encyclopedia of Genes and Genomes pathways based on their matching Enzyme Commission numbers. All the information is stored and retrievable at a highly performed, web-based and user-friendly relational database called EST model database or ESTMD version 2. Conclusion The ESTMD containing the sequence and annotation information of 4032 E. fetida ESTs is publicly accessible at . PMID:18047730
Whole genome analyses of marine fish pathogenic isolate, Mycobacterium sp. 012931.

PubMed

Kurokawa, Satoru; Kabayama, Jun; Hwang, Seong Don; Nho, Seong Won; Hikima, Jun-ichi; Jung, Tae Sung; Kondo, Hidehiro; Hirono, Ikuo; Takeyama, Haruko; Mori, Tetsushi; Aoki, Takashi

2014-10-01

Mycobacterium is a genus within the order Actinomycetales that comprises of a large number of well-characterized species, several of which includes pathogens known to cause serious disease in human and animal. Here, we report the whole genome sequence of Mycobacterium sp. strain 012931 isolated from the marine fish, yellowtail (Seriola quinqueradiata). Mycobacterium sp. 012931 is a fish pathogen causing serious damage to aquaculture farms in Japan. DNA dot plot analysis showed that Mycobacterium sp. 012931 was more closely related to Mycobacterium marinum when compared across several Mycobacterium species. However, little conservation of the gene order was observed between Mycobacterium sp. 012931 and M. marinum genome. The annotated 5,464 genes of Mycobacterium sp. 012931 was classified into 26 subsystems. The insertion/deletion gene analysis shows Mycobacterium sp. 012931 had 643 unique genes that were not found in the M. marinum strains. In the virulence, disease, and defense subsystem, both insertion and deletion genes of Mycobacterium sp. 012931 were associated with the PPE gene cluster of Mycobacteria. Of seven plcB genes in Mycobacterium sp. 012931, plcB_2 and plcB_3 showed low identities with those of M. marinum strains. Therefore, Mycobacterium sp. 012931 has differences on genetic and virulence from M. marinum and may induce different interaction mechanisms between host and pathogen.
The role of internal duplication in the evolution of multi-domain proteins.

PubMed

Nacher, J C; Hayashida, M; Akutsu, T

2010-08-01

Many proteins consist of several structural domains. These multi-domain proteins have likely been generated by selective genome growth dynamics during evolution to perform new functions as well as to create structures that fold on a biologically feasible time scale. Domain units frequently evolved through a variety of genetic shuffling mechanisms. Here we examine the protein domain statistics of more than 1000 organisms including eukaryotic, archaeal and bacterial species. The analysis extends earlier findings on asymmetric statistical laws for proteome to a wider variety of species. While proteins are composed of a wide range of domains, displaying a power-law decay, the computation of domain families for each protein reveals an exponential distribution, characterizing a protein universe composed of a thin number of unique families. Structural studies in proteomics have shown that domain repeats, or internal duplicated domains, represent a small but significant fraction of genome. In spite of its importance, this observation has been largely overlooked until recently. We model the evolutionary dynamics of proteome and demonstrate that these distinct distributions are in fact rooted in an internal duplication mechanism. This process generates the contemporary protein structural domain universe, determines its reduced thickness, and tames its growth. These findings have important implications, ranging from protein interaction network modeling to evolutionary studies based on fundamental mechanisms governing genome expansion.
QuIN: A Web Server for Querying and Visualizing Chromatin Interaction Networks.

PubMed

Thibodeau, Asa; Márquez, Eladio J; Luo, Oscar; Ruan, Yijun; Menghi, Francesca; Shin, Dong-Guk; Stitzel, Michael L; Vera-Licona, Paola; Ucar, Duygu

2016-06-01

Recent studies of the human genome have indicated that regulatory elements (e.g. promoters and enhancers) at distal genomic locations can interact with each other via chromatin folding and affect gene expression levels. Genomic technologies for mapping interactions between DNA regions, e.g., ChIA-PET and HiC, can generate genome-wide maps of interactions between regulatory elements. These interaction datasets are important resources to infer distal gene targets of non-coding regulatory elements and to facilitate prioritization of critical loci for important cellular functions. With the increasing diversity and complexity of genomic information and public ontologies, making sense of these datasets demands integrative and easy-to-use software tools. Moreover, network representation of chromatin interaction maps enables effective data visualization, integration, and mining. Currently, there is no software that can take full advantage of network theory approaches for the analysis of chromatin interaction datasets. To fill this gap, we developed a web-based application, QuIN, which enables: 1) building and visualizing chromatin interaction networks, 2) annotating networks with user-provided private and publicly available functional genomics and interaction datasets, 3) querying network components based on gene name or chromosome location, and 4) utilizing network based measures to identify and prioritize critical regulatory targets and their direct and indirect interactions. QuIN's web server is available at http://quin.jax.org QuIN is developed in Java and JavaScript, utilizing an Apache Tomcat web server and MySQL database and the source code is available under the GPLV3 license available on GitHub: https://github.com/UcarLab/QuIN/.
PAPST, a User Friendly and Powerful Java Platform for ChIP-Seq Peak Co-Localization Analysis and Beyond.

PubMed

Bible, Paul W; Kanno, Yuka; Wei, Lai; Brooks, Stephen R; O'Shea, John J; Morasso, Maria I; Loganantharaj, Rasiah; Sun, Hong-Wei

2015-01-01

Comparative co-localization analysis of transcription factors (TFs) and epigenetic marks (EMs) in specific biological contexts is one of the most critical areas of ChIP-Seq data analysis beyond peak calling. Yet there is a significant lack of user-friendly and powerful tools geared towards co-localization analysis based exploratory research. Most tools currently used for co-localization analysis are command line only and require extensive installation procedures and Linux expertise. Online tools partially address the usability issues of command line tools, but slow response times and few customization features make them unsuitable for rapid data-driven interactive exploratory research. We have developed PAPST: Peak Assignment and Profile Search Tool, a user-friendly yet powerful platform with a unique design, which integrates both gene-centric and peak-centric co-localization analysis into a single package. Most of PAPST's functions can be completed in less than five seconds, allowing quick cycles of data-driven hypothesis generation and testing. With PAPST, a researcher with or without computational expertise can perform sophisticated co-localization pattern analysis of multiple TFs and EMs, either against all known genes or a set of genomic regions obtained from public repositories or prior analysis. PAPST is a versatile, efficient, and customizable tool for genome-wide data-driven exploratory research. Creatively used, PAPST can be quickly applied to any genomic data analysis that involves a comparison of two or more sets of genomic coordinate intervals, making it a powerful tool for a wide range of exploratory genomic research. We first present PAPST's general purpose features then apply it to several public ChIP-Seq data sets to demonstrate its rapid execution and potential for cutting-edge research with a case study in enhancer analysis. To our knowledge, PAPST is the first software of its kind to provide efficient and sophisticated post peak-calling ChIP-Seq data analysis as an easy-to-use interactive application. PAPST is available at https://github.com/paulbible/papst and is a public domain work.
Cytoplasmic genome substitution in wheat affects the nuclear-cytoplasmic cross-talk leading to transcript and metabolite alterations

PubMed Central

2013-01-01

Background Alloplasmic lines provide a unique tool to study nuclear-cytoplasmic interactions. Three alloplasmic lines, with nuclear genomes from Triticum aestivum and harboring cytoplasm from Aegilops uniaristata, Aegilops tauschii and Hordeum chilense, were investigated by transcript and metabolite profiling to identify the effects of cytoplasmic substitution on nuclear-cytoplasmic signaling mechanisms. Results In combining the wheat nuclear genome with a cytoplasm of H. chilense, 540 genes were significantly altered, whereas 11 and 28 genes were significantly changed in the alloplasmic lines carrying the cytoplasm of Ae. uniaristata or Ae. tauschii, respectively. We identified the RNA maturation-related process as one of the most sensitive to a perturbation of the nuclear-cytoplasmic interaction. Several key components of the ROS chloroplast retrograde signaling, together with the up-regulation of the ROS scavenging system, showed that changes in the chloroplast genome have a direct impact on nuclear-cytoplasmic cross-talk. Remarkably, the H. chilense alloplasmic line down-regulated some genes involved in the determination of cytoplasmic male sterility without expressing the male sterility phenotype. Metabolic profiling showed a comparable response of the central metabolism of the alloplasmic and euplasmic lines to light, while exposing larger metabolite alterations in the H. chilense alloplasmic line as compared with the Aegilops lines, in agreement with the transcriptomic data. Several stress-related metabolites, remarkably raffinose, were altered in content in the H. chilense alloplasmic line when exposed to high light, while amino acids, as well as organic acids were significantly decreased. Alterations in the levels of transcript, related to raffinose, and the photorespiration-related metabolisms were associated with changes in the level of related metabolites. Conclusion The replacement of a wheat cytoplasm with the cytoplasm of a related species affects the nuclear-cytoplasmic cross-talk leading to transcript and metabolite alterations. The extent of these modifications was limited in the alloplasmic lines with Aegilops cytoplasm, and more evident in the alloplasmic line with H. chilense cytoplasm. We consider that, this finding might be linked to the phylogenetic distance of the genomes. PMID:24320731
PAPST, a User Friendly and Powerful Java Platform for ChIP-Seq Peak Co-Localization Analysis and Beyond

PubMed Central

Bible, Paul W.; Kanno, Yuka; Wei, Lai; Brooks, Stephen R.; O’Shea, John J.; Morasso, Maria I.; Loganantharaj, Rasiah; Sun, Hong-Wei

2015-01-01

Comparative co-localization analysis of transcription factors (TFs) and epigenetic marks (EMs) in specific biological contexts is one of the most critical areas of ChIP-Seq data analysis beyond peak calling. Yet there is a significant lack of user-friendly and powerful tools geared towards co-localization analysis based exploratory research. Most tools currently used for co-localization analysis are command line only and require extensive installation procedures and Linux expertise. Online tools partially address the usability issues of command line tools, but slow response times and few customization features make them unsuitable for rapid data-driven interactive exploratory research. We have developed PAPST: Peak Assignment and Profile Search Tool, a user-friendly yet powerful platform with a unique design, which integrates both gene-centric and peak-centric co-localization analysis into a single package. Most of PAPST’s functions can be completed in less than five seconds, allowing quick cycles of data-driven hypothesis generation and testing. With PAPST, a researcher with or without computational expertise can perform sophisticated co-localization pattern analysis of multiple TFs and EMs, either against all known genes or a set of genomic regions obtained from public repositories or prior analysis. PAPST is a versatile, efficient, and customizable tool for genome-wide data-driven exploratory research. Creatively used, PAPST can be quickly applied to any genomic data analysis that involves a comparison of two or more sets of genomic coordinate intervals, making it a powerful tool for a wide range of exploratory genomic research. We first present PAPST’s general purpose features then apply it to several public ChIP-Seq data sets to demonstrate its rapid execution and potential for cutting-edge research with a case study in enhancer analysis. To our knowledge, PAPST is the first software of its kind to provide efficient and sophisticated post peak-calling ChIP-Seq data analysis as an easy-to-use interactive application. PAPST is available at https://github.com/paulbible/papst and is a public domain work. PMID:25970601
Contextualization of drug-mediator relations using evidence networks.

PubMed

Tran, Hai Joey; Speyer, Gil; Kiefer, Jeff; Kim, Seungchan

2017-05-31

Genomic analysis of drug response can provide unique insights into therapies that can be used to match the "right drug to the right patient." However, the process of discovering such therapeutic insights using genomic data is not straightforward and represents an area of active investigation. EDDY (Evaluation of Differential DependencY), a statistical test to detect differential statistical dependencies, is one method that leverages genomic data to identify differential genetic dependencies. EDDY has been used in conjunction with the Cancer Therapeutics Response Portal (CTRP), a dataset with drug-response measurements for more than 400 small molecules, and RNAseq data of cell lines in the Cancer Cell Line Encyclopedia (CCLE) to find potential drug-mediator pairs. Mediators were identified as genes that showed significant change in genetic statistical dependencies within annotated pathways between drug sensitive and drug non-sensitive cell lines, and the results are presented as a public web-portal (EDDY-CTRP). However, the interpretability of drug-mediator pairs currently hinders further exploration of these potentially valuable results. In this study, we address this challenge by constructing evidence networks built with protein and drug interactions from the STITCH and STRING interaction databases. STITCH and STRING are sister databases that catalog known and predicted drug-protein interactions and protein-protein interactions, respectively. Using these two databases, we have developed a method to construct evidence networks to "explain" the relation between a drug and a mediator. RESULTS: We applied this approach to drug-mediator relations discovered in EDDY-CTRP analysis and identified evidence networks for ~70% of drug-mediator pairs where most mediators were not known direct targets for the drug. Constructed evidence networks enable researchers to contextualize the drug-mediator pair with current research and knowledge. Using evidence networks, we were able to improve the interpretability of the EDDY-CTRP results by linking the drugs and mediators with genes associated with both the drug and the mediator. We anticipate that these evidence networks will help inform EDDY-CTRP results and enhance the generation of important insights to drug sensitivity that will lead to improved precision medicine applications.
In silico Comparison of 19 Porphyromonas gingivalis Strains in Genomics, Phylogenetics, Phylogenomics and Functional Genomics.

PubMed

Chen, Tsute; Siddiqui, Huma; Olsen, Ingar

2017-01-01

Currently, genome sequences of a total of 19 Porphyromonas gingivalis strains are available, including eight completed genomes (strains W83, ATCC 33277, TDC60, HG66, A7436, AJW4, 381, and A7A1-28) and 11 high-coverage draft sequences (JCVI SC001, F0185, F0566, F0568, F0569, F0570, SJD2, W4087, W50, Ando, and MP4-504) that are assembled into fewer than 300 contigs. The objective was to compare these genomes at both nucleotide and protein sequence levels in order to understand their phylogenetic and functional relatedness. Four copies of 16S rRNA gene sequences were identified in each of the eight complete genomes and one in the other 11 unfinished genomes. These 43 16S rRNA sequences represent only 24 unique sequences and the derived phylogenetic tree suggests a possible evolutionary history for these strains. Phylogenomic comparison based on shared proteins and whole genome nucleotide sequences consistently showed two groups with closely related members: one consisted of ATCC 33277, 381, and HG66, another of W83, W50, and A7436. At least 1,037 core/shared proteins were identified in the 19 P. gingivalis genomes based on the most stringent detecting parameters. Comparative functional genomics based on genome-wide comparisons between NCBI and RAST annotations, as well as additional approaches, revealed functions that are unique or missing in individual P. gingivalis strains, or species-specific in all P. gingivalis strains, when compared to a neighboring species P. asaccharolytica . All the comparative results of this study are available online for download at ftp://www.homd.org/publication_data/20160425/.
The first complete chloroplast genome sequence of a lycophyte,Huperzia lucidula (Lycopodiaceae)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wolf, Paul G.; Karol, Kenneth G.; Mandoli, Dina F.

2005-02-01

We used a unique combination of techniques to sequence the first complete chloroplast genome of a lycophyte, Huperzia lucidula. This plant belongs to a significant clade hypothesized to represent the sister group to all other vascular plants. We used fluorescence-activated cell sorting (FACS) to isolate the organelles, rolling circle amplification (RCA) to amplify the genome, and shotgun sequencing to 8x depth coverage to obtain the complete chloroplast genome sequence. The genome is 154,373bp, containing inverted repeats of 15,314 bp each, a large single-copy region of 104,088 bp, and a small single-copy region of 19,671 bp. Gene order is more similarmore » to those of mosses, liverworts, and hornworts than to gene order for other vascular plants. For example, the Huperziachloroplast genome possesses the bryophyte gene order for a previously characterized 30 kb inversion, thus supporting the hypothesis that lycophytes are sister to all other extant vascular plants. The lycophytechloroplast genome data also enable a better reconstruction of the basaltracheophyte genome, which is useful for inferring relationships among bryophyte lineages. Several unique characters are observed in Huperzia, such as movement of the gene ndhF from the small single copy region into the inverted repeat. We present several analyses of evolutionary relationships among land plants by using nucleotide data, amino acid sequences, and by comparing gene arrangements from chloroplast genomes. The results, while still tentative pending the large number of chloroplast genomes from other key lineages that are soon to be sequenced, are intriguing in themselves, and contribute to a growing comparative database of genomic and morphological data across the green plants.« less
In silico Comparison of 19 Porphyromonas gingivalis Strains in Genomics, Phylogenetics, Phylogenomics and Functional Genomics

PubMed Central

Chen, Tsute; Siddiqui, Huma; Olsen, Ingar

2017-01-01

Currently, genome sequences of a total of 19 Porphyromonas gingivalis strains are available, including eight completed genomes (strains W83, ATCC 33277, TDC60, HG66, A7436, AJW4, 381, and A7A1-28) and 11 high-coverage draft sequences (JCVI SC001, F0185, F0566, F0568, F0569, F0570, SJD2, W4087, W50, Ando, and MP4-504) that are assembled into fewer than 300 contigs. The objective was to compare these genomes at both nucleotide and protein sequence levels in order to understand their phylogenetic and functional relatedness. Four copies of 16S rRNA gene sequences were identified in each of the eight complete genomes and one in the other 11 unfinished genomes. These 43 16S rRNA sequences represent only 24 unique sequences and the derived phylogenetic tree suggests a possible evolutionary history for these strains. Phylogenomic comparison based on shared proteins and whole genome nucleotide sequences consistently showed two groups with closely related members: one consisted of ATCC 33277, 381, and HG66, another of W83, W50, and A7436. At least 1,037 core/shared proteins were identified in the 19 P. gingivalis genomes based on the most stringent detecting parameters. Comparative functional genomics based on genome-wide comparisons between NCBI and RAST annotations, as well as additional approaches, revealed functions that are unique or missing in individual P. gingivalis strains, or species-specific in all P. gingivalis strains, when compared to a neighboring species P. asaccharolytica. All the comparative results of this study are available online for download at ftp://www.homd.org/publication_data/20160425/. PMID:28261563
The mitochondrial genome sequences of the round goby and the sand goby reveal patterns of recent evolution in gobiid fish.

PubMed

Adrian-Kalchhauser, Irene; Svensson, Ola; Kutschera, Verena E; Alm Rosenblad, Magnus; Pippel, Martin; Winkler, Sylke; Schloissnig, Siegfried; Blomberg, Anders; Burkhardt-Holm, Patricia

2017-02-16

Vertebrate mitochondrial genomes are optimized for fast replication and low cost of RNA expression. Accordingly, they are devoid of introns, are transcribed as polycistrons and contain very little intergenic sequences. Usually, vertebrate mitochondrial genomes measure between 16.5 and 17 kilobases (kb). During genome sequencing projects for two novel vertebrate models, the invasive round goby and the sand goby, we found that the sand goby genome is exceptionally small (16.4 kb), while the mitochondrial genome of the round goby is much larger than expected for a vertebrate. It is 19 kb in size and is thus one of the largest fish and even vertebrate mitochondrial genomes known to date. The expansion is attributable to a sequence insertion downstream of the putative transcriptional start site. This insertion carries traces of repeats from the control region, but is mostly novel. To get more information about this phenomenon, we gathered all available mitochondrial genomes of Gobiidae and of nine gobioid species, performed phylogenetic analyses, analysed gene arrangements, and compared gobiid mitochondrial genome sizes, ecological information and other species characteristics with respect to the mitochondrial phylogeny. This allowed us amongst others to identify a unique arrangement of tRNAs among Ponto-Caspian gobies. Our results indicate that the round goby mitochondrial genome may contain novel features. Since mitochondrial genome organisation is tightly linked to energy metabolism, these features may be linked to its invasion success. Also, the unique tRNA arrangement among Ponto-Caspian gobies may be helpful in studying the evolution of this highly adaptive and invasive species group. Finally, we find that the phylogeny of gobiids can be further refined by the use of longer stretches of linked DNA sequence.

Systematic CpT (ApG) depletion and CpG excess are unique genomic signatures of large DNA viruses infecting invertebrates.

PubMed

Upadhyay, Mohita; Sharma, Neha; Vivekanandan, Perumal

2014-01-01

Differences in the relative abundance of dinucleotides, if any may provide important clues on host-driven evolution of viruses. We studied dinucleotide frequencies of large DNA viruses infecting vertebrates (n = 105; viruses infecting mammals = 99; viruses infecting aves = 6; viruses infecting reptiles = 1) and invertebrates (n = 88; viruses infecting insects = 84; viruses infecting crustaceans = 4). We have identified systematic depletion of CpT(ApG) dinucleotides and over-representation of CpG dinucleotides as the unique genomic signature of large DNA viruses infecting invertebrates. Detailed investigation of this unique genomic signature suggests the existence of invertebrate host-induced pressures specifically targeting CpT(ApG) and CpG dinucleotides. The depletion of CpT dinucleotides among large DNA viruses infecting invertebrates is at least in part, explained by non-canonical DNA methylation by the infected host. Our findings highlight the role of invertebrate host-related factors in shaping virus evolution and they also provide the necessary framework for future studies on evolution, epigenetics and molecular biology of viruses infecting this group of hosts.
Complete genome sequence of the acetylene-fermenting Pelobacter sp. strain SFB93

USGS Publications Warehouse

Sutton, John M.; Baesman, Shaun; Fierst, Janna L.; Poret-Peterson, Amisha T.; Oremland, Ronald S.; Dunlap, Darren S.; Akob, Denise M.

2017-01-01

Acetylene fermentation is a rare metabolism that was previously reported as being unique to Pelobacter acetylenicus. Here, we report the genome sequence of Pelobacter sp. strain SFB93, an acetylene-fermenting bacterium isolated from sediments collected in San Francisco Bay, CA.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Chauhan, Archana; Layton, Alice; Williams, Daniel W

Pseudomonas fluorescens strain HK44 (DSM 6700) is a genetically engineered lux-based bioluminescent bioreporter. Here we report the draft genome sequence of strain HK44. Annotation of {approx}6.1 Mb sequence indicates that 30% of the traits are unique and distributed over 5 genomic islands, a prophage and two plasmids.
Developing improved durum wheat germplasm by altering the cytoplasmic genome

USDA-ARS?s Scientific Manuscript database

In eukaryotic organisms, nuclear and cytoplasmic genomes interact to drive cellular functions. These genomes have co-evolved to form specific nuclear-cytoplasmic interactions that are essential to the origin, success, and evolution of diploid and polyploid species. Hundreds of genetic diseases in h...
The genome of Mesobuthus martensii reveals a unique adaptation model of arthropods

PubMed Central

Cao, Zhijian; Yu, Yao; Wu, Yingliang; Hao, Pei; Di, Zhiyong; He, Yawen; Chen, Zongyun; Yang, Weishan; Shen, Zhiyong; He, Xiaohua; Sheng, Jia; Xu, Xiaobo; Pan, Bohu; Feng, Jing; Yang, Xiaojuan; Hong, Wei; Zhao, Wenjuan; Li, Zhongjie; Huang, Kai; Li, Tian; Kong, Yimeng; Liu, Hui; Jiang, Dahe; Zhang, Binyan; Hu, Jun; Hu, Youtian; Wang, Bin; Dai, Jianliang; Yuan, Bifeng; Feng, Yuqi; Huang, Wei; Xing, Xiaojing; Zhao, Guoping; Li, Xuan; Li, Yixue; Li, Wenxin

2013-01-01

Representing a basal branch of arachnids, scorpions are known as ‘living fossils’ that maintain an ancient anatomy and are adapted to have survived extreme climate changes. Here we report the genome sequence of Mesobuthus martensii, containing 32,016 protein-coding genes, the most among sequenced arthropods. Although M. martensii appears to evolve conservatively, it has a greater gene family turnover than the insects that have undergone diverse morphological and physiological changes, suggesting the decoupling of the molecular and morphological evolution in scorpions. Underlying the long-term adaptation of scorpions is the expansion of the gene families enriched in basic metabolic pathways, signalling pathways, neurotoxins and cytochrome P450, and the different dynamics of expansion between the shared and the scorpion lineage-specific gene families. Genomic and transcriptomic analyses further illustrate the important genetic features associated with prey, nocturnal behaviour, feeding and detoxification. The M. martensii genome reveals a unique adaptation model of arthropods, offering new insights into the genetic bases of the living fossils. PMID:24129506
The genome of Th17 cell-inducing segmented filamentous bacteria reveals extensive auxotrophy and adaptations to the intestinal environment

PubMed Central

Sczesnak, Andrew; Segata, Nicola; Qin, Xiang; Gevers, Dirk; Petrosino, Joseph F.; Huttenhower, Curtis; Littman, Dan R.; Ivanov, Ivaylo I.

2011-01-01

Summary Perturbations of the composition of the symbiotic intestinal microbiota can have profound consequences for host metabolism and immunity. In mice, segmented filamentous bacteria (SFB) direct the accumulation of potentially pro-inflammatory Th17 cells in the intestinal lamina propria. We present the genome sequence of SFB isolated from mono-colonized mice, which classifies SFB phylogenetically as a unique member of Clostridiales with a highly reduced genome. Annotation analysis demonstrates that SFB depends on its environment for amino acids and essential nutrients and may utilize host and dietary glycans for carbon, nitrogen, and energy. Comparative analyses reveal that SFB is functionally related to members of the genus Clostridium and several pathogenic or commensal “minimal” genera, including Finegoldia, Mycoplasma, Borrelia, and Phytoplasma. However, SFB is functionally distinct from all 1,200 examined genomes, indicating a gene complement representing biology relatively unique to its role as a gut commensal closely tied to host metabolism and immunity. PMID:21925113
Genome analysis of three Pneumocystis species reveals adaptation mechanisms to life exclusively in mammalian hosts

PubMed Central

Ma, Liang; Chen, Zehua; Huang, Da Wei; Kutty, Geetha; Ishihara, Mayumi; Wang, Honghui; Abouelleil, Amr; Bishop, Lisa; Davey, Emma; Deng, Rebecca; Deng, Xilong; Fan, Lin; Fantoni, Giovanna; Fitzgerald, Michael; Gogineni, Emile; Goldberg, Jonathan M.; Handley, Grace; Hu, Xiaojun; Huber, Charles; Jiao, Xiaoli; Jones, Kristine; Levin, Joshua Z.; Liu, Yueqin; Macdonald, Pendexter; Melnikov, Alexandre; Raley, Castle; Sassi, Monica; Sherman, Brad T.; Song, Xiaohong; Sykes, Sean; Tran, Bao; Walsh, Laura; Xia, Yun; Yang, Jun; Young, Sarah; Zeng, Qiandong; Zheng, Xin; Stephens, Robert; Nusbaum, Chad; Birren, Bruce W.; Azadi, Parastoo; Lempicki, Richard A.; Cuomo, Christina A.; Kovacs, Joseph A.

2016-01-01

Pneumocystis jirovecii is a major cause of life-threatening pneumonia in immunosuppressed patients including transplant recipients and those with HIV/AIDS, yet surprisingly little is known about the biology of this fungal pathogen. Here we report near complete genome assemblies for three Pneumocystis species that infect humans, rats and mice. Pneumocystis genomes are highly compact relative to other fungi, with substantial reductions of ribosomal RNA genes, transporters, transcription factors and many metabolic pathways, but contain expansions of surface proteins, especially a unique and complex surface glycoprotein superfamily, as well as proteases and RNA processing proteins. Unexpectedly, the key fungal cell wall components chitin and outer chain N-mannans are absent, based on genome content and experimental validation. Our findings suggest that Pneumocystis has developed unique mechanisms of adaptation to life exclusively in mammalian hosts, including dependence on the lungs for gas and nutrients and highly efficient strategies to escape both host innate and acquired immune defenses. PMID:26899007
Differential Sox10 Genomic Occupancy in Myelinating Glia

PubMed Central

Lopez-Anido, Camila; Sun, Guannan; Koenning, Matthias; Srinivasan, Rajini; Hung, Holly A.; Emery, Ben; Keles, Sunduz; Svaren, John

2015-01-01

Myelin is formed by specialized myelinating glia: oligodendrocytes and Schwann cells in the central and peripheral nervous systems, respectively. While there are distinct developmental aspects and regulatory pathways in these two cell types, myelination in both systems requires the transcriptional activator Sox10. Sox10 interacts with cell type-specific transcription factors at some loci to induce myelin gene expression, but it is largely unknown how Sox10 transcriptional networks globally compare between oligodendrocytes and Schwann cells. We used in vivo ChIP-Seq analysis of spinal cord and peripheral nerve (sciatic nerve) to identify unique and shared Sox10 binding sites and assess their correlation with active enhancers and transcriptional profiles in oligodendrocytes and Schwann cells. Sox10 binding sites overlap with active enhancers and critical cell type-specific regulators of myelination, such as Olig2 and Myrf in oligodendrocytes, and Egr2/Krox20 in Schwann cells. Sox10 sites also associate with genes critical for myelination in both oligodendrocytes and Schwann cells, and are found within super-enhancers previously defined in brain. In Schwann cells, Sox10 sites contain binding motifs of putative partners in the Sp/Klf, Tead, and nuclear receptor protein families. Specifically, siRNA analysis of nuclear receptors Nr2f1 and Nr2f2 revealed downregulation of myelin genes Mbp and Ndrg1 in primary Schwann cells. Our analysis highlights different mechanisms that establish cell type-specific genomic occupancy of Sox10, which reflects the unique characteristics of oligodendrocyte and Schwann cell differentiation. PMID:25974668
Differential gene transcription across the life cycle in Daphnia magna using a new all genome custom-made microarray.

PubMed

Campos, Bruno; Fletcher, Danielle; Piña, Benjamín; Tauler, Romà; Barata, Carlos

2018-05-18

Unravelling the link between genes and environment across the life cycle is a challenging goal that requires model organisms with well-characterized life-cycles, ecological interactions in nature, tractability in the laboratory, and available genomic tools. Very few well-studied invertebrate model species meet these requirements, being the waterflea Daphnia magna one of them. Here we report a full genome transcription profiling of D. magna during its life-cycle. The study was performed using a new microarray platform designed from the complete set of gene models representing the whole transcribed genome of D. magna. Up to 93% of the existing 41,317 D. magna gene models showed differential transcription patterns across the developmental stages of D. magna, 59% of which were functionally annotated. Embryos showed the highest number of unique transcribed genes, mainly related to DNA, RNA, and ribosome biogenesis, likely related to cellular proliferation and morphogenesis of the several body organs. Adult females showed an enrichment of transcripts for genes involved in reproductive processes. These female-specific transcripts were essentially absent in males, whose transcriptome was enriched in specific genes of male sexual differentiation genes, like doublesex. Our results define major characteristics of transcriptional programs involved in the life-cycle, differentiate males and females, and show that large scale gene-transcription data collected in whole animals can be used to identify genes involved in specific biological and biochemical processes.
Complete Genome Sequence of Yersinia pestis Strains Antiqua andNepal516: Evidence of Gene Reduction in an Emerging Pathogen

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chain, Patrick S.G.; Hu, Ping; Malfatti, Stephanie A.

2006-01-16

Yersinia pestis, the causative agent of bubonic andpneumonicplague, has undergone detailed study at the molecular level. Tofurther investigate the genomic diversity among this group and to helpcharacterize lineages of the plague organism that have no sequencedmembers, we present here the genomes of two isolates of the "classical"Antiqua biovar, strains Antiqua and Nepal516. The genomes of Antiqua andNepal516 are 4.7 Mb and 4.5 Mb and encode 4,138 and 3,956 open readingframes respectively. Though both strains belong to one of the threeclassical biovars, they represent separate lineages defined by recentphylogenetic studies. We compare all five currently sequenced Y. pestisgenomes and the correspondingmore » features in Y. pseudotuberculosis. Thereare strain-specific rearrangements, insertions, deletions, singlenucleotide polymorphisms and a unique distribution of insertionsequences. We found 453 single nucleotide polymorphisms in protein codingregions, which were used to assess evolutionary relationships of these Y.pestis strains. Gene reduction analysis revealed that the gene deletionprocesses are under selective pressure and many of the inactivations areprobably related to the organism s interaction with its host environment.The results presented here clearly demonstrate the differences betweenthe two Antiqua lineages and support the notion that grouping Y. pestisstrains based strictly on the classical definition of biovars (predicatedupon two biochemical assays) does not accurately reflect the phylogeneticrelationships within this species. Comparison of four virulent Y. pestisstrains with the human-avirulent strain 91001 provides further insightinto the genetic basis of virulence to humans.« less
Detection of PIWI and piRNAs in the mitochondria of mammalian cancer cells

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kwon, ChangHyuk, E-mail: netbuyer@hanmail.net; Tak, Hyosun, E-mail: chuberry@naver.com; Rho, Mina, E-mail: minarho@hanyang.ac.kr

2014-03-28

Highlights: • piRNA sequences were mapped to human mitochondrial (mt) genome. • We inspected small RNA-Seq datasets from somatic cell mt subcellular fractions. • Piwi and piRNA transcripts are present in mammalian somatic cancer cell mt fractions. - Abstract: Piwi-interacting RNAs (piRNAs) are 26–31 nt small noncoding RNAs that are processed from their longer precursor transcripts by Piwi proteins. Localization of Piwi and piRNA has been reported mostly in nucleus and cytoplasm of higher eukaryotes germ-line cells, where it is believed that known piRNA sequences are located in repeat regions of nuclear genome in germ-line cells. However, localization of PIWImore » and piRNA in mammalian somatic cell mitochondria yet remains largely unknown. We identified 29 piRNA sequence alignments from various regions of the human mitochondrial genome. Twelve out 29 piRNA sequences matched stem-loop fragment sequences of seven distinct tRNAs. We observed their actual expression in mitochondria subcellular fractions by inspecting mitochondrial-specific small RNA-Seq datasets. Of interest, the majority of the 29 piRNAs overlapped with multiple longer transcripts (expressed sequence tags) that are unique to the human mitochondrial genome. The presence of mature piRNAs in mitochondria was detected by qRT-PCR of mitochondrial subcellular RNAs. Further validation showed detection of Piwi by colocalization using anti-Piwil1 and mitochondria organelle-specific protein antibodies.« less
Single-cell sequencing provides clues about the host interactions of segmented filamentous bacteria (SFB)

PubMed Central

Pamp, Sünje J.; Harrington, Eoghan D.; Quake, Stephen R.; Relman, David A.; Blainey, Paul C.

2012-01-01

Segmented filamentous bacteria (SFB) are host-specific intestinal symbionts that comprise a distinct clade within the Clostridiaceae, designated Candidatus Arthromitus. SFB display a unique life cycle within the host, involving differentiation into multiple cell types. The latter include filaments that attach intimately to intestinal epithelial cells, and from which “holdfasts” and spores develop. SFB induce a multifaceted immune response, leading to host protection from intestinal pathogens. Cultivation resistance has hindered characterization of these enigmatic bacteria. In the present study, we isolated five SFB filaments from a mouse using a microfluidic device equipped with laser tweezers, generated genome sequences from each, and compared these sequences with each other, as well as to recently published SFB genome sequences. Based on the resulting analyses, SFB appear to be dependent on the host for a variety of essential nutrients. SFB have a relatively high abundance of predicted proteins devoted to cell cycle control and to envelope biogenesis, and have a group of SFB-specific autolysins and a dynamin-like protein. Among the five filament genomes, an average of 8.6% of predicted proteins were novel, including a family of secreted SFB-specific proteins. Four ADP-ribosyltransferase (ADPRT) sequence types, and a myosin-cross-reactive antigen (MCRA) protein were discovered; we hypothesize that they are involved in modulation of host responses. The presence of polymorphisms among mouse SFB genomes suggests the evolution of distinct SFB lineages. Overall, our results reveal several aspects of SFB adaptation to the mammalian intestinal tract. PMID:22434425
Sex and parasites: genomic and transcriptomic analysis of Microbotryum lychnidis-dioicae, the biotrophic and plant-castrating anther smut fungus.

PubMed

Perlin, Michael H; Amselem, Joelle; Fontanillas, Eric; Toh, Su San; Chen, Zehua; Goldberg, Jonathan; Duplessis, Sebastien; Henrissat, Bernard; Young, Sarah; Zeng, Qiandong; Aguileta, Gabriela; Petit, Elsa; Badouin, Helene; Andrews, Jared; Razeeq, Dominique; Gabaldón, Toni; Quesneville, Hadi; Giraud, Tatiana; Hood, Michael E; Schultz, David J; Cuomo, Christina A

2015-06-16

The genus Microbotryum includes plant pathogenic fungi afflicting a wide variety of hosts with anther smut disease. Microbotryum lychnidis-dioicae infects Silene latifolia and replaces host pollen with fungal spores, exhibiting biotrophy and necrosis associated with altering plant development. We determined the haploid genome sequence for M. lychnidis-dioicae and analyzed whole transcriptome data from plant infections and other stages of the fungal lifecycle, revealing the inventory and expression level of genes that facilitate pathogenic growth. Compared to related fungi, an expanded number of major facilitator superfamily transporters and secretory lipases were detected; lipase gene expression was found to be altered by exposure to lipid compounds, which signaled a switch to dikaryotic, pathogenic growth. In addition, while enzymes to digest cellulose, xylan, xyloglucan, and highly substituted forms of pectin were absent, along with depletion of peroxidases and superoxide dismutases that protect the fungus from oxidative stress, the repertoire of glycosyltransferases and of enzymes that could manipulate host development has expanded. A total of 14% of the genome was categorized as repetitive sequences. Transposable elements have accumulated in mating-type chromosomal regions and were also associated across the genome with gene clusters of small secreted proteins, which may mediate host interactions. The unique absence of enzyme classes for plant cell wall degradation and maintenance of enzymes that break down components of pollen tubes and flowers provides a striking example of biotrophic host adaptation.
Plasmid Characterization and Chromosome Analysis of Two netF+ Clostridium perfringens Isolates Associated with Foal and Canine Necrotizing Enteritis

PubMed Central

Mehdizadeh Gohari, Iman; Kropinski, Andrew M.; Weese, Scott J.; Parreira, Valeria R.; Whitehead, Ashley E.; Boerlin, Patrick; Prescott, John F.

2016-01-01

The recent discovery of a novel beta-pore-forming toxin, NetF, which is strongly associated with canine and foal necrotizing enteritis should improve our understanding of the role of type A Clostridium perfringens associated disease in these animals. The current study presents the complete genome sequence of two netF-positive strains, JFP55 and JFP838, which were recovered from cases of foal necrotizing enteritis and canine hemorrhagic gastroenteritis, respectively. Genome sequencing was done using Single Molecule, Real-Time (SMRT) technology-PacBio and Illumina Hiseq2000. The JFP55 and JFP838 genomes include a single 3.34 Mb and 3.53 Mb chromosome, respectively, and both genomes include five circular plasmids. Plasmid annotation revealed that three plasmids were shared by the two newly sequenced genomes, including a NetF/NetE toxins-encoding tcp-conjugative plasmid, a CPE/CPB2 toxins-encoding tcp-conjugative plasmid and a putative bacteriocin-encoding plasmid. The putative beta-pore-forming toxin genes, netF, netE and netG, were located in unique pathogenicity loci on tcp-conjugative plasmids. The C. perfringens JFP55 chromosome carries 2,825 protein-coding genes whereas the chromosome of JFP838 contains 3,014 protein-encoding genes. Comparison of these two chromosomes with three available reference C. perfringens chromosome sequences identified 48 (~247 kb) and 81 (~430 kb) regions unique to JFP55 and JFP838, respectively. Some of these divergent genomic regions in both chromosomes are phage- and plasmid-related segments. Sixteen of these unique chromosomal regions (~69 kb) were shared between the two isolates. Five of these shared regions formed a mosaic of plasmid-integrated segments, suggesting that these elements were acquired early in a clonal lineage of netF-positive C. perfringens strains. These results provide significant insight into the basis of canine and foal necrotizing enteritis and are the first to demonstrate that netF resides on a large and unique plasmid-encoded locus. PMID:26859667
Tissue-aware data integration approach for the inference of pathway interactions in metazoan organisms

PubMed Central

Park, Christopher Y.; Krishnan, Arjun; Zhu, Qian; Wong, Aaron K.; Lee, Young-Suk; Troyanskaya, Olga G.

2015-01-01

Motivation: Leveraging the large compendium of genomic data to predict biomedical pathways and specific mechanisms of protein interactions genome-wide in metazoan organisms has been challenging. In contrast to unicellular organisms, biological and technical variation originating from diverse tissues and cell-lineages is often the largest source of variation in metazoan data compendia. Therefore, a new computational strategy accounting for the tissue heterogeneity in the functional genomic data is needed to accurately translate the vast amount of human genomic data into specific interaction-level hypotheses. Results: We developed an integrated, scalable strategy for inferring multiple human gene interaction types that takes advantage of data from diverse tissue and cell-lineage origins. Our approach specifically predicts both the presence of a functional association and also the most likely interaction type among human genes or its protein products on a whole-genome scale. We demonstrate that directly incorporating tissue contextual information improves the accuracy of our predictions, and further, that such genome-wide results can be used to significantly refine regulatory interactions from primary experimental datasets (e.g. ChIP-Seq, mass spectrometry). Availability and implementation: An interactive website hosting all of our interaction predictions is publically available at http://pathwaynet.princeton.edu. Software was implemented using the open-source Sleipnir library, which is available for download at https://bitbucket.org/libsleipnir/libsleipnir.bitbucket.org. Contact: ogt@cs.princeton.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25431329
Draft Genome Sequence of Pedobacter sp. Strain V48, Isolated from a Coastal Sand Dune in the Netherlands

PubMed Central

Bitzer, Adam S.; Garbeva, Paolina

2014-01-01

Pedobacter sp. strain V48 participates in an interaction with Pseudomonas fluorescens which elicits interaction-induced phenotypes. We report the draft genome sequence of Pedobacter sp. V48, consisting of 6.46 Mbp. The sequence will contribute to improved understanding of the genus and facilitate genomic analysis of the model interspecies interaction with P. fluorescens. PMID:24578271
Whole-genome sequence, SNP chips and pedigree structure: building demographic profiles in domestic dog breeds to optimize genetic-trait mapping.

PubMed

Dreger, Dayna L; Rimbault, Maud; Davis, Brian W; Bhatnagar, Adrienne; Parker, Heidi G; Ostrander, Elaine A

2016-12-01

In the decade following publication of the draft genome sequence of the domestic dog, extraordinary advances with application to several fields have been credited to the canine genetic system. Taking advantage of closed breeding populations and the subsequent selection for aesthetic and behavioral characteristics, researchers have leveraged the dog as an effective natural model for the study of complex traits, such as disease susceptibility, behavior and morphology, generating unique contributions to human health and biology. When designing genetic studies using purebred dogs, it is essential to consider the unique demography of each population, including estimation of effective population size and timing of population bottlenecks. The analytical design approach for genome-wide association studies (GWAS) and analysis of whole-genome sequence (WGS) experiments are inextricable from demographic data. We have performed a comprehensive study of genomic homozygosity, using high-depth WGS data for 90 individuals, and Illumina HD SNP data from 800 individuals representing 80 breeds. These data were coupled with extensive pedigree data analyses for 11 breeds that, together, allowed us to compute breed structure, demography, and molecular measures of genome diversity. Our comparative analyses characterize the extent, formation and implication of breed-specific diversity as it relates to population structure. These data demonstrate the relationship between breed-specific genome dynamics and population architecture, and provide important considerations influencing the technological and cohort design of association and other genomic studies. © 2016. Published by The Company of Biologists Ltd.
Comparative genomic analysis shows that Streptococcus suis meningitis isolate SC070731 contains a unique 105K genomic island.

PubMed

Wu, Zongfu; Wang, Weixue; Tang, Min; Shao, Jing; Dai, Chen; Zhang, Wei; Fan, Hongjie; Yao, Huochun; Zong, Jie; Chen, Dai; Wang, Junning; Lu, Chengping

2014-02-10

Streptococcus suis (SS) is an important swine pathogen worldwide that occasionally causes serious infections in humans. SS infection may result in meningitis in pigs and humans. The pathogenic mechanisms of SS are poorly understood. Here, we provide the complete genome sequence of S. suis serotype 2 (SS2) strain SC070731 isolated from a pig with meningitis. The chromosome is 2,138,568bp in length. There are 1933 predicted protein coding sequences and 96.7% (57/59) of the known virulence-associated genes are present in the genome. Strain SC070731 showed similar virulence with SS2 virulent strains HA9801 and ZY05719, but was more virulent than SS2 virulent strain P1/7 in the zebrafish infection model. Comparative genomic analysis revealed a unique 105K genomic island in strain SC070731 that is absent in seven other sequenced SS2 strains. Further analysis of the 105K genomic island indicated that it contained a complete nisin locus similar to the nisin U locus in S. uberis strain 42, a prophage similar to S. oralis phage PH10 and several antibiotic resistance genes. Several proteins in the 105K genomic island, including nisin and RelBE toxin-antitoxin system, contribute to the bacterial fitness and virulence in other pathogenic bacteria. Further investigation of newly identified gene products, including four putative new virulence-associated surface proteins, will improve our understanding of SS pathogenesis. Copyright © 2013 Elsevier B.V. All rights reserved.
Whole-genome sequence, SNP chips and pedigree structure: building demographic profiles in domestic dog breeds to optimize genetic-trait mapping

PubMed Central

Dreger, Dayna L.; Rimbault, Maud; Davis, Brian W.; Bhatnagar, Adrienne; Parker, Heidi G.

2016-01-01

ABSTRACT In the decade following publication of the draft genome sequence of the domestic dog, extraordinary advances with application to several fields have been credited to the canine genetic system. Taking advantage of closed breeding populations and the subsequent selection for aesthetic and behavioral characteristics, researchers have leveraged the dog as an effective natural model for the study of complex traits, such as disease susceptibility, behavior and morphology, generating unique contributions to human health and biology. When designing genetic studies using purebred dogs, it is essential to consider the unique demography of each population, including estimation of effective population size and timing of population bottlenecks. The analytical design approach for genome-wide association studies (GWAS) and analysis of whole-genome sequence (WGS) experiments are inextricable from demographic data. We have performed a comprehensive study of genomic homozygosity, using high-depth WGS data for 90 individuals, and Illumina HD SNP data from 800 individuals representing 80 breeds. These data were coupled with extensive pedigree data analyses for 11 breeds that, together, allowed us to compute breed structure, demography, and molecular measures of genome diversity. Our comparative analyses characterize the extent, formation and implication of breed-specific diversity as it relates to population structure. These data demonstrate the relationship between breed-specific genome dynamics and population architecture, and provide important considerations influencing the technological and cohort design of association and other genomic studies. PMID:27874836
Visualization for genomics: the Microbial Genome Viewer.

PubMed

Kerkhoven, Robert; van Enckevort, Frank H J; Boekhorst, Jos; Molenaar, Douwe; Siezen, Roland J

2004-07-22

A Web-based visualization tool, the Microbial Genome Viewer, is presented that allows the user to combine complex genomic data in a highly interactive way. This Web tool enables the interactive generation of chromosome wheels and linear genome maps from genome annotation data stored in a MySQL database. The generated images are in scalable vector graphics (SVG) format, which is suitable for creating high-quality scalable images and dynamic Web representations. Gene-related data such as transcriptome and time-course microarray experiments can be superimposed on the maps for visual inspection. The Microbial Genome Viewer 1.0 is freely available at http://www.cmbi.kun.nl/MGV

Integrated genome browser: visual analytics platform for genomics.

PubMed

Freese, Nowlan H; Norris, David C; Loraine, Ann E

2016-07-15

Genome browsers that support fast navigation through vast datasets and provide interactive visual analytics functions can help scientists achieve deeper insight into biological systems. Toward this end, we developed Integrated Genome Browser (IGB), a highly configurable, interactive and fast open source desktop genome browser. Here we describe multiple updates to IGB, including all-new capabilities to display and interact with data from high-throughput sequencing experiments. To demonstrate, we describe example visualizations and analyses of datasets from RNA-Seq, ChIP-Seq and bisulfite sequencing experiments. Understanding results from genome-scale experiments requires viewing the data in the context of reference genome annotations and other related datasets. To facilitate this, we enhanced IGB's ability to consume data from diverse sources, including Galaxy, Distributed Annotation and IGB-specific Quickload servers. To support future visualization needs as new genome-scale assays enter wide use, we transformed the IGB codebase into a modular, extensible platform for developers to create and deploy all-new visualizations of genomic data. IGB is open source and is freely available from http://bioviz.org/igb aloraine@uncc.edu. © The Author 2016. Published by Oxford University Press.
Characterization of Equine Infectious Anemia Virus Integration in the Horse Genome.

PubMed

Liu, Qiang; Wang, Xue-Feng; Ma, Jian; He, Xi-Jun; Wang, Xiao-Jun; Zhou, Jian-Hua

2015-06-19

Human immunodeficiency virus (HIV)-1 has a unique integration profile in the human genome relative to murine and avian retroviruses. Equine infectious anemia virus (EIAV) is another well-studied lentivirus that can also be used as a promising retro-transfection vector, but its integration into its native host has not been characterized. In this study, we mapped 477 integration sites of the EIAV strain EIAVFDDV13 in fetal equine dermal (FED) cells during in vitro infection. Published integration sites of EIAV and HIV-1 in the human genome were also analyzed as references. Our results demonstrated that EIAVFDDV13 tended to integrate into genes and AT-rich regions, and it avoided integrating into transcription start sites (TSS), which is consistent with EIAV and HIV-1 integration in the human genome. Notably, the integration of EIAVFDDV13 favored long interspersed elements (LINEs) and DNA transposons in the horse genome, whereas the integration of HIV-1 favored short interspersed elements (SINEs) in the human genome. The chromosomal environment near LINEs or DNA transposons potentially influences viral transcription and may be related to the unique EIAV latency states in equids. The data on EIAV integration in its natural host will facilitate studies on lentiviral infection and lentivirus-based therapeutic vectors.
Advancing stroke genomic research in the age of Trans-Omics big data science: Emerging priorities and opportunities.

PubMed

Owolabi, Mayowa; Peprah, Emmanuel; Xu, Huichun; Akinyemi, Rufus; Tiwari, Hemant K; Irvin, Marguerite R; Wahab, Kolawole Wasiu; Arnett, Donna K; Ovbiagele, Bruce

2017-11-15

We systematically reviewed the genetic variants associated with stroke in genome-wide association studies (GWAS) and examined the emerging priorities and opportunities for rapidly advancing stroke research in the era of Trans-Omics science. Using the PRISMA guideline, we searched PubMed and NHGRI- EBI GWAS catalog for stroke studies from 2007 till May 2017. We included 31 studies. The major challenge is that the few validated variants could not account for the full genetic risk of stroke and have not been translated for clinical use. None of the studies included continental Africans. Genomic study of stroke among Africans presents a unique opportunity for the discovery, validation, functional annotation, Trans-Omics study and translation of genomic determinants of stroke with implications for global populations. This is because all humans originated from Africa, a continent with a unique genomic architecture and a distinctive epidemiology of stroke; as well as substantially higher heritability and resolution of fine mapping of stroke genes. Understanding the genomic determinants of stroke and the corresponding molecular mechanisms will revolutionize the development of a new set of precise biomarkers for stroke prediction, diagnosis and prognostic estimates as well as personalized interventions for reducing the global burden of stroke. Copyright © 2017 Elsevier B.V. All rights reserved.
Comparative Genomic Analyses of Clavibacter michiganensis subsp. insidiosus and Pathogenicity on Medicago truncatula.

PubMed

Lu, You; Ishimaru, Carol A; Glazebrook, Jane; Samac, Deborah A

2018-02-01

Clavibacter michiganensis is the most economically important gram-positive bacterial plant pathogen, with subspecies that cause serious diseases of maize, wheat, tomato, potato, and alfalfa. Much less is known about pathogenesis involving gram-positive plant pathogens than is known for gram-negative bacteria. Comparative genome analyses of C. michiganensis subspecies affecting tomato, potato, and maize have provided insights on pathogenicity. In this study, we identified strains of C. michiganensis subsp. insidiosus with contrasting pathogenicity on three accessions of the model legume Medicago truncatula. We generated complete genome sequences for two strains and compared these to a previously sequenced strain and genome sequences of four other subspecies. The three C. michiganensis subsp. insidiosus strains varied in gene content due to genome rearrangements, most likely facilitated by insertion elements, and plasmid number, which varied from one to three depending on strain. The core C. michiganensis genome consisted of 1,917 genes, with 379 genes unique to C. michiganensis subsp. insidiosus. An operon for synthesis of the extracellular blue pigment indigoidine, enzymes for pectin degradation, and an operon for inositol metabolism are among the unique features. Secreted serine proteases belonging to both the pat-1 and ppa families were present but highly diverged from those in other subspecies.
Characterization of Equine Infectious Anemia Virus Integration in the Horse Genome

PubMed Central

Liu, Qiang; Wang, Xue-Feng; Ma, Jian; He, Xi-Jun; Wang, Xiao-Jun; Zhou, Jian-Hua

2015-01-01

Human immunodeficiency virus (HIV)-1 has a unique integration profile in the human genome relative to murine and avian retroviruses. Equine infectious anemia virus (EIAV) is another well-studied lentivirus that can also be used as a promising retro-transfection vector, but its integration into its native host has not been characterized. In this study, we mapped 477 integration sites of the EIAV strain EIAVFDDV13 in fetal equine dermal (FED) cells during in vitro infection. Published integration sites of EIAV and HIV-1 in the human genome were also analyzed as references. Our results demonstrated that EIAVFDDV13 tended to integrate into genes and AT-rich regions, and it avoided integrating into transcription start sites (TSS), which is consistent with EIAV and HIV-1 integration in the human genome. Notably, the integration of EIAVFDDV13 favored long interspersed elements (LINEs) and DNA transposons in the horse genome, whereas the integration of HIV-1 favored short interspersed elements (SINEs) in the human genome. The chromosomal environment near LINEs or DNA transposons potentially influences viral transcription and may be related to the unique EIAV latency states in equids. The data on EIAV integration in its natural host will facilitate studies on lentiviral infection and lentivirus-based therapeutic vectors. PMID:26102582
Selective intra-dinucleotide interactions and periodicities of bases separated by K sites: a new vision and tool for phylogeny analyses.

PubMed

Valenzuela, Carlos Y

2017-02-13

Direct tests of the random or non-random distribution of nucleotides on genomes have been devised to test the hypothesis of neutral, nearly-neutral or selective evolution. These tests are based on the direct base distribution and are independent of the functional (coding or non-coding) or structural (repeated or unique sequences) properties of the DNA. The first approach described the longitudinal distribution of bases in tandem repeats under the Bose-Einstein statistics. A huge deviation from randomness was found. A second approach was the study of the base distribution within dinucleotides whose bases were separated by 0, 1, 2… K nucleotides. Again an enormous difference from the random distribution was found with significances out of tables and programs. These test values were periodical and included the 16 dinucleotides. For example a high "positive" (more observed than expected dinucleotides) value, found in dinucleotides whose bases were separated by (3K + 2) sites, was preceded by two smaller "negative" (less observed than expected dinucleotides) values, whose bases were separated by (3K) or (3K + 1) sites. We examined mtDNAs, prokaryote genomes and some eukaryote chromosomes and found that the significant non-random interactions and periodicities were present up to 1000 or more sites of base separation and in human chromosome 21 until separations of more than 10 millions sites. Each nucleotide has its own significant value of its distance to neutrality; this yields 16 hierarchical significances. A three dimensional table with the number of sites of separation between the bases and the 16 significances (the third dimension is the dinucleotide, individual or taxon involved) gives directly an evolutionary state of the analyzed genome that can be used to obtain phylogenies. An example is provided.
Alu-miRNA interactions modulate transcript isoform diversity in stress response and reveal signatures of positive selection

NASA Astrophysics Data System (ADS)

Pandey, Rajesh; Bhattacharya, Aniket; Bhardwaj, Vivek; Jha, Vineet; Mandal, Amit K.; Mukerji, Mitali

2016-09-01

Primate-specific Alus harbor different regulatory features, including miRNA targets. In this study, we provide evidence for miRNA-mediated modulation of transcript isoform levels during heat-shock response through exaptation of Alu-miRNA sites in mature mRNA. We performed genome-wide expression profiling coupled with functional validation of miRNA target sites within exonized Alus, and analyzed conservation of these targets across primates. We observed that two miRNAs (miR-15a-3p and miR-302d-3p) elevated in stress response, target RAD1, GTSE1, NR2C1, FKBP9 and UBE2I exclusively within Alu. These genes map onto the p53 regulatory network. Ectopic overexpression of miR-15a-3p downregulates GTSE1 and RAD1 at the protein level and enhances cell survival. This Alu-mediated fine-tuning seems to be unique to humans as evident from the absence of orthologous sites in other primate lineages. We further analyzed signatures of selection on Alu-miRNA targets in the genome, using 1000 Genomes Phase-I data. We found that 198 out of 3177 Alu-exonized genes exhibit signatures of selection within Alu-miRNA sites, with 60 of them containing SNPs supported by multiple evidences (global-FST > 0.3, pair-wise-FST > 0.5, Fay-Wu’s H < -20, iHS > 2.0, high ΔDAF) and implicated in p53 network. We propose that by affecting multiple genes, Alu-miRNA interactions have the potential to facilitate population-level adaptations in response to environmental challenges.
Integration and visualization of systems biology data in context of the genome

PubMed Central

2010-01-01

Background High-density tiling arrays and new sequencing technologies are generating rapidly increasing volumes of transcriptome and protein-DNA interaction data. Visualization and exploration of this data is critical to understanding the regulatory logic encoded in the genome by which the cell dynamically affects its physiology and interacts with its environment. Results The Gaggle Genome Browser is a cross-platform desktop program for interactively visualizing high-throughput data in the context of the genome. Important features include dynamic panning and zooming, keyword search and open interoperability through the Gaggle framework. Users may bookmark locations on the genome with descriptive annotations and share these bookmarks with other users. The program handles large sets of user-generated data using an in-process database and leverages the facilities of SQL and the R environment for importing and manipulating data. A key aspect of the Gaggle Genome Browser is interoperability. By connecting to the Gaggle framework, the genome browser joins a suite of interconnected bioinformatics tools for analysis and visualization with connectivity to major public repositories of sequences, interactions and pathways. To this flexible environment for exploring and combining data, the Gaggle Genome Browser adds the ability to visualize diverse types of data in relation to its coordinates on the genome. Conclusions Genomic coordinates function as a common key by which disparate biological data types can be related to one another. In the Gaggle Genome Browser, heterogeneous data are joined by their location on the genome to create information-rich visualizations yielding insight into genome organization, transcription and its regulation and, ultimately, a better understanding of the mechanisms that enable the cell to dynamically respond to its environment. PMID:20642854
Behavioral Economics: A New Lens for Understanding Genomic Decision Making.

PubMed

Moore, Scott Emory; Ulbrich, Holley H; Hepburn, Kenneth; Holaday, Bonnie; Mayo, Rachel; Sharp, Julia; Pruitt, Rosanne H

2018-05-01

This article seeks to take the next step in examining the insights that nurses and other healthcare providers can derive from applying behavioral economic concepts to support genomic decision making. As genomic science continues to permeate clinical practice, nurses must continue to adapt practice to meet new challenges. Decisions associated with genomics are often not simple and dichotomous in nature. They can be complex and challenging for all involved. This article offers an introduction to behavioral economics as a possible tool to help support patients', families', and caregivers' decision making related to genomics. Using current writings from nursing, ethics, behavioral economic, and other healthcare scholars, we review key concepts of behavioral economics and discuss their relevance to supporting genomic decision making. Behavioral economic concepts-particularly relativity, deliberation, and choice architecture-are specifically examined as new ways to view the complexities of genomic decision making. Each concept is explored through patient decision making and clinical practice examples. This article also discusses next steps and practice implications for further development of the behavioral economic lens in nursing. Behavioral economics provides valuable insight into the unique nature of genetic decision-making practices. Nurses are often a source of information and support for patients during clinical decision making. This article seeks to offer behavioral economic concepts as a framework for understanding and examining the unique nature of genomic decision making. As genetic and genomic testing become more common in practice, it will continue to grow in importance for nurses to be able to support the autonomous decision making of patients, their families, and caregivers. © 2018 Sigma Theta Tau International.
Analysis of Protein-DNA Interaction by Chromatin Immunoprecipitation and DNA Tiling Microarray (ChIP-on-chip).

PubMed

Gao, Hui; Zhao, Chunyan

2018-01-01

Chromatin immunoprecipitation (ChIP) has become the most effective and widely used tool to study the interactions between specific proteins or modified forms of proteins and a genomic DNA region. Combined with genome-wide profiling technologies, such as microarray hybridization (ChIP-on-chip) or massively parallel sequencing (ChIP-seq), ChIP could provide a genome-wide mapping of in vivo protein-DNA interactions in various organisms. Here, we describe a protocol of ChIP-on-chip that uses tiling microarray to obtain a genome-wide profiling of ChIPed DNA.
Increased prediction accuracy in wheat breeding trials using a marker x environment interaction genomic selection model

USDA-ARS?s Scientific Manuscript database

Genomic selection (GS) models use genome-wide genetic information to predict genetic values of candidates for selection. Originally these models were developed without considering genotype ' environment interaction (GE). Several authors have proposed extensions of the cannonical GS model that accomm...
QuIN: A Web Server for Querying and Visualizing Chromatin Interaction Networks

PubMed Central

Thibodeau, Asa; Márquez, Eladio J.; Luo, Oscar; Ruan, Yijun; Shin, Dong-Guk; Stitzel, Michael L.; Ucar, Duygu

2016-01-01

Recent studies of the human genome have indicated that regulatory elements (e.g. promoters and enhancers) at distal genomic locations can interact with each other via chromatin folding and affect gene expression levels. Genomic technologies for mapping interactions between DNA regions, e.g., ChIA-PET and HiC, can generate genome-wide maps of interactions between regulatory elements. These interaction datasets are important resources to infer distal gene targets of non-coding regulatory elements and to facilitate prioritization of critical loci for important cellular functions. With the increasing diversity and complexity of genomic information and public ontologies, making sense of these datasets demands integrative and easy-to-use software tools. Moreover, network representation of chromatin interaction maps enables effective data visualization, integration, and mining. Currently, there is no software that can take full advantage of network theory approaches for the analysis of chromatin interaction datasets. To fill this gap, we developed a web-based application, QuIN, which enables: 1) building and visualizing chromatin interaction networks, 2) annotating networks with user-provided private and publicly available functional genomics and interaction datasets, 3) querying network components based on gene name or chromosome location, and 4) utilizing network based measures to identify and prioritize critical regulatory targets and their direct and indirect interactions. AVAILABILITY: QuIN’s web server is available at http://quin.jax.org QuIN is developed in Java and JavaScript, utilizing an Apache Tomcat web server and MySQL database and the source code is available under the GPLV3 license available on GitHub: https://github.com/UcarLab/QuIN/. PMID:27336171
FROG - Fingerprinting Genomic Variation Ontology

PubMed Central

Bhardwaj, Anshu

2015-01-01

Genetic variations play a crucial role in differential phenotypic outcomes. Given the complexity in establishing this correlation and the enormous data available today, it is imperative to design machine-readable, efficient methods to store, label, search and analyze this data. A semantic approach, FROG: “FingeRprinting Ontology of Genomic variations” is implemented to label variation data, based on its location, function and interactions. FROG has six levels to describe the variation annotation, namely, chromosome, DNA, RNA, protein, variations and interactions. Each level is a conceptual aggregation of logically connected attributes each of which comprises of various properties for the variant. For example, in chromosome level, one of the attributes is location of variation and which has two properties, allosomes or autosomes. Another attribute is variation kind which has four properties, namely, indel, deletion, insertion, substitution. Likewise, there are 48 attributes and 278 properties to capture the variation annotation across six levels. Each property is then assigned a bit score which in turn leads to generation of a binary fingerprint based on the combination of these properties (mostly taken from existing variation ontologies). FROG is a novel and unique method designed for the purpose of labeling the entire variation data generated till date for efficient storage, search and analysis. A web-based platform is designed as a test case for users to navigate sample datasets and generate fingerprints. The platform is available at http://ab-openlab.csir.res.in/frog. PMID:26244889
Direct colorimetric detection of unamplified pathogen DNA by dextrin-capped gold nanoparticles.

PubMed

Baetsen-Young, Amy M; Vasher, Matthew; Matta, Leann L; Colgan, Phil; Alocilja, Evangelyn C; Day, Brad

2018-03-15

The interaction between gold nanoparticles (AuNPs) and nucleic acids has facilitated a variety of diagnostic applications, with further diversification of synthesis match bio-applications while reducing biotoxicity. However, DNA interactions with unique surface capping agents have not been fully defined. Using dextrin-capped AuNPs (d-AuNPs), we have developed a novel unamplified genomic DNA (gDNA) nanosensor, exploiting dispersion and aggregation characteristics of d-AuNPs, in the presence of gDNA, for sequence-specific detection. We demonstrate that d-AuNPs are stable in a five-fold greater salt concentration than citrate-capped AuNPs and the d-AuNPs were stabilized by single stranded DNA probe (ssDNAp). However, in the elevated salt concentrations of the DNA detection assay, the target reactions were surprisingly further stabilized by the formation of a ssDNAp-target gDNA complex. The results presented herein lead us to propose a mechanism whereby genomic ssDNA secondary structure formation during ssDNAp-to-target gDNA binding enables d-AuNP stabilization in elevated ionic environments. Using the assay described herein, we were successful in detecting as little as 2.94 fM of pathogen DNA, and using crude extractions of a pathogen matrix, as few as 18 spores/µL. Copyright © 2017 Elsevier B.V. All rights reserved.
Genome-wide gene expression changes associated with exposure of rat liver, heart, and kidney cells to endosulfan.

PubMed

Liu, Ruifeng; Printz, Richard L; Jenkins, Erin C; O'Brien, Tracy P; Te, Jerez A; Shiota, Masakazu; Wallqvist, Anders

2018-04-01

Endosulfan was once the most commonly used pesticide in agriculture and horticulture. It is an environmentally persistent organochlorine compound with the potential to bioaccumulate as it progresses through the food chain. Its acute and chronic toxicity to mammals, including humans, is well known, but the molecular mechanisms of its toxicity are not fully understood. To gain insight to these mechanisms, we examined genome-wide gene expression changes of rat liver, heart, and kidney cells induced by endosulfan exposure. We found that among the cell types examined, kidney and liver cells were the most sensitive and most resilient, respectively, to endosulfan insult. We acquired RNA sequencing information from cells exposed to endosulfan to identify differentially expressed genes, which we further examined to determine the cellular pathways that were affected. In kidney cells, exposure to endosulfan was uniquely associated with altered expression levels of genes constituting the hypoxia-inducible factor-1 (HIF-1) signaling pathway. In heart and liver cells, exposure to endosulfan altered the expression levels of genes for many members of the extracellular matrix (ECM)-receptor interaction pathway. Because both HIF-1 signaling and ECM-receptor interaction pathways directly or indirectly control cell growth, differentiation, proliferation, and apoptosis, our findings suggest that dysregulation of these pathways is responsible for endosulfan-induced cell death. Copyright © 2018 Elsevier Ltd. All rights reserved.
Inferring coarse-grain histone-DNA interaction potentials from high-resolution structures of the nucleosome

NASA Astrophysics Data System (ADS)

Meyer, Sam; Everaers, Ralf

2015-02-01

The histone-DNA interaction in the nucleosome is a fundamental mechanism of genomic compaction and regulation, which remains largely unknown despite increasing structural knowledge of the complex. In this paper, we propose a framework for the extraction of a nanoscale histone-DNA force-field from a collection of high-resolution structures, which may be adapted to a larger class of protein-DNA complexes. We applied the procedure to a large crystallographic database extended by snapshots from molecular dynamics simulations. The comparison of the structural models first shows that, at histone-DNA contact sites, the DNA base-pairs are shifted outwards locally, consistent with locally repulsive forces exerted by the histones. The second step shows that the various force profiles of the structures under analysis derive locally from a unique, sequence-independent, quadratic repulsive force-field, while the sequence preferences are entirely due to internal DNA mechanics. We have thus obtained the first knowledge-derived nanoscale interaction potential for histone-DNA in the nucleosome. The conformations obtained by relaxation of nucleosomal DNA with high-affinity sequences in this potential accurately reproduce the experimental values of binding preferences. Finally we address the more generic binding mechanisms relevant to the 80% genomic sequences incorporated in nucleosomes, by computing the conformation of nucleosomal DNA with sequence-averaged properties. This conformation differs from those found in crystals, and the analysis suggests that repulsive histone forces are related to local stretch tension in nucleosomal DNA, mostly between adjacent contact points. This tension could play a role in the stability of the complex.
Emergence and Evolution of Hominidae-Specific Coding and Noncoding Genomic Sequences.

PubMed

Saber, Morteza Mahmoudi; Adeyemi Babarinde, Isaac; Hettiarachchi, Nilmini; Saitou, Naruya

2016-07-12

Family Hominidae, which includes humans and great apes, is recognized for unique complex social behavior and intellectual abilities. Despite the increasing genome data, however, the genomic origin of its phenotypic uniqueness has remained elusive. Clade-specific genes and highly conserved noncoding sequences (HCNSs) are among the high-potential evolutionary candidates involved in driving clade-specific characters and phenotypes. On this premise, we analyzed whole genome sequences along with gene orthology data retrieved from major DNA databases to find Hominidae-specific (HS) genes and HCNSs. We discovered that Down syndrome critical region 4 (DSCR4) is the only experimentally verified gene uniquely present in Hominidae. DSCR4 has no structural homology to any known protein and was inferred to have emerged in several steps through LTR/ERV1, LTR/ERVL retrotransposition, and transversion. Using the genomic distance as neutral evolution threshold, we identified 1,658 HS HCNSs. Polymorphism coverage and derived allele frequency analysis of HS HCNSs showed that these HCNSs are under purifying selection, indicating that they may harbor important functions. They are overrepresented in promoters/untranslated regions, in close proximity of genes involved in sensory perception of sound and developmental process, and also showed a significantly lower nucleosome occupancy probability. Interestingly, many ancestral sequences of the HS HCNSs showed very high evolutionary rates. This suggests that new functions emerged through some kind of positive selection, and then purifying selection started to operate to keep these functions. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Crossed wires: 3D genome misfolding in human disease.

PubMed

Norton, Heidi K; Phillips-Cremins, Jennifer E

2017-11-06

Mammalian genomes are folded into unique topological structures that undergo precise spatiotemporal restructuring during healthy development. Here, we highlight recent advances in our understanding of how the genome folds inside the 3D nucleus and how these folding patterns are miswired during the onset and progression of mammalian disease states. We discuss potential mechanisms underlying the link among genome misfolding, genome dysregulation, and aberrant cellular phenotypes. We also discuss cases in which the endogenous 3D genome configurations in healthy cells might be particularly susceptible to mutation or translocation. Together, these data support an emerging model in which genome folding and misfolding is critically linked to the onset and progression of a broad range of human diseases. © 2017 Norton and Phillips-Cremins.
GenomePeek—an online tool for prokaryotic genome and metagenome analysis

DOE PAGES

McNair, Katelyn; Edwards, Robert A.

2015-06-16

As increases in prokaryotic sequencing take place, a method to quickly and accurately analyze this data is needed. Previous tools are mainly designed for metagenomic analysis and have limitations; such as long runtimes and significant false positive error rates. The online tool GenomePeek (edwards.sdsu.edu/GenomePeek) was developed to analyze both single genome and metagenome sequencing files, quickly and with low error rates. GenomePeek uses a sequence assembly approach where reads to a set of conserved genes are extracted, assembled and then aligned against the highly specific reference database. GenomePeek was found to be faster than traditional approaches while still keeping errormore » rates low, as well as offering unique data visualization options.« less
Insights from the complete chloroplast genome into the evolution of Sesamum indicum L.

PubMed

Zhang, Haiyang; Li, Chun; Miao, Hongmei; Xiong, Songjin

2013-01-01

Sesame (Sesamum indicum L.) is one of the oldest oilseed crops. In order to investigate the evolutionary characters according to the Sesame Genome Project, apart from sequencing its nuclear genome, we sequenced the complete chloroplast genome of S. indicum cv. Yuzhi 11 (white seeded) using Illumina and 454 sequencing. Comparisons of chloroplast genomes between S. indicum and the 18 other higher plants were then analyzed. The chloroplast genome of cv. Yuzhi 11 contains 153,338 bp and a total of 114 unique genes (KC569603). The number of chloroplast genes in sesame is the same as that in Nicotiana tabacum, Vitis vinifera and Platanus occidentalis. The variation in the length of the large single-copy (LSC) regions and inverted repeats (IR) in sesame compared to 18 other higher plant species was the main contributor to size variation in the cp genome in these species. The 77 functional chloroplast genes, except for ycf1 and ycf2, were highly conserved. The deletion of the cp ycf1 gene sequence in cp genomes may be due either to its transfer to the nuclear genome, as has occurred in sesame, or direct deletion, as has occurred in Panax ginseng and Cucumis sativus. The sesame ycf2 gene is only 5,721 bp in length and has lost about 1,179 bp. Nucleotides 1-585 of ycf2 when queried in BLAST had hits in the sesame draft genome. Five repeats (R10, R12, R13, R14 and R17) were unique to the sesame chloroplast genome. We also found that IR contraction/expansion in the cp genome alters its rate of evolution. Chloroplast genes and repeats display the signature of convergent evolution in sesame and other species. These findings provide a foundation for further investigation of cp genome evolution in Sesamum and other higher plants.

A novel mycovirus from Aspergillus fumigatus contains four unique dsRNAs as its genome and is infectious as dsRNA

PubMed Central

Kanhayuwa, Lakkhana; Kotta-Loizou, Ioly; Özkan, Selin; Gunning, A. Patrick; Coutts, Robert H. A.

2015-01-01

We report the discovery and characterization of a double-stranded RNA (dsRNA) mycovirus isolated from the human pathogenic fungus Aspergillus fumigatus, Aspergillus fumigatus tetramycovirus-1 (AfuTmV-1), which reveals several unique features not found previously in positive-strand RNA viruses, including the fact that it represents the first dsRNA (to our knowledge) that is not only infectious as a purified entity but also as a naked dsRNA. The AfuTmV-1 genome consists of four capped dsRNAs, the largest of which encodes an RNA-dependent RNA polymerase (RdRP) containing a unique GDNQ motif normally characteristic of negative-strand RNA viruses. The third largest dsRNA encodes an S-adenosyl methionine–dependent methyltransferase capping enzyme and the smallest dsRNA a P-A-S–rich protein that apparently coats but does not encapsidate the viral genome as visualized by atomic force microscopy. A combination of a capping enzyme with a picorna-like RdRP in the AfuTmV-1 genome is a striking case of chimerism and the first example (to our knowledge) of such a phenomenon. AfuTmV-1 appears to be intermediate between dsRNA and positive-strand ssRNA viruses, as well as between encapsidated and capsidless RNA viruses. PMID:26139522
The Complete Plastome Sequence of an Antarctic Bryophyte Sanionia uncinata (Hedw.) Loeske

PubMed Central

Park, Mira; Park, Hyun; Lee, Hyoungseok; Lee, Byeong-ha

2018-01-01

Organellar genomes of bryophytes are poorly represented with chloroplast genomes of only four mosses, four liverworts and two hornworts having been sequenced and annotated. Moreover, while Antarctic vegetation is dominated by the bryophytes, there are few reports on the plastid genomes for the Antarctic bryophytes. Sanionia uncinata (Hedw.) Loeske is one of the most dominant moss species in the maritime Antarctic. It has been researched as an important marker for ecological studies and as an extremophile plant for studies on stress tolerance. Here, we report the complete plastome sequence of S. uncinata, which can be exploited in comparative studies to identify the lineage-specific divergence across different species. The complete plastome of S. uncinata is 124,374 bp in length with a typical quadripartite structure of 114 unique genes including 82 unique protein-coding genes, 37 tRNA genes and four rRNA genes. However, two genes encoding the α subunit of RNA polymerase (rpoA) and encoding the cytochrome b6/f complex subunit VIII (petN) were absent. We could identify nuclear genes homologous to those genes, which suggests that rpoA and petN might have been relocated from the chloroplast genome to the nuclear genome. PMID:29494552
Functional genomics to discover antibiotic resistance genes: The paradigm of resistance to colistin mediated by ethanolamine phosphotransferase in Shewanella algae MARS 14.

PubMed

Telke, Amar A; Rolain, Jean-Marc

2015-12-01

Shewanella algae MARS 14 is a colistin-resistant clinical isolate retrieved from bronchoalveolar lavage of a hospitalised patient. A functional genomics strategy was employed to discover the molecular support for colistin resistance in S. algae MARS 14. A pZE21 MCS-1 plasmid-based genomic expression library was constructed in Escherichia coli TOP10. The estimated library size was 1.30×10(8) bp. Functional screening of colistin-resistant clones was carried out on Luria-Bertani agar containing 8 mg/L colistin. Five colistin-resistant clones were obtained after complete screening of the genomic expression library. Analysis of DNA sequencing results found a unique gene in all selected clones. Amino acid sequence analysis of this unique gene using the Integrated Microbial Genomes (IMG) and KEGG databases revealed that this gene encodes ethanolamine phosphotransferase (EptA, or so-called PmrC). Reverse transcription PCR analysis indicated that resistance to colistin in S. algae MARS 14 was associated with overexpression of EptA (27-fold increase), which plays a crucial role in the arrangement of outer membrane lipopolysaccharide. Copyright © 2015 Elsevier B.V. and the International Society of Chemotherapy. All rights reserved.
Salmonella Strains Isolated from Galápagos Iguanas Show Spatial Structuring of Serovar and Genomic Diversity

PubMed Central

Lankau, Emily W.; Cruz Bedon, Lenin; Mackie, Roderick I.

2012-01-01

It is thought that dispersal limitation primarily structures host-associated bacterial populations because host distributions inherently limit transmission opportunities. However, enteric bacteria may disperse great distances during food-borne outbreaks. It is unclear if such rapid long-distance dispersal events happen regularly in natural systems or if these events represent an anthropogenic exception. We characterized Salmonella enterica isolates from the feces of free-living Galápagos land and marine iguanas from five sites on four islands using serotyping and genomic fingerprinting. Each site hosted unique and nearly exclusive serovar assemblages. Genomic fingerprint analysis offered a more complex model of S. enterica biogeography, with evidence of both unique strain pools and of spatial population structuring along a geographic gradient. These findings suggest that even relatively generalist enteric bacteria may be strongly dispersal limited in a natural system with strong barriers, such as oceanic divides. Yet, these differing results seen on two typing methods also suggests that genomic variation is less dispersal limited, allowing for different ecological processes to shape biogeographical patterns of the core and flexible portions of this bacterial species' genome. PMID:22615968
Whole genome amplification of DNA extracted from FFPE tissues.

PubMed

Bosso, Mira; Al-Mulla, Fahd

2011-01-01

Whole genome amplification systems were developed to meet the increasing research demands on DNA resources and to avoid DNA shortage. The technology enables amplification of nanogram amounts of DNA into microgram quantities and is increasingly used in the amplification of DNA from multiple origins such as blood, fresh frozen tissue, formalin-fixed paraffin-embedded tissues, saliva, buccal swabs, bacteria, and plant and animal sources. This chapter focuses on the use of GenomePlex(®) tissue Whole Genome Amplification Kit, to amplify DNA directly from archived tissue. In addition, this chapter documents our unique experience with the utilization of GenomePlex(®) amplified DNA using several molecular techniques including metaphase Comparative Genomic Hybridization, array Comparative Genomic Hybridization, and real-time quantitative polymerase chain reaction assays. GenomePlex(®) is a registered trademark of Rubicon Genomics Incorporation.
Complete Genome Sequence and Comparative Genomics of a Novel Myxobacterium Myxococcus hansupus

PubMed Central

Sharma, Gaurav; Narwani, Tarun; Subramanian, Srikrishna

2016-01-01

Myxobacteria, a group of Gram-negative aerobes, belong to the class δ-proteobacteria and order Myxococcales. Unlike anaerobic δ-proteobacteria, they exhibit several unusual physiogenomic properties like gliding motility, desiccation-resistant myxospores and large genomes with high coding density. Here we report a 9.5 Mbp complete genome of Myxococcus hansupus that encodes 7,753 proteins. Phylogenomic and genome-genome distance based analysis suggest that Myxococcus hansupus is a novel member of the genus Myxococcus. Comparative genome analysis with other members of the genus Myxococcus was performed to explore their genome diversity. The variation in number of unique proteins observed across different species is suggestive of diversity at the genus level while the overrepresentation of several Pfam families indicates the extent and mode of genome expansion as compared to non-Myxococcales δ-proteobacteria. PMID:26900859
Survey of protein–DNA interactions in Aspergillus oryzae on a genomic scale

PubMed Central

Wang, Chao; Lv, Yangyong; Wang, Bin; Yin, Chao; Lin, Ying; Pan, Li

2015-01-01

The genome-scale delineation of in vivo protein–DNA interactions is key to understanding genome function. Only ∼5% of transcription factors (TFs) in the Aspergillus genus have been identified using traditional methods. Although the Aspergillus oryzae genome contains >600 TFs, knowledge of the in vivo genome-wide TF-binding sites (TFBSs) in aspergilli remains limited because of the lack of high-quality antibodies. We investigated the landscape of in vivo protein–DNA interactions across the A. oryzae genome through coupling the DNase I digestion of intact nuclei with massively parallel sequencing and the analysis of cleavage patterns in protein–DNA interactions at single-nucleotide resolution. The resulting map identified overrepresented de novo TF-binding motifs from genomic footprints, and provided the detailed chromatin remodeling patterns and the distribution of digital footprints near transcription start sites. The TFBSs of 19 known Aspergillus TFs were also identified based on DNase I digestion data surrounding potential binding sites in conjunction with TF binding specificity information. We observed that the cleavage patterns of TFBSs were dependent on the orientation of TF motifs and independent of strand orientation, consistent with the DNA shape features of binding motifs with flanking sequences. PMID:25883143
Population-based 3D genome structure analysis reveals driving forces in spatial genome organization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tjong, Harianto; Li, Wenyuan; Kalhor, Reza

Conformation capture technologies (e.g., Hi-C) chart physical interactions between chromatin regions on a genome-wide scale. However, the structural variability of the genome between cells poses a great challenge to interpreting ensemble-averaged Hi-C data, particularly for long-range and interchromosomal interactions. Here, we present a probabilistic approach for deconvoluting Hi-C data into a model population of distinct diploid 3D genome structures, which facilitates the detection of chromatin interactions likely to co-occur in individual cells. Here, our approach incorporates the stochastic nature of chromosome conformations and allows a detailed analysis of alternative chromatin structure states. For example, we predict and experimentally confirm themore » presence of large centromere clusters with distinct chromosome compositions varying between individual cells. The stability of these clusters varies greatly with their chromosome identities. We show that these chromosome-specific clusters can play a key role in the overall chromosome positioning in the nucleus and stabilizing specific chromatin interactions. By explicitly considering genome structural variability, our population-based method provides an important tool for revealing novel insights into the key factors shaping the spatial genome organization.« less
Population-based 3D genome structure analysis reveals driving forces in spatial genome organization

DOE PAGES

Tjong, Harianto; Li, Wenyuan; Kalhor, Reza; ...

2016-03-07

Conformation capture technologies (e.g., Hi-C) chart physical interactions between chromatin regions on a genome-wide scale. However, the structural variability of the genome between cells poses a great challenge to interpreting ensemble-averaged Hi-C data, particularly for long-range and interchromosomal interactions. Here, we present a probabilistic approach for deconvoluting Hi-C data into a model population of distinct diploid 3D genome structures, which facilitates the detection of chromatin interactions likely to co-occur in individual cells. Here, our approach incorporates the stochastic nature of chromosome conformations and allows a detailed analysis of alternative chromatin structure states. For example, we predict and experimentally confirm themore » presence of large centromere clusters with distinct chromosome compositions varying between individual cells. The stability of these clusters varies greatly with their chromosome identities. We show that these chromosome-specific clusters can play a key role in the overall chromosome positioning in the nucleus and stabilizing specific chromatin interactions. By explicitly considering genome structural variability, our population-based method provides an important tool for revealing novel insights into the key factors shaping the spatial genome organization.« less
Statistical Significance of Optical Map Alignments

PubMed Central

Sarkar, Deepayan; Goldstein, Steve; Schwartz, David C.

2012-01-01

Abstract The Optical Mapping System constructs ordered restriction maps spanning entire genomes through the assembly and analysis of large datasets comprising individually analyzed genomic DNA molecules. Such restriction maps uniquely reveal mammalian genome structure and variation, but also raise computational and statistical questions beyond those that have been solved in the analysis of smaller, microbial genomes. We address the problem of how to filter maps that align poorly to a reference genome. We obtain map-specific thresholds that control errors and improve iterative assembly. We also show how an optimal self-alignment score provides an accurate approximation to the probability of alignment, which is useful in applications seeking to identify structural genomic abnormalities. PMID:22506568
Transcriptional reprogramming underpins enhanced plant growth promotion by the biocontrol fungus Trichoderma hamatum GD12 during antagonistic interactions with Sclerotinia sclerotiorum in soil.

PubMed

Shaw, Sophie; Le Cocq, Kate; Paszkiewicz, Konrad; Moore, Karen; Winsbury, Rebecca; de Torres Zabala, Marta; Studholme, David J; Salmon, Deborah; Thornton, Christopher R; Grant, Murray R

2016-12-01

The free-living soil fungus Trichoderma hamatum strain GD12 is notable amongst Trichoderma strains in both controlling plant diseases and stimulating plant growth, a property enhanced during its antagonistic interactions with pathogens in soil. These attributes, alongside its markedly expanded genome and proteome compared with other biocontrol and plant growth-promoting Trichoderma strains, imply a rich potential for sustainable alternatives to synthetic pesticides and fertilizers for the control of plant disease and for increasing yields. The purpose of this study was to investigate the transcriptional responses of GD12 underpinning its biocontrol and plant growth promotion capabilities during antagonistic interactions with the pathogen Sclerotinia sclerotiorum in soil. Using an extensive mRNA-seq study capturing different time points during the pathogen-antagonist interaction in soil, we show that dynamic and biphasic signatures in the GD12 transcriptome underpin its biocontrol and plant (lettuce) growth-promoting activities. Functional predictions of differentially expressed genes demonstrate the enrichment of transcripts encoding proteins involved in transportation and oxidation-reduction reactions during both processes and an over-representation of siderophores. We identify a biphasic response during biocontrol characterized by a significant induction of transcripts encoding small-secreted cysteine-rich proteins, secondary metabolite-producing gene clusters and genes unique to GD12. These data support the hypothesis that Sclerotinia biocontrol is mediated by the synthesis and secretion of antifungal compounds and that GD12's unique reservoir of uncharacterized genes is actively recruited during the effective biological control of a plurivorous plant pathogen. © 2016 The Authors. Molecular Plant Pathology published by British Society for Plant Pathology and John Wiley & Sons Ltd.
A unique circovirus-like genome detected in pig feces

USDA-ARS?s Scientific Manuscript database

Using a metagenomic approach and molecular cloning methods, we identified, cloned, and sequenced the complete genome of a novel circular DNA virus, porcine stool-associated virus (PoSCV4), from pig feces. Phylogenetic analysis of the deduced replication initiator protein showed that PoSCV4 is most r...
Effector diversification within compartments of the Leptosphaeria maculans genome affected by repeat induced point mutations

USDA-ARS?s Scientific Manuscript database

The genome sequence of the phytopathogenic fungus Leptosphaeria maculans has been determined. It has a unique bipartite structure, divided between distinct GC-equilibrated and AT-rich regions (isochores), reminiscent of some plants and animals but not previously observed in fungi. The GC-equilibrate...
The genome of Diuraphis noxia, a global pest of small grains

USDA-ARS?s Scientific Manuscript database

The Russian wheat aphid (Diuraphis noxia) is the world's most destructive grain aphid, producing unique phytotoxic damage symptoms that result directly from salivary proteins injected into the host plant while feeding. We sequenced and assembled the genome of D. noxia biotype 2, the most widely des...
Genome-wide analysis of miRNAs in the ovaries of Jining Grey and Laiwu Black goats to explore the regulation of fecundity.

PubMed

Miao, Xiangyang; Luo, Qingmiao; Zhao, Huijing; Qin, Xiaoyu

2016-11-29

Goat fecundity is important for agriculture and varies depending on the genetic background of the goat. Two excellent domestic breeds in China, the Jining Grey and Laiwu Black goats, have different fecundity and prolificacies. To explore the potential miRNAs that regulate the expression of the genes involved in these prolific differences and to potentially discover new miRNAs, we performed a genome-wide analysis of the miRNAs in the ovaries from these two goats using RNA-Seq technology. Thirty miRNAs were differentially expressed between the Jining Grey and Laiwu Black goats. Gene Ontology and KEGG pathway analyses revealed that the target genes of the differentially expressed miRNAs were significantly enriched in several biological processes and pathways. A protein-protein interaction analysis indicated that the miRNAs and their target genes were related to the reproduction complex regulation network. The differential miRNA expression profiles found in the ovaries between the two distinctive breeds of goats studied here provide a unique resource for addressing fecundity differences in goats.
The ins and outs of algal metal transport

PubMed Central

Blaby-Haas, Crysten E.; Merchant, Sabeeha S.

2012-01-01

Metal transporters are a central component in the interaction of algae with their environment. They represent the first line of defense to cellular perturbations in metal concentration, and by analyzing algal metal transporter repertoires, we gain insight into a fundamental aspect of algal biology. The ability of individual algae to thrive in environments with unique geochemistry, compared to non-algal species commonly used as reference organisms for metal homeostasis, provides an opportunity to broaden our understanding of biological metal requirements, preferences and trafficking. Chlamydomonas reinhardtii is the best developed reference organism for the study of algal biology, especially with respect to metal metabolism; however, the diversity of algal niches necessitates a comparative genomic analysis of all sequenced algal genomes. A comparison between known and putative proteins in animals, plants, fungi and algae using protein similarity networks has revealed the presence of novel metal metabolism components in Chlamydomonas including new iron and copper transporters. This analysis also supports the concept that, in terms of metal metabolism, algae from similar niches are more related to one another than to algae from the same phylogenetic clade. PMID:22569643
The novel asymmetric entry intermediate of a picornavirus captured with nanodiscs

PubMed Central

Lee, Hyunwook; Shingler, Kristin L.; Organtini, Lindsey J.; Ashley, Robert E.; Makhov, Alexander M.; Conway, James F.; Hafenstein, Susan

2016-01-01

Many nonenveloped viruses engage host receptors that initiate capsid conformational changes necessary for genome release. Structural studies on the mechanisms of picornavirus entry have relied on in vitro approaches of virus incubated at high temperatures or with excess receptor molecules to trigger the entry intermediate or A-particle. We have induced the coxsackievirus B3 entry intermediate by triggering the virus with full-length receptors embedded in lipid bilayer nanodiscs. These asymmetrically formed A-particles were reconstructed using cryo-electron microscopy and a direct electron detector. These first high-resolution structures of a picornavirus entry intermediate captured at a membrane with and without imposing icosahedral symmetry (3.9 and 7.8 Å, respectively) revealed a novel A-particle that is markedly different from the classical A-particles. The asymmetric receptor binding triggers minimal global capsid expansion but marked local conformational changes at the site of receptor interaction. In addition, viral proteins extrude from the capsid only at the site of extensive protein remodeling adjacent to the nanodisc. Thus, the binding of the receptor triggers formation of a unique site in preparation for genome release. PMID:27574701
Patterns of amino acid conservation in human and animal immunodeficiency viruses.

PubMed

Voitenko, Olga S; Dhroso, Andi; Feldmann, Anna; Korkin, Dmitry; Kalinina, Olga V

2016-09-01

Due to their high genomic variability, RNA viruses and retroviruses present a unique opportunity for detailed study of molecular evolution. Lentiviruses, with HIV being a notable example, are one of the best studied viral groups: hundreds of thousands of sequences are available together with experimentally resolved three-dimensional structures for most viral proteins. In this work, we use these data to study specific patterns of evolution of the viral proteins, and their relationship to protein interactions and immunogenicity. We propose a method for identification of two types of surface residues clusters with abnormal conservation: extremely conserved and extremely variable clusters. We identify them on the surface of proteins from HIV and other animal immunodeficiency viruses. Both types of clusters are overrepresented on the interaction interfaces of viral proteins with other proteins, nucleic acids or low molecular-weight ligands, both in the viral particle and between the virus and its host. In the immunodeficiency viruses, the interaction interfaces are not more conserved than the corresponding proteins on an average, and we show that extremely conserved clusters coincide with protein-protein interaction hotspots, predicted as the residues with the largest energetic contribution to the interaction. Extremely variable clusters have been identified here for the first time. In the HIV-1 envelope protein gp120, they overlap with known antigenic sites. These antigenic sites also contain many residues from extremely conserved clusters, hence representing a unique interacting interface enriched both in extremely conserved and in extremely variable clusters of residues. This observation may have important implication for antiretroviral vaccine development. A Python package is available at https://bioinf.mpi-inf.mpg.de/publications/viral-ppi-pred/ voitenko@mpi-inf.mpg.de or kalinina@mpi-inf.mpg.de Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Extremotolerant tardigrade genome and improved radiotolerance of human cultured cells by tardigrade-unique protein

PubMed Central

Hashimoto, Takuma; Horikawa, Daiki D.; Saito, Yuki; Kuwahara, Hirokazu; Kozuka-Hata, Hiroko; Shin-I, Tadasu; Minakuchi, Yohei; Ohishi, Kazuko; Motoyama, Ayuko; Aizu, Tomoyuki; Enomoto, Atsushi; Kondo, Koyuki; Tanaka, Sae; Hara, Yuichiro; Koshikawa, Shigeyuki; Sagara, Hiroshi; Miura, Toru; Yokobori, Shin-ichi; Miyagawa, Kiyoshi; Suzuki, Yutaka; Kubo, Takeo; Oyama, Masaaki; Kohara, Yuji; Fujiyama, Asao; Arakawa, Kazuharu; Katayama, Toshiaki; Toyoda, Atsushi; Kunieda, Takekazu

2016-01-01

Tardigrades, also known as water bears, are small aquatic animals. Some tardigrade species tolerate almost complete dehydration and exhibit extraordinary tolerance to various physical extremes in the dehydrated state. Here we determine a high-quality genome sequence of Ramazzottius varieornatus, one of the most stress-tolerant tardigrade species. Precise gene repertoire analyses reveal the presence of a small proportion (1.2% or less) of putative foreign genes, loss of gene pathways that promote stress damage, expansion of gene families related to ameliorating damage, and evolution and high expression of novel tardigrade-unique proteins. Minor changes in the gene expression profiles during dehydration and rehydration suggest constitutive expression of tolerance-related genes. Using human cultured cells, we demonstrate that a tardigrade-unique DNA-associating protein suppresses X-ray-induced DNA damage by ∼40% and improves radiotolerance. These findings indicate the relevance of tardigrade-unique proteins to tolerability and tardigrades could be a bountiful source of new protection genes and mechanisms. PMID:27649274
Extremotolerant tardigrade genome and improved radiotolerance of human cultured cells by tardigrade-unique protein.

PubMed

Hashimoto, Takuma; Horikawa, Daiki D; Saito, Yuki; Kuwahara, Hirokazu; Kozuka-Hata, Hiroko; Shin-I, Tadasu; Minakuchi, Yohei; Ohishi, Kazuko; Motoyama, Ayuko; Aizu, Tomoyuki; Enomoto, Atsushi; Kondo, Koyuki; Tanaka, Sae; Hara, Yuichiro; Koshikawa, Shigeyuki; Sagara, Hiroshi; Miura, Toru; Yokobori, Shin-Ichi; Miyagawa, Kiyoshi; Suzuki, Yutaka; Kubo, Takeo; Oyama, Masaaki; Kohara, Yuji; Fujiyama, Asao; Arakawa, Kazuharu; Katayama, Toshiaki; Toyoda, Atsushi; Kunieda, Takekazu

2016-09-20

Tardigrades, also known as water bears, are small aquatic animals. Some tardigrade species tolerate almost complete dehydration and exhibit extraordinary tolerance to various physical extremes in the dehydrated state. Here we determine a high-quality genome sequence of Ramazzottius varieornatus, one of the most stress-tolerant tardigrade species. Precise gene repertoire analyses reveal the presence of a small proportion (1.2% or less) of putative foreign genes, loss of gene pathways that promote stress damage, expansion of gene families related to ameliorating damage, and evolution and high expression of novel tardigrade-unique proteins. Minor changes in the gene expression profiles during dehydration and rehydration suggest constitutive expression of tolerance-related genes. Using human cultured cells, we demonstrate that a tardigrade-unique DNA-associating protein suppresses X-ray-induced DNA damage by ∼40% and improves radiotolerance. These findings indicate the relevance of tardigrade-unique proteins to tolerability and tardigrades could be a bountiful source of new protection genes and mechanisms.

Introduction

PubMed Central

Taussig, Karen-Sue; Gibbon, Sahra Elizabeth

2013-01-01

We introduce this special issue of Medial Anthropology Quarterly on public health genomics by exploring both the unique contribution of ethnographic sensibility that medical anthropologists bring to the study of genomics and some of the key insights offered by the essays in this collection. As anthropologists, we are concerned with the power dynamics and larger cultural commitments embedded in practices associated with public health. We seek to understand, first, the broad significance of genomics as a cultural object and, second, the social action set into motion as researchers seek to translate genomic knowledge and technology into public health benefits. PMID:24214906
Accelerated Gene Evolution and Subfunctionalization in thePseudotetraploid Frog Xenopus Laevis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hellsten, Uffe; Khokha, Mustafa K.; Grammar, Timothy C.

2007-03-01

Ancient whole genome duplications have been implicated in the vertebrate and teleost radiations, and in the emergence of diverse angiosperm lineages, but the evolutionary response to such a perturbation is still poorly understood. The African clawed frog Xenopus laevis experienced a relatively recent tetraploidization {approx} 40 million years ago. Analysis of the considerable amount of EST sequence available for this species together with the genome sequence of the related diploid Xenopus tropicalis provides a unique opportunity to study the genomic response to whole genome duplication.
Complete Genome Sequence of Yersinis pestis Strains Antiqua and Nepa1516: Evidence of Gene Reduction in an Emerging Pathogen

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chain, Patrick S; Hu, Ping; Malfatti, Stephanie

2006-01-01

Yersinia pestis, the causative agent of bubonic and pneumonic plagues, has undergone detailed study at the molecular level. To further investigate the genomic diversity among this group and to help characterize lineages of the plague organism that have no sequenced members, we present here the genomes of two isolates of the ''classical'' antiqua biovar, strains Antiqua and Nepal516. The genomes of Antiqua and Nepal516 are 4.7 Mb and 4.5 Mb and encode 4,138 and 3,956 open reading frames, respectively. Though both strains belong to one of the three classical biovars, they represent separate lineages defined by recent phylogenetic studies. Wemore » compare all five currently sequenced Y. pestis genomes and the corresponding features in Yersinia pseudotuberculosis. There are strain-specific rearrangements, insertions, deletions, single nucleotide polymorphisms, and a unique distribution of insertion sequences. We found 453 single nucleotide polymorphisms in protein-coding regions, which were used to assess the evolutionary relationships of these Y. pestis strains. Gene reduction analysis revealed that the gene deletion processes are under selective pressure, and many of the inactivations are probably related to the organism's interaction with its host environment. The results presented here clearly demonstrate the differences between the two biovar antiqua lineages and support the notion that grouping Y. pestis strains based strictly on the classical definition of biovars (predicated upon two biochemical assays) does not accurately reflect the phylogenetic relationships within this species. A comparison of four virulent Y. pestis strains with the human-avirulent strain 91001 provides further insight into the genetic basis of virulence to humans.« less
Chompy: an infestation of MITE-like repetitive elements in the crocodilian genome.

PubMed

Ray, David A; Hedges, Dale J; Herke, Scott W; Fowlkes, Justin D; Barnes, Erin W; LaVie, Daniel K; Goodwin, Lindsey M; Densmore, Llewellyn D; Batzer, Mark A

2005-12-05

Interspersed repeats are a major component of most eukaryotic genomes and have an impact on genome size and stability, but the repetitive element landscape of crocodilian genomes has not yet been fully investigated. In this report, we provide the first detailed characterization of an interspersed repeat element in any crocodilian genome. Chompy is a putative miniature inverted-repeat transposable element (MITE) family initially recovered from the genome of Alligator mississippiensis (American alligator) but also present in the genomes of Crocodylus moreletii (Morelet's crocodile) and Gavialis gangeticus (Indian gharial). The element has all of the hallmarks of MITEs including terminal inverted repeats, possible target site duplications, and a tendency to form secondary structures. We estimate the copy number in the alligator genome to be approximately 46,000 copies. As a result of their size and unique properties, Chompy elements may provide a useful source of genomic variation for crocodilian comparative genomics.
RUCS: rapid identification of PCR primers for unique core sequences.

PubMed

Thomsen, Martin Christen Frølund; Hasman, Henrik; Westh, Henrik; Kaya, Hülya; Lund, Ole

2017-12-15

Designing PCR primers to target a specific selection of whole genome sequenced strains can be a long, arduous and sometimes impractical task. Such tasks would benefit greatly from an automated tool to both identify unique targets, and to validate the vast number of potential primer pairs for the targets in silico. Here we present RUCS, a program that will find PCR primer pairs and probes for the unique core sequences of a positive genome dataset complement to a negative genome dataset. The resulting primer pairs and probes are in addition to simple selection also validated through a complex in silico PCR simulation. We compared our method, which identifies the unique core sequences, against an existing tool called ssGeneFinder, and found that our method was 6.5-20 times more sensitive. We used RUCS to design primer pairs that would target a set of genomes known to contain the mcr-1 colistin resistance gene. Three of the predicted pairs were chosen for experimental validation using PCR and gel electrophoresis. All three pairs successfully produced an amplicon with the target length for the samples containing mcr-1 and no amplification products were produced for the negative samples. The novel methods presented in this manuscript can reduce the time needed to identify target sequences, and provide a quick virtual PCR validation to eliminate time wasted on ambiguously binding primers. Source code is freely available on https://bitbucket.org/genomicepidemiology/rucs. Web service is freely available on https://cge.cbs.dtu.dk/services/RUCS. mcft@cbs.dtu.dk. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
Evolution and Diversity of the Human Hepatitis D Virus Genome

PubMed Central

Huang, Chi-Ruei; Lo, Szecheng J.

2010-01-01

Human hepatitis delta virus (HDV) is the smallest RNA virus in genome. HDV genome is divided into a viroid-like sequence and a protein-coding sequence which could have originated from different resources and the HDV genome was eventually constituted through RNA recombination. The genome subsequently diversified through accumulation of mutations selected by interactions between the mutated RNA and proteins with host factors to successfully form the infectious virions. Therefore, we propose that the conservation of HDV nucleotide sequence is highly related with its functionality. Genome analysis of known HDV isolates shows that the C-terminal coding sequences of large delta antigen (LDAg) are the highest diversity than other regions of protein-coding sequences but they still retain biological functionality to interact with the heavy chain of clathrin can be selected and maintained. Since viruses interact with many host factors, including escaping the host immune response, how to design a program to predict RNA genome evolution is a great challenging work. PMID:20204073
Genomic identification of potential targets unique to Candida albicans for the discovery of antifungal agents.

PubMed

Tripathi, Himanshu; Luqman, Suaib; Meena, Abha; Khan, Feroz

2014-01-01

Despite of modern antifungal therapy, the mortality rates of invasive infection with human fungal pathogen Candida albicans are up to 40%. Studies suggest that drug resistance in the three most common species of human fungal pathogens viz., C. albicans, Aspergillus fumigatus (causing mortality rate up to 90%) and Cryptococcus neoformans (causing mortality rate up to 70%) is due to mutations in the target enzymes or high expression of drug transporter genes. Drug resistance in human fungal pathogens has led to an imperative need for the identification of new targets unique to fungal pathogens. In the present study, we have used a comparative genomics approach to find out potential target proteins unique to C. albicans, an opportunistic fungus responsible for severe infection in immune-compromised human. Interestingly, many target proteins of existing antifungal agents showed orthologs in human cells. To identify unique proteins, we have compared proteome of C. albicans [SC5314] i.e., 14,633 total proteins retrieved from the RefSeq database of NCBI, USA with proteome of human and non-pathogenic yeast Saccharomyces cerevisiae. Results showed that 4,568 proteins were identified unique to C. albicans as compared to those of human and later when these unique proteins were compared with S. cerevisiae proteome, finally 2,161 proteins were identified as unique proteins and after removing repeats total 1,618 unique proteins (42 functionally known, 1,566 hypothetical and 10 unknown) were selected as potential antifungal drug targets unique to C. albicans.
VESPA: Software to Facilitate Genomic Annotation of Prokaryotic Organisms Through Integration of Proteomic and Transcriptomic Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Peterson, Elena S.; McCue, Lee Ann; Rutledge, Alexandra C.

2012-04-25

Visual Exploration and Statistics to Promote Annotation (VESPA) is an interactive visual analysis software tool that facilitates the discovery of structural mis-annotations in prokaryotic genomes. VESPA integrates high-throughput peptide-centric proteomics data and oligo-centric or RNA-Seq transcriptomics data into a genomic context. The data may be interrogated via visual analysis across multiple levels of genomic resolution, linked searches, exports and interaction with BLAST to rapidly identify location of interest within the genome and evaluate potential mis-annotations.
Comparative genomics of plant-associated Pseudomonas spp.: Insights into diversity and inheritance of traits involved in multitrophic interactions

USDA-ARS?s Scientific Manuscript database

We provide here a comparative genome analysis of the Pseudomonas fluorescens group, including seven new genomic sequences for plant-associated strains. These strains exhibit a diverse spectrum of traits involved in biological control and other multitrophic interactions with plants, microbes, and ins...
ShinyGPAS: interactive genomic prediction accuracy simulator based on deterministic formulas.

PubMed

Morota, Gota

2017-12-20

Deterministic formulas for the accuracy of genomic predictions highlight the relationships among prediction accuracy and potential factors influencing prediction accuracy prior to performing computationally intensive cross-validation. Visualizing such deterministic formulas in an interactive manner may lead to a better understanding of how genetic factors control prediction accuracy. The software to simulate deterministic formulas for genomic prediction accuracy was implemented in R and encapsulated as a web-based Shiny application. Shiny genomic prediction accuracy simulator (ShinyGPAS) simulates various deterministic formulas and delivers dynamic scatter plots of prediction accuracy versus genetic factors impacting prediction accuracy, while requiring only mouse navigation in a web browser. ShinyGPAS is available at: https://chikudaisei.shinyapps.io/shinygpas/ . ShinyGPAS is a shiny-based interactive genomic prediction accuracy simulator using deterministic formulas. It can be used for interactively exploring potential factors that influence prediction accuracy in genome-enabled prediction, simulating achievable prediction accuracy prior to genotyping individuals, or supporting in-class teaching. ShinyGPAS is open source software and it is hosted online as a freely available web-based resource with an intuitive graphical user interface.
GPU Accelerated Browser for Neuroimaging Genomics.

PubMed

Zigon, Bob; Li, Huang; Yao, Xiaohui; Fang, Shiaofen; Hasan, Mohammad Al; Yan, Jingwen; Moore, Jason H; Saykin, Andrew J; Shen, Li

2018-04-25

Neuroimaging genomics is an emerging field that provides exciting opportunities to understand the genetic basis of brain structure and function. The unprecedented scale and complexity of the imaging and genomics data, however, have presented critical computational bottlenecks. In this work we present our initial efforts towards building an interactive visual exploratory system for mining big data in neuroimaging genomics. A GPU accelerated browsing tool for neuroimaging genomics is created that implements the ANOVA algorithm for single nucleotide polymorphism (SNP) based analysis and the VEGAS algorithm for gene-based analysis, and executes them at interactive rates. The ANOVA algorithm is 110 times faster than the 4-core OpenMP version, while the VEGAS algorithm is 375 times faster than its 4-core OpenMP counter part. This approach lays a solid foundation for researchers to address the challenges of mining large-scale imaging genomics datasets via interactive visual exploration.
Three invariant Hi-C interaction patterns: Applications to genome assembly.

PubMed

Oddes, Sivan; Zelig, Aviv; Kaplan, Noam

2018-06-01

Assembly of reference-quality genomes from next-generation sequencing data is a key challenge in genomics. Recently, we and others have shown that Hi-C data can be used to address several outstanding challenges in the field of genome assembly. This principle has since been developed in academia and industry, and has been used in the assembly of several major genomes. In this paper, we explore the central principles underlying Hi-C-based assembly approaches, by quantitatively defining and characterizing three invariant Hi-C interaction patterns on which these approaches can build: Intrachromosomal interaction enrichment, distance-dependent interaction decay and local interaction smoothness. Specifically, we evaluate to what degree each invariant pattern holds on a single locus level in different species, cell types and Hi-C map resolutions. We find that these patterns are generally consistent across species and cell types but are affected by sequencing depth, and that matrix balancing improves consistency of loci with all three invariant patterns. Finally, we overview current Hi-C-based assembly approaches in light of these invariant patterns and demonstrate how local interaction smoothness can be used to easily detect scaffolding errors in extremely sparse Hi-C maps. We suggest that simultaneously considering all three invariant patterns may lead to better Hi-C-based genome assembly methods. Copyright © 2018 Elsevier Inc. All rights reserved.
Functional genomics of lactic acid bacteria: from food to health

PubMed Central

2014-01-01

Genome analysis using next generation sequencing technologies has revolutionized the characterization of lactic acid bacteria and complete genomes of all major groups are now available. Comparative genomics has provided new insights into the natural and laboratory evolution of lactic acid bacteria and their environmental interactions. Moreover, functional genomics approaches have been used to understand the response of lactic acid bacteria to their environment. The results have been instrumental in understanding the adaptation of lactic acid bacteria in artisanal and industrial food fermentations as well as their interactions with the human host. Collectively, this has led to a detailed analysis of genes involved in colonization, persistence, interaction and signaling towards to the human host and its health. Finally, massive parallel genome re-sequencing has provided new opportunities in applied genomics, specifically in the characterization of novel non-GMO strains that have potential to be used in the food industry. Here, we provide an overview of the state of the art of these functional genomics approaches and their impact in understanding, applying and designing lactic acid bacteria for food and health. PMID:25186768
Functional genomics of lactic acid bacteria: from food to health.

PubMed

Douillard, François P; de Vos, Willem M

2014-08-29

Genome analysis using next generation sequencing technologies has revolutionized the characterization of lactic acid bacteria and complete genomes of all major groups are now available. Comparative genomics has provided new insights into the natural and laboratory evolution of lactic acid bacteria and their environmental interactions. Moreover, functional genomics approaches have been used to understand the response of lactic acid bacteria to their environment. The results have been instrumental in understanding the adaptation of lactic acid bacteria in artisanal and industrial food fermentations as well as their interactions with the human host. Collectively, this has led to a detailed analysis of genes involved in colonization, persistence, interaction and signaling towards to the human host and its health. Finally, massive parallel genome re-sequencing has provided new opportunities in applied genomics, specifically in the characterization of novel non-GMO strains that have potential to be used in the food industry. Here, we provide an overview of the state of the art of these functional genomics approaches and their impact in understanding, applying and designing lactic acid bacteria for food and health.
Defining Function in the Functional Medicine Model.

PubMed

Bland, Jeffrey

2017-02-01

In the functional medicine model, the word function is aligned with the evolving understanding that disease is an endpoint and function is a process. Function can move both forward and backward. The vector of change in function through time is, in part, determined by the unique interaction of an individual's genome with their environment, diet, and lifestyle. The functional medicine model for health care is concerned less with what we call the dysfunction or disease , and more about the dynamic processes that resulted in the person's dysfunction. The previous concept of functional somatic syndromes as psychosomatic in origin has now been replaced with a new concept of function that is rooted in the emerging 21st-century understanding of systems network-enabled biology.
Defining Function in the Functional Medicine Model

PubMed Central

Bland, Jeffrey

2017-01-01

In the functional medicine model, the word function is aligned with the evolving understanding that disease is an endpoint and function is a process. Function can move both forward and backward. The vector of change in function through time is, in part, determined by the unique interaction of an individual’s genome with their environment, diet, and lifestyle. The functional medicine model for health care is concerned less with what we call the dysfunction or disease, and more about the dynamic processes that resulted in the person’s dysfunction. The previous concept of functional somatic syndromes as psychosomatic in origin has now been replaced with a new concept of function that is rooted in the emerging 21st-century understanding of systems network-enabled biology. PMID:28223904
Using Arabidopsis to understand centromere function: progress and prospects.

PubMed

Copenhaver, Gregory P

2003-01-01

Arabidopsis thaliana has emerged in recent years as a leading model for understanding the structure and function of higher eukaryotic centromeres. Arabidopsis centromeres, like those of virtually all higher eukaryotes, encompass large DNA domains consisting of a complex combination of unique, dispersed middle repetitive and highly repetitive DNA. For this reason, they have required creative analysis using molecular, genetic, cytological and genomic techniques. This synergy of approaches, reinforced by rapid progress in understanding how proteins interact with the centromere DNA to form a complete functional unit, has made Arabidopsis one the best understood centromere systems. Yet major problems remain to be solved: gaining a complete structural definition of the centromere has been surprisingly difficult, and developing synthetic mini-chromosomes in plants has been even more challenging.
Genome empowerment for the Puerto Rican parrot – Amazona vittata

PubMed Central

2012-01-01

A unique community-funded project in Puerto Rico has launched whole-genome sequencing of the critically endangered Puerto Rican Parrot (Amazona vittata), with interpretation by genome bioinformaticians and students, and deposition into public online databases. This is the first article that focuses on the whole genome of a parrot species, one endemic to the USA and recently threatened with extinction. It provides invaluable conservation tools and a vivid example of hopeful prospects for future genome assessment of so many new species. It also demonstrates inventive ways for smaller institutions to contribute to a field largely considered the domain of large sequencing centers. PMID:23587407
Identification of a precursor genomic segment that provided a sequence unique to glycophorin B and E genes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Onda, M.; Kudo, S.; Fukuda, M.

Human glycophorin A, B, and E (GPA, GPB, and GPE) genes belong to a gene family located at the long arm of chromosome 4. These three genes are homologous from the 5'-flanking sequence to the Alu sequence, which is 1 kb downstream from the exon encoding the transmembrane domain. Analysis of the Alu sequence and flanking direct repeat sequences suggested that the GPA gene most closely resembles the ancestral gene, whereas the GPB and GPE gene arose by homologous recombination within the Alu sequence, acquiring 3' sequences from an unrelated precursor genomic segment. Here the authors describe the identification ofmore » this putative precursor genomic segment. A human genomic library was screened by using the sequence of the 3' region of the GPB gene as a probe. The genomic clones isolated were found to contain an Alu sequence that appeared to be involved in the recombination. Downstream from the Alu sequence, the nucleotide sequence of the precursor genomic segment is almost identical to that of the GPB or GPE gene. In contrast, the upstream sequence of the genomic segment differs entirely from that of the GPA, GPB, and GPE genes. Conservation of the direct repeats flanking the Alu sequence of the genomic segment strongly suggests that the sequence of this genomic segment has been maintained during evolution. This identified genomic segment was found to reside downstream from the GPA gene by both gene mapping and in situ chromosomal localization. The precursor genomic segment was also identified in the orangutan genome, which is known to lack GPB and GPE genes. These results indicate that one of the duplicated ancestral glycophorin genes acquired a unique 3' sequence by unequal crossing-over through its Alu sequence and the further downstream Alu sequence present in the duplicated gene. Further duplication and divergence of this gene yielded the GPB and GPE genes. 37 refs., 5 figs.« less
The Legionella pneumophila genome evolved to accommodate multiple regulatory mechanisms controlled by the CsrA-system

PubMed Central

Sahr, Tobias; Rusniok, Christophe; Impens, Francis; Oliva, Giulia; Sismeiro, Odile; Coppée, Jean-Yves

2017-01-01

The carbon storage regulator protein CsrA regulates cellular processes post-transcriptionally by binding to target-RNAs altering translation efficiency and/or their stability. Here we identified and analyzed the direct targets of CsrA in the human pathogen Legionella pneumophila. Genome wide transcriptome, proteome and RNA co-immunoprecipitation followed by deep sequencing of a wild type and a csrA mutant strain identified 479 RNAs with potential CsrA interaction sites located in the untranslated and/or coding regions of mRNAs or of known non-coding sRNAs. Further analyses revealed that CsrA exhibits a dual regulatory role in virulence as it affects the expression of the regulators FleQ, LqsR, LetE and RpoS but it also directly regulates the timely expression of over 40 Dot/Icm substrates. CsrA controls its own expression and the stringent response through a regulatory feedback loop as evidenced by its binding to RelA-mRNA and links it to quorum sensing and motility. CsrA is a central player in the carbon, amino acid, fatty acid metabolism and energy transfer and directly affects the biosynthesis of cofactors, vitamins and secondary metabolites. We describe the first L. pneumophila riboswitch, a thiamine pyrophosphate riboswitch whose regulatory impact is fine-tuned by CsrA, and identified a unique regulatory mode of CsrA, the active stabilization of RNA anti-terminator conformations inside a coding sequence preventing Rho-dependent termination of the gap operon through transcriptional polarity effects. This allows L. pneumophila to regulate the pentose phosphate pathway and the glycolysis combined or individually although they share genes in a single operon. Thus the L. pneumophila genome has evolved to acclimate at least five different modes of regulation by CsrA giving it a truly unique position in its life cycle. PMID:28212376

From the Cover: Genome analysis of the smallest free-living eukaryote Ostreococcus tauri unveils many unique features

NASA Astrophysics Data System (ADS)

Derelle, Evelyne; Ferraz, Conchita; Rombauts, Stephane; Rouzé, Pierre; Worden, Alexandra Z.; Robbens, Steven; Partensky, Frédéric; Degroeve, Sven; Echeynié, Sophie; Cooke, Richard; Saeys, Yvan; Wuyts, Jan; Jabbari, Kamel; Bowler, Chris; Panaud, Olivier; Piégu, Benoît; Ball, Steven G.; Ral, Jean-Philippe; Bouget, François-Yves; Piganeau, Gwenael; de Baets, Bernard; Picard, André; Delseny, Michel; Demaille, Jacques; van de Peer, Yves; Moreau, Hervé

2006-08-01

The green lineage is reportedly 1,500 million years old, evolving shortly after the endosymbiosis event that gave rise to early photosynthetic eukaryotes. In this study, we unveil the complete genome sequence of an ancient member of this lineage, the unicellular green alga Ostreococcus tauri (Prasinophyceae). This cosmopolitan marine primary producer is the world's smallest free-living eukaryote known to date. Features likely reflecting optimization of environmentally relevant pathways, including resource acquisition, unusual photosynthesis apparatus, and genes potentially involved in C4 photosynthesis, were observed, as was downsizing of many gene families. Overall, the 12.56-Mb nuclear genome has an extremely high gene density, in part because of extensive reduction of intergenic regions and other forms of compaction such as gene fusion. However, the genome is structurally complex. It exhibits previously unobserved levels of heterogeneity for a eukaryote. Two chromosomes differ structurally from the other eighteen. Both have a significantly biased G+C content, and, remarkably, they contain the majority of transposable elements. Many chromosome 2 genes also have unique codon usage and splicing, but phylogenetic analysis and composition do not support alien gene origin. In contrast, most chromosome 19 genes show no similarity to green lineage genes and a large number of them are specialized in cell surface processes. Taken together, the complete genome sequence, unusual features, and downsized gene families, make O. tauri an ideal model system for research on eukaryotic genome evolution, including chromosome specialization and green lineage ancestry. genome heterogeneity | genome sequence | green alga | Prasinophyceae | gene prediction
Chætognath transcriptome reveals ancestral and unique features among bilaterians

PubMed Central

Marlétaz, Ferdinand; Gilles, André; Caubit, Xavier; Perez, Yvan; Dossat, Carole; Samain, Sylvie; Gyapay, Gabor; Wincker, Patrick; Le Parco, Yannick

2008-01-01

Background The chætognaths (arrow worms) have puzzled zoologists for years because of their astonishing morphological and developmental characteristics. Despite their deuterostome-like development, phylogenomic studies recently positioned the chætognath phylum in protostomes, most likely in an early branching. This key phylogenetic position and the peculiar characteristics of chætognaths prompted further investigation of their genomic features. Results Transcriptomic and genomic data were collected from the chætognath Spadella cephaloptera through the sequencing of expressed sequence tags and genomic bacterial artificial chromosome clones. Transcript comparisons at various taxonomic scales emphasized the conservation of a core gene set and phylogenomic analysis confirmed the basal position of chætognaths among protostomes. A detailed survey of transcript diversity and individual genotyping revealed a past genome duplication event in the chætognath lineage, which was, surprisingly, followed by a high retention rate of duplicated genes. Moreover, striking genetic heterogeneity was detected within the sampled population at the nuclear and mitochondrial levels but cannot be explained by cryptic speciation. Finally, we found evidence for trans-splicing maturation of transcripts through splice-leader addition in the chætognath phylum and we further report that this processing is associated with operonic transcription. Conclusion These findings reveal both shared ancestral and unique derived characteristics of the chætognath genome, which suggests that this genome is likely the product of a very original evolutionary history. These features promote chætognaths as a pivotal model for comparative genomics, which could provide new clues for the investigation of the evolution of animal genomes. PMID:18533022
Investigating the Relatedness of Enteroinvasive Escherichia coli to Other E. coli and Shigella Isolates by Using Comparative Genomics

PubMed Central

Hazen, Tracy H.; Leonard, Susan R.; Lampel, Keith A.; Lacher, David W.

2016-01-01

Enteroinvasive Escherichia coli (EIEC) is a unique pathovar that has a pathogenic mechanism nearly indistinguishable from that of Shigella species. In contrast to isolates of the four Shigella species, which are widespread and can be frequent causes of human illness, EIEC causes far fewer reported illnesses each year. In this study, we analyzed the genome sequences of 20 EIEC isolates, including 14 first described in this study. Phylogenomic analysis of the EIEC genomes demonstrated that 17 of the isolates are present in three distinct lineages that contained only EIEC genomes, compared to reference genomes from each of the E. coli pathovars and Shigella species. Comparative genomic analysis identified genes that were unique to each of the three identified EIEC lineages. While many of the EIEC lineage-specific genes have unknown functions, those with predicted functions included a colicin and putative proteins involved in transcriptional regulation or carbohydrate metabolism. In silico detection of the Shigella virulence plasmid (pINV), which is essential for the invasion of host cells, demonstrated that a form of pINV was present in nearly all EIEC genomes, but the Mxi-Spa-Ipa region of the plasmid that encodes the invasion-associated proteins was absent from several of the EIEC isolates. The comparative genomic findings in this study support the hypothesis that multiple EIEC lineages have evolved independently from multiple distinct lineages of E. coli via the acquisition of the Shigella virulence plasmid and, in some cases, the Shigella pathogenicity islands. PMID:27271741
Investigating the Relatedness of Enteroinvasive Escherichia coli to Other E. coli and Shigella Isolates by Using Comparative Genomics.

PubMed

Hazen, Tracy H; Leonard, Susan R; Lampel, Keith A; Lacher, David W; Maurelli, Anthony T; Rasko, David A

2016-08-01

Enteroinvasive Escherichia coli (EIEC) is a unique pathovar that has a pathogenic mechanism nearly indistinguishable from that of Shigella species. In contrast to isolates of the four Shigella species, which are widespread and can be frequent causes of human illness, EIEC causes far fewer reported illnesses each year. In this study, we analyzed the genome sequences of 20 EIEC isolates, including 14 first described in this study. Phylogenomic analysis of the EIEC genomes demonstrated that 17 of the isolates are present in three distinct lineages that contained only EIEC genomes, compared to reference genomes from each of the E. coli pathovars and Shigella species. Comparative genomic analysis identified genes that were unique to each of the three identified EIEC lineages. While many of the EIEC lineage-specific genes have unknown functions, those with predicted functions included a colicin and putative proteins involved in transcriptional regulation or carbohydrate metabolism. In silico detection of the Shigella virulence plasmid (pINV), which is essential for the invasion of host cells, demonstrated that a form of pINV was present in nearly all EIEC genomes, but the Mxi-Spa-Ipa region of the plasmid that encodes the invasion-associated proteins was absent from several of the EIEC isolates. The comparative genomic findings in this study support the hypothesis that multiple EIEC lineages have evolved independently from multiple distinct lineages of E. coli via the acquisition of the Shigella virulence plasmid and, in some cases, the Shigella pathogenicity islands. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
Sequence Analysis and Characterization of Active Human Alu Subfamilies Based on the 1000 Genomes Pilot Project.

PubMed

Konkel, Miriam K; Walker, Jerilyn A; Hotard, Ashley B; Ranck, Megan C; Fontenot, Catherine C; Storer, Jessica; Stewart, Chip; Marth, Gabor T; Batzer, Mark A

2015-08-29

The goal of the 1000 Genomes Consortium is to characterize human genome structural variation (SV), including forms of copy number variations such as deletions, duplications, and insertions. Mobile element insertions, particularly Alu elements, are major contributors to genomic SV among humans. During the pilot phase of the project we experimentally validated 645 (611 intergenic and 34 exon targeted) polymorphic "young" Alu insertion events, absent from the human reference genome. Here, we report high resolution sequencing of 343 (322 unique) recent Alu insertion events, along with their respective target site duplications, precise genomic breakpoint coordinates, subfamily assignment, percent divergence, and estimated A-rich tail lengths. All the sequenced Alu loci were derived from the AluY lineage with no evidence of retrotransposition activity involving older Alu families (e.g., AluJ and AluS). AluYa5 is currently the most active Alu subfamily in the human lineage, followed by AluYb8, and many others including three newly identified subfamilies we have termed AluYb7a3, AluYb8b1, and AluYa4a1. This report provides the structural details of 322 unique Alu variants from individual human genomes collectively adding about 100 kb of genomic variation. Many Alu subfamilies are currently active in human populations, including a surprising level of AluY retrotransposition. Human Alu subfamilies exhibit continuous evolution with potential drivers sprouting new Alu lineages. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Threshold models for genome-enabled prediction of ordinal categorical traits in plant breeding.

PubMed

Montesinos-López, Osval A; Montesinos-López, Abelardo; Pérez-Rodríguez, Paulino; de Los Campos, Gustavo; Eskridge, Kent; Crossa, José

2014-12-23

Categorical scores for disease susceptibility or resistance often are recorded in plant breeding. The aim of this study was to introduce genomic models for analyzing ordinal characters and to assess the predictive ability of genomic predictions for ordered categorical phenotypes using a threshold model counterpart of the Genomic Best Linear Unbiased Predictor (i.e., TGBLUP). The threshold model was used to relate a hypothetical underlying scale to the outward categorical response. We present an empirical application where a total of nine models, five without interaction and four with genomic × environment interaction (G×E) and genomic additive × additive × environment interaction (G×G×E), were used. We assessed the proposed models using data consisting of 278 maize lines genotyped with 46,347 single-nucleotide polymorphisms and evaluated for disease resistance [with ordinal scores from 1 (no disease) to 5 (complete infection)] in three environments (Colombia, Zimbabwe, and Mexico). Models with G×E captured a sizeable proportion of the total variability, which indicates the importance of introducing interaction to improve prediction accuracy. Relative to models based on main effects only, the models that included G×E achieved 9-14% gains in prediction accuracy; adding additive × additive interactions did not increase prediction accuracy consistently across locations. Copyright © 2015 Montesinos-López et al.
BAUM: improving genome assembly by adaptive unique mapping and local overlap-layout-consensus approach.

PubMed

Wang, Anqi; Wang, Zhanyu; Li, Zheng; Li, Lei M

2018-06-15

It is highly desirable to assemble genomes of high continuity and consistency at low cost. The current bottleneck of draft genome continuity using the second generation sequencing (SGS) reads is primarily caused by uncertainty among repetitive sequences. Even though the single-molecule real-time sequencing technology is very promising to overcome the uncertainty issue, its relatively high cost and error rate add burden on budget or computation. Many long-read assemblers take the overlap-layout-consensus (OLC) paradigm, which is less sensitive to sequencing errors, heterozygosity and variability of coverage. However, current assemblers of SGS data do not sufficiently take advantage of the OLC approach. Aiming at minimizing uncertainty, the proposed method BAUM, breaks the whole genome into regions by adaptive unique mapping; then the local OLC is used to assemble each region in parallel. BAUM can (i) perform reference-assisted assembly based on the genome of a close species (ii) or improve the results of existing assemblies that are obtained based on short or long sequencing reads. The tests on two eukaryote genomes, a wild rice Oryza longistaminata and a parrot Melopsittacus undulatus, show that BAUM achieved substantial improvement on genome size and continuity. Besides, BAUM reconstructed a considerable amount of repetitive regions that failed to be assembled by existing short read assemblers. We also propose statistical approaches to control the uncertainty in different steps of BAUM. http://www.zhanyuwang.xin/wordpress/index.php/2017/07/21/baum. Supplementary data are available at Bioinformatics online.
WGSSAT: A High-Throughput Computational Pipeline for Mining and Annotation of SSR Markers From Whole Genomes.

PubMed

Pandey, Manmohan; Kumar, Ravindra; Srivastava, Prachi; Agarwal, Suyash; Srivastava, Shreya; Nagpure, Naresh S; Jena, Joy K; Kushwaha, Basdeo

2018-03-16

Mining and characterization of Simple Sequence Repeat (SSR) markers from whole genomes provide valuable information about biological significance of SSR distribution and also facilitate development of markers for genetic analysis. Whole genome sequencing (WGS)-SSR Annotation Tool (WGSSAT) is a graphical user interface pipeline developed using Java Netbeans and Perl scripts which facilitates in simplifying the process of SSR mining and characterization. WGSSAT takes input in FASTA format and automates the prediction of genes, noncoding RNA (ncRNA), core genes, repeats and SSRs from whole genomes followed by mapping of the predicted SSRs onto a genome (classified according to genes, ncRNA, repeats, exonic, intronic, and core gene region) along with primer identification and mining of cross-species markers. The program also generates a detailed statistical report along with visualization of mapped SSRs, genes, core genes, and RNAs. The features of WGSSAT were demonstrated using Takifugu rubripes data. This yielded a total of 139 057 SSR, out of which 113 703 SSR primer pairs were uniquely amplified in silico onto a T. rubripes (fugu) genome. Out of 113 703 mined SSRs, 81 463 were from coding region (including 4286 exonic and 77 177 intronic), 7 from RNA, 267 from core genes of fugu, whereas 105 641 SSR and 601 SSR primer pairs were uniquely mapped onto the medaka genome. WGSSAT is tested under Ubuntu Linux. The source code, documentation, user manual, example dataset and scripts are available online at https://sourceforge.net/projects/wgssat-nbfgr.
A 1000 Arab genome project to study the Emirati population.

PubMed

Al-Ali, Mariam; Osman, Wael; Tay, Guan K; AlSafar, Habiba S

2018-04-01

Discoveries from the human genome, HapMap, and 1000 genome projects have collectively contributed toward the creation of a catalog of human genetic variations that has improved our understanding of human diversity. Despite the collegial nature of many of these genome study consortiums, which has led to the cataloging of genetic variations of different ethnic groups from around the world, genome data on the Arab population remains overwhelmingly underrepresented. The National Arab Genome project in the United Arab Emirates (UAE) aims to address this deficiency by using Next Generation Sequencing (NGS) technology to provide data to improve our understanding of the Arab genome and catalog variants that are unique to the Arab population of the UAE. The project was conceived to shed light on the similarities and differences between the Arab genome and those of the other ethnic groups.
Insights from Human/Mouse genome comparisons

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pennacchio, Len A.

2003-03-30

Large-scale public genomic sequencing efforts have provided a wealth of vertebrate sequence data poised to provide insights into mammalian biology. These include deep genomic sequence coverage of human, mouse, rat, zebrafish, and two pufferfish (Fugu rubripes and Tetraodon nigroviridis) (Aparicio et al. 2002; Lander et al. 2001; Venter et al. 2001; Waterston et al. 2002). In addition, a high-priority has been placed on determining the genomic sequence of chimpanzee, dog, cow, frog, and chicken (Boguski 2002). While only recently available, whole genome sequence data have provided the unique opportunity to globally compare complete genome contents. Furthermore, the shared evolutionary ancestrymore » of vertebrate species has allowed the development of comparative genomic approaches to identify ancient conserved sequences with functionality. Accordingly, this review focuses on the initial comparison of available mammalian genomes and describes various insights derived from such analysis.« less
The future is now: single-cell genomics of bacteria and archaea

PubMed Central

Blainey, Paul C.

2013-01-01

Interest in the expanding catalog of uncultivated microorganisms, increasing recognition of heterogeneity among seemingly similar cells, and technological advances in whole-genome amplification and single-cell manipulation are driving considerable progress in single-cell genomics. Here, the spectrum of applications for single-cell genomics, key advances in the development of the field, and emerging methodology for single-cell genome sequencing are reviewed by example with attention to the diversity of approaches and their unique characteristics. Experimental strategies transcending specific methodologies are identified and organized as a road map for future studies in single-cell genomics of environmental microorganisms. Over the next decade, increasingly powerful tools for single-cell genome sequencing and analysis will play key roles in accessing the genomes of uncultivated organisms, determining the basis of microbial community functions, and fundamental aspects of microbial population biology. PMID:23298390
Reconstitution of wild type viral DNA in simian cells transfected with early and late SV40 defective genomes.

PubMed

O'Neill, F J; Gao, Y; Xu, X

1993-11-01

The DNAs of polyomaviruses ordinarily exist as a single circular molecule of approximately 5000 base pairs. Variants of SV40, BKV and JCV have been described which contain two complementing defective DNA molecules. These defectives, which form a bipartite genome structure, contain either the viral early region or the late region. The defectives have the unique property of being able to tolerate variable sized reiterations of regulatory and terminus region sequences, and portions of the coding region. They can also exchange coding region sequences with other polyomaviruses. It has been suggested that the bipartite genome structure might be a stage in the evolution of polyomaviruses which can uniquely sustain genome and sequence diversity. However, it is not known if the regulatory and terminus region sequences are highly mutable. Also, it is not known if the bipartite genome structure is reversible and what the conditions might be which would favor restoration of the monomolecular genome structure. We addressed the first question by sequencing the reiterated regulatory and terminus regions of E- and L-SV40 DNAs. This revealed a large number of mutations in the regulatory regions of the defective genomes, including deletions, insertions, rearrangements and base substitutions. We also detected insertions and base substitutions in the T-antigen gene. We addressed the second question by introducing into permissive simian cells, E- and L-SV40 genomes which had been engineered to contain only a single regulatory region. Analysis of viral DNA from transfected cells demonstrated recombined genomes containing a wild type monomolecular DNA structure. However, the complete defectives, containing reiterated regulatory regions, could often compete away the wild type genomes. The recombinant monomolecular genomes were isolated, cloned and found to be infectious. All of the DNA alterations identified in one of the regulatory regions of E-SV40 DNA were present in the recombinant monomolecular genomes. These and other findings indicate that the bipartite genome state can sustain many mutations which wtSV40 cannot directly sustain. However, the mutations can later be introduced into the wild type genomes when the E- and L-SV40 DNAs recombine to generate a new monomolecular genome structure.
Whole genome sequencing of the fish pathogen Francisella noatunensis subsp. orientalis Toba04 gives novel insights into Francisella evolution and pathogenecity

PubMed Central

2012-01-01

Background Francisella is a genus of gram-negative bacterium highly virulent in fishes and human where F. tularensis is causing the serious disease tularaemia in human. Recently Francisella species have been reported to cause mortality in aquaculture species like Atlantic cod and tilapia. We have completed the sequencing and draft assembly of the Francisella noatunensis subsp. orientalisToba04 strain isolated from farmed Tilapia. Compared to other available Francisella genomes, it is most similar to the genome of Francisella philomiragia subsp. philomiragia, a free-living bacterium not virulent to human. Results The genome is rearranged compared to the available Francisella genomes even though we found no IS-elements in the genome. Nearly 16% percent of the predicted ORFs are pseudogenes. Computational pathway analysis indicates that a number of the metabolic pathways are disrupted due to pseudogenes. Comparing the novel genome with other available Francisella genomes, we found around 2.5% of unique genes present in Francisella noatunensis subsp. orientalis Toba04 and a list of genes uniquely present in the human-pathogenic Francisella subspecies. Most of these genes might have transferred from bacterial species through horizontal gene transfer. Comparative analysis between human and fish pathogen also provide insights into genes responsible for pathogenecity. Our analysis of pseudogenes indicates that the evolution of Francisella subspecies’s pseudogenes from Tilapia is old with large number of pseudogenes having more than one inactivating mutation. Conclusions The fish pathogen has lost non-essential genes some time ago. Evolutionary analysis of the Francisella genomes, strongly suggests that human and fish pathogenic Francisella species have evolved independently from free-living metabolically competent Francisella species. These findings will contribute to understanding the evolution of Francisella species and pathogenesis. PMID:23131096
Genomic characterization of a Helicobacter pylori isolate from a patient with gastric cancer in China

PubMed Central

2014-01-01

Background Helicobacter pylori is well known for its relationship with the occurrence of several severe gastric diseases. The mechanisms of pathogenesis triggered by H. pylori are less well known. In this study, we report the genome sequence and genomic characterizations of H. pylori strain HLJ039 that was isolated from a patient with gastric cancer in the Chinese province of Heilongjiang, where there is a high incidence of gastric cancer. To investigate potential genomic features that may be involved in pathogenesis of carcinoma, the genome was compared to three previously sequenced genomes in this area. Result We obtained 42 contigs with a total length of 1,611,192 bp and predicted 1,687 coding sequences. Compared to strains isolated from gastritis and ulcers in this area, 10 different regions were identified as being unique for HLJ039; they mainly encoded type II restriction-modification enzyme, type II m6A methylase, DNA-cytosine methyltransferase, DNA methylase, and hypothetical proteins. A unique 547-bp fragment sharing 93% identity with a hypothetical protein of Helicobacter cinaedi ATCC BAA-847 was not present in any other previous H. pylori strains. Phylogenetic analysis based on core genome single nucleotide polymorphisms shows that HLJ039 is defined as hspEAsia subgroup, which belongs to the hpEastAsia group. Conclusion DNA methylations, variations of the genomic regions involved in restriction and modification systems, are the “hot” regions that may be related to the mechanism of H. pylori-induced gastric cancer. The genome sequence will provide useful information for the deep mining of potential mechanisms related to East Asian gastric cancer. PMID:24565107
Virulence factors encoded by Legionella longbeachae identified on the basis of the genome sequence analysis of clinical isolate D-4968.

PubMed

Kozak, Natalia A; Buss, Meghan; Lucas, Claressa E; Frace, Michael; Govil, Dhwani; Travis, Tatiana; Olsen-Rasmussen, Melissa; Benson, Robert F; Fields, Barry S

2010-02-01

Legionella longbeachae causes most cases of legionellosis in Australia and may be underreported worldwide due to the lack of L. longbeachae-specific diagnostic tests. L. longbeachae displays distinctive differences in intracellular trafficking, caspase 1 activation, and infection in mouse models compared to Legionella pneumophila, yet these two species have indistinguishable clinical presentations in humans. Unlike other legionellae, which inhabit freshwater systems, L. longbeachae is found predominantly in moist soil. In this study, we sequenced and annotated the genome of an L. longbeachae clinical isolate from Oregon, isolate D-4968, and compared it to the previously published genomes of L. pneumophila. The results revealed that the D-4968 genome is larger than the L. pneumophila genome and has a gene order that is different from that of the L. pneumophila genome. Genes encoding structural components of type II, type IV Lvh, and type IV Icm/Dot secretion systems are conserved. In contrast, only 42/140 homologs of genes encoding L. pneumophila Icm/Dot substrates have been found in the D-4968 genome. L. longbeachae encodes numerous proteins with eukaryotic motifs and eukaryote-like proteins unique to this species, including 16 ankyrin repeat-containing proteins and a novel U-box protein. We predict that these proteins are secreted by the L. longbeachae Icm/Dot secretion system. In contrast to the L. pneumophila genome, the L. longbeachae D-4968 genome does not contain flagellar biosynthesis genes, yet it contains a chemotaxis operon. The lack of a flagellum explains the failure of L. longbeachae to activate caspase 1 and trigger pyroptosis in murine macrophages. These unique features of L. longbeachae may reflect adaptation of this species to life in soil.
Web-based visual analysis for high-throughput genomics

PubMed Central

2013-01-01

Background Visualization plays an essential role in genomics research by making it possible to observe correlations and trends in large datasets as well as communicate findings to others. Visual analysis, which combines visualization with analysis tools to enable seamless use of both approaches for scientific investigation, offers a powerful method for performing complex genomic analyses. However, there are numerous challenges that arise when creating rich, interactive Web-based visualizations/visual analysis applications for high-throughput genomics. These challenges include managing data flow from Web server to Web browser, integrating analysis tools and visualizations, and sharing visualizations with colleagues. Results We have created a platform simplifies the creation of Web-based visualization/visual analysis applications for high-throughput genomics. This platform provides components that make it simple to efficiently query very large datasets, draw common representations of genomic data, integrate with analysis tools, and share or publish fully interactive visualizations. Using this platform, we have created a Circos-style genome-wide viewer, a generic scatter plot for correlation analysis, an interactive phylogenetic tree, a scalable genome browser for next-generation sequencing data, and an application for systematically exploring tool parameter spaces to find good parameter values. All visualizations are interactive and fully customizable. The platform is integrated with the Galaxy (http://galaxyproject.org) genomics workbench, making it easy to integrate new visual applications into Galaxy. Conclusions Visualization and visual analysis play an important role in high-throughput genomics experiments, and approaches are needed to make it easier to create applications for these activities. Our framework provides a foundation for creating Web-based visualizations and integrating them into Galaxy. Finally, the visualizations we have created using the framework are useful tools for high-throughput genomics experiments. PMID:23758618
Systematic CpT (ApG) Depletion and CpG Excess Are Unique Genomic Signatures of Large DNA Viruses Infecting Invertebrates

PubMed Central

Upadhyay, Mohita; Sharma, Neha; Vivekanandan, Perumal

2014-01-01

Differences in the relative abundance of dinucleotides, if any may provide important clues on host-driven evolution of viruses. We studied dinucleotide frequencies of large DNA viruses infecting vertebrates (n = 105; viruses infecting mammals = 99; viruses infecting aves = 6; viruses infecting reptiles = 1) and invertebrates (n = 88; viruses infecting insects = 84; viruses infecting crustaceans = 4). We have identified systematic depletion of CpT(ApG) dinucleotides and over-representation of CpG dinucleotides as the unique genomic signature of large DNA viruses infecting invertebrates. Detailed investigation of this unique genomic signature suggests the existence of invertebrate host-induced pressures specifically targeting CpT(ApG) and CpG dinucleotides. The depletion of CpT dinucleotides among large DNA viruses infecting invertebrates is at least in part, explained by non-canonical DNA methylation by the infected host. Our findings highlight the role of invertebrate host-related factors in shaping virus evolution and they also provide the necessary framework for future studies on evolution, epigenetics and molecular biology of viruses infecting this group of hosts. PMID:25369195
Construction, database integration, and application of an Oenothera EST library.

PubMed

Mrácek, Jaroslav; Greiner, Stephan; Cho, Won Kyong; Rauwolf, Uwe; Braun, Martha; Umate, Pavan; Altstätter, Johannes; Stoppel, Rhea; Mlcochová, Lada; Silber, Martina V; Volz, Stefanie M; White, Sarah; Selmeier, Renate; Rudd, Stephen; Herrmann, Reinhold G; Meurer, Jörg

2006-09-01

Coevolution of cellular genetic compartments is a fundamental aspect in eukaryotic genome evolution that becomes apparent in serious developmental disturbances after interspecific organelle exchanges. The genus Oenothera represents a unique, at present the only available, resource to study the role of the compartmentalized plant genome in diversification of populations and speciation processes. An integrated approach involving cDNA cloning, EST sequencing, and bioinformatic data mining was chosen using Oenothera elata with the genetic constitution nuclear genome AA with plastome type I. The Gene Ontology system grouped 1621 unique gene products into 17 different functional categories. Application of arrays generated from a selected fraction of ESTs revealed significantly differing expression profiles among closely related Oenothera species possessing the potential to generate fertile and incompatible plastid/nuclear hybrids (hybrid bleaching). Furthermore, the EST library provides a valuable source of PCR-based polymorphic molecular markers that are instrumental for genotyping and molecular mapping approaches.
Complete Unique Genome Sequence, Expression Profile, and Salivary Gland Tissue Tropism of the Herpesvirus 7 Homolog in Pigtailed Macaques.

PubMed

Staheli, Jeannette P; Dyen, Michael R; Deutsch, Gail H; Basom, Ryan S; Fitzgibbon, Matthew P; Lewis, Patrick; Barcy, Serge

2016-08-01

Human herpesvirus 6A (HHV-6A), HHV-6B, and HHV-7 are classified as roseoloviruses and are highly prevalent in the human population. Roseolovirus reactivation in an immunocompromised host can cause severe pathologies. While the pathogenic potential of HHV-7 is unclear, it can reactivate HHV-6 from latency and thus contributes to severe pathological conditions associated with HHV-6. Because of the ubiquitous nature of roseoloviruses, their roles in such interactions and the resulting pathological consequences have been difficult to study. Furthermore, the lack of a relevant animal model for HHV-7 infection has hindered a better understanding of its contribution to roseolovirus-associated diseases. Using next-generation sequencing analysis, we characterized the unique genome of an uncultured novel pigtailed macaque roseolovirus. Detailed genomic analysis revealed the presence of gene homologs to all 84 known HHV-7 open reading frames. Phylogenetic analysis confirmed that the virus is a macaque homolog of HHV-7, which we have provisionally named Macaca nemestrina herpesvirus 7 (MneHV7). Using high-throughput RNA sequencing, we observed that the salivary gland tissue samples from nine different macaques had distinct MneHV7 gene expression patterns and that the overall number of viral transcripts correlated with viral loads in parotid gland tissue and saliva. Immunohistochemistry staining confirmed that, like HHV-7, MneHV7 exhibits a natural tropism for salivary gland ductal cells. We also observed staining for MneHV7 in peripheral nerve ganglia present in salivary gland tissues, suggesting that HHV-7 may also have a tropism for the peripheral nervous system. Our data demonstrate that MneHV7-infected macaques represent a relevant animal model that may help clarify the causality between roseolovirus reactivation and diseases. Human herpesvirus 6A (HHV-6A), HHV-6B, and HHV-7 are classified as roseoloviruses. We have recently discovered that pigtailed macaques are naturally infected with viral homologs of HHV-6 and HHV-7, which we provisionally named MneHV6 and MneHV7, respectively. In this study, we confirm that MneHV7 is genetically and biologically similar to its human counterpart, HHV-7. We determined the complete unique MneHV7 genome sequence and provide a comprehensive annotation of all genes. We also characterized viral transcription profiles in salivary glands from naturally infected macaques. We show that broad transcriptional activity across most of the viral genome is associated with high viral loads in infected parotid glands and that late viral protein expression is detected in salivary duct cells and peripheral nerve ganglia. Our study provides new insights into the natural behavior of an extremely prevalent virus and establishes a basis for subsequent investigations of the mechanisms that cause HHV-7 reactivation and associated disease. Copyright © 2016 Staheli et al.
Complete Unique Genome Sequence, Expression Profile, and Salivary Gland Tissue Tropism of the Herpesvirus 7 Homolog in Pigtailed Macaques

PubMed Central

Staheli, Jeannette P.; Dyen, Michael R.; Deutsch, Gail H.; Basom, Ryan S.; Fitzgibbon, Matthew P.; Lewis, Patrick

2016-01-01

ABSTRACT Human herpesvirus 6A (HHV-6A), HHV-6B, and HHV-7 are classified as roseoloviruses and are highly prevalent in the human population. Roseolovirus reactivation in an immunocompromised host can cause severe pathologies. While the pathogenic potential of HHV-7 is unclear, it can reactivate HHV-6 from latency and thus contributes to severe pathological conditions associated with HHV-6. Because of the ubiquitous nature of roseoloviruses, their roles in such interactions and the resulting pathological consequences have been difficult to study. Furthermore, the lack of a relevant animal model for HHV-7 infection has hindered a better understanding of its contribution to roseolovirus-associated diseases. Using next-generation sequencing analysis, we characterized the unique genome of an uncultured novel pigtailed macaque roseolovirus. Detailed genomic analysis revealed the presence of gene homologs to all 84 known HHV-7 open reading frames. Phylogenetic analysis confirmed that the virus is a macaque homolog of HHV-7, which we have provisionally named Macaca nemestrina herpesvirus 7 (MneHV7). Using high-throughput RNA sequencing, we observed that the salivary gland tissue samples from nine different macaques had distinct MneHV7 gene expression patterns and that the overall number of viral transcripts correlated with viral loads in parotid gland tissue and saliva. Immunohistochemistry staining confirmed that, like HHV-7, MneHV7 exhibits a natural tropism for salivary gland ductal cells. We also observed staining for MneHV7 in peripheral nerve ganglia present in salivary gland tissues, suggesting that HHV-7 may also have a tropism for the peripheral nervous system. Our data demonstrate that MneHV7-infected macaques represent a relevant animal model that may help clarify the causality between roseolovirus reactivation and diseases. IMPORTANCE Human herpesvirus 6A (HHV-6A), HHV-6B, and HHV-7 are classified as roseoloviruses. We have recently discovered that pigtailed macaques are naturally infected with viral homologs of HHV-6 and HHV-7, which we provisionally named MneHV6 and MneHV7, respectively. In this study, we confirm that MneHV7 is genetically and biologically similar to its human counterpart, HHV-7. We determined the complete unique MneHV7 genome sequence and provide a comprehensive annotation of all genes. We also characterized viral transcription profiles in salivary glands from naturally infected macaques. We show that broad transcriptional activity across most of the viral genome is associated with high viral loads in infected parotid glands and that late viral protein expression is detected in salivary duct cells and peripheral nerve ganglia. Our study provides new insights into the natural behavior of an extremely prevalent virus and establishes a basis for subsequent investigations of the mechanisms that cause HHV-7 reactivation and associated disease. PMID:27170755

Patterns of divergence across the geographic and genomic landscape of a butterfly hybrid zone associated with a climatic gradient

USDA-ARS?s Scientific Manuscript database

The process of speciation is impacted by the interaction between the genomic architecture of diverging lineages and the environmental context they occupy. Yet, while climate can have a significant impact on this interaction, its role in determining the patterns of geographic and genomic divergence i...
Unique secreted–surface protein complex of Lactobacillus rhamnosus, identified by phage display

PubMed Central

Gagic, Dragana; Wen, Wesley; Collett, Michael A; Rakonjac, Jasna

2013-01-01

Proteins are the most diverse structures on bacterial surfaces; hence, they are candidates for species- and strain-specific interactions of bacteria with the host, environment, and other microorganisms. Genomics has decoded thousands of bacterial surface and secreted proteins, yet the function of most cannot be predicted because of the enormous variability and a lack of experimental data that would allow deduction of function through homology. Here, we used phage display to identify a pair of interacting extracellular proteins in the probiotic bacterium Lactobacillus rhamnosus HN001. A secreted protein, SpcA, containing two bacterial immunoglobulin-like domains type 3 (Big-3) and a domain distantly related to plant pathogen response domain 1 (PR-1-like) was identified by screening of an L. rhamnosus HN001 library using HN001 cells as bait. The SpcA-“docking” protein, SpcB, was in turn detected by another phage display library screening, using purified SpcA as bait. SpcB is a 3275-residue cell-surface protein that contains general features of large glycosylated Serine-rich adhesins/fibrils from gram-positive bacteria, including the hallmark signal sequence motif KxYKxGKxW. Both proteins are encoded by genes within a L. rhamnosus-unique gene cluster that distinguishes this species from other lactobacilli. To our knowledge, this is the first example of a secreted-docking protein pair identified in lactobacilli. PMID:23233310
Animal selection for whole genome sequencing by quantifying the unique contribution of homozygous haplotypes sequenced

USDA-ARS?s Scientific Manuscript database

Major whole genome sequencing projects promise to identify rare and causal variants within livestock species; however, the efficient selection of animals for sequencing remains a major problem within these surveys. The goal of this project was to develop a library of high accuracy genetic variants f...
Duplicated genes evolve independently in allopolyploid cotton.

Treesearch

Richard C. Cronn; Randall L. Small; Jonathan F. Wendel

1999-01-01

Of the many processes that generate gene duplications, polyploidy is unique in that entire genomes are duplicated. This process has been important in the evolution of many eukaryotic groups, and it occurs with high frequency in plants. Recent evidence suggests that polyploidization may be accompanied by rapid genomic changes, but the evolutionary fate of discrete loci...
Evidence of evolutionary history and selective sweeps in the genome of Meishan pig reveals its genetic and phenotypic characterization

USDA-ARS?s Scientific Manuscript database

Meishan is a famous Chinese indigenous pig breed known for its extremely high fecundity. To explore if Meishan has unique evolutionary process and genome characteristics differing from other pig breeds, we systematically analyzed its genetic divergence, and demographic history by large-scale reseque...
Development and validation of 697 novel polymorphic genomic and EST-SSR Markers in the American cranberry (Vaccinium macrocarpon Ait.)

USDA-ARS?s Scientific Manuscript database

The American cranberry, Vaccinium macrocarpon Ait., is an economically important North American fruit crop that is consumed because of its unique flavor and potential health benefits. However, a lack of abundant, genome-wide molecular markers has limited the adoption of modern molecular assisted sel...
Discovering causal pathways linking genomic events to transcriptional states using Tied Diffusion Through Interacting Events (TieDIE).

PubMed

Paull, Evan O; Carlin, Daniel E; Niepel, Mario; Sorger, Peter K; Haussler, David; Stuart, Joshua M

2013-11-01

Identifying the cellular wiring that connects genomic perturbations to transcriptional changes in cancer is essential to gain a mechanistic understanding of disease initiation, progression and ultimately to predict drug response. We have developed a method called Tied Diffusion Through Interacting Events (TieDIE) that uses a network diffusion approach to connect genomic perturbations to gene expression changes characteristic of cancer subtypes. The method computes a subnetwork of protein-protein interactions, predicted transcription factor-to-target connections and curated interactions from literature that connects genomic and transcriptomic perturbations. Application of TieDIE to The Cancer Genome Atlas and a breast cancer cell line dataset identified key signaling pathways, with examples impinging on MYC activity. Interlinking genes are predicted to correspond to essential components of cancer signaling and may provide a mechanistic explanation of tumor character and suggest subtype-specific drug targets. Software is available from the Stuart lab's wiki: https://sysbiowiki.soe.ucsc.edu/tiedie. jstuart@ucsc.edu. Supplementary data are available at Bioinformatics online.
Identification of common, unique and polymorphic microsatellites among 73 cyanobacterial genomes.

PubMed

Kabra, Ritika; Kapil, Aditi; Attarwala, Kherunnisa; Rai, Piyush Kant; Shanker, Asheesh

2016-04-01

Microsatellites also known as Simple Sequence Repeats are short tandem repeats of 1-6 nucleotides. These repeats are found in coding as well as non-coding regions of both prokaryotic and eukaryotic genomes and play a significant role in the study of gene regulation, genetic mapping, DNA fingerprinting and evolutionary studies. The availability of 73 complete genome sequences of cyanobacteria enabled us to mine and statistically analyze microsatellites in these genomes. The cyanobacterial microsatellites identified through bioinformatics analysis were stored in a user-friendly database named CyanoSat, which is an efficient data representation and query system designed using ASP.net. The information in CyanoSat comprises of perfect, imperfect and compound microsatellites found in coding, non-coding and coding-non-coding regions. Moreover, it contains PCR primers with 200 nucleotides long flanking region. The mined cyanobacterial microsatellites can be freely accessed at www.compubio.in/CyanoSat/home.aspx. In addition to this 82 polymorphic, 13,866 unique and 2390 common microsatellites were also detected. These microsatellites will be useful in strain identification and genetic diversity studies of cyanobacteria.
The Genomic Evolution of Prostate Cancer

DTIC Science & Technology

2014-10-01

Mutation characteristics. (a) Number of high-confidence somatic mutations across all foci. Non- silent , non- silent mutations; Unique, number of unique...genes harboring a non- silent mutation; Reported, gene reported to be mutated in references 9–12 and 14. (b) Spectrum of unique high confidence somatic...epigenetic and micr- oRNA-mediated inactivation of LRP1B, a modulator of the extracellular environment of thyroid cancer cells. Oncogene 2011; 30
The pomegranate (Punica granatum L.) genome provides insights into fruit quality and ovule developmental biology.

PubMed

Yuan, Zhaohe; Fang, Yanming; Zhang, Taikui; Fei, Zhangjun; Han, Fengming; Liu, Cuiyu; Liu, Min; Xiao, Wei; Zhang, Wenjing; Wu, Shan; Zhang, Mengwei; Ju, Youhui; Xu, Huili; Dai, He; Liu, Yujun; Chen, Yanhui; Wang, Lili; Zhou, Jianqing; Guan, Dian; Yan, Ming; Xia, Yanhua; Huang, Xianbin; Liu, Dongyuan; Wei, Hongmin; Zheng, Hongkun

2017-12-22

Pomegranate (Punica granatum L.) has an ancient cultivation history and has become an emerging profitable fruit crop due to its attractive features such as the bright red appearance and the high abundance of medicinally valuable ellagitannin-based compounds in its peel and aril. However, the limited genomic resources have restricted further elucidation of genetics and evolution of these interesting traits. Here, we report a 274-Mb high-quality draft pomegranate genome sequence, which covers approximately 81.5% of the estimated 336-Mb genome, consists of 2177 scaffolds with an N50 size of 1.7 Mb and contains 30 903 genes. Phylogenomic analysis supported that pomegranate belongs to the Lythraceae family rather than the monogeneric Punicaceae family, and comparative analyses showed that pomegranate and Eucalyptus grandis share the paleotetraploidy event. Integrated genomic and transcriptomic analyses provided insights into the molecular mechanisms underlying the biosynthesis of ellagitannin-based compounds, the colour formation in both peels and arils during pomegranate fruit development, and the unique ovule development processes that are characteristic of pomegranate. This genome sequence provides an important resource to expand our understanding of some unique biological processes and to facilitate both comparative biology studies and crop breeding. © 2017 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
The Complete Mitochondrial Genome and Novel Gene Arrangement of the Unique-Headed Bug Stenopirates sp. (Hemiptera: Enicocephalidae)

PubMed Central

Li, Hu; Liu, Hui; Shi, Aimin; Štys, Pavel; Zhou, Xuguo; Cai, Wanzhi

2012-01-01

Many of true bugs are important insect pests to cultivated crops and some are important vectors of human diseases, but few cladistic analyses have addressed relationships among the seven infraorders of Heteroptera. The Enicocephalomorpha and Nepomorpha are consider the basal groups of Heteroptera, but the basal-most lineage remains unresolved. Here we report the mitochondrial genome of the unique-headed bug Stenopirates sp., the first mitochondrial genome sequenced from Enicocephalomorpha. The Stenopirates sp. mitochondrial genome is a typical circular DNA molecule of 15, 384 bp in length, and contains 37 genes and a large non-coding fragment. The gene order differs substantially from other known insect mitochondrial genomes, with rearrangements of both tRNA genes and protein-coding genes. The overall AT content (82.5%) of Stenopirates sp. is the highest among all the known heteropteran mitochondrial genomes. The strand bias is consistent with other true bugs with negative GC-skew and positive AT-skew for the J-strand. The heteropteran mitochondrial atp8 exhibits the highest evolutionary rate, whereas cox1 appears to have the lowest rate. Furthermore, a negative correlation was observed between the variation of nucleotide substitutions and the GC content of each protein-coding gene. A microsatellite was identified in the putative control region. Finally, phylogenetic reconstruction suggests that Enicocephalomorpha is the sister group to all the remaining Heteroptera. PMID:22235294
Mapping of Micro-Tom BAC-End Sequences to the Reference Tomato Genome Reveals Possible Genome Rearrangements and Polymorphisms

PubMed Central

Asamizu, Erika; Shirasawa, Kenta; Hirakawa, Hideki; Sato, Shusei; Tabata, Satoshi; Yano, Kentaro; Ariizumi, Tohru; Shibata, Daisuke; Ezura, Hiroshi

2012-01-01

A total of 93,682 BAC-end sequences (BESs) were generated from a dwarf model tomato, cv. Micro-Tom. After removing repetitive sequences, the BESs were similarity searched against the reference tomato genome of a standard cultivar, “Heinz 1706.” By referring to the “Heinz 1706” physical map and by eliminating redundant or nonsignificant hits, 28,804 “unique pair ends” and 8,263 “unique ends” were selected to construct hypothetical BAC contigs. The total physical length of the BAC contigs was 495, 833, 423 bp, covering 65.3% of the entire genome. The average coverage of euchromatin and heterochromatin was 58.9% and 67.3%, respectively. From this analysis, two possible genome rearrangements were identified: one in chromosome 2 (inversion) and the other in chromosome 3 (inversion and translocation). Polymorphisms (SNPs and Indels) between the two cultivars were identified from the BLAST alignments. As a result, 171,792 polymorphisms were mapped on 12 chromosomes. Among these, 30,930 polymorphisms were found in euchromatin (1 per 3,565 bp) and 140,862 were found in heterochromatin (1 per 2,737 bp). The average polymorphism density in the genome was 1 polymorphism per 2,886 bp. To facilitate the use of these data in Micro-Tom research, the BAC contig and polymorphism information are available in the TOMATOMICS database. PMID:23227037
A Unique N-Terminal Sequence in the Carnation Italian ringspot virus p36 Replicase-Associated Protein Interacts with the Host Cell ESCRT-I Component Vps23

PubMed Central

Richardson, Lynn G. L.; Clendening, Eric A.; Sheen, Hyukho; Gidda, Satinder K.; White, K. Andrew

2014-01-01

ABSTRACT Like most positive-strand RNA viruses, infection by plant tombusviruses results in extensive rearrangement of specific host cell organelle membranes that serve as the sites of viral replication. The tombusvirus Tomato bushy stunt virus (TBSV) replicates within spherules derived from the peroxisomal boundary membrane, a process that involves the coordinated action of various viral and cellular factors, including constituents of the endosomal sorting complex required for transport (ESCRT). ESCRT is comprised of a series of protein subcomplexes (i.e., ESCRT-0 -I, -II, and -III) that normally participate in late endosome biogenesis and some of which are also hijacked by certain enveloped retroviruses (e.g., HIV) for viral budding from the plasma membrane. Here we show that the replication of Carnation Italian ringspot virus (CIRV), a tombusvirus that replicates at mitochondrial membranes also relies on ESCRT. In plant cells, CIRV recruits the ESCRT-I protein, Vps23, to mitochondria through an interaction that involves a unique region in the N terminus of the p36 replicase-associated protein that is not conserved in TBSV or other peroxisome-targeted tombusviruses. The interaction between p36 and Vps23 also involves the Vps23 C-terminal steadiness box domain and not its N-terminal ubiquitin E2 variant domain, which in the case of TBSV (and enveloped retroviruses) mediates the interaction with ESCRT. Overall, these results provide evidence that CIRV uses a unique N-terminal sequence for the recruitment of Vps23 that is distinct from those used by TBSV and certain mammalian viruses for ESCRT recruitment. Characterization of this novel interaction with Vps23 contributes to our understanding of how CIRV may have evolved to exploit key differences in the plant ESCRT machinery. IMPORTANCE Positive-strand RNA viruses replicate their genomes in association with specific host cell membranes. To accomplish this, cellular components responsible for membrane biogenesis and modeling are appropriated by viral proteins and redirected to assemble membrane-bound viral replicase complexes. The diverse pathways leading to the formation of these replication structures are poorly understood. We have determined that the cellular ESCRT system that is normally responsible for mediating late endosome biogenesis is also involved in the replication of the tombusvirus Carnation Italian ringspot virus (CIRV) at mitochondria. Notably, CIRV recruits ESCRT to the mitochondrial outer membrane via an interaction between a unique motif in the viral protein p36 and the ESCRT component Vps23. Our findings provide new insights into tombusvirus replication and the virus-induced remodeling of plant intracellular membranes, as well as normal ESCRT assembly in plants. PMID:24672030
Azolla--a model organism for plant genomic studies.

PubMed

Qiu, Yin-Long; Yu, Jun

2003-02-01

The aquatic ferns of the genus Azolla are nitrogen-fixing plants that have great potentials in agricultural production and environmental conservation. Azolla in many aspects is qualified to serve as a model organism for genomic studies because of its importance in agriculture, its unique position in plant evolution, its symbiotic relationship with the N2-fixing cyanobacterium, Anabaena azollae, and its moderate-sized genome. The goals of this genome project are not only to understand the biology of the Azolla genome to promote its applications in biological research and agriculture practice but also to gain critical insights about evolution of plant genomes. Together with the strategic and technical improvement as well as cost reduction of DNA sequencing, the deciphering of their genetic code is imminent.
THE E1 PROTEINS

PubMed Central

Bergvall, Monika; Melendy, Thomas; Archambault, Jacques

2013-01-01

E1, an ATP-dependent DNA helicase, is the only enzyme encoded by papillomaviruses (PVs). It is essential for replication and amplification of the viral episome in the nucleus of infected cells. To do so, E1 assembles into a double-hexamer at the viral origin, unwinds DNA at the origin and ahead of the replication fork and interacts with cellular DNA replication factors. Biochemical and structural studies have revealed the assembly pathway of E1 at the origin and how the enzyme unwinds DNA using a spiral escalator mechanism. E1 is tightly regulated in vivo, in particular by post-translational modifications that restrict its accumulation in the nucleus. Here we review how different functional domains of E1 orchestrate viral DNA replication, with an emphasis on their interactions with substrate DNA, host DNA replication factors and modifying enzymes. These studies have made E1 one of the best characterized helicases and provided unique insights on how PVs usurp different host-cell machineries to replicate and amplify their genome in a tightly controlled manner. PMID:24029589
Nematode-Trapping Fungi.

PubMed

Jiang, Xiangzhi; Xiang, Meichun; Liu, Xingzhong

2017-01-01

Nematode-trapping fungi are a unique and intriguing group of carnivorous microorganisms that can trap and digest nematodes by means of specialized trapping structures. They can develop diverse trapping devices, such as adhesive hyphae, adhesive knobs, adhesive networks, constricting rings, and nonconstricting rings. Nematode-trapping fungi have been found in all regions of the world, from the tropics to Antarctica, from terrestrial to aquatic ecosystems. They play an important ecological role in regulating nematode dynamics in soil. Molecular phylogenetic studies have shown that the majority of nematode-trapping fungi belong to a monophyletic group in the order Orbiliales (Ascomycota). Nematode-trapping fungi serve as an excellent model system for understanding fungal evolution and interaction between fungi and nematodes. With the development of molecular techniques and genome sequencing, their evolutionary origins and divergence, and the mechanisms underlying fungus-nematode interactions have been well studied. In recent decades, an increasing concern about the environmental hazards of using chemical nematicides has led to the application of these biological control agents as a rapidly developing component of crop protection.
In situ structures of the genome and genome-delivery apparatus in a single-stranded RNA virus.

PubMed

Dai, Xinghong; Li, Zhihai; Lai, Mason; Shu, Sara; Du, Yushen; Zhou, Z Hong; Sun, Ren

2017-01-05

Packaging of the genome into a protein capsid and its subsequent delivery into a host cell are two fundamental processes in the life cycle of a virus. Unlike double-stranded DNA viruses, which pump their genome into a preformed capsid, single-stranded RNA (ssRNA) viruses, such as bacteriophage MS2, co-assemble their capsid with the genome; however, the structural basis of this co-assembly is poorly understood. MS2 infects Escherichia coli via the host 'sex pilus' (F-pilus); it was the first fully sequenced organism and is a model system for studies of translational gene regulation, RNA-protein interactions, and RNA virus assembly. Its positive-sense ssRNA genome of 3,569 bases is enclosed in a capsid with one maturation protein monomer and 89 coat protein dimers arranged in a T = 3 icosahedral lattice. The maturation protein is responsible for attaching the virus to an F-pilus and delivering the viral genome into the host during infection, but how the genome is organized and delivered is not known. Here we describe the MS2 structure at 3.6 Å resolution, determined by electron-counting cryo-electron microscopy (cryoEM) and asymmetric reconstruction. We traced approximately 80% of the backbone of the viral genome, built atomic models for 16 RNA stem-loops, and identified three conserved motifs of RNA-coat protein interactions among 15 of these stem-loops with diverse sequences. The stem-loop at the 3' end of the genome interacts extensively with the maturation protein, which, with just a six-helix bundle and a six-stranded β-sheet, forms a genome-delivery apparatus and joins 89 coat protein dimers to form a capsid. This atomic description of genome-capsid interactions in a spherical ssRNA virus provides insight into genome delivery via the host sex pilus and mechanisms underlying ssRNA-capsid co-assembly, and inspires speculation about the links between nucleoprotein complexes and the origins of viruses.
Complete nucleotide sequence and genome structure of a Japanese isolate of hibiscus latent Fort Pierce virus, a unique tobamovirus that contains an internal poly(A) region in its 3' end.

PubMed

Yoshida, Tetsuya; Kitazawa, Yugo; Komatsu, Ken; Neriya, Yutaro; Ishikawa, Kazuya; Fujita, Naoko; Hashimoto, Masayoshi; Maejima, Kensaku; Yamaji, Yasuyuki; Namba, Shigetou

2014-11-01

In this study, we detected a Japanese isolate of hibiscus latent Fort Pierce virus (HLFPV-J), a member of the genus Tobamovirus, in a hibiscus plant in Japan and determined the complete sequence and organization of its genome. HLFPV-J has four open reading frames (ORFs), each of which shares more than 98 % nucleotide sequence identity with those of other HLFPV isolates. Moreover, HLFPV-J contains a unique internal poly(A) region of variable length, ranging from 44 to 78 nucleotides, in its 3'-untranslated region (UTR), as is the case with hibiscus latent Singapore virus (HLSV), another hibiscus-infecting tobamovirus. The length of the HLFPV-J genome was 6431 nucleotides, including the shortest internal poly(A) region. The sequence identities of ORFs 1, 2, 3 and 4 of HLFPV-J to other tobamoviruses were 46.6-68.7, 49.9-70.8, 31.0-70.8 and 39.4-70.1 %, respectively, at the nucleotide level and 39.8-75.0, 43.6-77.8, 19.2-70.4 and 31.2-74.2 %, respectively, at the amino acid level. The 5'- and 3'-UTRs of HLFPV-J showed 24.3-58.6 and 13.0-79.8 % identity, respectively, to other tobamoviruses. In particular, when compared to other tobamoviruses, each ORF and UTR of HLFPV-J showed the highest sequence identity to those of HLSV. Phylogenetic analysis showed that HLFPV-J, other HLFPV isolates and HLSV constitute a malvaceous-plant-infecting tobamovirus cluster. These results indicate that the genomic structure of HLFPV-J has unique features similar to those of HLSV. To our knowledge, this is the first report of the complete genome sequence of HLFPV.
Accurate prediction of protein–protein interactions from sequence alignments using a Bayesian method

PubMed Central

Burger, Lukas; van Nimwegen, Erik

2008-01-01

Accurate and large-scale prediction of protein–protein interactions directly from amino-acid sequences is one of the great challenges in computational biology. Here we present a new Bayesian network method that predicts interaction partners using only multiple alignments of amino-acid sequences of interacting protein domains, without tunable parameters, and without the need for any training examples. We first apply the method to bacterial two-component systems and comprehensively reconstruct two-component signaling networks across all sequenced bacteria. Comparisons of our predictions with known interactions show that our method infers interaction partners genome-wide with high accuracy. To demonstrate the general applicability of our method we show that it also accurately predicts interaction partners in a recent dataset of polyketide synthases. Analysis of the predicted genome-wide two-component signaling networks shows that cognates (interacting kinase/regulator pairs, which lie adjacent on the genome) and orphans (which lie isolated) form two relatively independent components of the signaling network in each genome. In addition, while most genes are predicted to have only a small number of interaction partners, we find that 10% of orphans form a separate class of ‘hub' nodes that distribute and integrate signals to and from up to tens of different interaction partners. PMID:18277381
Microbial ecology of deep-sea hypersaline anoxic basins.

PubMed

Merlino, Giuseppe; Barozzi, Alan; Michoud, Grégoire; Ngugi, David Kamanda; Daffonchio, Daniele

2018-07-01

Deep hypersaline anoxic basins (DHABs) are unique water bodies occurring within fractures at the bottom of the sea, where the dissolution of anciently buried evaporites created dense anoxic brines that are separated by a chemocline/pycnocline from the overlying oxygenated deep-seawater column. DHABs have been described in the Gulf of Mexico, the Mediterranean Sea, the Black Sea and the Red Sea. They are characterized by prolonged historical separation of the brines from the upper water column due to lack of mixing and by extreme conditions of salinity, anoxia, and relatively high hydrostatic pressure and temperatures. Due to these combined selection factors, unique microbial assemblages thrive in these polyextreme ecosystems. The topological localization of the different taxa in the brine-seawater transition zone coupled with the metabolic interactions and niche adaptations determine the metabolic functioning and biogeochemistry of DHABs. In particular, inherent metabolic strategies accompanied by genetic adaptations have provided insights on how prokaryotic communities can adapt to salt-saturated conditions. Here, we review the current knowledge of the diversity, genomics, metabolisms and ecology of prokaryotes in DHABs.

Genome of Horsepox Virus

PubMed Central

Tulman, E. R.; Delhon, G.; Afonso, C. L.; Lu, Z.; Zsak, L.; Sandybaev, N. T.; Kerembekova, U. Z.; Zaitsev, V. L.; Kutish, G. F.; Rock, D. L.

2006-01-01

Here we present the genomic sequence of horsepox virus (HSPV) isolate MNR-76, an orthopoxvirus (OPV) isolated in 1976 from diseased Mongolian horses. The 212-kbp genome contained 7.5-kbp inverted terminal repeats and lacked extensive terminal tandem repetition. HSPV contained 236 open reading frames (ORFs) with similarity to those in other OPVs, with those in the central 100-kbp region most conserved relative to other OPVs. Phylogenetic analysis of the conserved region indicated that HSPV is closely related to sequenced isolates of vaccinia virus (VACV) and rabbitpox virus, clearly grouping together these VACV-like viruses. Fifty-four HSPV ORFs likely represented fragments of 25 orthologous OPV genes, including in the central region the only known fragmented form of an OPV ribonucleotide reductase large subunit gene. In terminal genomic regions, HSPV lacked full-length homologues of genes variably fragmented in other VACV-like viruses but was unique in fragmentation of the homologue of VACV strain Copenhagen B6R, a gene intact in other known VACV-like viruses. Notably, HSPV contained in terminal genomic regions 17 kbp of OPV-like sequence absent in known VACV-like viruses, including fragments of genes intact in other OPVs and approximately 1.4 kb of sequence present only in cowpox virus (CPXV). HSPV also contained seven full-length genes fragmented or missing in other VACV-like viruses, including intact homologues of the CPXV strain GRI-90 D2L/I4R CrmB and D13L CD30-like tumor necrosis factor receptors, D3L/I3R and C1L ankyrin repeat proteins, B19R kelch-like protein, D7L BTB/POZ domain protein, and B22R variola virus B22R-like protein. These results indicated that HSPV contains unique genomic features likely contributing to a unique virulence/host range phenotype. They also indicated that while closely related to known VACV-like viruses, HSPV contains additional, potentially ancestral sequences absent in other VACV-like viruses. PMID:16940536
Complete Chloroplast Genome Sequences of Mongolia Medicine Artemisia frigida and Phylogenetic Relationships with Other Plants

PubMed Central

Liu, Yue; Huo, Naxin; Dong, Lingli; Wang, Yi; Zhang, Shuixian; Young, Hugh A.; Feng, Xiaoxiao; Gu, Yong Qiang

2013-01-01

Background Artemisia frigida Willd. is an important Mongolian traditional medicinal plant with pharmacological functions of stanch and detumescence. However, there is little sequence and genomic information available for Artemisia frigida, which makes phylogenetic identification, evolutionary studies, and genetic improvement of its value very difficult. We report the complete chloroplast genome sequence of Artemisia frigida based on 454 pyrosequencing. Methodology/Principal Findings The complete chloroplast genome of Artemisia frigida is 151,076 bp including a large single copy (LSC) region of 82,740 bp, a small single copy (SSC) region of 18,394 bp and a pair of inverted repeats (IRs) of 24,971 bp. The genome contains 114 unique genes and 18 duplicated genes. The chloroplast genome of Artemisia frigida contains a small 3.4 kb inversion within a large 23 kb inversion in the LSC region, a unique feature in Asteraceae. The gene order in the SSC region of Artemisia frigida is inverted compared with the other 6 Asteraceae species with the chloroplast genomes sequenced. This inversion is likely caused by an intramolecular recombination event only occurred in Artemisia frigida. The existence of rich SSR loci in the Artemisia frigida chloroplast genome provides a rare opportunity to study population genetics of this Mongolian medicinal plant. Phylogenetic analysis demonstrates a sister relationship between Artemisia frigida and four other species in Asteraceae, including Ageratina adenophora, Helianthus annuus, Guizotia abyssinica and Lactuca sativa, based on 61 protein-coding sequences. Furthermore, Artemisia frigida was placed in the tribe Anthemideae in the subfamily Asteroideae (Asteraceae) based on ndhF and trnL-F sequence comparisons. Conclusion The chloroplast genome sequence of Artemisia frigida was assembled and analyzed in this study, representing the first plastid genome sequenced in the Anthemideae tribe. This complete chloroplast genome sequence will be useful for molecular ecology and molecular phylogeny studies within Artemisia species and also within the Asteraceae family. PMID:23460871
Comparative chloroplast genomics and phylogenetics of Fagopyrum esculentum ssp. ancestrale – A wild ancestor of cultivated buckwheat

PubMed Central

Logacheva, Maria D; Samigullin, Tahir H; Dhingra, Amit; Penin, Aleksey A

2008-01-01

Background Chloroplast genome sequences are extremely informative about species-interrelationships owing to its non-meiotic and often uniparental inheritance over generations. The subject of our study, Fagopyrum esculentum, is a member of the family Polygonaceae belonging to the order Caryophyllales. An uncertainty remains regarding the affinity of Caryophyllales and the asterids that could be due to undersampling of the taxa. With that background, having access to the complete chloroplast genome sequence for Fagopyrum becomes quite pertinent. Results We report the complete chloroplast genome sequence of a wild ancestor of cultivated buckwheat, Fagopyrum esculentum ssp. ancestrale. The sequence was rapidly determined using a previously described approach that utilized a PCR-based method and employed universal primers, designed on the scaffold of multiple sequence alignment of chloroplast genomes. The gene content and order in buckwheat chloroplast genome is similar to Spinacia oleracea. However, some unique structural differences exist: the presence of an intron in the rpl2 gene, a frameshift mutation in the rpl23 gene and extension of the inverted repeat region to include the ycf1 gene. Phylogenetic analysis of 61 protein-coding gene sequences from 44 complete plastid genomes provided strong support for the sister relationships of Caryophyllales (including Polygonaceae) to asterids. Further, our analysis also provided support for Amborella as sister to all other angiosperms, but interestingly, in the bayesian phylogeny inference based on first two codon positions Amborella united with Nymphaeales. Conclusion Comparative genomics analyses revealed that the Fagopyrum chloroplast genome harbors the characteristic gene content and organization as has been described for several other chloroplast genomes. However, it has some unique structural features distinct from previously reported complete chloroplast genome sequences. Phylogenetic analysis of the dataset, including this new sequence from non-core Caryophyllales supports the sister relationship between Caryophyllales and asterids. PMID:18492277
Exploiting genotyping by sequencing to characterize the genomic structure of the American cranberry through high-density linkage mapping.

PubMed

Covarrubias-Pazaran, Giovanny; Diaz-Garcia, Luis; Schlautman, Brandon; Deutsch, Joseph; Salazar, Walter; Hernandez-Ochoa, Miguel; Grygleski, Edward; Steffan, Shawn; Iorizzo, Massimo; Polashock, James; Vorsa, Nicholi; Zalapa, Juan

2016-06-13

The application of genotyping by sequencing (GBS) approaches, combined with data imputation methodologies, is narrowing the genetic knowledge gap between major and understudied, minor crops. GBS is an excellent tool to characterize the genomic structure of recently domesticated (~200 years) and understudied species, such as cranberry (Vaccinium macrocarpon Ait.), by generating large numbers of markers for genomic studies such as genetic mapping. We identified 10842 potentially mappable single nucleotide polymorphisms (SNPs) in a cranberry pseudo-testcross population wherein 5477 SNPs and 211 short sequence repeats (SSRs) were used to construct a high density linkage map in cranberry of which a total of 4849 markers were mapped. Recombination frequency, linkage disequilibrium (LD), and segregation distortion at the genomic level in the parental and integrated linkage maps were characterized for first time in cranberry. SSR markers, used as the backbone in the map, revealed high collinearity with previously published linkage maps. The 4849 point map consisted of twelve linkage groups spanning 1112 cM, which anchored 2381 nuclear scaffolds accounting for ~13 Mb of the estimated 470 Mb cranberry genome. Bin mapping identified 592 and 672 unique bins in the parentals and a total of 1676 unique marker positions in the integrated map. Synteny analyses comparing the order of anchored cranberry scaffolds to their homologous positions in kiwifruit, grape, and coffee genomes provided initial evidence of homology between cranberry and closely related species. GBS data was used to rapidly saturate the cranberry genome with markers in a pseudo-testcross population. Collinearity between the present saturated genetic map and previous cranberry SSR maps suggests that the SNP locations represent accurate marker order and chromosome structure of the cranberry genome. SNPs greatly improved current marker genome coverage, which allowed for genome-wide structure investigations such as segregation distortion, recombination, linkage disequilibrium, and synteny analyses. In the future, GBS can be used to accelerate cranberry molecular breeding through QTL mapping and genome-wide association studies (GWAS).
A Molecular Basis for Bifidobacterial Enrichment in the Infant Gastrointestinal Tract123

PubMed Central

Garrido, Daniel; Barile, Daniela; Mills, David A.

2012-01-01

Bifidobacteria are commonly used as probiotics in dairy foods. Select bifidobacterial species are also early colonizers of the breast-fed infant colon; however, the mechanism for this enrichment is unclear. We previously showed that Bifidobacterium longum subsp. infantis is a prototypical bifidobacterial species that can readily utilize human milk oligosaccharides as the sole carbon source. MS-based glycoprofiling has revealed that numerous B. infantis strains preferentially consume small mass oligosaccharides, abundant in human milks. Genome sequencing revealed that B. infantis possesses a bias toward genes required to use mammalian-derived carbohydrates. Many of these genomic features encode enzymes that are active on milk oligosaccharides including a novel 40-kb region dedicated to oligosaccharide utilization. Biochemical and molecular characterization of the encoded glycosidases and transport proteins has further resolved the mechanism by which B. infantis selectively imports and catabolizes milk oligosaccharides. Expression studies indicate that many of these key functions are only induced during growth on milk oligosaccharides and not expressed during growth on other prebiotics. Analysis of numerous B. infantis isolates has confirmed that these genomic features are common among the B. infantis subspecies and likely constitute a competitive colonization strategy used by these unique bifidobacteria. By detailed characterization of the molecular mechanisms responsible, these studies provide a conceptual framework for bifidobacterial persistence and host interaction in the infant gastrointestinal tract mediated in part through consumption of human milk oligosaccharides. PMID:22585920
Cultivation of a human-associated TM7 phylotype reveals a reduced genome and epibiotic parasitic lifestyle.

PubMed

He, Xuesong; McLean, Jeffrey S; Edlund, Anna; Yooseph, Shibu; Hall, Adam P; Liu, Su-Yang; Dorrestein, Pieter C; Esquenazi, Eduardo; Hunter, Ryan C; Cheng, Genhong; Nelson, Karen E; Lux, Renate; Shi, Wenyuan

2015-01-06

The candidate phylum TM7 is globally distributed and often associated with human inflammatory mucosal diseases. Despite its prevalence, the TM7 phylum remains recalcitrant to cultivation, making it one of the most enigmatic phyla known. In this study, we cultivated a TM7 phylotype (TM7x) from the human oral cavity. This extremely small coccus (200-300 nm) has a distinctive lifestyle not previously observed in human-associated microbes. It is an obligate epibiont of an Actinomyces odontolyticus strain (XH001) yet also has a parasitic phase, thereby killing its host. This first completed genome (705 kb) for a human-associated TM7 phylotype revealed a complete lack of amino acid biosynthetic capacity. Comparative genomics analyses with uncultivated environmental TM7 assemblies show remarkable conserved gene synteny and only minimal gene loss/gain that may have occurred as TM7x adapted to conditions within the human host. Transcriptomic and metabolomic profiles provided the first indications, to our knowledge, that there is signaling interaction between TM7x and XH001. Furthermore, the induction of TNF-α production in macrophages by XH001 was repressed in the presence of TM7x, suggesting its potential immune suppression ability. Overall, our data provide intriguing insights into the uncultivability, pathogenicity, and unique lifestyle of this previously uncharacterized oral TM7 phylotype.
Cultivation of a human-associated TM7 phylotype reveals a reduced genome and epibiotic parasitic lifestyle

PubMed Central

He, Xuesong; McLean, Jeffrey S.; Edlund, Anna; Yooseph, Shibu; Hall, Adam P.; Liu, Su-Yang; Dorrestein, Pieter C.; Esquenazi, Eduardo; Hunter, Ryan C.; Cheng, Genhong; Nelson, Karen E.; Lux, Renate; Shi, Wenyuan

2015-01-01

The candidate phylum TM7 is globally distributed and often associated with human inflammatory mucosal diseases. Despite its prevalence, the TM7 phylum remains recalcitrant to cultivation, making it one of the most enigmatic phyla known. In this study, we cultivated a TM7 phylotype (TM7x) from the human oral cavity. This extremely small coccus (200–300 nm) has a distinctive lifestyle not previously observed in human-associated microbes. It is an obligate epibiont of an Actinomyces odontolyticus strain (XH001) yet also has a parasitic phase, thereby killing its host. This first completed genome (705 kb) for a human-associated TM7 phylotype revealed a complete lack of amino acid biosynthetic capacity. Comparative genomics analyses with uncultivated environmental TM7 assemblies show remarkable conserved gene synteny and only minimal gene loss/gain that may have occurred as TM7x adapted to conditions within the human host. Transcriptomic and metabolomic profiles provided the first indications, to our knowledge, that there is signaling interaction between TM7x and XH001. Furthermore, the induction of TNF-α production in macrophages by XH001 was repressed in the presence of TM7x, suggesting its potential immune suppression ability. Overall, our data provide intriguing insights into the uncultivability, pathogenicity, and unique lifestyle of this previously uncharacterized oral TM7 phylotype. PMID:25535390
Neanderthal behaviour, diet, and disease inferred from ancient DNA in dental calculus.

PubMed

Weyrich, Laura S; Duchene, Sebastian; Soubrier, Julien; Arriola, Luis; Llamas, Bastien; Breen, James; Morris, Alan G; Alt, Kurt W; Caramelli, David; Dresely, Veit; Farrell, Milly; Farrer, Andrew G; Francken, Michael; Gully, Neville; Haak, Wolfgang; Hardy, Karen; Harvati, Katerina; Held, Petra; Holmes, Edward C; Kaidonis, John; Lalueza-Fox, Carles; de la Rasilla, Marco; Rosas, Antonio; Semal, Patrick; Soltysiak, Arkadiusz; Townsend, Grant; Usai, Donatella; Wahl, Joachim; Huson, Daniel H; Dobney, Keith; Cooper, Alan

2017-04-20

Recent genomic data have revealed multiple interactions between Neanderthals and modern humans, but there is currently little genetic evidence regarding Neanderthal behaviour, diet, or disease. Here we describe the shotgun-sequencing of ancient DNA from five specimens of Neanderthal calcified dental plaque (calculus) and the characterization of regional differences in Neanderthal ecology. At Spy cave, Belgium, Neanderthal diet was heavily meat based and included woolly rhinoceros and wild sheep (mouflon), characteristic of a steppe environment. In contrast, no meat was detected in the diet of Neanderthals from El Sidrón cave, Spain, and dietary components of mushrooms, pine nuts, and moss reflected forest gathering. Differences in diet were also linked to an overall shift in the oral bacterial community (microbiota) and suggested that meat consumption contributed to substantial variation within Neanderthal microbiota. Evidence for self-medication was detected in an El Sidrón Neanderthal with a dental abscess and a chronic gastrointestinal pathogen (Enterocytozoon bieneusi). Metagenomic data from this individual also contained a nearly complete genome of the archaeal commensal Methanobrevibacter oralis (10.2× depth of coverage)-the oldest draft microbial genome generated to date, at around 48,000 years old. DNA preserved within dental calculus represents a notable source of information about the behaviour and health of ancient hominin specimens, as well as a unique system that is useful for the study of long-term microbial evolution.
Mosaic genome of endobacteria in arbuscular mycorrhizal fungi: Transkingdom gene transfer in an ancient mycoplasma-fungus association.

PubMed

Torres-Cortés, Gloria; Ghignone, Stefano; Bonfante, Paola; Schüßler, Arthur

2015-06-23

For more than 450 million years, arbuscular mycorrhizal fungi (AMF) have formed intimate, mutualistic symbioses with the vast majority of land plants and are major drivers in almost all terrestrial ecosystems. The obligate plant-symbiotic AMF host additional symbionts, so-called Mollicutes-related endobacteria (MRE). To uncover putative functional roles of these widespread but yet enigmatic MRE, we sequenced the genome of DhMRE living in the AMF Dentiscutata heterogama. Multilocus phylogenetic analyses showed that MRE form a previously unidentified lineage sister to the hominis group of Mycoplasma species. DhMRE possesses a strongly reduced metabolic capacity with 55% of the proteins having unknown function, which reflects unique adaptations to an intracellular lifestyle. We found evidence for transkingdom gene transfer between MRE and their AMF host. At least 27 annotated DhMRE proteins show similarities to nuclear-encoded proteins of the AMF Rhizophagus irregularis, which itself lacks MRE. Nuclear-encoded homologs could moreover be identified for another AMF, Gigaspora margarita, and surprisingly, also the non-AMF Mortierella verticillata. Our data indicate a possible origin of the MRE-fungus association in ancestors of the Glomeromycota and Mucoromycotina. The DhMRE genome encodes an arsenal of putative regulatory proteins with eukaryotic-like domains, some of them encoded in putative genomic islands. MRE are highly interesting candidates to study the evolution and interactions between an ancient, obligate endosymbiotic prokaryote with its obligate plant-symbiotic fungal host. Our data moreover may be used for further targeted searches for ancient effector-like proteins that may be key components in the regulation of the arbuscular mycorrhiza symbiosis.
Emerging Applications of Metabolomic and Genomic Profiling in Diabetic Clinical Medicine

PubMed Central

McKillop, Aine M.; Flatt, Peter R.

2011-01-01

Clinical and epidemiological metabolomics provides a unique opportunity to look at genotype-phenotype relationships as well as the body\\x{2019}s responses to environmental and lifestyle factors. Fundamentally, it provides information on the universal outcome of influencing factors on disease states and has great potential in the early diagnosis, therapy monitoring, and understanding of the pathogenesis of disease. Diseases, such as diabetes, with a complex set of interactions between genetic and environmental factors, produce changes in the body\\x{2019}s biochemical profile, thereby providing potential markers for diagnosis and initiation of therapies. There is clearly a need to discover new ways to aid diagnosis and assessment of glycemic status to help reduce diabetes complications and improve the quality of life. Many factors, including peptides, proteins, metabolites, nucleic acids, and polymorphisms, have been proposed as putative biomarkers for diabetes. Metabolomics is an approach used to identify and assess metabolic characteristics, changes, and phenotypes in response to influencing factors, such as environment, diet, lifestyle, and pathophysiological states. The specificity and sensitivity using metabolomics to identify biomarkers of disease have become increasingly feasible because of advances in analytical and information technologies. Likewise, the emergence of high-throughput genotyping technologies and genome-wide association studies has prompted the search for genetic markers of diabetes predisposition or susceptibility. In this review, we consider the application of key metabolomic and genomic methodologies in diabetes and summarize the established, new, and emerging metabolomic and genomic biomarkers for the disease. We conclude by summarizing future insights into the search for improved biomarkers for diabetes research and human diagnostics. PMID:22110171
Annealing of Complementary DNA Sequences During Double-Strand Break Repair in Drosophila Is Mediated by the Ortholog of SMARCAL1.

PubMed

Holsclaw, Julie Korda; Sekelsky, Jeff

2017-05-01

DNA double-strand breaks (DSBs) pose a serious threat to genomic integrity. If unrepaired, they can lead to chromosome fragmentation and cell death. If repaired incorrectly, they can cause mutations and chromosome rearrangements. DSBs are repaired using end-joining or homology-directed repair strategies, with the predominant form of homology-directed repair being synthesis-dependent strand annealing (SDSA). SDSA is the first defense against genomic rearrangements and information loss during DSB repair, making it a vital component of cell health and an attractive target for chemotherapeutic development. SDSA has also been proposed to be the primary mechanism for integration of large insertions during genome editing with CRISPR/Cas9. Despite the central role for SDSA in genome stability, little is known about the defining step: annealing. We hypothesized that annealing during SDSA is performed by the annealing helicase SMARCAL1, which can anneal RPA-coated single DNA strands during replication-associated DNA damage repair. We used unique genetic tools in Drosophila melanogaster to test whether the fly ortholog of SMARCAL1, Marcal1, mediates annealing during SDSA. Repair that requires annealing is significantly reduced in Marcal1 null mutants in both synthesis-dependent and synthesis-independent (single-strand annealing) assays. Elimination of the ATP-binding activity of Marcal1 also reduced annealing-dependent repair, suggesting that the annealing activity requires translocation along DNA. Unlike the null mutant, however, the ATP-binding defect mutant showed reduced end joining, shedding light on the interaction between SDSA and end-joining pathways. Copyright © 2017 by the Genetics Society of America.
Transformation-associated recombination (TAR) cloning for genomics studies and synthetic biology

PubMed Central

Kouprina, Natalay; Larionov, Vladimir

2016-01-01

Transformation-associated recombination (TAR) cloning represents a unique tool for isolation and manipulation of large DNA molecules. The technique exploits a high level of homologous recombination in the yeast Sacharomyces cerevisiae. So far, TAR cloning is the only method available to selectively recover chromosomal segments up to 300 kb in length from complex and simple genomes. In addition, TAR cloning allows the assembly and cloning of entire microbe genomes up to several Mb as well as engineering of large metabolic pathways. In this review, we summarize applications of TAR cloning for functional/structural genomics and synthetic biology. PMID:27116033
Brd4 Is Required for E2-Mediated Transcriptional Activation but Not Genome Partitioning of All Papillomaviruses†

PubMed Central

McPhillips, M. G.; Oliveira, J. G.; Spindler, J. E.; Mitra, R.; McBride, A. A.

2006-01-01

Bromodomain protein 4 (Brd4) has been identified as the cellular binding target through which the E2 protein of bovine papillomavirus type 1 links the viral genome to mitotic chromosomes. This tethering ensures retention and efficient partitioning of genomes to daughter cells following cell division. E2 is also a regulator of viral gene expression and a replication factor, in association with the viral E1 protein. In this study, we show that E2 proteins from a wide range of papillomaviruses interact with Brd4, albeit with variations in efficiency. Moreover, disruption of the E2-Brd4 interaction abrogates the transactivation function of E2, indicating that Brd4 is required for E2-mediated transactivation of all papillomaviruses. However, the interaction of E2 and Brd4 is not required for genome partitioning of all papillomaviruses since a number of papillomavirus E2 proteins associate with mitotic chromosomes independently of Brd4 binding. Furthermore, mutations in E2 that disrupt the interaction with Brd4 do not affect the ability of these E2s to associate with chromosomes. Thus, while all papillomaviruses attach their genomes to cellular chromosomes to facilitate genome segregation, they target different cellular binding partners. In summary, the E2 proteins from many papillomaviruses, including the clinically important alpha genus human papillomaviruses, interact with Brd4 to mediate transcriptional activation function but not all depend on this interaction to efficiently associate with mitotic chromosomes. PMID:16973557
Unique patterns of organization and migration of FGF-expressing cells during Drosophila morphogenesis.

PubMed

Du, Lijuan; Zhou, Amy; Patel, Akshay; Rao, Mishal; Anderson, Kelsey; Roy, Sougata

2017-07-01

Fibroblast growth factors (FGF) are essential signaling proteins that regulate diverse cellular functions in developmental and metabolic processes. In Drosophila, the FGF homolog, branchless (bnl) is expressed in a dynamic and spatiotemporally restricted pattern to induce branching morphogenesis of the trachea, which expresses the Bnl-receptor, breathless (btl). Here we have developed a new strategy to determine bnl- expressing cells and study their interactions with the btl-expressing cells in the range of tissue patterning during Drosophila development. To enable targeted gene expression specifically in the bnl expressing cells, a new LexA based bnl enhancer trap line was generated using CRISPR/Cas9 based genome editing. Analyses of the spatiotemporal expression of the reporter in various embryonic stages, larval or adult tissues and in metabolic hypoxia, confirmed its target specificity and versatility. With this tool, new bnl expressing cells, their unique organization and functional interactions with the btl-expressing cells were uncovered in a larval tracheoblast niche in the leg imaginal discs, in larval photoreceptors of the developing retina, and in the embryonic central nervous system. The targeted expression system also facilitated live imaging of simultaneously labeled Bnl sources and tracheal cells, which revealed a unique morphogenetic movement of the embryonic bnl- source. Migration of bnl- expressing cells may create a dynamic spatiotemporal pattern of the signal source necessary for the directional growth of the tracheal branch. The genetic tool and the comprehensive profile of expression, organization, and activity of various types of bnl-expressing cells described in this study provided us with an important foundation for future research investigating the mechanisms underlying Bnl signaling in tissue morphogenesis. Copyright © 2017 Elsevier Inc. All rights reserved.
Structural Insights into Helicobacter pylori Cag Protein Interactions with Host Cell Factors.

PubMed

Bergé, Célia; Terradot, Laurent

2017-01-01

The most virulent strains of Helicobacter pylori carry a genomic island (cagPAI) containing a set of 27-31 genes. The encoded proteins assemble a syringe-like apparatus to inject the cytotoxin-associated gene A (CagA) protein into gastric cells. This molecular device belongs to the type IV secretion system (T4SS) family albeit with unique characteristics. The cagPAI-encoded T4SS and its effector protein CagA have an intricate relationship with the host cell, with multiple interactions that only start to be deciphered from a structural point of view. On the one hand, the major roles of the interactions between CagL and CagA (and perhaps CagI and CagY) and host cell factors are to facilitate H. pylori adhesion and to mediate the injection of the CagA oncoprotein. On the other hand, CagA interactions with host cell partners interfere with cellular pathways to subvert cell defences and to promote H. pylori infection. Although a clear mechanism for CagA translocation is still lacking, the structural definition of CagA and CagL domains involved in interactions with signalling proteins are progressively coming to light. In this chapter, we will focus on the structural aspects of Cag protein interactions with host cell molecules, critical molecular events precluding H. pylori-mediated gastric cancer development.
Molecular pathological epidemiology: new developing frontiers of big data science to study etiologies and pathogenesis.

PubMed

Hamada, Tsuyoshi; Keum, NaNa; Nishihara, Reiko; Ogino, Shuji

2017-03-01

Molecular pathological epidemiology (MPE) is an integrative field that utilizes molecular pathology to incorporate interpersonal heterogeneity of a disease process into epidemiology. In each individual, the development and progression of a disease are determined by a unique combination of exogenous and endogenous factors, resulting in different molecular and pathological subtypes of the disease. Based on "the unique disease principle," the primary aim of MPE is to uncover an interactive relationship between a specific environmental exposure and disease subtypes in determining disease incidence and mortality. This MPE approach can provide etiologic and pathogenic insights, potentially contributing to precision medicine for personalized prevention and treatment. Although breast, prostate, lung, and colorectal cancers have been among the most commonly studied diseases, the MPE approach can be used to study any disease. In addition to molecular features, host immune status and microbiome profile likely affect a disease process, and thus serve as informative biomarkers. As such, further integration of several disciplines into MPE has been achieved (e.g., pharmaco-MPE, immuno-MPE, and microbial MPE), to provide novel insights into underlying etiologic mechanisms. With the advent of high-throughput sequencing technologies, available genomic and epigenomic data have expanded dramatically. The MPE approach can also provide a specific risk estimate for each disease subgroup, thereby enhancing the impact of genome-wide association studies on public health. In this article, we present recent progress of MPE, and discuss the importance of accounting for the disease heterogeneity in the era of big-data health science and precision medicine.
Novel, host-restricted genotypes of Bordetella bronchiseptica associated with phocine respiratory tract isolates.

PubMed

Register, Karen B; Ivanov, Yury V; Harvill, Eric T; Davison, Nick; Foster, Geoffrey

2015-03-01

During a succession of phocine morbillivirus outbreaks spanning the past 25 years, Bordetella bronchiseptica was identified as a frequent secondary invader and cause of death. The goal of this study was to evaluate genetic diversity and the molecular basis for host specificity among seal isolates from these outbreaks. MLST and PvuII ribotyping of 54 isolates from Scottish, English or Danish coasts of the Atlantic or North Sea revealed a single, host-restricted genotype. A single, novel genotype, unique from that of the Atlantic and North Sea isolates, was found in isolates from an outbreak in the Caspian Sea. Phylogenetic analysis based either on MLST sequence, ribotype patterns or genome-wide SNPs consistently placed both seal-specific genotypes within the same major clade but indicates a distinct evolutionary history for each. An additional isolate from the intestinal tract of a seal on the south-west coast of England has a genotype otherwise found in rabbit, guinea pig and pig isolates. To investigate the molecular basis for host specificity, DNA and predicted protein sequences of virulence genes that mediate host interactions were used in comparisons between a North Sea isolate, a Caspian Sea isolate and each of their closest relatives as inferred from genome-wide SNP analysis. Despite their phylogenetic divergence, fewer nucleotide and amino acid substitutions were found in comparisons of the two seal isolates than in comparisons with closely related strains. These data indicate isolates of B. bronchiseptica associated with respiratory disease in seals comprise unique, host-adapted and highly clonal populations. © 2015 The Authors.
Molecular pathological epidemiology: new developing frontiers of big data science to study etiologies and pathogenesis

PubMed Central

Hamada, Tsuyoshi; Keum, NaNa; Nishihara, Reiko; Ogino, Shuji

2016-01-01

Molecular pathological epidemiology (MPE) is an integrative field that utilizes molecular pathology to incorporate interpersonal heterogeneity of a disease process into epidemiology. In each individual, the development and progression of a disease are determined by a unique combination of exogenous and endogenous factors, resulting in different molecular and pathological subtypes of the disease. Based on “the unique disease principle,” the primary aim of MPE is to uncover an interactive relationship between a specific environmental exposure and disease subtypes in determining disease incidence and mortality. This MPE approach can provide etiologic and pathogenic insights, potentially contributing to precision medicine for personalized prevention and treatment. Although breast, prostate, lung, and colorectal cancers have been among the most commonly studied diseases, the MPE approach can be used to study any disease. In addition to molecular features, host immune status and microbiome profile likely affect a disease process, and thus serve as informative biomarkers. As such, further integration of several disciplines into MPE has been achieved (e.g., pharmaco-MPE, immuno-MPE, and microbial MPE), to provide novel insights into underlying etiologic mechanisms. With the advent of high-throughput sequencing technologies, available genomic and epigenomic data have expanded dramatically. The MPE approach can also provide a specific risk estimate for each disease subgroup, thereby enhancing the impact of genome-wide association studies on public health. In this article, we present recent progress of MPE, and discuss the importance of accounting for the disease heterogeneity in the era of big-data health science and precision medicine. PMID:27738762
GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes.

PubMed

Hallin, Peter F; Stærfeldt, Hans-Henrik; Rotenberg, Eva; Binnewies, Tim T; Benham, Craig J; Ussery, David W

2009-09-25

We present an interactive web application for visualizing genomic data of prokaryotic chromosomes. The tool (GeneWiz browser) allows users to carry out various analyses such as mapping alignments of homologous genes to other genomes, mapping of short sequencing reads to a reference chromosome, and calculating DNA properties such as curvature or stacking energy along the chromosome. The GeneWiz browser produces an interactive graphic that enables zooming from a global scale down to single nucleotides, without changing the size of the plot. Its ability to disproportionally zoom provides optimal readability and increased functionality compared to other browsers. The tool allows the user to select the display of various genomic features, color setting and data ranges. Custom numerical data can be added to the plot allowing, for example, visualization of gene expression and regulation data. Further, standard atlases are pre-generated for all prokaryotic genomes available in GenBank, providing a fast overview of all available genomes, including recently deposited genome sequences. The tool is available online from http://www.cbs.dtu.dk/services/gwBrowser. Supplemental material including interactive atlases is available online at http://www.cbs.dtu.dk/services/gwBrowser/suppl/.
ChIP-chip.

PubMed

Kim, Tae Hoon; Dekker, Job

2018-05-01

ChIP-chip can be used to analyze protein-DNA interactions in a region-wide and genome-wide manner. DNA microarrays contain PCR products or oligonucleotide probes that are designed to represent genomic sequences. Identification of genomic sites that interact with a specific protein is based on competitive hybridization of the ChIP-enriched DNA and the input DNA to DNA microarrays. The ChIP-chip protocol can be divided into two main sections: Amplification of ChIP DNA and hybridization of ChIP DNA to arrays. A large amount of DNA is required to hybridize to DNA arrays, and hybridization to a set of multiple commercial arrays that represent the entire human genome requires two rounds of PCR amplifications. The relative hybridization intensity of ChIP DNA and that of the input DNA is used to determine whether the probe sequence is a potential site of protein-DNA interaction. Resolution of actual genomic sites bound by the protein is dependent on the size of the chromatin and on the genomic distance between the probes on the array. As with expression profiling using gene chips, ChIP-chip experiments require multiple replicates for reliable statistical measure of protein-DNA interactions. © 2018 Cold Spring Harbor Laboratory Press.

Flavivirus and Filovirus EvoPrinters: New alignment tools for the comparative analysis of viral evolution.

PubMed

Brody, Thomas; Yavatkar, Amarendra S; Park, Dong Sun; Kuzin, Alexander; Ross, Jermaine; Odenwald, Ward F

2017-06-01

Flavivirus and Filovirus infections are serious epidemic threats to human populations. Multi-genome comparative analysis of these evolving pathogens affords a view of their essential, conserved sequence elements as well as progressive evolutionary changes. While phylogenetic analysis has yielded important insights, the growing number of available genomic sequences makes comparisons between hundreds of viral strains challenging. We report here a new approach for the comparative analysis of these hemorrhagic fever viruses that can superimpose an unlimited number of one-on-one alignments to identify important features within genomes of interest. We have adapted EvoPrinter alignment algorithms for the rapid comparative analysis of Flavivirus or Filovirus sequences including Zika and Ebola strains. The user can input a full genome or partial viral sequence and then view either individual comparisons or generate color-coded readouts that superimpose hundreds of one-on-one alignments to identify unique or shared identity SNPs that reveal ancestral relationships between strains. The user can also opt to select a database genome in order to access a library of pre-aligned genomes of either 1,094 Flaviviruses or 460 Filoviruses for rapid comparative analysis with all database entries or a select subset. Using EvoPrinter search and alignment programs, we show the following: 1) superimposing alignment data from many related strains identifies lineage identity SNPs, which enable the assessment of sublineage complexity within viral outbreaks; 2) whole-genome SNP profile screens uncover novel Dengue2 and Zika recombinant strains and their parental lineages; 3) differential SNP profiling identifies host cell A-to-I hyper-editing within Ebola and Marburg viruses, and 4) hundreds of superimposed one-on-one Ebola genome alignments highlight ultra-conserved regulatory sequences, invariant amino acid codons and evolutionarily variable protein-encoding domains within a single genome. EvoPrinter allows for the assessment of lineage complexity within Flavivirus or Filovirus outbreaks, identification of recombinant strains, highlights sequences that have undergone host cell A-to-I editing, and identifies unique input and database SNPs within highly conserved sequences. EvoPrinter's ability to superimpose alignment data from hundreds of strains onto a single genome has allowed us to identify unique Zika virus sublineages that are currently spreading in South, Central and North America, the Caribbean, and in China. This new set of integrated alignment programs should serve as a useful addition to existing tools for the comparative analysis of these viruses.
Plastome-Genome Interactions Affect Plastid Transmission in Oenothera

PubMed Central

Chiu, W. L.; Sears, B. B.

1993-01-01

Plastids of Oenothera, the evening primrose, can be transmitted to the progeny from both parents. In a constant nuclear background, the frequency of biparental plastid transmission is determined by the types of plastid genomes (plastomes) involved in the crosses. In this study, the impact of nuclear genomes on plastid inheritance was analyzed. In general, the transmission efficiency of each plastome correlated strongly with its compatibility with the nuclear genome of the progeny, suggesting that plastome-genome interactions can influence plastid transmission by affecting the efficiency of plastid multiplication after fertilization. Lower frequencies of plastid transmission from the paternal side were observed when the pollen had poor vigor due to an incompatible plastome-genome combination, indicating that plastome-genome interactions may also affect the input of plastids at fertilization. Parental traits that affect the process of fertilization can also have an impact on plastid transmission. Crosses using maternal parents with long styles or pollen with relatively low growth capacity resulted in reduced frequencies of paternal plastid transmission. These observations suggest that degeneration of pollen plastids may occur as the time interval between pollination and fertilization is lengthened. PMID:8462856
From genomics to chemical genomics: new developments in KEGG

PubMed Central

Kanehisa, Minoru; Goto, Susumu; Hattori, Masahiro; Aoki-Kinoshita, Kiyoko F.; Itoh, Masumi; Kawashima, Shuichi; Katayama, Toshiaki; Araki, Michihiro; Hirakawa, Mika

2006-01-01

The increasing amount of genomic and molecular information is the basis for understanding higher-order biological systems, such as the cell and the organism, and their interactions with the environment, as well as for medical, industrial and other practical applications. The KEGG resource () provides a reference knowledge base for linking genomes to biological systems, categorized as building blocks in the genomic space (KEGG GENES) and the chemical space (KEGG LIGAND), and wiring diagrams of interaction networks and reaction networks (KEGG PATHWAY). A fourth component, KEGG BRITE, has been formally added to the KEGG suite of databases. This reflects our attempt to computerize functional interpretations as part of the pathway reconstruction process based on the hierarchically structured knowledge about the genomic, chemical and network spaces. In accordance with the new chemical genomics initiatives, the scope of KEGG LIGAND has been significantly expanded to cover both endogenous and exogenous molecules. Specifically, RPAIR contains curated chemical structure transformation patterns extracted from known enzymatic reactions, which would enable analysis of genome-environment interactions, such as the prediction of new reactions and new enzyme genes that would degrade new environmental compounds. Additionally, drug information is now stored separately and linked to new KEGG DRUG structure maps. PMID:16381885
Genome-Wide Protein Interaction Screens Reveal Functional Networks Involving Sm-Like Proteins

PubMed Central

Fromont-Racine, Micheline; Mayes, Andrew E.; Brunet-Simon, Adeline; Rain, Jean-Christophe; Colley, Alan; Dix, Ian; Decourty, Laurence; Joly, Nicolas; Ricard, Florence; Beggs, Jean D.

2000-01-01

A set of seven structurally related Sm proteins forms the core of the snRNP particles containing the spliceosomal U1, U2, U4 and U5 snRNAs. A search of the genomic sequence of Saccharomyces cerevisiae has identified a number of open reading frames that potentially encode structurally similar proteins termed Lsm (Like Sm) proteins. With the aim of analysing all possible interactions between the Lsm proteins and any protein encoded in the yeast genome, we performed exhaustive and iterative genomic two-hybrid screens, starting with the Lsm proteins as baits. Indeed, extensive interactions amongst eight Lsm proteins were found that suggest the existence of a Lsm complex or complexes. These Lsm interactions apparently involve the conserved Sm domain that also mediates interactions between the Sm proteins. The screens also reveal functionally significant interactions with splicing factors, in particular with Prp4 and Prp24, compatible with genetic studies and with the reported association of Lsm proteins with spliceosomal U6 and U4/U6 particles. In addition, interactions with proteins involved in mRNA turnover, such as Mrt1, Dcp1, Dcp2 and Xrn1, point to roles for Lsm complexes in distinct RNA metabolic processes, that are confirmed in independent functional studies. These results provide compelling evidence that two-hybrid screens yield functionally meaningful information about protein–protein interactions and can suggest functions for uncharacterized proteins, especially when they are performed on a genome-wide scale. PMID:10900456
Draft Genome Sequences of Mycobacterium setense Type Strain DSM-45070 and the Nonpathogenic Strain Manresensis, Isolated from the Bank of the Cardener River in Manresa, Catalonia, Spain

PubMed Central

Vilaplana, Cristina; Velasco, Juan; Pluvinet, Raquel; Santín, Sheila; Prat, Cristina; Julián, Esther; Alcaide, Fernando; Comas, Iñaki; Sumoy, Lauro; Cardona, Pere-Joan

2015-01-01

We present here the draft genome sequences of two Mycobacterium setense strains. One of them corresponds to the M. setense type strain DSM-45070, originally isolated from a patient with a posttraumatic chronic skin abscess. The other one corresponds to the nonpathogenic M. setense strain Manresensis, isolated from the Cardener River crossing Manresa, Catalonia, Spain. A comparative genomic analysis shows a smaller genome size and fewer genes in M. setense strain Manresensis relative to those of the type strain, and it shows the genome segments unique to each strain. PMID:25657273
Comparative genomics of Fructobacillus spp. and Leuconostoc spp. reveals niche-specific evolution of Fructobacillus spp.

DOE PAGES

Endo, Akihito; Tanizawa, Yasuhiro; Tanaka, Naoto; ...

2015-12-29

In this study, Fructobacillus spp. in fructose-rich niches belong to the family Leuconostocaceae. They were originally classified as Leuconostoc spp., but were later grouped into a novel genus, Fructobacillus , based on their phylogenetic position, morphology and specific biochemical characteristics. The unique characters, so called fructophilic characteristics, had not been reported in the group of lactic acid bacteria, suggesting unique evolution at the genome level. Here we studied four draft genome sequences of Fructobacillus spp. and compared their metabolic properties against those of Leuconostoc spp. As a result, Fructobacillus species possess significantly less protein coding sequences in their small genomes.more » The number of genes was significantly smaller in carbohydrate transport and metabolism. Several other metabolic pathways, including TCA cycle, ubiquinone and other terpenoid-quinone biosynthesis and phosphotransferase systems, were characterized as discriminative pathways between the two genera. The adhE gene for bifunctional acetaldehyde/alcohol dehydrogenase, and genes for subunits of the pyruvate dehydrogenase complex were absent in Fructobacillus spp. The two genera also show different levels of GC contents, which are mainly due to the different GC contents at the third codon position. In conclusion, the present genome characteristics in Fructobacillus spp. suggest reductive evolution that took place to adapt to specific niches.« less
Comparative genomics of Fructobacillus spp. and Leuconostoc spp. reveals niche-specific evolution of Fructobacillus spp.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Endo, Akihito; Tanizawa, Yasuhiro; Tanaka, Naoto

In this study, Fructobacillus spp. in fructose-rich niches belong to the family Leuconostocaceae. They were originally classified as Leuconostoc spp., but were later grouped into a novel genus, Fructobacillus , based on their phylogenetic position, morphology and specific biochemical characteristics. The unique characters, so called fructophilic characteristics, had not been reported in the group of lactic acid bacteria, suggesting unique evolution at the genome level. Here we studied four draft genome sequences of Fructobacillus spp. and compared their metabolic properties against those of Leuconostoc spp. As a result, Fructobacillus species possess significantly less protein coding sequences in their small genomes.more » The number of genes was significantly smaller in carbohydrate transport and metabolism. Several other metabolic pathways, including TCA cycle, ubiquinone and other terpenoid-quinone biosynthesis and phosphotransferase systems, were characterized as discriminative pathways between the two genera. The adhE gene for bifunctional acetaldehyde/alcohol dehydrogenase, and genes for subunits of the pyruvate dehydrogenase complex were absent in Fructobacillus spp. The two genera also show different levels of GC contents, which are mainly due to the different GC contents at the third codon position. In conclusion, the present genome characteristics in Fructobacillus spp. suggest reductive evolution that took place to adapt to specific niches.« less
Comprehensive Survey of Genetic Diversity in Chloroplast Genomes and 45S nrDNAs within Panax ginseng Species

PubMed Central

Kim, Kyunghee; Lee, Sang-Choon; Lee, Junki; Lee, Hyun Oh; Joh, Ho Jun; Kim, Nam-Hoon; Park, Hyun-Seung; Yang, Tae-Jin

2015-01-01

We report complete sequences of chloroplast (cp) genome and 45S nuclear ribosomal DNA (45S nrDNA) for 11 Panax ginseng cultivars. We have obtained complete sequences of cp and 45S nrDNA, the representative barcoding target sequences for cytoplasm and nuclear genome, respectively, based on low coverage NGS sequence of each cultivar. The cp genomes sizes ranged from 156,241 to 156,425 bp and the major size variation was derived from differences in copy number of tandem repeats in the ycf1 gene and in the intergenic regions of rps16-trnUUG and rpl32-trnUAG. The complete 45S nrDNA unit sequences were 11,091 bp, representing a consensus single transcriptional unit with an intergenic spacer region. Comparative analysis of these sequences as well as those previously reported for three Chinese accessions identified very rare but unique polymorphism in the cp genome within P. ginseng cultivars. There were 12 intra-species polymorphisms (six SNPs and six InDels) among 14 cultivars. We also identified five SNPs from 45S nrDNA of 11 Korean ginseng cultivars. From the 17 unique informative polymorphic sites, we developed six reliable markers for analysis of ginseng diversity and cultivar authentication. PMID:26061692
LOLAweb: a containerized web server for interactive genomic locus overlap enrichment analysis.

PubMed

Nagraj, V P; Magee, Neal E; Sheffield, Nathan C

2018-06-06

The past few years have seen an explosion of interest in understanding the role of regulatory DNA. This interest has driven large-scale production of functional genomics data and analytical methods. One popular analysis is to test for enrichment of overlaps between a query set of genomic regions and a database of region sets. In this way, new genomic data can be easily connected to annotations from external data sources. Here, we present an interactive interface for enrichment analysis of genomic locus overlaps using a web server called LOLAweb. LOLAweb accepts a set of genomic ranges from the user and tests it for enrichment against a database of region sets. LOLAweb renders results in an R Shiny application to provide interactive visualization features, enabling users to filter, sort, and explore enrichment results dynamically. LOLAweb is built and deployed in a Linux container, making it scalable to many concurrent users on our servers and also enabling users to download and run LOLAweb locally.
A genome-wide 3C-method for characterizing the three-dimensional architectures of genomes.

PubMed

Duan, Zhijun; Andronescu, Mirela; Schutz, Kevin; Lee, Choli; Shendure, Jay; Fields, Stanley; Noble, William S; Anthony Blau, C

2012-11-01

Accumulating evidence demonstrates that the three-dimensional (3D) organization of chromosomes within the eukaryotic nucleus reflects and influences genomic activities, including transcription, DNA replication, recombination and DNA repair. In order to uncover structure-function relationships, it is necessary first to understand the principles underlying the folding and the 3D arrangement of chromosomes. Chromosome conformation capture (3C) provides a powerful tool for detecting interactions within and between chromosomes. A high throughput derivative of 3C, chromosome conformation capture on chip (4C), executes a genome-wide interrogation of interaction partners for a given locus. We recently developed a new method, a derivative of 3C and 4C, which, similar to Hi-C, is capable of comprehensively identifying long-range chromosome interactions throughout a genome in an unbiased fashion. Hence, our method can be applied to decipher the 3D architectures of genomes. Here, we provide a detailed protocol for this method. Published by Elsevier Inc.
Three-Dimensional Genome Organization and Function in Drosophila

PubMed Central

Schwartz, Yuri B.; Cavalli, Giacomo

2017-01-01

Understanding how the metazoan genome is used during development and cell differentiation is one of the major challenges in the postgenomic era. Early studies in Drosophila suggested that three-dimensional (3D) chromosome organization plays important regulatory roles in this process and recent technological advances started to reveal connections at the molecular level. Here we will consider general features of the architectural organization of the Drosophila genome, providing historical perspective and insights from recent work. We will compare the linear and spatial segmentation of the fly genome and focus on the two key regulators of genome architecture: insulator components and Polycomb group proteins. With its unique set of genetic tools and a compact, well annotated genome, Drosophila is poised to remain a model system of choice for rapid progress in understanding principles of genome organization and to serve as a proving ground for development of 3D genome-engineering techniques. PMID:28049701
Genome research elucidating environmental adaptation: Dark-fly project as a case study.

PubMed

Fuse, Naoyuki

2017-08-01

Organisms have the capacity to adapt to diverse environments, and environmental adaptation is a substantial driving force of evolution. Recent progress of genome science has addressed the genetic mechanisms underlying environmental adaptation. Whole genome sequencing has identified adaptive genes selected under particular environments. Genome editing technology enables us to directly test the role(s) of a gene in environmental adaptation. Genome science has also shed light on a unique organism, Dark-fly, which has been reared long-term in the dark. We determined the whole genome sequence of Dark-fly and reenacted environmental selections of the Dark-fly genome to identify the genes related to dark-adaptation. Here I will give an overview of current progress in genome science and summarize our study using Dark-fly, as a case study for environmental adaptation. Copyright © 2017 Elsevier Ltd. All rights reserved.
Complete genome sequence of Metallosphaera cuprina, a metal sulfide-oxidizing archaeon from a hot spring.

PubMed

Liu, Li-Jun; You, Xiao-Yan; Zheng, Huajun; Wang, Shengyue; Jiang, Cheng-Ying; Liu, Shuang-Jiang

2011-07-01

The genome of the metal sulfide-oxidizing, thermoacidophilic strain Metallosphaera cuprina Ar-4 has been completely sequenced and annotated. Originally isolated from a sulfuric hot spring, strain Ar-4 grows optimally at 65°C and a pH of 3.5. The M. cuprina genome has a 1,840,348-bp circular chromosome (2,029 open reading frames [ORFs]) and is 16% smaller than the previously sequenced Metallosphaera sedula genome. Compared to the M. sedula genome, there are no counterpart genes in the M. cuprina genome for about 480 ORFs in the M. sedula genome, of which 243 ORFs are annotated as hypothetical protein genes. Still, there are 233 ORFs uniquely occurring in M. cuprina. Genome annotation supports that M. cuprina lives a facultative life on CO(2) and organics and obtains energy from oxidation of sulfidic ores and reduced inorganic sulfuric compounds.
Diverse circovirus-like genome architectures revealed by environmental metagenomics.

PubMed

Rosario, Karyna; Duffy, Siobain; Breitbart, Mya

2009-10-01

Single-stranded DNA (ssDNA) viruses with circular genomes are the smallest viruses known to infect eukaryotes. The present study identified 10 novel genomes similar to ssDNA circoviruses through data-mining of public viral metagenomes. The metagenomic libraries included samples from reclaimed water and three different marine environments (Chesapeake Bay, British Columbia coastal waters and Sargasso Sea). All the genomes have similarities to the replication (Rep) protein of circoviruses; however, only half have genomic features consistent with known circoviruses. Some of the genomes exhibit a mixture of genomic features associated with different families of ssDNA viruses (i.e. circoviruses, geminiviruses and parvoviruses). Unique genome architectures and phylogenetic analysis of the Rep protein suggest that these viruses belong to novel genera and/or families. Investigating the complex community of ssDNA viruses in the environment can lead to the discovery of divergent species and help elucidate evolutionary links between ssDNA viruses.
The complete mitochondrial genome sequence of the maned wolf (Chrysocyon brachyurus).

PubMed

Zhao, Chao; Yang, Xiufeng; Zhang, Honghai; Zhang, Jin; Chen, Lei; Sha, Weilai; Liu, Guangshuai

2016-01-01

In this study, the complete mitochondrial genome of the maned wolf (Chrysocyon brachyurus), the unique species in Chrysocyon, was sequenced and reported for the first time using blood samples obtained from a female individual in Shanghai Zoo, China. Sequence analysis showed that the genome structure was in accordance with other Canidae species and it contained 12 S rRNA gene, 16 S rRNA gene, 22 tRNA genes, 13 protein-coding genes and 1 control region.
Exploring the Presence of microDNAs in Prostate Cancer Cell Lines, Tissue, and Sera of Prostate Cancer Patients and its Possible Application as Biomarker

DTIC Science & Technology

2016-04-01

Sequence tags were mapped on the human reference genome using the Novoalign software. Only those...ends of the linear islands to create a novel junctional sequence that does not exist in the genome . Thus the PE- sequence of a fragment that breaks at... genome (Fig. 3b). Those PE-tags where one tag maps uniquely to an island and the other remains unmapped, but passes the sequence quality filter,
Opportunities and challenges of big data for the social sciences: The case of genomic data.

PubMed

Liu, Hexuan; Guo, Guang

2016-09-01

In this paper, we draw attention to one unique and valuable source of big data, genomic data, by demonstrating the opportunities they provide to social scientists. We discuss different types of large-scale genomic data and recent advances in statistical methods and computational infrastructure used to address challenges in managing and analyzing such data. We highlight how these data and methods can be used to benefit social science research. Copyright © 2016 Elsevier Inc. All rights reserved.
Architectural protein subclasses shape 3-D organization of genomes during lineage commitment

PubMed Central

Phillips-Cremins, Jennifer E.; Sauria, Michael E. G.; Sanyal, Amartya; Gerasimova, Tatiana I.; Lajoie, Bryan R.; Bell, Joshua S. K.; Ong, Chin-Tong; Hookway, Tracy A.; Guo, Changying; Sun, Yuhua; Bland, Michael J.; Wagstaff, William; Dalton, Stephen; McDevitt, Todd C.; Sen, Ranjan; Dekker, Job; Taylor, James; Corces, Victor G.

2013-01-01

Summary Understanding the topological configurations of chromatin may reveal valuable insights into how the genome and epigenome act in concert to control cell fate during development. Here we generate high-resolution architecture maps across seven genomic loci in embryonic stem cells and neural progenitor cells. We observe a hierarchy of 3-D interactions that undergo marked reorganization at the sub-Mb scale during differentiation. Distinct combinations of CTCF, Mediator, and cohesin show widespread enrichment in looping interactions at different length scales. CTCF/cohesin anchor long-range constitutive interactions that form the topological basis for invariant sub-domains. Conversely, Mediator/cohesin together with pioneer factors bridge shortrange enhancer-promoter interactions within and between larger sub-domains. Knockdown of Smc1 or Med12 in ES cells results in disruption of spatial architecture and down-regulation of genes found in cohesin-mediated interactions. We conclude that cell type-specific chromatin organization occurs at the sub-Mb scale and that architectural proteins shape the genome in hierarchical length scales. PMID:23706625
Promoter-enhancer interactions identified from Hi-C data using probabilistic models and hierarchical topological domains.

PubMed

Ron, Gil; Globerson, Yuval; Moran, Dror; Kaplan, Tommy

2017-12-21

Proximity-ligation methods such as Hi-C allow us to map physical DNA-DNA interactions along the genome, and reveal its organization into topologically associating domains (TADs). As the Hi-C data accumulate, computational methods were developed for identifying domain borders in multiple cell types and organisms. Here, we present PSYCHIC, a computational approach for analyzing Hi-C data and identifying promoter-enhancer interactions. We use a unified probabilistic model to segment the genome into domains, which we then merge hierarchically and fit using a local background model, allowing us to identify over-represented DNA-DNA interactions across the genome. By analyzing the published Hi-C data sets in human and mouse, we identify hundreds of thousands of putative enhancers and their target genes, and compile an extensive genome-wide catalog of gene regulation in human and mouse. As we show, our predictions are highly enriched for ChIP-seq and DNA accessibility data, evolutionary conservation, eQTLs and other DNA-DNA interaction data.
Between Two Fern Genomes

PubMed Central

2014-01-01

Ferns are the only major lineage of vascular plants not represented by a sequenced nuclear genome. This lack of genome sequence information significantly impedes our ability to understand and reconstruct genome evolution not only in ferns, but across all land plants. Azolla and Ceratopteris are ideal and complementary candidates to be the first ferns to have their nuclear genomes sequenced. They differ dramatically in genome size, life history, and habit, and thus represent the immense diversity of extant ferns. Together, this pair of genomes will facilitate myriad large-scale comparative analyses across ferns and all land plants. Here we review the unique biological characteristics of ferns and describe a number of outstanding questions in plant biology that will benefit from the addition of ferns to the set of taxa with sequenced nuclear genomes. We explain why the fern clade is pivotal for understanding genome evolution across land plants, and we provide a rationale for how knowledge of fern genomes will enable progress in research beyond the ferns themselves. PMID:25324969

Some links on this page may take you to non-federal websites. Their policies may differ from this site.