Sample records for genomic diversity reveal

  1. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium.

    PubMed

    Machado, Henrique; Gram, Lone

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur , amino-acid usage, ANI), which allowed us to identify two misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan- and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity could be traced to the smaller chromosome and plasmids. Several of the physiological traits studied in the genus did not correlate with phylogenetic data. Since horizontal gene transfer (HGT) is often suggested as a source of genetic diversity and a potential driver of genomic evolution in bacterial species, we looked into evidence of such in Photobacterium genomes. Genomic islands were the source of genomic differences between strains of the same species. Also, we found transposase genes and CRISPR arrays that suggest multiple encounters with foreign DNA. Presence of genomic exchange traits was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.

  2. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium

    PubMed Central

    Machado, Henrique; Gram, Lone

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur, amino-acid usage, ANI), which allowed us to identify two misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan- and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity could be traced to the smaller chromosome and plasmids. Several of the physiological traits studied in the genus did not correlate with phylogenetic data. Since horizontal gene transfer (HGT) is often suggested as a source of genetic diversity and a potential driver of genomic evolution in bacterial species, we looked into evidence of such in Photobacterium genomes. Genomic islands were the source of genomic differences between strains of the same species. Also, we found transposase genes and CRISPR arrays that suggest multiple encounters with foreign DNA. Presence of genomic exchange traits was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms. PMID:28706512

  3. Nucleotide diversity maps reveal variation in diversity among wheat genomes and chromosomes

    PubMed Central

    2010-01-01

    Background A genome-wide assessment of nucleotide diversity in a polyploid species must minimize the inclusion of homoeologous sequences into diversity estimates and reliably allocate individual haplotypes into their respective genomes. The same requirements complicate the development and deployment of single nucleotide polymorphism (SNP) markers in polyploid species. We report here a strategy that satisfies these requirements and deploy it in the sequencing of genes in cultivated hexaploid wheat (Triticum aestivum, genomes AABBDD) and wild tetraploid wheat (Triticum turgidum ssp. dicoccoides, genomes AABB) from the putative site of wheat domestication in Turkey. Data are used to assess the distribution of diversity among and within wheat genomes and to develop a panel of SNP markers for polyploid wheat. Results Nucleotide diversity was estimated in 2114 wheat genes and was similar between the A and B genomes and reduced in the D genome. Within a genome, diversity was diminished on some chromosomes. Low diversity was always accompanied by an excess of rare alleles. A total of 5,471 SNPs was discovered in 1791 wheat genes. Totals of 1,271, 1,218, and 2,203 SNPs were discovered in 488, 463, and 641 genes of wheat putative diploid ancestors, T. urartu, Aegilops speltoides, and Ae. tauschii, respectively. A public database containing genome-specific primers, SNPs, and other information was constructed. A total of 987 genes with nucleotide diversity estimated in one or more of the wheat genomes was placed on an Ae. tauschii genetic map, and the map was superimposed on wheat deletion-bin maps. The agreement between the maps was assessed. Conclusions In a young polyploid, exemplified by T. aestivum, ancestral species are the primary source of genetic diversity. Low effective recombination due to self-pollination and a genetic mechanism precluding homoeologous chromosome pairing during polyploid meiosis can lead to the loss of diversity from large chromosomal regions. The

  4. Nucleotide diversity maps reveal variation in diversity among wheat genomes and chromosomes

    USDA-ARS?s Scientific Manuscript database

    Technical Abstract: 20-75 CHARACTER LINES A strategy for a genome-wide assessment of nucleotide diversity in a polyploid species must minimize the inclusion of homoeologous sequences into diversity estimates and reliably allocate individual haplotypes into respective genomes. In this study, nucle...

  5. Landscape genomics reveals altered genome wide diversity within revegetated stands of Eucalyptus microcarpa (Grey Box).

    PubMed

    Jordan, Rebecca; Dillon, Shannon K; Prober, Suzanne M; Hoffmann, Ary A

    2016-12-01

    In order to contribute to evolutionary resilience and adaptive potential in highly modified landscapes, revegetated areas should ideally reflect levels of genetic diversity within and across natural stands. Landscape genomic analyses enable such diversity patterns to be characterized at genome and chromosomal levels. Landscape-wide patterns of genomic diversity were assessed in Eucalyptus microcarpa, a dominant tree species widely used in revegetation in Southeastern Australia. Trees from small and large patches within large remnants, small isolated remnants and revegetation sites were assessed across the now highly fragmented distribution of this species using the DArTseq genomic approach. Genomic diversity was similar within all three types of remnant patches analysed, although often significantly but only slightly lower in revegetation sites compared with natural remnants. Differences in diversity between stand types varied across chromosomes. Genomic differentiation was higher between small, isolated remnants, and among revegetated sites compared with natural stands. We conclude that small remnants and revegetated sites of our E. microcarpa samples largely but not completely capture patterns in genomic diversity across the landscape. Genomic approaches provide a powerful tool for assessing restoration efforts across the landscape. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.

  6. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level

    PubMed Central

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea’s genetic data sources. PMID:27446038

  7. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations.

    PubMed

    Mallick, Swapan; Li, Heng; Lipson, Mark; Mathieson, Iain; Gymrek, Melissa; Racimo, Fernando; Zhao, Mengyao; Chennagiri, Niru; Nordenfelt, Susanne; Tandon, Arti; Skoglund, Pontus; Lazaridis, Iosif; Sankararaman, Sriram; Fu, Qiaomei; Rohland, Nadin; Renaud, Gabriel; Erlich, Yaniv; Willems, Thomas; Gallo, Carla; Spence, Jeffrey P; Song, Yun S; Poletti, Giovanni; Balloux, Francois; van Driem, George; de Knijff, Peter; Romero, Irene Gallego; Jha, Aashish R; Behar, Doron M; Bravi, Claudio M; Capelli, Cristian; Hervig, Tor; Moreno-Estrada, Andres; Posukh, Olga L; Balanovska, Elena; Balanovsky, Oleg; Karachanak-Yankova, Sena; Sahakyan, Hovhannes; Toncheva, Draga; Yepiskoposyan, Levon; Tyler-Smith, Chris; Xue, Yali; Abdullah, M Syafiq; Ruiz-Linares, Andres; Beall, Cynthia M; Di Rienzo, Anna; Jeong, Choongwon; Starikovskaya, Elena B; Metspalu, Ene; Parik, Jüri; Villems, Richard; Henn, Brenna M; Hodoglugil, Ugur; Mahley, Robert; Sajantila, Antti; Stamatoyannopoulos, George; Wee, Joseph T S; Khusainova, Rita; Khusnutdinova, Elza; Litvinov, Sergey; Ayodo, George; Comas, David; Hammer, Michael F; Kivisild, Toomas; Klitz, William; Winkler, Cheryl A; Labuda, Damian; Bamshad, Michael; Jorde, Lynn B; Tishkoff, Sarah A; Watkins, W Scott; Metspalu, Mait; Dryomov, Stanislav; Sukernik, Rem; Singh, Lalji; Thangaraj, Kumarasamy; Pääbo, Svante; Kelso, Janet; Patterson, Nick; Reich, David

    2016-10-13

    Here we report the Simons Genome Diversity Project data set: high quality genomes from 300 individuals from 142 diverse populations. These genomes include at least 5.8 million base pairs that are not present in the human reference genome. Our analysis reveals key features of the landscape of human genome variation, including that the rate of accumulation of mutations has accelerated by about 5% in non-Africans compared to Africans since divergence. We show that the ancestors of some pairs of present-day human populations were substantially separated by 100,000 years ago, well before the archaeologically attested onset of behavioural modernity. We also demonstrate that indigenous Australians, New Guineans and Andamanese do not derive substantial ancestry from an early dispersal of modern humans; instead, their modern human ancestry is consistent with coming from the same source as that of other non-Africans.

  8. Comparative Genomics Reveals the Diversity of Restriction-Modification Systems and DNA Methylation Sites in Listeria monocytogenes.

    PubMed

    Chen, Poyin; den Bakker, Henk C; Korlach, Jonas; Kong, Nguyet; Storey, Dylan B; Paxinos, Ellen E; Ashby, Meredith; Clark, Tyson; Luong, Khai; Wiedmann, Martin; Weimer, Bart C

    2017-02-01

    which manifests as gastroenteritis, meningoencephalitis, and abortion. Among Salmonella, Escherichia coli, Campylobacter, and Listeria-causing the most prevalent foodborne illnesses-infection by L. monocytogenes carries the highest mortality rate. The ability of L. monocytogenes to regulate its response to various harsh environments enables its persistence and transmission. Small-scale comparisons of L. monocytogenes focusing solely on genome contents reveal a highly syntenic genome yet fail to address the observed diversity in phenotypic regulation. This study provides a large-scale comparison of 302 L. monocytogenes isolates, revealing the importance of the epigenome and restriction-modification systems as major determinants of L. monocytogenes phylogenetic grouping and subsequent phenotypic expression. Further examination of virulence genes of select outbreak strains reveals an unprecedented diversity in methylation statuses despite high degrees of genome conservation. Copyright © 2017 American Society for Microbiology.

  9. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations

    PubMed Central

    Mallick, Swapan; Li, Heng; Lipson, Mark; Mathieson, Iain; Gymrek, Melissa; Racimo, Fernando; Zhao, Mengyao; Chennagiri, Niru; Nordenfelt, Susanne; Tandon, Arti; Skoglund, Pontus; Lazaridis, Iosif; Sankararaman, Sriram; Fu, Qiaomei; Rohland, Nadin; Renaud, Gabriel; Erlich, Yaniv; Willems, Thomas; Gallo, Carla; Spence, Jeffrey P.; Song, Yun S.; Poletti, Giovanni; Balloux, Francois; van Driem, George; de Knijff, Peter; Romero, Irene Gallego; Jha, Aashish R.; Behar, Doron M.; Bravi, Claudio M.; Capelli, Cristian; Hervig, Tor; Moreno-Estrada, Andres; Posukh, Olga L.; Balanovska, Elena; Balanovsky, Oleg; Karachanak-Yankova, Sena; Sahakyan, Hovhannes; Toncheva, Draga; Yepiskoposyan, Levon; Tyler-Smith, Chris; Xue, Yali; Abdullah, M. Syafiq; Ruiz-Linares, Andres; Beall, Cynthia M.; Di Rienzo, Anna; Jeong, Choongwon; Starikovskaya, Elena B.; Metspalu, Ene; Parik, Jüri; Villems, Richard; Henn, Brenna M.; Hodoglugil, Ugur; Mahley, Robert; Sajantila, Antti; Stamatoyannopoulos, George; Wee, Joseph T. S.; Khusainova, Rita; Khusnutdinova, Elza; Litvinov, Sergey; Ayodo, George; Comas, David; Hammer, Michael; Kivisild, Toomas; Klitz, William; Winkler, Cheryl; Labuda, Damian; Bamshad, Michael; Jorde, Lynn B.; Tishkoff, Sarah A.; Watkins, W. Scott; Metspalu, Mait; Dryomov, Stanislav; Sukernik, Rem; Singh, Lalji; Thangaraj, Kumarasamy; Pääbo, Svante; Kelso, Janet; Patterson, Nick; Reich, David

    2016-01-01

    We report the Simons Genome Diversity Project (SGDP) dataset: high quality genomes from 300 individuals from 142 diverse populations. These genomes include at least 5.8 million base pairs that are not present in the human reference genome. Our analysis reveals key features of the landscape of human genome variation, including that the rate of accumulation of mutations has accelerated by about 5% in non-Africans compared to Africans since divergence. We show that the ancestors of some pairs of present-day human populations were substantially separated by 100,000 years ago, well before the archaeologically attested onset of behavioral modernity. We also demonstrate that indigenous Australians, New Guineans and Andamanese do not derive substantial ancestry from an early dispersal of modern humans; instead, their modern human ancestry is consistent with coming from the same source as that in other non-Africans. PMID:27654912

  10. Whole genome comparison of a large collection of mycobacteriophages reveals a continuum of phage genetic diversity

    PubMed Central

    Pope, Welkin H; Bowman, Charles A; Russell, Daniel A; Jacobs-Sera, Deborah; Asai, David J; Cresawn, Steven G; Jacobs, William R; Hendrix, Roger W; Lawrence, Jeffrey G; Hatfull, Graham F; Abbazia, Patrick; Ababio, Amma; Adam, Naazneen

    2015-01-01

    The bacteriophage population is large, dynamic, ancient, and genetically diverse. Limited genomic information shows that phage genomes are mosaic, and the genetic architecture of phage populations remains ill-defined. To understand the population structure of phages infecting a single host strain, we isolated, sequenced, and compared 627 phages of Mycobacterium smegmatis. Their genetic diversity is considerable, and there are 28 distinct genomic types (clusters) with related nucleotide sequences. However, amino acid sequence comparisons show pervasive genomic mosaicism, and quantification of inter-cluster and intra-cluster relatedness reveals a continuum of genetic diversity, albeit with uneven representation of different phages. Furthermore, rarefaction analysis shows that the mycobacteriophage population is not closed, and there is a constant influx of genes from other sources. Phage isolation and analysis was performed by a large consortium of academic institutions, illustrating the substantial benefits of a disseminated, structured program involving large numbers of freshman undergraduates in scientific discovery. DOI: http://dx.doi.org/10.7554/eLife.06416.001 PMID:25919952

  11. Whole genome comparison of a large collection of mycobacteriophages reveals a continuum of phage genetic diversity.

    PubMed

    Pope, Welkin H; Bowman, Charles A; Russell, Daniel A; Jacobs-Sera, Deborah; Asai, David J; Cresawn, Steven G; Jacobs, William R; Hendrix, Roger W; Lawrence, Jeffrey G; Hatfull, Graham F

    2015-04-28

    The bacteriophage population is large, dynamic, ancient, and genetically diverse. Limited genomic information shows that phage genomes are mosaic, and the genetic architecture of phage populations remains ill-defined. To understand the population structure of phages infecting a single host strain, we isolated, sequenced, and compared 627 phages of Mycobacterium smegmatis. Their genetic diversity is considerable, and there are 28 distinct genomic types (clusters) with related nucleotide sequences. However, amino acid sequence comparisons show pervasive genomic mosaicism, and quantification of inter-cluster and intra-cluster relatedness reveals a continuum of genetic diversity, albeit with uneven representation of different phages. Furthermore, rarefaction analysis shows that the mycobacteriophage population is not closed, and there is a constant influx of genes from other sources. Phage isolation and analysis was performed by a large consortium of academic institutions, illustrating the substantial benefits of a disseminated, structured program involving large numbers of freshman undergraduates in scientific discovery.

  12. Unprecedented genomic diversity of RNA viruses in arthropods reveals the ancestry of negative-sense RNA viruses

    PubMed Central

    Li, Ci-Xiu; Shi, Mang; Tian, Jun-Hua; Lin, Xian-Dan; Kang, Yan-Jun; Chen, Liang-Jun; Qin, Xin-Cheng; Xu, Jianguo; Holmes, Edward C; Zhang, Yong-Zhen

    2015-01-01

    Although arthropods are important viral vectors, the biodiversity of arthropod viruses, as well as the role that arthropods have played in viral origins and evolution, is unclear. Through RNA sequencing of 70 arthropod species we discovered 112 novel viruses that appear to be ancestral to much of the documented genetic diversity of negative-sense RNA viruses, a number of which are also present as endogenous genomic copies. With this greatly enriched diversity we revealed that arthropods contain viruses that fall basal to major virus groups, including the vertebrate-specific arenaviruses, filoviruses, hantaviruses, influenza viruses, lyssaviruses, and paramyxoviruses. We similarly documented a remarkable diversity of genome structures in arthropod viruses, including a putative circular form, that sheds new light on the evolution of genome organization. Hence, arthropods are a major reservoir of viral genetic diversity and have likely been central to viral evolution. DOI: http://dx.doi.org/10.7554/eLife.05378.001 PMID:25633976

  13. Comparative genomics reveals high biological diversity and specific adaptations in the industrially and medically important fungal genus Aspergillus.

    PubMed

    de Vries, Ronald P; Riley, Robert; Wiebenga, Ad; Aguilar-Osorio, Guillermo; Amillis, Sotiris; Uchima, Cristiane Akemi; Anderluh, Gregor; Asadollahi, Mojtaba; Askin, Marion; Barry, Kerrie; Battaglia, Evy; Bayram, Özgür; Benocci, Tiziano; Braus-Stromeyer, Susanna A; Caldana, Camila; Cánovas, David; Cerqueira, Gustavo C; Chen, Fusheng; Chen, Wanping; Choi, Cindy; Clum, Alicia; Dos Santos, Renato Augusto Corrêa; Damásio, André Ricardo de Lima; Diallinas, George; Emri, Tamás; Fekete, Erzsébet; Flipphi, Michel; Freyberg, Susanne; Gallo, Antonia; Gournas, Christos; Habgood, Rob; Hainaut, Matthieu; Harispe, María Laura; Henrissat, Bernard; Hildén, Kristiina S; Hope, Ryan; Hossain, Abeer; Karabika, Eugenia; Karaffa, Levente; Karányi, Zsolt; Kraševec, Nada; Kuo, Alan; Kusch, Harald; LaButti, Kurt; Lagendijk, Ellen L; Lapidus, Alla; Levasseur, Anthony; Lindquist, Erika; Lipzen, Anna; Logrieco, Antonio F; MacCabe, Andrew; Mäkelä, Miia R; Malavazi, Iran; Melin, Petter; Meyer, Vera; Mielnichuk, Natalia; Miskei, Márton; Molnár, Ákos P; Mulé, Giuseppina; Ngan, Chew Yee; Orejas, Margarita; Orosz, Erzsébet; Ouedraogo, Jean Paul; Overkamp, Karin M; Park, Hee-Soo; Perrone, Giancarlo; Piumi, Francois; Punt, Peter J; Ram, Arthur F J; Ramón, Ana; Rauscher, Stefan; Record, Eric; Riaño-Pachón, Diego Mauricio; Robert, Vincent; Röhrig, Julian; Ruller, Roberto; Salamov, Asaf; Salih, Nadhira S; Samson, Rob A; Sándor, Erzsébet; Sanguinetti, Manuel; Schütze, Tabea; Sepčić, Kristina; Shelest, Ekaterina; Sherlock, Gavin; Sophianopoulou, Vicky; Squina, Fabio M; Sun, Hui; Susca, Antonia; Todd, Richard B; Tsang, Adrian; Unkles, Shiela E; van de Wiele, Nathalie; van Rossen-Uffink, Diana; Oliveira, Juliana Velasco de Castro; Vesth, Tammi C; Visser, Jaap; Yu, Jae-Hyuk; Zhou, Miaomiao; Andersen, Mikael R; Archer, David B; Baker, Scott E; Benoit, Isabelle; Brakhage, Axel A; Braus, Gerhard H; Fischer, Reinhard; Frisvad, Jens C; Goldman, Gustavo H; Houbraken, Jos; Oakley, Berl; Pócsi, István; Scazzocchio, Claudio; Seiboth, Bernhard; vanKuyk, Patricia A; Wortman, Jennifer; Dyer, Paul S; Grigoriev, Igor V

    2017-02-14

    The fungal genus Aspergillus is of critical importance to humankind. Species include those with industrial applications, important pathogens of humans, animals and crops, a source of potent carcinogenic contaminants of food, and an important genetic model. The genome sequences of eight aspergilli have already been explored to investigate aspects of fungal biology, raising questions about evolution and specialization within this genus. We have generated genome sequences for ten novel, highly diverse Aspergillus species and compared these in detail to sister and more distant genera. Comparative studies of key aspects of fungal biology, including primary and secondary metabolism, stress response, biomass degradation, and signal transduction, revealed both conservation and diversity among the species. Observed genomic differences were validated with experimental studies. This revealed several highlights, such as the potential for sex in asexual species, organic acid production genes being a key feature of black aspergilli, alternative approaches for degrading plant biomass, and indications for the genetic basis of stress response. A genome-wide phylogenetic analysis demonstrated in detail the relationship of the newly genome sequenced species with other aspergilli. Many aspects of biological differences between fungal species cannot be explained by current knowledge obtained from genome sequences. The comparative genomics and experimental study, presented here, allows for the first time a genus-wide view of the biological diversity of the aspergilli and in many, but not all, cases linked genome differences to phenotype. Insights gained could be exploited for biotechnological and medical applications of fungi.

  14. Comparative genomics of the marine bacterial genus Glaciecola reveals the high degree of genomic diversity and genomic characteristic for cold adaptation.

    PubMed

    Qin, Qi-Long; Xie, Bin-Bin; Yu, Yong; Shu, Yan-Li; Rong, Jin-Cheng; Zhang, Yan-Jiao; Zhao, Dian-Li; Chen, Xiu-Lan; Zhang, Xi-Ying; Chen, Bo; Zhou, Bai-Cheng; Zhang, Yu-Zhong

    2014-06-01

    To what extent the genomes of different species belonging to one genus can be diverse and the relationship between genomic differentiation and environmental factor remain unclear for oceanic bacteria. With many new bacterial genera and species being isolated from marine environments, this question warrants attention. In this study, we sequenced all the type strains of the published species of Glaciecola, a recently defined cold-adapted genus with species from diverse marine locations, to study the genomic diversity and cold-adaptation strategy in this genus.The genome size diverged widely from 3.08 to 5.96 Mb, which can be explained by massive gene gain and loss events. Horizontal gene transfer and new gene emergence contributed substantially to the genome size expansion. The genus Glaciecola had an open pan-genome. Comparative genomic research indicated that species of the genus Glaciecola had high diversity in genome size, gene content and genetic relatedness. This may be prevalent in marine bacterial genera considering the dynamic and complex environments of the ocean. Species of Glaciecola had some common genomic features related to cold adaptation, which enable them to thrive and play a role in biogeochemical cycle in the cold marine environments.

  15. Genomic Diversity and Evolution of the Lyssaviruses

    PubMed Central

    Delmas, Olivier; Holmes, Edward C.; Talbi, Chiraz; Larrous, Florence; Dacheux, Laurent; Bouchier, Christiane; Bourhy, Hervé

    2008-01-01

    Lyssaviruses are RNA viruses with single-strand, negative-sense genomes responsible for rabies-like diseases in mammals. To date, genomic and evolutionary studies have most often utilized partial genome sequences, particularly of the nucleoprotein and glycoprotein genes, with little consideration of genome-scale evolution. Herein, we report the first genomic and evolutionary analysis using complete genome sequences of all recognised lyssavirus genotypes, including 14 new complete genomes of field isolates from 6 genotypes and one genotype that is completely sequenced for the first time. In doing so we significantly increase the extent of genome sequence data available for these important viruses. Our analysis of these genome sequence data reveals that all lyssaviruses have the same genomic organization. A phylogenetic analysis reveals strong geographical structuring, with the greatest genetic diversity in Africa, and an independent origin for the two known genotypes that infect European bats. We also suggest that multiple genotypes may exist within the diversity of viruses currently classified as ‘Lagos Bat’. In sum, we show that rigorous phylogenetic techniques based on full length genome sequence provide the best discriminatory power for genotype classification within the lyssaviruses. PMID:18446239

  16. Ethical aspects of genome diversity research: genome research into cultural diversity or cultural diversity in genome research?

    PubMed

    Ilkilic, Ilhan; Paul, Norbert W

    2009-03-01

    The goal of the Human Genome Diversity Project (HGDP) was to reconstruct the history of human evolution and the historical and geographical distribution of populations with the help of scientific research. Through this kind of research, the entire spectrum of genetic diversity to be found in the human species was to be explored with the hope of generating a better understanding of the history of humankind. An important part of this genome diversity research consists in taking blood and tissue samples from indigenous populations. For various reasons, it has not been possible to execute this project in the planned scope and form to date. Nevertheless, genomic diversity research addresses complex issues which prove to be highly relevant from the perspective of research ethics, transcultural medical ethics, and cultural philosophy. In the article at hand, we discuss these ethical issues as illustrated by the HGDP. This investigation focuses on the confrontation of culturally diverse images of humans and their cosmologies within the framework of genome diversity research and the ethical questions it raises. We argue that in addition to complex questions pertaining to research ethics such as informed consent and autonomy of probands, genome diversity research also has a cultural-philosophical, meta-ethical, and phenomenological dimension which must be taken into account in ethical discourses. Acknowledging this fact, we attempt to show the limits of current guidelines used in international genome diversity studies, following this up by a formulation of theses designed to facilitate an appropriate inquiry and ethical evaluation of intercultural dimensions of genome research.

  17. Scanning the landscape of genome architecture of non-O1 and non-O139 Vibrio cholerae by whole genome mapping reveals extensive population genetic diversity.

    PubMed

    Chapman, Carol; Henry, Matthew; Bishop-Lilly, Kimberly A; Awosika, Joy; Briska, Adam; Ptashkin, Ryan N; Wagner, Trevor; Rajanna, Chythanya; Tsang, Hsinyi; Johnson, Shannon L; Mokashi, Vishwesh P; Chain, Patrick S G; Sozhamannan, Shanmuga

    2015-01-01

    Historically, cholera outbreaks have been linked to V. cholerae O1 serogroup strains or its derivatives of the O37 and O139 serogroups. A genomic study on the 2010 Haiti cholera outbreak strains highlighted the putative role of non O1/non-O139 V. cholerae in causing cholera and the lack of genomic sequences of such strains from around the world. Here we address these gaps by scanning a global collection of V. cholerae strains as a first step towards understanding the population genetic diversity and epidemic potential of non O1/non-O139 strains. Whole Genome Mapping (Optical Mapping) based bar coding produces a high resolution, ordered restriction map, depicting a complete view of the unique chromosomal architecture of an organism. To assess the genomic diversity of non-O1/non-O139 V. cholerae, we applied a Whole Genome Mapping strategy on a well-defined and geographically and temporally diverse strain collection, the Sakazaki serogroup type strains. Whole Genome Map data on 91 of the 206 serogroup type strains support the hypothesis that V. cholerae has an unprecedented genetic and genomic structural diversity. Interestingly, we discovered chromosomal fusions in two unusual strains that possess a single chromosome instead of the two chromosomes usually found in V. cholerae. We also found pervasive chromosomal rearrangements such as duplications and indels in many strains. The majority of Vibrio genome sequences currently in public databases are unfinished draft sequences. The Whole Genome Mapping approach presented here enables rapid screening of large strain collections to capture genomic complexities that would not have been otherwise revealed by unfinished draft genome sequencing and thus aids in assembling and finishing draft sequences of complex genomes. Furthermore, Whole Genome Mapping allows for prediction of novel V. cholerae non-O1/non-O139 strains that may have the potential to cause future cholera outbreaks.

  18. Scanning the landscape of genome architecture of non-O1 and non-O139 Vibrio cholerae by whole genome mapping reveals extensive population genetic diversity

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chapman, Carol; Henry, Matthew; Bishop-Lilly, Kimberly A.

    Historically, cholera outbreaks have been linked to V. cholerae O1 serogroup strains or its derivatives of the O37 and O139 serogroups. A genomic study on the 2010 Haiti cholera outbreak strains highlighted the putative role of non O1/non-O139 V. cholerae in causing cholera and the lack of genomic sequences of such strains from around the world. Here we address these gaps by scanning a global collection of V. cholerae strains as a first step towards understanding the population genetic diversity and epidemic potential of non O1/non-O139 strains. Whole Genome Mapping (Optical Mapping) based bar coding produces a high resolution, orderedmore » restriction map, depicting a complete view of the unique chromosomal architecture of an organism. To assess the genomic diversity of non-O1/non-O139 V. cholerae, we applied a Whole Genome Mapping strategy on a well-defined and geographically and temporally diverse strain collection, the Sakazaki serogroup type strains. Whole Genome Map data on 91 of the 206 serogroup type strains support the hypothesis that V. cholerae has an unprecedented genetic and genomic structural diversity. Interestingly, we discovered chromosomal fusions in two unusual strains that possess a single chromosome instead of the two chromosomes usually found in V. cholerae. We also found pervasive chromosomal rearrangements such as duplications and indels in many strains. The majority of Vibrio genome sequences currently in public databases are unfinished draft sequences. The Whole Genome Mapping approach presented here enables rapid screening of large strain collections to capture genomic complexities that would not have been otherwise revealed by unfinished draft genome sequencing and thus aids in assembling and finishing draft sequences of complex genomes. Furthermore, Whole Genome Mapping allows for prediction of novel V. cholerae non-O1/non-O139 strains that may have the potential to cause future cholera outbreaks.« less

  19. Scanning the landscape of genome architecture of non-O1 and non-O139 Vibrio cholerae by whole genome mapping reveals extensive population genetic diversity

    DOE PAGES

    Chapman, Carol; Henry, Matthew; Bishop-Lilly, Kimberly A.; ...

    2015-03-20

    Historically, cholera outbreaks have been linked to V. cholerae O1 serogroup strains or its derivatives of the O37 and O139 serogroups. A genomic study on the 2010 Haiti cholera outbreak strains highlighted the putative role of non O1/non-O139 V. cholerae in causing cholera and the lack of genomic sequences of such strains from around the world. Here we address these gaps by scanning a global collection of V. cholerae strains as a first step towards understanding the population genetic diversity and epidemic potential of non O1/non-O139 strains. Whole Genome Mapping (Optical Mapping) based bar coding produces a high resolution, orderedmore » restriction map, depicting a complete view of the unique chromosomal architecture of an organism. To assess the genomic diversity of non-O1/non-O139 V. cholerae, we applied a Whole Genome Mapping strategy on a well-defined and geographically and temporally diverse strain collection, the Sakazaki serogroup type strains. Whole Genome Map data on 91 of the 206 serogroup type strains support the hypothesis that V. cholerae has an unprecedented genetic and genomic structural diversity. Interestingly, we discovered chromosomal fusions in two unusual strains that possess a single chromosome instead of the two chromosomes usually found in V. cholerae. We also found pervasive chromosomal rearrangements such as duplications and indels in many strains. The majority of Vibrio genome sequences currently in public databases are unfinished draft sequences. The Whole Genome Mapping approach presented here enables rapid screening of large strain collections to capture genomic complexities that would not have been otherwise revealed by unfinished draft genome sequencing and thus aids in assembling and finishing draft sequences of complex genomes. Furthermore, Whole Genome Mapping allows for prediction of novel V. cholerae non-O1/non-O139 strains that may have the potential to cause future cholera outbreaks.« less

  20. Genome size diversity in orchids: consequences and evolution

    PubMed Central

    Leitch, I. J.; Kahandawala, I.; Suda, J.; Hanson, L.; Ingrouille, M. J.; Chase, M. W.; Fay, M. F.

    2009-01-01

    Background The amount of DNA comprising the genome of an organism (its genome size) varies a remarkable 40 000-fold across eukaryotes, yet most groups are characterized by much narrower ranges (e.g. 14-fold in gymnosperms, 3- to 4-fold in mammals). Angiosperms stand out as one of the most variable groups with genome sizes varying nearly 2000-fold. Nevertheless within angiosperms the majority of families are characterized by genomes which are small and vary little. Species with large genomes are mostly restricted to a few monocots families including Orchidaceae. Scope A survey of the literature revealed that genome size data for Orchidaceae are comparatively rare representing just 327 species. Nevertheless they reveal that Orchidaceae are currently the most variable angiosperm family with genome sizes ranging 168-fold (1C = 0·33–55·4 pg). Analysing the data provided insights into the distribution, evolution and possible consequences to the plant of this genome size diversity. Conclusions Superimposing the data onto the increasingly robust phylogenetic tree of Orchidaceae revealed how different subfamilies were characterized by distinct genome size profiles. Epidendroideae possessed the greatest range of genome sizes, although the majority of species had small genomes. In contrast, the largest genomes were found in subfamilies Cypripedioideae and Vanilloideae. Genome size evolution within this subfamily was analysed as this is the only one with reasonable representation of data. This approach highlighted striking differences in genome size and karyotype evolution between the closely related Cypripedium, Paphiopedilum and Phragmipedium. As to the consequences of genome size diversity, various studies revealed that this has both practical (e.g. application of genetic fingerprinting techniques) and biological consequences (e.g. affecting where and when an orchid may grow) and emphasizes the importance of obtaining further genome size data given the considerable

  1. Ancient genomes reveal a high diversity of Mycobacterium leprae in medieval Europe

    PubMed Central

    Bonazzi, Marion; Reiter, Ella; Urban, Christian; Taylor, G. Michael; Velemínský, Petr; Marcsik, Antónia; Pálfi, György; Boldsen, Jesper L.; Nebel, Almut; Mays, Simon; Benjak, Andrej

    2018-01-01

    Studying ancient DNA allows us to retrace the evolutionary history of human pathogens, such as Mycobacterium leprae, the main causative agent of leprosy. Leprosy is one of the oldest recorded and most stigmatizing diseases in human history. The disease was prevalent in Europe until the 16th century and is still endemic in many countries with over 200,000 new cases reported annually. Previous worldwide studies on modern and European medieval M. leprae genomes revealed that they cluster into several distinct branches of which two were present in medieval Northwestern Europe. In this study, we analyzed 10 new medieval M. leprae genomes including the so far oldest M. leprae genome from one of the earliest known cases of leprosy in the United Kingdom—a skeleton from the Great Chesterford cemetery with a calibrated age of 415–545 C.E. This dataset provides a genetic time transect of M. leprae diversity in Europe over the past 1500 years. We find M. leprae strains from four distinct branches to be present in the Early Medieval Period, and strains from three different branches were detected within a single cemetery from the High Medieval Period. Altogether these findings suggest a higher genetic diversity of M. leprae strains in medieval Europe at various time points than previously assumed. The resulting more complex picture of the past phylogeography of leprosy in Europe impacts current phylogeographical models of M. leprae dissemination. It suggests alternative models for the past spread of leprosy such as a wide spread prevalence of strains from different branches in Eurasia already in Antiquity or maybe even an origin in Western Eurasia. Furthermore, these results highlight how studying ancient M. leprae strains improves understanding the history of leprosy worldwide. PMID:29746563

  2. Ancient genomes reveal a high diversity of Mycobacterium leprae in medieval Europe.

    PubMed

    Schuenemann, Verena J; Avanzi, Charlotte; Krause-Kyora, Ben; Seitz, Alexander; Herbig, Alexander; Inskip, Sarah; Bonazzi, Marion; Reiter, Ella; Urban, Christian; Dangvard Pedersen, Dorthe; Taylor, G Michael; Singh, Pushpendra; Stewart, Graham R; Velemínský, Petr; Likovsky, Jakub; Marcsik, Antónia; Molnár, Erika; Pálfi, György; Mariotti, Valentina; Riga, Alessandro; Belcastro, M Giovanna; Boldsen, Jesper L; Nebel, Almut; Mays, Simon; Donoghue, Helen D; Zakrzewski, Sonia; Benjak, Andrej; Nieselt, Kay; Cole, Stewart T; Krause, Johannes

    2018-05-01

    Studying ancient DNA allows us to retrace the evolutionary history of human pathogens, such as Mycobacterium leprae, the main causative agent of leprosy. Leprosy is one of the oldest recorded and most stigmatizing diseases in human history. The disease was prevalent in Europe until the 16th century and is still endemic in many countries with over 200,000 new cases reported annually. Previous worldwide studies on modern and European medieval M. leprae genomes revealed that they cluster into several distinct branches of which two were present in medieval Northwestern Europe. In this study, we analyzed 10 new medieval M. leprae genomes including the so far oldest M. leprae genome from one of the earliest known cases of leprosy in the United Kingdom-a skeleton from the Great Chesterford cemetery with a calibrated age of 415-545 C.E. This dataset provides a genetic time transect of M. leprae diversity in Europe over the past 1500 years. We find M. leprae strains from four distinct branches to be present in the Early Medieval Period, and strains from three different branches were detected within a single cemetery from the High Medieval Period. Altogether these findings suggest a higher genetic diversity of M. leprae strains in medieval Europe at various time points than previously assumed. The resulting more complex picture of the past phylogeography of leprosy in Europe impacts current phylogeographical models of M. leprae dissemination. It suggests alternative models for the past spread of leprosy such as a wide spread prevalence of strains from different branches in Eurasia already in Antiquity or maybe even an origin in Western Eurasia. Furthermore, these results highlight how studying ancient M. leprae strains improves understanding the history of leprosy worldwide.

  3. Single-cell genomics of multiple uncultured stramenopiles reveals underestimated functional diversity across oceans.

    PubMed

    Seeleuthner, Yoann; Mondy, Samuel; Lombard, Vincent; Carradec, Quentin; Pelletier, Eric; Wessner, Marc; Leconte, Jade; Mangot, Jean-François; Poulain, Julie; Labadie, Karine; Logares, Ramiro; Sunagawa, Shinichi; de Berardinis, Véronique; Salanoubat, Marcel; Dimier, Céline; Kandels-Lewis, Stefanie; Picheral, Marc; Searson, Sarah; Pesant, Stephane; Poulton, Nicole; Stepanauskas, Ramunas; Bork, Peer; Bowler, Chris; Hingamp, Pascal; Sullivan, Matthew B; Iudicone, Daniele; Massana, Ramon; Aury, Jean-Marc; Henrissat, Bernard; Karsenti, Eric; Jaillon, Olivier; Sieracki, Mike; de Vargas, Colomban; Wincker, Patrick

    2018-01-22

    Single-celled eukaryotes (protists) are critical players in global biogeochemical cycling of nutrients and energy in the oceans. While their roles as primary producers and grazers are well appreciated, other aspects of their life histories remain obscure due to challenges in culturing and sequencing their natural diversity. Here, we exploit single-cell genomics and metagenomics data from the circumglobal Tara Oceans expedition to analyze the genome content and apparent oceanic distribution of seven prevalent lineages of uncultured heterotrophic stramenopiles. Based on the available data, each sequenced genome or genotype appears to have a specific oceanic distribution, principally correlated with water temperature and depth. The genome content provides hypotheses for specialization in terms of cell motility, food spectra, and trophic stages, including the potential impact on their lifestyles of horizontal gene transfer from prokaryotes. Our results support the idea that prominent heterotrophic marine protists perform diverse functions in ocean ecology.

  4. Complete genomic sequences of Propionibacterium freudenreichii phages from Swiss cheese reveal greater diversity than Cutibacterium (formerly Propionibacterium) acnes phages.

    PubMed

    Cheng, Lucy; Marinelli, Laura J; Grosset, Noël; Fitz-Gibbon, Sorel T; Bowman, Charles A; Dang, Brian Q; Russell, Daniel A; Jacobs-Sera, Deborah; Shi, Baochen; Pellegrini, Matteo; Miller, Jeff F; Gautier, Michel; Hatfull, Graham F; Modlin, Robert L

    2018-03-01

    A remarkable exception to the large genetic diversity often observed for bacteriophages infecting a specific bacterial host was found for the Cutibacterium acnes (formerly Propionibacterium acnes) phages, which are highly homogeneous. Phages infecting the related species, which is also a member of the Propionibacteriaceae family, Propionibacterium freudenreichii, a bacterium used in production of Swiss-type cheeses, have also been described and are common contaminants of the cheese manufacturing process. However, little is known about their genetic composition and diversity. We obtained seven independently isolated bacteriophages that infect P. freudenreichii from Swiss-type cheese samples, and determined their complete genome sequences. These data revealed that all seven phage isolates are of similar genomic length and GC% content, but their genomes are highly diverse, including genes encoding the capsid, tape measure, and tail proteins. In contrast to C. acnes phages, all P. freudenreichii phage genomes encode a putative integrase protein, suggesting they are capable of lysogenic growth. This is supported by the finding of related prophages in some P. freudenreichii strains. The seven phages could further be distinguished as belonging to two distinct genomic types, or 'clusters', based on nucleotide sequences, and host range analyses conducted on a collection of P. freudenreichii strains show a higher degree of host specificity than is observed for the C. acnes phages. Overall, our data demonstrate P. freudenreichii bacteriophages are distinct from C. acnes phages, as evidenced by their higher genetic diversity, potential for lysogenic growth, and more restricted host ranges. This suggests substantial differences in the evolution of these related species from the Propionibacteriaceae family and their phages, which is potentially related to their distinct environmental niches.

  5. Comparative genomics reveals insights into avian genome evolution and adaptation

    PubMed Central

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M.; Lee, Chul; Storz, Jay F.; Antunes, Agostinho; Greenwold, Matthew J.; Meredith, Robert W.; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R.; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T.; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V.; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S.; Gatesy, John; Hoffmann, Federico G.; Opazo, Juan C.; Håstad, Olle; Sawyer, Roger H.; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W.; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F.; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A.; Green, Richard E.; O’Brien, Stephen J.; Griffin, Darren; Johnson, Warren E.; Haussler, David; Ryder, Oliver A.; Willerslev, Eske; Graves, Gary R.; Alström, Per; Fjeldså, Jon; Mindell, David P.; Edwards, Scott V.; Braun, Edward L.; Rahbek, Carsten; Burt, David W.; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D.; Gilbert, M. Thomas P.; Wang, Jun

    2015-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712

  6. Open chromatin reveals the functional maize genome

    USDA-ARS?s Scientific Manuscript database

    Every cellular process mediated through nuclear DNA must contend with chromatin. As results from ENCODE show, open chromatin assays can efficiently integrate across diverse regulatory elements, revealing functional non-coding genome. In this study, we use a MNase hypersensitivity assay to discover o...

  7. Bacillus subtilis genome diversity.

    PubMed

    Earl, Ashlee M; Losick, Richard; Kolter, Roberto

    2007-02-01

    Microarray-based comparative genomic hybridization (M-CGH) is a powerful method for rapidly identifying regions of genome diversity among closely related organisms. We used M-CGH to examine the genome diversity of 17 strains belonging to the nonpathogenic species Bacillus subtilis. Our M-CGH results indicate that there is considerable genetic heterogeneity among members of this species; nearly one-third of Bsu168-specific genes exhibited variability, as measured by the microarray hybridization intensities. The variable loci include those encoding proteins involved in antibiotic production, cell wall synthesis, sporulation, and germination. The diversity in these genes may reflect this organism's ability to survive in diverse natural settings.

  8. Comparative genomics reveals insights into avian genome evolution and adaptation.

    PubMed

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M; Lee, Chul; Storz, Jay F; Antunes, Agostinho; Greenwold, Matthew J; Meredith, Robert W; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S; Gatesy, John; Hoffmann, Federico G; Opazo, Juan C; Håstad, Olle; Sawyer, Roger H; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A; Green, Richard E; O'Brien, Stephen J; Griffin, Darren; Johnson, Warren E; Haussler, David; Ryder, Oliver A; Willerslev, Eske; Graves, Gary R; Alström, Per; Fjeldså, Jon; Mindell, David P; Edwards, Scott V; Braun, Edward L; Rahbek, Carsten; Burt, David W; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D; Gilbert, M Thomas P; Wang, Jun

    2014-12-12

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. Copyright © 2014, American Association for the Advancement of Science.

  9. Genomic and transcriptomic analyses reveal differential regulation of diverse terpenoid and polyketides secondary metabolites in Hericium erinaceus.

    PubMed

    Chen, Juan; Zeng, Xu; Yang, Yan Long; Xing, Yong Mei; Zhang, Qi; Li, Jia Mei; Ma, Ke; Liu, Hong Wei; Guo, Shun Xing

    2017-08-31

    The lion's mane mushroom Hericium erinaceus is a famous traditional medicinal fungus credited with anti-dementia activity and a producer of cyathane diterpenoid natural products (erinacines) useful against nervous system diseases. To date, few studies have explored the biosynthesis of these compounds, although their chemical synthesis is known. Here, we report the first genome and tanscriptome sequence of the medicinal fungus H. erinaceus. The size of the genome is 39.35 Mb, containing 9895 gene models. The genome of H. erinaceus reveals diverse enzymes and a large family of cytochrome P450 (CYP) proteins involved in the biosynthesis of terpenoid backbones, diterpenoids, sesquiterpenes and polyketides. Three gene clusters related to terpene biosynthesis and one gene cluster for polyketides biosynthesis (PKS) were predicted. Genes involved in terpenoid biosynthesis were generally upregulated in mycelia, while the PKS gene was upregulated in the fruiting body. Comparative genome analysis of 42 fungal species of Basidiomycota revealed that most edible and medicinal mushroom show many more gene clusters involved in terpenoid and polyketide biosynthesis compared to the pathogenic fungi. None of the gene clusters for terpenoid or polyketide biosynthesis were predicted in the poisonous mushroom Amanita muscaria. Our findings may facilitate future discovery and biosynthesis of bioactive secondary metabolites from H. erinaceus and provide fundamental information for exploring the secondary metabolites in other Basidiomycetes.

  10. Comparative Genomic Analysis Reveals a Diverse Repertoire of Genes Involved in Prokaryote-Eukaryote Interactions within the Pseudovibrio Genus

    PubMed Central

    Romano, Stefano; Fernàndez-Guerra, Antonio; Reen, F. Jerry; Glöckner, Frank O.; Crowley, Susan P.; O'Sullivan, Orla; Cotter, Paul D.; Adams, Claire; Dobson, Alan D. W.; O'Gara, Fergal

    2016-01-01

    Strains of the Pseudovibrio genus have been detected worldwide, mainly as part of bacterial communities associated with marine invertebrates, particularly sponges. This recurrent association has been considered as an indication of a symbiotic relationship between these microbes and their host. Until recently, the availability of only two genomes, belonging to closely related strains, has limited the knowledge on the genomic and physiological features of the genus to a single phylogenetic lineage. Here we present 10 newly sequenced genomes of Pseudovibrio strains isolated from marine sponges from the west coast of Ireland, and including the other two publicly available genomes we performed an extensive comparative genomic analysis. Homogeneity was apparent in terms of both the orthologous genes and the metabolic features shared amongst the 12 strains. At the genomic level, a key physiological difference observed amongst the isolates was the presence only in strain P. axinellae AD2 of genes encoding proteins involved in assimilatory nitrate reduction, which was then proved experimentally. We then focused on studying those systems known to be involved in the interactions with eukaryotic and prokaryotic cells. This analysis revealed that the genus harbors a large diversity of toxin-like proteins, secretion systems and their potential effectors. Their distribution in the genus was not always consistent with the phylogenetic relationship of the strains. Finally, our analyses identified new genomic islands encoding potential toxin-immunity systems, previously unknown in the genus. Our analyses shed new light on the Pseudovibrio genus, indicating a large diversity of both metabolic features and systems for interacting with the host. The diversity in both distribution and abundance of these systems amongst the strains underlines how metabolically and phylogenetically similar bacteria may use different strategies to interact with the host and find a niche within its

  11. Genome-wide diversity and selective pressure in the human rhinovirus

    PubMed Central

    Kistler, Amy L; Webster, Dale R; Rouskin, Silvi; Magrini, Vince; Credle, Joel J; Schnurr, David P; Boushey, Homer A; Mardis, Elaine R; Li, Hao; DeRisi, Joseph L

    2007-01-01

    Background The human rhinoviruses (HRV) are one of the most common and diverse respiratory pathogens of humans. Over 100 distinct HRV serotypes are known, yet only 6 genomes are available. Due to the paucity of HRV genome sequence, little is known about the genetic diversity within HRV or the forces driving this diversity. Previous comparative genome sequence analyses indicate that recombination drives diversification in multiple genera of the picornavirus family, yet it remains unclear if this holds for HRV. Results To resolve this and gain insight into the forces driving diversification in HRV, we generated a representative set of 34 fully sequenced HRVs. Analysis of these genomes shows consistent phylogenies across the genome, conserved non-coding elements, and only limited recombination. However, spikes of genetic diversity at both the nucleotide and amino acid level are detectable within every locus of the genome. Despite this, the HRV genome as a whole is under purifying selective pressure, with islands of diversifying pressure in the VP1, VP2, and VP3 structural genes and two non-structural genes, the 3C protease and 3D polymerase. Mapping diversifying residues in these factors onto available 3-dimensional structures revealed the diversifying capsid residues partition to the external surface of the viral particle in statistically significant proximity to antigenic sites. Diversifying pressure in the pleconaril binding site is confined to a single residue known to confer drug resistance (VP1 191). In contrast, diversifying pressure in the non-structural genes is less clear, mapping both nearby and beyond characterized functional domains of these factors. Conclusion This work provides a foundation for understanding HRV genetic diversity and insight into the underlying biology driving evolution in HRV. It expands our knowledge of the genome sequence space that HRV reference serotypes occupy and how the pattern of genetic diversity across HRV genomes differs

  12. Genome Microscale Heterogeneity among Wild Potatoes Revealed by Diversity Arrays Technology Marker Sequences.

    PubMed

    Traini, Alessandra; Iorizzo, Massimo; Mann, Harpartap; Bradeen, James M; Carputo, Domenico; Frusciante, Luigi; Chiusano, Maria Luisa

    2013-01-01

    Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT) markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.

  13. Probing Genomic Aspects of the Multi-Host Pathogen Clostridium perfringens Reveals Significant Pangenome Diversity, and a Diverse Array of Virulence Factors.

    PubMed

    Kiu, Raymond; Caim, Shabhonam; Alexander, Sarah; Pachori, Purnima; Hall, Lindsay J

    2017-01-01

    Clostridium perfringens is an important cause of animal and human infections, however information about the genetic makeup of this pathogenic bacterium is currently limited. In this study, we sought to understand and characterise the genomic variation, pangenomic diversity, and key virulence traits of 56 C. perfringens strains which included 51 public, and 5 newly sequenced and annotated genomes using Whole Genome Sequencing. Our investigation revealed that C. perfringens has an "open" pangenome comprising 11667 genes and 12.6% of core genes, identified as the most divergent single-species Gram-positive bacterial pangenome currently reported. Our computational analyses also defined C. perfringens phylogeny (16S rRNA gene) in relation to some 25 Clostridium species, with C. baratii and C. sardiniense determined to be the closest relatives. Profiling virulence-associated factors confirmed presence of well-characterised C. perfringens -associated exotoxins genes including α-toxin ( plc ), enterotoxin ( cpe ), and Perfringolysin O ( pfo or pfoA ), although interestingly there did not appear to be a close correlation with encoded toxin type and disease phenotype. Furthermore, genomic analysis indicated significant horizontal gene transfer events as defined by presence of prophage genomes, and notably absence of CRISPR defence systems in >70% (40/56) of the strains. In relation to antimicrobial resistance mechanisms, tetracycline resistance genes ( tet ) and anti-defensins genes ( mprF ) were consistently detected in silico ( tet : 75%; mprF : 100%). However, pre-antibiotic era strain genomes did not encode for tet , thus implying antimicrobial selective pressures in C. perfringens evolutionary history over the past 80 years. This study provides new genomic understanding of this genetically divergent multi-host bacterium, and further expands our knowledge on this medically and veterinary important pathogen.

  14. Genome-wide data reveal cryptic diversity and genetic introgression in an Oriental cynopterine fruit bat radiation.

    PubMed

    Chattopadhyay, Balaji; Garg, Kritika M; Kumar, A K Vinoth; Doss, D Paramanantha Swami; Rheindt, Frank E; Kandula, Sripathi; Ramakrishnan, Uma

    2016-02-18

    The Oriental fruit bat genus Cynopterus, with several geographically overlapping species, presents an interesting case study to evaluate the evolutionary significance of coexistence versus isolation. We examined the morphological and genetic variability of congeneric fruit bats Cynopterus sphinx and C. brachyotis using 405 samples from two natural contact zones and 17 allopatric locations in the Indian subcontinent; and investigated the population differentiation patterns, evolutionary history, and the possibility of cryptic diversity in this species pair. Analysis of microsatellites, cytochrome b gene sequences, and restriction digestion based genome-wide data revealed that C. sphinx and C. brachyotis do not hybridize in contact zones. However, cytochrome b gene sequences and genome-wide SNP data helped uncover a cryptic, hitherto unrecognized cynopterine lineage in northeastern India coexisting with C. sphinx. Further analyses of shared variation of SNPs using Patterson's D statistics suggest introgression between this lineage and C. sphinx. Multivariate analyses of morphology using genetically classified grouping confirmed substantial morphological overlap between C. sphinx and C. brachyotis, specifically in the high elevation contact zones in southern India. Our results uncover novel diversity and detect a pattern of genetic introgression in a cryptic radiation of bats, demonstrating the complicated nature of lineage diversification in this poorly understood taxonomic group. Our results highlight the importance of genome-wide data to study evolutionary processes of morphologically similar species pairs. Our approach represents a significant step forward in evolutionary research on young radiations of non-model species that may retain the ability of interspecific gene flow.

  15. Whole mitochondrial genome sequencing of domestic horses reveals incorporation of extensive wild horse diversity during domestication

    PubMed Central

    2011-01-01

    Background DNA target enrichment by micro-array capture combined with high throughput sequencing technologies provides the possibility to obtain large amounts of sequence data (e.g. whole mitochondrial DNA genomes) from multiple individuals at relatively low costs. Previously, whole mitochondrial genome data for domestic horses (Equus caballus) were limited to only a few specimens and only short parts of the mtDNA genome (especially the hypervariable region) were investigated for larger sample sets. Results In this study we investigated whole mitochondrial genomes of 59 domestic horses from 44 breeds and a single Przewalski horse (Equus przewalski) using a recently described multiplex micro-array capture approach. We found 473 variable positions within the domestic horses, 292 of which are parsimony-informative, providing a well resolved phylogenetic tree. Our divergence time estimate suggests that the mitochondrial genomes of modern horse breeds shared a common ancestor around 93,000 years ago and no later than 38,000 years ago. A Bayesian skyline plot (BSP) reveals a significant population expansion beginning 6,000-8,000 years ago with an ongoing exponential growth until the present, similar to other domestic animal species. Our data further suggest that a large sample of wild horse diversity was incorporated into the domestic population; specifically, at least 46 of the mtDNA lineages observed in domestic horses (73%) already existed before the beginning of domestication about 5,000 years ago. Conclusions Our study provides a window into the maternal origins of extant domestic horses and confirms that modern domestic breeds present a wide sample of the mtDNA diversity found in ancestral, now extinct, wild horse populations. The data obtained allow us to detect a population expansion event coinciding with the beginning of domestication and to estimate both the minimum number of female horses incorporated into the domestic gene pool and the time depth of the

  16. Genome Surfing As Driver of Microbial Genomic Diversity.

    PubMed

    Choudoir, Mallory J; Panke-Buisse, Kevin; Andam, Cheryl P; Buckley, Daniel H

    2017-08-01

    Historical changes in population size, such as those caused by demographic range expansions, can produce nonadaptive changes in genomic diversity through mechanisms such as gene surfing. We propose that demographic range expansion of a microbial population capable of horizontal gene exchange can result in genome surfing, a mechanism that can cause widespread increase in the pan-genome frequency of genes acquired by horizontal gene exchange. We explain that patterns of genetic diversity within Streptomyces are consistent with genome surfing, and we describe several predictions for testing this hypothesis both in Streptomyces and in other microorganisms. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. Whole genome analysis reveals the diversity and evolutionary relationships between necrotic enteritis-causing strains of Clostridium perfringens.

    PubMed

    Lacey, Jake A; Allnutt, Theodore R; Vezina, Ben; Van, Thi Thu Hao; Stent, Thomas; Han, Xiaoyan; Rood, Julian I; Wade, Ben; Keyburn, Anthony L; Seemann, Torsten; Chen, Honglei; Haring, Volker; Johanesen, Priscilla A; Lyras, Dena; Moore, Robert J

    2018-05-22

    Clostridium perfringens causes a range of diseases in animals and humans including necrotic enteritis in chickens and food poisoning and gas gangrene in humans. Necrotic enteritis is of concern in commercial chicken production due to the cost of the implementation of infection control measures and to productivity losses. This study has focused on the genomic analysis of a range of chicken-derived C. perfringens isolates, from around the world and from different years. The genomes were sequenced and compared with 20 genomes available from public databases, which were from a diverse collection of isolates from chickens, other animals, and humans. We used a distance based phylogeny that was constructed based on gene content rather than sequence identity. Similarity between strains was defined as the number of genes that they have in common divided by their total number of genes. In this type of phylogenetic analysis, evolutionary distance can be interpreted in terms of evolutionary events such as acquisition and loss of genes, whereas the underlying properties (the gene content) can be interpreted in terms of function. We also compared these methods to the sequence-based phylogeny of the core genome. Distinct pathogenic clades of necrotic enteritis-causing C. perfringens were identified. They were characterised by variable regions encoded on the chromosome, with predicted roles in capsule production, adhesion, inhibition of related strains, phage integration, and metabolism. Some strains have almost identical genomes, even though they were isolated from different geographic regions at various times, while other highly distant genomes appear to result in similar outcomes with regard to virulence and pathogenesis. The high level of diversity in chicken isolates suggests there is no reliable factor that defines a chicken strain of C. perfringens, however, disease-causing strains can be defined by the presence of netB-encoding plasmids. This study reveals that horizontal

  18. Probing Genomic Aspects of the Multi-Host Pathogen Clostridium perfringens Reveals Significant Pangenome Diversity, and a Diverse Array of Virulence Factors

    PubMed Central

    Kiu, Raymond; Caim, Shabhonam; Alexander, Sarah; Pachori, Purnima; Hall, Lindsay J.

    2017-01-01

    Clostridium perfringens is an important cause of animal and human infections, however information about the genetic makeup of this pathogenic bacterium is currently limited. In this study, we sought to understand and characterise the genomic variation, pangenomic diversity, and key virulence traits of 56 C. perfringens strains which included 51 public, and 5 newly sequenced and annotated genomes using Whole Genome Sequencing. Our investigation revealed that C. perfringens has an “open” pangenome comprising 11667 genes and 12.6% of core genes, identified as the most divergent single-species Gram-positive bacterial pangenome currently reported. Our computational analyses also defined C. perfringens phylogeny (16S rRNA gene) in relation to some 25 Clostridium species, with C. baratii and C. sardiniense determined to be the closest relatives. Profiling virulence-associated factors confirmed presence of well-characterised C. perfringens-associated exotoxins genes including α-toxin (plc), enterotoxin (cpe), and Perfringolysin O (pfo or pfoA), although interestingly there did not appear to be a close correlation with encoded toxin type and disease phenotype. Furthermore, genomic analysis indicated significant horizontal gene transfer events as defined by presence of prophage genomes, and notably absence of CRISPR defence systems in >70% (40/56) of the strains. In relation to antimicrobial resistance mechanisms, tetracycline resistance genes (tet) and anti-defensins genes (mprF) were consistently detected in silico (tet: 75%; mprF: 100%). However, pre-antibiotic era strain genomes did not encode for tet, thus implying antimicrobial selective pressures in C. perfringens evolutionary history over the past 80 years. This study provides new genomic understanding of this genetically divergent multi-host bacterium, and further expands our knowledge on this medically and veterinary important pathogen. PMID:29312194

  19. Endozoicomonas genomes reveal functional adaptation and plasticity in bacterial strains symbiotically associated with diverse marine hosts

    PubMed Central

    Neave, Matthew J.; Michell, Craig T.; Apprill, Amy; Voolstra, Christian R.

    2017-01-01

    Endozoicomonas bacteria are globally distributed and often abundantly associated with diverse marine hosts including reef-building corals, yet their function remains unknown. In this study we generated novel Endozoicomonas genomes from single cells and metagenomes obtained directly from the corals Stylophora pistillata, Pocillopora verrucosa, and Acropora humilis. We then compared these culture-independent genomes to existing genomes of bacterial isolates acquired from a sponge, sea slug, and coral to examine the functional landscape of this enigmatic genus. Sequencing and analysis of single cells and metagenomes resulted in four novel genomes with 60–76% and 81–90% genome completeness, respectively. These data also confirmed that Endozoicomonas genomes are large and are not streamlined for an obligate endosymbiotic lifestyle, implying that they have free-living stages. All genomes show an enrichment of genes associated with carbon sugar transport and utilization and protein secretion, potentially indicating that Endozoicomonas contribute to the cycling of carbohydrates and the provision of proteins to their respective hosts. Importantly, besides these commonalities, the genomes showed evidence for differential functional specificity and diversification, including genes for the production of amino acids. Given this metabolic diversity of Endozoicomonas we propose that different genotypes play disparate roles and have diversified in concert with their hosts. PMID:28094347

  20. Genetic and Genomic Diversity Studies of Acacia Symbionts in Senegal Reveal New Species of Mesorhizobium with a Putative Geographical Pattern

    PubMed Central

    Diouf, Fatou; Diouf, Diegane; Klonowska, Agnieszka; Le Queré, Antoine; Bakhoum, Niokhor; Fall, Dioumacor; Neyra, Marc; Parrinello, Hugues; Diouf, Mayecor; Ndoye, Ibrahima; Moulin, Lionel

    2015-01-01

    Acacia senegal (L) Willd. and Acacia seyal Del. are highly nitrogen-fixing and moderately salt tolerant species. In this study we focused on the genetic and genomic diversity of Acacia mesorhizobia symbionts from diverse origins in Senegal and investigated possible correlations between the genetic diversity of the strains, their soil of origin, and their tolerance to salinity. We first performed a multi-locus sequence analysis on five markers gene fragments on a collection of 47 mesorhizobia strains of A. senegal and A. seyal from 8 localities. Most of the strains (60%) clustered with the M. plurifarium type strain ORS 1032T, while the others form four new clades (MSP1 to MSP4). We sequenced and assembled seven draft genomes: four in the M. plurifarium clade (ORS3356, ORS3365, STM8773 and ORS1032T), one in MSP1 (STM8789), MSP2 (ORS3359) and MSP3 (ORS3324). The average nucleotide identities between these genomes together with the MLSA analysis reveal three new species of Mesorhizobium. A great variability of salt tolerance was found among the strains with a lack of correlation between the genetic diversity of mesorhizobia, their salt tolerance and the soils samples characteristics. A putative geographical pattern of A. senegal symbionts between the dryland north part and the center of Senegal was found, reflecting adaptations to specific local conditions such as the water regime. However, the presence of salt does not seem to be an important structuring factor of Mesorhizobium species. PMID:25658650

  1. Sequencing of Seven Haloarchaeal Genomes Reveals Patterns of Genomic Flux

    PubMed Central

    Lynch, Erin A.; Langille, Morgan G. I.; Darling, Aaron; Wilbanks, Elizabeth G.; Haltiner, Caitlin; Shao, Katie S. Y.; Starr, Michael O.; Teiling, Clotilde; Harkins, Timothy T.; Edwards, Robert A.; Eisen, Jonathan A.; Facciotti, Marc T.

    2012-01-01

    We report the sequencing of seven genomes from two haloarchaeal genera, Haloferax and Haloarcula. Ease of cultivation and the existence of well-developed genetic and biochemical tools for several diverse haloarchaeal species make haloarchaea a model group for the study of archaeal biology. The unique physiological properties of these organisms also make them good candidates for novel enzyme discovery for biotechnological applications. Seven genomes were sequenced to ∼20×coverage and assembled to an average of 50 contigs (range 5 scaffolds - 168 contigs). Comparisons of protein-coding gene compliments revealed large-scale differences in COG functional group enrichment between these genera. Analysis of genes encoding machinery for DNA metabolism reveals genera-specific expansions of the general transcription factor TATA binding protein as well as a history of extensive duplication and horizontal transfer of the proliferating cell nuclear antigen. Insights gained from this study emphasize the importance of haloarchaea for investigation of archaeal biology. PMID:22848480

  2. Genomic landscape of human diversity across Madagascar

    PubMed Central

    Pierron, Denis; Heiske, Margit; Razafindrazaka, Harilanto; Rakoto, Ignace; Rabetokotany, Nelly; Ravololomanga, Bodo; Rakotozafy, Lucien M.-A.; Rakotomalala, Mireille Mialy; Razafiarivony, Michel; Rasoarifetra, Bako; Raharijesy, Miakabola Andriamampianina; Razafindralambo, Lolona; Ramilisonina; Fanony, Fulgence; Lejamble, Sendra; Thomas, Olivier; Mohamed Abdallah, Ahmed; Rocher, Christophe; Arachiche, Amal; Tonaso, Laure; Pereda-loth, Veronica; Schiavinato, Stéphanie; Brucato, Nicolas; Ricaut, Francois-Xavier; Kusuma, Pradiptajati; Sudoyo, Herawati; Ni, Shengyu; Boland, Anne; Deleuze, Jean-Francois; Beaujard, Philippe; Grange, Philippe; Adelaar, Sander; Stoneking, Mark; Rakotoarisoa, Jean-Aimé; Radimilahy, Chantal; Letellier, Thierry

    2017-01-01

    Although situated ∼400 km from the east coast of Africa, Madagascar exhibits cultural, linguistic, and genetic traits from both Southeast Asia and Eastern Africa. The settlement history remains contentious; we therefore used a grid-based approach to sample at high resolution the genomic diversity (including maternal lineages, paternal lineages, and genome-wide data) across 257 villages and 2,704 Malagasy individuals. We find a common Bantu and Austronesian descent for all Malagasy individuals with a limited paternal contribution from Europe and the Middle East. Admixture and demographic growth happened recently, suggesting a rapid settlement of Madagascar during the last millennium. However, the distribution of African and Asian ancestry across the island reveals that the admixture was sex biased and happened heterogeneously across Madagascar, suggesting independent colonization of Madagascar from Africa and Asia rather than settlement by an already admixed population. In addition, there are geographic influences on the present genomic diversity, independent of the admixture, showing that a few centuries is sufficient to produce detectable genetic structure in human populations. PMID:28716916

  3. Comparative Analysis of Genome Diversity in Bullmastiff Dogs

    PubMed Central

    Mortlock, Sally-Anne; Khatkar, Mehar S.; Williamson, Peter

    2016-01-01

    Management and preservation of genomic diversity in dog breeds is a major objective for maintaining health. The present study was undertaken to characterise genomic diversity in Bullmastiff dogs using both genealogical and molecular analysis. Genealogical analysis of diversity was conducted using a database consisting of 16,378 Bullmastiff pedigrees from year 1980 to 2013. Additionally, a total of 188 Bullmastiff dogs were genotyped using the 170,000 SNP Illumina CanineHD Beadchip. Genealogical parameters revealed a mean inbreeding coefficient of 0.047; 142 total founders (f); an effective number of founders (fe) of 79; an effective number of ancestors (fa) of 62; and an effective population size of the reference population of 41. Genetic diversity and the degree of genome-wide homogeneity within the breed were also investigated using molecular data. Multiple-locus heterozygosity (MLH) was equal to 0.206; runs of homozygosity (ROH) as proportion of the genome, averaged 16.44%; effective population size was 29.1, with an average inbreeding coefficient of 0.035, all estimated using SNP Data. Fine-scale population structure was analysed using NETVIEW, a population analysis pipeline. Visualisation of the high definition network captured relationships among individuals within and between subpopulations. Effects of unequal founder use, and ancestral inbreeding and selection, were evident. While current levels of Bullmastiff heterozygosity, inbreeding and homozygosity are not unusual, a relatively small effective population size indicates that a breeding strategy to reduce the inbreeding rate may be beneficial. PMID:26824579

  4. Phylogeny of a genomically diverse group of elymus (poaceae) allopolyploids reveals multiple levels of reticulation.

    PubMed

    Mason-Gamer, Roberta J

    2013-01-01

    The grass tribe Triticeae (=Hordeeae) comprises only about 300 species, but it is well known for the economically important crop plants wheat, barley, and rye. The group is also recognized as a fascinating example of evolutionary complexity, with a history shaped by numerous events of auto- and allopolyploidy and apparent introgression involving diploids and polyploids. The genus Elymus comprises a heterogeneous collection of allopolyploid genome combinations, all of which include at least one set of homoeologs, designated St, derived from Pseudoroegneria. The current analysis includes a geographically and genomically diverse collection of 21 tetraploid Elymus species, and a single hexaploid species. Diploid and polyploid relationships were estimated using four molecular data sets, including one that combines two regions of the chloroplast genome, and three from unlinked nuclear genes: phosphoenolpyruvate carboxylase, β-amylase, and granule-bound starch synthase I. Four gene trees were generated using maximum likelihood, and the phylogenetic placement of the polyploid sequences reveals extensive reticulation beyond allopolyploidy alone. The trees were interpreted with reference to numerous phenomena known to complicate allopolyploid phylogenies, and introgression was identified as a major factor in their history. The work illustrates the interpretation of complicated phylogenetic results through the sequential consideration of numerous possible explanations, and the results highlight the value of careful inspection of multiple independent molecular phylogenetic estimates, with particular focus on the differences among them.

  5. Phylogenetic Diversity and Genotypical Complexity of H9N2 Influenza A Viruses Revealed by Genomic Sequence Analysis

    PubMed Central

    Dong, Guoying; Luo, Jing; Zhang, Hong; Wang, Chengmin; Duan, Mingxing; Deliberto, Thomas Jude; Nolte, Dale Louis; Ji, Guangju; He, Hongxuan

    2011-01-01

    H9N2 influenza A viruses have become established worldwide in terrestrial poultry and wild birds, and are occasionally transmitted to mammals including humans and pigs. To comprehensively elucidate the genetic and evolutionary characteristics of H9N2 influenza viruses, we performed a large-scale sequence analysis of 571 viral genomes from the NCBI Influenza Virus Resource Database, representing the spectrum of H9N2 influenza viruses isolated from 1966 to 2009. Our study provides a panoramic framework for better understanding the genesis and evolution of H9N2 influenza viruses, and for describing the history of H9N2 viruses circulating in diverse hosts. Panorama phylogenetic analysis of the eight viral gene segments revealed the complexity and diversity of H9N2 influenza viruses. The 571 H9N2 viral genomes were classified into 74 separate lineages, which had marked host and geographical differences in phylogeny. Panorama genotypical analysis also revealed that H9N2 viruses include at least 98 genotypes, which were further divided according to their HA lineages into seven series (A–G). Phylogenetic analysis of the internal genes showed that H9N2 viruses are closely related to H3, H4, H5, H7, H10, and H14 subtype influenza viruses. Our results indicate that H9N2 viruses have undergone extensive reassortments to generate multiple reassortants and genotypes, suggesting that the continued circulation of multiple genotypical H9N2 viruses throughout the world in diverse hosts has the potential to cause future influenza outbreaks in poultry and epidemics in humans. We propose a nomenclature system for identifying and unifying all lineages and genotypes of H9N2 influenza viruses in order to facilitate international communication on the evolution, ecology and epidemiology of H9N2 influenza viruses. PMID:21386964

  6. Comparative Analysis of 35 Basidiomycete Genomes Reveals Diversity and Uniqueness of the Phylum

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Riley, Robert; Salamov, Asaf; Otillar, Robert

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprobes including wood decaying fungi. To better understand the diversity of this phylum we compared the genomes of 35 basidiomycete fungi including 6 newly sequenced genomes. The genomes of basidiomycetes span extremes of genome size, gene number, and repeat content. A phylogenetic tree of Basidiomycota was generated using the Phyldog software, which uses all available protein sequence data to simultaneously infer gene and species trees. Analysis of core genes revealsmore » that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) comprising proteins found in only one organism. Phylogenetic patterns of plant biomass-degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay among the members of Agaricomycotina subphylum. There is a correlation of the profile of certain gene families to nutritional mode in Agaricomycotina. Based on phylogenetically-informed PCA analysis of such profiles, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has liginolytic class II fungal peroxidases. Furthermore, we find that both fungi exhibit wood decay with white rot-like characteristics in growth assays. Analysis of the rate of discovery of proteins with no or few homologs suggests the high value of continued sequencing of basidiomycete fungi.« less

  7. Genetic Diversity in Lens Species Revealed by EST and Genomic Simple Sequence Repeat Analysis

    PubMed Central

    Dikshit, Harsh Kumar; Singh, Akanksha; Singh, Dharmendra; Aski, Muraleedhar Sidaram; Prakash, Prapti; Jain, Neelu; Meena, Suresh; Kumar, Shiv; Sarker, Ashutosh

    2015-01-01

    Low productivity of pilosae type lentils grown in South Asia is attributed to narrow genetic base of the released cultivars which results in susceptibility to biotic and abiotic stresses. For enhancement of productivity and production, broadening of genetic base is essentially required. The genetic base of released cultivars can be broadened by using diverse types including bold seeded and early maturing lentils from Mediterranean region and related wild species. Genetic diversity in eighty six accessions of three species of genus Lens was assessed based on twelve genomic and thirty one EST-SSR markers. The evaluated set of genotypes included diverse lentil varieties and advanced breeding lines from Indian programme, two early maturing ICARDA lines and five related wild subspecies/species endemic to the Mediterranean region. Genomic SSRs exhibited higher polymorphism in comparison to EST SSRs. GLLC 598 produced 5 alleles with highest gene diversity value of 0.80. Among the studied subspecies/species 43 SSRs detected maximum number of alleles in L. orientalis. Based on Nei’s genetic distance cultivated lentil L. culinaris subsp. culinaris was found to be close to its wild progenitor L. culinaris subsp. orientalis. The Prichard’s structure of 86 genotypes distinguished different subspecies/species. Higher variability was recorded among individuals within population than among populations. PMID:26381889

  8. Genus-wide comparison of Pseudovibrio bacterial genomes reveal diverse adaptations to different marine invertebrate hosts.

    PubMed

    Alex, Anoop; Antunes, Agostinho

    2018-01-01

    Bacteria belonging to the genus Pseudovibrio have been frequently found in association with a wide variety of marine eukaryotic invertebrate hosts, indicative of their versatile and symbiotic lifestyle. A recent comparison of the sponge-associated Pseudovibrio genomes has shed light on the mechanisms influencing a successful symbiotic association with sponges. In contrast, the genomic architecture of Pseudovibrio bacteria associated with other marine hosts has received less attention. Here, we performed genus-wide comparative analyses of 18 Pseudovibrio isolated from sponges, coral, tunicates, flatworm, and seawater. The analyses revealed a certain degree of commonality among the majority of sponge- and coral-associated bacteria. Isolates from other marine invertebrate host, tunicates, exhibited a genetic repertoire for cold adaptation and specific metabolic abilities including mucin degradation in the Antarctic tunicate-associated bacterium Pseudovibrio sp. Tun.PHSC04_5.I4. Reductive genome evolution was simultaneously detected in the flatworm-associated bacteria and the sponge-associated bacterium P. axinellae AD2, through the loss of major secretion systems (type III/VI) and virulence/symbioses factors such as proteins involved in adhesion and attachment to the host. Our study also unraveled the presence of a CRISPR-Cas system in P. stylochi UST20140214-052 a flatworm-associated bacterium possibly suggesting the role of CRISPR-based adaptive immune system against the invading virus particles. Detection of mobile elements and genomic islands (GIs) in all bacterial members highlighted the role of horizontal gene transfer for the acquisition of novel genetic features, likely enhancing the bacterial ecological fitness. These findings are insightful to understand the role of genome diversity in Pseudovibrio as an evolutionary strategy to increase their colonizing success across a wide range of marine eukaryotic hosts.

  9. Genus-wide comparison of Pseudovibrio bacterial genomes reveal diverse adaptations to different marine invertebrate hosts

    PubMed Central

    Alex, Anoop

    2018-01-01

    Bacteria belonging to the genus Pseudovibrio have been frequently found in association with a wide variety of marine eukaryotic invertebrate hosts, indicative of their versatile and symbiotic lifestyle. A recent comparison of the sponge-associated Pseudovibrio genomes has shed light on the mechanisms influencing a successful symbiotic association with sponges. In contrast, the genomic architecture of Pseudovibrio bacteria associated with other marine hosts has received less attention. Here, we performed genus-wide comparative analyses of 18 Pseudovibrio isolated from sponges, coral, tunicates, flatworm, and seawater. The analyses revealed a certain degree of commonality among the majority of sponge- and coral-associated bacteria. Isolates from other marine invertebrate host, tunicates, exhibited a genetic repertoire for cold adaptation and specific metabolic abilities including mucin degradation in the Antarctic tunicate-associated bacterium Pseudovibrio sp. Tun.PHSC04_5.I4. Reductive genome evolution was simultaneously detected in the flatworm-associated bacteria and the sponge-associated bacterium P. axinellae AD2, through the loss of major secretion systems (type III/VI) and virulence/symbioses factors such as proteins involved in adhesion and attachment to the host. Our study also unraveled the presence of a CRISPR-Cas system in P. stylochi UST20140214-052 a flatworm-associated bacterium possibly suggesting the role of CRISPR-based adaptive immune system against the invading virus particles. Detection of mobile elements and genomic islands (GIs) in all bacterial members highlighted the role of horizontal gene transfer for the acquisition of novel genetic features, likely enhancing the bacterial ecological fitness. These findings are insightful to understand the role of genome diversity in Pseudovibrio as an evolutionary strategy to increase their colonizing success across a wide range of marine eukaryotic hosts. PMID:29775460

  10. Genomic and Genetic Diversity within the Pseudomonas fluorescens Complex

    PubMed Central

    Garrido-Sanz, Daniel; Meier-Kolthoff, Jan P.; Göker, Markus; Martín, Marta; Rivilla, Rafael; Redondo-Nieto, Miguel

    2016-01-01

    The Pseudomonas fluorescens complex includes Pseudomonas strains that have been taxonomically assigned to more than fifty different species, many of which have been described as plant growth-promoting rhizobacteria (PGPR) with potential applications in biocontrol and biofertilization. So far the phylogeny of this complex has been analyzed according to phenotypic traits, 16S rDNA, MLSA and inferred by whole-genome analysis. However, since most of the type strains have not been fully sequenced and new species are frequently described, correlation between taxonomy and phylogenomic analysis is missing. In recent years, the genomes of a large number of strains have been sequenced, showing important genomic heterogeneity and providing information suitable for genomic studies that are important to understand the genomic and genetic diversity shown by strains of this complex. Based on MLSA and several whole-genome sequence-based analyses of 93 sequenced strains, we have divided the P. fluorescens complex into eight phylogenomic groups that agree with previous works based on type strains. Digital DDH (dDDH) identified 69 species and 75 subspecies within the 93 genomes. The eight groups corresponded to clustering with a threshold of 31.8% dDDH, in full agreement with our MLSA. The Average Nucleotide Identity (ANI) approach showed inconsistencies regarding the assignment to species and to the eight groups. The small core genome of 1,334 CDSs and the large pan-genome of 30,848 CDSs, show the large diversity and genetic heterogeneity of the P. fluorescens complex. However, a low number of strains were enough to explain most of the CDSs diversity at core and strain-specific genomic fractions. Finally, the identification and analysis of group-specific genome and the screening for distinctive characters revealed a phylogenomic distribution of traits among the groups that provided insights into biocontrol and bioremediation applications as well as their role as PGPR. PMID:26915094

  11. Genome-wide divergence, haplotype distribution and population demographic histories for Gossypium hirsutum and Gossypium barbadense as revealed by genome-anchored SNPs

    PubMed Central

    Reddy, Umesh K.; Nimmakayala, Padma; Abburi, Venkata Lakshmi; Reddy, C. V. C. M.; Saminathan, Thangasamy; Percy, Richard G.; Yu, John Z.; Frelichowski, James; Udall, Joshua A.; Page, Justin T.; Zhang, Dong; Shehzad, Tariq; Paterson, Andrew H.

    2017-01-01

    Use of 10,129 singleton SNPs of known genomic location in tetraploid cotton provided unique opportunities to characterize genome-wide diversity among 440 Gossypium hirsutum and 219 G. barbadense cultivars and landrace accessions of widespread origin. Using the SNPs distributed genome-wide, we examined genetic diversity, haplotype distribution and linkage disequilibrium patterns in the G. hirsutum and G. barbadense genomes to clarify population demographic history. Diversity and identity-by-state analyses have revealed little sharing of alleles between the two cultivated allotetraploid genomes, with a few exceptions that indicated sporadic gene flow. We found a high number of new alleles, representing increased nucleotide diversity, on chromosomes 1 and 2 in cultivated G. hirsutum as compared with low nucleotide diversity on these chromosomes in landrace G. hirsutum. In contrast, G. barbadense chromosomes showed negative Tajima’s D on several chromosomes for both cultivated and landrace types, which indicate that speciation of G. barbadense itself, might have occurred with relatively narrow genetic diversity. The presence of conserved linkage disequilibrium (LD) blocks and haplotypes between G. hirsutum and G. barbadense provides strong evidence for comparable patterns of evolution in their domestication processes. Our study illustrates the potential use of population genetic techniques to identify genomic regions for domestication. PMID:28128280

  12. Cyanobacterial life at low O(2): community genomics and function reveal metabolic versatility and extremely low diversity in a Great Lakes sinkhole mat.

    PubMed

    Voorhies, A A; Biddanda, B A; Kendall, S T; Jain, S; Marcus, D N; Nold, S C; Sheldon, N D; Dick, G J

    2012-05-01

    Cyanobacteria are renowned as the mediators of Earth's oxygenation. However, little is known about the cyanobacterial communities that flourished under the low-O(2) conditions that characterized most of their evolutionary history. Microbial mats in the submerged Middle Island Sinkhole of Lake Huron provide opportunities to investigate cyanobacteria under such persistent low-O(2) conditions. Here, venting groundwater rich in sulfate and low in O(2) supports a unique benthic ecosystem of purple-colored cyanobacterial mats. Beneath the mat is a layer of carbonate that is enriched in calcite and to a lesser extent dolomite. In situ benthic metabolism chambers revealed that the mats are net sinks for O(2), suggesting primary production mechanisms other than oxygenic photosynthesis. Indeed, (14)C-bicarbonate uptake studies of autotrophic production show variable contributions from oxygenic and anoxygenic photosynthesis and chemosynthesis, presumably because of supply of sulfide. These results suggest the presence of either facultatively anoxygenic cyanobacteria or a mix of oxygenic/anoxygenic types of cyanobacteria. Shotgun metagenomic sequencing revealed a remarkably low-diversity mat community dominated by just one genotype most closely related to the cyanobacterium Phormidium autumnale, for which an essentially complete genome was reconstructed. Also recovered were partial genomes from a second genotype of Phormidium and several Oscillatoria. Despite the taxonomic simplicity, diverse cyanobacterial genes putatively involved in sulfur oxidation were identified, suggesting a diversity of sulfide physiologies. The dominant Phormidium genome reflects versatile metabolism and physiology that is specialized for a communal lifestyle under fluctuating redox conditions and light availability. Overall, this study provides genomic and physiologic insights into low-O(2) cyanobacterial mat ecosystems that played crucial geobiological roles over long stretches of Earth history.

  13. Genomic diversity of cercarial clones of Himasthla elongata (Trematoda, Echinostomatidae) determined with AFLP technique.

    PubMed

    Galaktionov, N K; Podgornaya, O I; Strelkov, P P; Galaktionov, K V

    2016-12-01

    The aim of this study was to reveal genomic diversity formed during parthenogenetic reproduction of rediae of the trematode Himasthla elongata in its molluskan host Littorina littorea. We applied amplification fragment length polymorphism (AFLP) to determine the genomic diversity of individual cercariae within the clone, that is, the infrapopulation of parthenogenetic progeny in a single molluskan host. The level of genomic diversity of particular cercariae isolates from a single clone, detected with EcoR1/Mse1 AFLP reaction, was significantly lower than the variability of cercariae from different clones. The presence of intraclonal genomic diversity indicates a nonsexual shuffle of alleles during parthenogenesis in the rediae of H. elongata. The obtained polymorphic AFLP fragments were long enough to detect the sequences that may be responsible for clonal genomic variability. Based on this, AFLP can be recommended as a tool for the study of genetic mechanisms of this variability.

  14. Wild emmer genome architecture and diversity elucidate wheat evolution and domestication.

    PubMed

    Avni, Raz; Nave, Moran; Barad, Omer; Baruch, Kobi; Twardziok, Sven O; Gundlach, Heidrun; Hale, Iago; Mascher, Martin; Spannagl, Manuel; Wiebe, Krystalee; Jordan, Katherine W; Golan, Guy; Deek, Jasline; Ben-Zvi, Batsheva; Ben-Zvi, Gil; Himmelbach, Axel; MacLachlan, Ron P; Sharpe, Andrew G; Fritz, Allan; Ben-David, Roi; Budak, Hikmet; Fahima, Tzion; Korol, Abraham; Faris, Justin D; Hernandez, Alvaro; Mikel, Mark A; Levy, Avraham A; Steffenson, Brian; Maccaferri, Marco; Tuberosa, Roberto; Cattivelli, Luigi; Faccioli, Primetta; Ceriotti, Aldo; Kashkush, Khalil; Pourkheirandish, Mohammad; Komatsuda, Takao; Eilam, Tamar; Sela, Hanan; Sharon, Amir; Ohad, Nir; Chamovitz, Daniel A; Mayer, Klaus F X; Stein, Nils; Ronen, Gil; Peleg, Zvi; Pozniak, Curtis J; Akhunov, Eduard D; Distelfeld, Assaf

    2017-07-07

    Wheat ( Triticum spp.) is one of the founder crops that likely drove the Neolithic transition to sedentary agrarian societies in the Fertile Crescent more than 10,000 years ago. Identifying genetic modifications underlying wheat's domestication requires knowledge about the genome of its allo-tetraploid progenitor, wild emmer ( T. turgidum ssp. dicoccoides ). We report a 10.1-gigabase assembly of the 14 chromosomes of wild tetraploid wheat, as well as analyses of gene content, genome architecture, and genetic diversity. With this fully assembled polyploid wheat genome, we identified the causal mutations in Brittle Rachis 1 ( TtBtr1 ) genes controlling shattering, a key domestication trait. A study of genomic diversity among wild and domesticated accessions revealed genomic regions bearing the signature of selection under domestication. This reference assembly will serve as a resource for accelerating the genome-assisted improvement of modern wheat varieties. Copyright © 2017, American Association for the Advancement of Science.

  15. Nomadic lifestyle of Lactobacillus plantarum revealed by comparative genomics of 54 strains isolated from different habitats.

    PubMed

    Martino, Maria Elena; Bayjanov, Jumamurat R; Caffrey, Brian E; Wels, Michiel; Joncour, Pauline; Hughes, Sandrine; Gillet, Benjamin; Kleerebezem, Michiel; van Hijum, Sacha A F T; Leulier, François

    2016-12-01

    The ability of bacteria to adapt to diverse environmental conditions is well-known. The process of bacterial adaptation to a niche has been linked to large changes in the genome content, showing that many bacterial genomes reflect the constraints imposed by their habitat. However, some highly versatile bacteria are found in diverse habitats that almost share nothing in common. Lactobacillus plantarum is a lactic acid bacterium that is found in a large variety of habitat. With the aim of unravelling the link between evolution and ecological versatility of L. plantarum, we analysed the genomes of 54 L. plantarum strains isolated from different environments. Comparative genome analysis identified a high level of genomic diversity and plasticity among the strains analysed. Phylogenomic and functional divergence studies coupled with gene-trait matching analyses revealed a mixed distribution of the strains, which was uncoupled from their environmental origin. Our findings revealed the absence of specific genomic signatures marking adaptations of L. plantarum towards the diverse habitats it is associated with. This suggests fundamentally similar trends of genome evolution in L. plantarum, which occur in a manner that is apparently uncoupled from ecological constraint and reflects the nomadic lifestyle of this species. © 2016 The Authors. Environmental Microbiology published by Society for Applied Microbiology and John Wiley & Sons Ltd.

  16. Genomic and genetic analyses of diversity and plant interactions of Pseudomonas fluorescens

    PubMed Central

    Silby, Mark W; Cerdeño-Tárraga, Ana M; Vernikos, Georgios S; Giddens, Stephen R; Jackson, Robert W; Preston, Gail M; Zhang, Xue-Xian; Moon, Christina D; Gehrig, Stefanie M; Godfrey, Scott AC; Knight, Christopher G; Malone, Jacob G; Robinson, Zena; Spiers, Andrew J; Harris, Simon; Challis, Gregory L; Yaxley, Alice M; Harris, David; Seeger, Kathy; Murphy, Lee; Rutter, Simon; Squares, Rob; Quail, Michael A; Saunders, Elizabeth; Mavromatis, Konstantinos; Brettin, Thomas S; Bentley, Stephen D; Hothersall, Joanne; Stephens, Elton; Thomas, Christopher M; Parkhill, Julian; Levy, Stuart B; Rainey, Paul B; Thomson, Nicholas R

    2009-01-01

    Background Pseudomonas fluorescens are common soil bacteria that can improve plant health through nutrient cycling, pathogen antagonism and induction of plant defenses. The genome sequences of strains SBW25 and Pf0-1 were determined and compared to each other and with P. fluorescens Pf-5. A functional genomic in vivo expression technology (IVET) screen provided insight into genes used by P. fluorescens in its natural environment and an improved understanding of the ecological significance of diversity within this species. Results Comparisons of three P. fluorescens genomes (SBW25, Pf0-1, Pf-5) revealed considerable divergence: 61% of genes are shared, the majority located near the replication origin. Phylogenetic and average amino acid identity analyses showed a low overall relationship. A functional screen of SBW25 defined 125 plant-induced genes including a range of functions specific to the plant environment. Orthologues of 83 of these exist in Pf0-1 and Pf-5, with 73 shared by both strains. The P. fluorescens genomes carry numerous complex repetitive DNA sequences, some resembling Miniature Inverted-repeat Transposable Elements (MITEs). In SBW25, repeat density and distribution revealed 'repeat deserts' lacking repeats, covering approximately 40% of the genome. Conclusions P. fluorescens genomes are highly diverse. Strain-specific regions around the replication terminus suggest genome compartmentalization. The genomic heterogeneity among the three strains is reminiscent of a species complex rather than a single species. That 42% of plant-inducible genes were not shared by all strains reinforces this conclusion and shows that ecological success requires specialized and core functions. The diversity also indicates the significant size of genetic information within the Pseudomonas pan genome. PMID:19432983

  17. Genomic diversity within the haloalkaliphilic genus Thioalkalivibrio

    DOE PAGES

    Ahn, Anne-Catherine; Meier-Kolthoff, Jan P.; Overmars, Lex; ...

    2017-03-10

    Thioalkalivibrio is a genus of obligate chemolithoautotrophic haloalkaliphilic sulfur-oxidizing bacteria. Their habitat are soda lakes which are dual extreme environments with a pH range from 9.5 to 11 and salt concentrations up to saturation. More than 100 strains of this genus have been isolated from various soda lakes all over the world, but only ten species have been effectively described yet. Therefore, the assignment of the remaining strains to either existing or novel species is important and will further elucidate their genomic diversity as well as give a better general understanding of this genus. Recently, the genomes of 76 Thioalkalivibriomore » strains were sequenced. On these, we applied different methods including (i) 16S rRNA gene sequence analysis, (ii) Multilocus Sequence Analysis (MLSA) based on eight housekeeping genes, (iii) Average Nucleotide Identity based on BLAST (ANI b) and MUMmer (ANI m ), (iv) Tetranucleotide frequency correlation coefficients (TETRA), (v) digital DNA:DNA hybridization (dDDH) as well as (vi) nucleotide- and amino acid-based Genome BLAST Distance Phylogeny (GBDP) analyses. We detected a high genomic diversity by revealing 15 new "genomic" species and 16 new "genomic" subspecies in addition to the ten already described species. Phylogenetic and phylogenomic analyses showed that the genus is not monophyletic, because four strains were clearly separated from the other Thioalkalivibrio by type strains from other genera. Therefore, it is recommended to classify the latter group as a novel genus. The biogeographic distribution of Thioalkalivibrio suggested that the different "genomic" species can be classified as candidate disjunct or candidate endemic species. This study is a detailed genome-based classification and identification of members within the genus Thioalkalivibrio. However, future phenotypical and chemotaxonomical studies will be needed for a full species description of this genus.« less

  18. Genome-Wide Divergence and Linkage Disequilibrium Analyses for Capsicum baccatum Revealed by Genome-Anchored Single Nucleotide Polymorphisms

    PubMed Central

    Nimmakayala, Padma; Abburi, Venkata L.; Saminathan, Thangasamy; Almeida, Aldo; Davenport, Brittany; Davidson, Joshua; Reddy, C. V. Chandra Mohan; Hankins, Gerald; Ebert, Andreas; Choi, Doil; Stommel, John; Reddy, Umesh K.

    2016-01-01

    Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to characterize population structure and species domestication of these two important incompatible cultivated pepper species. Estimated mean nucleotide diversity (π) and Tajima's D across various chromosomes revealed biased distribution toward negative values on all chromosomes (except for chromosome 4) in cultivated C. baccatum, indicating a population bottleneck during domestication of C. baccatum. In contrast, C. annuum chromosomes showed positive π and Tajima's D on all chromosomes except chromosome 8, which may be because of domestication at multiple sites contributing to wider genetic diversity. For C. baccatum, 13,129 SNPs were available, with minor allele frequency (MAF) ≥0.05; PCA of the SNPs revealed 283 C. baccatum accessions grouped into 3 distinct clusters, for strong population structure. The fixation index (FST) between domesticated C. annuum and C. baccatum was 0.78, which indicates genome-wide divergence. We conducted extensive linkage disequilibrium (LD) analysis of C. baccatum var. pendulum cultivars on all adjacent SNP pairs within a chromosome to identify regions of high and low LD interspersed with a genome-wide average LD block size of 99.1 kb. We characterized 1742 haplotypes containing 4420 SNPs (range 9–2 SNPs per haplotype). Genome-wide association study (GWAS) of peduncle length, a trait that differentiates wild and domesticated C. baccatum types, revealed 36 significantly associated genome-wide SNPs. Population structure, identity by state (IBS) and LD patterns across the genome will be of potential use for future GWAS of economically important traits in C. baccatum peppers. PMID:27857720

  19. Genome-Wide Divergence and Linkage Disequilibrium Analyses for Capsicum baccatum Revealed by Genome-Anchored Single Nucleotide Polymorphisms.

    PubMed

    Nimmakayala, Padma; Abburi, Venkata L; Saminathan, Thangasamy; Almeida, Aldo; Davenport, Brittany; Davidson, Joshua; Reddy, C V Chandra Mohan; Hankins, Gerald; Ebert, Andreas; Choi, Doil; Stommel, John; Reddy, Umesh K

    2016-01-01

    Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to characterize population structure and species domestication of these two important incompatible cultivated pepper species. Estimated mean nucleotide diversity (π) and Tajima's D across various chromosomes revealed biased distribution toward negative values on all chromosomes (except for chromosome 4) in cultivated C. baccatum , indicating a population bottleneck during domestication of C. baccatum . In contrast, C. annuum chromosomes showed positive π and Tajima's D on all chromosomes except chromosome 8, which may be because of domestication at multiple sites contributing to wider genetic diversity. For C. baccatum , 13,129 SNPs were available, with minor allele frequency (MAF) ≥0.05; PCA of the SNPs revealed 283 C. baccatum accessions grouped into 3 distinct clusters, for strong population structure. The fixation index ( F ST ) between domesticated C. annuum and C. baccatum was 0.78, which indicates genome-wide divergence. We conducted extensive linkage disequilibrium (LD) analysis of C. baccatum var. pendulum cultivars on all adjacent SNP pairs within a chromosome to identify regions of high and low LD interspersed with a genome-wide average LD block size of 99.1 kb. We characterized 1742 haplotypes containing 4420 SNPs (range 9-2 SNPs per haplotype). Genome-wide association study (GWAS) of peduncle length, a trait that differentiates wild and domesticated C. baccatum types, revealed 36 significantly associated genome-wide SNPs. Population structure, identity by state (IBS) and LD patterns across the genome will be of potential use for future GWAS of economically important traits in C. baccatum peppers.

  20. Transposable element evolution in Heliconius suggests genome diversity within Lepidoptera

    PubMed Central

    2013-01-01

    Background Transposable elements (TEs) have the potential to impact genome structure, function and evolution in profound ways. In order to understand the contribution of transposable elements (TEs) to Heliconius melpomene, we queried the H. melpomene draft sequence to identify repetitive sequences. Results We determined that TEs comprise ~25% of the genome. The predominant class of TEs (~12% of the genome) was the non-long terminal repeat (non-LTR) retrotransposons, including a novel SINE family. However, this was only slightly higher than content derived from DNA transposons, which are diverse, with several families having mobilized in the recent past. Compared to the only other well-studied lepidopteran genome, Bombyx mori, H. melpomene exhibits a higher DNA transposon content and a distinct repertoire of retrotransposons. We also found that H. melpomene exhibits a high rate of TE turnover with few older elements accumulating in the genome. Conclusions Our analysis represents the first complete, de novo characterization of TE content in a butterfly genome and suggests that, while TEs are able to invade and multiply, TEs have an overall deleterious effect and/or that maintaining a small genome is advantageous. Our results also hint that analysis of additional lepidopteran genomes will reveal substantial TE diversity within the group. PMID:24088337

  1. The genome of the Erwinia amylovora phage PhiEaH1 reveals greater diversity and broadens the applicability of phages for the treatment of fire blight.

    PubMed

    Meczker, Katalin; Dömötör, Dóra; Vass, János; Rákhely, Gábor; Schneider, György; Kovács, Tamás

    2014-01-01

    The enterobacterium Erwinia amylovora is the causal agent of fire blight. This study presents the analysis of the complete genome of phage PhiEaH1, isolated from the soil surrounding an E. amylovora-infected apple tree in Hungary. Its genome is 218 kb in size, containing 244 ORFs. PhiEaH1 is the second E. amylovora infecting phage from the Siphoviridae family whose complete genome sequence was determined. Beside PhiEaH2, PhiEaH1 is the other active component of Erwiphage, the first bacteriophage-based pesticide on the market against E. amylovora. Comparative genome analysis in this study has revealed that PhiEaH1 not only differs from the 10 formerly sequenced E. amylovora bacteriophages belonging to other phage families, but also from PhiEaH2. Sequencing of more Siphoviridae phage genomes might reveal further diversity, providing opportunities for the development of even more effective biological control agents, phage cocktails against Erwinia fire blight disease of commercial fruit crops.

  2. Lactococcus lactis Diversity in Undefined Mixed Dairy Starter Cultures as Revealed by Comparative Genome Analyses and Targeted Amplicon Sequencing of epsD.

    PubMed

    Frantzen, Cyril A; Kleppen, Hans Petter; Holo, Helge

    2018-02-01

    Undefined mesophilic mixed (DL) starter cultures are used in the production of continental cheeses and contain unknown strain mixtures of Lactococcus lactis and leuconostocs. The choice of starter culture affects the taste, aroma, and quality of the final product. To gain insight into the diversity of Lactococcus lactis strains in starter cultures, we whole-genome sequenced 95 isolates from three different starter cultures. Pan-genomic analyses, which included 30 publically available complete genomes, grouped the strains into 21 L. lactis subsp . lactis and 28 L. lactis subsp. cremoris lineages. Only one of the 95 isolates grouped with previously sequenced strains, and the three starter cultures showed no overlap in lineage distributions. The culture diversity was assessed by targeted amplicon sequencing using purR , a core gene, and epsD , present in 93 of the 95 starter culture isolates but absent in most of the reference strains. This enabled an unprecedented discrimination of starter culture Lactococcus lactis and revealed substantial differences between the three starter cultures and compositional shifts during the cultivation of cultures in milk. IMPORTANCE In contemporary cheese production, standardized frozen seed stock starter cultures are used to ensure production stability, reproducibility, and quality control of the product. The dairy industry experiences significant disruptions of cheese production due to phage attacks, and one commonly used countermeasure to phage attack is to employ a starter rotation strategy, in which two or more starters with minimal overlap in phage sensitivity are used alternately. A culture-independent analysis of the lactococcal diversity in complex undefined starter cultures revealed large differences between the three starter cultures and temporal shifts in lactococcal composition during the production of bulk starters. A better understanding of the lactococcal diversity in starter cultures will enable the development of

  3. Tales of diversity: Genomic and morphological characteristics of forty-six Arthrobacter phages

    PubMed Central

    Adair, Tamarah L.; Afram, Patricia; Allen, Katherine G.; Archambault, Megan L.; Aziz, Rahat M.; Bagnasco, Filippa G.; Ball, Sarah L.; Barrett, Natalie A.; Benjamin, Robert C.; Blasi, Christopher J.; Borst, Katherine; Braun, Mary A.; Broomell, Haley; Brown, Conner B.; Brynell, Zachary S.; Bue, Ashley B.; Burke, Sydney O.; Casazza, William; Cautela, Julia A.; Chen, Kevin; Chimalakonda, Nitish S.; Chudoff, Dylan; Connor, Jade A.; Cross, Trevor S.; Curtis, Kyra N.; Dahlke, Jessica A.; Deaton, Bethany M.; Degroote, Sarah J.; DeNigris, Danielle M.; DeRuff, Katherine C.; Dolan, Milan; Dunbar, David; Egan, Marisa S.; Evans, Daniel R.; Fahnestock, Abby K.; Farooq, Amal; Finn, Garrett; Fratus, Christopher R.; Gaffney, Bobby L.; Garlena, Rebecca A.; Garrigan, Kelly E.; Gibbon, Bryan C.; Goedde, Michael A.; Guerrero Bustamante, Carlos A.; Harrison, Melinda; Hartwell, Megan C.; Heckman, Emily L.; Huang, Jennifer; Hughes, Lee E.; Hyduchak, Kathryn M.; Jacob, Aswathi E.; Kaku, Machika; Karstens, Allen W.; Kenna, Margaret A.; Khetarpal, Susheel; King, Rodney A.; Kobokovich, Amanda L.; Kolev, Hannah; Konde, Sai A.; Kriese, Elizabeth; Lamey, Morgan E.; Lantz, Carter N.; Lapin, Jonathan S.; Lawson, Temiloluwa O.; Lee, In Young; Lee, Scott M.; Lee-Soety, Julia Y.; Lehmann, Emily M.; London, Shawn C.; Lopez, A. Javier; Lynch, Kelly C.; Mageeney, Catherine M.; Martynyuk, Tetyana; Mathew, Kevin J.; Mavrich, Travis N.; McDaniel, Christopher M.; McDonald, Hannah; McManus, C. Joel; Medrano, Jessica E.; Mele, Francis E.; Menninger, Jennifer E.; Miller, Sierra N.; Minick, Josephine E.; Nabua, Courtney T.; Napoli, Caroline K.; Nkangabwa, Martha; Oates, Elizabeth A.; Ott, Cassandra T.; Pellerino, Sarah K.; Pinamont, William J.; Pirnie, Ross T.; Pizzorno, Marie C.; Plautz, Emilee J.; Pope, Welkin H.; Pruett, Katelyn M.; Rickstrew, Gabbi; Rimple, Patrick A.; Rinehart, Claire A.; Robinson, Kayla M.; Rose, Victoria A.; Russell, Daniel A.; Schick, Amelia M.; Schlossman, Julia; Schneider, Victoria M.; Sells, Chloe A.; Sieker, Jeremy W.; Silva, Morgan P.; Silvi, Marissa M.; Simon, Stephanie E.; Staples, Amanda K.; Steed, Isabelle L.; Stowe, Emily L.; Stueven, Noah A.; Swartz, Porter T.; Sweet, Emma A.; Sweetman, Abigail T.; Tender, Corrina; Terry, Katrina; Thomas, Chrystal; Thomas, Daniel S.; Thompson, Allison R.; Vanderveen, Lorianna; Varma, Rohan; Vaught, Hannah L.; Vo, Quynh D.; Vonberg, Zachary T.; Ware, Vassie C.; Warrad, Yasmene M.; Wathen, Kaitlyn E.; Weinstein, Jonathan L.; Wyper, Jacqueline F.; Yankauskas, Jakob R.; Zhang, Christine

    2017-01-01

    The vast bacteriophage population harbors an immense reservoir of genetic information. Almost 2000 phage genomes have been sequenced from phages infecting hosts in the phylum Actinobacteria, and analysis of these genomes reveals substantial diversity, pervasive mosaicism, and novel mechanisms for phage replication and lysogeny. Here, we describe the isolation and genomic characterization of 46 phages from environmental samples at various geographic locations in the U.S. infecting a single Arthrobacter sp. strain. These phages include representatives of all three virion morphologies, and Jasmine is the first sequenced podovirus of an actinobacterial host. The phages also span considerable sequence diversity, and can be grouped into 10 clusters according to their nucleotide diversity, and two singletons each with no close relatives. However, the clusters/singletons appear to be genomically well separated from each other, and relatively few genes are shared between clusters. Genome size varies from among the smallest of siphoviral phages (15,319 bp) to over 70 kbp, and G+C contents range from 45–68%, compared to 63.4% for the host genome. Although temperate phages are common among other actinobacterial hosts, these Arthrobacter phages are primarily lytic, and only the singleton Galaxy is likely temperate. PMID:28715480

  4. Genome size analyses of Pucciniales reveal the largest fungal genomes.

    PubMed

    Tavares, Sílvia; Ramos, Ana Paula; Pires, Ana Sofia; Azinheira, Helena G; Caldeirinha, Patrícia; Link, Tobias; Abranches, Rita; Silva, Maria do Céu; Voegele, Ralf T; Loureiro, João; Talhinhas, Pedro

    2014-01-01

    Rust fungi (Basidiomycota, Pucciniales) are biotrophic plant pathogens which exhibit diverse complexities in their life cycles and host ranges. The completion of genome sequencing of a few rust fungi has revealed the occurrence of large genomes. Sequencing efforts for other rust fungi have been hampered by uncertainty concerning their genome sizes. Flow cytometry was recently applied to estimate the genome size of a few rust fungi, and confirmed the occurrence of large genomes in this order (averaging 225.3 Mbp, while the average for Basidiomycota was 49.9 Mbp and was 37.7 Mbp for all fungi). In this work, we have used an innovative and simple approach to simultaneously isolate nuclei from the rust and its host plant in order to estimate the genome size of 30 rust species by flow cytometry. Genome sizes varied over 10-fold, from 70 to 893 Mbp, with an average genome size value of 380.2 Mbp. Compared to the genome sizes of over 1800 fungi, Gymnosporangium confusum possesses the largest fungal genome ever reported (893.2 Mbp). Moreover, even the smallest rust genome determined in this study is larger than the vast majority of fungal genomes (94%). The average genome size of the Pucciniales is now of 305.5 Mbp, while the average Basidiomycota genome size has shifted to 70.4 Mbp and the average for all fungi reached 44.2 Mbp. Despite the fact that no correlation could be drawn between the genome sizes, the phylogenomics or the life cycle of rust fungi, it is interesting to note that rusts with Fabaceae hosts present genomes clearly larger than those with Poaceae hosts. Although this study comprises only a small fraction of the more than 7000 rust species described, it seems already evident that the Pucciniales represent a group where genome size expansion could be a common characteristic. This is in sharp contrast to sister taxa, placing this order in a relevant position in fungal genomics research.

  5. Genome size analyses of Pucciniales reveal the largest fungal genomes

    PubMed Central

    Tavares, Sílvia; Ramos, Ana Paula; Pires, Ana Sofia; Azinheira, Helena G.; Caldeirinha, Patrícia; Link, Tobias; Abranches, Rita; Silva, Maria do Céu; Voegele, Ralf T.; Loureiro, João; Talhinhas, Pedro

    2014-01-01

    Rust fungi (Basidiomycota, Pucciniales) are biotrophic plant pathogens which exhibit diverse complexities in their life cycles and host ranges. The completion of genome sequencing of a few rust fungi has revealed the occurrence of large genomes. Sequencing efforts for other rust fungi have been hampered by uncertainty concerning their genome sizes. Flow cytometry was recently applied to estimate the genome size of a few rust fungi, and confirmed the occurrence of large genomes in this order (averaging 225.3 Mbp, while the average for Basidiomycota was 49.9 Mbp and was 37.7 Mbp for all fungi). In this work, we have used an innovative and simple approach to simultaneously isolate nuclei from the rust and its host plant in order to estimate the genome size of 30 rust species by flow cytometry. Genome sizes varied over 10-fold, from 70 to 893 Mbp, with an average genome size value of 380.2 Mbp. Compared to the genome sizes of over 1800 fungi, Gymnosporangium confusum possesses the largest fungal genome ever reported (893.2 Mbp). Moreover, even the smallest rust genome determined in this study is larger than the vast majority of fungal genomes (94%). The average genome size of the Pucciniales is now of 305.5 Mbp, while the average Basidiomycota genome size has shifted to 70.4 Mbp and the average for all fungi reached 44.2 Mbp. Despite the fact that no correlation could be drawn between the genome sizes, the phylogenomics or the life cycle of rust fungi, it is interesting to note that rusts with Fabaceae hosts present genomes clearly larger than those with Poaceae hosts. Although this study comprises only a small fraction of the more than 7000 rust species described, it seems already evident that the Pucciniales represent a group where genome size expansion could be a common characteristic. This is in sharp contrast to sister taxa, placing this order in a relevant position in fungal genomics research. PMID:25206357

  6. Genomic Diversity of Lactobacillus salivarius▿ †

    PubMed Central

    Raftis, Emma J.; Salvetti, Elisa; Torriani, Sandra; Felis, Giovanna E.; O'Toole, Paul W.

    2011-01-01

    Strains of Lactobacillus salivarius are increasingly employed as probiotic agents for humans or animals. Despite the diversity of environmental sources from which they have been isolated, the genomic diversity of L. salivarius has been poorly characterized, and the implications of this diversity for strain selection have not been examined. To tackle this, we applied comparative genomic hybridization (CGH) and multilocus sequence typing (MLST) to 33 strains derived from humans, animals, or food. The CGH, based on total genome content, including small plasmids, identified 18 major regions of genomic variation, or hot spots for variation. Three major divisions were thus identified, with only a subset of the human isolates constituting an ecologically discernible group. Omission of the small plasmids from the CGH or analysis by MLST provided broadly concordant fine divisions and separated human-derived and animal-derived strains more clearly. The two gene clusters for exopolysaccharide (EPS) biosynthesis corresponded to regions of significant genomic diversity. The CGH-based groupings of these regions did not correlate with levels of production of bound or released EPS. Furthermore, EPS production was significantly modulated by available carbohydrate. In addition to proving difficult to predict from the gene content, EPS production levels correlated inversely with production of biofilms, a trait considered desirable in probiotic commensals. L. salivarius displays a high level of genomic diversity, and while selection of L. salivarius strains for probiotic use can be informed by CGH or MLST, it also requires pragmatic experimental validation of desired phenotypic traits. PMID:21131523

  7. European Chlamydia abortus livestock isolate genomes reveal unusual stability and limited diversity, reflected in geographical signatures.

    PubMed

    Seth-Smith, H M B; Busó, Leonor Sánchez; Livingstone, M; Sait, M; Harris, S R; Aitchison, K D; Vretou, Evangelia; Siarkou, V I; Laroucau, K; Sachse, K; Longbottom, D; Thomson, N R

    2017-05-04

    Chlamydia abortus (formerly Chlamydophila abortus) is an economically important livestock pathogen, causing ovine enzootic abortion (OEA), and can also cause zoonotic infections in humans affecting pregnancy outcome. Large-scale genomic studies on other chlamydial species are giving insights into the biology of these organisms but have not yet been performed on C. abortus. Our aim was to investigate a broad collection of European isolates of C. abortus, using next generation sequencing methods, looking at diversity, geographic distribution and genome dynamics. Whole genome sequencing was performed on our collection of 57 C. abortus isolates originating primarily from the UK, Germany, France and Greece, but also from Tunisia, Namibia and the USA. Phylogenetic analysis of a total of 64 genomes shows a deep structural division within the C. abortus species with a major clade displaying limited diversity, in addition to a branch carrying two more distantly related Greek isolates, LLG and POS. Within the major clade, seven further phylogenetic groups can be identified, demonstrating geographical associations. The number of variable nucleotide positions across the sampled isolates is significantly lower than those published for C. trachomatis and C. psittaci. No recombination was identified within C. abortus, and no plasmid was found. Analysis of pseudogenes showed lineage specific loss of some functions, notably with several Pmp and TMH/Inc proteins predicted to be inactivated in many of the isolates studied. The diversity within C. abortus appears to be much lower compared to other species within the genus. There are strong geographical signatures within the phylogeny, indicating clonal expansion within areas of limited livestock transport. No recombination has been identified within this species, showing that different species of Chlamydia may demonstrate different evolutionary dynamics, and that the genome of C. abortus is highly stable.

  8. Diversity, genetic mapping, and signatures of domestication in the carrot (Daucus carota L.) genome, as revealed by Diversity Arrays Technology (DArT) markers

    USDA-ARS?s Scientific Manuscript database

    Carrot is one of the most economically important vegetables worldwide, however, genetic and genomic resources supporting carrot breeding remain limited. We developed a Diversity Arrays Technology (DArT) platform for wild and cultivated carrot and used it to investigate genetic diversity and to devel...

  9. Global Genomic Diversity of Oryza sativa Varieties Revealed by Comparative Physical Mapping

    PubMed Central

    Wang, Xiaoming; Kudrna, David A.; Pan, Yonglong; Wang, Hao; Liu, Lin; Lin, Haiyan; Zhang, Jianwei; Song, Xiang; Goicoechea, Jose Luis; Wing, Rod A.; Zhang, Qifa; Luo, Meizhong

    2014-01-01

    Bacterial artificial chromosome (BAC) physical maps embedding a large number of BAC end sequences (BESs) were generated for Oryza sativa ssp. indica varieties Minghui 63 (MH63) and Zhenshan 97 (ZS97) and were compared with the genome sequences of O. sativa spp. japonica cv. Nipponbare and O. sativa ssp. indica cv. 93-11. The comparisons exhibited substantial diversities in terms of large structural variations and small substitutions and indels. Genome-wide BAC-sized and contig-sized structural variations were detected, and the shared variations were analyzed. In the expansion regions of the Nipponbare reference sequence, in comparison to the MH63 and ZS97 physical maps, as well as to the previously constructed 93-11 physical map, the amounts and types of the repeat contents, and the outputs of gene ontology analysis, were significantly different from those of the whole genome. Using the physical maps of four wild Oryza species from OMAP (http://www.omap.org) as a control, we detected many conserved and divergent regions related to the evolution process of O. sativa. Between the BESs of MH63 and ZS97 and the two reference sequences, a total of 1532 polymorphic simple sequence repeats (SSRs), 71,383 SNPs, 1767 multiple nucleotide polymorphisms, 6340 insertions, and 9137 deletions were identified. This study provides independent whole-genome resources for intra- and intersubspecies comparisons and functional genomics studies in O. sativa. Both the comparative physical maps and the GBrowse, which integrated the QTL and molecular markers from GRAMENE (http://www.gramene.org) with our physical maps and analysis results, are open to the public through our Web site (http://gresource.hzau.edu.cn/resource/resource.html). PMID:24424778

  10. Genomic Diversity of Type B3 Bacteriophages of Caulobacter crescentus.

    PubMed

    Ash, Kurt T; Drake, Kristina M; Gibbs, Whitney S; Ely, Bert

    2017-07-01

    The genomes of the type B3 bacteriophages that infect Caulobacter crescentus are among the largest phage genomes thus far deposited into GenBank with sizes over 200 kb. In this study, we introduce six new bacteriophage genomes which were obtained from phage collected from various water systems in the southeastern United States and from tropical locations across the globe. A comparative analysis of the 12 available genomes revealed a "core genome" which accounts for roughly 1/3 of these bacteriophage genomes and is predominately localized to the head, tail, and lysis gene regions. Despite being isolated from geographically distinct locations, the genomes of these bacteriophages are highly conserved in both genome sequence and gene order. We also identified the insertions, deletions, translocations, and horizontal gene transfer events which are responsible for the genomic diversity of this group of bacteriophages and demonstrated that these changes are not consistent with the idea that modular reassortment of genomes occurs in this group of bacteriophages.

  11. Hidden diversity revealed by genome-resolved metagenomics of iron-oxidizing microbial mats from Lō'ihi Seamount, Hawai'i.

    PubMed

    Fullerton, Heather; Hager, Kevin W; McAllister, Sean M; Moyer, Craig L

    2017-08-01

    The Zetaproteobacteria are ubiquitous in marine environments, yet this class of Proteobacteria is only represented by a few closely-related cultured isolates. In high-iron environments, such as diffuse hydrothermal vents, the Zetaproteobacteria are important members of the community driving its structure. Biogeography of Zetaproteobacteria has shown two ubiquitous operational taxonomic units (OTUs), yet much is unknown about their genomic diversity. Genome-resolved metagenomics allows for the specific binning of microbial genomes based on genomic signatures present in composite metagenome assemblies. This resulted in the recovery of 93 genome bins, of which 34 were classified as Zetaproteobacteria. Form II ribulose 1,5-bisphosphate carboxylase genes were recovered from nearly all the Zetaproteobacteria genome bins. In addition, the Zetaproteobacteria genome bins contain genes for uptake and utilization of bioavailable nitrogen, detoxification of arsenic, and a terminal electron acceptor adapted for low oxygen concentration. Our results also support the hypothesis of a Cyc2-like protein as the site for iron oxidation, now detected across a majority of the Zetaproteobacteria genome bins. Whole genome comparisons showed a high genomic diversity across the Zetaproteobacteria OTUs and genome bins that were previously unidentified by SSU rRNA gene analysis. A single lineage of cosmopolitan Zetaproteobacteria (zOTU 2) was found to be monophyletic, based on cluster analysis of average nucleotide identity and average amino acid identity comparisons. From these data, we can begin to pinpoint genomic adaptations of the more ecologically ubiquitous Zetaproteobacteria, and further understand their environmental constraints and metabolic potential.

  12. Genetic Competence Drives Genome Diversity in Bacillus subtilis

    PubMed Central

    Chevreux, Bastien; Serra, Cláudia R; Schyns, Ghislain; Henriques, Adriano O

    2018-01-01

    Abstract Prokaryote genomes are the result of a dynamic flux of genes, with increases achieved via horizontal gene transfer and reductions occurring through gene loss. The ecological and selective forces that drive this genomic flexibility vary across species. Bacillus subtilis is a naturally competent bacterium that occupies various environments, including plant-associated, soil, and marine niches, and the gut of both invertebrates and vertebrates. Here, we quantify the genomic diversity of B. subtilis and infer the genome dynamics that explain the high genetic and phenotypic diversity observed. Phylogenomic and comparative genomic analyses of 42 B. subtilis genomes uncover a remarkable genome diversity that translates into a core genome of 1,659 genes and an asymptotic pangenome growth rate of 57 new genes per new genome added. This diversity is due to a large proportion of low-frequency genes that are acquired from closely related species. We find no gene-loss bias among wild isolates, which explains why the cloud genome, 43% of the species pangenome, represents only a small proportion of each genome. We show that B. subtilis can acquire xenologous copies of core genes that propagate laterally among strains within a niche. While not excluding the contributions of other mechanisms, our results strongly suggest a process of gene acquisition that is largely driven by competence, where the long-term maintenance of acquired genes depends on local and global fitness effects. This competence-driven genomic diversity provides B. subtilis with its generalist character, enabling it to occupy a wide range of ecological niches and cycle through them. PMID:29272410

  13. Low diversity, activity, and density of transposable elements in five avian genomes.

    PubMed

    Gao, Bo; Wang, Saisai; Wang, Yali; Shen, Dan; Xue, Songlei; Chen, Cai; Cui, Hengmi; Song, Chengyi

    2017-07-01

    In this study, we conducted the activity, diversity, and density analysis of transposable elements (TEs) across five avian genomes (budgerigar, chicken, turkey, medium ground finch, and zebra finch) to explore the potential reason of small genome sizes of birds. We found that these avian genomes exhibited low density of TEs by about 10% of genome coverages and low diversity of TEs with the TE landscapes dominated by CR1 and ERV elements, and contrasting proliferation dynamics both between TE types and between species were observed across the five avian genomes. Phylogenetic analysis revealed that CR1 clade was more diverse in the family structure compared with R2 clade in birds; avian ERVs were classified into four clades (alpha, beta, gamma, and ERV-L) and belonged to three classes of ERV with an uneven distributed in these lineages. The activities of DNA and SINE TEs were very low in the evolution history of avian genomes; most LINEs and LTRs were ancient copies with a substantial decrease of activity in recent, with only LTRs and LINEs in chicken and zebra finch exhibiting weak activity in very recent, and very few TEs were intact; however, the recent activity may be underestimated due to the sequencing/assembly technologies in some species. Overall, this study demonstrates low diversity, activity, and density of TEs in the five avian species; highlights the differences of TEs in these lineages; and suggests that the current and recent activity of TEs in avian genomes is very limited, which may be one of the reasons of small genome sizes in birds.

  14. Vibrio cholerae genomic diversity within and between patients

    PubMed Central

    Levade, Inès; Terrat, Yves; Leducq, Jean-Baptiste; Weil, Ana A.; Mayo-Smith, Leslie M.; Chowdhury, Fahima; Khan, Ashraful I.; Boncy, Jacques; Buteau, Josiane; Ivers, Louise C.; Ryan, Edward T.; Charles, Richelle C.; Calderwood, Stephen B.; Qadri, Firdausi; Harris, Jason B.; LaRocque, Regina C.

    2017-01-01

    Cholera is a severe, water-borne diarrhoeal disease caused by toxin-producing strains of the bacterium Vibrio cholerae. Comparative genomics has revealed ‘waves’ of cholera transmission and evolution, in which clones are successively replaced over decades and centuries. However, the extent of V. cholerae genetic diversity within an epidemic or even within an individual patient is poorly understood. Here, we characterized V. cholerae genomic diversity at a micro-epidemiological level within and between individual patients from Bangladesh and Haiti. To capture within-patient diversity, we isolated multiple (8 to 20) V. cholerae colonies from each of eight patients, sequenced their genomes and identified point mutations and gene gain/loss events. We found limited but detectable diversity at the level of point mutations within hosts (zero to three single nucleotide variants within each patient), and comparatively higher gene content variation within hosts (at least one gain/loss event per patient, and up to 103 events in one patient). Much of the gene content variation appeared to be due to gain and loss of phage and plasmids within the V. cholerae population, with occasional exchanges between V. cholerae and other members of the gut microbiota. We also show that certain intra-host variants have phenotypic consequences. For example, the acquisition of a Bacteroides plasmid and non-synonymous mutations in a sensor histidine kinase gene both reduced biofilm formation, an important trait for environmental survival. Together, our results show that V. cholerae is measurably evolving within patients, with possible implications for disease outcomes and transmission dynamics. PMID:29306353

  15. Reduced representation approaches to interrogate genome diversity in large repetitive plant genomes.

    PubMed

    Hirsch, Cory D; Evans, Joseph; Buell, C Robin; Hirsch, Candice N

    2014-07-01

    Technology and software improvements in the last decade now provide methodologies to access the genome sequence of not only a single accession, but also multiple accessions of plant species. This provides a means to interrogate species diversity at the genome level. Ample diversity among accessions in a collection of species can be found, including single-nucleotide polymorphisms, insertions and deletions, copy number variation and presence/absence variation. For species with small, non-repetitive rich genomes, re-sequencing of query accessions is robust, highly informative, and economically feasible. However, for species with moderate to large sized repetitive-rich genomes, technical and economic barriers prevent en masse genome re-sequencing of accessions. Multiple approaches to access a focused subset of loci in species with larger genomes have been developed, including reduced representation sequencing, exome capture and transcriptome sequencing. Collectively, these approaches have enabled interrogation of diversity on a genome scale for large plant genomes, including crop species important to worldwide food security. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  16. Comparison of 26 Sphingomonad Genomes Reveals Diverse Environmental Adaptations and Biodegradative Capabilities

    PubMed Central

    Aylward, Frank O.; McDonald, Bradon R.; Adams, Sandra M.; Valenzuela, Alejandra; Schmidt, Rebeccah A.; Goodwin, Lynne A.; Woyke, Tanja; Currie, Cameron R.; Suen, Garret

    2013-01-01

    Sphingomonads comprise a physiologically versatile group within the Alphaproteobacteria that includes strains of interest for biotechnology, human health, and environmental nutrient cycling. In this study, we compared 26 sphingomonad genome sequences to gain insight into their ecology, metabolic versatility, and environmental adaptations. Our multilocus phylogenetic and average amino acid identity (AAI) analyses confirm that Sphingomonas, Sphingobium, Sphingopyxis, and Novosphingobium are well-resolved monophyletic groups with the exception of Sphingomonas sp. strain SKA58, which we propose belongs to the genus Sphingobium. Our pan-genomic analysis of sphingomonads reveals numerous species-specific open reading frames (ORFs) but few signatures of genus-specific cores. The organization and coding potential of the sphingomonad genomes appear to be highly variable, and plasmid-mediated gene transfer and chromosome-plasmid recombination, together with prophage- and transposon-mediated rearrangements, appear to play prominent roles in the genome evolution of this group. We find that many of the sphingomonad genomes encode numerous oxygenases and glycoside hydrolases, which are likely responsible for their ability to degrade various recalcitrant aromatic compounds and polysaccharides, respectively. Many of these enzymes are encoded on megaplasmids, suggesting that they may be readily transferred between species. We also identified enzymes putatively used for the catabolism of sulfonate and nitroaromatic compounds in many of the genomes, suggesting that plant-based compounds or chemical contaminants may be sources of nitrogen and sulfur. Many of these sphingomonads appear to be adapted to oligotrophic environments, but several contain genomic features indicative of host associations. Our work provides a basis for understanding the ecological strategies employed by sphingomonads and their role in environmental nutrient cycling. PMID:23563954

  17. Diverse circovirus-like genome architectures revealed by environmental metagenomics.

    PubMed

    Rosario, Karyna; Duffy, Siobain; Breitbart, Mya

    2009-10-01

    Single-stranded DNA (ssDNA) viruses with circular genomes are the smallest viruses known to infect eukaryotes. The present study identified 10 novel genomes similar to ssDNA circoviruses through data-mining of public viral metagenomes. The metagenomic libraries included samples from reclaimed water and three different marine environments (Chesapeake Bay, British Columbia coastal waters and Sargasso Sea). All the genomes have similarities to the replication (Rep) protein of circoviruses; however, only half have genomic features consistent with known circoviruses. Some of the genomes exhibit a mixture of genomic features associated with different families of ssDNA viruses (i.e. circoviruses, geminiviruses and parvoviruses). Unique genome architectures and phylogenetic analysis of the Rep protein suggest that these viruses belong to novel genera and/or families. Investigating the complex community of ssDNA viruses in the environment can lead to the discovery of divergent species and help elucidate evolutionary links between ssDNA viruses.

  18. Fundamental differences in diversity and genomic population structure between Atlantic and Pacific Prochlorococcus.

    PubMed

    Kashtan, Nadav; Roggensack, Sara E; Berta-Thompson, Jessie W; Grinberg, Maor; Stepanauskas, Ramunas; Chisholm, Sallie W

    2017-09-01

    The Atlantic and Pacific Oceans represent different biogeochemical regimes in which the abundant marine cyanobacterium Prochlorococcus thrives. We have shown that Prochlorococcus populations in the Atlantic are composed of hundreds of genomically, and likely ecologically, distinct coexisting subpopulations with distinct genomic backbones. Here we ask if differences in the ecology and selection pressures between the Atlantic and Pacific are reflected in the diversity and genomic composition of their indigenous Prochlorococcus populations. We applied large-scale single-cell genomics and compared the cell-by-cell genomic composition of wild populations of co-occurring cells from samples from Station ALOHA off Hawaii, and from Bermuda Atlantic Time Series Station off Bermuda. We reveal fundamental differences in diversity and genomic structure of populations between the sites. The Pacific populations are more diverse than those in the Atlantic, composed of significantly more coexisting subpopulations and lacking dominant subpopulations. Prochlorococcus from the two sites seem to be composed of mostly non-overlapping distinct sets of subpopulations with different genomic backbones-likely reflecting different sets of ocean-specific micro-niches. Furthermore, phylogenetically closely related strains carry ocean-associated nutrient acquisition genes likely reflecting differences in major selection pressures between the oceans. This differential selection, along with geographic separation, clearly has a significant role in shaping these populations.

  19. Genomic patterns of species diversity and divergence in Eucalyptus.

    PubMed

    Hudson, Corey J; Freeman, Jules S; Myburg, Alexander A; Potts, Brad M; Vaillancourt, René E

    2015-06-01

    We examined genome-wide patterns of DNA sequence diversity and divergence among six species of the important tree genus Eucalyptus and investigated their relationship with genomic architecture. Using c. 90 range-wide individuals of each Eucalyptus species (E. grandis, E. urophylla, E. globulus, E. nitens, E. dunnii and E. camaldulensis), genetic diversity and divergence were estimated from 2840 polymorphic diversity arrays technology markers covering the 11 chromosomes. Species differentiating markers (SDMs) identified in each of 15 pairwise species comparisons, along with species diversity (HHW ) and divergence (FST ), were projected onto the E. grandis reference genome. Across all species comparisons, SDMs totalled 1.1-5.3% of markers and were widely distributed throughout the genome. Marker divergence (FST and SDMs) and diversity differed among and within chromosomes. Patterns of diversity and divergence were broadly conserved across species and significantly associated with genomic features, including the proximity of markers to genes, the relative number of clusters of tandem duplications, and gene density within or among chromosomes. These results suggest that genomic architecture influences patterns of species diversity and divergence in the genus. This influence is evident across the six species, encompassing diverse phylogenetic lineages, geography and ecology. © 2015 University of Tasmania New Phytologist © 2015 New Phytologist Trust.

  20. Genetic variability of mutans streptococci revealed by wide whole-genome sequencing

    PubMed Central

    2013-01-01

    Background Mutans streptococci are a group of bacteria significantly contributing to tooth decay. Their genetic variability is however still not well understood. Results Genomes of 6 clinical S. mutans isolates of different origins, one isolate of S. sobrinus (DSM 20742) and one isolate of S. ratti (DSM 20564) were sequenced and comparatively analyzed. Genome alignment revealed a mosaic-like structure of genome arrangement. Genes related to pathogenicity are found to have high variations among the strains, whereas genes for oxidative stress resistance are well conserved, indicating the importance of this trait in the dental biofilm community. Analysis of genome-scale metabolic networks revealed significant differences in 42 pathways. A striking dissimilarity is the unique presence of two lactate oxidases in S. sobrinus DSM 20742, probably indicating an unusual capability of this strain in producing H2O2 and expanding its ecological niche. In addition, lactate oxidases may form with other enzymes a novel energetic pathway in S. sobrinus DSM 20742 that can remedy its deficiency in citrate utilization pathway. Using 67 S. mutans genomes currently available including the strains sequenced in this study, we estimates the theoretical core genome size of S. mutans, and performed modeling of S. mutans pan-genome by applying different fitting models. An “open” pan-genome was inferred. Conclusions The comparative genome analyses revealed diversities in the mutans streptococci group, especially with respect to the virulence related genes and metabolic pathways. The results are helpful for better understanding the evolution and adaptive mechanisms of these oral pathogen microorganisms and for combating them. PMID:23805886

  1. Genomic diversity and introgression in O. sativa reveal the impact of domestication and breeding on the rice genome.

    PubMed

    Zhao, Keyan; Wright, Mark; Kimball, Jennifer; Eizenga, Georgia; McClung, Anna; Kovach, Michael; Tyagi, Wricha; Ali, Md Liakat; Tung, Chih-Wei; Reynolds, Andy; Bustamante, Carlos D; McCouch, Susan R

    2010-05-24

    The domestication of Asian rice (Oryza sativa) was a complex process punctuated by episodes of introgressive hybridization among and between subpopulations. Deep genetic divergence between the two main varietal groups (Indica and Japonica) suggests domestication from at least two distinct wild populations. However, genetic uniformity surrounding key domestication genes across divergent subpopulations suggests cultural exchange of genetic material among ancient farmers. In this study, we utilize a novel 1,536 SNP panel genotyped across 395 diverse accessions of O. sativa to study genome-wide patterns of polymorphism, to characterize population structure, and to infer the introgression history of domesticated Asian rice. Our population structure analyses support the existence of five major subpopulations (indica, aus, tropical japonica, temperate japonica and GroupV) consistent with previous analyses. Our introgression analysis shows that most accessions exhibit some degree of admixture, with many individuals within a population sharing the same introgressed segment due to artificial selection. Admixture mapping and association analysis of amylose content and grain length illustrate the potential for dissecting the genetic basis of complex traits in domesticated plant populations. Genes in these regions control a myriad of traits including plant stature, blast resistance, and amylose content. These analyses highlight the power of population genomics in agricultural systems to identify functionally important regions of the genome and to decipher the role of human-directed breeding in refashioning the genomes of a domesticated species.

  2. Genome Diversity and Evolution in the Budding Yeasts (Saccharomycotina)

    PubMed Central

    Dujon, Bernard A.; Louis, Edward J.

    2017-01-01

    Considerable progress in our understanding of yeast genomes and their evolution has been made over the last decade with the sequencing, analysis, and comparisons of numerous species, strains, or isolates of diverse origins. The role played by yeasts in natural environments as well as in artificial manufactures, combined with the importance of some species as model experimental systems sustained this effort. At the same time, their enormous evolutionary diversity (there are yeast species in every subphylum of Dikarya) sparked curiosity but necessitated further efforts to obtain appropriate reference genomes. Today, yeast genomes have been very informative about basic mechanisms of evolution, speciation, hybridization, domestication, as well as about the molecular machineries underlying them. They are also irreplaceable to investigate in detail the complex relationship between genotypes and phenotypes with both theoretical and practical implications. This review examines these questions at two distinct levels offered by the broad evolutionary range of yeasts: inside the best-studied Saccharomyces species complex, and across the entire and diversified subphylum of Saccharomycotina. While obviously revealing evolutionary histories at different scales, data converge to a remarkably coherent picture in which one can estimate the relative importance of intrinsic genome dynamics, including gene birth and loss, vs. horizontal genetic accidents in the making of populations. The facility with which novel yeast genomes can now be studied, combined with the already numerous available reference genomes, offer privileged perspectives to further examine these fundamental biological questions using yeasts both as eukaryotic models and as fungi of practical importance. PMID:28592505

  3. Genomic comparison of multi-drug resistant invasive and colonizing Acinetobacter baumannii isolated from diverse human body sites reveals genomic plasticity.

    PubMed

    Sahl, Jason W; Johnson, J Kristie; Harris, Anthony D; Phillippy, Adam M; Hsiao, William W; Thom, Kerri A; Rasko, David A

    2011-06-04

    Acinetobacter baumannii has recently emerged as a significant global pathogen, with a surprisingly rapid acquisition of antibiotic resistance and spread within hospitals and health care institutions. This study examines the genomic content of three A. baumannii strains isolated from distinct body sites. Isolates from blood, peri-anal, and wound sources were examined in an attempt to identify genetic features that could be correlated to each isolation source. Pulsed-field gel electrophoresis, multi-locus sequence typing and antibiotic resistance profiles demonstrated genotypic and phenotypic variation. Each isolate was sequenced to high-quality draft status, which allowed for comparative genomic analyses with existing A. baumannii genomes. A high resolution, whole genome alignment method detailed the phylogenetic relationships of sequenced A. baumannii and found no correlation between phylogeny and body site of isolation. This method identified genomic regions unique to both those isolates found on the surface of the skin or in wounds, termed colonization isolates, and those identified from body fluids, termed invasive isolates; these regions may play a role in the pathogenesis and spread of this important pathogen. A PCR-based screen of 74 A. baumanii isolates demonstrated that these unique genes are not exclusive to either phenotype or isolation source; however, a conserved genomic region exclusive to all sequenced A. baumannii was identified and verified. The results of the comparative genome analysis and PCR assay show that A. baumannii is a diverse and genomically variable pathogen that appears to have the potential to cause a range of human disease regardless of the isolation source.

  4. Comparative genomics of wild type yeast strains unveils important genome diversity

    PubMed Central

    Carreto, Laura; Eiriz, Maria F; Gomes, Ana C; Pereira, Patrícia M; Schuller, Dorit; Santos, Manuel AS

    2008-01-01

    Background Genome variability generates phenotypic heterogeneity and is of relevance for adaptation to environmental change, but the extent of such variability in natural populations is still poorly understood. For example, selected Saccharomyces cerevisiae strains are variable at the ploidy level, have gene amplifications, changes in chromosome copy number, and gross chromosomal rearrangements. This suggests that genome plasticity provides important genetic diversity upon which natural selection mechanisms can operate. Results In this study, we have used wild-type S. cerevisiae (yeast) strains to investigate genome variation in natural and artificial environments. We have used comparative genome hybridization on array (aCGH) to characterize the genome variability of 16 yeast strains, of laboratory and commercial origin, isolated from vineyards and wine cellars, and from opportunistic human infections. Interestingly, sub-telomeric instability was associated with the clinical phenotype, while Ty element insertion regions determined genomic differences of natural wine fermentation strains. Copy number depletion of ASP3 and YRF1 genes was found in all wild-type strains. Other gene families involved in transmembrane transport, sugar and alcohol metabolism or drug resistance had copy number changes, which also distinguished wine from clinical isolates. Conclusion We have isolated and genotyped more than 1000 yeast strains from natural environments and carried out an aCGH analysis of 16 strains representative of distinct genotype clusters. Important genomic variability was identified between these strains, in particular in sub-telomeric regions and in Ty-element insertion sites, suggesting that this type of genome variability is the main source of genetic diversity in natural populations of yeast. The data highlights the usefulness of yeast as a model system to unravel intraspecific natural genome diversity and to elucidate how natural selection shapes the yeast genome

  5. A Near-Complete Haplotype-Phased Genome of the Dikaryotic Wheat Stripe Rust Fungus Puccinia striiformis f. sp. tritici Reveals High Interhaplotype Diversity

    PubMed Central

    Sperschneider, Jana; Garnica, Diana P.; Miller, Marisa E.; Taylor, Jennifer M.; Dodds, Peter N.; Park, Robert F.

    2018-01-01

    ABSTRACT A long-standing biological question is how evolution has shaped the genomic architecture of dikaryotic fungi. To answer this, high-quality genomic resources that enable haplotype comparisons are essential. Short-read genome assemblies for dikaryotic fungi are highly fragmented and lack haplotype-specific information due to the high heterozygosity and repeat content of these genomes. Here, we present a diploid-aware assembly of the wheat stripe rust fungus Puccinia striiformis f. sp. tritici based on long reads using the FALCON-Unzip assembler. Transcriptome sequencing data sets were used to infer high-quality gene models and identify virulence genes involved in plant infection referred to as effectors. This represents the most complete Puccinia striiformis f. sp. tritici genome assembly to date (83 Mb, 156 contigs, N50 of 1.5 Mb) and provides phased haplotype information for over 92% of the genome. Comparisons of the phase blocks revealed high interhaplotype diversity of over 6%. More than 25% of all genes lack a clear allelic counterpart. When we investigated genome features that potentially promote the rapid evolution of virulence, we found that candidate effector genes are spatially associated with conserved genes commonly found in basidiomycetes. Yet, candidate effectors that lack an allelic counterpart are more distant from conserved genes than allelic candidate effectors and are less likely to be evolutionarily conserved within the P. striiformis species complex and Pucciniales. In summary, this haplotype-phased assembly enabled us to discover novel genome features of a dikaryotic plant-pathogenic fungus previously hidden in collapsed and fragmented genome assemblies. PMID:29463659

  6. Prophage Integrase Typing Is a Useful Indicator of Genomic Diversity in Salmonella enterica

    PubMed Central

    Colavecchio, Anna; D’Souza, Yasmin; Tompkins, Elizabeth; Jeukens, Julie; Freschi, Luca; Emond-Rheault, Jean-Guillaume; Kukavica-Ibrulj, Irena; Boyle, Brian; Bekal, Sadjia; Tamber, Sandeep; Levesque, Roger C.; Goodridge, Lawrence D.

    2017-01-01

    Salmonella enterica is a bacterial species that is a major cause of illness in humans and food-producing animals. S. enterica exhibits considerable inter-serovar diversity, as evidenced by the large number of host adapted serovars that have been identified. The development of methods to assess genome diversity in S. enterica will help to further define the limits of diversity in this foodborne pathogen. Thus, we evaluated a PCR assay, which targets prophage integrase genes, as a rapid method to investigate S. enterica genome diversity. To evaluate the PCR prophage integrase assay, 49 isolates of S. enterica were selected, including 19 clinical isolates from clonal serovars (Enteritidis and Heidelberg) that commonly cause human illness, and 30 isolates from food-associated Salmonella serovars that rarely cause human illness. The number of integrase genes identified by the PCR assay was compared to the number of integrase genes within intact prophages identified by whole genome sequencing and phage finding program PHASTER. The PCR assay identified a total of 147 prophage integrase genes within the 49 S. enterica genomes (79 integrase genes in the food-associated Salmonella isolates, 50 integrase genes in S. Enteritidis, and 18 integrase genes in S. Heidelberg). In comparison, whole genome sequencing and PHASTER identified a total of 75 prophage integrase genes within 102 intact prophages in the 49 S. enterica genomes (44 integrase genes in the food-associated Salmonella isolates, 21 integrase genes in S. Enteritidis, and 9 integrase genes in S. Heidelberg). Collectively, both the PCR assay and PHASTER identified the presence of a large diversity of prophage integrase genes in the food-associated isolates compared to the clinical isolates, thus indicating a high degree of diversity in the food-associated isolates, and confirming the clonal nature of S. Enteritidis and S. Heidelberg. Moreover, PHASTER revealed a diversity of 29 different types of prophages and 23

  7. Prophage Integrase Typing Is a Useful Indicator of Genomic Diversity in Salmonella enterica.

    PubMed

    Colavecchio, Anna; D'Souza, Yasmin; Tompkins, Elizabeth; Jeukens, Julie; Freschi, Luca; Emond-Rheault, Jean-Guillaume; Kukavica-Ibrulj, Irena; Boyle, Brian; Bekal, Sadjia; Tamber, Sandeep; Levesque, Roger C; Goodridge, Lawrence D

    2017-01-01

    Salmonella enterica is a bacterial species that is a major cause of illness in humans and food-producing animals. S. enterica exhibits considerable inter-serovar diversity, as evidenced by the large number of host adapted serovars that have been identified. The development of methods to assess genome diversity in S. enterica will help to further define the limits of diversity in this foodborne pathogen. Thus, we evaluated a PCR assay, which targets prophage integrase genes, as a rapid method to investigate S. enterica genome diversity. To evaluate the PCR prophage integrase assay, 49 isolates of S. enterica were selected, including 19 clinical isolates from clonal serovars (Enteritidis and Heidelberg) that commonly cause human illness, and 30 isolates from food-associated Salmonella serovars that rarely cause human illness. The number of integrase genes identified by the PCR assay was compared to the number of integrase genes within intact prophages identified by whole genome sequencing and phage finding program PHASTER. The PCR assay identified a total of 147 prophage integrase genes within the 49 S. enterica genomes (79 integrase genes in the food-associated Salmonella isolates, 50 integrase genes in S . Enteritidis, and 18 integrase genes in S . Heidelberg). In comparison, whole genome sequencing and PHASTER identified a total of 75 prophage integrase genes within 102 intact prophages in the 49 S. enterica genomes (44 integrase genes in the food-associated Salmonella isolates, 21 integrase genes in S . Enteritidis, and 9 integrase genes in S . Heidelberg). Collectively, both the PCR assay and PHASTER identified the presence of a large diversity of prophage integrase genes in the food-associated isolates compared to the clinical isolates, thus indicating a high degree of diversity in the food-associated isolates, and confirming the clonal nature of S . Enteritidis and S . Heidelberg. Moreover, PHASTER revealed a diversity of 29 different types of prophages and 23

  8. Genomic diversity of the human intestinal parasite Entamoeba histolytica

    PubMed Central

    2012-01-01

    Background Entamoeba histolytica is a significant cause of disease worldwide. However, little is known about the genetic diversity of the parasite. We re-sequenced the genomes of ten laboratory cultured lines of the eukaryotic pathogen Entamoeba histolytica in order to develop a picture of genetic diversity across the genome. Results The extreme nucleotide composition bias and repetitiveness of the E. histolytica genome provide a challenge for short-read mapping, yet we were able to define putative single nucleotide polymorphisms in a large portion of the genome. The results suggest a rather low level of single nucleotide diversity, although genes and gene families with putative roles in virulence are among the more polymorphic genes. We did observe large differences in coverage depth among genes, indicating differences in gene copy number between genomes. We found evidence indicating that recombination has occurred in the history of the sequenced genomes, suggesting that E. histolytica may reproduce sexually. Conclusions E. histolytica displays a relatively low level of nucleotide diversity across its genome. However, large differences in gene family content and gene copy number are seen among the sequenced genomes. The pattern of polymorphism indicates that E. histolytica reproduces sexually, or has done so in the past, which has previously been suggested but not proven. PMID:22630046

  9. Decelerated genome evolution in modern vertebrates revealed by analysis of multiple lancelet genomes.

    PubMed

    Huang, Shengfeng; Chen, Zelin; Yan, Xinyu; Yu, Ting; Huang, Guangrui; Yan, Qingyu; Pontarotti, Pierre Antoine; Zhao, Hongchen; Li, Jie; Yang, Ping; Wang, Ruihua; Li, Rui; Tao, Xin; Deng, Ting; Wang, Yiquan; Li, Guang; Zhang, Qiujin; Zhou, Sisi; You, Leiming; Yuan, Shaochun; Fu, Yonggui; Wu, Fenfang; Dong, Meiling; Chen, Shangwu; Xu, Anlong

    2014-12-19

    Vertebrates diverged from other chordates ~500 Myr ago and experienced successful innovations and adaptations, but the genomic basis underlying vertebrate origins are not fully understood. Here we suggest, through comparison with multiple lancelet (amphioxus) genomes, that ancient vertebrates experienced high rates of protein evolution, genome rearrangement and domain shuffling and that these rates greatly slowed down after the divergence of jawed and jawless vertebrates. Compared with lancelets, modern vertebrates retain, at least relatively, less protein diversity, fewer nucleotide polymorphisms, domain combinations and conserved non-coding elements (CNE). Modern vertebrates also lost substantial transposable element (TE) diversity, whereas lancelets preserve high TE diversity that includes even the long-sought RAG transposon. Lancelets also exhibit rapid gene turnover, pervasive transcription, fastest exon shuffling in metazoans and substantial TE methylation not observed in other invertebrates. These new lancelet genome sequences provide new insights into the chordate ancestral state and the vertebrate evolution.

  10. Mammalian Comparative Genomics Reveals Genetic and Epigenetic Features Associated with Genome Reshuffling in Rodentia

    PubMed Central

    Capilla, Laia; Sánchez-Guillén, Rosa Ana; Farré, Marta; Paytuví-Gallart, Andreu; Malinverni, Roberto; Ventura, Jacint; Larkin, Denis M.

    2016-01-01

    Abstract Understanding how mammalian genomes have been reshuffled through structural changes is fundamental to the dynamics of its composition, evolutionary relationships between species and, in the long run, speciation. In this work, we reveal the evolutionary genomic landscape in Rodentia, the most diverse and speciose mammalian order, by whole-genome comparisons of six rodent species and six representative outgroup mammalian species. The reconstruction of the evolutionary breakpoint regions across rodent phylogeny shows an increased rate of genome reshuffling that is approximately two orders of magnitude greater than in other mammalian species here considered. We identified novel lineage and clade-specific breakpoint regions within Rodentia and analyzed their gene content, recombination rates and their relationship with constitutive lamina genomic associated domains, DNase I hypersensitivity sites and chromatin modifications. We detected an accumulation of protein-coding genes in evolutionary breakpoint regions, especially genes implicated in reproduction and pheromone detection and mating. Moreover, we found an association of the evolutionary breakpoint regions with active chromatin state landscapes, most probably related to gene enrichment. Our results have two important implications for understanding the mechanisms that govern and constrain mammalian genome evolution. The first is that the presence of genes related to species-specific phenotypes in evolutionary breakpoint regions reinforces the adaptive value of genome reshuffling. Second, that chromatin conformation, an aspect that has been often overlooked in comparative genomic studies, might play a role in modeling the genomic distribution of evolutionary breakpoints. PMID:28175287

  11. Mammalian Comparative Genomics Reveals Genetic and Epigenetic Features Associated with Genome Reshuffling in Rodentia.

    PubMed

    Capilla, Laia; Sánchez-Guillén, Rosa Ana; Farré, Marta; Paytuví-Gallart, Andreu; Malinverni, Roberto; Ventura, Jacint; Larkin, Denis M; Ruiz-Herrera, Aurora

    2016-12-01

    Understanding how mammalian genomes have been reshuffled through structural changes is fundamental to the dynamics of its composition, evolutionary relationships between species and, in the long run, speciation. In this work, we reveal the evolutionary genomic landscape in Rodentia, the most diverse and speciose mammalian order, by whole-genome comparisons of six rodent species and six representative outgroup mammalian species. The reconstruction of the evolutionary breakpoint regions across rodent phylogeny shows an increased rate of genome reshuffling that is approximately two orders of magnitude greater than in other mammalian species here considered. We identified novel lineage and clade-specific breakpoint regions within Rodentia and analyzed their gene content, recombination rates and their relationship with constitutive lamina genomic associated domains, DNase I hypersensitivity sites and chromatin modifications. We detected an accumulation of protein-coding genes in evolutionary breakpoint regions, especially genes implicated in reproduction and pheromone detection and mating. Moreover, we found an association of the evolutionary breakpoint regions with active chromatin state landscapes, most probably related to gene enrichment. Our results have two important implications for understanding the mechanisms that govern and constrain mammalian genome evolution. The first is that the presence of genes related to species-specific phenotypes in evolutionary breakpoint regions reinforces the adaptive value of genome reshuffling. Second, that chromatin conformation, an aspect that has been often overlooked in comparative genomic studies, might play a role in modeling the genomic distribution of evolutionary breakpoints.

  12. Reduced representation genome sequencing reveals patterns of genetic diversity and selection in apple.

    PubMed

    Ma, Baiquan; Liao, Liao; Peng, Qian; Fang, Ting; Zhou, Hui; Korban, Schuyler S; Han, Yuepeng

    2017-03-01

    Identifying DNA sequence variations is a fundamental step towards deciphering the genetic basis of traits of interest. Here, a total of 20 cultivated and 10 wild apples were genotyped using specific-locus amplified fragment sequencing, and 39,635 single nucleotide polymorphisms with no missing genotypes and evenly distributed along the genome were selected to investigate patterns of genome-wide genetic variations between cultivated and wild apples. Overall, wild apples displayed higher levels of genetic diversity than cultivated apples. Linkage disequilibrium (LD) decays were observed quite rapidly in cultivated and wild apples, with an r 2 -value below 0.2 at 440 and 280 bp, respectively. Moreover, bidirectional gene flow and different distribution patterns of LD blocks were detected between domesticated and wild apples. Most LD blocks unique to cultivated apples were located within QTL regions controlling fruit quality, thus suggesting that fruit quality had probably undergone selection during apple domestication. The genome of the earliest cultivated apple in China, Nai, was highly similar to that of Malus sieversii, and contained a small portion of genetic material from other wild apple species. This suggested that introgression could have been an important driving force during initial domestication of apple. These findings will facilitate future breeding and genetic dissection of complex traits in apple. © 2017 Institute of Botany, Chinese Academy of Sciences.

  13. Gekko japonicus genome reveals evolution of adhesive toe pads and tail regeneration

    PubMed Central

    Liu, Yan; Zhou, Qian; Wang, Yongjun; Luo, Longhai; Yang, Jian; Yang, Linfeng; Liu, Mei; Li, Yingrui; Qian, Tianmei; Zheng, Yuan; Li, Meiyuan; Li, Jiang; Gu, Yun; Han, Zujing; Xu, Man; Wang, Yingjie; Zhu, Changlai; Yu, Bin; Yang, Yumin; Ding, Fei; Jiang, Jianping; Yang, Huanming; Gu, Xiaosong

    2015-01-01

    Reptiles are the most morphologically and physiologically diverse tetrapods, and have undergone 300 million years of adaptive evolution. Within the reptilian tetrapods, geckos possess several interesting features, including the ability to regenerate autotomized tails and to climb on smooth surfaces. Here we sequence the genome of Gekko japonicus (Schlegel's Japanese Gecko) and investigate genetic elements related to its physiology. We obtain a draft G. japonicus genome sequence of 2.55 Gb and annotated 22,487 genes. Comparative genomic analysis reveals specific gene family expansions or reductions that are associated with the formation of adhesive setae, nocturnal vision and tail regeneration, as well as the diversification of olfactory sensation. The obtained genomic data provide robust genetic evidence of adaptive evolution in reptiles. PMID:26598231

  14. Pantoea ananatis Genetic Diversity Analysis Reveals Limited Genomic Diversity as Well as Accessory Genes Correlated with Onion Pathogenicity.

    PubMed

    Stice, Shaun P; Stumpf, Spencer D; Gitaitis, Ron D; Kvitko, Brian H; Dutta, Bhabesh

    2018-01-01

    Pantoea ananatis is a member of the family Enterobacteriaceae and an enigmatic plant pathogen with a broad host range. Although P. ananatis strains can be aggressive on onion causing foliar necrosis and onion center rot, previous genomic analysis has shown that P. ananatis lacks the primary virulence secretion systems associated with other plant pathogens. We assessed a collection of fifty P. ananatis strains collected from Georgia over three decades to determine genetic factors that correlated with onion pathogenic potential. Previous genetic analysis studies have compared strains isolated from different hosts with varying diseases potential and isolation sources. Strains varied greatly in their pathogenic potential and aggressiveness on different cultivated Allium species like onion, leek, shallot, and chive. Using multi-locus sequence analysis (MLSA) and repetitive extragenic palindrome repeat (rep)-PCR techniques, we did not observe any correlation between onion pathogenic potential and genetic diversity among strains. Whole genome sequencing and pan-genomic analysis of a sub-set of 10 strains aided in the identification of a novel series of genetic regions, likely plasmid borne, and correlating with onion pathogenicity observed on single contigs of the genetic assemblies. We named these loci Onion Virulence Regions (OVR) A-D. The OVR loci contain genes involved in redox regulation as well as pectate lyase and rhamnogalacturonase genes. Previous studies have not identified distinct genetic loci or plasmids correlating with onion foliar pathogenicity or pathogenicity on a single host pathosystem. The lack of focus on a single host system for this phytopathgenic disease necessitates the pan-genomic analysis performed in this study.

  15. Pantoea ananatis Genetic Diversity Analysis Reveals Limited Genomic Diversity as Well as Accessory Genes Correlated with Onion Pathogenicity

    PubMed Central

    Stice, Shaun P.; Stumpf, Spencer D.; Gitaitis, Ron D.; Kvitko, Brian H.; Dutta, Bhabesh

    2018-01-01

    Pantoea ananatis is a member of the family Enterobacteriaceae and an enigmatic plant pathogen with a broad host range. Although P. ananatis strains can be aggressive on onion causing foliar necrosis and onion center rot, previous genomic analysis has shown that P. ananatis lacks the primary virulence secretion systems associated with other plant pathogens. We assessed a collection of fifty P. ananatis strains collected from Georgia over three decades to determine genetic factors that correlated with onion pathogenic potential. Previous genetic analysis studies have compared strains isolated from different hosts with varying diseases potential and isolation sources. Strains varied greatly in their pathogenic potential and aggressiveness on different cultivated Allium species like onion, leek, shallot, and chive. Using multi-locus sequence analysis (MLSA) and repetitive extragenic palindrome repeat (rep)-PCR techniques, we did not observe any correlation between onion pathogenic potential and genetic diversity among strains. Whole genome sequencing and pan-genomic analysis of a sub-set of 10 strains aided in the identification of a novel series of genetic regions, likely plasmid borne, and correlating with onion pathogenicity observed on single contigs of the genetic assemblies. We named these loci Onion Virulence Regions (OVR) A-D. The OVR loci contain genes involved in redox regulation as well as pectate lyase and rhamnogalacturonase genes. Previous studies have not identified distinct genetic loci or plasmids correlating with onion foliar pathogenicity or pathogenicity on a single host pathosystem. The lack of focus on a single host system for this phytopathgenic disease necessitates the pan-genomic analysis performed in this study. PMID:29491851

  16. Contrasting Patterns of Genomic Diversity Reveal Accelerated Genetic Drift but Reduced Directional Selection on X-Chromosome in Wild and Domestic Sheep Species.

    PubMed

    Chen, Ze-Hui; Zhang, Min; Lv, Feng-Hua; Ren, Xue; Li, Wen-Rong; Liu, Ming-Jun; Nam, Kiwoong; Bruford, Michael W; Li, Meng-Hua

    2018-04-01

    Analyses of genomic diversity along the X chromosome and of its correlation with autosomal diversity can facilitate understanding of evolutionary forces in shaping sex-linked genomic architecture. Strong selective sweeps and accelerated genetic drift on the X-chromosome have been inferred in primates and other model species, but no such insight has yet been gained in domestic animals compared with their wild relatives. Here, we analyzed X-chromosome variability in a large ovine data set, including a BeadChip array for 943 ewes from the world's sheep populations and 110 whole genomes of wild and domestic sheep. Analyzing whole-genome sequences, we observed a substantially reduced X-to-autosome diversity ratio (∼0.6) compared with the value expected under a neutral model (0.75). In particular, one large X-linked segment (43.05-79.25 Mb) was found to show extremely low diversity, most likely due to a high density of coding genes, featuring highly conserved regions. In general, we observed higher nucleotide diversity on the autosomes, but a flat diversity gradient in X-linked segments, as a function of increasing distance from the nearest genes, leading to a decreased X: autosome (X/A) diversity ratio and contrasting to the positive correlation detected in primates and other model animals. Our evidence suggests that accelerated genetic drift but reduced directional selection on X chromosome, as well as sex-biased demographic events, explain low X-chromosome diversity in sheep species. The distinct patterns of X-linked and X/A diversity we observed between Middle Eastern and non-Middle Eastern sheep populations can be explained by multiple migrations, selection, and admixture during the domestic sheep's recent postdomestication demographic expansion, coupled with natural selection for adaptation to new environments. In addition, we identify important novel genes involved in abnormal behavioral phenotypes, metabolism, and immunity, under selection on the sheep X-chromosome.

  17. Pan-genome analysis of Aeromonas hydrophila, Aeromonas veronii and Aeromonas caviae indicates phylogenomic diversity and greater pathogenic potential for Aeromonas hydrophila.

    PubMed

    Ghatak, Sandeep; Blom, Jochen; Das, Samir; Sanjukta, Rajkumari; Puro, Kekungu; Mawlong, Michael; Shakuntala, Ingudam; Sen, Arnab; Goesmann, Alexander; Kumar, Ashok; Ngachan, S V

    2016-07-01

    Aeromonas species are important pathogens of fishes and aquatic animals capable of infecting humans and other animals via food. Due to the paucity of pan-genomic studies on aeromonads, the present study was undertaken to analyse the pan-genome of three clinically important Aeromonas species (A. hydrophila, A. veronii, A. caviae). Results of pan-genome analysis revealed an open pan-genome for all three species with pan-genome sizes of 9181, 7214 and 6884 genes for A. hydrophila, A. veronii and A. caviae, respectively. Core-genome: pan-genome ratio (RCP) indicated greater genomic diversity for A. hydrophila and interestingly RCP emerged as an effective indicator to gauge genomic diversity which could possibly be extended to other organisms too. Phylogenomic network analysis highlighted the influence of homologous recombination and lateral gene transfer in the evolution of Aeromonas spp. Prediction of virulence factors indicated no significant difference among the three species though analysis of pathogenic potential and acquired antimicrobial resistance genes revealed greater hazards from A. hydrophila. In conclusion, the present study highlighted the usefulness of whole genome analyses to infer evolutionary cues for Aeromonas species which indicated considerable phylogenomic diversity for A. hydrophila and hitherto unknown genomic evidence for pathogenic potential of A. hydrophila compared to A. veronii and A. caviae.

  18. Genetic diversity and structure of elite cotton germplasm (Gossypium hirsutum L.) using genome-wide SNP data.

    PubMed

    Ai, XianTao; Liang, YaJun; Wang, JunDuo; Zheng, JuYun; Gong, ZhaoLong; Guo, JiangPing; Li, XueYuan; Qu, YanYing

    2017-10-01

    Cotton (Gossypium spp.) is the most important natural textile fiber crop, and Gossypium hirsutum L. is responsible for 90% of the annual cotton crop in the world. Information on cotton genetic diversity and population structure is essential for new breeding lines. In this study, we analyzed population structure and genetic diversity of 288 elite Gossypium hirsutum cultivar accessions collected from around the world, and especially from China, using genome-wide single nucleotide polymorphisms (SNP) markers. The average polymorphsim information content (PIC) was 0.25, indicating a relatively low degree of genetic diversity. Population structure analysis revealed extensive admixture and identified three subgroups. Phylogenetic analysis supported the subgroups identified by STRUCTURE. The results from both population structure and phylogenetic analysis were, for the most part, in agreement with pedigree information. Analysis of molecular variance revealed a larger amount of variation was due to diversity within the groups. Establishment of genetic diversity and population structure from this study could be useful for genetic and genomic analysis and systematic utilization of the standing genetic variation in upland cotton.

  19. Decelerated genome evolution in modern vertebrates revealed by analysis of multiple lancelet genomes

    PubMed Central

    Huang, Shengfeng; Chen, Zelin; Yan, Xinyu; Yu, Ting; Huang, Guangrui; Yan, Qingyu; Pontarotti, Pierre Antoine; Zhao, Hongchen; Li, Jie; Yang, Ping; Wang, Ruihua; Li, Rui; Tao, Xin; Deng, Ting; Wang, Yiquan; Li, Guang; Zhang, Qiujin; Zhou, Sisi; You, Leiming; Yuan, Shaochun; Fu, Yonggui; Wu, Fenfang; Dong, Meiling; Chen, Shangwu; Xu, Anlong

    2014-01-01

    Vertebrates diverged from other chordates ~500 Myr ago and experienced successful innovations and adaptations, but the genomic basis underlying vertebrate origins are not fully understood. Here we suggest, through comparison with multiple lancelet (amphioxus) genomes, that ancient vertebrates experienced high rates of protein evolution, genome rearrangement and domain shuffling and that these rates greatly slowed down after the divergence of jawed and jawless vertebrates. Compared with lancelets, modern vertebrates retain, at least relatively, less protein diversity, fewer nucleotide polymorphisms, domain combinations and conserved non-coding elements (CNE). Modern vertebrates also lost substantial transposable element (TE) diversity, whereas lancelets preserve high TE diversity that includes even the long-sought RAG transposon. Lancelets also exhibit rapid gene turnover, pervasive transcription, fastest exon shuffling in metazoans and substantial TE methylation not observed in other invertebrates. These new lancelet genome sequences provide new insights into the chordate ancestral state and the vertebrate evolution. PMID:25523484

  20. Genome-Wide Identification of Chromatin Transitional Regions Reveals Diverse Mechanisms Defining the Boundary of Facultative Heterochromatin

    PubMed Central

    Li, Guangyao; Zhou, Lei

    2013-01-01

    Due to the self-propagating nature of the heterochromatic modification H3K27me3, chromatin barrier activities are required to demarcate the boundary and prevent it from encroaching into euchromatic regions. Studies in Drosophila and vertebrate systems have revealed several important chromatin barrier elements and their respective binding factors. However, epigenomic data indicate that the binding of these factors are not exclusive to chromatin boundaries. To gain a comprehensive understanding of facultative heterochromatin boundaries, we developed a two-tiered method to identify the Chromatin Transitional Region (CTR), i.e. the nucleosomal region that shows the greatest transition rate of the H3K27me3 modification as revealed by ChIP-Seq. This approach was applied to identify CTRs in Drosophila S2 cells and human HeLa cells. Although many insulator proteins have been characterized in Drosophila, less than half of the CTRs in S2 cells are associated with known insulator proteins, indicating unknown mechanisms remain to be characterized. Our analysis also revealed that the peak binding of insulator proteins are usually 1–2 nucleosomes away from the CTR. Comparison of CTR-associated insulator protein binding sites vs. those in heterochromatic region revealed that boundary-associated binding sites are distinctively flanked by nucleosome destabilizing sequences, which correlates with significant decreased nucleosome density and increased binding intensities of co-factors. Interestingly, several subgroups of boundaries have enhanced H3.3 incorporation but reduced nucleosome turnover rate. Our genome-wide study reveals that diverse mechanisms are employed to define the boundaries of facultative heterochromatin. In both Drosophila and mammalian systems, only a small fraction of insulator protein binding sites co-localize with H3K27me3 boundaries. However, boundary-associated insulator binding sites are distinctively flanked by nucleosome destabilizing sequences, which

  1. Gene-trait matching across the Bifidobacterium longum pan-genome reveals considerable diversity in carbohydrate catabolism among human infant strains.

    PubMed

    Arboleya, Silvia; Bottacini, Francesca; O'Connell-Motherway, Mary; Ryan, C Anthony; Ross, R Paul; van Sinderen, Douwe; Stanton, Catherine

    2018-01-08

    Bifidobacterium longum is a common member of the human gut microbiota and is frequently present at high numbers in the gut microbiota of humans throughout life, thus indicative of a close symbiotic host-microbe relationship. Different mechanisms may be responsible for the high competitiveness of this taxon in its human host to allow stable establishment in the complex and dynamic intestinal microbiota environment. The objective of this study was to assess the genetic and metabolic diversity in a set of 20 B. longum strains, most of which had previously been isolated from infants, by performing whole genome sequencing and comparative analysis, and to analyse their carbohydrate utilization abilities using a gene-trait matching approach. We analysed their pan-genome and their phylogenetic relatedness. All strains clustered in the B. longum ssp. longum phylogenetic subgroup, except for one individual strain which was found to cluster in the B. longum ssp. suis phylogenetic group. The examined strains exhibit genomic diversity, while they also varied in their sugar utilization profiles. This allowed us to perform a gene-trait matching exercise enabling the identification of five gene clusters involved in the utilization of xylo-oligosaccharides, arabinan, arabinoxylan, galactan and fucosyllactose, the latter of which is an abundant human milk oligosaccharide (HMO). The results showed high diversity in terms of genes and predicted glycosyl-hydrolases, as well as the ability to metabolize a large range of sugars. Moreover, we corroborate the capability of B. longum ssp. longum to metabolise HMOs. Ultimately, their intraspecific genomic diversity and the ability to consume a wide assortment of carbohydrates, ranging from plant-derived carbohydrates to HMOs, may provide an explanation for the competitive advantage and persistence of B. longum in the human gut microbiome.

  2. Population genomic analysis reveals differential evolutionary histories and patterns of diversity across subgenomes and subpopulations of Brassica napus L.

    DOE PAGES

    Gazave, Elodie; Tassone, Erica E.; Ilut, Daniel C.; ...

    2016-04-21

    Here, the allotetraploid species Brassica napus L. is a global crop of major economic importance, providing canola oil (seed) and vegetables for human consumption and fodder and meal for livestock feed. Characterizing the genetic diversity present in the extant germplasm pool of B. napus is fundamental to better conserve, manage and utilize the genetic resources of this species. We used sequence-based genotyping to identify and genotype 30,881 SNPs in a diversity panel of 782 B. napus accessions, representing samples of winter and spring growth habits originating from 33 countries across Europe, Asia, and America. We detected strong population structure broadlymore » concordant with growth habit and geography, and identified three major genetic groups: spring (SP), winter Europe (WE), and winter Asia (WA). Subpopulation-specific polymorphism patterns suggest enriched genetic diversity within the WA group and a smaller effective breeding population for the SP group compared to WE. Interestingly, the two subgenomes of B. napus appear to have different geographic origins, with phylogenetic analysis placing WE and WA as basal clades for the other subpopulations in the C and A subgenomes, respectively. Finally, we identified 16 genomic regions where the patterns of diversity differed markedly from the genome-wide average, several of which are suggestive of genomic inversions. The results obtained in this study constitute a valuable resource for worldwide breeding efforts and the genetic dissection and prediction of complex B. napus traits.« less

  3. Population genomic analysis reveals differential evolutionary histories and patterns of diversity across subgenomes and subpopulations of Brassica napus L.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gazave, Elodie; Tassone, Erica E.; Ilut, Daniel C.

    Here, the allotetraploid species Brassica napus L. is a global crop of major economic importance, providing canola oil (seed) and vegetables for human consumption and fodder and meal for livestock feed. Characterizing the genetic diversity present in the extant germplasm pool of B. napus is fundamental to better conserve, manage and utilize the genetic resources of this species. We used sequence-based genotyping to identify and genotype 30,881 SNPs in a diversity panel of 782 B. napus accessions, representing samples of winter and spring growth habits originating from 33 countries across Europe, Asia, and America. We detected strong population structure broadlymore » concordant with growth habit and geography, and identified three major genetic groups: spring (SP), winter Europe (WE), and winter Asia (WA). Subpopulation-specific polymorphism patterns suggest enriched genetic diversity within the WA group and a smaller effective breeding population for the SP group compared to WE. Interestingly, the two subgenomes of B. napus appear to have different geographic origins, with phylogenetic analysis placing WE and WA as basal clades for the other subpopulations in the C and A subgenomes, respectively. Finally, we identified 16 genomic regions where the patterns of diversity differed markedly from the genome-wide average, several of which are suggestive of genomic inversions. The results obtained in this study constitute a valuable resource for worldwide breeding efforts and the genetic dissection and prediction of complex B. napus traits.« less

  4. Early Epstein-Barr Virus Genomic Diversity and Convergence toward the B95.8 Genome in Primary Infection.

    PubMed

    Weiss, Eric R; Lamers, Susanna L; Henderson, Jennifer L; Melnikov, Alexandre; Somasundaran, Mohan; Garber, Manuel; Selin, Liisa; Nusbaum, Chad; Luzuriaga, Katherine

    2018-01-15

    Over 90% of the world's population is persistently infected with Epstein-Barr virus. While EBV does not cause disease in most individuals, it is the common cause of acute infectious mononucleosis (AIM) and has been associated with several cancers and autoimmune diseases, highlighting a need for a preventive vaccine. At present, very few primary, circulating EBV genomes have been sequenced directly from infected individuals. While low levels of diversity and low viral evolution rates have been predicted for double-stranded DNA (dsDNA) viruses, recent studies have demonstrated appreciable diversity in common dsDNA pathogens (e.g., cytomegalovirus). Here, we report 40 full-length EBV genome sequences obtained from matched oral wash and B cell fractions from a cohort of 10 AIM patients. Both intra- and interpatient diversity were observed across the length of the entire viral genome. Diversity was most pronounced in viral genes required for establishing latent infection and persistence, with appreciable levels of diversity also detected in structural genes, including envelope glycoproteins. Interestingly, intrapatient diversity declined significantly over time ( P < 0.01), and this was particularly evident on comparison of viral genomes sequenced from B cell fractions in early primary infection and convalescence ( P < 0.001). B cell-associated viral genomes were observed to converge, becoming nearly identical to the B95.8 reference genome over time (Spearman rank-order correlation test; r = -0.5589, P = 0.0264). The reduction in diversity was most marked in the EBV latency genes. In summary, our data suggest independent convergence of diverse viral genome sequences toward a reference-like strain within a relatively short period following primary EBV infection. IMPORTANCE Identification of viral proteins with low variability and high immunogenicity is important for the development of a protective vaccine. Knowledge of genome diversity within circulating viral

  5. Genomic Signatures Reveal New Evidences for Selection of Important Traits in Domestic Cattle

    PubMed Central

    Xu, Lingyang; Bickhart, Derek M.; Cole, John B.; Schroeder, Steven G.; Song, Jiuzhou; Tassell, Curtis P. Van; Sonstegard, Tad S.; Liu, George E.

    2015-01-01

    We investigated diverse genomic selections using high-density single nucleotide polymorphism data of five distinct cattle breeds. Based on allele frequency differences, we detected hundreds of candidate regions under positive selection across Holstein, Angus, Charolais, Brahman, and N'Dama. In addition to well-known genes such as KIT, MC1R, ASIP, GHR, LCORL, NCAPG, WIF1, and ABCA12, we found evidence for a variety of novel and less-known genes under selection in cattle, such as LAP3, SAR1B, LRIG3, FGF5, and NUDCD3. Selective sweeps near LAP3 were then validated by next-generation sequencing. Genome-wide association analysis involving 26,362 Holsteins confirmed that LAP3 and SAR1B were related to milk production traits, suggesting that our candidate regions were likely functional. In addition, haplotype network analyses further revealed distinct selective pressures and evolution patterns across these five cattle breeds. Our results provided a glimpse into diverse genomic selection during cattle domestication, breed formation, and recent genetic improvement. These findings will facilitate genome-assisted breeding to improve animal production and health. PMID:25431480

  6. Genome surfing as driver of microbial genomic diversity

    USDA-ARS?s Scientific Manuscript database

    Historical changes in population size, such as those caused by demographic range expansions, can produce nonadaptive changes in genomic diversity through mechanisms such as gene surfing. We propose that demographic range expansion of a microbial population capable of horizontal gene exchange can res...

  7. Genetic diversity and signatures of selection in various goat breeds revealed by genome-wide SNP markers.

    PubMed

    Brito, Luiz F; Kijas, James W; Ventura, Ricardo V; Sargolzaei, Mehdi; Porto-Neto, Laercio R; Cánovas, Angela; Feng, Zeny; Jafarikia, Mohsen; Schenkel, Flávio S

    2017-03-14

    The detection of signatures of selection has the potential to elucidate the identities of genes and mutations associated with phenotypic traits important for livestock species. It is also very relevant to investigate the levels of genetic diversity of a population, as genetic diversity represents the raw material essential for breeding and has practical implications for implementation of genomic selection. A total of 1151 animals from nine goat populations selected for different breeding goals and genotyped with the Illumina Goat 50K single nucleotide polymorphisms (SNP) Beadchip were included in this investigation. The proportion of polymorphic SNPs ranged from 0.902 (Nubian) to 0.995 (Rangeland). The overall mean H O and H E was 0.374 ± 0.021 and 0.369 ± 0.023, respectively. The average pairwise genetic distance (D) ranged from 0.263 (Toggenburg) to 0.323 (Rangeland). The overall average for the inbreeding measures F EH , F VR , F LEUT , F ROH and F PED was 0.129, -0.012, -0.010, 0.038 and 0.030, respectively. Several regions located on 19 chromosomes were potentially under selection in at least one of the goat breeds. The genomic population tree constructed using all SNPs differentiated breeds based on selection purpose, while genomic population tree built using only SNPs in the most significant region showed a great differentiation between LaMancha and the other breeds. We hypothesized that this region is related to ear morphogenesis. Furthermore, we identified genes potentially related to reproduction traits, adult body mass, efficiency of food conversion, abdominal fat deposition, conformation traits, liver fat metabolism, milk fatty acids, somatic cells score, milk protein, thermo-tolerance and ear morphogenesis. In general, moderate to high levels of genetic variability were observed for all the breeds and a characterization of runs of homozygosity gave insights into the breeds' development history. The information reported here will be useful for

  8. The complete mitochondrial genome of Arctic Calanus hyperboreus (Copepoda, Calanoida) reveals characteristic patterns in calanoid mitochondrial genome.

    PubMed

    Kim, Sanghee; Lim, Byung-Jin; Min, Gi-Sik; Choi, Han-Gu

    2013-05-10

    Copepoda is the most diverse and abundant group of crustaceans, but its phylogenetic relationships are ambiguous. Mitochondrial (mt) genomes are useful for studying evolutionary history, but only six complete Copepoda mt genomes have been made available and these have extremely rearranged genome structures. This study determined the mt genome of Calanus hyperboreus, making it the first reported Arctic copepod mt genome and the first complete mt genome of a calanoid copepod. The mt genome of C. hyperboreus is 17,910 bp in length and it contains the entire set of 37 mt genes, including 13 protein-coding genes, 2 rRNAs, and 22 tRNAs. It has a very unusual gene structure, including the longest control region reported for a crustacean, a large tRNA gene cluster, and reversed GC skews in 11 out of 13 protein-coding genes (84.6%). Despite the unusual features, comparing this genome to published copepod genomes revealed retained pan-crustacean features, as well as a conserved calanoid-specific pattern. Our data provide a foundation for exploring the calanoid pattern and the mechanisms of mt gene rearrangement in the evolutionary history of the copepod mt genome. Copyright © 2012 Elsevier B.V. All rights reserved.

  9. Sub-clonal analysis of the murine C1498 acute myeloid leukaemia cell line reveals genomic and immunogenic diversity.

    PubMed

    Driss, Virginie; Leprêtre, Frédéric; Briche, Isabelle; Mopin, Alexia; Villenet, Céline; Figeac, Martin; Quesnel, Bruno; Brinster, Carine

    2017-12-01

    In acute myeloid leukaemia (AML)-affected patients, the presence of heterogeneous sub-clones at diagnosis has been shown to be responsible for minimal residual disease and relapses. The role played by the immune system in this leukaemic sub-clonal hierarchy and maintenance remains unknown. As leukaemic sub-clone immunogenicity could not be evaluated in human AML xenograft models, we assessed the sub-clonal diversity of the murine C1498 AML cell line and the immunogenicity of its sub-clones in immune-competent syngeneic mice. The murine C1498 cell line was cultured in vitro and sub-clonal cells were generated after limiting dilution. The genomic profiles of 6 different sub-clones were analysed by comparative genomic hybridization arrays (CGH). The sub-clones were then injected into immune-deficient and - competent syngeneic mice. The immunogenicities of the sub-clones was evaluated through 1) assessment of mouse survival, 2) determination of leukaemic cell infiltration into organs by flow cytometry and the expression of a fluorescent reporter gene, 3) assessment of the CTL response ex vivo and 4) detection of residual leukaemic cells in the organs via amplification of the genomic reporter gene by real-time PCR (qPCR). Genomic analyses revealed heterogeneity among the parental cell line and its derived sub-clones. When injected individually into immune-deficient mice, all sub-clones induced cases of AML with different kinetics. However, when administered into immune-competent animals, some sub-clones triggered AML in which no mice survived, whereas others elicited reduced lethality rates. The AML-surviving mice presented efficient anti-leukaemia CTL activity ex vivo and eliminated the leukaemic cells in vivo. We showed that C1498 cell sub-clones presented genomic heterogeneity and differential immunogenicity resulting either in immune escape or elimination. Such findings could have potent implications for new immunotherapeutic strategies in patients with AML

  10. Genomic insights into the Acidobacteria reveal strategies for their success in terrestrial environments

    PubMed Central

    Trojan, Daniela; Roux, Simon; Herbold, Craig; Rattei, Thomas; Woebken, Dagmar

    2018-01-01

    Summary Members of the phylum Acidobacteria are abundant and ubiquitous across soils. We performed a large‐scale comparative genome analysis spanning subdivisions 1, 3, 4, 6, 8 and 23 (n = 24) with the goal to identify features to help explain their prevalence in soils and understand their ecophysiology. Our analysis revealed that bacteriophage integration events along with transposable and mobile elements influenced the structure and plasticity of these genomes. Low‐ and high‐affinity respiratory oxygen reductases were detected in multiple genomes, suggesting the capacity for growing across different oxygen gradients. Among many genomes, the capacity to use a diverse collection of carbohydrates, as well as inorganic and organic nitrogen sources (such as via extracellular peptidases), was detected – both advantageous traits in environments with fluctuating nutrient environments. We also identified multiple soil acidobacteria with the potential to scavenge atmospheric concentrations of H2, now encompassing mesophilic soil strains within the subdivision 1 and 3, in addition to a previously identified thermophilic strain in subdivision 4. This large‐scale acidobacteria genome analysis reveal traits that provide genomic, physiological and metabolic versatility, presumably allowing flexibility and versatility in the challenging and fluctuating soil environment. PMID:29327410

  11. Genome-centric resolution of microbial diversity, metabolism and interactions in anaerobic digestion.

    PubMed

    Vanwonterghem, Inka; Jensen, Paul D; Rabaey, Korneel; Tyson, Gene W

    2016-09-01

    Our understanding of the complex interconnected processes performed by microbial communities is hindered by our inability to culture the vast majority of microorganisms. Metagenomics provides a way to bypass this cultivation bottleneck and recent advances in this field now allow us to recover a growing number of genomes representing previously uncultured populations from increasingly complex environments. In this study, a temporal genome-centric metagenomic analysis was performed of lab-scale anaerobic digesters that host complex microbial communities fulfilling a series of interlinked metabolic processes to enable the conversion of cellulose to methane. In total, 101 population genomes that were moderate to near-complete were recovered based primarily on differential coverage binning. These populations span 19 phyla, represent mostly novel species and expand the genomic coverage of several rare phyla. Classification into functional guilds based on their metabolic potential revealed metabolic networks with a high level of functional redundancy as well as niche specialization, and allowed us to identify potential roles such as hydrolytic specialists for several rare, uncultured populations. Genome-centric analyses of complex microbial communities across diverse environments provide the key to understanding the phylogenetic and metabolic diversity of these interactive communities. © 2016 Society for Applied Microbiology and John Wiley & Sons Ltd.

  12. Contrasting Patterns of Genomic Diversity Reveal Accelerated Genetic Drift but Reduced Directional Selection on X-Chromosome in Wild and Domestic Sheep Species

    PubMed Central

    Chen, Ze-Hui; Zhang, Min; Lv, Feng-Hua; Ren, Xue; Li, Wen-Rong; Liu, Ming-Jun; Nam, Kiwoong; Bruford, Michael W; Li, Meng-Hua

    2018-01-01

    Abstract Analyses of genomic diversity along the X chromosome and of its correlation with autosomal diversity can facilitate understanding of evolutionary forces in shaping sex-linked genomic architecture. Strong selective sweeps and accelerated genetic drift on the X-chromosome have been inferred in primates and other model species, but no such insight has yet been gained in domestic animals compared with their wild relatives. Here, we analyzed X-chromosome variability in a large ovine data set, including a BeadChip array for 943 ewes from the world’s sheep populations and 110 whole genomes of wild and domestic sheep. Analyzing whole-genome sequences, we observed a substantially reduced X-to-autosome diversity ratio (∼0.6) compared with the value expected under a neutral model (0.75). In particular, one large X-linked segment (43.05–79.25 Mb) was found to show extremely low diversity, most likely due to a high density of coding genes, featuring highly conserved regions. In general, we observed higher nucleotide diversity on the autosomes, but a flat diversity gradient in X-linked segments, as a function of increasing distance from the nearest genes, leading to a decreased X: autosome (X/A) diversity ratio and contrasting to the positive correlation detected in primates and other model animals. Our evidence suggests that accelerated genetic drift but reduced directional selection on X chromosome, as well as sex-biased demographic events, explain low X-chromosome diversity in sheep species. The distinct patterns of X-linked and X/A diversity we observed between Middle Eastern and non-Middle Eastern sheep populations can be explained by multiple migrations, selection, and admixture during the domestic sheep’s recent postdomestication demographic expansion, coupled with natural selection for adaptation to new environments. In addition, we identify important novel genes involved in abnormal behavioral phenotypes, metabolism, and immunity, under selection on

  13. Chimpanzee genomic diversity reveals ancient admixture with bonobos

    PubMed Central

    de Manuel, Marc; Kuhlwilm, Martin; Frandsen, Peter; Sousa, Vitor C.; Desai, Tariq; Prado-Martinez, Javier; Hernandez-Rodriguez, Jessica; Dupanloup, Isabelle; Lao, Oscar; Hallast, Pille; Schmidt, Joshua M.; Heredia-Genestar, José María; Benazzo, Andrea; Barbujani, Guido; Peter, Benjamin M.; Kuderna, Lukas F.K.; Casals, Ferran; Angedakin, Samuel; Arandjelovic, Mimi; Boesch, Christophe; Kühl, Hjalmar; Vigilant, Linda; Langergraber, Kevin; Novembre, John; Gut, Marta; Gut, Ivo; Navarro, Arcadi; Carlsen, Frands; Andrés, Aida M.; Siegismund, Hans. R.; Scally, Aylwyn; Excoffier, Laurent; Tyler-Smith, Chris; Castellano, Sergi; Xue, Yali; Hvilsom, Christina; Marques-Bonet, Tomas

    2017-01-01

    Our closest living relatives, chimpanzees and bonobos, have a complex demographic history. We have analyzed the high-coverage whole genomes of 75 wild-born chimpanzees and bonobos from ten countries in Africa. We find that chimpanzee population sub-structure makes genetic information a good predictor of geographic origin at country and regional scales. Most strikingly, multiple lines of evidence suggest that gene flow occurred from bonobos into the ancestors of central and eastern chimpanzees between 200 and 550 thousand years ago (Kya), probably with subsequent spread into Nigeria-Cameroon chimpanzees. Together with another possibly more recent contact (after 200 Kya), bonobos contributed less than 1% to the central chimpanzee genomes. Admixture thus appears to have been widespread during hominid evolution. PMID:27789843

  14. Comparative Genomic Analyses of the Human NPHP1 Locus Reveal Complex Genomic Architecture and Its Regional Evolution in Primates

    PubMed Central

    Yuan, Bo; Liu, Pengfei; Gupta, Aditya; Beck, Christine R.; Tejomurtula, Anusha; Campbell, Ian M.; Gambin, Tomasz; Simmons, Alexandra D.; Withers, Marjorie A.; Harris, R. Alan; Rogers, Jeffrey; Schwartz, David C.; Lupski, James R.

    2015-01-01

    Many loci in the human genome harbor complex genomic structures that can result in susceptibility to genomic rearrangements leading to various genomic disorders. Nephronophthisis 1 (NPHP1, MIM# 256100) is an autosomal recessive disorder that can be caused by defects of NPHP1; the gene maps within the human 2q13 region where low copy repeats (LCRs) are abundant. Loss of function of NPHP1 is responsible for approximately 85% of the NPHP1 cases—about 80% of such individuals carry a large recurrent homozygous NPHP1 deletion that occurs via nonallelic homologous recombination (NAHR) between two flanking directly oriented ~45 kb LCRs. Published data revealed a non-pathogenic inversion polymorphism involving the NPHP1 gene flanked by two inverted ~358 kb LCRs. Using optical mapping and array-comparative genomic hybridization, we identified three potential novel structural variant (SV) haplotypes at the NPHP1 locus that may protect a haploid genome from the NPHP1 deletion. Inter-species comparative genomic analyses among primate genomes revealed massive genomic changes during evolution. The aggregated data suggest that dynamic genomic rearrangements occurred historically within the NPHP1 locus and generated SV haplotypes observed in the human population today, which may confer differential susceptibility to genomic instability and the NPHP1 deletion within a personal genome. Our study documents diverse SV haplotypes at a complex LCR-laden human genomic region. Comparative analyses provide a model for how this complex region arose during primate evolution, and studies among humans suggest that intra-species polymorphism may potentially modulate an individual’s susceptibility to acquiring disease-associated alleles. PMID:26641089

  15. Whole Genome Sequencing of Field Isolates Reveals Extensive Genetic Diversity in Plasmodium vivax from Colombia

    PubMed Central

    Winter, David J.; Pacheco, M. Andreína; Vallejo, Andres F.; Schwartz, Rachel S.; Arevalo-Herrera, Myriam; Herrera, Socrates

    2015-01-01

    Plasmodium vivax is the most prevalent malarial species in South America and exerts a substantial burden on the populations it affects. The control and eventual elimination of P. vivax are global health priorities. Genomic research contributes to this objective by improving our understanding of the biology of P. vivax and through the development of new genetic markers that can be used to monitor efforts to reduce malaria transmission. Here we analyze whole-genome data from eight field samples from a region in Cordóba, Colombia where malaria is endemic. We find considerable genetic diversity within this population, a result that contrasts with earlier studies suggesting that P. vivax had limited diversity in the Americas. We also identify a selective sweep around a substitution known to confer resistance to sulphadoxine-pyrimethamine (SP). This is the first observation of a selective sweep for SP resistance in this species. These results indicate that P. vivax has been exposed to SP pressure even when the drug is not in use as a first line treatment for patients afflicted by this parasite. We identify multiple non-synonymous substitutions in three other genes known to be involved with drug resistance in Plasmodium species. Finally, we found extensive microsatellite polymorphisms. Using this information we developed 18 polymorphic and easy to score microsatellite loci that can be used in epidemiological investigations in South America. PMID:26709695

  16. Whole Genome Sequencing of Field Isolates Reveals Extensive Genetic Diversity in Plasmodium vivax from Colombia.

    PubMed

    Winter, David J; Pacheco, M Andreína; Vallejo, Andres F; Schwartz, Rachel S; Arevalo-Herrera, Myriam; Herrera, Socrates; Cartwright, Reed A; Escalante, Ananias A

    2015-12-01

    Plasmodium vivax is the most prevalent malarial species in South America and exerts a substantial burden on the populations it affects. The control and eventual elimination of P. vivax are global health priorities. Genomic research contributes to this objective by improving our understanding of the biology of P. vivax and through the development of new genetic markers that can be used to monitor efforts to reduce malaria transmission. Here we analyze whole-genome data from eight field samples from a region in Cordóba, Colombia where malaria is endemic. We find considerable genetic diversity within this population, a result that contrasts with earlier studies suggesting that P. vivax had limited diversity in the Americas. We also identify a selective sweep around a substitution known to confer resistance to sulphadoxine-pyrimethamine (SP). This is the first observation of a selective sweep for SP resistance in this species. These results indicate that P. vivax has been exposed to SP pressure even when the drug is not in use as a first line treatment for patients afflicted by this parasite. We identify multiple non-synonymous substitutions in three other genes known to be involved with drug resistance in Plasmodium species. Finally, we found extensive microsatellite polymorphisms. Using this information we developed 18 polymorphic and easy to score microsatellite loci that can be used in epidemiological investigations in South America.

  17. Genetics, Genomics and Evolution of Ergot Alkaloid Diversity

    PubMed Central

    Young, Carolyn A.; Schardl, Christopher L.; Panaccione, Daniel G.; Florea, Simona; Takach, Johanna E.; Charlton, Nikki D.; Moore, Neil; Webb, Jennifer S.; Jaromczyk, Jolanta

    2015-01-01

    The ergot alkaloid biosynthesis system has become an excellent model to study evolutionary diversification of specialized (secondary) metabolites. This is a very diverse class of alkaloids with various neurotropic activities, produced by fungi in several orders of the phylum Ascomycota, including plant pathogens and protective plant symbionts in the family Clavicipitaceae. Results of comparative genomics and phylogenomic analyses reveal multiple examples of three evolutionary processes that have generated ergot-alkaloid diversity: gene gains, gene losses, and gene sequence changes that have led to altered substrates or product specificities of the enzymes that they encode (neofunctionalization). The chromosome ends appear to be particularly effective engines for gene gains, losses and rearrangements, but not necessarily for neofunctionalization. Changes in gene expression could lead to accumulation of various pathway intermediates and affect levels of different ergot alkaloids. Genetic alterations associated with interspecific hybrids of Epichloë species suggest that such variation is also selectively favored. The huge structural diversity of ergot alkaloids probably represents adaptations to a wide variety of ecological situations by affecting the biological spectra and mechanisms of defense against herbivores, as evidenced by the diverse pharmacological effects of ergot alkaloids used in medicine. PMID:25875294

  18. Pervasive, Genome-Wide Transcription in the Organelle Genomes of Diverse Plastid-Bearing Protists.

    PubMed

    Sanitá Lima, Matheus; Smith, David Roy

    2017-11-06

    Organelle genomes are among the most sequenced kinds of chromosome. This is largely because they are small and widely used in molecular studies, but also because next-generation sequencing technologies made sequencing easier, faster, and cheaper. However, studies of organelle RNA have not kept pace with those of DNA, despite huge amounts of freely available eukaryotic RNA-sequencing (RNA-seq) data. Little is known about organelle transcription in nonmodel species, and most of the available eukaryotic RNA-seq data have not been mined for organelle transcripts. Here, we use publicly available RNA-seq experiments to investigate organelle transcription in 30 diverse plastid-bearing protists with varying organelle genomic architectures. Mapping RNA-seq data to organelle genomes revealed pervasive, genome-wide transcription, regardless of the taxonomic grouping, gene organization, or noncoding content. For every species analyzed, transcripts covered ≥85% of the mitochondrial and/or plastid genomes (all of which were ≤105 kb), indicating that most of the organelle DNA-coding and noncoding-is transcriptionally active. These results follow earlier studies of model species showing that organellar transcription is coupled and ubiquitous across the genome, requiring significant downstream processing of polycistronic transcripts. Our findings suggest that noncoding organelle DNA can be transcriptionally active, raising questions about the underlying function of these transcripts and underscoring the utility of publicly available RNA-seq data for recovering complete genome sequences. If pervasive transcription is also found in bigger organelle genomes (>105 kb) and across a broader range of eukaryotes, this could indicate that noncoding organelle RNAs are regulating fundamental processes within eukaryotic cells. Copyright © 2017 Sanitá Lima and Smith.

  19. Comparison of environmental and isolate Sulfobacillus genomes reveals diverse carbon, sulfur, nitrogen, and hydrogen metabolisms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Justice, Nicholas B.; Norman, Anders; Brown, Christopher T.

    Bacteria of the genus Sulfobacillus are found worldwide as members of microbial communities that accelerate sulfide mineral dissolution in acid mine drainage environments (AMD), acid-rock drainage environments (ARD), as well as in industrial bioleaching operations. Despite their frequent identification in these environments, their role in biogeochemical cycling is poorly understood. Here we report draft genomes of five species of the Sulfobacillus genus (AMDSBA1-5) reconstructed by cultivation-independent sequencing of biofilms sampled from the Richmond Mine (Iron Mountain, CA). Three of these species (AMDSBA2, AMDSBA3, and AMDSBA4) have no cultured representatives while AMDSBA1 is a strain of S. benefaciens, and AMDSBA5 amore » strain of S. thermosulfidooxidans. We analyzed the diversity of energy conservation and central carbon metabolisms for these genomes and previously published Sulfobacillus genomes. Pathways of sulfur oxidation vary considerably across the genus, including the number and type of subunits of putative heterodisulfide reductase complexes likely involved in sulfur oxidation. The number and type of nickel-iron hydrogenase proteins varied across the genus, as does the presence of different central carbon pathways. Only the AMDSBA3 genome encodes a dissimilatory nitrate reducatase and only the AMDSBA5 and S. thermosulfidooxidans genomes encode assimilatory nitrate reductases. Lastly, within the genus, AMDSBA4 is unusual in that its electron transport chain includes a cytochrome bc type complex, a unique cytochrome c oxidase, and two distinct succinate dehydrogenase complexes. Overall, the results significantly expand our understanding of carbon, sulfur, nitrogen, and hydrogen metabolism within the Sulfobacillus genus.« less

  20. Comparison of environmental and isolate Sulfobacillus genomes reveals diverse carbon, sulfur, nitrogen, and hydrogen metabolisms

    DOE PAGES

    Justice, Nicholas B.; Norman, Anders; Brown, Christopher T.; ...

    2014-12-15

    Bacteria of the genus Sulfobacillus are found worldwide as members of microbial communities that accelerate sulfide mineral dissolution in acid mine drainage environments (AMD), acid-rock drainage environments (ARD), as well as in industrial bioleaching operations. Despite their frequent identification in these environments, their role in biogeochemical cycling is poorly understood. Here we report draft genomes of five species of the Sulfobacillus genus (AMDSBA1-5) reconstructed by cultivation-independent sequencing of biofilms sampled from the Richmond Mine (Iron Mountain, CA). Three of these species (AMDSBA2, AMDSBA3, and AMDSBA4) have no cultured representatives while AMDSBA1 is a strain of S. benefaciens, and AMDSBA5 amore » strain of S. thermosulfidooxidans. We analyzed the diversity of energy conservation and central carbon metabolisms for these genomes and previously published Sulfobacillus genomes. Pathways of sulfur oxidation vary considerably across the genus, including the number and type of subunits of putative heterodisulfide reductase complexes likely involved in sulfur oxidation. The number and type of nickel-iron hydrogenase proteins varied across the genus, as does the presence of different central carbon pathways. Only the AMDSBA3 genome encodes a dissimilatory nitrate reducatase and only the AMDSBA5 and S. thermosulfidooxidans genomes encode assimilatory nitrate reductases. Lastly, within the genus, AMDSBA4 is unusual in that its electron transport chain includes a cytochrome bc type complex, a unique cytochrome c oxidase, and two distinct succinate dehydrogenase complexes. Overall, the results significantly expand our understanding of carbon, sulfur, nitrogen, and hydrogen metabolism within the Sulfobacillus genus.« less

  1. Global biogeography of Prochlorococcus genome diversity in the surface ocean.

    PubMed

    Kent, Alyssa G; Dupont, Chris L; Yooseph, Shibu; Martiny, Adam C

    2016-08-01

    Prochlorococcus, the smallest known photosynthetic bacterium, is abundant in the ocean's surface layer despite large variation in environmental conditions. There are several genetically divergent lineages within Prochlorococcus and superimposed on this phylogenetic diversity is extensive gene gain and loss. The environmental role in shaping the global ocean distribution of genome diversity in Prochlorococcus is largely unknown, particularly in a framework that considers the vertical and lateral mechanisms of evolution. Here we show that Prochlorococcus field populations from a global circumnavigation harbor extensive genome diversity across the surface ocean, but this diversity is not randomly distributed. We observed a significant correspondence between phylogenetic and gene content diversity, including regional differences in both phylogenetic composition and gene content that were related to environmental factors. Several gene families were strongly associated with specific regions and environmental factors, including the identification of a set of genes related to lower nutrient and temperature regions. Metagenomic assemblies of natural Prochlorococcus genomes reinforced this association by providing linkage of genes across genomic backbones. Overall, our results show that the phylogeography in Prochlorococcus taxonomy is echoed in its genome content. Thus environmental variation shapes the functional capabilities and associated ecosystem role of the globally abundant Prochlorococcus.

  2. Genomic diversity and versatility of Lactobacillus plantarum, a natural metabolic engineer.

    PubMed

    Siezen, Roland J; van Hylckama Vlieg, Johan E T

    2011-08-30

    In the past decade it has become clear that the lactic acid bacterium Lactobacillus plantarum occupies a diverse range of environmental niches and has an enormous diversity in phenotypic properties, metabolic capacity and industrial applications. In this review, we describe how genome sequencing, comparative genome hybridization and comparative genomics has provided insight into the underlying genomic diversity and versatility of L. plantarum. One of the main features appears to be genomic life-style islands consisting of numerous functional gene cassettes, in particular for carbohydrates utilization, which can be acquired, shuffled, substituted or deleted in response to niche requirements. In this sense, L. plantarum can be considered a "natural metabolic engineer".

  3. Genome-wide-analyses of Listeria monocytogenes from food-processing plants reveal clonal diversity and date the emergence of persisting sequence types.

    PubMed

    Knudsen, Gitte M; Nielsen, Jesper Boye; Marvig, Rasmus L; Ng, Yin; Worning, Peder; Westh, Henrik; Gram, Lone

    2017-08-01

    Whole genome sequencing is increasing used in epidemiology, e.g. for tracing outbreaks of food-borne diseases. This requires in-depth understanding of pathogen emergence, persistence and genomic diversity along the food production chain including in food processing plants. We sequenced the genomes of 80 isolates of Listeria monocytogenes sampled from Danish food processing plants over a time-period of 20 years, and analysed the sequences together with 10 public available reference genomes to advance our understanding of interplant and intraplant genomic diversity of L. monocytogenes. Except for three persisting sequence types (ST) based on Multi Locus Sequence Typing being ST7, ST8 and ST121, long-term persistence of clonal groups was limited, and new clones were introduced continuously, potentially from raw materials. No particular gene could be linked to the persistence phenotype. Using time-based phylogenetic analyses of the persistent STs, we estimate the L. monocytogenes evolutionary rate to be 0.18-0.35 single nucleotide polymorphisms/year, suggesting that the persistent STs emerged approximately 100 years ago, which correlates with the onset of industrialization and globalization of the food market. © 2017 Society for Applied Microbiology and John Wiley & Sons Ltd.

  4. Genomic Diversity of Erwinia carotovora subsp. carotovora and Its Correlation with Virulence

    PubMed Central

    Yap, Mee-Ngan; Barak, Jeri D.; Charkowski, Amy O.

    2004-01-01

    We used genetic and biochemical methods to examine the genomic diversity of the enterobacterial plant pathogen Erwinia carotovora subsp. carotovora. The results obtained with each method showed that E. carotovora subsp. carotovora strains isolated from one ecological niche, potato plants, are surprisingly diverse compared to related pathogens. A comparison of 23 partial mdh sequences revealed a maximum pairwise difference of 10.49% and an average pairwise difference of 2.13%, values which are much greater than the maximum variation (1.81%) and average variation (0.75%) previously reported for Escherichia coli. Pulsed-field gel electrophoresis analysis of I-CeuI-digested genomic DNA revealed seven rrn operons in all E. carotovora subsp. carotovora strains examined except strain WPP17, which had only six copies. We identified 26 I-CeuI restriction fragment length polymorphism patterns and observed significant polymorphism in fragment sizes ranging from 100 to 450 kb for all strains. We detected large plasmids in two strains, including the model strain E. carotovora subsp. carotovora 71. The two least virulent strains had an unusual chromosomal structure, suggesting that a particular pulsotype is correlated with virulence. To compare chromosomal organization of multiple enterobacterial genomes, several genes were mapped onto I-CeuI fragments. We identified portions of the genome that appear to be conserved across enterobacteria and portions that have undergone genome rearrangements. We found that the least virulent strain, WPP17, failed to oxidize cellobiose and was missing several hrp and hrc genes. The unexpected variability among isolates obtained from clonal hosts in one region and in one season suggests that factors other than the host plant, potato, drive the evolution of this common environmental bacterium and key plant pathogen. PMID:15128563

  5. Whole Genome Sequencing of Danish Staphylococcus argenteus Reveals a Genetically Diverse Collection with Clear Separation from Staphylococcus aureus.

    PubMed

    Hansen, Thomas A; Bartels, Mette D; Høgh, Silje V; Dons, Lone E; Pedersen, Michael; Jensen, Thøger G; Kemp, Michael; Skov, Marianne N; Gumpert, Heidi; Worning, Peder; Westh, Henrik

    2017-01-01

    Staphylococcus argenteus ( S. argenteus ) is a newly identified Staphylococcus species that has been misidentified as Staphylococcus aureus ( S. aureus ) and is clinically relevant. We identified 25 S. argenteus genomes in our collection of whole genome sequenced S. aureus . These genomes were compared to publicly available genomes and a phylogeny revealed seven clusters corresponding to seven clonal complexes. The genome of S. argenteus was found to be different from the genome of S. aureus and a core genome analysis showed that ~33% of the total gene pool was shared between the two species, at 90% homology level. An assessment of mobile elements shows flow of SCC mec cassettes, plasmids, phages, and pathogenicity islands, between S. argenteus and S. aureus . This dataset emphasizes that S. argenteus and S. aureus are two separate species that share genetic material.

  6. The use of genomic coancestry matrices in the optimisation of contributions to maintain genetic diversity at specific regions of the genome.

    PubMed

    Gómez-Romano, Fernando; Villanueva, Beatriz; Fernández, Jesús; Woolliams, John A; Pong-Wong, Ricardo

    2016-01-13

    Optimal contribution methods have proved to be very efficient for controlling the rates at which coancestry and inbreeding increase and therefore, for maintaining genetic diversity. These methods have usually relied on pedigree information for estimating genetic relationships between animals. However, with the large amount of genomic information now available such as high-density single nucleotide polymorphism (SNP) chips that contain thousands of SNPs, it becomes possible to calculate more accurate estimates of relationships and to target specific regions in the genome where there is a particular interest in maximising genetic diversity. The objective of this study was to investigate the effectiveness of using genomic coancestry matrices for: (1) minimising the loss of genetic variability at specific genomic regions while restricting the overall loss in the rest of the genome; or (2) maximising the overall genetic diversity while restricting the loss of diversity at specific genomic regions. Our study shows that the use of genomic coancestry was very successful at minimising the loss of diversity and outperformed the use of pedigree-based coancestry (genetic diversity even increased in some scenarios). The results also show that genomic information allows a targeted optimisation to maintain diversity at specific genomic regions, whether they are linked or not. The level of variability maintained increased when the targeted regions were closely linked. However, such targeted management leads to an important loss of diversity in the rest of the genome and, thus, it is necessary to take further actions to constrain this loss. Optimal contribution methods also proved to be effective at restricting the loss of diversity in the rest of the genome, although the resulting rate of coancestry was higher than the constraint imposed. The use of genomic matrices when optimising contributions permits the control of genetic diversity and inbreeding at specific regions of the

  7. Genomic diversity and versatility of Lactobacillus plantarum, a natural metabolic engineer

    PubMed Central

    2011-01-01

    In the past decade it has become clear that the lactic acid bacterium Lactobacillus plantarum occupies a diverse range of environmental niches and has an enormous diversity in phenotypic properties, metabolic capacity and industrial applications. In this review, we describe how genome sequencing, comparative genome hybridization and comparative genomics has provided insight into the underlying genomic diversity and versatility of L. plantarum. One of the main features appears to be genomic life-style islands consisting of numerous functional gene cassettes, in particular for carbohydrates utilization, which can be acquired, shuffled, substituted or deleted in response to niche requirements. In this sense, L. plantarum can be considered a “natural metabolic engineer”. PMID:21995294

  8. Intraspecies genomic diversity and natural population structure of the meat-borne lactic acid bacterium Lactobacillus sakei.

    PubMed

    Chaillou, Stéphane; Daty, Marie; Baraige, Fabienne; Dudez, Anne-Marie; Anglade, Patricia; Jones, Rhys; Alpert, Carl-Alfred; Champomier-Vergès, Marie-Christine; Zagorec, Monique

    2009-02-01

    Lactobacillus sakei is a food-borne bacterium naturally found in meat and fish products. A study was performed to examine the intraspecies diversity among 73 isolates sourced from laboratory collections in several different countries. Pulsed-field gel electrophoresis analysis demonstrated a 25% variation in genome size between isolates, ranging from 1,815 kb to 2,310 kb. The relatedness between isolates was then determined using a PCR-based method that detects the possession of 60 chromosomal genes belonging to the flexible gene pool. Ten different strain clusters were identified that had noticeable differences in their average genome size reflecting the natural population structure. The results show that many different genotypes may be isolated from similar types of meat products, suggesting a complex ecological habitat in which intraspecies diversity may be required for successful adaptation. Finally, proteomic analysis revealed a slight difference between the migration patterns of highly abundant GapA isoforms of the two prevailing L. sakei subspecies (sakei and carnosus). This analysis was used to affiliate the genotypic clusters with the corresponding subspecies. These findings reveal for the first time the extent of intraspecies genomic diversity in L. sakei. Consequently, identification of molecular subtypes may in the future prove valuable for a better understanding of microbial ecosystems in food products.

  9. Genome re-sequencing reveals the history of apple and supports a two-stage model for fruit enlargement.

    PubMed

    Duan, Naibin; Bai, Yang; Sun, Honghe; Wang, Nan; Ma, Yumin; Li, Mingjun; Wang, Xin; Jiao, Chen; Legall, Noah; Mao, Linyong; Wan, Sibao; Wang, Kun; He, Tianming; Feng, Shouqian; Zhang, Zongying; Mao, Zhiquan; Shen, Xiang; Chen, Xiaoliu; Jiang, Yuanmao; Wu, Shujing; Yin, Chengmiao; Ge, Shunfeng; Yang, Long; Jiang, Shenghui; Xu, Haifeng; Liu, Jingxuan; Wang, Deyun; Qu, Changzhi; Wang, Yicheng; Zuo, Weifang; Xiang, Li; Liu, Chang; Zhang, Daoyuan; Gao, Yuan; Xu, Yimin; Xu, Kenong; Chao, Thomas; Fazio, Gennaro; Shu, Huairui; Zhong, Gan-Yuan; Cheng, Lailiang; Fei, Zhangjun; Chen, Xuesen

    2017-08-15

    Human selection has reshaped crop genomes. Here we report an apple genome variation map generated through genome sequencing of 117 diverse accessions. A comprehensive model of apple speciation and domestication along the Silk Road is proposed based on evidence from diverse genomic analyses. Cultivated apples likely originate from Malus sieversii in Kazakhstan, followed by intensive introgressions from M. sylvestris. M. sieversii in Xinjiang of China turns out to be an "ancient" isolated ecotype not directly contributing to apple domestication. We have identified selective sweeps underlying quantitative trait loci/genes of important fruit quality traits including fruit texture and flavor, and provide evidences supporting a model of apple fruit size evolution comprising two major events with one occurring prior to domestication and the other during domestication. This study outlines the genetic basis of apple domestication and evolution, and provides valuable information for facilitating marker-assisted breeding and apple improvement.Apple is one of the most important fruit crops. Here, the authors perform deep genome resequencing of 117 diverse accessions and reveal comprehensive models of apple origin, speciation, domestication, and fruit size evolution as well as candidate genes associated with important agronomic traits.

  10. Population genomics of Fusarium graminearum reveals signatures of divergent evolution within a major cereal pathogen

    PubMed Central

    2018-01-01

    The cereal pathogen Fusarium graminearum is the primary cause of Fusarium head blight (FHB) and a significant threat to food safety and crop production. To elucidate population structure and identify genomic targets of selection within major FHB pathogen populations in North America we sequenced the genomes of 60 diverse F. graminearum isolates. We also assembled the first pan-genome for F. graminearum to clarify population-level differences in gene content potentially contributing to pathogen diversity. Bayesian and phylogenomic analyses revealed genetic structure associated with isolates that produce the novel NX-2 mycotoxin, suggesting a North American population that has remained genetically distinct from other endemic and introduced cereal-infecting populations. Genome scans uncovered distinct signatures of selection within populations, focused in high diversity, frequently recombining regions. These patterns suggested selection for genomic divergence at the trichothecene toxin gene cluster and thirteen additional regions containing genes potentially involved in pathogen specialization. Gene content differences further distinguished populations, in that 121 genes showed population-specific patterns of conservation. Genes that differentiated populations had predicted functions related to pathogenesis, secondary metabolism and antagonistic interactions, though a subset had unique roles in temperature and light sensitivity. Our results indicated that F. graminearum populations are distinguished by dozens of genes with signatures of selection and an array of dispensable accessory genes, suggesting that FHB pathogen populations may be equipped with different traits to exploit the agroecosystem. These findings provide insights into the evolutionary processes and genomic features contributing to population divergence in plant pathogens, and highlight candidate genes for future functional studies of pathogen specialization across evolutionarily and ecologically

  11. Evolution and Diversity of Transposable Elements in Vertebrate Genomes.

    PubMed

    Sotero-Caio, Cibele G; Platt, Roy N; Suh, Alexander; Ray, David A

    2017-01-01

    Transposable elements (TEs) are selfish genetic elements that mobilize in genomes via transposition or retrotransposition and often make up large fractions of vertebrate genomes. Here, we review the current understanding of vertebrate TE diversity and evolution in the context of recent advances in genome sequencing and assembly techniques. TEs make up 4-60% of assembled vertebrate genomes, and deeply branching lineages such as ray-finned fishes and amphibians generally exhibit a higher TE diversity than the more recent radiations of birds and mammals. Furthermore, the list of taxa with exceptional TE landscapes is growing. We emphasize that the current bottleneck in genome analyses lies in the proper annotation of TEs and provide examples where superficial analyses led to misleading conclusions about genome evolution. Finally, recent advances in long-read sequencing will soon permit access to TE-rich genomic regions that previously resisted assembly including the gigantic, TE-rich genomes of salamanders and lungfishes. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  12. Australian wild rice reveals pre-domestication origin of polymorphism deserts in rice genome.

    PubMed

    Krishnan S, Gopala; Waters, Daniel L E; Henry, Robert J

    2014-01-01

    Rice is a major source of human food with a predominantly Asian production base. Domestication involved selection of traits that are desirable for agriculture and to human consumers. Wild relatives of crop plants are a source of useful variation which is of immense value for crop improvement. Australian wild rices have been isolated from the impacts of domestication in Asia and represents a source of novel diversity for global rice improvement. Oryza rufipogon is a perennial wild progenitor of cultivated rice. Oryza meridionalis is a related annual species in Australia. We have examined the sequence of the genomes of AA genome wild rices from Australia that are close relatives of cultivated rice through whole genome re-sequencing. Assembly of the resequencing data to the O. sativa ssp. japonica cv. Nipponbare shows that Australian wild rices possess 2.5 times more single nucleotide polymorphisms than in the Asian wild rice and cultivated O. sativa ssp. indica. Analysis of the genome of domesticated rice reveals regions of low diversity that show very little variation (polymorphism deserts). Both the perennial and annual wild rice from Australia show a high degree of conservation of sequence with that found in cultivated rice in the same 4.58 Mbp region on chromosome 5, which suggests that some of the 'polymorphism deserts' in this and other parts of the rice genome may have originated prior to domestication due to natural selection. Analysis of genes in the 'polymorphism deserts' indicates that this selection may have been due to biotic or abiotic stress in the environment of early rice relatives. Despite having closely related sequences in these genome regions, the Australian wild populations represent an invaluable source of diversity supporting rice food security.

  13. Consequences of genomic diversity in Mycobacterium tuberculosis

    PubMed Central

    Coscolla, Mireia; Gagneux, Sebastien

    2014-01-01

    The causative agent of human tuberculosis, Mycobacterium tuberculosis complex (MTBC), comprises seven phylogenetically distinct lineages associated with different geographical regions. Here we review the latest findings on the nature and amount of genomic diversity within and between MTBC lineages. We then review recent evidence for the effect of this genomic diversity on mycobacterial phenotypes measured experimentally and in clinical settings. We conclude that overall, the most geographically widespread Lineage 2 (includes Beijing) and Lineage 4 (also known as Euro-American) are more virulent than other lineages that are more geographically restricted. This increased virulence is associated with delayed or reduced pro-inflammatory host immune responses, greater severity of disease, and enhanced transmission. Future work should focus on the interaction between MTBC and human genetic diversity, as well as on the environmental factors that modulate these interactions. PMID:25453224

  14. The Diversity Present in 5140 Human Mitochondrial Genomes

    PubMed Central

    Pereira, Luísa; Freitas, Fernando; Fernandes, Verónica; Pereira, Joana B.; Costa, Marta D.; Costa, Stephanie; Máximo, Valdemar; Macaulay, Vincent; Rocha, Ricardo; Samuels, David C.

    2009-01-01

    We analyzed the current status (as of the end of August 2008) of human mitochondrial genomes deposited in GenBank, amounting to 5140 complete or coding-region sequences, in order to present an overall picture of the diversity present in the mitochondrial DNA of the global human population. To perform this task, we developed mtDNA-GeneSyn, a computer tool that identifies and exhaustedly classifies the diversity present in large genetic data sets. The diversity observed in the 5140 human mitochondrial genomes was compared with all possible transitions and transversions from the standard human mitochondrial reference genome. This comparison showed that tRNA and rRNA secondary structures have a large effect in limiting the diversity of the human mitochondrial sequences, whereas for the protein-coding genes there is a bias toward less variation at the second codon positions. The analysis of the observed amino acid variations showed a tolerance of variations that convert between the amino acids V, I, A, M, and T. This defines a group of amino acids with similar chemical properties that can interconvert by a single transition. PMID:19426953

  15. Dynamic Evolution of Pathogenicity Revealed by Sequencing and Comparative Genomics of 19 Pseudomonas syringae Isolates

    PubMed Central

    Romanchuk, Artur; Chang, Jeff H.; Mukhtar, M. Shahid; Cherkis, Karen; Roach, Jeff; Grant, Sarah R.; Jones, Corbin D.; Dangl, Jeffery L.

    2011-01-01

    Closely related pathogens may differ dramatically in host range, but the molecular, genetic, and evolutionary basis for these differences remains unclear. In many Gram- negative bacteria, including the phytopathogen Pseudomonas syringae, type III effectors (TTEs) are essential for pathogenicity, instrumental in structuring host range, and exhibit wide diversity between strains. To capture the dynamic nature of virulence gene repertoires across P. syringae, we screened 11 diverse strains for novel TTE families and coupled this nearly saturating screen with the sequencing and assembly of 14 phylogenetically diverse isolates from a broad collection of diseased host plants. TTE repertoires vary dramatically in size and content across all P. syringae clades; surprisingly few TTEs are conserved and present in all strains. Those that are likely provide basal requirements for pathogenicity. We demonstrate that functional divergence within one conserved locus, hopM1, leads to dramatic differences in pathogenicity, and we demonstrate that phylogenetics-informed mutagenesis can be used to identify functionally critical residues of TTEs. The dynamism of the TTE repertoire is mirrored by diversity in pathways affecting the synthesis of secreted phytotoxins, highlighting the likely role of both types of virulence factors in determination of host range. We used these 14 draft genome sequences, plus five additional genome sequences previously reported, to identify the core genome for P. syringae and we compared this core to that of two closely related non-pathogenic pseudomonad species. These data revealed the recent acquisition of a 1 Mb megaplasmid by a sub-clade of cucumber pathogens. This megaplasmid encodes a type IV secretion system and a diverse set of unknown proteins, which dramatically increases both the genomic content of these strains and the pan-genome of the species. PMID:21799664

  16. Genomic analysis of methanogenic archaea reveals a shift towards energy conservation

    DOE PAGES

    Gilmore, Sean P.; Henske, John K.; Sexton, Jessica A.; ...

    2017-08-21

    The metabolism of archaeal methanogens drives methane release into the environment and is critical to understanding global carbon cycling. Methanogenesis operates at a very low reducing potential compared to other forms of respiration and is therefore critical to many anaerobic environments. Harnessing or altering methanogen metabolism has the potential to mitigate global warming and even be utilized for energy applications. Here, we report draft genome sequences for the isolated methanogens Methanobacterium bryantii, Methanosarcina spelaei, Methanosphaera cuniculi, and Methanocorpusculum parvum. These anaerobic, methane-producing archaea represent a diverse set of isolates, capable of methylotrophic, acetoclastic, and hydrogenotrophic methanogenesis. Assembly and analysis ofmore » the genomes allowed for simple and rapid reconstruction of metabolism in the four methanogens. Comparison of the distribution of Clusters of Orthologous Groups (COG) proteins to a sample of genomes from the RefSeq database revealed a trend towards energy conservation in genome composition of all methanogens sequenced. Further analysis of the predicted membrane proteins and transporters distinguished differing energy conservation methods utilized during methanogenesis, such as chemiosmotic coupling in Msar. spelaei and electron bifurcation linked to chemiosmotic coupling in Mbac. bryantii and Msph. cuniculi. Methanogens occupy a unique ecological niche, acting as the terminal electron acceptors in anaerobic environments, and their genomes display a significant shift towards energy conservation. The genome-enabled reconstructed metabolisms reported here have significance to diverse anaerobic communities and have led to proposed substrate utilization not previously reported in isolation, such as formate and methanol metabolism in Mbac. bryantii and CO 2 metabolism in Msph. cuniculi. The newly proposed substrates establish an important foundation with which to decipher how methanogens

  17. Genomic analysis of methanogenic archaea reveals a shift towards energy conservation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gilmore, Sean P.; Henske, John K.; Sexton, Jessica A.

    The metabolism of archaeal methanogens drives methane release into the environment and is critical to understanding global carbon cycling. Methanogenesis operates at a very low reducing potential compared to other forms of respiration and is therefore critical to many anaerobic environments. Harnessing or altering methanogen metabolism has the potential to mitigate global warming and even be utilized for energy applications. Here, we report draft genome sequences for the isolated methanogens Methanobacterium bryantii, Methanosarcina spelaei, Methanosphaera cuniculi, and Methanocorpusculum parvum. These anaerobic, methane-producing archaea represent a diverse set of isolates, capable of methylotrophic, acetoclastic, and hydrogenotrophic methanogenesis. Assembly and analysis ofmore » the genomes allowed for simple and rapid reconstruction of metabolism in the four methanogens. Comparison of the distribution of Clusters of Orthologous Groups (COG) proteins to a sample of genomes from the RefSeq database revealed a trend towards energy conservation in genome composition of all methanogens sequenced. Further analysis of the predicted membrane proteins and transporters distinguished differing energy conservation methods utilized during methanogenesis, such as chemiosmotic coupling in Msar. spelaei and electron bifurcation linked to chemiosmotic coupling in Mbac. bryantii and Msph. cuniculi. Methanogens occupy a unique ecological niche, acting as the terminal electron acceptors in anaerobic environments, and their genomes display a significant shift towards energy conservation. The genome-enabled reconstructed metabolisms reported here have significance to diverse anaerobic communities and have led to proposed substrate utilization not previously reported in isolation, such as formate and methanol metabolism in Mbac. bryantii and CO 2 metabolism in Msph. cuniculi. The newly proposed substrates establish an important foundation with which to decipher how methanogens

  18. Diversity and Relationships of Cocirculating Modern Human Rotaviruses Revealed Using Large-Scale Comparative Genomics

    PubMed Central

    McKell, Allison O.; Rippinger, Christine M.; McAllen, John K.; Akopov, Asmik; Kirkness, Ewen F.; Payne, Daniel C.; Edwards, Kathryn M.; Chappell, James D.; Patton, John T.

    2012-01-01

    Group A rotaviruses (RVs) are 11-segmented, double-stranded RNA viruses and are primary causes of gastroenteritis in young children. Despite their medical relevance, the genetic diversity of modern human RVs is poorly understood, and the impact of vaccine use on circulating strains remains unknown. In this study, we report the complete genome sequence analysis of 58 RVs isolated from children with severe diarrhea and/or vomiting at Vanderbilt University Medical Center (VUMC) in Nashville, TN, during the years spanning community vaccine implementation (2005 to 2009). The RVs analyzed include 36 G1P[8], 18 G3P[8], and 4 G12P[8] Wa-like genogroup 1 strains with VP6-VP1-VP2-VP3-NSP1-NSP2-NSP3-NSP4-NSP5/6 genotype constellations of I1-R1-C1-M1-A1-N1-T1-E1-H1. By constructing phylogenetic trees, we identified 2 to 5 subgenotype alleles for each gene. The results show evidence of intragenogroup gene reassortment among the cocirculating strains. However, several isolates from different seasons maintained identical allele constellations, consistent with the notion that certain RV clades persisted in the community. By comparing the genes of VUMC RVs to those of other archival and contemporary RV strains for which sequences are available, we defined phylogenetic lineages and verified that the diversity of the strains analyzed in this study reflects that seen in other regions of the world. Importantly, the VP4 and VP7 proteins encoded by VUMC RVs and other contemporary strains show amino acid changes in or near neutralization domains, which might reflect antigenic drift of the virus. Thus, this large-scale, comparative genomic study of modern human RVs provides significant insight into how this pathogen evolves during its spread in the community. PMID:22696651

  19. 'Candidatus Phytoplasma phoenicium' associated with almond witches'-broom disease: from draft genome to genetic diversity among strain populations.

    PubMed

    Quaglino, Fabio; Kube, Michael; Jawhari, Maan; Abou-Jawdah, Yusuf; Siewert, Christin; Choueiri, Elia; Sobh, Hana; Casati, Paola; Tedeschi, Rosemarie; Lova, Marina Molino; Alma, Alberto; Bianco, Piero Attilio

    2015-07-30

    Almond witches'-broom (AlmWB), a devastating disease of almond, peach and nectarine in Lebanon, is associated with 'Candidatus Phytoplasma phoenicium'. In the present study, we generated a draft genome sequence of 'Ca. P. phoenicium' strain SA213, representative of phytoplasma strain populations from different host plants, and determined the genetic diversity among phytoplasma strain populations by phylogenetic analyses of 16S rRNA, groEL, tufB and inmp gene sequences. Sequence-based typing and phylogenetic analysis of the gene inmp, coding an integral membrane protein, distinguished AlmWB-associated phytoplasma strains originating from diverse host plants, whereas their 16S rRNA, tufB and groEL genes shared 100 % sequence identity. Moreover, dN/dS analysis indicated positive selection acting on inmp gene. Additionally, the analysis of 'Ca. P. phoenicium' draft genome revealed the presence of integral membrane proteins and effector-like proteins and potential candidates for interaction with hosts. One of the integral membrane proteins was predicted as BI-1, an inhibitor of apoptosis-promoting Bax factor. Bioinformatics analyses revealed the presence of putative BI-1 in draft and complete genomes of other 'Ca. Phytoplasma' species. The genetic diversity within 'Ca. P. phoenicium' strain populations in Lebanon suggested that AlmWB disease could be associated with phytoplasma strains derived from the adaptation of an original strain to diverse hosts. Moreover, the identification of a putative inhibitor of apoptosis-promoting Bax factor (BI-1) in 'Ca. P. phoenicium' draft genome and within genomes of other 'Ca. Phytoplasma' species suggested its potential role as a phytoplasma fitness-increasing factor by modification of the host-defense response.

  20. The house spider genome reveals an ancient whole-genome duplication during arachnid evolution.

    PubMed

    Schwager, Evelyn E; Sharma, Prashant P; Clarke, Thomas; Leite, Daniel J; Wierschin, Torsten; Pechmann, Matthias; Akiyama-Oda, Yasuko; Esposito, Lauren; Bechsgaard, Jesper; Bilde, Trine; Buffry, Alexandra D; Chao, Hsu; Dinh, Huyen; Doddapaneni, HarshaVardhan; Dugan, Shannon; Eibner, Cornelius; Extavour, Cassandra G; Funch, Peter; Garb, Jessica; Gonzalez, Luis B; Gonzalez, Vanessa L; Griffiths-Jones, Sam; Han, Yi; Hayashi, Cheryl; Hilbrant, Maarten; Hughes, Daniel S T; Janssen, Ralf; Lee, Sandra L; Maeso, Ignacio; Murali, Shwetha C; Muzny, Donna M; Nunes da Fonseca, Rodrigo; Paese, Christian L B; Qu, Jiaxin; Ronshaugen, Matthew; Schomburg, Christoph; Schönauer, Anna; Stollewerk, Angelika; Torres-Oliva, Montserrat; Turetzek, Natascha; Vanthournout, Bram; Werren, John H; Wolff, Carsten; Worley, Kim C; Bucher, Gregor; Gibbs, Richard A; Coddington, Jonathan; Oda, Hiroki; Stanke, Mario; Ayoub, Nadia A; Prpic, Nikola-Michael; Flot, Jean-François; Posnien, Nico; Richards, Stephen; McGregor, Alistair P

    2017-07-31

    The duplication of genes can occur through various mechanisms and is thought to make a major contribution to the evolutionary diversification of organisms. There is increasing evidence for a large-scale duplication of genes in some chelicerate lineages including two rounds of whole genome duplication (WGD) in horseshoe crabs. To investigate this further, we sequenced and analyzed the genome of the common house spider Parasteatoda tepidariorum. We found pervasive duplication of both coding and non-coding genes in this spider, including two clusters of Hox genes. Analysis of synteny conservation across the P. tepidariorum genome suggests that there has been an ancient WGD in spiders. Comparison with the genomes of other chelicerates, including that of the newly sequenced bark scorpion Centruroides sculpturatus, suggests that this event occurred in the common ancestor of spiders and scorpions, and is probably independent of the WGDs in horseshoe crabs. Furthermore, characterization of the sequence and expression of the Hox paralogs in P. tepidariorum suggests that many have been subject to neo-functionalization and/or sub-functionalization since their duplication. Our results reveal that spiders and scorpions are likely the descendants of a polyploid ancestor that lived more than 450 MYA. Given the extensive morphological diversity and ecological adaptations found among these animals, rivaling those of vertebrates, our study of the ancient WGD event in Arachnopulmonata provides a new comparative platform to explore common and divergent evolutionary outcomes of polyploidization events across eukaryotes.

  1. Intraspecies Genomic Diversity and Natural Population Structure of the Meat-Borne Lactic Acid Bacterium Lactobacillus sakei▿ †

    PubMed Central

    Chaillou, Stéphane; Daty, Marie; Baraige, Fabienne; Dudez, Anne-Marie; Anglade, Patricia; Jones, Rhys; Alpert, Carl-Alfred; Champomier-Vergès, Marie-Christine; Zagorec, Monique

    2009-01-01

    Lactobacillus sakei is a food-borne bacterium naturally found in meat and fish products. A study was performed to examine the intraspecies diversity among 73 isolates sourced from laboratory collections in several different countries. Pulsed-field gel electrophoresis analysis demonstrated a 25% variation in genome size between isolates, ranging from 1,815 kb to 2,310 kb. The relatedness between isolates was then determined using a PCR-based method that detects the possession of 60 chromosomal genes belonging to the flexible gene pool. Ten different strain clusters were identified that had noticeable differences in their average genome size reflecting the natural population structure. The results show that many different genotypes may be isolated from similar types of meat products, suggesting a complex ecological habitat in which intraspecies diversity may be required for successful adaptation. Finally, proteomic analysis revealed a slight difference between the migration patterns of highly abundant GapA isoforms of the two prevailing L. sakei subspecies (sakei and carnosus). This analysis was used to affiliate the genotypic clusters with the corresponding subspecies. These findings reveal for the first time the extent of intraspecies genomic diversity in L. sakei. Consequently, identification of molecular subtypes may in the future prove valuable for a better understanding of microbial ecosystems in food products. PMID:19114527

  2. Cancer Genomics: Diversity and Disparity Across Ethnicity and Geography.

    PubMed

    Tan, Daniel S W; Mok, Tony S K; Rebbeck, Timothy R

    2016-01-01

    Ethnic and geographic differences in cancer incidence, prognosis, and treatment outcomes can be attributed to diversity in the inherited (germline) and somatic genome. Although international large-scale sequencing efforts are beginning to unravel the genomic underpinnings of cancer traits, much remains to be known about the underlying mechanisms and determinants of genomic diversity. Carcinogenesis is a dynamic, complex phenomenon representing the interplay between genetic and environmental factors that results in divergent phenotypes across ethnicities and geography. For example, compared with whites, there is a higher incidence of prostate cancer among Africans and African Americans, and the disease is generally more aggressive and fatal. Genome-wide association studies have identified germline susceptibility loci that may account for differences between the African and non-African patients, but the lack of availability of appropriate cohorts for replication studies and the incomplete understanding of genomic architecture across populations pose major limitations. We further discuss the transformative potential of routine diagnostic evaluation for actionable somatic alterations, using lung cancer as an example, highlighting implications of population disparities, current hurdles in implementation, and the far-reaching potential of clinical genomics in enhancing cancer prevention, diagnosis, and treatment. As we enter the era of precision cancer medicine, a concerted multinational effort is key to addressing population and genomic diversity as well as overcoming barriers and geographical disparities in research and health care delivery. © 2015 by American Society of Clinical Oncology.

  3. Genotyping-by-Sequencing (GBS) Revealed Molecular Genetic Diversity of Iranian Wheat Landraces and Cultivars

    PubMed Central

    Alipour, Hadi; Bihamta, Mohammad R.; Mohammadi, Valiollah; Peyghambari, Seyed A.; Bai, Guihua; Zhang, Guorong

    2017-01-01

    Background: Genetic diversity is an essential resource for breeders to improve new cultivars with desirable characteristics. Recently, genotyping-by-sequencing (GBS), a next-generation sequencing (NGS) technology that can simplify complex genomes, has now be used as a high-throughput and cost-effective molecular tool for routine breeding and screening in many crop species, including the species with a large genome. Results: We genotyped a diversity panel of 369 Iranian hexaploid wheat accessions including 270 landraces collected between 1931 and 1968 in different climate zones and 99 cultivars released between 1942 to 2014 using 16,506 GBS-based single nucleotide polymorphism (GBS-SNP) markers. The B genome had the highest number of mapped SNPs while the D genome had the lowest on both the Chinese Spring and W7984 references. Structure and cluster analyses divided the panel into three groups with two landrace groups and one cultivar group, suggesting a high differentiation between landraces and cultivars and between landraces. The cultivar group can be further divided into four subgroups with one subgroup was mostly derived from Iranian ancestor(s). Similarly, landrace groups can be further divided based on years of collection and climate zones where the accessions were collected. Molecular analysis of variance indicated that the genetic variation was larger between groups than within group. Conclusion: Obvious genetic diversity in Iranian wheat was revealed by analysis of GBS-SNPs and thus breeders can select genetically distant parents for crossing in breeding. The diverse Iranian landraces provide rich genetic sources of tolerance to biotic and abiotic stresses, and they can be useful resources for the improvement of wheat production in Iran and other countries. PMID:28912785

  4. Toward Integration of Comparative Genetic, Physical, Diversity, and Cytomolecular Maps for Grasses and Grains, Using the Sorghum Genome as a Foundation1

    PubMed Central

    Draye, Xavier; Lin, Yann-Rong; Qian, Xiao-yin; Bowers, John E.; Burow, Gloria B.; Morrell, Peter L.; Peterson, Daniel G.; Presting, Gernot G.; Ren, Shu-xin; Wing, Rod A.; Paterson, Andrew H.

    2001-01-01

    The small genome of sorghum (Sorghum bicolor L. Moench.) provides an important template for study of closely related large-genome crops such as maize (Zea mays) and sugarcane (Saccharum spp.), and is a logical complement to distantly related rice (Oryza sativa) as a “grass genome model.” Using a high-density RFLP map as a framework, a robust physical map of sorghum is being assembled by integrating hybridization and fingerprint data with comparative data from related taxa such as rice and using new methods to resolve genomic duplications into locus-specific groups. By taking advantage of allelic variation revealed by heterologous probes, the positions of corresponding loci on the wheat (Triticum aestivum), rice, maize, sugarcane, and Arabidopsis genomes are being interpolated on the sorghum physical map. Bacterial artificial chromosomes for the small genome of rice are shown to close several gaps in the sorghum contigs; the emerging rice physical map and assembled sequence will further accelerate progress. An important motivation for developing genomic tools is to relate molecular level variation to phenotypic diversity. “Diversity maps,” which depict the levels and patterns of variation in different gene pools, shed light on relationships of allelic diversity with chromosome organization, and suggest possible locations of genomic regions that are under selection due to major gene effects (some of which may be revealed by quantitative trait locus mapping). Both physical maps and diversity maps suggest interesting features that may be integrally related to the chromosomal context of DNA—progress in cytology promises to provide a means to elucidate such relationships. We seek to provide a detailed picture of the structure, function, and evolution of the genome of sorghum and its relatives, together with molecular tools such as locus-specific sequence-tagged site DNA markers and bacterial artificial chromosome contigs that will have enduring value for many

  5. Evolution and Diversity in Human Herpes Simplex Virus Genomes

    PubMed Central

    Gatherer, Derek; Ochoa, Alejandro; Greenbaum, Benjamin; Dolan, Aidan; Bowden, Rory J.; Enquist, Lynn W.; Legendre, Matthieu; Davison, Andrew J.

    2014-01-01

    Herpes simplex virus 1 (HSV-1) causes a chronic, lifelong infection in >60% of adults. Multiple recent vaccine trials have failed, with viral diversity likely contributing to these failures. To understand HSV-1 diversity better, we comprehensively compared 20 newly sequenced viral genomes from China, Japan, Kenya, and South Korea with six previously sequenced genomes from the United States, Europe, and Japan. In this diverse collection of passaged strains, we found that one-fifth of the newly sequenced members share a gene deletion and one-third exhibit homopolymeric frameshift mutations (HFMs). Individual strains exhibit genotypic and potential phenotypic variation via HFMs, deletions, short sequence repeats, and single-nucleotide polymorphisms, although the protein sequence identity between strains exceeds 90% on average. In the first genome-scale analysis of positive selection in HSV-1, we found signs of selection in specific proteins and residues, including the fusion protein glycoprotein H. We also confirmed previous results suggesting that recombination has occurred with high frequency throughout the HSV-1 genome. Despite this, the HSV-1 strains analyzed clustered by geographic origin during whole-genome distance analysis. These data shed light on likely routes of HSV-1 adaptation to changing environments and will aid in the selection of vaccine antigens that are invariant worldwide. PMID:24227835

  6. Within-Host Variations of Human Papillomavirus Reveal APOBEC Signature Mutagenesis in the Viral Genome.

    PubMed

    Hirose, Yusuke; Onuki, Mamiko; Tenjimbayashi, Yuri; Mori, Seiichiro; Ishii, Yoshiyuki; Takeuchi, Takamasa; Tasaka, Nobutaka; Satoh, Toyomi; Morisada, Tohru; Iwata, Takashi; Miyamoto, Shingo; Matsumoto, Koji; Sekizawa, Akihiko; Kukimoto, Iwao

    2018-06-15

    Persistent infection with oncogenic human papillomaviruses (HPVs) causes cervical cancer, accompanied by the accumulation of somatic mutations into the host genome. There are concomitant genetic changes in the HPV genome during viral infection; however, their relevance to cervical carcinogenesis is poorly understood. Here, we explored within-host genetic diversity of HPV by performing deep-sequencing analyses of viral whole-genome sequences in clinical specimens. The whole genomes of HPV types 16, 52, and 58 were amplified by type-specific PCR from total cellular DNA of cervical exfoliated cells collected from patients with cervical intraepithelial neoplasia (CIN) and invasive cervical cancer (ICC) and were deep sequenced. After constructing a reference viral genome sequence for each specimen, nucleotide positions showing changes with >0.5% frequencies compared to the reference sequence were determined for individual samples. In total, 1,052 positions of nucleotide variations were detected in HPV genomes from 151 samples (CIN1, n = 56; CIN2/3, n = 68; ICC, n = 27), with various numbers per sample. Overall, C-to-T and C-to-A substitutions were the dominant changes observed across all histological grades. While C-to-T transitions were predominantly detected in CIN1, their prevalence was decreased in CIN2/3 and fell below that of C-to-A transversions in ICC. Analysis of the trinucleotide context encompassing substituted bases revealed that TpCpN, a preferred target sequence for cellular APOBEC cytosine deaminases, was a primary site for C-to-T substitutions in the HPV genome. These results strongly imply that the APOBEC proteins are drivers of HPV genome mutation, particularly in CIN1 lesions. IMPORTANCE HPVs exhibit surprisingly high levels of genetic diversity, including a large repertoire of minor genomic variants in each viral genotype. Here, by conducting deep-sequencing analyses, we show for the first time a comprehensive snapshot of the within-host genetic

  7. OryzaGenome: Genome Diversity Database of Wild Oryza Species.

    PubMed

    Ohyanagi, Hajime; Ebata, Toshinobu; Huang, Xuehui; Gong, Hao; Fujita, Masahiro; Mochizuki, Takako; Toyoda, Atsushi; Fujiyama, Asao; Kaminuma, Eli; Nakamura, Yasukazu; Feng, Qi; Wang, Zi-Xuan; Han, Bin; Kurata, Nori

    2016-01-01

    The species in the genus Oryza, encompassing nine genome types and 23 species, are a rich genetic resource and may have applications in deeper genomic analyses aiming to understand the evolution of plant genomes. With the advancement of next-generation sequencing (NGS) technology, a flood of Oryza species reference genomes and genomic variation information has become available in recent years. This genomic information, combined with the comprehensive phenotypic information that we are accumulating in our Oryzabase, can serve as an excellent genotype-phenotype association resource for analyzing rice functional and structural evolution, and the associated diversity of the Oryza genus. Here we integrate our previous and future phenotypic/habitat information and newly determined genotype information into a united repository, named OryzaGenome, providing the variant information with hyperlinks to Oryzabase. The current version of OryzaGenome includes genotype information of 446 O. rufipogon accessions derived by imputation and of 17 accessions derived by imputation-free deep sequencing. Two variant viewers are implemented: SNP Viewer as a conventional genome browser interface and Variant Table as a text-based browser for precise inspection of each variant one by one. Portable VCF (variant call format) file or tab-delimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/scaffolds/contigs and genome-wide variation information for almost all of the closely and distantly related wild Oryza species from the NIG Wild Rice Collection will be available in future releases. All of the resources can be accessed through http://viewer.shigen.info/oryzagenome/. © The Author 2015. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.

  8. HLA Diversity in the 1000 Genomes Dataset

    PubMed Central

    Gourraud, Pierre-Antoine; Khankhanian, Pouya; Cereb, Nezih; Yang, Soo Young; Feolo, Michael; Maiers, Martin; D. Rioux, John; Hauser, Stephen; Oksenberg, Jorge

    2014-01-01

    The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation by sequencing at a level that should allow the genome-wide detection of most variants with frequencies as low as 1%. However, in the major histocompatibility complex (MHC), only the top 10 most frequent haplotypes are in the 1% frequency range whereas thousands of haplotypes are present at lower frequencies. Given the limitation of both the coverage and the read length of the sequences generated by the 1000 Genomes Project, the highly variable positions that define HLA alleles may be difficult to identify. We used classical Sanger sequencing techniques to type the HLA-A, HLA-B, HLA-C, HLA-DRB1 and HLA-DQB1 genes in the available 1000 Genomes samples and combined the results with the 103,310 variants in the MHC region genotyped by the 1000 Genomes Project. Using pairwise identity-by-descent distances between individuals and principal component analysis, we established the relationship between ancestry and genetic diversity in the MHC region. As expected, both the MHC variants and the HLA phenotype can identify the major ancestry lineage, informed mainly by the most frequent HLA haplotypes. To some extent, regions of the genome with similar genetic or similar recombination rate have similar properties. An MHC-centric analysis underlines departures between the ancestral background of the MHC and the genome-wide picture. Our analysis of linkage disequilibrium (LD) decay in these samples suggests that overestimation of pairwise LD occurs due to a limited sampling of the MHC diversity. This collection of HLA-specific MHC variants, available on the dbMHC portal, is a valuable resource for future analyses of the role of MHC in population and disease studies. PMID:24988075

  9. HLA diversity in the 1000 genomes dataset.

    PubMed

    Gourraud, Pierre-Antoine; Khankhanian, Pouya; Cereb, Nezih; Yang, Soo Young; Feolo, Michael; Maiers, Martin; Rioux, John D; Hauser, Stephen; Oksenberg, Jorge

    2014-01-01

    The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation by sequencing at a level that should allow the genome-wide detection of most variants with frequencies as low as 1%. However, in the major histocompatibility complex (MHC), only the top 10 most frequent haplotypes are in the 1% frequency range whereas thousands of haplotypes are present at lower frequencies. Given the limitation of both the coverage and the read length of the sequences generated by the 1000 Genomes Project, the highly variable positions that define HLA alleles may be difficult to identify. We used classical Sanger sequencing techniques to type the HLA-A, HLA-B, HLA-C, HLA-DRB1 and HLA-DQB1 genes in the available 1000 Genomes samples and combined the results with the 103,310 variants in the MHC region genotyped by the 1000 Genomes Project. Using pairwise identity-by-descent distances between individuals and principal component analysis, we established the relationship between ancestry and genetic diversity in the MHC region. As expected, both the MHC variants and the HLA phenotype can identify the major ancestry lineage, informed mainly by the most frequent HLA haplotypes. To some extent, regions of the genome with similar genetic or similar recombination rate have similar properties. An MHC-centric analysis underlines departures between the ancestral background of the MHC and the genome-wide picture. Our analysis of linkage disequilibrium (LD) decay in these samples suggests that overestimation of pairwise LD occurs due to a limited sampling of the MHC diversity. This collection of HLA-specific MHC variants, available on the dbMHC portal, is a valuable resource for future analyses of the role of MHC in population and disease studies.

  10. Visualization of Genome Diversity in German Shepherd Dogs.

    PubMed

    Mortlock, Sally-Anne; Booth, Rachel; Mazrier, Hamutal; Khatkar, Mehar S; Williamson, Peter

    2015-01-01

    A loss of genetic diversity may lead to increased disease risks in subpopulations of dogs. The canine breed structure has contributed to relatively small effective population size in many breeds and can limit the options for selective breeding strategies to maintain diversity. With the completion of the canine genome sequencing project, and the subsequent reduction in the cost of genotyping on a genomic scale, evaluating diversity in dogs has become much more accurate and accessible. This provides a potential tool for advising dog breeders and developing breeding programs within a breed. A challenge in doing this is to present complex relationship data in a form that can be readily utilized. Here, we demonstrate the use of a pipeline, known as NetView, to visualize the network of relationships in a subpopulation of German Shepherd Dogs.

  11. Genomic distribution and estimation of nucleotide diversity in natural populations: perspectives from the collared flycatcher (Ficedula albicollis) genome.

    PubMed

    Dutoit, Ludovic; Burri, Reto; Nater, Alexander; Mugal, Carina F; Ellegren, Hans

    2017-07-01

    Properly estimating genetic diversity in populations of nonmodel species requires a basic understanding of how diversity is distributed across the genome and among individuals. To this end, we analysed whole-genome resequencing data from 20 collared flycatchers (genome size ≈1.1 Gb; 10.13 million single nucleotide polymorphisms detected). Genomewide nucleotide diversity was almost identical among individuals (mean = 0.00394, range = 0.00384-0.00401), but diversity levels varied extensively across the genome (95% confidence interval for 200-kb windows = 0.0013-0.0053). Diversity was related to selective constraint such that in comparison with intergenic DNA, diversity at fourfold degenerate sites was reduced to 85%, 3' UTRs to 82%, 5' UTRs to 70% and nondegenerate sites to 12%. There was a strong positive correlation between diversity and chromosome size, probably driven by a higher density of targets for selection on smaller chromosomes increasing the diversity-reducing effect of linked selection. Simulations exploring the ability of sequence data from a small number of genetic markers to capture the observed diversity clearly demonstrated that diversity estimation from finite sampling of such data is bound to be associated with large confidence intervals. Nevertheless, we show that precision in diversity estimation in large outbred population benefits from increasing the number of loci rather than the number of individuals. Simulations mimicking RAD sequencing showed that this approach gives accurate estimates of genomewide diversity. Based on the patterns of observed diversity and the performed simulations, we provide broad recommendations for how genetic diversity should be estimated in natural populations. © 2016 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.

  12. The complex hybrid origins of the root knot nematodes revealed through comparative genomics

    PubMed Central

    Kumar, Sujai; Koutsovoulos, Georgios; Blaxter, Mark L.

    2014-01-01

    Root knot nematodes (RKN) can infect most of the world’s agricultural crop species and are among the most important of all plant pathogens. As yet however we have little understanding of their origins or the genomic basis of their extreme polyphagy. The most damaging pathogens reproduce by obligatory mitotic parthenogenesis and it has been suggested that these species originated from interspecific hybridizations between unknown parental taxa. We have sequenced the genome of the diploid meiotic parthenogen Meloidogyne floridensis, and use a comparative genomic approach to test the hypothesis that this species was involved in the hybrid origin of the tropical mitotic parthenogen Meloidogyne incognita. Phylogenomic analysis of gene families from M. floridensis, M. incognita and an outgroup species Meloidogyne hapla was carried out to trace the evolutionary history of these species’ genomes, and we demonstrate that M. floridensis was one of the parental species in the hybrid origins of M. incognita. Analysis of the M. floridensis genome itself revealed many gene loci present in divergent copies, as they are in M. incognita, indicating that it too had a hybrid origin. The triploid M. incognita is shown to be a complex double-hybrid between M. floridensis and a third, unidentified, parent. The agriculturally important RKN have very complex origins involving the mixing of several parental genomes by hybridization and their extreme polyphagy and success in agricultural environments may be related to this hybridization, producing transgressive variation on which natural selection can act. It is now clear that studying RKN variation via individual marker loci may fail due to the species’ convoluted origins, and multi-species population genomics is essential to understand the hybrid diversity and adaptive variation of this important species complex. This comparative genomic analysis provides a compelling example of the importance and complexity of hybridization in

  13. Penicillium arizonense, a new, genome sequenced fungal species, reveals a high chemical diversity in secreted metabolites.

    PubMed

    Grijseels, Sietske; Nielsen, Jens Christian; Randelovic, Milica; Nielsen, Jens; Nielsen, Kristian Fog; Workman, Mhairi; Frisvad, Jens Christian

    2016-10-14

    A new soil-borne species belonging to the Penicillium section Canescentia is described, Penicillium arizonense sp. nov. (type strain CBS 141311 T  = IBT 12289 T ). The genome was sequenced and assembled into 33.7 Mb containing 12,502 predicted genes. A phylogenetic assessment based on marker genes confirmed the grouping of P. arizonense within section Canescentia. Compared to related species, P. arizonense proved to encode a high number of proteins involved in carbohydrate metabolism, in particular hemicellulases. Mining the genome for genes involved in secondary metabolite biosynthesis resulted in the identification of 62 putative biosynthetic gene clusters. Extracts of P. arizonense were analysed for secondary metabolites and austalides, pyripyropenes, tryptoquivalines, fumagillin, pseurotin A, curvulinic acid and xanthoepocin were detected. A comparative analysis against known pathways enabled the proposal of biosynthetic gene clusters in P. arizonense responsible for the synthesis of all detected compounds except curvulinic acid. The capacity to produce biomass degrading enzymes and the identification of a high chemical diversity in secreted bioactive secondary metabolites, offers a broad range of potential industrial applications for the new species P. arizonense. The description and availability of the genome sequence of P. arizonense, further provides the basis for biotechnological exploitation of this species.

  14. Penicillium arizonense, a new, genome sequenced fungal species, reveals a high chemical diversity in secreted metabolites

    PubMed Central

    Grijseels, Sietske; Nielsen, Jens Christian; Randelovic, Milica; Nielsen, Jens; Nielsen, Kristian Fog; Workman, Mhairi; Frisvad, Jens Christian

    2016-01-01

    A new soil-borne species belonging to the Penicillium section Canescentia is described, Penicillium arizonense sp. nov. (type strain CBS 141311T = IBT 12289T). The genome was sequenced and assembled into 33.7 Mb containing 12,502 predicted genes. A phylogenetic assessment based on marker genes confirmed the grouping of P. arizonense within section Canescentia. Compared to related species, P. arizonense proved to encode a high number of proteins involved in carbohydrate metabolism, in particular hemicellulases. Mining the genome for genes involved in secondary metabolite biosynthesis resulted in the identification of 62 putative biosynthetic gene clusters. Extracts of P. arizonense were analysed for secondary metabolites and austalides, pyripyropenes, tryptoquivalines, fumagillin, pseurotin A, curvulinic acid and xanthoepocin were detected. A comparative analysis against known pathways enabled the proposal of biosynthetic gene clusters in P. arizonense responsible for the synthesis of all detected compounds except curvulinic acid. The capacity to produce biomass degrading enzymes and the identification of a high chemical diversity in secreted bioactive secondary metabolites, offers a broad range of potential industrial applications for the new species P. arizonense. The description and availability of the genome sequence of P. arizonense, further provides the basis for biotechnological exploitation of this species. PMID:27739446

  15. In Depth Characterization of Repetitive DNA in 23 Plant Genomes Reveals Sources of Genome Size Variation in the Legume Tribe Fabeae.

    PubMed

    Macas, Jiří; Novák, Petr; Pellicer, Jaume; Čížková, Jana; Koblížková, Andrea; Neumann, Pavel; Fuková, Iva; Doležel, Jaroslav; Kelly, Laura J; Leitch, Ilia J

    2015-01-01

    The differential accumulation and elimination of repetitive DNA are key drivers of genome size variation in flowering plants, yet there have been few studies which have analysed how different types of repeats in related species contribute to genome size evolution within a phylogenetic context. This question is addressed here by conducting large-scale comparative analysis of repeats in 23 species from four genera of the monophyletic legume tribe Fabeae, representing a 7.6-fold variation in genome size. Phylogenetic analysis and genome size reconstruction revealed that this diversity arose from genome size expansions and contractions in different lineages during the evolution of Fabeae. Employing a combination of low-pass genome sequencing with novel bioinformatic approaches resulted in identification and quantification of repeats making up 55-83% of the investigated genomes. In turn, this enabled an analysis of how each major repeat type contributed to the genome size variation encountered. Differential accumulation of repetitive DNA was found to account for 85% of the genome size differences between the species, and most (57%) of this variation was found to be driven by a single lineage of Ty3/gypsy LTR-retrotransposons, the Ogre elements. Although the amounts of several other lineages of LTR-retrotransposons and the total amount of satellite DNA were also positively correlated with genome size, their contributions to genome size variation were much smaller (up to 6%). Repeat analysis within a phylogenetic framework also revealed profound differences in the extent of sequence conservation between different repeat types across Fabeae. In addition to these findings, the study has provided a proof of concept for the approach combining recent developments in sequencing and bioinformatics to perform comparative analyses of repetitive DNAs in a large number of non-model species without the need to assemble their genomes.

  16. A Genome-Wide Association Study Reveals Genes Associated with Fusarium Ear Rot Resistance in a Maize Core Diversity Panel

    PubMed Central

    Zila, Charles T.; Samayoa, L. Fernando; Santiago, Rogelio; Butrón, Ana; Holland, James B.

    2013-01-01

    Fusarium ear rot is a common disease of maize that affects food and feed quality globally. Resistance to the disease is highly quantitative, and maize breeders have difficulty incorporating polygenic resistance alleles from unadapted donor sources into elite breeding populations without having a negative impact on agronomic performance. Identification of specific allele variants contributing to improved resistance may be useful to breeders by allowing selection of resistance alleles in coupling phase linkage with favorable agronomic characteristics. We report the results of a genome-wide association study to detect allele variants associated with increased resistance to Fusarium ear rot in a maize core diversity panel of 267 inbred lines evaluated in two sets of environments. We performed association tests with 47,445 single-nucleotide polymorphisms (SNPs) while controlling for background genomic relationships with a mixed model and identified three marker loci significantly associated with disease resistance in at least one subset of environments. Each associated SNP locus had relatively small additive effects on disease resistance (±1.1% on a 0–100% scale), but nevertheless were associated with 3 to 12% of the genotypic variation within or across environment subsets. Two of three identified SNPs colocalized with genes that have been implicated with programmed cell death. An analysis of associated allele frequencies within the major maize subpopulations revealed enrichment for resistance alleles in the tropical/subtropical and popcorn subpopulations compared with other temperate breeding pools. PMID:24048647

  17. Close Encounters of the Third Domain: The Emerging Genomic View of Archaeal Diversity and Evolution

    PubMed Central

    Spang, Anja; Saw, Jimmy H.; Lind, Anders E.; Ettema, Thijs J. G.

    2013-01-01

    The Archaea represent the so-called Third Domain of life, which has evolved in parallel with the Bacteria and which is implicated to have played a pivotal role in the emergence of the eukaryotic domain of life. Recent progress in genomic sequencing technologies and cultivation-independent methods has started to unearth a plethora of data of novel, uncultivated archaeal lineages. Here, we review how the availability of such genomic data has revealed several important insights into the diversity, ecological relevance, metabolic capacity, and the origin and evolution of the archaeal domain of life. PMID:24348093

  18. Genomic Evidence Reveals the Extreme Diversity and Wide Distribution of the Arsenic-Related Genes in Burkholderiales

    PubMed Central

    Li, Xiangyang; Zhang, Linshuang; Wang, Gejiao

    2014-01-01

    So far, numerous genes have been found to associate with various strategies to resist and transform the toxic metalloid arsenic (here, we denote these genes as “arsenic-related genes”). However, our knowledge of the distribution, redundancies and organization of these genes in bacteria is still limited. In this study, we analyzed the 188 Burkholderiales genomes and found that 95% genomes harbored arsenic-related genes, with an average of 6.6 genes per genome. The results indicated: a) compared to a low frequency of distribution for aio (arsenite oxidase) (12 strains), arr (arsenate respiratory reductase) (1 strain) and arsM (arsenite methytransferase)-like genes (4 strains), the ars (arsenic resistance system)-like genes were identified in 174 strains including 1,051 genes; b) 2/3 ars-like genes were clustered as ars operon and displayed a high diversity of gene organizations (68 forms) which may suggest the rapid movement and evolution for ars-like genes in bacterial genomes; c) the arsenite efflux system was dominant with ACR3 form rather than ArsB in Burkholderiales; d) only a few numbers of arsM and arrAB are found indicating neither As III biomethylation nor AsV respiration is the primary mechanism in Burkholderiales members; (e) the aio-like gene is mostly flanked with ars-like genes and phosphate transport system, implying the close functional relatedness between arsenic and phosphorus metabolisms. On average, the number of arsenic-related genes per genome of strains isolated from arsenic-rich environments is more than four times higher than the strains from other environments. Compared with human, plant and animal pathogens, the environmental strains possess a larger average number of arsenic-related genes, which indicates that habitat is likely a key driver for bacterial arsenic resistance. PMID:24632831

  19. Employing genome-wide SNP discovery and genotyping strategy to extrapolate the natural allelic diversity and domestication patterns in chickpea

    PubMed Central

    Kujur, Alice; Bajaj, Deepak; Upadhyaya, Hari D.; Das, Shouvik; Ranjan, Rajeev; Shree, Tanima; Saxena, Maneesha S.; Badoni, Saurabh; Kumar, Vinod; Tripathi, Shailesh; Gowda, C. L. L.; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K.; Parida, Swarup K.

    2015-01-01

    The genome-wide discovery and high-throughput genotyping of SNPs in chickpea natural germplasm lines is indispensable to extrapolate their natural allelic diversity, domestication, and linkage disequilibrium (LD) patterns leading to the genetic enhancement of this vital legume crop. We discovered 44,844 high-quality SNPs by sequencing of 93 diverse cultivated desi, kabuli, and wild chickpea accessions using reference genome- and de novo-based GBS (genotyping-by-sequencing) assays that were physically mapped across eight chromosomes of desi and kabuli. Of these, 22,542 SNPs were structurally annotated in different coding and non-coding sequence components of genes. Genes with 3296 non-synonymous and 269 regulatory SNPs could functionally differentiate accessions based on their contrasting agronomic traits. A high experimental validation success rate (92%) and reproducibility (100%) along with strong sensitivity (93–96%) and specificity (99%) of GBS-based SNPs was observed. This infers the robustness of GBS as a high-throughput assay for rapid large-scale mining and genotyping of genome-wide SNPs in chickpea with sub-optimal use of resources. With 23,798 genome-wide SNPs, a relatively high intra-specific polymorphic potential (49.5%) and broader molecular diversity (13–89%)/functional allelic diversity (18–77%) was apparent among 93 chickpea accessions, suggesting their tremendous applicability in rapid selection of desirable diverse accessions/inter-specific hybrids in chickpea crossbred varietal improvement program. The genome-wide SNPs revealed complex admixed domestication pattern, extensive LD estimates (0.54–0.68) and extended LD decay (400–500 kb) in a structured population inclusive of 93 accessions. These findings reflect the utility of our identified SNPs for subsequent genome-wide association study (GWAS) and selective sweep-based domestication trait dissection analysis to identify potential genomic loci (gene-associated targets) specifically

  20. Genetic diversity and trait genomic prediction in a pea diversity panel.

    PubMed

    Burstin, Judith; Salloignon, Pauline; Chabert-Martinello, Marianne; Magnin-Robert, Jean-Bernard; Siol, Mathieu; Jacquin, Françoise; Chauveau, Aurélie; Pont, Caroline; Aubert, Grégoire; Delaitre, Catherine; Truntzer, Caroline; Duc, Gérard

    2015-02-21

    Pea (Pisum sativum L.), a major pulse crop grown for its protein-rich seeds, is an important component of agroecological cropping systems in diverse regions of the world. New breeding challenges imposed by global climate change and new regulations urge pea breeders to undertake more efficient methods of selection and better take advantage of the large genetic diversity present in the Pisum sativum genepool. Diversity studies conducted so far in pea used Simple Sequence Repeat (SSR) and Retrotransposon Based Insertion Polymorphism (RBIP) markers. Recently, SNP marker panels have been developed that will be useful for genetic diversity assessment and marker-assisted selection. A collection of diverse pea accessions, including landraces and cultivars of garden, field or fodder peas as well as wild peas was characterised at the molecular level using newly developed SNP markers, as well as SSR markers and RBIP markers. The three types of markers were used to describe the structure of the collection and revealed different pictures of the genetic diversity among the collection. SSR showed the fastest rate of evolution and RBIP the slowest rate of evolution, pointing to their contrasted mode of evolution. SNP markers were then used to predict phenotypes -the date of flowering (BegFlo), the number of seeds per plant (Nseed) and thousand seed weight (TSW)- that were recorded for the collection. Different statistical methods were tested including the LASSO (Least Absolute Shrinkage ans Selection Operator), PLS (Partial Least Squares), SPLS (Sparse Partial Least Squares), Bayes A, Bayes B and GBLUP (Genomic Best Linear Unbiased Prediction) methods and the structure of the collection was taken into account in the prediction. Despite a limited number of 331 markers used for prediction, TSW was reliably predicted. The development of marker assisted selection has not reached its full potential in pea until now. This paper shows that the high-throughput SNP arrays that are being

  1. Family-wide Structural Characterization and Genomic Comparisons Decode the Diversity-oriented Biosynthesis of Thalassospiramides by Marine Proteobacteria*

    PubMed Central

    Zhang, Weipeng; Lu, Liang; Lai, Qiliang; Zhu, Beika; Li, Zhongrui; Xu, Ying; Shao, Zongze; Herrup, Karl; Moore, Bradley S.; Ross, Avena C.; Qian, Pei-Yuan

    2016-01-01

    The thalassospiramide lipopeptides have great potential for therapeutic applications; however, their structural and functional diversity and biosynthesis are poorly understood. Here, by cultivating 130 Rhodospirillaceae strains sampled from oceans worldwide, we discovered 21 new thalassospiramide analogues and demonstrated their neuroprotective effects. To investigate the diversity of biosynthetic gene cluster (BGC) architectures, we sequenced the draft genomes of 28 Rhodospirillaceae strains. Our family-wide genomic analysis revealed three types of dysfunctional BGCs and four functional BGCs whose architectures correspond to four production patterns. This correlation allowed us to reassess the “diversity-oriented biosynthesis” proposed for the microbial production of thalassospiramides, which involves iteration of several key modules. Preliminary evolutionary investigation suggested that the functional BGCs could have arisen through module/domain loss, whereas the dysfunctional BGCs arose through horizontal gene transfer. Further comparative genomics indicated that thalassospiramide production is likely to be attendant on particular genes/pathways for amino acid metabolism, signaling transduction, and compound efflux. Our findings provide a systematic understanding of thalassospiramide production and new insights into the underlying mechanism. PMID:27875306

  2. Ethiopian Population Dermatoglyphic Study Reveals Linguistic Stratification of Diversity

    PubMed Central

    2015-01-01

    The manifestation of ethnic, blood type, & gender-wise population variations regarding Dermatoglyphic manifestations are of interest to assess intra-group diversity and differentiation. The present study reports on the analysis of qualitaive and quantitative finger Dermatoglyphic traits of 382 individuals cross-sectionally sampled from an administrative region of Ethiopia, consisting of five ethnic cohorts from the Afro-Asiatic & Nilo-Saharan affiliations. These Dermatoglyphic parameters were then applied in the assessment of diversity & differentiation, including Heterozygosity, Fixation, Panmixia, Wahlund’s variance, Nei’s measure of genetic diversity, and thumb & finger pattern genotypes, which were inturn used in homology inferences as summarized by a Neighbour-Joining tree constructed from Nei’s standard genetic distance. Results revealed significant correlation between Dermatoglyphics & population parameters that were further found to be in concordance with the historical accounts of the ethnic groups. Such inductions as the ancient north-eastern presence and subsequent admixure events of the Oromos (PII= 15.01), the high diversity of the Amharas (H= 0.1978, F= 0.6453, and P= 0.4144), and the Nilo-Saharan origin of the Berta group (PII= 10.66) are evidences to this. The study has further tested the possibility of applying Dermatoglyphics in population genetic & anthropologic research, highlighting on the prospect of developing a method to trace back population origins & ancient movement patterns. Additionally, linguistic clustering was deemed significant for the Ethiopian population, coinciding with recent genome wide studies that have ascertained that linguistic clustering as to being more crucial than the geographical patterning in the Ethiopian context. Finally, Dermatoglyphic markers have been proven to be endowed with a strong potential as non-invasive preliminary tools applicable prior to genetic studies to analyze ethnically sub

  3. Comparative genomics of closely related Salmonella enterica serovar Typhi strains reveals genome dynamics and the acquisition of novel pathogenic elements.

    PubMed

    Yap, Kien-Pong; Gan, Han Ming; Teh, Cindy Shuan Ju; Chai, Lay Ching; Thong, Kwai Lin

    2014-11-20

    Typhoid fever is an infectious disease of global importance that is caused by Salmonella enterica subsp. enterica serovar Typhi (S. Typhi). This disease causes an estimated 200,000 deaths per year and remains a serious global health threat. S. Typhi is strictly a human pathogen, and some recovered individuals become long-term carriers who continue to shed the bacteria in their faeces, thus becoming main reservoirs of infection. A comparative genomics analysis combined with a phylogenomic analysis revealed that the strains from the outbreak and carrier were closely related with microvariations and possibly derived from a common ancestor. Additionally, the comparative genomics analysis with all of the other completely sequenced S. Typhi genomes revealed that strains BL196 and CR0044 exhibit unusual genomic variations despite S. Typhi being generally regarded as highly clonal. The two genomes shared distinct chromosomal architectures and uncommon genome features; notably, the presence of a ~10 kb novel genomic island containing uncharacterised virulence-related genes, and zot in particular. Variations were also detected in the T6SS system and genes that were related to SPI-10, insertion sequences, CRISPRs and nsSNPs among the studied genomes. Interestingly, the carrier strain CR0044 harboured far more genetic polymorphisms (83% mutant nsSNPs) compared with the closely related BL196 outbreak strain. Notably, the two highly related virulence-determinant genes, rpoS and tviE, were mutated in strains BL196 and CR0044, respectively, which revealed that the mutation in rpoS is stabilising, while that in tviE is destabilising. These microvariations provide novel insight into the optimisation of genes by the pathogens. However, the sporadic strain was found to be far more conserved compared with the others. The uncommon genomic variations in the two closely related BL196 and CR0044 strains suggests that S. Typhi is more diverse than previously thought. Our study has

  4. The genome of Mesobuthus martensii reveals a unique adaptation model of arthropods

    PubMed Central

    Cao, Zhijian; Yu, Yao; Wu, Yingliang; Hao, Pei; Di, Zhiyong; He, Yawen; Chen, Zongyun; Yang, Weishan; Shen, Zhiyong; He, Xiaohua; Sheng, Jia; Xu, Xiaobo; Pan, Bohu; Feng, Jing; Yang, Xiaojuan; Hong, Wei; Zhao, Wenjuan; Li, Zhongjie; Huang, Kai; Li, Tian; Kong, Yimeng; Liu, Hui; Jiang, Dahe; Zhang, Binyan; Hu, Jun; Hu, Youtian; Wang, Bin; Dai, Jianliang; Yuan, Bifeng; Feng, Yuqi; Huang, Wei; Xing, Xiaojing; Zhao, Guoping; Li, Xuan; Li, Yixue; Li, Wenxin

    2013-01-01

    Representing a basal branch of arachnids, scorpions are known as ‘living fossils’ that maintain an ancient anatomy and are adapted to have survived extreme climate changes. Here we report the genome sequence of Mesobuthus martensii, containing 32,016 protein-coding genes, the most among sequenced arthropods. Although M. martensii appears to evolve conservatively, it has a greater gene family turnover than the insects that have undergone diverse morphological and physiological changes, suggesting the decoupling of the molecular and morphological evolution in scorpions. Underlying the long-term adaptation of scorpions is the expansion of the gene families enriched in basic metabolic pathways, signalling pathways, neurotoxins and cytochrome P450, and the different dynamics of expansion between the shared and the scorpion lineage-specific gene families. Genomic and transcriptomic analyses further illustrate the important genetic features associated with prey, nocturnal behaviour, feeding and detoxification. The M. martensii genome reveals a unique adaptation model of arthropods, offering new insights into the genetic bases of the living fossils. PMID:24129506

  5. Genomic Analysis of Hospital Plumbing Reveals Diverse Reservoir of Bacterial Plasmids Conferring Carbapenem Resistance.

    PubMed

    Weingarten, Rebecca A; Johnson, Ryan C; Conlan, Sean; Ramsburg, Amanda M; Dekker, John P; Lau, Anna F; Khil, Pavel; Odom, Robin T; Deming, Clay; Park, Morgan; Thomas, Pamela J; Henderson, David K; Palmore, Tara N; Segre, Julia A; Frank, Karen M

    2018-02-06

    The hospital environment is a potential reservoir of bacteria with plasmids conferring carbapenem resistance. Our Hospital Epidemiology Service routinely performs extensive sampling of high-touch surfaces, sinks, and other locations in the hospital. Over a 2-year period, additional sampling was conducted at a broader range of locations, including housekeeping closets, wastewater from hospital internal pipes, and external manholes. We compared these data with previously collected information from 5 years of patient clinical and surveillance isolates. Whole-genome sequencing and analysis of 108 isolates provided comprehensive characterization of bla KPC / bla NDM -positive isolates, enabling an in-depth genetic comparison. Strikingly, despite a very low prevalence of patient infections with bla KPC -positive organisms, all samples from the intensive care unit pipe wastewater and external manholes contained carbapenemase-producing organisms (CPOs), suggesting a vast, resilient reservoir. We observed a diverse set of species and plasmids, and we noted species and susceptibility profile differences between environmental and patient populations of CPOs. However, there were plasmid backbones common to both populations, highlighting a potential environmental reservoir of mobile elements that may contribute to the spread of resistance genes. Clear associations between patient and environmental isolates were uncommon based on sequence analysis and epidemiology, suggesting reasonable infection control compliance at our institution. Nonetheless, a probable nosocomial transmission of Leclercia sp. from the housekeeping environment to a patient was detected by this extensive surveillance. These data and analyses further our understanding of CPOs in the hospital environment and are broadly relevant to the design of infection control strategies in many infrastructure settings. IMPORTANCE Carbapenemase-producing organisms (CPOs) are a global concern because of the morbidity and

  6. Evidence-based green algal genomics reveals marine diversity and ancestral characteristics of land plants.

    PubMed

    van Baren, Marijke J; Bachy, Charles; Reistetter, Emily Nahas; Purvine, Samuel O; Grimwood, Jane; Sudek, Sebastian; Yu, Hang; Poirier, Camille; Deerinck, Thomas J; Kuo, Alan; Grigoriev, Igor V; Wong, Chee-Hong; Smith, Richard D; Callister, Stephen J; Wei, Chia-Lin; Schmutz, Jeremy; Worden, Alexandra Z

    2016-03-31

    Prasinophytes are widespread marine green algae that are related to plants. Cellular abundance of the prasinophyte Micromonas has reportedly increased in the Arctic due to climate-induced changes. Thus, studies of these unicellular eukaryotes are important for marine ecology and for understanding Viridiplantae evolution and diversification. We generated evidence-based Micromonas gene models using proteomics and RNA-Seq to improve prasinophyte genomic resources. First, sequences of four chromosomes in the 22 Mb Micromonas pusilla (CCMP1545) genome were finished. Comparison with the finished 21 Mb genome of Micromonas commoda (RCC299; named herein) shows they share ≤8,141 of ~10,000 protein-encoding genes, depending on the analysis method. Unlike RCC299 and other sequenced eukaryotes, CCMP1545 has two abundant repetitive intron types and a high percent (26 %) GC splice donors. Micromonas has more genus-specific protein families (19 %) than other genome sequenced prasinophytes (11 %). Comparative analyses using predicted proteomes from other prasinophytes reveal proteins likely related to scale formation and ancestral photosynthesis. Our studies also indicate that peptidoglycan (PG) biosynthesis enzymes have been lost in multiple independent events in select prasinophytes and plants. However, CCMP1545, polar Micromonas CCMP2099 and prasinophytes from other classes retain the entire PG pathway, like moss and glaucophyte algae. Surprisingly, multiple vascular plants also have the PG pathway, except the Penicillin-Binding Protein, and share a unique bi-domain protein potentially associated with the pathway. Alongside Micromonas experiments using antibiotics that halt bacterial PG biosynthesis, the findings highlight unrecognized phylogenetic complexity in PG-pathway retention and implicate a role in chloroplast structure or division in several extant Viridiplantae lineages. Extensive differences in gene loss and architecture between related prasinophytes underscore

  7. The Global Invertebrate Genomics Alliance (GIGA): Developing Community Resources to Study Diverse Invertebrate Genomes

    PubMed Central

    2014-01-01

    Over 95% of all metazoan (animal) species comprise the “invertebrates,” but very few genomes from these organisms have been sequenced. We have, therefore, formed a “Global Invertebrate Genomics Alliance” (GIGA). Our intent is to build a collaborative network of diverse scientists to tackle major challenges (e.g., species selection, sample collection and storage, sequence assembly, annotation, analytical tools) associated with genome/transcriptome sequencing across a large taxonomic spectrum. We aim to promote standards that will facilitate comparative approaches to invertebrate genomics and collaborations across the international scientific community. Candidate study taxa include species from Porifera, Ctenophora, Cnidaria, Placozoa, Mollusca, Arthropoda, Echinodermata, Annelida, Bryozoa, and Platyhelminthes, among others. GIGA will target 7000 noninsect/nonnematode species, with an emphasis on marine taxa because of the unrivaled phyletic diversity in the oceans. Priorities for selecting invertebrates for sequencing will include, but are not restricted to, their phylogenetic placement; relevance to organismal, ecological, and conservation research; and their importance to fisheries and human health. We highlight benefits of sequencing both whole genomes (DNA) and transcriptomes and also suggest policies for genomic-level data access and sharing based on transparency and inclusiveness. The GIGA Web site (http://giga.nova.edu) has been launched to facilitate this collaborative venture. PMID:24336862

  8. The Global Invertebrate Genomics Alliance (GIGA): developing community resources to study diverse invertebrate genomes.

    PubMed

    Bracken-Grissom, Heather; Collins, Allen G; Collins, Timothy; Crandall, Keith; Distel, Daniel; Dunn, Casey; Giribet, Gonzalo; Haddock, Steven; Knowlton, Nancy; Martindale, Mark; Medina, Mónica; Messing, Charles; O'Brien, Stephen J; Paulay, Gustav; Putnam, Nicolas; Ravasi, Timothy; Rouse, Greg W; Ryan, Joseph F; Schulze, Anja; Wörheide, Gert; Adamska, Maja; Bailly, Xavier; Breinholt, Jesse; Browne, William E; Diaz, M Christina; Evans, Nathaniel; Flot, Jean-François; Fogarty, Nicole; Johnston, Matthew; Kamel, Bishoy; Kawahara, Akito Y; Laberge, Tammy; Lavrov, Dennis; Michonneau, François; Moroz, Leonid L; Oakley, Todd; Osborne, Karen; Pomponi, Shirley A; Rhodes, Adelaide; Santos, Scott R; Satoh, Nori; Thacker, Robert W; Van de Peer, Yves; Voolstra, Christian R; Welch, David Mark; Winston, Judith; Zhou, Xin

    2014-01-01

    Over 95% of all metazoan (animal) species comprise the "invertebrates," but very few genomes from these organisms have been sequenced. We have, therefore, formed a "Global Invertebrate Genomics Alliance" (GIGA). Our intent is to build a collaborative network of diverse scientists to tackle major challenges (e.g., species selection, sample collection and storage, sequence assembly, annotation, analytical tools) associated with genome/transcriptome sequencing across a large taxonomic spectrum. We aim to promote standards that will facilitate comparative approaches to invertebrate genomics and collaborations across the international scientific community. Candidate study taxa include species from Porifera, Ctenophora, Cnidaria, Placozoa, Mollusca, Arthropoda, Echinodermata, Annelida, Bryozoa, and Platyhelminthes, among others. GIGA will target 7000 noninsect/nonnematode species, with an emphasis on marine taxa because of the unrivaled phyletic diversity in the oceans. Priorities for selecting invertebrates for sequencing will include, but are not restricted to, their phylogenetic placement; relevance to organismal, ecological, and conservation research; and their importance to fisheries and human health. We highlight benefits of sequencing both whole genomes (DNA) and transcriptomes and also suggest policies for genomic-level data access and sharing based on transparency and inclusiveness. The GIGA Web site (http://giga.nova.edu) has been launched to facilitate this collaborative venture.

  9. Comparative Genomics Reveals the Core Gene Toolbox for the Fungus-Insect Symbiosis

    PubMed Central

    Stata, Matt; Wang, Wei; White, Merlin M.; Moncalvo, Jean-Marc

    2018-01-01

    ABSTRACT Modern genomics has shed light on many entomopathogenic fungi and expanded our knowledge widely; however, little is known about the genomic features of the insect-commensal fungi. Harpellales are obligate commensals living in the digestive tracts of disease-bearing insects (black flies, midges, and mosquitoes). In this study, we produced and annotated whole-genome sequences of nine Harpellales taxa and conducted the first comparative analyses to infer the genomic diversity within the members of the Harpellales. The genomes of the insect gut fungi feature low (26% to 37%) GC content and large genome size variations (25 to 102 Mb). Further comparisons with insect-pathogenic fungi (from both Ascomycota and Zoopagomycota), as well as with free-living relatives (as negative controls), helped to identify a gene toolbox that is essential to the fungus-insect symbiosis. The results not only narrow the genomic scope of fungus-insect interactions from several thousands to eight core players but also distinguish host invasion strategies employed by insect pathogens and commensals. The genomic content suggests that insect commensal fungi rely mostly on adhesion protein anchors that target digestive system, while entomopathogenic fungi have higher numbers of transmembrane helices, signal peptides, and pathogen-host interaction (PHI) genes across the whole genome and enrich genes as well as functional domains to inactivate the host inflammation system and suppress the host defense. Phylogenomic analyses have revealed that genome sizes of Harpellales fungi vary among lineages with an integer-multiple pattern, which implies that ancient genome duplications may have occurred within the gut of insects. PMID:29764946

  10. Unraveling Mycobacterium tuberculosis genomic diversity and evolution in Lisbon, Portugal, a highly drug resistant setting.

    PubMed

    Perdigão, João; Silva, Hugo; Machado, Diana; Macedo, Rita; Maltez, Fernando; Silva, Carla; Jordao, Luisa; Couto, Isabel; Mallard, Kim; Coll, Francesc; Hill-Cawthorne, Grant A; McNerney, Ruth; Pain, Arnab; Clark, Taane G; Viveiros, Miguel; Portugal, Isabel

    2014-11-18

    Multidrug- (MDR) and extensively drug resistant (XDR) tuberculosis (TB) presents a challenge to disease control and elimination goals. In Lisbon, Portugal, specific and successful XDR-TB strains have been found in circulation for almost two decades. In the present study we have genotyped and sequenced the genomes of 56 Mycobacterium tuberculosis isolates recovered mostly from Lisbon. The genotyping data revealed three major clusters associated with MDR-TB, two of which are associated with XDR-TB. Whilst the genomic data contributed to elucidate the phylogenetic positioning of circulating MDR-TB strains, showing a high predominance of a single SNP cluster group 5. Furthermore, a genome-wide phylogeny analysis from these strains, together with 19 publicly available genomes of Mycobacterium tuberculosis clinical isolates, revealed two major clades responsible for M/XDR-TB in the region: Lisboa3 and Q1 (LAM).The data presented by this study yielded insights on microevolution and identification of novel compensatory mutations associated with rifampicin resistance in rpoB and rpoC. The screening for other structural variations revealed putative clade-defining variants. One deletion in PPE41, found among Lisboa3 isolates, is proposed to contribute to immune evasion and as a selective advantage. Insertion sequence (IS) mapping has also demonstrated the role of IS6110 as a major driver in mycobacterial evolution by affecting gene integrity and regulation. Globally, this study contributes with novel genome-wide phylogenetic data and has led to the identification of new genomic variants that support the notion of a growing genomic diversity facing both setting and host adaptation.

  11. Diversity and Genome Analysis of Australian and Global Oilseed Brassica napus L. Germplasm Using Transcriptomics and Whole Genome Re-sequencing.

    PubMed

    Malmberg, M Michelle; Shi, Fan; Spangenberg, German C; Daetwyler, Hans D; Cogan, Noel O I

    2018-01-01

    Intensive breeding of Brassica napus has resulted in relatively low diversity, such that B. napus would benefit from germplasm improvement schemes that sustain diversity. As such, samples representative of global germplasm pools need to be assessed for existing population structure, diversity and linkage disequilibrium (LD). Complexity reduction genotyping-by-sequencing (GBS) methods, including GBS-transcriptomics (GBS-t), enable cost-effective screening of a large number of samples, while whole genome re-sequencing (WGR) delivers the ability to generate large numbers of unbiased genomic single nucleotide polymorphisms (SNPs), and identify structural variants (SVs). Furthermore, the development of genomic tools based on whole genomes representative of global oilseed diversity and orientated by the reference genome has substantial industry relevance and will be highly beneficial for canola breeding. As recent studies have focused on European and Chinese varieties, a global diversity panel as well as a substantial number of Australian spring types were included in this study. Focusing on industry relevance, 633 varieties were initially genotyped using GBS-t to examine population structure using 61,037 SNPs. Subsequently, 149 samples representative of global diversity were selected for WGR and both data sets used for a side-by-side evaluation of diversity and LD. The WGR data was further used to develop genomic resources consisting of a list of 4,029,750 high-confidence SNPs annotated using SnpEff, and SVs in the form of 10,976 deletions and 2,556 insertions. These resources form the basis of a reliable and repeatable system allowing greater integration between canola genomics studies, with a strong focus on breeding germplasm and industry applicability.

  12. Infectious diseases of marine molluscs and host responses as revealed by genomic tools

    PubMed Central

    Ford, Susan E.

    2016-01-01

    More and more infectious diseases affect marine molluscs. Some diseases have impacted commercial species including MSX and Dermo of the eastern oyster, QPX of hard clams, withering syndrome of abalone and ostreid herpesvirus 1 (OsHV-1) infections of many molluscs. Although the exact transmission mechanisms are not well understood, human activities and associated environmental changes often correlate with increased disease prevalence. For instance, hatcheries and large-scale aquaculture create high host densities, which, along with increasing ocean temperature, might have contributed to OsHV-1 epizootics in scallops and oysters. A key to understanding linkages between the environment and disease is to understand how the environment affects the host immune system. Although we might be tempted to downplay the role of immunity in invertebrates, recent advances in genomics have provided insights into host and parasite genomes and revealed surprisingly sophisticated innate immune systems in molluscs. All major innate immune pathways are found in molluscs with many immune receptors, regulators and effectors expanded. The expanded gene families provide great diversity and complexity in innate immune response, which may be key to mollusc's defence against diverse pathogens in the absence of adaptive immunity. Further advances in host and parasite genomics should improve our understanding of genetic variation in parasite virulence and host disease resistance. PMID:26880838

  13. Expanding the Diversity of Mycobacteriophages: Insights into Genome Architecture and Evolution

    PubMed Central

    Pope, Welkin H.; Jacobs-Sera, Deborah; Russell, Daniel A.; Peebles, Craig L.; Al-Atrache, Zein; Alcoser, Turi A.; Alexander, Lisa M.; Alfano, Matthew B.; Alford, Samantha T.; Amy, Nichols E.; Anderson, Marie D.; Anderson, Alexander G.; Ang, Andrew A. S.; Ares, Manuel; Barber, Amanda J.; Barker, Lucia P.; Barrett, Jonathan M.; Barshop, William D.; Bauerle, Cynthia M.; Bayles, Ian M.; Belfield, Katherine L.; Best, Aaron A.; Borjon, Agustin; Bowman, Charles A.; Boyer, Christine A.; Bradley, Kevin W.; Bradley, Victoria A.; Broadway, Lauren N.; Budwal, Keshav; Busby, Kayla N.; Campbell, Ian W.; Campbell, Anne M.; Carey, Alyssa; Caruso, Steven M.; Chew, Rebekah D.; Cockburn, Chelsea L.; Cohen, Lianne B.; Corajod, Jeffrey M.; Cresawn, Steven G.; Davis, Kimberly R.; Deng, Lisa; Denver, Dee R.; Dixon, Breyon R.; Ekram, Sahrish; Elgin, Sarah C. R.; Engelsen, Angela E.; English, Belle E. V.; Erb, Marcella L.; Estrada, Crystal; Filliger, Laura Z.; Findley, Ann M.; Forbes, Lauren; Forsyth, Mark H.; Fox, Tyler M.; Fritz, Melissa J.; Garcia, Roberto; George, Zindzi D.; Georges, Anne E.; Gissendanner, Christopher R.; Goff, Shannon; Goldstein, Rebecca; Gordon, Kobie C.; Green, Russell D.; Guerra, Stephanie L.; Guiney-Olsen, Krysta R.; Guiza, Bridget G.; Haghighat, Leila; Hagopian, Garrett V.; Harmon, Catherine J.; Harmson, Jeremy S.; Hartzog, Grant A.; Harvey, Samuel E.; He, Siping; He, Kevin J.; Healy, Kaitlin E.; Higinbotham, Ellen R.; Hildebrandt, Erin N.; Ho, Jason H.; Hogan, Gina M.; Hohenstein, Victoria G.; Holz, Nathan A.; Huang, Vincent J.; Hufford, Ericka L.; Hynes, Peter M.; Jackson, Arrykka S.; Jansen, Erica C.; Jarvik, Jonathan; Jasinto, Paul G.; Jordan, Tuajuanda C.; Kasza, Tomas; Katelyn, Murray A.; Kelsey, Jessica S.; Kerrigan, Larisa A.; Khaw, Daryl; Kim, Junghee; Knutter, Justin Z.; Ko, Ching-Chung; Larkin, Gail V.; Laroche, Jennifer R.; Latif, Asma; Leuba, Kohana D.; Leuba, Sequoia I.; Lewis, Lynn O.; Loesser-Casey, Kathryn E.; Long, Courtney A.; Lopez, A. Javier; Lowery, Nicholas; Lu, Tina Q.; Mac, Victor; Masters, Isaac R.; McCloud, Jazmyn J.; McDonough, Molly J.; Medenbach, Andrew J.; Menon, Anjali; Miller, Rachel; Morgan, Brandon K.; Ng, Patrick C.; Nguyen, Elvis; Nguyen, Katrina T.; Nguyen, Emilie T.; Nicholson, Kaylee M.; Parnell, Lindsay A.; Peirce, Caitlin E.; Perz, Allison M.; Peterson, Luke J.; Pferdehirt, Rachel E.; Philip, Seegren V.; Pogliano, Kit; Pogliano, Joe; Polley, Tamsen; Puopolo, Erica J.; Rabinowitz, Hannah S.; Resiss, Michael J.; Rhyan, Corwin N.; Robinson, Yetta M.; Rodriguez, Lauren L.; Rose, Andrew C.; Rubin, Jeffrey D.; Ruby, Jessica A.; Saha, Margaret S.; Sandoz, James W.; Savitskaya, Judith; Schipper, Dale J.; Schnitzler, Christine E.; Schott, Amanda R.; Segal, J. Bradley; Shaffer, Christopher D.; Sheldon, Kathryn E.; Shepard, Erica M.; Shepardson, Jonathan W.; Shroff, Madav K.; Simmons, Jessica M.; Simms, Erika F.; Simpson, Brandy M.; Sinclair, Kathryn M.; Sjoholm, Robert L.; Slette, Ingrid J.; Spaulding, Blaire C.; Straub, Clark L.; Stukey, Joseph; Sughrue, Trevor; Tang, Tin-Yun; Tatyana, Lyons M.; Taylor, Stephen B.; Taylor, Barbara J.; Temple, Louise M.; Thompson, Jasper V.; Tokarz, Michael P.; Trapani, Stephanie E.; Troum, Alexander P.; Tsay, Jonathan; Tubbs, Anthony T.; Walton, Jillian M.; Wang, Danielle H.; Wang, Hannah; Warner, John R.; Weisser, Emilie G.; Wendler, Samantha C.; Weston-Hafer, Kathleen A.; Whelan, Hilary M.; Williamson, Kurt E.; Willis, Angelica N.; Wirtshafter, Hannah S.; Wong, Theresa W.; Wu, Phillip; Yang, Yun jeong; Yee, Brandon C.; Zaidins, David A.; Zhang, Bo; Zúniga, Melina Y.; Hendrix, Roger W.; Hatfull, Graham F.

    2011-01-01

    Mycobacteriophages are viruses that infect mycobacterial hosts such as Mycobacterium smegmatis and Mycobacterium tuberculosis. All mycobacteriophages characterized to date are dsDNA tailed phages, and have either siphoviral or myoviral morphotypes. However, their genetic diversity is considerable, and although sixty-two genomes have been sequenced and comparatively analyzed, these likely represent only a small portion of the diversity of the mycobacteriophage population at large. Here we report the isolation, sequencing and comparative genomic analysis of 18 new mycobacteriophages isolated from geographically distinct locations within the United States. Although no clear correlation between location and genome type can be discerned, these genomes expand our knowledge of mycobacteriophage diversity and enhance our understanding of the roles of mobile elements in viral evolution. Expansion of the number of mycobacteriophages grouped within Cluster A provides insights into the basis of immune specificity in these temperate phages, and we also describe a novel example of apparent immunity theft. The isolation and genomic analysis of bacteriophages by freshman college students provides an example of an authentic research experience for novice scientists. PMID:21298013

  14. Expanding the diversity of mycobacteriophages: insights into genome architecture and evolution.

    PubMed

    Pope, Welkin H; Jacobs-Sera, Deborah; Russell, Daniel A; Peebles, Craig L; Al-Atrache, Zein; Alcoser, Turi A; Alexander, Lisa M; Alfano, Matthew B; Alford, Samantha T; Amy, Nichols E; Anderson, Marie D; Anderson, Alexander G; Ang, Andrew A S; Ares, Manuel; Barber, Amanda J; Barker, Lucia P; Barrett, Jonathan M; Barshop, William D; Bauerle, Cynthia M; Bayles, Ian M; Belfield, Katherine L; Best, Aaron A; Borjon, Agustin; Bowman, Charles A; Boyer, Christine A; Bradley, Kevin W; Bradley, Victoria A; Broadway, Lauren N; Budwal, Keshav; Busby, Kayla N; Campbell, Ian W; Campbell, Anne M; Carey, Alyssa; Caruso, Steven M; Chew, Rebekah D; Cockburn, Chelsea L; Cohen, Lianne B; Corajod, Jeffrey M; Cresawn, Steven G; Davis, Kimberly R; Deng, Lisa; Denver, Dee R; Dixon, Breyon R; Ekram, Sahrish; Elgin, Sarah C R; Engelsen, Angela E; English, Belle E V; Erb, Marcella L; Estrada, Crystal; Filliger, Laura Z; Findley, Ann M; Forbes, Lauren; Forsyth, Mark H; Fox, Tyler M; Fritz, Melissa J; Garcia, Roberto; George, Zindzi D; Georges, Anne E; Gissendanner, Christopher R; Goff, Shannon; Goldstein, Rebecca; Gordon, Kobie C; Green, Russell D; Guerra, Stephanie L; Guiney-Olsen, Krysta R; Guiza, Bridget G; Haghighat, Leila; Hagopian, Garrett V; Harmon, Catherine J; Harmson, Jeremy S; Hartzog, Grant A; Harvey, Samuel E; He, Siping; He, Kevin J; Healy, Kaitlin E; Higinbotham, Ellen R; Hildebrandt, Erin N; Ho, Jason H; Hogan, Gina M; Hohenstein, Victoria G; Holz, Nathan A; Huang, Vincent J; Hufford, Ericka L; Hynes, Peter M; Jackson, Arrykka S; Jansen, Erica C; Jarvik, Jonathan; Jasinto, Paul G; Jordan, Tuajuanda C; Kasza, Tomas; Katelyn, Murray A; Kelsey, Jessica S; Kerrigan, Larisa A; Khaw, Daryl; Kim, Junghee; Knutter, Justin Z; Ko, Ching-Chung; Larkin, Gail V; Laroche, Jennifer R; Latif, Asma; Leuba, Kohana D; Leuba, Sequoia I; Lewis, Lynn O; Loesser-Casey, Kathryn E; Long, Courtney A; Lopez, A Javier; Lowery, Nicholas; Lu, Tina Q; Mac, Victor; Masters, Isaac R; McCloud, Jazmyn J; McDonough, Molly J; Medenbach, Andrew J; Menon, Anjali; Miller, Rachel; Morgan, Brandon K; Ng, Patrick C; Nguyen, Elvis; Nguyen, Katrina T; Nguyen, Emilie T; Nicholson, Kaylee M; Parnell, Lindsay A; Peirce, Caitlin E; Perz, Allison M; Peterson, Luke J; Pferdehirt, Rachel E; Philip, Seegren V; Pogliano, Kit; Pogliano, Joe; Polley, Tamsen; Puopolo, Erica J; Rabinowitz, Hannah S; Resiss, Michael J; Rhyan, Corwin N; Robinson, Yetta M; Rodriguez, Lauren L; Rose, Andrew C; Rubin, Jeffrey D; Ruby, Jessica A; Saha, Margaret S; Sandoz, James W; Savitskaya, Judith; Schipper, Dale J; Schnitzler, Christine E; Schott, Amanda R; Segal, J Bradley; Shaffer, Christopher D; Sheldon, Kathryn E; Shepard, Erica M; Shepardson, Jonathan W; Shroff, Madav K; Simmons, Jessica M; Simms, Erika F; Simpson, Brandy M; Sinclair, Kathryn M; Sjoholm, Robert L; Slette, Ingrid J; Spaulding, Blaire C; Straub, Clark L; Stukey, Joseph; Sughrue, Trevor; Tang, Tin-Yun; Tatyana, Lyons M; Taylor, Stephen B; Taylor, Barbara J; Temple, Louise M; Thompson, Jasper V; Tokarz, Michael P; Trapani, Stephanie E; Troum, Alexander P; Tsay, Jonathan; Tubbs, Anthony T; Walton, Jillian M; Wang, Danielle H; Wang, Hannah; Warner, John R; Weisser, Emilie G; Wendler, Samantha C; Weston-Hafer, Kathleen A; Whelan, Hilary M; Williamson, Kurt E; Willis, Angelica N; Wirtshafter, Hannah S; Wong, Theresa W; Wu, Phillip; Yang, Yun jeong; Yee, Brandon C; Zaidins, David A; Zhang, Bo; Zúniga, Melina Y; Hendrix, Roger W; Hatfull, Graham F

    2011-01-27

    Mycobacteriophages are viruses that infect mycobacterial hosts such as Mycobacterium smegmatis and Mycobacterium tuberculosis. All mycobacteriophages characterized to date are dsDNA tailed phages, and have either siphoviral or myoviral morphotypes. However, their genetic diversity is considerable, and although sixty-two genomes have been sequenced and comparatively analyzed, these likely represent only a small portion of the diversity of the mycobacteriophage population at large. Here we report the isolation, sequencing and comparative genomic analysis of 18 new mycobacteriophages isolated from geographically distinct locations within the United States. Although no clear correlation between location and genome type can be discerned, these genomes expand our knowledge of mycobacteriophage diversity and enhance our understanding of the roles of mobile elements in viral evolution. Expansion of the number of mycobacteriophages grouped within Cluster A provides insights into the basis of immune specificity in these temperate phages, and we also describe a novel example of apparent immunity theft. The isolation and genomic analysis of bacteriophages by freshman college students provides an example of an authentic research experience for novice scientists.

  15. Mitochondrial population genomic analyses reveal population structure and demography of Indian Plasmodium falciparum.

    PubMed

    Tyagi, Suchi; Das, Aparup

    2015-09-01

    Inference on the genetic diversity of Plasmodium falciparum populations could help in better management of malaria. A very recent study with mitochondrial (mt) genomes in global P. falciparum had revealed interesting evolutionary genetic patterns of Indian isolates in comparison to global ones. However, no population genetic study using the whole mt genome sequences of P. falciparum isolates collected in the entire distribution range in India has yet been performed. We herewith have analyzed 85 whole mt genomes (48 already published and 37 entirely new) sampled from eight differentially endemic Indian locations to estimate genetic diversity and infer population structure and historical demography of Indian P. falciparum. We found 19 novel Indian-specific Single Nucleotide Polymorphisms (SNPs) and 22 novel haplotypes segregating in Indian P. falciparum. Accordingly, high haplotype and nucleotide diversities were detected in Indian P. falciparum in comparison to many other global isolates. Indian P. falciparum populations were found to be moderately sub-structured with four different genetic clusters. Interestingly, group of local populations aggregate to form each cluster; while samples from Jharkhand and Odisha formed a single cluster, P. falciparum isolates from Asom formed an independent one. Similarly, Surat, Bilaspur and Betul formed a single cluster and Goa and Mangalore formed another. Interestingly, P. falciparum isolates from the two later populations were significantly genetically differentiated from isolates collected in other six Indian locations. Signature of historical population expansion was evident in five population samples, and the onset of expansion event was found to be very similar to African P. falciparum. In agreement with the previous finding, the estimated Time to Most Recent Common Ancestor (TMRCA) and the effective population size were high in Indian P. falciparum. All these genetic features of Indian P. falciparum with high mt genome

  16. The Microbial Genomes Atlas (MiGA) webserver: taxonomic and gene diversity analysis of Archaea and Bacteria at the whole genome level.

    PubMed

    Rodriguez-R, Luis M; Gunturu, Santosh; Harvey, William T; Rosselló-Mora, Ramon; Tiedje, James M; Cole, James R; Konstantinidis, Konstantinos T

    2018-06-14

    The small subunit ribosomal RNA gene (16S rRNA) has been successfully used to catalogue and study the diversity of prokaryotic species and communities but it offers limited resolution at the species and finer levels, and cannot represent the whole-genome diversity and fluidity. To overcome these limitations, we introduced the Microbial Genomes Atlas (MiGA), a webserver that allows the classification of an unknown query genomic sequence, complete or partial, against all taxonomically classified taxa with available genome sequences, as well as comparisons to other related genomes including uncultivated ones, based on the genome-aggregate Average Nucleotide and Amino Acid Identity (ANI/AAI) concepts. MiGA integrates best practices in sequence quality trimming and assembly and allows input to be raw reads or assemblies from isolate genomes, single-cell sequences, and metagenome-assembled genomes (MAGs). Further, MiGA can take as input hundreds of closely related genomes of the same or closely related species (a so-called 'Clade Project') to assess their gene content diversity and evolutionary relationships, and calculate important clade properties such as the pangenome and core gene sets. Therefore, MiGA is expected to facilitate a range of genome-based taxonomic and diversity studies, and quality assessment across environmental and clinical settings. MiGA is available at http://microbial-genomes.org/.

  17. Genomic analysis reveals extensive gene duplication within the bovine TRB locus

    PubMed Central

    Connelley, Timothy; Aerts, Jan; Law, Andy; Morrison, W Ivan

    2009-01-01

    Background Diverse TR and IG repertoires are generated by V(D)J somatic recombination. Genomic studies have been pivotal in cataloguing the V, D, J and C genes present in the various TR/IG loci and describing how duplication events have expanded the number of these genes. Such studies have also provided insights into the evolution of these loci and the complex mechanisms that regulate TR/IG expression. In this study we analyze the sequence of the third bovine genome assembly to characterize the germline repertoire of bovine TRB genes and compare the organization, evolution and regulatory structure of the bovine TRB locus with that of humans and mice. Results The TRB locus in the third bovine genome assembly is distributed over 5 scaffolds, extending to ~730 Kb. The available sequence contains 134 TRBV genes, assigned to 24 subgroups, and 3 clusters of DJC genes, each comprising a single TRBD gene, 5–7 TRBJ genes and a single TRBC gene. Seventy-nine of the TRBV genes are predicted to be functional. Comparison with the human and murine TRB loci shows that the gene order, as well as the sequences of non-coding elements that regulate TRB expression, are highly conserved in the bovine. Dot-plot analyses demonstrate that expansion of the genomic TRBV repertoire has occurred via a complex and extensive series of duplications, predominantly involving DNA blocks containing multiple genes. These duplication events have resulted in massive expansion of several TRBV subgroups, most notably TRBV6, 9 and 21 which contain 40, 35 and 16 members respectively. Similarly, duplication has lead to the generation of a third DJC cluster. Analyses of cDNA data confirms the diversity of the TRBV genes and, in addition, identifies a substantial number of TRBV genes, predominantly from the larger subgroups, which are still absent from the genome assembly. The observed gene duplication within the bovine TRB locus has created a repertoire of phylogenetically diverse functional TRBV genes

  18. Phenotypic Heterogeneity of Genomically-Diverse Isolates of Streptococcus mutans

    PubMed Central

    Palmer, Sara R.; Miller, James H.; Abranches, Jacqueline; Zeng, Lin; Lefebure, Tristan; Richards, Vincent P.; Lemos, José A.; Stanhope, Michael J.; Burne, Robert A.

    2013-01-01

    High coverage, whole genome shotgun (WGS) sequencing of 57 geographically- and genetically-diverse isolates of Streptococcus mutans from individuals of known dental caries status was recently completed. Of the 57 sequenced strains, fifteen isolates, were selected based primarily on differences in gene content and phenotypic characteristics known to affect virulence and compared with the reference strain UA159. A high degree of variability in these properties was observed between strains, with a broad spectrum of sensitivities to low pH, oxidative stress (air and paraquat) and exposure to competence stimulating peptide (CSP). Significant differences in autolytic behavior and in biofilm development in glucose or sucrose were also observed. Natural genetic competence varied among isolates, and this was correlated to the presence or absence of competence genes, comCDE and comX, and to bacteriocins. In general strains that lacked the ability to become competent possessed fewer genes for bacteriocins and immunity proteins or contained polymorphic variants of these genes. WGS sequence analysis of the pan-genome revealed, for the first time, components of a Type VII secretion system in several S. mutans strains, as well as two putative ORFs that encode possible collagen binding proteins located upstream of the cnm gene, which is associated with host cell invasiveness. The virulence of these particular strains was assessed in a wax-worm model. This is the first study to combine a comprehensive analysis of key virulence-related phenotypes with extensive genomic analysis of a pathogen that evolved closely with humans. Our analysis highlights the phenotypic diversity of S. mutans isolates and indicates that the species has evolved a variety of adaptive strategies to persist in the human oral cavity and, when conditions are favorable, to initiate disease. PMID:23613838

  19. Genome sequencing and comparative genomics of honey bee microsporidia, Nosema apis reveal novel insights into host-parasite interactions.

    PubMed

    Chen, Yan ping; Pettis, Jeffery S; Zhao, Yan; Liu, Xinyue; Tallon, Luke J; Sadzewicz, Lisa D; Li, Renhua; Zheng, Huoqing; Huang, Shaokang; Zhang, Xuan; Hamilton, Michele C; Pernal, Stephen F; Melathopoulos, Andony P; Yan, Xianghe; Evans, Jay D

    2013-07-05

    The microsporidia parasite Nosema contributes to the steep global decline of honey bees that are critical pollinators of food crops. There are two species of Nosema that have been found to infect honey bees, Nosema apis and N. ceranae. Genome sequencing of N. apis and comparative genome analysis with N. ceranae, a fully sequenced microsporidia species, reveal novel insights into host-parasite interactions underlying the parasite infections. We applied the whole-genome shotgun sequencing approach to sequence and assemble the genome of N. apis which has an estimated size of 8.5 Mbp. We predicted 2,771 protein- coding genes and predicted the function of each putative protein using the Gene Ontology. The comparative genomic analysis led to identification of 1,356 orthologs that are conserved between the two Nosema species and genes that are unique characteristics of the individual species, thereby providing a list of virulence factors and new genetic tools for studying host-parasite interactions. We also identified a highly abundant motif in the upstream promoter regions of N. apis genes. This motif is also conserved in N. ceranae and other microsporidia species and likely plays a role in gene regulation across the microsporidia. The availability of the N. apis genome sequence is a significant addition to the rapidly expanding body of microsprodian genomic data which has been improving our understanding of eukaryotic genome diversity and evolution in a broad sense. The predicted virulent genes and transcriptional regulatory elements are potential targets for innovative therapeutics to break down the life cycle of the parasite.

  20. Complete genome analysis of 33 ecologically and biologically diverse Rift Valley fever virus strains reveals widespread virus movement and low genetic diversity due to recent common ancestry.

    PubMed

    Bird, Brian H; Khristova, Marina L; Rollin, Pierre E; Ksiazek, Thomas G; Nichol, Stuart T

    2007-03-01

    Rift Valley fever (RVF) virus is a mosquito-borne RNA virus responsible for large explosive outbreaks of acute febrile disease in humans and livestock in Africa with significant mortality and economic impact. The successful high-throughput generation of the complete genome sequence was achieved for 33 diverse RVF virus strains collected from throughout Africa and Saudi Arabia from 1944 to 2000, including strains differing in pathogenicity in disease models. While several distinct virus genetic lineages were determined, which approximately correlate with geographic origin, multiple exceptions indicative of long-distance virus movement have been found. Virus strains isolated within an epidemic (e.g., Mauritania, 1987, or Egypt, 1977 to 1978) exhibit little diversity, while those in enzootic settings (e.g., 1970s Zimbabwe) can be highly diverse. In addition, the large Saudi Arabian RVF outbreak in 2000 appears to have involved virus introduction from East Africa, based on the close ancestral relationship of a 1998 East African virus. Virus genetic diversity was low (approximately 5%) and primarily involved accumulation of mutations at an average of 2.9 x 10(-4) substitutions/site/year, although some evidence of RNA segment reassortment was found. Bayesian analysis of current RVF virus genetic diversity places the most recent common ancestor of these viruses in the late 1800s, the colonial period in Africa, a time of dramatic changes in agricultural practices and introduction of nonindigenous livestock breeds. In addition to insights into the evolution and ecology of RVF virus, these genomic data also provide a foundation for the design of molecular detection assays and prototype vaccines useful in combating this important disease.

  1. Microsatellite genotyping and genome-wide single nucleotide polymorphism-based indices of Plasmodium falciparum diversity within clinical infections.

    PubMed

    Murray, Lee; Mobegi, Victor A; Duffy, Craig W; Assefa, Samuel A; Kwiatkowski, Dominic P; Laman, Eugene; Loua, Kovana M; Conway, David J

    2016-05-12

    In regions where malaria is endemic, individuals are often infected with multiple distinct parasite genotypes, a situation that may impact on evolution of parasite virulence and drug resistance. Most approaches to studying genotypic diversity have involved analysis of a modest number of polymorphic loci, although whole genome sequencing enables a broader characterisation of samples. PCR-based microsatellite typing of a panel of ten loci was performed on Plasmodium falciparum in 95 clinical isolates from a highly endemic area in the Republic of Guinea, to characterize within-isolate genetic diversity. Separately, single nucleotide polymorphism (SNP) data from genome-wide short-read sequences of the same samples were used to derive within-isolate fixation indices (F ws), an inverse measure of diversity within each isolate compared to overall local genetic diversity. The latter indices were compared with the microsatellite results, and also with indices derived by randomly sampling modest numbers of SNPs. As expected, the number of microsatellite loci with more than one allele in each isolate was highly significantly inversely correlated with the genome-wide F ws fixation index (r = -0.88, P < 0.001). However, the microsatellite analysis revealed that most isolates contained mixed genotypes, even those that had no detectable genome sequence heterogeneity. Random sampling of different numbers of SNPs showed that an F ws index derived from ten or more SNPs with minor allele frequencies of >10 % had high correlation (r > 0.90) with the index derived using all SNPs. Different types of data give highly correlated indices of within-infection diversity, although PCR-based analysis detects low-level minority genotypes not apparent in bulk sequence analysis. When whole-genome data are not obtainable, quantitative assay of ten or more SNPs can yield a reasonably accurate estimate of the within-infection fixation index (F ws).

  2. Comparative genomic analysis reveals 2-oxoacid dehydrogenase complex lipoylation correlation with aerobiosis in archaea.

    PubMed

    Borziak, Kirill; Posner, Mareike G; Upadhyay, Abhishek; Danson, Michael J; Bagby, Stefan; Dorus, Steve

    2014-01-01

    , the extension of comparative genomic pathway profiling to broader metabolic and homeostasis networks should be useful in revealing characteristics from metagenomic datasets related to adaptations to diverse environments.

  3. Comparative genomic and functional analysis reveal conservation of plant growth promoting traits in Paenibacillus polymyxa and its closely related species

    PubMed Central

    Xie, Jianbo; Shi, Haowen; Du, Zhenglin; Wang, Tianshu; Liu, Xiaomeng; Chen, Sanfeng

    2016-01-01

    Paenibacillus polymyxa has widely been studied as a model of plant-growth promoting rhizobacteria (PGPR). Here, the genome sequences of 9 P. polymyxa strains, together with 26 other sequenced Paenibacillus spp., were comparatively studied. Phylogenetic analysis of the concatenated 244 single-copy core genes suggests that the 9 P. polymyxa strains and 5 other Paenibacillus spp., isolated from diverse geographic regions and ecological niches, formed a closely related clade (here it is called Poly-clade). Analysis of single nucleotide polymorphisms (SNPs) reveals local diversification of the 14 Poly-clade genomes. SNPs were not evenly distributed throughout the 14 genomes and the regions with high SNP density contain the genes related to secondary metabolism, including genes coding for polyketide. Recombination played an important role in the genetic diversity of this clade, although the rate of recombination was clearly lower than mutation. Some genes relevant to plant-growth promoting traits, i.e. phosphate solubilization and IAA production, are well conserved, while some genes relevant to nitrogen fixation and antibiotics synthesis are evolved with diversity in this Poly-clade. This study reveals that both P. polymyxa and its closely related species have plant growth promoting traits and they have great potential uses in agriculture and horticulture as PGPR. PMID:26856413

  4. Culture independent genomic comparisons reveal environmental adaptations for Altiarchaeales

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bird, Jordan T.; Baker, Brett J.; Probst, Alexander J.

    The recently proposed candidatus order Altiarchaeales remains an uncultured archaeal lineage composed of genetically diverse, globally widespread organisms frequently observed in anoxic subsurface environments. In spite of 15 years of studies on the psychrophilic biofilm-producing Candidatus Altiarchaeum hamiconexum and its close relatives, very little is known about the phylogenetic and functional diversity of the widespread free-living marine members of this taxon. From methanogenic sediments in the White Oak River Estuary, NC, USA, we sequenced a single cell amplified genome (SAG), WOR_SM1_SCG, and used it to identify and refine two high-quality genomes from metagenomes, WOR_SM1_79 and WOR_SM1_86-2, from the same site.more » These three genomic reconstructions form a monophyletic group, which also includes three previously published genomes from metagenomes from terrestrial springs and a SAG from Sakinaw Lake in a group previously designated as pMC2A384. A synapomorphic mutation in the Altiarchaeales tRNA synthetase β subunit, pheT, caused the protein to be encoded as two subunits at non-adjacent loci. Consistent with the terrestrial spring clades, our estuarine genomes contained a near-complete autotrophic metabolism, H2 or CO as potential electron donors, a reductive acetyl-CoA pathway for carbon fixation, and methylotroph-like NADP(H)-dependent dehydrogenase. Phylogenies based on 16S rRNA genes and concatenated conserved proteins identified two distinct sub-clades of Altiarchaeales, Alti-1 populated by organisms from actively flowing springs, and Alti-2 which was more widespread, diverse, and not associated with visible mats. The core Alti-1 genome suggested Alti-1 is adapted for the stream environment with lipopolysaccharide production capacity and extracellular hami structures. The core Alti-2 genome suggested members of this clade are free-living with distinct mechanisms for energy maintenance, motility, osmoregulation, and sulfur redox reactions. These

  5. Culture Independent Genomic Comparisons Reveal Environmental Adaptations for Altiarchaeales

    PubMed Central

    Baker, Brett J.; Probst, Alexander J.; Podar, Mircea; Lloyd, Karen G.

    2016-01-01

    The recently proposed candidatus order Altiarchaeales remains an uncultured archaeal lineage composed of genetically diverse, globally widespread organisms frequently observed in anoxic subsurface environments. In spite of 15 years of studies on the psychrophilic biofilm-producing Candidatus Altiarchaeum hamiconexum and its close relatives, very little is known about the phylogenetic and functional diversity of the widespread free-living marine members of this taxon. From methanogenic sediments in the White Oak River Estuary, NC, USA, we sequenced a single cell amplified genome (SAG), WOR_SM1_SCG, and used it to identify and refine two high-quality genomes from metagenomes, WOR_SM1_79 and WOR_SM1_86-2, from the same site. These three genomic reconstructions form a monophyletic group, which also includes three previously published genomes from metagenomes from terrestrial springs and a SAG from Sakinaw Lake in a group previously designated as pMC2A384. A synapomorphic mutation in the Altiarchaeales tRNA synthetase β subunit, pheT, caused the protein to be encoded as two subunits at non-adjacent loci. Consistent with the terrestrial spring clades, our estuarine genomes contained a near-complete autotrophic metabolism, H2 or CO as potential electron donors, a reductive acetyl-CoA pathway for carbon fixation, and methylotroph-like NADP(H)-dependent dehydrogenase. Phylogenies based on 16S rRNA genes and concatenated conserved proteins identified two distinct sub-clades of Altiarchaeales, Alti-1 populated by organisms from actively flowing springs, and Alti-2 which was more widespread, diverse, and not associated with visible mats. The core Alti-1 genome suggested Alti-1 is adapted for the stream environment with lipopolysaccharide production capacity and extracellular hami structures. The core Alti-2 genome suggested members of this clade are free-living with distinct mechanisms for energy maintenance, motility, osmoregulation, and sulfur redox reactions. These data

  6. Culture independent genomic comparisons reveal environmental adaptations for Altiarchaeales

    DOE PAGES

    Bird, Jordan T.; Baker, Brett J.; Probst, Alexander J.; ...

    2016-08-05

    The recently proposed candidatus order Altiarchaeales remains an uncultured archaeal lineage composed of genetically diverse, globally widespread organisms frequently observed in anoxic subsurface environments. In spite of 15 years of studies on the psychrophilic biofilm-producing Candidatus Altiarchaeum hamiconexum and its close relatives, very little is known about the phylogenetic and functional diversity of the widespread free-living marine members of this taxon. From methanogenic sediments in the White Oak River Estuary, NC, USA, we sequenced a single cell amplified genome (SAG), WOR_SM1_SCG, and used it to identify and refine two high-quality genomes from metagenomes, WOR_SM1_79 and WOR_SM1_86-2, from the same site.more » These three genomic reconstructions form a monophyletic group, which also includes three previously published genomes from metagenomes from terrestrial springs and a SAG from Sakinaw Lake in a group previously designated as pMC2A384. A synapomorphic mutation in the Altiarchaeales tRNA synthetase β subunit, pheT, caused the protein to be encoded as two subunits at non-adjacent loci. Consistent with the terrestrial spring clades, our estuarine genomes contained a near-complete autotrophic metabolism, H2 or CO as potential electron donors, a reductive acetyl-CoA pathway for carbon fixation, and methylotroph-like NADP(H)-dependent dehydrogenase. Phylogenies based on 16S rRNA genes and concatenated conserved proteins identified two distinct sub-clades of Altiarchaeales, Alti-1 populated by organisms from actively flowing springs, and Alti-2 which was more widespread, diverse, and not associated with visible mats. The core Alti-1 genome suggested Alti-1 is adapted for the stream environment with lipopolysaccharide production capacity and extracellular hami structures. The core Alti-2 genome suggested members of this clade are free-living with distinct mechanisms for energy maintenance, motility, osmoregulation, and sulfur redox reactions. These

  7. Trends in genome dynamics among major orders of insects revealed through variations in protein families.

    PubMed

    Rappoport, Nadav; Linial, Michal

    2015-08-07

    Insects belong to a class that accounts for the majority of animals on earth. With over one million identified species, insects display a huge diversity and occupy extreme environments. At present, there are dozens of fully sequenced insect genomes that cover a range of habitats, social behavior and morphologies. In view of such diverse collection of genomes, revealing evolutionary trends and charting functional relationships of proteins remain challenging. We analyzed the relatedness of 17 complete proteomes representative of proteomes from insects including louse, bee, beetle, ants, flies and mosquitoes, as well as an out-group from the crustaceans. The analyzed proteomes mostly represented the orders of Hymenoptera and Diptera. The 287,405 protein sequences from the 18 proteomes were automatically clustered into 20,933 families, including 799 singletons. A comprehensive analysis based on statistical considerations identified the families that were significantly expanded or reduced in any of the studied organisms. Among all the tested species, ants are characterized by an exceptionally high rate of family gain and loss. By assigning annotations to hundreds of species-specific families, the functional diversity among species and between the major clades (Diptera and Hymenoptera) is revealed. We found that many species-specific families are associated with receptor signaling, stress-related functions and proteases. The highest variability among insects associates with the function of transposition and nucleic acids processes (collectively coined TNAP). Specifically, the wasp and ants have an order of magnitude more TNAP families and proteins relative to species that belong to Diptera (mosquitoes and flies). An unsupervised clustering methodology combined with a comparative functional analysis unveiled proteomic signatures in the major clades of winged insects. We propose that the expansion of TNAP families in Hymenoptera potentially contributes to the accelerated

  8. MBGD update 2013: the microbial genome database for exploring the diversity of microbial world.

    PubMed

    Uchiyama, Ikuo; Mihara, Motohiro; Nishide, Hiroyo; Chiba, Hirokazu

    2013-01-01

    The microbial genome database for comparative analysis (MBGD, available at http://mbgd.genome.ad.jp/) is a platform for microbial genome comparison based on orthology analysis. As its unique feature, MBGD allows users to conduct orthology analysis among any specified set of organisms; this flexibility allows MBGD to adapt to a variety of microbial genomic study. Reflecting the huge diversity of microbial world, the number of microbial genome projects now becomes several thousands. To efficiently explore the diversity of the entire microbial genomic data, MBGD now provides summary pages for pre-calculated ortholog tables among various taxonomic groups. For some closely related taxa, MBGD also provides the conserved synteny information (core genome alignment) pre-calculated using the CoreAligner program. In addition, efficient incremental updating procedure can create extended ortholog table by adding additional genomes to the default ortholog table generated from the representative set of genomes. Combining with the functionalities of the dynamic orthology calculation of any specified set of organisms, MBGD is an efficient and flexible tool for exploring the microbial genome diversity.

  9. Whole genome analysis of porcine astroviruses detected in Japanese pigs reveals genetic diversity and possible intra-genotypic recombination.

    PubMed

    Ito, Mika; Kuroda, Moegi; Masuda, Tsuneyuki; Akagami, Masataka; Haga, Kei; Tsuchiaka, Shinobu; Kishimoto, Mai; Naoi, Yuki; Sano, Kaori; Omatsu, Tsutomu; Katayama, Yukie; Oba, Mami; Aoki, Hiroshi; Ichimaru, Toru; Mukono, Itsuro; Ouchi, Yoshinao; Yamasato, Hiroshi; Shirai, Junsuke; Katayama, Kazuhiko; Mizutani, Tetsuya; Nagai, Makoto

    2017-06-01

    Porcine astroviruses (PoAstVs) are ubiquitous enteric virus of pigs that are distributed in several countries throughout the world. Since PoAstVs are detected in apparent healthy pigs, the clinical significance of infection is unknown. However, AstVs have recently been associated with a severe neurological disorder in animals, including humans, and zoonotic potential has been suggested. To date, little is known about the epidemiology of PoAstVs among the pig population in Japan. In this report, we present an analysis of nearly complete genomes of 36 PoAstVs detected by a metagenomics approach in the feces of Japanese pigs. Based on a phylogenetic analysis and pairwise sequence comparison, 10, 5, 15, and 6 sequences were classified as PoAstV2, PoAstV3, PoAstV4, and PoAstV5, respectively. Co-infection with two or three strains was found in individual fecal samples from eight pigs. The phylogenetic trees of ORF1a, ORF1b, and ORF2 of PoAstV2 and PoAstV4 showed differences in their topologies. The PoAstV3 and PoAstV5 strains shared high sequence identities within each genotype in all ORFs; however, one PoAstV3 strain and one PoAstV5 strain showed considerable sequence divergence from the other PoAstV3 and PoAstV5 strains, respectively, in ORF2. Recombination analysis using whole genomes revealed evidence of multiple possible intra-genotype recombination events in PoAstV2 and PoAstV4, suggesting that recombination might have contributed to the genetic diversity and played an important role in the evolution of Japanese PoAstVs. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. A first exploration of genome size diversity in sponges.

    PubMed

    Jeffery, Nicholas W; Jardine, Catherine B; Gregory, T Ryan

    2013-08-01

    The phyla known as early-branching lineages of animals have become the subject of increasing interest from the perspectives of genomics and evolutionary biology. Unfortunately, data on even the most fundamental properties of their genomes, such as genome size, remain very scarce. In this study, genome size estimates are reported for 75 species of sponges (phylum Porifera) representing 33 families and 12 orders, marking the first large survey of genome size diversity for an early-branching phylum. Sponge genome sizes averaged around 0.2 pg but exhibited a 17-fold range overall (0.04-0.63 pg). In addition, the results of comparisons of two methods of genome size quantification (flow cytometry and Feulgen image analysis densitometry) are presented, thereby facilitating future work on these animals. Some particularly promising avenues for future investigation are highlighted.

  11. Genomic reconstruction of the history of extant populations of India reveals five distinct ancestral components and a complex structure

    PubMed Central

    Basu, Analabha; Sarkar-Roy, Neeta; Majumder, Partha P.

    2016-01-01

    India, occupying the center stage of Paleolithic and Neolithic migrations, has been underrepresented in genome-wide studies of variation. Systematic analysis of genome-wide data, using multiple robust statistical methods, on (i) 367 unrelated individuals drawn from 18 mainland and 2 island (Andaman and Nicobar Islands) populations selected to represent geographic, linguistic, and ethnic diversities, and (ii) individuals from populations represented in the Human Genome Diversity Panel (HGDP), reveal four major ancestries in mainland India. This contrasts with an earlier inference of two ancestries based on limited population sampling. A distinct ancestry of the populations of Andaman archipelago was identified and found to be coancestral to Oceanic populations. Analysis of ancestral haplotype blocks revealed that extant mainland populations (i) admixed widely irrespective of ancestry, although admixtures between populations was not always symmetric, and (ii) this practice was rapidly replaced by endogamy about 70 generations ago, among upper castes and Indo-European speakers predominantly. This estimated time coincides with the historical period of formulation and adoption of sociocultural norms restricting intermarriage in large social strata. A similar replacement observed among tribal populations was temporally less uniform. PMID:26811443

  12. Evidence-based green algal genomics reveals marine diversity and ancestral characteristics of land plants

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    van Baren, Marijke J.; Bachy, Charles; Reistetter, Emily Nahas

    Prasinophytes are widespread marine green algae that are related to plants. Abundance of the genus Micromonas has reportedly increased in the Arctic due to climate-induced changes. Thus, studies of these organisms are important for marine ecology and understanding Virdiplantae evolution and diversification. We generated evidence-based Micromonas gene models using proteomics and RNA-Seq to improve prasinophyte genomic resources. First, sequences of four chromosomes in the 22 Mb Micromonas pusilla (CCMP1545) genome were finished. Comparison with the finished 21 Mb Micromonas commoda (RCC299) shows they share ≤ 8,142 of ~10,000 protein-encoding genes, depending on the analysis method. Unlike RCC299 and other sequencedmore » eukaryotes, CCMP1545 has two abundant repetitive intron types and a high percent (26%) GC splice donors. Micromonas has more genus-specific protein families (19%) than other genome sequenced prasinophytes (11%). Comparative analyses using predicted proteomes from other prasinophytes reveal proteins likely related to scale formation and ancestral photosynthesis. Our studies also indicate that peptidoglycan (PG) biosynthesis enzymes have been lost in multiple independent events in select prasinophytes and most plants. However, CCMP1545, polar Micromonas CCMP2099 and prasinophytes from other claasses retain the entire PG pathway, like moss and glaucophyte algae. Multiple vascular plants that share a unique bi-domain protein also have the pathway, except the Penicillin-Binding-Protein. Alongside Micromonas experiments using antibiotics that halt bacterial PG biosynthesis, the findings highlight unrecognized phylogenetic complexity in the PG-pathway retention and implicate a role in chloroplast structure of division in several extant Vridiplantae lineages. Extensive differences in gene loss and architecture between related prasinophytes underscore their extensive divergence. PG biosynthesis genes from the cyanobacterial endosymbiont that became the

  13. Evidence-based green algal genomics reveals marine diversity and ancestral characteristics of land plants

    DOE PAGES

    van Baren, Marijke J.; Bachy, Charles; Reistetter, Emily Nahas; ...

    2016-03-31

    Prasinophytes are widespread marine green algae that are related to plants. Abundance of the genus Micromonas has reportedly increased in the Arctic due to climate-induced changes. Thus, studies of these organisms are important for marine ecology and understanding Virdiplantae evolution and diversification. We generated evidence-based Micromonas gene models using proteomics and RNA-Seq to improve prasinophyte genomic resources. First, sequences of four chromosomes in the 22 Mb Micromonas pusilla (CCMP1545) genome were finished. Comparison with the finished 21 Mb Micromonas commoda (RCC299) shows they share ≤ 8,142 of ~10,000 protein-encoding genes, depending on the analysis method. Unlike RCC299 and other sequencedmore » eukaryotes, CCMP1545 has two abundant repetitive intron types and a high percent (26%) GC splice donors. Micromonas has more genus-specific protein families (19%) than other genome sequenced prasinophytes (11%). Comparative analyses using predicted proteomes from other prasinophytes reveal proteins likely related to scale formation and ancestral photosynthesis. Our studies also indicate that peptidoglycan (PG) biosynthesis enzymes have been lost in multiple independent events in select prasinophytes and most plants. However, CCMP1545, polar Micromonas CCMP2099 and prasinophytes from other claasses retain the entire PG pathway, like moss and glaucophyte algae. Multiple vascular plants that share a unique bi-domain protein also have the pathway, except the Penicillin-Binding-Protein. Alongside Micromonas experiments using antibiotics that halt bacterial PG biosynthesis, the findings highlight unrecognized phylogenetic complexity in the PG-pathway retention and implicate a role in chloroplast structure of division in several extant Vridiplantae lineages. Extensive differences in gene loss and architecture between related prasinophytes underscore their extensive divergence. PG biosynthesis genes from the cyanobacterial endosymbiont that became the

  14. Comparative Analysis of the Peanut Witches'-Broom Phytoplasma Genome Reveals Horizontal Transfer of Potential Mobile Units and Effectors

    PubMed Central

    Lo, Wen-Sui; Lin, Chan-Pin; Kuo, Chih-Horng

    2013-01-01

    Phytoplasmas are a group of bacteria that are associated with hundreds of plant diseases. Due to their economical importance and the difficulties involved in the experimental study of these obligate pathogens, genome sequencing and comparative analysis have been utilized as powerful tools to understand phytoplasma biology. To date four complete phytoplasma genome sequences have been published. However, these four strains represent limited phylogenetic diversity. In this study, we report the shotgun sequencing and evolutionary analysis of a peanut witches'-broom (PnWB) phytoplasma genome. The availability of this genome provides the first representative of the 16SrII group and substantially improves the taxon sampling to investigate genome evolution. The draft genome assembly contains 13 chromosomal contigs with a total size of 562,473 bp, covering ∼90% of the chromosome. Additionally, a complete plasmid sequence is included. Comparisons among the five available phytoplasma genomes reveal the differentiations in gene content and metabolic capacity. Notably, phylogenetic inferences of the potential mobile units (PMUs) in these genomes indicate that horizontal transfer may have occurred between divergent phytoplasma lineages. Because many effectors are associated with PMUs, the horizontal transfer of these transposon-like elements can contribute to the adaptation and diversification of these pathogens. In summary, the findings from this study highlight the importance of improving taxon sampling when investigating genome evolution. Moreover, the currently available sequences are inadequate to fully characterize the pan-genome of phytoplasmas. Future genome sequencing efforts to expand phylogenetic diversity are essential in improving our understanding of phytoplasma evolution. PMID:23626855

  15. A genomic scale map of genetic diversity in Trypanosoma cruzi

    PubMed Central

    2012-01-01

    Background Trypanosoma cruzi, the causal agent of Chagas Disease, affects more than 16 million people in Latin America. The clinical outcome of the disease results from a complex interplay between environmental factors and the genetic background of both the human host and the parasite. However, knowledge of the genetic diversity of the parasite, is currently limited to a number of highly studied loci. The availability of a number of genomes from different evolutionary lineages of T. cruzi provides an unprecedented opportunity to look at the genetic diversity of the parasite at a genomic scale. Results Using a bioinformatic strategy, we have clustered T. cruzi sequence data available in the public domain and obtained multiple sequence alignments in which one or two alleles from the reference CL-Brener were included. These data covers 4 major evolutionary lineages (DTUs): TcI, TcII, TcIII, and the hybrid TcVI. Using these set of alignments we have identified 288,957 high quality single nucleotide polymorphisms and 1,480 indels. In a reduced re-sequencing study we were able to validate ~ 97% of high-quality SNPs identified in 47 loci. Analysis of how these changes affect encoded protein products showed a 0.77 ratio of synonymous to non-synonymous changes in the T. cruzi genome. We observed 113 changes that introduce or remove a stop codon, some causing significant functional changes, and a number of tri-allelic and tetra-allelic SNPs that could be exploited in strain typing assays. Based on an analysis of the observed nucleotide diversity we show that the T. cruzi genome contains a core set of genes that are under apparent purifying selection. Interestingly, orthologs of known druggable targets show statistically significant lower nucleotide diversity values. Conclusions This study provides the first look at the genetic diversity of T. cruzi at a genomic scale. The analysis covers an estimated ~ 60% of the genetic diversity present in the population, providing an

  16. Comparative Genomics of Flatworms (Platyhelminthes) Reveals Shared Genomic Features of Ecto- and Endoparastic Neodermata

    PubMed Central

    Hahn, Christoph; Fromm, Bastian; Bachmann, Lutz

    2014-01-01

    The ectoparasitic Monogenea comprise a major part of the obligate parasitic flatworm diversity. Although genomic adaptations to parasitism have been studied in the endoparasitic tapeworms (Cestoda) and flukes (Trematoda), no representative of the Monogenea has been investigated yet. We present the high-quality draft genome of Gyrodactylus salaris, an economically important monogenean ectoparasite of wild Atlantic salmon (Salmo salar). A total of 15,488 gene models were identified, of which 7,102 were functionally annotated. The controversial phylogenetic relationships within the obligate parasitic Neodermata were resolved in a phylogenomic analysis using 1,719 gene models (alignment length of >500,000 amino acids) for a set of 16 metazoan taxa. The Monogenea were found basal to the Cestoda and Trematoda, which implies ectoparasitism being plesiomorphic within the Neodermata and strongly supports a common origin of complex life cycles. Comparative analysis of seven parasitic flatworm genomes identified shared genomic features for the ecto- and endoparasitic lineages, such as a substantial reduction of the core bilaterian gene complement, including the homeodomain-containing genes, and a loss of the piwi and vasa genes, which are considered essential for animal development. Furthermore, the shared loss of functional fatty acid biosynthesis pathways and the absence of peroxisomes, the latter organelles presumed ubiquitous in eukaryotes except for parasitic protozoans, were inferred. The draft genome of G. salaris opens for future in-depth analyses of pathogenicity and host specificity of poorly characterized G. salaris strains, and will enhance studies addressing the genomics of host–parasite interactions and speciation in the highly diverse monogenean flatworms. PMID:24732282

  17. Diversity and Evolution of Mycobacterium tuberculosis: Moving to Whole-Genome-Based Approaches

    PubMed Central

    Niemann, Stefan; Supply, Philip

    2014-01-01

    Genotyping of clinical Mycobacterium tuberculosis complex (MTBC) strains has become a standard tool for epidemiological tracing and for the investigation of the local and global strain population structure. Of special importance is the analysis of the expansion of multidrug (MDR) and extensively drug-resistant (XDR) strains. Classical genotyping and, more recently, whole-genome sequencing have revealed that the strains of the MTBC are more diverse than previously anticipated. Globally, several phylogenetic lineages can be distinguished whose geographical distribution is markedly variable. Strains of particular (sub)lineages, such as Beijing, seem to be more virulent and associated with enhanced resistance levels and fitness, likely fueling their spread in certain world regions. The upcoming generalization of whole-genome sequencing approaches will expectedly provide more comprehensive insights into the molecular and epidemiological mechanisms involved and lead to better diagnostic and therapeutic tools. PMID:25190252

  18. Comparative Pan-Genome Analysis of Piscirickettsia salmonis Reveals Genomic Divergences within Genogroups.

    PubMed

    Nourdin-Galindo, Guillermo; Sánchez, Patricio; Molina, Cristian F; Espinoza-Rojas, Daniela A; Oliver, Cristian; Ruiz, Pamela; Vargas-Chacoff, Luis; Cárcamo, Juan G; Figueroa, Jaime E; Mancilla, Marcos; Maracaja-Coutinho, Vinicius; Yañez, Alejandro J

    2017-01-01

    Piscirickettsia salmonis is the etiological agent of salmonid rickettsial septicemia, a disease that seriously affects the salmonid industry. Despite efforts to genomically characterize P. salmonis , functional information on the life cycle, pathogenesis mechanisms, diagnosis, treatment, and control of this fish pathogen remain lacking. To address this knowledge gap, the present study conducted an in silico pan-genome analysis of 19 P. salmonis strains from distinct geographic locations and genogroups. Results revealed an expected open pan-genome of 3,463 genes and a core-genome of 1,732 genes. Two marked genogroups were identified, as confirmed by phylogenetic and phylogenomic relationships to the LF-89 and EM-90 reference strains, as well as by assessments of genomic structures. Different structural configurations were found for the six identified copies of the ribosomal operon in the P. salmonis genome, indicating translocation throughout the genetic material. Chromosomal divergences in genomic localization and quantity of genetic cassettes were also found for the Dot/Icm type IVB secretion system. To determine divergences between core-genomes, additional pan-genome descriptions were compiled for the so-termed LF and EM genogroups. Open pan-genomes composed of 2,924 and 2,778 genes and core-genomes composed of 2,170 and 2,228 genes were respectively found for the LF and EM genogroups. The core-genomes were functionally annotated using the Gene Ontology, KEGG, and Virulence Factor databases, revealing the presence of several shared groups of genes related to basic function of intracellular survival and bacterial pathogenesis. Additionally, the specific pan-genomes for the LF and EM genogroups were defined, resulting in the identification of 148 and 273 exclusive proteins, respectively. Notably, specific virulence factors linked to adherence, colonization, invasion factors, and endotoxins were established. The obtained data suggest that these genes could be

  19. Comparative Pan-Genome Analysis of Piscirickettsia salmonis Reveals Genomic Divergences within Genogroups

    PubMed Central

    Nourdin-Galindo, Guillermo; Sánchez, Patricio; Molina, Cristian F.; Espinoza-Rojas, Daniela A.; Oliver, Cristian; Ruiz, Pamela; Vargas-Chacoff, Luis; Cárcamo, Juan G.; Figueroa, Jaime E.; Mancilla, Marcos; Maracaja-Coutinho, Vinicius; Yañez, Alejandro J.

    2017-01-01

    Piscirickettsia salmonis is the etiological agent of salmonid rickettsial septicemia, a disease that seriously affects the salmonid industry. Despite efforts to genomically characterize P. salmonis, functional information on the life cycle, pathogenesis mechanisms, diagnosis, treatment, and control of this fish pathogen remain lacking. To address this knowledge gap, the present study conducted an in silico pan-genome analysis of 19 P. salmonis strains from distinct geographic locations and genogroups. Results revealed an expected open pan-genome of 3,463 genes and a core-genome of 1,732 genes. Two marked genogroups were identified, as confirmed by phylogenetic and phylogenomic relationships to the LF-89 and EM-90 reference strains, as well as by assessments of genomic structures. Different structural configurations were found for the six identified copies of the ribosomal operon in the P. salmonis genome, indicating translocation throughout the genetic material. Chromosomal divergences in genomic localization and quantity of genetic cassettes were also found for the Dot/Icm type IVB secretion system. To determine divergences between core-genomes, additional pan-genome descriptions were compiled for the so-termed LF and EM genogroups. Open pan-genomes composed of 2,924 and 2,778 genes and core-genomes composed of 2,170 and 2,228 genes were respectively found for the LF and EM genogroups. The core-genomes were functionally annotated using the Gene Ontology, KEGG, and Virulence Factor databases, revealing the presence of several shared groups of genes related to basic function of intracellular survival and bacterial pathogenesis. Additionally, the specific pan-genomes for the LF and EM genogroups were defined, resulting in the identification of 148 and 273 exclusive proteins, respectively. Notably, specific virulence factors linked to adherence, colonization, invasion factors, and endotoxins were established. The obtained data suggest that these genes could be

  20. Genomic diversity of bacteriophages infecting the fish pathogen Flavobacterium psychrophilum.

    PubMed

    Castillo, Daniel; Middelboe, Mathias

    2016-12-01

    Bacteriophages infecting the fish pathogen Flavobacterium psychrophilum can potentially be used to prevent and control outbreaks of this bacterium in salmonid aquaculture. However, the application of bacteriophages in disease control requires detailed knowledge on their genetic composition. To explore the diversity of F. pyschrophilum bacteriophages, we have analyzed the complete genome sequences of 17 phages isolated from two distant geographic areas (Denmark and Chile), including the previously characterized temperate bacteriophage 6H. Phage genome size ranged from 39 302 to 89 010 bp with a G+C content of 27%-32%. None of the bacteriophages isolated in Denmark contained genes associated with lysogeny, whereas the Chilean isolates were all putative temperate phages and similar to bacteriophage 6H. Comparative genome analysis showed that phages grouped in three different genetic clusters based on genetic composition and gene content, indicating a limited genetic diversity of F. psychrophilum-specific bacteriophages. However, amino acid sequence dissimilarity (25%) was found in putative structural proteins, which could be related to the host specificity determinants. This study represents the first analysis of genomic diversity and composition among bacteriophages infecting the fish pathogen F. psychrophilum and discusses the implications for the application of phages in disease control. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  1. Genome diversity in Brachypodium distachyon: deep sequencing of highly diverse inbred lines

    USDA-ARS?s Scientific Manuscript database

    Natural variation provides a powerful opportunity to study the genetic basis of biological traits. Brachypodium distachyon is a broadly distributed diploid model grass with a small genome and a large collection of diverse inbred lines. As a step towards understanding the genetic basis of the natura...

  2. Indigenous peoples and the morality of the Human Genome Diversity Project.

    PubMed

    Dodson, M; Williamson, R

    1999-04-01

    In addition to the aim of mapping and sequencing one human's genome, the Human Genome Project also intends to characterise the genetic diversity of the world's peoples. The Human Genome Diversity Project raises political, economic and ethical issues. These intersect clearly when the genomes under study are those of indigenous peoples who are already subject to serious economic, legal and/or social disadvantage and discrimination. The fact that some individuals associated with the project have made dismissive comments about indigenous peoples has confused rather than illuminated the deeper issues involved, as well as causing much antagonism among indigenous peoples. There are more serious ethical issues raised by the project for all geneticists, including those who are sympathetic to the problems of indigenous peoples. With particular attention to the history and attitudes of Australian indigenous peoples, we argue that the Human Genome Diversity Project can only proceed if those who further its objectives simultaneously: respect the cultural beliefs of indigenous peoples; publicly support the efforts of indigenous peoples to achieve respect and equality; express respect by a rigorous understanding of the meaning of equitable negotiation of consent, and ensure that both immediate and long term economic benefits from the research flow back to the groups taking part.

  3. Genomes of Salmonella with diverse patterns of antibiotic resistance (AR) revealed the dynamics of AR gene organization and detected resistance gene families found in Salmonella

    USDA-ARS?s Scientific Manuscript database

    We produced and assembled high quality draft genomes (~100X coverage) for 305 Salmonella from a diverse a group of over 100 serovars and diverse sources. Of these isolates, 119 were selected to capture a wide variety of different AR patterns. In our subsequent analyses we included 285 additional pub...

  4. Lampreys as Diverse Model Organisms in the Genomics Era.

    PubMed

    McCauley, David W; Docker, Margaret F; Whyard, Steve; Li, Weiming

    2015-11-01

    Lampreys, one of the two surviving groups of ancient vertebrates, have become important models for study in diverse fields of biology. Lampreys (of which there are approximately 40 species) are being studied, for example, (a) to control pest sea lamprey in the North American Great Lakes and to restore declining populations of native species elsewhere; (b) in biomedical research, focusing particularly on the regenerative capability of lampreys; and (c) by developmental biologists studying the evolution of key vertebrate characters. Although a lack of genetic resources has hindered research on the mechanisms regulating many aspects of lamprey life history and development, formerly intractable questions are now amenable to investigation following the recent publication of the sea lamprey genome. Here, we provide an overview of the ways in which genomic tools are currently being deployed to tackle diverse research questions and suggest several areas that may benefit from the availability of the sea lamprey genome.

  5. Lampreys as Diverse Model Organisms in the Genomics Era

    PubMed Central

    McCauley, David W.; Docker, Margaret F.; Whyard, Steve; Li, Weiming

    2015-01-01

    Lampreys, one of the two surviving groups of ancient vertebrates, have become important models for study in diverse fields of biology. Lampreys (of which there are approximately 40 species) are being studied, for example, (a) to control pest sea lamprey in the North American Great Lakes and to restore declining populations of native species elsewhere; (b) in biomedical research, focusing particularly on the regenerative capability of lampreys; and (c) by developmental biologists studying the evolution of key vertebrate characters. Although a lack of genetic resources has hindered research on the mechanisms regulating many aspects of lamprey life history and development, formerly intractable questions are now amenable to investigation following the recent publication of the sea lamprey genome. Here, we provide an overview of the ways in which genomic tools are currently being deployed to tackle diverse research questions and suggest several areas that may benefit from the availability of the sea lamprey genome. PMID:26951616

  6. Reconstruction of the lipid metabolism for the microalga Monoraphidium neglectum from its genome sequence reveals characteristics suitable for biofuel production

    PubMed Central

    2013-01-01

    Background Microalgae are gaining importance as sustainable production hosts in the fields of biotechnology and bioenergy. A robust biomass accumulating strain of the genus Monoraphidium (SAG 48.87) was investigated in this work as a potential feedstock for biofuel production. The genome was sequenced, annotated, and key enzymes for triacylglycerol formation were elucidated. Results Monoraphidium neglectum was identified as an oleaginous species with favourable growth characteristics as well as a high potential for crude oil production, based on neutral lipid contents of approximately 21% (dry weight) under nitrogen starvation, composed of predominantly C18:1 and C16:0 fatty acids. Further characterization revealed growth in a relatively wide pH range and salt concentrations of up to 1.0% NaCl, in which the cells exhibited larger structures. This first full genome sequencing of a member of the Selenastraceae revealed a diploid, approximately 68 Mbp genome with a G + C content of 64.7%. The circular chloroplast genome was assembled to a 135,362 bp single contig, containing 67 protein-coding genes. The assembly of the mitochondrial genome resulted in two contigs with an approximate total size of 94 kb, the largest known mitochondrial genome within algae. 16,761 protein-coding genes were assigned to the nuclear genome. Comparison of gene sets with respect to functional categories revealed a higher gene number assigned to the category “carbohydrate metabolic process” and in “fatty acid biosynthetic process” in M. neglectum when compared to Chlamydomonas reinhardtii and Nannochloropsis gaditana, indicating a higher metabolic diversity for applications in carbohydrate conversions of biotechnological relevance. Conclusions The genome of M. neglectum, as well as the metabolic reconstruction of crucial lipid pathways, provides new insights into the diversity of the lipid metabolism in microalgae. The results of this work provide a platform to encourage the

  7. Ethiopian Genetic Diversity Reveals Linguistic Stratification and Complex Influences on the Ethiopian Gene Pool

    PubMed Central

    Pagani, Luca; Kivisild, Toomas; Tarekegn, Ayele; Ekong, Rosemary; Plaster, Chris; Gallego Romero, Irene; Ayub, Qasim; Mehdi, S. Qasim; Thomas, Mark G.; Luiselli, Donata; Bekele, Endashaw; Bradman, Neil; Balding, David J.; Tyler-Smith, Chris

    2012-01-01

    Humans and their ancestors have traversed the Ethiopian landscape for millions of years, and present-day Ethiopians show great cultural, linguistic, and historical diversity, which makes them essential for understanding African variability and human origins. We genotyped 235 individuals from ten Ethiopian and two neighboring (South Sudanese and Somali) populations on an Illumina Omni 1M chip. Genotypes were compared with published data from several African and non-African populations. Principal-component and STRUCTURE-like analyses confirmed substantial genetic diversity both within and between populations, and revealed a match between genetic data and linguistic affiliation. Using comparisons with African and non-African reference samples in 40-SNP genomic windows, we identified “African” and “non-African” haplotypic components for each Ethiopian individual. The non-African component, which includes the SLC24A5 allele associated with light skin pigmentation in Europeans, may represent gene flow into Africa, which we estimate to have occurred ∼3 thousand years ago (kya). The non-African component was found to be more similar to populations inhabiting the Levant rather than the Arabian Peninsula, but the principal route for the expansion out of Africa ∼60 kya remains unresolved. Linkage-disequilibrium decay with genomic distance was less rapid in both the whole genome and the African component than in southern African samples, suggesting a less ancient history for Ethiopian populations. PMID:22726845

  8. Comparative genomics reveals phylogenetic distribution patterns of secondary metabolites in Amycolatopsis species.

    PubMed

    Adamek, Martina; Alanjary, Mohammad; Sales-Ortells, Helena; Goodfellow, Michael; Bull, Alan T; Winkler, Anika; Wibberg, Daniel; Kalinowski, Jörn; Ziemert, Nadine

    2018-06-01

    Genome mining tools have enabled us to predict biosynthetic gene clusters that might encode compounds with valuable functions for industrial and medical applications. With the continuously increasing number of genomes sequenced, we are confronted with an overwhelming number of predicted clusters. In order to guide the effective prioritization of biosynthetic gene clusters towards finding the most promising compounds, knowledge about diversity, phylogenetic relationships and distribution patterns of biosynthetic gene clusters is necessary. Here, we provide a comprehensive analysis of the model actinobacterial genus Amycolatopsis and its potential for the production of secondary metabolites. A phylogenetic characterization, together with a pan-genome analysis showed that within this highly diverse genus, four major lineages could be distinguished which differed in their potential to produce secondary metabolites. Furthermore, we were able to distinguish gene cluster families whose distribution correlated with phylogeny, indicating that vertical gene transfer plays a major role in the evolution of secondary metabolite gene clusters. Still, the vast majority of the diverse biosynthetic gene clusters were derived from clusters unique to the genus, and also unique in comparison to a database of known compounds. Our study on the locations of biosynthetic gene clusters in the genomes of Amycolatopsis' strains showed that clusters acquired by horizontal gene transfer tend to be incorporated into non-conserved regions of the genome thereby allowing us to distinguish core and hypervariable regions in Amycolatopsis genomes. Using a comparative genomics approach, it was possible to determine the potential of the genus Amycolatopsis to produce a huge diversity of secondary metabolites. Furthermore, the analysis demonstrates that horizontal and vertical gene transfer play an important role in the acquisition and maintenance of valuable secondary metabolites. Our results cast

  9. Whole-Genome Sequencing of Pharmacogenetic Drug Response in Racially Diverse Children with Asthma.

    PubMed

    Mak, Angel C Y; White, Marquitta J; Eckalbar, Walter L; Szpiech, Zachary A; Oh, Sam S; Pino-Yanes, Maria; Hu, Donglei; Goddard, Pagé; Huntsman, Scott; Galanter, Joshua; Wu, Ann Chen; Himes, Blanca E; Germer, Soren; Vogel, Julia M; Bunting, Karen L; Eng, Celeste; Salazar, Sandra; Keys, Kevin L; Liberto, Jennifer; Nuckton, Thomas J; Nguyen, Thomas A; Torgerson, Dara G; Kwok, Pui-Yan; Levin, Albert M; Celedón, Juan C; Forno, Erick; Hakonarson, Hakon; Sleiman, Patrick M; Dahlin, Amber; Tantisira, Kelan G; Weiss, Scott T; Serebrisky, Denise; Brigino-Buenaventura, Emerita; Farber, Harold J; Meade, Kelley; Lenoir, Michael A; Avila, Pedro C; Sen, Saunak; Thyne, Shannon M; Rodriguez-Cintron, William; Winkler, Cheryl A; Moreno-Estrada, Andrés; Sandoval, Karla; Rodriguez-Santana, Jose R; Kumar, Rajesh; Williams, L Keoki; Ahituv, Nadav; Ziv, Elad; Seibold, Max A; Darnell, Robert B; Zaitlen, Noah; Hernandez, Ryan D; Burchard, Esteban G

    2018-06-15

    Albuterol, a bronchodilator medication, is the first-line therapy for asthma worldwide. There are significant racial/ethnic differences in albuterol drug response. To identify genetic variants important for bronchodilator drug response (BDR) in racially diverse children. We performed the first whole-genome sequencing pharmacogenetics study from 1,441 children with asthma from the tails of the BDR distribution to identify genetic association with BDR. We identified population-specific and shared genetic variants associated with BDR, including genome-wide significant (P < 3.53 × 10 -7 ) and suggestive (P < 7.06 × 10 -6 ) loci near genes previously associated with lung capacity (DNAH5), immunity (NFKB1 and PLCB1), and β-adrenergic signaling (ADAMTS3 and COX18). Functional analyses of the BDR-associated SNP in NFKB1 revealed potential regulatory function in bronchial smooth muscle cells. The SNP is also an expression quantitative trait locus for a neighboring gene, SLC39A8. The lack of other asthma study populations with BDR and whole-genome sequencing data on minority children makes it impossible to perform replication of our rare variant associations. Minority underrepresentation also poses significant challenges to identify age-matched and population-matched cohorts of sufficient sample size for replication of our common variant findings. The lack of minority data, despite a collaboration of eight universities and 13 individual laboratories, highlights the urgent need for a dedicated national effort to prioritize diversity in research. Our study expands the understanding of pharmacogenetic analyses in racially/ethnically diverse populations and advances the foundation for precision medicine in at-risk and understudied minority populations.

  10. Genome re-sequencing reveals the history of apple and supports a two-stage model for fruit enlargement

    USDA-ARS?s Scientific Manuscript database

    Human selection has reshaped crop genomes. Here we report an apple genome variation map generated through genome sequencing of 117 diverse accessions. A comprehensive model of apple speciation and domestication along the Silk Road was proposed based on evidence from diverse genomic analyses. Cultiva...

  11. Genomic characterisation of Wongabel virus reveals novel genes within the Rhabdoviridae.

    PubMed

    Gubala, Aneta J; Proll, David F; Barnard, Ross T; Cowled, Chris J; Crameri, Sandra G; Hyatt, Alex D; Boyle, David B

    2008-06-20

    Viruses belonging to the family Rhabdoviridae infect a variety of different hosts, including insects, vertebrates and plants. Currently, there are approximately 200 ICTV-recognised rhabdoviruses isolated around the world. However, the majority remain poorly characterised and only a fraction have been definitively assigned to genera. The genomic and transcriptional complexity displayed by several of the characterised rhabdoviruses indicates large diversity and complexity within this family. To enable an improved taxonomic understanding of this family, it is necessary to gain further information about the poorly characterised members of this family. Here we present the complete genome sequence and predicted transcription strategy of Wongabel virus (WONV), a previously uncharacterised rhabdovirus isolated from biting midges (Culicoides austropalpalis) collected in northern Queensland, Australia. The 13,196 nucleotide genome of WONV encodes five typical rhabdovirus genes N, P, M, G and L. In addition, the WONV genome contains three genes located between the P and M genes (U1, U2, U3) and two open reading frames overlapping with the N and G genes (U4, U5). These five additional genes and their putative protein products appear to be novel, and their functions are unknown. Predictive analysis of the U5 gene product revealed characteristics typical of viroporins, and indicated structural similarities with the alpha-1 protein (putative viroporin) of viruses in the genus Ephemerovirus. Phylogenetic analyses of the N and G proteins of WONV indicated closest similarity with the avian-associated Flanders virus; however, the genomes of these two viruses are significantly diverged. WONV displays a novel and unique genome structure that has not previously been described for any animal rhabdovirus.

  12. Whole-Genome Comparison Reveals Novel Genetic Elements That Characterize the Genome of Industrial Strains of Saccharomyces cerevisiae

    PubMed Central

    Borneman, Anthony R.; Desany, Brian A.; Riches, David; Affourtit, Jason P.; Forgan, Angus H.; Pretorius, Isak S.; Egholm, Michael; Chambers, Paul J.

    2011-01-01

    Human intervention has subjected the yeast Saccharomyces cerevisiae to multiple rounds of independent domestication and thousands of generations of artificial selection. As a result, this species comprises a genetically diverse collection of natural isolates as well as domesticated strains that are used in specific industrial applications. However the scope of genetic diversity that was captured during the domesticated evolution of the industrial representatives of this important organism remains to be determined. To begin to address this, we have produced whole-genome assemblies of six commercial strains of S. cerevisiae (four wine and two brewing strains). These represent the first genome assemblies produced from S. cerevisiae strains in their industrially-used forms and the first high-quality assemblies for S. cerevisiae strains used in brewing. By comparing these sequences to six existing high-coverage S. cerevisiae genome assemblies, clear signatures were found that defined each industrial class of yeast. This genetic variation was comprised of both single nucleotide polymorphisms and large-scale insertions and deletions, with the latter often being associated with ORF heterogeneity between strains. This included the discovery of more than twenty probable genes that had not been identified previously in the S. cerevisiae genome. Comparison of this large number of S. cerevisiae strains also enabled the characterization of a cluster of five ORFs that have integrated into the genomes of the wine and bioethanol strains on multiple occasions and at diverse genomic locations via what appears to involve the resolution of a circular DNA intermediate. This work suggests that, despite the scrutiny that has been directed at the yeast genome, there remains a significant reservoir of ORFs and novel modes of genetic transmission that may have significant phenotypic impact in this important model and industrial species. PMID:21304888

  13. The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes

    PubMed Central

    Liu, Shengyi; Liu, Yumei; Yang, Xinhua; Tong, Chaobo; Edwards, David; Parkin, Isobel A. P.; Zhao, Meixia; Ma, Jianxin; Yu, Jingyin; Huang, Shunmou; Wang, Xiyin; Wang, Junyi; Lu, Kun; Fang, Zhiyuan; Bancroft, Ian; Yang, Tae-Jin; Hu, Qiong; Wang, Xinfa; Yue, Zhen; Li, Haojie; Yang, Linfeng; Wu, Jian; Zhou, Qing; Wang, Wanxin; King, Graham J; Pires, J. Chris; Lu, Changxin; Wu, Zhangyan; Sampath, Perumal; Wang, Zhuo; Guo, Hui; Pan, Shengkai; Yang, Limei; Min, Jiumeng; Zhang, Dong; Jin, Dianchuan; Li, Wanshun; Belcram, Harry; Tu, Jinxing; Guan, Mei; Qi, Cunkou; Du, Dezhi; Li, Jiana; Jiang, Liangcai; Batley, Jacqueline; Sharpe, Andrew G; Park, Beom-Seok; Ruperao, Pradeep; Cheng, Feng; Waminal, Nomar Espinosa; Huang, Yin; Dong, Caihua; Wang, Li; Li, Jingping; Hu, Zhiyong; Zhuang, Mu; Huang, Yi; Huang, Junyan; Shi, Jiaqin; Mei, Desheng; Liu, Jing; Lee, Tae-Ho; Wang, Jinpeng; Jin, Huizhe; Li, Zaiyun; Li, Xun; Zhang, Jiefu; Xiao, Lu; Zhou, Yongming; Liu, Zhongsong; Liu, Xuequn; Qin, Rui; Tang, Xu; Liu, Wenbin; Wang, Yupeng; Zhang, Yangyong; Lee, Jonghoon; Kim, Hyun Hee; Denoeud, France; Xu, Xun; Liang, Xinming; Hua, Wei; Wang, Xiaowu; Wang, Jun; Chalhoub, Boulos; Paterson, Andrew H

    2014-01-01

    Polyploidization has provided much genetic variation for plant adaptive evolution, but the mechanisms by which the molecular evolution of polyploid genomes establishes genetic architecture underlying species differentiation are unclear. Brassica is an ideal model to increase knowledge of polyploid evolution. Here we describe a draft genome sequence of Brassica oleracea, comparing it with that of its sister species B. rapa to reveal numerous chromosome rearrangements and asymmetrical gene loss in duplicated genomic blocks, asymmetrical amplification of transposable elements, differential gene co-retention for specific pathways and variation in gene expression, including alternative splicing, among a large number of paralogous and orthologous genes. Genes related to the production of anticancer phytochemicals and morphological variations illustrate consequences of genome duplication and gene divergence, imparting biochemical and morphological variation to B. oleracea. This study provides insights into Brassica genome evolution and will underpin research into the many important crops in this genus. PMID:24852848

  14. Whole genome sequencing of Brucella melitensis isolated from 57 patients in Germany reveals high diversity in strains from Middle East

    PubMed Central

    Georgi, Enrico; Walter, Mathias C.; Pfalzgraf, Marie-Theres; Northoff, Bernd H.; Holdt, Lesca M.; Scholz, Holger C.; Zoeller, Lothar

    2017-01-01

    Brucellosis, a worldwide common bacterial zoonotic disease, has become quite rare in Northern and Western Europe. However, since 2014 a significant increase of imported infections caused by Brucella (B.) melitensis has been noticed in Germany. Patients predominantly originated from Middle East including Turkey and Syria. These circumstances afforded an opportunity to gain insights into the population structure of Brucella strains. Brucella-isolates from 57 patients were recovered between January 2014 and June 2016 with culture confirmed brucellosis by the National Consultant Laboratory for Brucella. Their whole genome sequences were generated using the Illumina MiSeq platform. A whole genome-based SNP typing assay was developed in order to resolve geographically attributed genetic clusters. Results were compared to MLVA typing results, the current gold-standard of Brucella typing. In addition, sequences were examined for possible genetic variation within target regions of molecular diagnostic assays. Phylogenetic analyses revealed spatial clustering and distinguished strains from different patients in either case, whereas multiple isolates from a single patient or technical replicates showed identical SNP and MLVA profiles. By including WGS data from the NCBI database, five major genotypes were identified. Notably, strains originating from Turkey showed a high diversity and grouped into seven subclusters of genotype II. MLVA analysis congruently clustered all isolates and predominantly matched the East Mediterranean genetic clade. This study confirms whole-genome based SNP-analysis as a powerful tool for accurate typing of B. melitensis. Furthermore it allows special allocation and therefore provides useful information on the geographic origin for trace-back analysis. However, the lack of reliable metadata in public databases often prevents a resolution below geographic regions or country levels and corresponding precise trace-back analysis. Once this obstacle is

  15. Whole genome sequencing of Brucella melitensis isolated from 57 patients in Germany reveals high diversity in strains from Middle East.

    PubMed

    Georgi, Enrico; Walter, Mathias C; Pfalzgraf, Marie-Theres; Northoff, Bernd H; Holdt, Lesca M; Scholz, Holger C; Zoeller, Lothar; Zange, Sabine; Antwerpen, Markus H

    2017-01-01

    Brucellosis, a worldwide common bacterial zoonotic disease, has become quite rare in Northern and Western Europe. However, since 2014 a significant increase of imported infections caused by Brucella (B.) melitensis has been noticed in Germany. Patients predominantly originated from Middle East including Turkey and Syria. These circumstances afforded an opportunity to gain insights into the population structure of Brucella strains. Brucella-isolates from 57 patients were recovered between January 2014 and June 2016 with culture confirmed brucellosis by the National Consultant Laboratory for Brucella. Their whole genome sequences were generated using the Illumina MiSeq platform. A whole genome-based SNP typing assay was developed in order to resolve geographically attributed genetic clusters. Results were compared to MLVA typing results, the current gold-standard of Brucella typing. In addition, sequences were examined for possible genetic variation within target regions of molecular diagnostic assays. Phylogenetic analyses revealed spatial clustering and distinguished strains from different patients in either case, whereas multiple isolates from a single patient or technical replicates showed identical SNP and MLVA profiles. By including WGS data from the NCBI database, five major genotypes were identified. Notably, strains originating from Turkey showed a high diversity and grouped into seven subclusters of genotype II. MLVA analysis congruently clustered all isolates and predominantly matched the East Mediterranean genetic clade. This study confirms whole-genome based SNP-analysis as a powerful tool for accurate typing of B. melitensis. Furthermore it allows special allocation and therefore provides useful information on the geographic origin for trace-back analysis. However, the lack of reliable metadata in public databases often prevents a resolution below geographic regions or country levels and corresponding precise trace-back analysis. Once this obstacle is

  16. Genome mining of ascomycetous fungi reveals their genetic potential for ergot alkaloid production.

    PubMed

    Gerhards, Nina; Matuschek, Marco; Wallwey, Christiane; Li, Shu-Ming

    2015-06-01

    Ergot alkaloids are important as mycotoxins or as drugs. Naturally occurring ergot alkaloids as well as their semisynthetic derivatives have been used as pharmaceuticals in modern medicine for decades. We identified 196 putative ergot alkaloid biosynthetic genes belonging to at least 31 putative gene clusters in 31 fungal species by genome mining of the 360 available genome sequences of ascomycetous fungi with known proteins. Detailed analysis showed that these fungi belong to the families Aspergillaceae, Clavicipitaceae, Arthrodermataceae, Helotiaceae and Thermoascaceae. Within the identified families, only a small number of taxa are represented. Literature search revealed a large diversity of ergot alkaloid structures in different fungi of the phylum Ascomycota. However, ergot alkaloid accumulation was only observed in 15 of the sequenced species. Therefore, this study provides genetic basis for further study on ergot alkaloid production in the sequenced strains.

  17. Evolution of genomic diversity and sex at extreme environments: Fungal life under hypersaline Dead Sea stress

    PubMed Central

    Kis-Papo, Tamar; Kirzhner, Valery; Wasser, Solomon P.; Nevo, Eviatar

    2003-01-01

    We have found that genomic diversity is generally positively correlated with abiotic and biotic stress levels (1–3). However, beyond a high-threshold level of stress, the diversity declines to a few adapted genotypes. The Dead Sea is the harshest planetary hypersaline environment (340 g·liter–1 total dissolved salts, ≈10 times sea water). Hence, the Dead Sea is an excellent natural laboratory for testing the “rise and fall” pattern of genetic diversity with stress proposed in this article. Here, we examined genomic diversity of the ascomycete fungus Aspergillus versicolor from saline, nonsaline, and hypersaline Dead Sea environments. We screened the coding and noncoding genomes of A. versicolor isolates by using >600 AFLP (amplified fragment length polymorphism) markers (equal to loci). Genomic diversity was positively correlated with stress, culminating in the Dead Sea surface but dropped drastically in 50- to 280-m-deep seawater. The genomic diversity pattern paralleled the pattern of sexual reproduction of fungal species across the same southward gradient of increasing stress in Israel. This parallel may suggest that diversity and sex are intertwined intimately according to the rise and fall pattern and adaptively selected by natural selection in fungal genome evolution. Future large-scale verification in micromycetes will define further the trajectories of diversity and sex in the rise and fall pattern. PMID:14645702

  18. Conifer genomics and adaptation: at the crossroads of genetic diversity and genome function.

    PubMed

    Prunier, Julien; Verta, Jukka-Pekka; MacKay, John J

    2016-01-01

    Conifers have been understudied at the genomic level despite their worldwide ecological and economic importance but the situation is rapidly changing with the development of next generation sequencing (NGS) technologies. With NGS, genomics research has simultaneously gained in speed, magnitude and scope. In just a few years, genomes of 20-24 gigabases have been sequenced for several conifers, with several others expected in the near future. Biological insights have resulted from recent sequencing initiatives as well as genetic mapping, gene expression profiling and gene discovery research over nearly two decades. We review the knowledge arising from conifer genomics research emphasizing genome evolution and the genomic basis of adaptation, and outline emerging questions and knowledge gaps. We discuss future directions in three areas with potential inputs from NGS technologies: the evolutionary impacts of adaptation in conifers based on the adaptation-by-speciation model; the contributions of genetic variability of gene expression in adaptation; and the development of a broader understanding of genetic diversity and its impacts on genome function. These research directions promise to sustain research aimed at addressing the emerging challenges of adaptation that face conifer trees. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  19. Genetic Diversity of the Q Fever Agent, Coxiella burnetii, Assessed by Microarray-Based Whole-Genome Comparisons†

    PubMed Central

    Beare, Paul A.; Samuel, James E.; Howe, Dale; Virtaneva, Kimmo; Porcella, Stephen F.; Heinzen, Robert A.

    2006-01-01

    Coxiella burnetii, a gram-negative obligate intracellular bacterium, causes human Q fever and is considered a potential agent of bioterrorism. Distinct genomic groups of C. burnetii are revealed by restriction fragment-length polymorphisms (RFLP). Here we comprehensively define the genetic diversity of C. burnetii by hybridizing the genomes of 20 RFLP-grouped and four ungrouped isolates from disparate sources to a high-density custom Affymetrix GeneChip containing all open reading frames (ORFs) of the Nine Mile phase I (NMI) reference isolate. We confirmed the relatedness of RFLP-grouped isolates and showed that two ungrouped isolates represent distinct genomic groups. Isolates contained up to 20 genomic polymorphisms consisting of 1 to 18 ORFs each. These were mostly complete ORF deletions, although partial deletions, point mutations, and insertions were also identified. A total of 139 chromosomal and plasmid ORFs were polymorphic among all C. burnetii isolates, representing ca. 7% of the NMI coding capacity. Approximately 67% of all deleted ORFs were hypothetical, while 9% were annotated in NMI as nonfunctional (e.g., frameshifted). The remaining deleted ORFs were associated with diverse cellular functions. The only deletions associated with isogenic NMI variants of attenuated virulence were previously described large deletions containing genes involved in lipopolysaccharide (LPS) biosynthesis, suggesting that these polymorphisms alone are responsible for the lower virulence of these variants. Interestingly, a variant of the Australia QD isolate producing truncated LPS had no detectable deletions, indicating LPS truncation can occur via small genetic changes. Our results provide new insight into the genetic diversity and virulence potential of Coxiella species. PMID:16547017

  20. Population Genomics of Sub-Saharan Drosophila melanogaster: African Diversity and Non-African Admixture

    PubMed Central

    Pool, John E.; Corbett-Detig, Russell B.; Sugino, Ryuichi P.; Stevens, Kristian A.; Cardeno, Charis M.; Crepeau, Marc W.; Duchen, Pablo; Emerson, J. J.; Saelao, Perot; Begun, David J.; Langley, Charles H.

    2012-01-01

    Drosophila melanogaster has played a pivotal role in the development of modern population genetics. However, many basic questions regarding the demographic and adaptive history of this species remain unresolved. We report the genome sequencing of 139 wild-derived strains of D. melanogaster, representing 22 population samples from the sub-Saharan ancestral range of this species, along with one European population. Most genomes were sequenced above 25X depth from haploid embryos. Results indicated a pervasive influence of non-African admixture in many African populations, motivating the development and application of a novel admixture detection method. Admixture proportions varied among populations, with greater admixture in urban locations. Admixture levels also varied across the genome, with localized peaks and valleys suggestive of a non-neutral introgression process. Genomes from the same location differed starkly in ancestry, suggesting that isolation mechanisms may exist within African populations. After removing putatively admixed genomic segments, the greatest genetic diversity was observed in southern Africa (e.g. Zambia), while diversity in other populations was largely consistent with a geographic expansion from this potentially ancestral region. The European population showed different levels of diversity reduction on each chromosome arm, and some African populations displayed chromosome arm-specific diversity reductions. Inversions in the European sample were associated with strong elevations in diversity across chromosome arms. Genomic scans were conducted to identify loci that may represent targets of positive selection within an African population, between African populations, and between European and African populations. A disproportionate number of candidate selective sweep regions were located near genes with varied roles in gene regulation. Outliers for Europe-Africa FST were found to be enriched in genomic regions of locally elevated cosmopolitan

  1. Genomes of diverse isolates of the marine cyanobacterium Prochlorococcus

    PubMed Central

    Biller, Steven J.; Berube, Paul M.; Berta-Thompson, Jessie W.; Kelly, Libusha; Roggensack, Sara E.; Awad, Lana; Roache-Johnson, Kathryn H.; Ding, Huiming; Giovannoni, Stephen J.; Rocap, Gabrielle; Moore, Lisa R.; Chisholm, Sallie W.

    2014-01-01

    The marine cyanobacterium Prochlorococcus is the numerically dominant photosynthetic organism in the oligotrophic oceans, and a model system in marine microbial ecology. Here we report 27 new whole genome sequences (2 complete and closed; 25 of draft quality) of cultured isolates, representing five major phylogenetic clades of Prochlorococcus. The sequenced strains were isolated from diverse regions of the oceans, facilitating studies of the drivers of microbial diversity—both in the lab and in the field. To improve the utility of these genomes for comparative genomics, we also define pre-computed clusters of orthologous groups of proteins (COGs), indicating how genes are distributed among these and other publicly available Prochlorococcus genomes. These data represent a significant expansion of Prochlorococcus reference genomes that are useful for numerous applications in microbial ecology, evolution and oceanography. PMID:25977791

  2. The high-quality draft genome of peach (Prunus persica) identifies unique patterns of genetic diversity, domestication and genome evolution.

    PubMed

    Verde, Ignazio; Abbott, Albert G; Scalabrin, Simone; Jung, Sook; Shu, Shengqiang; Marroni, Fabio; Zhebentyayeva, Tatyana; Dettori, Maria Teresa; Grimwood, Jane; Cattonaro, Federica; Zuccolo, Andrea; Rossini, Laura; Jenkins, Jerry; Vendramin, Elisa; Meisel, Lee A; Decroocq, Veronique; Sosinski, Bryon; Prochnik, Simon; Mitros, Therese; Policriti, Alberto; Cipriani, Guido; Dondini, Luca; Ficklin, Stephen; Goodstein, David M; Xuan, Pengfei; Del Fabbro, Cristian; Aramini, Valeria; Copetti, Dario; Gonzalez, Susana; Horner, David S; Falchi, Rachele; Lucas, Susan; Mica, Erica; Maldonado, Jonathan; Lazzari, Barbara; Bielenberg, Douglas; Pirona, Raul; Miculan, Mara; Barakat, Abdelali; Testolin, Raffaele; Stella, Alessandra; Tartarini, Stefano; Tonutti, Pietro; Arús, Pere; Orellana, Ariel; Wells, Christina; Main, Dorrie; Vizzotto, Giannina; Silva, Herman; Salamini, Francesco; Schmutz, Jeremy; Morgante, Michele; Rokhsar, Daniel S

    2013-05-01

    Rosaceae is the most important fruit-producing clade, and its key commercially relevant genera (Fragaria, Rosa, Rubus and Prunus) show broadly diverse growth habits, fruit types and compact diploid genomes. Peach, a diploid Prunus species, is one of the best genetically characterized deciduous trees. Here we describe the high-quality genome sequence of peach obtained from a completely homozygous genotype. We obtained a complete chromosome-scale assembly using Sanger whole-genome shotgun methods. We predicted 27,852 protein-coding genes, as well as noncoding RNAs. We investigated the path of peach domestication through whole-genome resequencing of 14 Prunus accessions. The analyses suggest major genetic bottlenecks that have substantially shaped peach genome diversity. Furthermore, comparative analyses showed that peach has not undergone recent whole-genome duplication, and even though the ancestral triplicated blocks in peach are fragmentary compared to those in grape, all seven paleosets of paralogs from the putative paleoancestor are detectable.

  3. Absence of genome reduction in diverse, facultative endohyphal bacteria

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Baltrus, David A.; Dougherty, Kevin; Arendt, Kayla R.

    Fungi interact closely with bacteria, both on the surfaces of the hyphae and within their living tissues (i.e. endohyphal bacteria, EHB). These EHB can be obligate or facultative symbionts and can mediate diverse phenotypic traits in their hosts. Although EHB have been observed in many lineages of fungi, it remains unclear how widespread and general these associations are, and whether there are unifying ecological and genomic features can be found across EHB strains as a whole. We cultured 11 bacterial strains after they emerged from the hyphae of diverse Ascomycota that were isolated as foliar endophytes of cupressaceous trees, andmore » generated nearly complete genome sequences for all. Unlike the genomes of largely obligate EHB, the genomes of these facultative EHB resembled those of closely related strains isolated from environmental sources. Although all analysed genomes encoded structures that could be used to interact with eukaryotic hosts, pathways previously implicated in maintenance and establishment of EHB symbiosis were not universally present across all strains. Independent isolation of two nearly identical pairs of strains from different classes of fungi, coupled with recent experimental evidence, suggests horizontal transfer of EHB across endophytic hosts. Given the potential for EHB to influence fungal phenotypes, these genomes could shed light on the mechanisms of plant growth promotion or stress mitigation by fungal endophytes during the symbiotic phase, as well as degradation of plant material during the saprotrophic phase. As such, these findings contribute to the illumination of a new dimension of functional biodiversity in fungi.« less

  4. Absence of genome reduction in diverse, facultative endohyphal bacteria

    DOE PAGES

    Baltrus, David A.; Dougherty, Kevin; Arendt, Kayla R.; ...

    2017-02-28

    Fungi interact closely with bacteria, both on the surfaces of the hyphae and within their living tissues (i.e. endohyphal bacteria, EHB). These EHB can be obligate or facultative symbionts and can mediate diverse phenotypic traits in their hosts. Although EHB have been observed in many lineages of fungi, it remains unclear how widespread and general these associations are, and whether there are unifying ecological and genomic features can be found across EHB strains as a whole. We cultured 11 bacterial strains after they emerged from the hyphae of diverse Ascomycota that were isolated as foliar endophytes of cupressaceous trees, andmore » generated nearly complete genome sequences for all. Unlike the genomes of largely obligate EHB, the genomes of these facultative EHB resembled those of closely related strains isolated from environmental sources. Although all analysed genomes encoded structures that could be used to interact with eukaryotic hosts, pathways previously implicated in maintenance and establishment of EHB symbiosis were not universally present across all strains. Independent isolation of two nearly identical pairs of strains from different classes of fungi, coupled with recent experimental evidence, suggests horizontal transfer of EHB across endophytic hosts. Given the potential for EHB to influence fungal phenotypes, these genomes could shed light on the mechanisms of plant growth promotion or stress mitigation by fungal endophytes during the symbiotic phase, as well as degradation of plant material during the saprotrophic phase. As such, these findings contribute to the illumination of a new dimension of functional biodiversity in fungi.« less

  5. Annotated Draft Genome Assemblies for the Northern Bobwhite (Colinus virginianus) and the Scaled Quail (Callipepla squamata) Reveal Disparate Estimates of Modern Genome Diversity and Historic Effective Population Size.

    PubMed

    Oldeschulte, David L; Halley, Yvette A; Wilson, Miranda L; Bhattarai, Eric K; Brashear, Wesley; Hill, Joshua; Metz, Richard P; Johnson, Charles D; Rollins, Dale; Peterson, Markus J; Bickhart, Derek M; Decker, Jared E; Sewell, John F; Seabury, Christopher M

    2017-09-07

    Northern bobwhite ( Colinus virginianus ; hereafter bobwhite) and scaled quail ( Callipepla squamata ) populations have suffered precipitous declines across most of their US ranges. Illumina-based first- (v1.0) and second- (v2.0) generation draft genome assemblies for the scaled quail and the bobwhite produced N50 scaffold sizes of 1.035 and 2.042 Mb, thereby producing a 45-fold improvement in contiguity over the existing bobwhite assembly, and ≥90% of the assembled genomes were captured within 1313 and 8990 scaffolds, respectively. The scaled quail assembly (v1.0 = 1.045 Gb) was ∼20% smaller than the bobwhite (v2.0 = 1.254 Gb), which was supported by kmer-based estimates of genome size. Nevertheless, estimates of GC content (41.72%; 42.66%), genome-wide repetitive content (10.40%; 10.43%), and MAKER-predicted protein coding genes (17,131; 17,165) were similar for the scaled quail (v1.0) and bobwhite (v2.0) assemblies, respectively. BUSCO analyses utilizing 3023 single-copy orthologs revealed a high level of assembly completeness for the scaled quail (v1.0; 84.8%) and the bobwhite (v2.0; 82.5%), as verified by comparison with well-established avian genomes. We also detected 273 putative segmental duplications in the scaled quail genome (v1.0), and 711 in the bobwhite genome (v2.0), including some that were shared among both species. Autosomal variant prediction revealed ∼2.48 and 4.17 heterozygous variants per kilobase within the scaled quail (v1.0) and bobwhite (v2.0) genomes, respectively, and estimates of historic effective population size were uniformly higher for the bobwhite across all time points in a coalescent model. However, large-scale declines were predicted for both species beginning ∼15-20 KYA. Copyright © 2017 Oldeschulte et al.

  6. Multiple genome sequences reveal adaptations of a phototrophic bacterium to sediment microenvironments.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Oda, Yasuhiro; Larimer, Frank W; Chain, Patrick S. G.

    The bacterial genus Rhodopseudomonas is comprised of photosynthetic bacteria found widely distributed in aquatic sediments. Members of the genus catalyze hydrogen gas production, carbon dioxide sequestration, and biomass turnover. The genome sequence of Rhodopseudomonas palustris CGA009 revealed a surprising richness of metabolic versatility that would seem to explain its ability to live in a heterogeneous environment like sediment. However, there is considerable genotypic diversity among Rhodopseudomonas isolates. Here we report the complete genome sequences of four additional members of the genus isolated from a restricted geographical area. The sequences confirm that the isolates belong to a coherent taxonomic unit, butmore » they also have significant differences. Whole genome alignments show that the circular chromosomes of the isolates consist of a collinear backbone with a moderate number of genomic rearrangements that impact local gene order and orientation. There are 3,319 genes, 70% of the genes in each genome, shared by four or more strains. Between 10% and 18% of the genes in each genome are strain specific. Some of these genes suggest specialized physiological traits, which we verified experimentally, that include expanded light harvesting, oxygen respiration, and nitrogen fixation capabilities, as well as anaerobic fermentation. Strain-specific adaptations include traits that may be useful in bioenergy applications. This work suggests that against a backdrop of metabolic versatility that is a defining characteristic of Rhodopseudomonas, different ecotypes have evolved to take advantage of physical and chemical conditions in sediment microenvironments that are too small for human observation.« less

  7. The genome of the endophytic bacterium H. frisingense GSF30(T) identifies diverse strategies in the Herbaspirillum genus to interact with plants.

    PubMed

    Straub, Daniel; Rothballer, Michael; Hartmann, Anton; Ludewig, Uwe

    2013-01-01

    The diazotrophic, bacterial endophyte Herbaspirillum frisingense GSF30(T) has been identified in biomass grasses grown in temperate climate, including the highly nitrogen-efficient grass Miscanthus. Its genome was annotated and compared with related Herbaspirillum species from diverse habitats, including H. seropedicae, and further well-characterized endophytes. The analysis revealed that Herbaspirillum frisingense lacks a type III secretion system that is present in some related Herbaspirillum grass endophytes. Together with the lack of components of the type II secretion system, the genomic inventory indicates distinct interaction scenarios of endophytic Herbaspirillum strains with plants. Differences in respiration, carbon, nitrogen and cell wall metabolism among Herbaspirillum isolates partially correlate with their different habitats. Herbaspirillum frisingense is closely related to strains isolated from the rhizosphere of phragmites and from well water, but these lack nitrogen fixation and metabolism genes. Within grass endophytes, the high diversity in their genomic inventory suggests that even individual plant species provide distinct, highly diverse metabolic niches for successful endophyte-plant associations.

  8. The genome of the endophytic bacterium H. frisingense GSF30T identifies diverse strategies in the Herbaspirillum genus to interact with plants

    PubMed Central

    Straub, Daniel; Rothballer, Michael; Hartmann, Anton; Ludewig, Uwe

    2013-01-01

    The diazotrophic, bacterial endophyte Herbaspirillum frisingense GSF30T has been identified in biomass grasses grown in temperate climate, including the highly nitrogen-efficient grass Miscanthus. Its genome was annotated and compared with related Herbaspirillum species from diverse habitats, including H. seropedicae, and further well-characterized endophytes. The analysis revealed that Herbaspirillum frisingense lacks a type III secretion system that is present in some related Herbaspirillum grass endophytes. Together with the lack of components of the type II secretion system, the genomic inventory indicates distinct interaction scenarios of endophytic Herbaspirillum strains with plants. Differences in respiration, carbon, nitrogen and cell wall metabolism among Herbaspirillum isolates partially correlate with their different habitats. Herbaspirillum frisingense is closely related to strains isolated from the rhizosphere of phragmites and from well water, but these lack nitrogen fixation and metabolism genes. Within grass endophytes, the high diversity in their genomic inventory suggests that even individual plant species provide distinct, highly diverse metabolic niches for successful endophyte-plant associations. PMID:23825472

  9. Genome-wide nucleotide diversity of hatchery-reared Atlantic and Mediterranean strains of brown trout Salmo trutta compared to wild Mediterranean populations.

    PubMed

    Leitwein, M; Gagnaire, P-A; Desmarais, E; Guendouz, S; Rohmer, M; Berrebi, P; Guinand, B

    2016-12-01

    A genome-wide assessment of diversity is provided for wild Mediterranean brown trout Salmo trutta populations from headwater tributaries of the Orb River and from Atlantic and Mediterranean hatchery-reared strains that have been used for stocking. Double-digest restriction-site-associated DNA sequencing (dd-RADseq) was performed and the efficiency of de novo and reference-mapping approaches to obtain individual genotypes was compared. Large numbers of single nucleotide polymorphism (SNP) markers with similar genome-wide distributions were discovered using both approaches (196 639 v. 121 016 SNPs, respectively), with c. 80% of the loci detected de novo being also found with reference mapping, using the Atlantic salmon Salmo salar genome as a reference. Lower mapping density but larger nucleotide diversity (π) was generally observed near extremities of linkage groups, consistent with regions of residual tetrasomic inheritance observed in salmonids. Genome-wide diversity estimates revealed reduced polymorphism in hatchery strains (π = 0·0040 and π = 0·0029 in Atlantic and Mediterranean strains, respectively) compared to wild populations (π = 0·0049), a pattern that was congruent with allelic richness estimated from microsatellite markers. Finally, pronounced heterozygote deficiency was found in hatchery strains (Atlantic F IS = 0·18; Mediterranean F IS = 0·42), indicating that stocking practices may affect the genetic diversity in wild populations. These new genomic resources will provide important tools to define better conservation strategies in S. trutta. © 2016 The Fisheries Society of the British Isles.

  10. Chloroplast genomes: diversity, evolution, and applications in genetic engineering

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Daniell, Henry; Lin, Choun -Sea; Yu, Ming

    Chloroplasts play a crucial role in sustaining life on earth. The availability of over 800 sequenced chloroplast genomes from a variety of land plants has enhanced our understanding of chloroplast biology, intracellular gene transfer, conservation, diversity, and the genetic basis by which chloroplast transgenes can be engineered to enhance plant agronomic traits or to produce high-value agricultural or biomedical products. In this review, we discuss the impact of chloroplast genome sequences on understanding the origins of economically important cultivated species and changes that have taken place during domestication. Here, we also discuss the potential biotechnological applications of chloroplast genomes.

  11. Chloroplast genomes: diversity, evolution, and applications in genetic engineering

    DOE PAGES

    Daniell, Henry; Lin, Choun -Sea; Yu, Ming; ...

    2016-06-23

    Chloroplasts play a crucial role in sustaining life on earth. The availability of over 800 sequenced chloroplast genomes from a variety of land plants has enhanced our understanding of chloroplast biology, intracellular gene transfer, conservation, diversity, and the genetic basis by which chloroplast transgenes can be engineered to enhance plant agronomic traits or to produce high-value agricultural or biomedical products. In this review, we discuss the impact of chloroplast genome sequences on understanding the origins of economically important cultivated species and changes that have taken place during domestication. Here, we also discuss the potential biotechnological applications of chloroplast genomes.

  12. Genomic data reveal a loss of diversity in two species of tuco-tucos (genus Ctenomys) following a volcanic eruption.

    PubMed

    Hsu, Jeremy L; Crawford, Jeremy Chase; Tammone, Mauro N; Ramakrishnan, Uma; Lacey, Eileen A; Hadly, Elizabeth A

    2017-11-24

    Marked reductions in population size can trigger corresponding declines in genetic variation. Understanding the precise genetic consequences of such reductions, however, is often challenging due to the absence of robust pre- and post-reduction datasets. Here, we use heterochronous genomic data from samples obtained before and immediately after the 2011 eruption of the Puyehue-Cordón Caulle volcanic complex in Patagonia to explore the genetic impacts of this event on two parapatric species of rodents, the colonial tuco-tuco (Ctenomys sociabilis) and the Patagonian tuco-tuco (C. haigi). Previous analyses using microsatellites revealed no post-eruption changes in genetic variation in C. haigi, but an unexpected increase in variation in C. sociabilis. To explore this outcome further, we used targeted gene capture to sequence over 2,000 putatively neutral regions for both species. Our data revealed that, contrary to the microsatellite analyses, the eruption was associated with a small but significant decrease in genetic variation in both species. We suggest that genome-level analyses provide greater power than traditional molecular markers to detect the genetic consequences of population size changes, particularly changes that are recent, short-term, or modest in size. Consequently, genomic analyses promise to generate important new insights into the effects of specific environmental events on demography and genetic variation.

  13. Novel Phage Group Infecting Lactobacillus delbrueckii subsp. lactis, as Revealed by Genomic and Proteomic Analysis of Bacteriophage Ldl1

    PubMed Central

    Casey, Eoghan; Mahony, Jennifer; Neve, Horst; Noben, Jean-Paul; Dal Bello, Fabio

    2014-01-01

    Ldl1 is a virulent phage infecting the dairy starter Lactobacillus delbrueckii subsp. lactis LdlS. Electron microscopy analysis revealed that this phage exhibits a large head and a long tail and bears little resemblance to other characterized phages infecting Lactobacillus delbrueckii. In vitro propagation of this phage revealed a latent period of 30 to 40 min and a burst size of 59.9 ± 1.9 phage particles. Comparative genomic and proteomic analyses showed remarkable similarity between the genome of Ldl1 and that of Lactobacillus plantarum phage ATCC 8014-B2. The genomic and proteomic characteristics of Ldl1 demonstrate that this phage does not belong to any of the four previously recognized L. delbrueckii phage groups, necessitating the creation of a new group, called group e, thus adding to the knowledge on the diversity of phages targeting strains of this industrially important lactic acid bacterial species. PMID:25501478

  14. Novel phage group infecting Lactobacillus delbrueckii subsp. lactis, as revealed by genomic and proteomic analysis of bacteriophage Ldl1.

    PubMed

    Casey, Eoghan; Mahony, Jennifer; Neve, Horst; Noben, Jean-Paul; Dal Bello, Fabio; van Sinderen, Douwe

    2015-02-01

    Ldl1 is a virulent phage infecting the dairy starter Lactobacillus delbrueckii subsp. lactis LdlS. Electron microscopy analysis revealed that this phage exhibits a large head and a long tail and bears little resemblance to other characterized phages infecting Lactobacillus delbrueckii. In vitro propagation of this phage revealed a latent period of 30 to 40 min and a burst size of 59.9 +/- 1.9 phage particles. Comparative genomic and proteomic analyses showed remarkable similarity between the genome of Ldl1 and that of Lactobacillus plantarum phage ATCC 8014-B2. The genomic and proteomic characteristics of Ldl1 demonstrate that this phage does not belong to any of the four previously recognized L. delbrueckii phage groups, necessitating the creation of a new group, called group e, thus adding to the knowledge on the diversity of phages targeting strains of this industrially important lactic acid bacterial species.

  15. Comparative genomics of four closely related Clostridium perfringens bacteriophages reveals variable evolution among core genes with therapeutic potential

    PubMed Central

    2011-01-01

    Background Because biotechnological uses of bacteriophage gene products as alternatives to conventional antibiotics will require a thorough understanding of their genomic context, we sequenced and analyzed the genomes of four closely related phages isolated from Clostridium perfringens, an important agricultural and human pathogen. Results Phage whole-genome tetra-nucleotide signatures and proteomic tree topologies correlated closely with host phylogeny. Comparisons of our phage genomes to 26 others revealed three shared COGs; of particular interest within this core genome was an endolysin (PF01520, an N-acetylmuramoyl-L-alanine amidase) and a holin (PF04531). Comparative analyses of the evolutionary history and genomic context of these common phage proteins revealed two important results: 1) strongly significant host-specific sequence variation within the endolysin, and 2) a protein domain architecture apparently unique to our phage genomes in which the endolysin is located upstream of its associated holin. Endolysin sequences from our phages were one of two very distinct genotypes distinguished by variability within the putative enzymatically-active domain. The shared or core genome was comprised of genes with multiple sequence types belonging to five pfam families, and genes belonging to 12 pfam families, including the holin genes, which were nearly identical. Conclusions Significant genomic diversity exists even among closely-related bacteriophages. Holins and endolysins represent conserved functions across divergent phage genomes and, as we demonstrate here, endolysins can have significant variability and host-specificity even among closely-related genomes. Endolysins in our phage genomes may be subject to different selective pressures than the rest of the genome. These findings may have important implications for potential biotechnological applications of phage gene products. PMID:21631945

  16. Analysis of ATP6 sequence diversity in the Triticum-Aegilops group of species reveals the crucial role of rearrangement in mitochondrial genome evolution

    USDA-ARS?s Scientific Manuscript database

    Mutation and chromosomal rearrangements are the two main forces of increasing genetic diversity for natural selection to act upon, and ultimately drive the evolutionary process. Although genome evolution is a function of both forces, simultaneously, the ratio of each can be varied among different ge...

  17. A LDA-based approach to promoting ranking diversity for genomics information retrieval.

    PubMed

    Chen, Yan; Yin, Xiaoshi; Li, Zhoujun; Hu, Xiaohua; Huang, Jimmy Xiangji

    2012-06-11

    In the biomedical domain, there are immense data and tremendous increase of genomics and biomedical relevant publications. The wealth of information has led to an increasing amount of interest in and need for applying information retrieval techniques to access the scientific literature in genomics and related biomedical disciplines. In many cases, the desired information of a query asked by biologists is a list of a certain type of entities covering different aspects that are related to the question, such as cells, genes, diseases, proteins, mutations, etc. Hence, it is important of a biomedical IR system to be able to provide relevant and diverse answers to fulfill biologists' information needs. However traditional IR model only concerns with the relevance between retrieved documents and user query, but does not take redundancy between retrieved documents into account. This will lead to high redundancy and low diversity in the retrieval ranked lists. In this paper, we propose an approach which employs a topic generative model called Latent Dirichlet Allocation (LDA) to promoting ranking diversity for biomedical information retrieval. Different from other approaches or models which consider aspects on word level, our approach assumes that aspects should be identified by the topics of retrieved documents. We present LDA model to discover topic distribution of retrieval passages and word distribution of each topic dimension, and then re-rank retrieval results with topic distribution similarity between passages based on N-size slide window. We perform our approach on TREC 2007 Genomics collection and two distinctive IR baseline runs, which can achieve 8% improvement over the highest Aspect MAP reported in TREC 2007 Genomics track. The proposed method is the first study of adopting topic model to genomics information retrieval, and demonstrates its effectiveness in promoting ranking diversity as well as in improving relevance of ranked lists of genomics search

  18. Who Are the Okinawans? Ancestry, Genome Diversity, and Implications for the Genetic Study of Human Longevity From a Geographically Isolated Population

    PubMed Central

    Hsueh, Wen-Chi; He, Qimei; Willcox, D. Craig; Nievergelt, Caroline M.; Donlon, Timothy A.; Kwok, Pui-Yan; Suzuki, Makoto; Willcox, Bradley J.

    2014-01-01

    Isolated populations have advantages for genetic studies of longevity from decreased haplotype diversity and long-range linkage disequilibrium. This permits smaller sample sizes without loss of power, among other utilities. Little is known about the genome of the Okinawans, a potential population isolate, recognized for longevity. Therefore, we assessed genetic diversity, structure, and admixture in Okinawans, and compared this with Caucasians, Chinese, Japanese, and Africans from HapMap II, genotyped on the same Affymetrix GeneChip Human Mapping 500K array. Principal component analysis, haplotype coverage, and linkage disequilibrium decay revealed a distinct Okinawan genome—more homogeneity, less haplotype diversity, and longer range linkage disequilibrium. Population structure and admixture analyses utilizing 52 global reference populations from the Human Genome Diversity Cell Line Panel demonstrated that Okinawans clustered almost exclusively with East Asians. Sibling relative risk (λs) analysis revealed that siblings of Okinawan centenarians have 3.11 times (females) and 3.77 times (males) more likelihood of centenarianism. These findings suggest that Okinawans are genetically distinct and share several characteristics of a population isolate, which are prone to develop extreme phenotypes (eg, longevity) from genetic drift, natural selection, and population bottlenecks. These data support further exploration of genetic influence on longevity in the Okinawans. PMID:24444611

  19. Genetic Diversity in the Modern Horse Illustrated from Genome-Wide SNP Data

    PubMed Central

    Petersen, Jessica L.; Mickelson, James R.; Cothran, E. Gus; Andersson, Lisa S.; Axelsson, Jeanette; Bailey, Ernie; Bannasch, Danika; Binns, Matthew M.; Borges, Alexandre S.; Brama, Pieter; da Câmara Machado, Artur; Distl, Ottmar; Felicetti, Michela; Fox-Clipsham, Laura; Graves, Kathryn T.; Guérin, Gérard; Haase, Bianca; Hasegawa, Telhisa; Hemmann, Karin; Hill, Emmeline W.; Leeb, Tosso; Lindgren, Gabriella; Lohi, Hannes; Lopes, Maria Susana; McGivney, Beatrice A.; Mikko, Sofia; Orr, Nicholas; Penedo, M. Cecilia T; Piercy, Richard J.; Raekallio, Marja; Rieder, Stefan; Røed, Knut H.; Silvestrelli, Maurizio; Swinburne, June; Tozaki, Teruaki; Vaudin, Mark; M. Wade, Claire; McCue, Molly E.

    2013-01-01

    Horses were domesticated from the Eurasian steppes 5,000–6,000 years ago. Since then, the use of horses for transportation, warfare, and agriculture, as well as selection for desired traits and fitness, has resulted in diverse populations distributed across the world, many of which have become or are in the process of becoming formally organized into closed, breeding populations (breeds). This report describes the use of a genome-wide set of autosomal SNPs and 814 horses from 36 breeds to provide the first detailed description of equine breed diversity. FST calculations, parsimony, and distance analysis demonstrated relationships among the breeds that largely reflect geographic origins and known breed histories. Low levels of population divergence were observed between breeds that are relatively early on in the process of breed development, and between those with high levels of within-breed diversity, whether due to large population size, ongoing outcrossing, or large within-breed phenotypic diversity. Populations with low within-breed diversity included those which have experienced population bottlenecks, have been under intense selective pressure, or are closed populations with long breed histories. These results provide new insights into the relationships among and the diversity within breeds of horses. In addition these results will facilitate future genome-wide association studies and investigations into genomic targets of selection. PMID:23383025

  20. Sequencing rare marine actinomycete genomes reveals high density of unique natural product biosynthetic gene clusters.

    PubMed

    Schorn, Michelle A; Alanjary, Mohammad M; Aguinaldo, Kristen; Korobeynikov, Anton; Podell, Sheila; Patin, Nastassia; Lincecum, Tommie; Jensen, Paul R; Ziemert, Nadine; Moore, Bradley S

    2016-12-01

    Traditional natural product discovery methods have nearly exhausted the accessible diversity of microbial chemicals, making new sources and techniques paramount in the search for new molecules. Marine actinomycete bacteria have recently come into the spotlight as fruitful producers of structurally diverse secondary metabolites, and remain relatively untapped. In this study, we sequenced 21 marine-derived actinomycete strains, rarely studied for their secondary metabolite potential and under-represented in current genomic databases. We found that genome size and phylogeny were good predictors of biosynthetic gene cluster diversity, with larger genomes rivalling the well-known marine producers in the Streptomyces and Salinispora genera. Genomes in the Micrococcineae suborder, however, had consistently the lowest number of biosynthetic gene clusters. By networking individual gene clusters into gene cluster families, we were able to computationally estimate the degree of novelty each genus contributed to the current sequence databases. Based on the similarity measures between all actinobacteria in the Joint Genome Institute's Atlas of Biosynthetic gene Clusters database, rare marine genera show a high degree of novelty and diversity, with Corynebacterium, Gordonia, Nocardiopsis, Saccharomonospora and Pseudonocardia genera representing the highest gene cluster diversity. This research validates that rare marine actinomycetes are important candidates for exploration, as they are relatively unstudied, and their relatives are historically rich in secondary metabolites.

  1. Sequencing rare marine actinomycete genomes reveals high density of unique natural product biosynthetic gene clusters

    PubMed Central

    Schorn, Michelle A.; Alanjary, Mohammad M.; Aguinaldo, Kristen; Korobeynikov, Anton; Podell, Sheila; Patin, Nastassia; Lincecum, Tommie; Jensen, Paul R.; Ziemert, Nadine

    2016-01-01

    Traditional natural product discovery methods have nearly exhausted the accessible diversity of microbial chemicals, making new sources and techniques paramount in the search for new molecules. Marine actinomycete bacteria have recently come into the spotlight as fruitful producers of structurally diverse secondary metabolites, and remain relatively untapped. In this study, we sequenced 21 marine-derived actinomycete strains, rarely studied for their secondary metabolite potential and under-represented in current genomic databases. We found that genome size and phylogeny were good predictors of biosynthetic gene cluster diversity, with larger genomes rivalling the well-known marine producers in the Streptomyces and Salinispora genera. Genomes in the Micrococcineae suborder, however, had consistently the lowest number of biosynthetic gene clusters. By networking individual gene clusters into gene cluster families, we were able to computationally estimate the degree of novelty each genus contributed to the current sequence databases. Based on the similarity measures between all actinobacteria in the Joint Genome Institute's Atlas of Biosynthetic gene Clusters database, rare marine genera show a high degree of novelty and diversity, with Corynebacterium, Gordonia, Nocardiopsis, Saccharomonospora and Pseudonocardia genera representing the highest gene cluster diversity. This research validates that rare marine actinomycetes are important candidates for exploration, as they are relatively unstudied, and their relatives are historically rich in secondary metabolites. PMID:27902408

  2. Genome-wide genetic diversity, population structure and admixture analysis in African and Asian cattle breeds.

    PubMed

    Edea, Z; Bhuiyan, M S A; Dessie, T; Rothschild, M F; Dadi, H; Kim, K S

    2015-02-01

    Knowledge about genetic diversity and population structure is useful for designing effective strategies to improve the production, management and conservation of farm animal genetic resources. Here, we present a comprehensive genome-wide analysis of genetic diversity, population structure and admixture based on 244 animals sampled from 10 cattle populations in Asia and Africa and genotyped for 69,903 autosomal single-nucleotide polymorphisms (SNPs) mainly derived from the indicine breed. Principal component analysis, STRUCTURE and distance analysis from high-density SNP data clearly revealed that the largest genetic difference occurred between the two domestic lineages (taurine and indicine), whereas Ethiopian cattle populations represent a mosaic of the humped zebu and taurine. Estimation of the genetic influence of zebu and taurine revealed that Ethiopian cattle were characterized by considerable levels of introgression from South Asian zebu, whereas Bangladeshi populations shared very low taurine ancestry. The relationships among Ethiopian cattle populations reflect their history of origin and admixture rather than phenotype-based distinctions. The high within-individual genetic variability observed in Ethiopian cattle represents an untapped opportunity for adaptation to changing environments and for implementation of within-breed genetic improvement schemes. Our results provide a basis for future applications of genome-wide SNP data to exploit the unique genetic makeup of indigenous cattle breeds and to facilitate their improvement and conservation.

  3. Natural Product Biosynthetic Diversity and Comparative Genomics of the Cyanobacteria.

    PubMed

    Dittmann, Elke; Gugger, Muriel; Sivonen, Kaarina; Fewer, David P

    2015-10-01

    Cyanobacteria are an ancient lineage of slow-growing photosynthetic bacteria and a prolific source of natural products with intricate chemical structures and potent biological activities. The bulk of these natural products are known from just a handful of genera. Recent efforts have elucidated the mechanisms underpinning the biosynthesis of a diverse array of natural products from cyanobacteria. Many of the biosynthetic mechanisms are unique to cyanobacteria or rarely described from other organisms. Advances in genome sequence technology have precipitated a deluge of genome sequences for cyanobacteria. This makes it possible to link known natural products to biosynthetic gene clusters but also accelerates the discovery of new natural products through genome mining. These studies demonstrate that cyanobacteria encode a huge variety of cryptic gene clusters for the production of natural products, and the known chemical diversity is likely to be just a fraction of the true biosynthetic capabilities of this fascinating and ancient group of organisms. Copyright © 2015. Published by Elsevier Ltd.

  4. Genome-wide characterization of genetic diversity and population structure in Secale

    PubMed Central

    2014-01-01

    Background Numerous rye accessions are stored in ex situ genebanks worldwide. Little is known about the extent of genetic diversity contained in any of them and its relation to contemporary varieties, since to date rye genetic diversity studies had a very limited scope, analyzing few loci and/ or few accessions. Development of high throughput genotyping methods for rye opened the possibility for genome wide characterizations of large accessions sets. In this study we used 1054 Diversity Array Technology (DArT) markers with defined chromosomal location to characterize genetic diversity and population structure in a collection of 379 rye accessions including wild species, landraces, cultivated materials, historical and contemporary rye varieties. Results Average genetic similarity (GS) coefficients and average polymorphic information content (PIC) values varied among chromosomes. Comparison of chromosome specific average GS within and between germplasm sub-groups indicated regions of chromosomes 1R and 4R as being targeted by selection in current breeding programs. Bayesian clustering, principal coordinate analysis and Neighbor Joining clustering demonstrated that source and improvement status contributed significantly to the structure observed in the analyzed set of Secale germplasm. We revealed a relatively limited diversity in improved rye accessions, both historical and contemporary, as well as lack of correlation between clustering of improved accessions and geographic origin, suggesting common genetic background of rye accessions from diverse geographic regions and extensive germplasm exchange. Moreover, contemporary varieties were distinct from the remaining accessions. Conclusions Our results point to an influence of reproduction methods on the observed diversity patterns and indicate potential of ex situ collections for broadening the genetic diversity in rye breeding programs. Obtained data show that DArT markers provide a realistic picture of the genetic

  5. Targeted activation of diverse CRISPR-Cas systems for mammalian genome editing via proximal CRISPR targeting.

    PubMed

    Chen, Fuqiang; Ding, Xiao; Feng, Yongmei; Seebeck, Timothy; Jiang, Yanfang; Davis, Gregory D

    2017-04-07

    Bacterial CRISPR-Cas systems comprise diverse effector endonucleases with different targeting ranges, specificities and enzymatic properties, but many of them are inactive in mammalian cells and are thus precluded from genome-editing applications. Here we show that the type II-B FnCas9 from Francisella novicida possesses novel properties, but its nuclease function is frequently inhibited at many genomic loci in living human cells. Moreover, we develop a proximal CRISPR (termed proxy-CRISPR) targeting method that restores FnCas9 nuclease activity in a target-specific manner. We further demonstrate that this proxy-CRISPR strategy is applicable to diverse CRISPR-Cas systems, including type II-C Cas9 and type V Cpf1 systems, and can facilitate precise gene editing even between identical genomic sites within the same genome. Our findings provide a novel strategy to enable use of diverse otherwise inactive CRISPR-Cas systems for genome-editing applications and a potential path to modulate the impact of chromatin microenvironments on genome modification.

  6. Targeted activation of diverse CRISPR-Cas systems for mammalian genome editing via proximal CRISPR targeting

    PubMed Central

    Chen, Fuqiang; Ding, Xiao; Feng, Yongmei; Seebeck, Timothy; Jiang, Yanfang; Davis, Gregory D.

    2017-01-01

    Bacterial CRISPR–Cas systems comprise diverse effector endonucleases with different targeting ranges, specificities and enzymatic properties, but many of them are inactive in mammalian cells and are thus precluded from genome-editing applications. Here we show that the type II-B FnCas9 from Francisella novicida possesses novel properties, but its nuclease function is frequently inhibited at many genomic loci in living human cells. Moreover, we develop a proximal CRISPR (termed proxy-CRISPR) targeting method that restores FnCas9 nuclease activity in a target-specific manner. We further demonstrate that this proxy-CRISPR strategy is applicable to diverse CRISPR–Cas systems, including type II-C Cas9 and type V Cpf1 systems, and can facilitate precise gene editing even between identical genomic sites within the same genome. Our findings provide a novel strategy to enable use of diverse otherwise inactive CRISPR–Cas systems for genome-editing applications and a potential path to modulate the impact of chromatin microenvironments on genome modification. PMID:28387220

  7. Assessing Genetic Diversity among Brettanomyces Yeasts by DNA Fingerprinting and Whole-Genome Sequencing

    PubMed Central

    Crauwels, Sam; Zhu, Bo; Steensels, Jan; Busschaert, Pieter; De Samblanx, Gorik; Marchal, Kathleen; Willems, Kris A.

    2014-01-01

    Brettanomyces yeasts, with the species Brettanomyces (Dekkera) bruxellensis being the most important one, are generally reported to be spoilage yeasts in the beer and wine industry due to the production of phenolic off flavors. However, B. bruxellensis is also known to be a beneficial contributor in certain fermentation processes, such as the production of certain specialty beers. Nevertheless, despite its economic importance, Brettanomyces yeasts remain poorly understood at the genetic and genomic levels. In this study, the genetic relationship between more than 50 Brettanomyces strains from all presently known species and from several sources was studied using a combination of DNA fingerprinting techniques. This revealed an intriguing correlation between the B. bruxellensis fingerprints and the respective isolation source. To further explore this relationship, we sequenced a (beneficial) beer isolate of B. bruxellensis (VIB X9085; ST05.12/22) and compared its genome sequence with the genome sequences of two wine spoilage strains (AWRI 1499 and CBS 2499). ST05.12/22 was found to be substantially different from both wine strains, especially at the level of single nucleotide polymorphisms (SNPs). In addition, there were major differences in the genome structures between the strains investigated, including the presence of large duplications and deletions. Gene content analysis revealed the presence of 20 genes which were present in both wine strains but absent in the beer strain, including many genes involved in carbon and nitrogen metabolism, and vice versa, no genes that were missing in both AWRI 1499 and CBS 2499 were found in ST05.12/22. Together, this study provides tools to discriminate Brettanomyces strains and provides a first glimpse at the genetic diversity and genome plasticity of B. bruxellensis. PMID:24814796

  8. Assessing genetic diversity among Brettanomyces yeasts by DNA fingerprinting and whole-genome sequencing.

    PubMed

    Crauwels, Sam; Zhu, Bo; Steensels, Jan; Busschaert, Pieter; De Samblanx, Gorik; Marchal, Kathleen; Willems, Kris A; Verstrepen, Kevin J; Lievens, Bart

    2014-07-01

    Brettanomyces yeasts, with the species Brettanomyces (Dekkera) bruxellensis being the most important one, are generally reported to be spoilage yeasts in the beer and wine industry due to the production of phenolic off flavors. However, B. bruxellensis is also known to be a beneficial contributor in certain fermentation processes, such as the production of certain specialty beers. Nevertheless, despite its economic importance, Brettanomyces yeasts remain poorly understood at the genetic and genomic levels. In this study, the genetic relationship between more than 50 Brettanomyces strains from all presently known species and from several sources was studied using a combination of DNA fingerprinting techniques. This revealed an intriguing correlation between the B. bruxellensis fingerprints and the respective isolation source. To further explore this relationship, we sequenced a (beneficial) beer isolate of B. bruxellensis (VIB X9085; ST05.12/22) and compared its genome sequence with the genome sequences of two wine spoilage strains (AWRI 1499 and CBS 2499). ST05.12/22 was found to be substantially different from both wine strains, especially at the level of single nucleotide polymorphisms (SNPs). In addition, there were major differences in the genome structures between the strains investigated, including the presence of large duplications and deletions. Gene content analysis revealed the presence of 20 genes which were present in both wine strains but absent in the beer strain, including many genes involved in carbon and nitrogen metabolism, and vice versa, no genes that were missing in both AWRI 1499 and CBS 2499 were found in ST05.12/22. Together, this study provides tools to discriminate Brettanomyces strains and provides a first glimpse at the genetic diversity and genome plasticity of B. bruxellensis. Copyright © 2014, American Society for Microbiology. All Rights Reserved.

  9. Production of individualized V gene databases reveals high levels of immunoglobulin genetic diversity

    NASA Astrophysics Data System (ADS)

    Corcoran, Martin M.; Phad, Ganesh E.; Bernat, Néstor Vázquez; Stahl-Hennig, Christiane; Sumida, Noriyuki; Persson, Mats A. A.; Martin, Marcel; Hedestam, Gunilla B. Karlsson

    2016-12-01

    Comprehensive knowledge of immunoglobulin genetics is required to advance our understanding of B cell biology. Validated immunoglobulin variable (V) gene databases are close to completion only for human and mouse. We present a novel computational approach, IgDiscover, that identifies germline V genes from expressed repertoires to a specificity of 100%. IgDiscover uses a cluster identification process to produce candidate sequences that, once filtered, results in individualized germline V gene databases. IgDiscover was tested in multiple species, validated by genomic cloning and cross library comparisons and produces comprehensive gene databases even where limited genomic sequence is available. IgDiscover analysis of the allelic content of the Indian and Chinese-origin rhesus macaques reveals high levels of immunoglobulin gene diversity in this species. Further, we describe a novel human IGHV3-21 allele and confirm significant gene differences between Balb/c and C57BL6 mouse strains, demonstrating the power of IgDiscover as a germline V gene discovery tool.

  10. Production of individualized V gene databases reveals high levels of immunoglobulin genetic diversity

    PubMed Central

    Corcoran, Martin M.; Phad, Ganesh E.; Bernat, Néstor Vázquez; Stahl-Hennig, Christiane; Sumida, Noriyuki; Persson, Mats A.A.; Martin, Marcel; Hedestam, Gunilla B. Karlsson

    2016-01-01

    Comprehensive knowledge of immunoglobulin genetics is required to advance our understanding of B cell biology. Validated immunoglobulin variable (V) gene databases are close to completion only for human and mouse. We present a novel computational approach, IgDiscover, that identifies germline V genes from expressed repertoires to a specificity of 100%. IgDiscover uses a cluster identification process to produce candidate sequences that, once filtered, results in individualized germline V gene databases. IgDiscover was tested in multiple species, validated by genomic cloning and cross library comparisons and produces comprehensive gene databases even where limited genomic sequence is available. IgDiscover analysis of the allelic content of the Indian and Chinese-origin rhesus macaques reveals high levels of immunoglobulin gene diversity in this species. Further, we describe a novel human IGHV3-21 allele and confirm significant gene differences between Balb/c and C57BL6 mouse strains, demonstrating the power of IgDiscover as a germline V gene discovery tool. PMID:27995928

  11. Population Genomics Reveals Speciation and Introgression between Brown Norway Rats and Their Sibling Species

    PubMed Central

    Teng, Huajing; Zhang, Yaohua; Shi, Chengmin; Mao, Fengbiao; Cai, Wanshi; Lu, Liang; Zhao, Fangqing; Sun, Zhongsheng; Zhang, Jianxu

    2017-01-01

    Abstract Murine rodents are excellent models for study of adaptive radiations and speciation. Brown Norway rats (Rattus norvegicus) are successful global colonizers and the contributions of their domesticated laboratory strains to biomedical research are well established. To identify nucleotide-based speciation timing of the rat and genomic information contributing to its colonization capabilities, we analyzed 51 whole-genome sequences of wild-derived Brown Norway rats and their sibling species, R. nitidus, and identified over 20 million genetic variants in the wild Brown Norway rats that were absent in the laboratory strains, which substantially expand the reservoir of rat genetic diversity. We showed that divergence of the rat and its siblings coincided with drastic climatic changes that occurred during the Middle Pleistocene. Further, we revealed that there was a geographically widespread influx of genes between Brown Norway rats and the sibling species following the divergence, resulting in numerous introgressed regions in the genomes of admixed Brown Norway rats. Intriguing, genes related to chemical communications among these introgressed regions appeared to contribute to the population-specific adaptations of the admixed Brown Norway rats. Our data reveals evolutionary history of the Brown Norway rat, and offers new insights into the role of climatic changes in speciation of animals and the effect of interspecies introgression on animal adaptation. PMID:28482038

  12. Genetic Diversity and Linkage Disequilibrium in Chinese Bread Wheat (Triticum aestivum L.) Revealed by SSR Markers

    PubMed Central

    Hao, Chenyang; Wang, Lanfen; Ge, Hongmei; Dong, Yuchen; Zhang, Xueyong

    2011-01-01

    Two hundred and fifty bread wheat lines, mainly Chinese mini core accessions, were assayed for polymorphism and linkage disequilibrium (LD) based on 512 whole-genome microsatellite loci representing a mean marker density of 5.1 cM. A total of 6,724 alleles ranging from 1 to 49 per locus were identified in all collections. The mean PIC value was 0.650, ranging from 0 to 0.965. Population structure and principal coordinate analysis revealed that landraces and modern varieties were two relatively independent genetic sub-groups. Landraces had a higher allelic diversity than modern varieties with respect to both genomes and chromosomes in terms of total number of alleles and allelic richness. 3,833 (57.0%) and 2,788 (41.5%) rare alleles with frequencies of <5% were found in the landrace and modern variety gene pools, respectively, indicating greater numbers of rare variants, or likely new alleles, in landraces. Analysis of molecular variance (AMOVA) showed that A genome had the largest genetic differentiation and D genome the lowest. In contrast to genetic diversity, modern varieties displayed a wider average LD decay across the whole genome for locus pairs with r2>0.05 (P<0.001) than the landraces. Mean LD decay distance for the landraces at the whole genome level was <5 cM, while a higher LD decay distance of 5–10 cM in modern varieties. LD decay distances were also somewhat different for each of the 21 chromosomes, being higher for most of the chromosomes in modern varieties (<5∼25 cM) compared to landraces (<5∼15 cM), presumably indicating the influences of domestication and breeding. This study facilitates predicting the marker density required to effectively associate genotypes with traits in Chinese wheat genetic resources. PMID:21365016

  13. Genome-Wide Analysis of the World's Sheep Breeds Reveals High Levels of Historic Mixture and Strong Recent Selection

    PubMed Central

    Kijas, James W.; Lenstra, Johannes A.; Hayes, Ben; Boitard, Simon; Porto Neto, Laercio R.; San Cristobal, Magali; Servin, Bertrand; McCulloch, Russell; Whan, Vicki; Gietzen, Kimberly; Paiva, Samuel; Barendse, William; Ciani, Elena; Raadsma, Herman; McEwan, John; Dalrymple, Brian

    2012-01-01

    Through their domestication and subsequent selection, sheep have been adapted to thrive in a diverse range of environments. To characterise the genetic consequence of both domestication and selection, we genotyped 49,034 SNP in 2,819 animals from a diverse collection of 74 sheep breeds. We find the majority of sheep populations contain high SNP diversity and have retained an effective population size much higher than most cattle or dog breeds, suggesting domestication occurred from a broad genetic base. Extensive haplotype sharing and generally low divergence time between breeds reveal frequent genetic exchange has occurred during the development of modern breeds. A scan of the genome for selection signals revealed 31 regions containing genes for coat pigmentation, skeletal morphology, body size, growth, and reproduction. We demonstrate the strongest selection signal has occurred in response to breeding for the absence of horns. The high density map of genetic variability provides an in-depth view of the genetic history for this important livestock species. PMID:22346734

  14. Genomic diversity and macroecology of the crop wild relatives of domesticated pea.

    PubMed

    Smýkal, Petr; Hradilová, Iveta; Trněný, Oldřich; Brus, Jan; Rathore, Abhishek; Bariotakis, Michael; Das, Roma Rani; Bhattacharyya, Debjyoti; Richards, Christopher; Coyne, Clarice J; Pirintsos, Stergios

    2017-12-12

    There is growing interest in the conservation and utilization of crop wild relatives (CWR) in international food security policy and research. Legumes play an important role in human health, sustainable food production, global food security, and the resilience of current agricultural systems. Pea belongs to the ancient set of cultivated plants of the Near East domestication center and remains an important crop today. Based on genome-wide analysis, P. fulvum was identified as a well-supported species, while the diversity of wild P. sativum subsp. elatius was structured into 5 partly geographically positioned clusters. We explored the spatial and environmental patterns of two progenitor species of domesticated pea in the Mediterranean Basin and in the Fertile Crescent in relation to the past and current climate. This study revealed that isolation by distance does not explain the genetic structure of P. sativum subsp. elatius in its westward expansion from its center of origin. The genetic diversity of wild pea may be driven by Miocene-Pliocene events, while the phylogenetic diversity centers may reflect Pleisto-Holocene climatic changes. These findings help set research and discussion priorities and provide geographical and ecological information for germplasm-collecting missions, as well as for the preservation of extant diversity in ex-situ collections.

  15. Reconstruction of the vertebrate ancestral genome reveals dynamic genome reorganization in early vertebrates.

    PubMed

    Nakatani, Yoichiro; Takeda, Hiroyuki; Kohara, Yuji; Morishita, Shinichi

    2007-09-01

    Although several vertebrate genomes have been sequenced, little is known about the genome evolution of early vertebrates and how large-scale genomic changes such as the two rounds of whole-genome duplications (2R WGD) affected evolutionary complexity and novelty in vertebrates. Reconstructing the ancestral vertebrate genome is highly nontrivial because of the difficulty in identifying traces originating from the 2R WGD. To resolve this problem, we developed a novel method capable of pinning down remains of the 2R WGD in the human and medaka fish genomes using invertebrate tunicate and sea urchin genes to define ohnologs, i.e., paralogs produced by the 2R WGD. We validated the reconstruction using the chicken genome, which was not considered in the reconstruction step, and observed that many ancestral proto-chromosomes were retained in the chicken genome and had one-to-one correspondence to chicken microchromosomes, thereby confirming the reconstructed ancestral genomes. Our reconstruction revealed a contrast between the slow karyotype evolution after the second WGD and the rapid, lineage-specific genome reorganizations that occurred in the ancestral lineages of major taxonomic groups such as teleost fishes, amphibians, reptiles, and marsupials.

  16. Impacts of Genome-Wide Analyses on Our Understanding of Human Herpesvirus Diversity and Evolution.

    PubMed

    Renner, Daniel W; Szpara, Moriah L

    2018-01-01

    Until fairly recently, genome-wide evolutionary dynamics and within-host diversity were more commonly examined in the context of small viruses than in the context of large double-stranded DNA viruses such as herpesviruses. The high mutation rates and more compact genomes of RNA viruses have inspired the investigation of population dynamics for these species, and recent data now suggest that herpesviruses might also be considered candidates for population modeling. High-throughput sequencing (HTS) and bioinformatics have expanded our understanding of herpesviruses through genome-wide comparisons of sequence diversity, recombination, allele frequency, and selective pressures. Here we discuss recent data on the mechanisms that generate herpesvirus genomic diversity and underlie the evolution of these virus families. We focus on human herpesviruses, with key insights drawn from veterinary herpesviruses and other large DNA virus families. We consider the impacts of cell culture on herpesvirus genomes and how to accurately describe the viral populations under study. The need for a strong foundation of high-quality genomes is also discussed, since it underlies all secondary genomic analyses such as RNA sequencing (RNA-Seq), chromatin immunoprecipitation, and ribosome profiling. Areas where we foresee future progress, such as the linking of viral genetic differences to phenotypic or clinical outcomes, are highlighted as well. Copyright © 2017 Renner and Szpara.

  17. N-Terminal Protease Gene Phylogeny Reveals the Potential for Novel Cyanobactin Diversity in Cyanobacteria

    PubMed Central

    Martins, Joana; Leão, Pedro N.; Ramos, Vitor; Vasconcelos, Vitor

    2013-01-01

    Cyanobactins are a recently recognized group of ribosomal cyclic peptides produced by cyanobacteria, which have been studied because of their interesting biological activities. Here, we have used a PCR-based approach to detect the N-terminal protease (A) gene from cyanobactin synthetase gene clusters, in a set of diverse cyanobacteria from our culture collection (Laboratory of Ecotoxicology, Genomics and Evolution (LEGE) CC). Homologues of this gene were found in Microcystis and Rivularia strains, and for the first time in Cuspidothrix, Phormidium and Sphaerospermopsis strains. Phylogenetic relationships inferred from available A-gene sequences, including those obtained in this work, revealed two new groups of phylotypes, harboring Phormidium, Sphaerospermopsis and Rivularia LEGE isolates. Thus, this study shows that, using underexplored cyanobacterial strains, it is still possible to expand the known genetic diversity of genes involved in cyanobactin biosynthesis. PMID:24351973

  18. Genome-wide diversity and differentiation in New World populations of the human malaria parasite Plasmodium vivax

    PubMed Central

    de Oliveira, Thais C.; Rodrigues, Priscila T.; Menezes, Maria José; Gonçalves-Lopes, Raquel M.; Bastos, Melissa S.; Lima, Nathália F.; Barbosa, Susana; Gerber, Alexandra L.; Loss de Morais, Guilherme; Berná, Luisa; Phelan, Jody; Robello, Carlos; de Vasconcelos, Ana Tereza R.

    2017-01-01

    Background The Americas were the last continent colonized by humans carrying malaria parasites. Plasmodium falciparum from the New World shows very little genetic diversity and greater linkage disequilibrium, compared with its African counterparts, and is clearly subdivided into local, highly divergent populations. However, limited available data have revealed extensive genetic diversity in American populations of another major human malaria parasite, P. vivax. Methods We used an improved sample preparation strategy and next-generation sequencing to characterize 9 high-quality P. vivax genome sequences from northwestern Brazil. These new data were compared with publicly available sequences from recently sampled clinical P. vivax isolates from Brazil (BRA, total n = 11 sequences), Peru (PER, n = 23), Colombia (COL, n = 31), and Mexico (MEX, n = 19). Principal findings/Conclusions We found that New World populations of P. vivax are as diverse (nucleotide diversity π between 5.2 × 10−4 and 6.2 × 10−4) as P. vivax populations from Southeast Asia, where malaria transmission is substantially more intense. They display several non-synonymous nucleotide substitutions (some of them previously undescribed) in genes known or suspected to be involved in antimalarial drug resistance, such as dhfr, dhps, mdr1, mrp1, and mrp-2, but not in the chloroquine resistance transporter ortholog (crt-o) gene. Moreover, P. vivax in the Americas is much less geographically substructured than local P. falciparum populations, with relatively little between-population genome-wide differentiation (pairwise FST values ranging between 0.025 and 0.092). Finally, P. vivax populations show a rapid decline in linkage disequilibrium with increasing distance between pairs of polymorphic sites, consistent with very frequent outcrossing. We hypothesize that the high diversity of present-day P. vivax lineages in the Americas originated from successive migratory waves and subsequent admixture between

  19. Comparative genome analysis of PHB gene family reveals deep evolutionary origins and diverse gene function.

    PubMed

    Di, Chao; Xu, Wenying; Su, Zhen; Yuan, Joshua S

    2010-10-07

    PHB (Prohibitin) gene family is involved in a variety of functions important for different biological processes. PHB genes are ubiquitously present in divergent species from prokaryotes to eukaryotes. Human PHB genes have been found to be associated with various diseases. Recent studies by our group and others have shown diverse function of PHB genes in plants for development, senescence, defence, and others. Despite the importance of the PHB gene family, no comprehensive gene family analysis has been carried to evaluate the relatedness of PHB genes across different species. In order to better guide the gene function analysis and understand the evolution of the PHB gene family, we therefore carried out the comparative genome analysis of the PHB genes across different kingdoms. The relatedness, motif distribution, and intron/exon distribution all indicated that PHB genes is a relatively conserved gene family. The PHB genes can be classified into 5 classes and each class have a very deep evolutionary origin. The PHB genes within the class maintained the same motif patterns during the evolution. With Arabidopsis as the model species, we found that PHB gene intron/exon structure and domains are also conserved during the evolution. Despite being a conserved gene family, various gene duplication events led to the expansion of the PHB genes. Both segmental and tandem gene duplication were involved in Arabidopsis PHB gene family expansion. However, segmental duplication is predominant in Arabidopsis. Moreover, most of the duplicated genes experienced neofunctionalization. The results highlighted that PHB genes might be involved in important functions so that the duplicated genes are under the evolutionary pressure to derive new function. PHB gene family is a conserved gene family and accounts for diverse but important biological functions based on the similar molecular mechanisms. The highly diverse biological function indicated that more research needs to be carried out

  20. Expression Quantitative Trait Locus Mapping across Water Availability Environments Reveals Contrasting Associations with Genomic Features in Arabidopsis[C][W][OPEN

    PubMed Central

    Lowry, David B.; Logan, Tierney L.; Santuari, Luca; Hardtke, Christian S.; Richards, James H.; DeRose-Wilson, Leah J.; McKay, John K.; Sen, Saunak; Juenger, Thomas E.

    2013-01-01

    The regulation of gene expression is crucial for an organism’s development and response to stress, and an understanding of the evolution of gene expression is of fundamental importance to basic and applied biology. To improve this understanding, we conducted expression quantitative trait locus (eQTL) mapping in the Tsu-1 (Tsushima, Japan) × Kas-1 (Kashmir, India) recombinant inbred line population of Arabidopsis thaliana across soil drying treatments. We then used genome resequencing data to evaluate whether genomic features (promoter polymorphism, recombination rate, gene length, and gene density) are associated with genes responding to the environment (E) or with genes with genetic variation (G) in gene expression in the form of eQTLs. We identified thousands of genes that responded to soil drying and hundreds of main-effect eQTLs. However, we identified very few statistically significant eQTLs that interacted with the soil drying treatment (GxE eQTL). Analysis of genome resequencing data revealed associations of several genomic features with G and E genes. In general, E genes had lower promoter diversity and local recombination rates. By contrast, genes with eQTLs (G) had significantly greater promoter diversity and were located in genomic regions with higher recombination. These results suggest that genomic architecture may play an important a role in the evolution of gene expression. PMID:24045022

  1. Whole genome sequencing of diverse Shiga toxin-producing and non-producing Escherichia coli strains reveals a variety of virulence and novel antibiotic resistance plasmids

    USDA-ARS?s Scientific Manuscript database

    The genomes of a diverse set of Shiga toxin-producing E. coli strains and the presence of 38 plasmids among all the isolates were determined. Among the novel plasmids found, there were eight that encoded resistance genes to antibiotics, including aminoglycosides, carbapenems, penicillins, cephalosp...

  2. Broad genomic and transcriptional analysis reveals a highly derived genome in dinoflagellate mitochondria

    PubMed Central

    Jackson, Christopher J; Norman, John E; Schnare, Murray N; Gray, Michael W; Keeling, Patrick J; Waller, Ross F

    2007-01-01

    Background Dinoflagellates comprise an ecologically significant and diverse eukaryotic phylum that is sister to the phylum containing apicomplexan endoparasites. The mitochondrial genome of apicomplexans is uniquely reduced in gene content and size, encoding only three proteins and two ribosomal RNAs (rRNAs) within a highly compacted 6 kb DNA. Dinoflagellate mitochondrial genomes have been comparatively poorly studied: limited available data suggest some similarities with apicomplexan mitochondrial genomes but an even more radical type of genomic organization. Here, we investigate structure, content and expression of dinoflagellate mitochondrial genomes. Results From two dinoflagellates, Crypthecodinium cohnii and Karlodinium micrum, we generated over 42 kb of mitochondrial genomic data that indicate a reduced gene content paralleling that of mitochondrial genomes in apicomplexans, i.e., only three protein-encoding genes and at least eight conserved components of the highly fragmented large and small subunit rRNAs. Unlike in apicomplexans, dinoflagellate mitochondrial genes occur in multiple copies, often as gene fragments, and in numerous genomic contexts. Analysis of cDNAs suggests several novel aspects of dinoflagellate mitochondrial gene expression. Polycistronic transcripts were found, standard start codons are absent, and oligoadenylation occurs upstream of stop codons, resulting in the absence of termination codons. Transcripts of at least one gene, cox3, are apparently trans-spliced to generate full-length mRNAs. RNA substitutional editing, a process previously identified for mRNAs in dinoflagellate mitochondria, is also implicated in rRNA expression. Conclusion The dinoflagellate mitochondrial genome shares the same gene complement and fragmentation of rRNA genes with its apicomplexan counterpart. However, it also exhibits several unique characteristics. Most notable are the expansion of gene copy numbers and their arrangements within the genome, RNA

  3. Genome-Wide Analyses Reveal Genes Subject to Positive Selection in Pasteurella multocida

    PubMed Central

    Cao, Peili; Guo, Dongchun; Liu, Jiasen; Jiang, Qian; Xu, Zhuofei; Qu, Liandong

    2017-01-01

    Pasteurella multocida, a Gram-negative opportunistic pathogen, has led to a broad range of diseases in mammals and birds, including fowl cholera in poultry, pneumonia and atrophic rhinitis in swine and rabbit, hemorrhagic septicemia in cattle, and bite infections in humans. In order to better interpret the genetic diversity and adaptation evolution of this pathogen, seven genomes of P. multocida strains isolated from fowls, rabbit and pigs were determined by using high-throughput sequencing approach. Together with publicly available P. multocida genomes, evolutionary features were systematically analyzed in this study. Clustering of 70,565 protein-coding genes showed that the pangenome of 33 P. multocida strains was composed of 1,602 core genes, 1,364 dispensable genes, and 1,070 strain-specific genes. Of these, we identified a full spectrum of genes related to virulence factors and revealed genetic diversity of these potential virulence markers across P. multocida strains, e.g., bcbAB, fcbC, lipA, bexDCA, ctrCD, lgtA, lgtC, lic2A involved in biogenesis of surface polysaccharides, hsf encoding autotransporter adhesin, and fhaB encoding filamentous haemagglutinin. Furthermore, based on genome-wide positive selection scanning, a total of 35 genes were subject to strong selection pressure. Extensive analyses of protein subcellular location indicated that membrane-associated genes were highly abundant among all positively selected genes. The detected amino acid sites undergoing adaptive selection were preferably located in extracellular space, perhaps associated with bacterial evasion of host immune responses. Our findings shed more light on conservation and distribution of virulence-associated genes across P. multocida strains. Meanwhile, this study provides a genetic context for future researches on the mechanism of adaptive evolution in P. multocida. PMID:28611758

  4. Phylogeny of Banana Streak Virus reveals recent and repetitive endogenization in the genome of its banana host (Musa sp.).

    PubMed

    Gayral, Philippe; Iskra-Caruana, Marie-Line

    2009-07-01

    Banana streak virus (BSV) is a plant dsDNA pararetrovirus (family Caulimoviridae, genus badnavirus). Although integration is not an essential step in the BSV replication cycle, the nuclear genome of banana (Musa sp.) contains BSV endogenous pararetrovirus sequences (BSV EPRVs). Some BSV EPRVs are infectious by reconstituting a functional viral genome. Recent studies revealed a large molecular diversity of episomal BSV viruses (i.e., nonintegrated) while others focused on BSV EPRV sequences only. In this study, the evolutionary history of badnavirus integration in banana was inferred from phylogenetic relationships between BSV and BSV EPRVs. The relative evolution rates and selective pressures (d(N)/d(S) ratio) were also compared between endogenous and episomal viral sequences. At least 27 recent independent integration events occurred after the divergence of three banana species, indicating that viral integration is a recent and frequent phenomenon. Relaxation of selective pressure on badnaviral sequences that experienced neutral evolution after integration in the plant genome was recorded. Additionally, a significant decrease (35%) in the EPRV evolution rate was observed compared to BSV, reflecting the difference in the evolution rate between episomal dsDNA viruses and plant genome. The comparison of our results with the evolution rate of the Musa genome and other reverse-transcribing viruses suggests that EPRVs play an active role in episomal BSV diversity and evolution.

  5. Correction: Comparative analysis of fungal genomes reveals different plant cell wall degrading capacity in fungi

    PubMed Central

    2014-01-01

    Abstract The version of this article published in BMC Genomics 2013, 14: 274, contains 9 unpublished genomes (Botryobasidium botryosum, Gymnopus luxurians, Hypholoma sublateritium, Jaapia argillacea, Hebeloma cylindrosporum, Conidiobolus coronatus, Laccaria amethystina, Paxillus involutus, and P. rubicundulus) downloaded from JGI website. In this correction, we removed these genomes after discussion with editors and data producers whom we should have contacted before downloading these genomes. Removing these data did not alter the principle results and conclusions of our original work. The relevant Figures 1, 2, 3, 4 and 6; and Table 1 have been revised. Additional files 1, 3, 4, and 5 were also revised. We would like to apologize for any confusion or inconvenience this may have caused. Background Fungi produce a variety of carbohydrate activity enzymes (CAZymes) for the degradation of plant polysaccharide materials to facilitate infection and/or gain nutrition. Identifying and comparing CAZymes from fungi with different nutritional modes or infection mechanisms may provide information for better understanding of their life styles and infection models. To date, over hundreds of fungal genomes are publicly available. However, a systematic comparative analysis of fungal CAZymes across the entire fungal kingdom has not been reported. Results In this study, we systemically identified glycoside hydrolases (GHs), polysaccharide lyases (PLs), carbohydrate esterases (CEs), and glycosyltransferases (GTs) as well as carbohydrate-binding modules (CBMs) in the predicted proteomes of 94 representative fungi from Ascomycota, Basidiomycota, Chytridiomycota, and Zygomycota. Comparative analysis of these CAZymes that play major roles in plant polysaccharide degradation revealed that fungi exhibit tremendous diversity in the number and variety of CAZymes. Among them, some families of GHs and CEs are the most prevalent CAZymes that are distributed in all of the fungi analyzed

  6. Metatranscriptomic analysis of diverse microbial communities reveals core metabolic pathways and microbiome-specific functionality.

    PubMed

    Jiang, Yue; Xiong, Xuejian; Danska, Jayne; Parkinson, John

    2016-01-12

    Metatranscriptomics is emerging as a powerful technology for the functional characterization of complex microbial communities (microbiomes). Use of unbiased RNA-sequencing can reveal both the taxonomic composition and active biochemical functions of a complex microbial community. However, the lack of established reference genomes, computational tools and pipelines make analysis and interpretation of these datasets challenging. Systematic studies that compare data across microbiomes are needed to demonstrate the ability of such pipelines to deliver biologically meaningful insights on microbiome function. Here, we apply a standardized analytical pipeline to perform a comparative analysis of metatranscriptomic data from diverse microbial communities derived from mouse large intestine, cow rumen, kimchi culture, deep-sea thermal vent and permafrost. Sequence similarity searches allowed annotation of 19 to 76% of putative messenger RNA (mRNA) reads, with the highest frequency in the kimchi dataset due to its relatively low complexity and availability of closely related reference genomes. Metatranscriptomic datasets exhibited distinct taxonomic and functional signatures. From a metabolic perspective, we identified a common core of enzymes involved in amino acid, energy and nucleotide metabolism and also identified microbiome-specific pathways such as phosphonate metabolism (deep sea) and glycan degradation pathways (cow rumen). Integrating taxonomic and functional annotations within a novel visualization framework revealed the contribution of different taxa to metabolic pathways, allowing the identification of taxa that contribute unique functions. The application of a single, standard pipeline confirms that the rich taxonomic and functional diversity observed across microbiomes is not simply an artefact of different analysis pipelines but instead reflects distinct environmental influences. At the same time, our findings show how microbiome complexity and availability of

  7. Comparative Genomics Analyses Reveal Extensive Chromosome Colinearity and Novel Quantitative Trait Loci in Eucalyptus.

    PubMed

    Li, Fagen; Zhou, Changpin; Weng, Qijie; Li, Mei; Yu, Xiaoli; Guo, Yong; Wang, Yu; Zhang, Xiaohong; Gan, Siming

    2015-01-01

    Dense genetic maps, along with quantitative trait loci (QTLs) detected on such maps, are powerful tools for genomics and molecular breeding studies. In the important woody genus Eucalyptus, the recent release of E. grandis genome sequence allows for sequence-based genomic comparison and searching for positional candidate genes within QTL regions. Here, dense genetic maps were constructed for E. urophylla and E. tereticornis using genomic simple sequence repeats (SSR), expressed sequence tag (EST) derived SSR, EST-derived cleaved amplified polymorphic sequence (EST-CAPS), and diversity arrays technology (DArT) markers. The E. urophylla and E. tereticornis maps comprised 700 and 585 markers across 11 linkage groups, totaling at 1,208.2 and 1,241.4 cM in length, respectively. Extensive synteny and colinearity were observed as compared to three earlier DArT-based eucalypt maps (two maps with E. grandis × E. urophylla and one map of E. globulus) and with the E. grandis genome sequence. Fifty-three QTLs for growth (10-56 months of age) and wood density (56 months) were identified in 22 discrete regions on both maps, in which only one colocalizaiton was found between growth and wood density. Novel QTLs were revealed as compared with those previously detected on DArT-based maps for similar ages in Eucalyptus. Eleven to 585 positional candidate genes were obained for a 56-month-old QTL through aligning QTL confidence interval with the E. grandis genome. These results will assist in comparative genomics studies, targeted gene characterization, and marker-assisted selection in Eucalyptus and the related taxa.

  8. Comparative Genomics Analyses Reveal Extensive Chromosome Colinearity and Novel Quantitative Trait Loci in Eucalyptus

    PubMed Central

    Weng, Qijie; Li, Mei; Yu, Xiaoli; Guo, Yong; Wang, Yu; Zhang, Xiaohong; Gan, Siming

    2015-01-01

    Dense genetic maps, along with quantitative trait loci (QTLs) detected on such maps, are powerful tools for genomics and molecular breeding studies. In the important woody genus Eucalyptus, the recent release of E. grandis genome sequence allows for sequence-based genomic comparison and searching for positional candidate genes within QTL regions. Here, dense genetic maps were constructed for E. urophylla and E. tereticornis using genomic simple sequence repeats (SSR), expressed sequence tag (EST) derived SSR, EST-derived cleaved amplified polymorphic sequence (EST-CAPS), and diversity arrays technology (DArT) markers. The E. urophylla and E. tereticornis maps comprised 700 and 585 markers across 11 linkage groups, totaling at 1,208.2 and 1,241.4 cM in length, respectively. Extensive synteny and colinearity were observed as compared to three earlier DArT-based eucalypt maps (two maps with E. grandis × E. urophylla and one map of E. globulus) and with the E. grandis genome sequence. Fifty-three QTLs for growth (10–56 months of age) and wood density (56 months) were identified in 22 discrete regions on both maps, in which only one colocalizaiton was found between growth and wood density. Novel QTLs were revealed as compared with those previously detected on DArT-based maps for similar ages in Eucalyptus. Eleven to 585 positional candidate genes were obained for a 56-month-old QTL through aligning QTL confidence interval with the E. grandis genome. These results will assist in comparative genomics studies, targeted gene characterization, and marker-assisted selection in Eucalyptus and the related taxa. PMID:26695430

  9. Comparative Genomics Reveals Accelerated Evolution in Conserved Pathways during the Diversification of Anole Lizards

    PubMed Central

    Tollis, Marc; Hutchins, Elizabeth D; Stapley, Jessica; Rupp, Shawn M; Eckalbar, Walter L; Maayan, Inbar; Lasku, Eris; Infante, Carlos R; Dennis, Stuart R; Robertson, Joel A; May, Catherine M; Bermingham, Eldredge; DeNardo, Dale F; Hsieh, Shi-Tong Tonia; Kulathinal, Rob J; McMillan, William Owen; Menke, Douglas B; Pratt, Stephen C; Rawls, Jeffery Alan; Sanjur, Oris; Wilson-Rawls, Jeanne; Wilson Sayres, Melissa A; Fisher, Rebecca E

    2018-01-01

    Abstract Squamates include all lizards and snakes, and display some of the most diverse and extreme morphological adaptations among vertebrates. However, compared with birds and mammals, relatively few resources exist for comparative genomic analyses of squamates, hampering efforts to understand the molecular bases of phenotypic diversification in such a speciose clade. In particular, the ∼400 species of anole lizard represent an extensive squamate radiation. Here, we sequence and assemble the draft genomes of three anole species—Anolis frenatus, Anolis auratus, and Anolis apletophallus—for comparison with the available reference genome of Anolis carolinensis. Comparative analyses reveal a rapid background rate of molecular evolution consistent with a model of punctuated equilibrium, and strong purifying selection on functional genomic elements in anoles. We find evidence for accelerated evolution in genes involved in behavior, sensory perception, and reproduction, as well as in genes regulating limb bud development and hindlimb specification. Morphometric analyses of anole fore and hindlimbs corroborated these findings. We detect signatures of positive selection across several genes related to the development and regulation of the forebrain, hormones, and the iguanian lizard dewlap, suggesting molecular changes underlying behavioral adaptations known to reinforce species boundaries were a key component in the diversification of anole lizards. PMID:29360978

  10. Diversity Arrays Technology (DArT) for whole-genome profiling of barley

    PubMed Central

    Wenzl, Peter; Carling, Jason; Kudrna, David; Jaccoud, Damian; Huttner, Eric; Kleinhofs, Andris; Kilian, Andrzej

    2004-01-01

    Diversity Arrays Technology (DArT) can detect and type DNA variation at several hundred genomic loci in parallel without relying on sequence information. Here we show that it can be effectively applied to genetic mapping and diversity analyses of barley, a species with a 5,000-Mbp genome. We tested several complexity reduction methods and selected two that generated the most polymorphic genomic representations. Arrays containing individual fragments from these representations generated DArT fingerprints with a genotype call rate of 98.0% and a scoring reproducibility of at least 99.8%. The fingerprints grouped barley lines according to known genetic relationships. To validate the Mendelian behavior of DArT markers, we constructed a genetic map for a cross between cultivars Steptoe and Morex. Nearly all polymorphic array features could be incorporated into one of seven linkage groups (98.8%). The resulting map comprised ≈385 unique DArT markers and spanned 1,137 centimorgans. A comparison with the restriction fragment length polymorphism-based framework map indicated that the quality of the DArT map was equivalent, if not superior, to that of the framework map. These results highlight the potential of DArT as a generic technique for genome profiling in the context of molecular breeding and genomics. PMID:15192146

  11. The family Rhabdoviridae: Mono- and bipartite negative-sense RNA viruses with diverse genome organization and common evolutionary origins

    USGS Publications Warehouse

    Dietzgen, Ralf G.; Kondo, Hideki; Goodin, Michael M.; Kurath, Gael; Vasilakis, Nikos

    2017-01-01

    The family Rhabdoviridae consists of mostly enveloped, bullet-shaped or bacilliform viruses with a negative-sense, single-stranded RNA genome that infect vertebrates, invertebrates or plants. This ecological diversity is reflected by the diversity and complexity of their genomes. Five canonical structural protein genes are conserved in all rhabdoviruses, but may be overprinted, overlapped or interspersed with several novel and diverse accessory genes. This review gives an overview of the characteristics and diversity of rhabdoviruses, their taxonomic classification, replication mechanism, properties of classical rhabdoviruses such as rabies virus and rhabdoviruses with complex genomes, rhabdoviruses infecting aquatic species, and plant rhabdoviruses with both mono- and bipartite genomes.

  12. The Capsaspora genome reveals a complex unicellular prehistory of animals.

    PubMed

    Suga, Hiroshi; Chen, Zehua; de Mendoza, Alex; Sebé-Pedrós, Arnau; Brown, Matthew W; Kramer, Eric; Carr, Martin; Kerner, Pierre; Vervoort, Michel; Sánchez-Pons, Núria; Torruella, Guifré; Derelle, Romain; Manning, Gerard; Lang, B Franz; Russ, Carsten; Haas, Brian J; Roger, Andrew J; Nusbaum, Chad; Ruiz-Trillo, Iñaki

    2013-01-01

    To reconstruct the evolutionary origin of multicellular animals from their unicellular ancestors, the genome sequences of diverse unicellular relatives are essential. However, only the genome of the choanoflagellate Monosiga brevicollis has been reported to date. Here we completely sequence the genome of the filasterean Capsaspora owczarzaki, the closest known unicellular relative of metazoans besides choanoflagellates. Analyses of this genome alter our understanding of the molecular complexity of metazoans' unicellular ancestors showing that they had a richer repertoire of proteins involved in cell adhesion and transcriptional regulation than previously inferred only with the choanoflagellate genome. Some of these proteins were secondarily lost in choanoflagellates. In contrast, most intercellular signalling systems controlling development evolved later concomitant with the emergence of the first metazoans. We propose that the acquisition of these metazoan-specific developmental systems and the co-option of pre-existing genes drove the evolutionary transition from unicellular protists to metazoans.

  13. Comparative and genetic analysis of the four sequenced Paenibacillus polymyxa genomes reveals a diverse metabolism and conservation of genes relevant to plant-growth promotion and competitiveness.

    PubMed

    Eastman, Alexander W; Heinrichs, David E; Yuan, Ze-Chun

    2014-10-03

    Members of the genus Paenibacillus are important plant growth-promoting rhizobacteria that can serve as bio-reactors. Paenibacillus polymyxa promotes the growth of a variety of economically important crops. Our lab recently completed the genome sequence of Paenibacillus polymyxa CR1. As of January 2014, four P. polymyxa genomes have been completely sequenced but no comparative genomic analyses have been reported. Here we report the comparative and genetic analyses of four sequenced P. polymyxa genomes, which revealed a significantly conserved core genome. Complex metabolic pathways and regulatory networks were highly conserved and allow P. polymyxa to rapidly respond to dynamic environmental cues. Genes responsible for phytohormone synthesis, phosphate solubilization, iron acquisition, transcriptional regulation, σ-factors, stress responses, transporters and biomass degradation were well conserved, indicating an intimate association with plant hosts and the rhizosphere niche. In addition, genes responsible for antimicrobial resistance and non-ribosomal peptide/polyketide synthesis are present in both the core and accessory genome of each strain. Comparative analyses also reveal variations in the accessory genome, including large plasmids present in strains M1 and SC2. Furthermore, a considerable number of strain-specific genes and genomic islands are irregularly distributed throughout each genome. Although a variety of plant-growth promoting traits are encoded by all strains, only P. polymyxa CR1 encodes the unique nitrogen fixation cluster found in other Paenibacillus sp. Our study revealed that genomic loci relevant to host interaction and ecological fitness are highly conserved within the P. polymyxa genomes analysed, despite variations in the accessory genome. This work suggets that plant-growth promotion by P. polymyxa is mediated largely through phytohormone production, increased nutrient availability and bio-control mechanisms. This study provides an in

  14. Genome-wide patterns of recombination, linkage disequilibrium and nucleotide diversity from pooled resequencing and single nucleotide polymorphism genotyping unlock the evolutionary history of Eucalyptus grandis.

    PubMed

    Silva-Junior, Orzenil B; Grattapaglia, Dario

    2015-11-01

    We used high-density single nucleotide polymorphism (SNP) data and whole-genome pooled resequencing to examine the landscape of population recombination (ρ) and nucleotide diversity (ϴw ), assess the extent of linkage disequilibrium (r(2) ) and build the highest density linkage maps for Eucalyptus. At the genome-wide level, linkage disequilibrium (LD) decayed within c. 4-6 kb, slower than previously reported from candidate gene studies, but showing considerable variation from absence to complete LD up to 50 kb. A sharp decrease in the estimate of ρ was seen when going from short to genome-wide inter-SNP distances, highlighting the dependence of this parameter on the scale of observation adopted. Recombination was correlated with nucleotide diversity, gene density and distance from the centromere, with hotspots of recombination enriched for genes involved in chemical reactions and pathways of the normal metabolic processes. The high nucleotide diversity (ϴw = 0.022) of E. grandis revealed that mutation is more important than recombination in shaping its genomic diversity (ρ/ϴw = 0.645). Chromosome-wide ancestral recombination graphs allowed us to date the split of E. grandis (1.7-4.8 million yr ago) and identify a scenario for the recent demographic history of the species. Our results have considerable practical importance to Genome Wide Association Studies (GWAS), while indicating bright prospects for genomic prediction of complex phenotypes in eucalypt breeding. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  15. Evolution and Diversity of the Human Hepatitis D Virus Genome

    PubMed Central

    Huang, Chi-Ruei; Lo, Szecheng J.

    2010-01-01

    Human hepatitis delta virus (HDV) is the smallest RNA virus in genome. HDV genome is divided into a viroid-like sequence and a protein-coding sequence which could have originated from different resources and the HDV genome was eventually constituted through RNA recombination. The genome subsequently diversified through accumulation of mutations selected by interactions between the mutated RNA and proteins with host factors to successfully form the infectious virions. Therefore, we propose that the conservation of HDV nucleotide sequence is highly related with its functionality. Genome analysis of known HDV isolates shows that the C-terminal coding sequences of large delta antigen (LDAg) are the highest diversity than other regions of protein-coding sequences but they still retain biological functionality to interact with the heavy chain of clathrin can be selected and maintained. Since viruses interact with many host factors, including escaping the host immune response, how to design a program to predict RNA genome evolution is a great challenging work. PMID:20204073

  16. Extended diversity analysis of cultivated grapevine Vitis vinifera with 10K genome-wide SNPs.

    PubMed

    Laucou, Valérie; Launay, Amandine; Bacilieri, Roberto; Lacombe, Thierry; Adam-Blondon, Anne-Françoise; Bérard, Aurélie; Chauveau, Aurélie; de Andrés, Maria Teresa; Hausmann, Ludger; Ibáñez, Javier; Le Paslier, Marie-Christine; Maghradze, David; Martinez-Zapater, José Miguel; Maul, Erika; Ponnaiah, Maharajah; Töpfer, Reinhard; Péros, Jean-Pierre; Boursiquot, Jean-Michel

    2018-01-01

    Grapevine is a very important crop species that is mainly cultivated worldwide for fruits, wine and juice. Identification of the genetic bases of performance traits through association mapping studies requires a precise knowledge of the available diversity and how this diversity is structured and varies across the whole genome. An 18k SNP genotyping array was evaluated on a panel of Vitis vinifera cultivars and we obtained a data set with no missing values for a total of 10207 SNPs and 783 different genotypes. The average inter-SNP spacing was ~47 kbp, the mean minor allele frequency (MAF) was 0.23 and the genetic diversity in the sample was high (He = 0.32). Fourteen SNPs, chosen from those with the highest MAF values, were sufficient to identify each genotype in the sample. Parentage analysis revealed 118 full parentages and 490 parent-offspring duos, thus confirming the close pedigree relationships within the cultivated grapevine. Structure analyses also confirmed the main divisions due to an eastern-western gradient and human usage (table vs. wine). Using a multivariate approach, we refined the structure and identified a total of eight clusters. Both the genetic diversity (He, 0.26-0.32) and linkage disequilibrium (LD, 28.8-58.2 kbp) varied between clusters. Despite the short span LD, we also identified some non-recombining haplotype blocks that may complicate association mapping. Finally, we performed a genome-wide association study that confirmed previous works and also identified new regions for important performance traits such as acidity. Taken together, all the results contribute to a better knowledge of the genetics of the cultivated grapevine.

  17. Extended diversity analysis of cultivated grapevine Vitis vinifera with 10K genome-wide SNPs

    PubMed Central

    Launay, Amandine; Bacilieri, Roberto; Lacombe, Thierry; Adam-Blondon, Anne-Françoise; Bérard, Aurélie; Chauveau, Aurélie; de Andrés, Maria Teresa; Maghradze, David; Maul, Erika; Ponnaiah, Maharajah; Töpfer, Reinhard; Péros, Jean-Pierre; Boursiquot, Jean-Michel

    2018-01-01

    Grapevine is a very important crop species that is mainly cultivated worldwide for fruits, wine and juice. Identification of the genetic bases of performance traits through association mapping studies requires a precise knowledge of the available diversity and how this diversity is structured and varies across the whole genome. An 18k SNP genotyping array was evaluated on a panel of Vitis vinifera cultivars and we obtained a data set with no missing values for a total of 10207 SNPs and 783 different genotypes. The average inter-SNP spacing was ~47 kbp, the mean minor allele frequency (MAF) was 0.23 and the genetic diversity in the sample was high (He = 0.32). Fourteen SNPs, chosen from those with the highest MAF values, were sufficient to identify each genotype in the sample. Parentage analysis revealed 118 full parentages and 490 parent-offspring duos, thus confirming the close pedigree relationships within the cultivated grapevine. Structure analyses also confirmed the main divisions due to an eastern-western gradient and human usage (table vs. wine). Using a multivariate approach, we refined the structure and identified a total of eight clusters. Both the genetic diversity (He, 0.26–0.32) and linkage disequilibrium (LD, 28.8–58.2 kbp) varied between clusters. Despite the short span LD, we also identified some non-recombining haplotype blocks that may complicate association mapping. Finally, we performed a genome-wide association study that confirmed previous works and also identified new regions for important performance traits such as acidity. Taken together, all the results contribute to a better knowledge of the genetics of the cultivated grapevine. PMID:29420602

  18. The genomic signature of sexual selection in the genetic diversity of the sex chromosomes and autosomes.

    PubMed

    Corl, Ammon; Ellegren, Hans

    2012-07-01

    Genomic levels of variation can help reveal the selective and demographic forces that have affected a species during its history. The relative amount of genetic diversity observed on the sex chromosomes as compared to the autosomes is predicted to differ among monogamous and polygynous species. Many species show departures from the expectation for monogamy, but it can be difficult to conclude that this pattern results from variation in mating system because forces other than sexual selection can act upon sex chromosome genetic diversity. As a critical test of the role of mating system, we compared levels of genetic diversity on the Z chromosome and autosomes of phylogenetically independent pairs of shorebirds that differed in their mating systems. We found general support for sexual selection shaping sex chromosome diversity because most polygynous species showed relatively reduced genetic variation on their Z chromosomes as compared to monogamous species. Differences in levels of genetic diversity between the sex chromosomes and autosomes may therefore contribute to understanding the long-term history of sexual selection experienced by a species. © 2012 The Author(s).

  19. Exploration of the Genomic Diversity and Core Genome of the Bifidobacterium adolescentis Phylogenetic Group by Means of a Polyphasic Approach

    PubMed Central

    Duranti, Sabrina; Turroni, Francesca; Milani, Christian; Foroni, Elena; Bottacini, Francesca; Dal Bello, Fabio; Ferrarini, Alberto; Delledonne, Massimo; van Sinderen, Douwe

    2013-01-01

    In the current work, we describe genome diversity and core genome sequences among representatives of three bifidobacterial species, i.e., Bifidobacterium adolescentis, Bifidobacterium catenulatum, and Bifidobacterium pseudocatenulatum, by employing a polyphasic approach involving analysis of 16S rRNA gene and 16S-23S internal transcribed spacer (ITS) sequences, pulsed-field gel electrophoresis (PFGE), and comparative genomic hybridization (CGH) assays. PMID:23064340

  20. Population Genomics Reveals Speciation and Introgression between Brown Norway Rats and Their Sibling Species.

    PubMed

    Teng, Huajing; Zhang, Yaohua; Shi, Chengmin; Mao, Fengbiao; Cai, Wanshi; Lu, Liang; Zhao, Fangqing; Sun, Zhongsheng; Zhang, Jianxu

    2017-09-01

    Murine rodents are excellent models for study of adaptive radiations and speciation. Brown Norway rats (Rattus norvegicus) are successful global colonizers and the contributions of their domesticated laboratory strains to biomedical research are well established. To identify nucleotide-based speciation timing of the rat and genomic information contributing to its colonization capabilities, we analyzed 51 whole-genome sequences of wild-derived Brown Norway rats and their sibling species, R. nitidus, and identified over 20 million genetic variants in the wild Brown Norway rats that were absent in the laboratory strains, which substantially expand the reservoir of rat genetic diversity. We showed that divergence of the rat and its siblings coincided with drastic climatic changes that occurred during the Middle Pleistocene. Further, we revealed that there was a geographically widespread influx of genes between Brown Norway rats and the sibling species following the divergence, resulting in numerous introgressed regions in the genomes of admixed Brown Norway rats. Intriguing, genes related to chemical communications among these introgressed regions appeared to contribute to the population-specific adaptations of the admixed Brown Norway rats. Our data reveals evolutionary history of the Brown Norway rat, and offers new insights into the role of climatic changes in speciation of animals and the effect of interspecies introgression on animal adaptation. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  1. Hybridization Reveals the Evolving Genomic Architecture of Speciation

    PubMed Central

    Kronforst, Marcus R.; Hansen, Matthew E.B.; Crawford, Nicholas G.; Gallant, Jason R.; Zhang, Wei; Kulathinal, Rob J.; Kapan, Durrell D.; Mullen, Sean P.

    2014-01-01

    SUMMARY The rate at which genomes diverge during speciation is unknown, as are the physical dynamics of the process. Here, we compare full genome sequences of 32 butterflies, representing five species from a hybridizing Heliconius butterfly community, to examine genome-wide patterns of introgression and infer how divergence evolves during the speciation process. Our analyses reveal that initial divergence is restricted to a small fraction of the genome, largely clustered around known wing-patterning genes. Over time, divergence evolves rapidly, due primarily to the origin of new divergent regions. Furthermore, divergent genomic regions display signatures of both selection and adaptive introgression, demonstrating the link between microevolutionary processes acting within species and the origin of species across macroevolutionary timescales. Our results provide a uniquely comprehensive portrait of the evolving species boundary due to the role that hybridization plays in reducing the background accumulation of divergence at neutral sites. PMID:24183670

  2. Genome sequence analysis of five Canadian isolates of strawberry mottle virus reveals extensive intra-species diversity and a longer RNA2 with increased coding capacity compared to a previously characterized European isolate.

    PubMed

    Bhagwat, Basdeo; Dickison, Virginia; Ding, Xinlun; Walker, Melanie; Bernardy, Michael; Bouthillier, Michel; Creelman, Alexa; DeYoung, Robyn; Li, Yinzi; Nie, Xianzhou; Wang, Aiming; Xiang, Yu; Sanfaçon, Hélène

    2016-06-01

    In this study, we report the genome sequence of five isolates of strawberry mottle virus (family Secoviridae, order Picornavirales) from strawberry field samples with decline symptoms collected in Eastern Canada. The Canadian isolates differed from the previously characterized European isolate 1134 in that they had a longer RNA2, resulting in a 239-amino-acid extension of the C-terminal region of the polyprotein. Sequence analysis suggests that reassortment and recombination occurred among the isolates. Phylogenetic analysis revealed that the Canadian isolates are diverse, grouping in two separate branches along with isolates from Europe and the Americas.

  3. The family Rhabdoviridae: mono- and bipartite negative-sense RNA viruses with diverse genome organization and common evolutionary origins.

    PubMed

    Dietzgen, Ralf G; Kondo, Hideki; Goodin, Michael M; Kurath, Gael; Vasilakis, Nikos

    2017-01-02

    The family Rhabdoviridae consists of mostly enveloped, bullet-shaped or bacilliform viruses with a negative-sense, single-stranded RNA genome that infect vertebrates, invertebrates or plants. This ecological diversity is reflected by the diversity and complexity of their genomes. Five canonical structural protein genes are conserved in all rhabdoviruses, but may be overprinted, overlapped or interspersed with several novel and diverse accessory genes. This review gives an overview of the characteristics and diversity of rhabdoviruses, their taxonomic classification, replication mechanism, properties of classical rhabdoviruses such as rabies virus and rhabdoviruses with complex genomes, rhabdoviruses infecting aquatic species, and plant rhabdoviruses with both mono- and bipartite genomes. Copyright © 2016 Elsevier B.V. All rights reserved.

  4. Analysis of the Saccharomyces cerevisiae pan-genome reveals a pool of copy number variants distributed in diverse yeast strains from differing industrial environments.

    PubMed

    Dunn, Barbara; Richter, Chandra; Kvitek, Daniel J; Pugh, Tom; Sherlock, Gavin

    2012-05-01

    Although the budding yeast Saccharomyces cerevisiae is arguably one of the most well-studied organisms on earth, the genome-wide variation within this species--i.e., its "pan-genome"--has been less explored. We created a multispecies microarray platform containing probes covering the genomes of several Saccharomyces species: S. cerevisiae, including regions not found in the standard laboratory S288c strain, as well as the mitochondrial and 2-μm circle genomes-plus S. paradoxus, S. mikatae, S. kudriavzevii, S. uvarum, S. kluyveri, and S. castellii. We performed array-Comparative Genomic Hybridization (aCGH) on 83 different S. cerevisiae strains collected across a wide range of habitats; of these, 69 were commercial wine strains, while the remaining 14 were from a diverse set of other industrial and natural environments. We observed interspecific hybridization events, introgression events, and pervasive copy number variation (CNV) in all but a few of the strains. These CNVs were distributed throughout the strains such that they did not produce any clear phylogeny, suggesting extensive mating in both industrial and wild strains. To validate our results and to determine whether apparently similar introgressions and CNVs were identical by descent or recurrent, we also performed whole-genome sequencing on nine of these strains. These data may help pinpoint genomic regions involved in adaptation to different industrial milieus, as well as shed light on the course of domestication of S. cerevisiae.

  5. Comparative Genomic Analysis Reveals Habitat-Specific Genes and Regulatory Hubs within the Genus Novosphingobium

    PubMed Central

    Kumar, Roshan; Verma, Helianthous; Haider, Shazia; Bajaj, Abhay; Sood, Utkarsh; Ponnusamy, Kalaiarasan; Nagar, Shekhar; Shakarad, Mallikarjun N.; Negi, Ram Krishan; Singh, Yogendra; Khurana, J. P.; Gilbert, Jack A.

    2017-01-01

    ABSTRACT Species belonging to the genus Novosphingobium are found in many different habitats and have been identified as metabolically versatile. Through comparative genomic analysis, we identified habitat-specific genes and regulatory hubs that could determine habitat selection for Novosphingobium spp. Genomes from 27 Novosphingobium strains isolated from diverse habitats such as rhizosphere soil, plant surfaces, heavily contaminated soils, and marine and freshwater environments were analyzed. Genome size and coding potential were widely variable, differing significantly between habitats. Phylogenetic relationships between strains were less likely to describe functional genotype similarity than the habitat from which they were isolated. In this study, strains (19 out of 27) with a recorded habitat of isolation, and at least 3 representative strains per habitat, comprised four ecological groups—rhizosphere, contaminated soil, marine, and freshwater. Sulfur acquisition and metabolism were the only core genomic traits to differ significantly in proportion between these ecological groups; for example, alkane sulfonate (ssuABCD) assimilation was found exclusively in all of the rhizospheric isolates. When we examined osmolytic regulation in Novosphingobium spp. through ectoine biosynthesis, which was assumed to be marine habitat specific, we found that it was also present in isolates from contaminated soil, suggesting its relevance beyond the marine system. Novosphingobium strains were also found to harbor a wide variety of mono- and dioxygenases, responsible for the metabolism of several aromatic compounds, suggesting their potential to act as degraders of a variety of xenobiotic compounds. Protein-protein interaction analysis revealed β-barrel outer membrane proteins as habitat-specific hubs in each of the four habitats—freshwater (Saro_1868), marine water (PP1Y_AT17644), rhizosphere (PMI02_00367), and soil (V474_17210). These outer membrane proteins could play a

  6. Demographic history, selection and functional diversity of the canine genome.

    PubMed

    Ostrander, Elaine A; Wayne, Robert K; Freedman, Adam H; Davis, Brian W

    2017-12-01

    The domestic dog represents one of the most dramatic long-term evolutionary experiments undertaken by humans. From a large wolf-like progenitor, unparalleled diversity in phenotype and behaviour has developed in dogs, providing a model for understanding the developmental and genomic mechanisms of diversification. We discuss pattern and process in domestication, beginning with general findings about early domestication and problems in documenting selection at the genomic level. Furthermore, we summarize genotype-phenotype studies based first on single nucleotide polymorphism (SNP) genotyping and then with whole-genome data and show how an understanding of evolution informs topics as different as human history, adaptive and deleterious variation, morphological development, ageing, cancer and behaviour.

  7. Ecology and genomics of Bacillus subtilis.

    PubMed

    Earl, Ashlee M; Losick, Richard; Kolter, Roberto

    2008-06-01

    Bacillus subtilis is a remarkably diverse bacterial species that is capable of growth within many environments. Recent microarray-based comparative genomic analyses have revealed that members of this species also exhibit considerable genomic diversity. The identification of strain-specific genes might explain how B. subtilis has become so broadly adapted. The goal of identifying ecologically adaptive genes could soon be realized with the imminent release of several new B. subtilis genome sequences. As we embark upon this exciting new era of B. subtilis comparative genomics we review what is currently known about the ecology and evolution of this species.

  8. Comparative Genomics Reveals the Core Gene Toolbox for the Fungus-Insect Symbiosis.

    PubMed

    Wang, Yan; Stata, Matt; Wang, Wei; Stajich, Jason E; White, Merlin M; Moncalvo, Jean-Marc

    2018-05-15

    Modern genomics has shed light on many entomopathogenic fungi and expanded our knowledge widely; however, little is known about the genomic features of the insect-commensal fungi. Harpellales are obligate commensals living in the digestive tracts of disease-bearing insects (black flies, midges, and mosquitoes). In this study, we produced and annotated whole-genome sequences of nine Harpellales taxa and conducted the first comparative analyses to infer the genomic diversity within the members of the Harpellales. The genomes of the insect gut fungi feature low (26% to 37%) GC content and large genome size variations (25 to 102 Mb). Further comparisons with insect-pathogenic fungi (from both Ascomycota and Zoopagomycota), as well as with free-living relatives (as negative controls), helped to identify a gene toolbox that is essential to the fungus-insect symbiosis. The results not only narrow the genomic scope of fungus-insect interactions from several thousands to eight core players but also distinguish host invasion strategies employed by insect pathogens and commensals. The genomic content suggests that insect commensal fungi rely mostly on adhesion protein anchors that target digestive system, while entomopathogenic fungi have higher numbers of transmembrane helices, signal peptides, and pathogen-host interaction (PHI) genes across the whole genome and enrich genes as well as functional domains to inactivate the host inflammation system and suppress the host defense. Phylogenomic analyses have revealed that genome sizes of Harpellales fungi vary among lineages with an integer-multiple pattern, which implies that ancient genome duplications may have occurred within the gut of insects. IMPORTANCE Insect guts harbor various microbes that are important for host digestion, immune response, and disease dispersal in certain cases. Bacteria, which are among the primary endosymbionts, have been studied extensively. However, fungi, which are also frequently encountered

  9. Diversity and Strain Specificity of Plant Cell Wall Degrading Enzymes Revealed by the Draft Genome of Ruminococcus flavefaciens FD-1

    PubMed Central

    Berg Miller, Margret E.; Antonopoulos, Dionysios A.; Rincon, Marco T.; Band, Mark; Bari, Albert; Akraiko, Tatsiana; Hernandez, Alvaro; Thimmapuram, Jyothi; Henrissat, Bernard; Coutinho, Pedro M.; Borovok, Ilya; Jindou, Sadanari; Lamed, Raphael; Flint, Harry J.; Bayer, Edward A.; White, Bryan A.

    2009-01-01

    Background Ruminococcus flavefaciens is a predominant cellulolytic rumen bacterium, which forms a multi-enzyme cellulosome complex that could play an integral role in the ability of this bacterium to degrade plant cell wall polysaccharides. Identifying the major enzyme types involved in plant cell wall degradation is essential for gaining a better understanding of the cellulolytic capabilities of this organism as well as highlighting potential enzymes for application in improvement of livestock nutrition and for conversion of cellulosic biomass to liquid fuels. Methodology/Principal Findings The R. flavefaciens FD-1 genome was sequenced to 29x-coverage, based on pulsed-field gel electrophoresis estimates (4.4 Mb), and assembled into 119 contigs providing 4,576,399 bp of unique sequence. As much as 87.1% of the genome encodes ORFs, tRNA, rRNAs, or repeats. The GC content was calculated at 45%. A total of 4,339 ORFs was detected with an average gene length of 918 bp. The cellulosome model for R. flavefaciens was further refined by sequence analysis, with at least 225 dockerin-containing ORFs, including previously characterized cohesin-containing scaffoldin molecules. These dockerin-containing ORFs encode a variety of catalytic modules including glycoside hydrolases (GHs), polysaccharide lyases, and carbohydrate esterases. Additionally, 56 ORFs encode proteins that contain carbohydrate-binding modules (CBMs). Functional microarray analysis of the genome revealed that 56 of the cellulosome-associated ORFs were up-regulated, 14 were down-regulated, 135 were unaffected, when R. flavefaciens FD-1 was grown on cellulose versus cellobiose. Three multi-modular xylanases (ORF01222, ORF03896, and ORF01315) exhibited the highest levels of up-regulation. Conclusions/Significance The genomic evidence indicates that R. flavefaciens FD-1 has the largest known number of fiber-degrading enzymes likely to be arranged in a cellulosome architecture. Functional analysis of the genome has

  10. Diversity and strain specificity of plant cell wall degrading enzymes revealed by the draft genome of Ruminococcus flavefaciens FD-1.

    PubMed

    Berg Miller, Margret E; Antonopoulos, Dionysios A; Rincon, Marco T; Band, Mark; Bari, Albert; Akraiko, Tatsiana; Hernandez, Alvaro; Thimmapuram, Jyothi; Henrissat, Bernard; Coutinho, Pedro M; Borovok, Ilya; Jindou, Sadanari; Lamed, Raphael; Flint, Harry J; Bayer, Edward A; White, Bryan A

    2009-08-14

    Ruminococcus flavefaciens is a predominant cellulolytic rumen bacterium, which forms a multi-enzyme cellulosome complex that could play an integral role in the ability of this bacterium to degrade plant cell wall polysaccharides. Identifying the major enzyme types involved in plant cell wall degradation is essential for gaining a better understanding of the cellulolytic capabilities of this organism as well as highlighting potential enzymes for application in improvement of livestock nutrition and for conversion of cellulosic biomass to liquid fuels. The R. flavefaciens FD-1 genome was sequenced to 29x-coverage, based on pulsed-field gel electrophoresis estimates (4.4 Mb), and assembled into 119 contigs providing 4,576,399 bp of unique sequence. As much as 87.1% of the genome encodes ORFs, tRNA, rRNAs, or repeats. The GC content was calculated at 45%. A total of 4,339 ORFs was detected with an average gene length of 918 bp. The cellulosome model for R. flavefaciens was further refined by sequence analysis, with at least 225 dockerin-containing ORFs, including previously characterized cohesin-containing scaffoldin molecules. These dockerin-containing ORFs encode a variety of catalytic modules including glycoside hydrolases (GHs), polysaccharide lyases, and carbohydrate esterases. Additionally, 56 ORFs encode proteins that contain carbohydrate-binding modules (CBMs). Functional microarray analysis of the genome revealed that 56 of the cellulosome-associated ORFs were up-regulated, 14 were down-regulated, 135 were unaffected, when R. flavefaciens FD-1 was grown on cellulose versus cellobiose. Three multi-modular xylanases (ORF01222, ORF03896, and ORF01315) exhibited the highest levels of up-regulation. The genomic evidence indicates that R. flavefaciens FD-1 has the largest known number of fiber-degrading enzymes likely to be arranged in a cellulosome architecture. Functional analysis of the genome has revealed that the growth substrate drives expression of enzymes

  11. Diversity and Evolution in the Genome of Clostridium difficile

    PubMed Central

    Knight, Daniel R.; Elliott, Briony; Chang, Barbara J.; Perkins, Timothy T.

    2015-01-01

    SUMMARY Clostridium difficile infection (CDI) is the leading cause of antimicrobial and health care-associated diarrhea in humans, presenting a significant burden to global health care systems. In the last 2 decades, PCR- and sequence-based techniques, particularly whole-genome sequencing (WGS), have significantly furthered our knowledge of the genetic diversity, evolution, epidemiology, and pathogenicity of this once enigmatic pathogen. C. difficile is taxonomically distinct from many other well-known clostridia, with a diverse population structure comprising hundreds of strain types spread across at least 6 phylogenetic clades. The C. difficile species is defined by a large diverse pangenome with extreme levels of evolutionary plasticity that has been shaped over long time periods by gene flux and recombination, often between divergent lineages. These evolutionary events are in response to environmental and anthropogenic activities and have led to the rapid emergence and worldwide dissemination of virulent clonal lineages. Moreover, genome analysis of large clinically relevant data sets has improved our understanding of CDI outbreaks, transmission, and recurrence. The epidemiology of CDI has changed dramatically over the last 15 years, and CDI may have a foodborne or zoonotic etiology. The WGS era promises to continue to redefine our view of this significant pathogen. PMID:26085550

  12. The mitochondrial genomes of Amphiascoides atopus and Schizopera knabeni (Harpacticoida: Miraciidae) reveal similarities between the copepod orders Harpacticoida and Poecilostomatoida.

    PubMed

    Easton, Erin E; Darrow, Emily M; Spears, Trisha; Thistle, David

    2014-03-15

    Members of subclass Copepoda are abundant, diverse, and-as a result of their variety of ecological roles in marine and freshwater environments-important, but their phylogenetic interrelationships are unclear. Recent studies of arthropods have used gene arrangements in the mitochondrial (mt) genome to infer phylogenies, but for copepods, only seven complete mt genomes have been published. These data revealed several within-order and few among-order similarities. To increase the data available for comparisons, we sequenced the complete mt genome (13,831base pairs) of Amphiascoides atopus and 10,649base pairs of the mt genome of Schizopera knabeni (both in the family Miraciidae of the order Harpacticoida). Comparison of our data to those for Tigriopus japonicus (family Harpacticidae, order Harpacticoida) revealed similarities in gene arrangement among these three species that were consistent with those found within and among families of other copepod orders. Comparison of the mt genomes of our species with those known from other copepod orders revealed the arrangement of mt genes of our Harpacticoida species to be more similar to that of Sinergasilus polycolpus (order Poecilostomatoida) than to that of T. japonicus. The similarities between S. polycolpus and our species are the first to be noted across the boundaries of copepod orders and support the possibility that mt-gene arrangement might be used to infer copepod phylogenies. We also found that our two species had extremely truncated transfer RNAs and that gene overlaps occurred much more frequently than has been reported for other copepod mt genomes. Published by Elsevier B.V.

  13. Patterns of genome size diversity in bats (order Chiroptera).

    PubMed

    Smith, Jillian D L; Bickham, John W; Gregory, T Ryan

    2013-08-01

    Despite being a group of particular interest in considering relationships between genome size and metabolic parameters, bats have not been well studied from this perspective. This study presents new estimates for 121 "microbat" species from 12 families and complements a previous study on members of the family Pteropodidae ("megabats"). The results confirm that diversity in genome size in bats is very limited even compared with other mammals, varying approximately 2-fold from 1.63 pg in Lophostoma carrikeri to 3.17 pg in Rhinopoma hardwickii and averaging only 2.35 pg ± 0.02 SE (versus 3.5 pg overall for mammals). However, contrary to some other vertebrate groups, and perhaps owing to the narrow range observed, genome size correlations were not apparent with any chromosomal, physiological, flight-related, developmental, or ecological characteristics within the order Chiroptera. Genome size is positively correlated with measures of body size in bats, though the strength of the relationships differs between pteropodids ("megabats") and nonpteropodids ("microbats").

  14. Across language families: Genome diversity mirrors linguistic variation within Europe

    PubMed Central

    Longobardi, Giuseppe; Ghirotto, Silvia; Guardiano, Cristina; Tassi, Francesca; Benazzo, Andrea; Ceolin, Andrea

    2015-01-01

    ABSTRACT Objectives: The notion that patterns of linguistic and biological variation may cast light on each other and on population histories dates back to Darwin's times; yet, turning this intuition into a proper research program has met with serious methodological difficulties, especially affecting language comparisons. This article takes advantage of two new tools of comparative linguistics: a refined list of Indo‐European cognate words, and a novel method of language comparison estimating linguistic diversity from a universal inventory of grammatical polymorphisms, and hence enabling comparison even across different families. We corroborated the method and used it to compare patterns of linguistic and genomic variation in Europe. Materials and Methods: Two sets of linguistic distances, lexical and syntactic, were inferred from these data and compared with measures of geographic and genomic distance through a series of matrix correlation tests. Linguistic and genomic trees were also estimated and compared. A method (Treemix) was used to infer migration episodes after the main population splits. Results: We observed significant correlations between genomic and linguistic diversity, the latter inferred from data on both Indo‐European and non‐Indo‐European languages. Contrary to previous observations, on the European scale, language proved a better predictor of genomic differences than geography. Inferred episodes of genetic admixture following the main population splits found convincing correlates also in the linguistic realm. Discussion: These results pave the ground for previously unfeasible cross‐disciplinary analyses at the worldwide scale, encompassing populations of distant language families. Am J Phys Anthropol 157:630–640, 2015. © 2015 Wiley Periodicals, Inc. PMID:26059462

  15. Analysis of the full genome of human group C rotaviruses reveals lineage diversification and reassortment.

    PubMed

    Medici, Maria Cristina; Tummolo, Fabio; Martella, Vito; Arcangeletti, Maria Cristina; De Conto, Flora; Chezzi, Carlo; Fehér, Enikő; Marton, Szilvia; Calderaro, Adriana; Bányai, Krisztián

    2016-08-01

    Group C rotaviruses (RVC) are enteric pathogens of humans and animals. Whole-genome sequences are available only for few RVCs, leaving gaps in our knowledge about their genetic diversity. We determined the full-length genome sequence of two human RVCs (PR2593/2004 and PR713/2012), detected in Italy from hospital-based surveillance for rotavirus infection in 2004 and 2012. In the 11 RNA genomic segments, the two Italian RVCs segregated within separate intra-genotypic lineages showed variation ranging from 1.9 % (VP6) to 15.9 % (VP3) at the nucleotide level. Comprehensive analysis of human RVC sequences available in the databases allowed us to reveal the existence of at least two major genome configurations, defined as type I and type II. Human RVCs of type I were all associated with the M3 VP3 genotype, including the Italian strain PR2593/2004. Conversely, human RVCs of type II were all associated with the M2 VP3 genotype, including the Italian strain PR713/2012. Reassortant RVC strains between these major genome configurations were identified. Although only a few full-genome sequences of human RVCs, mostly of Asian origin, are available, the analysis of human RVC sequences retrieved from the databases indicates that at least two intra-genotypic RVC lineages circulate in European countries. Gathering more sequence data is necessary to develop a standardized genotype and intra-genotypic lineage classification system useful for epidemiological investigations and avoiding confusion in the literature.

  16. Mountain gorilla genomes reveal the impact of long-term population decline and inbreeding

    PubMed Central

    Ayub, Qasim; Szpak, Michal; Frandsen, Peter; Chen, Yuan; Yngvadottir, Bryndis; Cooper, David N.; de Manuel, Marc; Hernandez-Rodriguez, Jessica; Lobon, Irene; Siegismund, Hans R.; Pagani, Luca; Quail, Michael A.; Hvilsom, Christina; Mudakikwa, Antoine; Eichler, Evan E.; Cranfield, Michael R.; Marques-Bonet, Tomas; Tyler-Smith, Chris; Scally, Aylwyn

    2015-01-01

    Mountain gorillas are an endangered great ape subspecies and a prominent focus for conservation, yet we know little about their genomic diversity and evolutionary past. We sequenced whole genomes from multiple wild individuals and compared the genomes of all four Gorilla subspecies. We found that the two eastern subspecies have experienced a prolonged population decline over the past 100,000 years, resulting in very low genetic diversity and an increased overall burden of deleterious variation. A further recent decline in the mountain gorilla population has led to extensive inbreeding, such that individuals are typically homozygous at 34% of their sequence, leading to the purging of severely deleterious recessive mutations from the population. We discuss the causes of their decline and the consequences for their future survival. PMID:25859046

  17. Comparative transcriptomics of two environmentally relevant cyanobacteria reveals unexpected transcriptome diversity

    PubMed Central

    Voigt, Karsten; Sharma, Cynthia M; Mitschke, Jan; Joke Lambrecht, S; Voß, Björn; Hess, Wolfgang R; Steglich, Claudia

    2014-01-01

    Prochlorococcus is a genus of abundant and ecologically important marine cyanobacteria. Here, we present a comprehensive comparison of the structure and composition of the transcriptomes of two Prochlorococcus strains, which, despite their similarities, have adapted their gene pool to specific environmental constraints. We present genome-wide maps of transcriptional start sites (TSS) for both organisms, which are representatives of the two most diverse clades within the two major ecotypes adapted to high- and low-light conditions, respectively. Our data suggest antisense transcription for three-quarters of all genes, which is substantially more than that observed in other bacteria. We discovered hundreds of TSS within genes, most notably within 16 of the 29 prochlorosin genes, in strain MIT9313. A direct comparison revealed very little conservation in the location of TSS and the nature of non-coding transcripts between both strains. We detected extremely short 5′ untranslated regions with a median length of only 27 and 29 nt for MED4 and MIT9313, respectively, and for 8% of all protein-coding genes the median distance to the start codon is only 10 nt or even shorter. These findings and the absence of an obvious Shine–Dalgarno motif suggest that leaderless translation and ribosomal protein S1-dependent translation constitute alternative mechanisms for translation initiation in Prochlorococcus. We conclude that genome-wide antisense transcription is a major component of the transcriptional output from these relatively small genomes and that a hitherto unrecognized high degree of complexity and variability of gene expression exists in their transcriptional architecture. PMID:24739626

  18. Ecology of uncultured Prochlorococcus clades revealed through single-cell genomics and biogeographic analysis

    PubMed Central

    Malmstrom, Rex R; Rodrigue, Sébastien; Huang, Katherine H; Kelly, Libusha; Kern, Suzanne E; Thompson, Anne; Roggensack, Sara; Berube, Paul M; Henn, Matthew R; Chisholm, Sallie W

    2013-01-01

    Prochlorococcus is the numerically dominant photosynthetic organism throughout much of the world's oceans, yet little is known about the ecology and genetic diversity of populations inhabiting tropical waters. To help close this gap, we examined natural Prochlorococcus communities in the tropical Pacific Ocean using a single-cell whole-genome amplification and sequencing. Analysis of the gene content of just 10 single cells from these waters added 394 new genes to the Prochlorococcus pan-genome—that is, genes never before seen in a Prochlorococcus cell. Analysis of marker genes, including the ribosomal internal transcribed sequence, from dozens of individual cells revealed several representatives from two uncultivated clades of Prochlorococcus previously identified as HNLC1 and HNLC2. While the HNLC clades can dominate Prochlorococcus communities under certain conditions, their overall geographic distribution was highly restricted compared with other clades of Prochlorococcus. In the Atlantic and Pacific oceans, these clades were only found in warm waters with low Fe and high inorganic P levels. Genomic analysis suggests that at least one of these clades thrives in low Fe environments by scavenging organic-bound Fe, a process previously unknown in Prochlorococcus. Furthermore, the capacity to utilize organic-bound Fe appears to have been acquired horizontally and may be exchanged among other clades of Prochlorococcus. Finally, one of the single Prochlorococcus cells sequenced contained a partial genome of what appears to be a prophage integrated into the genome. PMID:22895163

  19. Molecular Diversity and Population Structure of a Worldwide Collection of Cultivated Tetraploid Alfalfa (Medicago sativa subsp. sativa L.) Germplasm as Revealed by Microsatellite Markers.

    PubMed

    Qiang, Haiping; Chen, Zhihong; Zhang, Zhengli; Wang, Xuemin; Gao, Hongwen; Wang, Zan

    2015-01-01

    Information on genetic diversity and population structure of a tetraploid alfalfa collection might be valuable in effective use of the genetic resources. A set of 336 worldwide genotypes of tetraploid alfalfa (Medicago sativa subsp. sativa L.) was genotyped using 85 genome-wide distributed SSR markers to reveal the genetic diversity and population structure in the alfalfa. Genetic diversity analysis identified a total of 1056 alleles across 85 marker loci. The average expected heterozygosity and polymorphism information content values were 0.677 and 0.638, respectively, showing high levels of genetic diversity in the cultivated tetraploid alfalfa germplasm. Comparison of genetic characteristics across chromosomes indicated regions of chromosomes 2 and 3 had the highest genetic diversity. A higher genetic diversity was detected in alfalfa landraces than that of wild materials and cultivars. Two populations were identified by the model-based population structure, principal coordinate and neighbor-joining analyses, corresponding to China and other parts of the world. However, lack of strictly correlation between clustering and geographic origins suggested extensive germplasm exchanges of alfalfa germplasm across diverse geographic regions. The quantitative analysis of the genetic diversity and population structure in this study could be useful for genetic and genomic analysis and utilization of the genetic variation in alfalfa breeding.

  20. Molecular Diversity and Population Structure of a Worldwide Collection of Cultivated Tetraploid Alfalfa (Medicago sativa subsp. sativa L.) Germplasm as Revealed by Microsatellite Markers

    PubMed Central

    Qiang, Haiping; Chen, Zhihong; Zhang, Zhengli; Wang, Xuemin; Gao, Hongwen; Wang, Zan

    2015-01-01

    Information on genetic diversity and population structure of a tetraploid alfalfa collection might be valuable in effective use of the genetic resources. A set of 336 worldwide genotypes of tetraploid alfalfa (Medicago sativa subsp. sativa L.) was genotyped using 85 genome-wide distributed SSR markers to reveal the genetic diversity and population structure in the alfalfa. Genetic diversity analysis identified a total of 1056 alleles across 85 marker loci. The average expected heterozygosity and polymorphism information content values were 0.677 and 0.638, respectively, showing high levels of genetic diversity in the cultivated tetraploid alfalfa germplasm. Comparison of genetic characteristics across chromosomes indicated regions of chromosomes 2 and 3 had the highest genetic diversity. A higher genetic diversity was detected in alfalfa landraces than that of wild materials and cultivars. Two populations were identified by the model-based population structure, principal coordinate and neighbor-joining analyses, corresponding to China and other parts of the world. However, lack of strictly correlation between clustering and geographic origins suggested extensive germplasm exchanges of alfalfa germplasm across diverse geographic regions. The quantitative analysis of the genetic diversity and population structure in this study could be useful for genetic and genomic analysis and utilization of the genetic variation in alfalfa breeding. PMID:25901573

  1. Comparative Genomics Reveals Accelerated Evolution in Conserved Pathways during the Diversification of Anole Lizards.

    PubMed

    Tollis, Marc; Hutchins, Elizabeth D; Stapley, Jessica; Rupp, Shawn M; Eckalbar, Walter L; Maayan, Inbar; Lasku, Eris; Infante, Carlos R; Dennis, Stuart R; Robertson, Joel A; May, Catherine M; Crusoe, Michael R; Bermingham, Eldredge; DeNardo, Dale F; Hsieh, Shi-Tong Tonia; Kulathinal, Rob J; McMillan, William Owen; Menke, Douglas B; Pratt, Stephen C; Rawls, Jeffery Alan; Sanjur, Oris; Wilson-Rawls, Jeanne; Wilson Sayres, Melissa A; Fisher, Rebecca E; Kusumi, Kenro

    2018-02-01

    Squamates include all lizards and snakes, and display some of the most diverse and extreme morphological adaptations among vertebrates. However, compared with birds and mammals, relatively few resources exist for comparative genomic analyses of squamates, hampering efforts to understand the molecular bases of phenotypic diversification in such a speciose clade. In particular, the ∼400 species of anole lizard represent an extensive squamate radiation. Here, we sequence and assemble the draft genomes of three anole species-Anolis frenatus, Anolis auratus, and Anolis apletophallus-for comparison with the available reference genome of Anolis carolinensis. Comparative analyses reveal a rapid background rate of molecular evolution consistent with a model of punctuated equilibrium, and strong purifying selection on functional genomic elements in anoles. We find evidence for accelerated evolution in genes involved in behavior, sensory perception, and reproduction, as well as in genes regulating limb bud development and hindlimb specification. Morphometric analyses of anole fore and hindlimbs corroborated these findings. We detect signatures of positive selection across several genes related to the development and regulation of the forebrain, hormones, and the iguanian lizard dewlap, suggesting molecular changes underlying behavioral adaptations known to reinforce species boundaries were a key component in the diversification of anole lizards. © The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  2. Cattle genome-wide analysis reveals genetic signatures in trypanotolerant N'Dama.

    PubMed

    Kim, Soo-Jin; Ka, Sojeong; Ha, Jung-Woo; Kim, Jaemin; Yoo, DongAhn; Kim, Kwondo; Lee, Hak-Kyo; Lim, Dajeong; Cho, Seoae; Hanotte, Olivier; Mwai, Okeyo Ally; Dessie, Tadelle; Kemp, Stephen; Oh, Sung Jong; Kim, Heebal

    2017-05-12

    Indigenous cattle in Africa have adapted to various local environments to acquire superior phenotypes that enhance their survival under harsh conditions. While many studies investigated the adaptation of overall African cattle, genetic characteristics of each breed have been poorly studied. We performed the comparative genome-wide analysis to assess evidence for subspeciation within species at the genetic level in trypanotolerant N'Dama cattle. We analysed genetic variation patterns in N'Dama from the genomes of 101 cattle breeds including 48 samples of five indigenous African cattle breeds and 53 samples of various commercial breeds. Analysis of SNP variances between cattle breeds using wMI, XP-CLR, and XP-EHH detected genes containing N'Dama-specific genetic variants and their potential associations. Functional annotation analysis revealed that these genes are associated with ossification, neurological and immune system. Particularly, the genes involved in bone formation indicate that local adaptation of N'Dama may engage in skeletal growth as well as immune systems. Our results imply that N'Dama might have acquired distinct genotypes associated with growth and regulation of regional diseases including trypanosomiasis. Moreover, this study offers significant insights into identifying genetic signatures for natural and artificial selection of diverse African cattle breeds.

  3. Clonal evolution in breast cancer revealed by single nucleus genome sequencing.

    PubMed

    Wang, Yong; Waters, Jill; Leung, Marco L; Unruh, Anna; Roh, Whijae; Shi, Xiuqing; Chen, Ken; Scheet, Paul; Vattathil, Selina; Liang, Han; Multani, Asha; Zhang, Hong; Zhao, Rui; Michor, Franziska; Meric-Bernstam, Funda; Navin, Nicholas E

    2014-08-14

    Sequencing studies of breast tumour cohorts have identified many prevalent mutations, but provide limited insight into the genomic diversity within tumours. Here we developed a whole-genome and exome single cell sequencing approach called nuc-seq that uses G2/M nuclei to achieve 91% mean coverage breadth. We applied this method to sequence single normal and tumour nuclei from an oestrogen-receptor-positive (ER(+)) breast cancer and a triple-negative ductal carcinoma. In parallel, we performed single nuclei copy number profiling. Our data show that aneuploid rearrangements occurred early in tumour evolution and remained highly stable as the tumour masses clonally expanded. In contrast, point mutations evolved gradually, generating extensive clonal diversity. Using targeted single-molecule sequencing, many of the diverse mutations were shown to occur at low frequencies (<10%) in the tumour mass. Using mathematical modelling we found that the triple-negative tumour cells had an increased mutation rate (13.3×), whereas the ER(+) tumour cells did not. These findings have important implications for the diagnosis, therapeutic treatment and evolution of chemoresistance in breast cancer.

  4. Genomic diversity and evolution of the fish pathogen Flavobacterium psychrophilum

    USDA-ARS?s Scientific Manuscript database

    Flavobacterium psychrophilum, the etiological agent of rainbow trout fry syndrome and bacterial cold-water disease in salmonid fish, is currently one of the main bacterial pathogens hampering the productivity of salmonid farming worldwide. In this study, the genomic diversity of the F. psychrophilum...

  5. Investigating the Genome Diversity of B. cereus and Evolutionary Aspects of B. anthracis Emergence

    PubMed Central

    Papazisi, Leka; Rasko, David A.; Ratnayake, Shashikala; Bock, Geoff R.; Remortel, Brian G.; Appalla, Lakshmi; Liu, Jia; Dracheva, Tatiana; Braisted, John C.; Shallom, Shamira; Jarrahi, Benham; Snesrud, Erik; Ahn, Susie; Sun, Qiang; Rilstone, Jenifer; Økstad, Ole Andreas; Kolstø, Anne-Brit; Fleischmann, Robert D.; Peterson, Scott N.

    2011-01-01

    Here we report the use of a multi-genome DNA microarray to investigate the genome diversity of Bacillus cereus group members and elucidate the events associated with the emergence of B. anthracis the causative agent of anthrax–a lethal zoonotic disease. We initially performed directed genome sequencing of seven diverse B. cereus strains to identify novel sequences encoded in those genomes. The novel genes identified, combined with those publicly available, allowed the design of a “species” DNA microarray. Comparative genomic hybridization analyses of 41 strains indicates that substantial heterogeneity exists with respect to the genes comprising functional role categories. While the acquisition of the plasmid-encoded pathogenicity island (pXO1) and capsule genes (pXO2) represent a crucial landmark dictating the emergence of B. anthracis, the evolution of this species and its close relatives was associated with an overall a shift in the fraction of genes devoted to energy metabolism, cellular processes, transport, as well as virulence. PMID:21447378

  6. Development of a Custom-Designed, Pan Genomic DNA Microarray to Characterize Strain-Level Diversity among Cronobacter spp.

    PubMed Central

    Tall, Ben Davies; Gangiredla, Jayanthi; Gopinath, Gopal R.; Yan, Qiongqiong; Chase, Hannah R.; Lee, Boram; Hwang, Seongeun; Trach, Larisa; Park, Eunbi; Yoo, YeonJoo; Chung, TaeJung; Jackson, Scott A.; Patel, Isha R.; Sathyamoorthy, Venugopal; Pava-Ripoll, Monica; Kotewicz, Michael L.; Carter, Laurenda; Iversen, Carol; Pagotto, Franco; Stephan, Roger; Lehner, Angelika; Fanning, Séamus; Grim, Christopher J.

    2015-01-01

    Cronobacter species cause infections in all age groups; however neonates are at highest risk and remain the most susceptible age group for life-threatening invasive disease. The genus contains seven species:Cronobacter sakazakii, Cronobacter malonaticus, Cronobacter turicensis, Cronobacter muytjensii, Cronobacter dublinensis, Cronobacter universalis, and Cronobacter condimenti. Despite an abundance of published genomes of these species, genomics-based epidemiology of the genus is not well established. The gene content of a diverse group of 126 unique Cronobacter and taxonomically related isolates was determined using a pan genomic-based DNA microarray as a genotyping tool and as a means to identify outbreak isolates for food safety, environmental, and clinical surveillance purposes. The microarray constitutes 19,287 independent genes representing 15 Cronobacter genomes and 18 plasmids and 2,371 virulence factor genes of phylogenetically related Gram-negative bacteria. The Cronobacter microarray was able to distinguish the seven Cronobacter species from one another and from non-Cronobacter species; and within each species, strains grouped into distinct clusters based on their genomic diversity. These results also support the phylogenic divergence of the genus and clearly highlight the genomic diversity among each member of the genus. The current study establishes a powerful platform for further genomics research of this diverse genus, an important prerequisite toward the development of future countermeasures against this foodborne pathogen in the food safety and clinical arenas. PMID:25984509

  7. Whole genome SNP discovery and analysis of genetic diversity in Turkey (Meleagris gallopavo)

    PubMed Central

    2012-01-01

    Background The turkey (Meleagris gallopavo) is an important agricultural species and the second largest contributor to the world’s poultry meat production. Genetic improvement is attributed largely to selective breeding programs that rely on highly heritable phenotypic traits, such as body size and breast muscle development. Commercial breeding with small effective population sizes and epistasis can result in loss of genetic diversity, which in turn can lead to reduced individual fitness and reduced response to selection. The presence of genomic diversity in domestic livestock species therefore, is of great importance and a prerequisite for rapid and accurate genetic improvement of selected breeds in various environments, as well as to facilitate rapid adaptation to potential changes in breeding goals. Genomic selection requires a large number of genetic markers such as e.g. single nucleotide polymorphisms (SNPs) the most abundant source of genetic variation within the genome. Results Alignment of next generation sequencing data of 32 individual turkeys from different populations was used for the discovery of 5.49 million SNPs, which subsequently were used for the analysis of genetic diversity among the different populations. All of the commercial lines branched from a single node relative to the heritage varieties and the South Mexican turkey population. Heterozygosity of all individuals from the different turkey populations ranged from 0.17-2.73 SNPs/Kb, while heterozygosity of populations ranged from 0.73-1.64 SNPs/Kb. The average frequency of heterozygous SNPs in individual turkeys was 1.07 SNPs/Kb. Five genomic regions with very low nucleotide variation were identified in domestic turkeys that showed state of fixation towards alleles different than wild alleles. Conclusion The turkey genome is much less diverse with a relatively low frequency of heterozygous SNPs as compared to other livestock species like chicken and pig. The whole genome SNP discovery

  8. Mountain gorilla genomes reveal the impact of long-term population decline and inbreeding.

    PubMed

    Xue, Yali; Prado-Martinez, Javier; Sudmant, Peter H; Narasimhan, Vagheesh; Ayub, Qasim; Szpak, Michal; Frandsen, Peter; Chen, Yuan; Yngvadottir, Bryndis; Cooper, David N; de Manuel, Marc; Hernandez-Rodriguez, Jessica; Lobon, Irene; Siegismund, Hans R; Pagani, Luca; Quail, Michael A; Hvilsom, Christina; Mudakikwa, Antoine; Eichler, Evan E; Cranfield, Michael R; Marques-Bonet, Tomas; Tyler-Smith, Chris; Scally, Aylwyn

    2015-04-10

    Mountain gorillas are an endangered great ape subspecies and a prominent focus for conservation, yet we know little about their genomic diversity and evolutionary past. We sequenced whole genomes from multiple wild individuals and compared the genomes of all four Gorilla subspecies. We found that the two eastern subspecies have experienced a prolonged population decline over the past 100,000 years, resulting in very low genetic diversity and an increased overall burden of deleterious variation. A further recent decline in the mountain gorilla population has led to extensive inbreeding, such that individuals are typically homozygous at 34% of their sequence, leading to the purging of severely deleterious recessive mutations from the population. We discuss the causes of their decline and the consequences for their future survival. Copyright © 2015, American Association for the Advancement of Science.

  9. Genome-wide Diversity and Association Mapping for Capsaicinoids and Fruit Weight in Capsicum annuum L

    PubMed Central

    Nimmakayala, Padma; Abburi, Venkata L.; Saminathan, Thangasamy; Alaparthi, Suresh B.; Almeida, Aldo; Davenport, Brittany; Nadimi, Marjan; Davidson, Joshua; Tonapi, Krittika; Yadav, Lav; Malkaram, Sridhar; Vajja, Gopinath; Hankins, Gerald; Harris, Robert; Park, Minkyu; Choi, Doil; Stommel, John; Reddy, Umesh K.

    2016-01-01

    Accumulated capsaicinoid content and increased fruit size are traits resulting from Capsicum annuum domestication. In this study, we used a diverse collection of C. annuum to generate 66,960 SNPs using genotyping by sequencing. The study identified 1189 haplotypes containing 3413 SNPs. Length of individual linkage disequilibrium (LD) blocks varied along chromosomes, with regions of high and low LD interspersed with an average LD of 139 kb. Principal component analysis (PCA), Bayesian model based population structure analysis and an Euclidean tree built based on identity by state (IBS) indices revealed that the clustering pattern of diverse accessions are in agreement with capsaicin content (CA) and fruit weight (FW) classifications indicating the importance of these traits in shaping modern pepper genome. PCA and IBS were used in a mixed linear model of capsaicin and dihydrocapsaicin content and fruit weight to reduce spurious associations because of confounding effects of subpopulations in genome-wide association study (GWAS). Our GWAS results showed SNPs in Ankyrin-like protein, IKI3 family protein, ABC transporter G family and pentatricopeptide repeat protein are the major markers for capsaicinoids and of 16 SNPs strongly associated with FW in both years of the study, 7 are located in known fruit weight controlling genes. PMID:27901114

  10. Genome-wide Diversity and Association Mapping for Capsaicinoids and Fruit Weight in Capsicum annuum L.

    PubMed

    Nimmakayala, Padma; Abburi, Venkata L; Saminathan, Thangasamy; Alaparthi, Suresh B; Almeida, Aldo; Davenport, Brittany; Nadimi, Marjan; Davidson, Joshua; Tonapi, Krittika; Yadav, Lav; Malkaram, Sridhar; Vajja, Gopinath; Hankins, Gerald; Harris, Robert; Park, Minkyu; Choi, Doil; Stommel, John; Reddy, Umesh K

    2016-11-30

    Accumulated capsaicinoid content and increased fruit size are traits resulting from Capsicum annuum domestication. In this study, we used a diverse collection of C. annuum to generate 66,960 SNPs using genotyping by sequencing. The study identified 1189 haplotypes containing 3413 SNPs. Length of individual linkage disequilibrium (LD) blocks varied along chromosomes, with regions of high and low LD interspersed with an average LD of 139 kb. Principal component analysis (PCA), Bayesian model based population structure analysis and an Euclidean tree built based on identity by state (IBS) indices revealed that the clustering pattern of diverse accessions are in agreement with capsaicin content (CA) and fruit weight (FW) classifications indicating the importance of these traits in shaping modern pepper genome. PCA and IBS were used in a mixed linear model of capsaicin and dihydrocapsaicin content and fruit weight to reduce spurious associations because of confounding effects of subpopulations in genome-wide association study (GWAS). Our GWAS results showed SNPs in Ankyrin-like protein, IKI3 family protein, ABC transporter G family and pentatricopeptide repeat protein are the major markers for capsaicinoids and of 16 SNPs strongly associated with FW in both years of the study, 7 are located in known fruit weight controlling genes.

  11. Diverse Lifestyles and Strategies of Plant Pathogenesis Encoded in the Genomes of Eighteen Dothideomycetes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ohm, Robin A.; Feau, Nicolas; Henrissat, Bernard

    2013-03-05

    The class of Dothideomycetes is one of the largest and most diverse groups of fungi. Many are plant pathogens and pose a serious threat to agricultural crops that are grown for biofuel, food or feed. Most Dothideomycetes have only a single host plant, and related species can have very diverse hosts. Eighteen genomes of Dothideomycetes have currently been sequenced by the Joint Genome Institute and other sequencing centers. Here we describe the results of comparative analyses of the fungi in this group.

  12. Comparative genomics of maize ear rot pathogens reveals expansion of carbohydrate-active enzymes and secondary metabolism backbone genes in Stenocarpella maydis.

    PubMed

    Zaccaron, Alex Z; Woloshuk, Charles P; Bluhm, Burton H

    2017-11-01

    Stenocarpella maydis is a plant pathogenic fungus that causes Diplodia ear rot, one of the most destructive diseases of maize. To date, little information is available regarding the molecular basis of pathogenesis in this organism, in part due to limited genomic resources. In this study, a 54.8 Mb draft genome assembly of S. maydis was obtained with Illumina and PacBio sequencing technologies, and analyzed. Comparative genomic analyses with the predominant maize ear rot pathogens Aspergillus flavus, Fusarium verticillioides, and Fusarium graminearum revealed an expanded set of carbohydrate-active enzymes for cellulose and hemicellulose degradation in S. maydis. Analyses of predicted genes involved in starch degradation revealed six putative α-amylases, four extracellular and two intracellular, and two putative γ-amylases, one of which appears to have been acquired from bacteria via horizontal transfer. Additionally, 87 backbone genes involved in secondary metabolism were identified, which represents one of the largest known assemblages among Pezizomycotina species. Numerous secondary metabolite gene clusters were identified, including two clusters likely involved in the biosynthesis of diplodiatoxin and chaetoglobosins. The draft genome of S. maydis presented here will serve as a useful resource for molecular genetics, functional genomics, and analyses of population diversity in this organism. Copyright © 2017 British Mycological Society. Published by Elsevier Ltd. All rights reserved.

  13. Whole genome sequencing of the monomorphic pathogen Mycobacterium bovis reveals local differentiation of cattle clinical isolates.

    PubMed

    Lasserre, Moira; Fresia, Pablo; Greif, Gonzalo; Iraola, Gregorio; Castro-Ramos, Miguel; Juambeltz, Arturo; Nuñez, Álvaro; Naya, Hugo; Robello, Carlos; Berná, Luisa

    2018-01-02

    Bovine tuberculosis (bTB) poses serious risks to animal welfare and economy, as well as to public health as a zoonosis. Its etiological agent, Mycobacterium bovis, belongs to the Mycobacterium tuberculosis complex (MTBC), a group of genetically monomorphic organisms featured by a remarkably high overall nucleotide identity (99.9%). Indeed, this characteristic is of major concern for correct typing and determination of strain-specific traits based on sequence diversity. Due to its historical economic dependence on cattle production, Uruguay is deeply affected by the prevailing incidence of Mycobacterium bovis. With the world's highest number of cattle per human, and its intensive cattle production, Uruguay represents a particularly suited setting to evaluate genomic variability among isolates, and the diversity traits associated to this pathogen. We compared 186 genomes from MTBC strains isolated worldwide, and found a highly structured population in M. bovis. The analysis of 23 new M. bovis genomes, belonging to strains isolated in Uruguay evidenced three groups present in the country. Despite presenting an expected highly conserved genomic structure and sequence, these strains segregate into a clustered manner within the worldwide phylogeny. Analysis of the non-pe/ppe differential areas against a reference genome defined four main sources of variability, namely: regions of difference (RD), variable genes, duplications and novel genes. RDs and variant analysis segregated the strains into clusters that are concordant with their spoligotype identities. Due to its high homoplasy rate, spoligotyping failed to reflect the true genomic diversity among worldwide representative strains, however, it remains a good indicator for closely related populations. This study introduces a comprehensive population structure analysis of worldwide M. bovis isolates. The incorporation and analysis of 23 novel Uruguayan M. bovis genomes, sheds light onto the genomic diversity of this

  14. The genome of Eucalyptus grandis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Myburg, Alexander A.; Grattapaglia, Dario; Tuskan, Gerald A.

    Eucalypts are the world s most widely planted hardwood trees. Their broad adaptability, rich species diversity, fast growth and superior multipurpose wood, have made them a global renewable resource of fiber and energy that mitigates human pressures on natural forests. We sequenced and assembled >94% of the 640 Mbp genome of Eucalyptus grandis into its 11 chromosomes. A set of 36,376 protein coding genes were predicted revealing that 34% occur in tandem duplications, the largest proportion found thus far in any plant genome. Eucalypts also show the highest diversity of genes for plant specialized metabolism that act as chemical defencemore » against biotic agents and provide unique pharmaceutical oils. Resequencing of a set of inbred tree genomes revealed regions of strongly conserved heterozygosity, likely hotspots of inbreeding depression. The resequenced genome of the sister species E. globulus underscored the high inter-specific genome colinearity despite substantial genome size variation in the genus. The genome of E. grandis is the first reference for the early diverging Rosid order Myrtales and is placed here basal to the Eurosids. This resource expands knowledge on the unique biology of large woody perennials and provides a powerful tool to accelerate comparative biology, breeding and biotechnology.« less

  15. Genomic basis for natural product biosynthetic diversity in the actinomycetes†

    PubMed Central

    Nett, Markus; Ikeda, Haruo; Moore, Bradley S.

    2010-01-01

    The phylum Actinobacteria hosts diverse high G + C, Gram-positive bacteria that have evolved a complex chemical language of natural product chemistry to help navigate their fascinatingly varied lifestyles. To date, 71 Actinobacteria genomes have been completed and annotated, with the vast majority representing the Actinomycetales, which are the source of numerous antibiotics and other drugs from genera such as Streptomyces, Saccharopolyspora and Salinispora. These genomic analyses have illuminated the secondary metabolic proficiency of these microbes – underappreciated for years based on conventional isolation programs – and have helped set the foundation for a new natural product discovery paradigm based on genome mining. Trends in the secondary metabolomes of natural product-rich actinomycetes are highlighted in this review article, which contains 199 references. PMID:19844637

  16. Genome-wide SNP and haplotype analyses reveal a rich history underlying dog domestication

    PubMed Central

    vonHoldt, Bridgett M.; Pollinger, John P.; Lohmueller, Kirk E.; Han, Eunjung; Parker, Heidi G.; Quignon, Pascale; Degenhardt, Jeremiah D.; Boyko, Adam R.; Earl, Dent A.; Auton, Adam; Reynolds, Andy; Bryc, Kasia; Brisbin, Abra; Knowles, James C.; Mosher, Dana S.; Spady, Tyrone C.; Elkahloun, Abdel; Geffen, Eli; Pilot, Malgorzata; Jedrzejewski, Wlodzimierz; Greco, Claudia; Randi, Ettore; Bannasch, Danika; Wilton, Alan; Shearman, Jeremy; Musiani, Marco; Cargill, Michelle; Jones, Paul G.; Qian, Zuwei; Huang, Wei; Ding, Zhao-Li; Zhang, Ya-ping; Bustamante, Carlos D.; Ostrander, Elaine A.; Novembre, John; Wayne, Robert K.

    2010-01-01

    Advances in genome technology have facilitated a new understanding of the historical and genetic processes crucial to rapid phenotypic evolution under domestication1,2. To understand the process of dog diversification better, we conducted an extensive genome-wide survey of more than 48,000 single nucleotide polymorphisms in dogs and their wild progenitor, the grey wolf. Here we show that dog breeds share a higher proportion of multi-locus haplotypes unique to grey wolves from the Middle East, indicating that they are a dominant source of genetic diversity for dogs rather than wolves from east Asia, as suggested by mitochondrial DNA sequence data3. Furthermore, we find a surprising correspondence between genetic and phenotypic/functional breed groupings but there are exceptions that suggest phenotypic diversification depended in part on the repeated crossing of individuals with novel phenotypes. Our results show that Middle Eastern wolves were a critical source of genome diversity, although interbreeding with local wolf populations clearly occurred elsewhere in the early history of specific lineages. More recently, the evolution of modern dog breeds seems to have been an iterative process that drew on a limited genetic toolkit to create remarkable phenotypic diversity. PMID:20237475

  17. Genome-wide SNP and haplotype analyses reveal a rich history underlying dog domestication.

    PubMed

    Vonholdt, Bridgett M; Pollinger, John P; Lohmueller, Kirk E; Han, Eunjung; Parker, Heidi G; Quignon, Pascale; Degenhardt, Jeremiah D; Boyko, Adam R; Earl, Dent A; Auton, Adam; Reynolds, Andy; Bryc, Kasia; Brisbin, Abra; Knowles, James C; Mosher, Dana S; Spady, Tyrone C; Elkahloun, Abdel; Geffen, Eli; Pilot, Malgorzata; Jedrzejewski, Wlodzimierz; Greco, Claudia; Randi, Ettore; Bannasch, Danika; Wilton, Alan; Shearman, Jeremy; Musiani, Marco; Cargill, Michelle; Jones, Paul G; Qian, Zuwei; Huang, Wei; Ding, Zhao-Li; Zhang, Ya-Ping; Bustamante, Carlos D; Ostrander, Elaine A; Novembre, John; Wayne, Robert K

    2010-04-08

    Advances in genome technology have facilitated a new understanding of the historical and genetic processes crucial to rapid phenotypic evolution under domestication. To understand the process of dog diversification better, we conducted an extensive genome-wide survey of more than 48,000 single nucleotide polymorphisms in dogs and their wild progenitor, the grey wolf. Here we show that dog breeds share a higher proportion of multi-locus haplotypes unique to grey wolves from the Middle East, indicating that they are a dominant source of genetic diversity for dogs rather than wolves from east Asia, as suggested by mitochondrial DNA sequence data. Furthermore, we find a surprising correspondence between genetic and phenotypic/functional breed groupings but there are exceptions that suggest phenotypic diversification depended in part on the repeated crossing of individuals with novel phenotypes. Our results show that Middle Eastern wolves were a critical source of genome diversity, although interbreeding with local wolf populations clearly occurred elsewhere in the early history of specific lineages. More recently, the evolution of modern dog breeds seems to have been an iterative process that drew on a limited genetic toolkit to create remarkable phenotypic diversity.

  18. DNA Editing of LTR Retrotransposons Reveals the Impact of APOBECs on Vertebrate Genomes

    PubMed Central

    Knisbacher, Binyamin A.; Levanon, Erez Y.

    2016-01-01

    Long terminal repeat retrotransposons (LTR) are widespread in vertebrates and their dynamism facilitates genome evolution. However, these endogenous retroviruses (ERVs) must be restricted to maintain genomic stability. The APOBECs, a protein family that can edit C-to-U in DNA, do so by interfering with reverse transcription and hypermutating retrotransposon DNA. In some cases, a retrotransposon may integrate into the genome despite being hypermutated. Such an event introduces a unique sequence into the genome, increasing retrotransposon diversity and the probability of developing new function at the locus of insertion. The prevalence of this phenomenon and its effects on vertebrate genomes are still unclear. In this study, we screened ERV sequences in the genomes of 123 diverse species and identified hundreds of thousands of edited sites in multiple vertebrate lineages, including placental mammals, marsupials, and birds. Numerous edited ERVs carry high mutation loads, some with greater than 350 edited sites, profoundly damaging their open-reading frames. For many of the species studied, this is the first evidence that APOBECs are active players in their innate immune system. Unexpectedly, some birds and especially zebra finch and medium ground-finch (one of Darwin’s finches) are exceptionally enriched in DNA editing. We demonstrate that edited retrotransposons may be preferentially retained in active genomic regions, as reflected from their enrichment in genes, exons, promoters, and transcription start sites, thereby raising the probability of their exaptation for novel function. In conclusion, DNA editing of retrotransposons by APOBECs has a substantial role in vertebrate innate immunity and may boost genome evolution. PMID:26541172

  19. The Human Genome Diversity Project

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cavalli-Sforza, L.

    1994-12-31

    The Human Genome Diversity Project (HGD Project) is an international anthropology project that seeks to study the genetic richness of the entire human species. This kind of genetic information can add a unique thread to the tapestry knowledge of humanity. Culture, environment, history, and other factors are often more important, but humanity`s genetic heritage, when analyzed with recent technology, brings another type of evidence for understanding species` past and present. The Project will deepen the understanding of this genetic richness and show both humanity`s diversity and its deep and underlying unity. The HGD Project is still largely in its planningmore » stages, seeking the best ways to reach its goals. The continuing discussions of the Project, throughout the world, should improve the plans for the Project and their implementation. The Project is as global as humanity itself; its implementation will require the kinds of partnerships among different nations and cultures that make the involvement of UNESCO and other international organizations particularly appropriate. The author will briefly discuss the Project`s history, describe the Project, set out the core principles of the Project, and demonstrate how the Project will help combat the scourge of racism.« less

  20. Genomes by design

    PubMed Central

    Haimovich, Adrian D.; Muir, Paul; Isaacs, Farren J.

    2016-01-01

    Next-generation DNA sequencing has revealed the complete genome sequences of numerous organisms, establishing a fundamental and growing understanding of genetic variation and phenotypic diversity. Engineering at the gene, network and whole-genome scale aims to introduce targeted genetic changes both to explore emergent phenotypes and to introduce new functionalities. Expansion of these approaches into massively parallel platforms establishes the ability to generate targeted genome modifications, elucidating causal links between genotype and phenotype, as well as the ability to design and reprogramme organisms. In this Review, we explore techniques and applications in genome engineering, outlining key advances and defining challenges. PMID:26260262

  1. Diverse patterns of genomic targeting by transcriptional regulators in Drosophila melanogaster.

    PubMed

    Slattery, Matthew; Ma, Lijia; Spokony, Rebecca F; Arthur, Robert K; Kheradpour, Pouya; Kundaje, Anshul; Nègre, Nicolas; Crofts, Alex; Ptashkin, Ryan; Zieba, Jennifer; Ostapenko, Alexander; Suchy, Sarah; Victorsen, Alec; Jameel, Nader; Grundstad, A Jason; Gao, Wenxuan; Moran, Jennifer R; Rehm, E Jay; Grossman, Robert L; Kellis, Manolis; White, Kevin P

    2014-07-01

    Annotation of regulatory elements and identification of the transcription-related factors (TRFs) targeting these elements are key steps in understanding how cells interpret their genetic blueprint and their environment during development, and how that process goes awry in the case of disease. One goal of the modENCODE (model organism ENCyclopedia of DNA Elements) Project is to survey a diverse sampling of TRFs, both DNA-binding and non-DNA-binding factors, to provide a framework for the subsequent study of the mechanisms by which transcriptional regulators target the genome. Here we provide an updated map of the Drosophila melanogaster regulatory genome based on the location of 84 TRFs at various stages of development. This regulatory map reveals a variety of genomic targeting patterns, including factors with strong preferences toward proximal promoter binding, factors that target intergenic and intronic DNA, and factors with distinct chromatin state preferences. The data also highlight the stringency of the Polycomb regulatory network, and show association of the Trithorax-like (Trl) protein with hotspots of DNA binding throughout development. Furthermore, the data identify more than 5800 instances in which TRFs target DNA regions with demonstrated enhancer activity. Regions of high TRF co-occupancy are more likely to be associated with open enhancers used across cell types, while lower TRF occupancy regions are associated with complex enhancers that are also regulated at the epigenetic level. Together these data serve as a resource for the research community in the continued effort to dissect transcriptional regulatory mechanisms directing Drosophila development. © 2014 Slattery et al.; Published by Cold Spring Harbor Laboratory Press.

  2. Integrated genomics of Mucorales reveals novel therapeutic targets

    USDA-ARS?s Scientific Manuscript database

    Mucormycosis is a life-threatening infection caused by Mucorales fungi. We sequenced 30 fungal genomes and performed transcriptomics with three representative Rhizopus and Mucor strains with human airway epithelial cells during fungal invasion to reveal key host and fungal determinants contributing ...

  3. Genomic diversity in switchgrass (Panicum virgatum): from the continental scale to a dune landscape

    PubMed Central

    Morris, Geoffrey P.; Grabowski, Paul; Borevitz, Justin O.

    2011-01-01

    Connecting broad-scale patterns of genetic variation and population structure to genetic diversity on a landscape is a key step towards understanding historical processes of migration and adaptation. New genomic approaches can be used to increase the resolution of phylogeographic studies while reducing locus sampling effects and circumventing ascertainment bias. Here, we use a novel approach based on high-throughput sequencing to characterize genetic diversity in complete chloroplast genomes and >10,000 nuclear loci in switchgrass, across a continental and landscape scale. Switchgrass is a North American tallgrass species, which is widely used in conservation and perennial biomass production, and shows strong ecotypic adaptation and population structure across the continental range. We sequenced 40.9 billion base pairs from 24 individuals from across the species’ range and 20 individuals from the Indiana Dunes. Analysis of plastome sequence revealed 203 variable SNP sites that define eight haplogroups, which are differentiated by 4 to 127 SNPs and confirmed by patterns of indel variation. These include three deeply divergent haplogroups, which correspond to the previously described lowland-upland ecotypic split and a novel upland haplogroup split that dates to the mid-Pleistoscene. Most of the plastome haplogroup diversity present in the northern switchgrass range, including in the Indiana Dunes, originated in the mid- or upper-Pleistocene prior to the most recent postglacial recolonization. Furthermore, a recently colonized landscape feature (~150 ya) in the Indiana Dunes contains several deeply divergent upland haplogroups. Nuclear markers also support a deep lowland-upland split, followed by limited gene flow, and show extensive gene flow in the local population of the Indiana Dunes. PMID:22060816

  4. Linking secondary metabolites to gene clusters through genome sequencing of six diverse Aspergillus species

    DOE PAGES

    Kjerbolling, Inge; Vesth, Tammi C.; Frisvad, Jens C.; ...

    2018-01-09

    The fungal genus of Aspergillus is highly interesting, containing everything from industrial cell factories over model organisms to human pathogens. In particular, this group has a prolific production of bioactive secondary metabolites (SMs). In this work, four diverse Aspergillus species (A. campestris, A. novofumigatus, A. ochraceoroseus and A. steynii) has been whole genome PacBio sequenced to provide genetic references in three Aspergillus sections. Additionally, A. taichungensis and A. candidus were sequenced for SM elucidation. Thirteen Aspergillus genomes were analysed with comparative genomics to determine phylogeny and genetic diversity, showing that each new genome contains 15–27% genes not found in othermore » sequenced Aspergilli. In particular, the new species A. novofumigatus was compared to the pathogenic species A. fumigatus. This suggests that A. novofumigatus can produce most of the same allergens, virulence and pathogenicity factors as A. fumigatus suggesting that A. novofumigatus could be as pathogenic as A. fumigatus. Furthermore, SMs were linked to gene clusters based on biological and chemical knowledge and analysis, genome sequences and predictive algorithms.« less

  5. Linking secondary metabolites to gene clusters through genome sequencing of six diverse Aspergillus species

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kjerbolling, Inge; Vesth, Tammi C.; Frisvad, Jens C.

    The fungal genus of Aspergillus is highly interesting, containing everything from industrial cell factories over model organisms to human pathogens. In particular, this group has a prolific production of bioactive secondary metabolites (SMs). In this work, four diverse Aspergillus species (A. campestris, A. novofumigatus, A. ochraceoroseus and A. steynii) has been whole genome PacBio sequenced to provide genetic references in three Aspergillus sections. Additionally, A. taichungensis and A. candidus were sequenced for SM elucidation. Thirteen Aspergillus genomes were analysed with comparative genomics to determine phylogeny and genetic diversity, showing that each new genome contains 15–27% genes not found in othermore » sequenced Aspergilli. In particular, the new species A. novofumigatus was compared to the pathogenic species A. fumigatus. This suggests that A. novofumigatus can produce most of the same allergens, virulence and pathogenicity factors as A. fumigatus suggesting that A. novofumigatus could be as pathogenic as A. fumigatus. Furthermore, SMs were linked to gene clusters based on biological and chemical knowledge and analysis, genome sequences and predictive algorithms.« less

  6. The genomes and comparative genomics of Lactobacillus delbrueckii phages.

    PubMed

    Riipinen, Katja-Anneli; Forsman, Päivi; Alatossava, Tapani

    2011-07-01

    Lactobacillus delbrueckii phages are a great source of genetic diversity. Here, the genome sequences of Lb. delbrueckii phages LL-Ku, c5 and JCL1032 were analyzed in detail, and the genetic diversity of Lb. delbrueckii phages belonging to different taxonomic groups was explored. The lytic isometric group b phages LL-Ku (31,080 bp) and c5 (31,841 bp) showed a minimum nucleotide sequence identity of 90% over about three-fourths of their genomes. The genomic locations of their lysis modules were unique, and the genomes featured several putative overlapping transcription units of genes. LL-Ku and c5 virions displayed peptidoglycan hydrolytic activity associated with a ~36-kDa protein similar in size to the endolysin. Unexpectedly, the 49,433-bp genome of the prolate phage JCL1032 (temperate, group c) revealed a conserved gene order within its structural genes. Lb. delbrueckii phages representing groups a (a phage LL-H), b and c possessed only limited protein sequence homology. Genomic comparison of LL-Ku and c5 suggested that diversification of Lb. delbrueckii phages is mainly due to insertions, deletions and recombination. For the first time, the complete genome sequences of group b and c Lb. delbrueckii phages are reported.

  7. Analysis of the Saccharomyces cerevisiae pan-genome reveals a pool of copy number variants distributed in diverse yeast strains from differing industrial environments

    PubMed Central

    Dunn, Barbara; Richter, Chandra; Kvitek, Daniel J.; Pugh, Tom; Sherlock, Gavin

    2012-01-01

    Although the budding yeast Saccharomyces cerevisiae is arguably one of the most well-studied organisms on earth, the genome-wide variation within this species—i.e., its “pan-genome”—has been less explored. We created a multispecies microarray platform containing probes covering the genomes of several Saccharomyces species: S. cerevisiae, including regions not found in the standard laboratory S288c strain, as well as the mitochondrial and 2-μm circle genomes–plus S. paradoxus, S. mikatae, S. kudriavzevii, S. uvarum, S. kluyveri, and S. castellii. We performed array-Comparative Genomic Hybridization (aCGH) on 83 different S. cerevisiae strains collected across a wide range of habitats; of these, 69 were commercial wine strains, while the remaining 14 were from a diverse set of other industrial and natural environments. We observed interspecific hybridization events, introgression events, and pervasive copy number variation (CNV) in all but a few of the strains. These CNVs were distributed throughout the strains such that they did not produce any clear phylogeny, suggesting extensive mating in both industrial and wild strains. To validate our results and to determine whether apparently similar introgressions and CNVs were identical by descent or recurrent, we also performed whole-genome sequencing on nine of these strains. These data may help pinpoint genomic regions involved in adaptation to different industrial milieus, as well as shed light on the course of domestication of S. cerevisiae. PMID:22369888

  8. Genome-wide characterization of centromeric satellites from multiple mammalian genomes.

    PubMed

    Alkan, Can; Cardone, Maria Francesca; Catacchio, Claudia Rita; Antonacci, Francesca; O'Brien, Stephen J; Ryder, Oliver A; Purgato, Stefania; Zoli, Monica; Della Valle, Giuliano; Eichler, Evan E; Ventura, Mario

    2011-01-01

    Despite its importance in cell biology and evolution, the centromere has remained the final frontier in genome assembly and annotation due to its complex repeat structure. However, isolation and characterization of the centromeric repeats from newly sequenced species are necessary for a complete understanding of genome evolution and function. In recent years, various genomes have been sequenced, but the characterization of the corresponding centromeric DNA has lagged behind. Here, we present a computational method (RepeatNet) to systematically identify higher-order repeat structures from unassembled whole-genome shotgun sequence and test whether these sequence elements correspond to functional centromeric sequences. We analyzed genome datasets from six species of mammals representing the diversity of the mammalian lineage, namely, horse, dog, elephant, armadillo, opossum, and platypus. We define candidate monomer satellite repeats and demonstrate centromeric localization for five of the six genomes. Our analysis revealed the greatest diversity of centromeric sequences in horse and dog in contrast to elephant and armadillo, which showed high-centromeric sequence homogeneity. We could not isolate centromeric sequences within the platypus genome, suggesting that centromeres in platypus are not enriched in satellite DNA. Our method can be applied to the characterization of thousands of other vertebrate genomes anticipated for sequencing in the near future, providing an important tool for annotation of centromeres.

  9. Phylum-wide comparative genomics unravel the diversity of secondary metabolism in Cyanobacteria

    DOE PAGES

    Calteau, Alexandra; Fewer, David P.; Latifi, Amel; ...

    2014-11-18

    Cyanobacteria are an ancient lineage of photosynthetic bacteria from which hundreds of natural products have been described, including many notorious toxins but also potent natural products of interest to the pharmaceutical and biotechnological industries. Many of these compounds are the products of non-ribosomal peptide synthetase (NRPS) or polyketide synthase (PKS) pathways. However, current understanding of the diversification of these pathways is largely based on the chemical structure of the bioactive compounds, while the evolutionary forces driving their remarkable chemical diversity are poorly understood. We carried out a phylum-wide investigation of genetic diversification of the cyanobacterial NRPS and PKS pathways formore » the production of bioactive compounds. 452 NRPS and PKS gene clusters were identified from 89 cyanobacterial genomes, revealing a clear burst in late-branching lineages. Our genomic analysis further grouped the clusters into 286 highly diversified cluster families (CF) of pathways. Some CFs appeared vertically inherited, while others presented a more complex evolutionary history. Only a few horizontal gene transfers were evidenced amongst strongly conserved CFs in the phylum, while several others have undergone drastic gene shuffling events, which could result in the observed diversification of the pathways. In addition to toxin production, several NRPS and PKS gene clusters are devoted to important cellular processes of these bacteria such as nitrogen fixation and iron uptake. The majority of the biosynthetic clusters identified here have unknown end products, highlighting the power of genome mining for the discovery of new natural products.« less

  10. Phylum-wide comparative genomics unravel the diversity of secondary metabolism in Cyanobacteria

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Calteau, Alexandra; Fewer, David P.; Latifi, Amel

    Cyanobacteria are an ancient lineage of photosynthetic bacteria from which hundreds of natural products have been described, including many notorious toxins but also potent natural products of interest to the pharmaceutical and biotechnological industries. Many of these compounds are the products of non-ribosomal peptide synthetase (NRPS) or polyketide synthase (PKS) pathways. However, current understanding of the diversification of these pathways is largely based on the chemical structure of the bioactive compounds, while the evolutionary forces driving their remarkable chemical diversity are poorly understood. We carried out a phylum-wide investigation of genetic diversification of the cyanobacterial NRPS and PKS pathways formore » the production of bioactive compounds. 452 NRPS and PKS gene clusters were identified from 89 cyanobacterial genomes, revealing a clear burst in late-branching lineages. Our genomic analysis further grouped the clusters into 286 highly diversified cluster families (CF) of pathways. Some CFs appeared vertically inherited, while others presented a more complex evolutionary history. Only a few horizontal gene transfers were evidenced amongst strongly conserved CFs in the phylum, while several others have undergone drastic gene shuffling events, which could result in the observed diversification of the pathways. In addition to toxin production, several NRPS and PKS gene clusters are devoted to important cellular processes of these bacteria such as nitrogen fixation and iron uptake. The majority of the biosynthetic clusters identified here have unknown end products, highlighting the power of genome mining for the discovery of new natural products.« less

  11. Genome-wide linkage disequilibrium and genetic diversity in five populations of Australian domestic sheep.

    PubMed

    Al-Mamun, Hawlader Abdullah; Clark, Samuel A; Kwan, Paul; Gondro, Cedric

    2015-11-24

    Knowledge of the genetic structure and overall diversity of livestock species is important to maximise the potential of genome-wide association studies and genomic prediction. Commonly used measures such as linkage disequilibrium (LD), effective population size (N e ), heterozygosity, fixation index (F ST) and runs of homozygosity (ROH) are widely used and help to improve our knowledge about genetic diversity in animal populations. The development of high-density single nucleotide polymorphism (SNP) arrays and the subsequent genotyping of large numbers of animals have greatly increased the accuracy of these population-based estimates. In this study, we used the Illumina OvineSNP50 BeadChip array to estimate and compare LD (measured by r (2) and D'), N e , heterozygosity, F ST and ROH in five Australian sheep populations: three pure breeds, i.e., Merino (MER), Border Leicester (BL), Poll Dorset (PD) and two crossbred populations i.e. F1 crosses of Merino and Border Leicester (MxB) and MxB crossed to Poll Dorset (MxBxP). Compared to other livestock species, the sheep populations that were analysed in this study had low levels of LD and high levels of genetic diversity. The rate of LD decay was greater in Merino than in the other pure breeds. Over short distances (<10 kb), the levels of LD were higher in BL and PD than in MER. Similarly, BL and PD had comparatively smaller N e than MER. Observed heterozygosity in the pure breeds ranged from 0.3 in BL to 0.38 in MER. Genetic distances between breeds were modest compared to other livestock species (highest F ST = 0.063) but the genetic diversity within breeds was high. Based on ROH, two chromosomal regions showed evidence of strong recent selection. This study shows that there is a large range of genome diversity in Australian sheep breeds, especially in Merino sheep. The observed range of diversity will influence the design of genome-wide association studies and the results that can be obtained from them. This

  12. Bacteriophages of Gordonia spp. Display a Spectrum of Diversity and Genetic Relationships.

    PubMed

    Pope, Welkin H; Mavrich, Travis N; Garlena, Rebecca A; Guerrero-Bustamante, Carlos A; Jacobs-Sera, Deborah; Montgomery, Matthew T; Russell, Daniel A; Warner, Marcie H; Hatfull, Graham F

    2017-08-15

    The global bacteriophage population is large, dynamic, old, and highly diverse genetically. Many phages are tailed and contain double-stranded DNA, but these remain poorly characterized genomically. A collection of over 1,000 phages infecting Mycobacterium smegmatis reveals the diversity of phages of a common bacterial host, but their relationships to phages of phylogenetically proximal hosts are not known. Comparative sequence analysis of 79 phages isolated on Gordonia shows these also to be diverse and that the phages can be grouped into 14 clusters of related genomes, with an additional 14 phages that are "singletons" with no closely related genomes. One group of six phages is closely related to Cluster A mycobacteriophages, but the other Gordonia phages are distant relatives and share only 10% of their genes with the mycobacteriophages. The Gordonia phage genomes vary in genome length (17.1 to 103.4 kb), percentage of GC content (47 to 68.8%), and genome architecture and contain a variety of features not seen in other phage genomes. Like the mycobacteriophages, the highly mosaic Gordonia phages demonstrate a spectrum of genetic relationships. We show this is a general property of bacteriophages and suggest that any barriers to genetic exchange are soft and readily violable. IMPORTANCE Despite the numerical dominance of bacteriophages in the biosphere, there is a dearth of complete genomic sequences. Current genomic information reveals that phages are highly diverse genomically and have mosaic architectures formed by extensive horizontal genetic exchange. Comparative analysis of 79 phages of Gordonia shows them to not only be highly diverse, but to present a spectrum of relatedness. Most are distantly related to phages of the phylogenetically proximal host Mycobacterium smegmatis , although one group of Gordonia phages is more closely related to mycobacteriophages than to the other Gordonia phages. Phage genome sequence space remains largely unexplored, but

  13. Phylogenomics reveals an extensive history of genome duplication in diatoms (Bacillariophyta).

    PubMed

    Parks, Matthew B; Nakov, Teofil; Ruck, Elizabeth C; Wickett, Norman J; Alverson, Andrew J

    2018-03-01

    Diatoms are one of the most species-rich lineages of microbial eukaryotes. Similarities in clade age, species richness, and primary productivity motivate comparisons to angiosperms, whose genomes have been inordinately shaped by whole-genome duplication (WGD). WGDs have been linked to speciation, increased rates of lineage diversification, and identified as a principal driver of angiosperm evolution. We synthesized a large but scattered body of evidence that suggests polyploidy may be common in diatoms as well. We used gene counts, gene trees, and distributions of synonymous divergence to carry out a phylogenomic analysis of WGD across a diverse set of 37 diatom species. Several methods identified WGDs of varying age across diatoms. Determining the occurrence, exact number, and placement of events was greatly impacted by uncertainty in gene trees. WGDs inferred from synonymous divergence of paralogs varied depending on how redundancy in transcriptomes was assessed, gene families were assembled, and synonymous distances (Ks) were calculated. Our results highlighted a need for systematic evaluation of key methodological aspects of Ks-based approaches to WGD inference. Gene tree reconciliations supported allopolyploidy as the predominant mode of polyploid formation, with strong evidence for ancient allopolyploid events in the thalassiosiroid and pennate diatom clades. Our results suggest that WGD has played a major role in the evolution of diatom genomes. We outline challenges in reconstructing paleopolyploid events in diatoms that, together with these results, offer a framework for understanding the impact of genome duplication in a group that likely harbors substantial genomic diversity. © 2018 The Authors. American Journal of Botany is published by Wiley Periodicals, Inc. on behalf of the Botanical Society of America.

  14. Genomic and Resistance Gene Homolog Diversity of the Dominant Tallgrass Prairie Species across the U.S. Great Plains Precipitation Gradient

    PubMed Central

    Rouse, Matthew N.; Saleh, Amgad A.; Seck, Amadou; Keeler, Kathleen H.; Travers, Steven E.; Hulbert, Scot H.; Garrett, Karen A.

    2011-01-01

    Background Environmental variables such as moisture availability are often important in determining species prevalence and intraspecific diversity. The population genetic structure of dominant plant species in response to a cline of these variables has rarely been addressed. We evaluated the spatial genetic structure and diversity of Andropogon gerardii populations across the U.S. Great Plains precipitation gradient, ranging from approximately 48 cm/year to 105 cm/year. Methodology/Principal Findings Genomic diversity was evaluated with AFLP markers and diversity of a disease resistance gene homolog was evaluated by PCR-amplification and digestion with restriction enzymes. We determined the degree of spatial genetic structure using Mantel tests. Genomic and resistance gene homolog diversity were evaluated across prairies using Shannon's index and by averaging haplotype dissimilarity. Trends in diversity across prairies were determined using linear regression of diversity on average precipitation for each prairie. We identified significant spatial genetic structure, with genomic similarity decreasing as a function of distance between samples. However, our data indicated that genome-wide diversity did not vary consistently across the precipitation gradient. In contrast, we found that disease resistance gene homolog diversity was positively correlated with precipitation. Significance Prairie remnants differ in the genetic resources they maintain. Selection and evolution in this disease resistance homolog is environmentally dependent. Overall, we found that, though this environmental gradient may not predict genomic diversity, individual traits such as disease resistance genes may vary significantly. PMID:21532756

  15. Platyhelminth Venom Allergen-Like (VAL) proteins: revealing structural diversity, class-specific features and biological associations across the phylum

    PubMed Central

    CHALMERS, IAIN W.; HOFFMANN, KARL F.

    2012-01-01

    SUMMARY During platyhelminth infection, a cocktail of proteins is released by the parasite to aid invasion, initiate feeding, facilitate adaptation and mediate modulation of the host immune response. Included amongst these proteins is the Venom Allergen-Like (VAL) family, part of the larger sperm coating protein/Tpx-1/Ag5/PR-1/Sc7 (SCP/TAPS) superfamily. To explore the significance of this protein family during Platyhelminthes development and host interactions, we systematically summarize all published proteomic, genomic and immunological investigations of the VAL protein family to date. By conducting new genomic and transcriptomic interrogations to identify over 200 VAL proteins (228) from species in all 4 traditional taxonomic classes (Trematoda, Cestoda, Monogenea and Turbellaria), we further expand our knowledge related to platyhelminth VAL diversity across the phylum. Subsequent phylogenetic and tertiary structural analyses reveal several class-specific VAL features, which likely indicate a range of roles mediated by this protein family. Our comprehensive analysis of platyhelminth VALs represents a unifying synopsis for understanding diversity within this protein family and a firm context in which to initiate future functional characterization of these enigmatic members. PMID:22717097

  16. Genetic diversity analysis of two commercial breeds of pigs using genomic and pedigree data.

    PubMed

    Zanella, Ricardo; Peixoto, Jane O; Cardoso, Fernando F; Cardoso, Leandro L; Biegelmeyer, Patrícia; Cantão, Maurício E; Otaviano, Antonio; Freitas, Marcelo S; Caetano, Alexandre R; Ledur, Mônica C

    2016-03-30

    Genetic improvement in livestock populations can be achieved without significantly affecting genetic diversity if mating systems and selection decisions take genetic relationships among individuals into consideration. The objective of this study was to examine the genetic diversity of two commercial breeds of pigs. Genotypes from 1168 Landrace (LA) and 1094 Large White (LW) animals from a commercial breeding program in Brazil were obtained using the Illumina PorcineSNP60 Beadchip. Inbreeding estimates based on pedigree (F x) and genomic information using runs of homozygosity (F ROH) and the single nucleotide polymorphisms (SNP) by SNP inbreeding coefficient (F SNP) were obtained. Linkage disequilibrium (LD), correlation of linkage phase (r) and effective population size (N e ) were also estimated. Estimates of inbreeding obtained with pedigree information were lower than those obtained with genomic data in both breeds. We observed that the extent of LD was slightly larger at shorter distances between SNPs in the LW population than in the LA population, which indicates that the LW population was derived from a smaller N e . Estimates of N e based on genomic data were equal to 53 and 40 for the current populations of LA and LW, respectively. The correlation of linkage phase between the two breeds was equal to 0.77 at distances up to 50 kb, which suggests that genome-wide association and selection should be performed within breed. Although selection intensities have been stronger in the LA breed than in the LW breed, levels of genomic and pedigree inbreeding were lower for the LA than for the LW breed. The use of genomic data to evaluate population diversity in livestock animals can provide new and more precise insights about the effects of intense selection for production traits. Resulting information and knowledge can be used to effectively increase response to selection by appropriately managing the rate of inbreeding, minimizing negative effects of inbreeding

  17. Camelid genomes reveal evolution and adaptation to desert environments.

    PubMed

    Wu, Huiguang; Guang, Xuanmin; Al-Fageeh, Mohamed B; Cao, Junwei; Pan, Shengkai; Zhou, Huanmin; Zhang, Li; Abutarboush, Mohammed H; Xing, Yanping; Xie, Zhiyuan; Alshanqeeti, Ali S; Zhang, Yanru; Yao, Qiulin; Al-Shomrani, Badr M; Zhang, Dong; Li, Jiang; Manee, Manee M; Yang, Zili; Yang, Linfeng; Liu, Yiyi; Zhang, Jilin; Altammami, Musaad A; Wang, Shenyuan; Yu, Lili; Zhang, Wenbin; Liu, Sanyang; Ba, La; Liu, Chunxia; Yang, Xukui; Meng, Fanhua; Wang, Shaowei; Li, Lu; Li, Erli; Li, Xueqiong; Wu, Kaifeng; Zhang, Shu; Wang, Junyi; Yin, Ye; Yang, Huanming; Al-Swailem, Abdulaziz M; Wang, Jun

    2014-10-21

    Bactrian camel (Camelus bactrianus), dromedary (Camelus dromedarius) and alpaca (Vicugna pacos) are economically important livestock. Although the Bactrian camel and dromedary are large, typically arid-desert-adapted mammals, alpacas are adapted to plateaus. Here we present high-quality genome sequences of these three species. Our analysis reveals the demographic history of these species since the Tortonian Stage of the Miocene and uncovers a striking correlation between large fluctuations in population size and geological time boundaries. Comparative genomic analysis reveals complex features related to desert adaptations, including fat and water metabolism, stress responses to heat, aridity, intense ultraviolet radiation and choking dust. Transcriptomic analysis of Bactrian camels further reveals unique osmoregulation, osmoprotection and compensatory mechanisms for water reservation underpinned by high blood glucose levels. We hypothesize that these physiological mechanisms represent kidney evolutionary adaptations to the desert environment. This study advances our understanding of camelid evolution and the adaptation of camels to arid-desert environments.

  18. The genome landscape of indigenous African cattle.

    PubMed

    Kim, Jaemin; Hanotte, Olivier; Mwai, Okeyo Ally; Dessie, Tadelle; Bashir, Salim; Diallo, Boubacar; Agaba, Morris; Kim, Kwondo; Kwak, Woori; Sung, Samsun; Seo, Minseok; Jeong, Hyeonsoo; Kwon, Taehyung; Taye, Mengistie; Song, Ki-Duk; Lim, Dajeong; Cho, Seoae; Lee, Hyun-Jeong; Yoon, Duhak; Oh, Sung Jong; Kemp, Stephen; Lee, Hak-Kyo; Kim, Heebal

    2017-02-20

    The history of African indigenous cattle and their adaptation to environmental and human selection pressure is at the root of their remarkable diversity. Characterization of this diversity is an essential step towards understanding the genomic basis of productivity and adaptation to survival under African farming systems. We analyze patterns of African cattle genetic variation by sequencing 48 genomes from five indigenous populations and comparing them to the genomes of 53 commercial taurine breeds. We find the highest genetic diversity among African zebu and sanga cattle. Our search for genomic regions under selection reveals signatures of selection for environmental adaptive traits. In particular, we identify signatures of selection including genes and/or pathways controlling anemia and feeding behavior in the trypanotolerant N'Dama, coat color and horn development in Ankole, and heat tolerance and tick resistance across African cattle especially in zebu breeds. Our findings unravel at the genome-wide level, the unique adaptive diversity of African cattle while emphasizing the opportunities for sustainable improvement of livestock productivity on the continent.

  19. Evolution of floral diversity: genomics, genes and gamma

    PubMed Central

    Berger, Brent A.; Howarth, Dianella G.; Soltis, Douglas E.

    2017-01-01

    A salient feature of flowering plant diversification is the emergence of a novel suite of floral features coinciding with the origin of the most species-rich lineage, Pentapetalae. Advances in phylogenetics, developmental genetics and genomics, including new analyses presented here, are helping to reconstruct the specific evolutionary steps involved in the evolution of this clade. The enormous floral diversity among Pentapetalae appears to be built on a highly conserved ground plan of five-parted (pentamerous) flowers with whorled phyllotaxis. By contrast, lability in the number and arrangement of component parts of the flower characterize the early-diverging eudicot lineages subtending Pentapetalae. The diversification of Pentapetalae also coincides closely with ancient hexaploidy, referred to as the gamma whole-genome triplication, for which the phylogenetic timing, mechanistic details and molecular evolutionary consequences are as yet not fully resolved. Transcription factors regulating floral development often persist in duplicate or triplicate in gamma-derived genomes, and both individual genes and whole transcriptional programmes exhibit a shift from broadly overlapping to tightly defined expression domains in Pentapetalae flowers. Investigations of these changes associated with the origin of Pentapetalae can lead to a more comprehensive understanding of what is arguably one of the most important evolutionary diversification events within terrestrial plants. This article is part of the themed issue ‘Evo-devo in the genomics era, and the origins of morphological diversity’. PMID:27994132

  20. Transcriptome sequencing reveals genome-wide variation in molecular evolutionary rate among ferns.

    PubMed

    Grusz, Amanda L; Rothfels, Carl J; Schuettpelz, Eric

    2016-08-30

    Transcriptomics in non-model plant systems has recently reached a point where the examination of nuclear genome-wide patterns in understudied groups is an achievable reality. This progress is especially notable in evolutionary studies of ferns, for which molecular resources to date have been derived primarily from the plastid genome. Here, we utilize transcriptome data in the first genome-wide comparative study of molecular evolutionary rate in ferns. We focus on the ecologically diverse family Pteridaceae, which comprises about 10 % of fern diversity and includes the enigmatic vittarioid ferns-an epiphytic, tropical lineage known for dramatically reduced morphologies and radically elongated phylogenetic branch lengths. Using expressed sequence data for 2091 loci, we perform pairwise comparisons of molecular evolutionary rate among 12 species spanning the three largest clades in the family and ask whether previously documented heterogeneity in plastid substitution rates is reflected in their nuclear genomes. We then inquire whether variation in evolutionary rate is being shaped by genes belonging to specific functional categories and test for differential patterns of selection. We find significant, genome-wide differences in evolutionary rate for vittarioid ferns relative to all other lineages within the Pteridaceae, but we recover few significant correlations between faster/slower vittarioid loci and known functional gene categories. We demonstrate that the faster rates characteristic of the vittarioid ferns are likely not driven by positive selection, nor are they unique to any particular type of nucleotide substitution. Our results reinforce recently reviewed mechanisms hypothesized to shape molecular evolutionary rates in vittarioid ferns and provide novel insight into substitution rate variation both within and among fern nuclear genomes.

  1. Whole-genome sequencing reveals mutational landscape underlying phenotypic differences between two widespread Chinese cattle breeds.

    PubMed

    Xu, Yao; Jiang, Yu; Shi, Tao; Cai, Hanfang; Lan, Xianyong; Zhao, Xin; Plath, Martin; Chen, Hong

    2017-01-01

    Whole-genome sequencing provides a powerful tool to obtain more genetic variability that could produce a range of benefits for cattle breeding industry. Nanyang (Bos indicus) and Qinchuan (Bos taurus) are two important Chinese indigenous cattle breeds with distinct phenotypes. To identify the genetic characteristics responsible for variation in phenotypes between the two breeds, in the present study, we for the first time sequenced the genomes of four Nanyang and four Qinchuan cattle with 10 to 12 fold on average of 97.86% and 98.98% coverage of genomes, respectively. Comparison with the Bos_taurus_UMD_3.1 reference assembly yielded 9,010,096 SNPs for Nanyang, and 6,965,062 for Qinchuan cattle, 51% and 29% of which were novel SNPs, respectively. A total of 154,934 and 115,032 small indels (1 to 3 bp) were found in the Nanyang and Qinchuan genomes, respectively. The SNP and indel distribution revealed that Nanyang showed a genetically high diversity as compared to Qinchuan cattle. Furthermore, a total of 2,907 putative cases of copy number variation (CNV) were identified by aligning Nanyang to Qinchuan genome, 783 of which (27%) encompassed the coding regions of 495 functional genes. The gene ontology (GO) analysis revealed that many CNV genes were enriched in the immune system and environment adaptability. Among several CNV genes related to lipid transport and fat metabolism, Lepin receptor gene (LEPR) overlapping with CNV_1815 showed remarkably higher copy number in Qinchuan than Nanyang (log2 (ratio) = -2.34988; P value = 1.53E-102). Further qPCR and association analysis investigated that the copy number of the LEPR gene presented positive correlations with transcriptional expression and phenotypic traits, suggesting the LEPR CNV may contribute to the higher fat deposition in muscles of Qinchuan cattle. Our findings provide evidence that the distinct phenotypes of Nanyang and Qinchuan breeds may be due to the different genetic variations including SNPs, indels

  2. Whole-genome sequencing reveals mutational landscape underlying phenotypic differences between two widespread Chinese cattle breeds

    PubMed Central

    Jiang, Yu; Shi, Tao; Cai, Hanfang; Lan, Xianyong; Zhao, Xin; Plath, Martin; Chen, Hong

    2017-01-01

    Whole-genome sequencing provides a powerful tool to obtain more genetic variability that could produce a range of benefits for cattle breeding industry. Nanyang (Bos indicus) and Qinchuan (Bos taurus) are two important Chinese indigenous cattle breeds with distinct phenotypes. To identify the genetic characteristics responsible for variation in phenotypes between the two breeds, in the present study, we for the first time sequenced the genomes of four Nanyang and four Qinchuan cattle with 10 to 12 fold on average of 97.86% and 98.98% coverage of genomes, respectively. Comparison with the Bos_taurus_UMD_3.1 reference assembly yielded 9,010,096 SNPs for Nanyang, and 6,965,062 for Qinchuan cattle, 51% and 29% of which were novel SNPs, respectively. A total of 154,934 and 115,032 small indels (1 to 3 bp) were found in the Nanyang and Qinchuan genomes, respectively. The SNP and indel distribution revealed that Nanyang showed a genetically high diversity as compared to Qinchuan cattle. Furthermore, a total of 2,907 putative cases of copy number variation (CNV) were identified by aligning Nanyang to Qinchuan genome, 783 of which (27%) encompassed the coding regions of 495 functional genes. The gene ontology (GO) analysis revealed that many CNV genes were enriched in the immune system and environment adaptability. Among several CNV genes related to lipid transport and fat metabolism, Lepin receptor gene (LEPR) overlapping with CNV_1815 showed remarkably higher copy number in Qinchuan than Nanyang (log2 (ratio) = -2.34988; P value = 1.53E-102). Further qPCR and association analysis investigated that the copy number of the LEPR gene presented positive correlations with transcriptional expression and phenotypic traits, suggesting the LEPR CNV may contribute to the higher fat deposition in muscles of Qinchuan cattle. Our findings provide evidence that the distinct phenotypes of Nanyang and Qinchuan breeds may be due to the different genetic variations including SNPs, indels

  3. Whole genomic DNA sequencing and comparative genomic analysis of Arthrospira platensis: high genome plasticity and genetic diversity

    PubMed Central

    Xu, Teng; Qin, Song; Hu, Yongwu; Song, Zhijian; Ying, Jianchao; Li, Peizhen; Dong, Wei; Zhao, Fangqing; Yang, Huanming; Bao, Qiyu

    2016-01-01

    Arthrospira platensis is a multi-cellular and filamentous non-N2-fixing cyanobacterium that is capable of performing oxygenic photosynthesis. In this study, we determined the nearly complete genome sequence of A. platensis YZ. A. platensis YZ genome is a single, circular chromosome of 6.62 Mb in size. Phylogenetic and comparative genomic analyses revealed that A. platensis YZ was more closely related to A. platensis NIES-39 than Arthrospira sp. PCC 8005 and A. platensis C1. Broad gene gains were identified between A. platensis YZ and three other Arthrospira speices, some of which have been previously demonstrated that can be laterally transferred among different species, such as restriction-modification systems-coding genes. Moreover, unprecedented extensive chromosomal rearrangements among different strains were observed. The chromosomal rearrangements, particularly the chromosomal inversions, were analysed and estimated to be closely related to palindromes that involved long inverted repeat sequences and the extensively distributed type IIR restriction enzyme in the Arthrospira genome. In addition, species from genus Arthrospira unanimously contained the highest rate of repetitive sequence compared with the other species of order Oscillatoriales, suggested that sequence duplication significantly contributed to Arthrospira genome phylogeny. These results provided in-depth views into the genomic phylogeny and structural variation of A. platensis, as well as provide a valuable resource for functional genomics studies. PMID:27330141

  4. Revealing the vectors of cellular identity with single-cell genomics

    PubMed Central

    Wagner, Allon; Regev, Aviv; Yosef, Nir

    2017-01-01

    Single-cell genomics has now made it possible to create a comprehensive atlas of human cells. At the same time, it has reopened definitions of a cell’s identity and type and of the ways in which they are regulated by the cell’s molecular circuitry. Emerging computational analysis methods, especially in single-cell RNA sequencing (scRNA-seq), have already begun to reveal, in a data-driven way, the diverse simultaneous facets of a cell’s identity, from a taxonomy of discrete cell types to continuous dynamic transitions and spatial locations. These developments will eventually allow a cell to be represented as a superposition of ‘basis vectors’, each determining a different (but possibly dependent) aspect of cellular organization and function. However, computational methods must also overcome considerable challenges—from handling technical noise and data scale to forming new abstractions of biology. As the scale of single-cell experiments continues to increase, new computational approaches will be essential for constructing and characterizing a reference map of cell identities. PMID:27824854

  5. Small Traditional Human Communities Sustain Genomic Diversity over Microgeographic Scales despite Linguistic Isolation

    PubMed Central

    Cox, Murray P.; Hudjashov, Georgi; Sim, Andre; Savina, Olga; Karafet, Tatiana M.; Sudoyo, Herawati; Lansing, J. Stephen

    2016-01-01

    At least since the Neolithic, humans have largely lived in networks of small, traditional communities. Often socially isolated, these groups evolved distinct languages and cultures over microgeographic scales of just tens of kilometers. Population genetic theory tells us that genetic drift should act quickly in such isolated groups, thus raising the question: do networks of small human communities maintain levels of genetic diversity over microgeographic scales? This question can no longer be asked in most parts of the world, which have been heavily impacted by historical events that make traditional society structures the exception. However, such studies remain possible in parts of Island Southeast Asia and Oceania, where traditional ways of life are still practiced. We captured genome-wide genetic data, together with linguistic records, for a case–study system—eight villages distributed across Sumba, a small, remote island in eastern Indonesia. More than 4,000 years after these communities were established during the Neolithic period, most speak different languages and can be distinguished genetically. Yet their nuclear diversity is not reduced, instead being comparable to other, even much larger, regional groups. Modeling reveals a separation of time scales: while languages and culture can evolve quickly, creating social barriers, sporadic migration averaged over many generations is sufficient to keep villages linked genetically. This loosely-connected network structure, once the global norm and still extant on Sumba today, provides a living proxy to explore fine-scale genome dynamics in the sort of small traditional communities within which the most recent episodes of human evolution occurred. PMID:27274003

  6. The genome of Eucalyptus grandis.

    PubMed

    Myburg, Alexander A; Grattapaglia, Dario; Tuskan, Gerald A; Hellsten, Uffe; Hayes, Richard D; Grimwood, Jane; Jenkins, Jerry; Lindquist, Erika; Tice, Hope; Bauer, Diane; Goodstein, David M; Dubchak, Inna; Poliakov, Alexandre; Mizrachi, Eshchar; Kullan, Anand R K; Hussey, Steven G; Pinard, Desre; van der Merwe, Karen; Singh, Pooja; van Jaarsveld, Ida; Silva-Junior, Orzenil B; Togawa, Roberto C; Pappas, Marilia R; Faria, Danielle A; Sansaloni, Carolina P; Petroli, Cesar D; Yang, Xiaohan; Ranjan, Priya; Tschaplinski, Timothy J; Ye, Chu-Yu; Li, Ting; Sterck, Lieven; Vanneste, Kevin; Murat, Florent; Soler, Marçal; Clemente, Hélène San; Saidi, Naijib; Cassan-Wang, Hua; Dunand, Christophe; Hefer, Charles A; Bornberg-Bauer, Erich; Kersting, Anna R; Vining, Kelly; Amarasinghe, Vindhya; Ranik, Martin; Naithani, Sushma; Elser, Justin; Boyd, Alexander E; Liston, Aaron; Spatafora, Joseph W; Dharmwardhana, Palitha; Raja, Rajani; Sullivan, Christopher; Romanel, Elisson; Alves-Ferreira, Marcio; Külheim, Carsten; Foley, William; Carocha, Victor; Paiva, Jorge; Kudrna, David; Brommonschenkel, Sergio H; Pasquali, Giancarlo; Byrne, Margaret; Rigault, Philippe; Tibbits, Josquin; Spokevicius, Antanas; Jones, Rebecca C; Steane, Dorothy A; Vaillancourt, René E; Potts, Brad M; Joubert, Fourie; Barry, Kerrie; Pappas, Georgios J; Strauss, Steven H; Jaiswal, Pankaj; Grima-Pettenati, Jacqueline; Salse, Jérôme; Van de Peer, Yves; Rokhsar, Daniel S; Schmutz, Jeremy

    2014-06-19

    Eucalypts are the world's most widely planted hardwood trees. Their outstanding diversity, adaptability and growth have made them a global renewable resource of fibre and energy. We sequenced and assembled >94% of the 640-megabase genome of Eucalyptus grandis. Of 36,376 predicted protein-coding genes, 34% occur in tandem duplications, the largest proportion thus far in plant genomes. Eucalyptus also shows the highest diversity of genes for specialized metabolites such as terpenes that act as chemical defence and provide unique pharmaceutical oils. Genome sequencing of the E. grandis sister species E. globulus and a set of inbred E. grandis tree genomes reveals dynamic genome evolution and hotspots of inbreeding depression. The E. grandis genome is the first reference for the eudicot order Myrtales and is placed here sister to the eurosids. This resource expands our understanding of the unique biology of large woody perennials and provides a powerful tool to accelerate comparative biology, breeding and biotechnology.

  7. Analysis of genotype diversity and evolution of Dengue virus serotype 2 using complete genomes

    PubMed Central

    Waman, Vaishali P.; Kolekar, Pandurang; Ramtirthkar, Mukund R.; Kale, Mohan M.

    2016-01-01

    Background Dengue is one of the most common arboviral diseases prevalent worldwide and is caused by Dengue viruses (genus Flavivirus, family Flaviviridae). There are four serotypes of Dengue Virus (DENV-1 to DENV-4), each of which is further subdivided into distinct genotypes. DENV-2 is frequently associated with severe dengue infections and epidemics. DENV-2 consists of six genotypes such as Asian/American, Asian I, Asian II, Cosmopolitan, American and sylvatic. Comparative genomic study was carried out to infer population structure of DENV-2 and to analyze the role of evolutionary and spatiotemporal factors in emergence of diversifying lineages. Methods Complete genome sequences of 990 strains of DENV-2 were analyzed using Bayesian-based population genetics and phylogenetic approaches to infer genetically distinct lineages. The role of spatiotemporal factors, genetic recombination and selection pressure in the evolution of DENV-2 is examined using the sequence-based bioinformatics approaches. Results DENV-2 genetic structure is complex and consists of fifteen subpopulations/lineages. The Asian/American genotype is observed to be diversified into seven lineages. The Asian I, Cosmopolitan and sylvatic genotypes were found to be subdivided into two lineages, each. The populations of American and Asian II genotypes were observed to be homogeneous. Significant evidence of episodic positive selection was observed in all the genes, except NS4A. Positive selection operational on a few codons in envelope gene confers antigenic and lineage diversity in the American strains of Asian/American genotype. Selection on codons of non-structural genes was observed to impact diversification of lineages in Asian I, cosmopolitan and sylvatic genotypes. Evidence of intra/inter-genotype recombination was obtained and the uncertainty in classification of recombinant strains was resolved using the population genetics approach. Discussion Complete genome-based analysis revealed that the

  8. Genetic Diversity and Demographic History of Cajanus spp. Illustrated from Genome-Wide SNPs

    PubMed Central

    Saxena, Rachit K.; von Wettberg, Eric; Upadhyaya, Hari D.; Sanchez, Vanessa; Songok, Serah; Saxena, Kulbhushan; Kimurto, Paul; Varshney, Rajeev K.

    2014-01-01

    Understanding genetic structure of Cajanus spp. is essential for achieving genetic improvement by quantitative trait loci (QTL) mapping or association studies and use of selected markers through genomic assisted breeding and genomic selection. After developing a comprehensive set of 1,616 single nucleotide polymorphism (SNPs) and their conversion into cost effective KASPar assays for pigeonpea (Cajanus cajan), we studied levels of genetic variability both within and between diverse set of Cajanus lines including 56 breeding lines, 21 landraces and 107 accessions from 18 wild species. These results revealed a high frequency of polymorphic SNPs and relatively high level of cross-species transferability. Indeed, 75.8% of successful SNP assays revealed polymorphism, and more than 95% of these assays could be successfully transferred to related wild species. To show regional patterns of variation, we used STRUCTURE and Analysis of Molecular Variance (AMOVA) to partition variance among hierarchical sets of landraces and wild species at either the continental scale or within India. STRUCTURE separated most of the domesticated germplasm from wild ecotypes, and separates Australian and Asian wild species as has been found previously. Among Indian regions and states within regions, we found 36% of the variation between regions, and 64% within landraces or wilds within states. The highest level of polymorphism in wild relatives and landraces was found in Madhya Pradesh and Andhra Pradesh provinces of India representing the centre of origin and domestication of pigeonpea respectively. PMID:24533111

  9. A Gene-Oriented Haplotype Comparison Reveals Recently Selected Genomic Regions in Temperate and Tropical Maize Germplasm

    PubMed Central

    Zhang, Jie; Li, Yongxiang; Zheng, Jun; Zhang, Hongwei; Yang, Xiaohong; Wang, Jianhua; Wang, Guoying

    2017-01-01

    The extensive genetic variation present in maize (Zea mays) germplasm makes it possible to detect signatures of positive artificial selection that occurred during temperate and tropical maize improvement. Here we report an analysis of 532,815 polymorphisms from a maize association panel consisting of 368 diverse temperate and tropical inbred lines. We developed a gene-oriented approach adapting exonic polymorphisms to identify recently selected alleles by comparing haplotypes across the maize genome. This analysis revealed evidence of selection for more than 1100 genomic regions during recent improvement, and included regulatory genes and key genes with visible mutant phenotypes. We find that selected candidate target genes in temperate maize are enriched in biosynthetic processes, and further examination of these candidates highlights two cases, sucrose flux and oil storage, in which multiple genes in a common pathway can be cooperatively selected. Finally, based on available parallel gene expression data, we hypothesize that some genes were selected for regulatory variations, resulting in altered gene expression. PMID:28099470

  10. Genomic Analysis of Hospital Plumbing Reveals Diverse Reservoir of Bacterial Plasmids Conferring Carbapenem Resistance

    PubMed Central

    2018-01-01

    ABSTRACT The hospital environment is a potential reservoir of bacteria with plasmids conferring carbapenem resistance. Our Hospital Epidemiology Service routinely performs extensive sampling of high-touch surfaces, sinks, and other locations in the hospital. Over a 2-year period, additional sampling was conducted at a broader range of locations, including housekeeping closets, wastewater from hospital internal pipes, and external manholes. We compared these data with previously collected information from 5 years of patient clinical and surveillance isolates. Whole-genome sequencing and analysis of 108 isolates provided comprehensive characterization of blaKPC/blaNDM-positive isolates, enabling an in-depth genetic comparison. Strikingly, despite a very low prevalence of patient infections with blaKPC-positive organisms, all samples from the intensive care unit pipe wastewater and external manholes contained carbapenemase-producing organisms (CPOs), suggesting a vast, resilient reservoir. We observed a diverse set of species and plasmids, and we noted species and susceptibility profile differences between environmental and patient populations of CPOs. However, there were plasmid backbones common to both populations, highlighting a potential environmental reservoir of mobile elements that may contribute to the spread of resistance genes. Clear associations between patient and environmental isolates were uncommon based on sequence analysis and epidemiology, suggesting reasonable infection control compliance at our institution. Nonetheless, a probable nosocomial transmission of Leclercia sp. from the housekeeping environment to a patient was detected by this extensive surveillance. These data and analyses further our understanding of CPOs in the hospital environment and are broadly relevant to the design of infection control strategies in many infrastructure settings. PMID:29437920

  11. Diversity and evolution of centromere repeats in the maize genome.

    PubMed

    Bilinski, Paul; Distor, Kevin; Gutierrez-Lopez, Jose; Mendoza, Gabriela Mendoza; Shi, Jinghua; Dawe, R Kelly; Ross-Ibarra, Jeffrey

    2015-03-01

    Centromere repeats are found in most eukaryotes and play a critical role in kinetochore formation. Though centromere repeats exhibit considerable diversity both within and among species, little is understood about the mechanisms that drive centromere repeat evolution. Here, we use maize as a model to investigate how a complex history involving polyploidy, fractionation, and recent domestication has impacted the diversity of the maize centromeric repeat CentC. We first validate the existence of long tandem arrays of repeats in maize and other taxa in the genus Zea. Although we find considerable sequence diversity among CentC copies genome-wide, genetic similarity among repeats is highest within these arrays, suggesting that tandem duplications are the primary mechanism for the generation of new copies. Nonetheless, clustering analyses identify similar sequences among distant repeats, and simulations suggest that this pattern may be due to homoplasious mutation. Although the two ancestral subgenomes of maize have contributed nearly equal numbers of centromeres, our analysis shows that the majority of all CentC repeats derive from one of the parental genomes, with an even stronger bias when examining the largest assembled contiguous clusters. Finally, by comparing maize with its wild progenitor teosinte, we find that the abundance of CentC likely decreased after domestication, while the pericentromeric repeat Cent4 has drastically increased.

  12. Comparative genomic de-convolution of the cotton genome revealed a decaploid ancestor and widespread chromosomal fractionation.

    PubMed

    Wang, Xiyin; Guo, Hui; Wang, Jinpeng; Lei, Tianyu; Liu, Tao; Wang, Zhenyi; Li, Yuxian; Lee, Tae-Ho; Li, Jingping; Tang, Haibao; Jin, Dianchuan; Paterson, Andrew H

    2016-02-01

    The 'apparently' simple genomes of many angiosperms mask complex evolutionary histories. The reference genome sequence for cotton (Gossypium spp.) revealed a ploidy change of a complexity unprecedented to date, indeed that could not be distinguished as to its exact dosage. Herein, by developing several comparative, computational and statistical approaches, we revealed a 5× multiplication in the cotton lineage of an ancestral genome common to cotton and cacao, and proposed evolutionary models to show how such a decaploid ancestor formed. The c. 70% gene loss necessary to bring the ancestral decaploid to its current gene count appears to fit an approximate geometrical model; that is, although many genes may be lost by single-gene deletion events, some may be lost in groups of consecutive genes. Gene loss following cotton decaploidy has largely just reduced gene copy numbers of some homologous groups. We designed a novel approach to deconvolute layers of chromosome homology, providing definitive information on gene orthology and paralogy across broad evolutionary distances, both of fundamental value and serving as an important platform to support further studies in and beyond cotton and genomics communities. No claim to original US government works. New Phytologist © 2015 New Phytologist Trust.

  13. Diverse Class 2 CRISPR-Cas Effector Proteins for Genome Engineering Applications.

    PubMed

    Pyzocha, Neena K; Chen, Sidi

    2018-02-16

    CRISPR-Cas genome editing technologies have revolutionized modern molecular biology by making targeted DNA edits simple and scalable. These technologies are developed by domesticating naturally occurring microbial adaptive immune systems that display wide diversity of functionality for targeted nucleic acid cleavage. Several CRISPR-Cas single effector enzymes have been characterized and engineered for use in mammalian cells. The unique properties of the single effector enzymes can make a critical difference in experimental use or targeting specificity. This review describes known single effector enzymes and discusses their use in genome engineering applications.

  14. Molecular cytogenetic and genomic analyses reveal new insights into the origin of the wheat B genome.

    PubMed

    Zhang, Wei; Zhang, Mingyi; Zhu, Xianwen; Cao, Yaping; Sun, Qing; Ma, Guojia; Chao, Shiaoman; Yan, Changhui; Xu, Steven S; Cai, Xiwen

    2018-02-01

    This work pinpointed the goatgrass chromosomal segment in the wheat B genome using modern cytogenetic and genomic technologies, and provided novel insights into the origin of the wheat B genome. Wheat is a typical allopolyploid with three homoeologous subgenomes (A, B, and D). The donors of the subgenomes A and D had been identified, but not for the subgenome B. The goatgrass Aegilops speltoides (genome SS) has been controversially considered a possible candidate for the donor of the wheat B genome. However, the relationship of the Ae. speltoides S genome with the wheat B genome remains largely obscure. The present study assessed the homology of the B and S genomes using an integrative cytogenetic and genomic approach, and revealed the contribution of Ae. speltoides to the origin of the wheat B genome. We discovered noticeable homology between wheat chromosome 1B and Ae. speltoides chromosome 1S, but not between other chromosomes in the B and S genomes. An Ae. speltoides-originated segment spanning a genomic region of approximately 10.46 Mb was detected on the long arm of wheat chromosome 1B (1BL). The Ae. speltoides-originated segment on 1BL was found to co-evolve with the rest of the B genome. Evidently, Ae. speltoides had been involved in the origin of the wheat B genome, but should not be considered an exclusive donor of this genome. The wheat B genome might have a polyphyletic origin with multiple ancestors involved, including Ae. speltoides. These novel findings will facilitate genome studies in wheat and other polyploids.

  15. Joint assembly and genetic mapping of the Atlantic horseshoe crab genome reveals ancient whole genome duplication

    PubMed Central

    2014-01-01

    Background Horseshoe crabs are marine arthropods with a fossil record extending back approximately 450 million years. They exhibit remarkable morphological stability over their long evolutionary history, retaining a number of ancestral arthropod traits, and are often cited as examples of “living fossils.” As arthropods, they belong to the Ecdysozoa, an ancient super-phylum whose sequenced genomes (including insects and nematodes) have thus far shown more divergence from the ancestral pattern of eumetazoan genome organization than cnidarians, deuterostomes and lophotrochozoans. However, much of ecdysozoan diversity remains unrepresented in comparative genomic analyses. Results Here we apply a new strategy of combined de novo assembly and genetic mapping to examine the chromosome-scale genome organization of the Atlantic horseshoe crab, Limulus polyphemus. We constructed a genetic linkage map of this 2.7 Gbp genome by sequencing the nuclear DNA of 34 wild-collected, full-sibling embryos and their parents at a mean redundancy of 1.1x per sample. The map includes 84,307 sequence markers grouped into 1,876 distinct genetic intervals and 5,775 candidate conserved protein coding genes. Conclusions Comparison with other metazoan genomes shows that the L. polyphemus genome preserves ancestral bilaterian linkage groups, and that a common ancestor of modern horseshoe crabs underwent one or more ancient whole genome duplications 300 million years ago, followed by extensive chromosome fusion. These results provide a counter-example to the often noted correlation between whole genome duplication and evolutionary radiations. The new, low-cost genetic mapping method for obtaining a chromosome-scale view of non-model organism genomes that we demonstrate here does not require laboratory culture, and is potentially applicable to a broad range of other species. PMID:24987520

  16. Analysis of the genetic diversity and structure across a wide range of germplasm reveals prominent gene flow in apple at the European level.

    PubMed

    Urrestarazu, Jorge; Denancé, Caroline; Ravon, Elisa; Guyader, Arnaud; Guisnel, Rémi; Feugey, Laurence; Poncet, Charles; Lateur, Marc; Houben, Patrick; Ordidge, Matthew; Fernandez-Fernandez, Felicidad; Evans, Kate M; Paprstein, Frantisek; Sedlak, Jiri; Nybom, Hilde; Garkava-Gustavsson, Larisa; Miranda, Carlos; Gassmann, Jennifer; Kellerhals, Markus; Suprun, Ivan; Pikunova, Anna V; Krasova, Nina G; Torutaeva, Elnura; Dondini, Luca; Tartarini, Stefano; Laurens, François; Durel, Charles-Eric

    2016-06-08

    The amount and structure of genetic diversity in dessert apple germplasm conserved at a European level is mostly unknown, since all diversity studies conducted in Europe until now have been performed on regional or national collections. Here, we applied a common set of 16 SSR markers to genotype more than 2,400 accessions across 14 collections representing three broad European geographic regions (North + East, West and South) with the aim to analyze the extent, distribution and structure of variation in the apple genetic resources in Europe. A Bayesian model-based clustering approach showed that diversity was organized in three groups, although these were only moderately differentiated (FST = 0.031). A nested Bayesian clustering approach allowed identification of subgroups which revealed internal patterns of substructure within the groups, allowing a finer delineation of the variation into eight subgroups (FST = 0.044). The first level of stratification revealed an asymmetric division of the germplasm among the three groups, and a clear association was found with the geographical regions of origin of the cultivars. The substructure revealed clear partitioning of genetic groups among countries, but also interesting associations between subgroups and breeding purposes of recent cultivars or particular usage such as cider production. Additional parentage analyses allowed us to identify both putative parents of more than 40 old and/or local cultivars giving interesting insights in the pedigree of some emblematic cultivars. The variation found at group and subgroup levels may reflect a combination of historical processes of migration/selection and adaptive factors to diverse agricultural environments that, together with genetic drift, have resulted in extensive genetic variation but limited population structure. The European dessert apple germplasm represents an important source of genetic diversity with a strong historical and patrimonial value. The present

  17. Genome analysis of crude oil degrading Franconibacter pulveris strain DJ34 revealed its genetic basis for hydrocarbon degradation and survival in oil contaminated environment.

    PubMed

    Pal, Siddhartha; Kundu, Anirban; Banerjee, Tirtha Das; Mohapatra, Balaram; Roy, Ajoy; Manna, Riddha; Sar, Pinaki; Kazy, Sufia K

    2017-10-01

    Franconibacter pulveris strain DJ34, isolated from Duliajan oil fields, Assam, was characterized in terms of its taxonomic, metabolic and genomic properties. The bacterium showed utilization of diverse petroleum hydrocarbons and electron acceptors, metal resistance, and biosurfactant production. The genome (4,856,096bp) of this strain contained different genes related to the degradation of various petroleum hydrocarbons, metal transport and resistance, dissimilatory nitrate, nitrite and sulfite reduction, chemotaxy, biosurfactant synthesis, etc. Genomic comparison with other Franconibacter spp. revealed higher abundance of genes for cell motility, lipid transport and metabolism, transcription and translation in DJ34 genome. Detailed COG analysis provides deeper insights into the genomic potential of this organism for degradation and survival in oil-contaminated complex habitat. This is the first report on ecophysiology and genomic inventory of Franconibacter sp. inhabiting crude oil rich environment, which might be useful for designing the strategy for bioremediation of oil contaminated environment. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. SCRaMbLE generates designed combinatorial stochastic diversity in synthetic chromosomes.

    PubMed

    Shen, Yue; Stracquadanio, Giovanni; Wang, Yun; Yang, Kun; Mitchell, Leslie A; Xue, Yaxin; Cai, Yizhi; Chen, Tai; Dymond, Jessica S; Kang, Kang; Gong, Jianhui; Zeng, Xiaofan; Zhang, Yongfen; Li, Yingrui; Feng, Qiang; Xu, Xun; Wang, Jun; Wang, Jian; Yang, Huanming; Boeke, Jef D; Bader, Joel S

    2016-01-01

    Synthetic chromosome rearrangement and modification by loxP-mediated evolution (SCRaMbLE) generates combinatorial genomic diversity through rearrangements at designed recombinase sites. We applied SCRaMbLE to yeast synthetic chromosome arm synIXR (43 recombinase sites) and then used a computational pipeline to infer or unscramble the sequence of recombinations that created the observed genomes. Deep sequencing of 64 synIXR SCRaMbLE strains revealed 156 deletions, 89 inversions, 94 duplications, and 55 additional complex rearrangements; several duplications are consistent with a double rolling circle mechanism. Every SCRaMbLE strain was unique, validating the capability of SCRaMbLE to explore a diverse space of genomes. Rearrangements occurred exclusively at designed loxPsym sites, with no significant evidence for ectopic rearrangements or mutations involving synthetic regions, the 99% nonsynthetic nuclear genome, or the mitochondrial genome. Deletion frequencies identified genes required for viability or fast growth. Replacement of 3' UTR by non-UTR sequence had surprisingly little effect on fitness. SCRaMbLE generates genome diversity in designated regions, reveals fitness constraints, and should scale to simultaneous evolution of multiple synthetic chromosomes. © 2016 Shen et al.; Published by Cold Spring Harbor Laboratory Press.

  19. SCRaMbLE generates designed combinatorial stochastic diversity in synthetic chromosomes

    PubMed Central

    Shen, Yue; Stracquadanio, Giovanni; Wang, Yun; Yang, Kun; Mitchell, Leslie A.; Xue, Yaxin; Cai, Yizhi; Chen, Tai; Dymond, Jessica S.; Kang, Kang; Gong, Jianhui; Zeng, Xiaofan; Zhang, Yongfen; Li, Yingrui; Feng, Qiang; Xu, Xun; Wang, Jun; Wang, Jian; Yang, Huanming; Boeke, Jef D.; Bader, Joel S.

    2016-01-01

    Synthetic chromosome rearrangement and modification by loxP-mediated evolution (SCRaMbLE) generates combinatorial genomic diversity through rearrangements at designed recombinase sites. We applied SCRaMbLE to yeast synthetic chromosome arm synIXR (43 recombinase sites) and then used a computational pipeline to infer or unscramble the sequence of recombinations that created the observed genomes. Deep sequencing of 64 synIXR SCRaMbLE strains revealed 156 deletions, 89 inversions, 94 duplications, and 55 additional complex rearrangements; several duplications are consistent with a double rolling circle mechanism. Every SCRaMbLE strain was unique, validating the capability of SCRaMbLE to explore a diverse space of genomes. Rearrangements occurred exclusively at designed loxPsym sites, with no significant evidence for ectopic rearrangements or mutations involving synthetic regions, the 99% nonsynthetic nuclear genome, or the mitochondrial genome. Deletion frequencies identified genes required for viability or fast growth. Replacement of 3′ UTR by non-UTR sequence had surprisingly little effect on fitness. SCRaMbLE generates genome diversity in designated regions, reveals fitness constraints, and should scale to simultaneous evolution of multiple synthetic chromosomes. PMID:26566658

  20. Whole-genome sequencing reveals clonal expansion of multiresistant Staphylococcus haemolyticus in European hospitals

    PubMed Central

    Cavanagh, Jorunn Pauline; Hjerde, Erik; Holden, Matthew T. G.; Kahlke, Tim; Klingenberg, Claus; Flægstad, Trond; Parkhill, Julian; Bentley, Stephen D.; Sollid, Johanna U. Ericson

    2014-01-01

    Objectives Staphylococcus haemolyticus is an emerging cause of nosocomial infections, primarily affecting immunocompromised patients. A comparative genomic analysis was performed on clinical S. haemolyticus isolates to investigate their genetic relationship and explore the coding sequences with respect to antimicrobial resistance determinants and putative hospital adaptation. Methods Whole-genome sequencing was performed on 134 isolates of S. haemolyticus from geographically diverse origins (Belgium, 2; Germany, 10; Japan, 13; Norway, 54; Spain, 2; Switzerland, 43; UK, 9; USA, 1). Each genome was individually assembled. Protein coding sequences (CDSs) were predicted and homologous genes were categorized into three types: Type I, core genes, homologues present in all strains; Type II, unique core genes, homologues shared by only a subgroup of strains; and Type III, unique genes, strain-specific CDSs. The phylogenetic relationship between the isolates was built from variable sites in the form of single nucleotide polymorphisms (SNPs) in the core genome and used to construct a maximum likelihood phylogeny. Results SNPs in the genome core regions divided the isolates into one major group of 126 isolates and one minor group of isolates with highly diverse genomes. The major group was further subdivided into seven clades (A–G), of which four (A–D) encompassed isolates only from Europe. Antimicrobial multiresistance was observed in 77.7% of the collection. High levels of homologous recombination were detected in genes involved in adherence, staphylococcal host adaptation and bacterial cell communication. Conclusions The presence of several successful and highly resistant clones underlines the adaptive potential of this opportunistic pathogen. PMID:25038069

  1. Phylogenomics of nonavian reptiles and the structure of the ancestral amniote genome

    PubMed Central

    Shedlock, Andrew M.; Botka, Christopher W.; Zhao, Shaying; Shetty, Jyoti; Zhang, Tingting; Liu, Jun S.; Deschavanne, Patrick J.; Edwards, Scott V.

    2007-01-01

    We report results of a megabase-scale phylogenomic analysis of the Reptilia, the sister group of mammals. Large-scale end-sequence scanning of genomic clones of a turtle, alligator, and lizard reveals diverse, mammal-like landscapes of retroelements and simple sequence repeats (SSRs) not found in the chicken. Several global genomic traits, including distinctive phylogenetic lineages of CR1-like long interspersed elements (LINEs) and a paucity of A-T rich SSRs, characterize turtles and archosaur genomes, whereas higher frequencies of tandem repeats and a lower global GC content reveal mammal-like features in Anolis. Nonavian reptile genomes also possess a high frequency of diverse and novel 50-bp unit tandem duplications not found in chicken or mammals. The frequency distributions of ≈65,000 8-mer oligonucleotides suggest that rates of DNA-word frequency change are an order of magnitude slower in reptiles than in mammals. These results suggest a diverse array of interspersed and SSRs in the common ancestor of amniotes and a genomic conservatism and gradual loss of retroelements in reptiles that culminated in the minimalist chicken genome. PMID:17307883

  2. Extensive retroviral diversity in shark.

    PubMed

    Han, Guan-Zhu

    2015-04-28

    Retroviruses infect a wide range of vertebrates. However, little is known about the diversity of retroviruses in basal vertebrates. Endogenous retrovirus (ERV) provides a valuable resource to study the ecology and evolution of retrovirus. I performed a genome-scale screening for ERVs in the elephant shark (Callorhinchus milii) and identified three complete or nearly complete ERVs and many short ERV fragments. I designate these retroviral elements "C. milli ERVs" (CmiERVs). Phylogenetic analysis shows that the CmiERVs form three distinct lineages. The genome invasions by these retroviruses are estimated to take place more than 50 million years ago. My results reveal the extensive retroviral diversity in the elephant shark. Diverse retroviruses appear to have been associated with cartilaginous fishes for millions of years. These findings have important implications in understanding the diversity and evolution of retroviruses.

  3. Comparative genomics of Clostridium bolteae and Clostridium clostridioforme reveals species-specific genomic properties and numerous putative antibiotic resistance determinants.

    PubMed

    Dehoux, Pierre; Marvaud, Jean Christophe; Abouelleil, Amr; Earl, Ashlee M; Lambert, Thierry; Dauga, Catherine

    2016-10-21

    Clostridium bolteae and Clostridium clostridioforme, previously included in the complex C. clostridioforme in the group Clostridium XIVa, remain difficult to distinguish by phenotypic methods. These bacteria, prevailing in the human intestinal microbiota, are opportunistic pathogens with various drug susceptibility patterns. In order to better characterize the two species and to obtain information on their antibiotic resistance genes, we analyzed the genomes of six strains of C. bolteae and six strains of C. clostridioforme, isolated from human infection. The genome length of C. bolteae varied from 6159 to 6398 kb, and 5719 to 6059 CDSs were detected. The genomes of C. clostridioforme were smaller, between 5467 and 5927 kb, and contained 5231 to 5916 CDSs. The two species display different metabolic pathways. The genomes of C. bolteae contained lactose operons involving PTS system and complex regulation, which contribute to phenotypic differentiation from C. clostridioforme. The Acetyl-CoA pathway, similar to that of Faecalibacterium prausnitzii, a major butyrate producer in the human gut, was only found in C. clostridioforme. The two species have also developed diverse flagella mobility systems contributing to gut colonization. Their genomes harboured many CDSs involved in resistance to beta-lactams, glycopeptides, macrolides, chloramphenicol, lincosamides, rifampin, linezolid, bacitracin, aminoglycosides and tetracyclines. Overall antimicrobial resistance genes were similar within a species, but strain-specific resistance genes were found. We discovered a new group of genes coding for rifampin resistance in C. bolteae. C. bolteae 90B3 was resistant to phenicols and linezolide in producing a 23S rRNA methyltransferase. C. clostridioforme 90A8 contained the VanB-type Tn1549 operon conferring vancomycin resistance. We also detected numerous genes encoding proteins related to efflux pump systems. Genomic comparison of C. bolteae and C. clostridiofrome revealed

  4. Genome Comparisons Reveal a Dominant Mechanism of Chromosome Number Reduction in Grasses and Accelerated Genome Evolution in Triticeae

    USDA-ARS?s Scientific Manuscript database

    Single nucleotide polymorphism was employed in the construction of a high-resolution, expressed sequence tag (EST) map of Aegilops tauschii, the diploid source of the wheat D genome. Comparison of the map with the rice and sorghum genome sequences revealed 50 inversions and translocations; 2, 8, and...

  5. Structural and sequence diversity of the transposon Galileo in the Drosophila willistoni genome.

    PubMed

    Gonçalves, Juliana W; Valiati, Victor Hugo; Delprat, Alejandra; Valente, Vera L S; Ruiz, Alfredo

    2014-09-13

    Galileo is one of three members of the P superfamily of DNA transposons. It was originally discovered in Drosophila buzzatii, in which three segregating chromosomal inversions were shown to have been generated by ectopic recombination between Galileo copies. Subsequently, Galileo was identified in six of 12 sequenced Drosophila genomes, indicating its widespread distribution within this genus. Galileo is strikingly abundant in Drosophila willistoni, a neotropical species that is highly polymorphic for chromosomal inversions, suggesting a role for this transposon in the evolution of its genome. We carried out a detailed characterization of all Galileo copies present in the D. willistoni genome. A total of 191 copies, including 133 with two terminal inverted repeats (TIRs), were classified according to structure in six groups. The TIRs exhibited remarkable variation in their length and structure compared to the most complete copy. Three copies showed extended TIRs due to internal tandem repeats, the insertion of other transposable elements (TEs), or the incorporation of non-TIR sequences into the TIRs. Phylogenetic analyses of the transposase (TPase)-encoding and TIR segments yielded two divergent clades, which we termed Galileo subfamilies V and W. Target-site duplications (TSDs) in D. willistoni Galileo copies were 7- or 8-bp in length, with the consensus sequence GTATTAC. Analysis of the region around the TSDs revealed a target site motif (TSM) with a 15-bp palindrome that may give rise to a stem-loop secondary structure. There is a remarkable abundance and diversity of Galileo copies in the D. willistoni genome, although no functional copies were found. The TIRs in particular have a dynamic structure and extend in different ways, but their ends (required for transposition) are more conserved than the rest of the element. The D. willistoni genome harbors two Galileo subfamilies (V and W) that diverged ~9 million years ago and may have descended from an ancestral

  6. Elaeis oleifera Genomic-SSR Markers: Exploitation in Oil Palm Germplasm Diversity and Cross-Amplification in Arecaceae

    PubMed Central

    Zaki, Noorhariza Mohd; Singh, Rajinder; Rosli, Rozana; Ismail, Ismanizan

    2012-01-01

    Species-specific simple sequence repeat (SSR) markers are favored for genetic studies and marker-assisted selection (MAS) breeding for oil palm genetic improvement. This report characterizes 20 SSR markers from an Elaeis oleifera genomic library (gSSR). Characterization of the repeat type in 2000 sequences revealed a high percentage of di-nucleotides (63.6%), followed by tri-nucleotides (24.2%). Primer pairs were successfully designed for 394 of the E. oleifera gSSRs. Subsequent analysis showed the ability of the 20 selected E. oleifera gSSR markers to reveal genetic diversity in the genus Elaeis. The average Polymorphism Information Content (PIC) value for the SSRs was 0.402, with the tri-repeats showing the highest average PIC (0.626). Low values of observed heterozygosity (Ho) (0.164) and highly positive fixation indices (Fis) in the E. oleifera germplasm collection, compared to the E. guineensis, indicated an excess of homozygosity in E. oleifera. The transferability of the markers to closely related palms, Elaeis guineensis, Cocos nucifera and ornamental palms is also reported. Sequencing the amplicons of three selected E. oleifera gSSRs across both species and palm taxa revealed variations in the repeat-units. The study showed the potential of E. oleifera gSSR markers to reveal genetic diversity in the genus Elaeis. The markers are also a valuable genetic resource for studying E. oleifera and other genus in the Arecaceae family. PMID:22605966

  7. Population-based 3D genome structure analysis reveals driving forces in spatial genome organization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tjong, Harianto; Li, Wenyuan; Kalhor, Reza

    Conformation capture technologies (e.g., Hi-C) chart physical interactions between chromatin regions on a genome-wide scale. However, the structural variability of the genome between cells poses a great challenge to interpreting ensemble-averaged Hi-C data, particularly for long-range and interchromosomal interactions. Here, we present a probabilistic approach for deconvoluting Hi-C data into a model population of distinct diploid 3D genome structures, which facilitates the detection of chromatin interactions likely to co-occur in individual cells. Here, our approach incorporates the stochastic nature of chromosome conformations and allows a detailed analysis of alternative chromatin structure states. For example, we predict and experimentally confirm themore » presence of large centromere clusters with distinct chromosome compositions varying between individual cells. The stability of these clusters varies greatly with their chromosome identities. We show that these chromosome-specific clusters can play a key role in the overall chromosome positioning in the nucleus and stabilizing specific chromatin interactions. By explicitly considering genome structural variability, our population-based method provides an important tool for revealing novel insights into the key factors shaping the spatial genome organization.« less

  8. Population-based 3D genome structure analysis reveals driving forces in spatial genome organization

    DOE PAGES

    Tjong, Harianto; Li, Wenyuan; Kalhor, Reza; ...

    2016-03-07

    Conformation capture technologies (e.g., Hi-C) chart physical interactions between chromatin regions on a genome-wide scale. However, the structural variability of the genome between cells poses a great challenge to interpreting ensemble-averaged Hi-C data, particularly for long-range and interchromosomal interactions. Here, we present a probabilistic approach for deconvoluting Hi-C data into a model population of distinct diploid 3D genome structures, which facilitates the detection of chromatin interactions likely to co-occur in individual cells. Here, our approach incorporates the stochastic nature of chromosome conformations and allows a detailed analysis of alternative chromatin structure states. For example, we predict and experimentally confirm themore » presence of large centromere clusters with distinct chromosome compositions varying between individual cells. The stability of these clusters varies greatly with their chromosome identities. We show that these chromosome-specific clusters can play a key role in the overall chromosome positioning in the nucleus and stabilizing specific chromatin interactions. By explicitly considering genome structural variability, our population-based method provides an important tool for revealing novel insights into the key factors shaping the spatial genome organization.« less

  9. Nile Tilapia Infectivity by Genomically Diverse Streptoccocus agalactiae Isolates from Multiple Hosts

    USDA-ARS?s Scientific Manuscript database

    Streptococcus agalactiae, Lancefield group B Streptococcus (GBS), is recognized for causing cattle mastitis, human neonatal meningitis, and fish meningo-encephalitis. We investigated the genomic diversity of GBS isolates from different phylogenetic hosts and geographical regions using serological t...

  10. Full genome analysis of rotavirus G9P[8] strains identified in acute gastroenteritis cases reveals genetic diversity: Pune, western India.

    PubMed

    Tatte, Vaishali S; Chaphekar, Deepa; Gopalkrishna, Varanasi

    2017-08-01

    Group A rotaviruses (RVA) are the major enteric etiological agents of severe acute gastroenteritis among children globally. As G9 RVA now represents as one of the major human RVA genotypes, studies on full genome of this particular genotype are being carried out worldwide. So far, no such studies on G9P[8] RVAs have been reported from Pune, western part of India. Keeping in view of this, the study was undertaken to understand the degree of genetic diversity of the commonly circulating G9P[8] RVA strains. Rotavirus surveillance studies carried out earlier during the years 2009-2011 showed increase in the prevalence of G9P[8] RVAs. Representative G9P[8] RVA strains from the years 2009, 2010, and 2011 were selected for the study. In general, all the G9 RVA strains showed clustering in the globally circulating sublineage of the VP7 gene and showed nucleotide/amino acid identities of 96.8-99.7%/96.9-99.8% with global G9 RV strains. Full genome analysis, of all three RVAs in this study indicated Wa-like genotype constellation G9-P[8]-I1-R1-C1-M1-A1-N1-T1-E1-H1. Within the strains nucleotide/amino acid divergence of 0.1-3.4%/0.0-4.1% was noted in all the RVA structural and non-structural genes. In conclusion, the present study highlights intra-genotypic variations throughout the RVA genome. The study further emphasizes the need for surveillance and analysis of the whole genomic constellation of the commonly circulating RVA strains of other regions in the country for understanding to a greater degree of the impact of rotavirus vaccination recently introduced in India. © 2017 Wiley Periodicals, Inc.

  11. Whole genome sequences of Japanese porcine species C rotaviruses reveal a high diversity of genotypes of individual genes and will contribute to a comprehensive, generally accepted classification system.

    PubMed

    Niira, Kazutaka; Ito, Mika; Masuda, Tsuneyuki; Saitou, Toshiya; Abe, Tadatsugu; Komoto, Satoshi; Sato, Mitsuo; Yamasato, Hiroshi; Kishimoto, Mai; Naoi, Yuki; Sano, Kaori; Tuchiaka, Shinobu; Okada, Takashi; Omatsu, Tsutomu; Furuya, Tetsuya; Aoki, Hiroshi; Katayama, Yukie; Oba, Mami; Shirai, Junsuke; Taniguchi, Koki; Mizutani, Tetsuya; Nagai, Makoto

    2016-10-01

    Porcine rotavirus C (RVC) is distributed throughout the world and is thought to be a pathogenic agent of diarrhea in piglets. Although, the VP7, VP4, and VP6 gene sequences of Japanese porcine RVCs are currently available, there is no whole-genome sequence data of Japanese RVC. Furthermore, only one to three sequences are available for porcine RVC VP1-VP3 and NSP1-NSP3 genes. Therefore, we determined nearly full-length whole-genome sequences of nine Japanese porcine RVCs from seven piglets with diarrhea and two healthy pigs and compared them with published RVC sequences from a database. The VP7 genes of two Japanese RVCs from healthy pigs were highly divergent from other known RVC strains and were provisionally classified as G12 and G13 based on the 86% nucleotide identity cut-off value. Pairwise sequence identity calculations and phylogenetic analyses revealed that candidate novel genotypes of porcine Japanese RVC were identified in the NSP1, NSP2 and NSP3 encoding genes, respectively. Furthermore, VP3 of Japanese porcine RVCs was shown to be closely related to human RVCs, suggesting a gene reassortment event between porcine and human RVCs and past interspecies transmission. The present study demonstrated that porcine RVCs show greater genetic diversity among strains than human and bovine RVCs. Copyright © 2016 Elsevier B.V. All rights reserved.

  12. CyanoGEBA: A Better Understanding of Cynobacterial Diversity through Large-Scale Genomics (JGI Seventh Annual User Meeting 2012: Genomics of Energy and Environment)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shih, Patrick

    2012-03-22

    Patrick Shih, representing both the University of California, Berkeley and JGI, gives a talk titled "CyanoGEBA: A Better Understanding of Cynobacterial Diversity through Large-scale Genomics" at the JGI 7th Annual Users Meeting: Genomics of Energy & Environment Meeting on March 22, 2012 in Walnut Creek, California.

  13. CyanoGEBA: A Better Understanding of Cynobacterial Diversity through Large-Scale Genomics (JGI Seventh Annual User Meeting 2012: Genomics of Energy and Environment)

    ScienceCinema

    Shih, Patrick

    2018-01-10

    Patrick Shih, representing both the University of California, Berkeley and JGI, gives a talk titled "CyanoGEBA: A Better Understanding of Cynobacterial Diversity through Large-scale Genomics" at the JGI 7th Annual Users Meeting: Genomics of Energy & Environment Meeting on March 22, 2012 in Walnut Creek, California.

  14. Defining the diverse spectrum of inversions, complex structural variation, and chromothripsis in the morbid human genome.

    PubMed

    Collins, Ryan L; Brand, Harrison; Redin, Claire E; Hanscom, Carrie; Antolik, Caroline; Stone, Matthew R; Glessner, Joseph T; Mason, Tamara; Pregno, Giulia; Dorrani, Naghmeh; Mandrile, Giorgia; Giachino, Daniela; Perrin, Danielle; Walsh, Cole; Cipicchio, Michelle; Costello, Maura; Stortchevoi, Alexei; An, Joon-Yong; Currall, Benjamin B; Seabra, Catarina M; Ragavendran, Ashok; Margolin, Lauren; Martinez-Agosto, Julian A; Lucente, Diane; Levy, Brynn; Sanders, Stephan J; Wapner, Ronald J; Quintero-Rivera, Fabiola; Kloosterman, Wigard; Talkowski, Michael E

    2017-03-06

    Structural variation (SV) influences genome organization and contributes to human disease. However, the complete mutational spectrum of SV has not been routinely captured in disease association studies. We sequenced 689 participants with autism spectrum disorder (ASD) and other developmental abnormalities to construct a genome-wide map of large SV. Using long-insert jumping libraries at 105X mean physical coverage and linked-read whole-genome sequencing from 10X Genomics, we document seven major SV classes at ~5 kb SV resolution. Our results encompass 11,735 distinct large SV sites, 38.1% of which are novel and 16.8% of which are balanced or complex. We characterize 16 recurrent subclasses of complex SV (cxSV), revealing that: (1) cxSV are larger and rarer than canonical SV; (2) each genome harbors 14 large cxSV on average; (3) 84.4% of large cxSVs involve inversion; and (4) most large cxSV (93.8%) have not been delineated in previous studies. Rare SVs are more likely to disrupt coding and regulatory non-coding loci, particularly when truncating constrained and disease-associated genes. We also identify multiple cases of catastrophic chromosomal rearrangements known as chromoanagenesis, including somatic chromoanasynthesis, and extreme balanced germline chromothripsis events involving up to 65 breakpoints and 60.6 Mb across four chromosomes, further defining rare categories of extreme cxSV. These data provide a foundational map of large SV in the morbid human genome and demonstrate a previously underappreciated abundance and diversity of cxSV that should be considered in genomic studies of human disease.

  15. Genetic Diversity and Reassortment of Hantaan Virus Tripartite RNA Genomes in Nature, the Republic of Korea

    PubMed Central

    Kim, Jeong-Ah; Kim, Won-keun; No, Jin Sun; Lee, Seung-Ho; Lee, Sook-Young; Kim, Ji Hye; Kho, Jeong Hoon; Lee, Daesang; Song, Dong Hyun; Gu, Se Hun; Jeong, Seong Tae; Park, Man-Seong; Kim, Heung-Chul; Klein, Terry A.; Song, Jin-Won

    2016-01-01

    Background Hantaan virus (HTNV), a negative sense tripartite RNA virus of the Family Bunyaviridae, is the most prevalent hantavirus in the Republic of Korea (ROK). It is the causative agent of Hemorrhagic Fever with Renal Syndrome (HFRS) in humans and maintained in the striped field mouse, Apodemus agrarius, the primary zoonotic host. Clinical HFRS cases have been reported commonly in HFRS-endemic areas of Gyeonggi province. Recently, the death of a member of the ROK military from Gangwon province due to HFRS prompted an investigation of the epidemiology and distribution of hantaviruses in Gangwon and Gyeonggi provinces that border the demilitarized zone separating North and South Korea. Methodology and Principal Findings To elucidate the geographic distribution and molecular diversity of HTNV, whole genome sequences of HTNV Large (L), Medium (M), and Small (S) segments were acquired from lung tissues of A. agrarius captured from 2003–2014. Consistent with the clinical incidence of HFRS established by the Korea Centers for Disease Control & Prevention (KCDC), the prevalence of HTNV in naturally infected mice in Gangwon province was lower than for Gyeonggi province. Whole genomic sequences of 34 HTNV strains were identified and a phylogenetic analysis showed geographic diversity of the virus in the limited areas. Reassortment analysis first suggested an occurrence of genetic exchange of HTNV genomes in nature, ROK. Conclusion/Significance This study is the first report to demonstrate the molecular prevalence of HTNV in Gangwon province. Whole genome sequencing of HTNV showed well-supported geographic lineages and the molecular diversity in the northern region of ROK due to a natural reassortment of HTNV genomes. These observations contribute to a better understanding of the genetic diversity and molecular evolution of hantaviruses. Also, the full-length of HTNV tripartite genomes will provide a database for phylogeographic analysis of spatial and temporal

  16. Reassessment of the Listeria monocytogenes pan-genome reveals dynamic integration hotspots and mobile genetic elements as major components of the accessory genome.

    PubMed

    Kuenne, Carsten; Billion, André; Mraheil, Mobarak Abu; Strittmatter, Axel; Daniel, Rolf; Goesmann, Alexander; Barbuddhe, Sukhadeo; Hain, Torsten; Chakraborty, Trinad

    2013-01-22

    Listeria monocytogenes is an important food-borne pathogen and model organism for host-pathogen interaction, thus representing an invaluable target considering research on the forces governing the evolution of such microbes. The diversity of this species has not been exhaustively explored yet, as previous efforts have focused on analyses of serotypes primarily implicated in human listeriosis. We conducted complete genome sequencing of 11 strains employing 454 GS FLX technology, thereby achieving full coverage of all serotypes including the first complete strains of serotypes 1/2b, 3c, 3b, 4c, 4d, and 4e. These were comparatively analyzed in conjunction with publicly available data and assessed for pathogenicity in the Galleria mellonella insect model. The species pan-genome of L. monocytogenes is highly stable but open, suggesting an ability to adapt to new niches by generating or including new genetic information. The majority of gene-scale differences represented by the accessory genome resulted from nine hyper variable hotspots, a similar number of different prophages, three transposons (Tn916, Tn554, IS3-like), and two mobilizable islands. Only a subset of strains showed CRISPR/Cas bacteriophage resistance systems of different subtypes, suggesting a supplementary function in maintenance of chromosomal stability. Multiple phylogenetic branches of the genus Listeria imply long common histories of strains of each lineage as revealed by a SNP-based core genome tree highlighting the impact of small mutations for the evolution of species L. monocytogenes. Frequent loss or truncation of genes described to be vital for virulence or pathogenicity was confirmed as a recurring pattern, especially for strains belonging to lineages III and II. New candidate genes implicated in virulence function were predicted based on functional domains and phylogenetic distribution. A comparative analysis of small regulatory RNA candidates supports observations of a differential

  17. Complete Genome Analysis of Three Novel Picornaviruses from Diverse Bat Species▿

    PubMed Central

    Lau, Susanna K. P.; Woo, Patrick C. Y.; Lai, Kenneth K. Y.; Huang, Yi; Yip, Cyril C. Y.; Shek, Chung-Tong; Lee, Paul; Lam, Carol S. F.; Chan, Kwok-Hung; Yuen, Kwok-Yung

    2011-01-01

    Although bats are important reservoirs of diverse viruses that can cause human epidemics, little is known about the presence of picornaviruses in these flying mammals. Among 1,108 bats of 18 species studied, three novel picornaviruses (groups 1, 2, and 3) were identified from alimentary specimens of 12 bats from five species and four genera. Two complete genomes, each from the three picornaviruses, were sequenced. Phylogenetic analysis showed that they fell into three distinct clusters in the Picornaviridae family, with low homologies to known picornaviruses, especially in leader and 2A proteins. Moreover, group 1 and 2 viruses are more closely related to each other than to group 3 viruses, which exhibit genome features distinct from those of the former two virus groups. In particular, the group 3 virus genome contains the shortest leader protein within Picornaviridae, a putative type I internal ribosome entry site (IRES) in the 5′-untranslated region instead of the type IV IRES found in group 1 and 2 viruses, one instead of two GXCG motifs in 2A, an L→V substitution in the DDLXQ motif in 2C helicase, and a conserved GXH motif in 3C protease. Group 1 and 2 viruses are unique among picornaviruses in having AMH instead of the GXH motif in 3Cpro. These findings suggest that the three picornaviruses belong to two novel genera in the Picornaviridae family. This report describes the discovery and complete genome analysis of three picornaviruses in bats, and their presence in diverse bat genera/species suggests the ability to cross the species barrier. PMID:21697464

  18. Hemimetabolous genomes reveal molecular basis of termite eusociality.

    PubMed

    Harrison, Mark C; Jongepier, Evelien; Robertson, Hugh M; Arning, Nicolas; Bitard-Feildel, Tristan; Chao, Hsu; Childers, Christopher P; Dinh, Huyen; Doddapaneni, Harshavardhan; Dugan, Shannon; Gowin, Johannes; Greiner, Carolin; Han, Yi; Hu, Haofu; Hughes, Daniel S T; Huylmans, Ann-Kathrin; Kemena, Carsten; Kremer, Lukas P M; Lee, Sandra L; Lopez-Ezquerra, Alberto; Mallet, Ludovic; Monroy-Kuhn, Jose M; Moser, Annabell; Murali, Shwetha C; Muzny, Donna M; Otani, Saria; Piulachs, Maria-Dolors; Poelchau, Monica; Qu, Jiaxin; Schaub, Florentine; Wada-Katsumata, Ayako; Worley, Kim C; Xie, Qiaolin; Ylla, Guillem; Poulsen, Michael; Gibbs, Richard A; Schal, Coby; Richards, Stephen; Belles, Xavier; Korb, Judith; Bornberg-Bauer, Erich

    2018-03-01

    Around 150 million years ago, eusocial termites evolved from within the cockroaches, 50 million years before eusocial Hymenoptera, such as bees and ants, appeared. Here, we report the 2-Gb genome of the German cockroach, Blattella germanica, and the 1.3-Gb genome of the drywood termite Cryptotermes secundus. We show evolutionary signatures of termite eusociality by comparing the genomes and transcriptomes of three termites and the cockroach against the background of 16 other eusocial and non-eusocial insects. Dramatic adaptive changes in genes underlying the production and perception of pheromones confirm the importance of chemical communication in the termites. These are accompanied by major changes in gene regulation and the molecular evolution of caste determination. Many of these results parallel molecular mechanisms of eusocial evolution in Hymenoptera. However, the specific solutions are remarkably different, thus revealing a striking case of convergence in one of the major evolutionary transitions in biological complexity.

  19. Evolution of sociality in spiders leads to depleted genomic diversity at both population and species levels.

    PubMed

    Settepani, V; Schou, M F; Greve, M; Grinsted, L; Bechsgaard, J; Bilde, T

    2017-08-01

    Across several animal taxa, the evolution of sociality involves a suite of characteristics, a "social syndrome," that includes cooperative breeding, reproductive skew, primary female-biased sex ratio, and the transition from outcrossing to inbreeding mating system, factors that are expected to reduce effective population size (Ne). This social syndrome may be favoured by short-term benefits but come with long-term costs, because the reduction in Ne amplifies loss of genetic diversity by genetic drift, ultimately restricting the potential of populations to respond to environmental change. To investigate the consequences of this social life form on genetic diversity, we used a comparative RAD-sequencing approach to estimate genomewide diversity in spider species that differ in level of sociality, reproductive skew and mating system. We analysed multiple populations of three independent sister-species pairs of social inbreeding and subsocial outcrossing Stegodyphus spiders, and a subsocial outgroup. Heterozygosity and within-population diversity were sixfold to 10-fold lower in social compared to subsocial species, and demographic modelling revealed a tenfold reduction in Ne of social populations. Species-wide genetic diversity depends on population divergence and the viability of genetic lineages. Population genomic patterns were consistent with high lineage turnover, which homogenizes the genetic structure that builds up between inbreeding populations, ultimately depleting genetic diversity at the species level. Indeed, species-wide genetic diversity of social species was 5-8 times lower than that of subsocial species. The repeated evolution of species with this social syndrome is associated with severe loss of genomewide diversity, likely to limit their evolutionary potential. © 2017 John Wiley & Sons Ltd.

  20. Genome-wide analysis of the Dof transcription factor gene family reveals soybean-specific duplicable and functional characteristics.

    PubMed

    Guo, Yong; Qiu, Li-Juan

    2013-01-01

    The Dof domain protein family is a classic plant-specific zinc-finger transcription factor family involved in a variety of biological processes. There is great diversity in the number of Dof genes in different plants. However, there are only very limited reports on the characterization of Dof transcription factors in soybean (Glycine max). In the present study, 78 putative Dof genes were identified from the whole-genome sequence of soybean. The predicted GmDof genes were non-randomly distributed within and across 19 out of 20 chromosomes and 97.4% (38 pairs) were preferentially retained duplicate paralogous genes located in duplicated regions of the genome. Soybean-specific segmental duplications contributed significantly to the expansion of the soybean Dof gene family. These Dof proteins were phylogenetically clustered into nine distinct subgroups among which the gene structure and motif compositions were considerably conserved. Comparative phylogenetic analysis of these Dof proteins revealed four major groups, similar to those reported for Arabidopsis and rice. Most of the GmDofs showed specific expression patterns based on RNA-seq data analyses. The expression patterns of some duplicate genes were partially redundant while others showed functional diversity, suggesting the occurrence of sub-functionalization during subsequent evolution. Comprehensive expression profile analysis also provided insights into the soybean-specific functional divergence among members of the Dof gene family. Cis-regulatory element analysis of these GmDof genes suggested diverse functions associated with different processes. Taken together, our results provide useful information for the functional characterization of soybean Dof genes by combining phylogenetic analysis with global gene-expression profiling.

  1. De novo assembly of soybean wild relatives for pan-genome analysis of diversity and agronomic traits.

    PubMed

    Li, Ying-hui; Zhou, Guangyu; Ma, Jianxin; Jiang, Wenkai; Jin, Long-guo; Zhang, Zhouhao; Guo, Yong; Zhang, Jinbo; Sui, Yi; Zheng, Liangtao; Zhang, Shan-shan; Zuo, Qiyang; Shi, Xue-hui; Li, Yan-fei; Zhang, Wan-ke; Hu, Yiyao; Kong, Guanyi; Hong, Hui-long; Tan, Bing; Song, Jian; Liu, Zhang-xiong; Wang, Yaoshen; Ruan, Hang; Yeung, Carol K L; Liu, Jian; Wang, Hailong; Zhang, Li-juan; Guan, Rong-xia; Wang, Ke-jing; Li, Wen-bin; Chen, Shou-yi; Chang, Ru-zhen; Jiang, Zhi; Jackson, Scott A; Li, Ruiqiang; Qiu, Li-juan

    2014-10-01

    Wild relatives of crops are an important source of genetic diversity for agriculture, but their gene repertoire remains largely unexplored. We report the establishment and analysis of a pan-genome of Glycine soja, the wild relative of cultivated soybean Glycine max, by sequencing and de novo assembly of seven phylogenetically and geographically representative accessions. Intergenomic comparisons identified lineage-specific genes and genes with copy number variation or large-effect mutations, some of which show evidence of positive selection and may contribute to variation of agronomic traits such as biotic resistance, seed composition, flowering and maturity time, organ size and final biomass. Approximately 80% of the pan-genome was present in all seven accessions (core), whereas the rest was dispensable and exhibited greater variation than the core genome, perhaps reflecting a role in adaptation to diverse environments. This work will facilitate the harnessing of untapped genetic diversity from wild soybean for enhancement of elite cultivars.

  2. Comprehensive Genomic Profiling of Esthesioneuroblastoma Reveals Additional Treatment Options.

    PubMed

    Gay, Laurie M; Kim, Sungeun; Fedorchak, Kyle; Kundranda, Madappa; Odia, Yazmin; Nangia, Chaitali; Battiste, James; Colon-Otero, Gerardo; Powell, Steven; Russell, Jeffery; Elvin, Julia A; Vergilio, Jo-Anne; Suh, James; Ali, Siraj M; Stephens, Philip J; Miller, Vincent A; Ross, Jeffrey S

    2017-07-01

    Esthesioneuroblastoma (ENB), also known as olfactory neuroblastoma, is a rare malignant neoplasm of the olfactory mucosa. Despite surgical resection combined with radiotherapy and adjuvant chemotherapy, ENB often relapses with rapid progression. Current multimodality, nontargeted therapy for relapsed ENB is of limited clinical benefit. We queried whether comprehensive genomic profiling (CGP) of relapsed or refractory ENB can uncover genomic alterations (GA) that could identify potential targeted therapies for these patients. CGP was performed on formalin-fixed, paraffin-embedded sections from 41 consecutive clinical cases of ENBs using a hybrid-capture, adaptor ligation based next-generation sequencing assay to a mean coverage depth of 593X. The results were analyzed for base substitutions, insertions and deletions, select rearrangements, and copy number changes (amplifications and homozygous deletions). Clinically relevant GA (CRGA) were defined as GA linked to drugs on the market or under evaluation in clinical trials. A total of 28 ENBs harbored GA, with a mean of 1.5 GA per sample. Approximately half of the ENBs (21, 51%) featured at least one CRGA, with an average of 1 CRGA per sample. The most commonly altered gene was TP53 (17%), with GA in PIK3CA , NF1 , CDKN2A , and CDKN2C occurring in 7% of samples. We report comprehensive genomic profiles for 41 ENB tumors. CGP revealed potential new therapeutic targets, including targetable GA in the mTOR, CDK and growth factor signaling pathways, highlighting the clinical value of genomic profiling in ENB. Comprehensive genomic profiling of 41 relapsed or refractory ENBs reveals recurrent alterations or classes of mutation, including amplification of tyrosine kinases encoded on chromosome 5q and mutations affecting genes in the mTOR/PI3K pathway. Approximately half of the ENBs (21, 51%) featured at least one clinically relevant genomic alteration (CRGA), with an average of 1 CRGA per sample. The most commonly altered

  3. Whole-genome sequencing reveals clonal expansion of multiresistant Staphylococcus haemolyticus in European hospitals.

    PubMed

    Cavanagh, Jorunn Pauline; Hjerde, Erik; Holden, Matthew T G; Kahlke, Tim; Klingenberg, Claus; Flægstad, Trond; Parkhill, Julian; Bentley, Stephen D; Sollid, Johanna U Ericson

    2014-11-01

    Staphylococcus haemolyticus is an emerging cause of nosocomial infections, primarily affecting immunocompromised patients. A comparative genomic analysis was performed on clinical S. haemolyticus isolates to investigate their genetic relationship and explore the coding sequences with respect to antimicrobial resistance determinants and putative hospital adaptation. Whole-genome sequencing was performed on 134 isolates of S. haemolyticus from geographically diverse origins (Belgium, 2; Germany, 10; Japan, 13; Norway, 54; Spain, 2; Switzerland, 43; UK, 9; USA, 1). Each genome was individually assembled. Protein coding sequences (CDSs) were predicted and homologous genes were categorized into three types: Type I, core genes, homologues present in all strains; Type II, unique core genes, homologues shared by only a subgroup of strains; and Type III, unique genes, strain-specific CDSs. The phylogenetic relationship between the isolates was built from variable sites in the form of single nucleotide polymorphisms (SNPs) in the core genome and used to construct a maximum likelihood phylogeny. SNPs in the genome core regions divided the isolates into one major group of 126 isolates and one minor group of isolates with highly diverse genomes. The major group was further subdivided into seven clades (A-G), of which four (A-D) encompassed isolates only from Europe. Antimicrobial multiresistance was observed in 77.7% of the collection. High levels of homologous recombination were detected in genes involved in adherence, staphylococcal host adaptation and bacterial cell communication. The presence of several successful and highly resistant clones underlines the adaptive potential of this opportunistic pathogen. © The Author 2014. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy.

  4. Genomic Diversity of Phages Infecting Probiotic Strains of Lactobacillus paracasei

    PubMed Central

    Rousseau, Geneviève M.; Capra, María L.; Quiberoni, Andrea; Tremblay, Denise M.; Labrie, Simon J.

    2015-01-01

    Strains of the Lactobacillus casei group have been extensively studied because some are used as probiotics in foods. Conversely, their phages have received much less attention. We analyzed the complete genome sequences of five L. paracasei temperate phages: CL1, CL2, iLp84, iLp1308, and iA2. Only phage iA2 could not replicate in an indicator strain. The genome lengths ranged from 34,155 bp (iA2) to 39,474 bp (CL1). Phages iA2 and iLp1308 (34,176 bp) possess the smallest genomes reported, thus far, for phages of the L. casei group. The GC contents of the five phage genomes ranged from 44.8 to 45.6%. As observed with many other phages, their genomes were organized as follows: genes coding for DNA packaging, morphogenesis, lysis, lysogeny, and replication. Phages CL1, CL2, and iLp1308 are highly related to each other. Phage iLp84 was also related to these three phages, but the similarities were limited to gene products involved in DNA packaging and structural proteins. Genomic fragments of phages CL1, CL2, iLp1308, and iLp84 were found in several genomes of L. casei strains. Prophage iA2 is unrelated to these four phages, but almost all of its genome was found in at least four L. casei strains. Overall, these phages are distinct from previously characterized Lactobacillus phages. Our results highlight the diversity of L. casei phages and indicate frequent DNA exchanges between phages and their hosts. PMID:26475105

  5. Two Antarctic penguin genomes reveal insights into their evolutionary history and molecular changes related to the Antarctic environment.

    PubMed

    Li, Cai; Zhang, Yong; Li, Jianwen; Kong, Lesheng; Hu, Haofu; Pan, Hailin; Xu, Luohao; Deng, Yuan; Li, Qiye; Jin, Lijun; Yu, Hao; Chen, Yan; Liu, Binghang; Yang, Linfeng; Liu, Shiping; Zhang, Yan; Lang, Yongshan; Xia, Jinquan; He, Weiming; Shi, Qiong; Subramanian, Sankar; Millar, Craig D; Meader, Stephen; Rands, Chris M; Fujita, Matthew K; Greenwold, Matthew J; Castoe, Todd A; Pollock, David D; Gu, Wanjun; Nam, Kiwoong; Ellegren, Hans; Ho, Simon Yw; Burt, David W; Ponting, Chris P; Jarvis, Erich D; Gilbert, M Thomas P; Yang, Huanming; Wang, Jian; Lambert, David M; Wang, Jun; Zhang, Guojie

    2014-01-01

    Penguins are flightless aquatic birds widely distributed in the Southern Hemisphere. The distinctive morphological and physiological features of penguins allow them to live an aquatic life, and some of them have successfully adapted to the hostile environments in Antarctica. To study the phylogenetic and population history of penguins and the molecular basis of their adaptations to Antarctica, we sequenced the genomes of the two Antarctic dwelling penguin species, the Adélie penguin [Pygoscelis adeliae] and emperor penguin [Aptenodytes forsteri]. Phylogenetic dating suggests that early penguins arose ~60 million years ago, coinciding with a period of global warming. Analysis of effective population sizes reveals that the two penguin species experienced population expansions from ~1 million years ago to ~100 thousand years ago, but responded differently to the climatic cooling of the last glacial period. Comparative genomic analyses with other available avian genomes identified molecular changes in genes related to epidermal structure, phototransduction, lipid metabolism, and forelimb morphology. Our sequencing and initial analyses of the first two penguin genomes provide insights into the timing of penguin origin, fluctuations in effective population sizes of the two penguin species over the past 10 million years, and the potential associations between these biological patterns and global climate change. The molecular changes compared with other avian genomes reflect both shared and diverse adaptations of the two penguin species to the Antarctic environment.

  6. ATAC-see reveals the accessible genome by transposase-mediated imaging and sequencing.

    PubMed

    Chen, Xingqi; Shen, Ying; Draper, Will; Buenrostro, Jason D; Litzenburger, Ulrike; Cho, Seung Woo; Satpathy, Ansuman T; Carter, Ava C; Ghosh, Rajarshi P; East-Seletsky, Alexandra; Doudna, Jennifer A; Greenleaf, William J; Liphardt, Jan T; Chang, Howard Y

    2016-12-01

    Spatial organization of the genome plays a central role in gene expression, DNA replication, and repair. But current epigenomic approaches largely map DNA regulatory elements outside of the native context of the nucleus. Here we report assay of transposase-accessible chromatin with visualization (ATAC-see), a transposase-mediated imaging technology that employs direct imaging of the accessible genome in situ, cell sorting, and deep sequencing to reveal the identity of the imaged elements. ATAC-see revealed the cell-type-specific spatial organization of the accessible genome and the coordinated process of neutrophil chromatin extrusion, termed NETosis. Integration of ATAC-see with flow cytometry enables automated quantitation and prospective cell isolation as a function of chromatin accessibility, and it reveals a cell-cycle dependence of chromatin accessibility that is especially dynamic in G1 phase. The integration of imaging and epigenomics provides a general and scalable approach for deciphering the spatiotemporal architecture of gene control.

  7. The streamlined genome of Phytomonas spp. relative to human pathogenic kinetoplastids reveals a parasite tailored for plants.

    PubMed

    Porcel, Betina M; Denoeud, France; Opperdoes, Fred; Noel, Benjamin; Madoui, Mohammed-Amine; Hammarton, Tansy C; Field, Mark C; Da Silva, Corinne; Couloux, Arnaud; Poulain, Julie; Katinka, Michael; Jabbari, Kamel; Aury, Jean-Marc; Campbell, David A; Cintron, Roxana; Dickens, Nicholas J; Docampo, Roberto; Sturm, Nancy R; Koumandou, V Lila; Fabre, Sandrine; Flegontov, Pavel; Lukeš, Julius; Michaeli, Shulamit; Mottram, Jeremy C; Szöőr, Balázs; Zilberstein, Dan; Bringaud, Frédéric; Wincker, Patrick; Dollet, Michel

    2014-02-01

    Members of the family Trypanosomatidae infect many organisms, including animals, plants and humans. Plant-infecting trypanosomes are grouped under the single genus Phytomonas, failing to reflect the wide biological and pathological diversity of these protists. While some Phytomonas spp. multiply in the latex of plants, or in fruit or seeds without apparent pathogenicity, others colonize the phloem sap and afflict plants of substantial economic value, including the coffee tree, coconut and oil palms. Plant trypanosomes have not been studied extensively at the genome level, a major gap in understanding and controlling pathogenesis. We describe the genome sequences of two plant trypanosomatids, one pathogenic isolate from a Guianan coconut and one non-symptomatic isolate from Euphorbia collected in France. Although these parasites have extremely distinct pathogenic impacts, very few genes are unique to either, with the vast majority of genes shared by both isolates. Significantly, both Phytomonas spp. genomes consist essentially of single copy genes for the bulk of their metabolic enzymes, whereas other trypanosomatids e.g. Leishmania and Trypanosoma possess multiple paralogous genes or families. Indeed, comparison with other trypanosomatid genomes revealed a highly streamlined genome, encoding for a minimized metabolic system while conserving the major pathways, and with retention of a full complement of endomembrane organelles, but with no evidence for functional complexity. Identification of the metabolic genes of Phytomonas provides opportunities for establishing in vitro culturing of these fastidious parasites and new tools for the control of agricultural plant disease.

  8. The Streamlined Genome of Phytomonas spp. Relative to Human Pathogenic Kinetoplastids Reveals a Parasite Tailored for Plants

    PubMed Central

    Porcel, Betina M.; Denoeud, France; Opperdoes, Fred; Noel, Benjamin; Madoui, Mohammed-Amine; Hammarton, Tansy C.; Field, Mark C.; Da Silva, Corinne; Couloux, Arnaud; Poulain, Julie; Katinka, Michael; Jabbari, Kamel; Aury, Jean-Marc; Campbell, David A.; Cintron, Roxana; Dickens, Nicholas J.; Docampo, Roberto; Sturm, Nancy R.; Koumandou, V. Lila; Fabre, Sandrine; Flegontov, Pavel; Lukeš, Julius; Michaeli, Shulamit; Mottram, Jeremy C.; Szöőr, Balázs; Zilberstein, Dan; Bringaud, Frédéric; Wincker, Patrick; Dollet, Michel

    2014-01-01

    Members of the family Trypanosomatidae infect many organisms, including animals, plants and humans. Plant-infecting trypanosomes are grouped under the single genus Phytomonas, failing to reflect the wide biological and pathological diversity of these protists. While some Phytomonas spp. multiply in the latex of plants, or in fruit or seeds without apparent pathogenicity, others colonize the phloem sap and afflict plants of substantial economic value, including the coffee tree, coconut and oil palms. Plant trypanosomes have not been studied extensively at the genome level, a major gap in understanding and controlling pathogenesis. We describe the genome sequences of two plant trypanosomatids, one pathogenic isolate from a Guianan coconut and one non-symptomatic isolate from Euphorbia collected in France. Although these parasites have extremely distinct pathogenic impacts, very few genes are unique to either, with the vast majority of genes shared by both isolates. Significantly, both Phytomonas spp. genomes consist essentially of single copy genes for the bulk of their metabolic enzymes, whereas other trypanosomatids e.g. Leishmania and Trypanosoma possess multiple paralogous genes or families. Indeed, comparison with other trypanosomatid genomes revealed a highly streamlined genome, encoding for a minimized metabolic system while conserving the major pathways, and with retention of a full complement of endomembrane organelles, but with no evidence for functional complexity. Identification of the metabolic genes of Phytomonas provides opportunities for establishing in vitro culturing of these fastidious parasites and new tools for the control of agricultural plant disease. PMID:24516393

  9. Twenty-one genome sequences from Pseudomonas species and 19 genome sequences from diverse bacteria isolated from the rhizosphere and endosphere of Populus deltoides.

    PubMed

    Brown, Steven D; Utturkar, Sagar M; Klingeman, Dawn M; Johnson, Courtney M; Martin, Stanton L; Land, Miriam L; Lu, Tse-Yuan S; Schadt, Christopher W; Doktycz, Mitchel J; Pelletier, Dale A

    2012-11-01

    To aid in the investigation of the Populus deltoides microbiome, we generated draft genome sequences for 21 Pseudomonas strains and 19 other diverse bacteria isolated from Populus deltoides roots. Genome sequences for isolates similar to Acidovorax, Bradyrhizobium, Brevibacillus, Caulobacter, Chryseobacterium, Flavobacterium, Herbaspirillum, Novosphingobium, Pantoea, Phyllobacterium, Polaromonas, Rhizobium, Sphingobium, and Variovorax were generated.

  10. Long-read sequencing of the coffee bean transcriptome reveals the diversity of full-length transcripts

    PubMed Central

    Cheng, Bing; Furtado, Agnelo

    2017-01-01

    Abstract Polyploidization contributes to the complexity of gene expression, resulting in numerous related but different transcripts. This study explored the transcriptome diversity and complexity of the tetraploid Arabica coffee (Coffea arabica) bean. Long-read sequencing (LRS) by Pacbio Isoform sequencing (Iso-seq) was used to obtain full-length transcripts without the difficulty and uncertainty of assembly required for reads from short-read technologies. The tetraploid transcriptome was annotated and compared with data from the sub-genome progenitors. Caffeine and sucrose genes were targeted for case analysis. An isoform-level tetraploid coffee bean reference transcriptome with 95 995 distinct transcripts (average 3236 bp) was obtained. A total of 88 715 sequences (92.42%) were annotated with BLASTx against NCBI non-redundant plant proteins, including 34 719 high-quality annotations. Further BLASTn analysis against NCBI non-redundant nucleotide sequences, Coffea canephora coding sequences with UTR, C. arabica ESTs, and Rfam resulted in 1213 sequences without hits, were potential novel genes in coffee. Longer UTRs were captured, especially in the 5΄UTRs, facilitating the identification of upstream open reading frames. The LRS also revealed more and longer transcript variants in key caffeine and sucrose metabolism genes from this polyploid genome. Long sequences (>10 kilo base) were poorly annotated. LRS technology shows the limitation of previous studies. It provides an important tool to produce a reference transcriptome including more of the diversity of full-length transcripts to help understand the biology and support the genetic improvement of polyploid species such as coffee. PMID:29048540

  11. Prokaryotic Caspase Homologs: Phylogenetic Patterns and Functional Characteristics Reveal Considerable Diversity

    PubMed Central

    Asplund-Samuelsson, Johannes; Bergman, Birgitta; Larsson, John

    2012-01-01

    Caspases accomplish initiation and execution of apoptosis, a programmed cell death process specific to metazoans. The existence of prokaryotic caspase homologs, termed metacaspases, has been known for slightly more than a decade. Despite their potential connection to the evolution of programmed cell death in eukaryotes, the phylogenetic distribution and functions of these prokaryotic metacaspase sequences are largely uncharted, while a few experiments imply involvement in programmed cell death. Aiming at providing a more detailed picture of prokaryotic caspase homologs, we applied a computational approach based on Hidden Markov Model search profiles to identify and functionally characterize putative metacaspases in bacterial and archaeal genomes. Out of the total of 1463 analyzed genomes, merely 267 (18%) were identified to contain putative metacaspases, but their taxonomic distribution included most prokaryotic phyla and a few archaea (Euryarchaeota). Metacaspases were particularly abundant in Alphaproteobacteria, Deltaproteobacteria and Cyanobacteria, which harbor many morphologically and developmentally complex organisms, and a distinct correlation was found between abundance and phenotypic complexity in Cyanobacteria. Notably, Bacillus subtilis and Escherichia coli, known to undergo genetically regulated autolysis, lacked metacaspases. Pfam domain architecture analysis combined with operon identification revealed rich and varied configurations among the metacaspase sequences. These imply roles in programmed cell death, but also e.g. in signaling, various enzymatic activities and protein modification. Together our data show a wide and scattered distribution of caspase homologs in prokaryotes with structurally and functionally diverse sub-groups, and with a potentially intriguing evolutionary role. These features will help delineate future characterizations of death pathways in prokaryotes. PMID:23185476

  12. Ecological and evolutionary significance of genomic GC content diversity in monocots

    PubMed Central

    Šmarda, Petr; Bureš, Petr; Horová, Lucie; Leitch, Ilia J.; Mucina, Ladislav; Pacini, Ettore; Tichý, Lubomír; Grulich, Vít; Rotreklová, Olga

    2014-01-01

    Genomic DNA base composition (GC content) is predicted to significantly affect genome functioning and species ecology. Although several hypotheses have been put forward to address the biological impact of GC content variation in microbial and vertebrate organisms, the biological significance of GC content diversity in plants remains unclear because of a lack of sufficiently robust genomic data. Using flow cytometry, we report genomic GC contents for 239 species representing 70 of 78 monocot families and compare them with genomic characters, a suite of life history traits and climatic niche data using phylogeny-based statistics. GC content of monocots varied between 33.6% and 48.9%, with several groups exceeding the GC content known for any other vascular plant group, highlighting their unusual genome architecture and organization. GC content showed a quadratic relationship with genome size, with the decreases in GC content in larger genomes possibly being a consequence of the higher biochemical costs of GC base synthesis. Dramatic decreases in GC content were observed in species with holocentric chromosomes, whereas increased GC content was documented in species able to grow in seasonally cold and/or dry climates, possibly indicating an advantage of GC-rich DNA during cell freezing and desiccation. We also show that genomic adaptations associated with changing GC content might have played a significant role in the evolution of the Earth’s contemporary biota, such as the rise of grass-dominated biomes during the mid-Tertiary. One of the major selective advantages of GC-rich DNA is hypothesized to be facilitating more complex gene regulation. PMID:25225383

  13. Twenty-One Genome Sequences from Pseudomonas Species and 19 Genome Sequences from Diverse Bacteria Isolated from the Rhizosphere and Endosphere of Populus deltoides

    PubMed Central

    Utturkar, Sagar M.; Klingeman, Dawn M.; Johnson, Courtney M.; Martin, Stanton L.; Land, Miriam L.; Lu, Tse-Yuan S.; Schadt, Christopher W.; Doktycz, Mitchel J.

    2012-01-01

    To aid in the investigation of the Populus deltoides microbiome, we generated draft genome sequences for 21 Pseudomonas strains and 19 other diverse bacteria isolated from Populus deltoides roots. Genome sequences for isolates similar to Acidovorax, Bradyrhizobium, Brevibacillus, Caulobacter, Chryseobacterium, Flavobacterium, Herbaspirillum, Novosphingobium, Pantoea, Phyllobacterium, Polaromonas, Rhizobium, Sphingobium, and Variovorax were generated. PMID:23045501

  14. Twenty-One Genome Sequences from Pseudomonas Species and 19 Genome Sequences from Diverse Bacteria Isolated from the Rhizosphere and Endosphere of Populus deltoides

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brown, Steven D; Utturkar, Sagar M; Klingeman, Dawn Marie

    To aid in the investigation of the Populus deltoides microbiome we generated draft genome sequences for twenty one Pseudomonas and twenty one other diverse bacteria isolated from Populus deltoides roots. Genome sequences for isolates similar to Acidovorax, Bradyrhizobium, Brevibacillus, Burkholderia, Caulobacter, Chryseobacterium, Flavobacterium, Herbaspirillum, Novosphingobium, Pantoea, Phyllobacterium, Polaromonas, Rhizobium, Sphingobium and Variovorax were generated.

  15. Functional Genomics of Novel Secondary Metabolites from Diverse Cyanobacteria Using Untargeted Metabolomics

    PubMed Central

    Baran, Richard; Ivanova, Natalia N.; Jose, Nick; Garcia-Pichel, Ferran; Kyrpides, Nikos C.; Gugger, Muriel; Northen, Trent R.

    2013-01-01

    Mass spectrometry-based metabolomics has become a powerful tool for the detection of metabolites in complex biological systems and for the identification of novel metabolites. We previously identified a number of unexpected metabolites in the cyanobacterium Synechococcus sp. PCC 7002, such as histidine betaine, its derivatives and several unusual oligosaccharides. To test for the presence of these compounds and to assess the diversity of small polar metabolites in other cyanobacteria, we profiled cell extracts of nine strains representing much of the morphological and evolutionary diversification of this phylum. Spectral features in raw metabolite profiles obtained by normal phase liquid chromatography coupled to mass spectrometry (MS) were manually curated so that chemical formulae of metabolites could be assigned. For putative identification, retention times and MS/MS spectra were cross-referenced with those of standards or available sprectral library records. Overall, we detected 264 distinct metabolites. These included indeed different betaines, oligosaccharides as well as additional unidentified metabolites with chemical formulae not present in databases of metabolism. Some of these metabolites were detected only in a single strain, but some were present in more than one. Genomic interrogation of the strains revealed that generally, presence of a given metabolite corresponded well with the presence of its biosynthetic genes, if known. Our results show the potential of combining metabolite profiling and genomics for the identification of novel biosynthetic genes. PMID:24084783

  16. Genome Sequencing and Analysis of Geographically Diverse Clinical Isolates of Herpes Simplex Virus 2

    PubMed Central

    Lamers, Susanna L.; Weiner, Brian; Ray, Stuart C.; Colgrove, Robert C.; Diaz, Fernando; Jing, Lichen; Wang, Kening; Saif, Sakina; Young, Sarah; Henn, Matthew; Laeyendecker, Oliver; Tobian, Aaron A. R.; Cohen, Jeffrey I.; Koelle, David M.; Quinn, Thomas C.; Knipe, David M.

    2015-01-01

    ABSTRACT Herpes simplex virus 2 (HSV-2), the principal causative agent of recurrent genital herpes, is a highly prevalent viral infection worldwide. Limited information is available on the amount of genomic DNA variation between HSV-2 strains because only two genomes have been determined, the HG52 laboratory strain and the newly sequenced SD90e low-passage-number clinical isolate strain, each from a different geographical area. In this study, we report the nearly complete genome sequences of 34 HSV-2 low-passage-number and laboratory strains, 14 of which were collected in Uganda, 1 in South Africa, 11 in the United States, and 8 in Japan. Our analyses of these genomes demonstrated remarkable sequence conservation, regardless of geographic origin, with the maximum nucleotide divergence between strains being 0.4% across the genome. In contrast, prior studies indicated that HSV-1 genomes exhibit more sequence diversity, as well as geographical clustering. Additionally, unlike HSV-1, little viral recombination between HSV-2 strains could be substantiated. These results are interpreted in light of HSV-2 evolution, epidemiology, and pathogenesis. Finally, the newly generated sequences more closely resemble the low-passage-number SD90e than HG52, supporting the use of the former as the new reference genome of HSV-2. IMPORTANCE Herpes simplex virus 2 (HSV-2) is a causative agent of genital and neonatal herpes. Therefore, knowledge of its DNA genome and genetic variability is central to preventing and treating genital herpes. However, only two full-length HSV-2 genomes have been reported. In this study, we sequenced 34 additional HSV-2 low-passage-number and laboratory viral genomes and initiated analysis of the genetic diversity of HSV-2 strains from around the world. The analysis of these genomes will facilitate research aimed at vaccine development, diagnosis, and the evaluation of clinical manifestations and transmission of HSV-2. This information will also contribute

  17. Repeat associated mechanisms of genome evolution and function revealed by the Mus caroli and Mus pahari genomes

    PubMed Central

    Thybert, David; Roller, Maša; Navarro, Fábio C.P.; Fiddes, Ian; Streeter, Ian; Feig, Christine; Martin-Galvez, David; Kolmogorov, Mikhail; Janoušek, Václav; Akanni, Wasiu; Aken, Bronwen; Aldridge, Sarah; Chakrapani, Varshith; Chow, William; Clarke, Laura; Cummins, Carla; Doran, Anthony; Dunn, Matthew; Goodstadt, Leo; Howe, Kerstin; Howell, Matthew; Josselin, Ambre-Aurore; Karn, Robert C.; Laukaitis, Christina M.; Jingtao, Lilue; Martin, Fergal; Muffato, Matthieu; Nachtweide, Stefanie; Quail, Michael A.; Sisu, Cristina; Stanke, Mario; Stefflova, Klara; Van Oosterhout, Cock; Veyrunes, Frederic; Ward, Ben; Yang, Fengtang; Yazdanifar, Golbahar; Zadissa, Amonida; Adams, David J.; Brazma, Alvis; Gerstein, Mark; Paten, Benedict; Pham, Son; Keane, Thomas M.; Odom, Duncan T.; Flicek, Paul

    2018-01-01

    Understanding the mechanisms driving lineage-specific evolution in both primates and rodents has been hindered by the lack of sister clades with a similar phylogenetic structure having high-quality genome assemblies. Here, we have created chromosome-level assemblies of the Mus caroli and Mus pahari genomes. Together with the Mus musculus and Rattus norvegicus genomes, this set of rodent genomes is similar in divergence times to the Hominidae (human-chimpanzee-gorilla-orangutan). By comparing the evolutionary dynamics between the Muridae and Hominidae, we identified punctate events of chromosome reshuffling that shaped the ancestral karyotype of Mus musculus and Mus caroli between 3 and 6 million yr ago, but that are absent in the Hominidae. Hominidae show between four- and sevenfold lower rates of nucleotide change and feature turnover in both neutral and functional sequences, suggesting an underlying coherence to the Muridae acceleration. Our system of matched, high-quality genome assemblies revealed how specific classes of repeats can play lineage-specific roles in related species. Recent LINE activity has remodeled protein-coding loci to a greater extent across the Muridae than the Hominidae, with functional consequences at the species level such as reproductive isolation. Furthermore, we charted a Muridae-specific retrotransposon expansion at unprecedented resolution, revealing how a single nucleotide mutation transformed a specific SINE element into an active CTCF binding site carrier specifically in Mus caroli, which resulted in thousands of novel, species-specific CTCF binding sites. Our results show that the comparison of matched phylogenetic sets of genomes will be an increasingly powerful strategy for understanding mammalian biology. PMID:29563166

  18. Genome and Transcriptome Sequences Reveal the Specific Parasitism of the Nematophagous Purpureocillium lilacinum 36-1

    PubMed Central

    Xie, Jialian; Li, Shaojun; Mo, Chenmi; Xiao, Xueqiong; Peng, Deliang; Wang, Gaofeng; Xiao, Yannong

    2016-01-01

    Purpureocillium lilacinum is a promising nematophagous ascomycete able to adapt diverse environments and it is also an opportunistic fungus that infects humans. A microbial inoculant of P. lilacinum has been registered to control plant parasitic nematodes. However, the molecular mechanism of the toxicological processes is still unclear because of the relatively few reports on the subject. In this study, using Illumina paired-end sequencing, the draft genome sequence and the transcriptome of P. lilacinum strain 36-1 infecting nematode-eggs were determined. Whole genome alignment indicated that P. lilacinum 36-1 possessed a more dynamic genome in comparison with P. lilacinum India strain. Moreover, a phylogenetic analysis showed that the P. lilacinum 36-1 had a closer relation to entomophagous fungi. The protein-coding genes in P. lilacinum 36-1 occurred much more frequently than they did in other fungi, which was a result of the depletion of repeat-induced point mutations (RIP). Comparative genome and transcriptome analyses revealed the genes that were involved in pathogenicity, particularly in the recognition, adhesion of nematode-eggs, downstream signal transduction pathways and hydrolase genes. By contrast, certain numbers of cellulose and xylan degradation genes and a lack of polysaccharide lyase genes showed the potential of P. lilacinum 36-1 as an endophyte. Notably, the expression of appressorium-formation and antioxidants-related genes exhibited similar infection patterns in P. lilacinum strain 36-1 to those of the model entomophagous fungi Metarhizium spp. These results uncovered the specific parasitism of P. lilacinum and presented the genes responsible for the infection of nematode-eggs. PMID:27486440

  19. Silage Collected from Dairy Farms Harbors an Abundance of Listeriaphages with Considerable Host Range and Genome Size Diversity

    PubMed Central

    Vongkamjan, Kitiya; Switt, Andrea Moreno; den Bakker, Henk C.; Fortes, Esther D.

    2012-01-01

    Since the food-borne pathogen Listeria monocytogenes is common in dairy farm environments, it is likely that phages infecting this bacterium (“listeriaphages”) are abundant on dairy farms. To better understand the ecology and diversity of listeriaphages on dairy farms and to develop a diverse phage collection for further studies, silage samples collected on two dairy farms were screened for L. monocytogenes and listeriaphages. While only 4.5% of silage samples tested positive for L. monocytogenes, 47.8% of samples were positive for listeriaphages, containing up to >1.5 × 104 PFU/g. Host range characterization of the 114 phage isolates obtained, with a reference set of 13 L. monocytogenes strains representing the nine major serotypes and four lineages, revealed considerable host range diversity; phage isolates were classified into nine lysis groups. While one serotype 3c strain was not lysed by any phage isolates, serotype 4 strains were highly susceptible to phages and were lysed by 63.2 to 88.6% of phages tested. Overall, 12.3% of phage isolates showed a narrow host range (lysing 1 to 5 strains), while 28.9% of phages represented broad host range (lysing ≥11 strains). Genome sizes of the phage isolates were estimated to range from approximately 26 to 140 kb. The extensive host range and genomic diversity of phages observed here suggest an important role of phages in the ecology of L. monocytogenes on dairy farms. In addition, the phage collection developed here has the potential to facilitate further development of phage-based biocontrol strategies (e.g., in silage) and other phage-based tools. PMID:23042180

  20. Wheat Landrace Genome Diversity

    PubMed Central

    Wingen, Luzie U.; West, Claire; Leverington-Waite, Michelle; Collier, Sarah; Orford, Simon; Goram, Richard; Yang, Cai-Yun; King, Julie; Allen, Alexandra M.; Burridge, Amanda; Edwards, Keith J.; Griffiths, Simon

    2017-01-01

    Understanding the genomic complexity of bread wheat (Triticum aestivum L.) is a cornerstone in the quest to unravel the processes of domestication and the following adaptation of domesticated wheat to a wide variety of environments across the globe. Additionally, it is of importance for future improvement of the crop, particularly in the light of climate change. Focusing on the adaptation after domestication, a nested association mapping (NAM) panel of 60 segregating biparental populations was developed, mainly involving landrace accessions from the core set of the Watkins hexaploid wheat collection optimized for genetic diversity. A modern spring elite variety, “Paragon,” was used as common reference parent. Genetic maps were constructed following identical rules to make them comparable. In total, 1611 linkage groups were identified, based on recombination from an estimated 126,300 crossover events over the whole NAM panel. A consensus map, named landrace consensus map (LRC), was constructed and contained 2498 genetic loci. These newly developed genetics tools were used to investigate the rules underlying genome fluidity or rigidity, e.g., by comparing marker distances and marker orders. In general, marker order was highly correlated, which provides support for strong synteny between bread wheat accessions. However, many exceptional cases of incongruent linkage groups and increased marker distances were also found. Segregation distortion was detected for many markers, sometimes as hot spots present in different populations. Furthermore, evidence for translocations in at least 36 of the maps was found. These translocations fell, in general, into many different translocation classes, but a few translocation classes were found in several accessions, the most frequent one being the well-known T5B:7B translocation. Loci involved in recombination rate, which is an interesting trait for plant breeding, were identified by QTL analyses using the crossover counts as a

  1. Whole genome amplification approach reveals novel polyhydroxyalkanoate synthases (PhaCs) from Japan Trench and Nankai Trough seawater.

    PubMed

    Foong, Choon Pin; Lau, Nyok-Sean; Deguchi, Shigeru; Toyofuku, Takashi; Taylor, Todd D; Sudesh, Kumar; Matsui, Minami

    2014-12-24

    Special features of the Japanese ocean include its ranges of latitude and depth. This study is the first to examine the diversity of Class I and II PHA synthases (PhaC) in DNA samples from pelagic seawater taken from the Japan Trench and Nankai Trough from a range of depths from 24 m to 5373 m. PhaC is the key enzyme in microorganisms that determines the types of monomer units that are polymerized into polyhydroxyalkanoate (PHA) and thus affects the physicochemical properties of this thermoplastic polymer. Complete putative PhaC sequences were determined via genome walking, and the activities of newly discovered PhaCs were evaluated in a heterologous host. A total of 76 putative phaC PCR fragments were amplified from the whole genome amplified seawater DNA. Of these 55 clones contained conserved PhaC domains and were classified into 20 genetic groups depending on their sequence similarity. Eleven genetic groups have undisclosed PhaC activity based on their distinct phylogenetic lineages from known PHA producers. Three complete DNA coding sequences were determined by IAN-PCR, and one PhaC was able to produce poly(3-hydroxybutyrate) in recombinant Cupriavidus necator PHB-4 (PHB-negative mutant). A new functional PhaC that has close identity to Marinobacter sp. was discovered in this study. Phylogenetic classification for all the phaC genes isolated from uncultured bacteria has revealed that seawater and other environmental resources harbor a great diversity of PhaCs with activities that have not yet been investigated. Functional evaluation of these in silico-based PhaCs via genome walking has provided new insights into the polymerizing ability of these enzymes.

  2. Increasing the involvement of diverse populations in genomics-based health care-lessons from haemoglobinopathies.

    PubMed

    Robinson, Helen M

    2017-10-01

    Integrating genomic medicine into health care delivery poses significant challenges to health professionals. To draw clinical benefit from genomic information, there is a need to build an evidence-based relationship between genotype and the physical expression of that genomic information. The work presented here uses preliminary work in the field of haemoglobinopathies to address two important challenges: to ensure that health care professionals in low- and middle-income countries are actively involved in the processes that will support genomic medicine, and that equity and diversity concerns are met so that clinical services can have relevance across all population and sub-population groups. Haemoglobinopathies provide an opportunity for gaining a better understanding of how long-standing genetic knowledge can be leveraged to determine if genomic-based services can be beneficial in low-resource settings. The Global Globin 2020 Challenge (GG2020) is an international initiative that uses haemoglobinopathies as an entry point to achieving growth in the quality and quantity of curated inputs into internationally recognised databases, harmonising the sharing of variant information within and between countries for better health care delivery and ensuring that storing, curation and sharing of variant information become an integral part of health care. Early findings from GG2020 indicate that paying attention to population diversity is an integral part of prevention and control of haemoglobinopathies.

  3. Streamlining genomes: toward the generation of simplified and stabilized microbial systems.

    PubMed

    Leprince, Audrey; van Passel, Mark W J; dos Santos, Vitor A P Martins

    2012-10-01

    At the junction between systems and synthetic biology, genome streamlining provides a solid foundation both for increased understanding of cellular circuitry, and for the tailoring of microbial chassis towards innovative biotechnological applications. Iterative genomic deletions (targeted and random) helps to generate simplified, stabilized and predictable genomes, whereas multiplexing genome engineering reveals a broad functional genetic diversity. The decrease in oligo and gene synthesis costs promises effective combinatorial tools for the generation of chassis based on streamlined and tractable genomes. Here we review recent progresses in streamlining genomes through recombineering techniques aiming to generate insights into cellular mechanisms and responses towards the design and assembly of streamlined genome chassis together with new cellular modules in diverse biotechnological applications. Copyright © 2012 Elsevier Ltd. All rights reserved.

  4. Reference-guided assembly of four diverse Arabidopsis thaliana genomes

    PubMed Central

    Schneeberger, Korbinian; Ossowski, Stephan; Ott, Felix; Klein, Juliane D.; Wang, Xi; Lanz, Christa; Smith, Lisa M.; Cao, Jun; Fitz, Joffrey; Warthmann, Norman; Henz, Stefan R.; Huson, Daniel H.; Weigel, Detlef

    2011-01-01

    We present whole-genome assemblies of four divergent Arabidopsis thaliana strains that complement the 125-Mb reference genome sequence released a decade ago. Using a newly developed reference-guided approach, we assembled large contigs from 9 to 42 Gb of Illumina short-read data from the Landsberg erecta (Ler-1), C24, Bur-0, and Kro-0 strains, which have been sequenced as part of the 1,001 Genomes Project for this species. Using alignments against the reference sequence, we first reduced the complexity of the de novo assembly and later integrated reads without similarity to the reference sequence. As an example, half of the noncentromeric C24 genome was covered by scaffolds that are longer than 260 kb, with a maximum of 2.2 Mb. Moreover, over 96% of the reference genome was covered by the reference-guided assembly, compared with only 87% with a complete de novo assembly. Comparisons with 2 Mb of dideoxy sequence reveal that the per-base error rate of the reference-guided assemblies was below 1 in 10,000. Our assemblies provide a detailed, genomewide picture of large-scale differences between A. thaliana individuals, most of which are difficult to access with alignment-consensus methods only. We demonstrate their practical relevance in studying the expression differences of polymorphic genes and show how the analysis of sRNA sequencing data can lead to erroneous conclusions if aligned against the reference genome alone. Genome assemblies, raw reads, and further information are accessible through http://1001genomes.org/projects/assemblies.html. PMID:21646520

  5. Exceptional diversity, non-random distribution, and rapid evolution of retroelements in the B73 maize genome.

    PubMed

    Baucom, Regina S; Estill, James C; Chaparro, Cristian; Upshaw, Naadira; Jogi, Ansuya; Deragon, Jean-Marc; Westerman, Richard P; Sanmiguel, Phillip J; Bennetzen, Jeffrey L

    2009-11-01

    Recent comprehensive sequence analysis of the maize genome now permits detailed discovery and description of all transposable elements (TEs) in this complex nuclear environment. Reiteratively optimized structural and homology criteria were used in the computer-assisted search for retroelements, TEs that transpose by reverse transcription of an RNA intermediate, with the final results verified by manual inspection. Retroelements were found to occupy the majority (>75%) of the nuclear genome in maize inbred B73. Unprecedented genetic diversity was discovered in the long terminal repeat (LTR) retrotransposon class of retroelements, with >400 families (>350 newly discovered) contributing >31,000 intact elements. The two other classes of retroelements, SINEs (four families) and LINEs (at least 30 families), were observed to contribute 1,991 and approximately 35,000 copies, respectively, or a combined approximately 1% of the B73 nuclear genome. With regard to fully intact elements, median copy numbers for all retroelement families in maize was 2 because >250 LTR retrotransposon families contained only one or two intact members that could be detected in the B73 draft sequence. The majority, perhaps all, of the investigated retroelement families exhibited non-random dispersal across the maize genome, with LINEs, SINEs, and many low-copy-number LTR retrotransposons exhibiting a bias for accumulation in gene-rich regions. In contrast, most (but not all) medium- and high-copy-number LTR retrotransposons were found to preferentially accumulate in gene-poor regions like pericentromeric heterochromatin, while a few high-copy-number families exhibited the opposite bias. Regions of the genome with the highest LTR retrotransposon density contained the lowest LTR retrotransposon diversity. These results indicate that the maize genome provides a great number of different niches for the survival and procreation of a great variety of retroelements that have evolved to differentially

  6. Salmonella enterica Prophage Sequence Profiles Reflect Genome Diversity and Can Be Used for High Discrimination Subtyping.

    PubMed

    Mottawea, Walid; Duceppe, Marc-Olivier; Dupras, Andrée A; Usongo, Valentine; Jeukens, Julie; Freschi, Luca; Emond-Rheault, Jean-Guillaume; Hamel, Jeremie; Kukavica-Ibrulj, Irena; Boyle, Brian; Gill, Alexander; Burnett, Elton; Franz, Eelco; Arya, Gitanjali; Weadge, Joel T; Gruenheid, Samantha; Wiedmann, Martin; Huang, Hongsheng; Daigle, France; Moineau, Sylvain; Bekal, Sadjia; Levesque, Roger C; Goodridge, Lawrence D; Ogunremi, Dele

    2018-01-01

    Non-typhoidal Salmonella is a leading cause of foodborne illness worldwide. Prompt and accurate identification of the sources of Salmonella responsible for disease outbreaks is crucial to minimize infections and eliminate ongoing sources of contamination. Current subtyping tools including single nucleotide polymorphism (SNP) typing may be inadequate, in some instances, to provide the required discrimination among epidemiologically unrelated Salmonella strains. Prophage genes represent the majority of the accessory genes in bacteria genomes and have potential to be used as high discrimination markers in Salmonella . In this study, the prophage sequence diversity in different Salmonella serovars and genetically related strains was investigated. Using whole genome sequences of 1,760 isolates of S. enterica representing 151 Salmonella serovars and 66 closely related bacteria, prophage sequences were identified from assembled contigs using PHASTER. We detected 154 different prophages in S. enterica genomes. Prophage sequences were highly variable among S. enterica serovars with a median ± interquartile range (IQR) of 5 ± 3 prophage regions per genome. While some prophage sequences were highly conserved among the strains of specific serovars, few regions were lineage specific. Therefore, strains belonging to each serovar could be clustered separately based on their prophage content. Analysis of S . Enteritidis isolates from seven outbreaks generated distinct prophage profiles for each outbreak. Taken altogether, the diversity of the prophage sequences correlates with genome diversity. Prophage repertoires provide an additional marker for differentiating S. enterica subtypes during foodborne outbreaks.

  7. Genomic Diversity of Biocontrol Strains of Pseudomonas spp. Isolated from Aerial or Root Surfaces of Plants

    USDA-ARS?s Scientific Manuscript database

    The striking ecological, metabolic, and biochemical diversity of Pseudomonas has intrigued microbiologists for many decades. To explore the genomic diversity of biocontrol strains of Pseudomonas spp., we derived high quality draft sequences of seven strains known to suppress plant disease. The str...

  8. Whole genome sequences of the USMARC sheep diversity panel v 2.4 aligned to the ovine reference genome assembly

    USDA-ARS?s Scientific Manuscript database

    A searchable and publicly viewable set of mapped genomes from 96 rams from 9 US sheep breeds was created. The nine pure breeds were selected to represent genetic diversity for traits such as fertility, prolificacy, maternal ability, growth rate, carcass leanness, wool quality, mature weight, and lo...

  9. Phylogenetic and Evolutionary Patterns in Microbial Carotenoid Biosynthesis Are Revealed by Comparative Genomics

    PubMed Central

    Klassen, Jonathan L.

    2010-01-01

    Background Carotenoids are multifunctional, taxonomically widespread and biotechnologically important pigments. Their biosynthesis serves as a model system for understanding the evolution of secondary metabolism. Microbial carotenoid diversity and evolution has hitherto been analyzed primarily from structural and biosynthetic perspectives, with the few phylogenetic analyses of microbial carotenoid biosynthetic proteins using either used limited datasets or lacking methodological rigor. Given the recent accumulation of microbial genome sequences, a reappraisal of microbial carotenoid biosynthetic diversity and evolution from the perspective of comparative genomics is warranted to validate and complement models of microbial carotenoid diversity and evolution based upon structural and biosynthetic data. Methodology/Principal Findings Comparative genomics were used to identify and analyze in silico microbial carotenoid biosynthetic pathways. Four major phylogenetic lineages of carotenoid biosynthesis are suggested composed of: (i) Proteobacteria; (ii) Firmicutes; (iii) Chlorobi, Cyanobacteria and photosynthetic eukaryotes; and (iv) Archaea, Bacteroidetes and two separate sub-lineages of Actinobacteria. Using this phylogenetic framework, specific evolutionary mechanisms are proposed for carotenoid desaturase CrtI-family enzymes and carotenoid cyclases. Several phylogenetic lineage-specific evolutionary mechanisms are also suggested, including: (i) horizontal gene transfer; (ii) gene acquisition followed by differential gene loss; (iii) co-evolution with other biochemical structures such as proteorhodopsins; and (iv) positive selection. Conclusions/Significance Comparative genomics analyses of microbial carotenoid biosynthetic proteins indicate a much greater taxonomic diversity then that identified based on structural and biosynthetic data, and divides microbial carotenoid biosynthesis into several, well-supported phylogenetic lineages not evident previously. This

  10. Proteomics and comparative genomics of Nitrososphaera viennensis reveal the core genome and adaptations of archaeal ammonia oxidizers

    PubMed Central

    Kerou, Melina; Offre, Pierre; Valledor, Luis; Abby, Sophie S.; Melcher, Michael; Nagler, Matthias; Weckwerth, Wolfram; Schleper, Christa

    2016-01-01

    Ammonia-oxidizing archaea (AOA) are among the most abundant microorganisms and key players in the global nitrogen and carbon cycles. They share a common energy metabolism but represent a heterogeneous group with respect to their environmental distribution and adaptions, growth requirements, and genome contents. We report here the genome and proteome of Nitrososphaera viennensis EN76, the type species of the archaeal class Nitrososphaeria of the phylum Thaumarchaeota encompassing all known AOA. N. viennensis is a soil organism with a 2.52-Mb genome and 3,123 predicted protein-coding genes. Proteomic analysis revealed that nearly 50% of the predicted genes were translated under standard laboratory growth conditions. Comparison with genomes of closely related species of the predominantly terrestrial Nitrososphaerales as well as the more streamlined marine Nitrosopumilales [Candidatus (Ca.) order] and the acidophile “Ca. Nitrosotalea devanaterra” revealed a core genome of AOA comprising 860 genes, which allowed for the reconstruction of central metabolic pathways common to all known AOA and expressed in the N. viennensis and “Ca. Nitrosopelagicus brevis” proteomes. Concomitantly, we were able to identify candidate proteins for as yet unidentified crucial steps in central metabolisms. In addition to unraveling aspects of core AOA metabolism, we identified specific metabolic innovations associated with the Nitrososphaerales mediating growth and survival in the soil milieu, including the capacity for biofilm formation, cell surface modifications and cell adhesion, and carbohydrate conversions as well as detoxification of aromatic compounds and drugs. PMID:27864514

  11. Co-invading symbiotic mutualists of Medicago polymorpha retain high ancestral diversity and contain diverse accessory genomes.

    PubMed

    Porter, Stephanie S; Faber-Hammond, Joshua J; Friesen, Maren L

    2018-01-01

    Exotic, invasive plants and animals can wreak havoc on ecosystems by displacing natives and altering environmental conditions. However, much less is known about the identities or evolutionary dynamics of the symbiotic microbes that accompany invasive species. Most leguminous plants rely upon symbiotic rhizobium bacteria to fix nitrogen and are incapable of colonizing areas devoid of compatible rhizobia. We compare the genomes of symbiotic rhizobia in a portion of the legume's invaded range with those of the rhizobium symbionts from across the legume's native range. We show that in an area of California the legume Medicago polymorpha has invaded, its Ensifer medicae symbionts: (i) exhibit genome-wide patterns of relatedness that together with historical evidence support host-symbiont co-invasion from Europe into California, (ii) exhibit population genomic patterns consistent with the introduction of the majority of deep diversity from the native range, rather than a genetic bottleneck during colonization of California and (iii) harbor a large set of accessory genes uniquely enriched in binding functions, which could play a role in habitat invasion. Examining microbial symbiont genome dynamics during biological invasions is critical for assessing host-symbiont co-invasions whereby microbial symbiont range expansion underlies plant and animal invasions. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  12. Genomic Analyses Reveal the Influence of Geographic Origin, Migration, and Hybridization on Modern Dog Breed Development.

    PubMed

    Parker, Heidi G; Dreger, Dayna L; Rimbault, Maud; Davis, Brian W; Mullen, Alexandra B; Carpintero-Ramirez, Gretchen; Ostrander, Elaine A

    2017-04-25

    There are nearly 400 modern domestic dog breeds with a unique histories and genetic profiles. To track the genetic signatures of breed development, we have assembled the most diverse dataset of dog breeds, reflecting their extensive phenotypic variation and heritage. Combining genetic distance, migration, and genome-wide haplotype sharing analyses, we uncover geographic patterns of development and independent origins of common traits. Our analyses reveal the hybrid history of breeds and elucidate the effects of immigration, revealing for the first time a suggestion of New World dog within some modern breeds. Finally, we used cladistics and haplotype sharing to show that some common traits have arisen more than once in the history of the dog. These analyses characterize the complexities of breed development, resolving longstanding questions regarding individual breed origination, the effect of migration on geographically distinct breeds, and, by inference, transfer of trait and disease alleles among dog breeds. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.

  13. Diversity and Activity of Alternative Nitrogenases in Sequenced Genomes and Coastal Environments

    PubMed Central

    McRose, Darcy L.; Zhang, Xinning; Kraepiel, Anne M. L.; Morel, François M. M.

    2017-01-01

    The nitrogenase enzyme, which catalyzes the reduction of N2 gas to NH4+, occurs as three separate isozyme that use Mo, Fe-only, or V. The majority of global nitrogen fixation is attributed to the more efficient ‘canonical’ Mo-nitrogenase, whereas Fe-only and V-(‘alternative’) nitrogenases are often considered ‘backup’ enzymes, used when Mo is limiting. Yet, the environmental distribution and diversity of alternative nitrogenases remains largely unknown. We searched for alternative nitrogenase genes in sequenced genomes and used PacBio sequencing to explore the diversity of canonical (nifD) and alternative (anfD and vnfD) nitrogenase amplicons in two coastal environments: the Florida Everglades and Sippewissett Marsh (MA). Genome-based searches identified an additional 25 species and 10 genera not previously known to encode alternative nitrogenases. Alternative nitrogenase amplicons were found in both Sippewissett Marsh and the Florida Everglades and their activity was further confirmed using newly developed isotopic techniques. Conserved amino acid sequences corresponding to cofactor ligands were also analyzed in anfD and vnfD amplicons, offering insight into environmental variants of these motifs. This study increases the number of available anfD and vnfD sequences ∼20-fold and allows for the first comparisons of environmental Mo-, Fe-only, and V-nitrogenase diversity. Our results suggest that alternative nitrogenases are maintained across a range of organisms and environments and that they can make important contributions to nitrogenase diversity and nitrogen fixation. PMID:28293220

  14. Diversity and Activity of Alternative Nitrogenases in Sequenced Genomes and Coastal Environments.

    PubMed

    McRose, Darcy L; Zhang, Xinning; Kraepiel, Anne M L; Morel, François M M

    2017-01-01

    The nitrogenase enzyme, which catalyzes the reduction of N 2 gas to NH 4 + , occurs as three separate isozyme that use Mo, Fe-only, or V. The majority of global nitrogen fixation is attributed to the more efficient 'canonical' Mo-nitrogenase, whereas Fe-only and V-('alternative') nitrogenases are often considered 'backup' enzymes, used when Mo is limiting. Yet, the environmental distribution and diversity of alternative nitrogenases remains largely unknown. We searched for alternative nitrogenase genes in sequenced genomes and used PacBio sequencing to explore the diversity of canonical ( nifD ) and alternative ( anfD and vnfD ) nitrogenase amplicons in two coastal environments: the Florida Everglades and Sippewissett Marsh (MA). Genome-based searches identified an additional 25 species and 10 genera not previously known to encode alternative nitrogenases. Alternative nitrogenase amplicons were found in both Sippewissett Marsh and the Florida Everglades and their activity was further confirmed using newly developed isotopic techniques. Conserved amino acid sequences corresponding to cofactor ligands were also analyzed in anfD and vnfD amplicons, offering insight into environmental variants of these motifs. This study increases the number of available anfD and vnfD sequences ∼20-fold and allows for the first comparisons of environmental Mo-, Fe-only, and V-nitrogenase diversity. Our results suggest that alternative nitrogenases are maintained across a range of organisms and environments and that they can make important contributions to nitrogenase diversity and nitrogen fixation.

  15. Genomic variation in Plasmodium vivax malaria reveals regions under selective pressure.

    PubMed

    Diez Benavente, Ernest; Ward, Zoe; Chan, Wilson; Mohareb, Fady R; Sutherland, Colin J; Roper, Cally; Campino, Susana; Clark, Taane G

    2017-01-01

    Although Plasmodium vivax contributes to almost half of all malaria cases outside Africa, it has been relatively neglected compared to the more deadly P. falciparum. It is known that P. vivax populations possess high genetic diversity, differing geographically potentially due to different vector species, host genetics and environmental factors. We analysed the high-quality genomic data for 46 P. vivax isolates spanning 10 countries across 4 continents. Using population genetic methods we identified hotspots of selection pressure, including the previously reported MRP1 and DHPS genes, both putative drug resistance loci. Extra copies and deletions in the promoter region of another drug resistance candidate, MDR1 gene, and duplications in the Duffy binding protein gene (PvDBP) potentially involved in erythrocyte invasion, were also identified. For surveillance applications, continental-informative markers were found in putative drug resistance loci, and we show that organellar polymorphisms could classify P. vivax populations across continents and differentiate between Plasmodia spp. This study has shown that genomic diversity that lies within and between P. vivax populations can be used to elucidate potential drug resistance and invasion mechanisms, as well as facilitate the molecular barcoding of the parasite for surveillance applications.

  16. Analysis of Complete Genomes of Propionibacterium acnes Reveals a Novel Plasmid and Increased Pseudogenes in an Acne Associated Strain

    PubMed Central

    Fitz-Gibbon, Sorel; Tomida, Shuta; Li, Huiying

    2013-01-01

    The human skin harbors a diverse community of bacteria, including the Gram-positive, anaerobic bacterium Propionibacterium acnes. P. acnes has historically been linked to the pathogenesis of acne vulgaris, a common skin disease affecting over 80% of all adolescents in the US. To gain insight into potential P. acnes pathogenic mechanisms, we previously sequenced the complete genome of a P. acnes strain HL096PA1 that is highly associated with acne. In this study, we compared its genome to the first published complete genome KPA171202. HL096PA1 harbors a linear plasmid, pIMPLE-HL096PA1. This is the first described P. acnes plasmid. We also observed a five-fold increase of pseudogenes in HL096PA1, several of which encode proteins in carbohydrate transport and metabolism. In addition, our analysis revealed a few island-like genomic regions that are unique to HL096PA1 and a large genomic inversion spanning the ribosomal operons. Together, these findings offer a basis for understanding P. acnes virulent properties, host adaptation mechanisms, and its potential role in acne pathogenesis at the strain level. Furthermore, the plasmid identified in HL096PA1 may potentially provide a new opportunity for P. acnes genetic manipulation and targeted therapy against specific disease-associated strains. PMID:23762865

  17. Repeat associated mechanisms of genome evolution and function revealed by the Mus caroli and Mus pahari genomes.

    PubMed

    Thybert, David; Roller, Maša; Navarro, Fábio C P; Fiddes, Ian; Streeter, Ian; Feig, Christine; Martin-Galvez, David; Kolmogorov, Mikhail; Janoušek, Václav; Akanni, Wasiu; Aken, Bronwen; Aldridge, Sarah; Chakrapani, Varshith; Chow, William; Clarke, Laura; Cummins, Carla; Doran, Anthony; Dunn, Matthew; Goodstadt, Leo; Howe, Kerstin; Howell, Matthew; Josselin, Ambre-Aurore; Karn, Robert C; Laukaitis, Christina M; Jingtao, Lilue; Martin, Fergal; Muffato, Matthieu; Nachtweide, Stefanie; Quail, Michael A; Sisu, Cristina; Stanke, Mario; Stefflova, Klara; Van Oosterhout, Cock; Veyrunes, Frederic; Ward, Ben; Yang, Fengtang; Yazdanifar, Golbahar; Zadissa, Amonida; Adams, David J; Brazma, Alvis; Gerstein, Mark; Paten, Benedict; Pham, Son; Keane, Thomas M; Odom, Duncan T; Flicek, Paul

    2018-04-01

    Understanding the mechanisms driving lineage-specific evolution in both primates and rodents has been hindered by the lack of sister clades with a similar phylogenetic structure having high-quality genome assemblies. Here, we have created chromosome-level assemblies of the Mus caroli and Mus pahari genomes. Together with the Mus musculus and Rattus norvegicus genomes, this set of rodent genomes is similar in divergence times to the Hominidae (human-chimpanzee-gorilla-orangutan). By comparing the evolutionary dynamics between the Muridae and Hominidae, we identified punctate events of chromosome reshuffling that shaped the ancestral karyotype of Mus musculus and Mus caroli between 3 and 6 million yr ago, but that are absent in the Hominidae. Hominidae show between four- and sevenfold lower rates of nucleotide change and feature turnover in both neutral and functional sequences, suggesting an underlying coherence to the Muridae acceleration. Our system of matched, high-quality genome assemblies revealed how specific classes of repeats can play lineage-specific roles in related species. Recent LINE activity has remodeled protein-coding loci to a greater extent across the Muridae than the Hominidae, with functional consequences at the species level such as reproductive isolation. Furthermore, we charted a Muridae-specific retrotransposon expansion at unprecedented resolution, revealing how a single nucleotide mutation transformed a specific SINE element into an active CTCF binding site carrier specifically in Mus caroli , which resulted in thousands of novel, species-specific CTCF binding sites. Our results show that the comparison of matched phylogenetic sets of genomes will be an increasingly powerful strategy for understanding mammalian biology. © 2018 Thybert et al.; Published by Cold Spring Harbor Laboratory Press.

  18. Genetic diversity and genome-wide association analysis of cooking time in dry bean (Phaseolus vulgaris L.).

    PubMed

    Cichy, Karen A; Wiesinger, Jason A; Mendoza, Fernando A

    2015-08-01

    Fivefold diversity for cooking time found in a panel of 206 Phaseolus vulgaris accessions. Fastest accession cooks nearly 20 min faster than average.   SNPs associated with cooking time on Pv02, 03, and 06. Dry beans (Phaseolus vulgaris L.) are a nutrient dense food and a dietary staple in parts of Africa and Latin America. One of the major factors that limits greater utilization of beans is their long cooking times compared to other foods. Cooking time is an important trait with implications for gender equity, nutritional value of diets, and energy utilization. Very little is known about the genetic diversity and genomic regions involved in determining cooking time. The objective of this research was to assess cooking time on a panel of 206 P. vulgaris accessions, use genome- wide association analysis (GWAS) to identify genomic regions influencing this trait, and to test the ability to predict cooking time by raw seed characteristics. In this study 5.5-fold variation for cooking time was found and five bean accessions were identified which cook in less than 27 min across 2 years, where the average cooking time was 37 min. One accession, ADP0367 cooked nearly 20 min faster than average. Four of these five accessions showed close phylogenetic relationship based on a NJ tree developed with ~5000 SNP markers, suggesting a potentially similar underlying genetic mechanism. GWAS revealed regions on chromosomes Pv02, Pv03, and Pv06 associated with cooking time. Vis/NIR scanning of raw seed explained 68 % of the phenotypic variation for cooking time, suggesting with additional experimentation, it may be possible to use this spectroscopy method to non-destructively identify fast cooking lines as part of a breeding program.

  19. The Nature and Evolution of Genomic Diversity in the Mycobacterium tuberculosis Complex.

    PubMed

    Brites, Daniela; Gagneux, Sebastien

    2017-01-01

    The Mycobacterium tuberculosis Complex (MTBC) consists of a clonal group of several mycobacterial lineages pathogenic to a range of different mammalian hosts. In this chapter, we discuss the origins and the evolutionary forces shaping the genomic diversity of the human-adapted MTBC. Advances in whole-genome sequencing have brought invaluable insights into the macro-evolution of the MTBC, and the biogeographical distribution of the different MTBC lineages, the phylogenetic relationships between these lineages. Moreover, micro-evolutionary processes start to be better understood, including those influencing bacterial mutation rates and those governing the fate of new mutations emerging within patients during treatment. Current genomic and epidemiological evidence reflect the fact that, through ecological specialization, the MTBC affecting humans became an obligate and extremely well-adapted human pathogen. Identifying the adaptive traits of human-adapted MTBC and unraveling the bacterial loci that interact with human genomic variation might help identify new targets for developing better vaccines and designing more effective treatments.

  20. Sequencing of Australian wild rice genomes reveals ancestral relationships with domesticated rice.

    PubMed

    Brozynska, Marta; Copetti, Dario; Furtado, Agnelo; Wing, Rod A; Crayn, Darren; Fox, Glen; Ishikawa, Ryuji; Henry, Robert J

    2017-06-01

    The related A genome species of the Oryza genus are the effective gene pool for rice. Here, we report draft genomes for two Australian wild A genome taxa: O. rufipogon-like population, referred to as Taxon A, and O. meridionalis-like population, referred to as Taxon B. These two taxa were sequenced and assembled by integration of short- and long-read next-generation sequencing (NGS) data to create a genomic platform for a wider rice gene pool. Here, we report that, despite the distinct chloroplast genome, the nuclear genome of the Australian Taxon A has a sequence that is much closer to that of domesticated rice (O. sativa) than to the other Australian wild populations. Analysis of 4643 genes in the A genome clade showed that the Australian annual, O. meridionalis, and related perennial taxa have the most divergent (around 3 million years) genome sequences relative to domesticated rice. A test for admixture showed possible introgression into the Australian Taxon A (diverged around 1.6 million years ago) especially from the wild indica/O. nivara clade in Asia. These results demonstrate that northern Australia may be the centre of diversity of the A genome Oryza and suggest the possibility that this might also be the centre of origin of this group and represent an important resource for rice improvement. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  1. Genome sequence diversity and clues to the evolution of variola (smallpox) virus.

    PubMed

    Esposito, Joseph J; Sammons, Scott A; Frace, A Michael; Osborne, John D; Olsen-Rasmussen, Melissa; Zhang, Ming; Govil, Dhwani; Damon, Inger K; Kline, Richard; Laker, Miriam; Li, Yu; Smith, Geoffrey L; Meyer, Hermann; Leduc, James W; Wohlhueter, Robert M

    2006-08-11

    Comparative genomics of 45 epidemiologically varied variola virus isolates from the past 30 years of the smallpox era indicate low sequence diversity, suggesting that there is probably little difference in the isolates' functional gene content. Phylogenetic clustering inferred three clades coincident with their geographical origin and case-fatality rate; the latter implicated putative proteins that mediate viral virulence differences. Analysis of the viral linear DNA genome suggests that its evolution involved direct descent and DNA end-region recombination events. Knowing the sequences will help understand the viral proteome and improve diagnostic test precision, therapeutics, and systems for their assessment.

  2. Flexibility and symmetry of prokaryotic genome rearrangement reveal lineage-associated core-gene-defined genome organizational frameworks.

    PubMed

    Kang, Yu; Gu, Chaohao; Yuan, Lina; Wang, Yue; Zhu, Yanmin; Li, Xinna; Luo, Qibin; Xiao, Jingfa; Jiang, Daquan; Qian, Minping; Ahmed Khan, Aftab; Chen, Fei; Zhang, Zhang; Yu, Jun

    2014-11-25

    The prokaryotic pangenome partitions genes into core and dispensable genes. The order of core genes, albeit assumed to be stable under selection in general, is frequently interrupted by horizontal gene transfer and rearrangement, but how a core-gene-defined genome maintains its stability or flexibility remains to be investigated. Based on data from 30 species, including 425 genomes from six phyla, we grouped core genes into syntenic blocks in the context of a pangenome according to their stability across multiple isolates. A subset of the core genes, often species specific and lineage associated, formed a core-gene-defined genome organizational framework (cGOF). Such cGOFs are either single segmental (one-third of the species analyzed) or multisegmental (the rest). Multisegment cGOFs were further classified into symmetric or asymmetric according to segment orientations toward the origin-terminus axis. The cGOFs in Gram-positive species are exclusively symmetric and often reversible in orientation, as opposed to those of the Gram-negative bacteria, which are all asymmetric and irreversible. Meanwhile, all species showing strong strand-biased gene distribution contain symmetric cGOFs and often specific DnaE (α subunit of DNA polymerase III) isoforms. Furthermore, functional evaluations revealed that cGOF genes are hub associated with regard to cellular activities, and the stability of cGOF provides efficient indexes for scaffold orientation as demonstrated by assembling virtual and empirical genome drafts. cGOFs show species specificity, and the symmetry of multisegmental cGOFs is conserved among taxa and constrained by DNA polymerase-centric strand-biased gene distribution. The definition of species-specific cGOFs provides powerful guidance for genome assembly and other structure-based analysis. Prokaryotic genomes are frequently interrupted by horizontal gene transfer (HGT) and rearrangement. To know whether there is a set of genes not only conserved in position

  3. Comparative Genomics of the Ectomycorrhizal Sister Species Rhizopogon vinicolor and Rhizopogon vesiculosus (Basidiomycota: Boletales) Reveals a Divergence of the Mating Type B Locus

    PubMed Central

    Mujic, Alija Bajro; Kuo, Alan; Tritt, Andrew; Lipzen, Anna; Chen, Cindy; Johnson, Jenifer; Sharma, Aditi; Barry, Kerrie; Grigoriev, Igor V.; Spatafora, Joseph W.

    2017-01-01

    Divergence of breeding system plays an important role in fungal speciation. Ectomycorrhizal fungi, however, pose a challenge for the study of reproductive biology because most cannot be mated under laboratory conditions. To overcome this barrier, we sequenced the draft genomes of the ectomycorrhizal sister species Rhizopogon vinicolor Smith and Zeller and R. vesiculosus Smith and Zeller (Basidiomycota, Boletales)—the first genomes available for Basidiomycota truffles—and characterized gene content and organization surrounding their mating type loci. Both species possess a pair of homeodomain transcription factor homologs at the mating type A-locus as well as pheromone receptor and pheromone precursor homologs at the mating type B-locus. Comparison of Rhizopogon genomes with genomes from Boletales, Agaricales, and Polyporales revealed synteny of the A-locus region within Boletales, but several genomic rearrangements across orders. Our findings suggest correlation between gene content at the B-locus region and breeding system in Boletales with tetrapolar species possessing more diverse gene content than bipolar species. Rhizopogon vinicolor possesses a greater number of B-locus pheromone receptor and precursor genes than R. vesiculosus, as well as a pair of isoprenyl cysteine methyltransferase genes flanking the B-locus compared to a single copy in R. vesiculosus. Examination of dikaryotic single nucleotide polymorphisms within genomes revealed greater heterozygosity in R. vinicolor, consistent with increased rates of outcrossing. Both species possess the components of a heterothallic breeding system with R. vinicolor possessing a B-locus region structure consistent with tetrapolar Boletales and R. vesiculosus possessing a B-locus region structure intermediate between bipolar and tetrapolar Boletales. PMID:28450370

  4. High-throughput analysis of the satellitome revealed enormous diversity of satellite DNAs in the neo-Y chromosome of the cricket Eneoptera surinamensis.

    PubMed

    Palacios-Gimenez, Octavio Manuel; Dias, Guilherme Borges; de Lima, Leonardo Gomes; Kuhn, Gustavo Campos E Silva; Ramos, Érica; Martins, Cesar; Cabral-de-Mello, Diogo Cavalcanti

    2017-07-25

    Satellite DNAs (satDNAs) constitute large portion of eukaryote genomes, comprising non-protein-coding sequences tandemly repeated. They are mostly found in heterochromatic regions of chromosomes such as around centromere or near telomeres, in intercalary heterochromatin, and often in non-recombining segments of sex chromosomes. We examined the satellitome in the cricket Eneoptera surinamensis (2n = 9, neo-X 1 X 2 Y, males) to characterize the molecular evolution of its neo-sex chromosomes. To achieve this, we analyzed illumina reads using graph-based clustering and complementary analyses. We found an unusually high number of 45 families of satDNAs, ranging from 4 bp to 517 bp, accounting for about 14% of the genome and showing different modular structures and high diversity of arrays. FISH mapping revealed that satDNAs are located mostly in C-positive pericentromeric regions of the chromosomes. SatDNAs enrichment was also observed in the neo-sex chromosomes in comparison to autosomes. Especially astonishing accumulation of satDNAs loci was found in the highly differentiated neo-Y, including 39 satDNAs over-represented in this chromosome, which is the greatest satDNAs diversity yet reported for sex chromosomes. Our results suggest possible involvement of satDNAs in genome increasing and in molecular differentiation of the neo-sex chromosomes in this species, contributing to the understanding of sex chromosome composition and evolution in Orthoptera.

  5. Unique transposon landscapes are pervasive across Drosophila melanogaster genomes

    PubMed Central

    Rahman, Reazur; Chirn, Gung-wei; Kanodia, Abhay; Sytnikova, Yuliya A.; Brembs, Björn; Bergman, Casey M.; Lau, Nelson C.

    2015-01-01

    To understand how transposon landscapes (TLs) vary across animal genomes, we describe a new method called the Transposon Insertion and Depletion AnaLyzer (TIDAL) and a database of >300 TLs in Drosophila melanogaster (TIDAL-Fly). Our analysis reveals pervasive TL diversity across cell lines and fly strains, even for identically named sub-strains from different laboratories such as the ISO1 strain used for the reference genome sequence. On average, >500 novel insertions exist in every lab strain, inbred strains of the Drosophila Genetic Reference Panel (DGRP), and fly isolates in the Drosophila Genome Nexus (DGN). A minority (<25%) of transposon families comprise the majority (>70%) of TL diversity across fly strains. A sharp contrast between insertion and depletion patterns indicates that many transposons are unique to the ISO1 reference genome sequence. Although TL diversity from fly strains reaches asymptotic limits with increasing sequencing depth, rampant TL diversity causes unsaturated detection of TLs in pools of flies. Finally, we show novel transposon insertions negatively correlate with Piwi-interacting RNA (piRNA) levels for most transposon families, except for the highly-abundant roo retrotransposon. Our study provides a useful resource for Drosophila geneticists to understand how transposons create extensive genomic diversity in fly cell lines and strains. PMID:26578579

  6. Comparative genome analysis reveals a conserved family of actin-like proteins in apicomplexan parasites

    PubMed Central

    Gordon, Jennifer L; Sibley, L David

    2005-01-01

    Background The phylum Apicomplexa is an early-branching eukaryotic lineage that contains a number of important human and animal pathogens. Their complex life cycles and unique cytoskeletal features distinguish them from other model eukaryotes. Apicomplexans rely on actin-based motility for cell invasion, yet the regulation of this system remains largely unknown. Consequently, we focused our efforts on identifying actin-related proteins in the recently completed genomes of Toxoplasma gondii, Plasmodium spp., Cryptosporidium spp., and Theileria spp. Results Comparative genomic and phylogenetic studies of apicomplexan genomes reveals that most contain only a single conventional actin and yet they each have 8–10 additional actin-related proteins. Among these are a highly conserved Arp1 protein (likely part of a conserved dynactin complex), and Arp4 and Arp6 homologues (subunits of the chromatin-remodeling machinery). In contrast, apicomplexans lack canonical Arp2 or Arp3 proteins, suggesting they lost the Arp2/3 actin polymerization complex on their evolutionary path towards intracellular parasitism. Seven of these actin-like proteins (ALPs) are novel to apicomplexans. They show no phylogenetic associations to the known Arp groups and likely serve functions specific to this important group of intracellular parasites. Conclusion The large diversity of actin-like proteins in apicomplexans suggests that the actin protein family has diverged to fulfill various roles in the unique biology of intracellular parasites. Conserved Arps likely participate in vesicular transport and gene expression, while apicomplexan-specific ALPs may control unique biological traits such as actin-based gliding motility. PMID:16343347

  7. Whole-Genome Sequencing Reveals Genetic Variation in the Asian House Rat.

    PubMed

    Teng, Huajing; Zhang, Yaohua; Shi, Chengmin; Mao, Fengbiao; Hou, Lingling; Guo, Hongling; Sun, Zhongsheng; Zhang, Jianxu

    2016-07-07

    Whole-genome sequencing of wild-derived rat species can provide novel genomic resources, which may help decipher the genetics underlying complex phenotypes. As a notorious pest, reservoir of human pathogens, and colonizer, the Asian house rat, Rattus tanezumi, is successfully adapted to its habitat. However, little is known regarding genetic variation in this species. In this study, we identified over 41,000,000 single-nucleotide polymorphisms, plus insertions and deletions, through whole-genome sequencing and bioinformatics analyses. Moreover, we identified over 12,000 structural variants, including 143 chromosomal inversions. Further functional analyses revealed several fixed nonsense mutations associated with infection and immunity-related adaptations, and a number of fixed missense mutations that may be related to anticoagulant resistance. A genome-wide scan for loci under selection identified various genes related to neural activity. Our whole-genome sequencing data provide a genomic resource for future genetic studies of the Asian house rat species and have the potential to facilitate understanding of the molecular adaptations of rats to their ecological niches. Copyright © 2016 Teng et al.

  8. Single-cell genomics reveals complex carbohydrate degradation patterns in poribacterial symbionts of marine sponges

    PubMed Central

    Kamke, Janine; Sczyrba, Alexander; Ivanova, Natalia; Schwientek, Patrick; Rinke, Christian; Mavromatis, Kostas; Woyke, Tanja; Hentschel, Ute

    2013-01-01

    Many marine sponges are hosts to dense and phylogenetically diverse microbial communities that are located in the extracellular matrix of the animal. The candidate phylum Poribacteria is a predominant member of the sponge microbiome and its representatives are nearly exclusively found in sponges. Here we used single-cell genomics to obtain comprehensive insights into the metabolic potential of individual poribacterial cells representing three distinct phylogenetic groups within Poribacteria. Genome sizes were up to 5.4 Mbp and genome coverage was as high as 98.5%. Common features of the poribacterial genomes indicated that heterotrophy is likely to be of importance for this bacterial candidate phylum. Carbohydrate-active enzyme database screening and further detailed analysis of carbohydrate metabolism suggested the ability to degrade diverse carbohydrate sources likely originating from seawater and from the host itself. The presence of uronic acid degradation pathways as well as several specific sulfatases provides strong support that Poribacteria degrade glycosaminoglycan chains of proteoglycans, which are important components of the sponge host matrix. Dominant glycoside hydrolase families further suggest degradation of other glycoproteins in the host matrix. We therefore propose that Poribacteria are well adapted to an existence in the sponge extracellular matrix. Poribacteria may be viewed as efficient scavengers and recyclers of a particular suite of carbon compounds that are unique to sponges as microbial ecosystems. PMID:23842652

  9. Evolutionary genomics and population structure of Entamoeba histolytica

    PubMed Central

    Das, Koushik; Ganguly, Sandipan

    2014-01-01

    Amoebiasis caused by the gastrointestinal parasite Entamoeba histolytica has diverse disease outcomes. Study of genome and evolution of this fascinating parasite will help us to understand the basis of its virulence and explain why, when and how it causes diseases. In this review, we have summarized current knowledge regarding evolutionary genomics of E. histolytica and discussed their association with parasite phenotypes and its differential pathogenic behavior. How genetic diversity reveals parasite population structure has also been discussed. Queries concerning their evolution and population structure which were required to be addressed have also been highlighted. This significantly large amount of genomic data will improve our knowledge about this pathogenic species of Entamoeba. PMID:25505504

  10. GENOMIC DIVERSITY AND THE MICROENVIRONMENT AS DRIVERS OF PROGRESSION IN DCIS

    DTIC Science & Technology

    2017-10-01

    stains, including quantitative analysis, 7) Identification of upstaged DCIS cases for the radiology aim, 8) Development of image analysis methods for...goals of the project? Aim 1. Determine whether genetic diversity of DCIS is greater in DCIS with adjacent invasive disease compared to DCIS without... compared to DCIS without IDC. Since genomics is not the sole driver of tumor behavior, we will phenotypically characterize DCIS and its

  11. Genotyping-by-sequencing (GBS) revealed molecular genetic diversity of Iranian wheat landraces and cultivars

    USDA-ARS?s Scientific Manuscript database

    Genetic diversity is an essential resource for breeders to improve new cultivars with desirable characteristics. Recently genotyping-by-sequencing (GBS), a next generation sequencing (NGS) based technology that can simplify complex genomes, has been used as a high-throughput and cost-effective molec...

  12. Comparative analysis of fungal genomes reveals different plant cell wall degrading capacity in fungi

    PubMed Central

    2013-01-01

    Background Fungi produce a variety of carbohydrate activity enzymes (CAZymes) for the degradation of plant polysaccharide materials to facilitate infection and/or gain nutrition. Identifying and comparing CAZymes from fungi with different nutritional modes or infection mechanisms may provide information for better understanding of their life styles and infection models. To date, over hundreds of fungal genomes are publicly available. However, a systematic comparative analysis of fungal CAZymes across the entire fungal kingdom has not been reported. Results In this study, we systemically identified glycoside hydrolases (GHs), polysaccharide lyases (PLs), carbohydrate esterases (CEs), and glycosyltransferases (GTs) as well as carbohydrate-binding modules (CBMs) in the predicted proteomes of 103 representative fungi from Ascomycota, Basidiomycota, Chytridiomycota, and Zygomycota. Comparative analysis of these CAZymes that play major roles in plant polysaccharide degradation revealed that fungi exhibit tremendous diversity in the number and variety of CAZymes. Among them, some families of GHs and CEs are the most prevalent CAZymes that are distributed in all of the fungi analyzed. Importantly, cellulases of some GH families are present in fungi that are not known to have cellulose-degrading ability. In addition, our results also showed that in general, plant pathogenic fungi have the highest number of CAZymes. Biotrophic fungi tend to have fewer CAZymes than necrotrophic and hemibiotrophic fungi. Pathogens of dicots often contain more pectinases than fungi infecting monocots. Interestingly, besides yeasts, many saprophytic fungi that are highly active in degrading plant biomass contain fewer CAZymes than plant pathogenic fungi. Furthermore, analysis of the gene expression profile of the wheat scab fungus Fusarium graminearum revealed that most of the CAZyme genes related to cell wall degradation were up-regulated during plant infection. Phylogenetic analysis also

  13. When COI barcodes deceive: complete genomes reveal introgression in hairstreaks

    PubMed Central

    Shen, Jinhui; Borek, Dominika; Robbins, Robert K.; Opler, Paul A.; Otwinowski, Zbyszek; Grishin, Nick V.

    2017-01-01

    Two species of hairstreak butterflies from the genus Calycopis are known in the United States: C. cecrops and C. isobeon. Analysis of mitochondrial COI barcodes of Calycopis revealed cecrops-like specimens from the eastern US with atypical barcodes that were 2.6% different from either USA species, but similar to Central American Calycopis species. To address the possibility that the specimens with atypical barcodes represent an undescribed cryptic species, we sequenced complete genomes of 27 Calycopis specimens of four species: C. cecrops, C. isobeon, C. quintana and C. bactra. Some of these specimens were collected up to 60 years ago and preserved dry in museum collections, but nonetheless produced genomes as complete as fresh samples. Phylogenetic trees reconstructed using the whole mitochondrial and nuclear genomes were incongruent. While USA Calycopis with atypical barcodes grouped with Central American species C. quintana by mitochondria, nuclear genome trees placed them within typical USA C. cecrops in agreement with morphology, suggesting mitochondrial introgression. Nuclear genomes also show introgression, especially between C. cecrops and C. isobeon. About 2.3% of each C. cecrops genome has probably (p-value < 0.01, FDR < 0.1) introgressed from C. isobeon and about 3.4% of each C. isobeon genome may have come from C. cecrops. The introgressed regions are enriched in genes encoding transmembrane proteins, mitochondria-targeting proteins and components of the larval cuticle. This study provides the first example of mitochondrial introgression in Lepidoptera supported by complete genome sequencing. Our results caution about relying solely on COI barcodes and mitochondrial DNA for species identification or discovery. PMID:28179510

  14. Population diversity of Diaphorina citri (Hemiptera: Liviidae) in China based on whole mitochondrial genome sequences.

    PubMed

    Wu, Fengnian; Jiang, Hongyan; Beattie, G Andrew C; Holford, Paul; Chen, Jianchi; Wallis, Christopher M; Zheng, Zheng; Deng, Xiaoling; Cen, Yijing

    2018-04-24

    Diaphorina citri (Asian citrus psyllid; ACP) transmits 'Candidatus Liberibacter asiaticus' associated with citrus Huanglongbing (HLB). ACP has been reported in 11 provinces/regions in China, yet its population diversity remains unclear. In this study, we evaluated ACP population diversity in China using representative whole mitochondrial genome (mitogenome) sequences. Additional mitogenome sequences outside China were also acquired and evaluated. The sizes of the 27 ACP mitogenome sequences ranged from 14 986 to 15 030 bp. Along with three previously published mitogenome sequences, the 30 sequences formed three major mitochondrial groups (MGs): MG1, present in southwestern China and occurring at elevations above 1000 m; MG2, present in southeastern China and Southeast Asia (Cambodia, Indonesia, Malaysia, and Vietnam) and occurring at elevations below 180 m; and MG3, present in the USA and Pakistan. Single nucleotide polymorphisms in five genes (cox2, atp8, nad3, nad1 and rrnL) contributed mostly in the ACP diversity. Among these genes, rrnL had the most variation. Mitogenome sequences analyses revealed two major phylogenetic groups of ACP present in China as well as a possible unique group present currently in Pakistan and the USA. The information could have significant implications for current ACP control and HLB management. © 2018 Society of Chemical Industry. © 2018 Society of Chemical Industry.

  15. Genomic variation in Plasmodium vivax malaria reveals regions under selective pressure

    PubMed Central

    Diez Benavente, Ernest; Ward, Zoe; Chan, Wilson; Mohareb, Fady R.; Sutherland, Colin J.; Roper, Cally; Campino, Susana

    2017-01-01

    Background Although Plasmodium vivax contributes to almost half of all malaria cases outside Africa, it has been relatively neglected compared to the more deadly P. falciparum. It is known that P. vivax populations possess high genetic diversity, differing geographically potentially due to different vector species, host genetics and environmental factors. Results We analysed the high-quality genomic data for 46 P. vivax isolates spanning 10 countries across 4 continents. Using population genetic methods we identified hotspots of selection pressure, including the previously reported MRP1 and DHPS genes, both putative drug resistance loci. Extra copies and deletions in the promoter region of another drug resistance candidate, MDR1 gene, and duplications in the Duffy binding protein gene (PvDBP) potentially involved in erythrocyte invasion, were also identified. For surveillance applications, continental-informative markers were found in putative drug resistance loci, and we show that organellar polymorphisms could classify P. vivax populations across continents and differentiate between Plasmodia spp. Conclusions This study has shown that genomic diversity that lies within and between P. vivax populations can be used to elucidate potential drug resistance and invasion mechanisms, as well as facilitate the molecular barcoding of the parasite for surveillance applications. PMID:28493919

  16. Genomic diversity and population structure of three autochthonous Greek sheep breeds assessed with genome-wide DNA arrays.

    PubMed

    Michailidou, S; Tsangaris, G; Fthenakis, G C; Tzora, A; Skoufos, I; Karkabounas, S C; Banos, G; Argiriou, A; Arsenos, G

    2018-06-01

    In the present study, genome-wide genotyping was applied to characterize the genetic diversity and population structure of three autochthonous Greek breeds: Boutsko, Karagouniko and Chios. Dairy sheep are among the most significant livestock species in Greece numbering approximately 9 million animals which are characterized by large phenotypic variation and reared under various farming systems. A total of 96 animals were genotyped with the Illumina's OvineSNP50K microarray beadchip, to study the population structure of the breeds and develop a specialized panel of single-nucleotide polymorphisms (SNPs), which could distinguish one breed from the others. Quality control on the dataset resulted in 46,125 SNPs, which were used to evaluate the genetic structure of the breeds. Population structure was assessed through principal component analysis (PCA) and admixture analysis, whereas inbreeding was estimated based on runs of homozygosity (ROHs) coefficients, genomic relationship matrix inbreeding coefficients (F GRM ) and patterns of linkage disequilibrium (LD). Associations between SNPs and breeds were analyzed with different inheritance models, to identify SNPs that distinguish among the breeds. Results showed high levels of genetic heterogeneity in the three breeds. Genetic distances among breeds were modest, despite their different ancestries. Chios and Karagouniko breeds were more genetically related to each other compared to Boutsko. Analysis revealed 3802 candidate SNPs that can be used to identify two-breed crosses and purebred animals. The present study provides, for the first time, data on the genetic background of three Greek indigenous dairy sheep breeds as well as a specialized marker panel that can be applied for traceability purposes as well as targeted genetic improvement schemes and conservation programs.

  17. Genome-wide high-throughput SNP discovery and genotyping for understanding natural (functional) allelic diversity and domestication patterns in wild chickpea

    PubMed Central

    Bajaj, Deepak; Das, Shouvik; Badoni, Saurabh; Kumar, Vinod; Singh, Mohar; Bansal, Kailash C.; Tyagi, Akhilesh K.; Parida, Swarup K.

    2015-01-01

    We identified 82489 high-quality genome-wide SNPs from 93 wild and cultivated Cicer accessions through integrated reference genome- and de novo-based GBS assays. High intra- and inter-specific polymorphic potential (66–85%) and broader natural allelic diversity (6–64%) detected by genome-wide SNPs among accessions signify their efficacy for monitoring introgression and transferring target trait-regulating genomic (gene) regions/allelic variants from wild to cultivated Cicer gene pools for genetic improvement. The population-specific assignment of wild Cicer accessions pertaining to the primary gene pool are more influenced by geographical origin/phenotypic characteristics than species/gene-pools of origination. The functional significance of allelic variants (non-synonymous and regulatory SNPs) scanned from transcription factors and stress-responsive genes in differentiating wild accessions (with potential known sources of yield-contributing and stress tolerance traits) from cultivated desi and kabuli accessions, fine-mapping/map-based cloning of QTLs and determination of LD patterns across wild and cultivated gene-pools are suitably elucidated. The correlation between phenotypic (agromorphological traits) and molecular diversity-based admixed domestication patterns within six structured populations of wild and cultivated accessions via genome-wide SNPs was apparent. This suggests utility of whole genome SNPs as a potential resource for identifying naturally selected trait-regulating genomic targets/functional allelic variants adaptive to diverse agroclimatic regions for genetic enhancement of cultivated gene-pools. PMID:26208313

  18. Cnidarian Cell Type Diversity and Regulation Revealed by Whole-Organism Single-Cell RNA-Seq.

    PubMed

    Sebé-Pedrós, Arnau; Saudemont, Baptiste; Chomsky, Elad; Plessier, Flora; Mailhé, Marie-Pierre; Renno, Justine; Loe-Mie, Yann; Lifshitz, Aviezer; Mukamel, Zohar; Schmutz, Sandrine; Novault, Sophie; Steinmetz, Patrick R H; Spitz, François; Tanay, Amos; Marlow, Heather

    2018-05-31

    The emergence and diversification of cell types is a leading factor in animal evolution. So far, systematic characterization of the gene regulatory programs associated with cell type specificity was limited to few cell types and few species. Here, we perform whole-organism single-cell transcriptomics to map adult and larval cell types in the cnidarian Nematostella vectensis, a non-bilaterian animal with complex tissue-level body-plan organization. We uncover eight broad cell classes in Nematostella, including neurons, cnidocytes, and digestive cells. Each class comprises different subtypes defined by the expression of multiple specific markers. In particular, we characterize a surprisingly diverse repertoire of neurons, which comparative analysis suggests are the result of lineage-specific diversification. By integrating transcription factor expression, chromatin profiling, and sequence motif analysis, we identify the regulatory codes that underlie Nematostella cell-specific expression. Our study reveals cnidarian cell type complexity and provides insights into the evolution of animal cell-specific genomic regulation. Copyright © 2018 Elsevier Inc. All rights reserved.

  19. DIVERSITY in binding, regulation, and evolution revealed from high-throughput ChIP.

    PubMed

    Mitra, Sneha; Biswas, Anushua; Narlikar, Leelavati

    2018-04-01

    Genome-wide in vivo protein-DNA interactions are routinely mapped using high-throughput chromatin immunoprecipitation (ChIP). ChIP-reported regions are typically investigated for enriched sequence-motifs, which are likely to model the DNA-binding specificity of the profiled protein and/or of co-occurring proteins. However, simple enrichment analyses can miss insights into the binding-activity of the protein. Note that ChIP reports regions making direct contact with the protein as well as those binding through intermediaries. For example, consider a ChIP experiment targeting protein X, which binds DNA at its cognate sites, but simultaneously interacts with four other proteins. Each of these proteins also binds to its own specific cognate sites along distant parts of the genome, a scenario consistent with the current view of transcriptional hubs and chromatin loops. Since ChIP will pull down all X-associated regions, the final reported data will be a union of five distinct sets of regions, each containing binding sites of one of the five proteins, respectively. Characterizing all five different motifs and the corresponding sets is important to interpret the ChIP experiment and ultimately, the role of X in regulation. We present diversity which attempts exactly this: it partitions the data so that each partition can be characterized with its own de novo motif. Diversity uses a Bayesian approach to identify the optimal number of motifs and the associated partitions, which together explain the entire dataset. This is in contrast to standard motif finders, which report motifs individually enriched in the data, but do not necessarily explain all reported regions. We show that the different motifs and associated regions identified by diversity give insights into the various complexes that may be forming along the chromatin, something that has so far not been attempted from ChIP data. Webserver at http://diversity.ncl.res.in/; standalone (Mac OS X/Linux) from https://github.com/NarlikarLab/DIVERSITY

  20. Comparative genomics of plant-associated Pseudomonas spp.: Insights into diversity and inheritance of traits involved in multitrophic interactions

    USDA-ARS?s Scientific Manuscript database

    We provide here a comparative genome analysis of the Pseudomonas fluorescens group, including seven new genomic sequences for plant-associated strains. These strains exhibit a diverse spectrum of traits involved in biological control and other multitrophic interactions with plants, microbes, and ins...

  1. Genome resolved analysis of a premature infant gut microbial community reveals a Varibaculum cambriense genome and a shift towards fermentation-based metabolism during the third week of life.

    PubMed

    Brown, Christopher T; Sharon, Itai; Thomas, Brian C; Castelle, Cindy J; Morowitz, Michael J; Banfield, Jillian F

    2013-12-17

    The premature infant gut has low individual but high inter-individual microbial diversity compared with adults. Based on prior 16S rRNA gene surveys, many species from this environment are expected to be similar to those previously detected in the human microbiota. However, the level of genomic novelty and metabolic variation of strains found in the infant gut remains relatively unexplored. To study the stability and function of early microbial colonizers of the premature infant gut, nine stool samples were taken during the third week of life of a premature male infant delivered via Caesarean section. Metagenomic sequences were assembled and binned into near-complete and partial genomes, enabling strain-level genomic analysis of the microbial community.We reconstructed eleven near-complete and six partial bacterial genomes representative of the key members of the microbial community. Twelve of these genomes share >90% putative ortholog amino acid identity with reference genomes. Manual curation of the assembly of one particularly novel genome resulted in the first essentially complete genome sequence (in three pieces, the order of which could not be determined due to a repeat) for Varibaculum cambriense (strain Dora), a medically relevant species that has been implicated in abscess formation.During the period studied, the microbial community undergoes a compositional shift, in which obligate anaerobes (fermenters) overtake Escherichia coli as the most abundant species. Other species remain stable, probably due to their ability to either respire anaerobically or grow by fermentation, and their capacity to tolerate fluctuating levels of oxygen. Metabolic predictions for V. cambriense suggest that, like other members of the microbial community, this organism is able to process various sugar substrates and make use of multiple different electron acceptors during anaerobic respiration. Genome comparisons within the family Actinomycetaceae reveal important differences

  2. Genome resolved analysis of a premature infant gut microbial community reveals a Varibaculum cambriense genome and a shift towards fermentation-based metabolism during the third week of life

    PubMed Central

    2013-01-01

    Background The premature infant gut has low individual but high inter-individual microbial diversity compared with adults. Based on prior 16S rRNA gene surveys, many species from this environment are expected to be similar to those previously detected in the human microbiota. However, the level of genomic novelty and metabolic variation of strains found in the infant gut remains relatively unexplored. Results To study the stability and function of early microbial colonizers of the premature infant gut, nine stool samples were taken during the third week of life of a premature male infant delivered via Caesarean section. Metagenomic sequences were assembled and binned into near-complete and partial genomes, enabling strain-level genomic analysis of the microbial community. We reconstructed eleven near-complete and six partial bacterial genomes representative of the key members of the microbial community. Twelve of these genomes share >90% putative ortholog amino acid identity with reference genomes. Manual curation of the assembly of one particularly novel genome resulted in the first essentially complete genome sequence (in three pieces, the order of which could not be determined due to a repeat) for Varibaculum cambriense (strain Dora), a medically relevant species that has been implicated in abscess formation. During the period studied, the microbial community undergoes a compositional shift, in which obligate anaerobes (fermenters) overtake Escherichia coli as the most abundant species. Other species remain stable, probably due to their ability to either respire anaerobically or grow by fermentation, and their capacity to tolerate fluctuating levels of oxygen. Metabolic predictions for V. cambriense suggest that, like other members of the microbial community, this organism is able to process various sugar substrates and make use of multiple different electron acceptors during anaerobic respiration. Genome comparisons within the family Actinomycetaceae reveal

  3. The genome diversity and karyotype evolution of mammals

    PubMed Central

    2011-01-01

    The past decade has witnessed an explosion of genome sequencing and mapping in evolutionary diverse species. While full genome sequencing of mammals is rapidly progressing, the ability to assemble and align orthologous whole chromosome regions from more than a few species is still not possible. The intense focus on building of comparative maps for companion (dog and cat), laboratory (mice and rat) and agricultural (cattle, pig, and horse) animals has traditionally been used as a means to understand the underlying basis of disease-related or economically important phenotypes. However, these maps also provide an unprecedented opportunity to use multispecies analysis as a tool for inferring karyotype evolution. Comparative chromosome painting and related techniques are now considered to be the most powerful approaches in comparative genome studies. Homologies can be identified with high accuracy using molecularly defined DNA probes for fluorescence in situ hybridization (FISH) on chromosomes of different species. Chromosome painting data are now available for members of nearly all mammalian orders. In most orders, there are species with rates of chromosome evolution that can be considered as 'default' rates. The number of rearrangements that have become fixed in evolutionary history seems comparatively low, bearing in mind the 180 million years of the mammalian radiation. Comparative chromosome maps record the history of karyotype changes that have occurred during evolution. The aim of this review is to provide an overview of these recent advances in our endeavor to decipher the karyotype evolution of mammals by integrating the published results together with some of our latest unpublished results. PMID:21992653

  4. Neolithic and medieval virus genomes reveal complex evolution of hepatitis B

    PubMed Central

    Key, Felix M; Kühnert, Denise; Bosse, Esther; Immel, Alexander; Rinne, Christoph; Kornell, Sabin-Christin; Yepes, Diego; Franzenburg, Sören; Heyne, Henrike O; Meier, Thomas; Lösch, Sandra; Meller, Harald; Friederich, Susanne; Nicklisch, Nicole; Alt, Kurt W; Schreiber, Stefan; Tholey, Andreas; Herbig, Alexander; Nebel, Almut

    2018-01-01

    The hepatitis B virus (HBV) is one of the most widespread human pathogens known today, yet its origin and evolutionary history are still unclear and controversial. Here, we report the analysis of three ancient HBV genomes recovered from human skeletons found at three different archaeological sites in Germany. We reconstructed two Neolithic and one medieval HBV genome by de novo assembly from shotgun DNA sequencing data. Additionally, we observed HBV-specific peptides using paleo-proteomics. Our results demonstrated that HBV has circulated in the European population for at least 7000 years. The Neolithic HBV genomes show a high genomic similarity to each other. In a phylogenetic network, they do not group with any human-associated HBV genome and are most closely related to those infecting African non-human primates. The ancient viruses appear to represent distinct lineages that have no close relatives today and possibly went extinct. Our results reveal the great potential of ancient DNA from human skeletons in order to study the long-time evolution of blood borne viruses. PMID:29745896

  5. Neolithic and Medieval virus genomes reveal complex evolution of Hepatitis B.

    PubMed

    Krause-Kyora, Ben; Susat, Julian; Key, Felix M; Kühnert, Denise; Bosse, Esther; Immel, Alexander; Rinne, Christoph; Kornell, Sabin-Christin; Yepes, Diego; Franzenburg, Sören; Heyne, Henrike O; Meier, Thomas; Lösch, Sandra; Meller, Harald; Friederich, Susanne; Nicklisch, Nicole; Alt, Kurt W; Schreiber, Stefan; Tholey, Andreas; Herbig, Alexander; Nebel, Almut; Krause, Johannes

    2018-05-10

    The hepatitis B virus (HBV) is one of the most widespread human pathogens known today, yet its origin and evolutionary history are still unclear and controversial. Here, we report the analysis of three ancient HBV genomes recovered from human skeletons found at three different archaeological sites in Germany. We reconstructed two Neolithic and one medieval HBV genomes by de novo assembly from shotgun DNA sequencing data. Additionally, we observed HBV-specific peptides using paleo-proteomics. Our results show that HBV circulates in the European population for at least 7000 years. The Neolithic HBV genomes show a high genomic similarity to each other. In a phylogenetic network, they do not group with any human-associated HBV genome and are most closely related to those infecting African non-human primates. These ancient virus forms appear to represent distinct lineages that have no close relatives today and possibly went extinct. Our results reveal the great potential of ancient DNA from human skeletons in order to study the long-time evolution of blood borne viruses. © 2018, Krause-Kyora et al.

  6. The Variable Regions of Lactobacillus rhamnosus Genomes Reveal the Dynamic Evolution of Metabolic and Host-Adaptation Repertoires

    PubMed Central

    Ceapa, Corina; Davids, Mark; Ritari, Jarmo; Lambert, Jolanda; Wels, Michiel; Douillard, François P.; Smokvina, Tamara; de Vos, Willem M.; Knol, Jan; Kleerebezem, Michiel

    2016-01-01

    Lactobacillus rhamnosus is a diverse Gram-positive species with strains isolated from different ecological niches. Here, we report the genome sequence analysis of 40 diverse strains of L. rhamnosus and their genomic comparison, with a focus on the variable genome. Genomic comparison of 40 L. rhamnosus strains discriminated the conserved genes (core genome) and regions of plasticity involving frequent rearrangements and horizontal transfer (variome). The L. rhamnosus core genome encompasses 2,164 genes, out of 4,711 genes in total (the pan-genome). The accessory genome is dominated by genes encoding carbohydrate transport and metabolism, extracellular polysaccharides (EPS) biosynthesis, bacteriocin production, pili production, the cas system, and the associated clustered regularly interspaced short palindromic repeat (CRISPR) loci, and more than 100 transporter functions and mobile genetic elements like phages, plasmid genes, and transposons. A clade distribution based on amino acid differences between core (shared) proteins matched with the clade distribution obtained from the presence–absence of variable genes. The phylogenetic and variome tree overlap indicated that frequent events of gene acquisition and loss dominated the evolutionary segregation of the strains within this species, which is paralleled by evolutionary diversification of core gene functions. The CRISPR-Cas system could have contributed to this evolutionary segregation. Lactobacillus rhamnosus strains contain the genetic and metabolic machinery with strain-specific gene functions required to adapt to a large range of environments. A remarkable congruency of the evolutionary relatedness of the strains’ core and variome functions, possibly favoring interspecies genetic exchanges, underlines the importance of gene-acquisition and loss within the L. rhamnosus strain diversification. PMID:27358423

  7. Living with Genome Instability: the Adaptation of Phytoplasmas to Diverse Environments of Their Insect and Plant Hosts††

    PubMed Central

    Bai, Xiaodong; Zhang, Jianhua; Ewing, Adam; Miller, Sally A.; Jancso Radek, Agnes; Shevchenko, Dmitriy V.; Tsukerman, Kiryl; Walunas, Theresa; Lapidus, Alla; Campbell, John W.; Hogenhout, Saskia A.

    2006-01-01

    Phytoplasmas (“Candidatus Phytoplasma,” class Mollicutes) cause disease in hundreds of economically important plants and are obligately transmitted by sap-feeding insects of the order Hemiptera, mainly leafhoppers and psyllids. The 706,569-bp chromosome and four plasmids of aster yellows phytoplasma strain witches' broom (AY-WB) were sequenced and compared to the onion yellows phytoplasma strain M (OY-M) genome. The phytoplasmas have small repeat-rich genomes. This comparative analysis revealed that the repeated DNAs are organized into large clusters of potential mobile units (PMUs), which contain tra5 insertion sequences (ISs) and genes for specialized sigma factors and membrane proteins. So far, these PMUs appear to be unique to phytoplasmas. Compared to mycoplasmas, phytoplasmas lack several recombination and DNA modification functions, and therefore, phytoplasmas may use different mechanisms of recombination, likely involving PMUs, for the creation of variability, allowing phytoplasmas to adjust to the diverse environments of plants and insects. The irregular GC skews and the presence of ISs and large repeated sequences in the AY-WB and OY-M genomes are indicative of high genomic plasticity. Nevertheless, segments of ∼250 kb located between the lplA and glnQ genes are syntenic between the two phytoplasmas and contain the majority of the metabolic genes and no ISs. AY-WB appears to be further along in the reductive evolution process than OY-M. The AY-WB genome is ∼154 kb smaller than the OY-M genome, primarily as a result of fewer multicopy sequences, including PMUs. Furthermore, AY-WB lacks genes that are truncated and are part of incomplete pathways in OY-M. PMID:16672622

  8. Genome-wide analysis reveals signatures of selection for important traits in domestic sheep from different ecoregions.

    PubMed

    Liu, Zhaohua; Ji, Zhibin; Wang, Guizhi; Chao, Tianle; Hou, Lei; Wang, Jianmin

    2016-11-03

    Throughout a long period of adaptation and selection, sheep have thrived in a diverse range of ecological environments. Mongolian sheep is the common ancestor of the Chinese short fat-tailed sheep. Migration to different ecoregions leads to changes in selection pressures and results in microevolution. Mongolian sheep and its subspecies differ in a number of important traits, especially reproductive traits. Genome-wide intraspecific variation is required to dissect the genetic basis of these traits. This research resequenced 3 short fat-tailed sheep breeds with a 43.2-fold coverage of the sheep genome. We report more than 17 million single nucleotide polymorphisms and 2.9 million indels and identify 143 genomic regions with reduced pooled heterozygosity or increased genetic distance to each other breed that represent likely targets for selection during the migration. These regions harbor genes related to developmental processes, cellular processes, multicellular organismal processes, biological regulation, metabolic processes, reproduction, localization, growth and various components of the stress responses. Furthermore, we examined the haplotype diversity of 3 genomic regions involved in reproduction and found significant differences in TSHR and PRL gene regions among 8 sheep breeds. Our results provide useful genomic information for identifying genes or causal mutations associated with important economic traits in sheep and for understanding the genetic basis of adaptation to different ecological environments.

  9. Patterns of genomic and phenomic diversity in wine and table grapes

    PubMed Central

    Migicovsky, Zoë; Sawler, Jason; Gardner, Kyle M; Aradhya, Mallikarjuna K; Prins, Bernard H; Schwaninger, Heidi R; Bustamante, Carlos D; Buckler, Edward S; Zhong, Gan-Yuan; Brown, Patrick J; Myles, Sean

    2017-01-01

    Grapes are one of the most economically and culturally important crops worldwide, and they have been bred for both winemaking and fresh consumption. Here we evaluate patterns of diversity across 33 phenotypes collected over a 17-year period from 580 table and wine grape accessions that belong to one of the world’s largest grape gene banks, the grape germplasm collection of the United States Department of Agriculture. We find that phenological events throughout the growing season are correlated, and quantify the marked difference in size between table and wine grapes. By pairing publicly available historical phenotype data with genome-wide polymorphism data, we identify large effect loci controlling traits that have been targeted during domestication and breeding, including hermaphroditism, lighter skin pigmentation and muscat aroma. Breeding for larger berries in table grapes was traditionally concentrated in geographic regions where Islam predominates and alcohol was prohibited, whereas wine grapes retained the ancestral smaller size that is more desirable for winemaking in predominantly Christian regions. We uncover a novel locus with a suggestive association with berry size that harbors a signature of positive selection for larger berries. Our results suggest that religious rules concerning alcohol consumption have had a marked impact on patterns of phenomic and genomic diversity in grapes. PMID:28791127

  10. Patterns of genomic and phenomic diversity in wine and table grapes.

    PubMed

    Migicovsky, Zoë; Sawler, Jason; Gardner, Kyle M; Aradhya, Mallikarjuna K; Prins, Bernard H; Schwaninger, Heidi R; Bustamante, Carlos D; Buckler, Edward S; Zhong, Gan-Yuan; Brown, Patrick J; Myles, Sean

    2017-01-01

    Grapes are one of the most economically and culturally important crops worldwide, and they have been bred for both winemaking and fresh consumption. Here we evaluate patterns of diversity across 33 phenotypes collected over a 17-year period from 580 table and wine grape accessions that belong to one of the world's largest grape gene banks, the grape germplasm collection of the United States Department of Agriculture. We find that phenological events throughout the growing season are correlated, and quantify the marked difference in size between table and wine grapes. By pairing publicly available historical phenotype data with genome-wide polymorphism data, we identify large effect loci controlling traits that have been targeted during domestication and breeding, including hermaphroditism, lighter skin pigmentation and muscat aroma. Breeding for larger berries in table grapes was traditionally concentrated in geographic regions where Islam predominates and alcohol was prohibited, whereas wine grapes retained the ancestral smaller size that is more desirable for winemaking in predominantly Christian regions. We uncover a novel locus with a suggestive association with berry size that harbors a signature of positive selection for larger berries. Our results suggest that religious rules concerning alcohol consumption have had a marked impact on patterns of phenomic and genomic diversity in grapes.

  11. In silico and experimental methods revealed highly diverse bacteria with quorum sensing and aromatics biodegradation systems--a potential broad application on bioremediation.

    PubMed

    Huang, Yili; Zeng, Yanhua; Yu, Zhiliang; Zhang, Jing; Feng, Hao; Lin, Xiuchun

    2013-11-01

    Phylogenetic overlaps between aromatics-degrading bacteria and acyl-homoserine-lactone (AHL) or autoinducer (AI) based quorum-sensing (QS) bacteria were evident in literatures; however, the diversity of bacteria with both activities had never been finely described. In-silico searching in NCBI genome database revealed that more than 11% of investigated population harbored both aromatic ring-hydroxylating-dioxygenase (RHD) gene and AHL/AI-synthetase gene. These bacteria were distributed in 10 orders, 15 families, 42 genus and 78 species. Horizontal transfers of both genes were common among them. Using enrichment and culture dependent method, 6 Sphingomonadales and 4 Rhizobiales with phenanthrene- or pyrene-degrading ability and AHL-production were isolated from marine, wetland and soil samples. Thin-layer-chromatography and gas-chromatography-mass-spectrum revealed that these Sphingomonads produced various AHL molecules. This is the first report of highly diverse bacteria that harbored both aromatics-degrading and QS systems. QS regulation may have broad impacts on aromatics biodegradation, and would be a new angle for developing bioremediation technology. Copyright © 2013 Elsevier Ltd. All rights reserved.

  12. Characterization of full genome sequences of chicken anemia viruses circulating in Egypt reveals distinct genetic diversity and evidence of recombination.

    PubMed

    Erfan, Ahmed M; Selim, Abdullah A; Naguib, Mahmoud M

    2018-06-02

    Chicken anemia virus (CAV) is one of the commercially important diseases of poultry worldwide. In Egypt, CAV has been reported to be a potential threat to the commercial poultry sectors. Hence, this study was aimed at isolation and full genomic analysis of CAVs circulating in chicken populations in different geographical location in Egypt. A total of 42 samples were collected from broiler chicken flocks in 9 governorates in Egypt from 12 to 42 days of age. The mortality rate observed among chickens was ranging from 3% to 22%. Nineteen out of 42 farms were found positive for the CAV genome by polymerase chain reaction (PCR). Full genome sequencing was conducted for 18 positive samples. Genetic analysis revealed a high similarity of >99% in 11 viruses with the vaccine strain Del-Ros; while the other seven samples shared close similarity to CAV field strains isolated from China, Taiwan, and Brazil. The data also indicated Q139 and Q144 amino acids substitutions among the VP1 of Egyptian field strains, which are known to be important in virus replication and spread. Phylogenetic analysis of the sequenced viruses (n = 18) based on either the full gene nucleotide sequence or VP1 coding sequence, suggested the circulation of four distinct genotypes in Egypt designated as group A, B, C and D. Moreover, evidence of recombination was detected among four Egyptian CAVs located within group A. The findings of this study succeeded to elucidate the epidemiological and genetic features of CAVs circulating in Egypt, and underscores the important of CAVs surveillance. Copyright © 2018 Elsevier B.V. All rights reserved.

  13. Shigella Phages Isolated during a Dysentery Outbreak Reveal Uncommon Structures and Broad Species Diversity.

    PubMed

    Doore, Sarah M; Schrad, Jason R; Dean, William F; Dover, John A; Parent, Kristin N

    2018-04-15

    In 2016, Michigan experienced the largest outbreak of shigellosis, a type of bacillary dysentery caused by Shigella spp., since 1988. Following this outbreak, we isolated 16 novel Shigella -infecting bacteriophages (viruses that infect bacteria) from environmental water sources. Most well-known bacteriophages infect the common laboratory species Escherichia coli and Salmonella enterica , and these phages have built the foundation of molecular and bacteriophage biology. Until now, comparatively few bacteriophages were known to infect Shigella spp., which are close relatives of E. coli We present a comprehensive analysis of these phages' host ranges, genomes, and structures, revealing genome sizes and capsid properties that are shared by very few previously described phages. After sequencing, a majority of the Shigella phages were found to have genomes of an uncommon size, shared by only 2% of all reported phage genomes. To investigate the structural implications of this unusual genome size, we used cryo-electron microscopy to resolve their capsid structures. We determined that these bacteriophage capsids have similarly uncommon geometry. Only two other viruses with this capsid structure have been described. Since most well-known bacteriophages infect Escherichia or Salmonella , our understanding of bacteriophages has been limited to a subset of well-described systems. Continuing to isolate phages using nontraditional strains of bacteria can fill gaps that currently exist in bacteriophage biology. In addition, the prevalence of Shigella phages during a shigellosis outbreak may suggest a potential impact of human health epidemics on local microbial communities. IMPORTANCE Shigella spp. bacteria are causative agents of dysentery and affect more than 164 million people worldwide every year. Despite the need to combat antibiotic-resistant Shigella strains, relatively few Shigella -infecting bacteriophages have been described. By specifically looking for Shigella

  14. A-to-I RNA Editing Contributes to Proteomic Diversity in Cancer. | Office of Cancer Genomics

    Cancer.gov

    Adenosine (A) to inosine (I) RNA editing introduces many nucleotide changes in cancer transcriptomes. However, due to the complexity of post-transcriptional regulation, the contribution of RNA editing to proteomic diversity in human cancers remains unclear. Here, we performed an integrated analysis of TCGA genomic data and CPTAC proteomic data. Despite limited site diversity, we demonstrate that A-to-I RNA editing contributes to proteomic diversity in breast cancer through changes in amino acid sequences. We validate the presence of editing events at both RNA and protein levels.

  15. Environmental survey meta-analysis reveals hidden diversity among unicellular opisthokonts.

    PubMed

    del Campo, Javier; Ruiz-Trillo, Iñaki

    2013-04-01

    The Opisthokonta clade includes Metazoa, Fungi, and several unicellular lineages, such as choanoflagellates, filastereans, ichthyosporeans, and nucleariids. To date, studies of the evolutionary diversity of opisthokonts have focused exclusively on metazoans, fungi, and, very recently, choanoflagellates. Thus, very little is known about diversity among the filastereans, ichthyosporeans, and nucleariids. To better understand the evolutionary diversity and ecology of the opisthokonts, here we analyze published environmental data from nonfungal unicellular opisthokonts and report 18S ribosomal DNA phylogenetic analyses. Our data reveal extensive diversity among all unicellular opisthokonts, except for the filastereans. We identify several clades that consist exclusively of environmental sequences, especially among ichthyosporeans and choanoflagellates. Moreover, we show that the ichthyosporeans represent a significant percentage of overall unicellular opisthokont diversity, with a greater ecological role in marine environments than previously believed. Our results provide a useful phylogenetic framework for future ecological and evolutionary studies of these poorly known lineages.

  16. Extremely Low Genomic Diversity of Rickettsia japonica Distributed in Japan.

    PubMed

    Akter, Arzuba; Ooka, Tadasuke; Gotoh, Yasuhiro; Yamamoto, Seigo; Fujita, Hiromi; Terasoma, Fumio; Kida, Kouji; Taira, Masakatsu; Nakadouzono, Fumiko; Gokuden, Mutsuyo; Hirano, Manabu; Miyashiro, Mamoru; Inari, Kouichi; Shimazu, Yukie; Tabara, Kenji; Toyoda, Atsushi; Yoshimura, Dai; Itoh, Takehiko; Kitano, Tomokazu; Sato, Mitsuhiko P; Katsura, Keisuke; Mondal, Shakhinur Islam; Ogura, Yoshitoshi; Ando, Shuji; Hayashi, Tetsuya

    2017-01-01

    Rickettsiae are obligate intracellular bacteria that have small genomes as a result of reductive evolution. Many Rickettsia species of the spotted fever group (SFG) cause tick-borne diseases known as "spotted fevers". The life cycle of SFG rickettsiae is closely associated with that of the tick, which is generally thought to act as a bacterial vector and reservoir that maintains the bacterium through transstadial and transovarial transmission. Each SFG member is thought to have adapted to a specific tick species, thus restricting the bacterial distribution to a relatively limited geographic region. These unique features of SFG rickettsiae allow investigation of how the genomes of such biologically and ecologically specialized bacteria evolve after genome reduction and the types of population structures that are generated. Here, we performed a nationwide, high-resolution phylogenetic analysis of Rickettsia japonica, an etiological agent of Japanese spotted fever that is distributed in Japan and Korea. The comparison of complete or nearly complete sequences obtained from 31 R. japonica strains isolated from various sources in Japan over the past 30 years demonstrated an extremely low level of genomic diversity. In particular, only 34 single nucleotide polymorphisms were identified among the 27 strains of the major lineage containing all clinical isolates and tick isolates from the three tick species. Our data provide novel insights into the biology and genome evolution of R. japonica, including the possibilities of recent clonal expansion and a long generation time in nature due to the long dormant phase associated with tick life cycles. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  17. Whole genome sequences in pulse crops: a global community resource to expedite translational genomics and knowledge-based crop improvement.

    PubMed

    Bohra, Abhishek; Singh, Narendra P

    2015-08-01

    Unprecedented developments in legume genomics over the last decade have resulted in the acquisition of a wide range of modern genomic resources to underpin genetic improvement of grain legumes. The genome enabled insights direct investigators in various ways that primarily include unearthing novel structural variations, retrieving the lost genetic diversity, introducing novel/exotic alleles from wider gene pools, finely resolving the complex quantitative traits and so forth. To this end, ready availability of cost-efficient and high-density genotyping assays allows genome wide prediction to be increasingly recognized as the key selection criterion in crop breeding. Further, the high-dimensional measurements of agronomically significant phenotypes obtained by using new-generation screening techniques will empower reference based resequencing as well as allele mining and trait mapping methods to comprehensively associate genome diversity with the phenome scale variation. Besides stimulating the forward genetic systems, accessibility to precisely delineated genomic segments reveals novel candidates for reverse genetic techniques like targeted genome editing. The shifting paradigm in plant genomics in turn necessitates optimization of crop breeding strategies to enable the most efficient integration of advanced omics knowledge and tools. We anticipate that the crop improvement schemes will be bolstered remarkably with rational deployment of these genome-guided approaches, ultimately resulting in expanded plant breeding capacities and improved crop performance.

  18. Ancient Ethiopian genome reveals extensive Eurasian admixture throughout the African continent.

    PubMed

    Gallego Llorente, M; Jones, E R; Eriksson, A; Siska, V; Arthur, K W; Arthur, J W; Curtis, M C; Stock, J T; Coltorti, M; Pieruccini, P; Stretton, S; Brock, F; Higham, T; Park, Y; Hofreiter, M; Bradley, D G; Bhak, J; Pinhasi, R; Manica, A

    2015-11-13

    Characterizing genetic diversity in Africa is a crucial step for most analyses reconstructing the evolutionary history of anatomically modern humans. However, historic migrations from Eurasia into Africa have affected many contemporary populations, confounding inferences. Here, we present a 12.5× coverage ancient genome of an Ethiopian male ("Mota") who lived approximately 4500 years ago. We use this genome to demonstrate that the Eurasian backflow into Africa came from a population closely related to Early Neolithic farmers, who had colonized Europe 4000 years earlier. The extent of this backflow was much greater than previously reported, reaching all the way to Central, West, and Southern Africa, affecting even populations such as Yoruba and Mbuti, previously thought to be relatively unadmixed, who harbor 6 to 7% Eurasian ancestry. Copyright © 2015, American Association for the Advancement of Science.

  19. Comparative genome analysis of Pseudogymnoascus spp. reveals primarily clonal evolution with small genome fragments exchanged between lineages.

    PubMed

    Leushkin, Evgeny V; Logacheva, Maria D; Penin, Aleksey A; Sutormin, Roman A; Gerasimov, Evgeny S; Kochkina, Galina A; Ivanushkina, Natalia E; Vasilenko, Oleg V; Kondrashov, Alexey S; Ozerskaya, Svetlana M

    2015-05-21

    Pseudogymnoascus spp. is a wide group of fungi lineages in the family Pseudorotiaceae including an aggressive pathogen of bats P. destructans. Although several lineages of P. spp. were shown to produce ascospores in culture, the vast majority of P. spp. demonstrates no evidence of sexual reproduction. P. spp. can tolerate a wide range of different temperatures and salinities and can survive even in permafrost layer. Adaptability of P. spp. to different environments is accompanied by extremely variable morphology and physiology. We sequenced genotypes of 14 strains of P. spp., 5 of which were extracted from permafrost, 1 from a cryopeg, a layer of unfrozen ground in permafrost, and 8 from temperate surface environments. All sequenced genotypes are haploid. Nucleotide diversity among these genomes is very high, with a typical evolutionary distance at synonymous sites dS ≈ 0.5, suggesting that the last common ancestor of these strains lived >50 Mya. The strains extracted from permafrost do not form a separate clade. Instead, each permafrost strain has close relatives from temperate environments. We observed a strictly clonal population structure with no conflicting topologies for ~99% of genome sequences. However, there is a number of short (~100-10,000 nt) genomic segments with the total length of 67.6 Kb which possess phylogenetic patterns strikingly different from the rest of the genome. The most remarkable case is a MAT-locus, which has 2 distinct alleles interspersed along the whole-genome phylogenetic tree. Predominantly clonal structure of genome sequences is consistent with the observations that sexual reproduction is rare in P. spp. Small number of regions with noncanonical phylogenies seem to arise due to some recombination events between derived lineages of P. spp., with MAT-locus being transferred on multiple occasions. All sequenced strains have heterothallic configuration of MAT-locus.

  20. Genome-Wide Analysis Reveals Novel Regulators of Growth in Drosophila melanogaster

    PubMed Central

    Vonesch, Sibylle Chantal; Lamparter, David; Mackay, Trudy F. C.; Bergmann, Sven; Hafen, Ernst

    2016-01-01

    Organismal size depends on the interplay between genetic and environmental factors. Genome-wide association (GWA) analyses in humans have implied many genes in the control of height but suffer from the inability to control the environment. Genetic analyses in Drosophila have identified conserved signaling pathways controlling size; however, how these pathways control phenotypic diversity is unclear. We performed GWA of size traits using the Drosophila Genetic Reference Panel of inbred, sequenced lines. We find that the top associated variants differ between traits and sexes; do not map to canonical growth pathway genes, but can be linked to these by epistasis analysis; and are enriched for genes and putative enhancers. Performing GWA on well-studied developmental traits under controlled conditions expands our understanding of developmental processes underlying phenotypic diversity. PMID:26751788

  1. Transposon diversity in Arabidopsis thaliana

    PubMed Central

    Le, Quang Hien; Wright, Stephen; Yu, Zhihui; Bureau, Thomas

    2000-01-01

    Recent availability of extensive genome sequence information offers new opportunities to analyze genome organization, including transposon diversity and accumulation, at a level of resolution that was previously unattainable. In this report, we used sequence similarity search and analysis protocols to perform a fine-scale analysis of a large sample (≈17.2 Mb) of the Arabidopsis thaliana (Columbia) genome for transposons. Consistent with previous studies, we report that the A. thaliana genome harbors diverse representatives of most known superfamilies of transposons. However, our survey reveals a higher density of transposons of which over one-fourth could be classified into a single novel transposon family designated as Basho, which appears unrelated to any previously known superfamily. We have also identified putative transposase-coding ORFs for miniature inverted-repeat transposable elements (MITEs), providing clues into the mechanism of mobility and origins of the most abundant transposons associated with plant genes. In addition, we provide evidence that most mined transposons have a clear distribution preference for A + T-rich sequences and show that structural variation for many mined transposons is partly due to interelement recombination. Taken together, these findings further underscore the complexity of transposons within the compact genome of A. thaliana. PMID:10861007

  2. Pyrosequencing of the northern red oak (Quercus rubra L.) chloroplast genome reveals high quality polymorphisms for population management

    Treesearch

    Lisa W. Alexander; Keith E. Woeste

    2014-01-01

    Given the low intraspecific chloroplast diversity detected in northern red oak (Quercus rubra L.), more powerful genetic tools are necessary to accurately characterize Q. rubra chloroplast diversity and structure. We report the sequencing, assembly, and annotation of the chloroplast genome of northern red oak via pyrosequencing and...

  3. Discovery and Characterization of Chromatin States for Systematic Annotation of the Human Genome

    NASA Astrophysics Data System (ADS)

    Ernst, Jason; Kellis, Manolis

    A plethora of epigenetic modifications have been described in the human genome and shown to play diverse roles in gene regulation, cellular differentiation and the onset of disease. Although individual modifications have been linked to the activity levels of various genetic functional elements, their combinatorial patterns are still unresolved and their potential for systematic de novo genome annotation remains untapped. Here, we use a multivariate Hidden Markov Model to reveal chromatin states in human T cells, based on recurrent and spatially coherent combinations of chromatin marks.We define 51 distinct chromatin states, including promoter-associated, transcription-associated, active intergenic, largescale repressed and repeat-associated states. Each chromatin state shows specific enrichments in functional annotations, sequence motifs and specific experimentally observed characteristics, suggesting distinct biological roles. This approach provides a complementary functional annotation of the human genome that reveals the genome-wide locations of diverse classes of epigenetic function.

  4. Next-Generation Sequencing of Two Mitochondrial Genomes from Family Pompilidae (Hymenoptera: Vespoidea) Reveal Novel Patterns of Gene Arrangement

    PubMed Central

    Chen, Peng-Yan; Zheng, Bo-Ying; Liu, Jing-Xian; Wei, Shu-Jun

    2016-01-01

    Animal mitochondrial genomes have provided large and diverse datasets for evolutionary studies. Here, the first two representative mitochondrial genomes from the family Pompilidae (Hymenoptera: Vespoidea) were determined using next-generation sequencing. The sequenced region of these two mitochondrial genomes from the species Auplopus sp. and Agenioideus sp. was 16,746 bp long with an A + T content of 83.12% and 16,596 bp long with an A + T content of 78.64%, respectively. In both species, all of the 37 typical mitochondrial genes were determined. The secondary structure of tRNA genes and rRNA genes were predicted and compared with those of other insects. Atypical trnS1 using abnormal anticodons TCT and lacking D-stem pairings was identified. There were 49 helices belonging to six domains in rrnL and 30 helices belonging to three domains in rrns present. Compared with the ancestral organization, four and two tRNA genes were rearranged in mitochondrial genomes of Auplopus and Agenioideus, respectively. In both species, trnM was shuffled upstream of the trnI-trnQ-trnM cluster, and trnA was translocated from the cluster trnA-trnR-trnN-trnS1-trnE-trnF to the region between nad1 and trnL1, which is novel to the Vespoidea. In Auplopus, the tRNA cluster trnW-trnC-trnY was shuffled to trnW-trnY-trnC. Phylogenetic analysis within Vespoidea revealed that Pompilidae and Mutillidae formed a sister lineage, and then sistered Formicidae. The genomes presented in this study have enriched the knowledge base of molecular markers, which is valuable in respect to studies about the gene rearrangement mechanism, genomic evolutionary processes and phylogeny of Hymenoptera. PMID:27727175

  5. Comparative genome and methylome analysis reveals restriction/modification system diversity in the gut commensal Bifidobacterium breve

    PubMed Central

    Bottacini, Francesca; Morrissey, Ruth; Roberts, Richard John; James, Kieran; van Breen, Justin; Egan, Muireann; Lambert, Jolanda; van Limpt, Kees; Knol, Jan; Motherway, Mary O’Connell; van Sinderen, Douwe

    2018-01-01

    Abstract Bifidobacterium breve represents one of the most abundant bifidobacterial species in the gastro-intestinal tract of breast-fed infants, where their presence is believed to exert beneficial effects. In the present study whole genome sequencing, employing the PacBio Single Molecule, Real-Time (SMRT) sequencing platform, combined with comparative genome analysis allowed the most extensive genetic investigation of this taxon. Our findings demonstrate that genes encoding Restriction/Modification (R/M) systems constitute a substantial part of the B. breve variable gene content (or variome). Using the methylome data generated by SMRT sequencing, combined with targeted Illumina bisulfite sequencing (BS-seq) and comparative genome analysis, we were able to detect methylation recognition motifs and assign these to identified B. breve R/M systems, where in several cases such assignments were confirmed by restriction analysis. Furthermore, we show that R/M systems typically impose a very significant barrier to genetic accessibility of B. breve strains, and that cloning of a methyltransferase-encoding gene may overcome such a barrier, thus allowing future functional investigations of members of this species. PMID:29294107

  6. Sensitive Next-Generation Sequencing Method Reveals Deep Genetic Diversity of HIV-1 in the Democratic Republic of the Congo.

    PubMed

    Rodgers, Mary A; Wilkinson, Eduan; Vallari, Ana; McArthur, Carole; Sthreshley, Larry; Brennan, Catherine A; Cloherty, Gavin; de Oliveira, Tulio

    2017-03-15

    As the epidemiological epicenter of the human immunodeficiency virus (HIV) pandemic, the Democratic Republic of the Congo (DRC) is a reservoir of circulating HIV strains exhibiting high levels of diversity and recombination. In this study, we characterized HIV specimens collected in two rural areas of the DRC between 2001 and 2003 to identify rare strains of HIV. The env gp41 region was sequenced and characterized for 172 HIV-positive specimens. The env sequences were predominantly subtype A (43.02%), but 7 other subtypes (33.14%), 20 circulating recombinant forms (CRFs; 11.63%), and 20 unclassified (11.63%) sequences were also found. Of the rare and unclassified subtypes, 18 specimens were selected for next-generation sequencing (NGS) by a modified HIV-switching mechanism at the 5' end of the RNA template (SMART) method to obtain full-genome sequences. NGS produced 14 new complete genomes, which included pure subtype C ( n = 2), D ( n = 1), F1 ( n = 1), H ( n = 3), and J ( n = 1) genomes. The two subtype C genomes and one of the subtype H genomes branched basal to their respective subtype branches but had no evidence of recombination. The remaining 6 genomes were complex recombinants of 2 or more subtypes, including subtypes A1, F, G, H, J, and K and unclassified fragments, including one subtype CRF25 isolate, which branched basal to all CRF25 references. Notably, all recombinant subtype H fragments branched basal to the H clade. Spatial-geographical analysis indicated that the diverse sequences identified here did not expand globally. The full-genome and subgenomic sequences identified in our study population significantly increase the documented diversity of the strains involved in the continually evolving HIV-1 pandemic. IMPORTANCE Very little is known about the ancestral HIV-1 strains that founded the global pandemic, and very few complete genome sequences are available from patients in the Congo Basin, where HIV-1 expanded early in the global pandemic. By

  7. The Human Genome Diversity (HGD) Project. Summary document

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    NONE

    1993-12-31

    In 1991 a group of human geneticists and molecular biologists proposed to the scientific community that a world wide survey be undertaken of variation in the human genome. To aid their considerations, the committee therefore decided to hold a small series of international workshops to explore the major scientific issues involved. The intention was to define a framework for the project which could provide a basis for much wider and more detailed discussion and planning--it was recognized that the successful implementation of the proposed project, which has come to be known as the Human Genome Diversity (HGD) Project, would notmore » only involve scientists but also various national and international non-scientific groups all of which should contribute to the project`s development. The international HGD workshop held in Sardinia in September 1993 was the last in the initial series of planning workshops. As such it not only explored new ground but also pulled together into a more coherent form much of the formal and informal discussion that had taken place in the preceding two years. This report presents the deliberations of the Sardinia workshop within a consideration of the overall development of the HGD Project to date.« less

  8. Revealing genome-scale transcriptional regulatory landscape of OmpR highlights its expanded regulatory roles under osmotic stress in Escherichia coli K-12 MG1655.

    PubMed

    Seo, Sang Woo; Gao, Ye; Kim, Donghyuk; Szubin, Richard; Yang, Jina; Cho, Byung-Kwan; Palsson, Bernhard O

    2017-05-19

    A transcription factor (TF), OmpR, plays a critical role in transcriptional regulation of the osmotic stress response in bacteria. Here, we reveal a genome-scale OmpR regulon in Escherichia coli K-12 MG1655. Integrative data analysis reveals that a total of 37 genes in 24 transcription units (TUs) belong to OmpR regulon. Among them, 26 genes show more than two-fold changes in expression level in an OmpR knock-out strain. Specifically, we find that: 1) OmpR regulates mostly membrane-located gene products involved in diverse fundamental biological processes, such as narU (encoding nitrate/nitrite transporter), ompX (encoding outer membrane protein X), and nuoN (encoding NADH:ubiquinone oxidoreductase); 2) by investigating co-regulation of entire sets of genes regulated by other stress-response TFs, stresses are surprisingly independently regulated among each other; and, 3) a detailed investigation of the physiological roles of the newly discovered OmpR regulon genes reveals that activation of narU represents a novel strategy to significantly improve osmotic stress tolerance of E. coli. Thus, the genome-scale approach to elucidating regulons comprehensively identifies regulated genes and leads to fundamental discoveries related to stress responses.

  9. Genomic Diversity of Hepatitis B Virus Infection Associated With Fulminant Hepatitis B Development.

    PubMed

    Mina, Thomas; Amini Bavil Olyaee, Samad; Tacke, Frank; Maes, Piet; Van Ranst, Marc; Pourkarim, Mahmoud Reza

    2015-06-01

    After five decades of Hepatitis B Virus (HBV) vaccine discovery, HBV is still a major public health problem. Due to the high genetic diversity of HBV and selective pressure of the host immune system, intra-host evolution of this virus in different clinical manifestations is a hot topic of research. HBV infection causes a range of clinical manifestations from acute to chronic infection, cirrhosis and hepatocellular carcinoma. Among all forms of HBV infection manifestations, fulminant hepatitis B infection possesses the highest fatality rate. Almost 1% of the acutely infected patients develop fulminant hepatitis B, in which the mortality rate is around 70%. All published papers deposited in Genbank, on the topic of fulminant hepatitis were reviewed and their virological aspects were investigated. In this review, we highlight the genomic diversity of HBV reported from patients with fulminant HBV infection. The most commonly detected diversities affect regulatory motifs of HBV in the core and S region, indicating that these alterations may convert the virus to an aggressive strain. Moreover, mutations at T-cell and B-cell epitopes located in pre-S1 and pre-S2 proteins may lead to an immune evasion of the virus, likely favoring a more severe clinical course of infection. Furthermore, point and frame shift mutations in the core region increase the viral replication of HBV and help virus to evade from immune system and guarantee its persistence. Fulminant hepatitis B is associated with distinct mutational patterns of HBV, underlining that genomic diversity of the virus is an important factor determining its pathogenicity.

  10. A fast indirect method to compute functions of genomic relationships concerning genotyped and ungenotyped individuals, for diversity management.

    PubMed

    Colleau, Jean-Jacques; Palhière, Isabelle; Rodríguez-Ramilo, Silvia T; Legarra, Andres

    2017-12-01

    Pedigree-based management of genetic diversity in populations, e.g., using optimal contributions, involves computation of the [Formula: see text] type yielding elements (relationships) or functions (usually averages) of relationship matrices. For pedigree-based relationships [Formula: see text], a very efficient method exists. When all the individuals of interest are genotyped, genomic management can be addressed using the genomic relationship matrix [Formula: see text]; however, to date, the computational problem of efficiently computing [Formula: see text] has not been well studied. When some individuals of interest are not genotyped, genomic management should consider the relationship matrix [Formula: see text] that combines genotyped and ungenotyped individuals; however, direct computation of [Formula: see text] is computationally very demanding, because construction of a possibly huge matrix is required. Our work presents efficient ways of computing [Formula: see text] and [Formula: see text], with applications on real data from dairy sheep and dairy goat breeding schemes. For genomic relationships, an efficient indirect computation with quadratic instead of cubic cost is [Formula: see text], where Z is a matrix relating animals to genotypes. For the relationship matrix [Formula: see text], we propose an indirect method based on the difference between vectors [Formula: see text], which involves computation of [Formula: see text] and of products such as [Formula: see text] and [Formula: see text], where [Formula: see text] is a working vector derived from [Formula: see text]. The latter computation is the most demanding but can be done using sparse Cholesky decompositions of matrix [Formula: see text], which allows handling very large genomic and pedigree data files. Studies based on simulations reported in the literature show that the trends of average relationships in [Formula: see text] and [Formula: see text] differ as genomic selection proceeds. When

  11. Genome-wide distribution of genetic diversity and linkage disequilibrium in a mass-selected population of maritime pine

    PubMed Central

    2014-01-01

    Background The accessibility of high-throughput genotyping technologies has contributed greatly to the development of genomic resources in non-model organisms. High-density genotyping arrays have only recently been developed for some economically important species such as conifers. The potential for using genomic technologies in association mapping and breeding depends largely on the genome wide patterns of diversity and linkage disequilibrium in current breeding populations. This study aims to deepen our knowledge regarding these issues in maritime pine, the first species used for reforestation in south western Europe. Results Using a new map merging algorithm, we first established a 1,712 cM composite linkage map (comprising 1,838 SNP markers in 12 linkage groups) by bringing together three already available genetic maps. Using rigorous statistical testing based on kernel density estimation and resampling we identified cold and hot spots of recombination. In parallel, 186 unrelated trees of a mass-selected population were genotyped using a 12k-SNP array. A total of 2,600 informative SNPs allowed to describe historical recombination, genetic diversity and genetic structure of this recently domesticated breeding pool that forms the basis of much of the current and future breeding of this species. We observe very low levels of population genetic structure and find no evidence that artificial selection has caused a reduction in genetic diversity. By combining these two pieces of information, we provided the map position of 1,671 SNPs corresponding to 1,192 different loci. This made it possible to analyze the spatial pattern of genetic diversity (H e ) and long distance linkage disequilibrium (LD) along the chromosomes. We found no particular pattern in the empirical variogram of H e across the 12 linkage groups and, as expected for an outcrossing species with large effective population size, we observed an almost complete lack of long distance LD. Conclusions These

  12. Population Stratification and Underrepresentation of Indian Subcontinent Genetic Diversity in the 1000 Genomes Project Dataset

    PubMed Central

    Sengupta, Dhriti; Choudhury, Ananyo; Basu, Analabha; Ramsay, Michèle

    2016-01-01

    Genomic variation in Indian populations is of great interest due to the diversity of ancestral components, social stratification, endogamy and complex admixture patterns. With an expanding population of 1.2 billion, India is also a treasure trove to catalogue innocuous as well as clinically relevant rare mutations. Recent studies have revealed four dominant ancestries in populations from mainland India: Ancestral North-Indian (ANI), Ancestral South-Indian (ASI), Ancestral Tibeto–Burman (ATB) and Ancestral Austro-Asiatic (AAA). The 1000 Genomes Project (KGP) Phase-3 data include about 500 genomes from five linguistically defined Indian-Subcontinent (IS) populations (Punjabi, Gujrati, Bengali, Telugu and Tamil) some of whom are recent migrants to USA or UK. Comparative analyses show that despite the distinct geographic origins of the KGP-IS populations, the ANI component is predominantly represented in this dataset. Previous studies demonstrated population substructure in the HapMap Gujrati population, and we found evidence for additional substructure in the Punjabi and Telugu populations. These substructured populations have characteristic/significant differences in heterozygosity and inbreeding coefficients. Moreover, we demonstrate that the substructure is better explained by factors like differences in proportion of ancestral components, and endogamy driven social structure rather than invoking a novel ancestral component to explain it. Therefore, using language and/or geography as a proxy for an ethnic unit is inadequate for many of the IS populations. This highlights the necessity for more nuanced sampling strategies or corrective statistical approaches, particularly for biomedical and population genetics research in India. PMID:27797945

  13. Commonalities in Development of Pure Breeds and Population Isolates Revealed in the Genome of the Sardinian Fonni's Dog.

    PubMed

    Dreger, Dayna L; Davis, Brian W; Cocco, Raffaella; Sechi, Sara; Di Cerbo, Alessandro; Parker, Heidi G; Polli, Michele; Marelli, Stefano P; Crepaldi, Paola; Ostrander, Elaine A

    2016-10-01

    The island inhabitants of Sardinia have long been a focus for studies of complex human traits due to their unique ancestral background and population isolation reflecting geographic and cultural restriction. Population isolates share decreased genomic diversity, increased linkage disequilibrium, and increased inbreeding coefficients. In many regions, dogs and humans have been exposed to the same natural and artificial forces of environment, growth, and migration. Distinct dog breeds have arisen through human-driven selection of characteristics to meet an ideal standard of appearance and function. The Fonni's Dog, an endemic dog population on Sardinia, has not been subjected to an intensive system of artificial selection, but rather has developed alongside the human population of Sardinia, influenced by geographic isolation and unregulated selection based on its environmental adaptation and aptitude for owner-desired behaviors. Through analysis of 28 dog breeds, represented with whole-genome sequences from 13 dogs and ∼170,000 genome-wide single nucleotide variants from 155 dogs, we have produced a genomic illustration of the Fonni's Dog. Genomic patterns confirm within-breed similarity, while population and demographic analyses provide spatial identity of Fonni's Dog to other Mediterranean breeds. Investigation of admixture and fixation indices reveals insights into the involvement of Fonni's Dogs in breed development throughout the Mediterranean. We describe how characteristics of population isolates are reflected in dog breeds that have undergone artificial selection, and are mirrored in the Fonni's Dog through traditional isolating factors that affect human populations. Lastly, we show that the genetic history of Fonni's Dog parallels demographic events in local human populations. Copyright © 2016 by the Genetics Society of America.

  14. Commonalities in Development of Pure Breeds and Population Isolates Revealed in the Genome of the Sardinian Fonni's Dog

    PubMed Central

    Dreger, Dayna L.; Davis, Brian W.; Cocco, Raffaella; Sechi, Sara; Di Cerbo, Alessandro; Parker, Heidi G.; Polli, Michele; Marelli, Stefano P.; Crepaldi, Paola; Ostrander, Elaine A.

    2016-01-01

    The island inhabitants of Sardinia have long been a focus for studies of complex human traits due to their unique ancestral background and population isolation reflecting geographic and cultural restriction. Population isolates share decreased genomic diversity, increased linkage disequilibrium, and increased inbreeding coefficients. In many regions, dogs and humans have been exposed to the same natural and artificial forces of environment, growth, and migration. Distinct dog breeds have arisen through human-driven selection of characteristics to meet an ideal standard of appearance and function. The Fonni’s Dog, an endemic dog population on Sardinia, has not been subjected to an intensive system of artificial selection, but rather has developed alongside the human population of Sardinia, influenced by geographic isolation and unregulated selection based on its environmental adaptation and aptitude for owner-desired behaviors. Through analysis of 28 dog breeds, represented with whole-genome sequences from 13 dogs and ∼170,000 genome-wide single nucleotide variants from 155 dogs, we have produced a genomic illustration of the Fonni’s Dog. Genomic patterns confirm within-breed similarity, while population and demographic analyses provide spatial identity of Fonni’s Dog to other Mediterranean breeds. Investigation of admixture and fixation indices reveals insights into the involvement of Fonni’s Dogs in breed development throughout the Mediterranean. We describe how characteristics of population isolates are reflected in dog breeds that have undergone artificial selection, and are mirrored in the Fonni’s Dog through traditional isolating factors that affect human populations. Lastly, we show that the genetic history of Fonni’s Dog parallels demographic events in local human populations. PMID:27519604

  15. Comparative Genome Analyses of Serratia marcescens FS14 Reveals Its High Antagonistic Potential

    PubMed Central

    Li, Pengpeng; Kwok, Amy H. Y.; Jiang, Jingwei; Ran, Tingting; Xu, Dongqing; Wang, Weiwu; Leung, Frederick C.

    2015-01-01

    S. marcescens FS14 was isolated from an Atractylodes macrocephala Koidz plant that was infected by Fusarium oxysporum and showed symptoms of root rot. With the completion of the genome sequence of FS14, the first comprehensive comparative-genomic analysis of the Serratia genus was performed. Pan-genome and COG analyses showed that the majority of the conserved core genes are involved in basic cellular functions, while genomic factors such as prophages contribute considerably to genome diversity. Additionally, a Type I restriction-modification system, a Type III secretion system and tellurium resistance genes are found in only some Serratia species. Comparative analysis further identified that S. marcescens FS14 possesses multiple mechanisms for antagonism against other microorganisms, including the production of prodigiosin, bacteriocins, and multi-antibiotic resistant determinants as well as chitinases. The presence of two evolutionarily distinct Type VI secretion systems (T6SSs) in FS14 may provide further competitive advantages for FS14 against other microbes. To our knowledge, this is the first report of comparative analysis on T6SSs in the genus, which identifies four types of T6SSs in Serratia spp.. Competition bioassays of FS14 against the vital plant pathogenic bacterium Ralstonia solanacearum and fungi Fusarium oxysporum and Sclerotinia sclerotiorum were performed to support our genomic analyses, in which FS14 demonstrated high antagonistic activities against both bacterial and fungal phytopathogens. PMID:25856195

  16. Metagenomics Reveals Pervasive Bacterial Populations and Reduced Community Diversity across the Alaska Tundra Ecosystem

    DOE PAGES

    Johnston, Eric R.; Rodriguez-R, Luis M.; Luo, Chengwei; ...

    2016-04-25

    How soil microbial communities contrast with respect to taxonomic and functional composition within and between ecosystems remains an unresolved question that is central to predicting how global anthropogenic change will affect soil functioning and services. In particular, it remains unclear how small-scale observations of soil communities based on the typical volume sampled (1-2 g) are generalizable to ecosystem-scale responses and processes. This is especially relevant for remote, northern latitude soils, which are challenging to sample and are also thought to be more vulnerable to climate change compared to temperate soils. Here, we employed well-replicated shotgun metagenome and 16S rRNA genemore » amplicon sequencing to characterize community composition and metabolic potential in Alaskan tundra soils, combining our own datasets with those publically available from distant tundra and temperate grassland and agriculture habitats. We found that the abundance of many taxa and metabolic functions differed substantially between tundra soil metagenomes relative to those from temperate soils, and that a high degree of OTU-sharing exists between tundra locations. Tundra soils were an order of magnitude less complex than their temperate counterparts, allowing for near-complete coverage of microbial community richness (~92% breadth) by sequencing, and the recovery of 27 high-quality, almost complete ( > 80% completeness) population bins. These population bins, collectively, made up to ~10% of the metagenomic datasets, and represented diverse taxonomic groups and metabolic lifestyles tuned toward sulfur cycling, hydrogen metabolism, methanotrophy, and organic matter oxidation. Several population bins, including members of Acidobacteria, Actinobacteria, and Proteobacteria, were also present in geographically distant (~100-530 km apart) tundra habitats (full genome representation and up to 99.6% genome-derived average nucleotide identity). Collectively, our results

  17. Metagenomics Reveals Pervasive Bacterial Populations and Reduced Community Diversity across the Alaska Tundra Ecosystem.

    PubMed

    Johnston, Eric R; Rodriguez-R, Luis M; Luo, Chengwei; Yuan, Mengting M; Wu, Liyou; He, Zhili; Schuur, Edward A G; Luo, Yiqi; Tiedje, James M; Zhou, Jizhong; Konstantinidis, Konstantinos T

    2016-01-01

    How soil microbial communities contrast with respect to taxonomic and functional composition within and between ecosystems remains an unresolved question that is central to predicting how global anthropogenic change will affect soil functioning and services. In particular, it remains unclear how small-scale observations of soil communities based on the typical volume sampled (1-2 g) are generalizable to ecosystem-scale responses and processes. This is especially relevant for remote, northern latitude soils, which are challenging to sample and are also thought to be more vulnerable to climate change compared to temperate soils. Here, we employed well-replicated shotgun metagenome and 16S rRNA gene amplicon sequencing to characterize community composition and metabolic potential in Alaskan tundra soils, combining our own datasets with those publically available from distant tundra and temperate grassland and agriculture habitats. We found that the abundance of many taxa and metabolic functions differed substantially between tundra soil metagenomes relative to those from temperate soils, and that a high degree of OTU-sharing exists between tundra locations. Tundra soils were an order of magnitude less complex than their temperate counterparts, allowing for near-complete coverage of microbial community richness (~92% breadth) by sequencing, and the recovery of 27 high-quality, almost complete (>80% completeness) population bins. These population bins, collectively, made up to ~10% of the metagenomic datasets, and represented diverse taxonomic groups and metabolic lifestyles tuned toward sulfur cycling, hydrogen metabolism, methanotrophy, and organic matter oxidation. Several population bins, including members of Acidobacteria, Actinobacteria, and Proteobacteria, were also present in geographically distant (~100-530 km apart) tundra habitats (full genome representation and up to 99.6% genome-derived average nucleotide identity). Collectively, our results revealed that

  18. Metagenomics Reveals Pervasive Bacterial Populations and Reduced Community Diversity across the Alaska Tundra Ecosystem

    PubMed Central

    Johnston, Eric R.; Rodriguez-R, Luis M.; Luo, Chengwei; Yuan, Mengting M.; Wu, Liyou; He, Zhili; Schuur, Edward A. G.; Luo, Yiqi; Tiedje, James M.; Zhou, Jizhong; Konstantinidis, Konstantinos T.

    2016-01-01

    How soil microbial communities contrast with respect to taxonomic and functional composition within and between ecosystems remains an unresolved question that is central to predicting how global anthropogenic change will affect soil functioning and services. In particular, it remains unclear how small-scale observations of soil communities based on the typical volume sampled (1–2 g) are generalizable to ecosystem-scale responses and processes. This is especially relevant for remote, northern latitude soils, which are challenging to sample and are also thought to be more vulnerable to climate change compared to temperate soils. Here, we employed well-replicated shotgun metagenome and 16S rRNA gene amplicon sequencing to characterize community composition and metabolic potential in Alaskan tundra soils, combining our own datasets with those publically available from distant tundra and temperate grassland and agriculture habitats. We found that the abundance of many taxa and metabolic functions differed substantially between tundra soil metagenomes relative to those from temperate soils, and that a high degree of OTU-sharing exists between tundra locations. Tundra soils were an order of magnitude less complex than their temperate counterparts, allowing for near-complete coverage of microbial community richness (~92% breadth) by sequencing, and the recovery of 27 high-quality, almost complete (>80% completeness) population bins. These population bins, collectively, made up to ~10% of the metagenomic datasets, and represented diverse taxonomic groups and metabolic lifestyles tuned toward sulfur cycling, hydrogen metabolism, methanotrophy, and organic matter oxidation. Several population bins, including members of Acidobacteria, Actinobacteria, and Proteobacteria, were also present in geographically distant (~100–530 km apart) tundra habitats (full genome representation and up to 99.6% genome-derived average nucleotide identity). Collectively, our results revealed

  19. PGG.Population: a database for understanding the genomic diversity and genetic ancestry of human populations

    PubMed Central

    Zhang, Chao; Gao, Yang; Liu, Jiaojiao; Xue, Zhe; Lu, Yan; Deng, Lian; Tian, Lei; Feng, Qidi

    2018-01-01

    Abstract There are a growing number of studies focusing on delineating genetic variations that are associated with complex human traits and diseases due to recent advances in next-generation sequencing technologies. However, identifying and prioritizing disease-associated causal variants relies on understanding the distribution of genetic variations within and among populations. The PGG.Population database documents 7122 genomes representing 356 global populations from 107 countries and provides essential information for researchers to understand human genomic diversity and genetic ancestry. These data and information can facilitate the design of research studies and the interpretation of results of both evolutionary and medical studies involving human populations. The database is carefully maintained and constantly updated when new data are available. We included miscellaneous functions and a user-friendly graphical interface for visualization of genomic diversity, population relationships (genetic affinity), ancestral makeup, footprints of natural selection, and population history etc. Moreover, PGG.Population provides a useful feature for users to analyze data and visualize results in a dynamic style via online illustration. The long-term ambition of the PGG.Population, together with the joint efforts from other researchers who contribute their data to our database, is to create a comprehensive depository of geographic and ethnic variation of human genome, as well as a platform bringing influence on future practitioners of medicine and clinical investigators. PGG.Population is available at https://www.pggpopulation.org. PMID:29112749

  20. Whole-Genome Genetic Diversity in a Sample of Australians with Deep Aboriginal Ancestry

    PubMed Central

    McEvoy, Brian P.; Lind, Joanne M.; Wang, Eric T.; Moyzis, Robert K.; Visscher, Peter M.; van Holst Pellekaan, Sheila M.; Wilton, Alan N.

    2010-01-01

    Australia was probably settled soon after modern humans left Africa, but details of this ancient migration are not well understood. Debate centers on whether the Pleistocene Sahul continent (composed of New Guinea, Australia, and Tasmania) was first settled by a single wave followed by regional divergence into Aboriginal Australian and New Guinean populations (common origin) or whether different parts of the continent were initially populated independently. Australia has been the subject of relatively few DNA studies even though understanding regional variation in genomic structure and diversity will be important if disease-association mapping methods are to be successfully evaluated and applied across populations. We report on a genome-wide investigation of Australian Aboriginal SNP diversity in a sample of participants from the Riverine region. The phylogenetic relationship of these Aboriginal Australians to a range of other global populations demonstrates a deep common origin with Papuan New Guineans and Melanesians, with little evidence of substantial later migration until the very recent arrival of European colonists. The study provides valuable and robust insights into an early and important phase of human colonization of the globe. A broader survey of Australia, including diverse geographic sample populations, will be required to fully appreciate the continent's unique population history and consequent genetic heritage, as well as the importance of both to the understanding of health issues. PMID:20691402

  1. Genome resequencing and transcriptome profiling reveal structural diversity and expression patterns of constitutive disease resistance genes in Huanglongbing-tolerant Poncirus trifoliata and its hybrids

    PubMed Central

    Rawat, Nidhi; Kumar, Brajendra; Albrecht, Ute; Du, Dongliang; Huang, Ming; Yu, Qibin; Zhang, Yi; Duan, Yong-Ping; Bowman, Kim D; Gmitter, Fred G; Deng, Zhanao

    2017-01-01

    Huanglongbing (HLB) is the most destructive bacterial disease of citrus worldwide. While most citrus varieties are susceptible to HLB, Poncirus trifoliata, a close relative of Citrus, and some of its hybrids with Citrus are tolerant to HLB. No specific HLB tolerance genes have been identified in P. trifoliata but recent studies have shown that constitutive disease resistance (CDR) genes were expressed at much higher levels in HLB-tolerant Poncirus hybrids and the expression of CDR genes was modulated by Candidatus Liberibacter asiaticus (CLas), the pathogen of HLB. The current study was undertaken to mine and characterize the CDR gene family in Citrus and Poncirus and to understand its association with HLB tolerance in Poncirus. We identified 17 CDR genes in two citrus genomes, deduced their structures, and investigated their phylogenetic relationships. We revealed that the expansion of the CDR family in Citrus seems to be due to segmental and tandem duplication events. Through genome resequencing and transcriptome sequencing, we identified eight CDR genes in the Poncirus genome (PtCDR1-PtCDR8). The number of SNPs was the highest in PtCDR2 and the lowest in PtCDR7. Most of the deletion and insertion events were observed in the UTR regions of Citrus and Poncirus CDR genes. PtCDR2 and PtCDR8 were in abundance in the leaf transcriptomes of two HLB-tolerant Poncirus genotypes and were also upregulated in HLB-tolerant, Poncirus hybrids as revealed by real-time PCR analysis. These two CDR genes seem to be good candidate genes for future studies of their role in citrus-CLas interactions. PMID:29152310

  2. High intraspecific genome diversity in the model arbuscular mycorrhizal symbiont Rhizophagus irregularis.

    PubMed

    Chen, Eric C H; Morin, Emmanuelle; Beaudet, Denis; Noel, Jessica; Yildirir, Gokalp; Ndikumana, Steve; Charron, Philippe; St-Onge, Camille; Giorgi, John; Krüger, Manuela; Marton, Timea; Ropars, Jeanne; Grigoriev, Igor V; Hainaut, Matthieu; Henrissat, Bernard; Roux, Christophe; Martin, Francis; Corradi, Nicolas

    2018-01-22

    Arbuscular mycorrhizal fungi (AMF) are known to improve plant fitness through the establishment of mycorrhizal symbioses. Genetic and phenotypic variations among closely related AMF isolates can significantly affect plant growth, but the genomic changes underlying this variability are unclear. To address this issue, we improved the genome assembly and gene annotation of the model strain Rhizophagus irregularis DAOM197198, and compared its gene content with five isolates of R. irregularis sampled in the same field. All isolates harbor striking genome variations, with large numbers of isolate-specific genes, gene family expansions, and evidence of interisolate genetic exchange. The observed variability affects all gene ontology terms and PFAM protein domains, as well as putative mycorrhiza-induced small secreted effector-like proteins and other symbiosis differentially expressed genes. High variability is also found in active transposable elements. Overall, these findings indicate a substantial divergence in the functioning capacity of isolates harvested from the same field, and thus their genetic potential for adaptation to biotic and abiotic changes. Our data also provide a first glimpse into the genome diversity that resides within natural populations of these symbionts, and open avenues for future analyses of plant-AMF interactions that link AMF genome variation with plant phenotype and fitness. © 2018 The Authors. New Phytologist © 2018 New Phytologist Trust.

  3. Morphologic and Genomic Analyses of New Isolates Reveal a Second Lineage of Cedratviruses.

    PubMed

    Rodrigues, Rodrigo Araújo Lima; Andreani, Julien; Andrade, Ana Cláudia Dos Santos Pereira; Machado, Talita Bastos; Abdi, Souhila; Levasseur, Anthony; Abrahão, Jônatas Santos; La Scola, Bernard

    2018-07-01

    Giant viruses have been isolated and characterized in different environments, expanding our knowledge about the biology of these unique microorganisms. In the last 2 years, a new group was discovered, the cedratviruses, currently composed of only two isolates and members of a putative new family, "Pithoviridae," along with previously known pithoviruses. Here we report the isolation and biological and genomic characterization of two novel cedratviruses isolated from samples collected in France and Brazil. Both viruses were isolated using Acanthamoeba castellanii as a host cell and exhibit ovoid particles with corks at either extremity of the particle. Curiously, the Brazilian cedratvirus is ∼20% smaller and presents a shorter genome of 460,038 bp, coding for fewer proteins than other cedratviruses. In addition, it has a completely asyntenic genome and presents a lower amino acid identity of orthologous genes (∼73%). Pangenome analysis comprising the four cedratviruses revealed an increase in the pangenome concomitant with a decrease in the core genome with the addition of the two novel viruses. Finally, phylogenetic analyses clustered the Brazilian virus in a separate branch within the group of cedratviruses, while the French isolate is closer to the previously reported Cedratvirus lausannensis Taking all together, we propose the existence of a second lineage of this emerging viral genus and provide new insights into the biodiversity and ubiquity of these giant viruses. IMPORTANCE Various giant viruses have been described in recent years, revealing a unique part of the virosphere. A new group among the giant viruses has recently been described, the cedratviruses, which is currently composed of only two isolates. In this paper, we describe two novel cedratviruses isolated from French and Brazilian samples. Biological and genomic analyses showed viruses with different particle sizes, genome lengths, and architecture, revealing the existence of a second lineage of

  4. Diversity arrays technology: a generic genome profiling technology on open platforms.

    PubMed

    Kilian, Andrzej; Wenzl, Peter; Huttner, Eric; Carling, Jason; Xia, Ling; Blois, Hélène; Caig, Vanessa; Heller-Uszynska, Katarzyna; Jaccoud, Damian; Hopper, Colleen; Aschenbrenner-Kilian, Malgorzata; Evers, Margaret; Peng, Kaiman; Cayla, Cyril; Hok, Puthick; Uszynski, Grzegorz

    2012-01-01

    In the last 20 years, we have observed an exponential growth of the DNA sequence data and simular increase in the volume of DNA polymorphism data generated by numerous molecular marker technologies. Most of the investment, and therefore progress, concentrated on human genome and genomes of selected model species. Diversity Arrays Technology (DArT), developed over a decade ago, was among the first "democratizing" genotyping technologies, as its performance was primarily driven by the level of DNA sequence variation in the species rather than by the level of financial investment. DArT also proved more robust to genome size and ploidy-level differences among approximately 60 organisms for which DArT was developed to date compared to other high-throughput genotyping technologies. The success of DArT in a number of organisms, including a wide range of "orphan crops," can be attributed to the simplicity of underlying concepts: DArT combines genome complexity reduction methods enriching for genic regions with a highly parallel assay readout on a number of "open-access" microarray platforms. The quantitative nature of the assay enabled a number of applications in which allelic frequencies can be estimated from DArT arrays. A typical DArT assay tests for polymorphism tens of thousands of genomic loci with the final number of markers reported (hundreds to thousands) reflecting the level of DNA sequence variation in the tested loci. Detailed DArT methods, protocols, and a range of their application examples as well as DArT's evolution path are presented.

  5. Genome-wide resequencing of KRICE_CORE reveals their potential for future breeding, as well as functional and evolutionary studies in the post-genomic era.

    PubMed

    Kim, Tae-Sung; He, Qiang; Kim, Kyu-Won; Yoon, Min-Young; Ra, Won-Hee; Li, Feng Peng; Tong, Wei; Yu, Jie; Oo, Win Htet; Choi, Buung; Heo, Eun-Beom; Yun, Byoung-Kook; Kwon, Soon-Jae; Kwon, Soon-Wook; Cho, Yoo-Hyun; Lee, Chang-Yong; Park, Beom-Seok; Park, Yong-Jin

    2016-05-26

    Rice germplasm collections continue to grow in number and size around the world. Since maintaining and screening such massive resources remains challenging, it is important to establish practical methods to manage them. A core collection, by definition, refers to a subset of the entire population that preserves the majority of genetic diversity, enhancing the efficiency of germplasm utilization. Here, we report whole-genome resequencing of the 137 rice mini core collection or Korean rice core set (KRICE_CORE) that represents 25,604 rice germplasms deposited in the Korean genebank of the Rural Development Administration (RDA). We implemented the Illumina HiSeq 2000 and 2500 platform to produce short reads and then assembled those with 9.8 depths using Nipponbare as a reference. Comparisons of the sequences with the reference genome yielded more than 15 million (M) single nucleotide polymorphisms (SNPs) and 1.3 M INDELs. Phylogenetic and population analyses using 2,046,529 high-quality SNPs successfully assigned rice accessions to the relevant rice subgroups, suggesting that these SNPs capture evolutionary signatures that have accumulated in rice subpopulations. Furthermore, genome-wide association studies (GWAS) for four exemplary agronomic traits in the KRIC_CORE manifest the utility of KRICE_CORE; that is, identifying previously defined genes or novel genetic factors that potentially regulate important phenotypes. This study provides strong evidence that the size of KRICE_CORE is small but contains high genetic and functional diversity across the genome. Thus, our resequencing results will be useful for future breeding, as well as functional and evolutionary studies, in the post-genomic era.

  6. Integrative Genomics Reveals Novel Molecular Pathways and Gene Networks for Coronary Artery Disease

    PubMed Central

    Mäkinen, Ville-Petteri; Civelek, Mete; Meng, Qingying; Zhang, Bin; Zhu, Jun; Levian, Candace; Huan, Tianxiao; Segrè, Ayellet V.; Ghosh, Sujoy; Vivar, Juan; Nikpay, Majid; Stewart, Alexandre F. R.; Nelson, Christopher P.; Willenborg, Christina; Erdmann, Jeanette; Blakenberg, Stefan; O'Donnell, Christopher J.; März, Winfried; Laaksonen, Reijo; Epstein, Stephen E.; Kathiresan, Sekar; Shah, Svati H.; Hazen, Stanley L.; Reilly, Muredach P.; Lusis, Aldons J.; Samani, Nilesh J.; Schunkert, Heribert; Quertermous, Thomas; McPherson, Ruth; Yang, Xia; Assimes, Themistocles L.

    2014-01-01

    The majority of the heritability of coronary artery disease (CAD) remains unexplained, despite recent successes of genome-wide association studies (GWAS) in identifying novel susceptibility loci. Integrating functional genomic data from a variety of sources with a large-scale meta-analysis of CAD GWAS may facilitate the identification of novel biological processes and genes involved in CAD, as well as clarify the causal relationships of established processes. Towards this end, we integrated 14 GWAS from the CARDIoGRAM Consortium and two additional GWAS from the Ottawa Heart Institute (25,491 cases and 66,819 controls) with 1) genetics of gene expression studies of CAD-relevant tissues in humans, 2) metabolic and signaling pathways from public databases, and 3) data-driven, tissue-specific gene networks from a multitude of human and mouse experiments. We not only detected CAD-associated gene networks of lipid metabolism, coagulation, immunity, and additional networks with no clear functional annotation, but also revealed key driver genes for each CAD network based on the topology of the gene regulatory networks. In particular, we found a gene network involved in antigen processing to be strongly associated with CAD. The key driver genes of this network included glyoxalase I (GLO1) and peptidylprolyl isomerase I (PPIL1), which we verified as regulatory by siRNA experiments in human aortic endothelial cells. Our results suggest genetic influences on a diverse set of both known and novel biological processes that contribute to CAD risk. The key driver genes for these networks highlight potential novel targets for further mechanistic studies and therapeutic interventions. PMID:25033284

  7. Whole-genome sequencing reveals the extent of heterozygosity in a preferentially self-fertilizing hermaphroditic vertebrate.

    PubMed

    Lins, Luana S F; Trojahn, Shawn; Sockell, Alexandra; Yee, Muh-Ching; Tatarenkov, Andrey; Bustamante, Carlos D; Earley, Ryan L; Kelley, Joanna L

    2018-04-01

    The mangrove rivulus, Kryptolebias marmoratus, is one of only two self-fertilizing hermaphroditic fish species and inhabits mangrove forests. While selfing can be advantageous, it reduces heterozygosity and decreases genetic diversity. Studies using microsatellites found that there are variable levels of selfing among populations of K. marmoratus, but overall, there is a low rate of outcrossing and, therefore, low heterozygosity. In this study, we used whole-genome data to assess the levels of heterozygosity in different lineages of the mangrove rivulus and infer the phylogenetic relationships among those lineages. We sequenced whole genomes from 15 lineages that were completely homozygous at microsatellite loci and used single nucleotide polymorphisms (SNPs) to determine heterozygosity levels. More variation was uncovered than in studies using microsatellite data because of the resolution of full genome sequencing data. Moreover, missense polymorphisms were found most often in genes associated with immune function and reproduction. Inferred phylogenetic relationships suggest that lineages largely group by their geographic distribution. The use of whole-genome data provided further insight into genetic diversity in this unique species. Although this study was limited by the number of lineages that were available, these data suggest that there is previously undescribed variation within lineages of K. marmoratus that could have functional consequences and (or) inform us about the limits to selfing (e.g., genetic load, accumulation of deleterious mutations) and selection that might favor the maintenance of heterozygosity. These results highlight the need to sequence additional individuals within and among lineages.

  8. Genome-wide comparative analysis reveals similar types of NBS genes in hybrid Citrus sinensis genome and original Citrus clementine genome and provides new insights into non-TIR NBS genes

    USDA-ARS?s Scientific Manuscript database

    In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approxima...

  9. Assessing Diversity of DNA Structure-Related Sequence Features in Prokaryotic Genomes

    PubMed Central

    Huang, Yongjie; Mrázek, Jan

    2014-01-01

    Prokaryotic genomes are diverse in terms of their nucleotide and oligonucleotide composition as well as presence of various sequence features that can affect physical properties of the DNA molecule. We present a survey of local sequence patterns which have a potential to promote non-canonical DNA conformations (i.e. different from standard B-DNA double helix) and interpret the results in terms of relationships with organisms' habitats, phylogenetic classifications, and other characteristics. Our present work differs from earlier similar surveys not only by investigating a wider range of sequence patterns in a large number of genomes but also by using a more realistic null model to assess significant deviations. Our results show that simple sequence repeats and Z-DNA-promoting patterns are generally suppressed in prokaryotic genomes, whereas palindromes and inverted repeats are over-represented. Representation of patterns that promote Z-DNA and intrinsic DNA curvature increases with increasing optimal growth temperature (OGT), and decreases with increasing oxygen requirement. Additionally, representations of close direct repeats, palindromes and inverted repeats exhibit clear negative trends with increasing OGT. The observed relationships with environmental characteristics, particularly OGT, suggest possible evolutionary scenarios of structural adaptation of DNA to particular environmental niches. PMID:24408877

  10. Assessment of genome origins and genetic diversity in the genus Eleusine with DNA markers.

    PubMed

    Salimath, S S; de Oliveira, A C; Godwin, I D; Bennetzen, J L

    1995-08-01

    Finger millet (Eleusine coracana), an allotetraploid cereal, is widely cultivated in the arid and semiarid regions of the world. Three DNA marker techniques, restriction fragment length polymorphism (RFLP), randomly amplified polymorphic DNA (RAPD), and inter simple sequence repeat amplification (ISSR), were employed to analyze 22 accessions belonging to 5 species of Eleusine. An 8 probe--3 enzyme RFLP combination, 18 RAPD primers, and 6 ISSR primers, respectively, revealed 14, 10, and 26% polymorphism in 17 accessions of E. coracana from Africa and Asia. These results indicated a very low level of DNA sequence variability in the finger millets but did allow each line to be distinguished. The different Eleusine species could be easily identified by DNA marker technology and the 16% intraspecific polymorphism exhibited by the two analyzed accessions of E. floccifolia suggested a much higher level of diversity in this species than in E. coracana. Between species, E. coracana and E. indica shared the most markers, while E. indica and E. tristachya shared a considerable number of markers, indicating that these three species form a close genetic assemblage within the Eleusine. Eleusine floccifolia and E. compressa were found to be the most divergent among the species examined. Comparison of RFLP, RAPD, and ISSR technologies, in terms of the quantity and quality of data output, indicated that ISSRs are particularly promising for the analysis of plant genome diversity.

  11. Comparative genomics reveals adaptations of a halotolerant thaumarchaeon in the interfaces of brine pools in the Red Sea.

    PubMed

    Kamanda Ngugi, David; Blom, Jochen; Alam, Intikhab; Rashid, Mamoon; Ba-Alawi, Wail; Zhang, Guishan; Hikmawan, Tyas; Guan, Yue; Antunes, Andre; Siam, Rania; El Dorry, Hamza; Bajic, Vladimir; Stingl, Ulrich

    2015-02-01

    The bottom of the Red Sea harbors over 25 deep hypersaline anoxic basins that are geochemically distinct and characterized by vertical gradients of extreme physicochemical conditions. Because of strong changes in density, particulate and microbial debris get entrapped in the brine-seawater interface (BSI), resulting in increased dissolved organic carbon, reduced dissolved oxygen toward the brines and enhanced microbial activities in the BSI. These features coupled with the deep-sea prevalence of ammonia-oxidizing archaea (AOA) in the global ocean make the BSI a suitable environment for studying the osmotic adaptations and ecology of these important players in the marine nitrogen cycle. Using phylogenomic-based approaches, we show that the local archaeal community of five different BSI habitats (with up to 18.2% salinity) is composed mostly of a single, highly abundant Nitrosopumilus-like phylotype that is phylogenetically distinct from the bathypelagic thaumarchaea; ammonia-oxidizing bacteria were absent. The composite genome of this novel Nitrosopumilus-like subpopulation (RSA3) co-assembled from multiple single-cell amplified genomes (SAGs) from one such BSI habitat further revealed that it shares ∼54% of its predicted genomic inventory with sequenced Nitrosopumilus species. RSA3 also carries several, albeit variable gene sets that further illuminate the phylogenetic diversity and metabolic plasticity of this genus. Specifically, it encodes for a putative proline-glutamate 'switch' with a potential role in osmotolerance and indirect impact on carbon and energy flows. Metagenomic fragment recruitment analyses against the composite RSA3 genome, Nitrosopumilus maritimus, and SAGs of mesopelagic thaumarchaea also reiterate the divergence of the BSI genotypes from other AOA.

  12. A genomic perspective on the generation and maintenance of genetic diversity in herbivorous insects

    PubMed Central

    Gloss, Andrew D.; Groen, Simon C.; Whiteman, Noah K.

    2017-01-01

    Understanding the processes that generate and maintain genetic variation within populations is a central goal in evolutionary biology. Theory predicts that some of this variation is maintained as a consequence of adapting to variable habitats. Studies in herbivorous insects have played a key role in confirming this prediction. Here, we highlight theoretical and conceptual models for the maintenance of genetic diversity in herbivorous insects, empirical genomic studies testing these models, and pressing questions within the realm of evolutionary and functional genomic studies. To address key gaps, we propose an integrative approach combining population genomic scans for adaptation, genome-wide characterization of targets of selection through experimental manipulations, mapping the genetic architecture of traits influencing fitness, and functional studies. We also stress the importance of studying the maintenance of genetic variation across biological scales—from variation within populations to divergence among populations—to form a comprehensive view of adaptation in herbivorous insects. PMID:28736510

  13. Karyotype diversity and genome size variation in Neotropical Maxillariinae orchids.

    PubMed

    Moraes, A P; Koehler, S; Cabral, J S; Gomes, S S L; Viccini, L F; Barros, F; Felix, L P; Guerra, M; Forni-Martins, E R

    2017-03-01

    Orchidaceae is a widely distributed plant family with very diverse vegetative and floral morphology, and such variability is also reflected in their karyotypes. However, since only a low proportion of Orchidaceae has been analysed for chromosome data, greater diversity may await to be unveiled. Here we analyse both genome size (GS) and karyotype in two subtribes recently included in the broadened Maxillariinea to detect how much chromosome and GS variation there is in these groups and to evaluate which genome rearrangements are involved in the species evolution. To do so, the GS (14 species), the karyotype - based on chromosome number, heterochromatic banding and 5S and 45S rDNA localisation (18 species) - was characterised and analysed along with published data using phylogenetic approaches. The GS presented a high phylogenetic correlation and it was related to morphological groups in Bifrenaria (larger plants - higher GS). The two largest GS found among genera were caused by different mechanisms: polyploidy in Bifrenaria tyrianthina and accumulation of repetitive DNA in Scuticaria hadwenii. The chromosome number variability was caused mainly through descending dysploidy, and x=20 was estimated as the base chromosome number. Combining GS and karyotype data with molecular phylogeny, our data provide a more complete scenario of the karyotype evolution in Maxillariinae orchids, allowing us to suggest, besides dysploidy, that inversions and transposable elements as two mechanisms involved in the karyotype evolution. Such karyotype modifications could be associated with niche changes that occurred during species evolution. © 2016 German Botanical Society and The Royal Botanical Society of the Netherlands.

  14. Microbial diversity and activity in the Nematostella vectensis holobiont: insights from 16S rRNA gene sequencing, isolate genomes, and a pilot-scale survey of gene expression.

    PubMed

    Har, Jia Y; Helbig, Tim; Lim, Ju H; Fernando, Samodha C; Reitzel, Adam M; Penn, Kevin; Thompson, Janelle R

    2015-01-01

    We have characterized the molecular and genomic diversity of the microbiota of the starlet sea anemone Nematostella vectensis, a cnidarian model for comparative developmental and functional biology and a year-round inhabitant of temperate salt marshes. Molecular phylogenetic analysis of 16S rRNA gene clone libraries revealed four ribotypes associated with N. vectensis at multiple locations and times. These associates include two novel ribotypes within the ε-Proteobacterial order Campylobacterales and the Spirochetes, respectively, each sharing <85% identity with cultivated strains, and two γ-Proteobacterial ribotypes sharing >99% 16S rRNA identity with Endozoicomonas elysicola and Pseudomonas oleovorans, respectively. Species-specific PCR revealed that these populations persisted in N. vectensis asexually propagated under laboratory conditions. cDNA indicated expression of the Campylobacterales and Endozoicomonas 16S rRNA in anemones from Sippewissett Marsh, MA. A collection of bacteria from laboratory raised N. vectensis was dominated by isolates from P. oleovorans and Rhizobium radiobacter. Isolates from field-collected anemones revealed an association with Limnobacter and Stappia isolates. Genomic DNA sequencing was carried out on 10 cultured bacterial isolates representing field- and laboratory-associates, i.e., Limnobacter spp., Stappia spp., P. oleovorans and R. radiobacter. Genomes contained multiple genes identified as virulence (host-association) factors while S. stellulata and L. thiooxidans genomes revealed pathways for mixotrophic sulfur oxidation. A pilot metatranscriptome of laboratory-raised N. vectensis was compared to the isolate genomes and indicated expression of ORFs from L. thiooxidans with predicted functions of motility, nutrient scavenging (Fe and P), polyhydroxyalkanoate synthesis for carbon storage, and selective permeability (porins). We hypothesize that such activities may mediate acclimation and persistence of bacteria in a N

  15. Microbial diversity and activity in the Nematostella vectensis holobiont: insights from 16S rRNA gene sequencing, isolate genomes, and a pilot-scale survey of gene expression

    PubMed Central

    Har, Jia Y.; Helbig, Tim; Lim, Ju H.; Fernando, Samodha C.; Reitzel, Adam M.; Penn, Kevin; Thompson, Janelle R.

    2015-01-01

    We have characterized the molecular and genomic diversity of the microbiota of the starlet sea anemone Nematostella vectensis, a cnidarian model for comparative developmental and functional biology and a year-round inhabitant of temperate salt marshes. Molecular phylogenetic analysis of 16S rRNA gene clone libraries revealed four ribotypes associated with N. vectensis at multiple locations and times. These associates include two novel ribotypes within the ε-Proteobacterial order Campylobacterales and the Spirochetes, respectively, each sharing <85% identity with cultivated strains, and two γ-Proteobacterial ribotypes sharing >99% 16S rRNA identity with Endozoicomonas elysicola and Pseudomonas oleovorans, respectively. Species-specific PCR revealed that these populations persisted in N. vectensis asexually propagated under laboratory conditions. cDNA indicated expression of the Campylobacterales and Endozoicomonas 16S rRNA in anemones from Sippewissett Marsh, MA. A collection of bacteria from laboratory raised N. vectensis was dominated by isolates from P. oleovorans and Rhizobium radiobacter. Isolates from field-collected anemones revealed an association with Limnobacter and Stappia isolates. Genomic DNA sequencing was carried out on 10 cultured bacterial isolates representing field- and laboratory-associates, i.e., Limnobacter spp., Stappia spp., P. oleovorans and R. radiobacter. Genomes contained multiple genes identified as virulence (host-association) factors while S. stellulata and L. thiooxidans genomes revealed pathways for mixotrophic sulfur oxidation. A pilot metatranscriptome of laboratory-raised N. vectensis was compared to the isolate genomes and indicated expression of ORFs from L. thiooxidans with predicted functions of motility, nutrient scavenging (Fe and P), polyhydroxyalkanoate synthesis for carbon storage, and selective permeability (porins). We hypothesize that such activities may mediate acclimation and persistence of bacteria in a N

  16. Genomic diversity of necrotic enteritis-associated strains of Clostridium perfringens: a review.

    PubMed

    Lacey, Jake A; Johanesen, Priscilla A; Lyras, Dena; Moore, Robert J

    2016-06-01

    The investigation of genomic variation between Clostridium perfringens isolates from poultry has been an important tool to enhance our understanding of the genetic basis of strain pathogenicity and the epidemiology of virulent and avirulent strains within the context of necrotic enteritis (NE). The earliest studies used whole genome profiling techniques such as pulsed-field gel electrophoresis to differentiate isolates and determine their relative levels of relatedness. DNA sequencing has been used to investigate genetic variation in (a) individual genes, such as those encoding the alpha and NetB toxins; (b) panels of housekeeping genes for multi-locus sequence typing and (c) most recently whole genome sequencing to build a more complete picture of genomic differences between isolates. Conclusions drawn from these studies include: differential carriage of large conjugative plasmids accounts for a large proportion of inter-strain differences; plasmid-encoded genes are more highly conserved than chromosomal genes, perhaps indicating a relatively recent origin for the plasmids; isolates from NE-affected birds fall into three distinct sequence-based clades while non-pathogenic isolates from healthy birds tend to be more genomically diverse. Overall, the NE causing strains are closely related to C. perfringens isolates from other birds and other diseases whereas the non-pathogenic poultry strains are generally more remotely related to either the pathogenic strains or the strains from other birds. Genomic analysis has indicated that genes in addition to netB are associated with NE pathogenic isolates. Collectively, this work has resulted in a deeper understanding of the pathogenesis of this important poultry disease.

  17. Transcriptome and methylome profiling reveals relics of genome dominance in the mesopolyploid Brassica oleracea

    PubMed Central

    2014-01-01

    Background Brassica oleracea is a valuable vegetable species that has contributed to human health and nutrition for hundreds of years and comprises multiple distinct cultivar groups with diverse morphological and phytochemical attributes. In addition to this phenotypic wealth, B. oleracea offers unique insights into polyploid evolution, as it results from multiple ancestral polyploidy events and a final Brassiceae-specific triplication event. Further, B. oleracea represents one of the diploid genomes that formed the economically important allopolyploid oilseed, Brassica napus. A deeper understanding of B. oleracea genome architecture provides a foundation for crop improvement strategies throughout the Brassica genus. Results We generate an assembly representing 75% of the predicted B. oleracea genome using a hybrid Illumina/Roche 454 approach. Two dense genetic maps are generated to anchor almost 92% of the assembled scaffolds to nine pseudo-chromosomes. Over 50,000 genes are annotated and 40% of the genome predicted to be repetitive, thus contributing to the increased genome size of B. oleracea compared to its close relative B. rapa. A snapshot of both the leaf transcriptome and methylome allows comparisons to be made across the triplicated sub-genomes, which resulted from the most recent Brassiceae-specific polyploidy event. Conclusions Differential expression of the triplicated syntelogs and cytosine methylation levels across the sub-genomes suggest residual marks of the genome dominance that led to the current genome architecture. Although cytosine methylation does not correlate with individual gene dominance, the independent methylation patterns of triplicated copies suggest epigenetic mechanisms play a role in the functional diversification of duplicate genes. PMID:24916971

  18. Genomic Reconstruction of the History of Native Sheep Reveals the Peopling Patterns of Nomads and the Expansion of Early Pastoralism in East Asia

    PubMed Central

    Zhao, Yong-Xin; Yang, Ji; Lv, Feng-Hua; Hu, Xiao-Ju; Xie, Xing-Long; Zhang, Min; Li, Wen-Rong; Liu, Ming-Jun; Wang, Yu-Tao; Li, Jin-Quan; Liu, Yong-Gang; Ren, Yan-Ling; Wang, Feng; Hehua, EEr; Kantanen, Juha; Arjen Lenstra, Johannes; Han, Jian-Lin; Li, Meng-Hua

    2017-01-01

    Abstract China has a rich resource of native sheep (Ovis aries) breeds associated with historical movements of several nomadic societies. However, the history of sheep and the associated nomadic societies in ancient China remains poorly understood. Here, we studied the genomic diversity of Chinese sheep using genome-wide SNPs, mitochondrial and Y-chromosomal variations in > 1,000 modern samples. Population genomic analyses combined with archeological records and historical ethnic demographics data revealed genetic signatures of the origins, secondary expansions and admixtures, of Chinese sheep thereby revealing the peopling patterns of nomads and the expansion of early pastoralism in East Asia. Originating from the Mongolian Plateau ∼5,000‒5,700 years ago, Chinese sheep were inferred to spread in the upper and middle reaches of the Yellow River ∼3,000‒5,000 years ago following the expansions of the Di-Qiang people. Afterwards, sheep were then inferred to reach the Qinghai-Tibetan and Yunnan-Kweichow plateaus ∼2,000‒2,600 years ago by following the north-to-southwest routes of the Di-Qiang migration. We also unveiled two subsequent waves of migrations of fat-tailed sheep into northern China, which were largely commensurate with the migrations of ancestors of Hui Muslims eastward and Mongols southward during the 12th‒13th centuries. Furthermore, we revealed signs of argali introgression into domestic sheep, extensive historical mixtures among domestic populations and strong artificial selection for tail type and other traits, reflecting various breeding strategies by nomadic societies in ancient China. PMID:28645168

  19. First genome sequences of Achromobacter phages reveal new members of the N4 family.

    PubMed

    Wittmann, Johannes; Dreiseikelmann, Brigitte; Rohde, Manfred; Meier-Kolthoff, Jan P; Bunk, Boyke; Rohde, Christine

    2014-01-27

    Multi-resistant Achromobacter xylosoxidans has been recognized as an emerging pathogen causing nosocomially acquired infections during the last years. Phages as natural opponents could be an alternative to fight such infections. Bacteriophages against this opportunistic pathogen were isolated in a recent study. This study shows a molecular analysis of two podoviruses and reveals first insights into the genomic structure of Achromobacter phages so far. Growth curve experiments and adsorption kinetics were performed for both phages. Adsorption and propagation in cells were visualized by electron microscopy. Both phage genomes were sequenced with the PacBio RS II system based on single molecule, real-time (SMRT) technology and annotated with several bioinformatic tools. To further elucidate the evolutionary relationships between the phage genomes, a phylogenomic analysis was conducted using the genome Blast Distance Phylogeny approach (GBDP). In this study, we present the first detailed analysis of genome sequences of two Achromobacter phages so far. Phages JWAlpha and JWDelta were isolated from two different waste water treatment plants in Germany. Both phages belong to the Podoviridae and contain linear, double-stranded DNA with a length of 72329 bp and 73659 bp, respectively. 92 and 89 putative open reading frames were identified for JWAlpha and JWDelta, respectively, by bioinformatic analysis with several tools. The genomes have nearly the same organization and could be divided into different clusters for transcription, replication, host interaction, head and tail structure and lysis. Detailed annotation via protein comparisons with BLASTP revealed strong similarities to N4-like phages. Analysis of the genomes of Achromobacter phages JWAlpha and JWDelta and comparisons of different gene clusters with other phages revealed that they might be strongly related to other N4-like phages, especially of the Escherichia group. Although all these phages show a highly

  20. Comparative genomics analyses revealed two virulent Listeria monocytogenes strains isolated from ready-to-eat food.

    PubMed

    Lim, Shu Yong; Yap, Kien-Pong; Thong, Kwai Lin

    2016-01-01

    Listeria monocytogenes is an important foodborne pathogen that causes considerable morbidity in humans with high mortality rates. In this study, we have sequenced the genomes and performed comparative genomics analyses on two strains, LM115 and LM41, isolated from ready-to-eat food in Malaysia. The genome size of LM115 and LM41 was 2,959,041 and 2,963,111 bp, respectively. These two strains shared approximately 90% homologous genes. Comparative genomics and phylogenomic analyses revealed that LM115 and LM41 were more closely related to the reference strains F2365 and EGD-e, respectively. Our virulence profiling indicated a total of 31 virulence genes shared by both analysed strains. These shared genes included those that encode for internalins and L. monocytogenes pathogenicity island 1 (LIPI-1). Both the Malaysian L. monocytogenes strains also harboured several genes associated with stress tolerance to counter the adverse conditions. Seven antibiotic and efflux pump related genes which may confer resistance against lincomycin, erythromycin, fosfomycin, quinolone, tetracycline, and penicillin, and macrolides were identified in the genomes of both strains. Whole genome sequencing and comparative genomics analyses revealed two virulent L. monocytogenes strains isolated from ready-to-eat foods in Malaysia. The identification of strains with pathogenic, persistent, and antibiotic resistant potentials from minimally processed food warrant close attention from both healthcare and food industry.