Science.gov

Sample records for haplotype-specific genomic diversity

  1. The diversity of fungal genome.

    PubMed

    Mohanta, Tapan Kumar; Bae, Hanhong

    2015-01-01

    The genome size of an organism varies from species to species. The C-value paradox enigma is a very complex puzzle with regards to vast diversity in genome sizes in eukaryotes. Here we reported the detailed genomic information of 172 fungal species among different fungal genomes and found that fungal genomes are very diverse in nature. In fungi, the diversity of genomes varies from 8.97 Mb to 177.57 Mb. The average genome sizes of Ascomycota and Basidiomycota fungi are 36.91 and 46.48 Mb respectively. But higher genome size is observed in Oomycota (74.85 Mb) species, a lineage of fungus-like eukaryotic microorganisms. The average coding genes of Oomycota species are almost doubled than that of Acomycota and Basidiomycota fungus. PMID:25866485

  2. Genome diversity of Shigella boydii.

    PubMed

    Kania, Dane A; Hazen, Tracy H; Hossain, Anowar; Nataro, James P; Rasko, David A

    2016-06-01

    ITALIC! Shigella boydiiis one of the four ITALIC! Shigellaspecies that causes disease worldwide; however, there are few published studies that examine the genomic variation of this species. This study compares genomes of 72 total isolates; 28 ITALIC! S. boydiifrom Bangladesh and The Gambia that were recently isolated as part of the Global Enteric Multicenter Study (GEMS), 14 historical ITALIC! S. boydiigenomes in the public domain and 30 ITALIC! Escherichia coliand ITALIC! Shigellareference genomes that represent the genomic diversity of these pathogens. This comparative analysis of these 72 genomes identified that the ITALIC! S. boydiiisolates separate into three phylogenomic clades, each with specific gene content. Each of the clades contains ITALIC! S. boydiiisolates from geographic and temporally distant sources, indicating that the ITALIC! S. boydiiisolates from the GEMS are representative of ITALIC! S. boydii.This study describes the genome sequences of a collection of novel ITALIC! S. boydiiisolates and provides insight into the diversity of this species in comparison to the ITALIC! E. coliand other ITALIC! Shigellaspecies. PMID:27056949

  3. Human Genome Diversity workshop 1

    SciTech Connect

    1992-12-31

    The Human Genome Diversity Project (HGD) is an international interdisciplinary program whose goal is to reveal as much as possible about the current state of genetic diversity among humans and the processes that were responsible for that diversity. Classical premolecular techniques have already proved that a significant component of human genetic variability lies within populations rather than among them. New molecular techniques will permit a dramatic increase in the resolving power of genetic analysis at the population level. Recent social changes in many parts of the world threaten the identity of a number of populations that may be extremely important for understanding human evolutionary history. It is therefore urgent to conduct research on human variation in these areas, while there is still time. The plan is to identify the most representative descendants of ancestral human populations worldwide and then to preserve genetic records of these populations. This is a report of the Population Genetics Workshop (Workshop 1), the first of three to be held to plan HGD, which was focused on sampling strategies and analytic methods from population genetics. The topics discussed were sampling and population structure; analysis of populations; drift versus natural selection; modeling migration and population subdivision; and population structure and subdivision.

  4. The Human Genome Diversity Project

    SciTech Connect

    Cavalli-Sforza, L.

    1994-12-31

    The Human Genome Diversity Project (HGD Project) is an international anthropology project that seeks to study the genetic richness of the entire human species. This kind of genetic information can add a unique thread to the tapestry knowledge of humanity. Culture, environment, history, and other factors are often more important, but humanity`s genetic heritage, when analyzed with recent technology, brings another type of evidence for understanding species` past and present. The Project will deepen the understanding of this genetic richness and show both humanity`s diversity and its deep and underlying unity. The HGD Project is still largely in its planning stages, seeking the best ways to reach its goals. The continuing discussions of the Project, throughout the world, should improve the plans for the Project and their implementation. The Project is as global as humanity itself; its implementation will require the kinds of partnerships among different nations and cultures that make the involvement of UNESCO and other international organizations particularly appropriate. The author will briefly discuss the Project`s history, describe the Project, set out the core principles of the Project, and demonstrate how the Project will help combat the scourge of racism.

  5. Genomic architecture of human neuroanatomical diversity.

    PubMed

    Toro, R; Poline, J-B; Huguet, G; Loth, E; Frouin, V; Banaschewski, T; Barker, G J; Bokde, A; Büchel, C; Carvalho, F M; Conrod, P; Fauth-Bühler, M; Flor, H; Gallinat, J; Garavan, H; Gowland, P; Heinz, A; Ittermann, B; Lawrence, C; Lemaître, H; Mann, K; Nees, F; Paus, T; Pausova, Z; Rietschel, M; Robbins, T; Smolka, M N; Ströhle, A; Schumann, G; Bourgeron, T

    2015-08-01

    Human brain anatomy is strikingly diverse and highly inheritable: genetic factors may explain up to 80% of its variability. Prior studies have tried to detect genetic variants with a large effect on neuroanatomical diversity, but those currently identified account for <5% of the variance. Here, based on our analyses of neuroimaging and whole-genome genotyping data from 1765 subjects, we show that up to 54% of this heritability is captured by large numbers of single-nucleotide polymorphisms of small-effect spread throughout the genome, especially within genes and close regulatory regions. The genetic bases of neuroanatomical diversity appear to be relatively independent of those of body size (height), but shared with those of verbal intelligence scores. The study of this genomic architecture should help us better understand brain evolution and disease. PMID:25224261

  6. Consequences of genomic diversity in Mycobacterium tuberculosis.

    PubMed

    Coscolla, Mireia; Gagneux, Sebastien

    2014-12-01

    The causative agent of human tuberculosis, Mycobacterium tuberculosis complex (MTBC), comprises seven phylogenetically distinct lineages associated with different geographical regions. Here we review the latest findings on the nature and amount of genomic diversity within and between MTBC lineages. We then review recent evidence for the effect of this genomic diversity on mycobacterial phenotypes measured experimentally and in clinical settings. We conclude that overall, the most geographically widespread Lineage 2 (includes Beijing) and Lineage 4 (also known as Euro-American) are more virulent than other lineages that are more geographically restricted. This increased virulence is associated with delayed or reduced pro-inflammatory host immune responses, greater severity of disease, and enhanced transmission. Future work should focus on the interaction between MTBC and human genetic diversity, as well as on the environmental factors that modulate these interactions. PMID:25453224

  7. Does M. tuberculosis genomic diversity explain disease diversity?

    PubMed Central

    Coscolla, Mireilla; Gagneux, Sebastien

    2010-01-01

    The outcome of tuberculosis infection and disease is highly variable. This variation has been attributed primarily to host and environmental factors, but better understanding of the global genomic diversity in the M. tuberculosis complex (MTBC) suggests that bacterial factors could also be involved. Review of nearly 100 published reports shows that MTBC strains differ in their virulence and immunogenicity in experimental models, but whether this phenotypic variation plays a role in human disease remains unclear. Given the complex interactions between the host, the pathogen and the environment, linking MTBC genotypic diversity to experimental and clinical phenotypes requires an integrated systems epidemiology approach embedded in a robust evolutionary framework. PMID:21076640

  8. OryzaGenome: Genome Diversity Database of Wild Oryza Species

    PubMed Central

    Ohyanagi, Hajime; Ebata, Toshinobu; Huang, Xuehui; Gong, Hao; Fujita, Masahiro; Mochizuki, Takako; Toyoda, Atsushi; Fujiyama, Asao; Kaminuma, Eli; Nakamura, Yasukazu; Feng, Qi; Wang, Zi-Xuan; Han, Bin; Kurata, Nori

    2016-01-01

    The species in the genus Oryza, encompassing nine genome types and 23 species, are a rich genetic resource and may have applications in deeper genomic analyses aiming to understand the evolution of plant genomes. With the advancement of next-generation sequencing (NGS) technology, a flood of Oryza species reference genomes and genomic variation information has become available in recent years. This genomic information, combined with the comprehensive phenotypic information that we are accumulating in our Oryzabase, can serve as an excellent genotype–phenotype association resource for analyzing rice functional and structural evolution, and the associated diversity of the Oryza genus. Here we integrate our previous and future phenotypic/habitat information and newly determined genotype information into a united repository, named OryzaGenome, providing the variant information with hyperlinks to Oryzabase. The current version of OryzaGenome includes genotype information of 446 O. rufipogon accessions derived by imputation and of 17 accessions derived by imputation-free deep sequencing. Two variant viewers are implemented: SNP Viewer as a conventional genome browser interface and Variant Table as a text-based browser for precise inspection of each variant one by one. Portable VCF (variant call format) file or tab-delimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/scaffolds/contigs and genome-wide variation information for almost all of the closely and distantly related wild Oryza species from the NIG Wild Rice Collection will be available in future releases. All of the resources can be accessed through http://viewer.shigen.info/oryzagenome/. PMID:26578696

  9. Genomes to Life Diversity Initiative

    SciTech Connect

    McClure, Thomas

    2010-03-15

    This was a collaborative initiative between Western Carolina University, Furman University and the University of North Carolina-Asheville. At each of the institutions, funds from the grant award were used for the acquisition of mostly microscopy laboratory equipment, supporting supplies and necessary training as appropriate. The distribution of funds was: $495,000 Western Carolina University; $130,000 Furman University; $100,000 University of North Carolina-Asheville for a total of $725,000 total award from DOE. Western Carolina University purchased significant instrumentation with funds from this award that included among others, fermenters, a Confocal microscope, and an automated sequencer. The fermenters have been used in research and courses and to prepare biochemical materials for research and courses. The Confocal microscope has provided Western students and faculty with unique imaging opportunities not generally available except in medical schools. Unlike regular optical microscopy, confocal microscopy offers a three-dimensional image that can be viewed from different angles. In addition, the device has been set up to be controlled from remote locations, providing high school and institutions of higher education students across Western North Carolina with the opportunity to use state-of-the-art instrumentation from their location. One of the goals of this collaboration was to get more high school students interested in science. The automated sequencer has become a very significant instructional and research tool. It has been widely used for characterizing the oak genome, which has very significant implications for Western North Carolina. More recently, it has been used for groundbreaking forensic science research. This device has been used to create a database to identify unidentified persons. The instrument has also been used in several undergraduate and graduate courses, where students learn the principles and operation of this very important instrument

  10. Genomic Diversity of Escherichia Isolates from Diverse Habitats

    PubMed Central

    Yoder-Himes, Deborah R.; Tiedje, James M.; Konstantinidis, Konstantinos T.

    2012-01-01

    Our understanding of the Escherichia genus is heavily biased toward pathogenic or commensal isolates from human or animal hosts. Recent studies have recovered Escherichia isolates that persist, and even grow, outside these hosts. Although the environmental isolates are typically phylogenetically distinct, they are highly related to and phenotypically indistinguishable from their human counterparts, including for the coliform test. To gain insights into the genomic diversity of Escherichia isolates from diverse habitats, including freshwater, soil, animal, and human sources, we carried out comparative DNA-DNA hybridizations using a multi-genome E. coli DNA microarray. The microarray was validated based on hybridizations with selected strains whose genome sequences were available and used to assess the frequency of microarray false positive and negative signals. Our results showed that human fecal isolates share two sets of genes (n>90) that are rarely found among environmental isolates, including genes presumably important for evading host immune mechanisms (e.g., a multi-drug transporter for acids and antimicrobials) and adhering to epithelial cells (e.g., hemolysin E and fimbrial-like adhesin protein). These results imply that environmental isolates are characterized by decreased ability to colonize host cells relative to human isolates. Our study also provides gene markers that can distinguish human isolates from those of warm-blooded animal and environmental origins, and thus can be used to more reliably assess fecal contamination in natural ecosystems. PMID:23056556

  11. Galaxy tools to study genome diversity

    PubMed Central

    2013-01-01

    Background Intra-species genetic variation can be used to investigate population structure, selection, and gene flow in non-model vertebrates; and due to the plummeting costs for genome sequencing, it is now possible for small labs to obtain full-genome variation data from their species of interest. However, those labs may not have easy access to, and familiarity with, computational tools to analyze those data. Results We have created a suite of tools for the Galaxy web server aimed at handling nucleotide and amino-acid polymorphisms discovered by full-genome sequencing of several individuals of the same species, or using a SNP genotyping microarray. In addition to providing user-friendly tools, a main goal is to make published analyses reproducible. While most of the examples discussed in this paper deal with nuclear-genome diversity in non-human vertebrates, we also illustrate the application of the tools to fungal genomes, human biomedical data, and mitochondrial sequences. Conclusions This project illustrates that a small group can design, implement, test, document, and distribute a Galaxy tool collection to meet the needs of a particular community of biologists. PMID:24377391

  12. PRDM9 drives evolutionary erosion of hotspots in Mus musculus through haplotype-specific initiation of meiotic recombination.

    PubMed

    Baker, Christopher L; Kajita, Shimpei; Walker, Michael; Saxl, Ruth L; Raghupathy, Narayanan; Choi, Kwangbom; Petkov, Petko M; Paigen, Kenneth

    2015-01-01

    Meiotic recombination generates new genetic variation and assures the proper segregation of chromosomes in gametes. PRDM9, a zinc finger protein with histone methyltransferase activity, initiates meiotic recombination by binding DNA at recombination hotspots and directing the position of DNA double-strand breaks (DSB). The DSB repair mechanism suggests that hotspots should eventually self-destruct, yet genome-wide recombination levels remain constant, a conundrum known as the hotspot paradox. To test if PRDM9 drives this evolutionary erosion, we measured activity of the Prdm9Cst allele in two Mus musculus subspecies, M.m. castaneus, in which Prdm9Cst arose, and M.m. domesticus, into which Prdm9Cst was introduced experimentally. Comparing these two strains, we find that haplotype differences at hotspots lead to qualitative and quantitative changes in PRDM9 binding and activity. Using Mus spretus as an outlier, we found most variants affecting PRDM9Cst binding arose and were fixed in M.m. castaneus, suppressing hotspot activity. Furthermore, M.m. castaneus×M.m. domesticus F1 hybrids exhibit novel hotspots, with large haplotype biases in both PRDM9 binding and chromatin modification. These novel hotspots represent sites of historic evolutionary erosion that become activated in hybrids due to crosstalk between one parent's Prdm9 allele and the opposite parent's chromosome. Together these data support a model where haplotype-specific PRDM9 binding directs biased gene conversion at hotspots, ultimately leading to hotspot erosion. PMID:25568937

  13. An approach to mapping haplotype-specific recombination sites in human MHC class III

    SciTech Connect

    Levo, A.; Westman, P.; Partanen, J.

    1996-12-31

    Studies of the major histocompatibility complex (MHC) in mouse indicate that the recombination sites are not randomly distributed and their occurrence is haplotype-dependent. No data concerning haplotype-specific recombination sites in human are available due to the low number of informative families. To investigate haplotype-specific recombination sites in human MHC, we describe an approach based on identification of recombinant haplotypes derived from one conserved haplotype at the population level. The recombination sites were mapped by comparing polymorphic markers between the recombinant and assumed original haplotypes. We tested this approach on the extended haplotype HLA A3; B47; Bf{sup *}F; C4A{sup *}1; C4B{sup *}Q0; DR7, which is most suitable for this analysis. First, it carries a number of rare markers, and second, the haplotype, albeit rare in the general population, is frequent in patients with 21-hydroxylase (21OH) defect. We observed recombinants derived from this haplotype in patients with 21OH defect. All these haplotypes had the centromeric part (from Bf to DR) identical to the original haplotype, but they differed in HLA A and B. We therefore assumed that they underwent recombinations in the segment that separates the Bf and HLA B genes. Polymorphic markers indicated that all break points mapped to two segments near the TNF locus. This approach makes possible the mapping of preferential recombination sites in different haplotypes. 20 refs., 1 fig., 1 tab.

  14. Retinal degeneration slow (rds) in mouse results from simple insertion of a t haplotype-specific element into protein-coding exon II

    SciTech Connect

    Ma, J.; Norton, J.C.; Allen, A.C.; Burns, J.L.; Travis, G.H.

    1995-07-20

    Retinal degeneration slow (rds) is a semidominant mutation of mice that causes dysplasia and degeneration of rod and cone photoreceptors. Mutations in RDS, the human ortholog of the rds gene, are responsible for several inherited retinal dystrophies including a subset of retinitis pigmentosa. The normal rds locus encodes rds/peripherin, an integral membrane glycoprotein present in outer segment discs. Genomic libraries form wildtype and rds/rds mice were screened with an rds cDNA, and phage {lambda} clones that span the normal and mutant loci were mapped. We show that in mice, rds is caused by the insertion into exon II of a 9.2-kb repetitive genomic element that is very similar to the t haplotype-specific element in the H-2 complex. The entire element is included in the RNA products of the mutant locus. We present evidence that rds in mice represents a null allele. 40 refs., 4 figs.

  15. Integrated Genetic and Epigenetic Analysis Identifies Haplotype-Specific Methylation in the FTO Type 2 Diabetes and Obesity Susceptibility Locus

    PubMed Central

    Wilson, Gareth A.; Rakyan, Vardhman K.; Teschendorff, Andrew E.; Akan, Pelin; Stupka, Elia; Down, Thomas A.; Prokopenko, Inga; Morison, Ian M.; Mill, Jonathan; Pidsley, Ruth; Deloukas, Panos; Frayling, Timothy M.; Hattersley, Andrew T.; McCarthy, Mark I.; Beck, Stephan; Hitman, Graham A.

    2010-01-01

    Recent multi-dimensional approaches to the study of complex disease have revealed powerful insights into how genetic and epigenetic factors may underlie their aetiopathogenesis. We examined genotype-epigenotype interactions in the context of Type 2 Diabetes (T2D), focussing on known regions of genomic susceptibility. We assayed DNA methylation in 60 females, stratified according to disease susceptibility haplotype using previously identified association loci. CpG methylation was assessed using methylated DNA immunoprecipitation on a targeted array (MeDIP-chip) and absolute methylation values were estimated using a Bayesian algorithm (BATMAN). Absolute methylation levels were quantified across LD blocks, and we identified increased DNA methylation on the FTO obesity susceptibility haplotype, tagged by the rs8050136 risk allele A (p = 9.40×10−4, permutation p = 1.0×10−3). Further analysis across the 46 kb LD block using sliding windows localised the most significant difference to be within a 7.7 kb region (p = 1.13×10−7). Sequence level analysis, followed by pyrosequencing validation, revealed that the methylation difference was driven by the co-ordinated phase of CpG-creating SNPs across the risk haplotype. This 7.7 kb region of haplotype-specific methylation (HSM), encapsulates a Highly Conserved Non-Coding Element (HCNE) that has previously been validated as a long-range enhancer, supported by the histone H3K4me1 enhancer signature. This study demonstrates that integration of Genome-Wide Association (GWA) SNP and epigenomic DNA methylation data can identify potential novel genotype-epigenotype interactions within disease-associated loci, thus providing a novel route to aid unravelling common complex diseases. PMID:21124985

  16. Limits and patterns of cytomegalovirus genomic diversity in humans

    PubMed Central

    Renzette, Nicholas; Pokalyuk, Cornelia; Gibson, Laura; Bhattacharjee, Bornali; Schleiss, Mark R.; Hamprecht, Klaus; Yamamoto, Aparecida Y.; Mussi-Pinhata, Marisa M.; Britt, William J.; Jensen, Jeffrey D.; Kowalik, Timothy F.

    2015-01-01

    Human cytomegalovirus (HCMV) exhibits surprisingly high genomic diversity during natural infection although little is known about the limits or patterns of HCMV diversity among humans. To address this deficiency, we analyzed genomic diversity among congenitally infected infants. We show that there is an upper limit to HCMV genomic diversity in these patient samples, with ∼25% of the genome being devoid of polymorphisms. These low diversity regions were distributed across 26 loci that were preferentially located in DNA-processing genes. Furthermore, by developing, to our knowledge, the first genome-wide mutation and recombination rate maps for HCMV, we show that genomic diversity is positively correlated with these two rates. In contrast, median levels of viral genomic diversity did not vary between putatively single or mixed strain infections. We also provide evidence that HCMV populations isolated from vascular compartments of hosts from different continents are genetically similar and that polymorphisms in glycoproteins and regulatory proteins are enriched in these viral populations. This analysis provides the most highly detailed map of HCMV genomic diversity in human hosts to date and informs our understanding of the distribution of HCMV genomic diversity within human hosts. PMID:26150505

  17. Characterization of genomic sequence showing strong association with polyembryony among diverse Citrus species and cultivars, and its synteny with Vitis and Populus.

    PubMed

    Nakano, Michiharu; Shimada, Takehiko; Endo, Tomoko; Fujii, Hiroshi; Nesumi, Hirohisa; Kita, Masayuki; Ebina, Masumi; Shimizu, Tokurou; Omura, Mitsuo

    2012-02-01

    Polyembryony, in which multiple somatic nucellar cell-derived embryos develop in addition to the zygotic embryo in a seed, is common in the genus Citrus. Previous genetic studies indicated polyembryony is mainly determined by a single locus, but the underlying molecular mechanism is still unclear. As a step towards identification and characterization of the gene or genes responsible for nucellar embryogenesis in Citrus, haplotype-specific physical maps around the polyembryony locus were constructed. By sequencing three BAC clones aligned on the polyembryony haplotype, a single contiguous draft sequence consisting of 380 kb containing 70 predicted open reading frames (ORFs) was reconstructed. Single nucleotide polymorphism genotypes detected in the sequenced genomic region showed strong association with embryo type in Citrus, indicating a common polyembryony locus is shared among widely diverse Citrus cultivars and species. The arrangement of the predicted ORFs in the characterized genomic region showed high collinearity to the genomic sequence of chromosome 4 of Vitis vinifera and linkage group VI of Populus trichocarpa, suggesting that the syntenic relationship among these species is conserved even though V. vinifera and P. trichocarpa are non-apomictic species. This is the first study to characterize in detail the genomic structure of an apomixis locus determining adventitious embryony. PMID:22195586

  18. Genome Diversity of Spore-Forming Firmicutes

    PubMed Central

    Galperin, Michael Y.

    2015-01-01

    Summary Formation of heat-resistant endospores is a specific property of the members of the phylum Firmicutes (low-G+C Gram-positive bacteria). It is found in representatives of four different classes of Firmicutes: Bacilli, Clostridia, Erysipelotrichia, and Negativicutes, which all encode similar sets of core sporulation proteins. Each of these classes also includes non-spore-forming organisms that sometimes belong to the same genus or even species as their spore-forming relatives. This chapter reviews the diversity of the members of phylum Firmicutes, its current taxonomy, and the status of genome sequencing projects for various subgroups within the phylum. It also discusses the evolution of the Firmicutes from their apparently spore-forming common ancestor and the independent loss of sporulation genes in several different lineages (staphylococci, streptococci, listeria, lactobacilli, ruminococci) in the course of their adaptation to the saprophytic lifestyle in nutrient-rich environment. It argues that systematics of Firmicutes is a rapidly developing area of research that benefits from the evolutionary approaches to the ever-increasing amount of genomic and phenotypic data and allows arranging these data into a common framework. Later the Bacillus filaments begin to prepare for spore formation. In their homogenous contents strongly refracting bodies appear. From each of these bodies develops an oblong or shortly cylindrical, strongly refracting, dark-rimmed spore. Ferdinand Cohn. 1876. Untersuchungen über Bacterien. IV. Beiträge zur Biologie der Bacillen. Beiträge zur Biologie der Pflanzen, vol. 2, pp. 249–276. (Studies on the biology of the bacilli. In: Milestones in Microbiology: 1546 to 1940. Translated and edited by Thomas D. Brock. Prentice-Hall, Englewood Cliffs, NJ, 1961, pp. 49–56). PMID:26184964

  19. Genomic patterns of species diversity and divergence in Eucalyptus.

    PubMed

    Hudson, Corey J; Freeman, Jules S; Myburg, Alexander A; Potts, Brad M; Vaillancourt, René E

    2015-06-01

    We examined genome-wide patterns of DNA sequence diversity and divergence among six species of the important tree genus Eucalyptus and investigated their relationship with genomic architecture. Using c. 90 range-wide individuals of each Eucalyptus species (E. grandis, E. urophylla, E. globulus, E. nitens, E. dunnii and E. camaldulensis), genetic diversity and divergence were estimated from 2840 polymorphic diversity arrays technology markers covering the 11 chromosomes. Species differentiating markers (SDMs) identified in each of 15 pairwise species comparisons, along with species diversity (HHW ) and divergence (FST ), were projected onto the E. grandis reference genome. Across all species comparisons, SDMs totalled 1.1-5.3% of markers and were widely distributed throughout the genome. Marker divergence (FST and SDMs) and diversity differed among and within chromosomes. Patterns of diversity and divergence were broadly conserved across species and significantly associated with genomic features, including the proximity of markers to genes, the relative number of clusters of tandem duplications, and gene density within or among chromosomes. These results suggest that genomic architecture influences patterns of species diversity and divergence in the genus. This influence is evident across the six species, encompassing diverse phylogenetic lineages, geography and ecology. PMID:25678438

  20. Nucleotide diversity maps reveal variation in diversity among wheat genomes and chromosomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Technical Abstract: 20-75 CHARACTER LINES A strategy for a genome-wide assessment of nucleotide diversity in a polyploid species must minimize the inclusion of homoeologous sequences into diversity estimates and reliably allocate individual haplotypes into respective genomes. In this study, nucle...

  1. Genome size diversity in orchids: consequences and evolution

    PubMed Central

    Leitch, I. J.; Kahandawala, I.; Suda, J.; Hanson, L.; Ingrouille, M. J.; Chase, M. W.; Fay, M. F.

    2009-01-01

    Background The amount of DNA comprising the genome of an organism (its genome size) varies a remarkable 40 000-fold across eukaryotes, yet most groups are characterized by much narrower ranges (e.g. 14-fold in gymnosperms, 3- to 4-fold in mammals). Angiosperms stand out as one of the most variable groups with genome sizes varying nearly 2000-fold. Nevertheless within angiosperms the majority of families are characterized by genomes which are small and vary little. Species with large genomes are mostly restricted to a few monocots families including Orchidaceae. Scope A survey of the literature revealed that genome size data for Orchidaceae are comparatively rare representing just 327 species. Nevertheless they reveal that Orchidaceae are currently the most variable angiosperm family with genome sizes ranging 168-fold (1C = 0·33–55·4 pg). Analysing the data provided insights into the distribution, evolution and possible consequences to the plant of this genome size diversity. Conclusions Superimposing the data onto the increasingly robust phylogenetic tree of Orchidaceae revealed how different subfamilies were characterized by distinct genome size profiles. Epidendroideae possessed the greatest range of genome sizes, although the majority of species had small genomes. In contrast, the largest genomes were found in subfamilies Cypripedioideae and Vanilloideae. Genome size evolution within this subfamily was analysed as this is the only one with reasonable representation of data. This approach highlighted striking differences in genome size and karyotype evolution between the closely related Cypripedium, Paphiopedilum and Phragmipedium. As to the consequences of genome size diversity, various studies revealed that this has both practical (e.g. application of genetic fingerprinting techniques) and biological consequences (e.g. affecting where and when an orchid may grow) and emphasizes the importance of obtaining further genome size data given the considerable

  2. The B73maize genome: complexity, diversity, dynamics

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We report the nucleotide sequence of the maize (Zea mays cv. B73) genome, the largest and most structurally diverse of plants to be sequenced. ~32,540 genes are predicted, 99.8% of which are placed on chromosomes assembled from integrated physical, genetic and optical maps. Nearly 85% of the genome...

  3. Genomic and Genetic Diversity within the Pseudomonas fluorescens Complex

    PubMed Central

    Garrido-Sanz, Daniel; Meier-Kolthoff, Jan P.; Göker, Markus; Martín, Marta; Rivilla, Rafael; Redondo-Nieto, Miguel

    2016-01-01

    The Pseudomonas fluorescens complex includes Pseudomonas strains that have been taxonomically assigned to more than fifty different species, many of which have been described as plant growth-promoting rhizobacteria (PGPR) with potential applications in biocontrol and biofertilization. So far the phylogeny of this complex has been analyzed according to phenotypic traits, 16S rDNA, MLSA and inferred by whole-genome analysis. However, since most of the type strains have not been fully sequenced and new species are frequently described, correlation between taxonomy and phylogenomic analysis is missing. In recent years, the genomes of a large number of strains have been sequenced, showing important genomic heterogeneity and providing information suitable for genomic studies that are important to understand the genomic and genetic diversity shown by strains of this complex. Based on MLSA and several whole-genome sequence-based analyses of 93 sequenced strains, we have divided the P. fluorescens complex into eight phylogenomic groups that agree with previous works based on type strains. Digital DDH (dDDH) identified 69 species and 75 subspecies within the 93 genomes. The eight groups corresponded to clustering with a threshold of 31.8% dDDH, in full agreement with our MLSA. The Average Nucleotide Identity (ANI) approach showed inconsistencies regarding the assignment to species and to the eight groups. The small core genome of 1,334 CDSs and the large pan-genome of 30,848 CDSs, show the large diversity and genetic heterogeneity of the P. fluorescens complex. However, a low number of strains were enough to explain most of the CDSs diversity at core and strain-specific genomic fractions. Finally, the identification and analysis of group-specific genome and the screening for distinctive characters revealed a phylogenomic distribution of traits among the groups that provided insights into biocontrol and bioremediation applications as well as their role as PGPR. PMID:26915094

  4. Genomic and Genetic Diversity within the Pseudomonas fluorescens Complex.

    PubMed

    Garrido-Sanz, Daniel; Meier-Kolthoff, Jan P; Göker, Markus; Martín, Marta; Rivilla, Rafael; Redondo-Nieto, Miguel

    2016-01-01

    The Pseudomonas fluorescens complex includes Pseudomonas strains that have been taxonomically assigned to more than fifty different species, many of which have been described as plant growth-promoting rhizobacteria (PGPR) with potential applications in biocontrol and biofertilization. So far the phylogeny of this complex has been analyzed according to phenotypic traits, 16S rDNA, MLSA and inferred by whole-genome analysis. However, since most of the type strains have not been fully sequenced and new species are frequently described, correlation between taxonomy and phylogenomic analysis is missing. In recent years, the genomes of a large number of strains have been sequenced, showing important genomic heterogeneity and providing information suitable for genomic studies that are important to understand the genomic and genetic diversity shown by strains of this complex. Based on MLSA and several whole-genome sequence-based analyses of 93 sequenced strains, we have divided the P. fluorescens complex into eight phylogenomic groups that agree with previous works based on type strains. Digital DDH (dDDH) identified 69 species and 75 subspecies within the 93 genomes. The eight groups corresponded to clustering with a threshold of 31.8% dDDH, in full agreement with our MLSA. The Average Nucleotide Identity (ANI) approach showed inconsistencies regarding the assignment to species and to the eight groups. The small core genome of 1,334 CDSs and the large pan-genome of 30,848 CDSs, show the large diversity and genetic heterogeneity of the P. fluorescens complex. However, a low number of strains were enough to explain most of the CDSs diversity at core and strain-specific genomic fractions. Finally, the identification and analysis of group-specific genome and the screening for distinctive characters revealed a phylogenomic distribution of traits among the groups that provided insights into biocontrol and bioremediation applications as well as their role as PGPR. PMID:26915094

  5. Global biogeography of Prochlorococcus genome diversity in the surface ocean.

    PubMed

    Kent, Alyssa G; Dupont, Chris L; Yooseph, Shibu; Martiny, Adam C

    2016-08-01

    Prochlorococcus, the smallest known photosynthetic bacterium, is abundant in the ocean's surface layer despite large variation in environmental conditions. There are several genetically divergent lineages within Prochlorococcus and superimposed on this phylogenetic diversity is extensive gene gain and loss. The environmental role in shaping the global ocean distribution of genome diversity in Prochlorococcus is largely unknown, particularly in a framework that considers the vertical and lateral mechanisms of evolution. Here we show that Prochlorococcus field populations from a global circumnavigation harbor extensive genome diversity across the surface ocean, but this diversity is not randomly distributed. We observed a significant correspondence between phylogenetic and gene content diversity, including regional differences in both phylogenetic composition and gene content that were related to environmental factors. Several gene families were strongly associated with specific regions and environmental factors, including the identification of a set of genes related to lower nutrient and temperature regions. Metagenomic assemblies of natural Prochlorococcus genomes reinforced this association by providing linkage of genes across genomic backbones. Overall, our results show that the phylogeography in Prochlorococcus taxonomy is echoed in its genome content. Thus environmental variation shapes the functional capabilities and associated ecosystem role of the globally abundant Prochlorococcus. PMID:26836261

  6. Retrotransposon evolution in diverse plant genomes.

    PubMed Central

    Langdon, T; Seago, C; Mende, M; Leggett, M; Thomas, H; Forster, J W; Jones, R N; Jenkins, G

    2000-01-01

    Retrotransposon or retrotransposon-like sequences have been reported to be conserved components of cereal centromeres. Here we show that the published sequences are derived from a single conventional Ty3-gypsy family or a nonautonomous derivative. Both autonomous and nonautonomous elements are likely to have colonized Poaceae centromeres at the time of a common ancestor but have been maintained since by active retrotransposition. The retrotransposon family is also present at a lower copy number in the Arabidopsis genome, where it shows less pronounced localization. The history of the family in the two types of genome provides an interesting contrast between "boom and bust" and persistent evolutionary patterns. PMID:10978295

  7. Transposable element evolution in Heliconius suggests genome diversity within Lepidoptera

    PubMed Central

    2013-01-01

    Background Transposable elements (TEs) have the potential to impact genome structure, function and evolution in profound ways. In order to understand the contribution of transposable elements (TEs) to Heliconius melpomene, we queried the H. melpomene draft sequence to identify repetitive sequences. Results We determined that TEs comprise ~25% of the genome. The predominant class of TEs (~12% of the genome) was the non-long terminal repeat (non-LTR) retrotransposons, including a novel SINE family. However, this was only slightly higher than content derived from DNA transposons, which are diverse, with several families having mobilized in the recent past. Compared to the only other well-studied lepidopteran genome, Bombyx mori, H. melpomene exhibits a higher DNA transposon content and a distinct repertoire of retrotransposons. We also found that H. melpomene exhibits a high rate of TE turnover with few older elements accumulating in the genome. Conclusions Our analysis represents the first complete, de novo characterization of TE content in a butterfly genome and suggests that, while TEs are able to invade and multiply, TEs have an overall deleterious effect and/or that maintaining a small genome is advantageous. Our results also hint that analysis of additional lepidopteran genomes will reveal substantial TE diversity within the group. PMID:24088337

  8. The Genomic and Phenotypic Diversity of Schizosaccharomyces pombe

    PubMed Central

    Jeffares, Daniel C.; Rallis, Charalampos; Rieux, Adrien; Speed, Doug; Převorovský, Martin; Mourier, Tobias; Marsellach, Francesc X.; Iqbal, Zamin; Lau, Winston; Cheng, Tammy M.K.; Pracana, Rodrigo; Mülleder, Michael; Lawson, Jonathan L.D.; Chessel, Anatole; Bala, Sendu; Hellenthal, Garrett; O’Fallon, Brendan; Keane, Thomas; Simpson, Jared T.; Bischof, Leanne; Tomiczek, Bartlomiej; Bitton, Danny A.; Sideri, Theodora; Codlin, Sandra; Hellberg, Josephine E.E.U.; van Trigt, Laurent; Jeffery, Linda; Li, Juan-Juan; Atkinson, Sophie; Thodberg, Malte; Febrer, Melanie; McLay, Kirsten; Drou, Nizar; Brown, William; Hayles, Jacqueline; Carazo Salas, Rafael E.; Ralser, Markus; Maniatis, Nikolas; Balding, David J.; Balloux, Francois; Durbin, Richard; Bähler, Jürg

    2015-01-01

    Natural variation within species reveals aspects of genome evolution and function. The fission yeast Schizosaccharomyces pombe is an important model for eukaryotic biology, but researchers typically use one standard laboratory strain. To extend the utility of this model, we surveyed the genomic and phenotypic variation in 161 natural isolates. We sequenced the genomes of all strains, revealing moderate genetic diversity (π = 3 ×10−3) and weak global population structure. We estimate that dispersal of S. pombe began within human antiquity (~340 BCE), and ancestors of these strains reached the Americas at ~1623 CE. We quantified 74 traits, revealing substantial heritable phenotypic diversity. We conducted 223 genome-wide association studies, with 89 traits showing at least one association. The most significant variant for each trait explained 22% of variance on average, with indels having higher effects than SNPs. This analysis presents a rich resource to examine genotype-phenotype relationships in a tractable model. PMID:25665008

  9. Genomic diversity of Bombyx mori nucleopolyhedrovirus strains.

    PubMed

    Xu, Yi-Peng; Cheng, Ruo-Lin; Xi, Yu; Zhang, Chuan-Xi

    2013-07-01

    Bombyx mori nucleopolyhedrovirus (BmNPV) is a baculovirus that selectively infects the domestic silkworm. In this study, six BmNPV strains were compared at the whole genome level. We found that the number of bro genes and the composition of the homologous regions (hrs) are the two primary areas of divergence within these genomes. When we compared the ORFs of these BmNPV variants, we noticed a high degree of sequence divergence in the ORFs that are not baculovirus core genes. This result is consistent with the results derived from phylogenetic trees and evolutionary pressure analyses of these ORFs, indicating that ORFs that are not core genes likely play important roles in the evolution of BmNPV strains. The evolutionary relationships of these BmNPV strains might be explained by their geographic origins or those of their hosts. In addition, the total number of hr palindromes seems to affect viral DNA replication in Bm5 cells. PMID:23639478

  10. Castor Bean Organelle Genome Sequencing and Worldwide Genetic Diversity Analysis

    PubMed Central

    Chan, Agnes P.; Williams, Amber L.; Rice, Danny W.; Liu, Xinyue; Melake-Berhan, Admasu; Huot Creasy, Heather; Puiu, Daniela; Rosovitz, M. J.; Khouri, Hoda M.; Beckstrom-Sternberg, Stephen M.; Allan, Gerard J.; Keim, Paul; Ravel, Jacques; Rabinowicz, Pablo D.

    2011-01-01

    Castor bean is an important oil-producing plant in the Euphorbiaceae family. Its high-quality oil contains up to 90% of the unusual fatty acid ricinoleate, which has many industrial and medical applications. Castor bean seeds also contain ricin, a highly toxic Type 2 ribosome-inactivating protein, which has gained relevance in recent years due to biosafety concerns. In order to gain knowledge on global genetic diversity in castor bean and to ultimately help the development of breeding and forensic tools, we carried out an extensive chloroplast sequence diversity analysis. Taking advantage of the recently published genome sequence of castor bean, we assembled the chloroplast and mitochondrion genomes extracting selected reads from the available whole genome shotgun reads. Using the chloroplast reference genome we used the methylation filtration technique to readily obtain draft genome sequences of 7 geographically and genetically diverse castor bean accessions. These sequence data were used to identify single nucleotide polymorphism markers and phylogenetic analysis resulted in the identification of two major clades that were not apparent in previous population genetic studies using genetic markers derived from nuclear DNA. Two distinct sub-clades could be defined within each major clade and large-scale genotyping of castor bean populations worldwide confirmed previously observed low levels of genetic diversity and showed a broad geographic distribution of each sub-clade. PMID:21750729

  11. Genetic Diversity of A-Genome Cotton.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Since Upland cotton (Gossypium hirsutum L.) is known to have relatively low levels of genetic diversity or variation in genetic makeup among individuals, a better understanding of this variation and relationships among possible sources of novel genes would be valuable. Therefore, analysis of genetic...

  12. Genomic Diversity within the Enterobacter cloacae Complex

    PubMed Central

    Paauw, Armand; Caspers, Martien P. M.; Schuren, Frank H. J.; Leverstein-van Hall, Maurine A.; Delétoile, Alexis; Montijn, Roy C.; Verhoef, Jan; Fluit, Ad C.

    2008-01-01

    Background Isolates of the Enterobacter cloacae complex have been increasingly isolated as nosocomial pathogens, but phenotypic identification of the E. cloacae complex is unreliable and irreproducible. Identification of species based on currently available genotyping tools is already superior to phenotypic identification, but the taxonomy of isolates belonging to this complex is cumbersome. Methodology/Principal Findings This study shows that multilocus sequence analysis and comparative genomic hybridization based on a mixed genome array is a powerful method for studying species assignment within the E. cloacae complex. The E. cloacae complex is shown to be evolutionarily divided into two clades that are genetically distinct from each other. The younger first clade is genetically more homogenous, contains the Enterobacter hormaechei species and is the most frequently cultured Enterobacter species in hospitals. The second and older clade consists of several (sub)species that are genetically more heterogonous. Genetic markers were identified that could discriminate between the two clades and cluster 1. Conclusions/Significance Based on genomic differences it is concluded that some previously defined (clonal and heterogenic) (sub)species of the E. cloacae complex have to be redefined because of disagreements with known or proposed nomenclature. However, further improved identification of the redefined species will be possible based on novel markers presented here. PMID:18716657

  13. Low genome content diversity of marine planktonic Thaumarchaeota.

    PubMed

    Luo, Haiwei; Sun, Ying; Hollibaugh, James T; Moran, Mary Ann

    2016-08-01

    Members of Thaumarchaeota are responsible for much of the ammonia oxidation occurring in the ocean. Recent studies showed that marine Thaumarchaeota have versatile metabolic capabilities, but sequencing additional genomes has not significantly increased the gene content ascribed to this group. We used the assembly-free dN pipeline software in combination with phylogenetic analyses to interrogate shotgun metagenomic data sets to gain a better understanding of the genomic diversity of Thaumarchaeota populations. The program confidently assigned ∼3,000 paired-end reads to Thaumarchaeota, independent of homologies to any known Thaumarchaeota genome sequence. Only 2% of these reads potentially harbor new genes that were absent from the genome of 'Candidatus Nitrosopumilus maritimus' str. SCM1, even though this strain was isolated from a marine aquarium rather than directly from the ocean. One of these novel genes encode proteins associated with the CRISPR/Cas system, Cas1, suggesting that phage defense through CRISPR may be also present in planktonic Thaumarchaeota lineages. Our results suggest that marine Thaumarchaeota populations have very low diversity in genome content, which is corroborated using computer simulation analyses of two bacterial lineages with known genome content diversity. PMID:27120311

  14. Comparative Analysis of Genome Diversity in Bullmastiff Dogs

    PubMed Central

    Mortlock, Sally-Anne; Khatkar, Mehar S.; Williamson, Peter

    2016-01-01

    Management and preservation of genomic diversity in dog breeds is a major objective for maintaining health. The present study was undertaken to characterise genomic diversity in Bullmastiff dogs using both genealogical and molecular analysis. Genealogical analysis of diversity was conducted using a database consisting of 16,378 Bullmastiff pedigrees from year 1980 to 2013. Additionally, a total of 188 Bullmastiff dogs were genotyped using the 170,000 SNP Illumina CanineHD Beadchip. Genealogical parameters revealed a mean inbreeding coefficient of 0.047; 142 total founders (f); an effective number of founders (fe) of 79; an effective number of ancestors (fa) of 62; and an effective population size of the reference population of 41. Genetic diversity and the degree of genome-wide homogeneity within the breed were also investigated using molecular data. Multiple-locus heterozygosity (MLH) was equal to 0.206; runs of homozygosity (ROH) as proportion of the genome, averaged 16.44%; effective population size was 29.1, with an average inbreeding coefficient of 0.035, all estimated using SNP Data. Fine-scale population structure was analysed using NETVIEW, a population analysis pipeline. Visualisation of the high definition network captured relationships among individuals within and between subpopulations. Effects of unequal founder use, and ancestral inbreeding and selection, were evident. While current levels of Bullmastiff heterozygosity, inbreeding and homozygosity are not unusual, a relatively small effective population size indicates that a breeding strategy to reduce the inbreeding rate may be beneficial. PMID:26824579

  15. Comparative Analysis of Genome Diversity in Bullmastiff Dogs.

    PubMed

    Mortlock, Sally-Anne; Khatkar, Mehar S; Williamson, Peter

    2016-01-01

    Management and preservation of genomic diversity in dog breeds is a major objective for maintaining health. The present study was undertaken to characterise genomic diversity in Bullmastiff dogs using both genealogical and molecular analysis. Genealogical analysis of diversity was conducted using a database consisting of 16,378 Bullmastiff pedigrees from year 1980 to 2013. Additionally, a total of 188 Bullmastiff dogs were genotyped using the 170,000 SNP Illumina CanineHD Beadchip. Genealogical parameters revealed a mean inbreeding coefficient of 0.047; 142 total founders (f); an effective number of founders (fe) of 79; an effective number of ancestors (fa) of 62; and an effective population size of the reference population of 41. Genetic diversity and the degree of genome-wide homogeneity within the breed were also investigated using molecular data. Multiple-locus heterozygosity (MLH) was equal to 0.206; runs of homozygosity (ROH) as proportion of the genome, averaged 16.44%; effective population size was 29.1, with an average inbreeding coefficient of 0.035, all estimated using SNP Data. Fine-scale population structure was analysed using NETVIEW, a population analysis pipeline. Visualisation of the high definition network captured relationships among individuals within and between subpopulations. Effects of unequal founder use, and ancestral inbreeding and selection, were evident. While current levels of Bullmastiff heterozygosity, inbreeding and homozygosity are not unusual, a relatively small effective population size indicates that a breeding strategy to reduce the inbreeding rate may be beneficial. PMID:26824579

  16. Lampreys as Diverse Model Organisms in the Genomics Era

    PubMed Central

    McCauley, David W.; Docker, Margaret F.; Whyard, Steve; Li, Weiming

    2015-01-01

    Lampreys, one of the two surviving groups of ancient vertebrates, have become important models for study in diverse fields of biology. Lampreys (of which there are approximately 40 species) are being studied, for example, (a) to control pest sea lamprey in the North American Great Lakes and to restore declining populations of native species elsewhere; (b) in biomedical research, focusing particularly on the regenerative capability of lampreys; and (c) by developmental biologists studying the evolution of key vertebrate characters. Although a lack of genetic resources has hindered research on the mechanisms regulating many aspects of lamprey life history and development, formerly intractable questions are now amenable to investigation following the recent publication of the sea lamprey genome. Here, we provide an overview of the ways in which genomic tools are currently being deployed to tackle diverse research questions and suggest several areas that may benefit from the availability of the sea lamprey genome. PMID:26951616

  17. Whole mitochondrial genome genetic diversity in an Estonian population sample.

    PubMed

    Stoljarova, Monika; King, Jonathan L; Takahashi, Maiko; Aaspõllu, Anu; Budowle, Bruce

    2016-01-01

    Mitochondrial DNA is a useful marker for population studies, human identification, and forensic analysis. Commonly used hypervariable regions I and II (HVI/HVII) were reported to contain as little as 25% of mitochondrial DNA variants and therefore the majority of power of discrimination of mitochondrial DNA resides in the coding region. Massively parallel sequencing technology enables entire mitochondrial genome sequencing. In this study, buccal swabs were collected from 114 unrelated Estonians and whole mitochondrial genome sequences were generated using the Illumina MiSeq system. The results are concordant with previous mtDNA control region reports of high haplogroup HV and U frequencies (47.4 and 23.7% in this study, respectively) in the Estonian population. One sample with the Northern Asian haplogroup D was detected. The genetic diversity of the Estonian population sample was estimated to be 99.67 and 95.85%, for mtGenome and HVI/HVII data, respectively. The random match probability for mtGenome data was 1.20 versus 4.99% for HVI/HVII. The nucleotide mean pairwise difference was 27 ± 11 for mtGenome and 7 ± 3 for HVI/HVII data. These data describe the genetic diversity of the Estonian population sample and emphasize the power of discrimination of the entire mitochondrial genome over the hypervariable regions. PMID:26289416

  18. Visualization of Genome Diversity in German Shepherd Dogs

    PubMed Central

    Mortlock, Sally-Anne; Booth, Rachel; Mazrier, Hamutal; Khatkar, Mehar S.; Williamson, Peter

    2015-01-01

    A loss of genetic diversity may lead to increased disease risks in subpopulations of dogs. The canine breed structure has contributed to relatively small effective population size in many breeds and can limit the options for selective breeding strategies to maintain diversity. With the completion of the canine genome sequencing project, and the subsequent reduction in the cost of genotyping on a genomic scale, evaluating diversity in dogs has become much more accurate and accessible. This provides a potential tool for advising dog breeders and developing breeding programs within a breed. A challenge in doing this is to present complex relationship data in a form that can be readily utilized. Here, we demonstrate the use of a pipeline, known as NetView, to visualize the network of relationships in a subpopulation of German Shepherd Dogs. PMID:26884680

  19. Remarkable Diversity of Endogenous Viruses in a Crustacean Genome

    PubMed Central

    Thézé, Julien; Leclercq, Sébastien; Moumen, Bouziane; Cordaux, Richard; Gilbert, Clément

    2014-01-01

    Recent studies in paleovirology have uncovered myriads of endogenous viral elements (EVEs) integrated in the genome of their eukaryotic hosts. These fragments result from endogenization, that is, integration of the viral genome into the host germline genome followed by vertical inheritance. So far, most studies have used a virus-centered approach, whereby endogenous copies of a particular group of viruses were searched in all available sequenced genomes. Here, we follow a host-centered approach whereby the genome of a given species is comprehensively screened for the presence of EVEs using all available complete viral genomes as queries. Our analyses revealed that 54 EVEs corresponding to 10 different viral lineages belonging to 5 viral families (Bunyaviridae, Circoviridae, Parvoviridae, and Totiviridae) and one viral order (Mononegavirales) became endogenized in the genome of the isopod crustacean Armadillidium vulgare. We show that viral endogenization occurred recurrently during the evolution of isopods and that A. vulgare viral lineages were involved in multiple host switches that took place between widely divergent taxa. Furthermore, 30 A. vulgare EVEs have uninterrupted open reading frames, suggesting they result from recent endogenization of viruses likely to be currently infecting isopod populations. Overall, our work shows that isopods have been and are still infected by a large variety of viruses. It also extends the host range of several families of viruses and brings new insights into their evolution. More generally, our results underline the power of paleovirology in characterizing the viral diversity currently infecting eukaryotic taxa. PMID:25084787

  20. Remarkable diversity of endogenous viruses in a crustacean genome.

    PubMed

    Thézé, Julien; Leclercq, Sébastien; Moumen, Bouziane; Cordaux, Richard; Gilbert, Clément

    2014-08-01

    Recent studies in paleovirology have uncovered myriads of endogenous viral elements (EVEs) integrated in the genome of their eukaryotic hosts. These fragments result from endogenization, that is, integration of the viral genome into the host germline genome followed by vertical inheritance. So far, most studies have used a virus-centered approach, whereby endogenous copies of a particular group of viruses were searched in all available sequenced genomes. Here, we follow a host-centered approach whereby the genome of a given species is comprehensively screened for the presence of EVEs using all available complete viral genomes as queries. Our analyses revealed that 54 EVEs corresponding to 10 different viral lineages belonging to 5 viral families (Bunyaviridae, Circoviridae, Parvoviridae, and Totiviridae) and one viral order (Mononegavirales) became endogenized in the genome of the isopod crustacean Armadillidium vulgare. We show that viral endogenization occurred recurrently during the evolution of isopods and that A. vulgare viral lineages were involved in multiple host switches that took place between widely divergent taxa. Furthermore, 30 A. vulgare EVEs have uninterrupted open reading frames, suggesting they result from recent endogenization of viruses likely to be currently infecting isopod populations. Overall, our work shows that isopods have been and are still infected by a large variety of viruses. It also extends the host range of several families of viruses and brings new insights into their evolution. More generally, our results underline the power of paleovirology in characterizing the viral diversity currently infecting eukaryotic taxa. PMID:25084787

  1. Report of the second Human Genome Diversity workshop

    SciTech Connect

    1992-12-31

    The Second Human Genome Diversity Workshop was successfully held at Penn State University from October 29--31, 1992. The Workshop was essentially organized around 7 groups, each comprising approximately 10 participants, representing the sampling issues in different regions of the world. These groups worked independently, using a common format provided by the organizers; this was adjusted as needed by the individual groups. The Workshop began with a presentation of the mandate to the participants, and of the procedures to be followed during the workshop. Dr. Feldman presented a summary of the results from the First Workshop. He and the other organizers also presented brief comments giving their perspective on the objectives of the Second Workshop. Dr. Julia Bodmer discussed the study of European genetic diversity, especially in the context of the HLA experience there, and of plans to extend such studies in the coming years. She also discussed surveys of world HLA laboratories in regard to resources related to Human Genome Diversity. Dr. Mark Weiss discussed the relevance of nonhuman primate studies for understanding how demographic processes, such as mate exchange between local groups, affected the local dispersion of genetic variation. Primate population geneticists have some relevant experience in interpreting variation at this local level, in particular, with various DNA fingerprinting methods. This experience may be relevant to the Human Genome Diversity Project, in terms of practical and statistical issues.

  2. Comparative genomics of wild type yeast strains unveils important genome diversity

    PubMed Central

    Carreto, Laura; Eiriz, Maria F; Gomes, Ana C; Pereira, Patrícia M; Schuller, Dorit; Santos, Manuel AS

    2008-01-01

    Background Genome variability generates phenotypic heterogeneity and is of relevance for adaptation to environmental change, but the extent of such variability in natural populations is still poorly understood. For example, selected Saccharomyces cerevisiae strains are variable at the ploidy level, have gene amplifications, changes in chromosome copy number, and gross chromosomal rearrangements. This suggests that genome plasticity provides important genetic diversity upon which natural selection mechanisms can operate. Results In this study, we have used wild-type S. cerevisiae (yeast) strains to investigate genome variation in natural and artificial environments. We have used comparative genome hybridization on array (aCGH) to characterize the genome variability of 16 yeast strains, of laboratory and commercial origin, isolated from vineyards and wine cellars, and from opportunistic human infections. Interestingly, sub-telomeric instability was associated with the clinical phenotype, while Ty element insertion regions determined genomic differences of natural wine fermentation strains. Copy number depletion of ASP3 and YRF1 genes was found in all wild-type strains. Other gene families involved in transmembrane transport, sugar and alcohol metabolism or drug resistance had copy number changes, which also distinguished wine from clinical isolates. Conclusion We have isolated and genotyped more than 1000 yeast strains from natural environments and carried out an aCGH analysis of 16 strains representative of distinct genotype clusters. Important genomic variability was identified between these strains, in particular in sub-telomeric regions and in Ty-element insertion sites, suggesting that this type of genome variability is the main source of genetic diversity in natural populations of yeast. The data highlights the usefulness of yeast as a model system to unravel intraspecific natural genome diversity and to elucidate how natural selection shapes the yeast genome

  3. Genome-wide association studies in diverse populations

    PubMed Central

    Rosenberg, Noah A; Huang, Lucy; Jewett, Ethan M; Szpiech, Zachary A; Jankovic, Ivana; Boehnke, Michael

    2011-01-01

    Genome-wide association (GWA) studies have identified a large number of single-nucleotide polymorphisms (SNPs) associated with disease phenotypes. As most GWA studies have been performed primarily in populations of European descent, this review examines the issues involved in extending consideration of GWA studies to diverse worldwide populations. Although challenges exist with such issues as imputation, admixture, and replication, investigation of diverse populations in GWA studies has significant potential to advance the project of mapping the genetic determinants of complex diseases for the human population as a whole. PMID:20395969

  4. Genomic diversity of colorectal cancer: Changing landscape and emerging targets

    PubMed Central

    Ahn, Daniel H; Ciombor, Kristen K; Mikhail, Sameh; Bekaii-Saab, Tanios

    2016-01-01

    Improvements in screening and preventive measures have led to an increased detection of early stage colorectal cancers (CRC) where patients undergo treatment with a curative intent. Despite these efforts, a high proportion of patients are diagnosed with advanced stage disease that is associated with poor outcomes, as CRC remains one of the leading causes of cancer-related deaths in the world. The development of next generation sequencing and collaborative multi-institutional efforts to characterize the cancer genome has afforded us with a comprehensive assessment of the genomic makeup present in CRC. This knowledge has translated into understanding the prognostic role of various tumor somatic variants in this disease. Additionally, the awareness of the genomic alterations present in CRC has resulted in an improvement in patient outcomes, largely due to better selection of personalized therapies based on an individual’s tumor genomic makeup. The benefit of various treatments is often limited, where recent studies assessing the genomic diversity in CRC have identified the development of secondary tumor somatic variants that likely contribute to acquired treatment resistance. These studies have begun to alter the landscape of treatment for CRC that include investigating novel targeted therapies, assessing the role of immunotherapy and prospective, dynamic assessment of changes in tumor genomic alterations that occur during the treatment of CRC. PMID:27433082

  5. Genomic diversity of colorectal cancer: Changing landscape and emerging targets.

    PubMed

    Ahn, Daniel H; Ciombor, Kristen K; Mikhail, Sameh; Bekaii-Saab, Tanios

    2016-07-01

    Improvements in screening and preventive measures have led to an increased detection of early stage colorectal cancers (CRC) where patients undergo treatment with a curative intent. Despite these efforts, a high proportion of patients are diagnosed with advanced stage disease that is associated with poor outcomes, as CRC remains one of the leading causes of cancer-related deaths in the world. The development of next generation sequencing and collaborative multi-institutional efforts to characterize the cancer genome has afforded us with a comprehensive assessment of the genomic makeup present in CRC. This knowledge has translated into understanding the prognostic role of various tumor somatic variants in this disease. Additionally, the awareness of the genomic alterations present in CRC has resulted in an improvement in patient outcomes, largely due to better selection of personalized therapies based on an individual's tumor genomic makeup. The benefit of various treatments is often limited, where recent studies assessing the genomic diversity in CRC have identified the development of secondary tumor somatic variants that likely contribute to acquired treatment resistance. These studies have begun to alter the landscape of treatment for CRC that include investigating novel targeted therapies, assessing the role of immunotherapy and prospective, dynamic assessment of changes in tumor genomic alterations that occur during the treatment of CRC. PMID:27433082

  6. Evolution and Diversity of the Human Hepatitis D Virus Genome

    PubMed Central

    Huang, Chi-Ruei; Lo, Szecheng J.

    2010-01-01

    Human hepatitis delta virus (HDV) is the smallest RNA virus in genome. HDV genome is divided into a viroid-like sequence and a protein-coding sequence which could have originated from different resources and the HDV genome was eventually constituted through RNA recombination. The genome subsequently diversified through accumulation of mutations selected by interactions between the mutated RNA and proteins with host factors to successfully form the infectious virions. Therefore, we propose that the conservation of HDV nucleotide sequence is highly related with its functionality. Genome analysis of known HDV isolates shows that the C-terminal coding sequences of large delta antigen (LDAg) are the highest diversity than other regions of protein-coding sequences but they still retain biological functionality to interact with the heavy chain of clathrin can be selected and maintained. Since viruses interact with many host factors, including escaping the host immune response, how to design a program to predict RNA genome evolution is a great challenging work. PMID:20204073

  7. Genome diversity in Brachypodium distachyon: deep sequencing of highly diverse inbred lines

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Natural variation provides a powerful opportunity to study the genetic basis of biological traits. Brachypodium distachyon is a broadly distributed diploid model grass with a small genome and a large collection of diverse inbred lines. As a step towards understanding the genetic basis of the natura...

  8. Genomes, diversity and resistance gene analogues in Musa species.

    PubMed

    Azhar, M; Heslop-Harrison, J S

    2008-01-01

    Resistance genes (R genes) in plants are abundant and may represent more than 1% of all the genes. Their diversity is critical to the recognition and response to attack from diverse pathogens. Like many other crops, banana and plantain face attacks from potentially devastating fungal and bacterial diseases, increased by a combination of worldwide spread of pathogens, exploitation of a small number of varieties, new pathogen mutations, and the lack of effective, benign and cheap chemical control. The challenge for plant breeders is to identify and exploit genetic resistances to diseases, which is particularly difficult in banana and plantain where the valuable cultivars are sterile, parthenocarpic and mostly triploid so conventional genetic analysis and breeding is impossible. In this paper, we review the nature of R genes and the key motifs, particularly in the Nucleotide Binding Sites (NBS), Leucine Rich Repeat (LRR) gene class. We present data about identity, nature and evolutionary diversity of the NBS domains of Musa R genes in diploid wild species with the Musa acuminata (A), M. balbisiana (B), M. schizocarpa (S), M. textilis (T), M. velutina and M. ornata genomes, and from various cultivated hybrid and triploid accessions, using PCR primers to isolate the domains from genomic DNA. Of 135 new sequences, 75% of the sequenced clones had uninterrupted open reading frames (ORFs), and phylogenetic UPGMA tree construction showed four clusters, one from Musa ornata, one largely from the B and T genomes, one from A and M. velutina, and the largest with A, B, T and S genomes. Only genes of the coiled-coil (non-TIR) class were found, typical of the grasses and presumably monocotyledons. The analysis of R genes in cultivated banana and plantain, and their wild relatives, has implications for identification and selection of resistance genes within the genus which may be useful for plant selection and breeding and also for defining relationships and genome evolution

  9. Discovery of biological networks from diverse functional genomic data

    PubMed Central

    Myers, Chad L; Robson, Drew; Wible, Adam; Hibbs, Matthew A; Chiriac, Camelia; Theesfeld, Chandra L; Dolinski, Kara; Troyanskaya, Olga G

    2005-01-01

    We have developed a general probabilistic system for query-based discovery of pathway-specific networks through integration of diverse genome-wide data. This framework was validated by accurately recovering known networks for 31 biological processes in Saccharomyces cerevisiae and experimentally verifying predictions for the process of chromosomal segregation. Our system, bioPIXIE, a public, comprehensive system for integration, analysis, and visualization of biological network predictions for S. cerevisiae, is freely accessible over the worldwide web. PMID:16420673

  10. Genomic basis for natural product biosynthetic diversity in the actinomycetes†

    PubMed Central

    Nett, Markus; Ikeda, Haruo; Moore, Bradley S.

    2010-01-01

    The phylum Actinobacteria hosts diverse high G + C, Gram-positive bacteria that have evolved a complex chemical language of natural product chemistry to help navigate their fascinatingly varied lifestyles. To date, 71 Actinobacteria genomes have been completed and annotated, with the vast majority representing the Actinomycetales, which are the source of numerous antibiotics and other drugs from genera such as Streptomyces, Saccharopolyspora and Salinispora. These genomic analyses have illuminated the secondary metabolic proficiency of these microbes – underappreciated for years based on conventional isolation programs – and have helped set the foundation for a new natural product discovery paradigm based on genome mining. Trends in the secondary metabolomes of natural product-rich actinomycetes are highlighted in this review article, which contains 199 references. PMID:19844637

  11. Genomic Diversity of Phages Infecting Probiotic Strains of Lactobacillus paracasei.

    PubMed

    Mercanti, Diego J; Rousseau, Geneviève M; Capra, María L; Quiberoni, Andrea; Tremblay, Denise M; Labrie, Simon J; Moineau, Sylvain

    2016-01-01

    Strains of the Lactobacillus casei group have been extensively studied because some are used as probiotics in foods. Conversely, their phages have received much less attention. We analyzed the complete genome sequences of five L. paracasei temperate phages: CL1, CL2, iLp84, iLp1308, and iA2. Only phage iA2 could not replicate in an indicator strain. The genome lengths ranged from 34,155 bp (iA2) to 39,474 bp (CL1). Phages iA2 and iLp1308 (34,176 bp) possess the smallest genomes reported, thus far, for phages of the L. casei group. The GC contents of the five phage genomes ranged from 44.8 to 45.6%. As observed with many other phages, their genomes were organized as follows: genes coding for DNA packaging, morphogenesis, lysis, lysogeny, and replication. Phages CL1, CL2, and iLp1308 are highly related to each other. Phage iLp84 was also related to these three phages, but the similarities were limited to gene products involved in DNA packaging and structural proteins. Genomic fragments of phages CL1, CL2, iLp1308, and iLp84 were found in several genomes of L. casei strains. Prophage iA2 is unrelated to these four phages, but almost all of its genome was found in at least four L. casei strains. Overall, these phages are distinct from previously characterized Lactobacillus phages. Our results highlight the diversity of L. casei phages and indicate frequent DNA exchanges between phages and their hosts. PMID:26475105

  12. Genomic Diversity of Phages Infecting Probiotic Strains of Lactobacillus paracasei

    PubMed Central

    Rousseau, Geneviève M.; Capra, María L.; Quiberoni, Andrea; Tremblay, Denise M.; Labrie, Simon J.

    2015-01-01

    Strains of the Lactobacillus casei group have been extensively studied because some are used as probiotics in foods. Conversely, their phages have received much less attention. We analyzed the complete genome sequences of five L. paracasei temperate phages: CL1, CL2, iLp84, iLp1308, and iA2. Only phage iA2 could not replicate in an indicator strain. The genome lengths ranged from 34,155 bp (iA2) to 39,474 bp (CL1). Phages iA2 and iLp1308 (34,176 bp) possess the smallest genomes reported, thus far, for phages of the L. casei group. The GC contents of the five phage genomes ranged from 44.8 to 45.6%. As observed with many other phages, their genomes were organized as follows: genes coding for DNA packaging, morphogenesis, lysis, lysogeny, and replication. Phages CL1, CL2, and iLp1308 are highly related to each other. Phage iLp84 was also related to these three phages, but the similarities were limited to gene products involved in DNA packaging and structural proteins. Genomic fragments of phages CL1, CL2, iLp1308, and iLp84 were found in several genomes of L. casei strains. Prophage iA2 is unrelated to these four phages, but almost all of its genome was found in at least four L. casei strains. Overall, these phages are distinct from previously characterized Lactobacillus phages. Our results highlight the diversity of L. casei phages and indicate frequent DNA exchanges between phages and their hosts. PMID:26475105

  13. The genomic and phenotypic diversity of Schizosaccharomyces pombe.

    PubMed

    Jeffares, Daniel C; Rallis, Charalampos; Rieux, Adrien; Speed, Doug; Převorovský, Martin; Mourier, Tobias; Marsellach, Francesc X; Iqbal, Zamin; Lau, Winston; Cheng, Tammy M K; Pracana, Rodrigo; Mülleder, Michael; Lawson, Jonathan L D; Chessel, Anatole; Bala, Sendu; Hellenthal, Garrett; O'Fallon, Brendan; Keane, Thomas; Simpson, Jared T; Bischof, Leanne; Tomiczek, Bartlomiej; Bitton, Danny A; Sideri, Theodora; Codlin, Sandra; Hellberg, Josephine E E U; van Trigt, Laurent; Jeffery, Linda; Li, Juan-Juan; Atkinson, Sophie; Thodberg, Malte; Febrer, Melanie; McLay, Kirsten; Drou, Nizar; Brown, William; Hayles, Jacqueline; Carazo Salas, Rafael E; Ralser, Markus; Maniatis, Nikolas; Balding, David J; Balloux, Francois; Durbin, Richard; Bähler, Jürg

    2015-03-01

    Natural variation within species reveals aspects of genome evolution and function. The fission yeast Schizosaccharomyces pombe is an important model for eukaryotic biology, but researchers typically use one standard laboratory strain. To extend the usefulness of this model, we surveyed the genomic and phenotypic variation in 161 natural isolates. We sequenced the genomes of all strains, finding moderate genetic diversity (π = 3 × 10(-3) substitutions/site) and weak global population structure. We estimate that dispersal of S. pombe began during human antiquity (∼340 BCE), and ancestors of these strains reached the Americas at ∼1623 CE. We quantified 74 traits, finding substantial heritable phenotypic diversity. We conducted 223 genome-wide association studies, with 89 traits showing at least one association. The most significant variant for each trait explained 22% of the phenotypic variance on average, with indels having larger effects than SNPs. This analysis represents a rich resource to examine genotype-phenotype relationships in a tractable model. PMID:25665008

  14. Genomic diversity of 2010 Haitian cholera outbreak strains.

    PubMed

    Hasan, Nur A; Choi, Seon Young; Eppinger, Mark; Clark, Philip W; Chen, Arlene; Alam, Munirul; Haley, Bradd J; Taviani, Elisa; Hine, Erin; Su, Qi; Tallon, Luke J; Prosper, Joseph B; Furth, Keziah; Hoq, M M; Li, Huai; Fraser-Liggett, Claire M; Cravioto, Alejandro; Huq, Anwar; Ravel, Jacques; Cebula, Thomas A; Colwell, Rita R

    2012-07-17

    The millions of deaths from cholera during the past 200 y, coupled with the morbidity and mortality of cholera in Haiti since October 2010, are grim reminders that Vibrio cholerae, the etiologic agent of cholera, remains a scourge. We report the isolation of both V. cholerae O1 and non-O1/O139 early in the Haiti cholera epidemic from samples collected from victims in 18 towns across eight Arrondissements of Haiti. The results showed two distinct populations of V. cholerae coexisted in Haiti early in the epidemic. As non-O1/O139 V. cholerae was the sole pathogen isolated from 21% of the clinical specimens, its role in this epidemic, either alone or in concert with V. cholerae O1, cannot be dismissed. A genomic approach was used to examine similarities and differences among the Haitian V. cholerae O1 and V. cholerae non-O1/O139 strains. A total of 47 V. cholerae O1 and 29 V. cholerae non-O1/O139 isolates from patients and the environment were sequenced. Comparative genome analyses of the 76 genomes and eight reference strains of V. cholerae isolated in concurrent epidemics outside Haiti and 27 V. cholerae genomes available in the public database demonstrated substantial diversity of V. cholerae and ongoing flux within its genome. PMID:22711841

  15. The Global Invertebrate Genomics Alliance (GIGA): Developing Community Resources to Study Diverse Invertebrate Genomes

    PubMed Central

    2014-01-01

    Over 95% of all metazoan (animal) species comprise the “invertebrates,” but very few genomes from these organisms have been sequenced. We have, therefore, formed a “Global Invertebrate Genomics Alliance” (GIGA). Our intent is to build a collaborative network of diverse scientists to tackle major challenges (e.g., species selection, sample collection and storage, sequence assembly, annotation, analytical tools) associated with genome/transcriptome sequencing across a large taxonomic spectrum. We aim to promote standards that will facilitate comparative approaches to invertebrate genomics and collaborations across the international scientific community. Candidate study taxa include species from Porifera, Ctenophora, Cnidaria, Placozoa, Mollusca, Arthropoda, Echinodermata, Annelida, Bryozoa, and Platyhelminthes, among others. GIGA will target 7000 noninsect/nonnematode species, with an emphasis on marine taxa because of the unrivaled phyletic diversity in the oceans. Priorities for selecting invertebrates for sequencing will include, but are not restricted to, their phylogenetic placement; relevance to organismal, ecological, and conservation research; and their importance to fisheries and human health. We highlight benefits of sequencing both whole genomes (DNA) and transcriptomes and also suggest policies for genomic-level data access and sharing based on transparency and inclusiveness. The GIGA Web site (http://giga.nova.edu) has been launched to facilitate this collaborative venture. PMID:24336862

  16. The Global Invertebrate Genomics Alliance (GIGA): developing community resources to study diverse invertebrate genomes.

    PubMed

    Bracken-Grissom, Heather; Collins, Allen G; Collins, Timothy; Crandall, Keith; Distel, Daniel; Dunn, Casey; Giribet, Gonzalo; Haddock, Steven; Knowlton, Nancy; Martindale, Mark; Medina, Mónica; Messing, Charles; O'Brien, Stephen J; Paulay, Gustav; Putnam, Nicolas; Ravasi, Timothy; Rouse, Greg W; Ryan, Joseph F; Schulze, Anja; Wörheide, Gert; Adamska, Maja; Bailly, Xavier; Breinholt, Jesse; Browne, William E; Diaz, M Christina; Evans, Nathaniel; Flot, Jean-François; Fogarty, Nicole; Johnston, Matthew; Kamel, Bishoy; Kawahara, Akito Y; Laberge, Tammy; Lavrov, Dennis; Michonneau, François; Moroz, Leonid L; Oakley, Todd; Osborne, Karen; Pomponi, Shirley A; Rhodes, Adelaide; Santos, Scott R; Satoh, Nori; Thacker, Robert W; Van de Peer, Yves; Voolstra, Christian R; Welch, David Mark; Winston, Judith; Zhou, Xin

    2014-01-01

    Over 95% of all metazoan (animal) species comprise the "invertebrates," but very few genomes from these organisms have been sequenced. We have, therefore, formed a "Global Invertebrate Genomics Alliance" (GIGA). Our intent is to build a collaborative network of diverse scientists to tackle major challenges (e.g., species selection, sample collection and storage, sequence assembly, annotation, analytical tools) associated with genome/transcriptome sequencing across a large taxonomic spectrum. We aim to promote standards that will facilitate comparative approaches to invertebrate genomics and collaborations across the international scientific community. Candidate study taxa include species from Porifera, Ctenophora, Cnidaria, Placozoa, Mollusca, Arthropoda, Echinodermata, Annelida, Bryozoa, and Platyhelminthes, among others. GIGA will target 7000 noninsect/nonnematode species, with an emphasis on marine taxa because of the unrivaled phyletic diversity in the oceans. Priorities for selecting invertebrates for sequencing will include, but are not restricted to, their phylogenetic placement; relevance to organismal, ecological, and conservation research; and their importance to fisheries and human health. We highlight benefits of sequencing both whole genomes (DNA) and transcriptomes and also suggest policies for genomic-level data access and sharing based on transparency and inclusiveness. The GIGA Web site (http://giga.nova.edu) has been launched to facilitate this collaborative venture. PMID:24336862

  17. Diversity and evolution of centromere repeats in the maize genome.

    PubMed

    Bilinski, Paul; Distor, Kevin; Gutierrez-Lopez, Jose; Mendoza, Gabriela Mendoza; Shi, Jinghua; Dawe, R Kelly; Ross-Ibarra, Jeffrey

    2015-03-01

    Centromere repeats are found in most eukaryotes and play a critical role in kinetochore formation. Though centromere repeats exhibit considerable diversity both within and among species, little is understood about the mechanisms that drive centromere repeat evolution. Here, we use maize as a model to investigate how a complex history involving polyploidy, fractionation, and recent domestication has impacted the diversity of the maize centromeric repeat CentC. We first validate the existence of long tandem arrays of repeats in maize and other taxa in the genus Zea. Although we find considerable sequence diversity among CentC copies genome-wide, genetic similarity among repeats is highest within these arrays, suggesting that tandem duplications are the primary mechanism for the generation of new copies. Nonetheless, clustering analyses identify similar sequences among distant repeats, and simulations suggest that this pattern may be due to homoplasious mutation. Although the two ancestral subgenomes of maize have contributed nearly equal numbers of centromeres, our analysis shows that the majority of all CentC repeats derive from one of the parental genomes, with an even stronger bias when examining the largest assembled contiguous clusters. Finally, by comparing maize with its wild progenitor teosinte, we find that the abundance of CentC likely decreased after domestication, while the pericentromeric repeat Cent4 has drastically increased. PMID:25190528

  18. A genome-wide map of diversity in Plasmodium falciparum.

    PubMed

    Volkman, Sarah K; Sabeti, Pardis C; DeCaprio, David; Neafsey, Daniel E; Schaffner, Stephen F; Milner, Danny A; Daily, Johanna P; Sarr, Ousmane; Ndiaye, Daouda; Ndir, Omar; Mboup, Soulyemane; Duraisingh, Manoj T; Lukens, Amanda; Derr, Alan; Stange-Thomann, Nicole; Waggoner, Skye; Onofrio, Robert; Ziaugra, Liuda; Mauceli, Evan; Gnerre, Sante; Jaffe, David B; Zainoun, Joanne; Wiegand, Roger C; Birren, Bruce W; Hartl, Daniel L; Galagan, James E; Lander, Eric S; Wirth, Dyann F

    2007-01-01

    Genetic variation allows the malaria parasite Plasmodium falciparum to overcome chemotherapeutic agents, vaccines and vector control strategies and remain a leading cause of global morbidity and mortality. Here we describe an initial survey of genetic variation across the P. falciparum genome. We performed extensive sequencing of 16 geographically diverse parasites and identified 46,937 SNPs, demonstrating rich diversity among P. falciparum parasites (pi = 1.16 x 10(-3)) and strong correlation with gene function. We identified multiple regions with signatures of selective sweeps in drug-resistant parasites, including a previously unidentified 160-kb region with extremely low polymorphism in pyrimethamine-resistant parasites. We further characterized 54 worldwide isolates by genotyping SNPs across 20 genomic regions. These data begin to define population structure among African, Asian and American groups and illustrate the degree of linkage disequilibrium, which extends over relatively short distances in African parasites but over longer distances in Asian parasites. We provide an initial map of genetic diversity in P. falciparum and demonstrate its potential utility in identifying genes subject to recent natural selection and in understanding the population genetics of this parasite. PMID:17159979

  19. Diversity and Evolution in the Genome of Clostridium difficile

    PubMed Central

    Knight, Daniel R.; Elliott, Briony; Chang, Barbara J.; Perkins, Timothy T.

    2015-01-01

    SUMMARY Clostridium difficile infection (CDI) is the leading cause of antimicrobial and health care-associated diarrhea in humans, presenting a significant burden to global health care systems. In the last 2 decades, PCR- and sequence-based techniques, particularly whole-genome sequencing (WGS), have significantly furthered our knowledge of the genetic diversity, evolution, epidemiology, and pathogenicity of this once enigmatic pathogen. C. difficile is taxonomically distinct from many other well-known clostridia, with a diverse population structure comprising hundreds of strain types spread across at least 6 phylogenetic clades. The C. difficile species is defined by a large diverse pangenome with extreme levels of evolutionary plasticity that has been shaped over long time periods by gene flux and recombination, often between divergent lineages. These evolutionary events are in response to environmental and anthropogenic activities and have led to the rapid emergence and worldwide dissemination of virulent clonal lineages. Moreover, genome analysis of large clinically relevant data sets has improved our understanding of CDI outbreaks, transmission, and recurrence. The epidemiology of CDI has changed dramatically over the last 15 years, and CDI may have a foodborne or zoonotic etiology. The WGS era promises to continue to redefine our view of this significant pathogen. PMID:26085550

  20. Comparative genomics reveals diversity among xanthomonads infecting tomato and pepper

    PubMed Central

    2011-01-01

    Background Bacterial spot of tomato and pepper is caused by four Xanthomonas species and is a major plant disease in warm humid climates. The four species are distinct from each other based on physiological and molecular characteristics. The genome sequence of strain 85-10, a member of one of the species, Xanthomonas euvesicatoria (Xcv) has been previously reported. To determine the relationship of the four species at the genome level and to investigate the molecular basis of their virulence and differing host ranges, draft genomic sequences of members of the other three species were determined and compared to strain 85-10. Results We sequenced the genomes of X. vesicatoria (Xv) strain 1111 (ATCC 35937), X. perforans (Xp) strain 91-118 and X. gardneri (Xg) strain 101 (ATCC 19865). The genomes were compared with each other and with the previously sequenced Xcv strain 85-10. In addition, the molecular features were predicted that may be required for pathogenicity including the type III secretion apparatus, type III effectors, other secretion systems, quorum sensing systems, adhesins, extracellular polysaccharide, and lipopolysaccharide determinants. Several novel type III effectors from Xg strain 101 and Xv strain 1111 genomes were computationally identified and their translocation was validated using a reporter gene assay. A homolog to Ax21, the elicitor of XA21-mediated resistance in rice, and a functional Ax21 sulfation system were identified in Xcv. Genes encoding proteins with functions mediated by type II and type IV secretion systems have also been compared, including enzymes involved in cell wall deconstruction, as contributors to pathogenicity. Conclusions Comparative genomic analyses revealed considerable diversity among bacterial spot pathogens, providing new insights into differences and similarities that may explain the diverse nature of these strains. Genes specific to pepper pathogens, such as the O-antigen of the lipopolysaccharide cluster, and genes

  1. Genomic Diversity of Enterotoxigenic Strains of Bacteroides fragilis.

    PubMed

    Pierce, Jessica V; Bernstein, Harris D

    2016-01-01

    Enterotoxigenic (ETBF) strains of Bacteroides fragilis are the subset of strains that secrete a toxin called fragilysin (Bft). Although ETBF strains are known to cause diarrheal disease and have recently been associated with colorectal cancer, they have not been well characterized. By sequencing the complete genome of four ETBF strains, we found that these strains exhibit considerable variation at the genomic level. Only a small number of genes that are located primarily in the Bft pathogenicity island (BFT PAI) and the flanking CTn86 conjugative transposon are conserved in all four strains and a fifth strain whose genome was previously sequenced. Interestingly, phylogenetic analysis strongly suggests that the BFT PAI was acquired by non-toxigenic (NTBF) strains multiple times during the course of evolution. At the phenotypic level, we found that the ETBF strains were less fit than the NTBF strain NCTC 9343 and were susceptible to a growth-inhibitory protein that it produces. The ETBF strains also showed a greater tendency to form biofilms, which may promote tumor formation, than NTBF strains. Although the genomic diversity of ETBF strains raises the possibility that they vary in their pathogenicity, our experimental results also suggest that they share common properties that are conferred by different combinations of non-universal genetic elements. PMID:27348220

  2. Genomic Diversity of Enterotoxigenic Strains of Bacteroides fragilis

    PubMed Central

    Pierce, Jessica V.; Bernstein, Harris D.

    2016-01-01

    Enterotoxigenic (ETBF) strains of Bacteroides fragilis are the subset of strains that secrete a toxin called fragilysin (Bft). Although ETBF strains are known to cause diarrheal disease and have recently been associated with colorectal cancer, they have not been well characterized. By sequencing the complete genome of four ETBF strains, we found that these strains exhibit considerable variation at the genomic level. Only a small number of genes that are located primarily in the Bft pathogenicity island (BFT PAI) and the flanking CTn86 conjugative transposon are conserved in all four strains and a fifth strain whose genome was previously sequenced. Interestingly, phylogenetic analysis strongly suggests that the BFT PAI was acquired by non-toxigenic (NTBF) strains multiple times during the course of evolution. At the phenotypic level, we found that the ETBF strains were less fit than the NTBF strain NCTC 9343 and were susceptible to a growth-inhibitory protein that it produces. The ETBF strains also showed a greater tendency to form biofilms, which may promote tumor formation, than NTBF strains. Although the genomic diversity of ETBF strains raises the possibility that they vary in their pathogenicity, our experimental results also suggest that they share common properties that are conferred by different combinations of non-universal genetic elements. PMID:27348220

  3. Integrons in Xanthomonas: A source of species genome diversity

    PubMed Central

    Gillings, Michael R.; Holley, Marita P.; Stokes, H. W.; Holmes, Andrew J.

    2005-01-01

    Integrons are best known for assembling antibiotic resistance genes in clinical bacteria. They capture genes by using integrase-mediated site-specific recombination of mobile gene cassettes. Integrons also occur in the chromosomes of many bacteria, notably β- and γ-Proteobacteria. In a survey of Xanthomonas, integrons were found in all 32 strains representing 12 pathovars of two species. Their chromosomal location was downstream from the acid dehydratase gene, ilvD, suggesting that an integron was present at this site in the ancestral xanthomonad. There was considerable sequence and structural diversity among the extant integrons. The majority of integrase genes were predicted to be inactivated by frameshifts, stop codons, or large deletions, suggesting that the associated gene cassettes can no longer be mobilized. In support, groups of strains with the same deletions or stop codons/frameshifts in their integrase gene usually contained identical arrays of gene cassettes. In general, strains within individual pathovars had identical cassettes, and these exhibited no similarity to cassettes detected in other pathovars. The variety and characteristics of contemporary gene cassettes suggests that the ancestral integron had access to a diverse pool of these mobile elements, and that their genes originated outside the Xanthomonas genome. Subsequent inactivation of the integrase gene in particular lineages has largely fixed the gene cassette arrays in particular pathovars during their differentiation and specialization into ecological niches. The acquisition of diverse gene cassettes by different lineages within Xanthomonas has contributed to the species-genome diversity of the genus. The role of gene cassettes in survival on plant surfaces is currently unknown. PMID:15755815

  4. Limitations and benefits of ARISA intra-genomic diversity fingerprinting.

    PubMed

    Popa, Radu; Popa, Rodica; Mashall, Matthew J; Nguyen, Hien; Tebo, Bradley M; Brauer, Suzanna

    2009-08-01

    Monitoring diversity changes and contamination in mixed cultures and simple microcosms is challenged by fast community structure dynamics, and the need for means allowing fast, cost-efficient and accurate identification of microorganisms at high phylogenetic resolution. The method we explored is a variant of Automated rRNA Intergenic Spacer Analysis based on Intra-Genomic Diversity Fingerprinting (ARISA-IGDF), and identifies phylotypes with multiple 16S-23S rRNA gene Intergenic Transcribed Spacers. We verified the effect of PCR conditions (annealing temperature, duration of final extension, number of cycles, group-specific primers and formamide) on ARISA-IGD fingerprints of 44 strains of Shewanella. We present a digitization algorithm and data analysis procedures needed to determine confidence in strain identification. Though using stringent PCR conditions and group-specific primers allow reasonably accurate identification of strains with three ARISA-IGD amplicons within the 82-1000 bp size range, ARISA-IGDF is best for phylotypes with >or=4 unambiguously different amplicons. This method allows monitoring the occurrence of culturable microbes and can be implemented in applications requiring high phylogenetic resolution, reproducibility, low cost and high throughput such as identifying contamination and monitoring the evolution of diversity in mixed cultures and low diversity microcosms and periodic screening of small microbial culture libraries. PMID:19538993

  5. Genetics, Genomics and Evolution of Ergot Alkaloid Diversity

    PubMed Central

    Young, Carolyn A.; Schardl, Christopher L.; Panaccione, Daniel G.; Florea, Simona; Takach, Johanna E.; Charlton, Nikki D.; Moore, Neil; Webb, Jennifer S.; Jaromczyk, Jolanta

    2015-01-01

    The ergot alkaloid biosynthesis system has become an excellent model to study evolutionary diversification of specialized (secondary) metabolites. This is a very diverse class of alkaloids with various neurotropic activities, produced by fungi in several orders of the phylum Ascomycota, including plant pathogens and protective plant symbionts in the family Clavicipitaceae. Results of comparative genomics and phylogenomic analyses reveal multiple examples of three evolutionary processes that have generated ergot-alkaloid diversity: gene gains, gene losses, and gene sequence changes that have led to altered substrates or product specificities of the enzymes that they encode (neofunctionalization). The chromosome ends appear to be particularly effective engines for gene gains, losses and rearrangements, but not necessarily for neofunctionalization. Changes in gene expression could lead to accumulation of various pathway intermediates and affect levels of different ergot alkaloids. Genetic alterations associated with interspecific hybrids of Epichloë species suggest that such variation is also selectively favored. The huge structural diversity of ergot alkaloids probably represents adaptations to a wide variety of ecological situations by affecting the biological spectra and mechanisms of defense against herbivores, as evidenced by the diverse pharmacological effects of ergot alkaloids used in medicine. PMID:25875294

  6. Genetics, genomics and evolution of ergot alkaloid diversity.

    PubMed

    Young, Carolyn A; Schardl, Christopher L; Panaccione, Daniel G; Florea, Simona; Takach, Johanna E; Charlton, Nikki D; Moore, Neil; Webb, Jennifer S; Jaromczyk, Jolanta

    2015-04-01

    The ergot alkaloid biosynthesis system has become an excellent model to study evolutionary diversification of specialized (secondary) metabolites. This is a very diverse class of alkaloids with various neurotropic activities, produced by fungi in several orders of the phylum Ascomycota, including plant pathogens and protective plant symbionts in the family Clavicipitaceae. Results of comparative genomics and phylogenomic analyses reveal multiple examples of three evolutionary processes that have generated ergot-alkaloid diversity: gene gains, gene losses, and gene sequence changes that have led to altered substrates or product specificities of the enzymes that they encode (neofunctionalization). The chromosome ends appear to be particularly effective engines for gene gains, losses and rearrangements, but not necessarily for neofunctionalization. Changes in gene expression could lead to accumulation of various pathway intermediates and affect levels of different ergot alkaloids. Genetic alterations associated with interspecific hybrids of Epichloë species suggest that such variation is also selectively favored. The huge structural diversity of ergot alkaloids probably represents adaptations to a wide variety of ecological situations by affecting the biological spectra and mechanisms of defense against herbivores, as evidenced by the diverse pharmacological effects of ergot alkaloids used in medicine. PMID:25875294

  7. Phenotypic heterogeneity of genomically-diverse isolates of Streptococcus mutans.

    PubMed

    Palmer, Sara R; Miller, James H; Abranches, Jacqueline; Zeng, Lin; Lefebure, Tristan; Richards, Vincent P; Lemos, José A; Stanhope, Michael J; Burne, Robert A

    2013-01-01

    High coverage, whole genome shotgun (WGS) sequencing of 57 geographically- and genetically-diverse isolates of Streptococcus mutans from individuals of known dental caries status was recently completed. Of the 57 sequenced strains, fifteen isolates, were selected based primarily on differences in gene content and phenotypic characteristics known to affect virulence and compared with the reference strain UA159. A high degree of variability in these properties was observed between strains, with a broad spectrum of sensitivities to low pH, oxidative stress (air and paraquat) and exposure to competence stimulating peptide (CSP). Significant differences in autolytic behavior and in biofilm development in glucose or sucrose were also observed. Natural genetic competence varied among isolates, and this was correlated to the presence or absence of competence genes, comCDE and comX, and to bacteriocins. In general strains that lacked the ability to become competent possessed fewer genes for bacteriocins and immunity proteins or contained polymorphic variants of these genes. WGS sequence analysis of the pan-genome revealed, for the first time, components of a Type VII secretion system in several S. mutans strains, as well as two putative ORFs that encode possible collagen binding proteins located upstream of the cnm gene, which is associated with host cell invasiveness. The virulence of these particular strains was assessed in a wax-worm model. This is the first study to combine a comprehensive analysis of key virulence-related phenotypes with extensive genomic analysis of a pathogen that evolved closely with humans. Our analysis highlights the phenotypic diversity of S. mutans isolates and indicates that the species has evolved a variety of adaptive strategies to persist in the human oral cavity and, when conditions are favorable, to initiate disease. PMID:23613838

  8. Phenotypic Heterogeneity of Genomically-Diverse Isolates of Streptococcus mutans

    PubMed Central

    Palmer, Sara R.; Miller, James H.; Abranches, Jacqueline; Zeng, Lin; Lefebure, Tristan; Richards, Vincent P.; Lemos, José A.; Stanhope, Michael J.; Burne, Robert A.

    2013-01-01

    High coverage, whole genome shotgun (WGS) sequencing of 57 geographically- and genetically-diverse isolates of Streptococcus mutans from individuals of known dental caries status was recently completed. Of the 57 sequenced strains, fifteen isolates, were selected based primarily on differences in gene content and phenotypic characteristics known to affect virulence and compared with the reference strain UA159. A high degree of variability in these properties was observed between strains, with a broad spectrum of sensitivities to low pH, oxidative stress (air and paraquat) and exposure to competence stimulating peptide (CSP). Significant differences in autolytic behavior and in biofilm development in glucose or sucrose were also observed. Natural genetic competence varied among isolates, and this was correlated to the presence or absence of competence genes, comCDE and comX, and to bacteriocins. In general strains that lacked the ability to become competent possessed fewer genes for bacteriocins and immunity proteins or contained polymorphic variants of these genes. WGS sequence analysis of the pan-genome revealed, for the first time, components of a Type VII secretion system in several S. mutans strains, as well as two putative ORFs that encode possible collagen binding proteins located upstream of the cnm gene, which is associated with host cell invasiveness. The virulence of these particular strains was assessed in a wax-worm model. This is the first study to combine a comprehensive analysis of key virulence-related phenotypes with extensive genomic analysis of a pathogen that evolved closely with humans. Our analysis highlights the phenotypic diversity of S. mutans isolates and indicates that the species has evolved a variety of adaptive strategies to persist in the human oral cavity and, when conditions are favorable, to initiate disease. PMID:23613838

  9. Global Genomic Diversity of Human Papillomavirus 6 Based on 724 Isolates and 190 Complete Genome Sequences

    PubMed Central

    Jelen, Mateja M.; Chen, Zigui; Kocjan, Boštjan J.; Burt, Felicity J.; Chan, Paul K. S.; Chouhy, Diego; Combrinck, Catharina E.; Coutlée, François; Estrade, Christine; Ferenczy, Alex; Fiander, Alison; Franco, Eduardo L.; Garland, Suzanne M.; Giri, Adriana A.; González, Joaquín Víctor; Gröning, Arndt; Heidrich, Kerstin; Hibbitts, Sam; Hošnjak, Lea; Luk, Tommy N. M.; Marinic, Karina; Matsukura, Toshihiko; Neumann, Anna; Oštrbenk, Anja; Picconi, Maria Alejandra; Richardson, Harriet; Sagadin, Martin; Sahli, Roland; Seedat, Riaz Y.; Seme, Katja; Severini, Alberto; Sinchi, Jessica L.; Smahelova, Jana; Tabrizi, Sepehr N.; Tachezy, Ruth; Tohme, Sarah; Uloza, Virgilijus; Vitkauskiene, Astra; Wong, Yong Wee; Židovec Lepej, Snježana; Burk, Robert D.

    2014-01-01

    ABSTRACT Human papillomavirus type 6 (HPV6) is the major etiological agent of anogenital warts and laryngeal papillomas and has been included in both the quadrivalent and nonavalent prophylactic HPV vaccines. This study investigated the global genomic diversity of HPV6, using 724 isolates and 190 complete genomes from six continents, and the association of HPV6 genomic variants with geographical location, anatomical site of infection/disease, and gender. Initially, a 2,800-bp E5a-E5b-L1-LCR fragment was sequenced from 492/530 (92.8%) HPV6-positive samples collected for this study. Among them, 130 exhibited at least one single nucleotide polymorphism (SNP), indel, or amino acid change in the E5a-E5b-L1-LCR fragment and were sequenced in full. A global alignment and maximum likelihood tree of 190 complete HPV6 genomes (130 fully sequenced in this study and 60 obtained from sequence repositories) revealed two variant lineages, A and B, and five B sublineages: B1, B2, B3, B4, and B5. HPV6 (sub)lineage-specific SNPs and a 960-bp representative region for whole-genome-based phylogenetic clustering within the L2 open reading frame were identified. Multivariate logistic regression analysis revealed that lineage B predominated globally. Sublineage B3 was more common in Africa and North and South America, and lineage A was more common in Asia. Sublineages B1 and B3 were associated with anogenital infections, indicating a potential lesion-specific predilection of some HPV6 sublineages. Females had higher odds for infection with sublineage B3 than males. In conclusion, a global HPV6 phylogenetic analysis revealed the existence of two variant lineages and five sublineages, showing some degree of ethnogeographic, gender, and/or disease predilection in their distribution. IMPORTANCE This study established the largest database of globally circulating HPV6 genomic variants and contributed a total of 130 new, complete HPV6 genome sequences to available sequence repositories. Two HPV

  10. Nearly finished genomes produced using gel microdroplet culturing reveal substantial intraspecies genomic diversity within the human microbiome

    PubMed Central

    Fitzsimons, Michael S.; Novotny, Mark; Lo, Chien-Chi; Dichosa, Armand E.K.; Yee-Greenbaum, Joyclyn L.; Snook, Jeremy P.; Gu, Wei; Chertkov, Olga; Davenport, Karen W.; McMurry, Kim; Reitenga, Krista G.; Daughton, Ashlynn R.; He, Jian; Johnson, Shannon L.; Gleasner, Cheryl D.; Wills, Patti L.; Parson-Quintana, Beverly; Chain, Patrick S.; Detter, John C.; Lasken, Roger S.; Han, Cliff S.

    2013-01-01

    The majority of microbial genomic diversity remains unexplored. This is largely due to our inability to culture most microorganisms in isolation, which is a prerequisite for traditional genome sequencing. Single-cell sequencing has allowed researchers to circumvent this limitation. DNA is amplified directly from a single cell using the whole-genome amplification technique of multiple displacement amplification (MDA). However, MDA from a single chromosome copy suffers from amplification bias and a large loss of specificity from even very small amounts of DNA contamination, which makes assembling a genome difficult and completely finishing a genome impossible except in extraordinary circumstances. Gel microdrop cultivation allows culturing of a diverse microbial community and provides hundreds to thousands of genetically identical cells as input for an MDA reaction. We demonstrate the utility of this approach by comparing sequencing results of gel microdroplets and single cells following MDA. Bias is reduced in the MDA reaction and genome sequencing, and assembly is greatly improved when using gel microdroplets. We acquired multiple near-complete genomes for two bacterial species from human oral and stool microbiome samples. A significant amount of genome diversity, including single nucleotide polymorphisms and genome recombination, is discovered. Gel microdroplets offer a powerful and high-throughput technology for assembling whole genomes from complex samples and for probing the pan-genome of naturally occurring populations. PMID:23493677

  11. The Human Genome Diversity (HGD) Project. Summary document

    SciTech Connect

    1993-12-31

    In 1991 a group of human geneticists and molecular biologists proposed to the scientific community that a world wide survey be undertaken of variation in the human genome. To aid their considerations, the committee therefore decided to hold a small series of international workshops to explore the major scientific issues involved. The intention was to define a framework for the project which could provide a basis for much wider and more detailed discussion and planning--it was recognized that the successful implementation of the proposed project, which has come to be known as the Human Genome Diversity (HGD) Project, would not only involve scientists but also various national and international non-scientific groups all of which should contribute to the project`s development. The international HGD workshop held in Sardinia in September 1993 was the last in the initial series of planning workshops. As such it not only explored new ground but also pulled together into a more coherent form much of the formal and informal discussion that had taken place in the preceding two years. This report presents the deliberations of the Sardinia workshop within a consideration of the overall development of the HGD Project to date.

  12. The genome diversity and karyotype evolution of mammals

    PubMed Central

    2011-01-01

    The past decade has witnessed an explosion of genome sequencing and mapping in evolutionary diverse species. While full genome sequencing of mammals is rapidly progressing, the ability to assemble and align orthologous whole chromosome regions from more than a few species is still not possible. The intense focus on building of comparative maps for companion (dog and cat), laboratory (mice and rat) and agricultural (cattle, pig, and horse) animals has traditionally been used as a means to understand the underlying basis of disease-related or economically important phenotypes. However, these maps also provide an unprecedented opportunity to use multispecies analysis as a tool for inferring karyotype evolution. Comparative chromosome painting and related techniques are now considered to be the most powerful approaches in comparative genome studies. Homologies can be identified with high accuracy using molecularly defined DNA probes for fluorescence in situ hybridization (FISH) on chromosomes of different species. Chromosome painting data are now available for members of nearly all mammalian orders. In most orders, there are species with rates of chromosome evolution that can be considered as 'default' rates. The number of rearrangements that have become fixed in evolutionary history seems comparatively low, bearing in mind the 180 million years of the mammalian radiation. Comparative chromosome maps record the history of karyotype changes that have occurred during evolution. The aim of this review is to provide an overview of these recent advances in our endeavor to decipher the karyotype evolution of mammals by integrating the published results together with some of our latest unpublished results. PMID:21992653

  13. The HLA genomic loci map: expression, interaction, diversity and disease.

    PubMed

    Shiina, Takashi; Hosomichi, Kazuyoshi; Inoko, Hidetoshi; Kulski, Jerzy K

    2009-01-01

    The human leukocyte antigen (HLA) super-locus is a genomic region in the chromosomal position 6p21 that encodes the six classical transplantation HLA genes and at least 132 protein coding genes that have important roles in the regulation of the immune system as well as some other fundamental molecular and cellular processes. This small segment of the human genome has been associated with more than 100 different diseases, including common diseases, such as diabetes, rheumatoid arthritis, psoriasis, asthma and various other autoimmune disorders. The first complete and continuous HLA 3.6 Mb genomic sequence was reported in 1999 with the annotation of 224 gene loci, including coding and non-coding genes that were reviewed extensively in 2004. In this review, we present (1) an updated list of all the HLA gene symbols, gene names, expression status, Online Mendelian Inheritance in Man (OMIM) numbers, including new genes, and latest changes to gene names and symbols, (2) a regional analysis of the extended class I, class I, class III, class II and extended class II subregions, (3) a summary of the interspersed repeats (retrotransposons and transposons), (4) examples of the sequence diversity between different HLA haplotypes, (5) intra- and extra-HLA gene interactions and (6) some of the HLA gene expression profiles and HLA genes associated with autoimmune and infectious diseases. Overall, the degrees and types of HLA super-locus coordinated gene expression profiles and gene variations have yet to be fully elucidated, integrated and defined for the processes involved with normal cellular and tissue physiology, inflammatory and immune responses, and autoimmune and infectious diseases. PMID:19158813

  14. Diversity-generating Retroelements in Phage and Bacterial Genomes.

    PubMed

    Guo, Huatao; Arambula, Diego; Ghosh, Partho; Miller, Jeff F

    2014-12-01

    Diversity-generating retroelements (DGRs) are DNA diversification machines found in diverse bacterial and bacteriophage genomes that accelerate the evolution of ligand-receptor interactions. Diversification results from a unidirectional transfer of sequence information from an invariant template repeat (TR) to a variable repeat (VR) located in a protein-encoding gene. Information transfer is coupled to site-specific mutagenesis in a process called mutagenic homing, which occurs through an RNA intermediate and is catalyzed by a unique, DGR-encoded reverse transcriptase that converts adenine residues in the TR into random nucleotides in the VR. In the prototype DGR found in the Bordetella bacteriophage BPP-1, the variable protein Mtd is responsible for phage receptor recognition. VR diversification enables progeny phage to switch tropism, accelerating their adaptation to changes in sequence or availability of host cell-surface molecules for infection. Since their discovery, hundreds of DGRs have been identified, and their functions are just beginning to be understood. VR-encoded residues of many DGR-diversified proteins are displayed in the context of a C-type lectin fold, although other scaffolds, including the immunoglobulin fold, may also be used. DGR homing is postulated to occur through a specialized target DNA-primed reverse transcription mechanism that allows repeated rounds of diversification and selection, and the ability to engineer DGRs to target heterologous genes suggests applications for bioengineering. This chapter provides a comprehensive review of our current understanding of this newly discovered family of beneficial retroelements. PMID:26104433

  15. Impact of marker ascertainment bias on genomic selection accuracy and estimates of genetic diversity

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genome-wide molecular markers are readily being applied to evaluate genetic diversity in germplasm collections and for making genomic selections in breeding programs. To accurately predict phenotypes and assay genetic diversity, molecular markers should assay a representative sample of the polymorp...

  16. Genome Sequence of a Diverse Goose Circovirus Recovered from Greylag Goose

    PubMed Central

    Stenzel, Tomasz; Farkas, Kata

    2015-01-01

    A diverse goose circovirus (GoCV) genome was recovered from a wild hunted greylag goose (Anser anser) in Poland. The genome shares 83% pairwise identity with other GoCV genomes recovered from various geese from China, Germany, and Taiwan. PMID:26227589

  17. The Great Migration and African-American Genomic Diversity

    PubMed Central

    Barakatt, Maxime; Gignoux, Christopher R.; Errington, Jacob; Blot, William J.; Bustamante, Carlos D.; Kenny, Eimear E.; Williams, Scott M.; Aldrich, Melinda C.; Gravel, Simon

    2016-01-01

    We present a comprehensive assessment of genomic diversity in the African-American population by studying three genotyped cohorts comprising 3,726 African-Americans from across the United States that provide a representative description of the population across all US states and socioeconomic status. An estimated 82.1% of ancestors to African-Americans lived in Africa prior to the advent of transatlantic travel, 16.7% in Europe, and 1.2% in the Americas, with increased African ancestry in the southern United States compared to the North and West. Combining demographic models of ancestry and those of relatedness suggests that admixture occurred predominantly in the South prior to the Civil War and that ancestry-biased migration is responsible for regional differences in ancestry. We find that recent migrations also caused a strong increase in genetic relatedness among geographically distant African-Americans. Long-range relatedness among African-Americans and between African-Americans and European-Americans thus track north- and west-bound migration routes followed during the Great Migration of the twentieth century. By contrast, short-range relatedness patterns suggest comparable mobility of ∼15–16km per generation for African-Americans and European-Americans, as estimated using a novel analytical model of isolation-by-distance. PMID:27232753

  18. The Great Migration and African-American Genomic Diversity.

    PubMed

    Baharian, Soheil; Barakatt, Maxime; Gignoux, Christopher R; Shringarpure, Suyash; Errington, Jacob; Blot, William J; Bustamante, Carlos D; Kenny, Eimear E; Williams, Scott M; Aldrich, Melinda C; Gravel, Simon

    2016-05-01

    We present a comprehensive assessment of genomic diversity in the African-American population by studying three genotyped cohorts comprising 3,726 African-Americans from across the United States that provide a representative description of the population across all US states and socioeconomic status. An estimated 82.1% of ancestors to African-Americans lived in Africa prior to the advent of transatlantic travel, 16.7% in Europe, and 1.2% in the Americas, with increased African ancestry in the southern United States compared to the North and West. Combining demographic models of ancestry and those of relatedness suggests that admixture occurred predominantly in the South prior to the Civil War and that ancestry-biased migration is responsible for regional differences in ancestry. We find that recent migrations also caused a strong increase in genetic relatedness among geographically distant African-Americans. Long-range relatedness among African-Americans and between African-Americans and European-Americans thus track north- and west-bound migration routes followed during the Great Migration of the twentieth century. By contrast, short-range relatedness patterns suggest comparable mobility of ∼15-16km per generation for African-Americans and European-Americans, as estimated using a novel analytical model of isolation-by-distance. PMID:27232753

  19. Dissecting genomic diversity, one cell at a time

    PubMed Central

    Blainey, Paul C; Quake, Stephen R

    2014-01-01

    Emerging technologies are bringing single-cell genome sequencing into the mainstream; this field has already yielded insights into the genetic architecture and variability between cells that highlight the dynamic nature of the genome. PMID:24524132

  20. Understanding and utilizing crop genome diversity via high-resolution genotyping.

    PubMed

    Voss-Fels, Kai; Snowdon, Rod J

    2016-04-01

    High-resolution genome analysis technologies provide an unprecedented level of insight into structural diversity across crop genomes. Low-cost discovery of sequence variation has become accessible for all crops since the development of next-generation DNA sequencing technologies, using diverse methods ranging from genome-scale resequencing or skim sequencing, reduced-representation genotyping-by-sequencing, transcriptome sequencing or sequence capture approaches. High-density, high-throughput genotyping arrays generated using the resulting sequence data are today available for the assessment of genomewide single nucleotide polymorphisms in all major crop species. Besides their application in genetic mapping or genomewide association studies for dissection of complex agronomic traits, high-density genotyping arrays are highly suitable for genomic selection strategies. They also enable description of crop diversity at an unprecedented chromosome-scale resolution. Application of population genetics parameters to genomewide diversity data sets enables dissection of linkage disequilibrium to characterize loci underlying selective sweeps. High-throughput genotyping platforms simultaneously open the way for targeted diversity enrichment, allowing rejuvenation of low-diversity chromosome regions in strongly selected breeding pools to potentially reverse the influence of linkage drag. Numerous recent examples are presented which demonstrate the power of next-generation genomics for high-resolution analysis of crop diversity on a subgenomic and chromosomal scale. Such studies give deep insight into the history of crop evolution and selection, while simultaneously identifying novel diversity to improve yield and heterosis. PMID:27003869

  1. Genomic diversity and versatility of Lactobacillus plantarum, a natural metabolic engineer

    PubMed Central

    2011-01-01

    In the past decade it has become clear that the lactic acid bacterium Lactobacillus plantarum occupies a diverse range of environmental niches and has an enormous diversity in phenotypic properties, metabolic capacity and industrial applications. In this review, we describe how genome sequencing, comparative genome hybridization and comparative genomics has provided insight into the underlying genomic diversity and versatility of L. plantarum. One of the main features appears to be genomic life-style islands consisting of numerous functional gene cassettes, in particular for carbohydrates utilization, which can be acquired, shuffled, substituted or deleted in response to niche requirements. In this sense, L. plantarum can be considered a “natural metabolic engineer”. PMID:21995294

  2. Nile Tilapia Infectivity by Genomically Diverse Streptoccocus agalactiae Isolates from Multiple Hosts

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Streptococcus agalactiae, Lancefield group B Streptococcus (GBS), is recognized for causing cattle mastitis, human neonatal meningitis, and fish meningo-encephalitis. We investigated the genomic diversity of GBS isolates from different phylogenetic hosts and geographical regions using serological t...

  3. Comparative assessment of genetic diversity in cytoplasmic and nuclear genome of upland cotton.

    PubMed

    Egamberdiev, Sharof S; Saha, Sukumar; Salakhutdinov, Ilkhom; Jenkins, Johnie N; Deng, Dewayne; Y Abdurakhmonov, Ibrokhim

    2016-06-01

    The importance of the cytoplasmic genome for many economically important traits is well documented in several crop species, including cotton. There is no report on application of cotton chloroplast specific SSR markers as a diagnostic tool to study genetic diversity among improved Upland cotton lines. The complete plastome sequence information in GenBank provided us an opportunity to report on 17 chloroplast specific SSR markers using a cost-effective data mining strategy. Here we report the comparative analysis of genetic diversity among a set of 42 improved Upland cotton lines using SSR markers specific to chloroplast and nuclear genome, respectively. Our results revealed that low to moderate level of genetic diversity existed in both nuclear and cytoplasm genome among this set of cotton lines. However, the specific estimation suggested that genetic diversity is lower in cytoplasmic genome compared to the nuclear genome among this set of Upland cotton lines. In summary, this research is important from several perspectives. We detected a set of cytoplasm genome specific SSR primer pairs by using a cost-effective data mining strategy. We reported for the first time the genetic diversity in the cytoplasmic genome within a set of improved Upland cotton accessions. Results revealed that the genetic diversity in cytoplasmic genome is narrow, compared to the nuclear genome within this set of Upland cotton accessions. Our results suggested that most of these polymorphic chloroplast SSRs would be a valuable complementary tool in addition to the nuclear SSR in the study of evolution, gene flow and genetic diversity in Upland cotton. PMID:27155886

  4. Diverse Lifestyles and Strategies of Plant Pathogenesis Encoded in the Genomes of Eighteen Doethideomycetes Fungi

    SciTech Connect

    Ohm, Robin A.; Feau, Nicolas; Henrissat, Bernard; Schoch, Conrad L.; Horwitz, Benjamin A.; Barry, Kerrie W.; Condon, Bradford J.; Copeland, Alex C.; Dhillon, Braham; Glaser, Fabien; Hesse, Cedar N.; Kosti, Idit; LaButti, Kurt; Lindquist, Erika A.; Lucas, Susan; Salamov, Asaf A.; Bradshaw, Rosie E.; Ciuffetti, Lynda; Hamelin, Richard C.; Kema, Gert H. J.; Lawrence, Christopher; Scott, James A.; Spatafora, Joseph W.; Turgeon, B. Gillian; de Wit, Pierre J. G. M.; Zhong, Shaobin; Goodwin, Stephen B.; Grigoriev, Igor V.

    2012-03-13

    The class of Dothideomycetes is one of the largest and most diverse groups of fungi. Many are plant pathogens and pose a serious threat to agricultural crops grown for biofuel, food or feed. Most Dothideomycetes have only a single host and related species can have very diverse host plants. Eighteen genomes of Dothideomycetes have currently been sequenced by the Joint Genome Institute and other sequencing centers. Here we describe the results of comparative analyses of the fungi in this group.

  5. Diverse Lifestyles and Strategies of Plant Pathogenesis Encoded in the Genomes of Eighteen Dothideomycetes

    SciTech Connect

    Ohm, Robin A.; Feau, Nicolas; Henrissat, Bernard; Schoch, Conrad L.; Horwitz, Benjamin A.; Barry, Kerrie W.; Condon, Bradford J.; Copeland, Alex C.; Dhillon, Braham; Glaser, Fabian; Hesse, Cedar N.; Kosti, Idit; LaButti, Kurt; Lindquist, Erika A.; Lucas, Susan; Salamov, Asaf A.; Bradshaw, Rosie E.; Ciuffetti, Lynda; Hamelin, Richard C.; Kema, Gert H. J.; Lawrence, Christopher; Scott, James A.; Spatafora, Joseph W.; Turgeon, B. Gillian; de Wit, Pierre J. G. M.; Zhong, Shaobin; Goodwin, Stephen B.; Grigoriev, Igor V.

    2013-03-05

    The class of Dothideomycetes is one of the largest and most diverse groups of fungi. Many are plant pathogens and pose a serious threat to agricultural crops that are grown for biofuel, food or feed. Most Dothideomycetes have only a single host plant, and related species can have very diverse hosts. Eighteen genomes of Dothideomycetes have currently been sequenced by the Joint Genome Institute and other sequencing centers. Here we describe the results of comparative analyses of the fungi in this group.

  6. Diversity through duplication: whole-genome sequencing reveals novel gene retrocopies in the human population.

    PubMed

    Richardson, Sandra R; Salvador-Palomeque, Carmen; Faulkner, Geoffrey J

    2014-05-01

    Gene retrocopies are generated by reverse transcription and genomic integration of mRNA. As such, retrocopies present an important exception to the central dogma of molecular biology, and have substantially impacted the functional landscape of the metazoan genome. While an estimated 8,000-17,000 retrocopies exist in the human genome reference sequence, the extent of variation between individuals in terms of retrocopy content has remained largely unexplored. Three recent studies by Abyzov et al., Ewing et al. and Schrider et al. have exploited 1,000 Genomes Project Consortium data, as well as other sources of whole-genome sequencing data, to uncover novel gene retrocopies. Here, we compare the methods and results of these three studies, highlight the impact of retrocopies in human diversity and genome evolution, and speculate on the potential for somatic gene retrocopies to impact cancer etiology and genetic diversity among individual neurons in the mammalian brain. PMID:24615986

  7. Diverse lifestyles and strategies of plant pathogenesis encoded in the genomes of eighteen Dothideomycetes fungi

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The class Dothideomycetes is one of the largest groups of fungi with a high level of ecological diversity including many plant pathogens infecting a broad range of hosts. Here for the first time we compare the sequenced genomes of 18 Dothideomycetes to analyze their evolution, genome organization, a...

  8. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level

    PubMed Central

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea’s genetic data sources. PMID:27446038

  9. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level.

    PubMed

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea's genetic data sources. PMID:27446038

  10. Genomic diversity of Pseudomonas spp. isolated from aerial or root surfaces of plants

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Among the diverse strains of Pseudomonas fluorescens and Pseudomonas chlororaphis inhabiting plant surfaces are those that protect plants from infection by pathogens. To explore the diversity of these bacteria, we derived genomic sequences of seven strains that suppress plant disease. Along with t...

  11. Genomic Diversity of Biocontrol Strains of Pseudomonas spp. Isolated from Aerial or Root Surfaces of Plants

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The striking ecological, metabolic, and biochemical diversity of Pseudomonas has intrigued microbiologists for many decades. To explore the genomic diversity of biocontrol strains of Pseudomonas spp., we derived high quality draft sequences of seven strains known to suppress plant disease. The str...

  12. Diverse circovirus-like genome architectures revealed by environmental metagenomics.

    PubMed

    Rosario, Karyna; Duffy, Siobain; Breitbart, Mya

    2009-10-01

    Single-stranded DNA (ssDNA) viruses with circular genomes are the smallest viruses known to infect eukaryotes. The present study identified 10 novel genomes similar to ssDNA circoviruses through data-mining of public viral metagenomes. The metagenomic libraries included samples from reclaimed water and three different marine environments (Chesapeake Bay, British Columbia coastal waters and Sargasso Sea). All the genomes have similarities to the replication (Rep) protein of circoviruses; however, only half have genomic features consistent with known circoviruses. Some of the genomes exhibit a mixture of genomic features associated with different families of ssDNA viruses (i.e. circoviruses, geminiviruses and parvoviruses). Unique genome architectures and phylogenetic analysis of the Rep protein suggest that these viruses belong to novel genera and/or families. Investigating the complex community of ssDNA viruses in the environment can lead to the discovery of divergent species and help elucidate evolutionary links between ssDNA viruses. PMID:19570956

  13. Whole genome comparison of a large collection of mycobacteriophages reveals a continuum of phage genetic diversity.

    PubMed

    Pope, Welkin H; Bowman, Charles A; Russell, Daniel A; Jacobs-Sera, Deborah; Asai, David J; Cresawn, Steven G; Jacobs, William R; Hendrix, Roger W; Lawrence, Jeffrey G; Hatfull, Graham F

    2015-01-01

    The bacteriophage population is large, dynamic, ancient, and genetically diverse. Limited genomic information shows that phage genomes are mosaic, and the genetic architecture of phage populations remains ill-defined. To understand the population structure of phages infecting a single host strain, we isolated, sequenced, and compared 627 phages of Mycobacterium smegmatis. Their genetic diversity is considerable, and there are 28 distinct genomic types (clusters) with related nucleotide sequences. However, amino acid sequence comparisons show pervasive genomic mosaicism, and quantification of inter-cluster and intra-cluster relatedness reveals a continuum of genetic diversity, albeit with uneven representation of different phages. Furthermore, rarefaction analysis shows that the mycobacteriophage population is not closed, and there is a constant influx of genes from other sources. Phage isolation and analysis was performed by a large consortium of academic institutions, illustrating the substantial benefits of a disseminated, structured program involving large numbers of freshman undergraduates in scientific discovery. PMID:25919952

  14. Whole genome sequences of the USMARC sheep diversity panel v 2.4 aligned to the ovine reference genome assembly

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A searchable and publicly viewable set of mapped genomes from 96 rams from 9 US sheep breeds was created. The nine pure breeds were selected to represent genetic diversity for traits such as fertility, prolificacy, maternal ability, growth rate, carcass leanness, wool quality, mature weight, and lo...

  15. Comparative Genomics Reveal Extensive Transposon-Mediated Genomic Plasticity and Diversity among Potential Effector Proteins within the Genus Coxiella▿ †

    PubMed Central

    Beare, Paul A.; Unsworth, Nathan; Andoh, Masako; Voth, Daniel E.; Omsland, Anders; Gilk, Stacey D.; Williams, Kelly P.; Sobral, Bruno W.; Kupko, John J.; Porcella, Stephen F.; Samuel, James E.; Heinzen, Robert A.

    2009-01-01

    Genetically distinct isolates of Coxiella burnetii, the cause of human Q fever, display different phenotypes with respect to in vitro infectivity/cytopathology and pathogenicity for laboratory animals. Moreover, correlations between C. burnetii genomic groups and human disease presentation (acute versus chronic) have been described, suggesting that isolates have distinct virulence characteristics. To provide a more-complete understanding of C. burnetii's genetic diversity, evolution, and pathogenic potential, we deciphered the whole-genome sequences of the K (Q154) and G (Q212) human chronic endocarditis isolates and the naturally attenuated Dugway (5J108-111) rodent isolate. Cross-genome comparisons that included the previously sequenced Nine Mile (NM) reference isolate (RSA493) revealed both novel gene content and disparate collections of pseudogenes that may contribute to isolate virulence and other phenotypes. While C. burnetii genomes are highly syntenous, recombination between abundant insertion sequence (IS) elements has resulted in genome plasticity manifested as chromosomal rearrangement of syntenic blocks and DNA insertions/deletions. The numerous IS elements, genomic rearrangements, and pseudogenes of C. burnetii isolates are consistent with genome structures of other bacterial pathogens that have recently emerged from nonpathogens with expanded niches. The observation that the attenuated Dugway isolate has the largest genome with the fewest pseudogenes and IS elements suggests that this isolate's lineage is at an earlier stage of pathoadaptation than the NM, K, and G lineages. PMID:19047403

  16. Landscape of genomic diversity and trait discovery in soybean

    PubMed Central

    Valliyodan, Babu; Dan Qiu; Patil, Gunvant; Zeng, Peng; Huang, Jiaying; Dai, Lu; Chen, Chengxuan; Li, Yanjun; Joshi, Trupti; Song, Li; Vuong, Tri D.; Musket, Theresa A.; Xu, Dong; Shannon, J. Grover; Shifeng, Cheng; Liu, Xin; Nguyen, Henry T.

    2016-01-01

    Cultivated soybean [Glycine max (L.) Merr.] is a primary source of vegetable oil and protein. We report a landscape analysis of genome-wide genetic variation and an association study of major domestication and agronomic traits in soybean. A total of 106 soybean genomes representing wild, landraces, and elite lines were re-sequenced at an average of 17x depth with a 97.5% coverage. Over 10 million high-quality SNPs were discovered, and 35.34% of these have not been previously reported. Additionally, 159 putative domestication sweeps were identified, which includes 54.34 Mbp (4.9%) and 4,414 genes; 146 regions were involved in artificial selection during domestication. A genome-wide association study of major traits including oil and protein content, salinity, and domestication traits resulted in the discovery of novel alleles. Genomic information from this study provides a valuable resource for understanding soybean genome structure and evolution, and can also facilitate trait dissection leading to sequencing-based molecular breeding. PMID:27029319

  17. Landscape of genomic diversity and trait discovery in soybean.

    PubMed

    Valliyodan, Babu; Dan Qiu; Patil, Gunvant; Zeng, Peng; Huang, Jiaying; Dai, Lu; Chen, Chengxuan; Li, Yanjun; Joshi, Trupti; Song, Li; Vuong, Tri D; Musket, Theresa A; Xu, Dong; Shannon, J Grover; Shifeng, Cheng; Liu, Xin; Nguyen, Henry T

    2016-01-01

    Cultivated soybean [Glycine max (L.) Merr.] is a primary source of vegetable oil and protein. We report a landscape analysis of genome-wide genetic variation and an association study of major domestication and agronomic traits in soybean. A total of 106 soybean genomes representing wild, landraces, and elite lines were re-sequenced at an average of 17x depth with a 97.5% coverage. Over 10 million high-quality SNPs were discovered, and 35.34% of these have not been previously reported. Additionally, 159 putative domestication sweeps were identified, which includes 54.34 Mbp (4.9%) and 4,414 genes; 146 regions were involved in artificial selection during domestication. A genome-wide association study of major traits including oil and protein content, salinity, and domestication traits resulted in the discovery of novel alleles. Genomic information from this study provides a valuable resource for understanding soybean genome structure and evolution, and can also facilitate trait dissection leading to sequencing-based molecular breeding. PMID:27029319

  18. Comparative Genomics Provides Insight into the Diversity of the Attaching and Effacing Escherichia coli Virulence Plasmids

    PubMed Central

    Hazen, Tracy H.; Kaper, James B.; Nataro, James P.

    2015-01-01

    Attaching and effacing Escherichia coli (AEEC) strains are a genomically diverse group of diarrheagenic E. coli strains that are characterized by the presence of the locus of enterocyte effacement (LEE) genomic island, which encodes a type III secretion system that is essential to virulence. AEEC strains can be further classified as either enterohemorrhagic E. coli (EHEC), typical enteropathogenic E. coli (EPEC), or atypical EPEC, depending on the presence or absence of the Shiga toxin genes or bundle-forming pilus (BFP) genes. Recent AEEC genomic studies have focused on the diversity of the core genome, and less is known regarding the genetic diversity and relatedness of AEEC plasmids. Comparative genomic analyses in this study demonstrated genetic similarity among AEEC plasmid genes involved in plasmid replication conjugative transfer and maintenance, while the remainder of the plasmids had sequence variability. Investigation of the EPEC adherence factor (EAF) plasmids, which carry the BFP genes, demonstrated significant plasmid diversity even among isolates within the same phylogenomic lineage, suggesting that these EAF-like plasmids have undergone genetic modifications or have been lost and acquired multiple times. Global transcriptional analyses of the EPEC prototype isolate E2348/69 and two EAF plasmid mutants of this isolate demonstrated that the plasmid genes influence the expression of a number of chromosomal genes in addition to the LEE. This suggests that the genetic diversity of the EAF plasmids could contribute to differences in the global virulence regulons of EPEC isolates. PMID:26238712

  19. Diversity of Genome Structure in Salmonella enterica Serovar Typhi Populations†

    PubMed Central

    Kothapalli, Sushma; Nair, Satheesh; Alokam, Suneetha; Pang, Tikki; Khakhria, Rasik; Woodward, David; Johnson, Wendy; Stocker, Bruce A. D.; Sanderson, Kenneth E.; Liu, Shu-Lin

    2005-01-01

    The genomes of most strains of Salmonella and Escherichia coli are highly conserved. In contrast, all 136 wild-type strains of Salmonella enterica serovar Typhi analyzed by partial digestion with I-CeuI (an endonuclease which cuts within the rrn operons) and pulsed-field gel electrophoresis and by PCR have rearrangements due to homologous recombination between the rrn operons leading to inversions and translocations. Recombination between rrn operons in culture is known to be equally frequent in S. enterica serovar Typhi and S. enterica serovar Typhimurium; thus, the recombinants in S. enterica serovar Typhi, but not those in S. enterica serovar Typhimurium, are able to survive in nature. However, even in S. enterica serovar Typhi the need for genome balance and the need for gene dosage impose limits on rearrangements. Of 100 strains of genome types 1 to 6, 72 were only 25.5 kb off genome balance (the relative lengths of the replichores during bidirectional replication from oriC to the termination of replication [Ter]), while 28 strains were less balanced (41 kb off balance), indicating that the survival of the best-balanced strains was greater. In addition, the need for appropriate gene dosage apparently selected against rearrangements which moved genes from their accustomed distance from oriC. Although rearrangements involving the seven rrn operons are very common in S. enterica serovar Typhi, other duplicated regions, such as the 25 IS200 elements, are very rarely involved in rearrangements. Large deletions and insertions in the genome are uncommon, except for deletions of Salmonella pathogenicity island 7 (usually 134 kb) from fragment I-CeuI-G and 40-kb insertions, possibly a prophage, in fragment I-CeuI-E. The phage types were determined, and the origins of the phage types appeared to be independent of the origins of the genome types. PMID:15805510

  20. Genomic diversity of human papillomavirus genotype 53 in an ethnogeographically closed cohort of white European women.

    PubMed

    Kocjan, Bostjan J; Seme, Katja; Mocilnik, Tina; Jancar, Nina; Vrtacnik-Bokal, Eda; Poljak, Mario

    2007-04-01

    Human papillomavirus (HPV) genotype 53 is classified taxonomically in alpha HPV genus-species 6, together with HPV-30, HPV-56, and HPV-66 and is considered to be one of three "probable high-risk" HPV genotypes. Recent worldwide comparison of 44 isolates of HPV-53 showed the existence of nine long control region (LCR) genomic variants, which formed a phylogenetic tree with two deep dichotomic branches. In order to investigate further the genomic diversity of HPV-53, a total of 94 isolates of HPV-53 obtained from an ethnogeographically closed cohort of 70 white European women was analyzed. The identification and characterization of HPV-53 genomic variants was based on analysis of three different HPV genomic regions: LCR, E6 and E7. A higher genomic diversity of HPV-53 was identified in the ethnogeographically closed cohort of white European women than has been reported previously on isolates collected worldwide. Altogether, 19 HPV-53 genomic variants, composed of 13 LCR, 13 E6, and 5 E7 genomic variants, were identified. Eleven out of 13 LCR, all E6, and four out of five E7 genomic variants were described for the first time. The present study confirmed dichotomic phylogeny of HPV-53 described previously and, in addition, showed for the first time that after a dichotomic split, both groups of HPV-53 genomic variants formed star-like phylogenetic clusters. In women with persistent HPV-53 infection, HPV-53 genomic variants remained unchanged for up to 51 months. In rare cases, infection with multiple HPV-53 genomic variants is possible. Taking into account the results of this and previous studies, at least 26 different HPV-53 genomic variants exist today. PMID:17311338

  1. Whole-Genome Yersinia sp. Assemblies from 10 Diverse Strains.

    PubMed

    Daligault, H E; Davenport, K W; Minogue, T D; Bishop-Lilly, K A; Broomall, S M; Bruce, D C; Chain, P S; Coyne, S R; Frey, K G; Gibbons, H S; Jaissle, J; Koroleva, G I; Ladner, J T; Lo, C-C; Munk, C; Palacios, G F; Redden, C L; Rosenzweig, C N; Scholz, M B; Johnson, S L

    2014-01-01

    Yersinia spp. are animal pathogens, some of which cause human disease. We sequenced 10 Yersinia isolates (from six species: Yersinia enterocolitica, Y. fredericksenii, Y. kristensenii, Y. pestis, Y. pseudotuberculosis, and Y. ruckeri) to high-quality draft or complete status. The genomes range in size from 3.77 to 4.94 Mbp. PMID:25342679

  2. Whole-Genome Yersinia sp. Assemblies from 10 Diverse Strains

    PubMed Central

    Daligault, H. E.; Davenport, K. W.; Minogue, T. D.; Bishop-Lilly, K. A.; Broomall, S. M.; Bruce, D. C.; Chain, P. S.; Coyne, S. R.; Frey, K. G.; Gibbons, H. S.; Jaissle, J.; Koroleva, G. I.; Ladner, J. T.; Lo, C.-C.; Munk, C.; Palacios, G. F.; Redden, C. L.; Rosenzweig, C. N.; Scholz, M. B.

    2014-01-01

    Yersinia spp. are animal pathogens, some of which cause human disease. We sequenced 10 Yersinia isolates (from six species: Yersinia enterocolitica, Y. fredericksenii, Y. kristensenii, Y. pestis, Y. pseudotuberculosis, and Y. ruckeri) to high-quality draft or complete status. The genomes range in size from 3.77 to 4.94 Mbp. PMID:25342679

  3. Genome diversity of Pseudomonas aeruginosa PAO1 laboratory strains.

    PubMed

    Klockgether, Jens; Munder, Antje; Neugebauer, Jens; Davenport, Colin F; Stanke, Frauke; Larbig, Karen D; Heeb, Stephan; Schöck, Ulrike; Pohl, Thomas M; Wiehlmann, Lutz; Tümmler, Burkhard

    2010-02-01

    Pseudomonas aeruginosa PAO1 is the most commonly used strain for research on this ubiquitous and metabolically versatile opportunistic pathogen. Strain PAO1, a derivative of the original Australian PAO isolate, has been distributed worldwide to laboratories and strain collections. Over decades discordant phenotypes of PAO1 sublines have emerged. Taking the existing PAO1-UW genome sequence (named after the University of Washington, which led the sequencing project) as a blueprint, the genome sequences of reference strains MPAO1 and PAO1-DSM (stored at the German Collection for Microorganisms and Cell Cultures [DSMZ]) were resolved by physical mapping and deep short read sequencing-by-synthesis. MPAO1 has been the source of near-saturation libraries of transposon insertion mutants, and PAO1-DSM is identical in its SpeI-DpnI restriction map with the original isolate. The major genomic differences of MPAO1 and PAO1-DSM in comparison to PAO1-UW are the lack of a large inversion, a duplication of a mobile 12-kb prophage region carrying a distinct integrase and protein phosphatases or kinases, deletions of 3 to 1,006 bp in size, and at least 39 single-nucleotide substitutions, 17 of which affect protein sequences. The PAO1 sublines differed in their ability to cope with nutrient limitation and their virulence in an acute murine airway infection model. Subline PAO1-DSM outnumbered the two other sublines in late stationary growth phase. In conclusion, P. aeruginosa PAO1 shows an ongoing microevolution of genotype and phenotype that jeopardizes the reproducibility of research. High-throughput genome resequencing will resolve more cases and could become a proper quality control for strain collections. PMID:20023018

  4. Verticillium comparative genomics--understanding pathogenicity and diversity.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Verticillium dahliae is the primary causal agent of Verticillium wilt that causes billions of dollars in annual losses worldwide. This soil-borne fungal pathogen exhibits extraordinary genetic plasticity, capable of colonizing a broad range of hosts in diverse ecological niches. Moreover, V. dahlia...

  5. First genomic survey of human skin fungal diversity

    Cancer.gov

    Fungal infections of the skin affect 29 million people in the United States. In the first study of human fungal skin diversity, National Institutes of Health researchers sequenced the DNA of fungi that thrive at different skin sites of healthy adults to d

  6. Strong links between genomic and anatomical diversity in both mammalian olfactory chemosensory systems.

    PubMed

    Garrett, Eva C; Steiper, Michael E

    2014-05-22

    Mammalian olfaction comprises two chemosensory systems: the odorant-detecting main olfactory system (MOS) and the pheromone-detecting vomeronasal system (VNS). Mammals are diverse in their anatomical and genomic emphases on olfactory chemosensation, including the loss or reduction of these systems in some orders. Despite qualitative evidence linking the genomic evolution of the olfactory systems to specific functions and phenotypes, little work has quantitatively tested whether the genomic aspects of the mammalian olfactory chemosensory systems are correlated to anatomical diversity. We show that the genomic and anatomical variation in these systems is tightly linked in both the VNS and the MOS, though the signature of selection is different in each system. Specifically, the MOS appears to vary based on absolute organ and gene family size while the VNS appears to vary according to the relative proportion of functional genes and relative anatomical size and complexity. Furthermore, there is little evidence that these two systems are evolving in a linked fashion. The relationships between genomic and anatomical diversity strongly support a role for natural selection in shaping both the anatomical and genomic evolution of the olfactory chemosensory systems in mammals. PMID:24718758

  7. Diversity of 5S rRNA genes within individual prokaryotic genomes

    PubMed Central

    Pei, Anna; Li, Hongru; Oberdorf, William E; Alekseyenko, Alexander V.; Parsons, Tamasha; Yang, Liying; Gerz, Erika A.; Lee, Peng; Xiang, Charlie; Nossa, Carlos W.; Pei, Zhiheng

    2012-01-01

    We examined intragenomic variation of paralogous 5S rRNA genes to evaluate the concept of ribosomal constraints. In a dataset containing 1168 genomes from 779 unique species, 96 species exhibited >3% diversity. Twenty seven species with >10% diversity contained a total of 421 mismatches between all pairs of the most dissimilar copies of 5S rRNA genes. The large majority (401 of 421) the diversified positions were conserved at the secondary structure level. The high diversity was associated with partial rRNA operon, split operon, or spacer length-related divergence. In total, these findings indicated that there were tight ribosomal constraints on paralogous 5S rRNA genes in a genome despite of the high degree of diversity at the primary structure level. There is supplementary material. PMID:22765222

  8. Genetic diversity in the modern horse illustrated from genome-wide SNP data.

    PubMed

    Petersen, Jessica L; Mickelson, James R; Cothran, E Gus; Andersson, Lisa S; Axelsson, Jeanette; Bailey, Ernie; Bannasch, Danika; Binns, Matthew M; Borges, Alexandre S; Brama, Pieter; da Câmara Machado, Artur; Distl, Ottmar; Felicetti, Michela; Fox-Clipsham, Laura; Graves, Kathryn T; Guérin, Gérard; Haase, Bianca; Hasegawa, Telhisa; Hemmann, Karin; Hill, Emmeline W; Leeb, Tosso; Lindgren, Gabriella; Lohi, Hannes; Lopes, Maria Susana; McGivney, Beatrice A; Mikko, Sofia; Orr, Nicholas; Penedo, M Cecilia T; Piercy, Richard J; Raekallio, Marja; Rieder, Stefan; Røed, Knut H; Silvestrelli, Maurizio; Swinburne, June; Tozaki, Teruaki; Vaudin, Mark; M Wade, Claire; McCue, Molly E

    2013-01-01

    Horses were domesticated from the Eurasian steppes 5,000-6,000 years ago. Since then, the use of horses for transportation, warfare, and agriculture, as well as selection for desired traits and fitness, has resulted in diverse populations distributed across the world, many of which have become or are in the process of becoming formally organized into closed, breeding populations (breeds). This report describes the use of a genome-wide set of autosomal SNPs and 814 horses from 36 breeds to provide the first detailed description of equine breed diversity. F(ST) calculations, parsimony, and distance analysis demonstrated relationships among the breeds that largely reflect geographic origins and known breed histories. Low levels of population divergence were observed between breeds that are relatively early on in the process of breed development, and between those with high levels of within-breed diversity, whether due to large population size, ongoing outcrossing, or large within-breed phenotypic diversity. Populations with low within-breed diversity included those which have experienced population bottlenecks, have been under intense selective pressure, or are closed populations with long breed histories. These results provide new insights into the relationships among and the diversity within breeds of horses. In addition these results will facilitate future genome-wide association studies and investigations into genomic targets of selection. PMID:23383025

  9. Genetic Diversity in the Modern Horse Illustrated from Genome-Wide SNP Data

    PubMed Central

    Petersen, Jessica L.; Mickelson, James R.; Cothran, E. Gus; Andersson, Lisa S.; Axelsson, Jeanette; Bailey, Ernie; Bannasch, Danika; Binns, Matthew M.; Borges, Alexandre S.; Brama, Pieter; da Câmara Machado, Artur; Distl, Ottmar; Felicetti, Michela; Fox-Clipsham, Laura; Graves, Kathryn T.; Guérin, Gérard; Haase, Bianca; Hasegawa, Telhisa; Hemmann, Karin; Hill, Emmeline W.; Leeb, Tosso; Lindgren, Gabriella; Lohi, Hannes; Lopes, Maria Susana; McGivney, Beatrice A.; Mikko, Sofia; Orr, Nicholas; Penedo, M. Cecilia T; Piercy, Richard J.; Raekallio, Marja; Rieder, Stefan; Røed, Knut H.; Silvestrelli, Maurizio; Swinburne, June; Tozaki, Teruaki; Vaudin, Mark; M. Wade, Claire; McCue, Molly E.

    2013-01-01

    Horses were domesticated from the Eurasian steppes 5,000–6,000 years ago. Since then, the use of horses for transportation, warfare, and agriculture, as well as selection for desired traits and fitness, has resulted in diverse populations distributed across the world, many of which have become or are in the process of becoming formally organized into closed, breeding populations (breeds). This report describes the use of a genome-wide set of autosomal SNPs and 814 horses from 36 breeds to provide the first detailed description of equine breed diversity. FST calculations, parsimony, and distance analysis demonstrated relationships among the breeds that largely reflect geographic origins and known breed histories. Low levels of population divergence were observed between breeds that are relatively early on in the process of breed development, and between those with high levels of within-breed diversity, whether due to large population size, ongoing outcrossing, or large within-breed phenotypic diversity. Populations with low within-breed diversity included those which have experienced population bottlenecks, have been under intense selective pressure, or are closed populations with long breed histories. These results provide new insights into the relationships among and the diversity within breeds of horses. In addition these results will facilitate future genome-wide association studies and investigations into genomic targets of selection. PMID:23383025

  10. Complete genome sequencing and comparative genomic analysis of functionally diverse Lysinibacillus sphaericus III(3)7.

    PubMed

    Rey, Andrés; Silva-Quintero, Laura; Dussán, Jenny

    2016-09-01

    Lysinibacillus sphaericus III(3)7 is a native Colombian strain, the first one isolated from soil samples. This strain has shown high levels of pathogenic activity against Culex quinquefaciatus larvae in laboratory assays compared to other members of the same species. Using Pacific Biosciences sequencing technology we sequenced, annotated (de novo) and described the genome of strain III(3)7, achieving a complete genome sequence status. We then performed a comparative analysis between the newly sequenced genome and the ones previously reported for Colombian isolates L. sphaericus OT4b.31, CBAM5 and OT4b.25, with the inclusion of L. sphaericus C3-41 that has been used as a reference genome for most of previous genome sequencing projects. We concluded that L. sphaericus III(3)7 is highly similar with strain OT4b.25 and shares high levels of synteny with isolates CBAM5 and C3-41. PMID:27419068

  11. Tomato Fruits Show Wide Phenomic Diversity but Fruit Developmental Genes Show Low Genomic Diversity

    PubMed Central

    Mohan, Vijee; Gupta, Soni; Thomas, Sherinmol; Mickey, Hanjabam; Charakana, Chaitanya; Chauhan, Vineeta Singh; Sharma, Kapil; Kumar, Rakesh; Tyagi, Kamal; Sarma, Supriya; Gupta, Suresh Kumar; Kilambi, Himabindu Vasuki; Nongmaithem, Sapana; Kumari, Alka; Gupta, Prateek; Sreelakshmi, Yellamaraju; Sharma, Rameshwar

    2016-01-01

    Domestication of tomato has resulted in large diversity in fruit phenotypes. An intensive phenotyping of 127 tomato accessions from 20 countries revealed extensive morphological diversity in fruit traits. The diversity in fruit traits clustered the accessions into nine classes and identified certain promising lines having desirable traits pertaining to total soluble salts (TSS), carotenoids, ripening index, weight and shape. Factor analysis of the morphometric data from Tomato Analyzer showed that the fruit shape is a complex trait shared by several factors. The 100% variance between round and flat fruit shapes was explained by one discriminant function having a canonical correlation of 0.874 by stepwise discriminant analysis. A set of 10 genes (ACS2, COP1, CYC-B, RIN, MSH2, NAC-NOR, PHOT1, PHYA, PHYB and PSY1) involved in various plant developmental processes were screened for SNP polymorphism by EcoTILLING. The genetic diversity in these genes revealed a total of 36 non-synonymous and 18 synonymous changes leading to the identification of 28 haplotypes. The average frequency of polymorphism across the genes was 0.038/Kb. Significant negative Tajima’D statistic in two of the genes, ACS2 and PHOT1 indicated the presence of rare alleles in low frequency. Our study indicates that while there is low polymorphic diversity in the genes regulating plant development, the population shows wider phenotype diversity. Nonetheless, morphological and genetic diversity of the present collection can be further exploited as potential resources in future. PMID:27077652

  12. Artificial selection with traditional or genomic relationships: consequences in coancestry and genetic diversity

    PubMed Central

    Rodríguez-Ramilo, Silvia Teresa; García-Cortés, Luis Alberto; de Cara, María Ángeles Rodríguez

    2015-01-01

    Estimated breeding values (EBVs) are traditionally obtained from pedigree information. However, EBVs from high-density genotypes can have higher accuracy than EBVs from pedigree information. At the same time, it has been shown that EBVs from genomic data lead to lower increases in inbreeding compared with traditional selection based on genealogies. Here we evaluate the performance with BLUP selection based on genealogical coancestry with three different genome-based coancestry estimates: (1) an estimate based on shared segments of homozygosity, (2) an approach based on SNP-by-SNP count corrected by allelic frequencies, and (3) the identity by state methodology. We evaluate the effect of different population sizes, different number of genomic markers, and several heritability values for a quantitative trait. The performance of the different measures of coancestry in BLUP is evaluated in the true breeding values after truncation selection and also in terms of coancestry and diversity maintained. Accordingly, cross-performances were also carried out, that is, how prediction based on genealogical records impacts the three other measures of coancestry and inbreeding, and viceversa. Our results show that the genetic gains are very similar for all four coancestries, but the genomic-based methods are superior to using genealogical coancestries in terms of maintaining diversity measured as observed heterozygosity. Furthermore, the measure of coancestry based on shared segments of the genome seems to provide slightly better results on some scenarios, and the increase in inbreeding and loss in diversity is only slightly larger than the other genomic selection methods in those scenarios. Our results shed light on genomic selection vs. traditional genealogical-based BLUP and make the case to manage the population variability using genomic information to preserve the future success of selection programmes. PMID:25904933

  13. Lactobacillus paracasei comparative genomics: towards species pan-genome definition and exploitation of diversity.

    PubMed

    Smokvina, Tamara; Wels, Michiel; Polka, Justyna; Chervaux, Christian; Brisse, Sylvain; Boekhorst, Jos; van Hylckama Vlieg, Johan E T; Siezen, Roland J

    2013-01-01

    Lactobacillus paracasei is a member of the normal human and animal gut microbiota and is used extensively in the food industry in starter cultures for dairy products or as probiotics. With the development of low-cost, high-throughput sequencing techniques it has become feasible to sequence many different strains of one species and to determine its "pan-genome". We have sequenced the genomes of 34 different L. paracasei strains, and performed a comparative genomics analysis. We analysed genome synteny and content, focussing on the pan-genome, core genome and variable genome. Each genome was shown to contain around 2800-3100 protein-coding genes, and comparative analysis identified over 4200 ortholog groups that comprise the pan-genome of this species, of which about 1800 ortholog groups make up the conserved core. Several factors previously associated with host-microbe interactions such as pili, cell-envelope proteinase, hydrolases p40 and p75 or the capacity to produce short branched-chain fatty acids (bkd operon) are part of the L. paracasei core genome present in all analysed strains. The variome consists mainly of hypothetical proteins, phages, plasmids, transposon/conjugative elements, and known functions such as sugar metabolism, cell-surface proteins, transporters, CRISPR-associated proteins, and EPS biosynthesis proteins. An enormous variety and variability of sugar utilization gene cassettes were identified, with each strain harbouring between 25-53 cassettes, reflecting the high adaptability of L. paracasei to different niches. A phylogenomic tree was constructed based on total genome contents, and together with an analysis of horizontal gene transfer events we conclude that evolution of these L. paracasei strains is complex and not always related to niche adaptation. The results of this genome content comparison was used, together with high-throughput growth experiments on various carbohydrates, to perform gene-trait matching analysis, in order to link

  14. Expanding the Diversity of Mycobacteriophages: Insights into Genome Architecture and Evolution

    PubMed Central

    Pope, Welkin H.; Jacobs-Sera, Deborah; Russell, Daniel A.; Peebles, Craig L.; Al-Atrache, Zein; Alcoser, Turi A.; Alexander, Lisa M.; Alfano, Matthew B.; Alford, Samantha T.; Amy, Nichols E.; Anderson, Marie D.; Anderson, Alexander G.; Ang, Andrew A. S.; Ares, Manuel; Barber, Amanda J.; Barker, Lucia P.; Barrett, Jonathan M.; Barshop, William D.; Bauerle, Cynthia M.; Bayles, Ian M.; Belfield, Katherine L.; Best, Aaron A.; Borjon, Agustin; Bowman, Charles A.; Boyer, Christine A.; Bradley, Kevin W.; Bradley, Victoria A.; Broadway, Lauren N.; Budwal, Keshav; Busby, Kayla N.; Campbell, Ian W.; Campbell, Anne M.; Carey, Alyssa; Caruso, Steven M.; Chew, Rebekah D.; Cockburn, Chelsea L.; Cohen, Lianne B.; Corajod, Jeffrey M.; Cresawn, Steven G.; Davis, Kimberly R.; Deng, Lisa; Denver, Dee R.; Dixon, Breyon R.; Ekram, Sahrish; Elgin, Sarah C. R.; Engelsen, Angela E.; English, Belle E. V.; Erb, Marcella L.; Estrada, Crystal; Filliger, Laura Z.; Findley, Ann M.; Forbes, Lauren; Forsyth, Mark H.; Fox, Tyler M.; Fritz, Melissa J.; Garcia, Roberto; George, Zindzi D.; Georges, Anne E.; Gissendanner, Christopher R.; Goff, Shannon; Goldstein, Rebecca; Gordon, Kobie C.; Green, Russell D.; Guerra, Stephanie L.; Guiney-Olsen, Krysta R.; Guiza, Bridget G.; Haghighat, Leila; Hagopian, Garrett V.; Harmon, Catherine J.; Harmson, Jeremy S.; Hartzog, Grant A.; Harvey, Samuel E.; He, Siping; He, Kevin J.; Healy, Kaitlin E.; Higinbotham, Ellen R.; Hildebrandt, Erin N.; Ho, Jason H.; Hogan, Gina M.; Hohenstein, Victoria G.; Holz, Nathan A.; Huang, Vincent J.; Hufford, Ericka L.; Hynes, Peter M.; Jackson, Arrykka S.; Jansen, Erica C.; Jarvik, Jonathan; Jasinto, Paul G.; Jordan, Tuajuanda C.; Kasza, Tomas; Katelyn, Murray A.; Kelsey, Jessica S.; Kerrigan, Larisa A.; Khaw, Daryl; Kim, Junghee; Knutter, Justin Z.; Ko, Ching-Chung; Larkin, Gail V.; Laroche, Jennifer R.; Latif, Asma; Leuba, Kohana D.; Leuba, Sequoia I.; Lewis, Lynn O.; Loesser-Casey, Kathryn E.; Long, Courtney A.; Lopez, A. Javier; Lowery, Nicholas; Lu, Tina Q.; Mac, Victor; Masters, Isaac R.; McCloud, Jazmyn J.; McDonough, Molly J.; Medenbach, Andrew J.; Menon, Anjali; Miller, Rachel; Morgan, Brandon K.; Ng, Patrick C.; Nguyen, Elvis; Nguyen, Katrina T.; Nguyen, Emilie T.; Nicholson, Kaylee M.; Parnell, Lindsay A.; Peirce, Caitlin E.; Perz, Allison M.; Peterson, Luke J.; Pferdehirt, Rachel E.; Philip, Seegren V.; Pogliano, Kit; Pogliano, Joe; Polley, Tamsen; Puopolo, Erica J.; Rabinowitz, Hannah S.; Resiss, Michael J.; Rhyan, Corwin N.; Robinson, Yetta M.; Rodriguez, Lauren L.; Rose, Andrew C.; Rubin, Jeffrey D.; Ruby, Jessica A.; Saha, Margaret S.; Sandoz, James W.; Savitskaya, Judith; Schipper, Dale J.; Schnitzler, Christine E.; Schott, Amanda R.; Segal, J. Bradley; Shaffer, Christopher D.; Sheldon, Kathryn E.; Shepard, Erica M.; Shepardson, Jonathan W.; Shroff, Madav K.; Simmons, Jessica M.; Simms, Erika F.; Simpson, Brandy M.; Sinclair, Kathryn M.; Sjoholm, Robert L.; Slette, Ingrid J.; Spaulding, Blaire C.; Straub, Clark L.; Stukey, Joseph; Sughrue, Trevor; Tang, Tin-Yun; Tatyana, Lyons M.; Taylor, Stephen B.; Taylor, Barbara J.; Temple, Louise M.; Thompson, Jasper V.; Tokarz, Michael P.; Trapani, Stephanie E.; Troum, Alexander P.; Tsay, Jonathan; Tubbs, Anthony T.; Walton, Jillian M.; Wang, Danielle H.; Wang, Hannah; Warner, John R.; Weisser, Emilie G.; Wendler, Samantha C.; Weston-Hafer, Kathleen A.; Whelan, Hilary M.; Williamson, Kurt E.; Willis, Angelica N.; Wirtshafter, Hannah S.; Wong, Theresa W.; Wu, Phillip; Yang, Yun jeong; Yee, Brandon C.; Zaidins, David A.; Zhang, Bo; Zúniga, Melina Y.; Hendrix, Roger W.; Hatfull, Graham F.

    2011-01-01

    Mycobacteriophages are viruses that infect mycobacterial hosts such as Mycobacterium smegmatis and Mycobacterium tuberculosis. All mycobacteriophages characterized to date are dsDNA tailed phages, and have either siphoviral or myoviral morphotypes. However, their genetic diversity is considerable, and although sixty-two genomes have been sequenced and comparatively analyzed, these likely represent only a small portion of the diversity of the mycobacteriophage population at large. Here we report the isolation, sequencing and comparative genomic analysis of 18 new mycobacteriophages isolated from geographically distinct locations within the United States. Although no clear correlation between location and genome type can be discerned, these genomes expand our knowledge of mycobacteriophage diversity and enhance our understanding of the roles of mobile elements in viral evolution. Expansion of the number of mycobacteriophages grouped within Cluster A provides insights into the basis of immune specificity in these temperate phages, and we also describe a novel example of apparent immunity theft. The isolation and genomic analysis of bacteriophages by freshman college students provides an example of an authentic research experience for novice scientists. PMID:21298013

  15. A genome-wide SNP panel for genetic diversity, mapping and breeding studies in rice

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A genome-wide SNP resource was developed for rice using the GoldenGate assay and used to genotype 400 landrace accessions of O. sativa. SNPs were originally discovered using Perlegen re-sequencing technology in 20 diverse landraces of O. sativa as part of OryzaSNP project (http://irfgc.irri.org). An...

  16. GENOMIC DIVERSITY OF STREPTOCCOCUS AGALACTIAE ISOLATES FROM MULTIPLE HOSTS AND THEIR INFECTIVITY IN NILE TILAPIA

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Our laboratory has conducted multiple studies to investigate the genomic diversity of GBS isolates from different phylogenetic hosts and geographical regions. We have examined fish and dolphin GBS strains using phenotypic, serological typing and multilocus sequence typing (MLST) techniques and comp...

  17. Draft Genome Sequences of a Phylogenetically Diverse Suite of Pseudomonas syringae Strains from Multiple Source Populations

    PubMed Central

    Yourstone, Scott; Lind, Abigail; Guilbaud, Caroline; Sands, David C.; Jones, Corbin D.; Morris, Cindy E.; Dangl, Jeffrey L.

    2014-01-01

    Here, we report the draft genome sequences for 7 phylogenetically diverse isolates of Pseudomonas syringae, obtained from numerous environmental sources and geographically proximate crop species. Overall, these sequences provide a wealth of information about the differences (or lack thereof) between isolates from disease outbreaks and those from other sources. PMID:24459267

  18. The landscape of genomic imprinting across diverse adult human tissues.

    PubMed

    Baran, Yael; Subramaniam, Meena; Biton, Anne; Tukiainen, Taru; Tsang, Emily K; Rivas, Manuel A; Pirinen, Matti; Gutierrez-Arcelus, Maria; Smith, Kevin S; Kukurba, Kim R; Zhang, Rui; Eng, Celeste; Torgerson, Dara G; Urbanek, Cydney; Li, Jin Billy; Rodriguez-Santana, Jose R; Burchard, Esteban G; Seibold, Max A; MacArthur, Daniel G; Montgomery, Stephen B; Zaitlen, Noah A; Lappalainen, Tuuli

    2015-07-01

    Genomic imprinting is an important regulatory mechanism that silences one of the parental copies of a gene. To systematically characterize this phenomenon, we analyze tissue specificity of imprinting from allelic expression data in 1582 primary tissue samples from 178 individuals from the Genotype-Tissue Expression (GTEx) project. We characterize imprinting in 42 genes, including both novel and previously identified genes. Tissue specificity of imprinting is widespread, and gender-specific effects are revealed in a small number of genes in muscle with stronger imprinting in males. IGF2 shows maternal expression in the brain instead of the canonical paternal expression elsewhere. Imprinting appears to have only a subtle impact on tissue-specific expression levels, with genes lacking a systematic expression difference between tissues with imprinted and biallelic expression. In summary, our systematic characterization of imprinting in adult tissues highlights variation in imprinting between genes, individuals, and tissues. PMID:25953952

  19. The landscape of genomic imprinting across diverse adult human tissues

    PubMed Central

    Baran, Yael; Subramaniam, Meena; Biton, Anne; Tukiainen, Taru; Tsang, Emily K.; Rivas, Manuel A.; Pirinen, Matti; Gutierrez-Arcelus, Maria; Smith, Kevin S.; Kukurba, Kim R.; Zhang, Rui; Eng, Celeste; Torgerson, Dara G.; Urbanek, Cydney; Li, Jin Billy; Rodriguez-Santana, Jose R.; Burchard, Esteban G.; Seibold, Max A.; MacArthur, Daniel G.; Montgomery, Stephen B.; Zaitlen, Noah A.; Lappalainen, Tuuli

    2015-01-01

    Genomic imprinting is an important regulatory mechanism that silences one of the parental copies of a gene. To systematically characterize this phenomenon, we analyze tissue specificity of imprinting from allelic expression data in 1582 primary tissue samples from 178 individuals from the Genotype-Tissue Expression (GTEx) project. We characterize imprinting in 42 genes, including both novel and previously identified genes. Tissue specificity of imprinting is widespread, and gender-specific effects are revealed in a small number of genes in muscle with stronger imprinting in males. IGF2 shows maternal expression in the brain instead of the canonical paternal expression elsewhere. Imprinting appears to have only a subtle impact on tissue-specific expression levels, with genes lacking a systematic expression difference between tissues with imprinted and biallelic expression. In summary, our systematic characterization of imprinting in adult tissues highlights variation in imprinting between genes, individuals, and tissues. PMID:25953952

  20. Intraclonal genome diversity of the major Pseudomonas aeruginosa clones C and PA14.

    PubMed

    Fischer, Sebastian; Klockgether, Jens; Morán Losada, Patricia; Chouvarine, Philippe; Cramer, Nina; Davenport, Colin F; Dethlefsen, Sarah; Dorda, Marie; Goesmann, Alexander; Hilker, Rolf; Mielke, Samira; Schönfelder, Torben; Suerbaum, Sebastian; Türk, Oliver; Woltemate, Sabrina; Wiehlmann, Lutz; Tümmler, Burkhard

    2016-04-01

    Bacterial populations differentiate at the subspecies level into clonal complexes. Intraclonal genome diversity was studied in 100 isolates of the two dominant Pseudomonas aeruginosa clones C and PA14 collected from the inanimate environment, acute and chronic infections. The core genome was highly conserved among clone members with a median pairwise within-clone single nucleotide sequence diversity of 8 × 10(-6) for clone C and 2 × 10(-5) for clone PA14. The composition of the accessory genome was, on the other hand, as variable within the clone as between unrelated clones. Each strain carried a large cargo of unique genes. The two dominant worldwide distributed P. aeruginosa clones combine an almost invariant core with the flexible gain and loss of genetic elements that spread by horizontal transfer. PMID:26711897

  1. Intraclonal genome diversity of the major Pseudomonas aeruginosa clones C and PA14

    PubMed Central

    Fischer, Sebastian; Klockgether, Jens; Morán Losada, Patricia; Chouvarine, Philippe; Cramer, Nina; Davenport, Colin F.; Dethlefsen, Sarah; Dorda, Marie; Goesmann, Alexander; Hilker, Rolf; Mielke, Samira; Schönfelder, Torben; Suerbaum, Sebastian; Türk, Oliver; Woltemate, Sabrina; Wiehlmann, Lutz

    2016-01-01

    Summary Bacterial populations differentiate at the subspecies level into clonal complexes. Intraclonal genome diversity was studied in 100 isolates of the two dominant P seudomonas aeruginosa clones C and PA14 collected from the inanimate environment, acute and chronic infections. The core genome was highly conserved among clone members with a median pairwise within‐clone single nucleotide sequence diversity of 8 × 10−6 for clone C and 2 × 10−5 for clone PA14. The composition of the accessory genome was, on the other hand, as variable within the clone as between unrelated clones. Each strain carried a large cargo of unique genes. The two dominant worldwide distributed P. aeruginosa clones combine an almost invariant core with the flexible gain and loss of genetic elements that spread by horizontal transfer. PMID:26711897

  2. Lactobacillus paracasei Comparative Genomics: Towards Species Pan-Genome Definition and Exploitation of Diversity

    PubMed Central

    Smokvina, Tamara; Wels, Michiel; Polka, Justyna; Chervaux, Christian; Brisse, Sylvain; Boekhorst, Jos; Vlieg, Johan E. T. van Hylckama; Siezen, Roland J.

    2013-01-01

    Lactobacillus paracasei is a member of the normal human and animal gut microbiota and is used extensively in the food industry in starter cultures for dairy products or as probiotics. With the development of low-cost, high-throughput sequencing techniques it has become feasible to sequence many different strains of one species and to determine its “pan-genome”. We have sequenced the genomes of 34 different L. paracasei strains, and performed a comparative genomics analysis. We analysed genome synteny and content, focussing on the pan-genome, core genome and variable genome. Each genome was shown to contain around 2800–3100 protein-coding genes, and comparative analysis identified over 4200 ortholog groups that comprise the pan-genome of this species, of which about 1800 ortholog groups make up the conserved core. Several factors previously associated with host-microbe interactions such as pili, cell-envelope proteinase, hydrolases p40 and p75 or the capacity to produce short branched-chain fatty acids (bkd operon) are part of the L. paracasei core genome present in all analysed strains. The variome consists mainly of hypothetical proteins, phages, plasmids, transposon/conjugative elements, and known functions such as sugar metabolism, cell-surface proteins, transporters, CRISPR-associated proteins, and EPS biosynthesis proteins. An enormous variety and variability of sugar utilization gene cassettes were identified, with each strain harbouring between 25–53 cassettes, reflecting the high adaptability of L. paracasei to different niches. A phylogenomic tree was constructed based on total genome contents, and together with an analysis of horizontal gene transfer events we conclude that evolution of these L. paracasei strains is complex and not always related to niche adaptation. The results of this genome content comparison was used, together with high-throughput growth experiments on various carbohydrates, to perform gene-trait matching analysis, in order to

  3. Genetic Diversity and Reassortment of Hantaan Virus Tripartite RNA Genomes in Nature, the Republic of Korea

    PubMed Central

    Kim, Jeong-Ah; Kim, Won-keun; No, Jin Sun; Lee, Seung-Ho; Lee, Sook-Young; Kim, Ji Hye; Kho, Jeong Hoon; Lee, Daesang; Song, Dong Hyun; Gu, Se Hun; Jeong, Seong Tae; Park, Man-Seong; Kim, Heung-Chul; Klein, Terry A.; Song, Jin-Won

    2016-01-01

    Background Hantaan virus (HTNV), a negative sense tripartite RNA virus of the Family Bunyaviridae, is the most prevalent hantavirus in the Republic of Korea (ROK). It is the causative agent of Hemorrhagic Fever with Renal Syndrome (HFRS) in humans and maintained in the striped field mouse, Apodemus agrarius, the primary zoonotic host. Clinical HFRS cases have been reported commonly in HFRS-endemic areas of Gyeonggi province. Recently, the death of a member of the ROK military from Gangwon province due to HFRS prompted an investigation of the epidemiology and distribution of hantaviruses in Gangwon and Gyeonggi provinces that border the demilitarized zone separating North and South Korea. Methodology and Principal Findings To elucidate the geographic distribution and molecular diversity of HTNV, whole genome sequences of HTNV Large (L), Medium (M), and Small (S) segments were acquired from lung tissues of A. agrarius captured from 2003–2014. Consistent with the clinical incidence of HFRS established by the Korea Centers for Disease Control & Prevention (KCDC), the prevalence of HTNV in naturally infected mice in Gangwon province was lower than for Gyeonggi province. Whole genomic sequences of 34 HTNV strains were identified and a phylogenetic analysis showed geographic diversity of the virus in the limited areas. Reassortment analysis first suggested an occurrence of genetic exchange of HTNV genomes in nature, ROK. Conclusion/Significance This study is the first report to demonstrate the molecular prevalence of HTNV in Gangwon province. Whole genome sequencing of HTNV showed well-supported geographic lineages and the molecular diversity in the northern region of ROK due to a natural reassortment of HTNV genomes. These observations contribute to a better understanding of the genetic diversity and molecular evolution of hantaviruses. Also, the full-length of HTNV tripartite genomes will provide a database for phylogeographic analysis of spatial and temporal

  4. Genome-wide distribution of genetic diversity and linkage disequilibrium in elite sugar beet germplasm

    PubMed Central

    2011-01-01

    Background Characterization of population structure and genetic diversity of germplasm is essential for the efficient organization and utilization of breeding material. The objectives of this study were to (i) explore the patterns of population structure in the pollen parent heterotic pool using different methods, (ii) investigate the genome-wide distribution of genetic diversity, and (iii) assess the extent and genome-wide distribution of linkage disequilibrium (LD) in elite sugar beet germplasm. Results A total of 264 and 238 inbred lines from the yield type and sugar type inbreds of the pollen parent heterotic gene pools, respectively, which had been genotyped with 328 SNP markers, were used in this study. Two distinct subgroups were detected based on different statistical methods within the elite sugar beet germplasm set, which was in accordance with its breeding history. MCLUST based on principal components, principal coordinates, or lapvectors had high correspondence with the germplasm type information as well as the assignment by STRUCTURE, which indicated that these methods might be alternatives to STRUCTURE for population structure analysis. Gene diversity and modified Roger's distance between the examined germplasm types varied considerably across the genome, which might be due to artificial selection. This observation indicates that population genetic approaches could be used to identify candidate genes for the traits under selection. Due to the fact that r2 >0.8 is required to detect marker-phenotype association explaining less than 1% of the phenotypic variance, our observation of a low proportion of SNP loci pairs showing such levels of LD suggests that the number of markers has to be dramatically increased for powerful genome-wide association mapping. Conclusions We provided a genome-wide distribution map of genetic diversity and linkage disequilibrium for the elite sugar beet germplasm, which is useful for the application of genome-wide association

  5. Genomic diversity in myeloproliferative neoplasms: focus on myelofibrosis

    PubMed Central

    2015-01-01

    The classical myeloproliferative neoplasms (MPNs) are a group of clonal diseases comprising essential thrombocythaemia (ET), polycythaemia vera (PV) and primary myelofibrosis (PMF). PMF is the rarest disease sub type and has been challenging to address due to the lack of a specific genetic marker, inadequate risk identification models and a highly variable clinical course. Continuous efforts have over time, seen the inclusion of cytogenetic information in prognostic scoring models that have resulted in improved risk stratification models providing further rationale for therapeutic management. Technological advances using single nucleotide polymorphism arrays increased the detection of known and novel MPN related changes and variant detection by massively parallel sequencing provided a large scale screening tool for the multitude of somatic gene mutations that have more recently been described in MPN. Some of these mutations show an association with specific cytogenetic changes or phenotypes. While PMF occurs mainly in adults, it has also been described in paediatric cases and shows distinct histopathological, genetic and clinical features in comparison. This review provides an overview of the genomics landscape of PMF and current developments in MPN therapy. PMID:26835366

  6. Whole genomic DNA sequencing and comparative genomic analysis of Arthrospira platensis: high genome plasticity and genetic diversity

    PubMed Central

    Xu, Teng; Qin, Song; Hu, Yongwu; Song, Zhijian; Ying, Jianchao; Li, Peizhen; Dong, Wei; Zhao, Fangqing; Yang, Huanming; Bao, Qiyu

    2016-01-01

    Arthrospira platensis is a multi-cellular and filamentous non-N2-fixing cyanobacterium that is capable of performing oxygenic photosynthesis. In this study, we determined the nearly complete genome sequence of A. platensis YZ. A. platensis YZ genome is a single, circular chromosome of 6.62 Mb in size. Phylogenetic and comparative genomic analyses revealed that A. platensis YZ was more closely related to A. platensis NIES-39 than Arthrospira sp. PCC 8005 and A. platensis C1. Broad gene gains were identified between A. platensis YZ and three other Arthrospira speices, some of which have been previously demonstrated that can be laterally transferred among different species, such as restriction-modification systems-coding genes. Moreover, unprecedented extensive chromosomal rearrangements among different strains were observed. The chromosomal rearrangements, particularly the chromosomal inversions, were analysed and estimated to be closely related to palindromes that involved long inverted repeat sequences and the extensively distributed type IIR restriction enzyme in the Arthrospira genome. In addition, species from genus Arthrospira unanimously contained the highest rate of repetitive sequence compared with the other species of order Oscillatoriales, suggested that sequence duplication significantly contributed to Arthrospira genome phylogeny. These results provided in-depth views into the genomic phylogeny and structural variation of A. platensis, as well as provide a valuable resource for functional genomics studies. PMID:27330141

  7. Whole genomic DNA sequencing and comparative genomic analysis of Arthrospira platensis: high genome plasticity and genetic diversity.

    PubMed

    Xu, Teng; Qin, Song; Hu, Yongwu; Song, Zhijian; Ying, Jianchao; Li, Peizhen; Dong, Wei; Zhao, Fangqing; Yang, Huanming; Bao, Qiyu

    2016-08-01

    Arthrospira platensis is a multi-cellular and filamentous non-N2-fixing cyanobacterium that is capable of performing oxygenic photosynthesis. In this study, we determined the nearly complete genome sequence of A. platensis YZ. A. platensis YZ genome is a single, circular chromosome of 6.62 Mb in size. Phylogenetic and comparative genomic analyses revealed that A. platensis YZ was more closely related to A. platensis NIES-39 than Arthrospira sp. PCC 8005 and A. platensis C1. Broad gene gains were identified between A. platensis YZ and three other Arthrospira speices, some of which have been previously demonstrated that can be laterally transferred among different species, such as restriction-modification systems-coding genes. Moreover, unprecedented extensive chromosomal rearrangements among different strains were observed. The chromosomal rearrangements, particularly the chromosomal inversions, were analysed and estimated to be closely related to palindromes that involved long inverted repeat sequences and the extensively distributed type IIR restriction enzyme in the Arthrospira genome. In addition, species from genus Arthrospira unanimously contained the highest rate of repetitive sequence compared with the other species of order Oscillatoriales, suggested that sequence duplication significantly contributed to Arthrospira genome phylogeny. These results provided in-depth views into the genomic phylogeny and structural variation of A. platensis, as well as provide a valuable resource for functional genomics studies. PMID:27330141

  8. Comparative genomics of Campylobacter concisus isolates reveals genetic diversity and provides insights into disease association

    PubMed Central

    2013-01-01

    Background In spite of its association with gastroenteritis and inflammatory bowel diseases, the isolation of Campylobacter concisus from both diseased and healthy individuals has led to controversy regarding its role as an intestinal pathogen. One proposed reason for this is the presence of high genetic diversity among the genomes of C. concisus strains. Results In this study the genomes of six C. concisus strains were sequenced, assembled and annotated including two strains isolated from Crohn’s disease patients (UNSW2 and UNSW3), three from gastroenteritis patients (UNSW1, UNSWCS and ATCC 51562) and one from a healthy individual (ATCC 51561). The genomes of C. concisus BAA-1457 and UNSWCD, available from NCBI, were included in subsequent comparative genomic analyses. The Pan and Core genomes for the sequenced C. concisus strains consisted of 3254 and 1556 protein coding genes, respectively. Conclusion Genes were identified with specific conservation in C. concisus strains grouped by phenotypes such as invasiveness, adherence, motility and diseased states. Phylogenetic trees based on ribosomal RNA sequences and concatenated host-related pathways for the eight C. concisus strains were generated using the neighbor-joining method, of which the 16S rRNA gene and peptidoglycan biosynthesis grouped the C. concisus strains according to their pathogenic phenotypes. Furthermore, 25 non-synonymous amino acid changes with 14 affecting functional domains, were identified within proteins of conserved host-related pathways, which had possible associations with the pathogenic potential of C. concisus strains. Finally, the genomes of the eight C. concisus strains were compared to the nine available genomes of the well-established pathogen Campylobacter jejuni, which identified several important differences in the respiration pathways of these two species. Our findings indicate that C. concisus strains are genetically diverse, and suggest the genomes of this bacterium contain

  9. A Genomic Encyclopedia of the Root Nodule Bacteria: assessing genetic diversity through a systematic biogeographic survey

    PubMed Central

    2015-01-01

    Root nodule bacteria are free-living soil bacteria, belonging to diverse genera within the Alphaproteobacteria and Betaproteobacteria, that have the capacity to form nitrogen-fixing symbioses with legumes. The symbiosis is specific and is governed by signaling molecules produced from both host and bacteria. Sequencing of several model RNB genomes has provided valuable insights into the genetic basis of symbiosis. However, the small number of sequenced RNB genomes available does not currently reflect the phylogenetic diversity of RNB, or the variety of mechanisms that lead to symbiosis in different legume hosts. This prevents a broad understanding of symbiotic interactions and the factors that govern the biogeography of host-microbe symbioses. Here, we outline a proposal to expand the number of sequenced RNB strains, which aims to capture this phylogenetic and biogeographic diversity. Through the Vavilov centers of diversity (Proposal ID: 231) and GEBA-RNB (Proposal ID: 882) projects we will sequence 107 RNB strains, isolated from diverse legume hosts in various geographic locations around the world. The nominated strains belong to nine of the 16 currently validly described RNB genera. They include 13 type strains, as well as elite inoculant strains of high commercial importance. These projects will strongly support systematic sequence-based studies of RNB and contribute to our understanding of the effects of biogeography on the evolution of different species of RNB, as well as the mechanisms that determine the specificity and effectiveness of nodulation and symbiotic nitrogen fixation by RNB with diverse legume hosts. PMID:25685260

  10. Expanding our view of genomic diversity in Candidatus Accumulibacter clades.

    PubMed

    Skennerton, Connor T; Barr, Jeremy J; Slater, Frances R; Bond, Philip L; Tyson, Gene W

    2015-05-01

    Enhanced biological phosphorus removal (EBPR) is an important industrial wastewater treatment process mediated by polyphosphate-accumulating organisms (PAOs). Members of the genus Candidatus Accumulibacter are one of the most extensively studied PAO as they are commonly enriched in lab-scale EBPR reactors. Members of different Accumulibacter clades are often enriched through changes in reactor process conditions; however, the two currently sequenced Accumulibacter genomes show extensive metabolic similarity. Here, we expand our understanding of Accumulibacter genomic diversity through recovery of eight population genomes using deep metagenomics, including seven from phylogenetic clades with no previously sequenced representative. Comparative genomic analysis revealed a core of shared genes involved primarily in carbon and phosphorus metabolism; however, each Accumulibacter genome also encoded a substantial number of unique genes (> 700 genes). A major difference between the Accumulibacter clades was the type of nitrate reductase encoded and the capacity to perform subsequent steps in denitrification. The Accumulibacter clade IIF genomes also contained acetaldehyde dehydrogenase that may allow ethanol to be used as carbon source. These differences in metabolism between Accumulibacter genomes provide a molecular basis for niche differentiation observed in lab-scale reactors and may offer new opportunities for process optimization. PMID:25088527

  11. Genome diversity and evidence of recombination and reassortment in nanoviruses from Europe.

    PubMed

    Grigoras, Ioana; Ginzo, Ana Isabel del Cueto; Martin, Darren P; Varsani, Arvind; Romero, Javier; Mammadov, Alamdar Ch; Huseynova, Irada M; Aliyev, Jalal A; Kheyr-Pour, Ahmed; Huss, Herbert; Ziebell, Heiko; Timchenko, Tatiana; Vetten, Heinrich-Josef; Gronenborn, Bruno

    2014-05-01

    The recent identification of a new nanovirus, pea necrotic yellow dwarf virus, from pea in Germany prompted us to survey wild and cultivated legumes for nanovirus infections in several European countries. This led to the identification of two new nanoviruses: black medic leaf roll virus (BMLRV) and pea yellow stunt virus (PYSV), each considered a putative new species. The complete genomes of a PYSV isolate from Austria and three BMLRV isolates from Austria, Azerbaijan and Sweden were sequenced. In addition, the genomes of five isolates of faba bean necrotic yellows virus (FBNYV) from Azerbaijan and Spain and those of four faba bean necrotic stunt virus (FBNSV) isolates from Azerbaijan were completely sequenced, leading to the first identification of FBNSV occurring in Europe. Sequence analyses uncovered evolutionary relationships, extensive reassortment and potential remnants of mixed nanovirus infections, as well as intra- and intercomponent recombination events within the nanovirus genomes. In some virus isolates, diverse types of the same genome component (paralogues) were observed, a type of genome complexity not described previously for any member of the family Nanoviridae. Moreover, infectious and aphid-transmissible nanoviruses from cloned genomic DNAs of FBNYV and BMLRV were reconstituted that, for the first time, allow experimental reassortments for studying the genome functions and evolution of these nanoviruses. PMID:24515973

  12. Whole genome comparison of a large collection of mycobacteriophages reveals a continuum of phage genetic diversity

    PubMed Central

    Pope, Welkin H; Bowman, Charles A; Russell, Daniel A; Jacobs-Sera, Deborah; Asai, David J; Cresawn, Steven G; Jacobs, William R; Hendrix, Roger W; Lawrence, Jeffrey G; Hatfull, Graham F; Abbazia, Patrick; Ababio, Amma; Adam, Naazneen

    2015-01-01

    The bacteriophage population is large, dynamic, ancient, and genetically diverse. Limited genomic information shows that phage genomes are mosaic, and the genetic architecture of phage populations remains ill-defined. To understand the population structure of phages infecting a single host strain, we isolated, sequenced, and compared 627 phages of Mycobacterium smegmatis. Their genetic diversity is considerable, and there are 28 distinct genomic types (clusters) with related nucleotide sequences. However, amino acid sequence comparisons show pervasive genomic mosaicism, and quantification of inter-cluster and intra-cluster relatedness reveals a continuum of genetic diversity, albeit with uneven representation of different phages. Furthermore, rarefaction analysis shows that the mycobacteriophage population is not closed, and there is a constant influx of genes from other sources. Phage isolation and analysis was performed by a large consortium of academic institutions, illustrating the substantial benefits of a disseminated, structured program involving large numbers of freshman undergraduates in scientific discovery. DOI: http://dx.doi.org/10.7554/eLife.06416.001 PMID:25919952

  13. How Archiving by Freezing Affects the Genome-Scale Diversity of Escherichia coli Populations.

    PubMed

    Sprouffske, Kathleen; Aguilar-Rodríguez, José; Wagner, Andreas

    2016-01-01

    In the experimental evolution of microbes such as Escherichia coli, many replicate populations are evolved from a common ancestor. Freezing a population sample supplemented with the cryoprotectant glycerol permits later analysis or restarting of an evolution experiment. Typically, each evolving population, and thus each sample archived in this way, consists of many unique genotypes and phenotypes. The effect of archiving on such a heterogeneous population is unknown. Here, we identified optimal archiving conditions for E. coli. We also used genome sequencing of archived samples to study the effects that archiving has on genomic population diversity. We observed no allele substitutions and mostly small changes in allele frequency. Nevertheless, principal component analysis of genome-scale allelic diversity shows that archiving affects diversity across many loci. We showed that this change in diversity is due to selection rather than drift. In addition, ∼1% of rare alleles that occurred at low frequencies were lost after treatment. Our observations imply that archived populations may be used to conduct fitness or other phenotypic assays of populations, in which the loss of a rare allele may have negligible effects. However, caution is appropriate when sequencing populations restarted from glycerol stocks, as well as when using glycerol stocks to restart or replay evolution. This is because the loss of rare alleles can alter the future evolutionary trajectory of a population if the lost alleles were strongly beneficial. PMID:26988250

  14. How Archiving by Freezing Affects the Genome-Scale Diversity of Escherichia coli Populations

    PubMed Central

    Sprouffske, Kathleen; Aguilar-Rodríguez, José; Wagner, Andreas

    2016-01-01

    In the experimental evolution of microbes such as Escherichia coli, many replicate populations are evolved from a common ancestor. Freezing a population sample supplemented with the cryoprotectant glycerol permits later analysis or restarting of an evolution experiment. Typically, each evolving population, and thus each sample archived in this way, consists of many unique genotypes and phenotypes. The effect of archiving on such a heterogeneous population is unknown. Here, we identified optimal archiving conditions for E. coli. We also used genome sequencing of archived samples to study the effects that archiving has on genomic population diversity. We observed no allele substitutions and mostly small changes in allele frequency. Nevertheless, principal component analysis of genome-scale allelic diversity shows that archiving affects diversity across many loci. We showed that this change in diversity is due to selection rather than drift. In addition, ∼1% of rare alleles that occurred at low frequencies were lost after treatment. Our observations imply that archived populations may be used to conduct fitness or other phenotypic assays of populations, in which the loss of a rare allele may have negligible effects. However, caution is appropriate when sequencing populations restarted from glycerol stocks, as well as when using glycerol stocks to restart or replay evolution. This is because the loss of rare alleles can alter the future evolutionary trajectory of a population if the lost alleles were strongly beneficial. PMID:26988250

  15. HGD-Chn: The Database of Genome Diversity and Variation for Chinese Populations.

    PubMed

    Hong-Sheng, Gui; Peng, Zhou; Cheng-Bo, Yang; Sheng-Bin, Li

    2009-04-01

    The Database of Genome Diversity and Variation for Chinese Populations is toward a more efficient utilization and sharing of the valuable yet diminishing genetic resources in China (including sample information of healthy populations, healthy pedigrees, disease population and disease pedigrees; genomic diversity data; disease-related allelic and haplotype data). Organization of the database can be divided into two parts: (1) Genetic resources of healthy people--Organizing genetic resources of healthy people. A variety of genetic markers (VNTR, STR, SNP, HLA, and enzyme markers, etc.) are chosen for their diversity among populations, with their distribution among different ethnic groups in China stored in the form of allelic frequency. A further analysis as well as an overall description of the Chinese population genetic structure is also being made possible. (2) Disease genetic resources--Four categories are mainly concerned: chromosomal diseases, monogenic diseases, polygenic diseases, and birth defects. For each kind of disease, the basic introduction and description, sample information, and allelic data of related gene are involved. Aside from research-oriented information, introductory courses oriented at general public covering fields of genomic diversity and variation, the related experimental techniques, standards and specifications could also be accessed in our website. Further more, flexible query and submit system with user-friendly interfaces are also integrated in our website to simplify the process of user-query and administrators' database maintenance work. Online data analyzing and managing tools are developed using bioinformatics algorithm and programming language for a better interpretation of the biological data. PMID:19342283

  16. Phylogenetic and genomic diversity in isolates from the globally distributed Acinetobacter baumannii ST25 lineage

    PubMed Central

    Sahl, Jason W.; Del Franco, Mariateresa; Pournaras, Spyros; Colman, Rebecca E.; Karah, Nabil; Dijkshoorn, Lenie; Zarrilli, Raffaele

    2015-01-01

    Acinetobacter baumannii is a globally distributed nosocomial pathogen that has gained interest due to its resistance to most currently used antimicrobials. Whole genome sequencing (WGS) and phylogenetics has begun to reveal the global genetic diversity of this pathogen. The evolution of A. baumannii has largely been defined by recombination, punctuated by the emergence and proliferation of defined clonal lineages. In this study we sequenced seven genomes from the sequence type (ST)25 lineage and compared them to 12 ST25 genomes deposited in public databases. A recombination analysis identified multiple genomic regions that are homoplasious in the ST25 phylogeny, indicating active or historical recombination. Genes associated with antimicrobial resistance were differentially distributed between ST25 genomes, which matched our laboratory-based antimicrobial susceptibility typing. Differences were also observed in biofilm formation between ST25 isolates, which were demonstrated to produce significantly more extensive biofilm than an isolate from the ST1 clonal lineage. These results demonstrate that within A. baumannii, even a fairly recently derived monophyletic lineage can still exhibit significant genotypic and phenotypic diversity. These results have implications for associating outbreaks with sequence typing as well as understanding mechanisms behind the global propagation of successful A. baumannii lineages. PMID:26462752

  17. Whole genome resequencing of Botrytis cinerea isolates identifies high levels of standing diversity.

    PubMed

    Atwell, Susanna; Corwin, Jason A; Soltis, Nicole E; Subedy, Anushryia; Denby, Katherine J; Kliebenstein, Daniel J

    2015-01-01

    How standing genetic variation within a pathogen contributes to diversity in host/pathogen interactions is poorly understood, partly because most studied pathogens are host-specific, clonally reproducing organisms which complicates genetic analysis. In contrast, Botrytis cinerea is a sexually reproducing, true haploid ascomycete that can infect a wide range of diverse plant hosts. While previous work had shown significant genomic variation between two isolates, we proceeded to assess the level and frequency of standing variation in a population of B. cinerea. To begin measuring standing genetic variation in B. cinerea, we re-sequenced the genomes of 13 different isolates and aligned them to the previously sequenced T4 reference genome. In addition one of these isolates was resequenced from four independently repeated cultures. A high level of genetic diversity was found within the 13 isolates. Within this variation, we could identify clusters of genes with major effect polymorphisms, i.e., polymorphisms that lead to a predicted functional knockout, that surrounded genes involved in controlling vegetative incompatibility. The genotype at these loci was able to partially predict the interaction of these isolates in vegetative fusion assays showing that these loci control vegetative incompatibility. This suggests that the vegetative incompatibility loci within B. cinerea are associated with regions of increased genetic diversity. The genome re-sequencing of four clones from the one isolate (Grape) that had been independently propagated over 10 years showed no detectable spontaneous mutation. This suggests that B. cinerea does not display an elevated spontaneous mutation rate. Future work will allow us to test if, and how, this diversity may be contributing to the pathogen's broad host range. PMID:26441923

  18. Whole genome resequencing of Botrytis cinerea isolates identifies high levels of standing diversity

    PubMed Central

    Atwell, Susanna; Corwin, Jason A.; Soltis, Nicole E.; Subedy, Anushryia; Denby, Katherine J.; Kliebenstein, Daniel J.

    2015-01-01

    How standing genetic variation within a pathogen contributes to diversity in host/pathogen interactions is poorly understood, partly because most studied pathogens are host-specific, clonally reproducing organisms which complicates genetic analysis. In contrast, Botrytis cinerea is a sexually reproducing, true haploid ascomycete that can infect a wide range of diverse plant hosts. While previous work had shown significant genomic variation between two isolates, we proceeded to assess the level and frequency of standing variation in a population of B. cinerea. To begin measuring standing genetic variation in B. cinerea, we re-sequenced the genomes of 13 different isolates and aligned them to the previously sequenced T4 reference genome. In addition one of these isolates was resequenced from four independently repeated cultures. A high level of genetic diversity was found within the 13 isolates. Within this variation, we could identify clusters of genes with major effect polymorphisms, i.e., polymorphisms that lead to a predicted functional knockout, that surrounded genes involved in controlling vegetative incompatibility. The genotype at these loci was able to partially predict the interaction of these isolates in vegetative fusion assays showing that these loci control vegetative incompatibility. This suggests that the vegetative incompatibility loci within B. cinerea are associated with regions of increased genetic diversity. The genome re-sequencing of four clones from the one isolate (Grape) that had been independently propagated over 10 years showed no detectable spontaneous mutation. This suggests that B. cinerea does not display an elevated spontaneous mutation rate. Future work will allow us to test if, and how, this diversity may be contributing to the pathogen's broad host range. PMID:26441923

  19. Close Encounters of the Third Domain: The Emerging Genomic View of Archaeal Diversity and Evolution

    PubMed Central

    Spang, Anja; Saw, Jimmy H.; Lind, Anders E.; Ettema, Thijs J. G.

    2013-01-01

    The Archaea represent the so-called Third Domain of life, which has evolved in parallel with the Bacteria and which is implicated to have played a pivotal role in the emergence of the eukaryotic domain of life. Recent progress in genomic sequencing technologies and cultivation-independent methods has started to unearth a plethora of data of novel, uncultivated archaeal lineages. Here, we review how the availability of such genomic data has revealed several important insights into the diversity, ecological relevance, metabolic capacity, and the origin and evolution of the archaeal domain of life. PMID:24348093

  20. Scanning the landscape of genome architecture of non-O1 and non-O139 Vibrio cholerae by whole genome mapping reveals extensive population genetic diversity

    DOE PAGESBeta

    Chapman, Carol; Henry, Matthew; Bishop-Lilly, Kimberly A.; Awosika, Joy; Briska, Adam; Ptashkin, Ryan N.; Wagner, Trevor; Rajanna, Chythanya; Tsang, Hsinyi; Johnson, Shannon L.; et al

    2015-03-20

    Historically, cholera outbreaks have been linked to V. cholerae O1 serogroup strains or its derivatives of the O37 and O139 serogroups. A genomic study on the 2010 Haiti cholera outbreak strains highlighted the putative role of non O1/non-O139 V. cholerae in causing cholera and the lack of genomic sequences of such strains from around the world. Here we address these gaps by scanning a global collection of V. cholerae strains as a first step towards understanding the population genetic diversity and epidemic potential of non O1/non-O139 strains. Whole Genome Mapping (Optical Mapping) based bar coding produces a high resolution, orderedmore » restriction map, depicting a complete view of the unique chromosomal architecture of an organism. To assess the genomic diversity of non-O1/non-O139 V. cholerae, we applied a Whole Genome Mapping strategy on a well-defined and geographically and temporally diverse strain collection, the Sakazaki serogroup type strains. Whole Genome Map data on 91 of the 206 serogroup type strains support the hypothesis that V. cholerae has an unprecedented genetic and genomic structural diversity. Interestingly, we discovered chromosomal fusions in two unusual strains that possess a single chromosome instead of the two chromosomes usually found in V. cholerae. We also found pervasive chromosomal rearrangements such as duplications and indels in many strains. The majority of Vibrio genome sequences currently in public databases are unfinished draft sequences. The Whole Genome Mapping approach presented here enables rapid screening of large strain collections to capture genomic complexities that would not have been otherwise revealed by unfinished draft genome sequencing and thus aids in assembling and finishing draft sequences of complex genomes. Furthermore, Whole Genome Mapping allows for prediction of novel V. cholerae non-O1/non-O139 strains that may have the potential to cause future cholera outbreaks.« less

  1. Scanning the Landscape of Genome Architecture of Non-O1 and Non-O139 Vibrio cholerae by Whole Genome Mapping Reveals Extensive Population Genetic Diversity

    PubMed Central

    Awosika, Joy; Briska, Adam; Ptashkin, Ryan N.; Wagner, Trevor; Rajanna, Chythanya; Tsang, Hsinyi; Johnson, Shannon L.; Mokashi, Vishwesh P.; Chain, Patrick S. G.; Sozhamannan, Shanmuga

    2015-01-01

    Historically, cholera outbreaks have been linked to V. cholerae O1 serogroup strains or its derivatives of the O37 and O139 serogroups. A genomic study on the 2010 Haiti cholera outbreak strains highlighted the putative role of non O1/non-O139 V. cholerae in causing cholera and the lack of genomic sequences of such strains from around the world. Here we address these gaps by scanning a global collection of V. cholerae strains as a first step towards understanding the population genetic diversity and epidemic potential of non O1/non-O139 strains. Whole Genome Mapping (Optical Mapping) based bar coding produces a high resolution, ordered restriction map, depicting a complete view of the unique chromosomal architecture of an organism. To assess the genomic diversity of non-O1/non-O139 V. cholerae, we applied a Whole Genome Mapping strategy on a well-defined and geographically and temporally diverse strain collection, the Sakazaki serogroup type strains. Whole Genome Map data on 91 of the 206 serogroup type strains support the hypothesis that V. cholerae has an unprecedented genetic and genomic structural diversity. Interestingly, we discovered chromosomal fusions in two unusual strains that possess a single chromosome instead of the two chromosomes usually found in V. cholerae. We also found pervasive chromosomal rearrangements such as duplications and indels in many strains. The majority of Vibrio genome sequences currently in public databases are unfinished draft sequences. The Whole Genome Mapping approach presented here enables rapid screening of large strain collections to capture genomic complexities that would not have been otherwise revealed by unfinished draft genome sequencing and thus aids in assembling and finishing draft sequences of complex genomes. Furthermore, Whole Genome Mapping allows for prediction of novel V. cholerae non-O1/non-O139 strains that may have the potential to cause future cholera outbreaks. PMID:25794000

  2. Ecological and evolutionary significance of genomic GC content diversity in monocots

    PubMed Central

    Šmarda, Petr; Bureš, Petr; Horová, Lucie; Leitch, Ilia J.; Mucina, Ladislav; Pacini, Ettore; Tichý, Lubomír; Grulich, Vít; Rotreklová, Olga

    2014-01-01

    Genomic DNA base composition (GC content) is predicted to significantly affect genome functioning and species ecology. Although several hypotheses have been put forward to address the biological impact of GC content variation in microbial and vertebrate organisms, the biological significance of GC content diversity in plants remains unclear because of a lack of sufficiently robust genomic data. Using flow cytometry, we report genomic GC contents for 239 species representing 70 of 78 monocot families and compare them with genomic characters, a suite of life history traits and climatic niche data using phylogeny-based statistics. GC content of monocots varied between 33.6% and 48.9%, with several groups exceeding the GC content known for any other vascular plant group, highlighting their unusual genome architecture and organization. GC content showed a quadratic relationship with genome size, with the decreases in GC content in larger genomes possibly being a consequence of the higher biochemical costs of GC base synthesis. Dramatic decreases in GC content were observed in species with holocentric chromosomes, whereas increased GC content was documented in species able to grow in seasonally cold and/or dry climates, possibly indicating an advantage of GC-rich DNA during cell freezing and desiccation. We also show that genomic adaptations associated with changing GC content might have played a significant role in the evolution of the Earth’s contemporary biota, such as the rise of grass-dominated biomes during the mid-Tertiary. One of the major selective advantages of GC-rich DNA is hypothesized to be facilitating more complex gene regulation. PMID:25225383

  3. Diversity, genetic mapping, and signatures of domestication in the carrot (Daucus carota L.) genome, as revealed by Diversity Arrays Technology (DArT) markers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Carrot is one of the most economically important vegetables worldwide, however, genetic and genomic resources supporting carrot breeding remain limited. We developed a Diversity Arrays Technology (DArT) platform for wild and cultivated carrot and used it to investigate genetic diversity and to devel...

  4. Extensive Genomic Diversity among Bovine-Adapted Staphylococcus aureus: Evidence for a Genomic Rearrangement within CC97

    PubMed Central

    Budd, Kathleen E.; McCoy, Finola; Monecke, Stefan; Cormican, Paul; Mitchell, Jennifer; Keane, Orla M.

    2015-01-01

    Staphylococcus aureus is an important pathogen associated with both human and veterinary disease and is a common cause of bovine mastitis. Genomic heterogeneity exists between S. aureus strains and has been implicated in the adaptation of specific strains to colonise particular mammalian hosts. Knowledge of the factors required for host specificity and virulence is important for understanding the pathogenesis and management of S. aureus mastitis. In this study, a panel of mastitis-associated S. aureus isolates (n = 126) was tested for resistance to antibiotics commonly used to treat mastitis. Over half of the isolates (52%) demonstrated resistance to penicillin and ampicillin but all were susceptible to the other antibiotics tested. S. aureus isolates were further examined for their clonal diversity by Multi-Locus Sequence Typing (MLST). In total, 18 different sequence types (STs) were identified and eBURST analysis demonstrated that the majority of isolates grouped into clonal complexes CC97, CC151 or sequence type (ST) 136. Analysis of the role of recombination events in determining S. aureus population structure determined that ST diversification through nucleotide substitutions were more likely to be due to recombination compared to point mutation, with regions of the genome possibly acting as recombination hotspots. DNA microarray analysis revealed a large number of differences amongst S. aureus STs in their variable genome content, including genes associated with capsule and biofilm formation and adhesion factors. Finally, evidence for a genomic arrangement was observed within isolates from CC97 with the ST71-like subgroup showing evidence of an IS431 insertion element having replaced approximately 30 kb of DNA including the ica operon and histidine biosynthesis genes, resulting in histidine auxotrophy. This genomic rearrangement may be responsible for the diversification of ST71 into an emerging bovine adapted subgroup. PMID:26317849

  5. The expanded diversity of methylophilaceae from Lake Washington through cultivation and genomic sequencing of novel ecotypes.

    PubMed

    Beck, David A C; McTaggart, Tami L; Setboonsarng, Usanisa; Vorobev, Alexey; Kalyuzhnaya, Marina G; Ivanova, Natalia; Goodwin, Lynne; Woyke, Tanja; Lidstrom, Mary E; Chistoserdova, Ludmila

    2014-01-01

    We describe five novel Methylophilaceae ecotypes from a single ecological niche in Lake Washington, USA, and compare them to three previously described ecotypes, in terms of their phenotype and genome sequence divergence. Two of the ecotypes appear to represent novel genera within the Methylophilaceae. Genome-based metabolic reconstruction highlights metabolic versatility of Methylophilaceae with respect to methylotrophy and nitrogen metabolism, different ecotypes possessing different combinations of primary substrate oxidation systems (MxaFI-type methanol dehydrogenase versus XoxF-type methanol dehydrogenase; methylamine dehydrogenase versus N-methylglutamate pathway) and different potentials for denitrification (assimilatory versus respiratory nitrate reduction). By comparing pairs of closely related genomes, we uncover that site-specific recombination is the main means of genomic evolution and strain divergence, including lateral transfers of genes from both closely- and distantly related taxa. The new ecotypes and the new genomes contribute significantly to our understanding of the extent of genomic and metabolic diversity among organisms of the same family inhabiting the same ecological niche. These organisms also provide novel experimental models for studying the complexity and the function of the microbial communities active in methylotrophy. PMID:25058595

  6. The Expanded Diversity of Methylophilaceae from Lake Washington through Cultivation and Genomic Sequencing of Novel Ecotypes

    PubMed Central

    Beck, David A. C.; McTaggart, Tami L.; Setboonsarng, Usanisa; Vorobev, Alexey; Kalyuzhnaya, Marina G.; Ivanova, Natalia; Goodwin, Lynne; Woyke, Tanja; Lidstrom, Mary E.; Chistoserdova, Ludmila

    2014-01-01

    We describe five novel Methylophilaceae ecotypes from a single ecological niche in Lake Washington, USA, and compare them to three previously described ecotypes, in terms of their phenotype and genome sequence divergence. Two of the ecotypes appear to represent novel genera within the Methylophilaceae. Genome-based metabolic reconstruction highlights metabolic versatility of Methylophilaceae with respect to methylotrophy and nitrogen metabolism, different ecotypes possessing different combinations of primary substrate oxidation systems (MxaFI-type methanol dehydrogenase versus XoxF-type methanol dehydrogenase; methylamine dehydrogenase versus N-methylglutamate pathway) and different potentials for denitrification (assimilatory versus respiratory nitrate reduction). By comparing pairs of closely related genomes, we uncover that site-specific recombination is the main means of genomic evolution and strain divergence, including lateral transfers of genes from both closely- and distantly related taxa. The new ecotypes and the new genomes contribute significantly to our understanding of the extent of genomic and metabolic diversity among organisms of the same family inhabiting the same ecological niche. These organisms also provide novel experimental models for studying the complexity and the function of the microbial communities active in methylotrophy. PMID:25058595

  7. Twenty-One Genome Sequences from Pseudomonas Species and 19 Genome Sequences from Diverse Bacteria Isolated from the Rhizosphere and Endosphere of Populus deltoides

    SciTech Connect

    Brown, Steven D; Utturkar, Sagar M; Klingeman, Dawn Marie; Johnson, Courtney M; Martin, Stanton; Land, Miriam L; Lu, Tse-Yuan; Schadt, Christopher Warren; Doktycz, Mitchel John; Pelletier, Dale A

    2012-01-01

    To aid in the investigation of the Populus deltoides microbiome we generated draft genome sequences for twenty one Pseudomonas and twenty one other diverse bacteria isolated from Populus deltoides roots. Genome sequences for isolates similar to Acidovorax, Bradyrhizobium, Brevibacillus, Burkholderia, Caulobacter, Chryseobacterium, Flavobacterium, Herbaspirillum, Novosphingobium, Pantoea, Phyllobacterium, Polaromonas, Rhizobium, Sphingobium and Variovorax were generated.

  8. CyanoGEBA: A Better Understanding of Cynobacterial Diversity through Large-scale Genomics (JGI Seventh Annual User Meeting 2012: Genomics of Energy and Environment)

    ScienceCinema

    Shih, Patrick [Kerfeld Lab, UC Berkeley and JGI

    2013-01-22

    Patrick Shih, representing both the University of California, Berkeley and JGI, gives a talk titled "CyanoGEBA: A Better Understanding of Cynobacterial Diversity through Large-scale Genomics" at the JGI 7th Annual Users Meeting: Genomics of Energy & Environment Meeting on March 22, 2012 in Walnut Creek, California.

  9. CyanoGEBA: A Better Understanding of Cynobacterial Diversity through Large-scale Genomics (JGI Seventh Annual User Meeting 2012: Genomics of Energy and Environment)

    SciTech Connect

    Shih, Patrick

    2012-03-22

    Patrick Shih, representing both the University of California, Berkeley and JGI, gives a talk titled "CyanoGEBA: A Better Understanding of Cynobacterial Diversity through Large-scale Genomics" at the JGI 7th Annual Users Meeting: Genomics of Energy & Environment Meeting on March 22, 2012 in Walnut Creek, California.

  10. A LDA-based approach to promoting ranking diversity for genomics information retrieval

    PubMed Central

    2012-01-01

    Background In the biomedical domain, there are immense data and tremendous increase of genomics and biomedical relevant publications. The wealth of information has led to an increasing amount of interest in and need for applying information retrieval techniques to access the scientific literature in genomics and related biomedical disciplines. In many cases, the desired information of a query asked by biologists is a list of a certain type of entities covering different aspects that are related to the question, such as cells, genes, diseases, proteins, mutations, etc. Hence, it is important of a biomedical IR system to be able to provide relevant and diverse answers to fulfill biologists' information needs. However traditional IR model only concerns with the relevance between retrieved documents and user query, but does not take redundancy between retrieved documents into account. This will lead to high redundancy and low diversity in the retrieval ranked lists. Results In this paper, we propose an approach which employs a topic generative model called Latent Dirichlet Allocation (LDA) to promoting ranking diversity for biomedical information retrieval. Different from other approaches or models which consider aspects on word level, our approach assumes that aspects should be identified by the topics of retrieved documents. We present LDA model to discover topic distribution of retrieval passages and word distribution of each topic dimension, and then re-rank retrieval results with topic distribution similarity between passages based on N-size slide window. We perform our approach on TREC 2007 Genomics collection and two distinctive IR baseline runs, which can achieve 8% improvement over the highest Aspect MAP reported in TREC 2007 Genomics track. Conclusions The proposed method is the first study of adopting topic model to genomics information retrieval, and demonstrates its effectiveness in promoting ranking diversity as well as in improving relevance of ranked

  11. From diversity to delivery: the case of the Indian Genome Variation initiative.

    PubMed

    Hardy, Billie-Jo; Séguin, Béatrice; Singer, Peter A; Mukerji, Mitali; Brahmachari, Samir K; Daar, Abdallah S

    2008-10-01

    India currently has the world's second-largest population along with a fast-growing economy and significant economic disparity. It also continues to experience a high rate of infectious disease and increasingly higher rates of chronic diseases. However, India cannot afford to import expensive technologies and therapeutics nor can it, as an emerging economy, emulate the health-delivery systems of the developed world. Instead, to address these challenges it is looking to biotechnology-based innovation in the field of genomics. The Indian Genome Variation (IGV) consortium, a government-funded collaborative network among seven local institutions, is a reflection of these efforts. The IGV has recently developed the first large-scale database of genomic diversity in the Indian population that will facilitate research on disease predisposition, adverse drug reactions and population migration. PMID:18802420

  12. Linkage disequilibrium and diversity for three genomic regions in Azoreans and mainland Portuguese

    PubMed Central

    2009-01-01

    Studies on linkage disequilibrium (LD) across the genome and populations have been used in recent years with the main objective of improving gene mapping of complex traits. Here, we characterize the patterns of genetic diversity of HLA loci and evaluate LD (D') extent in three genomic regions: Xq13.3, NRY and HLA. In addition, we examine the distribution of DXS1225-DXS8082 haplotype diversity in Azoreans and mainland Portuguese. Allele distribution has demonstrated that the São Miguel population is genetically very diverse; haplotype analysis revealed 100% discriminatory power for X- and Y-markers and 94.3% for HLA markers. Standardized multiallelic D' in these three genomic regions shows values lower than 0.33, thereby suggesting there is no extensive LD in the São Miguel population. Data regarding the distribution of DXS1225-DXS8082 haplotypes indicate that there are no significant differences among all the populations studied, (Azorean geographical groups, the Azores archipelago and mainland Portugal). Moreover, in these as well as in other European populations, the most frequent DXS1225-DXS8082 haplotype is 210-219. Even though São Miguel islanders and Azoreans do not constitute isolated populations and show LD for only very short physical distances, certain characteristics, such as the absence of genetic structure, the same environment and the possibility of constructing extensive pedigrees through church and civil records, offer an opportunity for dissecting the genetic background of complex diseases in these populations. PMID:21637671

  13. Analysis of retrotransposon structural diversity uncovers properties and propensities in angiosperm genome evolution

    PubMed Central

    Vitte, Clémentine; Bennetzen, Jeffrey L.

    2006-01-01

    Analysis of LTR retrotransposon structures in five diploid angiosperm genomes uncovered very different relative levels of different types of genomic diversity. All species exhibited recent LTR retrotransposon mobility and also high rates of DNA removal by unequal homologous recombination and illegitimate recombination. The larger plant genomes contained many LTR retrotransposon families with >10,000 copies per haploid genome, whereas the smaller genomes contained few or no LTR retrotransposon families with >1,000 copies, suggesting that this differential potential for retroelement amplification is a primary factor in angiosperm genome size variation. The average ratios of transition to transversion mutations (Ts/Tv) in diverging LTRs were >1.5 for each species studied, suggesting that these elements are mostly 5-methylated at cytosines in an epigenetically silenced state. However, the diploid wheat Triticum monococcum and barley have unusually low Ts/Tv values (respectively, 1.9 and 1.6) compared with maize (3.9), medicago (3.6), and lotus (2.5), suggesting that this silencing is less complete in the two Triticeae. Such characteristics as the ratios of point mutations to indels (insertions and deletions) and the relative efficiencies of DNA removal by unequal homologous recombination compared with illegitimate recombination were highly variable between species. These latter variations did not correlate with genome size or phylogenetic relatedness, indicating that they frequently change during the evolutionary descent of plant lineages. In sum, the results indicate that the different sizes, contents, and structures of angiosperm genomes are outcomes of the same suite of mechanistic processes, but acting with different relative efficiencies in different plant lineages. PMID:17101966

  14. Diverse Lifestyles and Strategies of Plant Pathogenesis Encoded in the Genomes of Eighteen Dothideomycetes Fungi

    PubMed Central

    Ohm, Robin A.; Feau, Nicolas; Henrissat, Bernard; Schoch, Conrad L.; Horwitz, Benjamin A.; Barry, Kerrie W.; Condon, Bradford J.; Copeland, Alex C.; Dhillon, Braham; Glaser, Fabian; Hesse, Cedar N.; Kosti, Idit; LaButti, Kurt; Lindquist, Erika A.; Lucas, Susan; Salamov, Asaf A.; Bradshaw, Rosie E.; Ciuffetti, Lynda; Hamelin, Richard C.; Kema, Gert H. J.; Lawrence, Christopher; Scott, James A.; Spatafora, Joseph W.; Turgeon, B. Gillian; de Wit, Pierre J. G. M.; Zhong, Shaobin; Goodwin, Stephen B.; Grigoriev, Igor V.

    2012-01-01

    The class Dothideomycetes is one of the largest groups of fungi with a high level of ecological diversity including many plant pathogens infecting a broad range of hosts. Here, we compare genome features of 18 members of this class, including 6 necrotrophs, 9 (hemi)biotrophs and 3 saprotrophs, to analyze genome structure, evolution, and the diverse strategies of pathogenesis. The Dothideomycetes most likely evolved from a common ancestor more than 280 million years ago. The 18 genome sequences differ dramatically in size due to variation in repetitive content, but show much less variation in number of (core) genes. Gene order appears to have been rearranged mostly within chromosomal boundaries by multiple inversions, in extant genomes frequently demarcated by adjacent simple repeats. Several Dothideomycetes contain one or more gene-poor, transposable element (TE)-rich putatively dispensable chromosomes of unknown function. The 18 Dothideomycetes offer an extensive catalogue of genes involved in cellulose degradation, proteolysis, secondary metabolism, and cysteine-rich small secreted proteins. Ancestors of the two major orders of plant pathogens in the Dothideomycetes, the Capnodiales and Pleosporales, may have had different modes of pathogenesis, with the former having fewer of these genes than the latter. Many of these genes are enriched in proximity to transposable elements, suggesting faster evolution because of the effects of repeat induced point (RIP) mutations. A syntenic block of genes, including oxidoreductases, is conserved in most Dothideomycetes and upregulated during infection in L. maculans, suggesting a possible function in response to oxidative stress. PMID:23236275

  15. Diverse Lifestyles and Strategies of Plant Pathogenesis Encoded in the Genomes of Eighteen Dothideomycetes Fungi

    SciTech Connect

    Ohm, Robin A.; Feau, Nicolas; Henrissat, Bernard; Schoch, Conrad L.; Horwitz, Benjamin A.; Barry, Kerrie W.; Condon, Bradford J.; Copeland, Alex C.; Dhillon, Braham; Glaser, Fabian; Hesse, Cedar N.; Kosti, Idit; LaButti, Kurt; Lindquist, Erika A.; Lucas, Susan; Salamov, Asaf A.; Bradshaw, Rosie E.; Ciuffetti, Lynda; Hamelin, Richard C.; Kema, Gert H. J.; Lawrence, Christopher; Scott, James A.; Spatafora, Joseph W.; Turgeon, B. Gillian; Wit, Pierre J. G. M. de; Zhong, Shaobin; Goodwin, Stephen B.; Grigoriev, Igor V.

    2012-02-29

    The class Dothideomycetes is one of the largest groups of fungi with a high level of ecological diversity including many plant pathogens infecting a broad range of hosts. Here, we compare genome features of 18 members of this class, including 6 necrotrophs, 9 (hemi)biotrophs and 3 saprotrophs, to analyze genome structure, evolution, and the diverse strategies of pathogenesis. The Dothideomycetes most likely evolved from a common ancestor more than 280 million years ago. The 18 genome sequences differ dramatically in size due to variation in repetitive content, but show much less variation in number of (core) genes. Gene order appears to have been rearranged mostly within chromosomal boundaries by multiple inversions, in extant genomes frequently demarcated by adjacent simple repeats. Several Dothideomycetes contain one or more gene-poor, transposable element (TE)-rich putatively dispensable chromosomes of unknown function. The 18 Dothideomycetes offer an extensive catalogue of genes involved in cellulose degradation, proteolysis, secondary metabolism, and cysteine-rich small secreted proteins. Ancestors of the two major orders of plant pathogens in the Dothideomycetes, the Capnodiales and Pleosporales, may have had different modes of pathogenesis, with the former having fewer of these genes than the latter. Many of these genes are enriched in proximity to transposable elements, suggesting faster evolution because of the effects of repeat induced point (RIP) mutations. A syntenic block of genes, including oxidoreductases, is conserved in most Dothideomycetes and upregulated during infection in L. maculans, suggesting a possible function in response to oxidative stress.

  16. Comparative Analysis of 35 Basidiomycete Genomes Reveals Diversity and Uniqueness of the Phylum

    SciTech Connect

    Riley, Robert; Salamov, Asaf; Otillar, Robert; Fagnan, Kirsten; Boussau, Bastien; Brown, Daren; Henrissat, Bernard; Levasseur, Anthony; Held, Benjamin; Nagy, Laszlo; Floudas, Dimitris; Morin, Emmanuelle; Manning, Gerard; Baker, Scott; Martin, Francis; Blanchette, Robert; Hibbett, David; Grigoriev, Igor V.

    2013-03-11

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprobes including wood decaying fungi. To better understand the diversity of this phylum we compared the genomes of 35 basidiomycete fungi including 6 newly sequenced genomes. The genomes of basidiomycetes span extremes of genome size, gene number, and repeat content. A phylogenetic tree of Basidiomycota was generated using the Phyldog software, which uses all available protein sequence data to simultaneously infer gene and species trees. Analysis of core genes reveals that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) comprising proteins found in only one organism. Phylogenetic patterns of plant biomass-degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay among the members of Agaricomycotina subphylum. There is a correlation of the profile of certain gene families to nutritional mode in Agaricomycotina. Based on phylogenetically-informed PCA analysis of such profiles, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has liginolytic class II fungal peroxidases. Furthermore, we find that both fungi exhibit wood decay with white rot-like characteristics in growth assays. Analysis of the rate of discovery of proteins with no or few homologs suggests the high value of continued sequencing of basidiomycete fungi.

  17. Neutral Theory Predicts the Relative Abundance and Diversity of Genetic Elements in a Broad Array of Eukaryotic Genomes

    PubMed Central

    Serra, François; Becher, Verónica; Dopazo, Hernán

    2013-01-01

    It is universally true in ecological communities, terrestrial or aquatic, temperate or tropical, that some species are very abundant, others are moderately common, and the majority are rare. Likewise, eukaryotic genomes also contain classes or “species” of genetic elements that vary greatly in abundance: DNA transposons, retrotransposons, satellite sequences, simple repeats and their less abundant functional sequences such as RNA or genes. Are the patterns of relative species abundance and diversity similar among ecological communities and genomes? Previous dynamical models of genomic diversity have focused on the selective forces shaping the abundance and diversity of transposable elements (TEs). However, ideally, models of genome dynamics should consider not only TEs, but also the diversity of all genetic classes or “species” populating eukaryotic genomes. Here, in an analysis of the diversity and abundance of genetic elements in >500 eukaryotic chromosomes, we show that the patterns are consistent with a neutral hypothesis of genome assembly in virtually all chromosomes tested. The distributions of relative abundance of genetic elements are quite precisely predicted by the dynamics of an ecological model for which the principle of functional equivalence is the main assumption. We hypothesize that at large temporal scales an overarching neutral or nearly neutral process governs the evolution of abundance and diversity of genetic elements in eukaryotic genomes. PMID:23798991

  18. Genetic Diversity in Lens Species Revealed by EST and Genomic Simple Sequence Repeat Analysis.

    PubMed

    Dikshit, Harsh Kumar; Singh, Akanksha; Singh, Dharmendra; Aski, Muraleedhar Sidaram; Prakash, Prapti; Jain, Neelu; Meena, Suresh; Kumar, Shiv; Sarker, Ashutosh

    2015-01-01

    Low productivity of pilosae type lentils grown in South Asia is attributed to narrow genetic base of the released cultivars which results in susceptibility to biotic and abiotic stresses. For enhancement of productivity and production, broadening of genetic base is essentially required. The genetic base of released cultivars can be broadened by using diverse types including bold seeded and early maturing lentils from Mediterranean region and related wild species. Genetic diversity in eighty six accessions of three species of genus Lens was assessed based on twelve genomic and thirty one EST-SSR markers. The evaluated set of genotypes included diverse lentil varieties and advanced breeding lines from Indian programme, two early maturing ICARDA lines and five related wild subspecies/species endemic to the Mediterranean region. Genomic SSRs exhibited higher polymorphism in comparison to EST SSRs. GLLC 598 produced 5 alleles with highest gene diversity value of 0.80. Among the studied subspecies/species 43 SSRs detected maximum number of alleles in L. orientalis. Based on Nei's genetic distance cultivated lentil L. culinaris subsp. culinaris was found to be close to its wild progenitor L. culinaris subsp. orientalis. The Prichard's structure of 86 genotypes distinguished different subspecies/species. Higher variability was recorded among individuals within population than among populations. PMID:26381889

  19. Genetic Diversity in Lens Species Revealed by EST and Genomic Simple Sequence Repeat Analysis

    PubMed Central

    Dikshit, Harsh Kumar; Singh, Akanksha; Singh, Dharmendra; Aski, Muraleedhar Sidaram; Prakash, Prapti; Jain, Neelu; Meena, Suresh; Kumar, Shiv; Sarker, Ashutosh

    2015-01-01

    Low productivity of pilosae type lentils grown in South Asia is attributed to narrow genetic base of the released cultivars which results in susceptibility to biotic and abiotic stresses. For enhancement of productivity and production, broadening of genetic base is essentially required. The genetic base of released cultivars can be broadened by using diverse types including bold seeded and early maturing lentils from Mediterranean region and related wild species. Genetic diversity in eighty six accessions of three species of genus Lens was assessed based on twelve genomic and thirty one EST-SSR markers. The evaluated set of genotypes included diverse lentil varieties and advanced breeding lines from Indian programme, two early maturing ICARDA lines and five related wild subspecies/species endemic to the Mediterranean region. Genomic SSRs exhibited higher polymorphism in comparison to EST SSRs. GLLC 598 produced 5 alleles with highest gene diversity value of 0.80. Among the studied subspecies/species 43 SSRs detected maximum number of alleles in L. orientalis. Based on Nei’s genetic distance cultivated lentil L. culinaris subsp. culinaris was found to be close to its wild progenitor L. culinaris subsp. orientalis. The Prichard’s structure of 86 genotypes distinguished different subspecies/species. Higher variability was recorded among individuals within population than among populations. PMID:26381889

  20. Comparison of environmental and isolate Sulfobacillus genomes reveals diverse carbon, sulfur, nitrogen, and hydrogen metabolisms

    SciTech Connect

    Justice, Nicholas B.; Norman, Anders; Brown, Christopher T.; Singh, Andrea; Thomas, Brian C.; Banfield, Jillian F.

    2014-12-15

    Bacteria of the genus Sulfobacillus are found worldwide as members of microbial communities that accelerate sulfide mineral dissolution in acid mine drainage environments (AMD), acid-rock drainage environments (ARD), as well as in industrial bioleaching operations. Despite their frequent identification in these environments, their role in biogeochemical cycling is poorly understood. Here we report draft genomes of five species of the Sulfobacillus genus (AMDSBA1-5) reconstructed by cultivation-independent sequencing of biofilms sampled from the Richmond Mine (Iron Mountain, CA). Three of these species (AMDSBA2, AMDSBA3, and AMDSBA4) have no cultured representatives while AMDSBA1 is a strain of S. benefaciens, and AMDSBA5 a strain of S. thermosulfidooxidans. We analyzed the diversity of energy conservation and central carbon metabolisms for these genomes and previously published Sulfobacillus genomes. Pathways of sulfur oxidation vary considerably across the genus, including the number and type of subunits of putative heterodisulfide reductase complexes likely involved in sulfur oxidation. The number and type of nickel-iron hydrogenase proteins varied across the genus, as does the presence of different central carbon pathways. Only the AMDSBA3 genome encodes a dissimilatory nitrate reducatase and only the AMDSBA5 and S. thermosulfidooxidans genomes encode assimilatory nitrate reductases. Lastly, within the genus, AMDSBA4 is unusual in that its electron transport chain includes a cytochrome bc type complex, a unique cytochrome c oxidase, and two distinct succinate dehydrogenase complexes. Overall, the results significantly expand our understanding of carbon, sulfur, nitrogen, and hydrogen metabolism within the Sulfobacillus genus.

  1. Microsporidian genomes harbor a diverse array of transposable elements that demonstrate an ancestry of horizontal exchange with metazoans.

    PubMed

    Parisot, Nicolas; Pelin, Adrian; Gasc, Cyrielle; Polonais, Valérie; Belkorchia, Abdel; Panek, Johan; El Alaoui, Hicham; Biron, David G; Brasset, Emilie; Vaury, Chantal; Peyret, Pierre; Corradi, Nicolas; Peyretaillade, Éric; Lerat, Emmanuelle

    2014-09-01

    Microsporidian genomes are the leading models to understand the streamlining in response to a pathogenic lifestyle; they are gene-poor and often possess small genomes. In this study, we show a feature of microsporidian genomes that contrasts this pattern of genome reduction. Specifically, genome investigations targeted at Anncaliia algerae, a human pathogen with a genome size of 23 Mb, revealed the presence of a hitherto undetected diversity in transposable elements (TEs). A total of 240 TE families per genome were identified, exceeding that found in many free-living fungi, and searches of microsporidian species revealed that these mobile elements represent a significant portion of their coding repertoire. Their phylogenetic analysis revealed that many cases of ancestry involve recent and bidirectional horizontal transfers with metazoans. The abundance and horizontal transfer origin of microsporidian TEs highlight a novel dimension of genome evolution in these intracellular pathogens, demonstrating that factors beyond reduction are at play in their diversification. PMID:25172905

  2. Microsporidian Genomes Harbor a Diverse Array of Transposable Elements that Demonstrate an Ancestry of Horizontal Exchange with Metazoans

    PubMed Central

    Gasc, Cyrielle; Polonais, Valérie; Belkorchia, Abdel; Panek, Johan; El Alaoui, Hicham; Biron, David G.; Brasset, Émilie; Vaury, Chantal; Peyret, Pierre; Corradi, Nicolas; Peyretaillade, Éric; Lerat, Emmanuelle

    2014-01-01

    Microsporidian genomes are the leading models to understand the streamlining in response to a pathogenic lifestyle; they are gene-poor and often possess small genomes. In this study, we show a feature of microsporidian genomes that contrasts this pattern of genome reduction. Specifically, genome investigations targeted at Anncaliia algerae, a human pathogen with a genome size of 23 Mb, revealed the presence of a hitherto undetected diversity in transposable elements (TEs). A total of 240 TE families per genome were identified, exceeding that found in many free-living fungi, and searches of microsporidian species revealed that these mobile elements represent a significant portion of their coding repertoire. Their phylogenetic analysis revealed that many cases of ancestry involve recent and bidirectional horizontal transfers with metazoans. The abundance and horizontal transfer origin of microsporidian TEs highlight a novel dimension of genome evolution in these intracellular pathogens, demonstrating that factors beyond reduction are at play in their diversification. PMID:25172905

  3. Whole mitochondrial genome sequencing of domestic horses reveals incorporation of extensive wild horse diversity during domestication

    PubMed Central

    2011-01-01

    Background DNA target enrichment by micro-array capture combined with high throughput sequencing technologies provides the possibility to obtain large amounts of sequence data (e.g. whole mitochondrial DNA genomes) from multiple individuals at relatively low costs. Previously, whole mitochondrial genome data for domestic horses (Equus caballus) were limited to only a few specimens and only short parts of the mtDNA genome (especially the hypervariable region) were investigated for larger sample sets. Results In this study we investigated whole mitochondrial genomes of 59 domestic horses from 44 breeds and a single Przewalski horse (Equus przewalski) using a recently described multiplex micro-array capture approach. We found 473 variable positions within the domestic horses, 292 of which are parsimony-informative, providing a well resolved phylogenetic tree. Our divergence time estimate suggests that the mitochondrial genomes of modern horse breeds shared a common ancestor around 93,000 years ago and no later than 38,000 years ago. A Bayesian skyline plot (BSP) reveals a significant population expansion beginning 6,000-8,000 years ago with an ongoing exponential growth until the present, similar to other domestic animal species. Our data further suggest that a large sample of wild horse diversity was incorporated into the domestic population; specifically, at least 46 of the mtDNA lineages observed in domestic horses (73%) already existed before the beginning of domestication about 5,000 years ago. Conclusions Our study provides a window into the maternal origins of extant domestic horses and confirms that modern domestic breeds present a wide sample of the mtDNA diversity found in ancestral, now extinct, wild horse populations. The data obtained allow us to detect a population expansion event coinciding with the beginning of domestication and to estimate both the minimum number of female horses incorporated into the domestic gene pool and the time depth of the

  4. Genome-scale phylogenetic function annotation of large and diverse protein families

    PubMed Central

    Engelhardt, Barbara E.; Jordan, Michael I.; Srouji, John R.; Brenner, Steven E.

    2011-01-01

    The Statistical Inference of Function Through Evolutionary Relationships (SIFTER) framework uses a statistical graphical model that applies phylogenetic principles to automate precise protein function prediction. Here we present a revised approach (SIFTER version 2.0) that enables annotations on a genomic scale. SIFTER 2.0 produces equivalently precise predictions compared to the earlier version on a carefully studied family and on a collection of 100 protein families. We have added an approximation method to SIFTER 2.0 and show a 500-fold improvement in speed with minimal impact on prediction results in the functionally diverse sulfotransferase protein family. On the Nudix protein family, previously inaccessible to the SIFTER framework because of the 66 possible molecular functions, SIFTER achieved 47.4% accuracy on experimental data (where BLAST achieved 34.0%). Finally, we used SIFTER to annotate all of the Schizosaccharomyces pombe proteins with experimental functional characterizations, based on annotations from proteins in 46 fungal genomes. SIFTER precisely predicted molecular function for 45.5% of the characterized proteins in this genome, as compared with four current function prediction methods that precisely predicted function for 62.6%, 30.6%, 6.0%, and 5.7% of these proteins. We use both precision-recall curves and ROC analyses to compare these genome-scale predictions across the different methods and to assess performance on different types of applications. SIFTER 2.0 is capable of predicting protein molecular function for large and functionally diverse protein families using an approximate statistical model, enabling phylogenetics-based protein function prediction for genome-wide analyses. The code for SIFTER and protein family data are available at http://sifter.berkeley.edu. PMID:21784873

  5. Comparative genomics of plant-associated Pseudomonas spp.: Insights into diversity and inheritance of traits involved in multitrophic interactions

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We provide here a comparative genome analysis of the Pseudomonas fluorescens group, including seven new genomic sequences for plant-associated strains. These strains exhibit a diverse spectrum of traits involved in biological control and other multitrophic interactions with plants, microbes, and ins...

  6. The diversity of shell matrix proteins: genome-wide investigation of the pearl oyster, Pinctada fucata.

    PubMed

    Miyamoto, Hiroshi; Endo, Hirotoshi; Hashimoto, Naoki; Limura, Kurin; Isowa, Yukinobu; Kinoshita, Shigeharu; Kotaki, Tomohiro; Masaoka, Tetsuji; Miki, Takumi; Nakayama, Seiji; Nogawa, Chihiro; Notazawa, Atsuto; Ohmori, Fumito; Sarashina, Isao; Suzuki, Michio; Takagi, Ryousuke; Takahashi, Jun; Takeuchi, Takeshi; Yokoo, Naoki; Satoh, Nori; Toyohara, Haruhiko; Miyashita, Tomoyuki; Wada, Hiroshi; Samata, Tetsuro; Endo, Kazuyoshi; Nagasawa, Hiromichi; Asakawa, Shuichi; Watabe, Shugo

    2013-10-01

    In molluscs, shell matrix proteins are associated with biomineralization, a biologically controlled process that involves nucleation and growth of calcium carbonate crystals. Identification and characterization of shell matrix proteins are important for better understanding of the adaptive radiation of a large variety of molluscs. We searched the draft genome sequence of the pearl oyster Pinctada fucata and annotated 30 different kinds of shell matrix proteins. Of these, we could identified Perlucin, ependymin-related protein and SPARC as common genes shared by bivalves and gastropods; however, most gastropod shell matrix proteins were not found in the P. fucata genome. Glycinerich proteins were conserved in the genus Pinctada. Another important finding with regard to these annotated genes was that numerous shell matrix proteins are encoded by more than one gene; e.g., three ACCBP-like proteins, three CaLPs, five chitin synthase-like proteins, two N16 proteins (pearlins), 10 N19 proteins, two nacreins, four Pifs, nine shematrins, two prismalin-14 proteins, and 21 tyrosinases. This diversity of shell matrix proteins may be implicated in the morphological diversity of mollusc shells. The annotated genes reported here can be searched in P. fucata gene models version 1.1 and genome assembly version 1.0 ( http://marinegenomics.oist.jp/pinctada_fucata ). These genes should provide a useful resource for studies of the genetic basis of biomineralization and evaluation of the role of shell matrix proteins as an evolutionary toolkit among the molluscs. PMID:24125645

  7. Genomic diversity and multiple origins of tetraploid (2n = 78, 80) Glycine tomentella.

    PubMed

    Kollipara, K P; Singh, R J; Hymowitz, T

    1994-06-01

    Among 15 wild perennial species of the genus Glycine Willd. subgenus Glycine, G. tomentella is exceptional. It is composed of four cytotypes (2n = 38, 40, 78, 80), is diverse in morphological features, and covers a wide geographical area. The objectives of this study were to uncover the genomic diversity in 78- and 80-chromosome cytotypes through a multidisciplinary approach, using cytogenetic, biochemical, and molecular methods, to verify previously identified isozyme groupings and to determine their possible origins. The cytogenetic observations, total seed protein and protease inhibitor profile comparisons, and the phylogenetic analysis of restriction fragment length polymorphisms identified three distinct groups (T1, T5, T6) among aneutetraploid (2n = 78) and four distinct groups (T2, T3, T4, T7) among tetraploid (2n = 80) G. tomentella accessions. The groupings were congruent with those of isozyme analysis. Tetraploid accessions from Indonesia were assigned to a new group, T7, based on the present study. Morphology, cytology, and seed protein banding patterns of synthetic tetraploids indicated that the T1 and T5 group aneutetraploids were composed of D3D3EE and AAEE genomes, respectively, and the T2 group tetraploid accessions consisted of AAD3D3 genomes. Various groups within the 78- and 80-chromosome G. tomentella were suggested to have originated in Australia by allopolyploidization, most likely through multiple independent events. PMID:18470090

  8. Trans-ethnic genome-wide association studies: advantages and challenges of mapping in diverse populations.

    PubMed

    Li, Yun R; Keating, Brendan J

    2014-01-01

    Genome-wide association studies (GWASs) are the method most often used by geneticists to interrogate the human genome, and they provide a cost-effective way to identify the genetic variants underpinning complex traits and diseases. Most initial GWASs have focused on genetically homogeneous cohorts from European populations given the limited availability of ethnic minority samples and so as to limit population stratification effects. Transethnic studies have been invaluable in explaining the heritability of common quantitative traits, such as height, and in examining the genetic architecture of complex diseases, such as type 2 diabetes. They provide an opportunity for large-scale signal replication in independent populations and for cross-population meta-analyses to boost statistical power. In addition, transethnic GWASs enable prioritization of candidate genes, fine-mapping of functional variants, and potentially identification of SNPs associated with disease risk in admixed populations, by taking advantage of natural differences in genomic linkage disequilibrium across ethnically diverse populations. Recent efforts to assess the biological function of variants identified by GWAS have highlighted the need for large-scale replication, meta-analyses and fine-mapping across worldwide populations of ethnically diverse genetic ancestries. Here, we review recent advances and new approaches that are important to consider when performing, designing or interpreting transethnic GWASs, and we highlight existing challenges, such as the limited ability to handle heterogeneity in linkage disequilibrium across populations and limitations in dissecting complex architectures, such as those found in recently admixed populations. PMID:25473427

  9. Anaplasma marginale: Diversity, Virulence, and Vaccine Landscape through a Genomics Approach.

    PubMed

    Quiroz-Castañeda, Rosa Estela; Amaro-Estrada, Itzel; Rodríguez-Camarillo, Sergio Darío

    2016-01-01

    In order to understand the genetic diversity of A. marginale, several efforts have been made around the world. This rickettsia affects a significant number of ruminants, causing bovine anaplasmosis, so the interest in its virulence and how it is transmitted have drawn interest not only from a molecular point of view but also, recently, some genomics research have been performed to elucidate genes and proteins with potential as antigens. Unfortunately, so far, we still do not have a recombinant anaplasmosis vaccine. In this review, we present a landscape of the multiple approaches carried out from the genomic perspective to generate valuable information that could be used in a holistic way to finally develop an anaplasmosis vaccine. These approaches include the analysis of the genetic diversity of A. marginale and how this affects control measures for the disease. Anaplasmosis vaccine development is also reviewed from the conventional vaccinomics to genome-base vaccinology approach based on proteomics, metabolomics, and transcriptomics analyses reported. The use of these new omics approaches will undoubtedly reveal new targets of interest in the near future, comprising information of potential antigens and the immunogenic effect of A. marginale proteins. PMID:27610385

  10. Anaplasma marginale: Diversity, Virulence, and Vaccine Landscape through a Genomics Approach

    PubMed Central

    Amaro-Estrada, Itzel; Rodríguez-Camarillo, Sergio Darío

    2016-01-01

    In order to understand the genetic diversity of A. marginale, several efforts have been made around the world. This rickettsia affects a significant number of ruminants, causing bovine anaplasmosis, so the interest in its virulence and how it is transmitted have drawn interest not only from a molecular point of view but also, recently, some genomics research have been performed to elucidate genes and proteins with potential as antigens. Unfortunately, so far, we still do not have a recombinant anaplasmosis vaccine. In this review, we present a landscape of the multiple approaches carried out from the genomic perspective to generate valuable information that could be used in a holistic way to finally develop an anaplasmosis vaccine. These approaches include the analysis of the genetic diversity of A. marginale and how this affects control measures for the disease. Anaplasmosis vaccine development is also reviewed from the conventional vaccinomics to genome-base vaccinology approach based on proteomics, metabolomics, and transcriptomics analyses reported. The use of these new omics approaches will undoubtedly reveal new targets of interest in the near future, comprising information of potential antigens and the immunogenic effect of A. marginale proteins. PMID:27610385

  11. Challenges of metagenomics and single-cell genomics approaches for exploring cyanobacterial diversity.

    PubMed

    Davison, Michelle; Hall, Eric; Zare, Richard; Bhaya, Devaki

    2015-10-01

    Cyanobacteria have played a crucial role in the history of early earth and continue to be instrumental in shaping our planet, yet applications of cutting edge technology have not yet been widely used to explore cyanobacterial diversity. To provide adequate background, we briefly review current sequencing technologies and their innovative uses in genomics and metagenomics. Next, we focus on current cell capture technologies and the challenges of using them with cyanobacteria. We illustrate the utility in coupling breakthroughs in DNA amplification with cell capture platforms, with an example of microfluidic isolation and subsequent targeted amplicon sequencing from individual terrestrial thermophilic cyanobacteria. Single cells of thermophilic, unicellular Synechococcus sp. JA-2-3-B'a(2-13) (Syn OS-B') were sorted in a microfluidic device, lysed, and subjected to whole genome amplification by multiple displacement amplification. We amplified regions from specific CRISPR spacer arrays, which are known to be highly diverse, contain semi-palindromic repeats which form secondary structure, and can be difficult to amplify. Cell capture, lysis, and genome amplification on a microfluidic device have been optimized, setting a stage for further investigations of individual cyanobacterial cells isolated directly from natural populations. PMID:25515769

  12. Genomic diversity of EPEC associated with clinical presentations of differing severity.

    PubMed

    Hazen, Tracy H; Donnenberg, Michael S; Panchalingam, Sandra; Antonio, Martin; Hossain, Anowar; Mandomando, Inacio; Ochieng, John Benjamin; Ramamurthy, Thandavarayan; Tamboura, Boubou; Qureshi, Shahida; Quadri, Farheen; Zaidi, Anita; Kotloff, Karen L; Levine, Myron M; Barry, Eileen M; Kaper, James B; Rasko, David A; Nataro, James P

    2016-01-01

    Enteropathogenic Escherichia coli (EPEC) are diarrhoeagenic E. coli, and are a significant cause of gastrointestinal illness among young children in developing countries. Typical EPEC are identified by the presence of the bundle-forming pilus encoded by a virulence plasmid, which has been linked to an increased severity of illness, while atypical EPEC lack this feature. Comparative genomics of 70 total EPEC from lethal (LI), non-lethal symptomatic (NSI) or asymptomatic (AI) cases of diarrhoeal illness in children enrolled in the Global Enteric Multicenter Study was used to investigate the genomic differences in EPEC isolates obtained from individuals with various clinical outcomes. A comparison of the genomes of isolates from different clinical outcomes identified genes that were significantly more prevalent in EPEC isolates of symptomatic and lethal outcomes than in EPEC isolates of asymptomatic outcomes. These EPEC isolates exhibited previously unappreciated phylogenomic diversity and combinations of virulence factors. These comparative results highlight the diversity of the pathogen, as well as the complexity of the EPEC virulence factor repertoire. PMID:27571975

  13. Whole-Genome Genetic Diversity in a Sample of Australians with Deep Aboriginal Ancestry

    PubMed Central

    McEvoy, Brian P.; Lind, Joanne M.; Wang, Eric T.; Moyzis, Robert K.; Visscher, Peter M.; van Holst Pellekaan, Sheila M.; Wilton, Alan N.

    2010-01-01

    Australia was probably settled soon after modern humans left Africa, but details of this ancient migration are not well understood. Debate centers on whether the Pleistocene Sahul continent (composed of New Guinea, Australia, and Tasmania) was first settled by a single wave followed by regional divergence into Aboriginal Australian and New Guinean populations (common origin) or whether different parts of the continent were initially populated independently. Australia has been the subject of relatively few DNA studies even though understanding regional variation in genomic structure and diversity will be important if disease-association mapping methods are to be successfully evaluated and applied across populations. We report on a genome-wide investigation of Australian Aboriginal SNP diversity in a sample of participants from the Riverine region. The phylogenetic relationship of these Aboriginal Australians to a range of other global populations demonstrates a deep common origin with Papuan New Guineans and Melanesians, with little evidence of substantial later migration until the very recent arrival of European colonists. The study provides valuable and robust insights into an early and important phase of human colonization of the globe. A broader survey of Australia, including diverse geographic sample populations, will be required to fully appreciate the continent's unique population history and consequent genetic heritage, as well as the importance of both to the understanding of health issues. PMID:20691402

  14. A Comparative Genomic Analysis of Diverse Clonal Types of Enterotoxigenic Escherichia coli Reveals Pathovar-Specific Conservation▿ †

    PubMed Central

    Sahl, Jason W.; Steinsland, Hans; Redman, Julia C.; Angiuoli, Samuel V.; Nataro, James P.; Sommerfelt, Halvor; Rasko, David A.

    2011-01-01

    Enterotoxigenic Escherichia coli (ETEC) is a major cause of diarrheal illness in children less than 5 years of age in low- and middle-income nations, whereas it is an emerging enteric pathogen in industrialized nations. Despite being an important cause of diarrhea, little is known about the genomic composition of ETEC. To address this, we sequenced the genomes of five ETEC isolates obtained from children in Guinea-Bissau with diarrhea. These five isolates represent distinct and globally dominant ETEC clonal groups. Comparative genomic analyses utilizing a gene-independent whole-genome alignment method demonstrated that sequenced ETEC strains share approximately 2.7 million bases of genomic sequence. Phylogenetic analysis of this “core genome” confirmed the diverse history of the ETEC pathovar and provides a finer resolution of the E. coli relationships than multilocus sequence typing. No identified genomic regions were conserved exclusively in all ETEC genomes; however, we identified more genomic content conserved among ETEC genomes than among non-ETEC E. coli genomes, suggesting that ETEC isolates share a genomic core. Comparisons of known virulence and of surface-exposed and colonization factor genes across all sequenced ETEC genomes not only identified variability but also indicated that some antigens are restricted to the ETEC pathovar. Overall, the generation of these five genome sequences, in addition to the two previously generated ETEC genomes, highlights the genomic diversity of ETEC. These studies increase our understanding of ETEC evolution, as well as provide insight into virulence factors and conserved proteins, which may be targets for vaccine development. PMID:21078854

  15. Genomic instability in B-cells and diversity of recombinations that activate c-myc.

    PubMed

    Janz, S; Jones, G M; Müller, J R; Potter, M

    1995-01-01

    Genetic rearrangements activating the proto-oncogene c-myc comprise a mandatory oncogenic step in plasma cell tumor development in BALB/cAnPt mice. In the majority of plasmacytomas, c-myc activating rearrangements take the form of reciprocal chromosomal translocations t(12;15) that juxtapose c-myc to the immunoglobulin heavy chain alpha locus (IgH alpha) in particular the switch alpha region (S alpha). The genetic basis for the prevalence of S alpha/c-myc recombinations in BALB/cAnPt plasmacytomas is not known but may be related to a hypothetical regional genomic instability of the c-myc and IgH alpha loci in BALB/cAnPt mice. We wished to test whether the genomic instability of both loci might be revealed by the diversity of genetic recombinations that can be observed in IgH alpha and c-myc. We employed PCR methods to detect new recombinations of c-myc and IgH alpha in the preneoplastic stage of plasma cell tumor development and found that c-myc can be joined to more genes or genomic regions than known before. This is indicative but does not formally prove a particular genomic instability of c-myc and IgH alpha in BALB/cAnPt B cells. Since defective DNA repair provides a mechanistic explanation for genomic instability, we measured the efficiency of repair in IgH alpha and c-myc using an assay that quantitates the removal of UV-induced pyrimidine dimers within specific genomic regions. We used plasmacytoma XRPC 24 as a model system and found that both IgH alpha and c-myc were poorly repaired, whereas c-abl, a proto-oncogene not related to conventional pristane-induced plasmacytoma-genesis, was efficiently repaired. PMID:7895512

  16. Genomic diversity of necrotic enteritis-associated strains of Clostridium perfringens: a review.

    PubMed

    Lacey, Jake A; Johanesen, Priscilla A; Lyras, Dena; Moore, Robert J

    2016-06-01

    The investigation of genomic variation between Clostridium perfringens isolates from poultry has been an important tool to enhance our understanding of the genetic basis of strain pathogenicity and the epidemiology of virulent and avirulent strains within the context of necrotic enteritis (NE). The earliest studies used whole genome profiling techniques such as pulsed-field gel electrophoresis to differentiate isolates and determine their relative levels of relatedness. DNA sequencing has been used to investigate genetic variation in (a) individual genes, such as those encoding the alpha and NetB toxins; (b) panels of housekeeping genes for multi-locus sequence typing and (c) most recently whole genome sequencing to build a more complete picture of genomic differences between isolates. Conclusions drawn from these studies include: differential carriage of large conjugative plasmids accounts for a large proportion of inter-strain differences; plasmid-encoded genes are more highly conserved than chromosomal genes, perhaps indicating a relatively recent origin for the plasmids; isolates from NE-affected birds fall into three distinct sequence-based clades while non-pathogenic isolates from healthy birds tend to be more genomically diverse. Overall, the NE causing strains are closely related to C. perfringens isolates from other birds and other diseases whereas the non-pathogenic poultry strains are generally more remotely related to either the pathogenic strains or the strains from other birds. Genomic analysis has indicated that genes in addition to netB are associated with NE pathogenic isolates. Collectively, this work has resulted in a deeper understanding of the pathogenesis of this important poultry disease. PMID:26949841

  17. Distribution and Genetic Diversity of Bacteriocin Gene Clusters in Rumen Microbial Genomes.

    PubMed

    Azevedo, Analice C; Bento, Cláudia B P; Ruiz, Jeronimo C; Queiroz, Marisa V; Mantovani, Hilário C

    2015-10-01

    Some species of ruminal bacteria are known to produce antimicrobial peptides, but the screening procedures have mostly been based on in vitro assays using standardized methods. Recent sequencing efforts have made available the genome sequences of hundreds of ruminal microorganisms. In this work, we performed genome mining of the complete and partial genome sequences of 224 ruminal bacteria and 5 ruminal archaea to determine the distribution and diversity of bacteriocin gene clusters. A total of 46 bacteriocin gene clusters were identified in 33 strains of ruminal bacteria. Twenty gene clusters were related to lanthipeptide biosynthesis, while 11 gene clusters were associated with sactipeptide production, 7 gene clusters were associated with class II bacteriocin production, and 8 gene clusters were associated with class III bacteriocin production. The frequency of strains whose genomes encode putative antimicrobial peptide precursors was 14.4%. Clusters related to the production of sactipeptides were identified for the first time among ruminal bacteria. BLAST analysis indicated that the majority of the gene clusters (88%) encoding putative lanthipeptides contained all the essential genes required for lanthipeptide biosynthesis. Most strains of Streptococcus (66.6%) harbored complete lanthipeptide gene clusters, in addition to an open reading frame encoding a putative class II bacteriocin. Albusin B-like proteins were found in 100% of the Ruminococcus albus strains screened in this study. The in silico analysis provided evidence of novel biosynthetic gene clusters in bacterial species not previously related to bacteriocin production, suggesting that the rumen microbiota represents an underexplored source of antimicrobial peptides. PMID:26253660

  18. Distribution and Genetic Diversity of Bacteriocin Gene Clusters in Rumen Microbial Genomes

    PubMed Central

    Azevedo, Analice C.; Bento, Cláudia B. P.; Ruiz, Jeronimo C.; Queiroz, Marisa V.

    2015-01-01

    Some species of ruminal bacteria are known to produce antimicrobial peptides, but the screening procedures have mostly been based on in vitro assays using standardized methods. Recent sequencing efforts have made available the genome sequences of hundreds of ruminal microorganisms. In this work, we performed genome mining of the complete and partial genome sequences of 224 ruminal bacteria and 5 ruminal archaea to determine the distribution and diversity of bacteriocin gene clusters. A total of 46 bacteriocin gene clusters were identified in 33 strains of ruminal bacteria. Twenty gene clusters were related to lanthipeptide biosynthesis, while 11 gene clusters were associated with sactipeptide production, 7 gene clusters were associated with class II bacteriocin production, and 8 gene clusters were associated with class III bacteriocin production. The frequency of strains whose genomes encode putative antimicrobial peptide precursors was 14.4%. Clusters related to the production of sactipeptides were identified for the first time among ruminal bacteria. BLAST analysis indicated that the majority of the gene clusters (88%) encoding putative lanthipeptides contained all the essential genes required for lanthipeptide biosynthesis. Most strains of Streptococcus (66.6%) harbored complete lanthipeptide gene clusters, in addition to an open reading frame encoding a putative class II bacteriocin. Albusin B-like proteins were found in 100% of the Ruminococcus albus strains screened in this study. The in silico analysis provided evidence of novel biosynthetic gene clusters in bacterial species not previously related to bacteriocin production, suggesting that the rumen microbiota represents an underexplored source of antimicrobial peptides. PMID:26253660

  19. Genomic and Metabolic Diversity of Marine Group I Thaumarchaeota in the Mesopelagic of Two Subtropical Gyres

    PubMed Central

    Swan, Brandon K.; Chaffin, Mark D.; Martinez-Garcia, Manuel; Morrison, Hilary G.; Field, Erin K.; Poulton, Nicole J.; Masland, E. Dashiell P.; Harris, Christopher C.; Sczyrba, Alexander; Chain, Patrick S. G.; Koren, Sergey; Woyke, Tanja; Stepanauskas, Ramunas

    2014-01-01

    Marine Group I (MGI) Thaumarchaeota are one of the most abundant and cosmopolitan chemoautotrophs within the global dark ocean. To date, no representatives of this archaeal group retrieved from the dark ocean have been successfully cultured. We used single cell genomics to investigate the genomic and metabolic diversity of thaumarchaea within the mesopelagic of the subtropical North Pacific and South Atlantic Ocean. Phylogenetic and metagenomic recruitment analysis revealed that MGI single amplified genomes (SAGs) are genetically and biogeographically distinct from existing thaumarchaea cultures obtained from surface waters. Confirming prior studies, we found genes encoding proteins for aerobic ammonia oxidation and the hydrolysis of urea, which may be used for energy production, as well as genes involved in 3-hydroxypropionate/4-hydroxybutyrate and oxidative tricarboxylic acid pathways. A large proportion of protein sequences identified in MGI SAGs were absent in the marine cultures Cenarchaeum symbiosum and Nitrosopumilus maritimus, thus expanding the predicted protein space for this archaeal group. Identifiable genes located on genomic islands with low metagenome recruitment capacity were enriched in cellular defense functions, likely in response to viral infections or grazing. We show that MGI Thaumarchaeota in the dark ocean may have more flexibility in potential energy sources and adaptations to biotic interactions than the existing, surface-ocean cultures. PMID:24743558

  20. Characterizing neutral genomic diversity and selection signatures in indigenous populations of Moroccan goats (Capra hircus) using WGS data

    PubMed Central

    Benjelloun, Badr; Alberto, Florian J.; Streeter, Ian; Boyer, Frédéric; Coissac, Eric; Stucki, Sylvie; BenBati, Mohammed; Ibnelbachyr, Mustapha; Chentouf, Mouad; Bechchari, Abdelmajid; Leempoel, Kevin; Alberti, Adriana; Engelen, Stefan; Chikhi, Abdelkader; Clarke, Laura; Flicek, Paul; Joost, Stéphane; Taberlet, Pierre; Pompanon, François

    2015-01-01

    Since the time of their domestication, goats (Capra hircus) have evolved in a large variety of locally adapted populations in response to different human and environmental pressures. In the present era, many indigenous populations are threatened with extinction due to their substitution by cosmopolitan breeds, while they might represent highly valuable genomic resources. It is thus crucial to characterize the neutral and adaptive genetic diversity of indigenous populations. A fine characterization of whole genome variation in farm animals is now possible by using new sequencing technologies. We sequenced the complete genome at 12× coverage of 44 goats geographically representative of the three phenotypically distinct indigenous populations in Morocco. The study of mitochondrial genomes showed a high diversity exclusively restricted to the haplogroup A. The 44 nuclear genomes showed a very high diversity (24 million variants) associated with low linkage disequilibrium. The overall genetic diversity was weakly structured according to geography and phenotypes. When looking for signals of positive selection in each population we identified many candidate genes, several of which gave insights into the metabolic pathways or biological processes involved in the adaptation to local conditions (e.g., panting in warm/desert conditions). This study highlights the interest of WGS data to characterize livestock genomic diversity. It illustrates the valuable genetic richness present in indigenous populations that have to be sustainably managed and may represent valuable genetic resources for the long-term preservation of the species. PMID:25904931

  1. Characterizing neutral genomic diversity and selection signatures in indigenous populations of Moroccan goats (Capra hircus) using WGS data.

    PubMed

    Benjelloun, Badr; Alberto, Florian J; Streeter, Ian; Boyer, Frédéric; Coissac, Eric; Stucki, Sylvie; BenBati, Mohammed; Ibnelbachyr, Mustapha; Chentouf, Mouad; Bechchari, Abdelmajid; Leempoel, Kevin; Alberti, Adriana; Engelen, Stefan; Chikhi, Abdelkader; Clarke, Laura; Flicek, Paul; Joost, Stéphane; Taberlet, Pierre; Pompanon, François

    2015-01-01

    Since the time of their domestication, goats (Capra hircus) have evolved in a large variety of locally adapted populations in response to different human and environmental pressures. In the present era, many indigenous populations are threatened with extinction due to their substitution by cosmopolitan breeds, while they might represent highly valuable genomic resources. It is thus crucial to characterize the neutral and adaptive genetic diversity of indigenous populations. A fine characterization of whole genome variation in farm animals is now possible by using new sequencing technologies. We sequenced the complete genome at 12× coverage of 44 goats geographically representative of the three phenotypically distinct indigenous populations in Morocco. The study of mitochondrial genomes showed a high diversity exclusively restricted to the haplogroup A. The 44 nuclear genomes showed a very high diversity (24 million variants) associated with low linkage disequilibrium. The overall genetic diversity was weakly structured according to geography and phenotypes. When looking for signals of positive selection in each population we identified many candidate genes, several of which gave insights into the metabolic pathways or biological processes involved in the adaptation to local conditions (e.g., panting in warm/desert conditions). This study highlights the interest of WGS data to characterize livestock genomic diversity. It illustrates the valuable genetic richness present in indigenous populations that have to be sustainably managed and may represent valuable genetic resources for the long-term preservation of the species. PMID:25904931

  2. Genome Sequencing of Mycobacterium abscessus Isolates from Patients in the United States and Comparisons to Globally Diverse Clinical Strains

    PubMed Central

    Davidson, Rebecca M.; Hasan, Nabeeh A.; Reynolds, Paul R.; Totten, Sarah; Garcia, Benjamin; Levin, Adrah; Ramamoorthy, Preveen; Heifets, Leonid; Daley, Charles L.

    2014-01-01

    Nontuberculous mycobacterial infections caused by Mycobacterium abscessus are responsible for a range of disease manifestations from pulmonary to skin infections and are notoriously difficult to treat, due to innate resistance to many antibiotics. Previous population studies of clinical M. abscessus isolates utilized multilocus sequence typing or pulsed-field gel electrophoresis, but high-resolution examinations of genetic diversity at the whole-genome level have not been well characterized, particularly among clinical isolates derived in the United States. We performed whole-genome sequencing of 11 clinical M. abscessus isolates derived from eight U.S. patients with pulmonary nontuberculous mycobacterial infections, compared them to 30 globally diverse clinical isolates, and investigated intrapatient genomic diversity and evolution. Phylogenomic analyses revealed a cluster of closely related U.S. and Western European M. abscessus subsp. abscessus isolates that are genetically distinct from other European isolates and all Asian isolates. Large-scale variation analyses suggested genome content differences of 0.3 to 8.3%, relative to the reference strain ATCC 19977T. Longitudinally sampled isolates showed very few single-nucleotide polymorphisms and correlated genomic deletion patterns, suggesting homogeneous infection populations. Our study explores the genomic diversity of clinical M. abscessus strains from multiple continents and provides insight into the genome plasticity of an opportunistic pathogen. PMID:25056330

  3. Development of a Custom-Designed, Pan Genomic DNA Microarray to Characterize Strain-Level Diversity among Cronobacter spp.

    PubMed Central

    Tall, Ben Davies; Gangiredla, Jayanthi; Gopinath, Gopal R.; Yan, Qiongqiong; Chase, Hannah R.; Lee, Boram; Hwang, Seongeun; Trach, Larisa; Park, Eunbi; Yoo, YeonJoo; Chung, TaeJung; Jackson, Scott A.; Patel, Isha R.; Sathyamoorthy, Venugopal; Pava-Ripoll, Monica; Kotewicz, Michael L.; Carter, Laurenda; Iversen, Carol; Pagotto, Franco; Stephan, Roger; Lehner, Angelika; Fanning, Séamus; Grim, Christopher J.

    2015-01-01

    Cronobacter species cause infections in all age groups; however neonates are at highest risk and remain the most susceptible age group for life-threatening invasive disease. The genus contains seven species:Cronobacter sakazakii, Cronobacter malonaticus, Cronobacter turicensis, Cronobacter muytjensii, Cronobacter dublinensis, Cronobacter universalis, and Cronobacter condimenti. Despite an abundance of published genomes of these species, genomics-based epidemiology of the genus is not well established. The gene content of a diverse group of 126 unique Cronobacter and taxonomically related isolates was determined using a pan genomic-based DNA microarray as a genotyping tool and as a means to identify outbreak isolates for food safety, environmental, and clinical surveillance purposes. The microarray constitutes 19,287 independent genes representing 15 Cronobacter genomes and 18 plasmids and 2,371 virulence factor genes of phylogenetically related Gram-negative bacteria. The Cronobacter microarray was able to distinguish the seven Cronobacter species from one another and from non-Cronobacter species; and within each species, strains grouped into distinct clusters based on their genomic diversity. These results also support the phylogenic divergence of the genus and clearly highlight the genomic diversity among each member of the genus. The current study establishes a powerful platform for further genomics research of this diverse genus, an important prerequisite toward the development of future countermeasures against this foodborne pathogen in the food safety and clinical arenas. PMID:25984509

  4. Genomic analysis of the immune gene repertoire of amphioxus reveals extraordinary innate complexity and diversity

    PubMed Central

    Huang, Shengfeng; Yuan, Shaochun; Guo, Lei; Yu, Yanhong; Li, Jun; Wu, Tao; Liu, Tong; Yang, Manyi; Wu, Kui; Liu, Huiling; Ge, Jin; Yu, Yingcai; Huang, Huiqing; Dong, Meiling; Yu, Cuiling; Chen, Shangwu; Xu, Anlong

    2008-01-01

    It has been speculated that before vertebrates evolved somatic diversity-based adaptive immunity, the germline-encoded diversity of innate immunity may have been more developed. Amphioxus occupies the basal position of the chordate phylum and hence is an important reference to the evolution of vertebrate immunity. Here we report the first comprehensive genomic survey of the immune gene repertoire of the amphioxus Branchiostoma floridae. It has been reported that the purple sea urchin has a vastly expanded innate receptor repertoire not previously seen in other species, which includes 222 toll-like receptors (TLRs), 203 NOD/NALP-like receptors (NLRs), and 218 scavenger receptors (SRs). We discovered that the amphioxus genome contains comparable expansion with 71 TLR gene models, 118 NLR models, and 270 SR models. Amphioxus also expands other receptor-like families, including 1215 C-type lectin models, 240 LRR and IGcam-containing models, 1363 other LRR-containing models, 75 C1q-like models, 98 ficolin-like models, and hundreds of models containing complement-related domains. The expansion is not restricted to receptors but is likely to extend to intermediate signal transducers because there are 58 TIR adapter-like models, 36 TRAF models, 44 initiator caspase models, and 541 death-fold domain-containing models in the genome. Amphioxus also has a sophisticated TNF system and a complicated complement system not previously seen in other invertebrates. Besides the increase of gene number, domain combinations of immune proteins are also increased. Altogether, this survey suggests that the amphioxus, a species without vertebrate-type adaptive immunity, holds extraordinary innate complexity and diversity. PMID:18562681

  5. Genomic analysis of the immune gene repertoire of amphioxus reveals extraordinary innate complexity and diversity.

    PubMed

    Huang, Shengfeng; Yuan, Shaochun; Guo, Lei; Yu, Yanhong; Li, Jun; Wu, Tao; Liu, Tong; Yang, Manyi; Wu, Kui; Liu, Huiling; Ge, Jin; Yu, Yingcai; Huang, Huiqing; Dong, Meiling; Yu, Cuiling; Chen, Shangwu; Xu, Anlong

    2008-07-01

    It has been speculated that before vertebrates evolved somatic diversity-based adaptive immunity, the germline-encoded diversity of innate immunity may have been more developed. Amphioxus occupies the basal position of the chordate phylum and hence is an important reference to the evolution of vertebrate immunity. Here we report the first comprehensive genomic survey of the immune gene repertoire of the amphioxus Branchiostoma floridae. It has been reported that the purple sea urchin has a vastly expanded innate receptor repertoire not previously seen in other species, which includes 222 toll-like receptors (TLRs), 203 NOD/NALP-like receptors (NLRs), and 218 scavenger receptors (SRs). We discovered that the amphioxus genome contains comparable expansion with 71 TLR gene models, 118 NLR models, and 270 SR models. Amphioxus also expands other receptor-like families, including 1215 C-type lectin models, 240 LRR and IGcam-containing models, 1363 other LRR-containing models, 75 C1q-like models, 98 ficolin-like models, and hundreds of models containing complement-related domains. The expansion is not restricted to receptors but is likely to extend to intermediate signal transducers because there are 58 TIR adapter-like models, 36 TRAF models, 44 initiator caspase models, and 541 death-fold domain-containing models in the genome. Amphioxus also has a sophisticated TNF system and a complicated complement system not previously seen in other invertebrates. Besides the increase of gene number, domain combinations of immune proteins are also increased. Altogether, this survey suggests that the amphioxus, a species without vertebrate-type adaptive immunity, holds extraordinary innate complexity and diversity. PMID:18562681

  6. Genomic and Metagenomic Analysis of Diversity-Generating Retroelements Associated with Treponema denticola

    PubMed Central

    Nimkulrat, Sutichot; Lee, Heewook; Doak, Thomas G.; Ye, Yuzhen

    2016-01-01

    Diversity-generating retroelements (DGRs) are genetic cassettes that can produce massive protein sequence variation in prokaryotes. Presumably DGRs confer selective advantages to their hosts (bacteria or viruses) by generating variants of target genes—typically resulting in target proteins with altered ligand-binding specificity—through a specialized error-prone reverse transcription process. The only extensively studied DGR system is from the Bordetella phage BPP-1, although DGRs are predicted to exist in other species. Using bioinformatics analysis, we discovered that the DGR system associated with the Treponema denticola species (a human oral-associated periopathogen) is dynamic (with gains/losses of the system found in the isolates) and diverse (with multiple types found in isolated genomes and the human microbiota). The T. denticola DGR is found in only nine of the 17 sequenced T. denticola strains. Analysis of the DGR-associated template regions and reverse transcriptase gene sequences revealed two types of DGR systems in T. denticola: the ATCC35405-type shared by seven isolates including ATCC35405; and the SP32-type shared by two isolates (SP32 and SP33), suggesting multiple DGR acquisitions. We detected additional variants of the T. denticola DGR systems in the human microbiomes, and found that the SP32-type DGR is more abundant than the ATCC35405-type in the healthy human oral microbiome, although the latter is found in more sequenced isolates. This is the first comprehensive study to characterize the DGRs associated with T. denticola in individual genomes as well as human microbiomes, demonstrating the importance of utilizing both individual genomes and metagenomes for characterizing the elements, and for analyzing their diversity and distribution in human populations. PMID:27375574

  7. Sex-Biased Evolutionary Forces Shape Genomic Patterns of Human Diversity

    PubMed Central

    Hammer, Michael F.; Mendez, Fernando L.; Cox, Murray P.; Woerner, August E.; Wall, Jeffrey D.

    2008-01-01

    Comparisons of levels of variability on the autosomes and X chromosome can be used to test hypotheses about factors influencing patterns of genomic variation. While a tremendous amount of nucleotide sequence data from across the genome is now available for multiple human populations, there has been no systematic effort to examine relative levels of neutral polymorphism on the X chromosome versus autosomes. We analyzed ∼210 kb of DNA sequencing data representing 40 independent noncoding regions on the autosomes and X chromosome from each of 90 humans from six geographically diverse populations. We correct for differences in mutation rates between males and females by considering the ratio of within-human diversity to human-orangutan divergence. We find that relative levels of genetic variation are higher than expected on the X chromosome in all six human populations. We test a number of alternative hypotheses to explain the excess polymorphism on the X chromosome, including models of background selection, changes in population size, and sex-specific migration in a structured population. While each of these processes may have a small effect on the relative ratio of X-linked to autosomal diversity, our results point to a systematic difference between the sexes in the variance in reproductive success; namely, the widespread effects of polygyny in human populations. We conclude that factors leading to a lower male versus female effective population size must be considered as important demographic variables in efforts to construct models of human demographic history and for understanding the forces shaping patterns of human genomic variability. PMID:18818765

  8. Substantial interindividual and limited intraindividual genomic diversity among tumors from men with metastatic prostate cancer.

    PubMed

    Kumar, Akash; Coleman, Ilsa; Morrissey, Colm; Zhang, Xiaotun; True, Lawrence D; Gulati, Roman; Etzioni, Ruth; Bolouri, Hamid; Montgomery, Bruce; White, Thomas; Lucas, Jared M; Brown, Lisha G; Dumpit, Ruth F; DeSarkar, Navonil; Higano, Celestia; Yu, Evan Y; Coleman, Roger; Schultz, Nikolaus; Fang, Min; Lange, Paul H; Shendure, Jay; Vessella, Robert L; Nelson, Peter S

    2016-04-01

    Tumor heterogeneity may reduce the efficacy of molecularly guided systemic therapy for cancers that have metastasized. To determine whether the genomic alterations in a single metastasis provide a reasonable assessment of the major oncogenic drivers of other dispersed metastases in an individual, we analyzed multiple tumors from men with disseminated prostate cancer through whole-exome sequencing, array comparative genomic hybridization (CGH) and RNA transcript profiling, and we compared the genomic diversity within and between individuals. In contrast to the substantial heterogeneity between men, there was limited diversity among metastases within an individual. The number of somatic mutations, the burden of genomic copy number alterations and aberrations in known oncogenic drivers were all highly concordant, as were metrics of androgen receptor (AR) activity and cell cycle activity. AR activity was inversely associated with cell proliferation, whereas the expression of Fanconi anemia (FA)-complex genes was correlated with elevated cell cycle progression, expression of the E2F transcription factor 1 (E2F1) and loss of retinoblastoma 1 (RB1). Men with somatic aberrations in FA-complex genes or in ATM serine/threonine kinase (ATM) exhibited significantly longer treatment-response durations to carboplatin than did men without defects in genes encoding DNA-repair proteins. Collectively, these data indicate that although exceptions exist, evaluating a single metastasis provides a reasonable assessment of the major oncogenic driver alterations that are present in disseminated tumors within an individual, and thus may be useful for selecting treatments on the basis of predicted molecular vulnerabilities. PMID:26928463

  9. Combining genomic sequencing methods to explore viral diversity and reveal potential virus-host interactions

    PubMed Central

    Chow, Cheryl-Emiliane T.; Winget, Danielle M.; White, Richard A.; Hallam, Steven J.; Suttle, Curtis A.

    2015-01-01

    Viral diversity and virus-host interactions in oxygen-starved regions of the ocean, also known as oxygen minimum zones (OMZs), remain relatively unexplored. Microbial community metabolism in OMZs alters nutrient and energy flow through marine food webs, resulting in biological nitrogen loss and greenhouse gas production. Thus, viruses infecting OMZ microbes have the potential to modulate community metabolism with resulting feedback on ecosystem function. Here, we describe viral communities inhabiting oxic surface (10 m) and oxygen-starved basin (200 m) waters of Saanich Inlet, a seasonally anoxic fjord on the coast of Vancouver Island, British Columbia using viral metagenomics and complete viral fosmid sequencing on samples collected between April 2007 and April 2010. Of 6459 open reading frames (ORFs) predicted across all 34 viral fosmids, 77.6% (n = 5010) had no homology to reference viral genomes. These fosmids recruited a higher proportion of viral metagenomic sequences from Saanich Inlet than from nearby northeastern subarctic Pacific Ocean (Line P) waters, indicating differences in the viral communities between coastal and open ocean locations. While functional annotations of fosmid ORFs were limited, recruitment to NCBI's non-redundant “nr” database and publicly available single-cell genomes identified putative viruses infecting marine thaumarchaeal and SUP05 proteobacteria to provide potential host linkages with relevance to coupled biogeochemical cycling processes in OMZ waters. Taken together, these results highlight the power of coupled analyses of multiple sequence data types, such as viral metagenomic and fosmid sequence data with prokaryotic single cell genomes, to chart viral diversity, elucidate genomic and ecological contexts for previously unclassifiable viral sequences, and identify novel host interactions in natural and engineered ecosystems. PMID:25914678

  10. Intraspecies Genomic Diversity and Long-Term Persistence of Bifidobacterium longum

    PubMed Central

    Chaplin, Andrei V.; Efimov, Boris A.; Smeianov, Vladimir V.; Kafarskaia, Lyudmila I.; Pikina, Alla P.; Shkoporov, Andrei N.

    2015-01-01

    Members of genus Bifidobacterium are Gram-positive bacteria, representing a large part of the human infant microbiota and moderately common in adults. However, our knowledge about their diversity, intraspecific phylogeny and long-term persistence in humans is still limited. Bifidobacterium longum is generally considered to be the most common and prevalent species in the intestinal microbiota. In this work we studied whole genome sequences of 28 strains of B. longum, including 8 sequences described in this paper. Part of these strains were isolated from healthy children during a long observation period (up to 10 years between isolation from the same patient). The three known subspecies (longum, infantis and suis) could be clearly divided using sequence-based phylogenetic methods, gene content and the average nucleotide identity. The profiles of glycoside hydrolase genes reflected the different ecological specializations of these three subspecies. The high impact of horizontal gene transfer on genomic diversity was observed, which is possibly due to a large number of prophages and rapidly spreading plasmids. The pan-genome characteristics of the subspecies longum corresponded to the open pan-genome model. While the major part of the strain-specific genetic loci represented transposons and phage-derived regions, a large number of cell envelope synthesis genes were also observed within this category, representing high variability of cell surface molecules. We observed the cases of isolation of high genetically similar strains of B. longum from the same patients after long periods of time, however, we didn’t succeed in the isolation of genetically identical bacteria: a fact, reflecting the high plasticity of microbiota in children. PMID:26275230

  11. Combining genomic sequencing methods to explore viral diversity and reveal potential virus-host interactions.

    PubMed

    Chow, Cheryl-Emiliane T; Winget, Danielle M; White, Richard A; Hallam, Steven J; Suttle, Curtis A

    2015-01-01

    Viral diversity and virus-host interactions in oxygen-starved regions of the ocean, also known as oxygen minimum zones (OMZs), remain relatively unexplored. Microbial community metabolism in OMZs alters nutrient and energy flow through marine food webs, resulting in biological nitrogen loss and greenhouse gas production. Thus, viruses infecting OMZ microbes have the potential to modulate community metabolism with resulting feedback on ecosystem function. Here, we describe viral communities inhabiting oxic surface (10 m) and oxygen-starved basin (200 m) waters of Saanich Inlet, a seasonally anoxic fjord on the coast of Vancouver Island, British Columbia using viral metagenomics and complete viral fosmid sequencing on samples collected between April 2007 and April 2010. Of 6459 open reading frames (ORFs) predicted across all 34 viral fosmids, 77.6% (n = 5010) had no homology to reference viral genomes. These fosmids recruited a higher proportion of viral metagenomic sequences from Saanich Inlet than from nearby northeastern subarctic Pacific Ocean (Line P) waters, indicating differences in the viral communities between coastal and open ocean locations. While functional annotations of fosmid ORFs were limited, recruitment to NCBI's non-redundant "nr" database and publicly available single-cell genomes identified putative viruses infecting marine thaumarchaeal and SUP05 proteobacteria to provide potential host linkages with relevance to coupled biogeochemical cycling processes in OMZ waters. Taken together, these results highlight the power of coupled analyses of multiple sequence data types, such as viral metagenomic and fosmid sequence data with prokaryotic single cell genomes, to chart viral diversity, elucidate genomic and ecological contexts for previously unclassifiable viral sequences, and identify novel host interactions in natural and engineered ecosystems. PMID:25914678

  12. Unprecedented genomic diversity of RNA viruses in arthropods reveals the ancestry of negative-sense RNA viruses

    PubMed Central

    Li, Ci-Xiu; Shi, Mang; Tian, Jun-Hua; Lin, Xian-Dan; Kang, Yan-Jun; Chen, Liang-Jun; Qin, Xin-Cheng; Xu, Jianguo; Holmes, Edward C; Zhang, Yong-Zhen

    2015-01-01

    Although arthropods are important viral vectors, the biodiversity of arthropod viruses, as well as the role that arthropods have played in viral origins and evolution, is unclear. Through RNA sequencing of 70 arthropod species we discovered 112 novel viruses that appear to be ancestral to much of the documented genetic diversity of negative-sense RNA viruses, a number of which are also present as endogenous genomic copies. With this greatly enriched diversity we revealed that arthropods contain viruses that fall basal to major virus groups, including the vertebrate-specific arenaviruses, filoviruses, hantaviruses, influenza viruses, lyssaviruses, and paramyxoviruses. We similarly documented a remarkable diversity of genome structures in arthropod viruses, including a putative circular form, that sheds new light on the evolution of genome organization. Hence, arthropods are a major reservoir of viral genetic diversity and have likely been central to viral evolution. DOI: http://dx.doi.org/10.7554/eLife.05378.001 PMID:25633976

  13. Genomic Diversity of Hepatitis B Virus Infection Associated With Fulminant Hepatitis B Development

    PubMed Central

    Mina, Thomas; Amini-Bavil-Olyaee, Samad; Tacke, Frank; Maes, Piet; Van Ranst, Marc; Pourkarim, Mahmoud Reza

    2015-01-01

    Context: After five decades of Hepatitis B Virus (HBV) vaccine discovery, HBV is still a major public health problem. Due to the high genetic diversity of HBV and selective pressure of the host immune system, intra-host evolution of this virus in different clinical manifestations is a hot topic of research. HBV infection causes a range of clinical manifestations from acute to chronic infection, cirrhosis and hepatocellular carcinoma. Among all forms of HBV infection manifestations, fulminant hepatitis B infection possesses the highest fatality rate. Almost 1% of the acutely infected patients develop fulminant hepatitis B, in which the mortality rate is around 70%. Evidence Acquisition: All published papers deposited in Genbank, on the topic of fulminant hepatitis were reviewed and their virological aspects were investigated. In this review, we highlight the genomic diversity of HBV reported from patients with fulminant HBV infection. Results: The most commonly detected diversities affect regulatory motifs of HBV in the core and S region, indicating that these alterations may convert the virus to an aggressive strain. Moreover, mutations at T-cell and B-cell epitopes located in pre-S1 and pre-S2 proteins may lead to an immune evasion of the virus, likely favoring a more severe clinical course of infection. Furthermore, point and frame shift mutations in the core region increase the viral replication of HBV and help virus to evade from immune system and guarantee its persistence. Conclusions: Fulminant hepatitis B is associated with distinct mutational patterns of HBV, underlining that genomic diversity of the virus is an important factor determining its pathogenicity. PMID:26288637

  14. Genome-Wide Diversity and Phylogeography of Mycobacterium avium subsp. paratuberculosis in Canadian Dairy Cattle.

    PubMed

    Ahlstrom, Christina; Barkema, Herman W; Stevenson, Karen; Zadoks, Ruth N; Biek, Roman; Kao, Rowland; Trewby, Hannah; Haupstein, Deb; Kelton, David F; Fecteau, Gilles; Labrecque, Olivia; Keefe, Greg P; McKenna, Shawn L B; Tahlan, Kapil; De Buck, Jeroen

    2016-01-01

    Mycobacterium avium subsp. paratuberculosis (MAP) is the causative bacterium of Johne's disease (JD) in ruminants. The control of JD in the dairy industry is challenging, but can be improved with a better understanding of the diversity and distribution of MAP subtypes. Previously established molecular typing techniques used to differentiate MAP have not been sufficiently discriminatory and/or reliable to accurately assess the population structure. In this study, the genetic diversity of 182 MAP isolates representing all Canadian provinces was compared to the known global diversity, using single nucleotide polymorphisms identified through whole genome sequencing. MAP isolates from Canada represented a subset of the known global diversity, as there were global isolates intermingled with Canadian isolates, as well as multiple global subtypes that were not found in Canada. One Type III and six "Bison type" isolates were found in Canada as well as one Type II subtype that represented 86% of all Canadian isolates. Rarefaction estimated larger subtype richness in Québec than in other Canadian provinces using a strict definition of MAP subtypes and lower subtype richness in the Atlantic region using a relaxed definition. Significant phylogeographic clustering was observed at the inter-provincial but not at the intra-provincial level, although most major clades were found in all provinces. The large number of shared subtypes among provinces suggests that cattle movement is a major driver of MAP transmission at the herd level, which is further supported by the lack of spatial clustering on an intra-provincial scale. PMID:26871723

  15. The Impact of Spatial Structure on Viral Genomic Diversity Generated during Adaptation to Thermal Stress

    PubMed Central

    Ally, Dilara; Wiss, Valorie R.; Deckert, Gail E.; Green, Danielle; Roychoudhury, Pavitra; Wichman, Holly A.; Brown, Celeste J.; Krone, Stephen M.

    2014-01-01

    Background Most clinical and natural microbial communities live and evolve in spatially structured environments. When changes in environmental conditions trigger evolutionary responses, spatial structure can impact the types of adaptive response and the extent to which they spread. In particular, localized competition in a spatial landscape can lead to the emergence of a larger number of different adaptive trajectories than would be found in well-mixed populations. Our goal was to determine how two levels of spatial structure affect genomic diversity in a population and how this diversity is manifested spatially. Methodology/Principal Findings We serially transferred bacteriophage populations growing at high temperatures (40°C) on agar plates for 550 generations at two levels of spatial structure. The level of spatial structure was determined by whether the physical locations of the phage subsamples were preserved or disrupted at each passage to fresh bacterial host populations. When spatial structure of the phage populations was preserved, there was significantly greater diversity on a global scale with restricted and patchy distribution. When spatial structure was disrupted with passaging to fresh hosts, beneficial mutants were spread across the entire plate. This resulted in reduced diversity, possibly due to clonal interference as the most fit mutants entered into competition on a global scale. Almost all substitutions present at the end of the adaptation in the populations with disrupted spatial structure were also present in the populations with structure preserved. Conclusions/Significance Our results are consistent with the patchy nature of the spread of adaptive mutants in a spatial landscape. Spatial structure enhances diversity and slows fixation of beneficial mutants. This added diversity could be beneficial in fluctuating environments. We also connect observed substitutions and their effects on fitness to aspects of phage biology, and we provide

  16. Assessing genetic diversity among Brettanomyces yeasts by DNA fingerprinting and whole-genome sequencing.

    PubMed

    Crauwels, Sam; Zhu, Bo; Steensels, Jan; Busschaert, Pieter; De Samblanx, Gorik; Marchal, Kathleen; Willems, Kris A; Verstrepen, Kevin J; Lievens, Bart

    2014-07-01

    Brettanomyces yeasts, with the species Brettanomyces (Dekkera) bruxellensis being the most important one, are generally reported to be spoilage yeasts in the beer and wine industry due to the production of phenolic off flavors. However, B. bruxellensis is also known to be a beneficial contributor in certain fermentation processes, such as the production of certain specialty beers. Nevertheless, despite its economic importance, Brettanomyces yeasts remain poorly understood at the genetic and genomic levels. In this study, the genetic relationship between more than 50 Brettanomyces strains from all presently known species and from several sources was studied using a combination of DNA fingerprinting techniques. This revealed an intriguing correlation between the B. bruxellensis fingerprints and the respective isolation source. To further explore this relationship, we sequenced a (beneficial) beer isolate of B. bruxellensis (VIB X9085; ST05.12/22) and compared its genome sequence with the genome sequences of two wine spoilage strains (AWRI 1499 and CBS 2499). ST05.12/22 was found to be substantially different from both wine strains, especially at the level of single nucleotide polymorphisms (SNPs). In addition, there were major differences in the genome structures between the strains investigated, including the presence of large duplications and deletions. Gene content analysis revealed the presence of 20 genes which were present in both wine strains but absent in the beer strain, including many genes involved in carbon and nitrogen metabolism, and vice versa, no genes that were missing in both AWRI 1499 and CBS 2499 were found in ST05.12/22. Together, this study provides tools to discriminate Brettanomyces strains and provides a first glimpse at the genetic diversity and genome plasticity of B. bruxellensis. PMID:24814796

  17. Comparison of environmental and isolate Sulfobacillus genomes reveals diverse carbon, sulfur, nitrogen, and hydrogen metabolisms

    DOE PAGESBeta

    Justice, Nicholas B.; Norman, Anders; Brown, Christopher T.; Singh, Andrea; Thomas, Brian C.; Banfield, Jillian F.

    2014-12-15

    Bacteria of the genus Sulfobacillus are found worldwide as members of microbial communities that accelerate sulfide mineral dissolution in acid mine drainage environments (AMD), acid-rock drainage environments (ARD), as well as in industrial bioleaching operations. Despite their frequent identification in these environments, their role in biogeochemical cycling is poorly understood. Here we report draft genomes of five species of the Sulfobacillus genus (AMDSBA1-5) reconstructed by cultivation-independent sequencing of biofilms sampled from the Richmond Mine (Iron Mountain, CA). Three of these species (AMDSBA2, AMDSBA3, and AMDSBA4) have no cultured representatives while AMDSBA1 is a strain of S. benefaciens, and AMDSBA5 amore » strain of S. thermosulfidooxidans. We analyzed the diversity of energy conservation and central carbon metabolisms for these genomes and previously published Sulfobacillus genomes. Pathways of sulfur oxidation vary considerably across the genus, including the number and type of subunits of putative heterodisulfide reductase complexes likely involved in sulfur oxidation. The number and type of nickel-iron hydrogenase proteins varied across the genus, as does the presence of different central carbon pathways. Only the AMDSBA3 genome encodes a dissimilatory nitrate reducatase and only the AMDSBA5 and S. thermosulfidooxidans genomes encode assimilatory nitrate reductases. Lastly, within the genus, AMDSBA4 is unusual in that its electron transport chain includes a cytochrome bc type complex, a unique cytochrome c oxidase, and two distinct succinate dehydrogenase complexes. Overall, the results significantly expand our understanding of carbon, sulfur, nitrogen, and hydrogen metabolism within the Sulfobacillus genus.« less

  18. Global genomic diversity of Oryza sativa varieties revealed by comparative physical mapping.

    PubMed

    Wang, Xiaoming; Kudrna, David A; Pan, Yonglong; Wang, Hao; Liu, Lin; Lin, Haiyan; Zhang, Jianwei; Song, Xiang; Goicoechea, Jose Luis; Wing, Rod A; Zhang, Qifa; Luo, Meizhong

    2014-04-01

    Bacterial artificial chromosome (BAC) physical maps embedding a large number of BAC end sequences (BESs) were generated for Oryza sativa ssp. indica varieties Minghui 63 (MH63) and Zhenshan 97 (ZS97) and were compared with the genome sequences of O. sativa spp. japonica cv. Nipponbare and O. sativa ssp. indica cv. 93-11. The comparisons exhibited substantial diversities in terms of large structural variations and small substitutions and indels. Genome-wide BAC-sized and contig-sized structural variations were detected, and the shared variations were analyzed. In the expansion regions of the Nipponbare reference sequence, in comparison to the MH63 and ZS97 physical maps, as well as to the previously constructed 93-11 physical map, the amounts and types of the repeat contents, and the outputs of gene ontology analysis, were significantly different from those of the whole genome. Using the physical maps of four wild Oryza species from OMAP (http://www.omap.org) as a control, we detected many conserved and divergent regions related to the evolution process of O. sativa. Between the BESs of MH63 and ZS97 and the two reference sequences, a total of 1532 polymorphic simple sequence repeats (SSRs), 71,383 SNPs, 1767 multiple nucleotide polymorphisms, 6340 insertions, and 9137 deletions were identified. This study provides independent whole-genome resources for intra- and intersubspecies comparisons and functional genomics studies in O. sativa. Both the comparative physical maps and the GBrowse, which integrated the QTL and molecular markers from GRAMENE (http://www.gramene.org) with our physical maps and analysis results, are open to the public through our Web site (http://gresource.hzau.edu.cn/resource/resource.html). PMID:24424778

  19. Whole-Genome Sequencing Reveals Diverse Models of Structural Variations in Esophageal Squamous Cell Carcinoma

    PubMed Central

    Cheng, Caixia; Zhou, Yong; Li, Hongyi; Xiong, Teng; Li, Shuaicheng; Bi, Yanghui; Kong, Pengzhou; Wang, Fang; Cui, Heyang; Li, Yaoping; Fang, Xiaodong; Yan, Ting; Li, Yike; Wang, Juan; Yang, Bin; Zhang, Ling; Jia, Zhiwu; Song, Bin; Hu, Xiaoling; Yang, Jie; Qiu, Haile; Zhang, Gehong; Liu, Jing; Xu, Enwei; Shi, Ruyi; Zhang, Yanyan; Liu, Haiyan; He, Chanting; Zhao, Zhenxiang; Qian, Yu; Rong, Ruizhou; Han, Zhiwei; Zhang, Yanlin; Luo, Wen; Wang, Jiaqian; Peng, Shaoliang; Yang, Xukui; Li, Xiangchun; Li, Lin; Fang, Hu; Liu, Xingmin; Ma, Li; Chen, Yunqing; Guo, Shiping; Chen, Xing; Xi, Yanfeng; Li, Guodong; Liang, Jianfang; Yang, Xiaofeng; Guo, Jiansheng; Jia, JunMei; Li, Qingshan; Cheng, Xiaolong; Zhan, Qimin; Cui, Yongping

    2016-01-01

    Comprehensive identification of somatic structural variations (SVs) and understanding their mutational mechanisms in cancer might contribute to understanding biological differences and help to identify new therapeutic targets. Unfortunately, characterization of complex SVs across the whole genome and the mutational mechanisms underlying esophageal squamous cell carcinoma (ESCC) is largely unclear. To define a comprehensive catalog of somatic SVs, affected target genes, and their underlying mechanisms in ESCC, we re-analyzed whole-genome sequencing (WGS) data from 31 ESCCs using Meerkat algorithm to predict somatic SVs and Patchwork to determine copy-number changes. We found deletions and translocations with NHEJ and alt-EJ signature as the dominant SV types, and 16% of deletions were complex deletions. SVs frequently led to disruption of cancer-associated genes (e.g., CDKN2A and NOTCH1) with different mutational mechanisms. Moreover, chromothripsis, kataegis, and breakage-fusion-bridge (BFB) were identified as contributing to locally mis-arranged chromosomes that occurred in 55% of ESCCs. These genomic catastrophes led to amplification of oncogene through chromothripsis-derived double-minute chromosome formation (e.g., FGFR1 and LETM2) or BFB-affected chromosomes (e.g., CCND1, EGFR, ERBB2, MMPs, and MYC), with approximately 30% of ESCCs harboring BFB-derived CCND1 amplification. Furthermore, analyses of copy-number alterations reveal high frequency of whole-genome duplication (WGD) and recurrent focal amplification of CDCA7 that might act as a potential oncogene in ESCC. Our findings reveal molecular defects such as chromothripsis and BFB in malignant transformation of ESCCs and demonstrate diverse models of SVs-derived target genes in ESCCs. These genome-wide SV profiles and their underlying mechanisms provide preventive, diagnostic, and therapeutic implications for ESCCs. PMID:26833333

  20. Distribution and diversity of cytotypes in Dianthus broteri as evidenced by genome size variations

    PubMed Central

    Balao, Francisco; Casimiro-Soriguer, Ramón; Talavera, María; Herrera, Javier; Talavera, Salvador

    2009-01-01

    Background and Aims Studying the spatial distribution of cytotypes and genome size in plants can provide valuable information about the evolution of polyploid complexes. Here, the spatial distribution of cytological races and the amount of DNA in Dianthus broteri, an Iberian carnation with several ploidy levels, is investigated. Methods Sample chromosome counts and flow cytometry (using propidium iodide) were used to determine overall genome size (2C value) and ploidy level in 244 individuals of 25 populations. Both fresh and dried samples were investigated. Differences in 2C and 1Cx values among ploidy levels within biogeographical provinces were tested using ANOVA. Geographical correlations of genome size were also explored. Key Results Extensive variation in chromosomes numbers (2n = 2x = 30, 2n = 4x = 60, 2n = 6x = 90 and 2n = 12x =180) was detected, and the dodecaploid cytotype is reported for the first time in this genus. As regards cytotype distribution, six populations were diploid, 11 were tetraploid, three were hexaploid and five were dodecaploid. Except for one diploid population containing some triploid plants (2n = 45), the remaining populations showed a single cytotype. Diploids appeared in two disjunct areas (south-east and south-west), and so did tetraploids (although with a considerably wider geographic range). Dehydrated leaf samples provided reliable measurements of DNA content. Genome size varied significantly among some cytotypes, and also extensively within diploid (up to 1·17-fold) and tetraploid (1·22-fold) populations. Nevertheless, variations were not straightforwardly congruent with ecology and geographical distribution. Conclusions Dianthus broteri shows the highest diversity of cytotypes known to date in the genus Dianthus. Moreover, some cytotypes present remarkable internal genome size variation. The evolution of the complex is discussed in terms of autopolyploidy, with primary and secondary contact zones. PMID:19633312

  1. Whole-Genome Sequencing Reveals Diverse Models of Structural Variations in Esophageal Squamous Cell Carcinoma.

    PubMed

    Cheng, Caixia; Zhou, Yong; Li, Hongyi; Xiong, Teng; Li, Shuaicheng; Bi, Yanghui; Kong, Pengzhou; Wang, Fang; Cui, Heyang; Li, Yaoping; Fang, Xiaodong; Yan, Ting; Li, Yike; Wang, Juan; Yang, Bin; Zhang, Ling; Jia, Zhiwu; Song, Bin; Hu, Xiaoling; Yang, Jie; Qiu, Haile; Zhang, Gehong; Liu, Jing; Xu, Enwei; Shi, Ruyi; Zhang, Yanyan; Liu, Haiyan; He, Chanting; Zhao, Zhenxiang; Qian, Yu; Rong, Ruizhou; Han, Zhiwei; Zhang, Yanlin; Luo, Wen; Wang, Jiaqian; Peng, Shaoliang; Yang, Xukui; Li, Xiangchun; Li, Lin; Fang, Hu; Liu, Xingmin; Ma, Li; Chen, Yunqing; Guo, Shiping; Chen, Xing; Xi, Yanfeng; Li, Guodong; Liang, Jianfang; Yang, Xiaofeng; Guo, Jiansheng; Jia, JunMei; Li, Qingshan; Cheng, Xiaolong; Zhan, Qimin; Cui, Yongping

    2016-02-01

    Comprehensive identification of somatic structural variations (SVs) and understanding their mutational mechanisms in cancer might contribute to understanding biological differences and help to identify new therapeutic targets. Unfortunately, characterization of complex SVs across the whole genome and the mutational mechanisms underlying esophageal squamous cell carcinoma (ESCC) is largely unclear. To define a comprehensive catalog of somatic SVs, affected target genes, and their underlying mechanisms in ESCC, we re-analyzed whole-genome sequencing (WGS) data from 31 ESCCs using Meerkat algorithm to predict somatic SVs and Patchwork to determine copy-number changes. We found deletions and translocations with NHEJ and alt-EJ signature as the dominant SV types, and 16% of deletions were complex deletions. SVs frequently led to disruption of cancer-associated genes (e.g., CDKN2A and NOTCH1) with different mutational mechanisms. Moreover, chromothripsis, kataegis, and breakage-fusion-bridge (BFB) were identified as contributing to locally mis-arranged chromosomes that occurred in 55% of ESCCs. These genomic catastrophes led to amplification of oncogene through chromothripsis-derived double-minute chromosome formation (e.g., FGFR1 and LETM2) or BFB-affected chromosomes (e.g., CCND1, EGFR, ERBB2, MMPs, and MYC), with approximately 30% of ESCCs harboring BFB-derived CCND1 amplification. Furthermore, analyses of copy-number alterations reveal high frequency of whole-genome duplication (WGD) and recurrent focal amplification of CDCA7 that might act as a potential oncogene in ESCC. Our findings reveal molecular defects such as chromothripsis and BFB in malignant transformation of ESCCs and demonstrate diverse models of SVs-derived target genes in ESCCs. These genome-wide SV profiles and their underlying mechanisms provide preventive, diagnostic, and therapeutic implications for ESCCs. PMID:26833333

  2. Assessing Genetic Diversity among Brettanomyces Yeasts by DNA Fingerprinting and Whole-Genome Sequencing

    PubMed Central

    Crauwels, Sam; Zhu, Bo; Steensels, Jan; Busschaert, Pieter; De Samblanx, Gorik; Marchal, Kathleen; Willems, Kris A.

    2014-01-01

    Brettanomyces yeasts, with the species Brettanomyces (Dekkera) bruxellensis being the most important one, are generally reported to be spoilage yeasts in the beer and wine industry due to the production of phenolic off flavors. However, B. bruxellensis is also known to be a beneficial contributor in certain fermentation processes, such as the production of certain specialty beers. Nevertheless, despite its economic importance, Brettanomyces yeasts remain poorly understood at the genetic and genomic levels. In this study, the genetic relationship between more than 50 Brettanomyces strains from all presently known species and from several sources was studied using a combination of DNA fingerprinting techniques. This revealed an intriguing correlation between the B. bruxellensis fingerprints and the respective isolation source. To further explore this relationship, we sequenced a (beneficial) beer isolate of B. bruxellensis (VIB X9085; ST05.12/22) and compared its genome sequence with the genome sequences of two wine spoilage strains (AWRI 1499 and CBS 2499). ST05.12/22 was found to be substantially different from both wine strains, especially at the level of single nucleotide polymorphisms (SNPs). In addition, there were major differences in the genome structures between the strains investigated, including the presence of large duplications and deletions. Gene content analysis revealed the presence of 20 genes which were present in both wine strains but absent in the beer strain, including many genes involved in carbon and nitrogen metabolism, and vice versa, no genes that were missing in both AWRI 1499 and CBS 2499 were found in ST05.12/22. Together, this study provides tools to discriminate Brettanomyces strains and provides a first glimpse at the genetic diversity and genome plasticity of B. bruxellensis. PMID:24814796

  3. Unravelling genomic diversity of Zygosaccharomyces rouxii complex with a link to its life cycle.

    PubMed

    Solieri, Lisa; Dakal, Tikam Chand; Croce, Maria Antonietta; Giudici, Paolo

    2013-05-01

    Zygosaccharomyces rouxii and the related species Zygosaccharomyces sapae (hereafter referred to as Z. rouxii complex) are protoploid hemiascomycete yeasts relevant in the elaboration and spoilage of foodstuff. Divergence of Z. rouxii complex before whole genome duplication, leading to the genus Saccharomyces, makes these yeasts very attractive for genome evolution study. Relatively little is known, however, about the diversity in this branch at the genetic and physiological levels. In this work, we investigated Z. rouxii complex, encompassing strains that in other works have been studied separately and comparing them in a comprehensive way. We showed that the majority of strains are unusually heterogeneous in their ribosomal DNA, a signal of relaxation of concerted evolution. Further analysis showed that they have hypervariable karyotypes, different levels of ploidy, and that housekeeping markers vary both in copy number and sequence. Overall, the results provide compelling evidence that the strains considered in this study are a complex of haploid, aneuploid and diploid mosaic lineages. The reproductive mode and life cycle of Zygosaccharomyces could lead to this unsuspected diversity. PMID:23279556

  4. Whole Genome Sequencing of Field Isolates Reveals Extensive Genetic Diversity in Plasmodium vivax from Colombia.

    PubMed

    Winter, David J; Pacheco, M Andreína; Vallejo, Andres F; Schwartz, Rachel S; Arevalo-Herrera, Myriam; Herrera, Socrates; Cartwright, Reed A; Escalante, Ananias A

    2015-12-01

    Plasmodium vivax is the most prevalent malarial species in South America and exerts a substantial burden on the populations it affects. The control and eventual elimination of P. vivax are global health priorities. Genomic research contributes to this objective by improving our understanding of the biology of P. vivax and through the development of new genetic markers that can be used to monitor efforts to reduce malaria transmission. Here we analyze whole-genome data from eight field samples from a region in Cordóba, Colombia where malaria is endemic. We find considerable genetic diversity within this population, a result that contrasts with earlier studies suggesting that P. vivax had limited diversity in the Americas. We also identify a selective sweep around a substitution known to confer resistance to sulphadoxine-pyrimethamine (SP). This is the first observation of a selective sweep for SP resistance in this species. These results indicate that P. vivax has been exposed to SP pressure even when the drug is not in use as a first line treatment for patients afflicted by this parasite. We identify multiple non-synonymous substitutions in three other genes known to be involved with drug resistance in Plasmodium species. Finally, we found extensive microsatellite polymorphisms. Using this information we developed 18 polymorphic and easy to score microsatellite loci that can be used in epidemiological investigations in South America. PMID:26709695

  5. Whole Genome Sequencing of Field Isolates Reveals Extensive Genetic Diversity in Plasmodium vivax from Colombia

    PubMed Central

    Winter, David J.; Pacheco, M. Andreína; Vallejo, Andres F.; Schwartz, Rachel S.; Arevalo-Herrera, Myriam; Herrera, Socrates

    2015-01-01

    Plasmodium vivax is the most prevalent malarial species in South America and exerts a substantial burden on the populations it affects. The control and eventual elimination of P. vivax are global health priorities. Genomic research contributes to this objective by improving our understanding of the biology of P. vivax and through the development of new genetic markers that can be used to monitor efforts to reduce malaria transmission. Here we analyze whole-genome data from eight field samples from a region in Cordóba, Colombia where malaria is endemic. We find considerable genetic diversity within this population, a result that contrasts with earlier studies suggesting that P. vivax had limited diversity in the Americas. We also identify a selective sweep around a substitution known to confer resistance to sulphadoxine-pyrimethamine (SP). This is the first observation of a selective sweep for SP resistance in this species. These results indicate that P. vivax has been exposed to SP pressure even when the drug is not in use as a first line treatment for patients afflicted by this parasite. We identify multiple non-synonymous substitutions in three other genes known to be involved with drug resistance in Plasmodium species. Finally, we found extensive microsatellite polymorphisms. Using this information we developed 18 polymorphic and easy to score microsatellite loci that can be used in epidemiological investigations in South America. PMID:26709695

  6. Analysis of genotype diversity and evolution of Dengue virus serotype 2 using complete genomes

    PubMed Central

    Waman, Vaishali P.; Kolekar, Pandurang; Ramtirthkar, Mukund R.; Kale, Mohan M.

    2016-01-01

    Background Dengue is one of the most common arboviral diseases prevalent worldwide and is caused by Dengue viruses (genus Flavivirus, family Flaviviridae). There are four serotypes of Dengue Virus (DENV-1 to DENV-4), each of which is further subdivided into distinct genotypes. DENV-2 is frequently associated with severe dengue infections and epidemics. DENV-2 consists of six genotypes such as Asian/American, Asian I, Asian II, Cosmopolitan, American and sylvatic. Comparative genomic study was carried out to infer population structure of DENV-2 and to analyze the role of evolutionary and spatiotemporal factors in emergence of diversifying lineages. Methods Complete genome sequences of 990 strains of DENV-2 were analyzed using Bayesian-based population genetics and phylogenetic approaches to infer genetically distinct lineages. The role of spatiotemporal factors, genetic recombination and selection pressure in the evolution of DENV-2 is examined using the sequence-based bioinformatics approaches. Results DENV-2 genetic structure is complex and consists of fifteen subpopulations/lineages. The Asian/American genotype is observed to be diversified into seven lineages. The Asian I, Cosmopolitan and sylvatic genotypes were found to be subdivided into two lineages, each. The populations of American and Asian II genotypes were observed to be homogeneous. Significant evidence of episodic positive selection was observed in all the genes, except NS4A. Positive selection operational on a few codons in envelope gene confers antigenic and lineage diversity in the American strains of Asian/American genotype. Selection on codons of non-structural genes was observed to impact diversification of lineages in Asian I, cosmopolitan and sylvatic genotypes. Evidence of intra/inter-genotype recombination was obtained and the uncertainty in classification of recombinant strains was resolved using the population genetics approach. Discussion Complete genome-based analysis revealed that the

  7. Diversity and genomic insights into the uncultured Chloroflexi from the human microbiota

    PubMed Central

    Campbell, Alisha G.; Schwientek, Patrick; Vishnivetskaya, Tatiana; Woyke, Tanja; Levy, Shawn; Beall, Clifford J.; Griffen, Ann; Leys, Eugene; Podar, Mircea

    2014-01-01

    SUMMARY Many microbial phyla that are widely distributed in open environments have few or no representatives within animal-associated microbiota. Among them, the Chloroflexi comprises taxonomically and physiologically diverse lineages adapted to a wide range of aquatic and terrestrial habitats. A distinct group of uncultured chloroflexi related to free-living anaerobic Anaerolineae inhabits the mammalian gastrointestinal tract and includes low-abundance human oral bacteria that appear to proliferate in periodontitis. Using a single-cell genomics approach we obtained the first draft genomic reconstruction for these organisms and compared their inferred metabolic potential with free-living chloroflexi. Genomic data suggest that oral chloroflexi are anaerobic heterotrophs, encoding abundant carbohydrate transport and metabolism functionalities, similar to those seen in environmental Anaerolineae isolates. The presence of genes for a unique phosphotransferase system and N-acetylglucosamine metabolism suggests an important ecological niche for oral chloroflexi in scavenging material from lysed bacterial cells and the human tissue. The inferred ability to produce sialic acid for cell membrane decoration may enable them to evade the host defense system and colonize the subgingival space. As with other low-abundance but persistent members of the microbiota, discerning community and host factors that influence the proliferation of oral chloroflexi may help understand the emergence of oral pathogens and the microbiota dynamics in health and disease states. PMID:24738594

  8. Genome-wide SNP analysis explains coral diversity and recovery in the Ryukyu Archipelago

    PubMed Central

    Shinzato, Chuya; Mungpakdee, Sutada; Arakaki, Nana; Satoh, Noriyuki

    2015-01-01

    Following a global coral bleaching event in 1998, Acropora corals surrounding most of Okinawa island (OI) were devastated, although they are now gradually recovering. In contrast, the Kerama Islands (KIs) only 30 km west of OI, have continuously hosted a great variety of healthy corals. Taking advantage of the decoded Acropora digitifera genome and using genome-wide SNP analyses, we clarified Acropora population structure in the southern Ryukyu Archipelago (sRA). Despite small genetic distances, we identified distinct clusters corresponding to specific island groups, suggesting infrequent long-distance dispersal within the sRA. Although the KIs were believed to supply coral larvae to OI, admixture analyses showed that such dispersal is much more limited than previously realized, indicating independent recovery of OI coral populations and the necessity of local conservation efforts for each region. We detected strong historical migration from the Yaeyama Islands (YIs) to OI, and suggest that the YIs are the original source of OI corals. In addition, migration edges to the KIs suggest that they are a historical sink population in the sRA, resulting in high diversity. This population genomics study provides the highest resolution data to date regarding coral population structure and history. PMID:26656261

  9. Genome-wide SNP analysis explains coral diversity and recovery in the Ryukyu Archipelago.

    PubMed

    Shinzato, Chuya; Mungpakdee, Sutada; Arakaki, Nana; Satoh, Noriyuki

    2015-01-01

    Following a global coral bleaching event in 1998, Acropora corals surrounding most of Okinawa island (OI) were devastated, although they are now gradually recovering. In contrast, the Kerama Islands (KIs) only 30 km west of OI, have continuously hosted a great variety of healthy corals. Taking advantage of the decoded Acropora digitifera genome and using genome-wide SNP analyses, we clarified Acropora population structure in the southern Ryukyu Archipelago (sRA). Despite small genetic distances, we identified distinct clusters corresponding to specific island groups, suggesting infrequent long-distance dispersal within the sRA. Although the KIs were believed to supply coral larvae to OI, admixture analyses showed that such dispersal is much more limited than previously realized, indicating independent recovery of OI coral populations and the necessity of local conservation efforts for each region. We detected strong historical migration from the Yaeyama Islands (YIs) to OI, and suggest that the YIs are the original source of OI corals. In addition, migration edges to the KIs suggest that they are a historical sink population in the sRA, resulting in high diversity. This population genomics study provides the highest resolution data to date regarding coral population structure and history. PMID:26656261

  10. Gene Arrangement Convergence, Diverse Intron Content, and Genetic Code Modifications in Mitochondrial Genomes of Sphaeropleales (Chlorophyta)

    PubMed Central

    Fučíková, Karolina; Lewis, Paul O.; González-Halphen, Diego; Lewis, Louise A.

    2014-01-01

    The majority of our knowledge about mitochondrial genomes of Viridiplantae comes from land plants, but much less is known about their green algal relatives. In the green algal order Sphaeropleales (Chlorophyta), only one representative mitochondrial genome is currently available—that of Acutodesmus obliquus. Our study adds nine completely sequenced and three partially sequenced mitochondrial genomes spanning the phylogenetic diversity of Sphaeropleales. We show not only a size range of 25–53 kb and variation in intron content (0–11) and gene order but also conservation of 13 core respiratory genes and fragmented ribosomal RNA genes. We also report an unusual case of gene arrangement convergence in Neochloris aquatica, where the two rns fragments were secondarily placed in close proximity. Finally, we report the unprecedented usage of UCG as stop codon in Pseudomuriella schumacherensis. In addition, phylogenetic analyses of the mitochondrial protein-coding genes yield a fully resolved, well-supported phylogeny, showing promise for addressing systematic challenges in green algae. PMID:25106621

  11. Assessing diversity of DNA structure-related sequence features in prokaryotic genomes.

    PubMed

    Huang, Yongjie; Mrázek, Jan

    2014-06-01

    Prokaryotic genomes are diverse in terms of their nucleotide and oligonucleotide composition as well as presence of various sequence features that can affect physical properties of the DNA molecule. We present a survey of local sequence patterns which have a potential to promote non-canonical DNA conformations (i.e. different from standard B-DNA double helix) and interpret the results in terms of relationships with organisms' habitats, phylogenetic classifications, and other characteristics. Our present work differs from earlier similar surveys not only by investigating a wider range of sequence patterns in a large number of genomes but also by using a more realistic null model to assess significant deviations. Our results show that simple sequence repeats and Z-DNA-promoting patterns are generally suppressed in prokaryotic genomes, whereas palindromes and inverted repeats are over-represented. Representation of patterns that promote Z-DNA and intrinsic DNA curvature increases with increasing optimal growth temperature (OGT), and decreases with increasing oxygen requirement. Additionally, representations of close direct repeats, palindromes and inverted repeats exhibit clear negative trends with increasing OGT. The observed relationships with environmental characteristics, particularly OGT, suggest possible evolutionary scenarios of structural adaptation of DNA to particular environmental niches. PMID:24408877

  12. Assessing Diversity of DNA Structure-Related Sequence Features in Prokaryotic Genomes

    PubMed Central

    Huang, Yongjie; Mrázek, Jan

    2014-01-01

    Prokaryotic genomes are diverse in terms of their nucleotide and oligonucleotide composition as well as presence of various sequence features that can affect physical properties of the DNA molecule. We present a survey of local sequence patterns which have a potential to promote non-canonical DNA conformations (i.e. different from standard B-DNA double helix) and interpret the results in terms of relationships with organisms' habitats, phylogenetic classifications, and other characteristics. Our present work differs from earlier similar surveys not only by investigating a wider range of sequence patterns in a large number of genomes but also by using a more realistic null model to assess significant deviations. Our results show that simple sequence repeats and Z-DNA-promoting patterns are generally suppressed in prokaryotic genomes, whereas palindromes and inverted repeats are over-represented. Representation of patterns that promote Z-DNA and intrinsic DNA curvature increases with increasing optimal growth temperature (OGT), and decreases with increasing oxygen requirement. Additionally, representations of close direct repeats, palindromes and inverted repeats exhibit clear negative trends with increasing OGT. The observed relationships with environmental characteristics, particularly OGT, suggest possible evolutionary scenarios of structural adaptation of DNA to particular environmental niches. PMID:24408877

  13. Phylum-wide comparative genomics unravel the diversity of secondary metabolism in Cyanobacteria

    SciTech Connect

    Calteau, Alexandra; Fewer, David P.; Latifi, Amel; Coursin, Thérèse; Laurent, Thierry; Jokela, Jouni; Kerfeld, Cheryl A.; Sivonen, Kaarina; Piel, Jörn; Gugger, Muriel

    2014-11-18

    Cyanobacteria are an ancient lineage of photosynthetic bacteria from which hundreds of natural products have been described, including many notorious toxins but also potent natural products of interest to the pharmaceutical and biotechnological industries. Many of these compounds are the products of non-ribosomal peptide synthetase (NRPS) or polyketide synthase (PKS) pathways. However, current understanding of the diversification of these pathways is largely based on the chemical structure of the bioactive compounds, while the evolutionary forces driving their remarkable chemical diversity are poorly understood. We carried out a phylum-wide investigation of genetic diversification of the cyanobacterial NRPS and PKS pathways for the production of bioactive compounds. 452 NRPS and PKS gene clusters were identified from 89 cyanobacterial genomes, revealing a clear burst in late-branching lineages. Our genomic analysis further grouped the clusters into 286 highly diversified cluster families (CF) of pathways. Some CFs appeared vertically inherited, while others presented a more complex evolutionary history. Only a few horizontal gene transfers were evidenced amongst strongly conserved CFs in the phylum, while several others have undergone drastic gene shuffling events, which could result in the observed diversification of the pathways. In addition to toxin production, several NRPS and PKS gene clusters are devoted to important cellular processes of these bacteria such as nitrogen fixation and iron uptake. The majority of the biosynthetic clusters identified here have unknown end products, highlighting the power of genome mining for the discovery of new natural products.

  14. Phylum-wide comparative genomics unravel the diversity of secondary metabolism in Cyanobacteria

    DOE PAGESBeta

    Calteau, Alexandra; Fewer, David P.; Latifi, Amel; Coursin, Thérèse; Laurent, Thierry; Jokela, Jouni; Kerfeld, Cheryl A.; Sivonen, Kaarina; Piel, Jörn; Gugger, Muriel

    2014-11-18

    Cyanobacteria are an ancient lineage of photosynthetic bacteria from which hundreds of natural products have been described, including many notorious toxins but also potent natural products of interest to the pharmaceutical and biotechnological industries. Many of these compounds are the products of non-ribosomal peptide synthetase (NRPS) or polyketide synthase (PKS) pathways. However, current understanding of the diversification of these pathways is largely based on the chemical structure of the bioactive compounds, while the evolutionary forces driving their remarkable chemical diversity are poorly understood. We carried out a phylum-wide investigation of genetic diversification of the cyanobacterial NRPS and PKS pathways formore » the production of bioactive compounds. 452 NRPS and PKS gene clusters were identified from 89 cyanobacterial genomes, revealing a clear burst in late-branching lineages. Our genomic analysis further grouped the clusters into 286 highly diversified cluster families (CF) of pathways. Some CFs appeared vertically inherited, while others presented a more complex evolutionary history. Only a few horizontal gene transfers were evidenced amongst strongly conserved CFs in the phylum, while several others have undergone drastic gene shuffling events, which could result in the observed diversification of the pathways. In addition to toxin production, several NRPS and PKS gene clusters are devoted to important cellular processes of these bacteria such as nitrogen fixation and iron uptake. The majority of the biosynthetic clusters identified here have unknown end products, highlighting the power of genome mining for the discovery of new natural products.« less

  15. Reduced representation genome sequencing suggests low diversity on the sex chromosomes of tonkean macaque monkeys.

    PubMed

    Evans, Ben J; Zeng, Kai; Esselstyn, Jacob A; Charlesworth, Brian; Melnick, Don J

    2014-09-01

    In species with separate sexes, social systems can differ in the relative variances of male versus female reproductive success. Papionin monkeys (macaques, mangabeys, mandrills, drills, baboons, and geladas) exhibit hallmarks of a high variance in male reproductive success, including a female-biased adult sex ratio and prominent sexual dimorphism. To explore the potential genomic consequences of such sex differences, we used a reduced representation genome sequencing approach to quantifying polymorphism at sites on autosomes and sex chromosomes of the tonkean macaque (Macaca tonkeana), a species endemic to the Indonesian island of Sulawesi. The ratio of nucleotide diversity of the X chromosome to that of the autosomes was less than the value (0.75) expected with a 1:1 sex ratio and no sex differences in the variance in reproductive success. However, the significance of this difference was dependent on which outgroup was used to standardize diversity levels. Using a new model that includes the effects of varying population size, sex differences in mutation rate between the autosomes and X chromosome, and GC-biased gene conversion (gBGC) or selection on GC content, we found that the maximum-likelihood estimate of the ratio of effective population size of the X chromosome to that of the autosomes was 0.68, which did not differ significantly from 0.75. We also found evidence for 1) a higher level of purifying selection on genic than nongenic regions, 2) gBGC or natural selection favoring increased GC content, 3) a dynamic demography characterized by population growth and contraction, 4) a higher mutation rate in males than females, and 5) a very low polymorphism level on the Y chromosome. These findings shed light on the population genomic consequences of sex differences in the variance in reproductive success, which appear to be modest in the tonkean macaque; they also suggest the occurrence of hitchhiking on the Y chromosome. PMID:24987106

  16. Genomic Diversity in Pig (Sus scrofa) and its Comparison with Human and other Livestock

    PubMed Central

    Zhang, Chunyan; Plastow, Graham

    2011-01-01

    We have reviewed the current pig (Sus scrofa) genomic diversity within and between sites and compared them with human and other livestock. The current Porcine 60K single nucleotide polymorphism (SNP) panel has an average SNP distance in a range of 30 - 40 kb. Most of genetic variation was distributed within populations, and only a small proportion of them existed between populations. The average heterozygosity was lower in pig than in human and other livestock. Genetic inbreeding coefficient (FIS), population differentiation (FST), and Nei’s genetic distance between populations were much larger in pig than in human and other livestock. Higher average genetic distance existed between European and Asian populations than between European or between Asian populations. Asian breeds harboured much larger variability and higher average heterozygosity than European breeds. The samples of wild boar that have been analyzed displayed more extensive genetic variation than domestic breeds. The average linkage disequilibrium (LD) in improved pig breeds extended to 1 - 3 cM, much larger than that in human (~ 30 kb) and cattle (~ 100 kb), but smaller than that in sheep (~ 10 cM). European breeds showed greater LD that decayed more slowly than Asian breeds. We briefly discuss some processes for maintaining genomic diversity in pig, including migration, introgression, selection, and drift. We conclude that, due to the long time of domestication, the pig possesses lower heterozygosity, higher FIS, and larger LD compared with human and cattle. This implies that a smaller effective population size and less informative markers are needed in pig for genome wide association studies. PMID:21966252

  17. Attenuation of virulence in an apicomplexan hemoparasite results in reduced genome diversity at the population level

    PubMed Central

    2011-01-01

    Background Virulence acquisition and loss is a dynamic adaptation of pathogens to thrive in changing milieus. We investigated the mechanisms of virulence loss at the whole genome level using Babesia bovis as a model apicomplexan in which genetically related attenuated parasites can be reliably derived from virulent parental strains in the natural host. We expected virulence loss to be accompanied by consistent changes at the gene level, and that such changes would be shared among attenuated parasites of diverse geographic and genetic background. Results Surprisingly, while single nucleotide polymorphisms in 14 genes distinguished all attenuated parasites from their virulent parental strains, all non-synonymous changes resulted in no deleterious amino acid modification that could consistently be associated with attenuation (or virulence) in this hemoparasite. Interestingly, however, attenuation significantly reduced the overall population's genome diversity with 81% of base pairs shared among attenuated strains, compared to only 60% of base pairs common among virulent parental parasites. There were significantly fewer genes that were unique to their geographical origins among the attenuated parasites, resulting in a simplified population structure among the attenuated strains. Conclusions This simplified structure includes reduced diversity of the variant erythrocyte surface 1 (ves) multigene family repertoire among attenuated parasites when compared to virulent parental strains, possibly suggesting that overall variance in large protein families such as Variant Erythrocyte Surface Antigens has a critical role in expression of the virulence phenotype. In addition, the results suggest that virulence (or attenuation) mechanisms may not be shared among all populations of parasites at the gene level, but instead may reflect expansion or contraction of the population structure in response to shifting milieus. PMID:21838895

  18. Bacterial origin of a diverse family of UDP-glycosyltransferase genes in the Tetranychus urticae genome.

    PubMed

    Ahn, Seung-Joon; Dermauw, Wannes; Wybouw, Nicky; Heckel, David G; Van Leeuwen, Thomas

    2014-07-01

    UDP-glycosyltransferases (UGTs) catalyze the conjugation of a variety of small lipophilic molecules with uridine diphosphate (UDP) sugars, altering them into more water-soluble metabolites. Thereby, UGTs play an important role in the detoxification of xenobiotics and in the regulation of endobiotics. Recently, the genome sequence was reported for the two-spotted spider mite, Tetranychus urticae, a polyphagous herbivore damaging a number of agricultural crops. Although various gene families implicated in xenobiotic metabolism have been documented in T. urticae, UGTs so far have not. We identified 80 UGT genes in the T. urticae genome, the largest number of UGT genes in a metazoan species reported so far. Phylogenetic analysis revealed that lineage-specific gene expansions increased the diversity of the T. urticae UGT repertoire. Genomic distribution, intron-exon structure and structural motifs in the T. urticae UGTs were also described. In addition, expression profiling after host-plant shifts and in acaricide resistant lines supported an important role for UGT genes in xenobiotic metabolism. Expanded searches of UGTs in other arachnid species (Subphylum Chelicerata), including a spider, a scorpion, two ticks and two predatory mites, unexpectedly revealed the complete absence of UGT genes. However, a centipede (Subphylum Myriapoda) and a water flea and a crayfish (Subphylum Crustacea) contain UGT genes in their genomes similar to insect UGTs, suggesting that the UGT gene family might have been lost early in the Chelicerata lineage and subsequently re-gained in the tetranychid mites. Sequence similarity of T. urticae UGTs and bacterial UGTs and their phylogenetic reconstruction suggest that spider mites acquired UGT genes from bacteria by horizontal gene transfer. Our findings show a unique evolutionary history of the T. urticae UGT gene family among other arthropods and provide important clues to its functions in relation to detoxification and thereby host

  19. Using Whole Genome Analysis to Examine Recombination across Diverse Sequence Types of Staphylococcus aureus.

    PubMed

    Driebe, Elizabeth M; Sahl, Jason W; Roe, Chandler; Bowers, Jolene R; Schupp, James M; Gillece, John D; Kelley, Erin; Price, Lance B; Pearson, Talima R; Hepp, Crystal M; Brzoska, Pius M; Cummings, Craig A; Furtado, Manohar R; Andersen, Paal S; Stegger, Marc; Engelthaler, David M; Keim, Paul S

    2015-01-01

    Staphylococcus aureus is an important clinical pathogen worldwide and understanding this organism's phylogeny and, in particular, the role of recombination, is important both to understand the overall spread of virulent lineages and to characterize outbreaks. To further elucidate the phylogeny of S. aureus, 35 diverse strains were sequenced using whole genome sequencing. In addition, 29 publicly available whole genome sequences were included to create a single nucleotide polymorphism (SNP)-based phylogenetic tree encompassing 11 distinct lineages. All strains of a particular sequence type fell into the same clade with clear groupings of the major clonal complexes of CC8, CC5, CC30, CC45 and CC1. Using a novel analysis method, we plotted the homoplasy density and SNP density across the whole genome and found evidence of recombination throughout the entire chromosome, but when we examined individual clonal lineages we found very little recombination. However, when we analyzed three branches of multiple lineages, we saw intermediate and differing levels of recombination between them. These data demonstrate that in S. aureus, recombination occurs across major lineages that subsequently expand in a clonal manner. Estimated mutation rates for the CC8 and CC5 lineages were different from each other. While the CC8 lineage rate was similar to previous studies, the CC5 lineage was 100-fold greater. Fifty known virulence genes were screened in all genomes in silico to determine their distribution across major clades. Thirty-three genes were present variably across clades, most of which were not constrained by ancestry, indicating horizontal gene transfer or gene loss. PMID:26161978

  20. Using Whole Genome Analysis to Examine Recombination across Diverse Sequence Types of Staphylococcus aureus

    PubMed Central

    Driebe, Elizabeth M.; Sahl, Jason W.; Roe, Chandler; Bowers, Jolene R.; Schupp, James M.; Gillece, John D.; Kelley, Erin; Price, Lance B.; Pearson, Talima R.; Hepp, Crystal M.; Brzoska, Pius M.; Cummings, Craig A.; Furtado, Manohar R.; Andersen, Paal S.; Stegger, Marc; Engelthaler, David M.; Keim, Paul S.

    2015-01-01

    Staphylococcus aureus is an important clinical pathogen worldwide and understanding this organism's phylogeny and, in particular, the role of recombination, is important both to understand the overall spread of virulent lineages and to characterize outbreaks. To further elucidate the phylogeny of S. aureus, 35 diverse strains were sequenced using whole genome sequencing. In addition, 29 publicly available whole genome sequences were included to create a single nucleotide polymorphism (SNP)-based phylogenetic tree encompassing 11 distinct lineages. All strains of a particular sequence type fell into the same clade with clear groupings of the major clonal complexes of CC8, CC5, CC30, CC45 and CC1. Using a novel analysis method, we plotted the homoplasy density and SNP density across the whole genome and found evidence of recombination throughout the entire chromosome, but when we examined individual clonal lineages we found very little recombination. However, when we analyzed three branches of multiple lineages, we saw intermediate and differing levels of recombination between them. These data demonstrate that in S. aureus, recombination occurs across major lineages that subsequently expand in a clonal manner. Estimated mutation rates for the CC8 and CC5 lineages were different from each other. While the CC8 lineage rate was similar to previous studies, the CC5 lineage was 100-fold greater. Fifty known virulence genes were screened in all genomes in silico to determine their distribution across major clades. Thirty-three genes were present variably across clades, most of which were not constrained by ancestry, indicating horizontal gene transfer or gene loss. PMID:26161978

  1. DNA variation of the mammalian major histocompatibility complex reflects genomic diversity and population history.

    PubMed Central

    Yuhki, N; O'Brien, S J

    1990-01-01

    The major histocompatibility complex (MHC) is a multigene complex of tightly linked homologous genes that encode cell surface antigens that play a key role in immune regulation and response to foreign antigens. In most species, MHC gene products display extreme antigenic polymorphism, and their variability has been interpreted to reflect an adaptive strategy for accommodating rapidly evolving infectious agents that periodically afflict natural populations. Determination of the extent of MHC variation has been limited to populations in which skin grafting is feasible or for which serological reagents have been developed. We present here a quantitative analysis of restriction fragment length polymorphism of MHC class I genes in several mammalian species (cats, rodents, humans) known to have very different levels of genetic diversity based on functional MHC assays and on allozyme surveys. When homologous class I probes were employed, a notable concordance was observed between the extent of MHC restriction fragment variation and functional MHC variation detected by skin grafts or genome-wide diversity estimated by allozyme screens. These results confirm the genetically depauperate character of the African cheetah, Acinonyx jubatus, and the Asiatic lion, Panthera leo persica; further, they support the use of class I MHC molecular reagents in estimating the extent and character of genetic diversity in natural populations. Images PMID:1967831

  2. DNA variation of the mammalian major histocompatibility complex reflects genomic diversity and population history

    SciTech Connect

    Yuhki, Naoya; O'Brien, S.J. )

    1990-01-01

    The major histocompatibility complex (MHC) is a multigene complex of tightly linked homologous genes that encode cell surface antigens that play a key role in immune regulation and response to foreign antigens. In most species, MHC gene products display extreme antigenic polymorphism, and their variability has been interpreted to reflect an adaptive strategy for accommodating rapidly evolving infectious agents that periodically afflict natural populations. Determination of the extent of MHC variation has been limited to populations in which skin grafting is feasible or for which serological reagents have been developed. The authors present here a quantitative analysis of restriction fragment length polymorphism of MHC class I genes in several mammalian species (cats, rodents, humans) known to have very different levels of genetic diversity based on functional MHC assays and on allozyme surveys. When homologous class I probes were employed, a notable concordance was observed between the extent of MHC restriction fragment variation and functional MHC variation detected by skin grafts or genome-wide diversity estimated by allozyme screens. These results confirm the genetically depauperate character of the African cheetah, Acinonyx jubatus, and the Asiatic lion, Panthera leo persica; further, they support the use of class I MHC molecular reagents in estimating the extent and character of genetic diversity in natural populations.

  3. De novo assembly of soybean wild relatives for pan-genome analysis of diversity and agronomic traits.

    PubMed

    Li, Ying-hui; Zhou, Guangyu; Ma, Jianxin; Jiang, Wenkai; Jin, Long-guo; Zhang, Zhouhao; Guo, Yong; Zhang, Jinbo; Sui, Yi; Zheng, Liangtao; Zhang, Shan-shan; Zuo, Qiyang; Shi, Xue-hui; Li, Yan-fei; Zhang, Wan-ke; Hu, Yiyao; Kong, Guanyi; Hong, Hui-long; Tan, Bing; Song, Jian; Liu, Zhang-xiong; Wang, Yaoshen; Ruan, Hang; Yeung, Carol K L; Liu, Jian; Wang, Hailong; Zhang, Li-juan; Guan, Rong-xia; Wang, Ke-jing; Li, Wen-bin; Chen, Shou-yi; Chang, Ru-zhen; Jiang, Zhi; Jackson, Scott A; Li, Ruiqiang; Qiu, Li-juan

    2014-10-01

    Wild relatives of crops are an important source of genetic diversity for agriculture, but their gene repertoire remains largely unexplored. We report the establishment and analysis of a pan-genome of Glycine soja, the wild relative of cultivated soybean Glycine max, by sequencing and de novo assembly of seven phylogenetically and geographically representative accessions. Intergenomic comparisons identified lineage-specific genes and genes with copy number variation or large-effect mutations, some of which show evidence of positive selection and may contribute to variation of agronomic traits such as biotic resistance, seed composition, flowering and maturity time, organ size and final biomass. Approximately 80% of the pan-genome was present in all seven accessions (core), whereas the rest was dispensable and exhibited greater variation than the core genome, perhaps reflecting a role in adaptation to diverse environments. This work will facilitate the harnessing of untapped genetic diversity from wild soybean for enhancement of elite cultivars. PMID:25218520

  4. Selection for silage yield and composition did not affect genomic diversity within the Wisconsin Quality Synthetic maize population.

    PubMed

    Lorenz, Aaron J; Beissinger, Timothy M; Silva, Renato Rodrigues; de Leon, Natalia

    2015-04-01

    Maize silage is forage of high quality and yield, and represents the second most important use of maize in the United States. The Wisconsin Quality Synthetic (WQS) maize population has undergone five cycles of recurrent selection for silage yield and composition, resulting in a genetically improved population. The application of high-density molecular markers allows breeders and geneticists to identify important loci through association analysis and selection mapping, as well as to monitor changes in the distribution of genetic diversity across the genome. The objectives of this study were to identify loci controlling variation for maize silage traits through association analysis and the assessment of selection signatures and to describe changes in the genomic distribution of gene diversity through selection and genetic drift in the WQS recurrent selection program. We failed to find any significant marker-trait associations using the historical phenotypic data from WQS breeding trials combined with 17,719 high-quality, informative single nucleotide polymorphisms. Likewise, no strong genomic signatures were left by selection on silage yield and quality in the WQS despite genetic gain for these traits. These results could be due to the genetic complexity underlying these traits, or the role of selection on standing genetic variation. Variation in loss of diversity through drift was observed across the genome. Some large regions experienced much greater loss in diversity than what is expected, suggesting limited recombination combined with small populations in recurrent selection programs could easily lead to fixation of large swaths of the genome. PMID:25645532

  5. Selection for Silage Yield and Composition Did Not Affect Genomic Diversity Within the Wisconsin Quality Synthetic Maize Population

    PubMed Central

    Lorenz, Aaron J.; Beissinger, Timothy M.; Silva, Renato Rodrigues; de Leon, Natalia

    2015-01-01

    Maize silage is forage of high quality and yield, and represents the second most important use of maize in the United States. The Wisconsin Quality Synthetic (WQS) maize population has undergone five cycles of recurrent selection for silage yield and composition, resulting in a genetically improved population. The application of high-density molecular markers allows breeders and geneticists to identify important loci through association analysis and selection mapping, as well as to monitor changes in the distribution of genetic diversity across the genome. The objectives of this study were to identify loci controlling variation for maize silage traits through association analysis and the assessment of selection signatures and to describe changes in the genomic distribution of gene diversity through selection and genetic drift in the WQS recurrent selection program. We failed to find any significant marker-trait associations using the historical phenotypic data from WQS breeding trials combined with 17,719 high-quality, informative single nucleotide polymorphisms. Likewise, no strong genomic signatures were left by selection on silage yield and quality in the WQS despite genetic gain for these traits. These results could be due to the genetic complexity underlying these traits, or the role of selection on standing genetic variation. Variation in loss of diversity through drift was observed across the genome. Some large regions experienced much greater loss in diversity than what is expected, suggesting limited recombination combined with small populations in recurrent selection programs could easily lead to fixation of large swaths of the genome. PMID:25645532

  6. Genome-Wide Diversity and Phylogeography of Mycobacterium avium subsp. paratuberculosis in Canadian Dairy Cattle

    PubMed Central

    Ahlstrom, Christina; Barkema, Herman W.; Stevenson, Karen; Zadoks, Ruth N.; Biek, Roman; Kao, Rowland; Trewby, Hannah; Haupstein, Deb; Kelton, David F.; Fecteau, Gilles; Labrecque, Olivia; Keefe, Greg P.; McKenna, Shawn L. B.; Tahlan, Kapil; De Buck, Jeroen

    2016-01-01

    Mycobacterium avium subsp. paratuberculosis (MAP) is the causative bacterium of Johne’s disease (JD) in ruminants. The control of JD in the dairy industry is challenging, but can be improved with a better understanding of the diversity and distribution of MAP subtypes. Previously established molecular typing techniques used to differentiate MAP have not been sufficiently discriminatory and/or reliable to accurately assess the population structure. In this study, the genetic diversity of 182 MAP isolates representing all Canadian provinces was compared to the known global diversity, using single nucleotide polymorphisms identified through whole genome sequencing. MAP isolates from Canada represented a subset of the known global diversity, as there were global isolates intermingled with Canadian isolates, as well as multiple global subtypes that were not found in Canada. One Type III and six “Bison type” isolates were found in Canada as well as one Type II subtype that represented 86% of all Canadian isolates. Rarefaction estimated larger subtype richness in Québec than in other Canadian provinces using a strict definition of MAP subtypes and lower subtype richness in the Atlantic region using a relaxed definition. Significant phylogeographic clustering was observed at the inter-provincial but not at the intra-provincial level, although most major clades were found in all provinces. The large number of shared subtypes among provinces suggests that cattle movement is a major driver of MAP transmission at the herd level, which is further supported by the lack of spatial clustering on an intra-provincial scale. PMID:26871723

  7. Whole genome analysis of diverse Chlamydia trachomatis strains identifies phylogenetic relationships masked by current clinical typing

    PubMed Central

    Harris, Simon R.; Clarke, Ian N.; Seth-Smith, Helena M. B.; Solomon, Anthony W.; Cutcliffe, Lesley T.; Marsh, Peter; Skilton, Rachel J.; Holland, Martin J.; Mabey, David; Peeling, Rosanna W.; Lewis, David A.; Spratt, Brian G.; Unemo, Magnus; Persson, Kenneth; Bjartling, Carina; Brunham, Robert; de Vries, Henry J.C.; Morré, Servaas A.; Speksnijder, Arjen; Bébéar, Cécile M.; Clerc, Maïté; de Barbeyrac, Bertille; Parkhill, Julian; Thomson, Nicholas R.

    2012-01-01

    Chlamydia trachomatis is responsible for both trachoma and sexually transmitted infections causing substantial morbidity and economic cost globally. Despite this, our knowledge of its population and evolutionary genetics is limited. Here we present a detailed whole genome phylogeny from representative strains of both trachoma and lymphogranuloma venereum (LGV) biovars from temporally and geographically diverse sources. Our analysis demonstrates that predicting phylogenetic structure using the ompA gene, traditionally used to classify Chlamydia, is misleading because extensive recombination in this region masks true relationships. We show that in many instances ompA is a chimera that can be exchanged in part or whole, both within and between biovars. We also provide evidence for exchange of, and recombination within, the cryptic plasmid, another important diagnostic target. We have used our phylogenetic framework to show how genetic exchange has manifested itself in ocular, urogenital and LGV C. trachomatis strains, including the epidemic LGV serotype L2b. PMID:22406642

  8. MBGD update 2015: microbial genome database for flexible ortholog analysis utilizing a diverse set of genomic data.

    PubMed

    Uchiyama, Ikuo; Mihara, Motohiro; Nishide, Hiroyo; Chiba, Hirokazu

    2015-01-01

    The microbial genome database for comparative analysis (MBGD) (available at http://mbgd.genome.ad.jp/) is a comprehensive ortholog database for flexible comparative analysis of microbial genomes, where the users are allowed to create an ortholog table among any specified set of organisms. Because of the rapid increase in microbial genome data owing to the next-generation sequencing technology, it becomes increasingly challenging to maintain high-quality orthology relationships while allowing the users to incorporate the latest genomic data available into an analysis. Because many of the recently accumulating genomic data are draft genome sequences for which some complete genome sequences of the same or closely related species are available, MBGD now stores draft genome data and allows the users to incorporate them into a user-specific ortholog database using the MyMBGD functionality. In this function, draft genome data are incorporated into an existing ortholog table created only from the complete genome data in an incremental manner to prevent low-quality draft data from affecting clustering results. In addition, to provide high-quality orthology relationships, the standard ortholog table containing all the representative genomes, which is first created by the rapid classification program DomClust, is now refined using DomRefine, a recently developed program for improving domain-level clustering using multiple sequence alignment information. PMID:25398900

  9. Small Traditional Human Communities Sustain Genomic Diversity over Microgeographic Scales despite Linguistic Isolation

    PubMed Central

    Cox, Murray P.; Hudjashov, Georgi; Sim, Andre; Savina, Olga; Karafet, Tatiana M.; Sudoyo, Herawati; Lansing, J. Stephen

    2016-01-01

    At least since the Neolithic, humans have largely lived in networks of small, traditional communities. Often socially isolated, these groups evolved distinct languages and cultures over microgeographic scales of just tens of kilometers. Population genetic theory tells us that genetic drift should act quickly in such isolated groups, thus raising the question: do networks of small human communities maintain levels of genetic diversity over microgeographic scales? This question can no longer be asked in most parts of the world, which have been heavily impacted by historical events that make traditional society structures the exception. However, such studies remain possible in parts of Island Southeast Asia and Oceania, where traditional ways of life are still practiced. We captured genome-wide genetic data, together with linguistic records, for a case–study system—eight villages distributed across Sumba, a small, remote island in eastern Indonesia. More than 4,000 years after these communities were established during the Neolithic period, most speak different languages and can be distinguished genetically. Yet their nuclear diversity is not reduced, instead being comparable to other, even much larger, regional groups. Modeling reveals a separation of time scales: while languages and culture can evolve quickly, creating social barriers, sporadic migration averaged over many generations is sufficient to keep villages linked genetically. This loosely-connected network structure, once the global norm and still extant on Sumba today, provides a living proxy to explore fine-scale genome dynamics in the sort of small traditional communities within which the most recent episodes of human evolution occurred. PMID:27274003

  10. Genome-Wide and Paternal Diversity Reveal a Recent Origin of Human Populations in North Africa

    PubMed Central

    Martínez-Cruz, Begoña; Zalloua, Pierre; Benammar Elgaaied, Amel; Comas, David

    2013-01-01

    The geostrategic location of North Africa as a crossroad between three continents and as a stepping-stone outside Africa has evoked anthropological and genetic interest in this region. Numerous studies have described the genetic landscape of the human population in North Africa employing paternal, maternal, and biparental molecular markers. However, information from these markers which have different inheritance patterns has been mostly assessed independently, resulting in an incomplete description of the region. In this study, we analyze uniparental and genome-wide markers examining similarities or contrasts in the results and consequently provide a comprehensive description of the evolutionary history of North Africa populations. Our results show that both males and females in North Africa underwent a similar admixture history with slight differences in the proportions of admixture components. Consequently, genome-wide diversity show similar patterns with admixture tests suggesting North Africans are a mixture of ancestral populations related to current Africans and Eurasians with more affinity towards the out-of-Africa populations than to sub-Saharan Africans. We estimate from the paternal lineages that most North Africans emerged ∼15,000 years ago during the last glacial warming and that population splits started after the desiccation of the Sahara. Although most North Africans share a common admixture history, the Tunisian Berbers show long periods of genetic isolation and appear to have diverged from surrounding populations without subsequent mixture. On the other hand, continuous gene flow from the Middle East made Egyptians genetically closer to Eurasians than to other North Africans. We show that genetic diversity of today's North Africans mostly captures patterns from migrations post Last Glacial Maximum and therefore may be insufficient to inform on the initial population of the region during the Middle Paleolithic period. PMID:24312208

  11. Small Traditional Human Communities Sustain Genomic Diversity over Microgeographic Scales despite Linguistic Isolation.

    PubMed

    Cox, Murray P; Hudjashov, Georgi; Sim, Andre; Savina, Olga; Karafet, Tatiana M; Sudoyo, Herawati; Lansing, J Stephen

    2016-09-01

    At least since the Neolithic, humans have largely lived in networks of small, traditional communities. Often socially isolated, these groups evolved distinct languages and cultures over microgeographic scales of just tens of kilometers. Population genetic theory tells us that genetic drift should act quickly in such isolated groups, thus raising the question: do networks of small human communities maintain levels of genetic diversity over microgeographic scales? This question can no longer be asked in most parts of the world, which have been heavily impacted by historical events that make traditional society structures the exception. However, such studies remain possible in parts of Island Southeast Asia and Oceania, where traditional ways of life are still practiced. We captured genome-wide genetic data, together with linguistic records, for a case-study system-eight villages distributed across Sumba, a small, remote island in eastern Indonesia. More than 4,000 years after these communities were established during the Neolithic period, most speak different languages and can be distinguished genetically. Yet their nuclear diversity is not reduced, instead being comparable to other, even much larger, regional groups. Modeling reveals a separation of time scales: while languages and culture can evolve quickly, creating social barriers, sporadic migration averaged over many generations is sufficient to keep villages linked genetically. This loosely-connected network structure, once the global norm and still extant on Sumba today, provides a living proxy to explore fine-scale genome dynamics in the sort of small traditional communities within which the most recent episodes of human evolution occurred. PMID:27274003

  12. Diversity and relationships of cocirculating modern human rotaviruses revealed using large-scale comparative genomics.

    PubMed

    McDonald, Sarah M; McKell, Allison O; Rippinger, Christine M; McAllen, John K; Akopov, Asmik; Kirkness, Ewen F; Payne, Daniel C; Edwards, Kathryn M; Chappell, James D; Patton, John T

    2012-09-01

    Group A rotaviruses (RVs) are 11-segmented, double-stranded RNA viruses and are primary causes of gastroenteritis in young children. Despite their medical relevance, the genetic diversity of modern human RVs is poorly understood, and the impact of vaccine use on circulating strains remains unknown. In this study, we report the complete genome sequence analysis of 58 RVs isolated from children with severe diarrhea and/or vomiting at Vanderbilt University Medical Center (VUMC) in Nashville, TN, during the years spanning community vaccine implementation (2005 to 2009). The RVs analyzed include 36 G1P[8], 18 G3P[8], and 4 G12P[8] Wa-like genogroup 1 strains with VP6-VP1-VP2-VP3-NSP1-NSP2-NSP3-NSP4-NSP5/6 genotype constellations of I1-R1-C1-M1-A1-N1-T1-E1-H1. By constructing phylogenetic trees, we identified 2 to 5 subgenotype alleles for each gene. The results show evidence of intragenogroup gene reassortment among the cocirculating strains. However, several isolates from different seasons maintained identical allele constellations, consistent with the notion that certain RV clades persisted in the community. By comparing the genes of VUMC RVs to those of other archival and contemporary RV strains for which sequences are available, we defined phylogenetic lineages and verified that the diversity of the strains analyzed in this study reflects that seen in other regions of the world. Importantly, the VP4 and VP7 proteins encoded by VUMC RVs and other contemporary strains show amino acid changes in or near neutralization domains, which might reflect antigenic drift of the virus. Thus, this large-scale, comparative genomic study of modern human RVs provides significant insight into how this pathogen evolves during its spread in the community. PMID:22696651

  13. Diversity and Relationships of Cocirculating Modern Human Rotaviruses Revealed Using Large-Scale Comparative Genomics

    PubMed Central

    McKell, Allison O.; Rippinger, Christine M.; McAllen, John K.; Akopov, Asmik; Kirkness, Ewen F.; Payne, Daniel C.; Edwards, Kathryn M.; Chappell, James D.; Patton, John T.

    2012-01-01

    Group A rotaviruses (RVs) are 11-segmented, double-stranded RNA viruses and are primary causes of gastroenteritis in young children. Despite their medical relevance, the genetic diversity of modern human RVs is poorly understood, and the impact of vaccine use on circulating strains remains unknown. In this study, we report the complete genome sequence analysis of 58 RVs isolated from children with severe diarrhea and/or vomiting at Vanderbilt University Medical Center (VUMC) in Nashville, TN, during the years spanning community vaccine implementation (2005 to 2009). The RVs analyzed include 36 G1P[8], 18 G3P[8], and 4 G12P[8] Wa-like genogroup 1 strains with VP6-VP1-VP2-VP3-NSP1-NSP2-NSP3-NSP4-NSP5/6 genotype constellations of I1-R1-C1-M1-A1-N1-T1-E1-H1. By constructing phylogenetic trees, we identified 2 to 5 subgenotype alleles for each gene. The results show evidence of intragenogroup gene reassortment among the cocirculating strains. However, several isolates from different seasons maintained identical allele constellations, consistent with the notion that certain RV clades persisted in the community. By comparing the genes of VUMC RVs to those of other archival and contemporary RV strains for which sequences are available, we defined phylogenetic lineages and verified that the diversity of the strains analyzed in this study reflects that seen in other regions of the world. Importantly, the VP4 and VP7 proteins encoded by VUMC RVs and other contemporary strains show amino acid changes in or near neutralization domains, which might reflect antigenic drift of the virus. Thus, this large-scale, comparative genomic study of modern human RVs provides significant insight into how this pathogen evolves during its spread in the community. PMID:22696651

  14. Employing genome-wide SNP discovery and genotyping strategy to extrapolate the natural allelic diversity and domestication patterns in chickpea

    PubMed Central

    Kujur, Alice; Bajaj, Deepak; Upadhyaya, Hari D.; Das, Shouvik; Ranjan, Rajeev; Shree, Tanima; Saxena, Maneesha S.; Badoni, Saurabh; Kumar, Vinod; Tripathi, Shailesh; Gowda, C. L. L.; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K.; Parida, Swarup K.

    2015-01-01

    The genome-wide discovery and high-throughput genotyping of SNPs in chickpea natural germplasm lines is indispensable to extrapolate their natural allelic diversity, domestication, and linkage disequilibrium (LD) patterns leading to the genetic enhancement of this vital legume crop. We discovered 44,844 high-quality SNPs by sequencing of 93 diverse cultivated desi, kabuli, and wild chickpea accessions using reference genome- and de novo-based GBS (genotyping-by-sequencing) assays that were physically mapped across eight chromosomes of desi and kabuli. Of these, 22,542 SNPs were structurally annotated in different coding and non-coding sequence components of genes. Genes with 3296 non-synonymous and 269 regulatory SNPs could functionally differentiate accessions based on their contrasting agronomic traits. A high experimental validation success rate (92%) and reproducibility (100%) along with strong sensitivity (93–96%) and specificity (99%) of GBS-based SNPs was observed. This infers the robustness of GBS as a high-throughput assay for rapid large-scale mining and genotyping of genome-wide SNPs in chickpea with sub-optimal use of resources. With 23,798 genome-wide SNPs, a relatively high intra-specific polymorphic potential (49.5%) and broader molecular diversity (13–89%)/functional allelic diversity (18–77%) was apparent among 93 chickpea accessions, suggesting their tremendous applicability in rapid selection of desirable diverse accessions/inter-specific hybrids in chickpea crossbred varietal improvement program. The genome-wide SNPs revealed complex admixed domestication pattern, extensive LD estimates (0.54–0.68) and extended LD decay (400–500 kb) in a structured population inclusive of 93 accessions. These findings reflect the utility of our identified SNPs for subsequent genome-wide association study (GWAS) and selective sweep-based domestication trait dissection analysis to identify potential genomic loci (gene-associated targets) specifically

  15. Whole-Genome Sequencing of Kaposi's Sarcoma-Associated Herpesvirus from Zambian Kaposi's Sarcoma Biopsy Specimens Reveals Unique Viral Diversity

    PubMed Central

    Olp, Landon N.; Jeanniard, Adrien; Marimo, Clemence; West, John T.

    2015-01-01

    ABSTRACT Kaposi's sarcoma-associated herpesvirus (KSHV) is the etiological agent for Kaposi's sarcoma (KS). Both KSHV and KS are endemic in sub-Saharan Africa where approximately 84% of global KS cases occur. Nevertheless, whole-genome sequencing of KSHV has only been completed using isolates from Western countries—where KS is not endemic. The lack of whole-genome KSHV sequence data from the most clinically important geographical region, sub-Saharan Africa, represents an important gap since it remains unclear whether genomic diversity has a role on KSHV pathogenesis. We hypothesized that distinct KSHV genotypes might be present in sub-Saharan Africa compared to Western countries. Using a KSHV-targeted enrichment protocol followed by Illumina deep-sequencing, we generated and analyzed 16 unique Zambian, KS-derived, KSHV genomes. We enriched KSHV DNA over cellular DNA 1,851 to 18,235-fold. Enrichment provided coverage levels up to 24,740-fold; therefore, supporting highly confident polymorphism analysis. Multiple alignment of the 16 newly sequenced KSHV genomes showed low level variability across the entire central conserved region. This variability resulted in distinct phylogenetic clustering between Zambian KSHV genomic sequences and those derived from Western countries. Importantly, the phylogenetic segregation of Zambian from Western sequences occurred irrespective of inclusion of the highly variable genes K1 and K15. We also show that four genes within the more conserved region of the KSHV genome contained polymorphisms that partially, but not fully, contributed to the unique Zambian KSHV whole-genome phylogenetic structure. Taken together, our data suggest that the whole KSHV genome should be taken into consideration for accurate viral characterization. IMPORTANCE Our results represent the largest number of KSHV whole-genomic sequences published to date and the first time that multiple genomes have been sequenced from sub-Saharan Africa, a geographic area

  16. Functional Genomics of Novel Secondary Metabolites from Diverse Cyanobacteria Using Untargeted Metabolomics

    PubMed Central

    Baran, Richard; Ivanova, Natalia N.; Jose, Nick; Garcia-Pichel, Ferran; Kyrpides, Nikos C.; Gugger, Muriel; Northen, Trent R.

    2013-01-01

    Mass spectrometry-based metabolomics has become a powerful tool for the detection of metabolites in complex biological systems and for the identification of novel metabolites. We previously identified a number of unexpected metabolites in the cyanobacterium Synechococcus sp. PCC 7002, such as histidine betaine, its derivatives and several unusual oligosaccharides. To test for the presence of these compounds and to assess the diversity of small polar metabolites in other cyanobacteria, we profiled cell extracts of nine strains representing much of the morphological and evolutionary diversification of this phylum. Spectral features in raw metabolite profiles obtained by normal phase liquid chromatography coupled to mass spectrometry (MS) were manually curated so that chemical formulae of metabolites could be assigned. For putative identification, retention times and MS/MS spectra were cross-referenced with those of standards or available sprectral library records. Overall, we detected 264 distinct metabolites. These included indeed different betaines, oligosaccharides as well as additional unidentified metabolites with chemical formulae not present in databases of metabolism. Some of these metabolites were detected only in a single strain, but some were present in more than one. Genomic interrogation of the strains revealed that generally, presence of a given metabolite corresponded well with the presence of its biosynthetic genes, if known. Our results show the potential of combining metabolite profiling and genomics for the identification of novel biosynthetic genes. PMID:24084783

  17. Genome scale transcriptional response diversity among ten ecotypes of Arabidopsis thaliana during heat stress

    PubMed Central

    Barah, Pankaj; Jayavelu, Naresh D.; Mundy, John; Bones, Atle M.

    2013-01-01

    In the scenario of global warming and climate change, heat stress is a serious threat to crop production worldwide. Being sessile, plants cannot escape from heat. Plants have developed various adaptive mechanisms to survive heat stress. Several studies have focused on diversity of heat tolerance levels in divergent Arabidopsis thaliana (A. thaliana) ecotypes, but comprehensive genome scale understanding of heat stress response in plants is still lacking. Here we report the genome scale transcript responses to heat stress of 10 A. thaliana ecotypes (Col, Ler, C24, Cvi, Kas1, An1, Sha, Kyo2, Eri, and Kond) originated from different geographical locations. During the experiment, A. thaliana plants were subjected to heat stress (38°C) and transcript responses were monitored using Arabidopsis NimbleGen ATH6 microarrays. The responses of A. thaliana ecotypes exhibited considerable variation in the transcript abundance levels. In total, 3644 transcripts were significantly heat regulated (p < 0.01) in the 10 ecotypes, including 244 transcription factors and 203 transposable elements. By employing a systems genetics approach- Network Component Analysis (NCA), we have constructed an in silico transcript regulatory network model for 35 heat responsive transcription factors during cellular responses to heat stress in A. thaliana. The computed activities of the 35 transcription factors showed ecotype specific responses to the heat treatment. PMID:24409190

  18. PCR-based positive hybridization to detect genomic diversity associated with bacterial secondary metabolism

    PubMed Central

    Pomati, Francesco; Neilan, Brett A.

    2004-01-01

    A PCR-based positive hybridization (PPH) method was developed to explore toxic-specific genes in common between toxigenic strains of Anabaena circinalis, a cyanobacterium able to produce saxitoxin (STX). The PPH technique is based on the same principles of suppression subtractive hybridization (SSH), although with the former no driver DNA is required and two tester genomic DNAs are hybridized at high stringency. The aim was to obtain genes associated with cyanobacterial STX production. The genetic diversity within phylogenetically similar strains of A.circinalis was investigated by comparing the results of the standard SSH protocol to the PPH approach by DNA-microarray analysis. SSH allowed the recovery of DNA libraries that were mainly specific for each of the two STX-producing strains used. Several candidate sequences were found by PPH to be in common between both the STX-producing testers. The PPH technique performed using unsubtracted genomic libraries proved to be a powerful tool to identify DNA sequences possibly transferred laterally between two cyanobacterial strains that may be candidate(s) in STX biosynthesis. The approach presented in this study represents a novel and valid tool to study the genetic basis for secondary metabolite production in microorganisms. PMID:14718552

  19. Contrasting Genomic Diversity in Two Closely Related Postharvest Pathogens: Penicillium digitatum and Penicillium expansum

    PubMed Central

    Julca, Irene; Droby, Samir; Sela, Noa; Marcet-Houben, Marina; Gabaldón, Toni

    2016-01-01

    Penicillium digitatum and Penicillium expansum are two closely related fungal plant pathogens causing green and blue mold in harvested fruit, respectively. The two species differ in their host specificity, being P. digitatum restricted to citrus fruits and P. expansum able to infect a wide range of fruits after harvest. Although host-specific Penicillium species have been found to have a smaller gene content, it is so far unclear whether these different host specificities impact genome variation at the intraspecific level. Here we assessed genome variation across four P. digitatum and seven P. expansum isolates from geographically distant regions. Our results show very high similarity (average 0.06 SNPs [single nucleotide polymorphism] per kb) between globally distributed isolates of P. digitatum pointing to a recent expansion of a single lineage. This low level of genetic variation found in our samples contrasts with the higher genetic variability observed in the similarly distributed P. expansum isolates (2.44 SNPs per kb). Patterns of polymorphism in P. expansum indicate that recombination exists between genetically diverged strains. Consistent with the existence of sexual recombination and heterothallism, which was unknown for this species, we identified the two alternative mating types in different P. expansum isolates. Patterns of polymorphism in P. digitatum indicate a recent clonal population expansion of a single lineage that has reached worldwide distribution. We suggest that the contrasting patterns of genomic variation between the two species reflect underlying differences in population dynamics related with host specificities and related agricultural practices. It should be noted, however, that this results should be confirmed with a larger sampling of strains, as new strains may broaden the diversity so far found in P. digitatum. PMID:26672008

  20. Probing the diversity of chloromethane-degrading bacteria by comparative genomics and isotopic fractionation

    PubMed Central

    Nadalig, Thierry; Greule, Markus; Bringel, Françoise; Keppler, Frank; Vuilleumier, Stéphane

    2014-01-01

    Chloromethane (CH3Cl) is produced on earth by a variety of abiotic and biological processes. It is the most important halogenated trace gas in the atmosphere, where it contributes to ozone destruction. Current estimates of the global CH3Cl budget are uncertain and suggest that microorganisms might play a more important role in degrading atmospheric CH3Cl than previously thought. Its degradation by bacteria has been demonstrated in marine, terrestrial, and phyllospheric environments. Improving our knowledge of these degradation processes and their magnitude is thus highly relevant for a better understanding of the global budget of CH3Cl. The cmu pathway, for chloromethane utilisation, is the only microbial pathway for CH3Cl degradation elucidated so far, and was characterized in detail in aerobic methylotrophic Alphaproteobacteria. Here, we reveal the potential of using a two-pronged approach involving a combination of comparative genomics and isotopic fractionation during CH3Cl degradation to newly address the question of the diversity of chloromethane-degrading bacteria in the environment. Analysis of available bacterial genome sequences reveals that several bacteria not yet known to degrade CH3Cl contain part or all of the complement of cmu genes required for CH3Cl degradation. These organisms, unlike bacteria shown to grow with CH3Cl using the cmu pathway, are obligate anaerobes. On the other hand, analysis of the complete genome of the chloromethane-degrading bacterium Leisingera methylohalidivorans MB2 showed that this bacterium does not contain cmu genes. Isotope fractionation experiments with L. methylohalidivorans MB2 suggest that the unknown pathway used by this bacterium for growth with CH3Cl can be differentiated from the cmu pathway. This result opens the prospect that contributions from bacteria with the cmu and Leisingera-type pathways to the atmospheric CH3Cl budget may be teased apart in the future. PMID:25360131

  1. Contrasting Genomic Diversity in Two Closely Related Postharvest Pathogens: Penicillium digitatum and Penicillium expansum.

    PubMed

    Julca, Irene; Droby, Samir; Sela, Noa; Marcet-Houben, Marina; Gabaldón, Toni

    2016-01-01

    Penicillium digitatum and Penicillium expansum are two closely related fungal plant pathogens causing green and blue mold in harvested fruit, respectively. The two species differ in their host specificity, being P. digitatum restricted to citrus fruits and P. expansum able to infect a wide range of fruits after harvest. Although host-specific Penicillium species have been found to have a smaller gene content, it is so far unclear whether these different host specificities impact genome variation at the intraspecific level. Here we assessed genome variation across four P. digitatum and seven P. expansum isolates from geographically distant regions. Our results show very high similarity (average 0.06 SNPs [single nucleotide polymorphism] per kb) between globally distributed isolates of P. digitatum pointing to a recent expansion of a single lineage. This low level of genetic variation found in our samples contrasts with the higher genetic variability observed in the similarly distributed P. expansum isolates (2.44 SNPs per kb). Patterns of polymorphism in P. expansum indicate that recombination exists between genetically diverged strains. Consistent with the existence of sexual recombination and heterothallism, which was unknown for this species, we identified the two alternative mating types in different P. expansum isolates. Patterns of polymorphism in P. digitatum indicate a recent clonal population expansion of a single lineage that has reached worldwide distribution. We suggest that the contrasting patterns of genomic variation between the two species reflect underlying differences in population dynamics related with host specificities and related agricultural practices. It should be noted, however, that this results should be confirmed with a larger sampling of strains, as new strains may broaden the diversity so far found in P. digitatum. PMID:26672008

  2. Diverse patterns of genomic targeting by transcriptional regulators in Drosophila melanogaster

    PubMed Central

    Slattery, Matthew; Ma, Lijia; Spokony, Rebecca F.; Arthur, Robert K.; Kheradpour, Pouya; Kundaje, Anshul; Nègre, Nicolas; Crofts, Alex; Ptashkin, Ryan; Zieba, Jennifer; Ostapenko, Alexander; Suchy, Sarah; Victorsen, Alec; Jameel, Nader; Grundstad, A. Jason; Gao, Wenxuan; Moran, Jennifer R.; Rehm, E. Jay; Grossman, Robert L.; Kellis, Manolis; White, Kevin P.

    2014-01-01

    Annotation of regulatory elements and identification of the transcription-related factors (TRFs) targeting these elements are key steps in understanding how cells interpret their genetic blueprint and their environment during development, and how that process goes awry in the case of disease. One goal of the modENCODE (model organism ENCyclopedia of DNA Elements) Project is to survey a diverse sampling of TRFs, both DNA-binding and non-DNA-binding factors, to provide a framework for the subsequent study of the mechanisms by which transcriptional regulators target the genome. Here we provide an updated map of the Drosophila melanogaster regulatory genome based on the location of 84 TRFs at various stages of development. This regulatory map reveals a variety of genomic targeting patterns, including factors with strong preferences toward proximal promoter binding, factors that target intergenic and intronic DNA, and factors with distinct chromatin state preferences. The data also highlight the stringency of the Polycomb regulatory network, and show association of the Trithorax-like (Trl) protein with hotspots of DNA binding throughout development. Furthermore, the data identify more than 5800 instances in which TRFs target DNA regions with demonstrated enhancer activity. Regions of high TRF co-occupancy are more likely to be associated with open enhancers used across cell types, while lower TRF occupancy regions are associated with complex enhancers that are also regulated at the epigenetic level. Together these data serve as a resource for the research community in the continued effort to dissect transcriptional regulatory mechanisms directing Drosophila development. PMID:24985916

  3. Obesity genomics: assessing the transferability of susceptibility loci across diverse populations

    PubMed Central

    2013-01-01

    The prevalence of obesity has nearly doubled worldwide over the past three decades, but substantial differences exist between nations. Although these differences are partly due to the degree of westernization, genetic factors also contribute. To date, little is known about whether the same genes contribute to obesity-susceptibility in populations of different ancestry. We review the transferability of obesity-susceptibility loci (identified by genome-wide association studies) using both single nucleotide polymorphism (SNP) and locus-wide comparisons. SNPs in FTO and near MC4R, obesity-susceptibility loci first identified in Europeans, replicate widely across other ancestries. SNP-to-SNP comparisons suggest that more than half of the 36 body mass index-associated loci are shared across European and East Asian ancestry populations, whereas locus-wide analyses suggest that the transferability might be even more extensive. Furthermore, by taking advantage of differences in haplotype structure, populations of different ancestries can help to narrow down loci, thereby pinpointing causal genes for functional follow-up. Larger-scale genetic association studies in ancestrally diverse populations will be needed for in-depth and locus-wide analyses aimed at determining, with greater confidence, the transferability of loci and allowing fine-mapping. Understanding similarities and differences in genetic susceptibility across populations of diverse ancestries might eventually contribute to a more targeted prevention and customized treatment of obesity. PMID:23806069

  4. Obesity genomics: assessing the transferability of susceptibility loci across diverse populations.

    PubMed

    Lu, Yingchang; Loos, Ruth Jf

    2013-01-01

    The prevalence of obesity has nearly doubled worldwide over the past three decades, but substantial differences exist between nations. Although these differences are partly due to the degree of westernization, genetic factors also contribute. To date, little is known about whether the same genes contribute to obesity-susceptibility in populations of different ancestry. We review the transferability of obesity-susceptibility loci (identified by genome-wide association studies) using both single nucleotide polymorphism (SNP) and locus-wide comparisons. SNPs in FTO and near MC4R, obesity-susceptibility loci first identified in Europeans, replicate widely across other ancestries. SNP-to-SNP comparisons suggest that more than half of the 36 body mass index-associated loci are shared across European and East Asian ancestry populations, whereas locus-wide analyses suggest that the transferability might be even more extensive. Furthermore, by taking advantage of differences in haplotype structure, populations of different ancestries can help to narrow down loci, thereby pinpointing causal genes for functional follow-up. Larger-scale genetic association studies in ancestrally diverse populations will be needed for in-depth and locus-wide analyses aimed at determining, with greater confidence, the transferability of loci and allowing fine-mapping. Understanding similarities and differences in genetic susceptibility across populations of diverse ancestries might eventually contribute to a more targeted prevention and customized treatment of obesity. PMID:23806069

  5. Genomic diversity and differentiation of a managed island wild boar population.

    PubMed

    Iacolina, L; Scandura, M; Goedbloed, D J; Alexandri, P; Crooijmans, R P M A; Larson, G; Archibald, A; Apollonio, M; Schook, L B; Groenen, M A M; Megens, H-J

    2016-01-01

    The evolution of island populations in natural systems is driven by local adaptation and genetic drift. However, evolutionary pathways may be altered by humans in several ways. The wild boar (WB) (Sus scrofa) is an iconic game species occurring in several islands, where it has been strongly managed since prehistoric times. We examined genomic diversity at 49 803 single-nucleotide polymorphisms in 99 Sardinian WBs and compared them with 196 wild specimens from mainland Europe and 105 domestic pigs (DP; 11 breeds). High levels of genetic variation were observed in Sardinia (80.9% of the total number of polymorphisms), which can be only in part associated to recent genetic introgression. Both Principal Component Analysis and Bayesian clustering approach revealed that the Sardinian WB population is highly differentiated from the other European populations (FST=0.126-0.138), and from DP (FST=0.169). Such evidences were mostly unaffected by an uneven sample size, although clustering results in reference populations changed when the number of individuals was standardized. Runs of homozygosity (ROHs) pattern and distribution in Sardinian WB are consistent with a past expansion following a bottleneck (small ROHs) and recent population substructuring (highly homozygous individuals). The observed effect of a non-random selection of Sardinian individuals on diversity, FST and ROH estimates, stressed the importance of sampling design in the study of structured or introgressed populations. Our results support the heterogeneity and distinctiveness of the Sardinian population and prompt further investigations on its origins and conservation status. PMID:26243137

  6. Genomic resolution of an aggressive, widespread, diverse and expanding meningococcal serogroup B, C and W lineage

    PubMed Central

    Lucidarme, Jay; Hill, Dorothea M.C.; Bratcher, Holly B.; Gray, Steve J.; du Plessis, Mignon; Tsang, Raymond S.W.; Vazquez, Julio A.; Taha, Muhamed-Kheir; Ceyhan, Mehmet; Efron, Adriana M.; Gorla, Maria C.; Findlow, Jamie; Jolley, Keith A.; Maiden, Martin C.J.; Borrow, Ray

    2015-01-01

    Summary Objectives Neisseria meningitidis is a leading cause of meningitis and septicaemia. The hyperinvasive ST-11 clonal complex (cc11) caused serogroup C (MenC) outbreaks in the US military in the 1960s and UK universities in the 1990s, a global Hajj-associated serogroup W (MenW) outbreak in 2000–2001, and subsequent MenW epidemics in sub-Saharan Africa. More recently, endemic MenW disease has expanded in South Africa, South America and the UK, and MenC cases have been reported among European and North American men who have sex with men (MSM). Routine typing schemes poorly resolve cc11 so we established the population structure at genomic resolution. Methods Representatives of these episodes and other geo-temporally diverse cc11 meningococci (n = 750) were compared across 1546 core genes and visualised on phylogenetic networks. Results MenW isolates were confined to a distal portion of one of two main lineages with MenB and MenC isolates interspersed elsewhere. An expanding South American/UK MenW strain was distinct from the ‘Hajj outbreak’ strain and a closely related endemic South African strain. Recent MenC isolates from MSM in France and the UK were closely related but distinct. Conclusions High resolution ‘genomic’ multilocus sequence typing is necessary to resolve and monitor the spread of diverse cc11 lineages globally. PMID:26226598

  7. Comparative Genomics Reveals the Origins and Diversity of Arthropod Immune Systems.

    PubMed

    Palmer, William J; Jiggins, Francis M

    2015-08-01

    Insects are an important model for the study of innate immune systems, but remarkably little is known about the immune system of other arthropod groups despite their importance as disease vectors, pests, and components of biological diversity. Using comparative genomics, we have characterized the immune system of all the major groups of arthropods beyond insects for the first time--studying five chelicerates, a myriapod, and a crustacean. We found clear traces of an ancient origin of innate immunity, with some arthropods having Toll-like receptors and C3-complement factors that are more closely related in sequence or structure to vertebrates than other arthropods. Across the arthropods some components of the immune system, such as the Toll signaling pathway, are highly conserved. However, there is also remarkable diversity. The chelicerates apparently lack the Imd signaling pathway and beta-1,3 glucan binding proteins--a key class of pathogen recognition receptors. Many genes have large copy number variation across species, and this may sometimes be accompanied by changes in function. For example, we find that peptidoglycan recognition proteins have frequently lost their catalytic activity and switch between secreted and intracellular forms. We also find that there has been widespread and extensive duplication of the cellular immune receptor Dscam (Down syndrome cell adhesion molecule), which may be an alternative way to generate the high diversity produced by alternative splicing in insects. In the antiviral short interfering RNAi pathway Argonaute 2 evolves rapidly and is frequently duplicated, with a highly variable copy number. Our results provide a detailed analysis of the immune systems of several important groups of animals for the first time and lay the foundations for functional work on these groups. PMID:25908671

  8. Comparative Genomics Reveals the Origins and Diversity of Arthropod Immune Systems

    PubMed Central

    Palmer, William J.; Jiggins, Francis M.

    2015-01-01

    Insects are an important model for the study of innate immune systems, but remarkably little is known about the immune system of other arthropod groups despite their importance as disease vectors, pests, and components of biological diversity. Using comparative genomics, we have characterized the immune system of all the major groups of arthropods beyond insects for the first time—studying five chelicerates, a myriapod, and a crustacean. We found clear traces of an ancient origin of innate immunity, with some arthropods having Toll-like receptors and C3-complement factors that are more closely related in sequence or structure to vertebrates than other arthropods. Across the arthropods some components of the immune system, such as the Toll signaling pathway, are highly conserved. However, there is also remarkable diversity. The chelicerates apparently lack the Imd signaling pathway and beta-1,3 glucan binding proteins—a key class of pathogen recognition receptors. Many genes have large copy number variation across species, and this may sometimes be accompanied by changes in function. For example, we find that peptidoglycan recognition proteins have frequently lost their catalytic activity and switch between secreted and intracellular forms. We also find that there has been widespread and extensive duplication of the cellular immune receptor Dscam (Down syndrome cell adhesion molecule), which may be an alternative way to generate the high diversity produced by alternative splicing in insects. In the antiviral short interfering RNAi pathway Argonaute 2 evolves rapidly and is frequently duplicated, with a highly variable copy number. Our results provide a detailed analysis of the immune systems of several important groups of animals for the first time and lay the foundations for functional work on these groups. PMID:25908671

  9. A searchable, whole genome resource designed for protein variant analysis in diverse lineages of U.S. beef cattle

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A key feature of a gene's function is the variety of protein isoforms it encodes in a population. However, the genetic diversity in bovine whole genome databases tends to be underrepresented because these databases contain an abundance of sequence from the most influential sires. Our first aim was ...

  10. Exploring the diversity of Arcobacter spp. in cattle in the UK using MLST and whole genome sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Arcobacter butzleri is considered to be an emerging human foodborne pathogen. The completion of an A. butzleri genome sequence along with microarray analysis of 13 isolates in 2007 revealed a surprising amount of diversity amongst A. butzleri isolates from humans, animals and food. In order to furth...