Science.gov

Sample records for bacterioplankton genomes inferred

  1. Excess of non-conservative amino acid changes in marine bacterioplankton lineages with reduced genomes.

    PubMed

    Luo, Haiwei; Huang, Yongjie; Stepanauskas, Ramunas; Tang, Jijun

    2017-06-12

    Surface ocean waters are dominated by planktonic bacterial lineages with highly reduced genomes. The best examples are the cyanobacterial genus Prochlorococcus, the alphaproteobacterial clade SAR11 and the gammaproteobacterial clade SAR86, which together represent over 50% of the cells in surface oceans. Several studies have identified signatures of selection on these lineages in today's ocean and have postulated selection as the primary force throughout their evolutionary history. However, massive loss of genomic DNA in these lineages often occurred in the distant past, and the selective pressures underlying these ancient events have not been assessed. Here, we probe ancient selective pressures by computing %GC-corrected rates of conservative and radical nonsynonymous nucleotide substitutions. Surprisingly, we found an excess of radical changes in several of these lineages in comparison to their relatives with larger genomes. Furthermore, analyses of allelic genome sequences of several populations within these lineages consistently supported that radical replacements are more likely to be deleterious than conservative changes. Our results suggest coincidence of massive genomic DNA losses and increased power of genetic drift, but we also suggest that additional evidence independent of the nucleotide substitution analyses is needed to support a primary role of genetic drift driving ancient genome reduction of marine bacterioplankton lineages.

  2. Comparative analysis of deep-sea bacterioplankton OMICS revealed the occurrence of habitat-specific genomic attributes.

    PubMed

    Smedile, Francesco; Messina, Enzo; La Cono, Violetta; Yakimov, Michail M

    2014-10-01

    Bathyal aphotic ocean represents the largest biotope on our planet, which sustains highly diverse but low-density microbial communities, with yet untapped genomic attributes, potentially useful for discovery of new biomolecules, industrial enzymes and pathways. In the last two decades, culture-independent approaches of high-throughput sequencing have provided new insights into structure and function of marine bacterioplankton, leading to unprecedented opportunities to accurately characterize microbial communities and their interactions with the environments. In the present review we focused on the analysis of relatively few deep-sea OMICS studies, completed thus far, to find the specific genomic patterns determining the lifeway and adaptation mechanisms of prokaryotes thriving in the dark deep ocean below the depth of 1000m. Phylogenomic and omic studies provided clear evidence that the bathyal microbial communities are distinct from the epipelagic counterparts and, along with generally larger genomes, possess their own habitat-specific genomic attributes. The high abundance in the deep ocean OMICS of the systems for environmental sensing, signal transduction and metabolic versatility as compared to the epipelagic counterparts is thought to enable the deep-sea bacterioplankton to rapidly adapt to changing environmental conditions associated with resource scarcity and high diversity of energy and carbon substrates in the bathyal biotopes. Together with a versatile heterotrophy, mixotrophy and anaplerosis are thought to enable the deep-sea bacterioplankton to cope with these environmental conditions.

  3. Single-cell genomics-based analysis of virus–host interactions in marine surface bacterioplankton

    PubMed Central

    Labonté, Jessica M; Swan, Brandon K; Poulos, Bonnie; Luo, Haiwei; Koren, Sergey; Hallam, Steven J; Sullivan, Matthew B; Woyke, Tanja; Eric Wommack, K; Stepanauskas, Ramunas

    2015-01-01

    Viral infections dynamically alter the composition and metabolic potential of marine microbial communities and the evolutionary trajectories of host populations with resulting feedback on biogeochemical cycles. It is quite possible that all microbial populations in the ocean are impacted by viral infections. Our knowledge of virus–host relationships, however, has been limited to a minute fraction of cultivated host groups. Here, we utilized single-cell sequencing to obtain genomic blueprints of viruses inside or attached to individual bacterial and archaeal cells captured in their native environment, circumventing the need for host and virus cultivation. A combination of comparative genomics, metagenomic fragment recruitment, sequence anomalies and irregularities in sequence coverage depth and genome recovery were utilized to detect viruses and to decipher modes of virus–host interactions. Members of all three tailed phage families were identified in 20 out of 58 phylogenetically and geographically diverse single amplified genomes (SAGs) of marine bacteria and archaea. At least four phage–host interactions had the characteristics of late lytic infections, all of which were found in metabolically active cells. One virus had genetic potential for lysogeny. Our findings include first known viruses of Thaumarchaeota, Marinimicrobia, Verrucomicrobia and Gammaproteobacteria clusters SAR86 and SAR92. Viruses were also found in SAGs of Alphaproteobacteria and Bacteroidetes. A high fragment recruitment of viral metagenomic reads confirmed that most of the SAG-associated viruses are abundant in the ocean. Our study demonstrates that single-cell genomics, in conjunction with sequence-based computational tools, enable in situ, cultivation-independent insights into host–virus interactions in complex microbial communities. PMID:25848873

  4. Single-cell genomics-based analysis of virus–host interactions in marine surface bacterioplankton

    DOE PAGES

    Labonté, Jessica M.; Swan, Brandon K.; Poulos, Bonnie; ...

    2015-04-07

    Viral infections dynamically alter the composition and metabolic potential of marine microbial communities and the evolutionary trajectories of host populations with resulting feedback on biogeochemical cycles. It is quite possible that all microbial populations in the ocean are impacted by viral infections. Our knowledge of virus–host relationships, however, has been limited to a minute fraction of cultivated host groups. Here, we utilized single-cell sequencing to obtain genomic blueprints of viruses inside or attached to individual bacterial and archaeal cells captured in their native environment, circumventing the need for host and virus cultivation. Furthermore, a combination of comparative genomics, metagenomic fragmentmore » recruitment, sequence anomalies and irregularities in sequence coverage depth and genome recovery were utilized to detect viruses and to decipher modes of virus–host interactions. Members of all three tailed phage families were identified in 20 out of 58 phylogenetically and geographically diverse single amplified genomes (SAGs) of marine bacteria and archaea. At least four phage–host interactions had the characteristics of late lytic infections, all of which were found in metabolically active cells. One virus had genetic potential for lysogeny. Our findings include first known viruses of Thaumarchaeota, Marinimicrobia, Verrucomicrobia and Gammaproteobacteria clusters SAR86 and SAR92. Viruses were also found in SAGs of Alphaproteobacteria and Bacteroidetes. A high fragment recruitment of viral metagenomic reads confirmed that most of the SAG-associated viruses are abundant in the ocean. This study demonstrates that single-cell genomics, in conjunction with sequence-based computational tools, enable in situ, cultivation-independent insights into host–virus interactions in complex microbial communities.« less

  5. Single-cell genomics-based analysis of virus–host interactions in marine surface bacterioplankton

    SciTech Connect

    Labonté, Jessica M.; Swan, Brandon K.; Poulos, Bonnie; Luo, Haiwei; Koren, Sergey; Hallam, Steven J.; Sullivan, Matthew B.; Woyke, Tanja; Eric Wommack, K.; Stepanauskas, Ramunas

    2015-04-07

    Viral infections dynamically alter the composition and metabolic potential of marine microbial communities and the evolutionary trajectories of host populations with resulting feedback on biogeochemical cycles. It is quite possible that all microbial populations in the ocean are impacted by viral infections. Our knowledge of virus–host relationships, however, has been limited to a minute fraction of cultivated host groups. Here, we utilized single-cell sequencing to obtain genomic blueprints of viruses inside or attached to individual bacterial and archaeal cells captured in their native environment, circumventing the need for host and virus cultivation. Furthermore, a combination of comparative genomics, metagenomic fragment recruitment, sequence anomalies and irregularities in sequence coverage depth and genome recovery were utilized to detect viruses and to decipher modes of virus–host interactions. Members of all three tailed phage families were identified in 20 out of 58 phylogenetically and geographically diverse single amplified genomes (SAGs) of marine bacteria and archaea. At least four phage–host interactions had the characteristics of late lytic infections, all of which were found in metabolically active cells. One virus had genetic potential for lysogeny. Our findings include first known viruses of Thaumarchaeota, Marinimicrobia, Verrucomicrobia and Gammaproteobacteria clusters SAR86 and SAR92. Viruses were also found in SAGs of Alphaproteobacteria and Bacteroidetes. A high fragment recruitment of viral metagenomic reads confirmed that most of the SAG-associated viruses are abundant in the ocean. This study demonstrates that single-cell genomics, in conjunction with sequence-based computational tools, enable in situ, cultivation-independent insights into host–virus interactions in complex microbial communities.

  6. Freshwater bacterial lifestyles inferred from comparative genomics.

    PubMed

    Livermore, Joshua A; Emrich, Scott J; Tan, John; Jones, Stuart E

    2014-03-01

    While micro-organisms actively mediate and participate in freshwater ecosystem services, we know little about freshwater microbial genetic diversity. Genome sequences are available for many bacteria from the human microbiome and the ocean (over 800 and 200, respectively), but only two freshwater genomes are currently available: the streamlined genomes of Polynucleobacter necessarius ssp. asymbioticus and the Actinobacterium AcI-B1. Here, we sequenced and analysed draft genomes of eight phylogentically diverse freshwater bacteria exhibiting a range of lifestyle characteristics. Comparative genomics of these bacteria reveals putative freshwater bacterial lifestyles based on differences in predicted growth rate, capability to respond to environmental stimuli and diversity of useable carbon substrates. Our conceptual model based on these genomic characteristics provides a foundation on which further ecophysiological and genomic studies can be built. In addition, these genomes greatly expand the diversity of existing genomic context for future studies on the ecology and genetics of freshwater bacteria.

  7. Inferring parental genomic ancestries using pooled semi-Markov processes.

    PubMed

    Zou, James Y; Halperin, Eran; Burchard, Esteban; Sankararaman, Sriram

    2015-06-15

    A basic problem of broad public and scientific interest is to use the DNA of an individual to infer the genomic ancestries of the parents. In particular, we are often interested in the fraction of each parent's genome that comes from specific ancestries (e.g. European, African, Native American, etc). This has many applications ranging from understanding the inheritance of ancestry-related risks and traits to quantifying human assortative mating patterns. We model the problem of parental genomic ancestry inference as a pooled semi-Markov process. We develop a general mathematical framework for pooled semi-Markov processes and construct efficient inference algorithms for these models. Applying our inference algorithm to genotype data from 231 Mexican trios and 258 Puerto Rican trios where we have the true genomic ancestry of each parent, we demonstrate that our method accurately infers parameters of the semi-Markov processes and parents' genomic ancestries. We additionally validated the method on simulations. Our model of pooled semi-Markov process and inference algorithms may be of independent interest in other settings in genomics and machine learning. © The Author 2015. Published by Oxford University Press.

  8. Use of Whole Genome Sequence Data To Infer Baculovirus Phylogeny

    PubMed Central

    Herniou, Elisabeth A.; Luque, Teresa; Chen, Xinwen; Vlak, Just M.; Winstanley, Doreen; Cory, Jennifer S.; O'Reilly, David R.

    2001-01-01

    Several phylogenetic methods based on whole genome sequence data were evaluated using data from nine complete baculovirus genomes. The utility of three independent character sets was assessed. The first data set comprised the sequences of the 63 genes common to these viruses. The second set of characters was based on gene order, and phylogenies were inferred using both breakpoint distance analysis and a novel method developed here, termed neighbor pair analysis. The third set recorded gene content by scoring gene presence or absence in each genome. All three data sets yielded phylogenies supporting the separation of the Nucleopolyhedrovirus (NPV) and Granulovirus (GV) genera, the division of the NPVs into groups I and II, and species relationships within group I NPVs. Generation of phylogenies based on the combined sequences of all 63 shared genes proved to be the most effective approach to resolving the relationships among the group II NPVs and the GVs. The history of gene acquisitions and losses that have accompanied baculovirus diversification was visualized by mapping the gene content data onto the phylogenetic tree. This analysis highlighted the fluid nature of baculovirus genomes, with evidence of frequent genome rearrangements and multiple gene content changes during their evolution. Of more than 416 genes identified in the genomes analyzed, only 63 are present in all nine genomes, and 200 genes are found only in a single genome. Despite this fluidity, the whole genome-based methods we describe are sufficiently powerful to recover the underlying phylogeny of the viruses. PMID:11483757

  9. Inference of self-regulated transcriptional networks by comparative genomics.

    PubMed

    Cornish, Joseph P; Matthews, Fialelei; Thomas, Julien R; Erill, Ivan

    2012-01-01

    The assumption of basic properties, like self-regulation, in simple transcriptional regulatory networks can be exploited to infer regulatory motifs from the growing amounts of genomic and meta-genomic data. These motifs can in principle be used to elucidate the nature and scope of transcriptional networks through comparative genomics. Here we assess the feasibility of this approach using the SOS regulatory network of Gram-positive bacteria as a test case. Using experimentally validated data, we show that the known regulatory motif can be inferred through the assumption of self-regulation. Furthermore, the inferred motif provides a more robust search pattern for comparative genomics than the experimental motifs defined in reference organisms. We take advantage of this robustness to generate a functional map of the SOS response in Gram-positive bacteria. Our results reveal definite differences in the composition of the LexA regulon between Firmicutes and Actinobacteria, and confirm that regulation of cell-division inhibition is a widespread characteristic of this network among Gram-positive bacteria.

  10. Minimal-assumption inference from population-genomic data

    PubMed Central

    Weissman, Daniel B; Hallatschek, Oskar

    2017-01-01

    Samples of multiple complete genome sequences contain vast amounts of information about the evolutionary history of populations, much of it in the associations among polymorphisms at different loci. We introduce a method, Minimal-Assumption Genomic Inference of Coalescence (MAGIC), that reconstructs key features of the evolutionary history, including the distribution of coalescence times, by integrating information across genomic length scales without using an explicit model of coalescence or recombination, allowing it to analyze arbitrarily large samples without phasing while making no assumptions about ancestral structure, linked selection, or gene conversion. Using simulated data, we show that the performance of MAGIC is comparable to that of PSMC’ even on single diploid samples generated with standard coalescent and recombination models. Applying MAGIC to a sample of human genomes reveals evidence of non-demographic factors driving coalescence. DOI: http://dx.doi.org/10.7554/eLife.24836.001 PMID:28671549

  11. SOP for pathway inference in Integrated Microbial Genomes (IMG).

    PubMed

    Anderson, Iain; Chen, Amy; Markowitz, Victor; Kyrpides, Nikos; Ivanova, Natalia

    2011-12-31

    One of the most important aspects of genomic analysis is the prediction of which pathways, both metabolic and non-metabolic, are present in an organism. In IMG, this is carried out by the assignment of IMG terms, which are organized into IMG pathways. Based on manual and automatic assignment of IMG terms, the presence or absence of IMG pathways is automatically inferred. The three categories of pathway assertion are asserted (likely present), not asserted (likely absent), and unknown. In the unknown category, at least one term necessary for the pathway is missing, but an ortholog in another organism has the corresponding term assigned to it. Automatic pathway inference is an important initial step in genome analysis.

  12. GARLIC: Genomic Autozygosity Regions Likelihood-based Inference and Classification.

    PubMed

    Szpiech, Zachary A; Blant, Alexandra; Pemberton, Trevor J

    2017-07-01

    Runs of homozygosity (ROH) are important genomic features that manifest when identical-by-descent haplotypes are inherited from parents. Their length distributions and genomic locations are informative about population history and they are useful for mapping recessive loci contributing to both Mendelian and complex disease risk. Here, we present software implementing a model-based method ( Pemberton et al., 2012 ) for inferring ROH in genome-wide SNP datasets that incorporates population-specific parameters and a genotyping error rate as well as provides a length-based classification module to identify biologically interesting classes of ROH. Using simulations, we evaluate the performance of this method. GARLIC is written in C ++. Source code and pre-compiled binaries (Windows, OSX and Linux) are hosted on GitHub ( https://github.com/szpiech/garlic ) under the GNU General Public License version 3. zachary.szpiech@ucsf.edu. Supplementary data are available at Bioinformatics online.

  13. Genome-Wide Inference of Ancestral Recombination Graphs

    PubMed Central

    Rasmussen, Matthew D.; Hubisz, Melissa J.; Gronau, Ilan; Siepel, Adam

    2014-01-01

    The complex correlation structure of a collection of orthologous DNA sequences is uniquely captured by the “ancestral recombination graph” (ARG), a complete record of coalescence and recombination events in the history of the sample. However, existing methods for ARG inference are computationally intensive, highly approximate, or limited to small numbers of sequences, and, as a consequence, explicit ARG inference is rarely used in applied population genomics. Here, we introduce a new algorithm for ARG inference that is efficient enough to apply to dozens of complete mammalian genomes. The key idea of our approach is to sample an ARG of chromosomes conditional on an ARG of chromosomes, an operation we call “threading.” Using techniques based on hidden Markov models, we can perform this threading operation exactly, up to the assumptions of the sequentially Markov coalescent and a discretization of time. An extension allows for threading of subtrees instead of individual sequences. Repeated application of these threading operations results in highly efficient Markov chain Monte Carlo samplers for ARGs. We have implemented these methods in a computer program called ARGweaver. Experiments with simulated data indicate that ARGweaver converges rapidly to the posterior distribution over ARGs and is effective in recovering various features of the ARG for dozens of sequences generated under realistic parameters for human populations. In applications of ARGweaver to 54 human genome sequences from Complete Genomics, we find clear signatures of natural selection, including regions of unusually ancient ancestry associated with balancing selection and reductions in allele age in sites under directional selection. The patterns we observe near protein-coding genes are consistent with a primary influence from background selection rather than hitchhiking, although we cannot rule out a contribution from recurrent selective sweeps. PMID:24831947

  14. Inferring Heterozygosity from Ancient and Low Coverage Genomes

    PubMed Central

    Kousathanas, Athanasios; Leuenberger, Christoph; Link, Vivian; Sell, Christian; Burger, Joachim; Wegmann, Daniel

    2017-01-01

    While genetic diversity can be quantified accurately from high coverage sequencing data, it is often desirable to obtain such estimates from data with low coverage, either to save costs or because of low DNA quality, as is observed for ancient samples. Here, we introduce a method to accurately infer heterozygosity probabilistically from sequences with average coverage <1× of a single individual. The method relaxes the infinite sites assumption of previous methods, does not require a reference sequence, except for the initial alignment of the sequencing data, and takes into account both variable sequencing errors and potential postmortem damage. It is thus also applicable to nonmodel organisms and ancient genomes. Since error rates as reported by sequencing machines are generally distorted and require recalibration, we also introduce a method to accurately infer recalibration parameters in the presence of postmortem damage. This method does not require knowledge about the underlying genome sequence, but instead works with haploid data (e.g., from the X-chromosome from mammalian males) and integrates over the unknown genotypes. Using extensive simulations we show that a few megabasepairs of haploid data are sufficient for accurate recalibration, even at average coverages as low as 1×. At similar coverages, our method also produces very accurate estimates of heterozygosity down to 10−4 within windows of about 1 Mbp. We further illustrate the usefulness of our approach by inferring genome-wide patterns of diversity for several ancient human samples, and we found that 3000–5000-year-old samples showed diversity patterns comparable to those of modern humans. In contrast, two European hunter-gatherer samples exhibited not only considerably lower levels of diversity than modern samples, but also highly distinct distributions of diversity along their genomes. Interestingly, these distributions were also very different between the two samples, supporting earlier

  15. Inference of distant genetic relations in humans using "1000 genomes".

    PubMed

    Al-Khudhair, Ahmed; Qiu, Shuhao; Wyse, Meghan; Chowdhury, Shilpi; Cheng, Xi; Bekbolsynov, Dulat; Saha-Mandal, Arnab; Dutta, Rajib; Fedorova, Larisa; Fedorov, Alexei

    2015-01-07

    Nucleotide sequence differences on the whole-genome scale have been computed for 1,092 people from 14 populations publicly available by the 1000 Genomes Project. Total number of differences in genetic variants between 96,464 human pairs has been calculated. The distributions of these differences for individuals within European, Asian, or African origin were characterized by narrow unimodal peaks with mean values of 3.8, 3.5, and 5.1 million, respectively, and standard deviations of 0.1-0.03 million. The total numbers of genomic differences between pairs of all known relatives were found to be significantly lower than their respective population means and in reverse proportion to the distance of their consanguinity. By counting the total number of genomic differences it is possible to infer familial relations for people that share down to 6% of common loci identical-by-descent. Detection of familial relations can be radically improved when only very rare genetic variants are taken into account. Counting of total number of shared very rare single nucleotide polymorphisms (SNPs) from whole-genome sequences allows establishing distant familial relations for persons with eighth and ninth degrees of relationship. Using this analysis we predicted 271 distant familial pairwise relations among 1,092 individuals that have not been declared by 1000 Genomes Project. Particularly, among 89 British and 97 Chinese individuals we found three British-Chinese pairs with distant genetic relationships. Individuals from these pairs share identical-by-descent DNA fragments that represent 0.001%, 0.004%, and 0.01% of their genomes. With affordable whole-genome sequencing techniques, very rare SNPs should become important genetic markers for familial relationships and population stratification. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  16. A human genome-wide library of local phylogeny predictions for whole-genome inference problems

    PubMed Central

    Sridhar, Srinath; Schwartz, Russell

    2008-01-01

    Background Many common inference problems in computational genetics depend on inferring aspects of the evolutionary history of a data set given a set of observed modern sequences. Detailed predictions of the full phylogenies are therefore of value in improving our ability to make further inferences about population history and sources of genetic variation. Making phylogenetic predictions on the scale needed for whole-genome analysis is, however, extremely computationally demanding. Results In order to facilitate phylogeny-based predictions on a genomic scale, we develop a library of maximum parsimony phylogenies within local regions spanning all autosomal human chromosomes based on Haplotype Map variation data. We demonstrate the utility of this library for population genetic inferences by examining a tree statistic we call 'imperfection,' which measures the reuse of variant sites within a phylogeny. This statistic is significantly predictive of recombination rate, shows additional regional and population-specific conservation, and allows us to identify outlier genes likely to have experienced unusual amounts of variation in recent human history. Conclusion Recent theoretical advances in algorithms for phylogenetic tree reconstruction have made it possible to perform large-scale inferences of local maximum parsimony phylogenies from single nucleotide polymorphism (SNP) data. As results from the imperfection statistic demonstrate, phylogeny predictions encode substantial information useful for detecting genomic features and population history. This data set should serve as a platform for many kinds of inferences one may wish to make about human population history and genetic variation. PMID:18710563

  17. Population genetic inference from personal genome data: impact of ancestry and admixture on human genomic variation.

    PubMed

    Kidd, Jeffrey M; Gravel, Simon; Byrnes, Jake; Moreno-Estrada, Andres; Musharoff, Shaila; Bryc, Katarzyna; Degenhardt, Jeremiah D; Brisbin, Abra; Sheth, Vrunda; Chen, Rong; McLaughlin, Stephen F; Peckham, Heather E; Omberg, Larsson; Bormann Chung, Christina A; Stanley, Sarah; Pearlstein, Kevin; Levandowsky, Elizabeth; Acevedo-Acevedo, Suehelay; Auton, Adam; Keinan, Alon; Acuña-Alonzo, Victor; Barquera-Lozano, Rodrigo; Canizales-Quinteros, Samuel; Eng, Celeste; Burchard, Esteban G; Russell, Archie; Reynolds, Andy; Clark, Andrew G; Reese, Martin G; Lincoln, Stephen E; Butte, Atul J; De La Vega, Francisco M; Bustamante, Carlos D

    2012-10-05

    Full sequencing of individual human genomes has greatly expanded our understanding of human genetic variation and population history. Here, we present a systematic analysis of 50 human genomes from 11 diverse global populations sequenced at high coverage. Our sample includes 12 individuals who have admixed ancestry and who have varying degrees of recent (within the last 500 years) African, Native American, and European ancestry. We found over 21 million single-nucleotide variants that contribute to a 1.75-fold range in nucleotide heterozygosity across diverse human genomes. This heterozygosity ranged from a high of one heterozygous site per kilobase in west African genomes to a low of 0.57 heterozygous sites per kilobase in segments inferred to have diploid Native American ancestry from the genomes of Mexican and Puerto Rican individuals. We show evidence of all three continental ancestries in the genomes of Mexican, Puerto Rican, and African American populations, and the genome-wide statistics are highly consistent across individuals from a population once ancestry proportions have been accounted for. Using a generalized linear model, we identified subtle variations across populations in the proportion of neutral versus deleterious variation and found that genome-wide statistics vary in admixed populations even once ancestry proportions have been factored in. We further infer that multiple periods of gene flow shaped the diversity of admixed populations in the Americas-70% of the European ancestry in today's African Americans dates back to European gene flow happening only 7-8 generations ago.

  18. Population Genetic Inference from Personal Genome Data: Impact of Ancestry and Admixture on Human Genomic Variation

    PubMed Central

    Kidd, Jeffrey M.; Gravel, Simon; Byrnes, Jake; Moreno-Estrada, Andres; Musharoff, Shaila; Bryc, Katarzyna; Degenhardt, Jeremiah D.; Brisbin, Abra; Sheth, Vrunda; Chen, Rong; McLaughlin, Stephen F.; Peckham, Heather E.; Omberg, Larsson; Bormann Chung, Christina A.; Stanley, Sarah; Pearlstein, Kevin; Levandowsky, Elizabeth; Acevedo-Acevedo, Suehelay; Auton, Adam; Keinan, Alon; Acuña-Alonzo, Victor; Barquera-Lozano, Rodrigo; Canizales-Quinteros, Samuel; Eng, Celeste; Burchard, Esteban G.; Russell, Archie; Reynolds, Andy; Clark, Andrew G.; Reese, Martin G.; Lincoln, Stephen E.; Butte, Atul J.; De La Vega, Francisco M.; Bustamante, Carlos D.

    2012-01-01

    Full sequencing of individual human genomes has greatly expanded our understanding of human genetic variation and population history. Here, we present a systematic analysis of 50 human genomes from 11 diverse global populations sequenced at high coverage. Our sample includes 12 individuals who have admixed ancestry and who have varying degrees of recent (within the last 500 years) African, Native American, and European ancestry. We found over 21 million single-nucleotide variants that contribute to a 1.75-fold range in nucleotide heterozygosity across diverse human genomes. This heterozygosity ranged from a high of one heterozygous site per kilobase in west African genomes to a low of 0.57 heterozygous sites per kilobase in segments inferred to have diploid Native American ancestry from the genomes of Mexican and Puerto Rican individuals. We show evidence of all three continental ancestries in the genomes of Mexican, Puerto Rican, and African American populations, and the genome-wide statistics are highly consistent across individuals from a population once ancestry proportions have been accounted for. Using a generalized linear model, we identified subtle variations across populations in the proportion of neutral versus deleterious variation and found that genome-wide statistics vary in admixed populations even once ancestry proportions have been factored in. We further infer that multiple periods of gene flow shaped the diversity of admixed populations in the Americas—70% of the European ancestry in today’s African Americans dates back to European gene flow happening only 7–8 generations ago. PMID:23040495

  19. Genomic inference of the metabolism of cosmopolitan subsurface Archaea, Hadesarchaea.

    PubMed

    Baker, Brett J; Saw, Jimmy H; Lind, Anders E; Lazar, Cassandre Sara; Hinrichs, Kai-Uwe; Teske, Andreas P; Ettema, Thijs J G

    2016-02-15

    The subsurface biosphere is largely unexplored and contains a broad diversity of uncultured microbes(1). Despite being one of the few prokaryotic lineages that is cosmopolitan in both the terrestrial and marine subsurface(2-4), the physiological and ecological roles of SAGMEG (South-African Gold Mine Miscellaneous Euryarchaeal Group) Archaea are unknown. Here, we report the metabolic capabilities of this enigmatic group as inferred from genomic reconstructions. Four high-quality (63-90% complete) genomes were obtained from White Oak River estuary and Yellowstone National Park hot spring sediment metagenomes. Phylogenomic analyses place SAGMEG Archaea as a deeply rooting sister clade of the Thermococci, leading us to propose the name Hadesarchaea for this new Archaeal class. With an estimated genome size of around 1.5 Mbp, the genomes of Hadesarchaea are distinctly streamlined, yet metabolically versatile. They share several physiological mechanisms with strict anaerobic Euryarchaeota. Several metabolic characteristics make them successful in the subsurface, including genes involved in CO and H2 oxidation (or H2 production), with potential coupling to nitrite reduction to ammonia (DNRA). This first glimpse into the metabolic capabilities of these cosmopolitan Archaea suggests they are mediating key geochemical processes and are specialized for survival in the subsurface biosphere.

  20. The aggregate site frequency spectrum for comparative population genomic inference.

    PubMed

    Xue, Alexander T; Hickerson, Michael J

    2015-12-01

    Understanding how assemblages of species responded to past climate change is a central goal of comparative phylogeography and comparative population genomics, an endeavour that has increasing potential to integrate with community ecology. New sequencing technology now provides the potential to perform complex demographic inference at unprecedented resolution across assemblages of nonmodel species. To this end, we introduce the aggregate site frequency spectrum (aSFS), an expansion of the site frequency spectrum to use single nucleotide polymorphism (SNP) data sets collected from multiple, co-distributed species for assemblage-level demographic inference. We describe how the aSFS is constructed over an arbitrary number of independent population samples and then demonstrate how the aSFS can differentiate various multispecies demographic histories under a wide range of sampling configurations while allowing effective population sizes and expansion magnitudes to vary independently. We subsequently couple the aSFS with a hierarchical approximate Bayesian computation (hABC) framework to estimate degree of temporal synchronicity in expansion times across taxa, including an empirical demonstration with a data set consisting of five populations of the threespine stickleback (Gasterosteus aculeatus). Corroborating what is generally understood about the recent postglacial origins of these populations, the joint aSFS/hABC analysis strongly suggests that the stickleback data are most consistent with synchronous expansion after the Last Glacial Maximum (posterior probability = 0.99). The aSFS will have general application for multilevel statistical frameworks to test models involving assemblages and/or communities, and as large-scale SNP data from nonmodel species become routine, the aSFS expands the potential for powerful next-generation comparative population genomic inference.

  1. Genome BLAST distance phylogenies inferred from whole plastid and whole mitochondrion genome sequences

    PubMed Central

    Auch, Alexander F; Henz, Stefan R; Holland, Barbara R; Göker, Markus

    2006-01-01

    Background Phylogenetic methods which do not rely on multiple sequence alignments are important tools in inferring trees directly from completely sequenced genomes. Here, we extend the recently described Genome BLAST Distance Phylogeny (GBDP) strategy to compute phylogenetic trees from all completely sequenced plastid genomes currently available and from a selection of mitochondrial genomes representing the major eukaryotic lineages. BLASTN, TBLASTX, or combinations of both are used to locate high-scoring segment pairs (HSPs) between two sequences from which pairwise similarities and distances are computed in different ways resulting in a total of 96 GBDP variants. The suitability of these distance formulae for phylogeny reconstruction is directly estimated by computing a recently described measure of "treelikeness", the so-called δ value, from the respective distance matrices. Additionally, we compare the trees inferred from these matrices using UPGMA, NJ, BIONJ, FastME, or STC, respectively, with the NCBI taxonomy tree of the taxa under study. Results Our results indicate that, at this taxonomic level, plastid genomes are much more valuable for inferring phylogenies than are mitochondrial genomes, and that distances based on breakpoints are of little use. Distances based on the proportion of "matched" HSP length to average genome length were best for tree estimation. Additionally we found that using TBLASTX instead of BLASTN and, particularly, combining TBLASTX and BLASTN leads to a small but significant increase in accuracy. Other factors do not significantly affect the phylogenetic outcome. The BIONJ algorithm results in phylogenies most in accordance with the current NCBI taxonomy, with NJ and FastME performing insignificantly worse, and STC performing as well if applied to high quality distance matrices. δ values are found to be a reliable predictor of phylogenetic accuracy. Conclusion Using the most treelike distance matrices, as judged by their δ values

  2. Catchment-scale biogeography of riverine bacterioplankton

    PubMed Central

    Read, Daniel S; Gweon, Hyun S; Bowes, Michael J; Newbold, Lindsay K; Field, Dawn; Bailey, Mark J; Griffiths, Robert I

    2015-01-01

    Lotic ecosystems such as rivers and streams are unique in that they represent a continuum of both space and time during the transition from headwaters to the river mouth. As microbes have very different controls over their ecology, distribution and dispersion compared with macrobiota, we wished to explore biogeographical patterns within a river catchment and uncover the major drivers structuring bacterioplankton communities. Water samples collected across the River Thames Basin, UK, covering the transition from headwater tributaries to the lower reaches of the main river channel were characterised using 16S rRNA gene pyrosequencing. This approach revealed an ecological succession in the bacterial community composition along the river continuum, moving from a community dominated by Bacteroidetes in the headwaters to Actinobacteria-dominated downstream. Location of the sampling point in the river network (measured as the cumulative water channel distance upstream) was found to be the most predictive spatial feature; inferring that ecological processes pertaining to temporal community succession are of prime importance in driving the assemblages of riverine bacterioplankton communities. A decrease in bacterial activity rates and an increase in the abundance of low nucleic acid bacteria relative to high nucleic acid bacteria were found to correspond with these downstream changes in community structure, suggesting corresponding functional changes. Our findings show that bacterial communities across the Thames basin exhibit an ecological succession along the river continuum, and that this is primarily driven by water residence time rather than the physico-chemical status of the river. PMID:25238398

  3. Catchment-scale biogeography of riverine bacterioplankton.

    PubMed

    Read, Daniel S; Gweon, Hyun S; Bowes, Michael J; Newbold, Lindsay K; Field, Dawn; Bailey, Mark J; Griffiths, Robert I

    2015-02-01

    Lotic ecosystems such as rivers and streams are unique in that they represent a continuum of both space and time during the transition from headwaters to the river mouth. As microbes have very different controls over their ecology, distribution and dispersion compared with macrobiota, we wished to explore biogeographical patterns within a river catchment and uncover the major drivers structuring bacterioplankton communities. Water samples collected across the River Thames Basin, UK, covering the transition from headwater tributaries to the lower reaches of the main river channel were characterised using 16S rRNA gene pyrosequencing. This approach revealed an ecological succession in the bacterial community composition along the river continuum, moving from a community dominated by Bacteroidetes in the headwaters to Actinobacteria-dominated downstream. Location of the sampling point in the river network (measured as the cumulative water channel distance upstream) was found to be the most predictive spatial feature; inferring that ecological processes pertaining to temporal community succession are of prime importance in driving the assemblages of riverine bacterioplankton communities. A decrease in bacterial activity rates and an increase in the abundance of low nucleic acid bacteria relative to high nucleic acid bacteria were found to correspond with these downstream changes in community structure, suggesting corresponding functional changes. Our findings show that bacterial communities across the Thames basin exhibit an ecological succession along the river continuum, and that this is primarily driven by water residence time rather than the physico-chemical status of the river.

  4. Baboon phylogeny as inferred from complete mitochondrial genomes

    PubMed Central

    Zinner, Dietmar; Wertheimer, Jenny; Liedigk, Rasmus; Groeneveld, Linn F; Roos, Christian

    2013-01-01

    Baboons (genus Papio) are an interesting phylogeographical primate model for the evolution of savanna species during the Pleistocene. Earlier studies, based on partial mitochondrial sequence information, revealed seven major haplogroups indicating multiple para- and polyphylies among the six baboon species. The most basal splits among baboon lineages remained unresolved and the credibility intervals for divergence time estimates were rather large. Assuming that genetic variation within the two studied mitochondrial loci so far was insufficient to infer the apparently rapid early radiation of baboons we used complete mitochondrial sequence information of ten specimens, representing all major baboon lineages, to reconstruct a baboon phylogeny and to re-estimate divergence times. Our data confirmed the earlier tree topology including the para- and polyphyletic relationships of most baboon species; divergence time estimates are slightly younger and credibility intervals narrowed substantially, thus making the estimates more precise. However, the most basal relationships could not be resolved and it remains open whether (1) the most southern population of baboons diverged first or (2) a major split occurred between southern and northern clades. Our study shows that complete mitochondrial genome sequences are more effective to reconstruct robust phylogenies and to narrow down estimated divergence time intervals than only short portions of the mitochondrial genome, although there are also limitations in resolving phylogenetic relationships. Am J Phys Anthropol, 2013. © 2012 Wiley Periodicals, Inc. PMID:23180628

  5. Inferring divergence of context-dependent substitution rates in Drosophila genomes with applications to comparative genomics.

    PubMed

    Chachick, Ran; Tanay, Amos

    2012-07-01

    Nucleotide substitution is a major evolutionary driving force that can incrementally and stochastically give rise to broad divergence patterns among species. The substitution process at each genomic position is frequently modeled independently of the other positions, although complex interactions between nearby bases are known to significantly affect mutation rates. Here, we study the evolution of 12 fly genomes using new algorithms for accurate inference of parameter-rich substitution models. By comparing models between lineages, we reveal the evolutionary histories of substitution rates at different flanking nucleotide contexts. We demonstrate these driving forces of molecular evolution to be constantly changing, suggesting that neutral drift of mutation rates is an important factor in the evolution of genomes and their sequence composition. This observation is used to develop a scalable approach for parameter-rich comparative genomics. By screening short DNA sequences, we demonstrate how homeoboxes and other transcription factor binding motifs are highly conserved based on our parameter-rich models but not according to standard conservation assays. With the increasing availability of genome sequences, rich substitution models become an attractive and practical approach for evolutionary analysis in general and comparative genomics in particular.

  6. Inferring patterns of folktale diffusion using genomic data.

    PubMed

    Bortolini, Eugenio; Pagani, Luca; Crema, Enrico R; Sarno, Stefania; Barbieri, Chiara; Boattini, Alessio; Sazzini, Marco; da Silva, Sara Graça; Martini, Gessica; Metspalu, Mait; Pettener, Davide; Luiselli, Donata; Tehrani, Jamshid J

    2017-08-22

    Observable patterns of cultural variation are consistently intertwined with demic movements, cultural diffusion, and adaptation to different ecological contexts [Cavalli-Sforza and Feldman (1981) Cultural Transmission and Evolution: A Quantitative Approach; Boyd and Richerson (1985) Culture and the Evolutionary Process]. The quantitative study of gene-culture coevolution has focused in particular on the mechanisms responsible for change in frequency and attributes of cultural traits, the spread of cultural information through demic and cultural diffusion, and detecting relationships between genetic and cultural lineages. Here, we make use of worldwide whole-genome sequences [Pagani et al. (2016) Nature 538:238-242] to assess the impact of processes involving population movement and replacement on cultural diversity, focusing on the variability observed in folktale traditions (n = 596) [Uther (2004) The Types of International Folktales: A Classification and Bibliography. Based on the System of Antti Aarne and Stith Thompson] in Eurasia. We find that a model of cultural diffusion predicted by isolation-by-distance alone is not sufficient to explain the observed patterns, especially at small spatial scales (up to [Formula: see text]4,000 km). We also provide an empirical approach to infer presence and impact of ethnolinguistic barriers preventing the unbiased transmission of both genetic and cultural information. After correcting for the effect of ethnolinguistic boundaries, we show that, of the alternative models that we propose, the one entailing cultural diffusion biased by linguistic differences is the most plausible. Additionally, we identify 15 tales that are more likely to be predominantly transmitted through population movement and replacement and locate putative focal areas for a set of tales that are spread worldwide.

  7. Phylogeny Inference of Closely Related Bacterial Genomes: Combining the Features of Both Overlapping Genes and Collinear Genomic Regions

    PubMed Central

    Zhang, Yan-Cong; Lin, Kui

    2015-01-01

    Overlapping genes (OGs) represent one type of widespread genomic feature in bacterial genomes and have been used as rare genomic markers in phylogeny inference of closely related bacterial species. However, the inference may experience a decrease in performance for phylogenomic analysis of too closely or too distantly related genomes. Another drawback of OGs as phylogenetic markers is that they usually take little account of the effects of genomic rearrangement on the similarity estimation, such as intra-chromosome/genome translocations, horizontal gene transfer, and gene losses. To explore such effects on the accuracy of phylogeny reconstruction, we combine phylogenetic signals of OGs with collinear genomic regions, here called locally collinear blocks (LCBs). By putting these together, we refine our previous metric of pairwise similarity between two closely related bacterial genomes. As a case study, we used this new method to reconstruct the phylogenies of 88 Enterobacteriale genomes of the class Gammaproteobacteria. Our results demonstrated that the topological accuracy of the inferred phylogeny was improved when both OGs and LCBs were simultaneously considered, suggesting that combining these two phylogenetic markers may reduce, to some extent, the influence of gene loss on phylogeny inference. Such phylogenomic studies, we believe, will help us to explore a more effective approach to increasing the robustness of phylogeny reconstruction of closely related bacterial organisms. PMID:26715828

  8. Robust and scalable inference of population history from hundreds of unphased whole genomes.

    PubMed

    Terhorst, Jonathan; Kamm, John A; Song, Yun S

    2017-02-01

    It has recently been demonstrated that inference methods based on genealogical processes with recombination can uncover past population history in unprecedented detail. However, these methods scale poorly with sample size, limiting resolution in the recent past, and they require phased genomes, which contain switch errors that can catastrophically distort the inferred history. Here we present SMC++, a new statistical tool capable of analyzing orders of magnitude more samples than existing methods while requiring only unphased genomes (its results are independent of phasing). SMC++ can jointly infer population size histories and split times in diverged populations, and it employs a novel spline regularization scheme that greatly reduces estimation error. We apply SMC++ to analyze sequence data from over a thousand human genomes in Africa and Eurasia, hundreds of genomes from a Drosophila melanogaster population in Africa, and tens of genomes from zebra finch and long-tailed finch populations in Australia.

  9. Coherent dynamics and association networks among lake bacterioplankton taxa

    PubMed Central

    Eiler, Alexander; Heinrich, Friederike; Bertilsson, Stefan

    2012-01-01

    Bacteria have important roles in freshwater food webs and in the cycling of elements in the ecosystem. Yet specific ecological features of individual phylogenetic groups and interactions among these are largely unknown. We used 454 pyrosequencing of 16S rRNA genes to study associations of different bacterioplankton groups to environmental characteristics and their co-occurrence patterns over an annual cycle in a dimictic lake. Clear seasonal succession of the bacterioplankton community was observed. After binning of sequences into previously described and highly resolved phylogenetic groups (tribes), their temporal dynamics revealed extensive synchrony and associations with seasonal events such as ice coverage, ice-off, mixing and phytoplankton blooms. Coupling between closely and distantly related tribes was resolved by time-dependent rank correlations, suggesting ecological coherence that was often dependent on taxonomic relatedness. Association networks with the abundant freshwater Actinobacteria and Proteobacteria in focus revealed complex interdependencies within bacterioplankton communities and contrasting linkages to environmental conditions. Accordingly, unique ecological features can be inferred for each tribe and reveal the natural history of abundant cultured and uncultured freshwater bacteria. PMID:21881616

  10. Genomic inferences from Afrotheria and the evolution of elephants.

    PubMed

    Roca, Alfred L; O'Brien, Stephen J

    2005-12-01

    Recent genetic studies have established that African forest and savanna elephants are distinct species with dissociated cytonuclear genomic patterns, and have identified Asian elephants from Borneo and Sumatra as conservation priorities. Representative of Afrotheria, a superordinal clade encompassing six eutherian orders, the African savanna elephant was among the first mammals chosen for whole-genome sequencing to provide a comparative understanding of the human genome. Elephants have large and complex brains and display advanced levels of social structure, communication, learning and intelligence. The elephant genome sequence might prove useful for comparative genomic studies of these advanced traits, which have appeared independently in only three mammalian orders: primates, cetaceans and proboscideans.

  11. AD-LIBS: inferring ancestry across hybrid genomes using low-coverage sequence data.

    PubMed

    Schaefer, Nathan K; Shapiro, Beth; Green, Richard E

    2017-04-04

    Inferring the ancestry of each region of admixed individuals' genomes is useful in studies ranging from disease gene mapping to speciation genetics. Current methods require high-coverage genotype data and phased reference panels, and are therefore inappropriate for many data sets. We present a software application, AD-LIBS, that uses a hidden Markov model to infer ancestry across hybrid genomes without requiring variant calling or phasing. This approach is useful for non-model organisms and in cases of low-coverage data, such as ancient DNA. We demonstrate the utility of AD-LIBS with synthetic data. We then use AD-LIBS to infer ancestry in two published data sets: European human genomes with Neanderthal ancestry and brown bear genomes with polar bear ancestry. AD-LIBS correctly infers 87-91% of ancestry in simulations and produces ancestry maps that agree with published results and global ancestry estimates in humans. In brown bears, we find more polar bear ancestry than has been published previously, using both AD-LIBS and an existing software application for local ancestry inference, HAPMIX. We validate AD-LIBS polar bear ancestry maps by recovering a geographic signal within bears that mirrors what is seen in SNP data. Finally, we demonstrate that AD-LIBS is more effective than HAPMIX at inferring ancestry when preexisting phased reference data are unavailable and genomes are sequenced to low coverage. AD-LIBS is an effective tool for ancestry inference that can be used even when few individuals are available for comparison or when genomes are sequenced to low coverage. AD-LIBS is therefore likely to be useful in studies of non-model or ancient organisms that lack large amounts of genomic DNA. AD-LIBS can therefore expand the range of studies in which admixture mapping is a viable tool.

  12. Systems Biology and Ecology of Streamlined Bacterioplankton

    NASA Astrophysics Data System (ADS)

    Giovannoni, S. J.

    2014-12-01

    The salient feature of streamlined cells is their small genome size, but "streamlining" refers more generally to selection that favors minimization of cell size and complexity. The essence of streamlining theory is that selection is most efficient in organisms that have large effective population sizes, and, in nutrient-limited systems, favors cell architecture that minimizes resources required for replication. Regardless of the cause of genome reduction, lost coding potential eventually dictates loss of function, raising the questions, what genome features are expendable, and how do cells become highly successful with a minimal genomic repertoire? One consequence of reductive evolution in streamlined organisms is atypical patterns of prototrophy, for example the recent discovery of a requirement for the thiamin precursor 4-amino-5-hydroxymethyl-2-methylpyrimidine in some plankton taxa. Examples such as this fit within the framework of the Black Queen Hypothesis, which describes genome reduction that results in reliance on community goods and increased community connectivity. Other examples of genome reduction include losses of regulatory functions, or replacement with simpler regulatory systems, and increased metabolic integration. In one such case, in the order Pelagibacterales, the PII system for regulating responses to N limitation has been replaced with a simpler system composed of fewer genes. Both the absence of common regulatory systems and atypical patterns of prototrophy have been linked to difficulty in culturing Pelagibacterales, lending credibility to the idea that streamlining might broadly explain the phenomenon of the uncultured microbial majority. The success of streamlined osmotrophic bacterioplankton suggests that they successfully compete for labile organic matter and capture a large share of this resource, but an alternative theory postulates they are not good resource competitors and instead prosper by avoiding predation. The answers to these

  13. Ecophysiology of Freshwater Verrucomicrobia Inferred from Metagenome-Assembled Genomes

    PubMed Central

    He, Shaomei; Stevens, Sarah L. R.; Chan, Leong-Keat; Bertilsson, Stefan; Glavina del Rio, Tijana; Tringe, Susannah G.; Malmstrom, Rex R.

    2017-01-01

    ABSTRACT Microbes are critical in carbon and nutrient cycling in freshwater ecosystems. Members of the Verrucomicrobia are ubiquitous in such systems, and yet their roles and ecophysiology are not well understood. In this study, we recovered 19 Verrucomicrobia draft genomes by sequencing 184 time-series metagenomes from a eutrophic lake and a humic bog that differ in carbon source and nutrient availabilities. These genomes span four of the seven previously defined Verrucomicrobia subdivisions and greatly expand knowledge of the genomic diversity of freshwater Verrucomicrobia. Genome analysis revealed their potential role as (poly)saccharide degraders in freshwater, uncovered interesting genomic features for this lifestyle, and suggested their adaptation to nutrient availabilities in their environments. Verrucomicrobia populations differ significantly between the two lakes in glycoside hydrolase gene abundance and functional profiles, reflecting the autochthonous and terrestrially derived allochthonous carbon sources of the two ecosystems, respectively. Interestingly, a number of genomes recovered from the bog contained gene clusters that potentially encode a novel porin-multiheme cytochrome c complex and might be involved in extracellular electron transfer in the anoxic humus-rich environment. Notably, most epilimnion genomes have large numbers of so-called “Planctomycete-specific” cytochrome c-encoding genes, which exhibited distribution patterns nearly opposite to those seen with glycoside hydrolase genes, probably associated with the different levels of environmental oxygen availability and carbohydrate complexity between lakes/layers. Overall, the recovered genomes represent a major step toward understanding the role, ecophysiology, and distribution of Verrucomicrobia in freshwater. IMPORTANCE Freshwater Verrucomicrobia spp. are cosmopolitan in lakes and rivers, and yet their roles and ecophysiology are not well understood, as cultured freshwater

  14. Inferring Ancestral Recombination Graphs from Bacterial Genomic Data.

    PubMed

    Vaughan, Timothy G; Welch, David; Drummond, Alexei J; Biggs, Patrick J; George, Tessy; French, Nigel P

    2017-02-01

    Homologous recombination is a central feature of bacterial evolution, yet it confounds traditional phylogenetic methods. While a number of methods specific to bacterial evolution have been developed, none of these permit joint inference of a bacterial recombination graph and associated parameters. In this article, we present a new method which addresses this shortcoming. Our method uses a novel Markov chain Monte Carlo algorithm to perform phylogenetic inference under the ClonalOrigin model. We demonstrate the utility of our method by applying it to ribosomal multilocus sequence typing data sequenced from pathogenic and nonpathogenic Escherichia coli serotype O157 and O26 isolates collected in rural New Zealand. The method is implemented as an open source BEAST 2 package, Bacter, which is available via the project web page at http://tgvaughan.github.io/bacter. Copyright © 2017 Vaughan et al.

  15. Inferring Ancestral Recombination Graphs from Bacterial Genomic Data

    PubMed Central

    Vaughan, Timothy G.; Welch, David; Drummond, Alexei J.; Biggs, Patrick J.; George, Tessy; French, Nigel P.

    2017-01-01

    Homologous recombination is a central feature of bacterial evolution, yet it confounds traditional phylogenetic methods. While a number of methods specific to bacterial evolution have been developed, none of these permit joint inference of a bacterial recombination graph and associated parameters. In this article, we present a new method which addresses this shortcoming. Our method uses a novel Markov chain Monte Carlo algorithm to perform phylogenetic inference under the ClonalOrigin model. We demonstrate the utility of our method by applying it to ribosomal multilocus sequence typing data sequenced from pathogenic and nonpathogenic Escherichia coli serotype O157 and O26 isolates collected in rural New Zealand. The method is implemented as an open source BEAST 2 package, Bacter, which is available via the project web page at http://tgvaughan.github.io/bacter. PMID:28007885

  16. OMA 2011: orthology inference among 1000 complete genomes.

    PubMed

    Altenhoff, Adrian M; Schneider, Adrian; Gonnet, Gaston H; Dessimoz, Christophe

    2011-01-01

    OMA (Orthologous MAtrix) is a database that identifies orthologs among publicly available, complete genomes. Initiated in 2004, the project is at its 11th release. It now includes 1000 genomes, making it one of the largest resources of its kind. Here, we describe recent developments in terms of species covered; the algorithmic pipeline--in particular regarding the treatment of alternative splicing, and new features of the web (OMA Browser) and programming interface (SOAP API). In the second part, we review the various representations provided by OMA and their typical applications. The database is publicly accessible at http://omabrowser.org.

  17. Using Genetic Distance to Infer the Accuracy of Genomic Prediction

    PubMed Central

    Scutari, Marco; Mackay, Ian

    2016-01-01

    The prediction of phenotypic traits using high-density genomic data has many applications such as the selection of plants and animals of commercial interest; and it is expected to play an increasing role in medical diagnostics. Statistical models used for this task are usually tested using cross-validation, which implicitly assumes that new individuals (whose phenotypes we would like to predict) originate from the same population the genomic prediction model is trained on. In this paper we propose an approach based on clustering and resampling to investigate the effect of increasing genetic distance between training and target populations when predicting quantitative traits. This is important for plant and animal genetics, where genomic selection programs rely on the precision of predictions in future rounds of breeding. Therefore, estimating how quickly predictive accuracy decays is important in deciding which training population to use and how often the model has to be recalibrated. We find that the correlation between true and predicted values decays approximately linearly with respect to either FST or mean kinship between the training and the target populations. We illustrate this relationship using simulations and a collection of data sets from mice, wheat and human genetics. PMID:27589268

  18. EMu: probabilistic inference of mutational processes and their localization in the cancer genome

    PubMed Central

    2013-01-01

    The spectrum of mutations discovered in cancer genomes can be explained by the activity of a few elementary mutational processes. We present a novel probabilistic method, EMu, to infer the mutational signatures of these processes from a collection of sequenced tumors. EMu naturally incorporates the tumor-specific opportunity for different mutation types according to sequence composition. Applying EMu to breast cancer data, we derive detailed maps of the activity of each process, both genome-wide and within specific local regions of the genome. Our work provides new opportunities to study the mutational processes underlying cancer development. EMu is available at http://www.sanger.ac.uk/resources/software/emu/. PMID:23628380

  19. A molecular phylogeny of Hemiptera inferred from mitochondrial genome sequences.

    PubMed

    Song, Nan; Liang, Ai-Ping; Bu, Cui-Ping

    2012-01-01

    Classically, Hemiptera is comprised of two suborders: Homoptera and Heteroptera. Homoptera includes Cicadomorpha, Fulgoromorpha and Sternorrhyncha. However, according to previous molecular phylogenetic studies based on 18S rDNA, Fulgoromorpha has a closer relationship to Heteroptera than to other hemipterans, leaving Homoptera as paraphyletic. Therefore, the position of Fulgoromorpha is important for studying phylogenetic structure of Hemiptera. We inferred the evolutionary affiliations of twenty-five superfamilies of Hemiptera using mitochondrial protein-coding genes and rRNAs. We sequenced three mitogenomes, from Pyrops candelaria, Lycorma delicatula and Ricania marginalis, representing two additional families in Fulgoromorpha. Pyrops and Lycorma are representatives of an additional major family Fulgoridae in Fulgoromorpha, whereas Ricania is a second representative of the highly derived clade Ricaniidae. The organization and size of these mitogenomes are similar to those of the sequenced fulgoroid species. Our consensus phylogeny of Hemiptera largely supported the relationships (((Fulgoromorpha,Sternorrhyncha),Cicadomorpha),Heteroptera), and thus supported the classic phylogeny of Hemiptera. Selection of optimal evolutionary models (exclusion and inclusion of two rRNA genes or of third codon positions of protein-coding genes) demonstrated that rapidly evolving and saturated sites should be removed from the analyses.

  20. A Molecular Phylogeny of Hemiptera Inferred from Mitochondrial Genome Sequences

    PubMed Central

    Song, Nan; Liang, Ai-Ping; Bu, Cui-Ping

    2012-01-01

    Classically, Hemiptera is comprised of two suborders: Homoptera and Heteroptera. Homoptera includes Cicadomorpha, Fulgoromorpha and Sternorrhyncha. However, according to previous molecular phylogenetic studies based on 18S rDNA, Fulgoromorpha has a closer relationship to Heteroptera than to other hemipterans, leaving Homoptera as paraphyletic. Therefore, the position of Fulgoromorpha is important for studying phylogenetic structure of Hemiptera. We inferred the evolutionary affiliations of twenty-five superfamilies of Hemiptera using mitochondrial protein-coding genes and rRNAs. We sequenced three mitogenomes, from Pyrops candelaria, Lycorma delicatula and Ricania marginalis, representing two additional families in Fulgoromorpha. Pyrops and Lycorma are representatives of an additional major family Fulgoridae in Fulgoromorpha, whereas Ricania is a second representative of the highly derived clade Ricaniidae. The organization and size of these mitogenomes are similar to those of the sequenced fulgoroid species. Our consensus phylogeny of Hemiptera largely supported the relationships (((Fulgoromorpha,Sternorrhyncha),Cicadomorpha),Heteroptera), and thus supported the classic phylogeny of Hemiptera. Selection of optimal evolutionary models (exclusion and inclusion of two rRNA genes or of third codon positions of protein-coding genes) demonstrated that rapidly evolving and saturated sites should be removed from the analyses. PMID:23144967

  1. The History of Slavs Inferred from Complete Mitochondrial Genome Sequences

    PubMed Central

    Mielnik-Sikorska, Marta; Daca, Patrycja; Malyarchuk, Boris; Derenko, Miroslava; Skonieczna, Katarzyna; Perkova, Maria; Dobosz, Tadeusz; Grzybowski, Tomasz

    2013-01-01

    To shed more light on the processes leading to crystallization of a Slavic identity, we investigated variability of complete mitochondrial genomes belonging to haplogroups H5 and H6 (63 mtDNA genomes) from the populations of Eastern and Western Slavs, including new samples of Poles, Ukrainians and Czechs presented here. Molecular dating implies formation of H5 approximately 11.5–16 thousand years ago (kya) in the areas of southern Europe. Within ancient haplogroup H6, dated at around 15–28 kya, there is a subhaplogroup H6c, which probably survived the last glaciation in Europe and has undergone expansion only 3–4 kya, together with the ancestors of some European groups, including the Slavs, because H6c has been detected in Czechs, Poles and Slovaks. Detailed analysis of complete mtDNAs allowed us to identify a number of lineages that seem specific for Central and Eastern Europe (H5a1f, H5a2, H5a1r, H5a1s, H5b4, H5e1a, H5u1, some subbranches of H5a1a and H6a1a9). Some of them could possibly be traced back to at least ∼4 kya, which indicates that some of the ancestors of today's Slavs (Poles, Czechs, Slovaks, Ukrainians and Russians) inhabited areas of Central and Eastern Europe much earlier than it was estimated on the basis of archaeological and historical data. We also sequenced entire mitochondrial genomes of several non-European lineages (A, C, D, G, L) found in contemporary populations of Poland and Ukraine. The analysis of these haplogroups confirms the presence of Siberian (C5c1, A8a1) and Ashkenazi-specific (L2a1l2a) mtDNA lineages in Slavic populations. Moreover, we were able to pinpoint some lineages which could possibly reflect the relatively recent contacts of Slavs with nomadic Altaic peoples (C4a1a, G2a, D5a2a1a1). PMID:23342138

  2. The history of Slavs inferred from complete mitochondrial genome sequences.

    PubMed

    Mielnik-Sikorska, Marta; Daca, Patrycja; Malyarchuk, Boris; Derenko, Miroslava; Skonieczna, Katarzyna; Perkova, Maria; Dobosz, Tadeusz; Grzybowski, Tomasz

    2013-01-01

    To shed more light on the processes leading to crystallization of a Slavic identity, we investigated variability of complete mitochondrial genomes belonging to haplogroups H5 and H6 (63 mtDNA genomes) from the populations of Eastern and Western Slavs, including new samples of Poles, Ukrainians and Czechs presented here. Molecular dating implies formation of H5 approximately 11.5-16 thousand years ago (kya) in the areas of southern Europe. Within ancient haplogroup H6, dated at around 15-28 kya, there is a subhaplogroup H6c, which probably survived the last glaciation in Europe and has undergone expansion only 3-4 kya, together with the ancestors of some European groups, including the Slavs, because H6c has been detected in Czechs, Poles and Slovaks. Detailed analysis of complete mtDNAs allowed us to identify a number of lineages that seem specific for Central and Eastern Europe (H5a1f, H5a2, H5a1r, H5a1s, H5b4, H5e1a, H5u1, some subbranches of H5a1a and H6a1a9). Some of them could possibly be traced back to at least ∼4 kya, which indicates that some of the ancestors of today's Slavs (Poles, Czechs, Slovaks, Ukrainians and Russians) inhabited areas of Central and Eastern Europe much earlier than it was estimated on the basis of archaeological and historical data. We also sequenced entire mitochondrial genomes of several non-European lineages (A, C, D, G, L) found in contemporary populations of Poland and Ukraine. The analysis of these haplogroups confirms the presence of Siberian (C5c1, A8a1) and Ashkenazi-specific (L2a1l2a) mtDNA lineages in Slavic populations. Moreover, we were able to pinpoint some lineages which could possibly reflect the relatively recent contacts of Slavs with nomadic Altaic peoples (C4a1a, G2a, D5a2a1a1).

  3. Evolutionary History of Chimpanzees Inferred from Complete Mitochondrial Genomes

    PubMed Central

    Bjork, Adam; Liu, Weimin; Wertheim, Joel O.; Hahn, Beatrice H.; Worobey, Michael

    2011-01-01

    Investigations into the evolutionary history of the common chimpanzee, Pan troglodytes, have produced inconsistent results due to differences in the types of molecular data considered, the model assumptions employed, and the quantity and geographical range of samples used. We amplified and sequenced 24 complete P. troglodytes mitochondrial genomes from fecal samples collected at multiple study sites throughout sub-Saharan Africa. Using a “relaxed molecular clock,” fossil calibrations, and 12 additional complete primate mitochondrial genomes, we analyzed the pattern and timing of primate diversification in a Bayesian framework. Our results support the recognition of four chimpanzee subspecies. Within P. troglodytes, we report a mean (95% highest posterior density [HPD]) time since most recent common ancestor (tMRCA) of 1.026 (0.811–1.263) Ma for the four proposed subspecies, with two major lineages. One of these lineages (tMRCA = 0.510 [0.387–0.650] Ma) contains P. t. verus (tMRCA = 0.155 [0.101–0.213] Ma) and P. t. ellioti (formerly P. t. vellerosus; tMRCA = 0.157 [0.102–0.215] Ma), both of which are monophyletic. The other major lineage contains P. t. schweinfurthii (tMRCA = 0.111 [0.077–0.146] Ma), a monophyletic clade nested within the P. t. troglodytes lineage (tMRCA = 0.380 [0.296–0.476] Ma). We utilized two analysis techniques that may be of widespread interest. First, we implemented a Yule speciation prior across the entire primate tree with separate coalescent priors on each of the chimpanzee subspecies. The validity of this approach was confirmed by estimates based on more traditional techniques. We also suggest that accurate tMRCA estimates from large computationally difficult sequence alignments may be obtained by implementing our novel method of bootstrapping smaller randomly subsampled alignments. PMID:20802239

  4. Inference of homologous recombination in bacteria using whole-genome sequences.

    PubMed

    Didelot, Xavier; Lawson, Daniel; Darling, Aaron; Falush, Daniel

    2010-12-01

    Bacteria and archaea reproduce clonally, but sporadically import DNA into their chromosomes from other organisms. In many of these events, the imported DNA replaces an homologous segment in the recipient genome. Here we present a new method to reconstruct the history of recombination events that affected a given sample of bacterial genomes. We introduce a mathematical model that represents both the donor and the recipient of each DNA import as an ancestor of the genomes in the sample. The model represents a simplification of the previously described coalescent with gene conversion. We implement a Monte Carlo Markov chain algorithm to perform inference under this model from sequence data alignments and show that inference is feasible for whole-genome alignments through parallelization. Using simulated data, we demonstrate accurate and reliable identification of individual recombination events and global recombination rate parameters. We applied our approach to an alignment of 13 whole genomes from the Bacillus cereus group. We find, as expected from laboratory experiments, that the recombination rate is higher between closely related organisms and also that the genome contains several broad regions of elevated levels of recombination. Application of the method to the genomic data sets that are becoming available should reveal the evolutionary history and private lives of populations of bacteria and archaea. The methods described in this article have been implemented in a computer software package, ClonalOrigin, which is freely available from http://code.google.com/p/clonalorigin/.

  5. Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance

    SciTech Connect

    Ahn, Tae-Hyuk; Chai, Juanjuan; Pan, Chongle

    2014-09-29

    Motivation: Metagenomic sequencing of clinical samples provides a promising technique for direct pathogen detection and characterization in biosurveillance. Taxonomic analysis at the strain level can be used to resolve serotypes of a pathogen in biosurveillance. Sigma was developed for strain-level identification and quantification of pathogens using their reference genomes based on metagenomic analysis. Results: Sigma provides not only accurate strain-level inferences, but also three unique capabilities: (i) Sigma quantifies the statistical uncertainty of its inferences, which includes hypothesis testing of identified genomes and confidence interval estimation of their relative abundances; (ii) Sigma enables strain variant calling by assigning metagenomic reads to their most likely reference genomes; and (iii) Sigma supports parallel computing for fast analysis of large datasets. In conclusion, the algorithm performance was evaluated using simulated mock communities and fecal samples with spike-in pathogen strains. Availability and Implementation: Sigma was implemented in C++ with source codes and binaries freely available at http://sigma.omicsbio.org.

  6. Insights and inferences about integron evolution from genomic data

    PubMed Central

    Nemergut, Diana R; Robeson, Michael S; Kysela, Robert F; Martin, Andrew P; Schmidt, Steven K; Knight, Rob

    2008-01-01

    Background Integrons are mechanisms that facilitate horizontal gene transfer, allowing bacteria to integrate and express foreign DNA. These are important in the exchange of antibiotic resistance determinants, but can also transfer a diverse suite of genes unrelated to pathogenicity. Here, we provide a systematic analysis of the distribution and diversity of integron intI genes and integron-containing bacteria. Results We found integrons in 103 different pathogenic and non-pathogenic bacteria, in six major phyla. Integrons were widely scattered, and their presence was not confined to specific clades within bacterial orders. Nearly 1/3 of the intI genes that we identified were pseudogenes, containing either an internal stop codon or a frameshift mutation that would render the protein product non-functional. Additionally, 20% of bacteria contained more than one integrase gene. dN/dS ratios revealed mutational hotspots in clades of Vibrio and Shewanella intI genes. Finally, we characterized the gene cassettes associated with integrons in Methylobacillus flagellatus KT and Dechloromonas aromatica RCB, and found a heavy metal efflux gene as well as genes involved in protein folding and stability. Conclusion Our analysis suggests that the present distribution of integrons is due to multiple losses and gene transfer events. While, in some cases, the ability to integrate and excise foreign DNA may be selectively advantageous, the gain, loss, or rearrangment of gene cassettes could also be deleterious, selecting against functional integrases. Thus, such a high fraction of pseudogenes may suggest that the selective impact of integrons on genomes is variable, oscillating between beneficial and deleterious, possibly depending on environmental conditions. PMID:18513439

  7. Inferring Strain Mixture within Clinical Plasmodium falciparum Isolates from Genomic Sequence Data

    PubMed Central

    O’Brien, John D.; Amenga-Etego, Lucas

    2016-01-01

    We present a rigorous statistical model that infers the structure of P. falciparum mixtures—including the number of strains present, their proportion within the samples, and the amount of unexplained mixture—using whole genome sequence (WGS) data. Applied to simulation data, artificial laboratory mixtures, and field samples, the model provides reasonable inference with as few as 10 reads or 50 SNPs and works efficiently even with much larger data sets. Source code and example data for the model are provided in an open source fashion. We discuss the possible uses of this model as a window into within-host selection for clinical and epidemiological studies. PMID:27362949

  8. Untangling statistical and biological models to understand network inference: the need for a genomics network ontology.

    PubMed

    Emmert-Streib, Frank; Dehmer, Matthias; Haibe-Kains, Benjamin

    2014-01-01

    In this paper, we shed light on approaches that are currently used to infer networks from gene expression data with respect to their biological meaning. As we will show, the biological interpretation of these networks depends on the chosen theoretical perspective. For this reason, we distinguish a statistical perspective from a mathematical modeling perspective and elaborate their differences and implications. Our results indicate the imperative need for a genomic network ontology in order to avoid increasing confusion about the biological interpretation of inferred networks, which can be even enhanced by approaches that integrate multiple data sets, respectively, data types.

  9. Improved genome inference in the MHC using a population reference graph.

    PubMed

    Dilthey, Alexander; Cox, Charles; Iqbal, Zamin; Nelson, Matthew R; McVean, Gil

    2015-06-01

    Although much is known about human genetic variation, such information is typically ignored in assembling new genomes. Instead, reads are mapped to a single reference, which can lead to poor characterization of regions of high sequence or structural diversity. We introduce a population reference graph, which combines multiple reference sequences and catalogs of variation. The genomes of new samples are reconstructed as paths through the graph using an efficient hidden Markov model, allowing for recombination between different haplotypes and additional variants. By applying the method to the 4.5-Mb extended MHC region on human chromosome 6, combining 8 assembled haplotypes, the sequences of known classical HLA alleles and 87,640 SNP variants from the 1000 Genomes Project, we demonstrate using simulations, SNP genotyping, and short-read and long-read data how the method improves the accuracy of genome inference and identified regions where the current set of reference sequences is substantially incomplete.

  10. Streamlining and Large Ancestral Genomes in Archaea Inferred with a Phylogenetic Birth-and-Death Model

    PubMed Central

    Miklós, István

    2009-01-01

    Homologous genes originate from a common ancestor through vertical inheritance, duplication, or horizontal gene transfer. Entire homolog families spawned by a single ancestral gene can be identified across multiple genomes based on protein sequence similarity. The sequences, however, do not always reveal conclusively the history of large families. To study the evolution of complete gene repertoires, we propose here a mathematical framework that does not rely on resolved gene family histories. We show that so-called phylogenetic profiles, formed by family sizes across multiple genomes, are sufficient to infer principal evolutionary trends. The main novelty in our approach is an efficient algorithm to compute the likelihood of a phylogenetic profile in a model of birth-and-death processes acting on a phylogeny. We examine known gene families in 28 archaeal genomes using a probabilistic model that involves lineage- and family-specific components of gene acquisition, duplication, and loss. The model enables us to consider all possible histories when inferring statistics about archaeal evolution. According to our reconstruction, most lineages are characterized by a net loss of gene families. Major increases in gene repertoire have occurred only a few times. Our reconstruction underlines the importance of persistent streamlining processes in shaping genome composition in Archaea. It also suggests that early archaeal genomes were as complex as typical modern ones, and even show signs, in the case of the methanogenic ancestor, of an extremely large gene repertoire. PMID:19570746

  11. Inferring chromatin-bound protein complexes from genome-wide binding assays

    PubMed Central

    Giannopoulou, Eugenia G.; Elemento, Olivier

    2013-01-01

    Genome-wide binding assays can determine where individual transcription factors bind in the genome. However, these factors rarely bind chromatin alone, but instead frequently bind to cis-regulatory elements (CREs) together with other factors thus forming protein complexes. Currently there are no integrative analytical approaches that can predict which complexes are formed on chromatin. Here, we describe a computational methodology to systematically capture protein complexes and infer their impact on gene expression. We applied our method to three human cell types, identified thousands of CREs, inferred known and undescribed complexes recruited to these CREs, and determined the role of the complexes as activators or repressors. Importantly, we found that the predicted complexes have a higher number of physical interactions between their members than expected by chance. Our work provides a mechanism for developing hypotheses about gene regulation via binding partners, and deciphering the interplay between combinatorial binding and gene expression. PMID:23554462

  12. How to Infer Relative Fitness from a Sample of Genomic Sequences

    PubMed Central

    Dayarian, Adel; Shraiman, Boris I.

    2014-01-01

    Mounting evidence suggests that natural populations can harbor extensive fitness diversity with numerous genomic loci under selection. It is also known that genealogical trees for populations under selection are quantifiably different from those expected under neutral evolution and described statistically by Kingman’s coalescent. While differences in the statistical structure of genealogies have long been used as a test for the presence of selection, the full extent of the information that they contain has not been exploited. Here we demonstrate that the shape of the reconstructed genealogical tree for a moderately large number of random genomic samples taken from a fitness diverse, but otherwise unstructured, asexual population can be used to predict the relative fitness of individuals within the sample. To achieve this we define a heuristic algorithm, which we test in silico, using simulations of a Wright–Fisher model for a realistic range of mutation rates and selection strength. Our inferred fitness ranking is based on a linear discriminator that identifies rapidly coalescing lineages in the reconstructed tree. Inferred fitness ranking correlates strongly with actual fitness, with a genome in the top 10% ranked being in the top 20% fittest with false discovery rate of 0.1–0.3, depending on the mutation/selection parameters. The ranking also enables us to predict the genotypes that future populations inherit from the present one. While the inference accuracy increases monotonically with sample size, samples of 200 nearly saturate the performance. We propose that our approach can be used for inferring relative fitness of genomes obtained in single-cell sequencing of tumors and in monitoring viral outbreaks. PMID:24770330

  13. How to infer relative fitness from a sample of genomic sequences.

    PubMed

    Dayarian, Adel; Shraiman, Boris I

    2014-07-01

    Mounting evidence suggests that natural populations can harbor extensive fitness diversity with numerous genomic loci under selection. It is also known that genealogical trees for populations under selection are quantifiably different from those expected under neutral evolution and described statistically by Kingman's coalescent. While differences in the statistical structure of genealogies have long been used as a test for the presence of selection, the full extent of the information that they contain has not been exploited. Here we demonstrate that the shape of the reconstructed genealogical tree for a moderately large number of random genomic samples taken from a fitness diverse, but otherwise unstructured, asexual population can be used to predict the relative fitness of individuals within the sample. To achieve this we define a heuristic algorithm, which we test in silico, using simulations of a Wright-Fisher model for a realistic range of mutation rates and selection strength. Our inferred fitness ranking is based on a linear discriminator that identifies rapidly coalescing lineages in the reconstructed tree. Inferred fitness ranking correlates strongly with actual fitness, with a genome in the top 10% ranked being in the top 20% fittest with false discovery rate of 0.1-0.3, depending on the mutation/selection parameters. The ranking also enables us to predict the genotypes that future populations inherit from the present one. While the inference accuracy increases monotonically with sample size, samples of 200 nearly saturate the performance. We propose that our approach can be used for inferring relative fitness of genomes obtained in single-cell sequencing of tumors and in monitoring viral outbreaks.

  14. Higher-level phylogeny of paraneopteran insects inferred from mitochondrial genome sequences

    PubMed Central

    Li, Hu; Shao, Renfu; Song, Nan; Song, Fan; Jiang, Pei; Li, Zhihong; Cai, Wanzhi

    2015-01-01

    Mitochondrial (mt) genome data have been proven to be informative for animal phylogenetic studies but may also suffer from systematic errors, due to the effects of accelerated substitution rate and compositional heterogeneity. We analyzed the mt genomes of 25 insect species from the four paraneopteran orders, aiming to better understand how accelerated substitution rate and compositional heterogeneity affect the inferences of the higher-level phylogeny of this diverse group of hemimetabolous insects. We found substantial heterogeneity in base composition and contrasting rates in nucleotide substitution among these paraneopteran insects, which complicate the inference of higher-level phylogeny. The phylogenies inferred with concatenated sequences of mt genes using maximum likelihood and Bayesian methods and homogeneous models failed to recover Psocodea and Hemiptera as monophyletic groups but grouped, instead, the taxa that had accelerated substitution rates together, including Sternorrhyncha (a suborder of Hemiptera), Thysanoptera, Phthiraptera and Liposcelididae (a family of Psocoptera). Bayesian inference with nucleotide sequences and heterogeneous models (CAT and CAT + GTR), however, recovered Psocodea, Thysanoptera and Hemiptera each as a monophyletic group. Within Psocodea, Liposcelididae is more closely related to Phthiraptera than to other species of Psocoptera. Furthermore, Thysanoptera was recovered as the sister group to Hemiptera. PMID:25704094

  15. Microarray Data Processing Techniques for Genome-Scale Network Inference from Large Public Repositories.

    PubMed

    Chockalingam, Sriram; Aluru, Maneesha; Aluru, Srinivas

    2016-09-19

    Pre-processing of microarray data is a well-studied problem. Furthermore, all popular platforms come with their own recommended best practices for differential analysis of genes. However, for genome-scale network inference using microarray data collected from large public repositories, these methods filter out a considerable number of genes. This is primarily due to the effects of aggregating a diverse array of experiments with different technical and biological scenarios. Here we introduce a pre-processing pipeline suitable for inferring genome-scale gene networks from large microarray datasets. We show that partitioning of the available microarray datasets according to biological relevance into tissue- and process-specific categories significantly extends the limits of downstream network construction. We demonstrate the effectiveness of our pre-processing pipeline by inferring genome-scale networks for the model plant Arabidopsis thaliana using two different construction methods and a collection of 11,760 Affymetrix ATH1 microarray chips. Our pre-processing pipeline and the datasets used in this paper are made available at http://alurulab.cc.gatech.edu/microarray-pp.

  16. Inference of gorilla demographic and selective history from whole-genome sequence data.

    PubMed

    McManus, Kimberly F; Kelley, Joanna L; Song, Shiya; Veeramah, Krishna R; Woerner, August E; Stevison, Laurie S; Ryder, Oliver A; Ape Genome Project, Great; Kidd, Jeffrey M; Wall, Jeffrey D; Bustamante, Carlos D; Hammer, Michael F

    2015-03-01

    Although population-level genomic sequence data have been gathered extensively for humans, similar data from our closest living relatives are just beginning to emerge. Examination of genomic variation within great apes offers many opportunities to increase our understanding of the forces that have differentially shaped the evolutionary history of hominid taxa. Here, we expand upon the work of the Great Ape Genome Project by analyzing medium to high coverage whole-genome sequences from 14 western lowland gorillas (Gorilla gorilla gorilla), 2 eastern lowland gorillas (G. beringei graueri), and a single Cross River individual (G. gorilla diehli). We infer that the ancestors of western and eastern lowland gorillas diverged from a common ancestor approximately 261 ka, and that the ancestors of the Cross River population diverged from the western lowland gorilla lineage approximately 68 ka. Using a diffusion approximation approach to model the genome-wide site frequency spectrum, we infer a history of western lowland gorillas that includes an ancestral population expansion of 1.4-fold around 970 ka and a recent 5.6-fold contraction in population size 23 ka. The latter may correspond to a major reduction in African equatorial forests around the Last Glacial Maximum. We also analyze patterns of variation among western lowland gorillas to identify several genomic regions with strong signatures of recent selective sweeps. We find that processes related to taste, pancreatic and saliva secretion, sodium ion transmembrane transport, and cardiac muscle function are overrepresented in genomic regions predicted to have experienced recent positive selection. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  17. Inference of Gorilla Demographic and Selective History from Whole-Genome Sequence Data

    PubMed Central

    McManus, Kimberly F.; Kelley, Joanna L.; Song, Shiya; Veeramah, Krishna R.; Woerner, August E.; Stevison, Laurie S.; Ryder, Oliver A.; Ape Genome Project, Great; Kidd, Jeffrey M.; Wall, Jeffrey D.; Bustamante, Carlos D.; Hammer, Michael F.

    2015-01-01

    Although population-level genomic sequence data have been gathered extensively for humans, similar data from our closest living relatives are just beginning to emerge. Examination of genomic variation within great apes offers many opportunities to increase our understanding of the forces that have differentially shaped the evolutionary history of hominid taxa. Here, we expand upon the work of the Great Ape Genome Project by analyzing medium to high coverage whole-genome sequences from 14 western lowland gorillas (Gorilla gorilla gorilla), 2 eastern lowland gorillas (G. beringei graueri), and a single Cross River individual (G. gorilla diehli). We infer that the ancestors of western and eastern lowland gorillas diverged from a common ancestor approximately 261 ka, and that the ancestors of the Cross River population diverged from the western lowland gorilla lineage approximately 68 ka. Using a diffusion approximation approach to model the genome-wide site frequency spectrum, we infer a history of western lowland gorillas that includes an ancestral population expansion of 1.4-fold around 970 ka and a recent 5.6-fold contraction in population size 23 ka. The latter may correspond to a major reduction in African equatorial forests around the Last Glacial Maximum. We also analyze patterns of variation among western lowland gorillas to identify several genomic regions with strong signatures of recent selective sweeps. We find that processes related to taste, pancreatic and saliva secretion, sodium ion transmembrane transport, and cardiac muscle function are overrepresented in genomic regions predicted to have experienced recent positive selection. PMID:25534031

  18. Robust Inference of Genetic Exchange Communities from Microbial Genomes Using TF-IDF

    PubMed Central

    Cong, Yingnan; Chan, Yao-ban; Phillips, Charles A.; Langston, Michael A.; Ragan, Mark A.

    2017-01-01

    Bacteria and archaea can exchange genetic material across lineages through processes of lateral genetic transfer (LGT). Collectively, these exchange relationships can be modeled as a network and analyzed using concepts from graph theory. In particular, densely connected regions within an LGT network have been defined as genetic exchange communities (GECs). However, it has been problematic to construct networks in which edges solely represent LGT. Here we apply term frequency-inverse document frequency (TF-IDF), an alignment-free method originating from document analysis, to infer regions of lateral origin in bacterial genomes. We examine four empirical datasets of different size (number of genomes) and phyletic breadth, varying a key parameter (word length k) within bounds established in previous work. We map the inferred lateral regions to genes in recipient genomes, and construct networks in which the nodes are groups of genomes, and the edges natively represent LGT. We then extract maximum and maximal cliques (i.e., GECs) from these graphs, and identify nodes that belong to GECs across a wide range of k. Most surviving lateral transfer has happened within these GECs. Using Gene Ontology enrichment tests we demonstrate that biological processes associated with metabolism, regulation and transport are often over-represented among the genes affected by LGT within these communities. These enrichments are largely robust to change of k. PMID:28154557

  19. Structure-based inference of molecular functions of proteins of unknown function from Berkeley Structural Genomics Center

    SciTech Connect

    Kim, Sung-Hou; Shin, Dong Hae; Hou, Jingtong; Chandonia, John-Marc; Das, Debanu; Choi, In-Geol; Kim, Rosalind; Kim, Sung-Hou

    2007-09-02

    Advances in sequence genomics have resulted in an accumulation of a huge number of protein sequences derived from genome sequences. However, the functions of a large portion of them cannot be inferred based on the current methods of sequence homology detection to proteins of known functions. Three-dimensional structure can have an important impact in providing inference of molecular function (physical and chemical function) of a protein of unknown function. Structural genomics centers worldwide have been determining many 3-D structures of the proteins of unknown functions, and possible molecular functions of them have been inferred based on their structures. Combined with bioinformatics and enzymatic assay tools, the successful acceleration of the process of protein structure determination through high throughput pipelines enables the rapid functional annotation of a large fraction of hypothetical proteins. We present a brief summary of the process we used at the Berkeley Structural Genomics Center to infer molecular functions of proteins of unknown function.

  20. Structure-based inference of molecular functions of proteins of unknown function from Berkeley Structural Genomics Center.

    PubMed

    Shin, Dong Hae; Hou, Jingtong; Chandonia, John-Marc; Das, Debanu; Choi, In-Geol; Kim, Rosalind; Kim, Sung-Hou

    2007-09-01

    Advances in sequence genomics have resulted in an accumulation of a huge number of protein sequences derived from genome sequences. However, the functions of a large portion of them cannot be inferred based on the current methods of sequence homology detection to proteins of known functions. Three-dimensional structure can have an important impact in providing inference of molecular function (physical and chemical function) of a protein of unknown function. Structural genomics centers worldwide have been determining many 3-D structures of the proteins of unknown functions, and possible molecular functions of them have been inferred based on their structures. Combined with bioinformatics and enzymatic assay tools, the successful acceleration of the process of protein structure determination through high throughput pipelines enables the rapid functional annotation of a large fraction of hypothetical proteins. We present a brief summary of the process we used at the Berkeley Structural Genomics Center to infer molecular functions of proteins of unknown function.

  1. ClonalFrameML: Efficient Inference of Recombination in Whole Bacterial Genomes

    PubMed Central

    Didelot, Xavier; Wilson, Daniel J.

    2015-01-01

    Recombination is an important evolutionary force in bacteria, but it remains challenging to reconstruct the imports that occurred in the ancestry of a genomic sample. Here we present ClonalFrameML, which uses maximum likelihood inference to simultaneously detect recombination in bacterial genomes and account for it in phylogenetic reconstruction. ClonalFrameML can analyse hundreds of genomes in a matter of hours, and we demonstrate its usefulness on simulated and real datasets. We find evidence for recombination hotspots associated with mobile elements in Clostridium difficile ST6 and a previously undescribed 310kb chromosomal replacement in Staphylococcus aureus ST582. ClonalFrameML is freely available at http://clonalframeml.googlecode.com/. PMID:25675341

  2. Genomic inferences of domestication events are corroborated by written records in Brassica rapa.

    PubMed

    Qi, Xinshuai; An, Hong; Ragsdale, Aaron P; Hall, Tara E; Gutenkunst, Ryan N; Chris Pires, J; Barker, Michael S

    2017-07-01

    Demographic modelling is often used with population genomic data to infer the relationships and ages among populations. However, relatively few analyses are able to validate these inferences with independent data. Here, we leverage written records that describe distinct Brassica rapa crops to corroborate demographic models of domestication. Brassica rapa crops are renowned for their outstanding morphological diversity, but the relationships and order of domestication remain unclear. We generated genomewide SNPs from 126 accessions collected globally using high-throughput transcriptome data. Analyses of more than 31,000 SNPs across the B. rapa genome revealed evidence for five distinct genetic groups and supported a European-Central Asian origin of B. rapa crops. Our results supported the traditionally recognized South Asian and East Asian B. rapa groups with evidence that pak choi, Chinese cabbage and yellow sarson are likely monophyletic groups. In contrast, the oil-type B. rapa subsp. oleifera and brown sarson were polyphyletic. We also found no evidence to support the contention that rapini is the wild type or the earliest domesticated subspecies of B. rapa. Demographic analyses suggested that B. rapa was introduced to Asia 2,400-4,100 years ago, and that Chinese cabbage originated 1,200-2,100 years ago via admixture of pak choi and European-Central Asian B. rapa. We also inferred significantly different levels of founder effect among the B. rapa subspecies. Written records from antiquity that document these crops are consistent with these inferences. The concordance between our age estimates of domestication events with historical records provides unique support for our demographic inferences. © 2017 John Wiley & Sons Ltd.

  3. Extensive Error in the Number of Genes Inferred from Draft Genome Assemblies

    PubMed Central

    Denton, James F.; Lugo-Martinez, Jose; Tucker, Abraham E.; Schrider, Daniel R.; Warren, Wesley C.; Hahn, Matthew W.

    2014-01-01

    Current sequencing methods produce large amounts of data, but genome assemblies based on these data are often woefully incomplete. These incomplete and error-filled assemblies result in many annotation errors, especially in the number of genes present in a genome. In this paper we investigate the magnitude of the problem, both in terms of total gene number and the number of copies of genes in specific families. To do this, we compare multiple draft assemblies against higher-quality versions of the same genomes, using several new assemblies of the chicken genome based on both traditional and next-generation sequencing technologies, as well as published draft assemblies of chimpanzee. We find that upwards of 40% of all gene families are inferred to have the wrong number of genes in draft assemblies, and that these incorrect assemblies both add and subtract genes. Using simulated genome assemblies of Drosophila melanogaster, we find that the major cause of increased gene numbers in draft genomes is the fragmentation of genes onto multiple individual contigs. Finally, we demonstrate the usefulness of RNA-Seq in improving the gene annotation of draft assemblies, largely by connecting genes that have been fragmented in the assembly process. PMID:25474019

  4. Inferring network structure in non-normal and mixed discrete-continuous genomic data.

    PubMed

    Bhadra, Anindya; Rao, Arvind; Baladandayuthapani, Veerabhadran

    2017-04-24

    Inferring dependence structure through undirected graphs is crucial for uncovering the major modes of multivariate interaction among high-dimensional genomic markers that are potentially associated with cancer. Traditionally, conditional independence has been studied using sparse Gaussian graphical models for continuous data and sparse Ising models for discrete data. However, there are two clear situations when these approaches are inadequate. The first occurs when the data are continuous but display non-normal marginal behavior such as heavy tails or skewness, rendering an assumption of normality inappropriate. The second occurs when a part of the data is ordinal or discrete (e.g., presence or absence of a mutation) and the other part is continuous (e.g., expression levels of genes or proteins). In this case, the existing Bayesian approaches typically employ a latent variable framework for the discrete part that precludes inferring conditional independence among the data that are actually observed. The current article overcomes these two challenges in a unified framework using Gaussian scale mixtures. Our framework is able to handle continuous data that are not normal and data that are of mixed continuous and discrete nature, while still being able to infer a sparse conditional sign independence structure among the observed data. Extensive performance comparison in simulations with alternative techniques and an analysis of a real cancer genomics data set demonstrate the effectiveness of the proposed approach. © 2017, The International Biometric Society.

  5. The feasibility of genome-scale biological network inference using Graphics Processing Units.

    PubMed

    Thiagarajan, Raghuram; Alavi, Amir; Podichetty, Jagdeep T; Bazil, Jason N; Beard, Daniel A

    2017-01-01

    Systems research spanning fields from biology to finance involves the identification of models to represent the underpinnings of complex systems. Formal approaches for data-driven identification of network interactions include statistical inference-based approaches and methods to identify dynamical systems models that are capable of fitting multivariate data. Availability of large data sets and so-called 'big data' applications in biology present great opportunities as well as major challenges for systems identification/reverse engineering applications. For example, both inverse identification and forward simulations of genome-scale gene regulatory network models pose compute-intensive problems. This issue is addressed here by combining the processing power of Graphics Processing Units (GPUs) and a parallel reverse engineering algorithm for inference of regulatory networks. It is shown that, given an appropriate data set, information on genome-scale networks (systems of 1000 or more state variables) can be inferred using a reverse-engineering algorithm in a matter of days on a small-scale modern GPU cluster.

  6. Inferring drug-disease associations from integration of chemical, genomic and phenotype data using network propagation

    PubMed Central

    2013-01-01

    Background During the last few years, the knowledge of drug, disease phenotype and protein has been rapidly accumulated and more and more scientists have been drawn the attention to inferring drug-disease associations by computational method. Development of an integrated approach for systematic discovering drug-disease associations by those informational data is an important issue. Methods We combine three different networks of drug, genomic and disease phenotype and assign the weights to the edges from available experimental data and knowledge. Given a specific disease, we use our network propagation approach to infer the drug-disease associations. Results We apply prostate cancer and colorectal cancer as our test data. We use the manually curated drug-disease associations from comparative toxicogenomics database to be our benchmark. The ranked results show that our proposed method obtains higher specificity and sensitivity and clearly outperforms previous methods. Our result also show that our method with off-targets information gets higher performance than that with only primary drug targets in both test data. Conclusions We clearly demonstrate the feasibility and benefits of using network-based analyses of chemical, genomic and phenotype data to reveal drug-disease associations. The potential associations inferred by our method provide new perspectives for toxicogenomics and drug reposition evaluation. PMID:24565337

  7. Genetic diversity in Sargasso Sea bacterioplankton.

    PubMed

    Giovannoni, S J; Britschgi, T B; Moyer, C L; Field, K G

    1990-05-03

    Bacterioplankton are recognized as important agents of biogeochemical change in marine ecosystems, yet relatively little is known about the species that make up these communities. Uncertainties about the genetic structure and diversity of natural bacterioplankton populations stem from the traditional difficulties associated with microbial cultivation techniques. Discrepancies between direct counts and plate counts are typically several orders of magnitude, raising doubts as to whether cultivated marine bacteria are actually representative of dominant planktonic species. We have phylogenetically analysed clone libraries of eubacterial 16S ribosomal RNA genes amplified from natural populations of Sargasso Sea picoplankton by the polymerase chain reaction. The analysis indicates the presence of a novel microbial group, the SAR11 cluster, which appears to be a significant component of this oligotrophic bacterioplankton community. A second cluster of lineages related to the oxygenic phototrophs--cyanobacteria, prochlorophytes and chloroplasts--was also observed. However, none of the genes matched the small subunit rRNA sequences of cultivated marine cyanobacteria from similar habitats. The diversity of 16S rRNA genes observed within the clusters suggests that these bacterioplankton may be consortia of independent lineages sharing surprisingly distant common ancestors.

  8. Gene network inference via structural equation modeling in genetical genomics experiments.

    PubMed

    Liu, Bing; de la Fuente, Alberto; Hoeschele, Ina

    2008-03-01

    Our goal is gene network inference in genetical genomics or systems genetics experiments. For species where sequence information is available, we first perform expression quantitative trait locus (eQTL) mapping by jointly utilizing cis-, cis-trans-, and trans-regulation. After using local structural models to identify regulator-target pairs for each eQTL, we construct an encompassing directed network (EDN) by assembling all retained regulator-target relationships. The EDN has nodes corresponding to expressed genes and eQTL and directed edges from eQTL to cis-regulated target genes, from cis-regulated genes to cis-trans-regulated target genes, from trans-regulator genes to target genes, and from trans-eQTL to target genes. For network inference within the strongly constrained search space defined by the EDN, we propose structural equation modeling (SEM), because it can model cyclic networks and the EDN indeed contains feedback relationships. On the basis of a factorization of the likelihood and the constrained search space, our SEM algorithm infers networks involving several hundred genes and eQTL. Structure inference is based on a penalized likelihood ratio and an adaptation of Occam's window model selection. The SEM algorithm was evaluated using data simulated with nonlinear ordinary differential equations and known cyclic network topologies and was applied to a real yeast data set.

  9. Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance

    DOE PAGES

    Ahn, Tae-Hyuk; Chai, Juanjuan; Pan, Chongle

    2014-09-29

    Motivation: Metagenomic sequencing of clinical samples provides a promising technique for direct pathogen detection and characterization in biosurveillance. Taxonomic analysis at the strain level can be used to resolve serotypes of a pathogen in biosurveillance. Sigma was developed for strain-level identification and quantification of pathogens using their reference genomes based on metagenomic analysis. Results: Sigma provides not only accurate strain-level inferences, but also three unique capabilities: (i) Sigma quantifies the statistical uncertainty of its inferences, which includes hypothesis testing of identified genomes and confidence interval estimation of their relative abundances; (ii) Sigma enables strain variant calling by assigning metagenomic readsmore » to their most likely reference genomes; and (iii) Sigma supports parallel computing for fast analysis of large datasets. In conclusion, the algorithm performance was evaluated using simulated mock communities and fecal samples with spike-in pathogen strains. Availability and Implementation: Sigma was implemented in C++ with source codes and binaries freely available at http://sigma.omicsbio.org.« less

  10. Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance

    PubMed Central

    Ahn, Tae-Hyuk; Chai, Juanjuan; Pan, Chongle

    2015-01-01

    Motivation: Metagenomic sequencing of clinical samples provides a promising technique for direct pathogen detection and characterization in biosurveillance. Taxonomic analysis at the strain level can be used to resolve serotypes of a pathogen in biosurveillance. Sigma was developed for strain-level identification and quantification of pathogens using their reference genomes based on metagenomic analysis. Results: Sigma provides not only accurate strain-level inferences, but also three unique capabilities: (i) Sigma quantifies the statistical uncertainty of its inferences, which includes hypothesis testing of identified genomes and confidence interval estimation of their relative abundances; (ii) Sigma enables strain variant calling by assigning metagenomic reads to their most likely reference genomes; and (iii) Sigma supports parallel computing for fast analysis of large datasets. The algorithm performance was evaluated using simulated mock communities and fecal samples with spike-in pathogen strains. Availability and Implementation: Sigma was implemented in C++ with source codes and binaries freely available at http://sigma.omicsbio.org. Contact: panc@ornl.gov Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25266224

  11. Inferring genome-wide patterns of admixture in Qataris using fifty-five ancestral populations.

    PubMed

    Omberg, Larsson; Salit, Jacqueline; Hackett, Neil; Fuller, Jennifer; Matthew, Rebecca; Chouchane, Lotfi; Rodriguez-Flores, Juan L; Bustamante, Carlos; Crystal, Ronald G; Mezey, Jason G

    2012-06-26

    Populations of the Arabian Peninsula have a complex genetic structure that reflects waves of migrations including the earliest human migrations from Africa and eastern Asia, migrations along ancient civilization trading routes and colonization history of recent centuries. Here, we present a study of genome-wide admixture in this region, using 156 genotyped individuals from Qatar, a country located at the crossroads of these migration patterns. Since haplotypes of these individuals could have originated from many different populations across the world, we have developed a machine learning method "SupportMix" to infer loci-specific genomic ancestry when simultaneously analyzing many possible ancestral populations. Simulations show that SupportMix is not only more accurate than other popular admixture discovery tools but is the first admixture inference method that can efficiently scale for simultaneous analysis of 50-100 putative ancestral populations while being independent of prior demographic information. By simultaneously using the 55 world populations from the Human Genome Diversity Panel, SupportMix was able to extract the fine-scale ancestry of the Qatar population, providing many new observations concerning the ancestry of the region. For example, as well as recapitulating the three major sub-populations in Qatar, composed of mainly Arabic, Persian, and African ancestry, SupportMix additionally identifies the specific ancestry of the Persian group to populations sampled in Greater Persia rather than from China and the ancestry of the African group to sub-Saharan origin and not Southern African Bantu origin as previously thought.

  12. Inferring Bottlenecks from Genome-Wide Samples of Short Sequence Blocks

    PubMed Central

    Bunnefeld, Lynsey; Frantz, Laurent A. F.; Lohse, Konrad

    2015-01-01

    The advent of the genomic era has necessitated the development of methods capable of analyzing large volumes of genomic data efficiently. Being able to reliably identify bottlenecks—extreme population size changes of short duration—not only is interesting in the context of speciation and extinction but also matters (as a null model) when inferring selection. Bottlenecks can be detected in polymorphism data via their distorting effect on the shape of the underlying genealogy. Here, we use the generating function of genealogies to derive the probability of mutational configurations in short sequence blocks under a simple bottleneck model. Given a large number of nonrecombining blocks, we can compute maximum-likelihood estimates of the time and strength of the bottleneck. Our method relies on a simple summary of the joint distribution of polymorphic sites. We extend the site frequency spectrum by counting mutations in frequency classes in short sequence blocks. Using linkage information over short distances in this way gives greater power to detect bottlenecks than the site frequency spectrum and potentially opens up a wide range of demographic histories to blockwise inference. Finally, we apply our method to genomic data from a species of pig (Sus cebifrons) endemic to islands in the center and west of the Philippines to estimate whether a bottleneck occurred upon island colonization and compare our scheme to Li and Durbin’s pairwise sequentially Markovian coalescent (PSMC) both for the pig data and using simulations. PMID:26341659

  13. RegPredict: an integrated system for regulon inference in prokaryotes by comparative genomics approach

    SciTech Connect

    Novichkov, Pavel S.; Rodionov, Dmitry A.; Stavrovskaya, Elena D.; Novichkova, Elena S.; Kazakov, Alexey E.; Gelfand, Mikhail S.; Arkin, Adam P.; Mironov, Andrey A.; Dubchak, Inna

    2010-05-26

    RegPredict web server is designed to provide comparative genomics tools for reconstruction and analysis of microbial regulons using comparative genomics approach. The server allows the user to rapidly generate reference sets of regulons and regulatory motif profiles in a group of prokaryotic genomes. The new concept of a cluster of co-regulated orthologous operons allows the user to distribute the analysis of large regulons and to perform the comparative analysis of multiple clusters independently. Two major workflows currently implemented in RegPredict are: (i) regulon reconstruction for a known regulatory motif and (ii) ab initio inference of a novel regulon using several scenarios for the generation of starting gene sets. RegPredict provides a comprehensive collection of manually curated positional weight matrices of regulatory motifs. It is based on genomic sequences, ortholog and operon predictions from the MicrobesOnline. An interactive web interface of RegPredict integrates and presents diverse genomic and functional information about the candidate regulon members from several web resources. RegPredict is freely accessible at http://regpredict.lbl.gov.

  14. Hybrid Origins of Citrus Varieties Inferred from DNA Marker Analysis of Nuclear and Organelle Genomes

    PubMed Central

    Kitajima, Akira; Nonaka, Keisuke; Yoshioka, Terutaka; Ohta, Satoshi; Goto, Shingo; Toyoda, Atsushi; Fujiyama, Asao; Mochizuki, Takako; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu

    2016-01-01

    Most indigenous citrus varieties are assumed to be natural hybrids, but their parentage has so far been determined in only a few cases because of their wide genetic diversity and the low transferability of DNA markers. Here we infer the parentage of indigenous citrus varieties using simple sequence repeat and indel markers developed from various citrus genome sequence resources. Parentage tests with 122 known hybrids using the selected DNA markers certify their transferability among those hybrids. Identity tests confirm that most variant strains are selected mutants, but we find four types of kunenbo (Citrus nobilis) and three types of tachibana (Citrus tachibana) for which we suggest different origins. Structure analysis with DNA markers that are in Hardy–Weinberg equilibrium deduce three basic taxa coinciding with the current understanding of citrus ancestors. Genotyping analysis of 101 indigenous citrus varieties with 123 selected DNA markers infers the parentages of 22 indigenous citrus varieties including Satsuma, Temple, and iyo, and single parents of 45 indigenous citrus varieties, including kunenbo, C. ichangensis, and Ichang lemon by allele-sharing and parentage tests. Genotyping analysis of chloroplast and mitochondrial genomes using 11 DNA markers classifies their cytoplasmic genotypes into 18 categories and deduces the combination of seed and pollen parents. Likelihood ratio analysis verifies the inferred parentages with significant scores. The reconstructed genealogy identifies 12 types of varieties consisting of Kishu, kunenbo, yuzu, koji, sour orange, dancy, kobeni mikan, sweet orange, tachibana, Cleopatra, willowleaf mandarin, and pummelo, which have played pivotal roles in the occurrence of these indigenous varieties. The inferred parentage of the indigenous varieties confirms their hybrid origins, as found by recent studies. PMID:27902727

  15. Hybrid Origins of Citrus Varieties Inferred from DNA Marker Analysis of Nuclear and Organelle Genomes.

    PubMed

    Shimizu, Tokurou; Kitajima, Akira; Nonaka, Keisuke; Yoshioka, Terutaka; Ohta, Satoshi; Goto, Shingo; Toyoda, Atsushi; Fujiyama, Asao; Mochizuki, Takako; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu

    2016-01-01

    Most indigenous citrus varieties are assumed to be natural hybrids, but their parentage has so far been determined in only a few cases because of their wide genetic diversity and the low transferability of DNA markers. Here we infer the parentage of indigenous citrus varieties using simple sequence repeat and indel markers developed from various citrus genome sequence resources. Parentage tests with 122 known hybrids using the selected DNA markers certify their transferability among those hybrids. Identity tests confirm that most variant strains are selected mutants, but we find four types of kunenbo (Citrus nobilis) and three types of tachibana (Citrus tachibana) for which we suggest different origins. Structure analysis with DNA markers that are in Hardy-Weinberg equilibrium deduce three basic taxa coinciding with the current understanding of citrus ancestors. Genotyping analysis of 101 indigenous citrus varieties with 123 selected DNA markers infers the parentages of 22 indigenous citrus varieties including Satsuma, Temple, and iyo, and single parents of 45 indigenous citrus varieties, including kunenbo, C. ichangensis, and Ichang lemon by allele-sharing and parentage tests. Genotyping analysis of chloroplast and mitochondrial genomes using 11 DNA markers classifies their cytoplasmic genotypes into 18 categories and deduces the combination of seed and pollen parents. Likelihood ratio analysis verifies the inferred parentages with significant scores. The reconstructed genealogy identifies 12 types of varieties consisting of Kishu, kunenbo, yuzu, koji, sour orange, dancy, kobeni mikan, sweet orange, tachibana, Cleopatra, willowleaf mandarin, and pummelo, which have played pivotal roles in the occurrence of these indigenous varieties. The inferred parentage of the indigenous varieties confirms their hybrid origins, as found by recent studies.

  16. Inferring Where and When Replication Initiates from Genome-Wide Replication Timing Data

    NASA Astrophysics Data System (ADS)

    Baker, A.; Audit, B.; Yang, S. C.-H.; Bechhoefer, J.; Arneodo, A.

    2012-06-01

    Based on an analogy between DNA replication and one dimensional nucleation-and-growth processes, various attempts to infer the local initiation rate I(x,t) of DNA replication origins from replication timing data have been developed in the framework of phase transition kinetics theories. These works have all used curve-fit strategies to estimate I(x,t) from genome-wide replication timing data. Here, we show how to invert analytically the Kolmogorov-Johnson-Mehl-Avrami model and extract I(x,t) directly. Tests on both simulated and experimental budding-yeast data confirm the location and firing-time distribution of replication origins.

  17. Inferring human population size and separation history from multiple genome sequences

    PubMed Central

    Schiffels, Stephan; Durbin, Richard

    2014-01-01

    The availability of complete human genome sequences from populations across the world has given rise to new population genetic inference methods that explicitly model their ancestral relationship under recombination and mutation. So far, application of these methods to evolutionary history more recent than 20-30 thousand years ago and to population separations has been limited. Here we present a new method that overcomes these shortcomings. The Multiple Sequentially Markovian Coalescent (MSMC) analyses the observed pattern of mutations in multiple individuals, focusing on the first coalescence between any two individuals. Results from applying MSMC to genome sequences from nine populations across the world suggest that the genetic separation of non-African ancestors from African Yoruban ancestors started long before 50,000 years ago, and give information about human population history as recently as 2,000 years ago, including the bottleneck in the peopling of the Americas, and separations within Africa, East Asia and Europe. PMID:24952747

  18. The root of the mammalian tree inferred from whole mitochondrial genomes.

    PubMed

    Phillips, Matthew J; Penny, David

    2003-08-01

    Morphological and molecular data are currently contradictory over the position of monotremes with respect to marsupial and placental mammals. As part of a re-evaluation of both forms of data we examine complete mitochondrial genomes in more detail. There is a particularly large discrepancy in the frequencies of thymine and cytosine (T-C) between mitochondrial genomes that appears to affect some deep divergences in the mammalian tree. We report that recoding nucleotides to RY-characters, and partitioning maximum-likelihood analyses among subsets of data reduces such biases, and improves the fit of models to the data, respectively. RY-coding also increases the signal on the internal branches relative to external, and thus increases the phylogenetic signal. In contrast to previous analyses of mitochondrial data, our analyses favor Theria (marsupials plus placentals) over Marsupionta (monotremes plus marsupials). However, a short therian stem lineage is inferred, which is at variance with the traditionally deep placement of monotremes on morphological data.

  19. Inference of gene regulatory networks from genome-wide knockout fitness data

    PubMed Central

    Wang, Liming; Wang, Xiaodong; Arkin, Adam P.; Samoilov, Michael S.

    2013-01-01

    Motivation: Genome-wide fitness is an emerging type of high-throughput biological data generated for individual organisms by creating libraries of knockouts, subjecting them to broad ranges of environmental conditions, and measuring the resulting clone-specific fitnesses. Since fitness is an organism-scale measure of gene regulatory network behaviour, it may offer certain advantages when insights into such phenotypical and functional features are of primary interest over individual gene expression. Previous works have shown that genome-wide fitness data can be used to uncover novel gene regulatory interactions, when compared with results of more conventional gene expression analysis. Yet, to date, few algorithms have been proposed for systematically using genome-wide mutant fitness data for gene regulatory network inference. Results: In this article, we describe a model and propose an inference algorithm for using fitness data from knockout libraries to identify underlying gene regulatory networks. Unlike most prior methods, the presented approach captures not only structural, but also dynamical and non-linear nature of biomolecular systems involved. A state–space model with non-linear basis is used for dynamically describing gene regulatory networks. Network structure is then elucidated by estimating unknown model parameters. Unscented Kalman filter is used to cope with the non-linearities introduced in the model, which also enables the algorithm to run in on-line mode for practical use. Here, we demonstrate that the algorithm provides satisfying results for both synthetic data as well as empirical measurements of GAL network in yeast Saccharomyces cerevisiae and TyrR–LiuR network in bacteria Shewanella oneidensis. Availability: MATLAB code and datasets are available to download at http://www.duke.edu/∼lw174/Fitness.zip and http://genomics.lbl.gov/supplemental/fitness-bioinf/ Contact: wangx@ee.columbia.edu or mssamoilov@lbl.gov Supplementary information

  20. Proteomics-inferred genome typing (PIGT) demonstrates inter-populationrecombination as a strategy for environmental adaptation

    SciTech Connect

    Denef, Vincent; Verberkmoes, Nathan C; Shah, Manesh B; Abraham, Paul E; Lefsrud, Mark G; Hettich, Robert {Bob} L; Banfield, Jillian F.

    2009-01-01

    Analyses of ecological and evolutionary processes that shape microbial consortia are facilitated by comprehensive studies of ecosystems with low species richness. In the current study we evaluated the role of recombination in altering the fitness of chemoautotrophic bacteria in their natural environment. Proteomics-inferred genome typing (PIGT) was used to determine the genomic make-up of Leptospirillum group II populations in 27 biofilms sampled from six locations in the Richmond Mine acid mine drainage system (Iron Mountain, CA) over a four-year period. We observed six distinct genotypes that are recombinants comprised of segments from two parental genotypes. Community genomic analyses revealed additional low abundance recombinant variants. The dominance of some genotypes despite a larger available genome pool, and patterns of spatiotemporal distribution within the ecosystem, indicate selection for distinct recombinants. Genes involved in motility, signal transduction and transport were overrepresented in the tens to hundreds of kilobase recombinant blocks, whereas core metabolic functions were significantly underrepresented. Our findings demonstrate the power of PIGT and reveal that recombination is a mechanism for fine-scale adaptation in this system.

  1. Phylogenetics and biogeography of the dung beetle genus Onthophagus inferred from mitochondrial genomes.

    PubMed

    Breeschoten, Thijmen; Doorenweerd, Camiel; Tarasov, Sergei; Vogler, Alfried P

    2016-12-01

    Phylogenetic relationships of dung beetles in the tribe Onthophagini, including the species-rich, cosmopolitan genus Onthophagus, were inferred using whole mitochondrial genomes. Data were generated by shotgun sequencing of mixed genomic DNA from >100 individuals on 50% of an Illumina MiSeq flow cell. Genome assembly of the mixed reads produced contigs of 74 (nearly) complete mitogenomes. The final dataset included representatives of Onthophagus from all biogeographic regions, closely related genera of Onthophagini, and the related tribes Onitini and Oniticellini. The analysis defined four major clades of Onthophagini, which was paraphyletic for Oniticellini, with Onitini as sister group to all others. Several (sub)genera considered as members of Onthophagus in the older literature formed separate deep lineages. All New World species of Onthophagus formed a monophyletic group, and the Australian taxa are confined to a single or two closely related clades, one of which forms the sister group of the New World species. Dating the tree by constraining the basal splits with existing calibrations of Scarabaeoidea suggests an origin of Onthophagini sensu lato in the Eocene and a rapid spread from an African ancestral stock into the Oriental region, and secondarily to Australia and the Americas at about 20-24 Mya. The successful assembly of mitogenomes and the well-supported tree obtained from these sequences demonstrates the power of shotgun sequencing from total genomic DNA of species pools as an efficient tool in genus-level phylogenetics.

  2. BPhyOG: an interactive server for genome-wide inference of bacterial phylogenies based on overlapping genes.

    PubMed

    Luo, Yingqin; Fu, Cong; Zhang, Da-Yong; Lin, Kui

    2007-07-25

    Overlapping genes (OGs) in bacterial genomes are pairs of adjacent genes of which the coding sequences overlap partly or entirely. With the rapid accumulation of sequence data, many OGs in bacterial genomes have now been identified. Indeed, these might prove a consistent feature across all microbial genomes. Our previous work suggests that OGs can be considered as robust markers at the whole genome level for the construction of phylogenies. An online, interactive web server for inferring phylogenies is needed for biologists to analyze phylogenetic relationships among a set of bacterial genomes of interest. BPhyOG is an online interactive server for reconstructing the phylogenies of completely sequenced bacterial genomes on the basis of their shared overlapping genes. It provides two tree-reconstruction methods: Neighbor Joining (NJ) and Unweighted Pair-Group Method using Arithmetic averages (UPGMA). Users can apply the desired method to generate phylogenetic trees, which are based on an evolutionary distance matrix for the selected genomes. The distance between two genomes is defined by the normalized number of their shared OG pairs. BPhyOG also allows users to browse the OGs that were used to infer the phylogenetic relationships. It provides detailed annotation for each OG pair and the features of the component genes through hyperlinks. Users can also retrieve each of the homologous OG pairs that have been determined among 177 genomes. It is a useful tool for analyzing the tree of life and overlapping genes from a genomic standpoint. BPhyOG is a useful interactive web server for genome-wide inference of any potential evolutionary relationship among the genomes selected by users. It currently includes 177 completely sequenced bacterial genomes containing 79,855 OG pairs, the annotation and homologous OG pairs of which are integrated comprehensively. The reliability of phylogenies complemented by annotations make BPhyOG a powerful web server for genomic and genetic

  3. Inferring gene structures in genomic sequences using pattern recognition and expressed sequence tags

    SciTech Connect

    Xu, Y.; Mural, R.; Uberbacher, E.

    1997-02-01

    Computational methods for gene identification in genomic sequences typically have two phases: coding region prediction and gene parsing. While there are many effective methods for predicting coding regions (exons), parsing the predicted exons into proper gene structures, to a large extent, remains an unsolved problem. This paper presents an algorithm for inferring gene structures from predicted exon candidates, based on Expressed Sequence Tags (ESTs) and biological intuition/rules. The algorithm first finds all the related ESTs in the EST database (dbEST) for each predicted exon, and infers the boundaries of one or a series of genes based on the available EST information and biological rules. Then it constructs gene models within each pair of gene boundaries, that are most consistent with the EST information. By exploiting EST information and biological rules, the algorithm can (1) model complicated multiple gene structures, including embedded genes, (2) identify falsely-predicted exons and locate missed exons, and (3) make more accurate exon boundary predictions. The algorithm has been implemented and tested on long genomic sequences with a number of genes. Test results show that very accurate (predicted) gene models can be expected when related ESTs exist for the predicted exons.

  4. Inferring gene structures in genomic sequences using pattern recognition and expressed sequence tags.

    PubMed

    Xu, Y; Mural, R J; Uberbacher, E C

    1997-01-01

    Computational methods for gene identification in genomic sequences typically have two phases: coding region prediction and gene parsing. While there are many effective methods for predicting coding regions (exons), parsing the predicted exons into proper gene structures, to a large extent, remains an unsolved problem. This paper presents an algorithm for inferring gene structures from predicted exon candidates, based on Expressed Sequence Tags (ESTs) and biological intuition/rules. The algorithm first finds all the related ESTs in the EST database (dbEST) for each predicted exon, and infers the boundaries of one or a series of genes based on the available EST information and biological rules. Then it constructs gene models within each pair of gene boundaries, that are most consistent with the EST information. By exploiting EST information and biological rules, the algorithm can (1) model complicated multiple gene structures, including embedded genes, (2) identify falsely-predicted exons and locate missed exons, and (3) make more accurate exon boundary predictions. The algorithm has been implemented and tested on long genomic sequences with a number of genes. Test results show that very accurate (predicted) gene models can be expected when related ESTs exist for the predicted exons.

  5. Occurrence and expression of gene transfer agent genes in marine bacterioplankton.

    PubMed

    Biers, Erin J; Wang, Kui; Pennington, Catherine; Belas, Robert; Chen, Feng; Moran, Mary Ann

    2008-05-01

    Genes with homology to the transduction-like gene transfer agent (GTA) were observed in genome sequences of three cultured members of the marine Roseobacter clade. A broader search for homologs for this host-controlled virus-like gene transfer system identified likely GTA systems in cultured Alphaproteobacteria, and particularly in marine bacterioplankton representatives. Expression of GTA genes and extracellular release of GTA particles ( approximately 50 to 70 nm) was demonstrated experimentally for the Roseobacter clade member Silicibacter pomeroyi DSS-3, and intraspecific gene transfer was documented. GTA homologs are surprisingly infrequent in marine metagenomic sequence data, however, and the role of this lateral gene transfer mechanism in ocean bacterioplankton communities remains unclear.

  6. From algae to angiosperms-inferring the phylogeny of green plants (Viridiplantae) from 360 plastid genomes.

    PubMed

    Ruhfel, Brad R; Gitzendanner, Matthew A; Soltis, Pamela S; Soltis, Douglas E; Burleigh, J Gordon

    2014-02-17

    Next-generation sequencing has provided a wealth of plastid genome sequence data from an increasingly diverse set of green plants (Viridiplantae). Although these data have helped resolve the phylogeny of numerous clades (e.g., green algae, angiosperms, and gymnosperms), their utility for inferring relationships across all green plants is uncertain. Viridiplantae originated 700-1500 million years ago and may comprise as many as 500,000 species. This clade represents a major source of photosynthetic carbon and contains an immense diversity of life forms, including some of the smallest and largest eukaryotes. Here we explore the limits and challenges of inferring a comprehensive green plant phylogeny from available complete or nearly complete plastid genome sequence data. We assembled protein-coding sequence data for 78 genes from 360 diverse green plant taxa with complete or nearly complete plastid genome sequences available from GenBank. Phylogenetic analyses of the plastid data recovered well-supported backbone relationships and strong support for relationships that were not observed in previous analyses of major subclades within Viridiplantae. However, there also is evidence of systematic error in some analyses. In several instances we obtained strongly supported but conflicting topologies from analyses of nucleotides versus amino acid characters, and the considerable variation in GC content among lineages and within single genomes affected the phylogenetic placement of several taxa. Analyses of the plastid sequence data recovered a strongly supported framework of relationships for green plants. This framework includes: i) the placement of Zygnematophyceace as sister to land plants (Embryophyta), ii) a clade of extant gymnosperms (Acrogymnospermae) with cycads + Ginkgo sister to remaining extant gymnosperms and with gnetophytes (Gnetophyta) sister to non-Pinaceae conifers (Gnecup trees), and iii) within the monilophyte clade (Monilophyta), Equisetales

  7. From algae to angiosperms–inferring the phylogeny of green plants (Viridiplantae) from 360 plastid genomes

    PubMed Central

    2014-01-01

    Background Next-generation sequencing has provided a wealth of plastid genome sequence data from an increasingly diverse set of green plants (Viridiplantae). Although these data have helped resolve the phylogeny of numerous clades (e.g., green algae, angiosperms, and gymnosperms), their utility for inferring relationships across all green plants is uncertain. Viridiplantae originated 700-1500 million years ago and may comprise as many as 500,000 species. This clade represents a major source of photosynthetic carbon and contains an immense diversity of life forms, including some of the smallest and largest eukaryotes. Here we explore the limits and challenges of inferring a comprehensive green plant phylogeny from available complete or nearly complete plastid genome sequence data. Results We assembled protein-coding sequence data for 78 genes from 360 diverse green plant taxa with complete or nearly complete plastid genome sequences available from GenBank. Phylogenetic analyses of the plastid data recovered well-supported backbone relationships and strong support for relationships that were not observed in previous analyses of major subclades within Viridiplantae. However, there also is evidence of systematic error in some analyses. In several instances we obtained strongly supported but conflicting topologies from analyses of nucleotides versus amino acid characters, and the considerable variation in GC content among lineages and within single genomes affected the phylogenetic placement of several taxa. Conclusions Analyses of the plastid sequence data recovered a strongly supported framework of relationships for green plants. This framework includes: i) the placement of Zygnematophyceace as sister to land plants (Embryophyta), ii) a clade of extant gymnosperms (Acrogymnospermae) with cycads + Ginkgo sister to remaining extant gymnosperms and with gnetophytes (Gnetophyta) sister to non-Pinaceae conifers (Gnecup trees), and iii) within the monilophyte clade

  8. The aggregate site frequency spectrum (aSFS) for comparative population genomic inference

    PubMed Central

    Xue, Alexander T.; Hickerson, Michael J.

    2015-01-01

    Understanding how assemblages of species responded to past climate change is a central goal of comparative phylogeography and comparative population genomics, an endeavor that has increasing potential to integrate with community ecology. New sequencing technology now provides the potential to perform complex demographic inference at unprecedented resolution across assemblages of non-model species. To this end, we introduce the aggregate site frequency spectrum (aSFS), an expansion of the site frequency spectrum to use single nucleotide polymorphism (SNP) datasets collected from multiple, co-distributed species for assemblage-level demographic inference. We describe how the aSFS is constructed over an arbitrary number of independent population samples and then demonstrate how the aSFS can differentiate various multi-species demographic histories under a wide range of sampling configurations while allowing effective population sizes and expansion magnitudes to vary independently. We subsequently couple the aSFS with a hierarchical approximate Bayesian computation (hABC) framework to estimate degree of temporal synchronicity in expansion times across taxa, including an empirical demonstration with a dataset consisting of five populations of the threespine stickleback (Gasterosteus aculeatus). Corroborating what is generally understood about the recent post-glacial origins of these populations, the joint aSFS/hABC analysis strongly suggests that the stickleback data are most consistent with synchronous expansion after the Last Glacial Maximum (posterior probability = 0.99). The aSFS will have general application for multi-level statistical frameworks to test models involving assemblages and/or communities and as large-scale SNP data from non-model species become routine, the aSFS expands the potential for powerful next-generation comparative population genomic inference. PMID:26769405

  9. High-Accuracy HLA Type Inference from Whole-Genome Sequencing Data Using Population Reference Graphs

    PubMed Central

    Dilthey, Alexander T.; Gourraud, Pierre-Antoine; McVean, Gil

    2016-01-01

    Genetic variation at the Human Leucocyte Antigen (HLA) genes is associated with many autoimmune and infectious disease phenotypes, is an important element of the immunological distinction between self and non-self, and shapes immune epitope repertoires. Determining the allelic state of the HLA genes (HLA typing) as a by-product of standard whole-genome sequencing data would therefore be highly desirable and enable the immunogenetic characterization of samples in currently ongoing population sequencing projects. Extensive hyperpolymorphism and sequence similarity between the HLA genes, however, pose problems for accurate read mapping and make HLA type inference from whole-genome sequencing data a challenging problem. We describe how to address these challenges in a Population Reference Graph (PRG) framework. First, we construct a PRG for 46 (mostly HLA) genes and pseudogenes, their genomic context and their characterized sequence variants, integrating a database of over 10,000 known allele sequences. Second, we present a sequence-to-PRG paired-end read mapping algorithm that enables accurate read mapping for the HLA genes. Third, we infer the most likely pair of underlying alleles at G group resolution from the IMGT/HLA database at each locus, employing a simple likelihood framework. We show that HLA*PRG, our algorithm, outperforms existing methods by a wide margin. We evaluate HLA*PRG on six classical class I and class II HLA genes (HLA-A, -B, -C, -DQA1, -DQB1, -DRB1) and on a set of 14 samples (3 samples with 2 x 100bp, 11 samples with 2 x 250bp Illumina HiSeq data). Of 158 alleles tested, we correctly infer 157 alleles (99.4%). We also identify and re-type two erroneous alleles in the original validation data. We conclude that HLA*PRG for the first time achieves accuracies comparable to gold-standard reference methods from standard whole-genome sequencing data, though high computational demands (currently ~30–250 CPU hours per sample) remain a significant

  10. High-Accuracy HLA Type Inference from Whole-Genome Sequencing Data Using Population Reference Graphs.

    PubMed

    Dilthey, Alexander T; Gourraud, Pierre-Antoine; Mentzer, Alexander J; Cereb, Nezih; Iqbal, Zamin; McVean, Gil

    2016-10-01

    Genetic variation at the Human Leucocyte Antigen (HLA) genes is associated with many autoimmune and infectious disease phenotypes, is an important element of the immunological distinction between self and non-self, and shapes immune epitope repertoires. Determining the allelic state of the HLA genes (HLA typing) as a by-product of standard whole-genome sequencing data would therefore be highly desirable and enable the immunogenetic characterization of samples in currently ongoing population sequencing projects. Extensive hyperpolymorphism and sequence similarity between the HLA genes, however, pose problems for accurate read mapping and make HLA type inference from whole-genome sequencing data a challenging problem. We describe how to address these challenges in a Population Reference Graph (PRG) framework. First, we construct a PRG for 46 (mostly HLA) genes and pseudogenes, their genomic context and their characterized sequence variants, integrating a database of over 10,000 known allele sequences. Second, we present a sequence-to-PRG paired-end read mapping algorithm that enables accurate read mapping for the HLA genes. Third, we infer the most likely pair of underlying alleles at G group resolution from the IMGT/HLA database at each locus, employing a simple likelihood framework. We show that HLA*PRG, our algorithm, outperforms existing methods by a wide margin. We evaluate HLA*PRG on six classical class I and class II HLA genes (HLA-A, -B, -C, -DQA1, -DQB1, -DRB1) and on a set of 14 samples (3 samples with 2 x 100bp, 11 samples with 2 x 250bp Illumina HiSeq data). Of 158 alleles tested, we correctly infer 157 alleles (99.4%). We also identify and re-type two erroneous alleles in the original validation data. We conclude that HLA*PRG for the first time achieves accuracies comparable to gold-standard reference methods from standard whole-genome sequencing data, though high computational demands (currently ~30-250 CPU hours per sample) remain a significant

  11. Inferring causal genomic alterations in breast cancer using gene expression data

    PubMed Central

    2011-01-01

    Background One of the primary objectives in cancer research is to identify causal genomic alterations, such as somatic copy number variation (CNV) and somatic mutations, during tumor development. Many valuable studies lack genomic data to detect CNV; therefore, methods that are able to infer CNVs from gene expression data would help maximize the value of these studies. Results We developed a framework for identifying recurrent regions of CNV and distinguishing the cancer driver genes from the passenger genes in the regions. By inferring CNV regions across many datasets we were able to identify 109 recurrent amplified/deleted CNV regions. Many of these regions are enriched for genes involved in many important processes associated with tumorigenesis and cancer progression. Genes in these recurrent CNV regions were then examined in the context of gene regulatory networks to prioritize putative cancer driver genes. The cancer driver genes uncovered by the framework include not only well-known oncogenes but also a number of novel cancer susceptibility genes validated via siRNA experiments. Conclusions To our knowledge, this is the first effort to systematically identify and validate drivers for expression based CNV regions in breast cancer. The framework where the wavelet analysis of copy number alteration based on expression coupled with the gene regulatory network analysis, provides a blueprint for leveraging genomic data to identify key regulatory components and gene targets. This integrative approach can be applied to many other large-scale gene expression studies and other novel types of cancer data such as next-generation sequencing based expression (RNA-Seq) as well as CNV data. PMID:21806811

  12. Inferring genome-wide patterns of admixture in Qataris using fifty-five ancestral populations

    PubMed Central

    2012-01-01

    Background Populations of the Arabian Peninsula have a complex genetic structure that reflects waves of migrations including the earliest human migrations from Africa and eastern Asia, migrations along ancient civilization trading routes and colonization history of recent centuries. Results Here, we present a study of genome-wide admixture in this region, using 156 genotyped individuals from Qatar, a country located at the crossroads of these migration patterns. Since haplotypes of these individuals could have originated from many different populations across the world, we have developed a machine learning method "SupportMix" to infer loci-specific genomic ancestry when simultaneously analyzing many possible ancestral populations. Simulations show that SupportMix is not only more accurate than other popular admixture discovery tools but is the first admixture inference method that can efficiently scale for simultaneous analysis of 50-100 putative ancestral populations while being independent of prior demographic information. Conclusions By simultaneously using the 55 world populations from the Human Genome Diversity Panel, SupportMix was able to extract the fine-scale ancestry of the Qatar population, providing many new observations concerning the ancestry of the region. For example, as well as recapitulating the three major sub-populations in Qatar, composed of mainly Arabic, Persian, and African ancestry, SupportMix additionally identifies the specific ancestry of the Persian group to populations sampled in Greater Persia rather than from China and the ancestry of the African group to sub-Saharan origin and not Southern African Bantu origin as previously thought. PMID:22734698

  13. Circumpolar synchrony in big river bacterioplankton

    PubMed Central

    Crump, Byron C.; Peterson, Bruce J.; Raymond, Peter A.; Amon, Rainer M. W.; Rinehart, Amanda; McClelland, James W.; Holmes, Robert M.

    2009-01-01

    Natural bacterial communities are extremely diverse and highly dynamic, but evidence is mounting that the compositions of these communities follow predictable temporal patterns. We investigated these patterns with a 3-year, circumpolar study of bacterioplankton communities in the six largest rivers of the pan-arctic watershed (Ob', Yenisey, Lena, Kolyma, Yukon, and Mackenzie), five of which are among Earth's 25 largest rivers. Communities in the six rivers shifted synchronously over time, correlating with seasonal shifts in hydrology and biogeochemistry and clustering into three groups: winter/spring, spring freshet, and summer/fall. This synchrony indicates that hemisphere-scale variation in seasonal climate sets the pace of variation in microbial diversity. Moreover, these seasonal communities reassembled each year in all six rivers, suggesting a long-term, predictable succession in the composition of big river bacterioplankton communities. PMID:19940248

  14. Adaptive evolution of chloroplast genome structure inferred using a parametric bootstrap approach

    PubMed Central

    Cui, Liying; Leebens-Mack, Jim; Wang, Li-San; Tang, Jijun; Rymarquis, Linda; Stern, David B; dePamphilis, Claude W

    2006-01-01

    Background Genome rearrangements influence gene order and configuration of gene clusters in all genomes. Most land plant chloroplast DNAs (cpDNAs) share a highly conserved gene content and with notable exceptions, a largely co-linear gene order. Conserved gene orders may reflect a slow intrinsic rate of neutral chromosomal rearrangements, or selective constraint. It is unknown to what extent observed changes in gene order are random or adaptive. We investigate the influence of natural selection on gene order in association with increased rate of chromosomal rearrangement. We use a novel parametric bootstrap approach to test if directional selection is responsible for the clustering of functionally related genes observed in the highly rearranged chloroplast genome of the unicellular green alga Chlamydomonas reinhardtii, relative to ancestral chloroplast genomes. Results Ancestral gene orders were inferred and then subjected to simulated rearrangement events under the random breakage model with varying ratios of inversions and transpositions. We found that adjacent chloroplast genes in C. reinhardtii were located on the same strand much more frequently than in simulated genomes that were generated under a random rearrangement processes (increased sidedness; p < 0.0001). In addition, functionally related genes were found to be more clustered than those evolved under random rearrangements (p < 0.0001). We report evidence of co-transcription of neighboring genes, which may be responsible for the observed gene clusters in C. reinhardtii cpDNA. Conclusion Simulations and experimental evidence suggest that both selective maintenance and directional selection for gene clusters are determinants of chloroplast gene order. PMID:16469102

  15. Impact of Sample Type and DNA Isolation Procedure on Genomic Inference of Microbiome Composition

    PubMed Central

    Munk, Patrick; Lukjancenko, Oksana; Priemé, Anders; Aarestrup, Frank M.

    2016-01-01

    ABSTRACT Explorations of complex microbiomes using genomics greatly enhance our understanding about their diversity, biogeography, and function. The isolation of DNA from microbiome specimens is a key prerequisite for such examinations, but challenges remain in obtaining sufficient DNA quantities required for certain sequencing approaches, achieving accurate genomic inference of microbiome composition, and facilitating comparability of findings across specimen types and sequencing projects. These aspects are particularly relevant for the genomics-based global surveillance of infectious agents and antimicrobial resistance from different reservoirs. Here, we compare in a stepwise approach a total of eight commercially available DNA extraction kits and 16 procedures based on these for three specimen types (human feces, pig feces, and hospital sewage). We assess DNA extraction using spike-in controls and different types of beads for bead beating, facilitating cell lysis. We evaluate DNA concentration, purity, and stability and microbial community composition using 16S rRNA gene sequencing and for selected samples using shotgun metagenomic sequencing. Our results suggest that inferred community composition was dependent on inherent specimen properties as well as DNA extraction method. We further show that bead beating or enzymatic treatment can increase the extraction of DNA from Gram-positive bacteria. Final DNA quantities could be increased by isolating DNA from a larger volume of cell lysate than that in standard protocols. Based on this insight, we designed an improved DNA isolation procedure optimized for microbiome genomics that can be used for the three examined specimen types and potentially also for other biological specimens. A standard operating procedure is available from https://dx.doi.org/10.6084/m9.figshare.3475406. IMPORTANCE Sequencing-based analyses of microbiomes may lead to a breakthrough in our understanding of the microbial worlds associated with

  16. Impact of Sample Type and DNA Isolation Procedure on Genomic Inference of Microbiome Composition.

    PubMed

    Knudsen, Berith E; Bergmark, Lasse; Munk, Patrick; Lukjancenko, Oksana; Priemé, Anders; Aarestrup, Frank M; Pamp, Sünje J

    2016-01-01

    Explorations of complex microbiomes using genomics greatly enhance our understanding about their diversity, biogeography, and function. The isolation of DNA from microbiome specimens is a key prerequisite for such examinations, but challenges remain in obtaining sufficient DNA quantities required for certain sequencing approaches, achieving accurate genomic inference of microbiome composition, and facilitating comparability of findings across specimen types and sequencing projects. These aspects are particularly relevant for the genomics-based global surveillance of infectious agents and antimicrobial resistance from different reservoirs. Here, we compare in a stepwise approach a total of eight commercially available DNA extraction kits and 16 procedures based on these for three specimen types (human feces, pig feces, and hospital sewage). We assess DNA extraction using spike-in controls and different types of beads for bead beating, facilitating cell lysis. We evaluate DNA concentration, purity, and stability and microbial community composition using 16S rRNA gene sequencing and for selected samples using shotgun metagenomic sequencing. Our results suggest that inferred community composition was dependent on inherent specimen properties as well as DNA extraction method. We further show that bead beating or enzymatic treatment can increase the extraction of DNA from Gram-positive bacteria. Final DNA quantities could be increased by isolating DNA from a larger volume of cell lysate than that in standard protocols. Based on this insight, we designed an improved DNA isolation procedure optimized for microbiome genomics that can be used for the three examined specimen types and potentially also for other biological specimens. A standard operating procedure is available from https://dx.doi.org/10.6084/m9.figshare.3475406. IMPORTANCE Sequencing-based analyses of microbiomes may lead to a breakthrough in our understanding of the microbial worlds associated with humans

  17. Co-occurrence Analysis of Microbial Taxa in the Atlantic Ocean Reveals High Connectivity in the Free-Living Bacterioplankton

    PubMed Central

    Milici, Mathias; Deng, Zhi-Luo; Tomasch, Jürgen; Decelle, Johan; Wos-Oxley, Melissa L.; Wang, Hui; Jáuregui, Ruy; Plumeier, Iris; Giebel, Helge-Ansgar; Badewien, Thomas H.; Wurst, Mascha; Pieper, Dietmar H.; Simon, Meinhard; Wagner-Döbler, Irene

    2016-01-01

    We determined the taxonomic composition of the bacterioplankton of the epipelagic zone of the Atlantic Ocean along a latitudinal transect (51°S–47°N) using Illumina sequencing of the V5-V6 region of the 16S rRNA gene and inferred co-occurrence networks. Bacterioplankon community composition was distinct for Longhurstian provinces and water depth. Free-living microbial communities (between 0.22 and 3 μm) were dominated by highly abundant and ubiquitous taxa with streamlined genomes (e.g., SAR11, SAR86, OM1, Prochlorococcus) and could clearly be separated from particle-associated communities which were dominated by Bacteroidetes, Planktomycetes, Verrucomicrobia, and Roseobacters. From a total of 369 different communities we then inferred co-occurrence networks for each size fraction and depth layer of the plankton between bacteria and between bacteria and phototrophic micro-eukaryotes. The inferred networks showed a reduction of edges in the deepest layer of the photic zone. Networks comprised of free-living bacteria had a larger amount of connections per OTU when compared to the particle associated communities throughout the water column. Negative correlations accounted for roughly one third of the total edges in the free-living communities at all depths, while they decreased with depth in the particle associated communities where they amounted for roughly 10% of the total in the last part of the epipelagic zone. Co-occurrence networks of bacteria with phototrophic micro-eukaryotes were not taxon-specific, and dominated by mutual exclusion (~60%). The data show a high degree of specialization to micro-environments in the water column and highlight the importance of interdependencies particularly between free-living bacteria in the upper layers of the epipelagic zone. PMID:27199970

  18. Co-occurrence Analysis of Microbial Taxa in the Atlantic Ocean Reveals High Connectivity in the Free-Living Bacterioplankton.

    PubMed

    Milici, Mathias; Deng, Zhi-Luo; Tomasch, Jürgen; Decelle, Johan; Wos-Oxley, Melissa L; Wang, Hui; Jáuregui, Ruy; Plumeier, Iris; Giebel, Helge-Ansgar; Badewien, Thomas H; Wurst, Mascha; Pieper, Dietmar H; Simon, Meinhard; Wagner-Döbler, Irene

    2016-01-01

    We determined the taxonomic composition of the bacterioplankton of the epipelagic zone of the Atlantic Ocean along a latitudinal transect (51°S-47°N) using Illumina sequencing of the V5-V6 region of the 16S rRNA gene and inferred co-occurrence networks. Bacterioplankon community composition was distinct for Longhurstian provinces and water depth. Free-living microbial communities (between 0.22 and 3 μm) were dominated by highly abundant and ubiquitous taxa with streamlined genomes (e.g., SAR11, SAR86, OM1, Prochlorococcus) and could clearly be separated from particle-associated communities which were dominated by Bacteroidetes, Planktomycetes, Verrucomicrobia, and Roseobacters. From a total of 369 different communities we then inferred co-occurrence networks for each size fraction and depth layer of the plankton between bacteria and between bacteria and phototrophic micro-eukaryotes. The inferred networks showed a reduction of edges in the deepest layer of the photic zone. Networks comprised of free-living bacteria had a larger amount of connections per OTU when compared to the particle associated communities throughout the water column. Negative correlations accounted for roughly one third of the total edges in the free-living communities at all depths, while they decreased with depth in the particle associated communities where they amounted for roughly 10% of the total in the last part of the epipelagic zone. Co-occurrence networks of bacteria with phototrophic micro-eukaryotes were not taxon-specific, and dominated by mutual exclusion (~60%). The data show a high degree of specialization to micro-environments in the water column and highlight the importance of interdependencies particularly between free-living bacteria in the upper layers of the epipelagic zone.

  19. Comparative genome analyses of Arabidopsis spp.: Inferring chromosomal rearrangement events in the evolutionary history of A. thaliana

    PubMed Central

    Yogeeswaran, Krithika; Frary, Amy; York, Thomas L.; Amenta, Alison; Lesser, Andrew H.; Nasrallah, June B.; Tanksley, Steven D.; Nasrallah, Mikhail E.

    2005-01-01

    Comparative genome analysis is a powerful tool that can facilitate the reconstruction of the evolutionary history of the genomes of modern-day species. The model plant Arabidopsis thaliana with its n = 5 genome is thought to be derived from an ancestral n = 8 genome. Pairwise comparative genome analyses of A. thaliana with polyploid and diploid Brassicaceae species have suggested that rapid genome evolution, manifested by chromosomal rearrangements and duplications, characterizes the polyploid, but not the diploid, lineages of this family. In this study, we constructed a low-density genetic linkage map of Arabidopsis lyrata ssp. lyrata (A. l. lyrata; n = 8, diploid), the closest known relative of A. thaliana (MRCA ∼5 Mya), using A. thaliana-specific markers that resolve into the expected eight linkage groups. We then performed comparative Bayesian analyses using raw mapping data from this study and from a Capsella study to infer the number and nature of rearrangements that distinguish the n = 8 genomes of A. l. lyrata and Capsella from the n = 5 genome of A. thaliana. We conclude that there is strong statistical support in favor of the parsimony scenarios of 10 major chromosomal rearrangements separating these n = 8 genomes from A. thaliana. These chromosomal rearrangement events contribute to a rate of chromosomal evolution higher than previously reported in this lineage. We infer that at least seven of these events, common to both sets of data, are responsible for the change in karyotype and underlie genome reduction in A. thaliana. PMID:15805492

  20. The Phylogeny and Evolutionary Timescale of Muscoidea (Diptera: Brachycera: Calyptratae) Inferred from Mitochondrial Genomes

    PubMed Central

    Wang, Ning; Cameron, Stephen L.; Mao, Meng; Wang, Yuyu; Xi, Yuqiang; Yang, Ding

    2015-01-01

    Muscoidea is a significant dipteran clade that includes house flies (Family Muscidae), latrine flies (F. Fannidae), dung flies (F. Scathophagidae) and root maggot flies (F. Anthomyiidae). It is comprised of approximately 7000 described species. The monophyly of the Muscoidea and the precise relationships of muscoids to the closest superfamily the Oestroidea (blow flies, flesh flies etc) are both unresolved. Until now mitochondrial (mt) genomes were available for only two of the four muscoid families precluding a thorough test of phylogenetic relationships using this data source. Here we present the first two mt genomes for the families Fanniidae (Euryomma sp.) (family Fanniidae) and Anthomyiidae (Delia platura (Meigen, 1826)). We also conducted phylogenetic analyses containing of these newly sequenced mt genomes plus 15 other species representative of dipteran diversity to address the internal relationship of Muscoidea and its systematic position. Both maximum-likelihood and Bayesian analyses suggested that Muscoidea was not a monophyletic group with the relationship: (Fanniidae + Muscidae) + ((Anthomyiidae + Scathophagidae) + (Calliphoridae + Sarcophagidae)), supported by the majority of analysed datasets. This also infers that Oestroidea was paraphyletic in the majority of analyses. Divergence time estimation suggested that the earliest split within the Calyptratae, separating (Tachinidae + Oestridae) from the remaining families, occurred in the Early Eocene. The main divergence within the paraphyletic muscoidea grade was between Fanniidae + Muscidae and the lineage ((Anthomyiidae + Scathophagidae) + (Calliphoridae + Sarcophagidae)) which occurred in the Late Eocene. PMID:26225760

  1. The influence of genomic context on mutation patterns in the human genome inferred from rare variants

    PubMed Central

    Schaibley, Valerie M.; Zawistowski, Matthew; Wegmann, Daniel; Ehm, Margaret G.; Nelson, Matthew R.; St. Jean, Pamela L.; Abecasis, Gonçalo R.; Novembre, John; Zöllner, Sebastian; Li, Jun Z.

    2013-01-01

    Understanding patterns of spontaneous mutations is of fundamental interest in studies of human genome evolution and genetic disease. Here, we used extremely rare variants in humans to model the molecular spectrum of single-nucleotide mutations. Compared to common variants in humans and human–chimpanzee fixed differences (substitutions), rare variants, on average, arose more recently in the human lineage and are less affected by the potentially confounding effects of natural selection, population demographic history, and biased gene conversion. We analyzed variants obtained from a population-based sequencing study of 202 genes in >14,000 individuals. We observed considerable variability in the per-gene mutation rate, which was correlated with local GC content, but not recombination rate. Using >20,000 variants with a derived allele frequency ≤10−4, we examined the effect of local GC content and recombination rate on individual variant subtypes and performed comparisons with common variants and substitutions. The influence of local GC content on rare variants differed from that on common variants or substitutions, and the differences varied by variant subtype. Furthermore, recombination rate and recombination hotspots have little effect on rare variants of any subtype, yet both have a relatively strong impact on multiple variant subtypes in common variants and substitutions. This observation is consistent with the effect of biased gene conversion or selection-dependent processes. Our results highlight the distinct biases inherent in the initial mutation patterns and subsequent evolutionary processes that affect segregating variants. PMID:23990608

  2. The influence of genomic context on mutation patterns in the human genome inferred from rare variants.

    PubMed

    Schaibley, Valerie M; Zawistowski, Matthew; Wegmann, Daniel; Ehm, Margaret G; Nelson, Matthew R; St Jean, Pamela L; Abecasis, Gonçalo R; Novembre, John; Zöllner, Sebastian; Li, Jun Z

    2013-12-01

    Understanding patterns of spontaneous mutations is of fundamental interest in studies of human genome evolution and genetic disease. Here, we used extremely rare variants in humans to model the molecular spectrum of single-nucleotide mutations. Compared to common variants in humans and human-chimpanzee fixed differences (substitutions), rare variants, on average, arose more recently in the human lineage and are less affected by the potentially confounding effects of natural selection, population demographic history, and biased gene conversion. We analyzed variants obtained from a population-based sequencing study of 202 genes in >14,000 individuals. We observed considerable variability in the per-gene mutation rate, which was correlated with local GC content, but not recombination rate. Using >20,000 variants with a derived allele frequency ≤ 10(-4), we examined the effect of local GC content and recombination rate on individual variant subtypes and performed comparisons with common variants and substitutions. The influence of local GC content on rare variants differed from that on common variants or substitutions, and the differences varied by variant subtype. Furthermore, recombination rate and recombination hotspots have little effect on rare variants of any subtype, yet both have a relatively strong impact on multiple variant subtypes in common variants and substitutions. This observation is consistent with the effect of biased gene conversion or selection-dependent processes. Our results highlight the distinct biases inherent in the initial mutation patterns and subsequent evolutionary processes that affect segregating variants.

  3. Chiropteran types I and II interferon genes inferred from genome sequencing traces by a statistical gene-family assembler

    PubMed Central

    2010-01-01

    Background The rate of emergence of human pathogens is steadily increasing; most of these novel agents originate in wildlife. Bats, remarkably, are the natural reservoirs of many of the most pathogenic viruses in humans. There are two bat genome projects currently underway, a circumstance that promises to speed the discovery host factors important in the coevolution of bats with their viruses. These genomes, however, are not yet assembled and one of them will provide only low coverage, making the inference of most genes of immunological interest error-prone. Many more wildlife genome projects are underway and intend to provide only shallow coverage. Results We have developed a statistical method for the assembly of gene families from partial genomes. The method takes full advantage of the quality scores generated by base-calling software, incorporating them into a complete probabilistic error model, to overcome the limitation inherent in the inference of gene family members from partial sequence information. We validated the method by inferring the human IFNA genes from the genome trace archives, and used it to infer 61 type-I interferon genes, and single type-II interferon genes in the bats Pteropus vampyrus and Myotis lucifugus. We confirmed our inferences by direct cloning and sequencing of IFNA, IFNB, IFND, and IFNK in P. vampyrus, and by demonstrating transcription of some of the inferred genes by known interferon-inducing stimuli. Conclusion The statistical trace assembler described here provides a reliable method for extracting information from the many available and forthcoming partial or shallow genome sequencing projects, thereby facilitating the study of a wider variety of organisms with ecological and biomedical significance to humans than would otherwise be possible. PMID:20663124

  4. Chiropteran types I and II interferon genes inferred from genome sequencing traces by a statistical gene-family assembler.

    PubMed

    Kepler, Thomas B; Sample, Christopher; Hudak, Kathryn; Roach, Jeffrey; Haines, Albert; Walsh, Allyson; Ramsburg, Elizabeth A

    2010-07-21

    The rate of emergence of human pathogens is steadily increasing; most of these novel agents originate in wildlife. Bats, remarkably, are the natural reservoirs of many of the most pathogenic viruses in humans. There are two bat genome projects currently underway, a circumstance that promises to speed the discovery host factors important in the coevolution of bats with their viruses. These genomes, however, are not yet assembled and one of them will provide only low coverage, making the inference of most genes of immunological interest error-prone. Many more wildlife genome projects are underway and intend to provide only shallow coverage. We have developed a statistical method for the assembly of gene families from partial genomes. The method takes full advantage of the quality scores generated by base-calling software, incorporating them into a complete probabilistic error model, to overcome the limitation inherent in the inference of gene family members from partial sequence information. We validated the method by inferring the human IFNA genes from the genome trace archives, and used it to infer 61 type-I interferon genes, and single type-II interferon genes in the bats Pteropus vampyrus and Myotis lucifugus. We confirmed our inferences by direct cloning and sequencing of IFNA, IFNB, IFND, and IFNK in P. vampyrus, and by demonstrating transcription of some of the inferred genes by known interferon-inducing stimuli. The statistical trace assembler described here provides a reliable method for extracting information from the many available and forthcoming partial or shallow genome sequencing projects, thereby facilitating the study of a wider variety of organisms with ecological and biomedical significance to humans than would otherwise be possible.

  5. Integration of Multiple Genomic and Phenotype Data to Infer Novel miRNA-Disease Associations

    PubMed Central

    Zhou, Meng; Cheng, Liang; Yang, Haixiu; Wang, Jing; Sun, Jie; Wang, Zhenzhen

    2016-01-01

    MicroRNAs (miRNAs) play an important role in the development and progression of human diseases. The identification of disease-associated miRNAs will be helpful for understanding the molecular mechanisms of diseases at the post-transcriptional level. Based on different types of genomic data sources, computational methods for miRNA-disease association prediction have been proposed. However, individual source of genomic data tends to be incomplete and noisy; therefore, the integration of various types of genomic data for inferring reliable miRNA-disease associations is urgently needed. In this study, we present a computational framework, CHNmiRD, for identifying miRNA-disease associations by integrating multiple genomic and phenotype data, including protein-protein interaction data, gene ontology data, experimentally verified miRNA-target relationships, disease phenotype information and known miRNA-disease connections. The performance of CHNmiRD was evaluated by experimentally verified miRNA-disease associations, which achieved an area under the ROC curve (AUC) of 0.834 for 5-fold cross-validation. In particular, CHNmiRD displayed excellent performance for diseases without any known related miRNAs. The results of case studies for three human diseases (glioblastoma, myocardial infarction and type 1 diabetes) showed that all of the top 10 ranked miRNAs having no known associations with these three diseases in existing miRNA-disease databases were directly or indirectly confirmed by our latest literature mining. All these results demonstrated the reliability and efficiency of CHNmiRD, and it is anticipated that CHNmiRD will serve as a powerful bioinformatics method for mining novel disease-related miRNAs and providing a new perspective into molecular mechanisms underlying human diseases at the post-transcriptional level. CHNmiRD is freely available at http://www.bio-bigdata.com/CHNmiRD. PMID:26849207

  6. Integration of Multiple Genomic and Phenotype Data to Infer Novel miRNA-Disease Associations.

    PubMed

    Shi, Hongbo; Zhang, Guangde; Zhou, Meng; Cheng, Liang; Yang, Haixiu; Wang, Jing; Sun, Jie; Wang, Zhenzhen

    2016-01-01

    MicroRNAs (miRNAs) play an important role in the development and progression of human diseases. The identification of disease-associated miRNAs will be helpful for understanding the molecular mechanisms of diseases at the post-transcriptional level. Based on different types of genomic data sources, computational methods for miRNA-disease association prediction have been proposed. However, individual source of genomic data tends to be incomplete and noisy; therefore, the integration of various types of genomic data for inferring reliable miRNA-disease associations is urgently needed. In this study, we present a computational framework, CHNmiRD, for identifying miRNA-disease associations by integrating multiple genomic and phenotype data, including protein-protein interaction data, gene ontology data, experimentally verified miRNA-target relationships, disease phenotype information and known miRNA-disease connections. The performance of CHNmiRD was evaluated by experimentally verified miRNA-disease associations, which achieved an area under the ROC curve (AUC) of 0.834 for 5-fold cross-validation. In particular, CHNmiRD displayed excellent performance for diseases without any known related miRNAs. The results of case studies for three human diseases (glioblastoma, myocardial infarction and type 1 diabetes) showed that all of the top 10 ranked miRNAs having no known associations with these three diseases in existing miRNA-disease databases were directly or indirectly confirmed by our latest literature mining. All these results demonstrated the reliability and efficiency of CHNmiRD, and it is anticipated that CHNmiRD will serve as a powerful bioinformatics method for mining novel disease-related miRNAs and providing a new perspective into molecular mechanisms underlying human diseases at the post-transcriptional level. CHNmiRD is freely available at http://www.bio-bigdata.com/CHNmiRD.

  7. EFICAz: a comprehensive approach for accurate genome-scale enzyme function inference

    PubMed Central

    Tian, Weidong; Arakaki, Adrian K.; Skolnick, Jeffrey

    2004-01-01

    EFICAz (Enzyme Function Inference by Combined Approach) is an automatic engine for large-scale enzyme function inference that combines predictions from four different methods developed and optimized to achieve high prediction accuracy: (i) recognition of functionally discriminating residues (FDRs) in enzyme families obtained by a Conservation-controlled HMM Iterative procedure for Enzyme Family classification (CHIEFc), (ii) pairwise sequence comparison using a family specific Sequence Identity Threshold, (iii) recognition of FDRs in Multiple Pfam enzyme families, and (iv) recognition of multiple Prosite patterns of high specificity. For FDR (i.e. conserved positions in an enzyme family that discriminate between true and false members of the family) identification, we have developed an Evolutionary Footprinting method that uses evolutionary information from homofunctional and heterofunctional multiple sequence alignments associated with an enzyme family. The FDRs show a significant correlation with annotated active site residues. In a jackknife test, EFICAz shows high accuracy (92%) and sensitivity (82%) for predicting four EC digits in testing sequences that are <40% identical to any member of the corresponding training set. Applied to Escherichia coli genome, EFICAz assigns more detailed enzymatic function than KEGG, and generates numerous novel predictions. PMID:15576349

  8. EFICAz: a comprehensive approach for accurate genome-scale enzyme function inference.

    PubMed

    Tian, Weidong; Arakaki, Adrian K; Skolnick, Jeffrey

    2004-01-01

    EFICAz (Enzyme Function Inference by Combined Approach) is an automatic engine for large-scale enzyme function inference that combines predictions from four different methods developed and optimized to achieve high prediction accuracy: (i) recognition of functionally discriminating residues (FDRs) in enzyme families obtained by a Conservation-controlled HMM Iterative procedure for Enzyme Family classification (CHIEFc), (ii) pairwise sequence comparison using a family specific Sequence Identity Threshold, (iii) recognition of FDRs in Multiple Pfam enzyme families, and (iv) recognition of multiple Prosite patterns of high specificity. For FDR (i.e. conserved positions in an enzyme family that discriminate between true and false members of the family) identification, we have developed an Evolutionary Footprinting method that uses evolutionary information from homofunctional and heterofunctional multiple sequence alignments associated with an enzyme family. The FDRs show a significant correlation with annotated active site residues. In a jackknife test, EFICAz shows high accuracy (92%) and sensitivity (82%) for predicting four EC digits in testing sequences that are <40% identical to any member of the corresponding training set. Applied to Escherichia coli genome, EFICAz assigns more detailed enzymatic function than KEGG, and generates numerous novel predictions.

  9. Paleolithic Contingent in Modern Japanese: Estimation and Inference using Genome-wide Data

    PubMed Central

    He, Yungang; Wang, Wei R.; Xu, Shuhua; Jin, Li; SNP Consortium, Pan-Asia

    2012-01-01

    The genetic origins of Japanese populations have been controversial. Upper Paleolithic Japanese, i.e. Jomon, developed independently in Japanese islands for more than 10,000 years until the isolation was ended with the influxes of continental immigrants about 2,000 years ago. However, the knowledge of origin of Jomon and its contribution to the genetic pool of contemporary Japanese is still limited, albeit the extensive studies using mtDNA and Y chromosomes. In this report, we aimed to infer the origin of Jomon and to estimate its contribution to Japanese by fitting an admixture model with missing data from Jomon to a genome-wide data from 94 worldwide populations. Our results showed that the genetic contributions of Jomon, the Paleolithic contingent in Japanese, are 54.3∼62.3% in Ryukyuans and 23.1∼39.5% in mainland Japanese, respectively. Utilizing inferred allele frequencies of the Jomon population, we further showed the Paleolithic contingent in Japanese had a Northeast Asia origin. PMID:22482036

  10. ABC inference of multi-population divergence with admixture from unphased population genomic data.

    PubMed

    Robinson, John D; Bunnefeld, Lynsey; Hearn, Jack; Stone, Graham N; Hickerson, Michael J

    2014-09-01

    Rapidly developing sequencing technologies and declining costs have made it possible to collect genome-scale data from population-level samples in nonmodel systems. Inferential tools for historical demography given these data sets are, at present, underdeveloped. In particular, approximate Bayesian computation (ABC) has yet to be widely embraced by researchers generating these data. Here, we demonstrate the promise of ABC for analysis of the large data sets that are now attainable from nonmodel taxa through current genomic sequencing technologies. We develop and test an ABC framework for model selection and parameter estimation, given histories of three-population divergence with admixture. We then explore different sampling regimes to illustrate how sampling more loci, longer loci or more individuals affects the quality of model selection and parameter estimation in this ABC framework. Our results show that inferences improved substantially with increases in the number and/or length of sequenced loci, while less benefit was gained by sampling large numbers of individuals. Optimal sampling strategies given our inferential models included at least 2000 loci, each approximately 2 kb in length, sampled from five diploid individuals per population, although specific strategies are model and question dependent. We tested our ABC approach through simulation-based cross-validations and illustrate its application using previously analysed data from the oak gall wasp, Biorhiza pallida. © 2014 The Authors. Molecular Ecology published by John Wiley & Sons Ltd.

  11. GWIS: Genome-Wide Inferred Statistics for Functions of Multiple Phenotypes.

    PubMed

    Nieuwboer, Harold A; Pool, René; Dolan, Conor V; Boomsma, Dorret I; Nivard, Michel G

    2016-10-06

    Here we present a method of genome-wide inferred study (GWIS) that provides an approximation of genome-wide association study (GWAS) summary statistics for a variable that is a function of phenotypes for which GWAS summary statistics, phenotypic means, and covariances are available. A GWIS can be performed regardless of sample overlap between the GWAS of the phenotypes on which the function depends. Because a GWIS provides association estimates and their standard errors for each SNP, a GWIS can form the basis for polygenic risk scoring, LD score regression, Mendelian randomization studies, biological annotation, and other analyses. GWISs can also be used to boost power of a GWAS meta-analysis where cohorts have not measured all constituent phenotypes in the function. We demonstrate the accuracy of a BMI GWIS by performing power simulations and type I error simulations under varying circumstances, and we apply a GWIS by reconstructing a body mass index (BMI) GWAS based on a weight GWAS and a height GWAS. Furthermore, we apply a GWIS to further our understanding of the underlying genetic structure of bipolar disorder and schizophrenia and their relation to educational attainment. Our analyses suggest that the previously reported genetic correlation between schizophrenia and educational attainment is probably induced by the observed genetic correlation between schizophrenia and bipolar disorder and the previously reported genetic correlation between bipolar disorder and educational attainment. Copyright © 2016 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  12. ABC inference of multi-population divergence with admixture from unphased population genomic data

    PubMed Central

    Robinson, John D; Bunnefeld, Lynsey; Hearn, Jack; Stone, Graham N; Hickerson, Michael J

    2014-01-01

    Rapidly developing sequencing technologies and declining costs have made it possible to collect genome-scale data from population-level samples in nonmodel systems. Inferential tools for historical demography given these data sets are, at present, underdeveloped. In particular, approximate Bayesian computation (ABC) has yet to be widely embraced by researchers generating these data. Here, we demonstrate the promise of ABC for analysis of the large data sets that are now attainable from nonmodel taxa through current genomic sequencing technologies. We develop and test an ABC framework for model selection and parameter estimation, given histories of three-population divergence with admixture. We then explore different sampling regimes to illustrate how sampling more loci, longer loci or more individuals affects the quality of model selection and parameter estimation in this ABC framework. Our results show that inferences improved substantially with increases in the number and/or length of sequenced loci, while less benefit was gained by sampling large numbers of individuals. Optimal sampling strategies given our inferential models included at least 2000 loci, each approximately 2 kb in length, sampled from five diploid individuals per population, although specific strategies are model and question dependent. We tested our ABC approach through simulation-based cross-validations and illustrate its application using previously analysed data from the oak gall wasp, Biorhiza pallida. PMID:25113024

  13. Inferring Quantitative Trait Pathways Associated with Bull Fertility from a Genome-Wide Association Study

    PubMed Central

    Peñagaricano, Francisco; Weigel, Kent A.; Rosa, Guilherme J. M.; Khatib, Hasan

    2013-01-01

    Whole-genome association studies typically focus on genetic markers with the strongest evidence of association. However, single markers often explain only a small component of the genetic variance and hence offer a limited understanding of the trait under study. As such, the objective of this study was to perform a pathway-based association analysis in Holstein dairy cattle in order to identify relevant pathways involved in bull fertility. The results of a single-marker association analysis, using 1,755 bulls with sire conception rate data and genotypes for 38,650 single nucleotide polymorphisms (SNPs), were used in this study. A total of 16,819 annotated genes, including 2,767 significantly associated with bull fertility, were used to interrogate a total of 662 Gene Ontology (GO) terms and 248 InterPro (IP) entries using a test of proportions based on the cumulative hypergeometric distribution. After multiple-testing correction, 20 GO categories and one IP entry showed significant overrepresentation of genes statistically associated with bull fertility. Several of these functional categories such as small GTPases mediated signal transduction, neurogenesis, calcium ion binding, and cytoskeleton are known to be involved in biological processes closely related to male fertility. These results could provide insight into the genetic architecture of this complex trait in dairy cattle. In addition, this study shows that quantitative trait pathways inferred from single-marker analyses could enhance our interpretations of the results of genome-wide association studies. PMID:23335935

  14. Species Delimitation and Interspecific Relationships of the Genus Orychophragmus (Brassicaceae) Inferred from Whole Chloroplast Genomes

    PubMed Central

    Hu, Huan; Hu, Quanjun; Al-Shehbaz, Ihsan A.; Luo, Xin; Zeng, Tingting; Guo, Xinyi; Liu, Jianquan

    2016-01-01

    Genetic variations from few chloroplast DNA fragments show lower discriminatory power in the delimitation of closely related species and less resolution ability in discerning interspecific relationships than from nrITS. Here we use Orychophragmus (Brassicaceae) as a model system to test the hypothesis that the whole chloroplast genomes (plastomes), with accumulation of more variations despite the slow evolution, can overcome these weaknesses. We used Illumina sequencing technology via a reference-guided assembly to construct complete plastomes of 17 individuals from six putatively assumed species in the genus. All plastomes are highly conserved in genome structure, gene order, and orientation, and they are around 153 kb in length and contain 113 unique genes. However, nucleotide variations are quite substantial to support the delimitation of all sampled species and to resolve interspecific relationships with high statistical supports. As expected, the estimated divergences between major clades and species are lower than those estimated from nrITS probably due to the slow substitution rate of the plastomes. However, the plastome and nrITS phylogenies were contradictory in the placements of most species, thus suggesting that these species may have experienced complex non-bifurcating evolutions with incomplete lineage sorting and/or hybrid introgressions. Overall, our case study highlights the importance of using plastomes to examine species boundaries and establish an independent phylogeny to infer the speciation history of plants. PMID:27999584

  15. LASSIM-A network inference toolbox for genome-wide mechanistic modeling.

    PubMed

    Magnusson, Rasmus; Mariotti, Guido Pio; Köpsén, Mattias; Lövfors, William; Gawel, Danuta R; Jörnsten, Rebecka; Linde, Jörg; Nordling, Torbjörn E M; Nyman, Elin; Schulze, Sylvie; Nestor, Colm E; Zhang, Huan; Cedersund, Gunnar; Benson, Mikael; Tjärnberg, Andreas; Gustafsson, Mika

    2017-06-01

    systems-level data. We demonstrate the power of this approach by inferring a mechanistically motivated, genome-wide model of the Th2 transcription regulatory system, which plays an important role in several immune related diseases.

  16. LASSIM—A network inference toolbox for genome-wide mechanistic modeling

    PubMed Central

    Mariotti, Guido Pio; Lövfors, William; Gawel, Danuta R.; Jörnsten, Rebecka; Linde, Jörg; Schulze, Sylvie; Nestor, Colm E.; Zhang, Huan; Cedersund, Gunnar; Benson, Mikael

    2017-01-01

    systems-level data. We demonstrate the power of this approach by inferring a mechanistically motivated, genome-wide model of the Th2 transcription regulatory system, which plays an important role in several immune related diseases. PMID:28640810

  17. Orthology Inference in Nonmodel Organisms Using Transcriptomes and Low-Coverage Genomes: Improving Accuracy and Matrix Occupancy for Phylogenomics

    PubMed Central

    Yang, Ya; Smith, Stephen A.

    2014-01-01

    Orthology inference is central to phylogenomic analyses. Phylogenomic data sets commonly include transcriptomes and low-coverage genomes that are incomplete and contain errors and isoforms. These properties can severely violate the underlying assumptions of orthology inference with existing heuristics. We present a procedure that uses phylogenies for both homology and orthology assignment. The procedure first uses similarity scores to infer putative homologs that are then aligned, constructed into phylogenies, and pruned of spurious branches caused by deep paralogs, misassembly, frameshifts, or recombination. These final homologs are then used to identify orthologs. We explore four alternative tree-based orthology inference approaches, of which two are new. These accommodate gene and genome duplications as well as gene tree discordance. We demonstrate these methods in three published data sets including the grape family, Hymenoptera, and millipedes with divergence times ranging from approximately 100 to over 400 Ma. The procedure significantly increased the completeness and accuracy of the inferred homologs and orthologs. We also found that data sets that are more recently diverged and/or include more high-coverage genomes had more complete sets of orthologs. To explicitly evaluate sources of conflicting phylogenetic signals, we applied serial jackknife analyses of gene regions keeping each locus intact. The methods described here can scale to over 100 taxa. They have been implemented in python with independent scripts for each step, making it easy to modify or incorporate them into existing pipelines. All scripts are available from https://bitbucket.org/yangya/phylogenomic_dataset_construction. PMID:25158799

  18. PICARA, an analytical pipeline providing probabilistic inference about a priori candidates genes underlying genome-wide association QTL in plants

    USDA-ARS?s Scientific Manuscript database

    PICARA is an analytical pipeline designed to systematically summarize observed SNP/trait associations identified by genome wide association studies (GWAS) and to identify candidate genes involved in the regulation of complex trait variation. The pipeline provides probabilistic inference about a prio...

  19. Genomic analysis of circulating cell-free DNA infers breast cancer dormancy

    PubMed Central

    Shaw, Jacqueline A.; Page, Karen; Blighe, Kevin; Hava, Natasha; Guttery, David; Ward, Becky; Brown, James; Ruangpratheep, Chetana; Stebbing, Justin; Payne, Rachel; Palmieri, Carlo; Cleator, Suzy; Walker, Rosemary A.; Coombes, R. Charles

    2012-01-01

    Biomarkers in breast cancer to monitor minimal residual disease have remained elusive. We hypothesized that genomic analysis of circulating free DNA (cfDNA) isolated from plasma may form the basis for a means of detecting and monitoring breast cancer. We profiled 251 genomes using Affymetrix SNP 6.0 arrays to determine copy number variations (CNVs) and loss of heterozygosity (LOH), comparing 138 cfDNA samples with matched primary tumor and normal leukocyte DNA in 65 breast cancer patients and eight healthy female controls. Concordance of SNP genotype calls in paired cfDNA and leukocyte DNA samples distinguished between breast cancer patients and healthy female controls (P < 0.0001) and between preoperative patients and patients on follow-up who had surgery and treatment (P = 0.0016). Principal component analyses of cfDNA SNP/copy number results also separated presurgical breast cancer patients from the healthy controls, suggesting specific CNVs in cfDNA have clinical significance. We identified focal high-level DNA amplification in paired tumor and cfDNA clustered in a number of chromosome arms, some of which harbor genes with oncogenic potential, including USP17L2 (DUB3), BRF1, MTA1, and JAG2. Remarkably, in 50 patients on follow-up, specific CNVs were detected in cfDNA, mirroring the primary tumor, up to 12 yr after diagnosis despite no other evidence of disease. These data demonstrate the potential of SNP/CNV analysis of cfDNA to distinguish between patients with breast cancer and healthy controls during routine follow-up. The genomic profiles of cfDNA infer dormancy/minimal residual disease in the majority of patients on follow-up. PMID:21990379

  20. Genomic analysis of circulating cell-free DNA infers breast cancer dormancy.

    PubMed

    Shaw, Jacqueline A; Page, Karen; Blighe, Kevin; Hava, Natasha; Guttery, David; Ward, Becky; Brown, James; Ruangpratheep, Chetana; Stebbing, Justin; Payne, Rachel; Palmieri, Carlo; Cleator, Suzy; Walker, Rosemary A; Coombes, R Charles

    2012-02-01

    Biomarkers in breast cancer to monitor minimal residual disease have remained elusive. We hypothesized that genomic analysis of circulating free DNA (cfDNA) isolated from plasma may form the basis for a means of detecting and monitoring breast cancer. We profiled 251 genomes using Affymetrix SNP 6.0 arrays to determine copy number variations (CNVs) and loss of heterozygosity (LOH), comparing 138 cfDNA samples with matched primary tumor and normal leukocyte DNA in 65 breast cancer patients and eight healthy female controls. Concordance of SNP genotype calls in paired cfDNA and leukocyte DNA samples distinguished between breast cancer patients and healthy female controls (P < 0.0001) and between preoperative patients and patients on follow-up who had surgery and treatment (P = 0.0016). Principal component analyses of cfDNA SNP/copy number results also separated presurgical breast cancer patients from the healthy controls, suggesting specific CNVs in cfDNA have clinical significance. We identified focal high-level DNA amplification in paired tumor and cfDNA clustered in a number of chromosome arms, some of which harbor genes with oncogenic potential, including USP17L2 (DUB3), BRF1, MTA1, and JAG2. Remarkably, in 50 patients on follow-up, specific CNVs were detected in cfDNA, mirroring the primary tumor, up to 12 yr after diagnosis despite no other evidence of disease. These data demonstrate the potential of SNP/CNV analysis of cfDNA to distinguish between patients with breast cancer and healthy controls during routine follow-up. The genomic profiles of cfDNA infer dormancy/minimal residual disease in the majority of patients on follow-up.

  1. The evolutionary history of termites as inferred from 66 mitochondrial genomes.

    PubMed

    Bourguignon, Thomas; Lo, Nathan; Cameron, Stephen L; Šobotník, Jan; Hayashi, Yoshinobu; Shigenobu, Shuji; Watanabe, Dai; Roisin, Yves; Miura, Toru; Evans, Theodore A

    2015-02-01

    Termites have colonized many habitats and are among the most abundant animals in tropical ecosystems, which they modify considerably through their actions. The timing of their rise in abundance and of the dispersal events that gave rise to modern termite lineages is not well understood. To shed light on termite origins and diversification, we sequenced the mitochondrial genome of 48 termite species and combined them with 18 previously sequenced termite mitochondrial genomes for phylogenetic and molecular clock analyses using multiple fossil calibrations. The 66 genomes represent most major clades of termites. Unlike previous phylogenetic studies based on fewer molecular data, our phylogenetic tree is fully resolved for the lower termites. The phylogenetic positions of Macrotermitinae and Apicotermitinae are also resolved as the basal groups in the higher termites, but in the crown termitid groups, including Termitinae + Syntermitinae + Nasutitermitinae + Cubitermitinae, the position of some nodes remains uncertain. Our molecular clock tree indicates that the lineages leading to termites and Cryptocercus roaches diverged 170 Ma (153-196 Ma 95% confidence interval [CI]), that modern Termitidae arose 54 Ma (46-66 Ma 95% CI), and that the crown termitid group arose 40 Ma (35-49 Ma 95% CI). This indicates that the distribution of basal termite clades was influenced by the final stages of the breakup of Pangaea. Our inference of ancestral geographic ranges shows that the Termitidae, which includes more than 75% of extant termite species, most likely originated in Africa or Asia, and acquired their pantropical distribution after a series of dispersal and subsequent diversification events.

  2. Covariance Between Genotypic Effects and its Use for Genomic Inference in Half-Sib Families

    PubMed Central

    Wittenburg, Dörte; Teuscher, Friedrich; Klosa, Jan; Reinsch, Norbert

    2016-01-01

    In livestock, current statistical approaches utilize extensive molecular data, e.g., single nucleotide polymorphisms (SNPs), to improve the genetic evaluation of individuals. The number of model parameters increases with the number of SNPs, so the multicollinearity between covariates can affect the results obtained using whole genome regression methods. In this study, dependencies between SNPs due to linkage and linkage disequilibrium among the chromosome segments were explicitly considered in methods used to estimate the effects of SNPs. The population structure affects the extent of such dependencies, so the covariance among SNP genotypes was derived for half-sib families, which are typical in livestock populations. Conditional on the SNP haplotypes of the common parent (sire), the theoretical covariance was determined using the haplotype frequencies of the population from which the individual parent (dam) was derived. The resulting covariance matrix was included in a statistical model for a trait of interest, and this covariance matrix was then used to specify prior assumptions for SNP effects in a Bayesian framework. The approach was applied to one family in simulated scenarios (few and many quantitative trait loci) and using semireal data obtained from dairy cattle to identify genome segments that affect performance traits, as well as to investigate the impact on predictive ability. Compared with a method that does not explicitly consider any of the relationship among predictor variables, the accuracy of genetic value prediction was improved by 10–22%. The results show that the inclusion of dependence is particularly important for genomic inference based on small sample sizes. PMID:27402363

  3. Effect of sampling on the extent and accuracy of the inferred genetic history of recombining genome.

    PubMed

    Platt, Daniel E; Utro, Filippo; Parida, Laxmi

    2014-06-01

    Accessible biotechnology is enabling the cataloging of genetic variants in individuals in populations at unprecedented scales. The use of phylogeny of the individuals within populations allows a model-based approach to studying these variations, which is important in understanding relationships between and across populations. For the somatic genome, however, the phylogeny must take recombinations (and other genetic mixing events) into account. Hence the resulting topology is more complex than a tree. Unlike a tree topology, it is not as apparent which events are visible from the extant samples. An earlier work presented a mathematical model (called the minimal descriptor) for teasing apart the inherent visible information from that which any specific algorithm might see. We use this framework to study the effect of sampling sizes on the overall inferred genetic history. In this paper, we seek to understand the extent, characteristics (in terms of recent versus ancient genetic events) and reliability of what was resolvable within field samples drawn from modern populations. We observed that most of the visible ancient events are recoverable from relatively small sample sizes. However, without identification of this relatively small minority of ancient genetic events, most of the signal will appear to reflect modern events and admixtures. We also found that the more ancient events are likely to be reproduced with higher fidelity between multiple samplings, and that the identified older events are less likely to yield false positive discrimination between populations. We conclude that a recombinant phylogenetic reconstruction is necessary to identify which markers are most likely to discriminate ancient events, and to discriminate between populations with lower risk of false positives. Secondly, on a broader note, this study also provides a general methodology for a critical assessment of the inferred common genetic history of populations (say, in plant cultivars or

  4. Photoheterotrophy of bacterioplankton is ubiquitous in the surface oligotrophic ocean

    NASA Astrophysics Data System (ADS)

    Evans, Claire; Gómez-Pereira, Paola R.; Martin, Adrian P.; Scanlan, David J.; Zubkov, Mikhail V.

    2015-06-01

    Accurate measurements in the Southern Hemisphere were obtained to test a hypothesis of the ubiquity of photoheterotrophy in the oligotrophic ocean. We present experimental results of light-enhanced uptake of methionine, leucine and ATP by bacterioplankton during two large-scale transects of the South Atlantic. Light increased the uptake of substrates by both dominant bacterioplankton groups, Prochlorococcus and SAR11, as well as for the bulk microbial community. Our consistent experimental evidence strongly indicates that photoheterotrophy is characteristic of dominant bacterioplankton populations in the global oligotrophic ocean.

  5. Comparative Analysis of Mitochondrial Genomes in Diplura (Hexapoda, Arthropoda): Taxon Sampling Is Crucial for Phylogenetic Inferences

    PubMed Central

    Chen, Wan-Jun; Koch, Markus; Mallatt, Jon M.; Luan, Yun-Xia

    2014-01-01

    Two-pronged bristletails (Diplura) are traditionally classified into three major superfamilies: Campodeoidea, Projapygoidea, and Japygoidea. The interrelationships of these three superfamilies and the monophyly of Diplura have been much debated. Few previous studies included Projapygoidea in their phylogenetic considerations, and its position within Diplura still is a puzzle from both morphological and molecular points of view. Until now, no mitochondrial genome has been sequenced for any projapygoid species. To fill in this gap, we determined and annotated the complete mitochondrial genome of Octostigma sinensis (Octostigmatidae, Projapygoidea), and of three more dipluran species, one each from the Campodeidae, Parajapygidae, and Japygidae. All four newly sequenced dipluran mtDNAs encode the same set of genes in the same gene order as shared by most crustaceans and hexapods. Secondary structure truncations have occurred in trnR, trnC, trnS1, and trnS2, and the reduction of transfer RNA D-arms was found to be taxonomically correlated, with Campodeoidea having experienced the most reduction. Partitioned phylogenetic analyses, based on both amino acids and nucleotides of the protein-coding genes plus the ribosomal RNA genes, retrieve significant support for a monophyletic Diplura within Pancrustacea, with Projapygoidea more closely related to Campodeoidea than to Japygoidea. Another key finding is that monophyly of Diplura cannot be recovered unless Projapygoidea is included in the phylogenetic analyses; this explains the dipluran polyphyly found by past mitogenomic studies. Including Projapygoidea increased the sampling density within Diplura and probably helped by breaking up a long-branch-attraction artifact. This finding provides an example of how proper sampling is significant for phylogenetic inference. PMID:24391151

  6. Genome-wide copy number analysis using copy number inferring tool (CNIT) and DNA pooling.

    PubMed

    Lin, Chien-hsing; Huang, Mei-chu; Li, Ling-hui; Wu, Jer-yuarn; Chen, Yuan-tsong; Fann, Cathy S J

    2008-08-01

    Copy number variation (CNV) has become an important genomic structure element in the human population, and some CNVs are related to specific traits and diseases. Moreover, analysis of human genomes has been potentiated by the use of high-resolution microarrays that assess single nucleotide polymorphisms (SNPs). Although many programs have been designed to analyze data from Affymetrix SNP microarrays, they all have high false-positive rates (FPRs) in copy number (CN) analyses. Copy number analysis tool (CNAT) 4.0 is a recently developed program that offers improved CN estimation, but small amplifications and deletions are lost when using the smoothing procedure. Here, we propose a copy number inferring tool (CNIT) algorithm for the 100K SNP microarray to investigate CNVs at 29.6-kb resolution. CNIT estimated SNP allelic and total CN with reliable P values based on intensity data. In addition, the hidden Markov model (HMM) method was applied to predict regions having altered CN by considering contiguous SNPs. Based on a CN analysis of 23 unrelated Taiwanese and 30 HapMap Centre d'Etude du Polymorphisme Humain (CEPH) trios, CNIT showed higher accuracy and power than other programs. The FPRs and false-negative rates (FNRs) of CNIT were 0.1% and 0.16%, respectively. CNIT also showed better sensitivity for detecting small amplifications and deletions. Furthermore, DNA pooling of 10 and 30 normal unrelated individuals were applied to the 100K SNP microarray, respectively, and 12 common CN-variable regions were identified, suggesting that DNA pooling can be applied to discover common CNVs.

  7. Comparative analysis of mitochondrial genomes in Diplura (hexapoda, arthropoda): taxon sampling is crucial for phylogenetic inferences.

    PubMed

    Chen, Wan-Jun; Koch, Markus; Mallatt, Jon M; Luan, Yun-Xia

    2014-01-01

    Two-pronged bristletails (Diplura) are traditionally classified into three major superfamilies: Campodeoidea, Projapygoidea, and Japygoidea. The interrelationships of these three superfamilies and the monophyly of Diplura have been much debated. Few previous studies included Projapygoidea in their phylogenetic considerations, and its position within Diplura still is a puzzle from both morphological and molecular points of view. Until now, no mitochondrial genome has been sequenced for any projapygoid species. To fill in this gap, we determined and annotated the complete mitochondrial genome of Octostigma sinensis (Octostigmatidae, Projapygoidea), and of three more dipluran species, one each from the Campodeidae, Parajapygidae, and Japygidae. All four newly sequenced dipluran mtDNAs encode the same set of genes in the same gene order as shared by most crustaceans and hexapods. Secondary structure truncations have occurred in trnR, trnC, trnS1, and trnS2, and the reduction of transfer RNA D-arms was found to be taxonomically correlated, with Campodeoidea having experienced the most reduction. Partitioned phylogenetic analyses, based on both amino acids and nucleotides of the protein-coding genes plus the ribosomal RNA genes, retrieve significant support for a monophyletic Diplura within Pancrustacea, with Projapygoidea more closely related to Campodeoidea than to Japygoidea. Another key finding is that monophyly of Diplura cannot be recovered unless Projapygoidea is included in the phylogenetic analyses; this explains the dipluran polyphyly found by past mitogenomic studies. Including Projapygoidea increased the sampling density within Diplura and probably helped by breaking up a long-branch-attraction artifact. This finding provides an example of how proper sampling is significant for phylogenetic inference.

  8. Genomic inference accurately predicts the timing and severity of a recent bottleneck in a non-model insect population

    PubMed Central

    McCoy, Rajiv C.; Garud, Nandita R.; Kelley, Joanna L.; Boggs, Carol L.; Petrov, Dmitri A.

    2015-01-01

    The analysis of molecular data from natural populations has allowed researchers to answer diverse ecological questions that were previously intractable. In particular, ecologists are often interested in the demographic history of populations, information that is rarely available from historical records. Methods have been developed to infer demographic parameters from genomic data, but it is not well understood how inferred parameters compare to true population history or depend on aspects of experimental design. Here we present and evaluate a method of SNP discovery using RNA-sequencing and demographic inference using the program δaδi, which uses a diffusion approximation to the allele frequency spectrum to fit demographic models. We test these methods in a population of the checkerspot butterfly Euphydryas gillettii. This population was intentionally introduced to Gothic, Colorado in 1977 and has since experienced extreme fluctuations including bottlenecks of fewer than 25 adults, as documented by nearly annual field surveys. Using RNA-sequencing of eight individuals from Colorado and eight individuals from a native population in Wyoming, we generate the first genomic resources for this system. While demographic inference is commonly used to examine ancient demography, our study demonstrates that our inexpensive, all-in-one approach to marker discovery and genotyping provides sufficient data to accurately infer the timing of a recent bottleneck. This demographic scenario is relevant for many species of conservation concern, few of which have sequenced genomes. Our results are remarkably insensitive to sample size or number of genomic markers, which has important implications for applying this method to other non-model systems. PMID:24237665

  9. epiG: statistical inference and profiling of DNA methylation from whole-genome bisulfite sequencing data.

    PubMed

    Vincent, Martin; Mundbjerg, Kamilla; Skou Pedersen, Jakob; Liang, Gangning; Jones, Peter A; Ørntoft, Torben Falck; Dalsgaard Sørensen, Karina; Wiuf, Carsten

    2017-02-21

    The study of epigenetic heterogeneity at the level of individual cells and in whole populations is the key to understanding cellular differentiation, organismal development, and the evolution of cancer. We develop a statistical method, epiG, to infer and differentiate between different epi-allelic haplotypes, annotated with CpG methylation status and DNA polymorphisms, from whole-genome bisulfite sequencing data, and nucleosome occupancy from NOMe-seq data. We demonstrate the capabilities of the method by inferring allele-specific methylation and nucleosome occupancy in cell lines, and colon and tumor samples, and by benchmarking the method against independent experimental data.

  10. An application of collaborative targeted maximum likelihood estimation in causal inference and genomics.

    PubMed

    Gruber, Susan; van der Laan, Mark J

    2010-01-01

    A concrete example of the collaborative double-robust targeted likelihood estimator (C-TMLE) introduced in a companion article in this issue is presented, and applied to the estimation of causal effects and variable importance parameters in genomic data. The focus is on non-parametric estimation in a point treatment data structure. Simulations illustrate the performance of C-TMLE relative to current competitors such as the augmented inverse probability of treatment weighted estimator that relies on an external non-collaborative estimator of the treatment mechanism, and inefficient estimation procedures including propensity score matching and standard inverse probability of treatment weighting. C-TMLE is also applied to the estimation of the covariate-adjusted marginal effect of individual HIV mutations on resistance to the anti-retroviral drug lopinavir. The influence curve of the C-TMLE is used to establish asymptotically valid statistical inference. The list of mutations found to have a statistically significant association with resistance is in excellent agreement with mutation scores provided by the Stanford HIVdb mutation scores database.

  11. GIGA: a simple, efficient algorithm for gene tree inference in the genomic age

    PubMed Central

    2010-01-01

    Background Phylogenetic relationships between genes are not only of theoretical interest: they enable us to learn about human genes through the experimental work on their relatives in numerous model organisms from bacteria to fruit flies and mice. Yet the most commonly used computational algorithms for reconstructing gene trees can be inaccurate for numerous reasons, both algorithmic and biological. Additional information beyond gene sequence data has been shown to improve the accuracy of reconstructions, though at great computational cost. Results We describe a simple, fast algorithm for inferring gene phylogenies, which makes use of information that was not available prior to the genomic age: namely, a reliable species tree spanning much of the tree of life, and knowledge of the complete complement of genes in a species' genome. The algorithm, called GIGA, constructs trees agglomeratively from a distance matrix representation of sequences, using simple rules to incorporate this genomic age information. GIGA makes use of a novel conceptualization of gene trees as being composed of orthologous subtrees (containing only speciation events), which are joined by other evolutionary events such as gene duplication or horizontal gene transfer. An important innovation in GIGA is that, at every step in the agglomeration process, the tree is interpreted/reinterpreted in terms of the evolutionary events that created it. Remarkably, GIGA performs well even when using a very simple distance metric (pairwise sequence differences) and no distance averaging over clades during the tree construction process. Conclusions GIGA is efficient, allowing phylogenetic reconstruction of very large gene families and determination of orthologs on a large scale. It is exceptionally robust to adding more gene sequences, opening up the possibility of creating stable identifiers for referring to not only extant genes, but also their common ancestors. We compared trees produced by GIGA to those in

  12. GIGA: a simple, efficient algorithm for gene tree inference in the genomic age.

    PubMed

    Thomas, Paul D

    2010-06-09

    Phylogenetic relationships between genes are not only of theoretical interest: they enable us to learn about human genes through the experimental work on their relatives in numerous model organisms from bacteria to fruit flies and mice. Yet the most commonly used computational algorithms for reconstructing gene trees can be inaccurate for numerous reasons, both algorithmic and biological. Additional information beyond gene sequence data has been shown to improve the accuracy of reconstructions, though at great computational cost. We describe a simple, fast algorithm for inferring gene phylogenies, which makes use of information that was not available prior to the genomic age: namely, a reliable species tree spanning much of the tree of life, and knowledge of the complete complement of genes in a species' genome. The algorithm, called GIGA, constructs trees agglomeratively from a distance matrix representation of sequences, using simple rules to incorporate this genomic age information. GIGA makes use of a novel conceptualization of gene trees as being composed of orthologous subtrees (containing only speciation events), which are joined by other evolutionary events such as gene duplication or horizontal gene transfer. An important innovation in GIGA is that, at every step in the agglomeration process, the tree is interpreted/reinterpreted in terms of the evolutionary events that created it. Remarkably, GIGA performs well even when using a very simple distance metric (pairwise sequence differences) and no distance averaging over clades during the tree construction process. GIGA is efficient, allowing phylogenetic reconstruction of very large gene families and determination of orthologs on a large scale. It is exceptionally robust to adding more gene sequences, opening up the possibility of creating stable identifiers for referring to not only extant genes, but also their common ancestors. We compared trees produced by GIGA to those in the TreeFam database, and they

  13. BACTERIOPLANKTON DYNAMICS IN A SUBTROPICAL ESTUARY: EVIDENCE FOR SUBSTRATE LIMITATION

    EPA Science Inventory

    Bacterioplankton abundance and metabolic characteristics were measured along a transect in Pensacola Bay, Florida, USA, to examine the factors that control microbial water column processes in this subtropical estuary. The microbial measures included 3 H-L-leucine incorporation, e...

  14. BACTERIOPLANKTON DYNAMICS IN A SUBTROPICAL ESTUARY: EVIDENCE FOR SUBSTRATE LIMITATION

    EPA Science Inventory

    Bacterioplankton abundance and metabolic characteristics were measured along a transect in Pensacola Bay, Florida, USA, to examine the factors that control microbial water column processes in this subtropical estuary. The microbial measures included 3 H-L-leucine incorporation, e...

  15. Low-Pass Genome-Wide Sequencing and Variant Inference Using Identity-by-Descent in an Isolated Human Population

    PubMed Central

    Gusev, A.; Shah, M. J.; Kenny, E. E.; Ramachandran, A.; Lowe, J. K.; Salit, J.; Lee, C. C.; Levandowsky, E. C.; Weaver, T. N.; Doan, Q. C.; Peckham, H. E.; McLaughlin, S. F.; Lyons, M. R.; Sheth, V. N.; Stoffel, M.; De La Vega, F. M.; Friedman, J. M.; Breslow, J. L.

    2012-01-01

    Whole-genome sequencing in an isolated population with few founders directly ascertains variants from the population bottleneck that may be rare elsewhere. In such populations, shared haplotypes allow imputation of variants in unsequenced samples without resorting to complex statistical methods as in studies of outbred cohorts. We focus on an isolated population cohort from the Pacific Island of Kosrae, Micronesia, where we previously collected SNP array and rich phenotype data for the majority of the population. We report identification of long regions with haplotypes co-inherited between pairs of individuals and methodology to leverage such shared genetic content for imputation. Our estimates show that sequencing as few as 40 personal genomes allows for inference in up to 60% of the 3000-person cohort at the average locus. We ascertained a pilot data set of whole-genome sequences from seven Kosraean individuals, with average 5× coverage. This assay identified 5,735,306 unique sites of which 1,212,831 were previously unknown. Additionally, these variants are unusually enriched for alleles that are rare in other populations when compared to geographic neighbors (published Korean genome SJK). We used the presence of shared haplotypes between the seven Kosraen individuals to estimate expected imputation accuracy of known and novel homozygous variants at 99.6% and 97.3%, respectively. This study presents whole-genome analysis of a homogenous isolate population with emphasis on optimal rare variant inference. PMID:22135348

  16. Diazotrophic bacterioplankton in a coral reef lagoon: phylogeny, diel nitrogenase expression and response to phosphate enrichment.

    PubMed

    Hewson, Ian; Moisander, Pia H; Morrison, Amanda E; Zehr, Jonathan P

    2007-05-01

    We investigated diazotrophic bacterioplankton assemblage composition in the Heron Reef lagoon (Great Barrier Reef, Australia) using culture-independent techniques targeting the nifH fragment of the nitrogenase gene. Seawater was collected at 3 h intervals over a period of 72 h (i.e. over diel as well as tidal cycles). An incubation experiment was also conducted to assess the impact of phosphate (PO(4)3*) availability on nifH expression patterns. DNA-based nifH libraries contained primarily sequences that were most similar to nifH from sediment, microbial mat and surface-associated microorganisms, with a few sequences that clustered with typical open ocean phylotypes. In contrast to genomic DNA sequences, libraries prepared from gene transcripts (mRNA amplified by reverse transcription-polymerase chain reaction) were entirely cyanobacterial and contained phylotypes similar to those observed in open ocean plankton. The abundance of Trichodesmium and two uncultured cyanobacterial phylotypes from previous studies (group A and group B) were studied by quantitative-polymerase chain reaction in the lagoon samples. These were detected as transcripts, but were not detected in genomic DNA. The gene transcript abundance of these phylotypes demonstrated variability over several diel cycles. The PO(4)3* enrichment experiment had a clearer pattern of gene expression over diel cycles than the lagoon sampling, however PO(4)3* additions did not result in enhanced transcript abundance relative to control incubations. The results suggest that a number of diazotrophs in bacterioplankton of the reef lagoon may originate from sediment, coral or beachrock surfaces, sloughing into plankton with the flooding tide. The presence of typical open ocean phylotype transcripts in lagoon bacterioplankton may indicate that they are an important component of the N cycle of the coral reef.

  17. Genome Size Variation and Species Relationships in Hieracium Sub-genus Pilosella (Asteraceae) as Inferred by Flow Cytometry

    PubMed Central

    Suda, Jan; Krahulcová, Anna; Trávníček, Pavel; Rosenbaumová, Radka; Peckert, Tomáš; Krahulec, František

    2007-01-01

    Background and Aims Hieracium sub-genus Pilosella (hawkweeds) is a taxonomically complicated group of vascular plants, the structure of which is substantially influenced by frequent interspecific hybridization and polyploidization. Two kinds of species, ‘basic’ and ‘intermediate’ (i.e. hybridogenous), are usually recognized. In this study, genome size variation was investigated in a representative set of Central European hawkweeds in order to assess the value of such a data set for species delineation and inference of evolutionary relationships. Methods Holoploid and monoploid genome sizes (C- and Cx-values) were determined using propidium iodide flow cytometry for 376 homogeneously cultivated individuals of Hieracium sub-genus Pilosella, including 24 species (271 individuals), five recent natural hybrids (seven individuals) and experimental F1 hybrids from four parental combinations (98 individuals). Chromosome counts were available for more than half of the plant accessions. Base composition (proportion of AT/GC bases) was cytometrically estimated in 73 individuals. Key Results Seven different ploidy levels (2x–8x) were detected, with intraspecific ploidy polymorphism (up to four different cytotypes) occurring in 11 wild species. Mean 2C-values varied approx. 4·3-fold from 3·53 pg in diploid H. hoppeanum to 15·30 pg in octoploid H. brachiatum. 1Cx-values ranged from 1·72 pg in H. pilosella to 2·16 pg in H. echioides (1·26-fold). The DNA content of (high) polyploids was usually proportional to the DNA values of their diploid/low polyploid counterparts, indicating lack of processes altering genome size (i.e. genome down-sizing). Most species showed constant nuclear DNA amounts, exceptions being three hybridogenous taxa, in which introgressive hybridization was suggested as a presumable trigger for genome size variation. Monoploid genome sizes of hybridogenous species were always between the corresponding values of their putative parents. In addition

  18. Modulated Modularity Clustering as an Exploratory Tool for Functional Genomic Inference

    PubMed Central

    Stone, Eric A.; Ayroles, Julien F.

    2009-01-01

    In recent years, the advent of high-throughput assays, coupled with their diminishing cost, has facilitated a systems approach to biology. As a consequence, massive amounts of data are currently being generated, requiring efficient methodology aimed at the reduction of scale. Whole-genome transcriptional profiling is a standard component of systems-level analyses, and to reduce scale and improve inference clustering genes is common. Since clustering is often the first step toward generating hypotheses, cluster quality is critical. Conversely, because the validation of cluster-driven hypotheses is indirect, it is critical that quality clusters not be obtained by subjective means. In this paper, we present a new objective-based clustering method and demonstrate that it yields high-quality results. Our method, modulated modularity clustering (MMC), seeks community structure in graphical data. MMC modulates the connection strengths of edges in a weighted graph to maximize an objective function (called modularity) that quantifies community structure. The result of this maximization is a clustering through which tightly-connected groups of vertices emerge. Our application is to systems genetics, and we quantitatively compare MMC both to the hierarchical clustering method most commonly employed and to three popular spectral clustering approaches. We further validate MMC through analyses of human and Drosophila melanogaster expression data, demonstrating that the clusters we obtain are biologically meaningful. We show MMC to be effective and suitable to applications of large scale. In light of these features, we advocate MMC as a standard tool for exploration and hypothesis generation. PMID:19424432

  19. Genome Alignment Spanning Major Poaceae Lineages Reveals Heterogeneous Evolutionary Rates and Alters Inferred Dates for Key Evolutionary Events.

    PubMed

    Wang, Xiyin; Wang, Jingpeng; Jin, Dianchuan; Guo, Hui; Lee, Tae-Ho; Liu, Tao; Paterson, Andrew H

    2015-06-01

    Multiple comparisons among genomes can clarify their evolution, speciation, and functional innovations. To date, the genome sequences of eight grasses representing the most economically important Poaceae (grass) clades have been published, and their genomic-level comparison is an essential foundation for evolutionary, functional, and translational research. Using a formal and conservative approach, we aligned these genomes. Direct comparison of paralogous gene pairs all duplicated simultaneously reveal striking variation in evolutionary rates among whole genomes, with nucleotide substitution slowest in rice and up to 48% faster in other grasses, adding a new dimension to the value of rice as a grass model. We reconstructed ancestral genome contents for major evolutionary nodes, potentially contributing to understanding the divergence and speciation of grasses. Recent fossil evidence suggests revisions of the estimated dates of key evolutionary events, implying that the pan-grass polyploidization occurred ∼96 million years ago and could not be related to the Cretaceous-Tertiary mass extinction as previously inferred. Adjusted dating to reflect both updated fossil evidence and lineage-specific evolutionary rates suggested that maize subgenome divergence and maize-sorghum divergence were virtually simultaneous, a coincidence that would be explained if polyploidization directly contributed to speciation. This work lays a solid foundation for Poaceae translational genomics. Copyright © 2015 The Author. Published by Elsevier Inc. All rights reserved.

  20. Assignment of homoeologs to parental genomes in allopolyploids for species tree inference, with an example from Fumaria (papaveraceae).

    PubMed

    Bertrand, Yann J K; Scheen, Anne-Cathrine; Marcussen, Thomas; Pfeil, Bernard E; de Sousa, Filipe; Oxelman, Bengt

    2015-05-01

    There is a rising awareness that species trees are best inferred from multiple loci while taking into account processes affecting individual gene trees, such as substitution model error (failure of the model to account for the complexity of the data) and coalescent stochasticity (presence of incomplete lineage sorting [ILS]). Although most studies have been carried out in the context of dichotomous species trees, these processes operate also in more complex evolutionary histories involving multiple hybridizations and polyploidy. Recently, methods have been developed that accurately handle ILS in allopolyploids, but they are thus far restricted to networks of diploids and tetraploids. We propose a procedure that improves on this limitation by designing a workflow that assigns homoeologs to hypothetical diploid ancestral genomes prior to genome tree construction. Conflicting assignment hypotheses are evaluated against substitution model error and coalescent stochasticity. Incongruence that cannot be explained by stochastic mechanisms needs to be explained by other processes (e.g., homoploid hybridization or paralogy). The data can then be filtered to build multilabeled genome phylogenies using inference methods that can recover species trees, either in the face of substitution model error and coalescent stochasticity alone, or while simultaneously accounting for hybridization. Methods are already available for folding the resulting multilabeled genome phylogeny into a network. We apply the workflow to the reconstruction of the reticulate phylogeny of the plant genus Fumaria (Papaveraceae) with ploidal levels ranging from 2[Formula: see text] to 14[Formula: see text]. We describe the challenges in recovering nuclear NRPB2 homoeologs in high ploidy species while combining in vivo cloning and direct sequencing techniques. Using parametric bootstrapping simulations we assign nuclear homoeologs and chloroplast sequences (four concatenated loci) to their common

  1. Systematic Inference of Copy-Number Genotypes from Personal Genome Sequencing Data Reveals Extensive Olfactory Receptor Gene Content Diversity

    PubMed Central

    Waszak, Sebastian M.; Hasin, Yehudit; Zichner, Thomas; Olender, Tsviya; Keydar, Ifat; Khen, Miriam; Stütz, Adrian M.; Schlattl, Andreas; Lancet, Doron; Korbel, Jan O.

    2010-01-01

    Copy-number variations (CNVs) are widespread in the human genome, but comprehensive assignments of integer locus copy-numbers (i.e., copy-number genotypes) that, for example, enable discrimination of homozygous from heterozygous CNVs, have remained challenging. Here we present CopySeq, a novel computational approach with an underlying statistical framework that analyzes the depth-of-coverage of high-throughput DNA sequencing reads, and can incorporate paired-end and breakpoint junction analysis based CNV-analysis approaches, to infer locus copy-number genotypes. We benchmarked CopySeq by genotyping 500 chromosome 1 CNV regions in 150 personal genomes sequenced at low-coverage. The assessed copy-number genotypes were highly concordant with our performed qPCR experiments (Pearson correlation coefficient 0.94), and with the published results of two microarray platforms (95–99% concordance). We further demonstrated the utility of CopySeq for analyzing gene regions enriched for segmental duplications by comprehensively inferring copy-number genotypes in the CNV-enriched >800 olfactory receptor (OR) human gene and pseudogene loci. CopySeq revealed that OR loci display an extensive range of locus copy-numbers across individuals, with zero to two copies in some OR loci, and two to nine copies in others. Among genetic variants affecting OR loci we identified deleterious variants including CNVs and SNPs affecting ∼15% and ∼20% of the human OR gene repertoire, respectively, implying that genetic variants with a possible impact on smell perception are widespread. Finally, we found that for several OR loci the reference genome appears to represent a minor-frequency variant, implying a necessary revision of the OR repertoire for future functional studies. CopySeq can ascertain genomic structural variation in specific gene families as well as at a genome-wide scale, where it may enable the quantitative evaluation of CNVs in genome-wide association studies involving high

  2. A Detailed History of Intron-rich Eukaryotic Ancestors Inferred from a Global Survey of 100 Complete Genomes

    PubMed Central

    Csuros, Miklos; Rogozin, Igor B.; Koonin, Eugene V.

    2011-01-01

    Protein-coding genes in eukaryotes are interrupted by introns, but intron densities widely differ between eukaryotic lineages. Vertebrates, some invertebrates and green plants have intron-rich genes, with 6–7 introns per kilobase of coding sequence, whereas most of the other eukaryotes have intron-poor genes. We reconstructed the history of intron gain and loss using a probabilistic Markov model (Markov Chain Monte Carlo, MCMC) on 245 orthologous genes from 99 genomes representing the three of the five supergroups of eukaryotes for which multiple genome sequences are available. Intron-rich ancestors are confidently reconstructed for each major group, with 53 to 74% of the human intron density inferred with 95% confidence for the Last Eukaryotic Common Ancestor (LECA). The results of the MCMC reconstruction are compared with the reconstructions obtained using Maximum Likelihood (ML) and Dollo parsimony methods. An excellent agreement between the MCMC and ML inferences is demonstrated whereas Dollo parsimony introduces a noticeable bias in the estimations, typically yielding lower ancestral intron densities than MCMC and ML. Evolution of eukaryotic genes was dominated by intron loss, with substantial gain only at the bases of several major branches including plants and animals. The highest intron density, 120 to 130% of the human value, is inferred for the last common ancestor of animals. The reconstruction shows that the entire line of descent from LECA to mammals was intron-rich, a state conducive to the evolution of alternative splicing. PMID:21935348

  3. A detailed history of intron-rich eukaryotic ancestors inferred from a global survey of 100 complete genomes.

    PubMed

    Csuros, Miklos; Rogozin, Igor B; Koonin, Eugene V

    2011-09-01

    Protein-coding genes in eukaryotes are interrupted by introns, but intron densities widely differ between eukaryotic lineages. Vertebrates, some invertebrates and green plants have intron-rich genes, with 6-7 introns per kilobase of coding sequence, whereas most of the other eukaryotes have intron-poor genes. We reconstructed the history of intron gain and loss using a probabilistic Markov model (Markov Chain Monte Carlo, MCMC) on 245 orthologous genes from 99 genomes representing the three of the five supergroups of eukaryotes for which multiple genome sequences are available. Intron-rich ancestors are confidently reconstructed for each major group, with 53 to 74% of the human intron density inferred with 95% confidence for the Last Eukaryotic Common Ancestor (LECA). The results of the MCMC reconstruction are compared with the reconstructions obtained using Maximum Likelihood (ML) and Dollo parsimony methods. An excellent agreement between the MCMC and ML inferences is demonstrated whereas Dollo parsimony introduces a noticeable bias in the estimations, typically yielding lower ancestral intron densities than MCMC and ML. Evolution of eukaryotic genes was dominated by intron loss, with substantial gain only at the bases of several major branches including plants and animals. The highest intron density, 120 to 130% of the human value, is inferred for the last common ancestor of animals. The reconstruction shows that the entire line of descent from LECA to mammals was intron-rich, a state conducive to the evolution of alternative splicing.

  4. Annual patterns in bacterioplankton community variability in a humic lake.

    PubMed

    Kent, A D; Jones, S E; Yannarell, A C; Graham, J M; Lauster, G H; Kratz, T K; Triplett, E W

    2004-11-01

    Bacterioplankton community composition (BCC) was monitored in a shallow humic lake in northern Wisconsin, USA, over 3 years using automated ribosomal intergenic spacer analysis (ARISA). Comparison of ARISA profiles of bacterial communities over time indicated that BCC was highly variable on a seasonal and annual scale. Nonmetric multidimensional scaling (MDS) analysis indicated little similarity in BCC from year to year. Nevertheless, annual patterns in bacterioplankton community diversity were observed. Trends in bacterioplankton community diversity were correlated to annual patterns in community succession observed for phytoplankton and zooplankton populations, consistent with the notion that food web interactions affect bacterioplankton community structure in this humic lake. Bacterioplankton communities experience a dramatic drop in richness and abundance each year in early summer, concurrent with an increase in the abundance of both mixotrophic and heterotrophic flagellates. A second drop in richness, but not abundance, is observed each year in late summer, coinciding with an intense bloom of the nonphagotrophic dinoflagellate Peridinium limbatum. A relationship between bacterial community composition, size, and abundance and the population dynamics of Daphnia was also observed. The noted synchrony between these major population and species shifts suggests that linkages across trophic levels play a role in determining the annual time course of events for the microbial and metazoan components of the plankton.

  5. Biogeography of bacterioplankton in the tropical seawaters of Singapore.

    PubMed

    Lau, Stanley C K; Zhang, Rui; Brodie, Eoin L; Piceno, Yvette M; Andersen, Gary; Liu, Wen-Tso

    2013-05-01

    Knowledge about the biogeography of marine bacterioplankton on the global scale in general and in Southeast Asia in particular has been scarce. This study investigated the biogeography of bacterioplankton community in Singapore seawaters. Twelve stations around Singapore island were sampled on different schedules over 1 year. Using PCR-DNA fingerprinting, DNA cloning and sequencing, and microarray hybridization of the 16S rRNA genes, we observed clear spatial variations of bacterioplankton diversity within the small area of the Singapore seas. Water samples collected from the Singapore Strait (south) throughout the year were dominated by DNA sequences affiliated with Cyanobacteria and Alphaproteobacteria that were believed to be associated with the influx of water from the open seas in Southeast Asia. On the contrary, water in the relatively polluted Johor Strait (north) were dominated by Betaproteobacteria, Gammaproteobacteria, and Bacteroidetes and that were presumably associated with river discharge and the relatively eutrophic conditions of the waterway. Bacterioplankton diversity was temporally stable, except for the episodic surge of Pseudoalteromonas, associated with algal blooms. Overall, these results provide valuable insights into the diversity of bacterioplankton communities in Singapore seas and the possible influences of hydrological conditions and anthropogenic activities on the dynamics of the communities.

  6. Comparative population genomics: power and principles for the inference of functionality

    PubMed Central

    Lawrie, David S.; Petrov, Dmitri A.

    2014-01-01

    The availability of sequenced genomes from multiple related organisms allows the detection and localization of functional genomic elements based on the idea that such elements evolve more slowly than neutral sequences. Although such comparative genomics methods have proven useful in discovering functional elements and ascertaining levels of functional constraint in the genome as a whole, here we outline limitations intrinsic to this approach that cannot be overcome by sequencing more species. We argue that it is essential to supplement comparative genomics with ultra-deep sampling of populations from closely related species to enable substantially more powerful genomic scans for functional elements. The convergence of sequencing technology and population genetics theory has made such projects feasible and has exciting implications for functional genomics. PMID:24656563

  7. Comparative population genomics: power and principles for the inference of functionality.

    PubMed

    Lawrie, David S; Petrov, Dmitri A

    2014-04-01

    The availability of sequenced genomes from multiple related organisms allows the detection and localization of functional genomic elements based on the idea that such elements evolve more slowly than neutral sequences. Although such comparative genomics methods have proven useful in discovering functional elements and ascertaining levels of functional constraint in the genome as a whole, here we outline limitations intrinsic to this approach that cannot be overcome by sequencing more species. We argue that it is essential to supplement comparative genomics with ultra-deep sampling of populations from closely related species to enable substantially more powerful genomic scans for functional elements. The convergence of sequencing technology and population genetics theory has made such projects feasible and has exciting implications for functional genomics. Copyright © 2014 Elsevier Ltd. All rights reserved.

  8. The phylogenomic position of the grey nurse shark Carcharias taurus Rafinesque, 1810 (Lamniformes, Odontaspididae) inferred from the mitochondrial genome.

    PubMed

    Bowden, Deborah L; Vargas-Caro, Carolina; Ovenden, Jennifer R; Bennett, Michael B; Bustamante, Carlos

    2016-11-01

    The complete mitochondrial genome of the grey nurse shark Carcharias taurus is described from 25 963 828 sequences obtained using Illumina NGS technology. Total length of the mitogenome is 16 715 bp, consisting of 2 rRNAs, 13 protein-coding regions, 22 tRNA and 2 non-coding regions thus updating the previously published mitogenome for this species. The phylogenomic reconstruction inferred from the mitogenome of 15 species of Lamniform and Carcharhiniform sharks supports the inclusion of C. taurus in a clade with the Lamnidae and Cetorhinidae. This complete mitogenome contributes to ongoing investigation into the monophyly of the Family Odontaspididae.

  9. Stream Hydrological Fragmentation Drives Bacterioplankton Community Composition

    PubMed Central

    Fazi, Stefano; Vázquez, Eusebi; Casamayor, Emilio O.; Amalfitano, Stefano; Butturini, Andrea

    2013-01-01

    In Mediterranean intermittent streams, the hydrological fragmentation in summer and the successive water flow re-convergence in autumn allow exploring how local processes shape the microbial community within the same habitat. The objectives of this study were to determine how bacterial community composition responded to hydrological fragmentation in summer, and to evaluate whether the seasonal shifts in community composition predominate over the effects of episodic habitat fragmentation. The bacterial community was assessed along the intermittent stream Fuirosos (Spain), at different levels of phylogenetic resolution by in situ hybridization, fingerprinting, and 16S rRNA gene sequencing. The hydrological fragmentation of the stream network strongly altered the biogeochemical conditions with the depletion of oxidized solutes and caused changes in dissolved organic carbon characteristics. In the isolated ponds, beta-Proteobacteria and Actinobacteria increased their abundance with a gradual reduction of the alpha-diversity as pond isolation time increased. Moreover, fingerprinting analysis clearly showed a shift in community composition between summer and autumn. In the context of a seasonal shift, the temporary stream fragmentation simultaneously reduced the microbial dispersion and affected local environmental conditions (shift in redox regime and quality of the dissolved organic matter) tightly shaping the bacterioplankton community composition. PMID:23741302

  10. Stream hydrological fragmentation drives bacterioplankton community composition.

    PubMed

    Fazi, Stefano; Vázquez, Eusebi; Casamayor, Emilio O; Amalfitano, Stefano; Butturini, Andrea

    2013-01-01

    In Mediterranean intermittent streams, the hydrological fragmentation in summer and the successive water flow re-convergence in autumn allow exploring how local processes shape the microbial community within the same habitat. The objectives of this study were to determine how bacterial community composition responded to hydrological fragmentation in summer, and to evaluate whether the seasonal shifts in community composition predominate over the effects of episodic habitat fragmentation. The bacterial community was assessed along the intermittent stream Fuirosos (Spain), at different levels of phylogenetic resolution by in situ hybridization, fingerprinting, and 16S rRNA gene sequencing. The hydrological fragmentation of the stream network strongly altered the biogeochemical conditions with the depletion of oxidized solutes and caused changes in dissolved organic carbon characteristics. In the isolated ponds, beta-Proteobacteria and Actinobacteria increased their abundance with a gradual reduction of the alpha-diversity as pond isolation time increased. Moreover, fingerprinting analysis clearly showed a shift in community composition between summer and autumn. In the context of a seasonal shift, the temporary stream fragmentation simultaneously reduced the microbial dispersion and affected local environmental conditions (shift in redox regime and quality of the dissolved organic matter) tightly shaping the bacterioplankton community composition.

  11. Demographic Divergence History of Pied Flycatcher and Collared Flycatcher Inferred from Whole-Genome Re-sequencing Data

    PubMed Central

    Nadachowska-Brzyska, Krystyna; Burri, Reto; Olason, Pall I.; Kawakami, Takeshi; Smeds, Linnéa; Ellegren, Hans

    2013-01-01

    Profound knowledge of demographic history is a prerequisite for the understanding and inference of processes involved in the evolution of population differentiation and speciation. Together with new coalescent-based methods, the recent availability of genome-wide data enables investigation of differentiation and divergence processes at unprecedented depth. We combined two powerful approaches, full Approximate Bayesian Computation analysis (ABC) and pairwise sequentially Markovian coalescent modeling (PSMC), to reconstruct the demographic history of the split between two avian speciation model species, the pied flycatcher and collared flycatcher. Using whole-genome re-sequencing data from 20 individuals, we investigated 15 demographic models including different levels and patterns of gene flow, and changes in effective population size over time. ABC provided high support for recent (mode 0.3 my, range <0.7 my) species divergence, declines in effective population size of both species since their initial divergence, and unidirectional recent gene flow from pied flycatcher into collared flycatcher. The estimated divergence time and population size changes, supported by PSMC results, suggest that the ancestral species persisted through one of the glacial periods of middle Pleistocene and then split into two large populations that first increased in size before going through severe bottlenecks and expanding into their current ranges. Secondary contact appears to have been established after the last glacial maximum. The severity of the bottlenecks at the last glacial maximum is indicated by the discrepancy between current effective population sizes (20,000–80,000) and census sizes (5–50 million birds) of the two species. The recent divergence time challenges the supposition that avian speciation is a relatively slow process with extended times for intrinsic postzygotic reproductive barriers to evolve. Our study emphasizes the importance of using genome-wide data to

  12. Inferring Selective Constraint from Population Genomic Data Suggests Recent Regulatory Turnover in the Human Brain

    PubMed Central

    Schrider, Daniel R.; Kern, Andrew D.

    2015-01-01

    The comparative genomics revolution of the past decade has enabled the discovery of functional elements in the human genome via sequence comparison. While that is so, an important class of elements, those specific to humans, is entirely missed by searching for sequence conservation across species. Here we present an analysis based on variation data among human genomes that utilizes a supervised machine learning approach for the identification of human-specific purifying selection in the genome. Using only allele frequency information from the complete low-coverage 1000 Genomes Project data set in conjunction with a support vector machine trained from known functional and nonfunctional portions of the genome, we are able to accurately identify portions of the genome constrained by purifying selection. Our method identifies previously known human-specific gains or losses of function and uncovers many novel candidates. Candidate targets for gain and loss of function along the human lineage include numerous putative regulatory regions of genes essential for normal development of the central nervous system, including a significant enrichment of gain of function events near neurotransmitter receptor genes. These results are consistent with regulatory turnover being a key mechanism in the evolution of human-specific characteristics of brain development. Finally, we show that the majority of the genome is unconstrained by natural selection currently, in agreement with what has been estimated from phylogenetic methods but in sharp contrast to estimates based on transcriptomics or other high-throughput functional methods. PMID:26590212

  13. Reference set of regulons in Desulfovibrionales inferred by comparative genomics approach

    SciTech Connect

    Kazakov, A.E.; Rodionov, D.A.; Price, M.N.; Arkin, A.P.; Dubchak, I.; Novichkov, P.S.

    2010-11-15

    in this study, we carried out large-scale comparative genomics analysis of regulatory interactions in Desulfovibrio vulgaris and 12 related genomes from Desulfovibrionales order using our recently developed web server RegPredict (http://regpredict.lbl.gov). An overall reference collection of 26 Desulfovibrionales regulogs can be accessed through RegPrecise database (http://regpredict.lbl.gov).

  14. Predicting Protein Function by Genomic Context: Quantitative Evaluation and Qualitative Inferences

    PubMed Central

    Huynen, Martijn; Snel, Berend; Lathe, Warren; Bork, Peer

    2000-01-01

    Various new methods have been proposed to predict functional interactions between proteins based on the genomic context of their genes. The types of genomic context that they use are Type I: the fusion of genes; Type II: the conservation of gene-order or co-occurrence of genes in potential operons; and Type III: the co-occurrence of genes across genomes (phylogenetic profiles). Here we compare these types for their coverage, their correlations with various types of functional interaction, and their overlap with homology-based function assignment. We apply the methods to Mycoplasma genitalium, the standard benchmarking genome in computational and experimental genomics. Quantitatively, conservation of gene order is the technique with the highest coverage, applying to 37% of the genes. By combining gene order conservation with gene fusion (6%), the co-occurrence of genes in operons in absence of gene order conservation (8%), and the co-occurrence of genes across genomes (11%), significant context information can be obtained for 50% of the genes (the categories overlap). Qualitatively, we observe that the functional interactions between genes are stronger as the requirements for physical neighborhood on the genome are more stringent, while the fraction of potential false positives decreases. Moreover, only in cases in which gene order is conserved in a substantial fraction of the genomes, in this case six out of twenty-five, does a single type of functional interaction (physical interaction) clearly dominate (>80%). In other cases, complementary function information from homology searches, which is available for most of the genes with significant genomic context, is essential to predict the type of interaction. Using a combination of genomic context and homology searches, new functional features can be predicted for 10% of M. genitalium genes. PMID:10958638

  15. High-level phylogeny of the Coleoptera inferred with mitochondrial genome sequences.

    PubMed

    Yuan, Ming-Long; Zhang, Qi-Lin; Zhang, Li; Guo, Zhong-Long; Liu, Yong-Jian; Shen, Yu-Ying; Shao, Renfu

    2016-11-01

    The Coleoptera (beetles) exhibits tremendous morphological, ecological, and behavioral diversity. To better understand the phylogenetics and evolution of beetles, we sequenced three complete mitogenomes from two families (Cleridae and Meloidae), which share conserved mitogenomic features with other completely sequenced beetles. We assessed the influence of six datasets and three inference methods on topology and nodal support within the Coleoptera. We found that both Bayesian inference and maximum likelihood with homogeneous-site models were greatly affected by nucleotide compositional heterogeneity, while the heterogeneous-site mixture model in PhyloBayes could provide better phylogenetic signals for the Coleoptera. The amino acid dataset generated more reliable tree topology at the higher taxonomic levels (i.e. suborders and series), where the inclusion of rRNA genes and the third positions of protein-coding genes improved phylogenetic inference at the superfamily level, especially under a heterogeneous-site model. We recovered the suborder relationships as (Archostemata+Adephaga)+(Myxophaga+Polyphaga). The series relationships within Polyphaga were recovered as (Scirtiformia+(Elateriformia+((Bostrichiformia+Scarabaeiformia+Staphyliniformia)+Cucujiformia))). All superfamilies within Cucujiformia were recovered as monophyletic. We obtained a cucujiform phylogeny of (Cleroidea+(Coccinelloidea+((Lymexyloidea+Tenebrionoidea)+(Cucujoidea+(Chrysomeloidea+Curculionoidea))))). This study showed that although tree topologies were sensitive to data types and inference methods, mitogenomic data could provide useful information for resolving the Coleoptera phylogeny at various taxonomic levels by using suitable datasets and heterogeneous-site models.

  16. Inferring Speciation Processes from Patterns of Natural Variation in Microbial Genomes

    PubMed Central

    Krause, David J.; Whitaker, Rachel J.

    2015-01-01

    Microbial species concepts have long been the focus of contentious debate, fueled by technological limitations to the genetic resolution of species, by the daunting task of investigating phenotypic variation among individual microscopic organisms, and by a lack of understanding of gene flow in reproductively asexual organisms that are prone to promiscuous horizontal gene transfer. Population genomics, the emerging approach of analyzing the complete genomes of a multitude of closely related organisms, is poised to overcome these limitations by providing a window into patterns of genome variation revealing the evolutionary processes through which species diverge. This new approach is more than just an extension of previous multilocus sequencing technologies, in that it provides a comprehensive view of interacting evolutionary processes. Here we argue that the application of population genomic tools in a rigorous population genetic framework will help to identify the processes of microbial speciation and ultimately lead to a general species concept based on the unique biology and ecology of microorganisms. PMID:26316424

  17. Demographic History of the Genus Pan Inferred from Whole Mitochondrial Genome Reconstructions.

    PubMed

    Lobon, Irene; Tucci, Serena; de Manuel, Marc; Ghirotto, Silvia; Benazzo, Andrea; Prado-Martinez, Javier; Lorente-Galdos, Belen; Nam, Kiwoong; Dabad, Marc; Hernandez-Rodriguez, Jessica; Comas, David; Navarro, Arcadi; Schierup, Mikkel H; Andres, Aida M; Barbujani, Guido; Hvilsom, Christina; Marques-Bonet, Tomas

    2016-07-03

    The genus Pan is the closest genus to our own and it includes two species, Pan paniscus (bonobos) and Pan troglodytes (chimpanzees). The later is constituted by four subspecies, all highly endangered. The study of the Pan genera has been incessantly complicated by the intricate relationship among subspecies and the statistical limitations imposed by the reduced number of samples or genomic markers analyzed. Here, we present a new method to reconstruct complete mitochondrial genomes (mitogenomes) from whole genome shotgun (WGS) datasets, mtArchitect, showing that its reconstructions are highly accurate and consistent with long-range PCR mitogenomes. We used this approach to build the mitochondrial genomes of 20 newly sequenced samples which, together with available genomes, allowed us to analyze the hitherto most complete Pan mitochondrial genome dataset including 156 chimpanzee and 44 bonobo individuals, with a proportional contribution from all chimpanzee subspecies. We estimated the separation time between chimpanzees and bonobos around 1.15 million years ago (Mya) [0.81-1.49]. Further, we found that under the most probable genealogical model the two clades of chimpanzees, Western + Nigeria-Cameroon and Central + Eastern, separated at 0.59 Mya [0.41-0.78] with further internal separations at 0.32 Mya [0.22-0.43] and 0.16 Mya [0.17-0.34], respectively. Finally, for a subset of our samples, we compared nuclear versus mitochondrial genomes and we found that chimpanzee subspecies have different patterns of nuclear and mitochondrial diversity, which could be a result of either processes affecting the mitochondrial genome, such as hitchhiking or background selection, or a result of population dynamics. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  18. Inferring Selective Constraint from Population Genomic Data Suggests Recent Regulatory Turnover in the Human Brain.

    PubMed

    Schrider, Daniel R; Kern, Andrew D

    2015-11-19

    The comparative genomics revolution of the past decade has enabled the discovery of functional elements in the human genome via sequence comparison. While that is so, an important class of elements, those specific to humans, is entirely missed by searching for sequence conservation across species. Here we present an analysis based on variation data among human genomes that utilizes a supervised machine learning approach for the identification of human-specific purifying selection in the genome. Using only allele frequency information from the complete low-coverage 1000 Genomes Project data set in conjunction with a support vector machine trained from known functional and nonfunctional portions of the genome, we are able to accurately identify portions of the genome constrained by purifying selection. Our method identifies previously known human-specific gains or losses of function and uncovers many novel candidates. Candidate targets for gain and loss of function along the human lineage include numerous putative regulatory regions of genes essential for normal development of the central nervous system, including a significant enrichment of gain of function events near neurotransmitter receptor genes. These results are consistent with regulatory turnover being a key mechanism in the evolution of human-specific characteristics of brain development. Finally, we show that the majority of the genome is unconstrained by natural selection currently, in agreement with what has been estimated from phylogenetic methods but in sharp contrast to estimates based on transcriptomics or other high-throughput functional methods. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  19. Demographic History of the Genus Pan Inferred from Whole Mitochondrial Genome Reconstructions

    PubMed Central

    Tucci, Serena; de Manuel, Marc; Ghirotto, Silvia; Benazzo, Andrea; Prado-Martinez, Javier; Lorente-Galdos, Belen; Nam, Kiwoong; Dabad, Marc; Hernandez-Rodriguez, Jessica; Comas, David; Navarro, Arcadi; Schierup, Mikkel H.; Andres, Aida M.; Barbujani, Guido; Hvilsom, Christina; Marques-Bonet, Tomas

    2016-01-01

    The genus Pan is the closest genus to our own and it includes two species, Pan paniscus (bonobos) and Pan troglodytes (chimpanzees). The later is constituted by four subspecies, all highly endangered. The study of the Pan genera has been incessantly complicated by the intricate relationship among subspecies and the statistical limitations imposed by the reduced number of samples or genomic markers analyzed. Here, we present a new method to reconstruct complete mitochondrial genomes (mitogenomes) from whole genome shotgun (WGS) datasets, mtArchitect, showing that its reconstructions are highly accurate and consistent with long-range PCR mitogenomes. We used this approach to build the mitochondrial genomes of 20 newly sequenced samples which, together with available genomes, allowed us to analyze the hitherto most complete Pan mitochondrial genome dataset including 156 chimpanzee and 44 bonobo individuals, with a proportional contribution from all chimpanzee subspecies. We estimated the separation time between chimpanzees and bonobos around 1.15 million years ago (Mya) [0.81–1.49]. Further, we found that under the most probable genealogical model the two clades of chimpanzees, Western + Nigeria-Cameroon and Central + Eastern, separated at 0.59 Mya [0.41–0.78] with further internal separations at 0.32 Mya [0.22–0.43] and 0.16 Mya [0.17–0.34], respectively. Finally, for a subset of our samples, we compared nuclear versus mitochondrial genomes and we found that chimpanzee subspecies have different patterns of nuclear and mitochondrial diversity, which could be a result of either processes affecting the mitochondrial genome, such as hitchhiking or background selection, or a result of population dynamics. PMID:27345955

  20. Adaptation, Ecology, and Evolution of the Halophilic Stromatolite Archaeon Halococcus hamelinensis Inferred through Genome Analyses

    PubMed Central

    Gudhka, Reema K.; Neilan, Brett A.; Burns, Brendan P.

    2015-01-01

    Halococcus hamelinensis was the first archaeon isolated from stromatolites. These geomicrobial ecosystems are thought to be some of the earliest known on Earth, yet, despite their evolutionary significance, the role of Archaea in these systems is still not well understood. Detailed here is the genome sequencing and analysis of an archaeon isolated from stromatolites. The genome of H. hamelinensis consisted of 3,133,046 base pairs with an average G+C content of 60.08% and contained 3,150 predicted coding sequences or ORFs, 2,196 (68.67%) of which were protein-coding genes with functional assignments and 954 (29.83%) of which were of unknown function. Codon usage of the H. hamelinensis genome was consistent with a highly acidic proteome, a major adaptive mechanism towards high salinity. Amino acid transport and metabolism, inorganic ion transport and metabolism, energy production and conversion, ribosomal structure, and unknown function COG genes were overrepresented. The genome of H. hamelinensis also revealed characteristics reflecting its survival in its extreme environment, including putative genes/pathways involved in osmoprotection, oxidative stress response, and UV damage repair. Finally, genome analyses indicated the presence of putative transposases as well as positive matches of genes of H. hamelinensis against various genomes of Bacteria, Archaea, and viruses, suggesting the potential for horizontal gene transfer. PMID:25709556

  1. Comparative genomics of four Liliales families inferred from the complete chloroplast genome sequence of Veratrum patulum O. Loes. (Melanthiaceae).

    PubMed

    Do, Hoang Dang Khoa; Kim, Jung Sung; Kim, Joo-Hwan

    2013-11-10

    The sequence of the chloroplast genome, which is inherited maternally, contains useful information for many scientific fields such as plant systematics, biogeography and biotechnology because its characteristics are highly conserved among species. There is an increase in chloroplast genomes of angiosperms that have been sequenced in recent years. In this study, the nucleotide sequence of the chloroplast genome (cpDNA) of Veratrum patulum Loes. (Melanthiaceae, Liliales) was analyzed completely. The circular double-stranded DNA of 153,699 bp consists of two inverted repeat (IR) regions of 26,360 bp each, a large single copy of 83,372 bp, and a small single copy of 17,607 bp. This plastome contains 81 protein-coding genes, 30 distinct tRNA and four genes of rRNA. In addition, there are six hypothetical coding regions (ycf1, ycf2, ycf3, ycf4, ycf15 and ycf68) and two open reading frames (ORF42 and ORF56), which are also found in the chloroplast genomes of the other species. The gene orders and gene contents of the V. patulum plastid genome are similar to that of Smilax china, Lilium longiflorum and Alstroemeria aurea, members of the Smilacaceae, Liliaceae and Alstroemeriaceae (Liliales), respectively. However, the loss rps16 exon 2 in V. patulum results in the difference in the large single copy regions in comparison with other species. The base substitution rate is quite similar among genes of these species. Additionally, the base substitution rate of inverted repeat region was smaller than that of single copy regions in all observed species of Liliales. The IR regions were expanded to trnH_GUG in V. patulum, a part of rps19 in L. longiflorum and A. aurea, and whole sequence of rps19 in S. china. Furthermore, the IGS lengths of rbcL-accD-psaI region were variable among Liliales species, suggesting that this region might be a hotspot of indel events and the informative site for phylogenetic studies in Liliales. In general, the whole chloroplast genome of V. patulum, a

  2. Disentangling seasonal bacterioplankton population dynamics by high-frequency sampling.

    PubMed

    Lindh, Markus V; Sjöstedt, Johanna; Andersson, Anders F; Baltar, Federico; Hugerth, Luisa W; Lundin, Daniel; Muthusamy, Saraladevi; Legrand, Catherine; Pinhassi, Jarone

    2015-07-01

    Multiyear comparisons of bacterioplankton succession reveal that environmental conditions drive community shifts with repeatable patterns between years. However, corresponding insight into bacterioplankton dynamics at a temporal resolution relevant for detailed examination of variation and characteristics of specific populations within years is essentially lacking. During 1 year, we collected 46 samples in the Baltic Sea for assessing bacterial community composition by 16S rRNA gene pyrosequencing (nearly twice weekly during productive season). Beta-diversity analysis showed distinct clustering of samples, attributable to seemingly synchronous temporal transitions among populations (populations defined by 97% 16S rRNA gene sequence identity). A wide spectrum of bacterioplankton dynamics was evident, where divergent temporal patterns resulted both from pronounced differences in relative abundance and presence/absence of populations. Rates of change in relative abundance calculated for individual populations ranged from 0.23 to 1.79 day(-1) . Populations that were persistently dominant, transiently abundant or generally rare were found in several major bacterial groups, implying evolution has favoured a similar variety of life strategies within these groups. These findings suggest that high temporal resolution sampling allows constraining the timescales and frequencies at which distinct populations transition between being abundant or rare, thus potentially providing clues about physical, chemical or biological forcing on bacterioplankton community structure. © 2014 Society for Applied Microbiology and John Wiley & Sons Ltd.

  3. The evolutionary history of Saccharomyces species inferred from completed mitochondrial genomes and revision in the 'yeast mitochondrial genetic code'.

    PubMed

    Sulo, Pavol; Szabóová, Dana; Bielik, Peter; Poláková, Silvia; Šoltys, Katarína; Jatzová, Katarína; Szemes, Tomáš

    2017-06-15

    The yeast Saccharomyces are widely used to test ecological and evolutionary hypotheses. A large number of nuclear genomic DNA sequences are available, but mitochondrial genomic data are insufficient. We completed mitochondrial DNA (mtDNA) sequencing from Illumina MiSeq reads for all Saccharomyces species. All are circularly mapped molecules decreasing in size with phylogenetic distance from Saccharomyces cerevisiae but with similar gene content including regulatory and selfish elements like origins of replication, introns, free-standing open reading frames or GC clusters. Their most profound feature is species-specific alteration in gene order. The genetic code slightly differs from well-established yeast mitochondrial code as GUG is used rarely as the translation start and CGA and CGC code for arginine. The multilocus phylogeny, inferred from mtDNA, does not correlate with the trees derived from nuclear genes. mtDNA data demonstrate that Saccharomyces cariocanus should be assigned as a separate species and Saccharomyces bayanus CBS 380T should not be considered as a distinct species due to mtDNA nearly identical to Saccharomyces uvarum mtDNA. Apparently, comparison of mtDNAs should not be neglected in genomic studies as it is an important tool to understand the origin and evolutionary history of some yeast species. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  4. Inferences of drug responses in cancer cells from cancer genomic features and compound chemical and therapeutic properties

    PubMed Central

    Wang, Yongcui; Fang, Jianwen; Chen, Shilong

    2016-01-01

    Accurately predicting the response of a cancer patient to a therapeutic agent is a core goal of precision medicine. Existing approaches were mainly relied primarily on genomic alterations in cancer cells that have been treated with different drugs. Here we focus on predicting drug response based on integration of the heterogeneously pharmacogenomics data from both cell and drug sides. Through a systematical approach, named as PDRCC (Predict Drug Response in Cancer Cells), the cancer genomic alterations and compound chemical and therapeutic properties were incorporated to determine the chemotherapeutic response in cancer patients. Using the Cancer Cell Line Encyclopedia (CCLE) study as the benchmark dataset, all pharmacogenomics data exhibited their roles in inferring the relationships between cancer cells and drugs. When integrating both genomic resources and compound information, the prediction coverage was significantly increased. The validity of PDRCC was also supported by its effective in uncovering the unknown cell-drug associations with database and literature evidences. It set the stage for clinical testing of novel therapeutic strategies, such as the sensitive association between cancer cell ‘A549_LUNG’ and compound ‘Topotecan’. In conclusion, PDRCC offers the possibility for faster, safer, and cheaper the development of novel anti-cancer therapeutics in the early-stage clinical trails. PMID:27645580

  5. Inferences of drug responses in cancer cells from cancer genomic features and compound chemical and therapeutic properties

    NASA Astrophysics Data System (ADS)

    Wang, Yongcui; Fang, Jianwen; Chen, Shilong

    2016-09-01

    Accurately predicting the response of a cancer patient to a therapeutic agent is a core goal of precision medicine. Existing approaches were mainly relied primarily on genomic alterations in cancer cells that have been treated with different drugs. Here we focus on predicting drug response based on integration of the heterogeneously pharmacogenomics data from both cell and drug sides. Through a systematical approach, named as PDRCC (Predict Drug Response in Cancer Cells), the cancer genomic alterations and compound chemical and therapeutic properties were incorporated to determine the chemotherapeutic response in cancer patients. Using the Cancer Cell Line Encyclopedia (CCLE) study as the benchmark dataset, all pharmacogenomics data exhibited their roles in inferring the relationships between cancer cells and drugs. When integrating both genomic resources and compound information, the prediction coverage was significantly increased. The validity of PDRCC was also supported by its effective in uncovering the unknown cell-drug associations with database and literature evidences. It set the stage for clinical testing of novel therapeutic strategies, such as the sensitive association between cancer cell ‘A549_LUNG’ and compound ‘Topotecan’. In conclusion, PDRCC offers the possibility for faster, safer, and cheaper the development of novel anti-cancer therapeutics in the early-stage clinical trails.

  6. Genome-wide inference of natural selection on human transcription factor binding sites.

    PubMed

    Arbiza, Leonardo; Gronau, Ilan; Aksoy, Bulent A; Hubisz, Melissa J; Gulko, Brad; Keinan, Alon; Siepel, Adam

    2013-07-01

    For decades, it has been hypothesized that gene regulation has had a central role in human evolution, yet much remains unknown about the genome-wide impact of regulatory mutations. Here we use whole-genome sequences and genome-wide chromatin immunoprecipitation and sequencing data to demonstrate that natural selection has profoundly influenced human transcription factor binding sites since the divergence of humans from chimpanzees 4-6 million years ago. Our analysis uses a new probabilistic method, called INSIGHT, for measuring the influence of selection on collections of short, interspersed noncoding elements. We find that, on average, transcription factor binding sites have experienced somewhat weaker selection than protein-coding genes. However, the binding sites of several transcription factors show clear evidence of adaptation. Several measures of selection are strongly correlated with predicted binding affinity. Overall, regulatory elements seem to contribute substantially to both adaptive substitutions and deleterious polymorphisms with key implications for human evolution and disease.

  7. Phylogenetic position of the coral symbiont Ostreobium (Ulvophyceae) inferred from chloroplast genome data.

    PubMed

    Verbruggen, Heroen; Marcelino, Vanessa R; Guiry, Michael D; Cremen, M Chiela M; Jackson, Christopher J

    2017-04-10

    The green algal genus Ostreobium is an important symbiont of corals, playing roles in reef decalcification and providing photosynthates to the coral during bleaching events. A chloroplast genome of a cultured strain of Ostreobium was available, but low taxon sampling and Ostreobium's early-branching nature left doubt about its phylogenetic position. Here we generate and describe chloroplast genomes from four Ostreobium strains as well as Avrainvillea mazei and Neomeris sp., strategically sampled early-branching lineages in the Bryopsidales and Dasycladales, respectively. At 80,584 bp, the chloroplast genome of Ostreobium sp. HV05042 is the most compact yet found in the Ulvophyceae. The Avrainvillea chloroplast genome is ca. 94 kbp and contains introns in infA and cysT that have nearly complete sequence identity except for an ORF in infA that is not present in cysT. In line with other bryopsidalean species, it also contains regions with possibly bacteria-derived ORFs. The Neomeris data did not assemble into a canonical circular chloroplast genome but a large number of contigs containing fragments of chloroplast genes and showing evidence of long introns and intergenic regions, and the Neomeris chloroplast genome size was estimated to exceed 1.87 Mb. Chloroplast phylogenomics and 18S nrDNA data showed strong support for the Ostreobium lineage being sister to the remaining Bryopsidales. There were differences in branch support when outgroups were varied, but the overall support for the placement of Ostreobium was strong. These results permitted us to validate two suborders and introduce a third, the Ostreobineae. This article is protected by copyright. All rights reserved.

  8. Partial short-read sequencing of a highly inbred Iberian pig and genomics inference thereof

    PubMed Central

    Esteve-Codina, A; Kofler, R; Himmelbauer, H; Ferretti, L; Vivancos, A P; Groenen, M A M; Folch, J M; Rodríguez, M C; Pérez-Enciso, M

    2011-01-01

    Despite dramatic reduction in sequencing costs with the advent of next generation sequencing technologies, obtaining a complete mammalian genome sequence at sufficient depth is still costly. An alternative is partial sequencing. Here, we have sequenced a reduced representation library of an Iberian sow from the Guadyerbas strain, a highly inbred strain that has been used in numerous QTL studies because of its extreme phenotypic characteristics. Using the Illumina Genome Analyzer II (San Diego, CA, USA), we resequenced ∼1% of the genome with average 4 × depth, identifying 68 778 polymorphisms. Of these, 55 457 were putative fixed differences with respect to the assembly, based on the genome of a Duroc pig, and 13 321 were heterozygous positions within Guadyerbas. Despite being highly inbred, the estimate of heterozygosity within Guadyerbas was ∼0.78 kb−1 in autosomes, after correcting for low depth. Nucleotide variability was consistently higher at the telomeric regions than on the rest of the chromosome, likely a result of increased recombination rates. Further, variability was 50% lower in the X-chromosome than in autosomes, which may be explained by a recent bottleneck or by selection. We divided the whole genome in 500 kb windows and we analyzed overrepresented gene ontology terms in regions of low and high variability. Multi organism process, pigmentation and cell killing were overrepresented in high variability regions and metabolic process ontology, within low variability regions. Further, a genome wide Hudson–Kreitman–Aguadé test was carried out per window; overall, variability was in agreement with neutral expectations. PMID:21407255

  9. A core phylogeny of Dictyostelia inferred from genomes representative of the eight major and minor taxonomic divisions of the group.

    PubMed

    Singh, Reema; Schilde, Christina; Schaap, Pauline

    2016-11-17

    Dictyostelia are a well-studied group of organisms with colonial multicellularity, which are members of the mostly unicellular Amoebozoa. A phylogeny based on SSU rDNA data subdivided all Dictyostelia into four major groups, but left the position of the root and of six group-intermediate taxa unresolved. Recent phylogenies inferred from 30 or 213 proteins from sequenced genomes, positioned the root between two branches, each containing two major groups, but lacked data to position the group-intermediate taxa. Since the positions of these early diverging taxa are crucial for understanding the evolution of phenotypic complexity in Dictyostelia, we sequenced six representative genomes of early diverging taxa. We retrieved orthologs of 47 housekeeping proteins with an average size of 890 amino acids from six newly sequenced and eight published genomes of Dictyostelia and unicellular Amoebozoa and inferred phylogenies from single and concatenated protein sequence alignments. Concatenated alignments of all 47 proteins, and four out of five subsets of nine concatenated proteins all produced the same consensus phylogeny with 100% statistical support. Trees inferred from just two out of the 47 proteins, individually reproduced the consensus phylogeny, highlighting that single gene phylogenies will rarely reflect correct species relationships. However, sets of two or three concatenated proteins again reproduced the consensus phylogeny, indicating that a small selection of genes suffices for low cost classification of as yet unincorporated or newly discovered dictyostelid and amoebozoan taxa by gene amplification. The multi-locus consensus phylogeny shows that groups 1 and 2 are sister clades in branch I, with the group-intermediate taxon D. polycarpum positioned as outgroup to group 2. Branch II consists of groups 3 and 4, with the group-intermediate taxon Polysphondylium violaceum positioned as sister to group 4, and the group-intermediate taxon Dictyostelium polycephalum

  10. Inference of Candidate Germline Mutator Loci in Humans from Genome-Wide Haplotype Data

    PubMed Central

    2017-01-01

    The rate of germline mutation varies widely between species but little is known about the extent of variation in the germline mutation rate between individuals of the same species. Here we demonstrate that an allele that increases the rate of germline mutation can result in a distinctive signature in the genomic region linked to the affected locus, characterized by a number of haplotypes with a locally high proportion of derived alleles, against a background of haplotypes carrying a typical proportion of derived alleles. We searched for this signature in human haplotype data from phase 3 of the 1000 Genomes Project and report a number of candidate mutator loci, several of which are located close to or within genes involved in DNA repair or the DNA damage response. To investigate whether mutator alleles remained active at any of these loci, we used de novo mutation counts from human parent-offspring trios in the 1000 Genomes and Genome of the Netherlands cohorts, looking for an elevated number of de novo mutations in the offspring of parents carrying a candidate mutator haplotype at each of these loci. We found some support for two of the candidate loci, including one locus just upstream of the BRSK2 gene, which is expressed in the testis and has been reported to be involved in the response to DNA damage. PMID:28095480

  11. Interspecific chromosome substitution lines as genetic resources for improvement, trait analysis and genomic inference

    USDA-ARS?s Scientific Manuscript database

    The genetic base that cotton breeders commonly use to improve Upland cultivars is very narrow. The AD-genome species G. barbadense, G. tomentosum, and G. mustelinum are part of the primary germplasm pool, too, and constitute genetic reservoirs of genes for resistance to abiotic stress, pests and pa...

  12. A Novel Candidate Vaccine for Cytauxzoonosis Inferred from Comparative Apicomplexan Genomics

    PubMed Central

    Tarigo, Jaime L.; Scholl, Elizabeth H.; Bird, David McK.; Brown, Corrie C.; Cohn, Leah A.; Dean, Gregg A.; Levy, Michael G.; Doolan, Denise L.; Trieu, Angela; Nordone, Shila K.; Felgner, Philip L.; Vigil, Adam; Birkenheuer, Adam J.

    2013-01-01

    Cytauxzoonosis is an emerging infectious disease of domestic cats (Felis catus) caused by the apicomplexan protozoan parasite Cytauxzoon felis. The growing epidemic, with its high morbidity and mortality points to the need for a protective vaccine against cytauxzoonosis. Unfortunately, the causative agent has yet to be cultured continuously in vitro, rendering traditional vaccine development approaches beyond reach. Here we report the use of comparative genomics to computationally and experimentally interpret the C. felis genome to identify a novel candidate vaccine antigen for cytauxzoonosis. As a starting point we sequenced, assembled, and annotated the C. felis genome and the proteins it encodes. Whole genome alignment revealed considerable conserved synteny with other apicomplexans. In particular, alignments with the bovine parasite Theileria parva revealed that a C. felis gene, cf76, is syntenic to p67 (the leading vaccine candidate for bovine theileriosis), despite a lack of significant sequence similarity. Recombinant subdomains of cf76 were challenged with survivor-cat antiserum and found to be highly seroreactive. Comparison of eleven geographically diverse samples from the south-central and southeastern USA demonstrated 91–100% amino acid sequence identity across cf76, including a high level of conservation in an immunogenic 226 amino acid (24 kDa) carboxyl terminal domain. Using in situ hybridization, transcription of cf76 was documented in the schizogenous stage of parasite replication, the life stage that is believed to be the most important for development of a protective immune response. Collectively, these data point to identification of the first potential vaccine candidate antigen for cytauxzoonosis. Further, our bioinformatic approach emphasizes the use of comparative genomics as an accelerated path to developing vaccines against experimentally intractable pathogens. PMID:23977000

  13. Chromosomal instability in Afrotheria: fragile sites, evolutionary breakpoints and phylogenetic inference from genome sequence assemblies

    PubMed Central

    Ruiz-Herrera, Aurora; Robinson, Terence J

    2007-01-01

    Background Extant placental mammals are divided into four major clades (Laurasiatheria, Supraprimates, Xenarthra and Afrotheria). Given that Afrotheria is generally thought to root the eutherian tree in phylogenetic analysis of large nuclear gene data sets, the study of the organization of the genomes of afrotherian species provides new insights into the dynamics of mammalian chromosomal evolution. Here we test if there are chromosomal bands with a high tendency to break and reorganize in Afrotheria, and by analyzing the expression of aphidicolin-induced common fragile sites in three afrotherian species, whether these are coincidental with recognized evolutionary breakpoints. Results We described 29 fragile sites in the aardvark (OAF) genome, 27 in the golden mole (CAS), and 35 in the elephant-shrew (EED) genome. We show that fragile sites are conserved among afrotherian species and these are correlated with evolutionary breakpoints when compared to the human (HSA) genome. Inddition, by computationally scanning the newly released opossum (Monodelphis domestica) and chicken sequence assemblies for use as outgroups to Placentalia, we validate the HSA 3/21/5 chromosomal synteny as a rare genomic change that defines the monophyly of this ancient African clade of mammals. On the other hand, support for HSA 1/19p, which is also thought to underpin Afrotheria, is currently ambiguous. Conclusion We provide evidence that (i) the evolutionary breakpoints that characterise human syntenies detected in the basal Afrotheria correspond at the chromosomal band level with fragile sites, (ii) that HSA 3p/21 was in the amniote ancestor (i.e., common to turtles, lepidosaurs, crocodilians, birds and mammals) and was subsequently disrupted in the lineage leading to marsupials. Its expansion to include HSA 5 in Afrotheria is unique and (iii) that its fragmentation to HSA 3p/21 + HSA 5/21 in elephant and manatee was due to a fission within HSA 21 that is probably shared by all

  14. Phylogeny and physiology of candidate phylum ‘Atribacteria' (OP9/JS1) inferred from cultivation-independent genomics

    PubMed Central

    Nobu, Masaru K; Dodsworth, Jeremy A; Murugapiran, Senthil K; Rinke, Christian; Gies, Esther A; Webster, Gordon; Schwientek, Patrick; Kille, Peter; Parkes, R John; Sass, Henrik; Jørgensen, Bo B; Weightman, Andrew J; Liu, Wen-Tso; Hallam, Steven J; Tsiamis, George; Woyke, Tanja; Hedlund, Brian P

    2016-01-01

    The ‘Atribacteria' is a candidate phylum in the Bacteria recently proposed to include members of the OP9 and JS1 lineages. OP9 and JS1 are globally distributed, and in some cases abundant, in anaerobic marine sediments, geothermal environments, anaerobic digesters and reactors and petroleum reservoirs. However, the monophyly of OP9 and JS1 has been questioned and their physiology and ecology remain largely enigmatic due to a lack of cultivated representatives. Here cultivation-independent genomic approaches were used to provide a first comprehensive view of the phylogeny, conserved genomic features and metabolic potential of members of this ubiquitous candidate phylum. Previously available and heretofore unpublished OP9 and JS1 single-cell genomic data sets were used as recruitment platforms for the reconstruction of atribacterial metagenome bins from a terephthalate-degrading reactor biofilm and from the monimolimnion of meromictic Sakinaw Lake. The single-cell genomes and metagenome bins together comprise six species- to genus-level groups that represent most major lineages within OP9 and JS1. Phylogenomic analyses of these combined data sets confirmed the monophyly of the ‘Atribacteria' inclusive of OP9 and JS1. Additional conserved features within the ‘Atribacteria' were identified, including a gene cluster encoding putative bacterial microcompartments that may be involved in aldehyde and sugar metabolism, energy conservation and carbon storage. Comparative analysis of the metabolic potential inferred from these data sets revealed that members of the ‘Atribacteria' are likely to be heterotrophic anaerobes that lack respiratory capacity, with some lineages predicted to specialize in either primary fermentation of carbohydrates or secondary fermentation of organic acids, such as propionate. PMID:26090992

  15. Inferring Population Size History from Large Samples of Genome-Wide Molecular Data - An Approximate Bayesian Computation Approach

    PubMed Central

    Boitard, Simon; Rodríguez, Willy; Jay, Flora; Mona, Stefano; Austerlitz, Frédéric

    2016-01-01

    Inferring the ancestral dynamics of effective population size is a long-standing question in population genetics, which can now be tackled much more accurately thanks to the massive genomic data available in many species. Several promising methods that take advantage of whole-genome sequences have been recently developed in this context. However, they can only be applied to rather small samples, which limits their ability to estimate recent population size history. Besides, they can be very sensitive to sequencing or phasing errors. Here we introduce a new approximate Bayesian computation approach named PopSizeABC that allows estimating the evolution of the effective population size through time, using a large sample of complete genomes. This sample is summarized using the folded allele frequency spectrum and the average zygotic linkage disequilibrium at different bins of physical distance, two classes of statistics that are widely used in population genetics and can be easily computed from unphased and unpolarized SNP data. Our approach provides accurate estimations of past population sizes, from the very first generations before present back to the expected time to the most recent common ancestor of the sample, as shown by simulations under a wide range of demographic scenarios. When applied to samples of 15 or 25 complete genomes in four cattle breeds (Angus, Fleckvieh, Holstein and Jersey), PopSizeABC revealed a series of population declines, related to historical events such as domestication or modern breed creation. We further highlight that our approach is robust to sequencing errors, provided summary statistics are computed from SNPs with common alleles. PMID:26943927

  16. Phylogeny and physiology of candidate phylum 'Atribacteria' (OP9/JS1) inferred from cultivation-independent genomics.

    PubMed

    Nobu, Masaru K; Dodsworth, Jeremy A; Murugapiran, Senthil K; Rinke, Christian; Gies, Esther A; Webster, Gordon; Schwientek, Patrick; Kille, Peter; Parkes, R John; Sass, Henrik; Jørgensen, Bo B; Weightman, Andrew J; Liu, Wen-Tso; Hallam, Steven J; Tsiamis, George; Woyke, Tanja; Hedlund, Brian P

    2016-02-01

    The 'Atribacteria' is a candidate phylum in the Bacteria recently proposed to include members of the OP9 and JS1 lineages. OP9 and JS1 are globally distributed, and in some cases abundant, in anaerobic marine sediments, geothermal environments, anaerobic digesters and reactors and petroleum reservoirs. However, the monophyly of OP9 and JS1 has been questioned and their physiology and ecology remain largely enigmatic due to a lack of cultivated representatives. Here cultivation-independent genomic approaches were used to provide a first comprehensive view of the phylogeny, conserved genomic features and metabolic potential of members of this ubiquitous candidate phylum. Previously available and heretofore unpublished OP9 and JS1 single-cell genomic data sets were used as recruitment platforms for the reconstruction of atribacterial metagenome bins from a terephthalate-degrading reactor biofilm and from the monimolimnion of meromictic Sakinaw Lake. The single-cell genomes and metagenome bins together comprise six species- to genus-level groups that represent most major lineages within OP9 and JS1. Phylogenomic analyses of these combined data sets confirmed the monophyly of the 'Atribacteria' inclusive of OP9 and JS1. Additional conserved features within the 'Atribacteria' were identified, including a gene cluster encoding putative bacterial microcompartments that may be involved in aldehyde and sugar metabolism, energy conservation and carbon storage. Comparative analysis of the metabolic potential inferred from these data sets revealed that members of the 'Atribacteria' are likely to be heterotrophic anaerobes that lack respiratory capacity, with some lineages predicted to specialize in either primary fermentation of carbohydrates or secondary fermentation of organic acids, such as propionate.

  17. Genome-Wide SNP Discovery, Genotyping and Their Preliminary Applications for Population Genetic Inference in Spotted Sea Bass (Lateolabrax maculatus)

    PubMed Central

    Wang, Juan; Xue, Dong-Xiu; Zhang, Bai-Dong; Li, Yu-Long; Liu, Bing-Jian; Liu, Jin-Xian

    2016-01-01

    Next-generation sequencing and the collection of genome-wide single-nucleotide polymorphisms (SNPs) allow identifying fine-scale population genetic structure and genomic regions under selection. The spotted sea bass (Lateolabrax maculatus) is a non-model species of ecological and commercial importance and widely distributed in northwestern Pacific. A total of 22 648 SNPs was discovered across the genome of L. maculatus by paired-end sequencing of restriction-site associated DNA (RAD-PE) for 30 individuals from two populations. The nucleotide diversity (π) for each population was 0.0028±0.0001 in Dandong and 0.0018±0.0001 in Beihai, respectively. Shallow but significant genetic differentiation was detected between the two populations analyzed by using both the whole data set (FST = 0.0550, P < 0.001) and the putatively neutral SNPs (FST = 0.0347, P < 0.001). However, the two populations were highly differentiated based on the putatively adaptive SNPs (FST = 0.6929, P < 0.001). Moreover, a total of 356 SNPs representing 298 unique loci were detected as outliers putatively under divergent selection by FST-based outlier tests as implemented in BAYESCAN and LOSITAN. Functional annotation of the contigs containing putatively adaptive SNPs yielded hits for 22 of 55 (40%) significant BLASTX matches. Candidate genes for local selection constituted a wide array of functions, including binding, catalytic and metabolic activities, etc. The analyses with the SNPs developed in the present study highlighted the importance of genome-wide genetic variation for inference of population structure and local adaptation in L. maculatus. PMID:27336696

  18. The green impact: bacterioplankton response toward a phytoplankton spring bloom in the southern North Sea assessed by comparative metagenomic and metatranscriptomic approaches

    PubMed Central

    Wemheuer, Bernd; Wemheuer, Franziska; Hollensteiner, Jacqueline; Meyer, Frauke-Dorothee; Voget, Sonja; Daniel, Rolf

    2015-01-01

    Phytoplankton blooms exhibit a severe impact on bacterioplankton communities as they change nutrient availabilities and other environmental factors. In the current study, the response of a bacterioplankton community to a Phaeocystis globosa spring bloom was investigated in the southern North Sea. For this purpose, water samples were taken inside and reference samples outside of an algal spring bloom. Structural changes of the bacterioplankton community were assessed by amplicon-based analysis of 16S rRNA genes and transcripts generated from environmental DNA and RNA, respectively. Several marine groups responded to bloom presence. The abundance of the Roseobacter RCA cluster and the SAR92 clade significantly increased in bloom presence in the total and active fraction of the bacterial community. Functional changes were investigated by direct sequencing of environmental DNA and mRNA. The corresponding datasets comprised more than 500 million sequences across all samples. Metatranscriptomic data sets were mapped on representative genomes of abundant marine groups present in the samples and on assembled metagenomic and metatranscriptomic datasets. Differences in gene expression profiles between non-bloom and bloom samples were recorded. The genome-wide gene expression level of Planktomarina temperata, an abundant member of the Roseobacter RCA cluster, was higher inside the bloom. Genes that were differently expressed included transposases, which showed increased expression levels inside the bloom. This might contribute to the adaptation of this organism toward environmental stresses through genome reorganization. In addition, several genes affiliated to the SAR92 clade were significantly upregulated inside the bloom including genes encoding for proteins involved in isoleucine and leucine incorporation. Obtained results provide novel insights into compositional and functional variations of marine bacterioplankton communities as response to a phytoplankton bloom. PMID

  19. The green impact: bacterioplankton response toward a phytoplankton spring bloom in the southern North Sea assessed by comparative metagenomic and metatranscriptomic approaches.

    PubMed

    Wemheuer, Bernd; Wemheuer, Franziska; Hollensteiner, Jacqueline; Meyer, Frauke-Dorothee; Voget, Sonja; Daniel, Rolf

    2015-01-01

    Phytoplankton blooms exhibit a severe impact on bacterioplankton communities as they change nutrient availabilities and other environmental factors. In the current study, the response of a bacterioplankton community to a Phaeocystis globosa spring bloom was investigated in the southern North Sea. For this purpose, water samples were taken inside and reference samples outside of an algal spring bloom. Structural changes of the bacterioplankton community were assessed by amplicon-based analysis of 16S rRNA genes and transcripts generated from environmental DNA and RNA, respectively. Several marine groups responded to bloom presence. The abundance of the Roseobacter RCA cluster and the SAR92 clade significantly increased in bloom presence in the total and active fraction of the bacterial community. Functional changes were investigated by direct sequencing of environmental DNA and mRNA. The corresponding datasets comprised more than 500 million sequences across all samples. Metatranscriptomic data sets were mapped on representative genomes of abundant marine groups present in the samples and on assembled metagenomic and metatranscriptomic datasets. Differences in gene expression profiles between non-bloom and bloom samples were recorded. The genome-wide gene expression level of Planktomarina temperata, an abundant member of the Roseobacter RCA cluster, was higher inside the bloom. Genes that were differently expressed included transposases, which showed increased expression levels inside the bloom. This might contribute to the adaptation of this organism toward environmental stresses through genome reorganization. In addition, several genes affiliated to the SAR92 clade were significantly upregulated inside the bloom including genes encoding for proteins involved in isoleucine and leucine incorporation. Obtained results provide novel insights into compositional and functional variations of marine bacterioplankton communities as response to a phytoplankton bloom.

  20. Causal inference of gene regulation with subnetwork assembly from genetical genomics data.

    PubMed

    Peng, Chien-Hua; Jiang, Yi-Zhi; Tai, An-Shun; Liu, Chun-Bin; Peng, Shih-Chi; Liao, Chun-Ta; Yen, Tzu-Chen; Hsieh, Wen-Ping

    2014-03-01

    Deciphering the causal networks of gene interactions is critical for identifying disease pathways and disease-causing genes. We introduce a method to reconstruct causal networks based on exploring phenotype-specific modules in the human interactome and including the expression quantitative trait loci (eQTLs) that underlie the joint expression variation of each module. Closely associated eQTLs help anchor the orientation of the network. To overcome the inherent computational complexity of causal network reconstruction, we first deduce the local causality of individual subnetworks using the selected eQTLs and module transcripts. These subnetworks are then integrated to infer a global causal network using a random-field ranking method, which was motivated by animal sociology. We demonstrate how effectively the inferred causality restores the regulatory structure of the networks that mediate lymph node metastasis in oral cancer. Network rewiring clearly characterizes the dynamic regulatory systems of distinct disease states. This study is the first to associate an RXRB-causal network with increased risks of nodal metastasis, tumor relapse, distant metastases and poor survival for oral cancer. Thus, identifying crucial upstream drivers of a signal cascade can facilitate the discovery of potential biomarkers and effective therapeutic targets.

  1. High-throughput single-cell sequencing identifies photoheterotrophs and chemoautotrophs in freshwater bacterioplankton

    PubMed Central

    Martinez-Garcia, Manuel; Swan, Brandon K; Poulton, Nicole J; Gomez, Monica Lluesma; Masland, Dashiell; Sieracki, Michael E; Stepanauskas, Ramunas

    2012-01-01

    Recent discoveries suggest that photoheterotrophs (rhodopsin-containing bacteria (RBs) and aerobic anoxygenic phototrophs (AAPs)) and chemoautotrophs may be significant for marine and freshwater ecosystem productivity. However, their abundance and taxonomic identities remain largely unknown. We used a combination of single-cell and metagenomic DNA sequencing to study the predominant photoheterotrophs and chemoautotrophs inhabiting the euphotic zone of temperate, physicochemically diverse freshwater lakes. Multi-locus sequencing of 712 single amplified genomes, generated by fluorescence-activated cell sorting and whole genome multiple displacement amplification, showed that most of the cosmopolitan freshwater clusters contain photoheterotrophs. These comprised at least 10–23% of bacterioplankton, and RBs were the dominant fraction. Our data demonstrate that Actinobacteria, including clusters acI, Luna and acSTL, are the predominant freshwater RBs. We significantly broaden the known taxonomic range of freshwater RBs, to include Alpha-, Beta-, Gamma- and Deltaproteobacteria, Verrucomicrobia and Sphingobacteria. By sequencing single cells, we found evidence for inter-phyla horizontal gene transfer and recombination of rhodopsin genes and identified specific taxonomic groups involved in these evolutionary processes. Our data suggest that members of the ubiquitous betaproteobacteria Polynucleobacter spp. are the dominant AAPs in temperate freshwater lakes. Furthermore, the RuBisCO (ribulose 1,5-bisphosphate carboxylase/oxygenase) gene was found in several single cells of Betaproteobacteria, Bacteroidetes and Gammaproteobacteria, suggesting that chemoautotrophs may be more prevalent among aerobic bacterioplankton than previously thought. This study demonstrates the power of single-cell DNA sequencing addressing previously unresolved questions about the metabolic potential and evolutionary histories of uncultured microorganisms, which dominate most natural environments

  2. High-throughput single-cell sequencing identifies photoheterotrophs and chemoautotrophs in freshwater bacterioplankton.

    PubMed

    Martinez-Garcia, Manuel; Swan, Brandon K; Poulton, Nicole J; Gomez, Monica Lluesma; Masland, Dashiell; Sieracki, Michael E; Stepanauskas, Ramunas

    2012-01-01

    Recent discoveries suggest that photoheterotrophs (rhodopsin-containing bacteria (RBs) and aerobic anoxygenic phototrophs (AAPs)) and chemoautotrophs may be significant for marine and freshwater ecosystem productivity. However, their abundance and taxonomic identities remain largely unknown. We used a combination of single-cell and metagenomic DNA sequencing to study the predominant photoheterotrophs and chemoautotrophs inhabiting the euphotic zone of temperate, physicochemically diverse freshwater lakes. Multi-locus sequencing of 712 single amplified genomes, generated by fluorescence-activated cell sorting and whole genome multiple displacement amplification, showed that most of the cosmopolitan freshwater clusters contain photoheterotrophs. These comprised at least 10-23% of bacterioplankton, and RBs were the dominant fraction. Our data demonstrate that Actinobacteria, including clusters acI, Luna and acSTL, are the predominant freshwater RBs. We significantly broaden the known taxonomic range of freshwater RBs, to include Alpha-, Beta-, Gamma- and Deltaproteobacteria, Verrucomicrobia and Sphingobacteria. By sequencing single cells, we found evidence for inter-phyla horizontal gene transfer and recombination of rhodopsin genes and identified specific taxonomic groups involved in these evolutionary processes. Our data suggest that members of the ubiquitous betaproteobacteria Polynucleobacter spp. are the dominant AAPs in temperate freshwater lakes. Furthermore, the RuBisCO (ribulose 1,5-bisphosphate carboxylase/oxygenase) gene was found in several single cells of Betaproteobacteria, Bacteroidetes and Gammaproteobacteria, suggesting that chemoautotrophs may be more prevalent among aerobic bacterioplankton than previously thought. This study demonstrates the power of single-cell DNA sequencing addressing previously unresolved questions about the metabolic potential and evolutionary histories of uncultured microorganisms, which dominate most natural environments.

  3. Simple Math is Enough: Two Examples of Inferring Functional Associations from Genomic Data

    NASA Technical Reports Server (NTRS)

    Liang, Shoudan

    2003-01-01

    Non-random features in the genomic data are usually biologically meaningful. The key is to choose the feature well. Having a p-value based score prioritizes the findings. If two proteins share a unusually large number of common interaction partners, they tend to be involved in the same biological process. We used this finding to predict the functions of 81 un-annotated proteins in yeast.

  4. Low rate of genomic repatterning in Xenarthra inferred from chromosome painting data.

    PubMed

    Dobigny, G; Yang, F; O'Brien, P C M; Volobouev, V; Kovács, A; Pieczarka, J C; Ferguson-Smith, M A; Robinson, T J

    2005-01-01

    Comparative cytogenetic studies on Xenarthra, one of the most basal mammalian clades in the Placentalia, are virtually absent, being restricted largely to descriptions of conventional karyotypes and diploid numbers. We present a molecular cytogenetic comparison of chromosomes from the two-toed (Choloepus didactylus, 2n = 65) and three-toed sloth species (Bradypus tridactylus, 2n = 52), an anteater (Tamandua tetradactyla, 2n = 54) which, together with some data on the six-banded armadillo (Euphractus sexcinctus, 2n = 58), collectively represent all the major xenarthran lineages. Our results, based on interspecific chromosome painting using flow-sorted two-toed sloth chromosomes as painting probes, show the sloth species to be karyotypically closely related but markedly different from the anteater. We also test the synteny disruptions and segmental associations identified within Pilosa (anteaters and sloths) against the chromosomes of the six-banded armadillo as outgroup taxon. We could thus polarize the 35 non-ambiguously identified chromosomal changes characterizing the evolution of the anteater and sloth genomes and map these to a published sequence-based phylogeny for the group. These data suggest a low rate of genomic repatterning when placed in the context of divergence estimates based on molecular and fossil data. Finally, our results provide a glimpse of a likely ancestral karyotype for the extant Xenarthra, a pivotal group for understanding eutherian genome evolution.

  5. ARG-walker: inference of individual specific strengths of meiotic recombination hotspots by population genomics analysis

    PubMed Central

    2015-01-01

    Background Meiotic recombination hotspots play important roles in various aspects of genomics, but the underlying mechanisms for regulating the locations and strengths of recombination hotspots are not yet fully revealed. Most existing algorithms for estimating recombination rates from sequence polymorphism data can only output average recombination rates of a population, although there is evidence for the heterogeneity in recombination rates among individuals. For genome-wide association studies (GWAS) of recombination hotspots, an efficient algorithm that estimates the individualized strengths of recombination hotspots is highly desirable. Results In this work, we propose a novel graph mining algorithm named ARG-walker, based on random walks on ancestral recombination graphs (ARG), to estimate individual-specific recombination hotspot strengths. Extensive simulations demonstrate that ARG-walker is able to distinguish the hot allele of a recombination hotspot from the cold allele. Integrated with output of ARG-walker, we performed GWAS on the phased haplotype data of the 22 autosome chromosomes of the HapMap Asian population samples of Chinese and Japanese (JPT+CHB). Significant cis-regulatory signals have been detected, which is corroborated by the enrichment of the well-known 13-mer motif CCNCCNTNNCCNC of PRDM9 protein. Moreover, two new DNA motifs have been identified in the flanking regions of the significantly associated SNPs (single nucleotide polymorphisms), which are likely to be new cis-regulatory elements of meiotic recombination hotspots of the human genome. Conclusions Our results on both simulated and real data suggest that ARG-walker is a promising new method for estimating the individual recombination variations. In the future, it could be used to uncover the mechanisms of recombination regulation and human diseases related with recombination hotspots. PMID:26679564

  6. Remarkable variation in maize genome structure inferred from haplotype diversity at the bz locus.

    PubMed

    Wang, Qinghua; Dooner, Hugo K

    2006-11-21

    Maize is probably the most diverse of all crop species. Unexpectedly large differences among haplotypes were first revealed in a comparison of the bz genomic regions of two different inbred lines, McC and B73. Retrotransposon clusters, which comprise most of the repetitive DNA in maize, varied markedly in makeup, and location relative to the genes in the region and genic sequences, later shown to be carried by two helitron transposons, also differed between the inbreds. Thus, the allelic bz regions of these Corn Belt inbreds shared only a minority of the total sequence. To investigate further the variation caused by retrotransposons, helitrons, and other insertions, we have analyzed the organization of the bz genomic region in five additional cultivars selected because of their geographic and genetic diversity: the inbreds A188, CML258, and I137TN, and the land races Coroico and NalTel. This vertical comparison has revealed the existence of several new helitrons, new retrotransposons, members of every superfamily of DNA transposons, numerous miniature elements, and novel insertions flanked at either end by TA repeats, which we call TAFTs (TA-flanked transposons). The extent of variation in the region is remarkable. In pairwise comparisons of eight bz haplotypes, the percentage of shared sequences ranges from 25% to 84%. Chimeric haplotypes were identified that combine retrotransposon clusters found in different haplotypes. We propose that recombination in the common gene space greatly amplifies the variability produced by the retrotransposition explosion in the maize ancestry, creating the heterogeneity in genome organization found in modern maize.

  7. Remarkable variation in maize genome structure inferred from haplotype diversity at the bz locus

    PubMed Central

    Wang, Qinghua; Dooner, Hugo K.

    2006-01-01

    Maize is probably the most diverse of all crop species. Unexpectedly large differences among haplotypes were first revealed in a comparison of the bz genomic regions of two different inbred lines, McC and B73. Retrotransposon clusters, which comprise most of the repetitive DNA in maize, varied markedly in makeup, and location relative to the genes in the region and genic sequences, later shown to be carried by two helitron transposons, also differed between the inbreds. Thus, the allelic bz regions of these Corn Belt inbreds shared only a minority of the total sequence. To investigate further the variation caused by retrotransposons, helitrons, and other insertions, we have analyzed the organization of the bz genomic region in five additional cultivars selected because of their geographic and genetic diversity: the inbreds A188, CML258, and I137TN, and the land races Coroico and NalTel. This vertical comparison has revealed the existence of several new helitrons, new retrotransposons, members of every superfamily of DNA transposons, numerous miniature elements, and novel insertions flanked at either end by TA repeats, which we call TAFTs (TA-flanked transposons). The extent of variation in the region is remarkable. In pairwise comparisons of eight bz haplotypes, the percentage of shared sequences ranges from 25% to 84%. Chimeric haplotypes were identified that combine retrotransposon clusters found in different haplotypes. We propose that recombination in the common gene space greatly amplifies the variability produced by the retrotransposition explosion in the maize ancestry, creating the heterogeneity in genome organization found in modern maize. PMID:17101975

  8. Revealing Less Derived Nature of Cartilaginous Fish Genomes with Their Evolutionary Time Scale Inferred with Nuclear Genes

    PubMed Central

    Renz, Adina J.; Meyer, Axel; Kuraku, Shigehiro

    2013-01-01

    Cartilaginous fishes, divided into Holocephali (chimaeras) and Elasmoblanchii (sharks, rays and skates), occupy a key phylogenetic position among extant vertebrates in reconstructing their evolutionary processes. Their accurate evolutionary time scale is indispensable for better understanding of the relationship between phenotypic and molecular evolution of cartilaginous fishes. However, our current knowledge on the time scale of cartilaginous fish evolution largely relies on estimates using mitochondrial DNA sequences. In this study, making the best use of the still partial, but large-scale sequencing data of cartilaginous fish species, we estimate the divergence times between the major cartilaginous fish lineages employing nuclear genes. By rigorous orthology assessment based on available genomic and transcriptomic sequence resources for cartilaginous fishes, we selected 20 protein-coding genes in the nuclear genome, spanning 2973 amino acid residues. Our analysis based on the Bayesian inference resulted in the mean divergence time of 421 Ma, the late Silurian, for the Holocephali-Elasmobranchii split, and 306 Ma, the late Carboniferous, for the split between sharks and rays/skates. By applying these results and other documented divergence times, we measured the relative evolutionary rate of the Hox A cluster sequences in the cartilaginous fish lineages, which resulted in a lower substitution rate with a factor of at least 2.4 in comparison to tetrapod lineages. The obtained time scale enables mapping phenotypic and molecular changes in a quantitative framework. It is of great interest to corroborate the less derived nature of cartilaginous fish at the molecular level as a genome-wide phenomenon. PMID:23825540

  9. Inferring Properties of Ancient Cyanobacteria from Biogeochemical Activity and Genomes of Siderophilic Cyanobacteria

    NASA Technical Reports Server (NTRS)

    McKay, David S.; Brown, I. I.; Tringe, S. G.; Thomas-Keprta, K. E.; Bryant, D. A.; Sarkisova, S. S.; Malley, K.; Sosa, O.; Klatt, C. G.; McKay, D. S.

    2010-01-01

    Interrelationships between life and the planetary system could have simultaneously left landmarks in genomes of microbes and physicochemical signatures in the lithosphere. Verifying the links between genomic features in living organisms and the mineralized signatures generated by these organisms will help to reveal traces of life on Earth and beyond. Among contemporary environments, iron-depositing hot springs (IDHS) may represent one of the most appropriate natural models [1] for insights into ancient life since organisms may have originated on Earth and probably Mars in association with hydrothermal activity [2,3]. IDHS also seem to be appropriate models for studying certain biogeochemical processes that could have taken place in the late Archean and,-or early Paleoproterozoic eras [4, 5]. It has been suggested that inorganic polyphosphate (PPi), in chains of tens to hundreds of phosphate residues linked by high-energy bonds, is environmentally ubiquitous and abundant [6]. Cyanobacteria (CB) react to increased heavy metal concentrations and UV by enhanced generation of PPi bodies (PPB) [7], which are believed to be signatures of life [8]. However, the role of PPi in oxygenic prokaryotes for the suppression of oxidative stress induced by high Fe is poorly studied. Here we present preliminary results of a new mechanism of Fe mineralization in oxygenic prokaryotes, the effect of Fe on the generation of PPi bodies in CB, as well as preliminary analysis of the diversity and phylogeny of proteins involved in the prevention of oxidative stress in phototrophs inhabiting IDHS.

  10. Primate phylogenetic relationships and divergence dates inferred from complete mitochondrial genomes

    PubMed Central

    Hodgson, Jason A.; Burrell, Andrew S.; Sterner, Kirstin N.; Raaum, Ryan L.; Disotell, Todd R.

    2014-01-01

    The origins and the divergence times of the most basal lineages within primates have been difficult to resolve mainly due to the incomplete sampling of early fossil taxa. The main source of contention is related to the discordance between molecular and fossil estimates: while there are no crown primate fossils older than 56 Ma, most molecule-based estimates extend the origins of crown primates into the Cretaceous. Here we present a comprehensive mitogenomic study of primates. We assembled 87 mammalian mitochondrial genomes, including 62 primate species representing all the families of the order. We newly sequenced eleven mitochondrial genomes, including eight Old World monkeys and three strepsirrhines. Phylogenetic analyses support a strong topology, confirming the monophyly for all the major primate clades. In contrast to previous mitogenomic studies, the positions of tarsiers and colugos relative to strepsirrhines and anthropoids are well resolved. In order to improve our understanding of how fossil calibrations affect age estimates within primates, we explore the effect of seventeen fossil calibrations across primates and other mammalian groups and we select a subset of calibrations to date our mitogenomic tree. The divergence date estimates of the Strepsirrhine/Haplorhine split support an origin of crown primates in the Late Cretaceous, at around 74 Ma. This result supports a short fuse model of primate origins, whereby relatively little time passed between the origin of the order and the diversification of its major clades. It also suggests that the early primate fossil record is likely poorly sampled. PMID:24583291

  11. Genetic Structure and Phylogeography of the Leopard Cat (Prionailurus bengalensis) Inferred from Mitochondrial Genomes.

    PubMed

    Patel, Riddhi P; Wutke, Saskia; Lenz, Dorina; Mukherjee, Shomita; Ramakrishnan, Uma; Veron, Géraldine; Fickel, Jörns; Wilting, Andreas; Förster, Daniel W

    2017-06-01

    The Leopard cat Prionailurus bengalensis is a habitat generalist that is widely distributed across Southeast Asia. Based on morphological traits, this species has been subdivided into 12 subspecies. Thus far, there have been few molecular studies investigating intraspecific variation, and those had been limited in geographic scope. For this reason, we aimed to study the genetic structure and evolutionary history of this species across its very large distribution range in Asia. We employed both PCR-based (short mtDNA fragments, 94 samples) and high throughput sequencing based methods (whole mitochondrial genomes, 52 samples) on archival, noninvasively collected and fresh samples to investigate the distribution of intraspecific genetic variation. Our comprehensive sampling coupled with the improved resolution of a mitochondrial genome analyses provided strong support for a deep split between Mainland and Sundaic Leopard cats. Although we identified multiple haplogroups within the species' distribution, we found no matrilineal evidence for the distinction of 12 subspecies. In the context of Leopard cat biogeography, we cautiously recommend a revision of the Prionailurus bengalensis subspecific taxonomy: namely, a reduction to 4 subspecies (2 mainland and 2 Sundaic forms). © The American Genetic Association 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  12. Inferring the choreography of parental genomes during fertilization from ultralarge-scale whole-transcriptome analysis.

    PubMed

    Park, Sung-Joon; Komata, Makiko; Inoue, Fukashi; Yamada, Kaori; Nakai, Kenta; Ohsugi, Miho; Shirahige, Katsuhiko

    2013-12-15

    Fertilization precisely choreographs parental genomes by using gamete-derived cellular factors and activating genome regulatory programs. However, the mechanism remains elusive owing to the technical difficulties of preparing large numbers of high-quality preimplantation cells. Here, we collected >14 × 10(4) high-quality mouse metaphase II oocytes and used these to establish detailed transcriptional profiles for four early embryo stages and parthenogenetic development. By combining these profiles with other public resources, we found evidence that gene silencing appeared to be mediated in part by noncoding RNAs and that this was a prerequisite for post-fertilization development. Notably, we identified 817 genes that were differentially expressed in embryos after fertilization compared with parthenotes. The regulation of these genes was distinctly different from those expressed in parthenotes, suggesting functional specialization of particular transcription factors prior to first cell cleavage. We identified five transcription factors that were potentially necessary for developmental progression: Foxd1, Nkx2-5, Sox18, Myod1, and Runx1. Our very large-scale whole-transcriptome profile of early mouse embryos yielded a novel and valuable resource for studies in developmental biology and stem cell research. The database is available at http://dbtmee.hgc.jp.

  13. Karyotypic evolution of the family Sciuridae: inferences from the genome organizations of ground squirrels.

    PubMed

    Li, T; Wang, J; Su, W; Nie, W; Yang, F

    2006-01-01

    Cross-species chromosome painting has made a great contribution to our understanding of the evolution of karyotypes and genome organizations of mammals. Several recent papers of comparative painting between tree and flying squirrels have shed some light on the evolution of the family Sciuridae and the order Rodentia. In the present study we have extended the comparative painting to the Himalayan marmot (Marmotahimalayana) and the African ground squirrel (Xerus cf. erythropus), i.e. representative species from another important squirrel group--the ground squirrels--, and have established genome-wide comparative chromosome maps between human, eastern gray squirrel, and these two ground squirrels. The results show that 1) the squirrels so far studied all have conserved karyotypes that resemble the ancestral karyotype of the order Rodentia; 2) the African ground squirrels could have retained the ancestral karyotype of the family Sciuridae. Furthermore, we have mapped the evolutionary rearrangements onto a molecular-based consensus phylogenetic tree of the family Sciuridae. 2006 S. Karger AG, Basel.

  14. Metabolic Roles of Uncultivated Bacterioplankton Lineages in the Northern Gulf of Mexico "Dead Zone".

    PubMed

    Thrash, J Cameron; Seitz, Kiley W; Baker, Brett J; Temperton, Ben; Gillies, Lauren E; Rabalais, Nancy N; Henrissat, Bernard; Mason, Olivia U

    2017-09-12

    Marine regions that have seasonal to long-term low dissolved oxygen (DO) concentrations, sometimes called "dead zones," are increasing in number and severity around the globe with deleterious effects on ecology and economics. One of the largest of these coastal dead zones occurs on the continental shelf of the northern Gulf of Mexico (nGOM), which results from eutrophication-enhanced bacterioplankton respiration and strong seasonal stratification. Previous research in this dead zone revealed the presence of multiple cosmopolitan bacterioplankton lineages that have eluded cultivation, and thus their metabolic roles in this ecosystem remain unknown. We used a coupled shotgun metagenomic and metatranscriptomic approach to determine the metabolic potential of Marine Group II Euryarchaeota, SAR406, and SAR202. We recovered multiple high-quality, nearly complete genomes from all three groups as well as candidate phyla usually associated with anoxic environments-Parcubacteria (OD1) and Peregrinibacteria Two additional groups with putative assignments to ACD39 and PAUC34f supplement the metabolic contributions by uncultivated taxa. Our results indicate active metabolism in all groups, including prevalent aerobic respiration, with concurrent expression of genes for nitrate reduction in SAR406 and SAR202, and dissimilatory nitrite reduction to ammonia and sulfur reduction by SAR406. We also report a variety of active heterotrophic carbon processing mechanisms, including degradation of complex carbohydrate compounds by SAR406, SAR202, ACD39, and PAUC34f. Together, these data help constrain the metabolic contributions from uncultivated groups in the nGOM during periods of low DO and suggest roles for these organisms in the breakdown of complex organic matter.IMPORTANCE Dead zones receive their name primarily from the reduction of eukaryotic macrobiota (demersal fish, shrimp, etc.) that are also key coastal fisheries. Excess nutrients contributed from anthropogenic activity

  15. King penguin demography since the last glaciation inferred from genome-wide data.

    PubMed

    Trucchi, Emiliano; Gratton, Paolo; Whittington, Jason D; Cristofari, Robin; Le Maho, Yvon; Stenseth, Nils Chr; Le Bohec, Céline

    2014-07-22

    How natural climate cycles, such as past glacial/interglacial patterns, have shaped species distributions at the high-latitude regions of the Southern Hemisphere is still largely unclear. Here, we show how the post-glacial warming following the Last Glacial Maximum (ca 18 000 years ago), allowed the (re)colonization of the fragmented sub-Antarctic habitat by an upper-level marine predator, the king penguin Aptenodytes patagonicus. Using restriction site-associated DNA sequencing and standard mitochondrial data, we tested the behaviour of subsets of anonymous nuclear loci in inferring past demography through coalescent-based and allele frequency spectrum analyses. Our results show that the king penguin population breeding on Crozet archipelago steeply increased in size, closely following the Holocene warming recorded in the Epica Dome C ice core. The following population growth can be explained by a threshold model in which the ecological requirements of this species (year-round ice-free habitat for breeding and access to a major source of food such as the Antarctic Polar Front) were met on Crozet soon after the Pleistocene/Holocene climatic transition. © 2014 The Author(s) Published by the Royal Society. All rights reserved.

  16. Triallelic Population Genomics for Inferring Correlated Fitness Effects of Same Site Nonsynonymous Mutations.

    PubMed

    Ragsdale, Aaron P; Coffman, Alec J; Hsieh, PingHsun; Struck, Travis J; Gutenkunst, Ryan N

    2016-05-01

    The distribution of mutational effects on fitness is central to evolutionary genetics. Typical univariate distributions, however, cannot model the effects of multiple mutations at the same site, so we introduce a model in which mutations at the same site have correlated fitness effects. To infer the strength of that correlation, we developed a diffusion approximation to the triallelic frequency spectrum, which we applied to data from Drosophila melanogaster We found a moderate positive correlation between the fitness effects of nonsynonymous mutations at the same codon, suggesting that both mutation identity and location are important for determining fitness effects in proteins. We validated our approach by comparing it to biochemical mutational scanning experiments, finding strong quantitative agreement, even between different organisms. We also found that the correlation of mutational fitness effects was not affected by protein solvent exposure or structural disorder. Together, our results suggest that the correlation of fitness effects at the same site is a previously overlooked yet fundamental property of protein evolution.

  17. King penguin demography since the last glaciation inferred from genome-wide data

    PubMed Central

    Trucchi, Emiliano; Gratton, Paolo; Whittington, Jason D.; Cristofari, Robin; Le Maho, Yvon; Stenseth, Nils Chr; Le Bohec, Céline

    2014-01-01

    How natural climate cycles, such as past glacial/interglacial patterns, have shaped species distributions at the high-latitude regions of the Southern Hemisphere is still largely unclear. Here, we show how the post-glacial warming following the Last Glacial Maximum (ca 18 000 years ago), allowed the (re)colonization of the fragmented sub-Antarctic habitat by an upper-level marine predator, the king penguin Aptenodytes patagonicus. Using restriction site-associated DNA sequencing and standard mitochondrial data, we tested the behaviour of subsets of anonymous nuclear loci in inferring past demography through coalescent-based and allele frequency spectrum analyses. Our results show that the king penguin population breeding on Crozet archipelago steeply increased in size, closely following the Holocene warming recorded in the Epica Dome C ice core. The following population growth can be explained by a threshold model in which the ecological requirements of this species (year-round ice-free habitat for breeding and access to a major source of food such as the Antarctic Polar Front) were met on Crozet soon after the Pleistocene/Holocene climatic transition. PMID:24920481

  18. Integrating gene and protein expression data with genome-scale metabolic networks to infer functional pathways.

    PubMed

    Pey, Jon; Valgepea, Kaspar; Rubio, Angel; Beasley, John E; Planes, Francisco J

    2013-12-08

    The study of cellular metabolism in the context of high-throughput -omics data has allowed us to decipher novel mechanisms of importance in biotechnology and health. To continue with this progress, it is essential to efficiently integrate experimental data into metabolic modeling. We present here an in-silico framework to infer relevant metabolic pathways for a particular phenotype under study based on its gene/protein expression data. This framework is based on the Carbon Flux Path (CFP) approach, a mixed-integer linear program that expands classical path finding techniques by considering additional biophysical constraints. In particular, the objective function of the CFP approach is amended to account for gene/protein expression data and influence obtained paths. This approach is termed integrative Carbon Flux Path (iCFP). We show that gene/protein expression data also influences the stoichiometric balancing of CFPs, which provides a more accurate picture of active metabolic pathways. This is illustrated in both a theoretical and real scenario. Finally, we apply this approach to find novel pathways relevant in the regulation of acetate overflow metabolism in Escherichia coli. As a result, several targets which could be relevant for better understanding of the phenomenon leading to impaired acetate overflow are proposed. A novel mathematical framework that determines functional pathways based on gene/protein expression data is presented and validated. We show that our approach is able to provide new insights into complex biological scenarios such as acetate overflow in Escherichia coli.

  19. Integrating gene and protein expression data with genome-scale metabolic networks to infer functional pathways

    PubMed Central

    2013-01-01

    Background The study of cellular metabolism in the context of high-throughput -omics data has allowed us to decipher novel mechanisms of importance in biotechnology and health. To continue with this progress, it is essential to efficiently integrate experimental data into metabolic modeling. Results We present here an in-silico framework to infer relevant metabolic pathways for a particular phenotype under study based on its gene/protein expression data. This framework is based on the Carbon Flux Path (CFP) approach, a mixed-integer linear program that expands classical path finding techniques by considering additional biophysical constraints. In particular, the objective function of the CFP approach is amended to account for gene/protein expression data and influence obtained paths. This approach is termed integrative Carbon Flux Path (iCFP). We show that gene/protein expression data also influences the stoichiometric balancing of CFPs, which provides a more accurate picture of active metabolic pathways. This is illustrated in both a theoretical and real scenario. Finally, we apply this approach to find novel pathways relevant in the regulation of acetate overflow metabolism in Escherichia coli. As a result, several targets which could be relevant for better understanding of the phenomenon leading to impaired acetate overflow are proposed. Conclusions A novel mathematical framework that determines functional pathways based on gene/protein expression data is presented and validated. We show that our approach is able to provide new insights into complex biological scenarios such as acetate overflow in Escherichia coli. PMID:24314206

  20. Evolutionary landscape of amphibians emerging from ancient freshwater fish inferred from complete mitochondrial genomes.

    PubMed

    Wang, Xiao-Tong; Zhang, Yan-Feng; Wu, Qian; Zhang, Hao

    2012-05-04

    It is very interesting that the only extant marine amphibian is the marine frog, Fejervarya cancrivora. This study investigated the reasons for this apparent rarity by conducting a phylogenetic tree analysis of the complete mitochondrial genomes from 14 amphibians, 67 freshwater fishes, four migratory fishes, 35 saltwater fishes, and one hemichordate. The results showed that amphibians, living fossil fishes, and the common ancestors of modern fishes are phylogenetically separated. In general, amphibians, living fossil fishes, saltwater fishes, and freshwater fishes are clustered in different clades. This suggests that the ancestor of living amphibians arose from a type of primordial freshwater fish, rather than the coelacanth, lungfish, or modern saltwater fish. Modern freshwater fish and modern saltwater fish were probably separated from a common ancestor by a single event, caused by crustal movement. Copyright © 2012 Elsevier Inc. All rights reserved.

  1. Temporal patterns of phyto- and bacterioplankton and their relationships with environmental factors in Lake Taihu, China.

    PubMed

    Su, Xiaomei; Steinman, Alan D; Xue, Qingju; Zhao, Yanyan; Tang, Xiangming; Xie, Liqiang

    2017-10-01

    Phytoplankton and bacterioplankton are integral components of aquatic food webs and play essential roles in the structure and function of freshwater ecosystems. However, little is known about how phyto- and bacterioplankton may respond synchronously to changing environmental conditions. Thus, we analyzed simultaneously the composition and structure of phyto- and bacterioplankton on a monthly basis over 12 months in cyanobacteria-dominated areas of Lake Taihu and compared their responses to changes in environmental factors. Metric multi-dimensional scaling (mMDS) revealed that the temporal variations of phyto- and bacterioplankton were significant. Time lag analysis (TLA) indicated that the temporal pattern of phytoplankton tended to exhibit convergent dynamics while bacterioplankton showed highly stable or stochastic variation. A significant directional change was found for bacterioplankton at the genus level and the slopes (rate of change) and regression R(2) (low stochasticity or stability) were greater if Cyanobacteria were included, suggesting a higher level of instability in the bacterial community at lower taxonomy level. Consequently, phytoplankton responded more rapidly to the change in environmental conditions than bacterioplankton when analyzed at the phylum level, while bacterioplankton were more sensitive at the finer taxonomic resolution in Lake Taihu. Redundancy analysis (RDA) results showed that environmental variables collectively explained 51.0% variance of phytoplankton and 46.7% variance of bacterioplankton, suggesting that environmental conditions have a significant influence on the temporal variations of phyto- and bacterioplankton. Furthermore, variance partitioning indicated that the bacterial community structure was largely explained by water temperature and nitrogen, suggesting that these factors were the primary drivers shaping bacterioplankton. Copyright © 2017. Published by Elsevier Ltd.

  2. Phylogeography of the fire-bellied toads Bombina: independent Pleistocene histories inferred from mitochondrial genomes.

    PubMed

    Hofman, Sebastian; Spolsky, Christina; Uzzell, Thomas; Cogălniceanu, Dan; Babik, Wiesław; Szymura, Jacek M

    2007-06-01

    The fire-bellied toads Bombina bombina and Bombina variegata, interbreed in a long, narrow zone maintained by a balance between selection and dispersal. Hybridization takes place between local, genetically differentiated groups. To quantify divergence between these groups and reconstruct their history and demography, we analysed nucleotide variation at the mitochondrial cytochrome b gene (1096 bp) in 364 individuals from 156 sites representing the entire range of both species. Three distinct clades with high sequence divergence (K2P = 8-11%) were distinguished. One clade grouped B. bombina haplotypes; the two other clades grouped B. variegata haplotypes. One B. variegata clade included only Carpathian individuals; the other represented B. variegata from the southwestern parts of its distribution: Southern and Western Europe (Balkano-Western lineage), Apennines, and the Rhodope Mountains. Differentiation between the Carpathian and Balkano-Western lineages, K2P approximately 8%, approached interspecific divergence. Deep divergence among European Bombina lineages suggests their preglacial origin, and implies long and largely independent evolutionary histories of the species. Multiple glacial refugia were identified in the lowlands adjoining the Black Sea, in the Carpathians, in the Balkans, and in the Apennines. The results of the nested clade and demographic analyses suggest drastic reductions of population sizes during the last glacial period, and significant demographic growth related to postglacial colonization. Inferred history, supported by fossil evidence, demonstrates that Bombina ranges underwent repeated contractions and expansions. Geographical concordance between morphology, allozymes, and mtDNA shows that previous episodes of interspecific hybridization have left no detectable mtDNA introgression. Either the admixed populations went extinct, or selection against hybrids hindered mtDNA gene flow in ancient hybrid zones.

  3. Conflicting genomic signals affect phylogenetic inference in four species of North American pines.

    PubMed

    Koralewski, Tomasz E; Mateos, Mariana; Krutovsky, Konstantin V

    2016-01-01

    Adaptive evolutionary processes in plants may be accompanied by episodes of introgression, parallel evolution and incomplete lineage sorting that pose challenges in untangling species evolutionary history. Genus Pinus (pines) is one of the most abundant and most studied groups among gymnosperms, and a good example of a lineage where these phenomena have been observed. Pines are among the most ecologically and economically important plant species. Some, such as the pines of the southeastern USA (southern pines in subsection Australes), are subjects of intensive breeding programmes. Despite numerous published studies, the evolutionary history of Australes remains ambiguous and often controversial. We studied the phylogeny of four major southern pine species: shortleaf (Pinus echinata), slash (P. elliottii), longleaf (P. palustris) and loblolly (P. taeda), using sequences from 11 nuclear loci and maximum likelihood and Bayesian methods. Our analysis encountered resolution difficulties similar to earlier published studies. Although incomplete lineage sorting and introgression are two phenomena presumptively underlying our results, the phylogenetic inferences seem to be also influenced by the genes examined, with certain topologies supported by sets of genes sharing common putative functionalities. For example, genes involved in wood formation supported the clade echinata-taeda, genes linked to plant defence supported the clade echinata-elliottii and genes linked to water management properties supported the clade echinata-palustris The support for these clades was very high and consistent across methods. We discuss the potential factors that could underlie these observations, including incomplete lineage sorting, hybridization and parallel or adaptive evolution. Our results likely reflect the relatively short evolutionary history of the subsection that is thought to have begun during the middle Miocene and has been influenced by climate fluctuations. Published by Oxford

  4. Conflicting genomic signals affect phylogenetic inference in four species of North American pines

    PubMed Central

    Koralewski, Tomasz E.; Mateos, Mariana; Krutovsky, Konstantin V.

    2016-01-01

    Adaptive evolutionary processes in plants may be accompanied by episodes of introgression, parallel evolution and incomplete lineage sorting that pose challenges in untangling species evolutionary history. Genus Pinus (pines) is one of the most abundant and most studied groups among gymnosperms, and a good example of a lineage where these phenomena have been observed. Pines are among the most ecologically and economically important plant species. Some, such as the pines of the southeastern USA (southern pines in subsection Australes), are subjects of intensive breeding programmes. Despite numerous published studies, the evolutionary history of Australes remains ambiguous and often controversial. We studied the phylogeny of four major southern pine species: shortleaf (Pinus echinata), slash (P. elliottii), longleaf (P. palustris) and loblolly (P. taeda), using sequences from 11 nuclear loci and maximum likelihood and Bayesian methods. Our analysis encountered resolution difficulties similar to earlier published studies. Although incomplete lineage sorting and introgression are two phenomena presumptively underlying our results, the phylogenetic inferences seem to be also influenced by the genes examined, with certain topologies supported by sets of genes sharing common putative functionalities. For example, genes involved in wood formation supported the clade echinata–taeda, genes linked to plant defence supported the clade echinata–elliottii and genes linked to water management properties supported the clade echinata–palustris. The support for these clades was very high and consistent across methods. We discuss the potential factors that could underlie these observations, including incomplete lineage sorting, hybridization and parallel or adaptive evolution. Our results likely reflect the relatively short evolutionary history of the subsection that is thought to have begun during the middle Miocene and has been influenced by climate fluctuations. PMID

  5. Module Anchored Network Inference: A Sequential Module-Based Approach to Novel Gene Network Construction from Genomic Expression Data on Human Disease Mechanism

    PubMed Central

    Keller, Susanna R.; Lee, Jae K.

    2017-01-01

    Different computational approaches have been examined and compared for inferring network relationships from time-series genomic data on human disease mechanisms under the recent Dialogue on Reverse Engineering Assessment and Methods (DREAM) challenge. Many of these approaches infer all possible relationships among all candidate genes, often resulting in extremely crowded candidate network relationships with many more False Positives than True Positives. To overcome this limitation, we introduce a novel approach, Module Anchored Network Inference (MANI), that constructs networks by analyzing sequentially small adjacent building blocks (modules). Using MANI, we inferred a 7-gene adipogenesis network based on time-series gene expression data during adipocyte differentiation. MANI was also applied to infer two 10-gene networks based on time-course perturbation datasets from DREAM3 and DREAM4 challenges. MANI well inferred and distinguished serial, parallel, and time-dependent gene interactions and network cascades in these applications showing a superior performance to other in silico network inference techniques for discovering and reconstructing gene network relationships. PMID:28197408

  6. Impact of UV Radiation on Bacterioplankton Community Composition†

    PubMed Central

    Winter, Christian; Moeseneder, Markus M.; Herndl, Gerhard J.

    2001-01-01

    The potential effect of UV radiation on the composition of coastal marine bacterioplankton communities was investigated. Dilution cultures with seawater collected from the surface mixed layer of the coastal North Sea were exposed to different ranges of natural or artificial solar radiation for up to two diurnal cycles. The composition of the bacterioplankton community prior to exposure was compared to that after exposure to the different radiation regimes using denaturing gradient gel electrophoresis (DGGE) of 16S rRNA and 16S ribosomal DNA. Only minor changes in the composition of the bacterial community in the different radiation regimes were detectable. Sequencing of selected bands obtained by DGGE revealed that some species of the Flexibacter-Cytophaga-Bacteroides (FCB) group were sensitive to UV radiation while other species of the FCB group were resistant. Overall, only up to ≈10% of the operational taxonomic units present in the dilution cultures appeared to be affected by UV radiation. Thus, we conclude that UV radiation has little effect on the composition of coastal marine bacterioplankton communities in the North Sea. PMID:11157229

  7. Bacterioplankton assembly and interspecies interaction indicating increasing coastal eutrophication.

    PubMed

    Dai, Wenfang; Zhang, Jinjie; Tu, Qichao; Deng, Ye; Qiu, Qiongfen; Xiong, Jinbo

    2017-06-01

    Anthropogenic perturbations impose negative effects on coastal ecosystems, such as increasing levels of eutrophication. Given the biogeochemical significance of microorganisms, understanding the processes and mechanisms underlying their spatial distribution under changing environmental conditions is critical. To address this question, we examined how coastal bacterioplankton communities respond to increasing eutrophication levels created by anthropogenic perturbations. The results showed that the magnitude of changes in the bacterioplankton community compositions (BCCs) and the importance of deterministic processes that constrained bacterial assembly were closely associated with eutrophication levels. Moreover, increasing eutrophication significantly (P < 0.001) attenuated the distance decay rate, with a random spatial distribution of BCCs in the undisturbed location. In contrast, the complexity of interspecies interaction was enhanced under moderate eutrophication levels but declined under heavy eutrophication. Changes in the relative abundances of 27 bacterial families were significantly correlated with eutrophication levels. Notably, the pattern of enrichment or decrease for a given bacterial family was consistent with its known ecological functions. Our findings demonstrate that the magnitude of changes in BCCs and underlying determinism are dependent on eutrophication levels. However, the buffer capacity of bacterioplankton community is limited, with disrupted interspecies interaction occurring under heavy eutrophication. As such, bacterial assemblages are sensitive to changes in environmental conditions and could thus potentially serve as bio-indicators for increasing eutrophication. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. Redox-Specialized Bacterioplankton Metacommunity in a Temperate Estuary

    PubMed Central

    Laas, Peeter; Simm, Jaak; Lips, Inga; Lips, Urmas; Kisand, Veljo; Metsis, Madis

    2015-01-01

    This study explored the spatiotemporal dynamics of the bacterioplankton community composition in the Gulf of Finland (easternmost sub-basin of the Baltic Sea) based on phylogenetic analysis of 16S rDNA sequences acquired from community samples via pyrosequencing. Investigations of bacterioplankton in hydrographically complex systems provide good insight into the strategies by which microbes deal with spatiotemporal hydrographic gradients, as demonstrated by our research. Many ribotypes were closely affiliated with sequences isolated from environments with similar steep physiochemical gradients and/or seasonal changes, including seasonally anoxic estuaries. Hence, one of the main conclusions of this study is that marine ecosystems where oxygen and salinity gradients co-occur can be considered a habitat for a cosmopolitan metacommunity consisting of specialized groups occupying niches universal to such environments throughout the world. These niches revolve around functional capabilities to utilize different electron receptors and donors (including trace metal and single carbon compounds). On the other hand, temporal shifts in the bacterioplankton community composition at the surface layer were mainly connected to the seasonal succession of phytoplankton and the inflow of freshwater species. We also conclude that many relatively abundant populations are indigenous and well-established in the area. PMID:25860812

  9. Global patterns of diversity and community structure in marine bacterioplankton.

    PubMed

    Pommier, T; Canbäck, B; Riemann, L; Boström, K H; Simu, K; Lundberg, P; Tunlid, A; Hagström, A

    2007-02-01

    Because of their small size, great abundance and easy dispersal, it is often assumed that marine planktonic microorganisms have a ubiquitous distribution that prevents any structured assembly into local communities. To challenge this view, marine bacterioplankton communities from coastal waters at nine locations distributed world-wide were examined through the use of comprehensive clone libraries of 16S ribosomal RNA genes, used as operational taxonomic units (OTU). Our survey and analyses show that there were marked differences in the composition and richness of OTUs between locations. Remarkably, the global marine bacterioplankton community showed a high degree of endemism, and conversely included few cosmopolitan OTUs. Our data were consistent with a latitudinal gradient of OTU richness. We observed a positive relationship between the relative OTU abundances and their range of occupation, i.e. cosmopolitans had the largest population sizes. Although OTU richness differed among locations, the distributions of the major taxonomic groups represented in the communities were analogous, and all local communities were similarly structured and dominated by a few OTUs showing variable taxonomic affiliations. The observed patterns of OTU richness indicate that similar evolutionary and ecological processes structured the communities. We conclude that marine bacterioplankton share many of the biogeographical and macroecological features of macroscopic organisms. The general processes behind those patterns are likely to be comparable across taxa and major global biomes.

  10. Redox-specialized bacterioplankton metacommunity in a temperate estuary.

    PubMed

    Laas, Peeter; Simm, Jaak; Lips, Inga; Lips, Urmas; Kisand, Veljo; Metsis, Madis

    2015-01-01

    This study explored the spatiotemporal dynamics of the bacterioplankton community composition in the Gulf of Finland (easternmost sub-basin of the Baltic Sea) based on phylogenetic analysis of 16S rDNA sequences acquired from community samples via pyrosequencing. Investigations of bacterioplankton in hydrographically complex systems provide good insight into the strategies by which microbes deal with spatiotemporal hydrographic gradients, as demonstrated by our research. Many ribotypes were closely affiliated with sequences isolated from environments with similar steep physiochemical gradients and/or seasonal changes, including seasonally anoxic estuaries. Hence, one of the main conclusions of this study is that marine ecosystems where oxygen and salinity gradients co-occur can be considered a habitat for a cosmopolitan metacommunity consisting of specialized groups occupying niches universal to such environments throughout the world. These niches revolve around functional capabilities to utilize different electron receptors and donors (including trace metal and single carbon compounds). On the other hand, temporal shifts in the bacterioplankton community composition at the surface layer were mainly connected to the seasonal succession of phytoplankton and the inflow of freshwater species. We also conclude that many relatively abundant populations are indigenous and well-established in the area.

  11. Metabolic Roles of Uncultivated Bacterioplankton Lineages in the Northern Gulf of Mexico “Dead Zone”

    PubMed Central

    Seitz, Kiley W.; Temperton, Ben; Gillies, Lauren E.; Rabalais, Nancy N.; Henrissat, Bernard; Mason, Olivia U.

    2017-01-01

    ABSTRACT Marine regions that have seasonal to long-term low dissolved oxygen (DO) concentrations, sometimes called “dead zones,” are increasing in number and severity around the globe with deleterious effects on ecology and economics. One of the largest of these coastal dead zones occurs on the continental shelf of the northern Gulf of Mexico (nGOM), which results from eutrophication-enhanced bacterioplankton respiration and strong seasonal stratification. Previous research in this dead zone revealed the presence of multiple cosmopolitan bacterioplankton lineages that have eluded cultivation, and thus their metabolic roles in this ecosystem remain unknown. We used a coupled shotgun metagenomic and metatranscriptomic approach to determine the metabolic potential of Marine Group II Euryarchaeota, SAR406, and SAR202. We recovered multiple high-quality, nearly complete genomes from all three groups as well as candidate phyla usually associated with anoxic environments—Parcubacteria (OD1) and Peregrinibacteria. Two additional groups with putative assignments to ACD39 and PAUC34f supplement the metabolic contributions by uncultivated taxa. Our results indicate active metabolism in all groups, including prevalent aerobic respiration, with concurrent expression of genes for nitrate reduction in SAR406 and SAR202, and dissimilatory nitrite reduction to ammonia and sulfur reduction by SAR406. We also report a variety of active heterotrophic carbon processing mechanisms, including degradation of complex carbohydrate compounds by SAR406, SAR202, ACD39, and PAUC34f. Together, these data help constrain the metabolic contributions from uncultivated groups in the nGOM during periods of low DO and suggest roles for these organisms in the breakdown of complex organic matter. PMID:28900024

  12. Inferring regulatory elements from a whole genome. An analysis of Helicobacter pylori sigma(80) family of promoter signals.

    PubMed

    Vanet, A; Marsan, L; Labigne, A; Sagot, M F

    2000-03-24

    Helicobacter pylori is adapted to life in a unique niche, the gastric epithelium of primates. Its promoters may therefore be different from those of other bacteria. Here, we determine motifs possibly involved in the recognition of such promoter sequences by the RNA polymerase using a new motif identification method. An important feature of this method is that the motifs are sought with the least possible assumptions about what they may look like. The method starts by considering the whole genome of H. pylori and attempts to infer directly from it a description for a family of promoters. Thus, this approach differs from searching for such promoters with a previously established description. The two algorithms are based on the idea of inferring motifs by flexibly comparing words in the sequences with an external object, instead of between themselves. The first algorithm infers single motifs, the second a combination of two motifs separated from one another by strictly defined, sterically constrained distances. Besides independently finding motifs known to be present in other bacteria, such as the Shine-Dalgarno sequence and the TATA-box, this approach suggests the existence in H. pylori of a new, combined motif, TTAAGC, followed optimally 21 bp downstream by TATAAT. Between these two motifs, there is in some cases another, TTTTAA or, less frequently, a repetition of TTAAGC separated optimally from the TATA-box by 12 bp. The combined motif TTAAGCx(21+/-2)TATAAT is present with no errors immediately upstream from the only two copies of the ribosomal 23 S-5 S RNA genes in H. pylori, and with one error upstream from the only two copies of the ribosomal 16 S RNA genes. The operons of both ribosomal RNA molecules are strongly expressed, representing an encouraging sign of the pertinence of the motifs found by the algorithms. In 25 cases out of a possible 30, the combined motif is found with no more than three substitutions immediately upstream from ribosomal proteins, or

  13. Morphological homoplasy, life history evolution, and historical biogeography of plethodontid salamanders inferred from complete mitochondrial genomes

    PubMed Central

    Mueller, Rachel Lockridge; Macey, J. Robert; Jaekel, Martin; Wake, David B.; Boore, Jeffrey L.

    2004-01-01

    The evolutionary history of the largest salamander family (Plethodontidae) is characterized by extreme morphological homoplasy. Analysis of the mechanisms generating such homoplasy requires an independent molecular phylogeny. To this end, we sequenced 24 complete mitochondrial genomes (22 plethodontids and two outgroup taxa), added data for three species from GenBank, and performed partitioned and unpartitioned Bayesian, maximum likelihood, and maximum parsimony phylogenetic analyses. We explored four dataset partitioning strategies to account for evolutionary process heterogeneity among genes and codon positions, all of which yielded increased model likelihoods and decreased numbers of supported nodes in the topologies (Bayesian posterior probability >0.95) relative to the unpartitioned analysis. Our phylogenetic analyses yielded congruent trees that contrast with the traditional morphology-based taxonomy; the monophyly of three of four major groups is rejected. Reanalysis of current hypotheses in light of these evolutionary relationships suggests that (i) a larval life history stage reevolved from a direct-developing ancestor multiple times; (ii) there is no phylogenetic support for the “Out of Appalachia” hypothesis of plethodontid origins; and (iii) novel scenarios must be reconstructed for the convergent evolution of projectile tongues, reduction in toe number, and specialization for defensive tail loss. Some of these scenarios imply morphological transformation series that proceed in the opposite direction than was previously thought. In addition, they suggest surprising evolutionary lability in traits previously interpreted to be conservative. PMID:15365171

  14. Origins of the Moken Sea Gypsies inferred from mitochondrial hypervariable region and whole genome sequences.

    PubMed

    Dancause, Kelsey Needham; Chan, Chim W; Arunotai, Narumon Hinshiranan; Lum, J Koji

    2009-02-01

    The origins of the Moken 'Sea Gypsies,' a group of traditionally boat-dwelling nomadic foragers, remain speculative despite previous examinations from linguistic, sociocultural and genetic perspectives. We explored Moken origin(s) and affinities by comparing whole mitochondrial genome and hypervariable segment I sequences from 12 Moken individuals, sampled from four islands of the Mergui Archipelago, to other mainland Asian, Island Southeast Asian (ISEA) and Oceanic populations. These analyses revealed a major (11/12) and a minor (1/12) haplotype in the population, indicating low mitochondrial diversity likely resulting from historically low population sizes, isolation and consequent genetic drift. Phylogenetic analyses revealed close relationships between the major lineage (MKN1) and ISEA, mainland Asian and aboriginal Malay populations, and of the minor lineage (MKN2) to populations from ISEA. MKN1 belongs to a recently defined subclade of the ancient yet localized M21 haplogroup. MKN2 is not closely related to any previously sampled lineages, but has been tentatively assigned to the basal M46 haplogroup that possibly originated among the original inhabitants of ISEA. Our analyses suggest that MKN1 originated within coastal mainland SEA and dispersed into ISEA and rapidly into the Mergui Archipelago within the past few thousand years as a result of climate change induced population pressure.

  15. Morphological homoplasy, life history evolution, and historical biogeography of plethodontid salamanders inferred from complete mitochondrial genomes

    SciTech Connect

    Mueller, Rachel Lockridge; Macey, J. Robert; Jaekel, Martin; Wake, David B.; Boore, Jeffrey L.

    2004-08-01

    The evolutionary history of the largest salamander family (Plethodontidae) is characterized by extreme morphological homoplasy. Analysis of the mechanisms generating such homoplasy requires an independent, molecular phylogeny. To this end, we sequenced 24 complete mitochondrial genomes (22 plethodontids and two outgroup taxa), added data for three species from GenBank, and performed partitioned and unpartitioned Bayesian, ML, and MP phylogenetic analyses. We explored four dataset partitioning strategies to account for evolutionary process heterogeneity among genes and codon positions, all of which yielded increased model likelihoods and decreased numbers of supported nodes in the topologies (PP > 0.95) relative to the unpartitioned analysis. Our phylogenetic analyses yielded congruent trees that contrast with the traditional morphology-based taxonomy; the monophyly of three out of four major groups is rejected. Reanalysis of current hypotheses in light of these new evolutionary relationships suggests that (1) a larval life history stage re-evolved from a direct-developing ancestor multiple times, (2) there is no phylogenetic support for the ''Out of Appalachia'' hypothesis of plethodontid origins, and (3) novel scenarios must be reconstructed for the convergent evolution of projectile tongues, reduction in toe number, and specialization for defensive tail loss. Some of these novel scenarios imply morphological transformation series that proceed in the opposite direction than was previously thought. In addition, they suggest surprising evolutionary lability in traits previously interpreted to be conservative.

  16. Effects of UV-B Radiation on the Structural and Physiological Diversity of Bacterioneuston and Bacterioplankton

    PubMed Central

    Santos, Ana L.; Oliveira, Vanessa; Baptista, Inês; Henriques, Isabel; Gomes, Newton C. M.; Almeida, Adelaide; Correia, António

    2012-01-01

    The effects of UV radiation (UVR) on estuarine bacterioneuston and bacterioplankton were assessed in microcosm experiments. Bacterial abundance and DNA synthesis were more affected in bacterioplankton. Protein synthesis was more inhibited in bacterioneuston. Community analysis indicated that UVR has the potential to select resistant bacteria (e.g., Gammaproteobacteria), particularly abundant in bacterioneuston. PMID:22247171

  17. Phylogeny and biogeography of the family Salamandridae (Amphibia: Caudata) inferred from complete mitochondrial genomes.

    PubMed

    Zhang, Peng; Papenfuss, Theodore J; Wake, Marvalee H; Qu, Lianghu; Wake, David B

    2008-11-01

    Phylogenetic relationships of members of the salamander family Salamandridae were examined using complete mitochondrial genomes collected from 42 species representing all 20 salamandrid genera and five outgroup taxa. Weighted maximum parsimony, partitioned maximum likelihood, and partitioned Bayesian approaches all produce an identical, well-resolved phylogeny; most branches are strongly supported with greater than 90% bootstrap values and 1.0 Bayesian posterior probabilities. Our results support recent taxonomic changes in finding the traditional genera Mertensiella, Euproctus, and Triturus to be non-monophyletic species assemblages. We successfully resolved the current polytomy at the base of the salamandrid tree: the Italian newt genus Salamandrina is sister to all remaining salamandrids. Beyond Salamandrina, a clade comprising all remaining newts is separated from a clade containing the true salamanders. Among these newts, the branching orders of well-supported clades are: primitive newts (Echinotriton, Pleurodeles, and Tylototriton), New World newts (Notophthalmus-Taricha), Corsica-Sardinia newts (Euproctus), and modern European newts (Calotriton, Lissotriton, Mesotriton, Neurergus, Ommatotriton, and Triturus) plus modern Asian newts (Cynops, Pachytriton, and Paramesotriton).Two alternative sets of calibration points and two Bayesian dating methods (BEAST and MultiDivTime) were used to estimate timescales for salamandrid evolution. The estimation difference by dating methods is slight and we propose two sets of timescales based on different calibration choices. The two timescales suggest that the initial diversification of extant salamandrids took place in Europe about 97 or 69Ma. North American salamandrids were derived from their European ancestors by dispersal through North Atlantic Land Bridges in the Late Cretaceous ( approximately 69Ma) or Middle Eocene ( approximately 43Ma). Ancestors of Asian salamandrids most probably dispersed to the eastern Asia

  18. Phylogenetic Diversity of the Enteric Pathogen Salmonella enterica subsp. enterica Inferred from Genome-Wide Reference-Free SNP Characters

    PubMed Central

    Timme, Ruth E.; Pettengill, James B.; Allard, Marc W.; Strain, Errol; Barrangou, Rodolphe; Wehnes, Chris; Van Kessel, JoAnn S.; Karns, Jeffrey S.; Musser, Steven M.; Brown, Eric W.

    2013-01-01

    The enteric pathogen Salmonella enterica is one of the leading causes of foodborne illness in the world. The species is extremely diverse, containing more than 2,500 named serovars that are designated for their unique antigen characters and pathogenicity profiles—some are known to be virulent pathogens, while others are not. Questions regarding the evolution of pathogenicity, significance of antigen characters, diversity of clustered regularly interspaced short palindromic repeat (CRISPR) loci, among others, will remain elusive until a strong evolutionary framework is established. We present the first large-scale S. enterica subsp. enterica phylogeny inferred from a new reference-free k-mer approach of gathering single nucleotide polymorphisms (SNPs) from whole genomes. The phylogeny of 156 isolates representing 78 serovars (102 were newly sequenced) reveals two major lineages, each with many strongly supported sublineages. One of these lineages is the S. Typhi group; well nested within the phylogeny. Lineage-through-time analyses suggest there have been two instances of accelerated rates of diversification within the subspecies. We also found that antigen characters and CRISPR loci reveal different evolutionary patterns than that of the phylogeny, suggesting that a horizontal gene transfer or possibly a shared environmental acquisition might have influenced the present character distribution. Our study also shows the ability to extract reference-free SNPs from a large set of genomes and then to use these SNPs for phylogenetic reconstruction. This automated, annotation-free approach is an important step forward for bacterial disease tracking and in efficiently elucidating the evolutionary history of highly clonal organisms. PMID:24158624

  19. Phylogenetic relationships and divergence dates of softshell turtles (Testudines: Trionychidae) inferred from complete mitochondrial genomes.

    PubMed

    Li, Haifeng; Liu, Juanjuan; Xiong, Lei; Zhang, Huanhuan; Zhou, Huaxing; Yin, Huazong; Jing, Wanxing; Li, Jun; Shi, Qiong; Wang, Yuqin; Liu, Jianjun; Nie, Liuwang

    2017-03-15

    The softshell turtles (Trionychidae) are one of the most widely distributed reptile groups in the world, and fossils have been found on all continents except Antarctica. The phylogenetic relationships among members of this group have been previously studied; however, there are disagreements regarding its taxonomy, its phylogeography and divergence times are still poorly understood as well. Here we present a comprehensive mitogenomic study of softshell turtles. We sequenced the complete mitochondrial genomes of 10 softshell turtles, in addition to the GenBank sequence of Dogania subplana, Lissemys punctata, Trionyx triunguis, which cover all extant genera within Trionychidae except for Cyclanorbis and Cycloderma. These data were combined with other mitogenomes of turtles for phylogenetic analyses. Divergence time-calibration and ancestral reconstruction were calculated using BEAST and RASP software, respectively. Our phylogenetic analyses indicate that Trionychidae is the sister taxon of Carettochelyidae, and support the monophyly of Trionychinae and Cyclanorbinae, which is consistent with morphological data and molecular analysis. Our phylogenetic analyses have established a sister taxon relationship between the Asian Rafetus and the Asian Palea + Pelodiscus + Dogania + Nilssonia + Amyda, whereas a previous study grouped the Asian Rafetus with the American Apalone. The results of divergence time estimates and area ancestral reconstruction show that extant Trionychidae originated in Asia at around 108 million years ago (MA), and radiations mainly occurred during two warm periods, namely, Late Cretaceous-Early Eocene and Oligocene. By combining the estimateddivergence time and the reconstructed ancestral area of softshell turtles, we determined that the dispersal of softshell turtles out of Asia may have taken three routes. Furthermore, the times of dispersal seem to be in agreement with the time of the India-Asia collision and opening of the Bering Strait, which

  20. Higher-level salamander relationships and divergence dates inferred from complete mitochondrial genomes.

    PubMed

    Zhang, Peng; Wake, David B

    2009-11-01

    Phylogenetic relationships among the salamander families have been difficult to resolve, largely because the window of time in which major lineages diverged was very short relative to the subsequently long evolutionary history of each family. We present seven new complete mitochondrial genomes representing five salamander families that have no or few mitogenome records in GenBank in order to assess the phylogenetic relationships of all salamander families from a mitogenomic perspective. Phylogenetic analyses of two data sets-one combining the entire mitogenome sequence except for the D-loop, and the other combining the deduced amino acid sequences of all 13 mitochondrial protein-coding genes-produce nearly identical well-resolved topologies. The monophyly of each family is supported, including the controversial Proteidae. The internally fertilizing salamanders are demonstrated to be a clade, concordant with recent results using nuclear genes. The internally fertilizing salamanders include two well-supported clades: one is composed of Ambystomatidae, Dicamptodontidae, and Salamandridae, the other Proteidae, Rhyacotritonidae, Amphiumidae, and Plethodontidae. In contrast to results from nuclear loci, our results support the conventional morphological hypothesis that Sirenidae is the sister-group to all other salamanders and they statistically reject the hypothesis from nuclear genes that the suborder Cryptobranchoidea (Cryptobranchidae+Hynobiidae) branched earlier than the Sirenidae. Using recently recommended fossil calibration points and a "soft bound" calibration strategy, we recalculated evolutionary timescales for tetrapods with an emphasis on living salamanders, under a Bayesian framework with and without a rate-autocorrelation assumption. Our dating results indicate: (i) the widely used rate-autocorrelation assumption in relaxed clock analyses is problematic and the accuracy of molecular dating for early lissamphibian evolution is questionable; (ii) the initial

  1. Limitations to estimating bacterial cross-species transmission using genetic and genomic markers: inferences from simulation modeling

    PubMed Central

    Benavides, Julio A; Cross, Paul C; Luikart, Gordon; Creel, Scott

    2014-01-01

    Cross-species transmission (CST) of bacterial pathogens has major implications for human health, livestock, and wildlife management because it determines whether control actions in one species may have subsequent effects on other potential host species. The study of bacterial transmission has benefitted from methods measuring two types of genetic variation: variable number of tandem repeats (VNTRs) and single nucleotide polymorphisms (SNPs). However, it is unclear whether these data can distinguish between different epidemiological scenarios. We used a simulation model with two host species and known transmission rates (within and between species) to evaluate the utility of these markers for inferring CST. We found that CST estimates are biased for a wide range of parameters when based on VNTRs and a most parsimonious reconstructed phylogeny. However, estimations of CST rates lower than 5% can be achieved with relatively low bias using as low as 250 SNPs. CST estimates are sensitive to several parameters, including the number of mutations accumulated since introduction, stochasticity, the genetic difference of strains introduced, and the sampling effort. Our results suggest that, even with whole-genome sequences, unbiased estimates of CST will be difficult when sampling is limited, mutation rates are low, or for pathogens that were recently introduced. PMID:25469159

  2. Limitations to estimating bacterial cross-species transmission using genetic and genomic markers: inferences from simulation modeling.

    PubMed

    Benavides, Julio A; Cross, Paul C; Luikart, Gordon; Creel, Scott

    2014-08-01

    Cross-species transmission (CST) of bacterial pathogens has major implications for human health, livestock, and wildlife management because it determines whether control actions in one species may have subsequent effects on other potential host species. The study of bacterial transmission has benefitted from methods measuring two types of genetic variation: variable number of tandem repeats (VNTRs) and single nucleotide polymorphisms (SNPs). However, it is unclear whether these data can distinguish between different epidemiological scenarios. We used a simulation model with two host species and known transmission rates (within and between species) to evaluate the utility of these markers for inferring CST. We found that CST estimates are biased for a wide range of parameters when based on VNTRs and a most parsimonious reconstructed phylogeny. However, estimations of CST rates lower than 5% can be achieved with relatively low bias using as low as 250 SNPs. CST estimates are sensitive to several parameters, including the number of mutations accumulated since introduction, stochasticity, the genetic difference of strains introduced, and the sampling effort. Our results suggest that, even with whole-genome sequences, unbiased estimates of CST will be difficult when sampling is limited, mutation rates are low, or for pathogens that were recently introduced.

  3. Inferring the genetic history of lactase persistence along the Italian peninsula from a large genomic interval surrounding the LCT gene.

    PubMed

    De Fanti, Sara; Sazzini, Marco; Giuliani, Cristina; Frazzoni, Federica; Sarno, Stefania; Boattini, Alessio; Marasco, Elena; Mantovani, Vilma; Franceschi, Claudio; Moral, Pedro; Garagnani, Paolo; Luiselli, Donata

    2015-12-01

    Although genetic variants related to lactase persistence in European populations were supposed to have firstly undergone positive selection in farmers from the Balkans and Central Europe, demographic and evolutionary dynamics that subsequently shaped the distribution of this adaptive trait across the continent have still to be elucidated. To deepen the knowledge about potential routes of diffusion of lactase persistence to Western Europe we investigated variation at a large genomic region surrounding the LCT gene along the Italian peninsula, a geographical area that played a key role in population movements responsible for Neolithic diffusion across Europe. By genotyping 40 highly selected SNPs in more than 400 Italian individuals we described gradients of nucleotide and haplotype variation potentially related to lactase persistence and compared them with those observed in several European and Mediterranean human groups. Multiple migratory events responsible for earlier introduction of the examined alleles in Italy than in Northern European regions could be invoked. Different demic processes occurred along the western and eastern sides of the peninsula were also inferred via linkage disequilibrium and population structure analyses. The appreciable genetic continuum observed between people from Northern or Central-Western Italy and Central European populations suggested a local arrival of lactase persistence-related variants mainly via overland routes. On the contrary, diversity of Central-Eastern and Southern Italian groups entailed also gene flow from South-Eastern Mediterranean regions, in accordance to the earlier entrance of the Neolithic in Southern Italy via maritime population movements along the Mediterranean coastlines. © 2015 Wiley Periodicals, Inc.

  4. A new statistic and its power to infer membership in a genome-wide association study using genotype frequencies.

    PubMed

    Jacobs, Kevin B; Yeager, Meredith; Wacholder, Sholom; Craig, David; Kraft, Peter; Hunter, David J; Paschal, Justin; Manolio, Teri A; Tucker, Margaret; Hoover, Robert N; Thomas, Gilles D; Chanock, Stephen J; Chatterjee, Nilanjan

    2009-11-01

    Aggregate results from genome-wide association studies (GWAS), such as genotype frequencies for cases and controls, were until recently often made available on public websites because they were thought to disclose negligible information concerning an individual's participation in a study. Homer et al. recently suggested that a method for forensic detection of an individual's contribution to an admixed DNA sample could be applied to aggregate GWAS data. Using a likelihood-based statistical framework, we developed an improved statistic that uses genotype frequencies and individual genotypes to infer whether a specific individual or any close relatives participated in the GWAS and, if so, what the participant's phenotype status is. Our statistic compares the logarithm of genotype frequencies, in contrast to that of Homer et al., which is based on differences in either SNP probe intensity or allele frequencies. We derive the theoretical power of our test statistics and explore the empirical performance in scenarios with varying numbers of randomly chosen or top-associated SNPs.

  5. Genome at Juncture of Early Human Migration: A Systematic Analysis of Two Whole Genomes and Thirteen Exomes from Kuwaiti Population Subgroup of Inferred Saudi Arabian Tribe Ancestry

    PubMed Central

    Alsmadi, Osama; Hebbar, Prashantha; Antony, Dinu; Behbehani, Kazem; Thanaraj, Thangavel Alphonse

    2014-01-01

    Population of the State of Kuwait is composed of three genetic subgroups of inferred Persian, Saudi Arabian tribe and Bedouin ancestry. The Saudi Arabian tribe subgroup traces its origin to the Najd region of Saudi Arabia. By sequencing two whole genomes and thirteen exomes from this subgroup at high coverage (>40X), we identify 4,950,724 Single Nucleotide Polymorphisms (SNPs), 515,802 indels and 39,762 structural variations. Of the identified variants, 10,098 (8.3%) exomic SNPs, 139,923 (2.9%) non-exomic SNPs, 5,256 (54.3%) exomic indels, and 374,959 (74.08%) non-exomic indels are ‘novel’. Up to 8,070 (79.9%) of the reported novel biallelic exomic SNPs are seen in low frequency (minor allele frequency <5%). We observe 5,462 known and 1,004 novel potentially deleterious nonsynonymous SNPs. Allele frequencies of common SNPs from the 15 exomes is significantly correlated with those from genotype data of a larger cohort of 48 individuals (Pearson correlation coefficient, 0.91; p <2.2×10−16). A set of 2,485 SNPs show significantly different allele frequencies when compared to populations from other continents. Two notable variants having risk alleles in high frequencies in this subgroup are: a nonsynonymous deleterious SNP (rs2108622 [19:g.15990431C>T] from CYP4F2 gene [MIM:*604426]) associated with warfarin dosage levels [MIM:#122700] required to elicit normal anticoagulant response; and a 3′ UTR SNP (rs6151429 [22:g.51063477T>C]) from ARSA gene [MIM:*607574]) associated with Metachromatic Leukodystrophy [MIM:#250100]. Hemoglobin Riyadh variant (identified for the first time in a Saudi Arabian woman) is observed in the exome data. The mitochondrial haplogroup profiles of the 15 individuals are consistent with the haplogroup diversity seen in Saudi Arabian natives, who are believed to have received substantial gene flow from Africa and eastern provenance. We present the first genome resource imperative for designing future genetic studies in Saudi Arabian

  6. Unusual bacterioplankton community structure in ultra-oligotrophic Crater Lake

    USGS Publications Warehouse

    Urbach, Ena; Vergin, Kevin L.; Morse, Ariel

    2001-01-01

    The bacterioplankton assemblage in Crater Lake, Oregon (U.S.A.), is different from communities found in other oxygenated lakes, as demonstrated by four small subunit ribosomal ribonucleic acid (SSU rRNA) gene clone libraries and oligonucleotide probe hybridization to RNA from lake water. Populations in the euphotic zone of this deep (589 m), oligotrophic caldera lake are dominated by two phylogenetic clusters of currently uncultivated bacteria: CL120-10, a newly identified cluster in the verrucomicrobiales, and ACK4 actinomycetes, known as a minor constituent of bacterioplankton in other lakes. Deep-water populations at 300 and 500 m are dominated by a different pair of uncultivated taxa: CL500-11, a novel cluster in the green nonsulfur bacteria, and group I marine crenarchaeota. b-Proteobacteria, dominant in most other freshwater environments, are relatively rare in Crater Lake (<=16% of nonchloroplast bacterial rRNA at all depths). Other taxa identified in Crater Lake libraries include a newly identified candidate bacterial division, ABY1, and a newly identified subcluster, CL0-1, within candidate division OP10. Probe analyses confirmed vertical stratification of several microbial groups, similar to patterns observed in open-ocean systems. Additional similarities between Crater Lake and ocean microbial populations include aphotic zone dominance of group I marine crenarchaeota and green nonsulfur bacteria. Comparison of Crater Lake to other lakes studied by rRNA methods suggests that selective factors structuring Crater Lake bacterioplankton populations may include low concentrations of available trace metals and dissolved organic matter, chemistry of infiltrating hydrothermal waters, and irradiation by high levels of ultraviolet light.

  7. Coastal Bacterioplankton Community Dynamics in Response to a Natural Disturbance

    PubMed Central

    Rappé, Michael S.

    2013-01-01

    In order to characterize how disturbances to microbial communities are propagated over temporal and spatial scales in aquatic environments, the dynamics of bacterial assemblages throughout a subtropical coastal embayment were investigated via SSU rRNA gene analyses over an 8-month period, which encompassed a large storm event. During non-perturbed conditions, sampling sites clustered into three groups based on their microbial community composition: an offshore oceanic group, a freshwater group, and a distinct and persistent coastal group. Significant differences in measured environmental parameters or in the bacterial community due to the storm event were found only within the coastal cluster of sampling sites, and only at 5 of 12 locations; three of these sites showed a significant response in both environmental and bacterial community characteristics. These responses were most pronounced at sites close to the shoreline. During the storm event, otherwise common bacterioplankton community members such as marine Synechococcus sp. and members of the SAR11 clade of Alphaproteobacteria decreased in relative abundance in the affected coastal zone, whereas several lineages of Gammaproteobacteria, Betaproteobacteria, and members of the Roseobacter clade of Alphaproteobacteria increased. The complex spatial patterns in both environmental conditions and microbial community structure related to freshwater runoff and wind convection during the perturbation event leads us to conclude that spatial heterogeneity was an important factor influencing both the dynamics and the resistance of the bacterioplankton communities to disturbances throughout this complex subtropical coastal system. This heterogeneity may play a role in facilitating a rapid rebound of regions harboring distinctly coastal bacterioplankton communities to their pre-disturbed taxonomic composition. PMID:23409156

  8. Phylogeny and biogeography of highly diverged freshwater fish species (Leuciscinae, Cyprinidae, Teleostei) inferred from mitochondrial genome analysis.

    PubMed

    Imoto, Junichi M; Saitoh, Kenji; Sasaki, Takeshi; Yonezawa, Takahiro; Adachi, Jun; Kartavtsev, Yuri P; Miya, Masaki; Nishida, Mutsumi; Hanzawa, Naoto

    2013-02-10

    The distribution of freshwater taxa is a good biogeographic model to study pattern and process of vicariance and dispersal. The subfamily Leuciscinae (Cyprinidae, Teleostei) consists of many species distributed widely in Eurasia and North America. Leuciscinae have been divided into two phyletic groups, leuciscin and phoxinin. The phylogenetic relationships between major clades within the subfamily are poorly understood, largely because of the overwhelming diversity of the group. The origin of the Far Eastern phoxinin is an interesting question regarding the evolutionary history of Leuciscinae. Here we present phylogenetic analysis of 31 species of Leuciscinae and outgroups based on complete mitochondrial genome sequences to clarify the phylogenetic relationships and to infer the evolutionary history of the subfamily. Phylogenetic analysis suggests that the Far Eastern phoxinin species comprised the monophyletic clades Tribolodon, Pseudaspius, Oreoleuciscus and Far Eastern Phoxinus. The Far Eastern phoxinin clade was independent of other Leuciscinae lineages and was closer to North American phoxinins than European leuciscins. All of our analysis also suggested that leuciscins and phoxinins each constituted monophyletic groups. Divergence time estimation suggested that Leuciscinae species diverged from outgroups such as Tincinae to be 83.3 million years ago (Mya) in the Late Cretaceous and leuciscin and phoxinin shared a common ancestor 70.7 Mya. Radiation of Leuciscinae lineages occurred during the Late Cretaceous to Paleocene. This period also witnessed the radiation of tetrapods. Reconstruction of ancestral areas indicates Leuciscinae species originated within Europe. Leuciscin species evolved in Europe and the ancestor of phoxinin was distributed in North America. The Far Eastern phoxinins would have dispersed from North America to Far East across the Beringia land bridge. The present study suggests important roles for the continental rearrangements during the

  9. Bacterioplankton secondary production estimates for artificially fertilized shrimp pond

    NASA Astrophysics Data System (ADS)

    Lu, Jing-Rang; Li, De-Shang; Zhang, Hong-Yan

    1997-03-01

    Experiments were conducted from June to September, 1995 in a controlled integrated culture pond-enclosure ecosystem. The principal objective of this study was to quantify the rate of heterotrophic bacterioplankton production in situ in a fertilization pond ecosystem. This paper presents a method by which bacterial production was estimated through incubation in situ and measurement of increased bacterial abundance with time. Bacterial growth rates, production and turnover per day during the periods of culture were estimated. The influence of zooplankton grazing, substrate limiting and water temperature on the bacterial growth rates and production were studied also.

  10. Bacterioplankton carbon cycling along the Subtropical Frontal Zone off New Zealand

    NASA Astrophysics Data System (ADS)

    Baltar, Federico; Stuck, Esther; Morales, Sergio; Currie, Kim

    2015-06-01

    Marine heterotrophic bacterioplankton (Bacteria and Archaea) play a central role in ocean carbon cycling. As such, identifying the factors controlling these microbial populations is crucial to fully understanding carbon fluxes. We studied bacterioplankton activities along a transect crossing three water masses (i.e., Subtropical waters [STW], Sub-Antarctic waters [SAW] and neritic waters [NW]) with contrasting nutrient regimes across the Subtropical Frontal Zone. In contrast to bacterioplankton production and community respiration, bacterioplankton respiration increased in the offshore SAW, causing a seaward increase in the contribution of bacteria to community respiration (from 7% to 100%). Cell-specific bacterioplankton respiration also increased in SAW, but cell-specific production did not, suggesting that prokaryotic cells in SAW were investing more energy towards respiration than growth. This was reflected in a 5-fold decline in bacterioplankton growth efficiency (BGE) towards SAW. One way to explain this decrease in BGE could be due to the observed reduction in phytoplankton biomass (and presumably organic matter concentration) towards SAW. However, this would not explain why bacterioplankton respiration was highest in SAW, where phytoplankton biomass was lowest. Another factor affecting BGE could be the iron limitation characteristic of high-nutrient low-chlorophyll (HNLC) regions like SAW. Our field-study based evidences would agree with previous laboratory experiments in which iron stress provoked a decrease in BGE of marine bacterial isolates. Our results suggest that there is a strong gradient in bacterioplankton carbon cycling rates along the Subtropical Frontal Zone, mainly due to the HNLC conditions of SAW. We suggest that Fe-induced reduction of BGE in HNLC regions like SAW could be relevant in marine carbon cycling, inducing bacterioplankton to act as a link or a sink of organic carbon by impacting on the quantity of organic carbon they incorporate

  11. Pseudoscorpion mitochondria show rearranged genes and genome-wide reductions of RNA gene sizes and inferred structures, yet typical nucleotide composition bias

    PubMed Central

    2012-01-01

    Background Pseudoscorpions are chelicerates and have historically been viewed as being most closely related to solifuges, harvestmen, and scorpions. No mitochondrial genomes of pseudoscorpions have been published, but the mitochondrial genomes of some lineages of Chelicerata possess unusual features, including short rRNA genes and tRNA genes that lack sequence to encode arms of the canonical cloverleaf-shaped tRNA. Additionally, some chelicerates possess an atypical guanine-thymine nucleotide bias on the major coding strand of their mitochondrial genomes. Results We sequenced the mitochondrial genomes of two divergent taxa from the chelicerate order Pseudoscorpiones. We find that these genomes possess unusually short tRNA genes that do not encode cloverleaf-shaped tRNA structures. Indeed, in one genome, all 22 tRNA genes lack sequence to encode canonical cloverleaf structures. We also find that the large ribosomal RNA genes are substantially shorter than those of most arthropods. We inferred secondary structures of the LSU rRNAs from both pseudoscorpions, and find that they have lost multiple helices. Based on comparisons with the crystal structure of the bacterial ribosome, two of these helices were likely contact points with tRNA T-arms or D-arms as they pass through the ribosome during protein synthesis. The mitochondrial gene arrangements of both pseudoscorpions differ from the ancestral chelicerate gene arrangement. One genome is rearranged with respect to the location of protein-coding genes, the small rRNA gene, and at least 8 tRNA genes. The other genome contains 6 tRNA genes in novel locations. Most chelicerates with rearranged mitochondrial genes show a genome-wide reversal of the CA nucleotide bias typical for arthropods on their major coding strand, and instead possess a GT bias. Yet despite their extensive rearrangement, these pseudoscorpion mitochondrial genomes possess a CA bias on the major coding strand. Phylogenetic analyses of all 13

  12. Changes of bacterioplankton apparent species richness in two ornamental fish aquaria.

    PubMed

    Vlahos, Nikolaos; Kormas, Konstantinos Ar; Pachiadaki, Maria G; Meziti, Alexandra; Hotos, George N; Mente, Eleni

    2013-12-01

    We analysed the 16S rRNA gene diversity within the bacterioplankton community in the water column of the ornamental fish Pterophyllum scalare and Archocentrus nigrofasciatus aquaria during a 60-day growth experiment in order to detect any dominant bacterial species and their possible association with the rearing organisms. The basic physical and chemical parameters remained stable but the bacterial community at 0, 30 and 60 days showed marked differences in bacterial cell abundance and diversity. We found high species richness but no dominant phylotypes were detected. Only few of the phylotypes were found in more than one time point per treatment and always with low relative abundance. The majority of the common phylotypes belonged to the Proteobacteria phylum and were closely related to Acinetobacter junii, Pseudomonas sp., Nevskia ramosa, Vogesella perlucida, Chitinomonas taiwanensis, Acidovorax sp., Pelomonas saccharophila and the rest belonged to the α-Proteobacteria, Bacteroidetes, Actinobacteria, candidate division OP11 and one unaffiliated group. Several of these phylotypes were closely related to known taxa including Sphingopyxis chilensis, Flexibacter aurantiacus subsp. excathedrus and Mycobacterium sp. Despite the high phylogenetic diversity most of the inferred ecophysiological roles of the found phylotypes are related to nitrogen metabolism, a key process for fish aquaria.

  13. Oligotrophic Bacterioplankton with a Novel Single-Cell Life Strategy

    PubMed Central

    Simu, Karin; Hagström, Åke

    2004-01-01

    A large fraction of the marine bacterioplankton community is unable to form colonies on agar surfaces, which so far no experimental evidence can explain. Here we describe a previously undescribed growth behavior of three non-colony-forming oligotrophic bacterioplankton, including a SAR11 cluster representative, the world's most abundant organism. We found that these bacteria exhibit a behavior that promotes growth and dispersal instead of colony formation. Although these bacteria do not form colonies on agar, it was possible to monitor growth on the surface of seawater agar slides containing a fluorescent stain, 4′,6′-diamidino-2-phenylindole (DAPI). Agar slides were prepared by pouring a solution containing 0.7% agar and 0.5 μg of DAPI per ml in seawater onto glass slides. Prompt dispersal of newly divided cells explained the inability to form colonies since immobilized cells (cells immersed in agar) formed microcolonies. The behavior observed suggests a life strategy intended to optimize access of individual cells to substrates. Thus, the inability to form colonies or biofilms appears to be part of a K-selected population strategy in which oligotrophic bacteria explore dissolved organic matter in seawater as single cells. PMID:15066843

  14. Bacterioplankton community variation across river to ocean environmental gradients.

    PubMed

    Fortunato, Caroline S; Crump, Byron C

    2011-08-01

    Coastal zones encompass a complex spectrum of environmental gradients that each impact the composition of bacterioplankton communities. Few studies have attempted to address these gradients comprehensively. We generated a synoptic, 16S rRNA gene-based bacterioplankton community profile of a coastal zone by applying the fingerprinting technique denaturing gradient gel electrophoresis to water samples collected from the Columbia River, estuary, and plume, and along coastal transects covering 360 km of the Oregon and Washington coasts and extending to the deep ocean (>2,000 m). Communities were found to cluster into five distinct groups based on location in the system (ANOSIM, p < 0.003): estuary, plume, epipelagic, shelf bottom (depth < 150 m), and slope bottom (depth > 650 m). Across all environments, abiotic factors (salinity, temperature, depth) explained most of the community variability (ρ = 0.734). But within each coastal environment, biotic factors explained most of the variability. Thus, structuring physical factors in coastal zones, such as salinity and temperature, define the boundaries of many distinct microbial habitats, but within these habitats variability in microbial communities is explained by biological gradients in primary and secondary productivity.

  15. Structuring of Bacterioplankton Diversity in a Large Tropical Bay

    PubMed Central

    Gregoracci, Gustavo B.; Nascimento, Juliana R.; Cabral, Anderson S.; Paranhos, Rodolfo; Valentin, Jean L.; Thompson, Cristiane C.; Thompson, Fabiano L.

    2012-01-01

    Structuring of bacterioplanktonic populations and factors that determine the structuring of specific niche partitions have been demonstrated only for a limited number of colder water environments. In order to better understand the physical chemical and biological parameters that may influence bacterioplankton diversity and abundance, we examined their productivity, abundance and diversity in the second largest Brazilian tropical bay (Guanabara Bay, GB), as well as seawater physical chemical and biological parameters of GB. The inner bay location with higher nutrient input favored higher microbial (including vibrio) growth. Metagenomic analysis revealed a predominance of Gammaproteobacteria in this location, while GB locations with lower nutrient concentration favored Alphaproteobacteria and Flavobacteria. According to the subsystems (SEED) functional analysis, GB has a distinctive metabolic signature, comprising a higher number of sequences in the metabolism of phosphorus and aromatic compounds and a lower number of sequences in the photosynthesis subsystem. The apparent phosphorus limitation appears to influence the GB metagenomic signature of the three locations. Phosphorus is also one of the main factors determining changes in the abundance of planktonic vibrios, suggesting that nutrient limitation can be observed at community (metagenomic) and population levels (total prokaryote and vibrio counts). PMID:22363639

  16. Species-Specific Associations Between Bacterioplankton and Photosynthetic Picoeukaryotes

    NASA Astrophysics Data System (ADS)

    Farnelid, H.; Turk-Kubo, K.; Zehr, J. P.

    2016-02-01

    Photosynthetic picoeukaryotes are significant contributors to marine primary productivity. Interactions between marine bacterioplankton and picoeukaryotes frequently occur and can have large biogeochemical impacts. Currently, partly due to methodological difficulties for studying microbial associations in situ, these ecological interactions are poorly characterized. Here we use flow cytometry sorting to identify novel bacterial phylotypes found in physical association with photosynthetic picoeukaryotes. Samples were collected on eight occasions at the Santa Cruz wharf on Monterey Bay during summer and fall, 2014. The phylogeny of associated microbes was assessed through clone libraries and Illumina MiSeq sequencing of amplicons of the 16S rRNA gene. In addition, 16 bacterial isolates comprised of 14 taxa were obtained from sorted photosynthetic picoeukaryote cells. The most frequently detected bacterioplankton phyla were Alphaproteobacteria, Bacteriodetes, and Gammaproteobacteria. The sequences from the sorted populations were a community distinct from the unsorted seawater samples suggesting species-specific functional associations. These species-specific patterns were further supported by re-occurring patterns between replicates and sampling dates. The finding of sequences from the free-living genera Synechococcus and Pelagibacter also suggest that photosynthetic picoeukaryotes can be bacterivores, possibly feeding on some of the most numerically abundant bacteria. The results show that specific bacterial phylotypes are found in association with photosynthetic picoeukaryotes. Taxonomic identification of these associations is a prerequisite for further characterizing the interactions, their metabolic pathways and ecological functions.

  17. BACTERIOPLANKTON DYNAMICS IN NORTHERN SAN FRANCISCO BAY: ROLE OF PARTICLE ASSOCIATION AND SEASONAL FRESHWATER FLOW

    EPA Science Inventory

    Bacterioplankton abundance and metabolic characteristics were observed in northern San Francisco Bay, California, during spring and summer 1996 at three sites: Central Bay, Suisun Bay, and the Sacramento River. These sites spanned a salinity gradient from marine to freshwater, an...

  18. BACTERIOPLANKTON DYNAMICS IN NORTHERN SAN FRANCISCO BAY: ROLE OF PARTICLE ASSOCIATION AND SEASONAL FRESHWATER FLOW

    EPA Science Inventory

    Bacterioplankton abundance and metabolic characteristics were observed in northern San Francisco Bay, California, during spring and summer 1996 at three sites: Central Bay, Suisun Bay, and the Sacramento River. These sites spanned a salinity gradient from marine to freshwater, an...

  19. Insights into archaeal evolution and symbiosis from the genomes of a nanoarchaeon and its inferred crenarchaeal host from Obsidian Pool, Yellowstone National Park.

    PubMed

    Podar, Mircea; Makarova, Kira S; Graham, David E; Wolf, Yuri I; Koonin, Eugene V; Reysenbach, Anna-Louise

    2013-04-22

    A single cultured marine organism, Nanoarchaeum equitans, represents the Nanoarchaeota branch of symbiotic Archaea, with a highly reduced genome and unusual features such as multiple split genes. The first terrestrial hyperthermophilic member of the Nanoarchaeota was collected from Obsidian Pool, a thermal feature in Yellowstone National Park, separated by single cell isolation, and sequenced together with its putative host, a Sulfolobales archaeon. Both the new Nanoarchaeota (Nst1) and N. equitans lack most biosynthetic capabilities, and phylogenetic analysis of ribosomal RNA and protein sequences indicates that the two form a deep-branching archaeal lineage. However, the Nst1 genome is more than 20% larger, and encodes a complete gluconeogenesis pathway as well as the full complement of archaeal flagellum proteins. With a larger genome, a smaller repertoire of split protein encoding genes and no split non-contiguous tRNAs, Nst1 appears to have experienced less severe genome reduction than N. equitans. These findings imply that, rather than representing ancestral characters, the extremely compact genomes and multiple split genes of Nanoarchaeota are derived characters associated with their symbiotic or parasitic lifestyle. The inferred host of Nst1 is potentially autotrophic, with a streamlined genome and simplified central and energetic metabolism as compared to other Sulfolobales. Comparison of the N. equitans and Nst1 genomes suggests that the marine and terrestrial lineages of Nanoarchaeota share a common ancestor that was already a symbiont of another archaeon. The two distinct Nanoarchaeota-host genomic data sets offer novel insights into the evolution of archaeal symbiosis and parasitism, enabling further studies of the cellular and molecular mechanisms of these relationships. This article was reviewed by Patrick Forterre, Bettina Siebers (nominated by Michael Galperin) and Purification Lopez-Garcia.

  20. Insights into archaeal evolution and symbiosis from the genomes of a nanoarchaeon and its inferred crenarchaeal host from Obsidian Pool, Yellowstone National Park

    PubMed Central

    2013-01-01

    Background A single cultured marine organism, Nanoarchaeum equitans, represents the Nanoarchaeota branch of symbiotic Archaea, with a highly reduced genome and unusual features such as multiple split genes. Results The first terrestrial hyperthermophilic member of the Nanoarchaeota was collected from Obsidian Pool, a thermal feature in Yellowstone National Park, separated by single cell isolation, and sequenced together with its putative host, a Sulfolobales archaeon. Both the new Nanoarchaeota (Nst1) and N. equitans lack most biosynthetic capabilities, and phylogenetic analysis of ribosomal RNA and protein sequences indicates that the two form a deep-branching archaeal lineage. However, the Nst1 genome is more than 20% larger, and encodes a complete gluconeogenesis pathway as well as the full complement of archaeal flagellum proteins. With a larger genome, a smaller repertoire of split protein encoding genes and no split non-contiguous tRNAs, Nst1 appears to have experienced less severe genome reduction than N. equitans. These findings imply that, rather than representing ancestral characters, the extremely compact genomes and multiple split genes of Nanoarchaeota are derived characters associated with their symbiotic or parasitic lifestyle. The inferred host of Nst1 is potentially autotrophic, with a streamlined genome and simplified central and energetic metabolism as compared to other Sulfolobales. Conclusions Comparison of the N. equitans and Nst1 genomes suggests that the marine and terrestrial lineages of Nanoarchaeota share a common ancestor that was already a symbiont of another archaeon. The two distinct Nanoarchaeota-host genomic data sets offer novel insights into the evolution of archaeal symbiosis and parasitism, enabling further studies of the cellular and molecular mechanisms of these relationships. Reviewers This article was reviewed by Patrick Forterre, Bettina Siebers (nominated by Michael Galperin) and Purification Lopez-Garcia PMID:23607440

  1. Bacterioplankton Populations within the Oxygen Minimum Zone of the Sargasso Sea

    NASA Astrophysics Data System (ADS)

    Schuler, G.; Parsons, R. J.; Johnson, R. J.

    2016-02-01

    Oxygen minimum zones are present throughout the world's oceans, and occur at depths between 200 to 1000m. Heterotrophic bacteria reduce the dissolved oxygen within this layer through respiration, while metabolizing falling particles. This report studied the bacterioplankton in the oxygen minimum zone at the BATS (Bermuda Atlantic Times-series Study) site from July 2014 until November 2014. Total bacterioplankton populations were enumerated through direct counts. In the transitional zone (400m-800m) of the oxygen minimum zone, a secondary bacterioplankton peak formed. This study used FISH (Fluorescent in situ hybridization) and CARD-FISH (Catalyzed Reporter Deposition-Fluorescent in situ hybridization) to enumerate specific bacterial and archaeal taxa. Crenarchaeota (including Thaumarchaeota) increased in abundance within the upper oxycline. Thaumarchaeota have the ammonia monooxygenase gene that oxidizes ammonium into nitrite in low oxygen conditions. Amplification of the amoA gene confirmed that ammonia oxidizing archaea (AOA) were present within the OMZ. Using Terminal Restriction Fragment Length Polymorphism (T-RFLP), the bacterial community structure showed high similarity based depth zones (0-80m, 160-600m, and 800-4500m). Niskin experiments determined that water collected at 800m had an exponential increase in bacterioplankton over time. While experimental design did not allow for oxygen levels to be maintained, the bacterioplankton community was predominantly bacteria with eubacteria positive cells making up 89.3% of the of the total bacterioplankton community by day 34. Improvements to the experimental design are required to determine which specific bacterial taxa caused this increase at 800m. This study suggests that there are factors other than oxygen influencing bacterioplankton populations at the BATS site, and more analysis is needed once the BATS data is available to determine the key drivers of bacterioplankton dynamics within the BATS OMZ.

  2. Verrucomicrobia Are Candidates for Polysaccharide-Degrading Bacterioplankton in an Arctic Fjord of Svalbard

    PubMed Central

    Cardman, Z.; Arnosti, C.; Durbin, A.; Ziervogel, K.; Cox, C.; Steen, A. D.

    2014-01-01

    In Arctic marine bacterial communities, members of the phylum Verrucomicrobia are consistently detected, although not typically abundant, in 16S rRNA gene clone libraries and pyrotag surveys of the marine water column and in sediments. In an Arctic fjord (Smeerenburgfjord) of Svalbard, members of the Verrucomicrobia, together with Flavobacteria and smaller proportions of Alpha- and Gammaproteobacteria, constituted the most frequently detected bacterioplankton community members in 16S rRNA gene-based clone library analyses of the water column. Parallel measurements in the water column of the activities of six endo-acting polysaccharide hydrolases showed that chondroitin sulfate, laminarin, and xylan hydrolysis accounted for most of the activity. Several Verrucomicrobia water column phylotypes were affiliated with previously sequenced, glycoside hydrolase-rich genomes of individual Verrucomicrobia cells that bound fluorescently labeled laminarin and xylan and therefore constituted candidates for laminarin and xylan hydrolysis. In sediments, the bacterial community was dominated by different lineages of Verrucomicrobia, Bacteroidetes, and Proteobacteria but also included members of multiple phylum-level lineages not observed in the water column. This community hydrolyzed laminarin, xylan, chondroitin sulfate, and three additional polysaccharide substrates at high rates. Comparisons with data from the same fjord in the previous summer showed that the bacterial community in Smeerenburgfjord changed in composition, most conspicuously in the changing detection frequency of Verrucomicrobia in the water column. Nonetheless, in both years the community hydrolyzed the same polysaccharide substrates. PMID:24727271

  3. Verrucomicrobia are candidates for polysaccharide-degrading bacterioplankton in an arctic fjord of Svalbard.

    PubMed

    Cardman, Z; Arnosti, C; Durbin, A; Ziervogel, K; Cox, C; Steen, A D; Teske, A

    2014-06-01

    In Arctic marine bacterial communities, members of the phylum Verrucomicrobia are consistently detected, although not typically abundant, in 16S rRNA gene clone libraries and pyrotag surveys of the marine water column and in sediments. In an Arctic fjord (Smeerenburgfjord) of Svalbard, members of the Verrucomicrobia, together with Flavobacteria and smaller proportions of Alpha- and Gammaproteobacteria, constituted the most frequently detected bacterioplankton community members in 16S rRNA gene-based clone library analyses of the water column. Parallel measurements in the water column of the activities of six endo-acting polysaccharide hydrolases showed that chondroitin sulfate, laminarin, and xylan hydrolysis accounted for most of the activity. Several Verrucomicrobia water column phylotypes were affiliated with previously sequenced, glycoside hydrolase-rich genomes of individual Verrucomicrobia cells that bound fluorescently labeled laminarin and xylan and therefore constituted candidates for laminarin and xylan hydrolysis. In sediments, the bacterial community was dominated by different lineages of Verrucomicrobia, Bacteroidetes, and Proteobacteria but also included members of multiple phylum-level lineages not observed in the water column. This community hydrolyzed laminarin, xylan, chondroitin sulfate, and three additional polysaccharide substrates at high rates. Comparisons with data from the same fjord in the previous summer showed that the bacterial community in Smeerenburgfjord changed in composition, most conspicuously in the changing detection frequency of Verrucomicrobia in the water column. Nonetheless, in both years the community hydrolyzed the same polysaccharide substrates.

  4. Biogeography of bacterioplankton in lakes and streams of an Arctic tundra catchment.

    PubMed

    Crump, Ron C; Adams, Heather E; Hobbie, John E; Kling, George W

    2007-06-01

    Bacterioplankton community composition was compared across 10 lakes and 14 streams within the catchment of Toolik Lake, a tundra lake in Arctic Alaska, during seven surveys conducted over three years using denaturing gradient gel electrophoresis (DGGE) of PCR-amplified rDNA. Bacterioplankton communities in streams draining tundra were very different than those in streams draining lakes. Communities in streams draining lakes were similar to communities in lakes. In a connected series of lakes and streams, the stream communities changed with distance from the upstream lake and with changes in water chemistry, suggesting inoculation and dilution with bacteria from soil waters or hyporheic zones. In the same system, lakes shared similar bacterioplankton communities (78% similar) that shifted gradually down the catchment. In contrast, unconnected lakes contained somewhat different communities (67% similar). We found evidence that dispersal influences bacterioplankton communities via advection and dilution (mass effects) in streams, and via inoculation and subsequent growth in lakes. The spatial pattern of bacterioplankton community composition was strongly influenced by interactions among soil water, stream, and lake environments. Our results reveal large differences in lake-specific and stream-specific bacterial community composition over restricted spatial scales (<10 km) and suggest that geographic distance and connectivity influence the distribution of bacterioplankton communities across a landscape.

  5. Elevated pCO2 enhances bacterioplankton removal of organic carbon

    PubMed Central

    James, Anna K.; Passow, Uta; Brzezinski, Mark A.; Parsons, Rachel J.; Trapani, Jennifer N.; Carlson, Craig A.

    2017-01-01

    Factors that affect the removal of organic carbon by heterotrophic bacterioplankton can impact the rate and magnitude of organic carbon loss in the ocean through the conversion of a portion of consumed organic carbon to CO2. Through enhanced rates of consumption, surface bacterioplankton communities can also reduce the amount of dissolved organic carbon (DOC) available for export from the surface ocean. The present study investigated the direct effects of elevated pCO2 on bacterioplankton removal of several forms of DOC ranging from glucose to complex phytoplankton exudate and lysate, and naturally occurring DOC. Elevated pCO2 (1000–1500 ppm) enhanced both the rate and magnitude of organic carbon removal by bacterioplankton communities compared to low (pre-industrial and ambient) pCO2 (250 –~400 ppm). The increased removal was largely due to enhanced respiration, rather than enhanced production of bacterioplankton biomass. The results suggest that elevated pCO2 can increase DOC consumption and decrease bacterioplankton growth efficiency, ultimately decreasing the amount of DOC available for vertical export and increasing the production of CO2 in the surface ocean. PMID:28257422

  6. Elevated pCO2 enhances bacterioplankton removal of organic carbon.

    PubMed

    James, Anna K; Passow, Uta; Brzezinski, Mark A; Parsons, Rachel J; Trapani, Jennifer N; Carlson, Craig A

    2017-01-01

    Factors that affect the removal of organic carbon by heterotrophic bacterioplankton can impact the rate and magnitude of organic carbon loss in the ocean through the conversion of a portion of consumed organic carbon to CO2. Through enhanced rates of consumption, surface bacterioplankton communities can also reduce the amount of dissolved organic carbon (DOC) available for export from the surface ocean. The present study investigated the direct effects of elevated pCO2 on bacterioplankton removal of several forms of DOC ranging from glucose to complex phytoplankton exudate and lysate, and naturally occurring DOC. Elevated pCO2 (1000-1500 ppm) enhanced both the rate and magnitude of organic carbon removal by bacterioplankton communities compared to low (pre-industrial and ambient) pCO2 (250 -~400 ppm). The increased removal was largely due to enhanced respiration, rather than enhanced production of bacterioplankton biomass. The results suggest that elevated pCO2 can increase DOC consumption and decrease bacterioplankton growth efficiency, ultimately decreasing the amount of DOC available for vertical export and increasing the production of CO2 in the surface ocean.

  7. Seasonality of freshwater bacterioplankton diversity in two tropical shallow lakes from the Brazilian Atlantic Forest.

    PubMed

    Ávila, Marcelo P; Staehr, Peter A; Barbosa, Francisco A R; Chartone-Souza, Edmar; Nascimento, Andréa M A

    2017-01-01

    Bacteria are highly important for the cycling of organic and inorganic matter in freshwater environments; however, little is known about the diversity of bacterioplankton in tropical systems. Studies on carbon and nutrient cycling in tropical lakes suggest a very different seasonality from that of temperate climates. Here, we used 16S rRNA gene next-generation sequencing (NGS) to investigate seasonal changes in bacterioplankton communities of two tropical lakes, which differed in trophic status and mixing regime. Our findings revealed seasonally and depth-wise highly dynamic bacterioplankton communities. Differences in richness and structure appeared strongly related to the physicochemical characteristics of the water column, especially phosphate, pH and oxygen. Bacterioplankton communities were dominated by common taxonomic groups, such as Synechococcus and Actinobacteria acI, as well as rare and poorly characterized taxa such as 'Candidatus Methylacidiphilum' (Verrucomicrobia). Stratification and oxygen depletion during the rainy season promoted the occurrence of anoxygenic phototrophic and methanotrophic bacteria important for carbon and nutrient cycling. Differences in lake mixing regime were associated with seasonal beta diversity. Our study is the first attempt to use NGS for cataloging the diversity of bacterioplankton communities in Brazilian lakes and thus contributes to the ongoing worldwide endeavor to characterize freshwater lake bacterioplankton signatures. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  8. DNA hybridization to compare species compositions of natural bacterioplankton assemblages.

    PubMed Central

    Lee, S; Fuhrman, J A

    1990-01-01

    Little is known about the species composition and variability of natural bacterial communities, mostly because conventional identification requires pure cultures, but less than 1% of active natural bacteria are cultivable. This problem was circumvented by comparing species compositions via hybridization of total DNA of natural bacterioplankton communities for the estimation of the fraction of DNA in common between two samples (similarity). DNA probes that were labeled with 35S by nick translation were hybridized to filter-bound DNA in a reciprocal fashion; similarities (in percent) were calculated by normalizing the values to self-hybridizations. In tests with DNA mixtures of pure cultures, the experimentally observed similarities agreed with expectations. However, reciprocal similarities (probe and target reversed) were often asymmetric, unlike those of DNA from single strains. This was due to the relative complexity and G + C content of DNA, which provided a means to interpret the asymmetry that was occasionally observed in natural samples. Natural bacteria were collected by filtration from Long Island Sound (LIS), N.Y., the Caribbean and Sargasso seas, and a coral reef lagoon near Bermuda. The samples showed similarities of less than 10 to 95%. The LIS and Sargasso and Caribbean sea samples were 20 to 50% similar to each other. The coral reef sample was less than 10% similar to the others, indicating its unique composition. Seasonality was also observed; an LIS sample obtained in the autumn was 40% similar to two LIS samples obtained in the summer; these latter two samples were 95% similar. We concluded that total DNA hybridization is a rapid, simple, and unbiased method for investigating the variation of bacterioplankton species composition over time and space, avoiding the need of culturing. PMID:2317044

  9. Functional characterization of somatic mutations in cancer using network-based inference of protein activity | Office of Cancer Genomics

    Cancer.gov

    Identifying the multiple dysregulated oncoproteins that contribute to tumorigenesis in a given patient is crucial for developing personalized treatment plans. However, accurate inference of aberrant protein activity in biological samples is still challenging as genetic alterations are only partially predictive and direct measurements of protein activity are generally not feasible.

  10. Arthropod Phylogenetics in Light of Three Novel Millipede (Myriapoda: Diplopoda) Mitochondrial Genomes with Comments on the Appropriateness of Mitochondrial Genome Sequence Data for Inferring Deep Level Relationships

    PubMed Central

    Brewer, Michael S.; Swafford, Lynn; Spruill, Chad L.; Bond, Jason E.

    2013-01-01

    Background Arthropods are the most diverse group of eukaryotic organisms, but their phylogenetic relationships are poorly understood. Herein, we describe three mitochondrial genomes representing orders of millipedes for which complete genomes had not been characterized. Newly sequenced genomes are combined with existing data to characterize the protein coding regions of myriapods and to attempt to reconstruct the evolutionary relationships within the Myriapoda and Arthropoda. Results The newly sequenced genomes are similar to previously characterized millipede sequences in terms of synteny and length. Unique translocations occurred within the newly sequenced taxa, including one half of the Appalachioria falcifera genome, which is inverted with respect to other millipede genomes. Across myriapods, amino acid conservation levels are highly dependent on the gene region. Additionally, individual loci varied in the level of amino acid conservation. Overall, most gene regions showed low levels of conservation at many sites. Attempts to reconstruct the evolutionary relationships suffered from questionable relationships and low support values. Analyses of phylogenetic informativeness show the lack of signal deep in the trees (i.e., genes evolve too quickly). As a result, the myriapod tree resembles previously published results but lacks convincing support, and, within the arthropod tree, well established groups were recovered as polyphyletic. Conclusions The novel genome sequences described herein provide useful genomic information concerning millipede groups that had not been investigated. Taken together with existing sequences, the variety of compositions and evolution of myriapod mitochondrial genomes are shown to be more complex than previously thought. Unfortunately, the use of mitochondrial protein-coding regions in deep arthropod phylogenetics appears problematic, a result consistent with previously published studies. Lack of phylogenetic signal renders the

  11. Quantifying the effects of geographical and environmental factors on distribution of stream bacterioplankton within nature reserves of Fujian, China.

    PubMed

    Wang, Yongming; Yang, Jun; Liu, Lemian; Yu, Zheng

    2015-07-01

    Bacterioplankton are important components of freshwater ecosystems and play essential roles in ecological functions and processes; however, little is known about their geographical distribution and the factors influencing their ecology, especially in stream ecosystems. To examine how geographical and environmental factors affect the composition of bacterioplankton communities, we used denaturing gradient gel electrophoresis and clone sequencing to survey bacterioplankton communities in 31 samples of streamwater from seven nature reserves in Fujian province, southeast China. Our results revealed that dominant bacterioplankton communities exhibited a distinct geographical pattern. Further, we provided evidence for distance decay relationships in bacterioplankton community similarity and found similar community gradients in response to elevation and latitude. Both redundancy analyses and Mantel tests showed that bacterioplankton community composition was significantly correlated with both environmental (electrical conductivity, total phosphorus, and PO4-P) and geographical factors (latitude, longitude, and elevation). Variance partitioning further showed that the joint effect of geographical and environmental factors explained the largest proportion of the variation in distribution of bacterioplankton communities (13.6 %), followed by purely geographical factors (11.2 %), and purely environmental factors (0.6 %). The Betaproteobacteria were the most common taxa in the streams, followed by Firmicutes and Gammaproteobacteria. Therefore, our results suggest that the biogeographical patterns of stream bacterioplankton communities across the Fujian nature reserves are more influenced by geographical factors than by local physicochemical properties.

  12. Contrasting diversity of epibiotic bacteria and surrounding bacterioplankton of a common submerged macrophyte, Potamogeton crispus, in freshwater lakes.

    PubMed

    He, Dan; Ren, Lijuan; Wu, Qinglong L

    2014-12-01

    Epibiotic bacteria on surfaces of submerged macrophytes play important roles in the ecological processes of shallow lakes. However, their community ecology and dynamics are far from understood in comparison with those of bacterioplankton. Here, we conducted a comparative study of the species diversity and composition of epibiotic bacterial and the surrounding bacterioplankton communities of a common submerged macrophyte, Potamogeton crispus, in 12 lakes at a regional scale in China. We found that in different freshwater lakes, epibiotic bacteria possessed higher taxonomic richness than bacterioplankton did. There existed a marked divergence in the community structure between epibiotic bacteria and bacterioplankton. Alphaproteobacteria was the most dominant group for epibiotic bacteria, whereas Actinobacteria dominated bacterioplankton. Although variations in both bacterioplankton and epibiotic bacterial community compositions in different lakes were better explained by environmental than spatial factors, both environment and space had more intensified effects on epibiotic bacteria. This implied more complex and diverse 'microhabitats' for epibiotic bacteria on surfaces of submerged macrophytes, which may lead to higher variations of epibiotic bacteria than bacterioplankton. Our study suggested that epibiotic bacteria exhibited higher diversity and distinct community composition than the surrounding bacterioplankton. More attention should be focused on the productive and diverse microbial habitats on submerged macrophytes. © 2014 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved.

  13. Marine bacterioplankton biomass, activity and community structure in the vicinity of Antarctic icebergs

    NASA Astrophysics Data System (ADS)

    Murray, Alison E.; Peng, Vivian; Tyler, Charlotte; Wagh, Protima

    2011-06-01

    We studied marine bacterioplankton in the Scotia Sea in June 2008 and in the northwest Weddell Sea in March to mid April 2009 in waters proximal to three free-drifting icebergs (SS-1, A-43k, and C-18a), in a region with a high density of smaller icebergs (iceberg alley), and at stations that were upstream of the iceberg trajectories designated as far-field reference sites that were between 16-75 km away. Hydrographic parameters were used to define water masses in which comparisons between bacterioplankton-associated characteristics (abundance, leucine incorporation into protein, aminopeptidase activities and community structure) within and between water masses could be made. Early winter Scotia Sea bacterioplankton had low levels of cells and low heterotrophic production rates in the upper 50 m. Influences of the icebergs on bacterioplankton at this time of year were minimal, if not deleterious, as we found lower levels of heterotrophic production near A-43k in comparison to stations >16 km away. Additionally, the results point to small but significant differences in cell abundance, heterotrophic production, and community structure between the two icebergs studied. These icebergs differed greatly in size and the findings suggest that the larger iceberg had a greater effect. In the NW Weddell Sea in March-mid April bacterioplankton were twice as abundant and had heterotrophic productions rates that were 8-fold higher than what we determined in the Scotia Sea, though levels were still quite low, which is typical for autumn. We did not detect direct iceberg-related influences on the bacterioplankton characteristics studied here. Clues to understanding bacterioplankton responses may lie in the details of community structure, as there were some significant differences in community structure in the winter water and underlying upper circumpolar deep-water masses between stations occupied close to C-18a and at stations 18 km away (i.e. Polaribacter and Pelagibacter

  14. Inferring Cell Differentiation Processes Based on Phylogenetic Analysis of Genome-Wide Epigenetic Information: Hematopoiesis as a Model Case

    PubMed Central

    Koyanagi, Kanako O.

    2015-01-01

    How cells divide and differentiate is a fundamental question in organismal development; however, the discovery of differentiation processes in various cell types is laborious and sometimes impossible. Phylogenetic analysis is typically used to reconstruct evolutionary processes based on inherent characters. It could also be used to reconstruct developmental processes based on the developmental changes that occur during cell proliferation and differentiation. In this study, DNA methylation information from differentiated hematopoietic cells was used to perform phylogenetic analyses. The results were assessed for their validity in inferring hierarchical differentiation processes of hematopoietic cells and DNA methylation processes of differentiating progenitor cells. Overall, phylogenetic analyses based on DNA methylation information facilitated inferences regarding hematopoiesis. PMID:25638259

  15. The new physician as unwitting quantum mechanic: is adapting Dirac's inference system best practice for personalized medicine, genomics, and proteomics?

    PubMed

    Robson, Barry

    2007-08-01

    What is the Best Practice for automated inference in Medical Decision Support for personalized medicine? A known system already exists as Dirac's inference system from quantum mechanics (QM) using bra-kets and bras where A and B are states, events, or measurements representing, say, clinical and biomedical rules. Dirac's system should theoretically be the universal best practice for all inference, though QM is notorious as sometimes leading to bizarre conclusions that appear not to be applicable to the macroscopic world of everyday world human experience and medical practice. It is here argued that this apparent difficulty vanishes if QM is assigned one new multiplication function @, which conserves conditionality appropriately, making QM applicable to classical inference including a quantitative form of the predicate calculus. An alternative interpretation with the same consequences is if every i = radical-1 in Dirac's QM is replaced by h, an entity distinct from 1 and i and arguably a hidden root of 1 such that h2 = 1. With that exception, this paper is thus primarily a review of the application of Dirac's system, by application of linear algebra in the complex domain to help manipulate information about associations and ontology in complicated data. Any combined bra-ket can be shown to be composed only of the sum of QM-like bra and ket weights c(), times an exponential function of Fano's mutual information measure I(A; B) about the association between A and B, that is, an association rule from data mining. With the weights and Fano measure re-expressed as expectations on finite data using Riemann's Incomplete (i.e., Generalized) Zeta Functions, actual counts of observations for real world sparse data can be readily utilized. Finally, the paper compares identical character, distinguishability of states events or measurements, correlation, mutual information, and orthogonal character, important issues in data mining

  16. The effect of water exchange on bacterioplankton depletion and inorganic nutrient dynamics in coral reef cavities

    NASA Astrophysics Data System (ADS)

    van Duyl, F. C.; Scheffers, S. R.; Thomas, F. I. M.; Driscoll, M.

    2006-03-01

    We studied the effect of water exchange on the depletion (or accumulation) of bacterioplankton, dissolved organic matter and inorganic nutrients in small open framework cavities (50-70 l) at 15 m depth on the coral reef along Curaçao, Netherlands Antilles. The bacterioplankton removal rate in cavities increased with increasing water exchange rates up to a threshold of 0.0045 s-1, reaching values of 50-100 mg C m-2 total interior cavity surface area (CSA) per day. Beyond the threshold, bacterioplankton removal dropped. The cryptic community is apparently adapted to the average water exchange in these cavities (0.0041 s-1). Dissolved inorganic nitrogen (DIN), nitrate + nitrite (NO x ) in particular, accumulated in cavity water and the accumulation decreased with increasing water exchange. Net NO x effluxes exceeded net DIN effluxes from cavities (average efflux rate of 1.9 mmol NO x vs. 0.8 mmol DIN m-2 interior CSA per day). The difference is ascribed to net ammonium losses (NH4) in cavities at reef concentrations >0.025 μM NH4, possibly due to enhanced nitrification. Dissolved inorganic phosphate accumulated in cavities, but was not related to water exchange. The cryptic biota in cavities depend on water exchange for optimization of consumption of bacterioplankton and removal of inorganic nitrogen. Coral cavities are an evident sink of bacterioplankton and a source of NO x and PO{4/3-}.

  17. High Temporal but Low Spatial Heterogeneity of Bacterioplankton in the Chesapeake Bay▿ †

    PubMed Central

    Kan, Jinjun; Suzuki, Marcelino T.; Wang, Kui; Evans, Sarah E.; Chen, Feng

    2007-01-01

    Compared to freshwater and the open ocean, less is known about bacterioplankton community structure and spatiotemporal dynamics in estuaries, particularly those with long residence times. The Chesapeake Bay is the largest estuary in the United States, but despite its ecological and economic significance, little is known about its microbial community composition. A rapid screening approach, ITS (internal transcribed spacer)-LH (length heterogeneity)-PCR, was used to screen six rRNA operon (16S rRNA-ITS-23S rRNA) clone libraries constructed from bacterioplankton collected in three distinct regions of the Chesapeake Bay over two seasons. The natural length variation of the 16S-23S rRNA gene ITS region, as well as the presence and location of tRNA-alanine coding regions within the ITS, was determined for 576 clones. Clones representing unique ITS-LH-PCR sizes were sequenced and identified. Dramatic shifts in bacterial composition (changes within subgroups or clades) were observed for the Alphaproteobacteria (Roseobacter clade, SAR11), Cyanobacteria (Synechococcus), and Actinobacteria, suggesting strong seasonal variation within these taxonomic groups. Despite large gradients in salinity and phytoplankton parameters, a remarkably homogeneous bacterioplankton community was observed in the bay in each season. Stronger seasonal, rather than spatial, variation of the bacterioplankton population was also supported by denaturing gradient gel electrophoresis and LH-PCR analyses, indicating that environmental parameters with stronger seasonal, rather than regional, dynamics, such as temperature, might determine bacterioplankton community composition in the Chesapeake Bay. PMID:17827310

  18. Coupling Bacterioplankton Populations and Environment to Community Function in Coastal Temperate Waters

    PubMed Central

    Traving, Sachia J.; Bentzon-Tilia, Mikkel; Knudsen-Leerbeck, Helle; Mantikci, Mustafa; Hansen, Jørgen L. S.; Stedmon, Colin A.; Sørensen, Helle; Markager, Stiig; Riemann, Lasse

    2016-01-01

    Bacterioplankton play a key role in marine waters facilitating processes important for carbon cycling. However, the influence of specific bacterial populations and environmental conditions on bacterioplankton community performance remains unclear. The aim of the present study was to identify drivers of bacterioplankton community functions, taking into account the variability in community composition and environmental conditions over seasons, in two contrasting coastal systems. A Least Absolute Shrinkage and Selection Operator (LASSO) analysis of the biological and chemical data obtained from surface waters over a full year indicated that specific bacterial populations were linked to measured functions. Namely, Synechococcus (Cyanobacteria) was strongly correlated with protease activity. Both function and community composition showed seasonal variation. However, the pattern of substrate utilization capacity could not be directly linked to the community dynamics. The overall importance of dissolved organic matter (DOM) parameters in the LASSO models indicate that bacterioplankton respond to the present substrate landscape, with a particular importance of nitrogenous DOM. The identification of common drivers of bacterioplankton community functions in two different systems indicates that the drivers may be of broader relevance in coastal temperate waters. PMID:27729909

  19. Prevalent genome streamlining and latitudinal divergence of planktonic bacteria in the surface ocean.

    PubMed

    Swan, Brandon K; Tupper, Ben; Sczyrba, Alexander; Lauro, Federico M; Martinez-Garcia, Manuel; González, José M; Luo, Haiwei; Wright, Jody J; Landry, Zachary C; Hanson, Niels W; Thompson, Brian P; Poulton, Nicole J; Schwientek, Patrick; Acinas, Silvia G; Giovannoni, Stephen J; Moran, Mary Ann; Hallam, Steven J; Cavicchioli, Ricardo; Woyke, Tanja; Stepanauskas, Ramunas

    2013-07-09

    Planktonic bacteria dominate surface ocean biomass and influence global biogeochemical processes, but remain poorly characterized owing to difficulties in cultivation. Using large-scale single cell genomics, we obtained insight into the genome content and biogeography of many bacterial lineages inhabiting the surface ocean. We found that, compared with existing cultures, natural bacterioplankton have smaller genomes, fewer gene duplications, and are depleted in guanine and cytosine, noncoding nucleotides, and genes encoding transcription, signal transduction, and noncytoplasmic proteins. These findings provide strong evidence that genome streamlining and oligotrophy are prevalent features among diverse, free-living bacterioplankton, whereas existing laboratory cultures consist primarily of copiotrophs. The apparent ubiquity of metabolic specialization and mixotrophy, as predicted from single cell genomes, also may contribute to the difficulty in bacterioplankton cultivation. Using metagenome fragment recruitment against single cell genomes, we show that the global distribution of surface ocean bacterioplankton correlates with temperature and latitude and is not limited by dispersal at the time scales required for nucleotide substitution to exceed the current operational definition of bacterial species. Single cell genomes with highly similar small subunit rRNA gene sequences exhibited significant genomic and biogeographic variability, highlighting challenges in the interpretation of individual gene surveys and metagenome assemblies in environmental microbiology. Our study demonstrates the utility of single cell genomics for gaining an improved understanding of the composition and dynamics of natural microbial assemblages.

  20. Prevalent genome streamlining and latitudinal divergence of planktonic bacteria in the surface ocean

    PubMed Central

    Swan, Brandon K.; Tupper, Ben; Sczyrba, Alexander; Lauro, Federico M.; Martinez-Garcia, Manuel; González, José M.; Luo, Haiwei; Wright, Jody J.; Landry, Zachary C.; Hanson, Niels W.; Thompson, Brian P.; Poulton, Nicole J.; Schwientek, Patrick; Acinas, Silvia G.; Giovannoni, Stephen J.; Moran, Mary Ann; Hallam, Steven J.; Cavicchioli, Ricardo; Woyke, Tanja; Stepanauskas, Ramunas

    2013-01-01

    Planktonic bacteria dominate surface ocean biomass and influence global biogeochemical processes, but remain poorly characterized owing to difficulties in cultivation. Using large-scale single cell genomics, we obtained insight into the genome content and biogeography of many bacterial lineages inhabiting the surface ocean. We found that, compared with existing cultures, natural bacterioplankton have smaller genomes, fewer gene duplications, and are depleted in guanine and cytosine, noncoding nucleotides, and genes encoding transcription, signal transduction, and noncytoplasmic proteins. These findings provide strong evidence that genome streamlining and oligotrophy are prevalent features among diverse, free-living bacterioplankton, whereas existing laboratory cultures consist primarily of copiotrophs. The apparent ubiquity of metabolic specialization and mixotrophy, as predicted from single cell genomes, also may contribute to the difficulty in bacterioplankton cultivation. Using metagenome fragment recruitment against single cell genomes, we show that the global distribution of surface ocean bacterioplankton correlates with temperature and latitude and is not limited by dispersal at the time scales required for nucleotide substitution to exceed the current operational definition of bacterial species. Single cell genomes with highly similar small subunit rRNA gene sequences exhibited significant genomic and biogeographic variability, highlighting challenges in the interpretation of individual gene surveys and metagenome assemblies in environmental microbiology. Our study demonstrates the utility of single cell genomics for gaining an improved understanding of the composition and dynamics of natural microbial assemblages. PMID:23801761

  1. Phylogeny and genetic history of the Siberian salamander (Salamandrella keyserlingii, Dybowski, 1870) inferred from complete mitochondrial genomes.

    PubMed

    Malyarchuk, Boris; Derenko, Miroslava; Denisova, Galina

    2013-05-01

    We assessed phylogeny of the Siberian salamander (Salamandrella keyserlingii, Dybowski, 1870), the most northern ectothermic, terrestrial vertebrate in Eurasia, by sequence analysis of complete mitochondrial genomes in 26 specimens from different localities (China, Khabarovsk region, Sakhalin, Yakutia, Magadan region, Chukotka, Kamchatka, Ural, European part of Russia). In addition, a complete mitochondrial genome of the Schrenck salamander, Salamandrella schrenckii, was determined for the first time. Bayesian phylogenetic analysis of the entire mtDNA genomes of S. keyserlingii demonstrates that two haplotype clades, AB and C, radiated about 1.4 million years ago (Mya). Bayesian skyline plots of population size change through time show an expansion around 250 thousand years ago (kya) and then a decline around the Last Glacial Maximum (25 kya) with subsequent restoration of population size. Climatic changes during the Quaternary period have dramatically affected the population genetic structure of the Siberian salamanders. In addition, complete mtDNA sequence analysis allowed us to recognize that the vast area of Northern Eurasia was colonized only by the Siberian salamander clade C1b during the last 150 kya. Meanwhile, we were unable to find evidence of molecular adaptation in this clade by analyzing the whole mitochondrial genomes of the Siberian salamanders.

  2. Introgression and phenotypic assimilation in Zimmerius flycatchers (Tyrannidae): population genetic and phylogenetic inferences from genome-wide SNPs.

    PubMed

    Rheindt, Frank E; Fujita, Matthew K; Wilton, Peter R; Edwards, Scott V

    2014-03-01

    Genetic introgression is pervasive in nature and may lead to large-scale phenotypic assimilation and/or admixture of populations, but there is limited knowledge on whether large phenotypic changes are typically accompanied by high levels of introgression throughout the genome. Using bioacoustic, biometric, and spectrophotometric data from a flycatcher (Tyrannidae) system in the Neotropical genus Zimmerius, we document a mosaic pattern of phenotypic admixture in which a population of Zimmerius viridiflavus in northern Peru (henceforth "mosaic") is vocally and biometrically similar to conspecifics to the south but shares plumage characteristics with a different species (Zimmerius chrysops) to the north. To clarify the origins of the mosaic population, we used the RAD-seq approach to generate a data set of 37,361 genome-wide single nucleotide polymorphisms (SNPs). A range of population-genetic diagnostics shows that the genome of the mosaic population is largely indistinguishable from southern Z. viridiflavus and distinct from northern Z. chrysops, and the application of parsimony and species tree methods to the genome-wide SNP data set confirms the close affinity of the mosaic population with southern Z. viridiflavus. Even so, using a subset of 2710 SNPs found across all sampled lineages in configurations appropriate for a recently proposed statistical ("ABBA/BABA") test that distinguishes gene flow from incomplete lineage sorting, we detected low levels of gene flow from northern Z. chrysops into the mosaic population. Mapping the candidate loci for introgression from Z. chrysops into the mosaic population to the zebra finch genome reveals close linkage with genes significantly enriched in functions involving cell projection and plasma membranes. Introgression of key alleles may have led to phenotypic assimilation in the plumage of mosaic birds, suggesting that selection may have been a key factor facilitating introgression.

  3. Alkane hydroxylase gene (alkB) phylotype composition and diversity in northern Gulf of Mexico bacterioplankton

    PubMed Central

    Smith, Conor B.; Tolar, Bradley B.; Hollibaugh, James T.; King, Gary M.

    2013-01-01

    Natural and anthropogenic activities introduce alkanes into marine systems where they are degraded by alkane hydroxylases expressed by phylogenetically diverse bacteria. Partial sequences for alkB, one of the structural genes of alkane hydroxylase, have been used to assess the composition of alkane-degrading communities, and to determine their responses to hydrocarbon inputs. We present here the first spatially extensive analysis of alkB in bacterioplankton of the northern Gulf of Mexico (nGoM), a region that experiences numerous hydrocarbon inputs. We have analyzed 401 partial alkB gene sequences amplified from genomic extracts collected during March 2010 from 17 water column samples that included surface waters and bathypelagic depths. Previous analyses of 16S rRNA gene sequences for these and related samples have shown that nGoM bacterial community composition and structure stratify strongly with depth, with distinctly different communities above and below 100 m. Although we hypothesized that alkB gene sequences would exhibit a similar pattern, PCA analyses of operational protein units (OPU) indicated that community composition did not vary consistently with depth or other major physical-chemical variables. We observed 22 distinct OPUs, one of which was ubiquitous and accounted for 57% of all sequences. This OPU clustered with AlkB sequences from known hydrocarbon oxidizers (e.g., Alcanivorax and Marinobacter). Some OPUs could not be associated with known alkane degraders, however, and perhaps represent novel hydrocarbon-oxidizing populations or genes. These results indicate that the capacity for alkane hydrolysis occurs widely in the nGoM, but that alkane degrader diversity varies substantially among sites and responds differently than bulk communities to physical-chemical variables. PMID:24376439

  4. Metatranscriptomic analysis of ammonia-oxidizing organisms in an estuarine bacterioplankton assemblage

    PubMed Central

    Hollibaugh, James T; Gifford, Scott; Sharma, Shalabh; Bano, Nasreen; Moran, Mary Ann

    2011-01-01

    Quantitative PCR (qPCR) analysis revealed elevated relative abundance (1.8% of prokaryotes) of marine group 1 Crenarchaeota (MG1C) in two samples of southeastern US coastal bacterioplankton, collected in August 2008, compared with samples collected from the same site at different times (mean 0.026%). We analyzed the MG1C sequences in metatranscriptomes from these samples to gain an insight into the metabolism of MG1C population growing in the environment, and for comparison with ammonia-oxidizing bacteria (AOB) in the same samples. Assemblies revealed low diversity within sequences assigned to most individual MG1C open reading frames (ORFs) and high homology with ‘Candidatus Nitrosopumilus maritimus' strain SCM1 genome sequences. Reads assigned to ORFs for ammonia uptake and oxidation accounted for 37% of all MG1C transcripts. We did not recover any reads for Nmar_1354–Nmar_1357, proposed to encode components of an alternative, nitroxyl-based ammonia oxidation pathway; however, reads from Nmar_1259 and Nmar_1667, annotated as encoding a multicopper oxidase with homology to nirK, were abundant. Reads assigned to two homologous ORFs (Nmar_1201 and Nmar_1547), annotated as hypothetical proteins were also abundant, suggesting that their unknown function is important to MG1C. Superoxide dismutase and peroxiredoxin-like transcripts were more abundant in the MG1C transcript pool than in the complete metatranscriptome, suggesting an enhanced response to oxidative stress by the MG1C population. qPCR indicated low AOB abundance (0.0010% of prokaryotes), and we found no transcripts related to ammonia oxidation and only one RuBisCO transcript among the transcripts assigned to AOB, suggesting they were not responding to the same environmental cues as the MG1C population. PMID:21085199

  5. Automatic Determination of Bacterioplankton Biomass by Image Analysis †

    PubMed Central

    Bjørnsen, Peter Koefoed

    1986-01-01

    Image analysis was applied to epifluorescense microscopy of acridine orange-stained plankton samples. A program was developed for discrimination and binary segmentation of digitized video images, taken by an ultrasensitive video camera mounted on the microscope. Cell volumes were estimated from area and perimeter of the objects in the binary image. The program was tested on fluorescent latex beads of known diameters. Biovolumes measured by image analysis were compared with directly determined carbon biomasses in batch cultures of estuarine and freshwater bacterioplankton. This calibration revealed an empirical conversion factor from biovolume to biomass of 0.35 pg of C μm−3 (± 0.03 95% confidence limit). The deviation of this value from the normally used conversion factors of 0.086 to 0.121 pg of C μm−3 is discussed. The described system was capable of measuring 250 cells within 10 min, providing estimates of cell number, mean cell volume, and biovolume with a precision of 5%. Images PMID:16347077

  6. Energetic differences between bacterioplankton trophic groups and coral reef resistance

    PubMed Central

    McDole Somera, Tracey; Bailey, Barbara; Barott, Katie; Grasis, Juris; Hatay, Mark; Hilton, Brett J.; Hisakawa, Nao; Nosrat, Bahador; Nulton, James; Silveira, Cynthia B.; Sullivan, Chris; Brainard, Russell E.; Rohwer, Forest

    2016-01-01

    Coral reefs are among the most productive and diverse marine ecosystems on the Earth. They are also particularly sensitive to changing energetic requirements by different trophic levels. Microbialization specifically refers to the increase in the energetic metabolic demands of microbes relative to macrobes and is significantly correlated with increasing human influence on coral reefs. In this study, metabolic theory of ecology is used to quantify the relative contributions of two broad bacterioplankton groups, autotrophs and heterotrophs, to energy flux on 27 Pacific coral reef ecosystems experiencing human impact to varying degrees. The effective activation energy required for photosynthesis is lower than the average energy of activation for the biochemical reactions of the Krebs cycle, and changes in the proportional abundance of these two groups can greatly affect rates of energy and materials cycling. We show that reef-water communities with a higher proportional abundance of microbial autotrophs expend more metabolic energy per gram of microbial biomass. Increased energy and materials flux through fast energy channels (i.e. water-column associated microbial autotrophs) may dampen the detrimental effects of increased heterotrophic loads (e.g. coral disease) on coral reef systems experiencing anthropogenic disturbance. PMID:27097927

  7. Energetic differences between bacterioplankton trophic groups and coral reef resistance.

    PubMed

    McDole Somera, Tracey; Bailey, Barbara; Barott, Katie; Grasis, Juris; Hatay, Mark; Hilton, Brett J; Hisakawa, Nao; Nosrat, Bahador; Nulton, James; Silveira, Cynthia B; Sullivan, Chris; Brainard, Russell E; Rohwer, Forest

    2016-04-27

    Coral reefs are among the most productive and diverse marine ecosystems on the Earth. They are also particularly sensitive to changing energetic requirements by different trophic levels. Microbialization specifically refers to the increase in the energetic metabolic demands of microbes relative to macrobes and is significantly correlated with increasing human influence on coral reefs. In this study, metabolic theory of ecology is used to quantify the relative contributions of two broad bacterioplankton groups, autotrophs and heterotrophs, to energy flux on 27 Pacific coral reef ecosystems experiencing human impact to varying degrees. The effective activation energy required for photosynthesis is lower than the average energy of activation for the biochemical reactions of the Krebs cycle, and changes in the proportional abundance of these two groups can greatly affect rates of energy and materials cycling. We show that reef-water communities with a higher proportional abundance of microbial autotrophs expend more metabolic energy per gram of microbial biomass. Increased energy and materials flux through fast energy channels (i.e. water-column associated microbial autotrophs) may dampen the detrimental effects of increased heterotrophic loads (e.g. coral disease) on coral reef systems experiencing anthropogenic disturbance.

  8. The phylogenetic position of eriophyoid mites (superfamily Eriophyoidea) in Acariformes inferred from the sequences of mitochondrial genomes and nuclear small subunit (18S) rRNA gene.

    PubMed

    Xue, Xiao-Feng; Dong, Yan; Deng, Wei; Hong, Xiao-Yue; Shao, Renfu

    2017-04-01

    Eriophyoid mites (superfamily Eriophyoidea) comprise >4400 species worldwide. Despite over a century of study, the phylogenetic position of these mites within Acariformes is still poorly resolved. Currently, Eriophyoidea is placed in the order Trombidiformes. We inferred the high-level phylogeny of Acari with the mitochondrial (mt) genome sequences of 110 species including four eriophyoid species, and the nuclear small subunit (18S) rRNA gene sequences of 226 species including 25 eriophyoid species. Maximum likelihood (ML), Bayesian inference (BI) and Maximum parsimony (MP) methods were used to analyze the sequence data. Divergence times were estimated for major lineages of Acari using Bayesian approaches. Our analyses consistently recovered the monophyly of Eriophyoidea but rejected the monophyly of Trombidiformes. The eriophyoid mites were grouped with the sarcoptiform mites, or were the sister group of sarcoptiform mites+non-eriophyoid trombidiform mites, depending on data partition strategies. Eriophyoid mites diverged from other mites in the Devonian (384Mya, 95% HPD, 352-410Mya). The origin of eriophyoid mites was dated to the Permian (262Mya, 95% HPD 230-307Mya), mostly prior to the radiation of gymnosperms (Triassic-Jurassic) and angiosperms (early Cretaceous). We propose that the placement of Eriophyoidea in the order Trombidiformes under the current classification system should be reviewed.

  9. Molecular phylogeny of soft ticks (Ixodida: Argasidae) inferred from mitochondrial genome and nuclear rRNA sequences.

    PubMed

    Burger, Thomas D; Shao, Renfu; Labruna, Marcelo B; Barker, Stephen C

    2014-03-01

    The genus-level classification of soft ticks (Argasidae) is controversial. A previous phylogenetic analysis of morphological and developmental characters found that the genus Ornithodoros was paraphyletic and raised a new genus, Carios, for species previously in the genera Antricola, Argas, Ornithodoros, and Nothoaspis (Klompen and Oliver, 1993). Genetic analyses of soft ticks to date have been limited to 16S rRNA, which is not highly phylogenetically informative for this group. We sequenced the entire mitochondrial genomes of 7 species of soft ticks, and the partial mitochondrial genomes of a further 5 species of soft ticks. We used these sequences to test the genus-level classification of soft ticks. Our analyses strongly support a clade of Neotropical species (mostly bat-associated) within the subfamily Ornithodorinae. This clade, which we call Neotropical Ornithodorinae, has species from 2 genera, Antricola and Nothoaspis, and 2 subgenera, Ornithodoros (Alectorobius) and Ornithodoros (Subparmatus). We also addressed the phylogenetic position of Ornithodoros savignyi, the type species of the genus Ornithodoros. Our analysis strongly supports a clade consisting of Ornithodoros savignyi and 4 other Ornithodoros species: Or. brasiliensis, Or. moubata, Or. porcinus, and Or. rostratus. This clade, Ornithodoros sensu stricto, did not contain the Alectorobius and Subparmatus species, Or. (Alectorobius) fonsecai, Or. (Alectorobius) capensis, and Or. (Subparmatus) marinkellei, which in traditional classification schemes have been placed in the genus Ornithodoros. Our comparison of mitochondrial rRNA, nuclear rRNA, and mitochondrial genome analyses show that only mitochondrial genome sequences have the potential to resolve the controversial phylogenetic relationships within the major soft tick lineages, such as the taxonomic status of Carios sensu Klompen and Oliver (1993). Crown Copyright © 2013. Published by Elsevier GmbH. All rights reserved.

  10. Polytene Chromosomal Maps of 11 Drosophila Species: The Order of Genomic Scaffolds Inferred From Genetic and Physical Maps

    PubMed Central

    Schaeffer, Stephen W.; Bhutkar, Arjun; McAllister, Bryant F.; Matsuda, Muneo; Matzkin, Luciano M.; O'Grady, Patrick M.; Rohde, Claudia; Valente, Vera L. S.; Aguadé, Montserrat; Anderson, Wyatt W.; Edwards, Kevin; Garcia, Ana C. L.; Goodman, Josh; Hartigan, James; Kataoka, Eiko; Lapoint, Richard T.; Lozovsky, Elena R.; Machado, Carlos A.; Noor, Mohamed A. F.; Papaceit, Montserrat; Reed, Laura K.; Richards, Stephen; Rieger, Tania T.; Russo, Susan M.; Sato, Hajime; Segarra, Carmen; Smith, Douglas R.; Smith, Temple F.; Strelets, Victor; Tobari, Yoshiko N.; Tomimura, Yoshihiko; Wasserman, Marvin; Watts, Thomas; Wilson, Robert; Yoshida, Kiyohito; Markow, Therese A.; Gelbart, William M.; Kaufman, Thomas C.

    2008-01-01

    The sequencing of the 12 genomes of members of the genus Drosophila was taken as an opportunity to reevaluate the genetic and physical maps for 11 of the species, in part to aid in the mapping of assembled scaffolds. Here, we present an overview of the importance of cytogenetic maps to Drosophila biology and to the concepts of chromosomal evolution. Physical and genetic markers were used to anchor the genome assembly scaffolds to the polytene chromosomal maps for each species. In addition, a computational approach was used to anchor smaller scaffolds on the basis of the analysis of syntenic blocks. We present the chromosomal map data from each of the 11 sequenced non-Drosophila melanogaster species as a series of sections. Each section reviews the history of the polytene chromosome maps for each species, presents the new polytene chromosome maps, and anchors the genomic scaffolds to the cytological maps using genetic and physical markers. The mapping data agree with Muller's idea that the majority of Drosophila genes are syntenic. Despite the conservation of genes within homologous chromosome arms across species, the karyotypes of these species have changed through the fusion of chromosomal arms followed by subsequent rearrangement events. PMID:18622037

  11. Bacterioplankton Community Dynamics and Nutrient Availability in a Shallow Well Mixed Estuary of the Northern Gulf of Mexico.

    NASA Astrophysics Data System (ADS)

    Hoch, M. P.

    2016-02-01

    Sabine Lake Estuary is a shallow, well mixed, tidal lagoon of the Northern Gulf of Mexico. This study defines the bacterioplankton community composition and factors that may influence its variation in Sabine Lake Estuary. Twenty physicochemical parameters, phytoplankton photopigments, and bacterial 16SrDNA sequences were analyzed seasonally from twelve sites ranging from the inflows of Sabine and Neches Rivers to the Sabine Pass outflow. Photopigments were used to estimate phytoplankton groups via CHEMTAX, and bacterioplankton 16SrDNA sequences of 97% similarity were quantified and taxa identified. Nutrient availability experiments were conducted on bacterioplankton. Notable seasonal differences were seen in six of the ten most common (>3% of total sequences) classes of bacterioplankton. Canonical correspondence analysis (CCA) of common classes was used to explore physiochemical parameters and phytoplankton groups influencing variation in the bacterioplankton. Alphaproteobacteria were most abundant throughout the year. Opitutae, Actinobacteria, Sphingobacteria, and Beta-proteobacteria were strongly influenced by conditions with higher TDN, DOC, turbidity, and Chlorophytes during winter when high river discharges reduced salinity. Planctomycetacia were most prevalent during spring and coincide with predominance of Cryptophytes. In summer and fall the aforementioned classes decline, and there is an increase in Synechococcophycideae. Nitrogen was least available to bacterioplankton during summer and fall. Clearer, warmer and more saline conditions with lower DOC reflect tidal movement of seawater into the estuary when river discharges were low, conditions favorable for Synechococcophycidea. Seasonal fluctuations in physicochemical conditions and certain phytoplankton groups influence the variation in the bacterioplankton community in Sabine Lake Estuary.

  12. pH Influences the Importance of Niche-Related and Neutral Processes in Lacustrine Bacterioplankton Assembly

    PubMed Central

    Ren, Lijuan; Jeppesen, Erik; He, Dan; Wang, Jianjun; Liboriussen, Lone; Xing, Peng

    2015-01-01

    pH is an important factor that shapes the structure of bacterial communities. However, we have very limited information about the patterns and processes by which overall bacterioplankton communities assemble across wide pH gradients in natural freshwater lakes. Here, we used pyrosequencing to analyze the bacterioplankton communities in 25 discrete freshwater lakes in Denmark with pH levels ranging from 3.8 to 8.8. We found that pH was the key factor impacting lacustrine bacterioplankton community assembly. More acidic lakes imposed stronger environmental filtering, which decreased the richness and evenness of bacterioplankton operational taxonomic units (OTUs) and largely shifted community composition. Although environmental filtering was determined to be the most important determinant of bacterioplankton community assembly, the importance of neutral assembly processes must also be considered, notably in acidic lakes, where the species (OTU) diversity was low. We observed that the strong effect of environmental filtering in more acidic lakes was weakened by the enhanced relative importance of neutral community assembly, and bacterioplankton communities tended to be less phylogenetically clustered in more acidic lakes. In summary, we propose that pH is a major environmental determinant in freshwater lakes, regulating the relative importance and interplay between niche-related and neutral processes and shaping the patterns of freshwater lake bacterioplankton biodiversity. PMID:25724952

  13. Structure, expression profile and phylogenetic inference of chalcone isomerase-like genes from the narrow-leafed lupin (Lupinus angustifolius L.) genome

    PubMed Central

    Przysiecka, Łucja; Książkiewicz, Michał; Wolko, Bogdan; Naganowska, Barbara

    2015-01-01

    Lupins, like other legumes, have a unique biosynthesis scheme of 5-deoxy-type flavonoids and isoflavonoids. A key enzyme in this pathway is chalcone isomerase (CHI), a member of CHI-fold protein family, encompassing subfamilies of CHI1, CHI2, CHI-like (CHIL), and fatty acid-binding (FAP) proteins. Here, two Lupinus angustifolius (narrow-leafed lupin) CHILs, LangCHIL1 and LangCHIL2, were identified and characterized using DNA fingerprinting, cytogenetic and linkage mapping, sequencing and expression profiling. Clones carrying CHIL sequences were assembled into two contigs. Full gene sequences were obtained from these contigs, and mapped in two L. angustifolius linkage groups by gene-specific markers. Bacterial artificial chromosome fluorescence in situ hybridization approach confirmed the localization of two LangCHIL genes in distinct chromosomes. The expression profiles of both LangCHIL isoforms were very similar. The highest level of transcription was in the roots of the third week of plant growth; thereafter, expression declined. The expression of both LangCHIL genes in leaves and stems was similar and low. Comparative mapping to reference legume genome sequences revealed strong syntenic links; however, LangCHIL2 contig had a much more conserved structure than LangCHIL1. LangCHIL2 is assumed to be an ancestor gene, whereas LangCHIL1 probably appeared as a result of duplication. As both copies are transcriptionally active, questions arise concerning their hypothetical functional divergence. Screening of the narrow-leafed lupin genome and transcriptome with CHI-fold protein sequences, followed by Bayesian inference of phylogeny and cross-genera synteny survey, identified representatives of all but one (CHI1) main subfamilies. They are as follows: two copies of CHI2, FAPa2 and CHIL, and single copies of FAPb and FAPa1. Duplicated genes are remnants of whole genome duplication which is assumed to have occurred after the divergence of Lupinus, Arachis, and Glycine

  14. Adaptive Change Inferred from Genomic Population Analysis of the ST93 Epidemic Clone of Community-Associated Methicillin-Resistant Staphylococcus aureus

    PubMed Central

    Stinear, Timothy P.; Holt, Kathryn E.; Chua, Kyra; Stepnell, Justin; Tuck, Kellie L.; Coombs, Geoffrey; Harrison, Paul Francis; Seemann, Torsten; Howden, Benjamin P.

    2014-01-01

    Community-associated methicillin-resistant Staphylococcus aureus (CA-MRSA) has emerged as a major public health problem around the world. In Australia, ST93-IV[2B] is the dominant CA-MRSA clone and displays significantly greater virulence than other S. aureus. Here, we have examined the evolution of ST93 via genomic analysis of 12 MSSA and 44 MRSA ST93 isolates, collected from around Australia over a 17-year period. Comparative analysis revealed a core genome of 2.6 Mb, sharing greater than 99.7% nucleotide identity. The accessory genome was 0.45 Mb and comprised additional mobile DNA elements, harboring resistance to erythromycin, trimethoprim, and tetracycline. Phylogenetic inference revealed a molecular clock and suggested that a single clone of methicillin susceptible, Panton-Valentine leukocidin (PVL) positive, ST93 S. aureus likely spread from North Western Australia in the early 1970s, acquiring methicillin resistance at least twice in the mid 1990s. We also explored associations between genotype and important MRSA phenotypes including oxacillin MIC and production of exotoxins (α-hemolysin [Hla], δ-hemolysin [Hld], PSMα3, and PVL). High-level expression of Hla is a signature feature of ST93 and reduced expression in eight isolates was readily explained by mutations in the agr locus. However, subtle but significant decreases in Hld were also noted over time that coincided with decreasing oxacillin resistance and were independent of agr mutations. The evolution of ST93 S. aureus is thus associated with a reduction in both exotoxin expression and oxacillin MIC, suggesting MRSA ST93 isolates are under pressure for adaptive change. PMID:24482534

  15. Complete Genome and Molecular Epidemiological Data Infer the Maintenance of Rabies among Kudu (Tragelaphus strepsiceros) in Namibia

    PubMed Central

    Scott, Terence P.; Fischer, Melina; Khaiseb, Siegfried; Freuling, Conrad; Höper, Dirk; Hoffmann, Bernd; Markotter, Wanda; Müller, Thomas; Nel, Louis H.

    2013-01-01

    Rabies in kudu is unique to Namibia and two major peaks in the epizootic have occurred since it was first noted in 1977. Due to the large numbers of kudu that were affected, it was suspected that horizontal transmission of rabies occurs among kudu and that rabies was being maintained independently within the Namibian kudu population – separate from canid cycles, despite geographic overlap. In this study, it was our aim to show, through phylogenetic analyses, that rabies was being maintained independently within the Namibian kudu population. We also tested, through complete genome sequencing of four rabies virus isolates from jackal and kudu, whether specific mutations occurred in the virus genome due to host adaptation. We found the separate grouping of all rabies isolates from kudu to those of any other canid species in Namibia, suggesting that rabies was being maintained independently in kudu. Additionally, we noted several mutations unique to isolates from kudu, suggesting that these mutations may be due to the adaptation of rabies to a new host. In conclusion, we show clear evidence that rabies is being maintained independently in the Namibian kudu population – a unique phenomenon with ecological and economic impacts. PMID:23527015

  16. Complete genome and molecular epidemiological data infer the maintenance of rabies among kudu (Tragelaphus strepsiceros) in Namibia.

    PubMed

    Scott, Terence P; Fischer, Melina; Khaiseb, Siegfried; Freuling, Conrad; Höper, Dirk; Hoffmann, Bernd; Markotter, Wanda; Müller, Thomas; Nel, Louis H

    2013-01-01

    Rabies in kudu is unique to Namibia and two major peaks in the epizootic have occurred since it was first noted in 1977. Due to the large numbers of kudu that were affected, it was suspected that horizontal transmission of rabies occurs among kudu and that rabies was being maintained independently within the Namibian kudu population - separate from canid cycles, despite geographic overlap. In this study, it was our aim to show, through phylogenetic analyses, that rabies was being maintained independently within the Namibian kudu population. We also tested, through complete genome sequencing of four rabies virus isolates from jackal and kudu, whether specific mutations occurred in the virus genome due to host adaptation. We found the separate grouping of all rabies isolates from kudu to those of any other canid species in Namibia, suggesting that rabies was being maintained independently in kudu. Additionally, we noted several mutations unique to isolates from kudu, suggesting that these mutations may be due to the adaptation of rabies to a new host. In conclusion, we show clear evidence that rabies is being maintained independently in the Namibian kudu population - a unique phenomenon with ecological and economic impacts.

  17. Structuring of bacterioplankton communities by specific dissolved organic carbon compounds.

    PubMed

    Gómez-Consarnau, Laura; Lindh, Markus V; Gasol, Josep M; Pinhassi, Jarone

    2012-09-01

    The main role of microorganisms in the cycling of the bulk dissolved organic carbon pool in the ocean is well established. Nevertheless, it remains unclear if particular bacteria preferentially utilize specific carbon compounds and whether such compounds have the potential to shape bacterial community composition. Enrichment experiments in the Mediterranean Sea, Baltic Sea and the North Sea (Skagerrak) showed that different low-molecular-weight organic compounds, with a proven importance for the growth of marine bacteria (e.g. amino acids, glucose, dimethylsulphoniopropionate, acetate or pyruvate), in most cases differentially stimulated bacterial growth. Denaturing gradient gel electrophoresis 'fingerprints' and 16S rRNA gene sequencing revealed that some bacterial phylotypes that became abundant were highly specific to enrichment with specific carbon compounds (e.g. Acinetobacter sp. B1-A3 with acetate or Psychromonas sp. B3-U1 with glucose). In contrast, other phylotypes increased in relative abundance in response to enrichment with several, or all, of the investigated carbon compounds (e.g. Neptuniibacter sp. M2-A4 with acetate, pyruvate and dimethylsulphoniopropionate, and Thalassobacter sp. M3-A3 with pyruvate and amino acids). Furthermore, different carbon compounds triggered the development of unique combinations of dominant phylotypes in several of the experiments. These results suggest that bacteria differ substantially in their abilities to utilize specific carbon compounds, with some bacteria being specialists and others having a more generalist strategy. Thus, changes in the supply or composition of the dissolved organic carbon pool can act as selective forces structuring bacterioplankton communities.

  18. Molecular phylogeny of the nettle family (Urticaceae) inferred from multiple loci of three genomes and extensive generic sampling.

    PubMed

    Wu, Zeng-Yuan; Monro, Alex K; Milne, Richard I; Wang, Hong; Yi, Ting-Shuang; Liu, Jie; Li, De-Zhu

    2013-12-01

    Urticaceae is one of the larger Angiosperm families, but relationships within it remain poorly known. This study presents the first densely sampled molecular phylogeny of Urticaceae, using maximum likelihood (ML), maximum parsimony (MP) and Bayesian inference (BI) to analyze the DNA sequence data from two nuclear (ITS and 18S), four chloroplast (matK, rbcL, rpll4-rps8-infA-rpl36, trnL-trnF) and one mitochondrial (matR) loci. We sampled 169 accessions representing 122 species, representing 47 of the 54 recognized genera within Urticaceae, including four of the six sometimes separated as Cecropiaceae. Major results included: (1) Urticaceae including Cecropiaceae was monophyletic; (2) Cecropiaceae was biphyletic, with both lineages nested within Urticaceae; (3) Urticaceae can be divided into four well-supported clades; (4) previously erected tribes or subfamilies were broadly supported, with some additions and alterations; (5) the monophyly of many genera was supported, whereas Boehmeria, Pellionia, Pouzolzia and Urera were clearly polyphyletic, while Urtica and Pilea each had a small genus nested within them; (6) relationships between genera were clarified, mostly with substantial support. These results clarify that some morphological characters have been overstated and others understated in previous classifications of the family, and provide a strong foundation for future studies on biogeography, character evolution, and circumscription of difficult genera.

  19. INFLUENCE OF LIGHT ON BACTERIOPLANKTON PRODUCTION AND RESPIRATION IN A SUBTROPICAL CORAL REEF

    EPA Science Inventory

    The influence of sunlight on bacterioplankton production (14C-leucine (Leu) and 3H-thymidine (TdR) incorporation; changes in cell abundances) and O2 consumption was investigated in a shallow subtropical coral reef located near Key Largo, Florida. Quartz (light) and opaque (dark) ...

  20. Effects of Nutrients on Specific Growth Rate of Bacterioplankton in Oligotrophic Lake Water Cultures †

    PubMed Central

    Coveney, Michael F.; Wetzel, Robert G.

    1992-01-01

    The effects of organic and inorganic nutrient additions on the specific growth rates of bacterioplankton in oligotrophic lake water cultures were investigated. Lake water was first passed through 0.8-μm-pore-size filters (prescreening) to remove bacterivores and to minimize confounding effects of algae. Specific growth rates were calculated from changes in both bacterial cell numbers and biovolumes over 36 h. Gross specific growth rates in unmanipulated control samples were estimated through separate measurements of grazing losses by use of penicillin. The addition of mixed organic substrates alone to prescreened water did not significantly increase bacterioplankton specific growth rates. The addition of inorganic phosphorus alone significantly increased one or both specific growth rates in three of four experiments, and one experiment showed a secondary stimulation by organic substrates. The stimulatory effects of phosphorus addition were greatest concurrently with the highest alkaline phosphatase activity in the lake water. Because bacteria have been shown to dominate inorganic phosphorus uptake in other P-deficient systems, the demonstration that phosphorus, rather than organic carbon, can limit bacterioplankton growth suggests direct competition between phytoplankton and bacterioplankton for inorganic phosphorus. PMID:16348620

  1. INFLUENCE OF LIGHT ON BACTERIOPLANKTON PRODUCTION AND RESPIRATION IN A SUBTROPICAL CORAL REEF

    EPA Science Inventory

    The influence of sunlight on bacterioplankton production (14C-leucine (Leu) and 3H-thymidine (TdR) incorporation; changes in cell abundances) and O2 consumption was investigated in a shallow subtropical coral reef located near Key Largo, Florida. Quartz (light) and opaque (dark) ...

  2. Uptake of picophytoplankton, bacterioplankton and virioplankton by a fringing coral reef community (Ningaloo Reef, Australia)

    NASA Astrophysics Data System (ADS)

    Patten, N. L.; Wyatt, A. S. J.; Lowe, R. J.; Waite, A. M.

    2011-09-01

    We examined the importance of picoplankton and virioplankton to reef trophodynamics at Ningaloo Reef, (north-western Australia), in May and November 2008. Picophytoplankton ( Prochlorococcus, Synechococcus and picoeukaryotes), bacterioplankton (inclusive of bacteria and Archaea), virioplankton and chlorophyll a (Chl a) were measured at five stations following the consistent wave-driven unidirectional mean flow path of seawater across the reef and into the lagoon. Prochlorococcus, Synechococcus, picoeukaryotes and bacterioplankton were depleted to similar levels (~40% on average) over the fore reef, reef crest and reef flat (=`active reef'), with negligible uptake occurring over the sandy bottom lagoon. Depletion of virioplankton also occurred but to more variable levels. Highest uptake rates, m, of picoplankton occurred over the reef crest, while uptake coefficients, S (independent of cell concentration), were similarly scaled over the reef zones, indicating no preferential uptake of any one group. Collectively, picophytoplankton, bacterioplankton and virioplankton accounted for the uptake of 29 mmol C m-2 day-1, with Synechococcus contributing the highest proportion of the removed C. Picoplankton and virioplankton accounted for 1-5 mmol N m-2 day-1 of the removed N, with bacterioplankton estimated to be a highly rich source of N. Results indicate the importance of ocean-reef interactions and the dependence of certain reef organisms on picoplanktonic supply for reef-level biogeochemistry processes.

  3. Interactions between hydrology and water chemistry shape bacterioplankton biogeography across boreal freshwater networks.

    PubMed

    Niño-García, Juan Pablo; Ruiz-González, Clara; Del Giorgio, Paul A

    2016-07-01

    Disentangling the mechanisms shaping bacterioplankton communities across freshwater ecosystems requires considering a hydrologic dimension that can influence both dispersal and local sorting, but how the environment and hydrology interact to shape the biogeography of freshwater bacterioplankton over large spatial scales remains unexplored. Using Illumina sequencing of the 16S ribosomal RNA gene, we investigate the large-scale spatial patterns of bacterioplankton across 386 freshwater systems from seven distinct regions in boreal Québec. We show that both hydrology and local water chemistry (mostly pH) interact to shape a sequential structuring of communities from highly diverse assemblages in headwater streams toward larger rivers and lakes dominated by fewer taxa. Increases in water residence time along the hydrologic continuum were accompanied by major losses of bacterial richness and by an increased differentiation of communities driven by local conditions (pH and other related variables). This suggests that hydrology and network position modulate the relative role of environmental sorting and mass effects on community assembly by determining both the time frame for bacterial growth and the composition of the immigrant pool. The apparent low dispersal limitation (that is, the lack of influence of geographic distance on the spatial patterns observed at the taxonomic resolution used) suggests that these boreal bacterioplankton communities derive from a shared bacterial pool that enters the networks through the smallest streams, largely dominated by mass effects, and that is increasingly subjected to local sorting of species during transit along the hydrologic continuum.

  4. Contrasted Effects of Diversity and Immigration on Ecological Insurance in Marine Bacterioplankton Communities

    PubMed Central

    Bouvier, Corinne; Barbera, Claire; Mouquet, Nicolas

    2012-01-01

    The ecological insurance hypothesis predicts a positive effect of species richness on ecosystem functioning in a variable environment. This effect stems from temporal and spatial complementarity among species within metacommunities coupled with optimal levels of dispersal. Despite its importance in the context of global change by human activities, empirical evidence for ecological insurance remains scarce and controversial. Here we use natural aquatic bacterial communities to explore some of the predictions of the spatial and temporal aspects of the ecological insurance hypothesis. Addressing ecological insurance with bacterioplankton is of strong relevance given their central role in fundamental ecosystem processes. Our experimental set up consisted of water and bacterioplankton communities from two contrasting coastal lagoons. In order to mimic environmental fluctuations, the bacterioplankton community from one lagoon was successively transferred between tanks containing water from each of the two lagoons. We manipulated initial bacterial diversity for experimental communities and immigration during the experiment. We found that the abundance and production of bacterioplankton communities was higher and more stable (lower temporal variance) for treatments with high initial bacterial diversity. Immigration was only marginally beneficial to bacterial communities, probably because microbial communities operate at different time scales compared to the frequency of perturbation selected in this study, and of their intrinsic high physiologic plasticity. Such local “physiological insurance” may have a strong significance for the maintenance of bacterial abundance and production in the face of environmental perturbations. PMID:22701572

  5. In situ interactions between photosynthetic picoeukaryotes and bacterioplankton in the Atlantic Ocean: evidence for mixotrophy.

    PubMed

    Hartmann, Manuela; Zubkov, Mikhail V; Scanlan, Dave J; Lepère, Cécile

    2013-12-01

    Heterotrophic bacterioplankton, cyanobacteria and phototrophic picoeukaryotes (< 5 μm in size) numerically dominate planktonic oceanic communities. While feeding on bacterioplankton is often attributed to aplastidic protists, recent evidence suggests that phototrophic picoeukaryotes could be important bacterivores. Here, we present direct visual evidence from the surface mixed layer of the Atlantic Ocean that bacterioplankton are internalized by phototrophic picoeukaryotes. In situ interactions of phototrophic picoeukaryotes and bacterioplankton (specifically Prochlorococcus cyanobacteria and the SAR11 clade) were investigated using a combination of flow cytometric cell sorting and dual tyramide signal amplification fluorescence in situ hybridization. Using this method, we observed plastidic Prymnesiophyceae and Chrysophyceae cells containing Prochlorococcus, and to a lesser extent SAR11 cells. These microscopic observations of in situ microbial trophic interactions demonstrate the frequency and likely selectivity of phototrophic picoeukaryote bacterivory in the surface mixed layer of both the North and South Atlantic subtropical gyres and adjacent equatorial region, broadening our views on the ecological role of the smallest oceanic plastidic protists.

  6. Simultaneous Extraction from Bacterioplankton of Total RNA and DNA Suitable for Quantitative Structure and Function Analyses

    PubMed Central

    Weinbauer, Markus G.; Fritz, Ingo; Wenderoth, Dirk F.; Höfle, Manfred G.

    2002-01-01

    The aim of this study was to develop a protocol for the simultaneous extraction from bacterioplankton of RNA and DNA suitable for quantitative molecular analysis. By using a combined mechanical and chemical extraction method, the highest RNA and DNA yield was obtained with sodium lauryl sarcosinate-phenol or DivoLab-phenol as the extraction mix. The efficiency of extraction of nucleic acids was comparatively high and varied only moderately in gram-negative bacterial isolates and bacterioplankton (RNA, 52 to 66%; DNA, 43 to 61%); significant amounts of nucleic acids were also obtained for a gram-positive bacterial isolate (RNA, 20 to 30%; DNA, 20 to 25%). Reverse transcription-PCR and PCR amplification products of fragments of 16S rRNA and its genes were obtained from all isolates and communities, indicating that the extracted nucleic acids were intact and pure enough for community structure analyses. By using single-strand conformation polymorphism of fragments of 16S rRNA and its gene, community fingerprints were obtained from pond bacterioplankton. mRNA transcripts encoding fragments of the enzyme nitrite reductase gene (nir gene) could be detected in a pond water sample, indicating that the extraction method is also suitable for studying gene expression. The extraction method presented yields nucleic acids that can be used to perform structural and functional studies of bacterioplankton communities from a single sample. PMID:11872453

  7. Bacterioplankton: a sink for carbon in a coastal marine plankton community

    SciTech Connect

    Ducklow, H.W.; Purdie, D.A.; Williams, P.J.LeB.; Davis, J.M.

    1986-05-16

    Recent determinations of high production rates (up to 30% of primary production in surface waters) implicate free-living marine bacterioplankton as a link in a microbial loop that supplements phytoplankton as food for herbivores. An enclosed water column of 300 cubic meters was used to test the microbial loop hypothesis by following the fate of carbon-14-labeled bacterioplankton for over 50 days. Only 2% of the label initially fixed from carbon-14-labeled glucose by bacteria was present in larger organisms after 13 days, at which time about 20% of the total label added remained in the particulate fraction. Most of the label appeared to pass directly from particles smaller than 1 micrometer (heterotrophic bacterioplankton and some bacteriovores) to respired labeled carbon dioxide or to regenerated dissolved organic carbon-14. Secondary (and, by implication, primary) production by organisms smaller than 1 micrometer may not be an important food source in marine food chains. Bacterioplankton can be a sink for carbon in planktonic food webs and may serve principally as agents of nutrient regeneration rather than as food.

  8. Ubiquity and quantitative significance of bacterioplankton lineages inhabiting the oxygenated hypolimnion of deep freshwater lakes.

    PubMed

    Okazaki, Yusuke; Fujinaga, Shohei; Tanaka, Atsushi; Kohzu, Ayato; Oyagi, Hideo; Nakano, Shin-Ichi

    2017-10-01

    The oxygenated hypolimnion accounts for a volumetrically significant part of the global freshwater systems. Previous studies have proposed the presence of hypolimnion-specific bacterioplankton lineages that are distinct from those inhabiting the epilimnion. To date, however, no consensus exists regarding their ubiquity and abundance, which is necessary to evaluate their ecological importance. The present study investigated the bacterioplankton community in the oxygenated hypolimnia of 10 deep freshwater lakes. Despite the broad geochemical characteristics of the lakes, 16S rRNA gene sequencing demonstrated that the communities in the oxygenated hypolimnia were distinct from those in the epilimnia and identified several predominant lineages inhabiting multiple lakes. Catalyzed reporter deposition fluorescence in situ hybridization revealed that abundant hypolimnion-specific lineages, CL500-11 (Chloroflexi), CL500-3, CL500-37, CL500-15 (Planctomycetes) and Marine Group I (Thaumarchaeota), together accounted for 1.5-32.9% of all bacterioplankton in the hypolimnion of the lakes. Furthermore, an analysis of single-nucleotide variation in the partial 16S rRNA gene sequence (oligotyping) suggested the presence of different sub-populations between lakes and water layers among the lineages occurring in the entire water layer (for example, acI-B1 and acI-A7). Collectively, these results provide the first comprehensive overview of the bacterioplankton community in the oxygenated hypolimnion of deep freshwater lakes.

  9. Effects of nutrients on specific growth rate of bacterioplankton in oligotrophic lake water cultures

    SciTech Connect

    Coveney, M.F.; Wetzel, R.G. )

    1992-01-01

    The effects of organic and inorganic nutrient additions on the specific growth rates of bacterioplankton in oligotrophic lake water cultures were investigated. Lake water was first passed through 0.8-{mu}m-pore-size filters (prescreening) to remove bacterivores and to minimize confounding effects of algae. Specific growth rates were calculated from changes in both bacterial cell numbers and biovolumes over 36 h. Gross specific growth rates in unmanipulated control samples were estimated through separate measurements of grazing losses by use of penicillin. The addition of mixed organic substrates alone to prescreened water did not significantly increase bacterioplankton specific growth rates. The addition of inorganic phosphorus alone significantly increased one or both specific growth rates in three of four experiments, and one experiment showed a secondary stimulation by organic substrates. The stimulatory effects of phosphorus addition were greatest concurrently with the highest alkaline phosphatase activity in the lake water. Because bacteria have been shown to dominate inorganic phosphorus uptake in other P-deficient systems, the demonstration that phosphorus, rather than organic carbon, can limit bacterioplankton growth suggests direct competition between phytoplankton and bacterioplankton for inorganic phosphorus.

  10. Phytoplankton community succession shaping bacterioplankton community composition in Lake Taihu, China.

    PubMed

    Niu, Yuan; Shen, Hong; Chen, Jun; Xie, Ping; Yang, Xi; Tao, Min; Ma, Zhimei; Qi, Min

    2011-08-01

    PCR-denaturing gradient gel electrophoresis (DGGE) and canonical correspondence analysis (CCA) were used to explore the relationship between succession of phytoplankton community and temporal variation of bacterioplankton community composition (BCC) in the eutrophic Lake Taihu. Serious Microcystis bloom was observed in July-December 2008 and Bacillariophyta and Cryptophyta dominated in January-June 2009. BCC was characterized by DGGE of 16S rRNA gene with subsequent sequencing. The DGGE banding patterns revealed a remarkable seasonality which was closely related to phytoplankton community succession. Variation trend of Shannon-Wiener diversity index in bacterioplankton community was similar to that of phytoplankton community. CCA revealed that temperature and phytoplankton played key roles in structuring BCC. Sequencing of DGGE bands suggested that the majority of the sequences were affiliated with common phylogenetic groups in freshwater: Alphaproteobacteria, Betaproteobacteria, Bacteroidetes and Actinobacteria. The cluster STA2-30 (affiliated with Actinobacteria) was found almost across the sampling time at the two study sites. We observed that the family Flavobacteriaceae (affiliated with Bacteroidetes) tightly coupled to diatom bloom and the cluster ML-5-51.2 (affiliated with Actinobacteria) dominated the bacterioplankton communities during Microcystis bloom. These results were quite similar at the two sampling sites, indicating that BCC changes were not random but with fixed pattern. Our study showed insights into relationships between phytoplankton and bacterioplankton communities at species level, facilitating a better understanding of microbial loop and ecosystem functioning in the lake.

  11. BACTERIOPLANKTON DYNAMICS IN PENSACOLA BAY, FL, USA: ROLE OF PHYTOPLANKTON AND DETRIAL CARBON SOURCES

    EPA Science Inventory

    Bacterioplankton Dynamics in Pensacola Bay, FL, USA: Role of Phytoplankton and Detrital Carbon Sources (Abstract). To be presented at the16th Biennial Conference of the Estuarine Research Foundation, ERF 2001: An Estuarine Odyssey, 4-8 November 2001, St. Pete Beach, FL. 1 p. (ER...

  12. Phylogenetic inference and SSR characterization of tropical woody bamboos tribe Bambuseae (Poaceae: Bambusoideae) based on complete plastid genome sequences.

    PubMed

    Vieira, Leila do Nascimento; Dos Anjos, Karina Goulart; Faoro, Helisson; Fraga, Hugo Pacheco de Freitas; Greco, Thiago Machado; Pedrosa, Fábio de Oliveira; de Souza, Emanuel Maltempi; Rogalski, Marcelo; de Souza, Robson Francisco; Guerra, Miguel Pedro

    2016-05-01

    The complete plastome sequencing is an efficient option for increasing phylogenetic resolution and evolutionary studies, as well as may greatly facilitate the use of plastid DNA markers in plant population genetic studies. Merostachys and Guadua stand out as the most common and the highest potential utilization bamboos indigenous of Brazil. Here, we sequenced the complete plastome sequences of the Brazilian Guadua chacoensis and Merostachys sp. to perform full plastome phylogeny and characterize the occurrence, type, and distribution of SRRs using 20 Bambuseae species. The determined plastome sequence of Merostachys sp. and G. chacoensis is 136,334 and 135,403 bp in size, respectively, with an identical gene content and typical quadripartite structure consisting of a pair of IRs separated by the LSC and SSC regions. The Maximum Likelihood and Bayesian Inference analyses produced phylogenomic trees identical in topology. These trees supported monophyly of Paleotropical and Neotropical Bamboos clades. The Neotropical bamboos segregated into three well-supported lineages, Chusqueinae, Guaduinae, and Arthrostylidiinae, with the last two forming a well-supported sister relationship. Paleotropical bamboos segregated into two well-supported lineages, Hickeliinae and Bambusinae + Melocanninae. We identified 141.8 cpSSR in Bambuseae plastomes and an inferior value (38.15) for plastome coding sequences. Among them, we identified 16 polymorphic SSR loci, with number of alleles varying from 3 to 10. These 16 polymorphic cpSSR loci in Bambuseae plastome can be assessed for the intraspecific level of polymorphism, leading to innovative highly sensitive phylogeographic and population genetics studies for this tribe.

  13. Genomic Alteration in Head and Neck Squamous Cell Carcinoma (HNSCC) Cell Lines Inferred from Karyotyping, Molecular Cytogenetics, and Array Comparative Genomic Hybridization.

    PubMed

    Singchat, Worapong; Hitakomate, Ekarat; Rerkarmnuaychoke, Budsaba; Suntronpong, Aorarat; Fu, Beiyuan; Bodhisuwan, Winai; Peyachoknagul, Surin; Yang, Fengtang; Koontongkaew, Sittichai; Srikulnath, Kornsorn

    2016-01-01

    Genomic alteration in head and neck squamous cell carcinoma (HNSCC) was studied in two cell line pairs (HN30-HN31 and HN4-HN12) using conventional C-banding, multiplex fluorescence in situ hybridization (M-FISH), and array comparative genomic hybridization (array CGH). HN30 and HN4 were derived from primary lesions in the pharynx and base of tongue, respectively, and HN31 and HN12 were derived from lymph-node metastatic lesions belonging to the same patients. Gain of chromosome 1, 7, and 11 were shared in almost all cell lines. Hierarchical clustering revealed that HN31 was closely related to HN4, which shared eight chromosome alteration cases. Large C-positive heterochromatins were found in the centromeric region of chromosome 9 in HN31 and HN4, which suggests complex structural amplification of the repetitive sequence. Array CGH revealed amplification of 7p22.3p11.2, 8q11.23q12.1, and 14q32.33 in all cell lines involved with tumorigenesis and inflammation genes. The amplification of 2p21 (SIX3), 11p15.5 (H19), and 11q21q22.3 (MAML2, PGR, TRPC6, and MMP family) regions, and deletion of 9p23 (PTPRD) and 16q23.1 (WWOX) regions were identified in HN31 and HN12. Interestingly, partial loss of PTPRD (9p23) and WWOX (16q23.1) genes was identified in HN31 and HN12, and the level of gene expression tended to be the down-regulation of PTPRD, with no detectable expression of the WWOX gene. This suggests that the scarcity of PTPRD and WWOX genes might have played an important role in progression of HNSCC, and could be considered as a target for cancer therapy or a biomarker in molecular pathology.

  14. Genomic Alteration in Head and Neck Squamous Cell Carcinoma (HNSCC) Cell Lines Inferred from Karyotyping, Molecular Cytogenetics, and Array Comparative Genomic Hybridization

    PubMed Central

    Rerkarmnuaychoke, Budsaba; Suntronpong, Aorarat; Fu, Beiyuan; Bodhisuwan, Winai; Peyachoknagul, Surin; Yang, Fengtang; Koontongkaew, Sittichai; Srikulnath, Kornsorn

    2016-01-01

    Genomic alteration in head and neck squamous cell carcinoma (HNSCC) was studied in two cell line pairs (HN30-HN31 and HN4-HN12) using conventional C-banding, multiplex fluorescence in situ hybridization (M-FISH), and array comparative genomic hybridization (array CGH). HN30 and HN4 were derived from primary lesions in the pharynx and base of tongue, respectively, and HN31 and HN12 were derived from lymph-node metastatic lesions belonging to the same patients. Gain of chromosome 1, 7, and 11 were shared in almost all cell lines. Hierarchical clustering revealed that HN31 was closely related to HN4, which shared eight chromosome alteration cases. Large C-positive heterochromatins were found in the centromeric region of chromosome 9 in HN31 and HN4, which suggests complex structural amplification of the repetitive sequence. Array CGH revealed amplification of 7p22.3p11.2, 8q11.23q12.1, and 14q32.33 in all cell lines involved with tumorigenesis and inflammation genes. The amplification of 2p21 (SIX3), 11p15.5 (H19), and 11q21q22.3 (MAML2, PGR, TRPC6, and MMP family) regions, and deletion of 9p23 (PTPRD) and 16q23.1 (WWOX) regions were identified in HN31 and HN12. Interestingly, partial loss of PTPRD (9p23) and WWOX (16q23.1) genes was identified in HN31 and HN12, and the level of gene expression tended to be the down-regulation of PTPRD, with no detectable expression of the WWOX gene. This suggests that the scarcity of PTPRD and WWOX genes might have played an important role in progression of HNSCC, and could be considered as a target for cancer therapy or a biomarker in molecular pathology. PMID:27501229

  15. Can we continue to neglect genomic variation in introgression rates when inferring the history of speciation? A case study in a Mytilus hybrid zone.

    PubMed

    Roux, C; Fraïsse, C; Castric, V; Vekemans, X; Pogson, G H; Bierne, N

    2014-08-01

    The use of molecular data to reconstruct the history of divergence and gene flow between populations of closely related taxa represents a challenging problem. It has been proposed that the long-standing debate about the geography of speciation can be resolved by comparing the likelihoods of a model of isolation with migration and a model of secondary contact. However, data are commonly only fit to a model of isolation with migration and rarely tested against the secondary contact alternative. Furthermore, most demographic inference methods have neglected variation in introgression rates and assume that the gene flow parameter (Nm) is similar among loci. Here, we show that neglecting this source of variation can give misleading results. We analysed DNA sequences sampled from populations of the marine mussels, Mytilus edulis and M. galloprovincialis, across a well-studied mosaic hybrid zone in Europe and evaluated various scenarios of speciation, with or without variation in introgression rates, using an Approximate Bayesian Computation (ABC) approach. Models with heterogeneous gene flow across loci always outperformed models assuming equal migration rates irrespective of the history of gene flow being considered. By incorporating this heterogeneity, the best-supported scenario was a long period of allopatric isolation during the first three-quarters of the time since divergence followed by secondary contact and introgression during the last quarter. By contrast, constraining migration to be homogeneous failed to discriminate among any of the different models of gene flow tested. Our simulations thus provide statistical support for the secondary contact scenario in the European Mytilus hybrid zone that the standard coalescent approach failed to confirm. Our results demonstrate that genomic variation in introgression rates can have profound impacts on the biological conclusions drawn from inference methods and needs to be incorporated in future studies.

  16. First all-in-one diagnostic tool for DNA intelligence: genome-wide inference of biogeographic ancestry, appearance, relatedness, and sex with the Identitas v1 Forensic Chip.

    PubMed

    Keating, Brendan; Bansal, Aruna T; Walsh, Susan; Millman, Jonathan; Newman, Jonathan; Kidd, Kenneth; Budowle, Bruce; Eisenberg, Arthur; Donfack, Joseph; Gasparini, Paolo; Budimlija, Zoran; Henders, Anjali K; Chandrupatla, Hareesh; Duffy, David L; Gordon, Scott D; Hysi, Pirro; Liu, Fan; Medland, Sarah E; Rubin, Laurence; Martin, Nicholas G; Spector, Timothy D; Kayser, Manfred

    2013-05-01

    When a forensic DNA sample cannot be associated directly with a previously genotyped reference sample by standard short tandem repeat profiling, the investigation required for identifying perpetrators, victims, or missing persons can be both costly and time consuming. Here, we describe the outcome of a collaborative study using the Identitas Version 1 (v1) Forensic Chip, the first commercially available all-in-one tool dedicated to the concept of developing intelligence leads based on DNA. The chip allows parallel interrogation of 201,173 genome-wide autosomal, X-chromosomal, Y-chromosomal, and mitochondrial single nucleotide polymorphisms for inference of biogeographic ancestry, appearance, relatedness, and sex. The first assessment of the chip's performance was carried out on 3,196 blinded DNA samples of varying quantities and qualities, covering a wide range of biogeographic origin and eye/hair coloration as well as variation in relatedness and sex. Overall, 95 % of the samples (N = 3,034) passed quality checks with an overall genotype call rate >90 % on variable numbers of available recorded trait information. Predictions of sex, direct match, and first to third degree relatedness were highly accurate. Chip-based predictions of biparental continental ancestry were on average ~94 % correct (further support provided by separately inferred patrilineal and matrilineal ancestry). Predictions of eye color were 85 % correct for brown and 70 % correct for blue eyes, and predictions of hair color were 72 % for brown, 63 % for blond, 58 % for black, and 48 % for red hair. From the 5 % of samples (N = 162) with <90 % call rate, 56 % yielded correct continental ancestry predictions while 7 % yielded sufficient genotypes to allow hair and eye color prediction. Our results demonstrate that the Identitas v1 Forensic Chip holds great promise for a wide range of applications including criminal investigations, missing person investigations, and for national security

  17. Demographic inferences using short-read genomic data in an approximate Bayesian computation framework: in silico evaluation of power, biases and proof of concept in Atlantic walrus.

    PubMed

    Shafer, Aaron B A; Gattepaille, Lucie M; Stewart, Robert E A; Wolf, Jochen B W

    2015-01-01

    Approximate Bayesian computation (ABC) is a powerful tool for model-based inference of demographic histories from large genetic data sets. For most organisms, its implementation has been hampered by the lack of sufficient genetic data. Genotyping-by-sequencing (GBS) provides cheap genome-scale data to fill this gap, but its potential has not fully been exploited. Here, we explored power, precision and biases of a coalescent-based ABC approach where GBS data were modelled with either a population mutation parameter (θ) or a fixed site (FS) approach, allowing single or several segregating sites per locus. With simulated data ranging from 500 to 50 000 loci, a variety of demographic models could be reliably inferred across a range of timescales and migration scenarios. Posterior estimates were informative with 1000 loci for migration and split time in simple population divergence models. In more complex models, posterior distributions were wide and almost reverted to the uninformative prior even with 50 000 loci. ABC parameter estimates, however, were generally more accurate than an alternative composite-likelihood method. Bottleneck scenarios proved particularly difficult, and only recent bottlenecks without recovery could be reliably detected and dated. Notably, minor-allele-frequency filters - usual practice for GBS data - negatively affected nearly all estimates. With this in mind, we used a combination of FS and θ approaches on empirical GBS data generated from the Atlantic walrus (Odobenus rosmarus rosmarus), collectively providing support for a population split before the last glacial maximum followed by asymmetrical migration and a high Arctic bottleneck. Overall, this study evaluates the potential and limitations of GBS data in an ABC-coalescence framework and proposes a best-practice approach.

  18. Southeast Asian origins of five Hill Tribe populations and correlation of genetic to linguistic relationships inferred with genome-wide SNP data.

    PubMed

    Listman, J B; Malison, R T; Sanichwankul, K; Ittiwut, C; Mutirangura, A; Gelernter, J

    2011-02-01

    In Thailand, the term Hill Tribe is used to describe populations whose members traditionally practice slash and burn agriculture and reside in the mountains. These tribes are thought to have migrated throughout Asia for up to 5,000 years, including migrations through Southern China and/or Southeast Asia. There have been continuous migrations southward from China into Thailand for approximately the past thousand years and the present geographic range of any given tribe straddles multiple political borders. As none of these populations have autochthonous scripts, written histories have until recently, been externally produced. Northern Asian, Tibetan, and Siberian origins of Hill Tribes have been proposed. All purport endogamy and have nonmutually intelligible languages. To test hypotheses regarding the geographic origins of these populations, relatedness and migrations among them and neighboring populations, and whether their genetic relationships correspond with their linguistic relationships, we analyzed 2,445 genome-wide SNP markers in 118 individuals from five Thai Hill Tribe populations (Akha, Hmong, Karen, Lahu, and Lisu), 90 individuals from majority Thai populations, and 826 individuals from Asian and Oceanean HGDP and HapMap populations using a Bayesian clustering method. Considering these results within the context of results ofrecent large-scale studies of Asian geographic genetic variation allows us to infer a shared Southeast Asian origin of these five Hill Tribe populations as well ancestry components that distinguish among them seen in successive levels of clustering. In addition, the inferred level of shared ancestry among the Hill Tribes corresponds well to relationships among their languages.

  19. Southeast Asian origins of five Hill Tribe populations and correlation of genetic to linguistic relationships inferred with genome-wide SNP data

    PubMed Central

    Listman, JB; Malison, RT; Sanichwankul, K; Ittiwut, C; Mutirangura, A; Gelernter, J

    2010-01-01

    In Thailand, the term Hill Tribe is used to describe populations whose members traditionally practice slash and burn agriculture and reside in the mountains. These tribes are thought to have migrated throughout Asia for up to 5,000 years, including migrations through Southern China and/or Southeast Asia. There have been continuous migrations southward from China into Thailand for approximately the past thousand years and the present geographic range of any given tribe straddles multiple political borders. As none of these populations have autochthonous scripts, written histories have until recently, been externally produced. Northern Asian, Tibetan, and Siberian origins of Hill Tribes have been proposed. All purport endogamy and have non-mutually intelligible languages. In order to test hypotheses regarding the geographic origins of these populations, relatedness and migrations among them and neighboring populations, and whether their genetic relationships correspond with their linguistic relationships, we analyzed 2445 genome-wide SNP markers in 118 individuals from five Thai Hill Tribe populations (Akha, Hmong, Karen, Lahu, and Lisu), 90 individuals from majority Thai populations, and 826 individuals from Asian and Oceanean HGDP and HapMap populations using a Bayesian clustering method. Considering these results within the context of results of recent large-scale studies of Asian geographic genetic variation allows us to infer a shared Southeast Asian origin of these five Hill Tribe populations as well ancestry components that distinguish among them seen in successive levels of clustering. In addition, the inferred level of shared ancestry among the Hill Tribes corresponds well to relationships among their languages. PMID:20979205

  20. Fish-mediated changes in bacterioplankton community composition: an in situ mesocosm experiment

    NASA Astrophysics Data System (ADS)

    Luo, Congqiang; Yi, Chunlong; Ni, Leyi; Guo, Longgen

    2017-06-01

    We characterized variations in bacterioplankton community composition (BCC) in mesocosms subject to three different treatments. Two groups contained fish (group one: Cyprinus carpio; group two: Hypophthalmichthys molitrix); and group three, the untreated mesocosm, was the control. Samples were taken seven times over a 49-day period, and BCC was analyzed by PCR-denaturing gradient gel electrophoresis (DGGE) and real-time quantitative PCR (qPCR). Results revealed that introduction of C. carpio and H. molitrix had a remarkable impact on the composition of bacterioplankton communities, and the BCC was significantly different between each treatment. Sequencing of DGGE bands revealed that the bacterioplankton community in the different treatment groups was consistent at a taxonomic level, but differed in its abundance. H. molitrix promoted the richness of Alphaproteobacteria and Actinobacteria, while more bands affiliated to Cyanobacteria were detected inC. carpio mesocosms. The redundancy analysis (RDA) result demonstrated that the BCC was closely related to the bottom-up (total phosphorus, chlorophyll a, phytoplankton biomass) and top-down forces (biomass of copepods and cladocera) in C. carpio and control mesocosms, respectively. We found no evidence for top-down regulation of BCC by zooplankton in H. molitrix mesocosms, while grazing by protozoa (heterotrophic nanoflagellates, ciliates) became the major way to regulate BCC. Total bacterioplankton abundances were significantly higher in C. carpio mesocosms because of high nutrient concentration and suspended solids. Our study provided insights into the relationship between fish and bacterioplankton at species level, leading to a deep understanding of the function of the microbial loop and the aquatic ecosystem.

  1. Coral and macroalgal exudates vary in neutral sugar composition and differentially enrich reef bacterioplankton lineages

    PubMed Central

    Nelson, Craig E; Goldberg, Stuart J; Wegley Kelly, Linda; Haas, Andreas F; Smith, Jennifer E; Rohwer, Forest; Carlson, Craig A

    2013-01-01

    Increasing algal cover on tropical reefs worldwide may be maintained through feedbacks whereby algae outcompete coral by altering microbial activity. We hypothesized that algae and coral release compositionally distinct exudates that differentially alter bacterioplankton growth and community structure. We collected exudates from the dominant hermatypic coral holobiont Porites spp. and three dominant macroalgae (one each Ochrophyta, Rhodophyta and Chlorophyta) from reefs of Mo'orea, French Polynesia. We characterized exudates by measuring dissolved organic carbon (DOC) and fractional dissolved combined neutral sugars (DCNSs) and subsequently tracked bacterioplankton responses to each exudate over 48 h, assessing cellular growth, DOC/DCNS utilization and changes in taxonomic composition (via 16S rRNA amplicon pyrosequencing). Fleshy macroalgal exudates were enriched in the DCNS components fucose (Ochrophyta) and galactose (Rhodophyta); coral and calcareous algal exudates were enriched in total DCNS but in the same component proportions as ambient seawater. Rates of bacterioplankton growth and DOC utilization were significantly higher in algal exudate treatments than in coral exudate and control incubations with each community selectively removing different DCNS components. Coral exudates engendered the smallest shift in overall bacterioplankton community structure, maintained high diversity and enriched taxa from Alphaproteobacteria lineages containing cultured representatives with relatively few virulence factors (VFs) (Hyphomonadaceae and Erythrobacteraceae). In contrast, macroalgal exudates selected for less diverse communities heavily enriched in copiotrophic Gammaproteobacteria lineages containing cultured pathogens with increased VFs (Vibrionaceae and Pseudoalteromonadaceae). Our results demonstrate that algal exudates are enriched in DCNS components, foster rapid growth of bacterioplankton and select for bacterial populations with more potential VFs than

  2. Coral and macroalgal exudates vary in neutral sugar composition and differentially enrich reef bacterioplankton lineages.

    PubMed

    Nelson, Craig E; Goldberg, Stuart J; Wegley Kelly, Linda; Haas, Andreas F; Smith, Jennifer E; Rohwer, Forest; Carlson, Craig A

    2013-05-01

    Increasing algal cover on tropical reefs worldwide may be maintained through feedbacks whereby algae outcompete coral by altering microbial activity. We hypothesized that algae and coral release compositionally distinct exudates that differentially alter bacterioplankton growth and community structure. We collected exudates from the dominant hermatypic coral holobiont Porites spp. and three dominant macroalgae (one each Ochrophyta, Rhodophyta and Chlorophyta) from reefs of Mo'orea, French Polynesia. We characterized exudates by measuring dissolved organic carbon (DOC) and fractional dissolved combined neutral sugars (DCNSs) and subsequently tracked bacterioplankton responses to each exudate over 48 h, assessing cellular growth, DOC/DCNS utilization and changes in taxonomic composition (via 16S rRNA amplicon pyrosequencing). Fleshy macroalgal exudates were enriched in the DCNS components fucose (Ochrophyta) and galactose (Rhodophyta); coral and calcareous algal exudates were enriched in total DCNS but in the same component proportions as ambient seawater. Rates of bacterioplankton growth and DOC utilization were significantly higher in algal exudate treatments than in coral exudate and control incubations with each community selectively removing different DCNS components. Coral exudates engendered the smallest shift in overall bacterioplankton community structure, maintained high diversity and enriched taxa from Alphaproteobacteria lineages containing cultured representatives with relatively few virulence factors (VFs) (Hyphomonadaceae and Erythrobacteraceae). In contrast, macroalgal exudates selected for less diverse communities heavily enriched in copiotrophic Gammaproteobacteria lineages containing cultured pathogens with increased VFs (Vibrionaceae and Pseudoalteromonadaceae). Our results demonstrate that algal exudates are enriched in DCNS components, foster rapid growth of bacterioplankton and select for bacterial populations with more potential VFs than

  3. Response of Bacterioplankton Communities to Cadmium Exposure in Coastal Water Microcosms with High Temporal Variability

    PubMed Central

    Wang, Kai; Xiong, Jinbo; Chen, Xinxin; Zheng, Jialai; Hu, Changju; Yang, Yina; Zhu, Jianlin

    2014-01-01

    Multiple anthropogenic disturbances to bacterial diversity have been investigated in coastal ecosystems, in which temporal variability in the bacterioplankton community has been considered a ubiquitous process. However, far less is known about the temporal dynamics of a bacterioplankton community responding to pollution disturbances such as toxic metals. We used coastal water microcosms perturbed with 0, 10, 100, and 1,000 μg liter−1 of cadmium (Cd) for 2 weeks to investigate temporal variability, Cd-induced patterns, and their interaction in the coastal bacterioplankton community and to reveal whether the bacterial community structure would reflect the Cd gradient in a temporally varying system. Our results showed that the bacterioplankton community structure shifted along the Cd gradient consistently after a 4-day incubation, although it exhibited some resistance to Cd at low concentration (10 μg liter−1). A process akin to an arms race between temporal variability and Cd exposure was observed, and the temporal variability overwhelmed Cd-induced patterns in the bacterial community. The temporal succession of the bacterial community was correlated with pH, dissolved oxygen, NO3−-N, NO2−-N, PO43−-P, dissolved organic carbon, and chlorophyll a, and each of these parameters contributed more to community variance than Cd did. However, elevated Cd levels did decrease the temporal turnover rate of community. Furthermore, key taxa, affiliated to the families Flavobacteriaceae, Rhodobacteraceae, Erythrobacteraceae, Piscirickettsiaceae, and Alteromonadaceae, showed a high frequency of being associated with Cd levels during 2 weeks. This study provides direct evidence that specific Cd-induced patterns in bacterioplankton communities exist in highly varying manipulated coastal systems. Future investigations on an ecosystem scale across longer temporal scales are needed to validate the observed pattern. PMID:25326310

  4. Response of bacterioplankton communities to cadmium exposure in coastal water microcosms with high temporal variability.

    PubMed

    Wang, Kai; Zhang, Demin; Xiong, Jinbo; Chen, Xinxin; Zheng, Jialai; Hu, Changju; Yang, Yina; Zhu, Jianlin

    2015-01-01

    Multiple anthropogenic disturbances to bacterial diversity have been investigated in coastal ecosystems, in which temporal variability in the bacterioplankton community has been considered a ubiquitous process. However, far less is known about the temporal dynamics of a bacterioplankton community responding to pollution disturbances such as toxic metals. We used coastal water microcosms perturbed with 0, 10, 100, and 1,000 μg liter(-1) of cadmium (Cd) for 2 weeks to investigate temporal variability, Cd-induced patterns, and their interaction in the coastal bacterioplankton community and to reveal whether the bacterial community structure would reflect the Cd gradient in a temporally varying system. Our results showed that the bacterioplankton community structure shifted along the Cd gradient consistently after a 4-day incubation, although it exhibited some resistance to Cd at low concentration (10 μg liter(-1)). A process akin to an arms race between temporal variability and Cd exposure was observed, and the temporal variability overwhelmed Cd-induced patterns in the bacterial community. The temporal succession of the bacterial community was correlated with pH, dissolved oxygen, NO3 (-)-N, NO2 (-)-N, PO4 (3-)-P, dissolved organic carbon, and chlorophyll a, and each of these parameters contributed more to community variance than Cd did. However, elevated Cd levels did decrease the temporal turnover rate of community. Furthermore, key taxa, affiliated to the families Flavobacteriaceae, Rhodobacteraceae, Erythrobacteraceae, Piscirickettsiaceae, and Alteromonadaceae, showed a high frequency of being associated with Cd levels during 2 weeks. This study provides direct evidence that specific Cd-induced patterns in bacterioplankton communities exist in highly varying manipulated coastal systems. Future investigations on an ecosystem scale across longer temporal scales are needed to validate the observed pattern. Copyright © 2015, American Society for Microbiology. All

  5. Topologically inferring pathway activity toward precise cancer classification via integrating genomic and metabolomic data: prostate cancer as a case

    PubMed Central

    Liu, Wei; Bai, Xuefeng; Liu, Yuejuan; Wang, Wei; Han, Junwei; Wang, Qiuyu; Xu, Yanjun; Zhang, Chunlong; Zhang, Shihua; Li, Xuecang; Ren, Zhonggui; Zhang, Jian; Li, Chunquan

    2015-01-01

    Precise cancer classification is a central challenge in clinical cancer research such as diagnosis, prognosis and metastasis prediction. Most existing cancer classification methods based on gene or metabolite biomarkers were limited to single genomics or metabolomics, and lacked integration and utilization of multiple ‘omics’ data. The accuracy and robustness of these methods when applied to independent cohorts of patients must be improved. In this study, we propose a directed random walk-based method to evaluate the topological importance of each gene in a reconstructed gene–metabolite graph by integrating information from matched gene expression profiles and metabolomic profiles. The joint use of gene and metabolite information contributes to accurate evaluation of the topological importance of genes and reproducible pathway activities. We constructed classifiers using reproducible pathway activities for precise cancer classification and risk metabolic pathway identification. We applied the proposed method to the classification of prostate cancer. Within-dataset experiments and cross-dataset experiments on three independent datasets demonstrated that the proposed method achieved a more accurate and robust overall performance compared to several existing classification methods. The resulting risk pathways and topologically important differential genes and metabolites provide biologically informative models for prostate cancer prognosis and therapeutic strategies development. PMID:26286638

  6. Phylogenetic position of tetraodontiform fishes within the higher teleosts: Bayesian inferences based on 44 whole mitochondrial genome sequences.

    PubMed

    Yamanoue, Yusuke; Miya, Masaki; Matsuura, Keiichi; Yagishita, Naoki; Mabuchi, Kohji; Sakai, Harumi; Katoh, Masaya; Nishida, Mutsumi

    2007-10-01

    Tetraodontiformes includes approximately 350 species assigned to nine families, sharing several reduced morphological features of higher teleosts. The order has been accepted as a monophyletic group by many authors, although several alternative hypotheses exist regarding its phylogenetic position within the higher teleosts. To date, acanthuroids, zeiforms, and lophiiforms have been proposed as sister-groups of the tetraodontiforms. The monophyly and sister-group status was investigated using whole mitochondrial genome (mitogenome) sequences from 44 purposefully-chosen species (26 sequences newly-determined during the study) that fully represent the major tetraodontiform lineages plus all the groups that have been hypothesized as being close relatives. Partitioned Bayesian analyses were conducted with the three datasets that comprised concatenated nucleotide sequences from 13 protein-coding genes (with and without, or with RY-coding, 3rd codon positions), plus 22 transfer RNA and two ribosomal RNA genes. The resultant trees were well resolved and largely congruent, with most internal branches being supported by high posterior probabilities. Mitogenomic data strongly supported the monophyly of tetraodontiform fishes, placing them as a sister-group of either Lophiiformes plus Caproidei or Caproidei only. The sister-group relationship between Acanthuroidei and Tetraodontiformes was statistically rejected using Bayes factors. These results were confirmed by a reanalysis of the previously published nuclear RAG1 gene sequences using the Bayesian method. Within the Tetraodontiformes, however, monophylies of the three superfamilies were not recovered and further taxonomic sampling and subsequent efforts should clarify these relationships.

  7. The Evolutionary History of Plasmodium vivax as Inferred from Mitochondrial Genomes: Parasite Genetic Diversity in the Americas

    PubMed Central

    Taylor, Jesse E.; Pacheco, M. Andreína; Bacon, David J.; Beg, Mohammad A.; Machado, Ricardo Luiz; Fairhurst, Rick M.; Herrera, Socrates; Kim, Jung-Yeon; Menard, Didier; Póvoa, Marinete Marins; Villegas, Leopoldo; Mulyanto; Snounou, Georges; Cui, Liwang; Zeyrek, Fadile Yildiz; Escalante, Ananias A.

    2013-01-01

    Plasmodium vivax is the most prevalent human malaria parasite in the Americas. Previous studies have contrasted the genetic diversity of parasite populations in the Americas with those in Asia and Oceania, concluding that New World populations exhibit low genetic diversity consistent with a recent introduction. Here we used an expanded sample of complete mitochondrial genome sequences to investigate the diversity of P. vivax in the Americas as well as in other continental populations. We show that the diversity of P. vivax in the Americas is comparable to that in Asia and Oceania, and we identify several divergent clades circulating in South America that may have resulted from independent introductions. In particular, we show that several haplotypes sampled in Venezuela and northeastern Brazil belong to a clade that diverged from the other P. vivax lineages at least 30,000 years ago, albeit not necessarily in the Americas. We propose that, unlike in Asia where human migration increases local genetic diversity, the combined effects of the geographical structure and the low incidence of vivax malaria in the Americas has resulted in patterns of low local but high regional genetic diversity. This could explain previous views that P. vivax in the Americas has low genetic diversity because these were based on studies carried out in limited areas. Further elucidation of the complex geographical pattern of P. vivax variation will be important both for diversity assessments of genes encoding candidate vaccine antigens and in the formulation of control and surveillance measures aimed at malaria elimination. PMID:23733143

  8. The evolutionary history of Plasmodium vivax as inferred from mitochondrial genomes: parasite genetic diversity in the Americas.

    PubMed

    Taylor, Jesse E; Pacheco, M Andreína; Bacon, David J; Beg, Mohammad A; Machado, Ricardo Luiz; Fairhurst, Rick M; Herrera, Socrates; Kim, Jung-Yeon; Menard, Didier; Póvoa, Marinete Marins; Villegas, Leopoldo; Mulyanto; Snounou, Georges; Cui, Liwang; Zeyrek, Fadile Yildiz; Escalante, Ananias A

    2013-09-01

    Plasmodium vivax is the most prevalent human malaria parasite in the Americas. Previous studies have contrasted the genetic diversity of parasite populations in the Americas with those in Asia and Oceania, concluding that New World populations exhibit low genetic diversity consistent with a recent introduction. Here we used an expanded sample of complete mitochondrial genome sequences to investigate the diversity of P. vivax in the Americas as well as in other continental populations. We show that the diversity of P. vivax in the Americas is comparable to that in Asia and Oceania, and we identify several divergent clades circulating in South America that may have resulted from independent introductions. In particular, we show that several haplotypes sampled in Venezuela and northeastern Brazil belong to a clade that diverged from the other P. vivax lineages at least 30,000 years ago, albeit not necessarily in the Americas. We propose that, unlike in Asia where human migration increases local genetic diversity, the combined effects of the geographical structure and the low incidence of vivax malaria in the Americas has resulted in patterns of low local but high regional genetic diversity. This could explain previous views that P. vivax in the Americas has low genetic diversity because these were based on studies carried out in limited areas. Further elucidation of the complex geographical pattern of P. vivax variation will be important both for diversity assessments of genes encoding candidate vaccine antigens and in the formulation of control and surveillance measures aimed at malaria elimination.

  9. ChEA: transcription factor regulation inferred from integrating genome-wide ChIP-X experiments

    PubMed Central

    Lachmann, Alexander; Xu, Huilei; Krishnan, Jayanth; Berger, Seth I.; Mazloom, Amin R.; Ma'ayan, Avi

    2010-01-01

    Motivation: Experiments such as ChIP-chip, ChIP-seq, ChIP-PET and DamID (the four methods referred herein as ChIP-X) are used to profile the binding of transcription factors to DNA at a genome-wide scale. Such experiments provide hundreds to thousands of potential binding sites for a given transcription factor in proximity to gene coding regions. Results: In order to integrate data from such studies and utilize it for further biological discovery, we collected interactions from such experiments to construct a mammalian ChIP-X database. The database contains 189 933 interactions, manually extracted from 87 publications, describing the binding of 92 transcription factors to 31 932 target genes. We used the database to analyze mRNA expression data where we perform gene-list enrichment analysis using the ChIP-X database as the prior biological knowledge gene-list library. The system is delivered as a web-based interactive application called ChIP Enrichment Analysis (ChEA). With ChEA, users can input lists of mammalian gene symbols for which the program computes over-representation of transcription factor targets from the ChIP-X database. The ChEA database allowed us to reconstruct an initial network of transcription factors connected based on shared overlapping targets and binding site proximity. To demonstrate the utility of ChEA we present three case studies. We show how by combining the Connectivity Map (CMAP) with ChEA, we can rank pairs of compounds to be used to target specific transcription factor activity in cancer cells. Availability: The ChEA software and ChIP-X database is freely available online at: http://amp.pharm.mssm.edu/lib/chea.jsp Contact: avi.maayan@mssm.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:20709693

  10. Testing the phylogenetic position of a parasitic plant (Cuscuta, Convolvulaceae, asteridae): Bayesian inference and the parametric bootstrap on data drawn from three genomes.

    PubMed

    Stefanović, Sasa; Olmstead, Richard G

    2004-06-01

    Previous findings on structural rearrangements in the chloroplast genome of Cuscuta (dodder), the only parasitic genus in the morning-glory family, Convolvulaceae, were attributed to its parasitic life style, but without proper comparison to related nonparasitic members of the family. Before molecular evolutionary questions regarding genome evolution can be answered, the phylogenetic problems within the family need to be resolved. However, the phylogenetic position of parasitic angiosperms and their precise relationship to nonparasitic relatives are difficult to infer. Problems are encountered with both morphological and molecular evidence. Molecular data have been used in numerous studies to elucidate relationships of parasitic taxa, despite accelerated rates of sequence evolution. To address the question of the position of the genus Cuscuta within Convolvulaceae, we generated a new molecular data set consisting of mitochondrial (atpA) and nuclear (RPB2) genes, and analyzed these data together with an existing chloroplast data matrix (rbcL, atpB, trnL-F, and psbE-J), to which an additional chloroplast gene (rpl2) was added. This data set was analyzed with an array of phylogenetic methods, including Bayesian analysis, maximum likelihood, and maximum parsimony. Further exploration of data was done by using methods of phylogeny hypothesis testing. At least two nonparasitic lineages are shown to diverge within the Convolvulaceae before Cuscuta. However, the exact sister group of Cuscuta could not be ascertained, even though many alternatives were rejected with confidence. Caution is therefore warranted when interpreting the causes of molecular evolution in Cuscuta. Detailed comparisons with nonparasitic Convolvulaceae are necessary before firm conclusions can be reached regarding the effects of the parasitic mode of life on patterns of molecular evolution in Cuscuta.

  11. Bacterioplankton responses to iron enrichment during the SAGE experiment

    NASA Astrophysics Data System (ADS)

    Kuparinen, J.; Hall, J.; Ellwood, M.; Safi, K.; Peloquin, J.; Katz, D.

    2011-03-01

    We studied the microbial food web in the upper 100 m of the water column in iron-limited sub-Antarctic HNLC waters south-east of New Zealand in the SAGE experiment in 2004, with focus on bacterioplankton. Samples were collected daily from inside and outside the iron enriched patch. Short term enrichment experiments were conducted on board in 4 L polycarbonate bottles with water outside the iron enriched patch to study single and combined effects of micronutrient additions on microbial food web. Low bacterial growth was recorded in the study area with community turnover times of 50 h or more during the study period. Measurements of bacterial standing stocks and production rates in the study show minor responses to the large scale iron enrichment, with increase in rates and stocks after the first enrichment and at the end of the study period after the third iron enrichment when solar radiation increased and wind mixing decreased. The average daily bacterial production rates were 31.5 and 33.7 mgCm -2 d -1 for the OUT and IN stations, respectively; thus overall there was not a significant difference between the control and the iron-enriched patch. In the bottle experiments bacterial thymidine incorporation showed responses to single iron and silicic acid enrichments and a major growth response to the combined iron and sucrose enrichments. Phytoplankton chlorophyll- a showed clear stimulation by single additions of iron and silicic acid and silicic acid enhanced the iron impact. Cobalt additions had no effect on bacteria growth and a negative effect on phytoplankton growth. Low bacterial in situ growth rates and the enrichment experiments suggest that bacteria are co-limited by iron and carbon, and that bacterial iron uptake is dependent on carbon supply by the food web. With the high iron quota (μmol Fe mol C -1) bacteria may scavenge considerable amounts of the excess iron, and thus influence the relative importance of the microbial food web as a carbon sink.

  12. Ecological Inference

    NASA Astrophysics Data System (ADS)

    King, Gary; Rosen, Ori; Tanner, Martin A.

    2004-09-01

    This collection of essays brings together a diverse group of scholars to survey the latest strategies for solving ecological inference problems in various fields. The last half-decade has witnessed an explosion of research in ecological inference--the process of trying to infer individual behavior from aggregate data. Although uncertainties and information lost in aggregation make ecological inference one of the most problematic types of research to rely on, these inferences are required in many academic fields, as well as by legislatures and the Courts in redistricting, by business in marketing research, and by governments in policy analysis.

  13. Distribution, Community Composition, and Potential Metabolic Activity of Bacterioplankton in an Urbanized Mediterranean Sea Coastal Zone.

    PubMed

    Richa, Kumari; Balestra, Cecilia; Piredda, Roberta; Benes, Vladimir; Borra, Marco; Passarelli, Augusto; Margiotta, Francesca; Saggiomo, Maria; Biffali, Elio; Sanges, Remo; Scanlan, David J; Casotti, Raffaella

    2017-09-01

    Bacterioplankton are fundamental components of marine ecosystems and influence the entire biosphere by contributing to the global biogeochemical cycles of key elements. Yet, there is a significant gap in knowledge about their diversity and specific activities, as well as environmental factors that shape their community composition and function. Here, the distribution and diversity of surface bacterioplankton along the coastline of the Gulf of Naples (GON; Italy) were investigated using flow cytometry coupled with high-throughput sequencing of the 16S rRNA gene. Heterotrophic bacteria numerically dominated the bacterioplankton and comprised mainly Alphaproteobacteria, Gammaproteobacteria, and Bacteroidetes Distinct communities occupied river-influenced, coastal, and offshore sites, as indicated by Bray-Curtis dissimilarity, distance metric (UniFrac), linear discriminant analysis effect size (LEfSe), and multivariate analyses. The heterogeneity in diversity and community composition was mainly due to salinity and changes in environmental conditions across sites, as defined by nutrient and chlorophyll a concentrations. Bacterioplankton communities were composed of a few dominant taxa and a large proportion (92%) of rare taxa (here defined as operational taxonomic units [OTUs] accounting for <0.1% of the total sequence abundance), the majority of which were unique to each site. The relationship between 16S rRNA and the 16S rRNA gene, i.e., between potential metabolic activity and abundance, was positive for the whole community. However, analysis of individual OTUs revealed high rRNA-to-rRNA gene ratios for most (71.6% ± 16.7%) of the rare taxa, suggesting that these low-abundance organisms were potentially active and hence might be playing an important role in ecosystem diversity and functioning in the GON.IMPORTANCE The study of bacterioplankton in coastal zones is of critical importance, considering that these areas are highly productive and anthropogenically

  14. Rapid turnover of dissolved DMS and DMSP by defined bacterioplankton communities in the stratified euphotic zone of the North Sea

    NASA Astrophysics Data System (ADS)

    Zubkov, Mikhail V.; Fuchs, Bernhard M.; Archer, Stephen D.; Kiene, Ronald P.; Amann, Rudolf; Burkill, Peter H.

    Bacterioplankton-driven turnover of the algal osmolyte, dimethylsulphoniopropionate (DMSP), and its degradation product, dimethylsulphide (DMS) the major natural source of atmospheric sulphur, were studied during a Lagrangian SF 6-tracer experiment in the North Sea (60°N, 3°E). The water mass sampled within the euphotic zone was characterised by a surface mixed layer (from 0 m to 13-30 m) and a subsurface layer (from 13-30 m to 45-58 m) separated by a 2°C thermocline spanning 2 m. The fluxes of dissolved DMSP (DMSPd) and DMS were determined using radioactive tracer techniques. Rates of the simultaneous incorporation of 14C-leucine and 3H-thymidine were measured to estimate bacterioplankton production. Flow cytometry was employed to discriminate subpopulations and to determine the numbers and biomass of bacterioplankton by staining for nucleic acids and proteins. Bacterioplankton subpopulations were separated by flow cytometric sorting and their composition determined using 16S ribosomal gene cloning/sequencing and fluorescence in situ hybridisation with designed group-specific oligonucleotide probes. A subpopulation, dominated by bacteria related to Roseobacter-( α-proteobacteria), constituted 26-33% of total bacterioplankton numbers and 45-48% of biomass in both surface and subsurface layers. The other abundant prokaryotes were a group within the SAR86 cluster of γ-proteobacteria and bacteria from the Cytophaga-Flavobacterium—cluster. Bacterial consumption of DMSPd was greater in the subsurface layer (41 nM d -1) than in the surface layer (20 nM d -1). Bacterioplankton tightly controlled the DMSPd pool, particularly in the subsurface layer, with a turnover time of 2 h, whereas the turnover time of DMSPd in the surface layer was 10 h. Consumed DMSP satisfied the majority of sulphur demands of bacterioplankton, even though bacterioplankton assimilated only about 2.5% and 6.0% of consumed DMSPd sulphur in the surface and subsurface layers, respectively

  15. Covariance of bacterioplankton composition and environmental variables in a temperate delta system

    USGS Publications Warehouse

    Stepanauskas, R.; Moran, M.A.; Bergamaschi, B.A.; Hollibaugh, J.T.

    2003-01-01

    We examined seasonal and spatial variation in bacterioplankton composition in the Sacramento-San Joaquin River Delta (CA) using terminal restriction fragment length polymorphism (T-RFLP) analysis. Cloned 16S rRNA genes from this system were used for putative identification of taxa dominating the T-RFLP profiles. Both cloning and T-RFLP analysis indicated that Actinobacteria, Verrucomicrobia, Cytophaga-Flavobacterium and Proteobacteria were the most abundant bacterioplankton groups in the Delta. Despite the broad variety of sampled habitats (deep water channels, lakes, marshes, agricultural drains, freshwater and brackish areas), and the spatial and temporal differences in hydrology, temperature and water chemistry among the sampling campaigns, T-RFLP electropherograms from all samples were similar, indicating that the same bacterioplankton phylotypes dominated in the various habitats of the Delta throughout the year. However, principal component analysis (PCA) and partial least-squares regression (PLS) of T-RFLP profiles revealed consistent grouping of samples on a seasonal, but not a spatial, basis. ??-Proteobacteria related to Ralstonia, Actinobacteria related to Microthrix, and ??-Proteobacteria identical to the environmental Clone LD12 had the highest relative abundance in summer/fall T-RFLP profiles and were associated with low river flow, high pH, and a number of optical and chemical characteristics of dissolved organic carbon (DOC) indicative of an increased proportion of phytoplankton-produced organic material as opposed to allochthonous, terrestrially derived organic material. On the other hand, Geobacter-related ??-Proteobacteria showed a relative increase in abundance in T-RFLP analysis during winter/spring, and probably were washed out from watershed soils or sediment. Various phylotypes associated with the same phylogenetic division, based on tentative identification of T-RFLP fragments, exhibited diverse seasonal patterns, suggesting that ecological

  16. Magnitude and regulation of bacterioplankton respiratory quotient across freshwater environmental gradients

    PubMed Central

    Berggren, Martin; Lapierre, Jean-François; del Giorgio, Paul A

    2012-01-01

    Bacterioplankton respiration (BR) may represent the largest single sink of organic carbon in the biosphere and constitutes an important driver of atmospheric carbon dioxide (CO2) emissions from freshwaters. Complete understanding of BR is precluded by the fact that most studies need to assume a respiratory quotient (RQ; mole of CO2 produced per mole of O2 consumed) to calculate rates of BR. Many studies have, without clear support, assumed a fixed RQ around 1. Here we present 72 direct measurements of bacterioplankton RQ that we carried out in epilimnetic samples of 52 freshwater sites in Québec (Canada), using O2 and CO2 optic sensors. The RQs tended to converge around 1.2, but showed large variability (s.d.=0.45) and significant correlations with major gradients of ecosystem-level, substrate-level and bacterial community-level characteristics. Experiments with natural bacterioplankton using different single substrates suggested that RQ is intimately linked to the elemental composition of the respired compounds. RQs were on average low in net autotrophic systems, where bacteria likely were utilizing mainly reduced substrates, whereas we found evidence that the dominance of highly oxidized substrates, for example, organic acids formed by photo-chemical processes, led to high RQ in the more heterotrophic systems. Further, we suggest that BR contributes to a substantially larger share of freshwater CO2 emissions than presently believed based on the assumption that RQ is ∼1. Our study demonstrates that bacterioplankton RQ is not only a practical aspect of BR determination, but also a major ecosystem state variable that provides unique information about aquatic ecosystem functioning. PMID:22094347

  17. Magnitude and regulation of bacterioplankton respiratory quotient across freshwater environmental gradients.

    PubMed

    Berggren, Martin; Lapierre, Jean-François; del Giorgio, Paul A

    2012-05-01

    Bacterioplankton respiration (BR) may represent the largest single sink of organic carbon in the biosphere and constitutes an important driver of atmospheric carbon dioxide (CO(2)) emissions from freshwaters. Complete understanding of BR is precluded by the fact that most studies need to assume a respiratory quotient (RQ; mole of CO(2) produced per mole of O(2) consumed) to calculate rates of BR. Many studies have, without clear support, assumed a fixed RQ around 1. Here we present 72 direct measurements of bacterioplankton RQ that we carried out in epilimnetic samples of 52 freshwater sites in Québec (Canada), using O(2) and CO(2) optic sensors. The RQs tended to converge around 1.2, but showed large variability (s.d.=0.45) and significant correlations with major gradients of ecosystem-level, substrate-level and bacterial community-level characteristics. Experiments with natural bacterioplankton using different single substrates suggested that RQ is intimately linked to the elemental composition of the respired compounds. RQs were on average low in net autotrophic systems, where bacteria likely were utilizing mainly reduced substrates, whereas we found evidence that the dominance of highly oxidized substrates, for example, organic acids formed by photo-chemical processes, led to high RQ in the more heterotrophic systems. Further, we suggest that BR contributes to a substantially larger share of freshwater CO(2) emissions than presently believed based on the assumption that RQ is ∼1. Our study demonstrates that bacterioplankton RQ is not only a practical aspect of BR determination, but also a major ecosystem state variable that provides unique information about aquatic ecosystem functioning.

  18. Impact of solar radiation on bacterioplankton in Laguna Vilama, a hypersaline Andean lake (4650 m)

    NASA Astrophysics Data System (ADS)

    FaríAs, MaríA. Eugenia; FernáNdez-Zenoff, Verónica; Flores, Regina; OrdóñEz, Omar; EstéVez, Cristina

    2009-06-01

    Laguna Vilama is a hypersaline Lake located at 4660 m altitude in the northwest of Argentina high up in the Andean Puna. The impact of ultraviolet (UV) radiation on bacterioplankton was studied by collecting samples at different times of the day. Molecular analysis (DGGE) showed that the bacterioplankton community is characterized by Gamma-proteobacteria (Halomonas sp., Marinobacter sp.), Alpha-proteobacteria (Roseobacter sp.), HGC (Agrococcus jenensis and an uncultured bacterium), and CFB (uncultured Bacteroidetes). During the day, minor modifications in bacterial diversity such as intensification of Bacteroidetes' signal and an emergence of Gamma-proteobacteria (Marinobacter flavimaris) were observed after solar exposure. DNA damage, measured as an accumulation of Cyclobutane Pyrimidine Dimers (CPDs), in bacterioplankton and naked DNA increased from 100 CPDs MB-1 at 1200 local time (LT) to 300 CPDs MB-1 at 1600 LT, and from 80 CPDs MB-1 at 1200 LT to 640 CPDs MB-1 at 1600 LT, respectively. In addition, pure cultures of Pseudomonas sp. V1 and Brachybacterium sp. V5, two bacteria previously isolated from this environment, were exposed simultaneously with the community, and viability of both strains diminished after solar exposure. No CPD accumulation was observed in either of the exposed cultures, but an increase in mutagenesis was detected in V5. Of both strains only Brachybacterium sp. V5 showed CPD accumulation in naked DNA. These results suggest that the bacterioplankton community is well adapted to this highly solar irradiated environment showing little accumulation of CPDs and few changes in the community composition. They also demonstrate that these microorganisms contain efficient mechanisms against UV damage.

  19. Seasonal patterns of the bacterioplankton community composition in a lake threatened by a pesticide disposal site.

    PubMed

    Lew, Sylwia; Lew, Marcin; Szarek, Józef; Babińska, Izabella

    2011-03-01

    BACKGROUND AIM AND SCOPE: The objective of the study was to determine the effects of ca. 35 years of pesticide contamination (pesticide dump-PD) of Lake Szeląg Wielki (located in the north-eastern Poland) on changes in the microbial communities of aquatic ecosystems. In the years 2008-2009, analyses were carried out for seasonal changes in the quantity and composition of bacterioplankton in the lake examined, which is of high significance to the tourism and fishing industries and is located in the vicinity of an area subjected to reclamation after a pesticide dump. Bacterioplankton composition was assayed by fluorescence in situ hybridisation technique for the contribution of major groups of the Bacteria domain: ά-, β- and γ-Proteobacteria, Cytophaga-Flavobacterium and Actinobacteria as well as bacteria capable of degrading pesticides in an aquatic environment-Pseudomonas spp. Seasonal patterns of the total number of bacteria were determined by direct counting of 4',6-diamidino-2-phenylindole (DAPI)-stained cells. The percentage of the detected Eubacteria (EUB 338 probe) relative to all the DAPI-stained bacteria in Lake Szeląg Wielki ranged from 46% to 63%. Bacteria capable of degrading pesticides in an aquatic environment-Pseudomonas spp.-were identified with a highly specific probe PEA 998. The highest mean values of this parameter reached 5.1%. In the spring, Pseudomonas spp. bacteria accounted for up to 80% of all Gamma-Proteobacteria microbes. The study showed that the qualitative and quantitative changes in the bacterioplankton of the lake can be characterised by tendencies which are typical of a eutrophic water reservoir. However, a higher contribution of microorganisms capable of degrading sparingly degradable, toxic compounds and pesticides was determined in bacterioplankton from the PD-contaminated lake, as compared to microbial communities of a lake not contaminated with pesticides.

  20. The dynamics of carbon exchange in vertically stratified coastal bacterioplankton communities

    SciTech Connect

    Blum, P.

    1998-07-01

    This research focuses on the development and application of novel molecular methods to measure bacterioplankton growth state in situ. These methods included bulk or population-based studies and single cell studies. Due to the limited duration of support and subsequent termination of the molecular-focused PIs, only the former bulk method was applied to marine samples. In addition, basic laboratory studies were completed which addressed why the selected biomarkers were regulated by bacterial growth state.

  1. Non-random assembly of bacterioplankton communities in the subtropical north pacific ocean.

    PubMed

    Eiler, Alexander; Hayakawa, Darin H; Rappé, Michael S

    2011-01-01

    The exploration of bacterial diversity in the global ocean has revealed new taxa and previously unrecognized metabolic potential; however, our understanding of what regulates this diversity is limited. Using terminal restriction fragment length polymorphism (T-RFLP) data from bacterial small-subunit ribosomal RNA genes we show that, independent of depth and time, a large fraction of bacterioplankton co-occurrence patterns are non-random in the oligotrophic North Pacific subtropical gyre (NPSG). Pair-wise correlations of all identified operational taxonomic units (OTUs) revealed a high degree of significance, with 6.6% of the pair-wise co-occurrences being negatively correlated and 20.7% of them being positive. The most abundant OTUs, putatively identified as Prochlorococcus, SAR11, and SAR116 bacteria, were among the most correlated OTUs. As expected, bacterial community composition lacked statistically significant patterns of seasonality in the mostly stratified water column except in a few depth horizons of the sunlit surface waters, with higher frequency variations in community structure apparently related to populations associated with the deep chlorophyll maximum. Communities were structured vertically into epipelagic, mesopelagic, and bathypelagic populations. Permutation-based statistical analyses of T-RFLP data and their corresponding metadata revealed a broad range of putative environmental drivers controlling bacterioplankton community composition in the NPSG, including concentrations of inorganic nutrients and phytoplankton pigments. Together, our results suggest that deterministic forces such as environmental filtering and interactions among taxa determine bacterioplankton community patterns, and consequently affect ecosystem functions in the NPSG.

  2. Non-Random Assembly of Bacterioplankton Communities in the Subtropical North Pacific Ocean

    PubMed Central

    Eiler, Alexander; Hayakawa, Darin H.; Rappé, Michael S.

    2011-01-01

    The exploration of bacterial diversity in the global ocean has revealed new taxa and previously unrecognized metabolic potential; however, our understanding of what regulates this diversity is limited. Using terminal restriction fragment length polymorphism (T-RFLP) data from bacterial small-subunit ribosomal RNA genes we show that, independent of depth and time, a large fraction of bacterioplankton co-occurrence patterns are non-random in the oligotrophic North Pacific subtropical gyre (NPSG). Pair-wise correlations of all identified operational taxonomic units (OTUs) revealed a high degree of significance, with 6.6% of the pair-wise co-occurrences being negatively correlated and 20.7% of them being positive. The most abundant OTUs, putatively identified as Prochlorococcus, SAR11, and SAR116 bacteria, were among the most correlated OTUs. As expected, bacterial community composition lacked statistically significant patterns of seasonality in the mostly stratified water column except in a few depth horizons of the sunlit surface waters, with higher frequency variations in community structure apparently related to populations associated with the deep chlorophyll maximum. Communities were structured vertically into epipelagic, mesopelagic, and bathypelagic populations. Permutation-based statistical analyses of T-RFLP data and their corresponding metadata revealed a broad range of putative environmental drivers controlling bacterioplankton community composition in the NPSG, including concentrations of inorganic nutrients and phytoplankton pigments. Together, our results suggest that deterministic forces such as environmental filtering and interactions among taxa determine bacterioplankton community patterns, and consequently affect ecosystem functions in the NPSG. PMID:21747815

  3. Tracking differential incorporation of dissolved organic carbon types among diverse lineages of Sargasso Sea bacterioplankton.

    PubMed

    Nelson, Craig E; Carlson, Craig A

    2012-06-01

    Bacterioplankton are the primary trophic conduit for dissolved organic carbon (DOC) and linking community structure with DOC utilization is central to understanding global carbon cycling. We coupled stable isotope probing (SIP) with 16S rRNA pyrosequencing in dark seawater culture experiments on euphotic and mesopelagic communities from the Sargasso Sea. Parallel cultures were amended with equimolar quantities of four DO(13) C substrates to simultaneously evaluate community utilization and population-specific incorporation. Of the substrates tested - two cyanobacterial products (exudates or lysates from a culture of Synechococcus) and two defined monosaccharides (glucose or gluconic acid) - the cyanobacterial exudates were incorporated by the greatest diversity of oligotrophic bacterioplankton populations in surface waters, including taxa from > 10 major subclades within the Flavobacteria, Actinobacteria, Verrucomicrobia and Proteobacteria (including SAR11). In contrast, the monosaccharide glucose was not incorporated by any taxa belonging to extant oligotrophic oceanic clades. Conversely, proteobacterial copiotrophs, which were rare in the ambient water (< 0.1% of sequences), grew rapidly on all DOC amendments at both depths, but with different substrate preferences among lineages. We present a new analytical framework for using SIP to detect DOC incorporation across diverse oligotrophic bacterioplankton and discuss implications for the ecology of bacterial-DOC interactions among populations of diverging trophic strategies.

  4. Understanding diversity patterns in bacterioplankton communities from a sub-Antarctic peatland.

    PubMed

    Quiroga, María Victoria; Valverde, Angel; Mataloni, Gabriela; Cowan, Don

    2015-06-01

    Bacterioplankton communities inhabiting peatlands have the potential to influence local ecosystem functions. However, most microbial ecology research in such wetlands has been done in ecosystems (mostly peat soils) of the Northern Hemisphere, and very little is known of the factors that drive bacterial community assembly in other regions of the world. In this study, we used high-throughput sequencing to analyse the structure of the bacterial communities in five pools located in a sub-Antarctic peat bog (Tierra del Fuego, Argentina), and tested for relationships between bacterial communities and environmental conditions. Bacterioplankton communities in peat bog pools were diverse and dominated by members of the Proteobacteria, Actinobacteria, Bacteroidetes and Verrucomicrobia. Community structure was largely explained by differences in hydrological connectivity, pH and nutrient status (ombrotrophic versus minerotrophic pools). Bacterioplankton communities in ombrotrophic pools showed phylogenetic clustering, suggesting a dominant role of deterministic processes in shaping these assemblages. These correlations between habitat characteristics and bacterial diversity patterns provide new insights into the factors regulating microbial populations in peatland ecosystems.

  5. Alkaline phosphatases in microbialites and bacterioplankton from Alchichica soda lake, Mexico.

    PubMed

    Valdespino-Castillo, Patricia M; Alcántara-Hernández, Rocio J; Alcocer, Javier; Merino-Ibarra, Martín; Macek, Miroslav; Falcón, Luisa I

    2014-11-01

    Dissolved organic phosphorus utilization by different members of natural communities has been closely linked to microbial alkaline phosphatases whose affiliation and diversity is largely unknown. Here we assessed genetic diversity of bacterial alkaline phosphatases phoX and phoD, using highly diverse microbial consortia (microbialites and bacterioplankton) as study models. These microbial consortia are found in an oligo-mesotrophic soda lake with a particular geochemistry, exhibiting a low calcium concentration and a high Mg : Ca ratio relative to seawater. In spite of the relative low calcium concentration in the studied system, our results highlight the diversity of calcium-based metallophosphatases phoX and phoD-like in heterotrophic bacteria of microbialites and bacterioplankton, where phoX was the most abundant alkaline phosphatase found. phoX and phoD-like phylotypes were more numerous in microbialites than in bacterioplankton. A larger potential community for DOP utilization in microbialites was consistent with the TN : TP ratio, suggesting P limitation within these assemblages. A cross-system comparison indicated that diversity of phoX in Lake Alchichica was similar to that of other aquatic systems with a naturally contrasting ionic composition and trophic state, although no phylotypes were shared among systems. © 2014 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved.

  6. Bacterioplankton community shifts associated with epipelagic and mesopelagic waters in the Southern Ocean

    PubMed Central

    Yu, Zheng; Yang, Jun; Liu, Lemian; Zhang, Wenjing; Amalfitano, Stefano

    2015-01-01

    The Southern Ocean is among the least explored marine environments on Earth, and still little is known about regional and vertical variability in the diversity of Antarctic marine prokaryotes. In this study, the bacterioplankton community in both epipelagic and mesopelagic waters was assessed at two adjacent stations by high-throughput sequencing and quantitative PCR. Water temperature was significantly higher in the superficial photic zone, while higher salinity and dissolved oxygen were recorded in the deeper water layers. The highest abundance of the bacterioplankton was found at a depth of 75 m, corresponding to the deep chlorophyll maximum layer. Both Alphaproteobacteria and Gammaproteobacteria were the most abundant taxa throughout the water column, while more sequences affiliated to Cyanobacteria and unclassified bacteria were identified from surface and the deepest waters, respectively. Temperature was the most significant environmental variable affecting the bacterial community structure. The bacterial community composition displayed significant differences at the epipelagic layers between two stations, whereas those in the mesopelagic waters were more similar to each other. Our results indicated that the epipelagic bacterioplankton might be dominated by short-term environmental variable conditions, whereas the mesopelagic communities appeared to be structured by longer water-mass residence time and relative stable environmental factors. PMID:26256889

  7. Seasonal assemblages and short-lived blooms in coastal north-west Atlantic Ocean bacterioplankton.

    PubMed

    El-Swais, Heba; Dunn, Katherine A; Bielawski, Joseph P; Li, William K W; Walsh, David A

    2015-10-01

    Temperate oceans are inhabited by diverse and temporally dynamic bacterioplankton communities. However, the role of the environment, resources and phytoplankton dynamics in shaping marine bacterioplankton communities at different time scales remains poorly constrained. Here, we combined time series observations (time scales of weeks to years) with molecular analysis of formalin-fixed samples from a coastal inlet of the north-west Atlantic Ocean to show that a combination of temperature, nitrate, small phytoplankton and Synechococcus abundances are best predictors for annual bacterioplankton community variability, explaining 38% of the variation. Using Bayesian mixed modelling, we identified assemblages of co-occurring bacteria associated with different seasonal periods, including the spring bloom (e.g. Polaribacter, Ulvibacter, Alteromonadales and ARCTIC96B-16) and the autumn bloom (e.g. OM42, OM25, OM38 and Arctic96A-1 clades of Alphaproteobacteria, and SAR86, OM60 and SAR92 clades of Gammaproteobacteria). Community variability over spring bloom development was best explained by silicate (32%)--an indication of rapid succession of bacterial taxa in response to diatom biomass--while nanophytoplankton as well as picophytoplankton abundance explained community variability (16-27%) over the transition into and out of the autumn bloom. Moreover, the seasonal structure was punctuated with short-lived blooms of rare bacteria including the KSA-1 clade of Sphingobacteria related to aromatic hydrocarbon-degrading bacteria. © 2014 Society for Applied Microbiology and John Wiley & Sons Ltd.

  8. Temporal variability in the diversity and composition of stream bacterioplankton communities.

    PubMed

    Portillo, Maria C; Anderson, Suzanne P; Fierer, Noah

    2012-09-01

    Bacterioplankton in freshwater streams play a critical role in stream nutrient cycling. Despite their ecological importance, the temporal variability in the structure of stream bacterioplankton communities remains understudied. We investigated the composition and temporal variability of stream bacterial communities and the influence of physicochemical parameters on these communities. We used barcoded pyrosequencing to survey bacterial communities in 107 streamwater samples collected from four locations in the Colorado Rocky Mountains from September 2008 to November 2009. The four sampled locations harboured distinct communities yet, at each sampling location, there was pronounced temporal variability in both community composition and alpha diversity levels. These temporal shifts in bacterioplankton community structure were not seasonal; rather, their diversity and composition appeared to be driven by intermittent changes in various streamwater biogeochemical conditions. Bacterial communities varied independently of time, as indicated by the observation that communities in samples collected close together in time were no more similar than those collected months apart. The temporal turnover in community composition was higher than observed in most previously studied microbial, plant or animal communities, highlighting the importance of stochastic processes and disturbance events in structuring these communities over time. Detailed temporal sampling is important if the objective is to monitor microbial community dynamics in pulsed ecosystems like streams.

  9. Marine bacterioplankton community turnover within seasonally hypoxic waters of a subtropical sound: Devil's Hole, Bermuda.

    PubMed

    Parsons, Rachel J; Nelson, Craig E; Carlson, Craig A; Denman, Carmen C; Andersson, Andreas J; Kledzik, Andrew L; Vergin, Kevin L; McNally, Sean P; Treusch, Alexander H; Giovannoni, Stephen J

    2015-10-01

    Understanding bacterioplankton community dynamics in coastal hypoxic environments is relevant to global biogeochemistry because coastal hypoxia is increasing worldwide. The temporal dynamics of bacterioplankton communities were analysed throughout the illuminated water column of Devil's Hole, Bermuda during the 6-week annual transition from a strongly stratified water column with suboxic and high-pCO2 bottom waters to a fully mixed and ventilated state during 2008. A suite of culture-independent methods provided a quantitative spatiotemporal characterization of bacterioplankton community changes, including both direct counts and rRNA gene sequencing. During stratification, the surface waters were dominated by the SAR11 clade of Alphaproteobacteria and the cyanobacterium Synechococcus. In the suboxic bottom waters, cells from the order Chlorobiales prevailed, with gene sequences indicating members of the genera Chlorobium and Prosthecochloris--anoxygenic photoautotrophs that utilize sulfide as a source of electrons for photosynthesis. Transitional zones of hypoxia also exhibited elevated levels of methane- and sulfur-oxidizing bacteria relative to the overlying waters. The abundance of both Thaumarcheota and Euryarcheota were elevated in the suboxic bottom waters (> 10(9) cells l(-1)). Following convective mixing, the entire water column returned to a community typical of oxygenated waters, with Euryarcheota only averaging 5% of cells, and Chlorobiales and Thaumarcheota absent.

  10. Combined Carbohydrates Support Rich Communities of Particle-Associated Marine Bacterioplankton

    PubMed Central

    Sperling, Martin; Piontek, Judith; Engel, Anja; Wiltshire, Karen H.; Niggemann, Jutta; Gerdts, Gunnar; Wichels, Antje

    2017-01-01

    Carbohydrates represent an important fraction of labile and semi-labile marine organic matter that is mainly comprised of exopolymeric substances derived from phytoplankton exudation and decay. This study investigates the composition of total combined carbohydrates (tCCHO; >1 kDa) and the community development of free-living (0.2–3 μm) and particle-associated (PA) (3–10 μm) bacterioplankton during a spring phytoplankton bloom in the southern North Sea. Furthermore, rates were determined for the extracellular enzymatic hydrolysis that catalyzes the initial step in bacterial organic matter remineralization. Concentrations of tCCHO greatly increased during bloom development, while the composition showed only minor changes over time. The combined concentration of glucose, galactose, fucose, rhamnose, galactosamine, glucosamine, and glucuronic acid in tCCHO was a significant factor shaping the community composition of the PA bacteria. The richness of PA bacteria greatly increased in the post-bloom phase. At the same time, the increase in extracellular β-glucosidase activity was sufficient to explain the observed decrease in tCCHO, indicating the efficient utilization of carbohydrates by the bacterioplankton community during the post-bloom phase. Our results suggest that carbohydrate concentration and composition are important factors in the multifactorial environmental control of bacterioplankton succession and the enzymatic hydrolysis of organic matter during phytoplankton blooms. PMID:28197132

  11. Real-Time Pathogen Detection in the Era of Whole-Genome Sequencing and Big Data: Comparison of k-mer and Site-Based Methods for Inferring the Genetic Distances among Tens of Thousands of Salmonella Samples

    PubMed Central

    Pettengill, James B.; Pightling, Arthur W.; Baugher, Joseph D.; Rand, Hugh; Strain, Errol

    2016-01-01

    The adoption of whole-genome sequencing within the public health realm for molecular characterization of bacterial pathogens has been followed by an increased emphasis on real-time detection of emerging outbreaks (e.g., food-borne Salmonellosis). In turn, large databases of whole-genome sequence data are being populated. These databases currently contain tens of thousands of samples and are expected to grow to hundreds of thousands within a few years. For these databases to be of optimal use one must be able to quickly interrogate them to accurately determine the genetic distances among a set of samples. Being able to do so is challenging due to both biological (evolutionary diverse samples) and computational (petabytes of sequence data) issues. We evaluated seven measures of genetic distance, which were estimated from either k-mer profiles (Jaccard, Euclidean, Manhattan, Mash Jaccard, and Mash distances) or nucleotide sites (NUCmer and an extended multi-locus sequence typing (MLST) scheme). When analyzing empirical data (whole-genome sequence data from 18,997 Salmonella isolates) there are features (e.g., genomic, assembly, and contamination) that cause distances inferred from k-mer profiles, which treat absent data as informative, to fail to accurately capture the distance between samples when compared to distances inferred from differences in nucleotide sites. Thus, site-based distances, like NUCmer and extended MLST, are superior in performance, but accessing the computing resources necessary to perform them may be challenging when analyzing large databases. PMID:27832109

  12. Real-Time Pathogen Detection in the Era of Whole-Genome Sequencing and Big Data: Comparison of k-mer and Site-Based Methods for Inferring the Genetic Distances among Tens of Thousands of Salmonella Samples.

    PubMed

    Pettengill, James B; Pightling, Arthur W; Baugher, Joseph D; Rand, Hugh; Strain, Errol

    2016-01-01

    The adoption of whole-genome sequencing within the public health realm for molecular characterization of bacterial pathogens has been followed by an increased emphasis on real-time detection of emerging outbreaks (e.g., food-borne Salmonellosis). In turn, large databases of whole-genome sequence data are being populated. These databases currently contain tens of thousands of samples and are expected to grow to hundreds of thousands within a few years. For these databases to be of optimal use one must be able to quickly interrogate them to accurately determine the genetic distances among a set of samples. Being able to do so is challenging due to both biological (evolutionary diverse samples) and computational (petabytes of sequence data) issues. We evaluated seven measures of genetic distance, which were estimated from either k-mer profiles (Jaccard, Euclidean, Manhattan, Mash Jaccard, and Mash distances) or nucleotide sites (NUCmer and an extended multi-locus sequence typing (MLST) scheme). When analyzing empirical data (whole-genome sequence data from 18,997 Salmonella isolates) there are features (e.g., genomic, assembly, and contamination) that cause distances inferred from k-mer profiles, which treat absent data as informative, to fail to accurately capture the distance between samples when compared to distances inferred from differences in nucleotide sites. Thus, site-based distances, like NUCmer and extended MLST, are superior in performance, but accessing the computing resources necessary to perform them may be challenging when analyzing large databases.

  13. Responses of spatial-temporal dynamics of bacterioplankton community to large-scale reservoir operation: a case study in the Three Gorges Reservoir, China

    PubMed Central

    Li, Zhe; Lu, Lunhui; Guo, Jinsong; Yang, Jixiang; Zhang, Jiachao; He, Bin; Xu, Linlin

    2017-01-01

    Large rivers are commonly regulated by damming, yet the effects of such disruption on bacterioplankton community structures have not been adequately studied. The aim of this study was to explore the biogeographical patterns present under dam regulation and to uncover the major drivers structuring bacterioplankton communities. Bacterioplankton assemblages in the Three Gorges Reservoir (TGR) were analyzed using Illumina Miseq sequencing by comparing seven sites located within the TGR before and after impoundment. This approach revealed ecological and spatial-temporal variations in bacterioplankton community composition along the longitudinal axis. The community was dynamic and dominated by Proteobacteria and Actinobacteria phyla, encompassing 39.26% and 37.14% of all sequences, respectively, followed by Bacteroidetes (8.67%) and Cyanobacteria (3.90%). The Shannon-Wiener index of the bacterioplankton community in the flood season (August) was generally higher than that in the impoundment season (November). Principal Component Analysis of the bacterioplankton community compositions showed separation between different seasons and sampling sites. Results of the relationship between bacterioplankton community compositions and environmental variables highlighted that ecological processes of element cycling and large dam disturbances are of prime importance in driving the assemblages of riverine bacterioplankton communities. PMID:28211884

  14. Bacterioplankton and phytoplankton biomass and production during summer stratification in the northwestern Mediterranean Sea

    NASA Astrophysics Data System (ADS)

    Pedrós-Alió, Carlos; Calderón-Paz, Juan-Isidro; Guixa-Boixereu, Núria; Estrada, Marta; Gasol, Josep M.

    1999-06-01

    We examined bacterioplankton biomass and heterotrophic production (BHP) during summer stratification in the northwestern Mediterranean in four successive stratification seasons (June-July of 1993-1996). Values of phytoplankton biomass and primary production were determined simultaneously so that the data sets for autotrophic and heterotrophic microbial plankton could be compared. Three standard stations were set along a transect from Barcelona to the channel between Mallorca and Menorca, representing coastally influenced shelf waters, frontal waters over the slope front, and open sea waters. Conversion factors from 3H-leucine incorporation to BHP were empirically determined and varied between 0.29 and 3.25 kg C mol -1. Bacterial biomass values were among the lowest found in any marine environment. BHP values (between 0.02 and 2.5 μg C L -1 d -1) were larger than those of low nutrient low chlorophyll areas such as the Sargasso Sea and lower than those from high nutrient low chlorophyll areas such as the equatorial Pacific. Growth rates of bacterioplankton were highest at the slope front (0.20 d -1) and lowest at the open sea station (0.04 d -1). Phytoplankton growth rates were similar at the three stations (˜0.50 d -1). Integrated values of bacterioplankton biomass, BHP and bacterial growth rates did not show significant differences among years, but differences between the three stations were clearly significant. Phytoplankton biomass, primary production, and phytoplankton growth rates did not show significant differences either with year or with station. As a consequence the bacterioplankton to phytoplankton biomass (BB/BPHY) and production (BHP/PP) ratios varied from the coastal to the open sea stations. The BB/BPHY ratio was 0.98 at the coast and ˜0.70 at the other two stations. These ratios are similar to those found in other oligotrophic marine environments. The BHP/PP ratio was 0.83 at the coast, 0.36 at the slope and 0.09 at the open sea station. The last

  15. Inferring Horizontal Gene Transfer

    PubMed Central

    Lassalle, Florent; Dessimoz, Christophe

    2015-01-01

    Horizontal or Lateral Gene Transfer (HGT or LGT) is the transmission of portions of genomic DNA between organisms through a process decoupled from vertical inheritance. In the presence of HGT events, different fragments of the genome are the result of different evolutionary histories. This can therefore complicate the investigations of evolutionary relatedness of lineages and species. Also, as HGT can bring into genomes radically different genotypes from distant lineages, or even new genes bearing new functions, it is a major source of phenotypic innovation and a mechanism of niche adaptation. For example, of particular relevance to human health is the lateral transfer of antibiotic resistance and pathogenicity determinants, leading to the emergence of pathogenic lineages [1]. Computational identification of HGT events relies upon the investigation of sequence composition or evolutionary history of genes. Sequence composition-based ("parametric") methods search for deviations from the genomic average, whereas evolutionary history-based ("phylogenetic") approaches identify genes whose evolutionary history significantly differs from that of the host species. The evaluation and benchmarking of HGT inference methods typically rely upon simulated genomes, for which the true history is known. On real data, different methods tend to infer different HGT events, and as a result it can be difficult to ascertain all but simple and clear-cut HGT events. PMID:26020646

  16. Evolutionary origin of a streamlined marine bacterioplankton lineage.

    PubMed

    Luo, Haiwei

    2015-06-01

    Planktonic bacterial lineages with streamlined genomes are prevalent in the ocean. The base composition of their DNA is often highly biased towards low G+C content, a possible source of systematic error in phylogenetic reconstruction. A total of 228 orthologous protein families were sampled that are shared among major lineages of Alphaproteobacteria, including the marine free-living SAR11 clade and the obligate endosymbiotic Rickettsiales. These two ecologically distinct lineages share genome sizes of <1.5 Mbp and genomic G+C content of <30%. Statistical analyses showed that only 28 protein families are composition-homogeneous, whereas the other 200 families significantly violate the composition-homogeneous assumption included in most phylogenetic methods. RAxML analysis based on the concatenation of 24 ribosomal proteins that fall into the heterogeneous protein category clustered the SAR11 and Rickettsiales lineages at the base of the Alphaproteobacteria tree, whereas that based on the concatenation of 28 homogeneous proteins (including 19 ribosomal proteins) disassociated the lineages and placed SAR11 at the base of the non-endosymbiotic lineages. When the two data sets were concatenated, only a model that accounted for compositional bias yielded a tree identical to the tree built with composition-homogeneous proteins. Ancestral genome analysis suggests that the first evolved SAR11 cell had a small genome streamlined from its ancestor by a factor of two and coinciding with an ecological transition, followed by further gradual streamlining towards the extant SAR11 populations.

  17. Decrease of NH4+-N by bacterioplankton accelerated the removal of cyanobacterial blooms in aerated aquatic ecosystem.

    PubMed

    Yang, Xi; Xie, Ping; Ma, Zhimei; Wang, Qing; Fan, Huihui; Shen, Hong

    2013-11-01

    We used aerated systems to assess the influence of the bacterioplankton community on cyanobacterial blooms in algae/post-bloom of Lake Taihu, China. Bacterioplankton community diversity was evaluated by polymerase chain reaction-denaturing gradient gel electrophoresis (PCR-DGGE) fingerprinting. Chemical analysis and nitrogen dynamic changes illustrated that NH4+-N was nitrified to NO2--N and NO3--N by bacterioplankton. Finally, NH4+-N was exhausted and NO3--N was denitrified to NO2--N, while the accumulation of NO2--N indicated that bacterioplankton with completely aerobic denitrification ability were lacking in the water samples collected from Lake Taihu. We suggested that adding completely aerobic denitrification bacteria (to denitrify NO2--N to N2) would improve the water quality. PCR-DGGE and sequencing results showed that more than1/3 of the bacterial species were associated with the removal of nitrogen, and Acidovorax temperans was the dominant one. PCR-DGGE, variation of nitrogen, removal efficiencies of chlorophyll-a and canonical correspondence analysis indicated that the bacterioplanktonsignificantly influenced the physiological and biochemical changes of cyanobacteria. Additionally, the unweighted pair-group method with arithmetic means revealed there was no obvious harm to the microecosystem from aeration. The present study demonstrated that bacterioplankton can play crucial roles in aerated ecosystems, which could control the impact of cyanobacterial blooms in eutrophicated fresh water systems.

  18. A Global eDNA Comparison of Freshwater Bacterioplankton Assemblages Focusing on Large-River Floodplain Lakes of Brazil.

    PubMed

    Tessler, Michael; Brugler, Mercer R; DeSalle, Rob; Hersch, Rebecca; Velho, Luiz Felipe M; Segovia, Bianca T; Lansac-Toha, Fabio A; Lemke, Michael J

    2017-01-01

    With its network of lotic and lentic habitats that shift during changes in seasonal connection, the tropical and subtropical large-river systems represent possibly the most dynamic of all aquatic environments. Pelagic water samples were collected from Brazilian floodplain lakes (total n = 58) in four flood-pulsed systems (Amazon [n = 21], Araguaia [n = 14], Paraná [n = 15], and Pantanal [n = 8]) in 2011-2012 and sequenced via 454 for bacterial environmental DNA using 16S amplicons; additional abiotic field and laboratory measurements were collected for the assayed lakes. We report here a global comparison of the bacterioplankton makeup of freshwater systems, focusing on a comparison of Brazilian lakes with similar freshwater systems across the globe. The results indicate a surprising similarity at higher taxonomic levels of the bacterioplankton in Brazilian freshwater with global sites. However, substantial novel diversity at the family level was also observed for the Brazilian freshwater systems. Brazilian freshwater bacterioplankton richness was relatively average globally. Ordination results indicate that Brazilian bacterioplankton composition is unique from other areas of the globe. Using Brazil-only ordinations, floodplain system differentiation most strongly correlated with dissolved oxygen, pH, and phosphate. Our data on Brazilian freshwater systems in combination with analysis of a collection of freshwater environmental samples from across the globe offers the first regional picture of bacterioplankton diversity in these important freshwater systems.

  19. Linking the composition of bacterioplankton to rapid turnover of dissolved dimethylsulphoniopropionate in an algal bloom in the North Sea.

    PubMed

    Zubkov, M V; Fuchs, B M; Archer, S D; Kiene, R P; Amann, R; Burkill, P H

    2001-05-01

    The algal osmolyte, dimethylsulphoniopropionate (DMSP), is abundant in the surface oceans and is the major precursor of dimethyl sulphide (DMS), a gas involved in global climate regulation. Here, we report results from an in situ Lagrangian study that suggests a link between the microbially driven fluxes of dissolved DMSP (DMSPd) and specific members of the bacterioplankton community in a North Sea coccolithophore bloom. The bacterial population in the bloom was dominated by a single species related to the genus Roseobacter, which accounted for 24% of the bacterioplankton numbers and up to 50% of the biomass. The abundance of the Roseobacter cells showed significant paired correlation with DMSPd consumption and bacterioplankton production, whereas abundances of other bacteria did not. Consumed DMSPd (28 nM day(-1)) contributed 95% of the sulphur and up to 15% of the carbon demand of the total bacterial populations, suggesting the importance of DMSP as a substrate for the Roseobacter-dominated bacterioplankton. In dominating DMSPd flux, the Roseobacter species may exert a major control on DMS production. DMSPd turnover rate was 10 times that of DMS (2.7 nM day(-1)), indicating that DMSPd was probably the major source of DMS, but that most of the DMSPd was metabolized without DMS production. Our study suggests that single species of bacterioplankton may at times be important in metabolizing DMSP and regulating the generation of DMS in the sea.

  20. Spatial and seasonal distributions of bacterioplankton in the Pearl River Estuary: The combined effects of riverine inputs, temperature, and phytoplankton.

    PubMed

    Li, Jiajun; Jiang, Xin; Jing, Zhiyou; Li, Gang; Chen, Zuozhi; Zhou, Linbin; Zhao, Chunyu; Liu, Jiaxing; Tan, Yehui

    2017-08-16

    In this study, we used flow cytometry and 16S rRNA gene pyrosequencing to investigate bacterioplankton (heterotrophic bacteria and picocyanobacteria) abundance and community structure in surface waters along the Pearl River Estuary. The results showed significant differences in bacterioplankton dynamics between fresh- and saltwater sites and between wet and dry season. Synechococcus constituted the majority of picocyanobacteria in both seasons. During the wet season, Synechococcus reached extremely high abundance at the mouth of the estuary, and heterotrophic bacteria were highly abundant (>10(6)cellsml(-1)) throughout the studied region. At the same time, bacterioplankton decreased dramatically during the dry season. Pyrosequencing data indicated that salinity was a key parameter in shaping microbial community structure during both seasons. Phytoplankton was also an important factor; the proportion of Synechococcus and Rhodobacteriales was elevated at the frontal zone with higher chlorophyll a during the wet season, whereas Synechococcus were markedly reduced during the dry season. Copyright © 2017. Published by Elsevier Ltd.

  1. Real-Time Pathogen Detection in the Era of Whole-Genome Sequencing and Big Data: Comparison of k-mer and Site-Based Methods for Inferring the Genetic Distances among Tens of Thousands of Salmonella Samples

    SciTech Connect

    Pettengill, James B.; Pightling, Arthur W.; Baugher, Joseph D.; Rand, Hugh; Strain, Errol

    2016-11-10

    The adoption of whole-genome sequencing within the public health realm for molecular characterization of bacterial pathogens has been followed by an increased emphasis on real-time detection of emerging outbreaks (e.g., food-borne Salmonellosis). In turn, large databases of whole-genome sequence data are being populated. These databases currently contain tens of thousands of samples and are expected to grow to hundreds of thousands within a few years. For these databases to be of optimal use one must be able to quickly interrogate them to accurately determine the genetic distances among a set of samples. Being able to do so is challenging due to both biological (evolutionary diverse samples) and computational (petabytes of sequence data) issues. We evaluated seven measures of genetic distance, which were estimated from either k-mer profiles (Jaccard, Euclidean, Manhattan, Mash Jaccard, and Mash distances) or nucleotide sites (NUCmer and an extended multi-locus sequence typing (MLST) scheme). Finally, when analyzing empirical data (wholegenome sequence data from 18,997 Salmonella isolates) there are features (e.g., genomic, assembly, and contamination) that cause distances inferred from k-mer profiles, which treat absent data as informative, to fail to accurately capture the distance between samples when compared to distances inferred from differences in nucleotide sites. Thus, site-based distances, like NUCmer and extended MLST, are superior in performance, but accessing the computing resources necessary to perform them may be challenging when analyzing large databases.

  2. Real-Time Pathogen Detection in the Era of Whole-Genome Sequencing and Big Data: Comparison of k-mer and Site-Based Methods for Inferring the Genetic Distances among Tens of Thousands of Salmonella Samples

    DOE PAGES

    Pettengill, James B.; Pightling, Arthur W.; Baugher, Joseph D.; ...

    2016-11-10

    The adoption of whole-genome sequencing within the public health realm for molecular characterization of bacterial pathogens has been followed by an increased emphasis on real-time detection of emerging outbreaks (e.g., food-borne Salmonellosis). In turn, large databases of whole-genome sequence data are being populated. These databases currently contain tens of thousands of samples and are expected to grow to hundreds of thousands within a few years. For these databases to be of optimal use one must be able to quickly interrogate them to accurately determine the genetic distances among a set of samples. Being able to do so is challenging duemore » to both biological (evolutionary diverse samples) and computational (petabytes of sequence data) issues. We evaluated seven measures of genetic distance, which were estimated from either k-mer profiles (Jaccard, Euclidean, Manhattan, Mash Jaccard, and Mash distances) or nucleotide sites (NUCmer and an extended multi-locus sequence typing (MLST) scheme). Finally, when analyzing empirical data (wholegenome sequence data from 18,997 Salmonella isolates) there are features (e.g., genomic, assembly, and contamination) that cause distances inferred from k-mer profiles, which treat absent data as informative, to fail to accurately capture the distance between samples when compared to distances inferred from differences in nucleotide sites. Thus, site-based distances, like NUCmer and extended MLST, are superior in performance, but accessing the computing resources necessary to perform them may be challenging when analyzing large databases.« less

  3. Perceptual inference.

    PubMed

    Aggelopoulos, Nikolaos C

    2015-08-01

    Perceptual inference refers to the ability to infer sensory stimuli from predictions that result from internal neural representations built through prior experience. Methods of Bayesian statistical inference and decision theory model cognition adequately by using error sensing either in guiding action or in "generative" models that predict the sensory information. In this framework, perception can be seen as a process qualitatively distinct from sensation, a process of information evaluation using previously acquired and stored representations (memories) that is guided by sensory feedback. The stored representations can be utilised as internal models of sensory stimuli enabling long term associations, for example in operant conditioning. Evidence for perceptual inference is contributed by such phenomena as the cortical co-localisation of object perception with object memory, the response invariance in the responses of some neurons to variations in the stimulus, as well as from situations in which perception can be dissociated from sensation. In the context of perceptual inference, sensory areas of the cerebral cortex that have been facilitated by a priming signal may be regarded as comparators in a closed feedback loop, similar to the better known motor reflexes in the sensorimotor system. The adult cerebral cortex can be regarded as similar to a servomechanism, in using sensory feedback to correct internal models, producing predictions of the outside world on the basis of past experience.

  4. A BAC library of the SP80-3280 sugarcane variety (saccharum sp.) and its inferred microsynteny with the sorghum genome

    PubMed Central

    2012-01-01

    Background Sugarcane breeding has significantly progressed in the last 30 years, but achieving additional yield gains has been difficult because of the constraints imposed by the complex ploidy of this crop. Sugarcane cultivars are interspecific hybrids between Saccharum officinarum and Saccharum spontaneum. S. officinarum is an octoploid with 2n = 80 chromosomes while S. spontaneum has 2n = 40 to 128 chromosomes and ploidy varying from 5 to 16. The hybrid genome is composed of 70-80% S. officinaram and 5-20% S. spontaneum chromosomes and a small proportion of recombinants. Sequencing the genome of this complex crop may help identify useful genes, either per se or through comparative genomics using closely related grasses. The construction and sequencing of a bacterial artificial chromosome (BAC) library of an elite commercial variety of sugarcane could help assembly the sugarcane genome. Results A BAC library designated SS_SBa was constructed with DNA isolated from the commercial sugarcane variety SP80-3280. The library contains 36,864 clones with an average insert size of 125 Kb, 88% of which has inserts larger than 90 Kb. Based on the estimated genome size of 760–930 Mb, the library exhibits 5–6 times coverage the monoploid sugarcane genome. Bidirectional BAC end sequencing (BESs) from a random sample of 192 BAC clones sampled genes and repetitive elements of the sugarcane genome. Forty-five per cent of the total BES nucleotides represents repetitive elements, 83% of which belonging to LTR retrotransposons. Alignment of BESs corresponding to 42 BACs to the genome sequence of the 10 sorghum chromosomes revealed regions of microsynteny, with expansions and contractions of sorghum genome regions relative to the sugarcane BAC clones. In general, the sampled sorghum genome regions presented an average 29% expansion in relation to the sugarcane syntenic BACs. Conclusion The SS_SBa BAC library represents a new resource for sugarcane genome sequencing

  5. A BAC library of the SP80-3280 sugarcane variety (saccharum sp.) and its inferred microsynteny with the sorghum genome.

    PubMed

    Figueira, Thais Rezende e Silva; Okura, Vagner; Rodrigues da Silva, Felipe; Jose da Silva, Marcio; Kudrna, Dave; Ammiraju, Jetty S S; Talag, Jayson; Wing, Rod; Arruda, Paulo

    2012-04-23

    Sugarcane breeding has significantly progressed in the last 30 years, but achieving additional yield gains has been difficult because of the constraints imposed by the complex ploidy of this crop. Sugarcane cultivars are interspecific hybrids between Saccharum officinarum and Saccharum spontaneum. S. officinarum is an octoploid with 2n = 80 chromosomes while S. spontaneum has 2n = 40 to 128 chromosomes and ploidy varying from 5 to 16. The hybrid genome is composed of 70-80% S. officinaram and 5-20% S. spontaneum chromosomes and a small proportion of recombinants. Sequencing the genome of this complex crop may help identify useful genes, either per se or through comparative genomics using closely related grasses. The construction and sequencing of a bacterial artificial chromosome (BAC) library of an elite commercial variety of sugarcane could help assembly the sugarcane genome. A BAC library designated SS_SBa was constructed with DNA isolated from the commercial sugarcane variety SP80-3280. The library contains 36,864 clones with an average insert size of 125 Kb, 88% of which has inserts larger than 90 Kb. Based on the estimated genome size of 760-930 Mb, the library exhibits 5-6 times coverage the monoploid sugarcane genome. Bidirectional BAC end sequencing (BESs) from a random sample of 192 BAC clones sampled genes and repetitive elements of the sugarcane genome. Forty-five per cent of the total BES nucleotides represents repetitive elements, 83% of which belonging to LTR retrotransposons. Alignment of BESs corresponding to 42 BACs to the genome sequence of the 10 sorghum chromosomes revealed regions of microsynteny, with expansions and contractions of sorghum genome regions relative to the sugarcane BAC clones. In general, the sampled sorghum genome regions presented an average 29% expansion in relation to the sugarcane syntenic BACs. The SS_SBa BAC library represents a new resource for sugarcane genome sequencing. An analysis of insert size, genome

  6. Statistical Inference

    NASA Astrophysics Data System (ADS)

    Khan, Shahjahan

    Often scientific information on various data generating processes are presented in the from of numerical and categorical data. Except for some very rare occasions, generally such data represent a small part of the population, or selected outcomes of any data generating process. Although, valuable and useful information is lurking in the array of scientific data, generally, they are unavailable to the users. Appropriate statistical methods are essential to reveal the hidden "jewels" in the mess of the row data. Exploratory data analysis methods are used to uncover such valuable characteristics of the observed data. Statistical inference provides techniques to make valid conclusions about the unknown characteristics or parameters of the population from which scientifically drawn sample data are selected. Usually, statistical inference includes estimation of population parameters as well as performing test of hypotheses on the parameters. However, prediction of future responses and determining the prediction distributions are also part of statistical inference. Both Classical or Frequentists and Bayesian approaches are used in statistical inference. The commonly used Classical approach is based on the sample data alone. In contrast, increasingly popular Beyesian approach uses prior distribution on the parameters along with the sample data to make inferences. The non-parametric and robust methods are also being used in situations where commonly used model assumptions are unsupported. In this chapter,we cover the philosophical andmethodological aspects of both the Classical and Bayesian approaches.Moreover, some aspects of predictive inference are also included. In the absence of any evidence to support assumptions regarding the distribution of the underlying population, or if the variable is measured only in ordinal scale, non-parametric methods are used. Robust methods are employed to avoid any significant changes in the results due to deviations from the model

  7. Statistical Inference

    NASA Astrophysics Data System (ADS)

    Khan, Shahjahan

    Often scientific information on various data generating processes are presented in the from of numerical and categorical data. Except for some very rare occasions, generally such data represent a small part of the population, or selected outcomes of any data generating process. Although, valuable and useful information is lurking in the array of scientific data, generally, they are unavailable to the users. Appropriate statistical methods are essential to reveal the hidden “jewels” in the mess of the row data. Exploratory data analysis methods are used to uncover such valuable characteristics of the observed data. Statistical inference provides techniques to make valid conclusions about the unknown characteristics or parameters of the population from which scientifically drawn sample data are selected. Usually, statistical inference includes estimation of population parameters as well as performing test of hypotheses on the parameters. However, prediction of future responses and determining the prediction distributions are also part of statistical inference. Both Classical or Frequentists and Bayesian approaches are used in statistical inference. The commonly used Classical approach is based on the sample data alone. In contrast, increasingly popular Beyesian approach uses prior distribution on the parameters along with the sample data to make inferences. The non-parametric and robust methods are also being used in situations where commonly used model assumptions are unsupported. In this chapter,we cover the philosophical andmethodological aspects of both the Classical and Bayesian approaches.Moreover, some aspects of predictive inference are also included. In the absence of any evidence to support assumptions regarding the distribution of the underlying population, or if the variable is measured only in ordinal scale, non-parametric methods are used. Robust methods are employed to avoid any significant changes in the results due to deviations from the model

  8. Diversity and genomics of Antarctic marine micro-organisms.

    PubMed

    Murray, Alison E; Grzymski, Joseph J

    2007-12-29

    Marine bacterioplanktons are thought to play a vital role in Southern Ocean ecology and ecosystem function, as they do in other ocean systems. However, our understanding of phylogenetic diversity, genome-enabled capabilities and specific adaptations to this persistently cold environment is limited. Bacterioplankton community composition shifts significantly over the annual cycle as sea ice melts and phytoplankton bloom. Microbial diversity in sea ice is better known than that of the plankton, where culture collections do not appear to represent organisms detected with molecular surveys. Broad phylogenetic groupings of Antarctic bacterioplankton such as the marine group I Crenarchaeota, alpha-Proteobacteria (Roseobacter-related and SAR-11 clusters), gamma-Proteobacteria (both cultivated and uncultivated groups) and Bacteriodetes-affiliated organisms in Southern Ocean waters are in common with other ocean systems. Antarctic SSU rRNA gene phylotypes are typically affiliated with other polar sequences. Some species such as Polaribacter irgensii and currently uncultivated gamma-Proteobacteria (Ant4D3 and Ant10A4) may flourish in Antarctic waters, though further studies are needed to address diversity on a larger scale. Insights from initial genomics studies on both cultivated organisms and genomes accessed through shotgun cloning of environmental samples suggest that there are many unique features of these organisms that facilitate survival in high-latitude, persistently cold environments.

  9. Stimulated bacterioplankton growth and selection for certain bacterial taxa in the vicinity of the ctenophore Mnemiopsis leidyi.

    PubMed

    Dinasquet, Julie; Granhag, Lena; Riemann, Lasse

    2012-01-01

    Episodic blooms of voracious gelatinous zooplankton, such as the ctenophore Mnemiopsis leidyi, affect pools of inorganic nutrients and dissolved organic carbon by intensive grazing activities and mucus release. This will potentially influence bacterioplankton activity and community composition, at least at local scales; however, available studies on this are scarce. In the present study we examined effects of M. leidyi on bacterioplankton growth and composition in incubation experiments. Moreover, we examined community composition of bacteria associated with the surface and gut of M. leidyi. High release of ammonium and high bacterial growth was observed in the treatments with M. leidyi relative to controls. Deep 454 pyrosequencing of 16 S rRNA genes showed specific bacterial communities in treatments with M. leidyi as well as specific communities associated with M. leidyi tissue and gut. In particular, members of Flavobacteriaceae were associated with M. leidyi. Our study shows that M. leidyi influences bacterioplankton activity and community composition in the vicinity of the jellyfish. In particular during temporary aggregations of jellyfish, these local zones of high bacterial growth may contribute significantly to the spatial heterogeneity of bacterioplankton activity and community composition in the sea.

  10. Snowmelt-driven changes in dissolved organic matter and bacterioplankton communities in the Heilongjiang watershed of China.

    PubMed

    Qiu, Linlin; Cui, Hongyang; Wu, Junqiu; Wang, Baijie; Zhao, Yue; Li, Jiming; Jia, Liming; Wei, Zimin

    2016-06-15

    Bacterioplankton plays a significant role in the circulation of materials and ecosystem function in the biosphere. Dissolved organic matter (DOM) from dead plant material and surface soil leaches into water bodies when snow melts. In our study, water samples from nine sampling sites along the Heilongjiang watershed were collected in February and June 2014 during which period snowmelt occurred. The goal of this study was to characterize changes in DOM and bacterioplankton community composition (BCC) associated with snowmelt, the effects of DOM, environmental and geographical factors on the distribution of BCC and interactions of aquatic bacterioplankton populations with different sources of DOM in the Heilongjiang watershed. BCC was measured by denaturing gradient gel electrophoresis (DGGE). DOM was measured by excitation-emission matrix (EEM) fluorescence spectroscopy. Bacterioplankton exhibited a distinct seasonal change in community composition due to snowmelt at all sampling points except for EG. Redundancy analysis (RDA) indicated that BCC was more closely related to DOM (Components 1 and 4, dissolved organic carbon, biochemical oxygen demand and chlorophyll a) and environmental factors (water temperature and nitrate nitrogen) than geographical factors. Furthermore, DOM had a greater impact on BCC than environmental factors (29.80 vs. 15.90% of the variation). Overall, spring snowmelt played an important role in altering the quality and quantity of DOM and BCC in the Heilongjiang watershed. Copyright © 2016 Elsevier B.V. All rights reserved.

  11. Stimulated bacterioplankton growth and selection for certain bacterial taxa in the vicinity of the ctenophore Mnemiopsis leidyi

    PubMed Central

    Dinasquet, Julie; Granhag, Lena; Riemann, Lasse

    2012-01-01

    Episodic blooms of voracious gelatinous zooplankton, such as the ctenophore Mnemiopsis leidyi, affect pools of inorganic nutrients and dissolved organic carbon by intensive grazing activities and mucus release. This will potentially influence bacterioplankton activity and community composition, at least at local scales; however, available studies on this are scarce. In the present study we examined effects of M. leidyi on bacterioplankton growth and composition in incubation experiments. Moreover, we examined community composition of bacteria associated with the surface and gut of M. leidyi. High release of ammonium and high bacterial growth was observed in the treatments with M. leidyi relative to controls. Deep 454 pyrosequencing of 16 S rRNA genes showed specific bacterial communities in treatments with M. leidyi as well as specific communities associated with M. leidyi tissue and gut. In particular, members of Flavobacteriaceae were associated with M. leidyi. Our study shows that M. leidyi influences bacterioplankton activity and community composition in the vicinity of the jellyfish. In particular during temporary aggregations of jellyfish, these local zones of high bacterial growth may contribute significantly to the spatial heterogeneity of bacterioplankton activity and community composition in the sea. PMID:22912629

  12. Diel fluctuations in the abundance and community diversity of coastal bacterioplankton assemblages over a tidal cycle.

    PubMed

    Olapade, Ola A

    2012-01-01

    The diel change in abundance and community diversity of the bacterioplankton assemblages within the Pacific Ocean at a fixed location in Monterey Bay, California (USA) were examined with several culture-independent (i.e., nucleic acid staining, fluorescence in situ hybridization {FISH}, and 16S ribosomal RNA gene libraries) approaches over a tidal cycle. FISH analyses revealed the quantitative predominance of bacterial members belonging to the Cytophaga-Flavobacterium cluster as well as two Proteobacteria (α- and γ-) subclasses within the bacterioplankton assemblages, especially during high tide (HT) and outgoing tide (OT) than the other tidal events. While the clone libraries showed that majority of the sequences were similar to the 16S rRNA gene sequences of unknown bacteria (32% to 73%), however, the operational taxonomic units from members of the α-Proteobacteria, Bacteroidetes, Firmicutes, and Cyanobacteria were also well represented during the four tidal events examined. Comparatively, sequence diversity was highest in OT, lowest in low tide, and very similar between HT and incoming tide. The results indicate that the dynamics of bacterial occurrence and diversity appeared to be more pronounced during HT and OT, further indicative of the ecological importance of several environmental variables including temperature, light intensity, and nutrient availability that are also concurrently fluctuating during these tidal events in marine systems.

  13. Phytoplankton, bacterioplankton and virioplankton structure and function across the southern Great Barrier Reef shelf

    NASA Astrophysics Data System (ADS)

    Alongi, Daniel M.; Patten, Nicole L.; McKinnon, David; Köstner, Nicole; Bourne, David G.; Brinkman, Richard

    2015-02-01

    Bacterioplankton and phytoplankton dynamics, pelagic respiration, virioplankton abundance, and the diversity of pelagic diazotrophs and other bacteria were examined in relation to water-column nutrients and vertical mixing across the southern Great Barrier Reef (GBR) shelf where sharp inshore to offshore gradients in water chemistry and hydrology prevail. A principal component analysis (PCA) revealed station groups clustered geographically, suggesting across-shelf differences in plankton function and structure driven by changes in mixing intensity, sediment resuspension, and the relative contributions of terrestrial, reef and oceanic nutrients. At most stations and sampling periods, microbial abundance and activities peaked both inshore and at channels between outer shelf reefs of the Pompey Reef complex. PCA also revealed that virioplankton numbers and biomass correlated with bacterioplankton numbers and production, and that bacterial growth and respiration correlated with net primary production, suggesting close virus-bacteria-phytoplankton interactions; all plankton groups correlated with particulate C, N, and P. Strong vertical mixing facilitates tight coupling of pelagic and benthic shelf processes as, on average, 37% and 56% of N and P demands of phytoplankton are derived from benthic nutrient regeneration and resuspension. These across-shelf planktonic trends mirror those of the benthic microbial community.

  14. Response of marine bacterioplankton pH homeostasis gene expression to elevated CO2

    NASA Astrophysics Data System (ADS)

    Bunse, Carina; Lundin, Daniel; Karlsson, Christofer M. G.; Akram, Neelam; Vila-Costa, Maria; Palovaara, Joakim; Svensson, Lovisa; Holmfeldt, Karin; González, José M.; Calvo, Eva; Pelejero, Carles; Marrasé, Cèlia; Dopson, Mark; Gasol, Josep M.; Pinhassi, Jarone

    2016-05-01

    Human-induced ocean acidification impacts marine life. Marine bacteria are major drivers of biogeochemical nutrient cycles and energy fluxes; hence, understanding their performance under projected climate change scenarios is crucial for assessing ecosystem functioning. Whereas genetic and physiological responses of phytoplankton to ocean acidification are being disentangled, corresponding functional responses of bacterioplankton to pH reduction from elevated CO2 are essentially unknown. Here we show, from metatranscriptome analyses of a phytoplankton bloom mesocosm experiment, that marine bacteria responded to lowered pH by enhancing the expression of genes encoding proton pumps, such as respiration complexes, proteorhodopsin and membrane transporters. Moreover, taxonomic transcript analysis showed that distinct bacterial groups expressed different pH homeostasis genes in response to elevated CO2. These responses were substantial for numerous pH homeostasis genes under low-chlorophyll conditions (chlorophyll a <2.5 μg l-1) however, the changes in gene expression under high-chlorophyll conditions (chlorophyll a >20 μg l-1) were low. Given that proton expulsion through pH homeostasis mechanisms is energetically costly, these findings suggest that bacterioplankton adaptation to ocean acidification could have long-term effects on the economy of ocean ecosystems.

  15. Phylotype Dynamics of Bacterial P Utilization Genes in Microbialites and Bacterioplankton of a Monomictic Endorheic Lake.

    PubMed

    Valdespino-Castillo, Patricia M; Alcántara-Hernández, Rocío J; Merino-Ibarra, Martín; Alcocer, Javier; Macek, Miroslav; Moreno-Guillén, Octavio A; Falcón, Luisa I

    2017-02-01

    Microbes can modulate ecosystem function since they harbor a vast genetic potential for biogeochemical cycling. The spatial and temporal dynamics of this genetic diversity should be acknowledged to establish a link between ecosystem function and community structure. In this study, we analyzed the genetic diversity of bacterial phosphorus utilization genes in two microbial assemblages, microbialites and bacterioplankton of Lake Alchichica, a semiclosed (i.e., endorheic) system with marked seasonality that varies in nutrient conditions, temperature, dissolved oxygen, and water column stability. We focused on dissolved organic phosphorus (DOP) utilization gene dynamics during contrasting mixing and stratification periods. Bacterial alkaline phosphatases (phoX and phoD) and alkaline beta-propeller phytases (bpp) were surveyed. DOP utilization genes showed different dynamics evidenced by a marked change within an intra-annual period and a differential circadian pattern of expression. Although Lake Alchichica is a semiclosed system, this dynamic turnover of phylotypes (from lake circulation to stratification) points to a different potential of DOP utilization by the microbial communities within periods. DOP utilization gene dynamics was different among genetic markers and among assemblages (microbialite vs. bacterioplankton). As estimated by the system's P mass balance, P inputs and outputs were similar in magnitude (difference was <10 %). A theoretical estimation of water column P monoesters was used to calculate the potential P fraction that can be remineralized on an annual basis. Overall, bacterial groups including Proteobacteria (Alpha and Gamma) and Bacteroidetes seem to be key participants in DOP utilization responses.

  16. Flavobacteria Blooms in Four Eutrophic Lakes: Linking Population Dynamics of Freshwater Bacterioplankton to Resource Availability▿ †

    PubMed Central

    Eiler, Alexander; Bertilsson, Stefan

    2007-01-01

    Heterotrophic bacteria are major contributors to biogeochemical cycles and influence water quality. Still, the lack of representative isolates and the few quantitative surveys leave the ecological role and significance of single bacterial populations to be revealed. Here we analyzed the diversity and dynamics of freshwater Flavobacteria populations in four eutrophic temperate lakes. From each lake, clone libraries were constructed using primers specific for either the class Flavobacteria or Bacteria. Sequencing of 194 Flavobacteria clones from 8 libraries revealed a diverse freshwater Flavobacteria community and distinct differences among lakes. Abundance and seasonal dynamics of Flavobacteria were assessed by quantitative PCR with class-specific primers. In parallel, the dynamics of individual populations within the Flavobacteria community were assessed with terminal restriction fragment length polymorphism analysis using identical primers. The contribution of Flavobacteria to the total bacterioplankton community ranged from 0.4 to almost 100% (average, 24%). Blooms where Flavobacteria represented more than 30% of the bacterioplankton were observed at different times in the four lakes. In general, high proportions of Flavobacteria appeared during episodes of high bacterial production. Phylogenetic analyses combined with Flavobacteria community fingerprints suggested dominance of two Flavobacteria lineages. Both drastic alterations in total Flavobacteria and in community composition of this class significantly correlated with bacterial production, emphasizing that resource availability is an important driver of heterotrophic bacterial succession in eutrophic lakes. PMID:17435002

  17. Impact of warming on phyto-bacterioplankton coupling and bacterial community composition in experimental mesocosms.

    PubMed

    von Scheibner, Markus; Dörge, Petra; Biermann, Antje; Sommer, Ulrich; Hoppe, Hans-Georg; Jürgens, Klaus

    2014-03-01

    Global warming is assumed to alter the trophic interactions and carbon flow patterns of aquatic food webs. The impact of temperature on phyto-bacterioplankton coupling and bacterial community composition (BCC) was the focus of the present study, in which an indoor mesocosm experiment with natural plankton communities from the western Baltic Sea was conducted. A 6 °C increase in water temperature resulted, as predicted, in tighter coupling between the diatom-dominated phytoplankton and heterotrophic bacteria, accompanied by a strong increase in carbon flow into bacterioplankton during the phytoplankton bloom phase. Suppressed bacterial development at cold in situ temperatures probably reflected lowered bacterial production and grazing by protists, as the latter were less affected by low temperatures. BCC was strongly influenced by the phytoplankton bloom stage and to a lesser extent by temperature. Under both temperature regimes, Gammaproteobacteria clearly dominated during the phytoplankton peak, with Glaciecola sp. as the single most abundant taxon. However, warming induced the appearance of additional bacterial taxa belonging to Betaproteobacteria and Bacteroidetes. Our results show that warming during an early phytoplankton bloom causes a shift towards a more heterotrophic system, with the appearance of new bacterial taxa suggesting a potential for utilization of a broader substrate spectrum. © 2013 John Wiley & Sons Ltd and Society for Applied Microbiology.

  18. Metagenomic identification of bacterioplankton taxa and pathways involved in microcystin degradation in lake erie.

    PubMed

    Mou, Xiaozhen; Lu, Xinxin; Jacob, Jisha; Sun, Shulei; Heath, Robert

    2013-01-01

    Cyanobacterial harmful blooms (CyanoHABs) that produce microcystins are appearing in an increasing number of freshwater ecosystems worldwide, damaging quality of water for use by human and aquatic life. Heterotrophic bacteria assemblages are thought to be important in transforming and detoxifying microcystins in natural environments. However, little is known about their taxonomic composition or pathways involved in the process. To address this knowledge gap, we compared the metagenomes of Lake Erie free-living bacterioplankton assemblages in laboratory microcosms amended with microcystins relative to unamended controls. A diverse array of bacterial phyla were responsive to elevated supply of microcystins, including Acidobacteria, Actinobacteria, Bacteroidetes, Planctomycetes, Proteobacteria of the alpha, beta, gamma, delta and epsilon subdivisions and Verrucomicrobia. At more detailed taxonomic levels, Methylophilales (mainly in genus Methylotenera) and Burkholderiales (mainly in genera Bordetella, Burkholderia, Cupriavidus, Polaromonas, Ralstonia, Polynucleobacter and Variovorax) of Betaproteobacteria were suggested to be more important in microcystin degradation than Sphingomonadales of Alphaproteobacteria. The latter taxa were previously thought to be major microcystin degraders. Homologs to known microcystin-degrading genes (mlr) were not overrepresented in microcystin-amended metagenomes, indicating that Lake Erie bacterioplankton might employ alternative genes and/or pathways in microcystin degradation. Genes for xenobiotic metabolism were overrepresented in microcystin-amended microcosms, suggesting they are important in bacterial degradation of microcystin, a phenomenon that has been identified previously only in eukaryotic systems.

  19. Viruses and flagellates sustain apparent richness and reduce biomass accumulation of bacterioplankton in coastal marine waters.

    PubMed

    Zhang, Rui; Weinbauer, Markus G; Qian, Pei-Yuan

    2007-12-01

    To gain a better understanding of the interactions among bacteria, viruses and flagellates in coastal marine ecosystems, we investigated the effect of viral lysis and protistan bacterivory on bacterial abundance, production and diversity [determined by 16S rRNA gene polymerase chain reaction (PCR) and denaturing gradient gel electrophoresis (DGGE)] in three coastal marine sites with different nutrient supplies in Hong Kong. Six experiments were set up using filtration and dilution methods to develop virus, flagellate and virus+flagellate treatments for natural bacterial populations. All three predation treatments had significant repressing effects on bacterial abundance. Bacterial production was significantly repressed by flagellates and both predators (flagellates and viruses). Bacterial apparent species richness (indicated as the number of DGGE bands) was always significantly higher in the presence of viruses, flagellates and both predators than in the predator-free control. Cluster analysis of the DGGE patterns showed that the effects of viruses and flagellates on bacterial community structure were relatively stochastic while the co-effects of predators caused consistent trends (DGGE always showed the most similar patterns when compared with those of in situ environments) and substantially increased the apparent richness. Overall, we found strong evidence that viral lysis and protist bacterivory act additively to reduce bacterial production and to sustain diversity. This first systematic attempt to study the interactive effects of viruses and flagellates on the diversity and production of bacterial communities in coastal waters suggests that a tight control of bacterioplankton dominants results in relatively stable bacterioplankton communities.

  20. Recruitment of Members from the Rare Biosphere of Marine Bacterioplankton Communities after an Environmental Disturbance

    PubMed Central

    Sjöstedt, Johanna; Koch-Schmidt, Per; Pontarp, Mikael; Canbäck, Björn; Tunlid, Anders; Lundberg, Per; Hagström, Åke

    2012-01-01

    A bacterial community may be resistant to environmental disturbances if some of its species show metabolic flexibility and physiological tolerance to the changing conditions. Alternatively, disturbances can change the composition of the community and thereby potentially affect ecosystem processes. The impact of disturbance on the composition of bacterioplankton communities was examined in continuous seawater cultures. Bacterial assemblages from geographically closely connected areas, the Baltic Sea (salinity 7 and high dissolved organic carbon [DOC]) and Skagerrak (salinity 28 and low DOC), were exposed to gradual opposing changes in salinity and DOC over a 3-week period such that the Baltic community was exposed to Skagerrak salinity and DOC and vice versa. Denaturing gradient gel electrophoresis and clone libraries of PCR-amplified 16S rRNA genes showed that the composition of the transplanted communities differed significantly from those held at constant salinity. Despite this, the growth yields (number of cells ml−1) were similar, which suggests similar levels of substrate utilization. Deep 454 pyrosequencing of 16S rRNA genes showed that the composition of the disturbed communities had changed due to the recruitment of phylotypes present in the rare biosphere of the original community. The study shows that members of the rare biosphere can become abundant in a bacterioplankton community after disturbance and that those bacteria can have important roles in maintaining ecosystem processes. PMID:22194288

  1. Short-Term Dynamics of North Sea Bacterioplankton-Dissolved Organic Matter Coherence on Molecular Level

    PubMed Central

    Lucas, Judith; Koester, Irina; Wichels, Antje; Niggemann, Jutta; Dittmar, Thorsten; Callies, Ulrich; Wiltshire, Karen H.; Gerdts, Gunnar

    2016-01-01

    Remineralization and transformation of dissolved organic matter (DOM) by marine microbes shape the DOM composition and thus, have large impact on global carbon and nutrient cycling. However, information on bacterioplankton-DOM interactions on a molecular level is limited. We examined the variation of bacterial community composition (BCC) at Helgoland Roads (North Sea) in relation to variation of molecular DOM composition and various environmental parameters on short-time scales. Surface water samples were taken daily over a period of 20 days. Bacterial community and molecular DOM composition were assessed via 16S rRNA gene tag sequencing and ultrahigh resolution Fourier-transform ion cyclotron resonance mass spectrometry (FT-ICR-MS), respectively. Environmental conditions were driven by a coastal water influx during the first half of the sampling period and the onset of a summer phytoplankton bloom toward the end of the sampling period. These phenomena led to a distinct grouping of bacterial communities and DOM composition which was particularly influenced by total dissolved nitrogen (TDN) concentration, temperature, and salinity, as revealed by distance-based linear regression analyses. Bacterioplankton-DOM interaction was demonstrated in strong correlations between specific bacterial taxa and particular DOM molecules, thus, suggesting potential specialization on particular substrates. We propose that a combination of high resolution techniques, as used in this study, may provide substantial information on substrate generalists and specialists and thus, contribute to prediction of BCC variation. PMID:27014241

  2. Recruitment of members from the rare biosphere of marine bacterioplankton communities after an environmental disturbance.

    PubMed

    Sjöstedt, Johanna; Koch-Schmidt, Per; Pontarp, Mikael; Canbäck, Björn; Tunlid, Anders; Lundberg, Per; Hagström, Ake; Riemann, Lasse

    2012-03-01

    A bacterial community may be resistant to environmental disturbances if some of its species show metabolic flexibility and physiological tolerance to the changing conditions. Alternatively, disturbances can change the composition of the community and thereby potentially affect ecosystem processes. The impact of disturbance on the composition of bacterioplankton communities was examined in continuous seawater cultures. Bacterial assemblages from geographically closely connected areas, the Baltic Sea (salinity 7 and high dissolved organic carbon [DOC]) and Skagerrak (salinity 28 and low DOC), were exposed to gradual opposing changes in salinity and DOC over a 3-week period such that the Baltic community was exposed to Skagerrak salinity and DOC and vice versa. Denaturing gradient gel electrophoresis and clone libraries of PCR-amplified 16S rRNA genes showed that the composition of the transplanted communities differed significantly from those held at constant salinity. Despite this, the growth yields (number of cells ml(-1)) were similar, which suggests similar levels of substrate utilization. Deep 454 pyrosequencing of 16S rRNA genes showed that the composition of the disturbed communities had changed due to the recruitment of phylotypes present in the rare biosphere of the original community. The study shows that members of the rare biosphere can become abundant in a bacterioplankton community after disturbance and that those bacteria can have important roles in maintaining ecosystem processes.

  3. Response of rare, common and abundant bacterioplankton to anthropogenic perturbations in a Mediterranean coastal site.

    PubMed

    Baltar, Federico; Palovaara, Joakim; Vila-Costa, Maria; Salazar, Guillem; Calvo, Eva; Pelejero, Carles; Marrasé, Cèlia; Gasol, Josep M; Pinhassi, Jarone

    2015-06-01

    Bacterioplankton communities are made up of a small set of abundant taxa and a large number of low-abundant organisms (i.e. 'rare biosphere'). Despite the critical role played by bacteria in marine ecosystems, it remains unknown how this large diversity of organisms are affected by human-induced perturbations, or what controls the responsiveness of rare compared to abundant bacteria. We studied the response of a Mediterranean bacterioplankton community to two anthropogenic perturbations (i.e. nutrient enrichment and/or acidification) in two mesocosm experiments (in winter and summer). Nutrient enrichment increased the relative abundance of some operational taxonomic units (OTUs), e.g. Polaribacter, Tenacibaculum, Rhodobacteraceae and caused a relative decrease in others (e.g. Croceibacter). Interestingly, a synergistic effect of acidification and nutrient enrichment was observed on specific OTUs (e.g. SAR86). We analyzed the OTUs that became abundant at the end of the experiments and whether they belonged to the rare (<0.1% of relative abundance), the common (0.1-1.0% of relative abundance) or the abundant (>1% relative abundance) fractions. Most of the abundant OTUs at the end of the experiments were abundant, or at least common, in the original community of both experiments, suggesting that ecosystem alterations do not necessarily call for rare members to grow.

  4. Spatial variability overwhelms seasonal patterns in bacterioplankton communities across a river to ocean gradient

    PubMed Central

    Fortunato, Caroline S; Herfort, Lydie; Zuber, Peter; Baptista, Antonio M; Crump, Byron C

    2012-01-01

    Few studies of microbial biogeography address variability across both multiple habitats and multiple seasons. Here we examine the spatial and temporal variability of bacterioplankton community composition of the Columbia River coastal margin using 16S amplicon pyrosequencing of 300 water samples collected in 2007 and 2008. Communities separated into seven groups (ANOSIM, P<0.001): river, estuary, plume, epipelagic, mesopelagic, shelf bottom (depth<350 m) and slope bottom (depth>850 m). The ordination of these samples was correlated with salinity (ρ=−0.83) and depth (ρ=−0.62). Temporal patterns were obscured by spatial variability among the coastal environments, and could only be detected within individual groups. Thus, structuring environmental factors (for example, salinity, depth) dominate over seasonal changes in determining community composition. Seasonal variability was detected across an annual cycle in the river, estuary and plume where communities separated into two groups, early year (April–July) and late year (August–Nov), demonstrating annual reassembly of communities over time. Determining both the spatial and temporal variability of bacterioplankton communities provides a framework for modeling these communities across environmental gradients from river to deep ocean. PMID:22011718

  5. Identification of Associations between Bacterioplankton and Photosynthetic Picoeukaryotes in Coastal Waters

    PubMed Central

    Farnelid, Hanna M.; Turk-Kubo, Kendra A.; Zehr, Jonathan P.

    2016-01-01

    Photosynthetic picoeukaryotes are significant contributors to marine primary productivity. Associations between marine bacterioplankton and picoeukaryotes frequently occur and can have large biogeochemical impacts. We used flow cytometry to sort cells from seawater to identify non-eukaryotic phylotypes that are associated with photosynthetic picoeukaryotes. Samples were collected at the Santa Cruz wharf on Monterey Bay, CA, USA during summer and fall, 2014. The phylogeny of associated microbes was assessed through 16S rRNA gene amplicon clone and Illumina MiSeq libraries. The most frequently detected bacterioplankton phyla within the photosynthetic picoeukaryote sorts were Proteobacteria (Alphaproteobacteria and Gammaproteobacteria) and Bacteroidetes. Intriguingly, the presence of free-living bacterial genera in the photosynthetic picoeukaryote sorts could suggest that some of the photosynthetic picoeukaryotes were mixotrophs. However, the occurrence of bacterial sequences, which were not prevalent in the corresponding bulk seawater samples, indicates that there was also a selection for specific OTUs in association with photosynthetic picoeukaryotes suggesting specific functional associations. The results show that diverse bacterial phylotypes are found in association with photosynthetic picoeukaryotes. Taxonomic identification of these associations is a prerequisite for further characterizing and to elucidate their metabolic pathways and ecological functions. PMID:27148165

  6. Evidence of bacterioplankton community adaptation in response to long-term mariculture disturbance.

    PubMed

    Xiong, Jinbo; Chen, Heping; Hu, Changju; Ye, Xiansen; Kong, Dingjiang; Zhang, Demin

    2015-10-16

    Understanding the underlying mechanisms that shape the temporal dynamics of a microbial community has important implications for predicting the trajectory of an ecosystem's response to anthropogenic disturbances. Here, we evaluated the seasonal dynamics of bacterioplankton community composition (BCC) following more than three decades of mariculture disturbance in Xiangshan Bay. Clear seasonal succession and site (fish farm and control site) separation of the BCC were observed, which were primarily shaped by temperature, dissolved oxygen and sampling time. However, the sensitive bacterial families consistently changed in relative abundance in response to mariculture disturbance, regardless of the season. Temporal changes in the BCC followed the time-decay for similarity relationship at both sites. Notably, mariculture disturbance significantly (P < 0.001) flattened the temporal turnover but intensified bacterial species-to-species interactions. The decrease in bacterial temporal turnover under long-term mariculture disturbance was coupled with a consistent increase in the percentage of deterministic processes that constrained bacterial assembly based on a null model analysis. The results demonstrate that the BCC is sensitive to mariculture disturbance; however, a bacterioplankton community could adapt to a long-term disturbance via attenuating temporal turnover and intensifying species-species interactions. These findings expand our current understanding of microbial assembly in response to long-term anthropogenic disturbances.

  7. Influence of macrophyte decomposition on growth rate and community structure of Okefenokee Swamp bacterioplankton

    SciTech Connect

    Murray, R.E.; Hodson, R.E.

    1986-02-01

    Dissolved substances released during decomposition of the white water lily (Nymphaea odorata) can alter the growth rate of Okefenokee Swamp bacterioplankton. In microcosm experiments dissolved compounds released bacterioplankton, followed by a period of intense bacterial growth. Rates of (/sup 3/H)thymidine incorporation and turnover of dissolved D-glucose were depressed by over 85%, 3 h after the addition of Nymphaea leachates to microcosms containing Okefenokee Swamp water. Bacterial activity subsequently recovered; after 20 h (/sup 3/H)thymidine incorporation in leachate-treated microcosms was 10-fold greater than that in control microcosms. The recovery of activity was due to a shift in the composition of the bacterial population toward resistance to the inhibitory compounds present in Nymphaea leachates. Inhibitory compounds released during the decomposition of aquatic macrophytes thus act as selective agents which alter the community structure of the bacterial population with respect to leachate resistance. Soluble compounds derived from macrophyte decomposition influence the rate of bacterial secondary production and the availability of microbial biomass to microconsumers.

  8. Relation between presence-absence of a visible nucleoid and metabolic activity in bacterioplankton cells

    SciTech Connect

    Choi, Joon, W.; Sherr, E.B.; Sherr, B.F.

    1996-09-01

    We investigated the report of Zweifel and Hagstroem that only a portion of marine bacteria contain nucleoids--the DNA-containing regions of procaryotic cells-- and that such bacteria correspond to the active or viable fraction of bacterioplankton. In Oregon coastal waters, 21-64% of bacteria had visible nucleoids; number of nucleoid-visible (NV) bacteria were greater than numbers of metabolically active bacteria, based on cells with active electron transport systems (ETS) and intact cell membranes. During log growth of a marine isolate, proportions of NV and ETS-active cells approached 100%. In stationary growth phase, the fraction of ETS-active cells decreased rapidly, while that of NV cells remained high for 7 d. When starved cells of the isolate were resupplied with nutrient (50 mg liter{sup -1} peptone), total cell number did not increase during the initial 6 h, but the proportion of NV cells increased from 27 to 100%, and that of ETS-active cells from 6 to 75%. In an analogous experiment with a bacterioplankton assemblage, a similar trend was observed: the number of NV cells double during the initial 6 h prior to an increase in total cell counts. These results show that some bacteria without visible nucleoids are capable of becoming NV cells, and thus have DNa in a nucleoid region not detectable with the method used here. 18 refs., 4 figs., 1 tab.

  9. Evidence of bacterioplankton community adaptation in response to long-term mariculture disturbance

    PubMed Central

    Xiong, Jinbo; Chen, Heping; Hu, Changju; Ye, Xiansen; Kong, Dingjiang; Zhang, Demin

    2015-01-01

    Understanding the underlying mechanisms that shape the temporal dynamics of a microbial community has important implications for predicting the trajectory of an ecosystem’s response to anthropogenic disturbances. Here, we evaluated the seasonal dynamics of bacterioplankton community composition (BCC) following more than three decades of mariculture disturbance in Xiangshan Bay. Clear seasonal succession and site (fish farm and control site) separation of the BCC were observed, which were primarily shaped by temperature, dissolved oxygen and sampling time. However, the sensitive bacterial families consistently changed in relative abundance in response to mariculture disturbance, regardless of the season. Temporal changes in the BCC followed the time-decay for similarity relationship at both sites. Notably, mariculture disturbance significantly (P < 0.001) flattened the temporal turnover but intensified bacterial species-to-species interactions. The decrease in bacterial temporal turnover under long-term mariculture disturbance was coupled with a consistent increase in the percentage of deterministic processes that constrained bacterial assembly based on a null model analysis. The results demonstrate that the BCC is sensitive to mariculture disturbance; however, a bacterioplankton community could adapt to a long-term disturbance via attenuating temporal turnover and intensifying species-species interactions. These findings expand our current understanding of microbial assembly in response to long-term anthropogenic disturbances. PMID:26471739

  10. Occurrence of Plasmids in the Aromatic Degrading Bacterioplankton of the Baltic Sea

    PubMed Central

    Jutkina, Jekaterina; Heinaru, Eeva; Vedler, Eve; Juhanson, Jaanis; Heinaru, Ain

    2011-01-01

    Plasmids are mobile genetic elements that provide their hosts with many beneficial traits including in some cases the ability to degrade different aromatic compounds. To fulfill the knowledge gap regarding catabolic plasmids of the Baltic Sea water, a total of 209 biodegrading bacterial strains were isolated and screened for the presence of these mobile genetic elements. We found that both large and small plasmids are common in the cultivable Baltic Sea bacterioplankton and are particularly prevalent among bacterial genera Pseudomonas and Acinetobacter. Out of 61 plasmid-containing strains (29% of all isolates), 34 strains were found to carry large plasmids, which could be associated with the biodegradative capabilities of the host bacterial strains. Focusing on the diversity of IncP-9 plasmids, self-transmissible m-toluate (TOL) and salicylate (SAL) plasmids were detected. Sequencing the repA gene of IncP-9 carrying isolates revealed a high diversity within IncP-9 plasmid family, as well as extended the assumed bacterial host species range of the IncP-9 representatives. This study is the first insight into the genetic pool of the IncP-9 catabolic plasmids in the Baltic Sea bacterioplankton. PMID:24710296

  11. Phylogenetic relationship and virulence inference of Streptococcus Anginosus Group: curated annotation and whole-genome comparative analysis support distinct species designation

    PubMed Central

    2013-01-01

    Background The Streptococcus Anginosus Group (SAG) represents three closely related species of the viridans group streptococci recognized as commensal bacteria of the oral, gastrointestinal and urogenital tracts. The SAG also cause severe invasive infections, and are pathogens during cystic fibrosis (CF) pulmonary exacerbation. Little genomic information or description of virulence mechanisms is currently available for SAG. We conducted intra and inter species whole-genome comparative analyses with 59 publically available Streptococcus genomes and seven in-house closed high quality finished SAG genomes; S. constellatus (3), S. intermedius (2), and S. anginosus (2). For each SAG species, we sequenced at least one numerically dominant strain from CF airways recovered during acute exacerbation and an invasive, non-lung isolate. We also evaluated microevolution that occurred within two isolates that were cultured from one individual one year apart. Results The SAG genomes were most closely related to S. gordonii and S. sanguinis, based on shared orthologs and harbor a similar number of proteins within each COG category as other Streptococcus species. Numerous characterized streptococcus virulence factor homologs were identified within the SAG genomes including; adherence, invasion, spreading factors, LPxTG cell wall proteins, and two component histidine kinases known to be involved in virulence gene regulation. Mobile elements, primarily integrative conjugative elements and bacteriophage, account for greater than 10% of the SAG genomes. S. anginosus was the most variable species sequenced in this study, yielding both the smallest and the largest SAG genomes containing multiple genomic rearrangements, insertions and deletions. In contrast, within the S. constellatus and S. intermedius species, there was extensive continuous synteny, with only slight differences in genome size between strains. Within S. constellatus we were able to determine important SNPs and changes in

  12. Palaeohexaploid ancestry for Caryophyllales inferred from extensive gene-based physical and genetic mapping of the sugar beet genome (Beta vulgaris).

    PubMed

    Dohm, Juliane C; Lange, Cornelia; Holtgräwe, Daniela; Sörensen, Thomas Rosleff; Borchardt, Dietrich; Schulz, Britta; Lehrach, Hans; Weisshaar, Bernd; Himmelbauer, Heinz

    2012-05-01

    Sugar beet (Beta vulgaris) is an important crop plant that accounts for 30% of the world's sugar production annually. The genus Beta is a distant relative of currently sequenced taxa within the core eudicotyledons; the genomic characterization of sugar beet is essential to make its genome accessible to molecular dissection. Here, we present comprehensive genomic information in genetic and physical maps that cover all nine chromosomes. Based on this information we identified the proposed ancestral linkage groups of rosids and asterids within the sugar beet genome. We generated an extended genetic map that comprises 1127 single nucleotide polymorphism markers prepared from expressed sequence tags and bacterial artificial chromosome (BAC) end sequences. To construct a genome-wide physical map, we hybridized gene-derived oligomer probes against two BAC libraries with 9.5-fold cumulative coverage of the 758 Mbp genome. More than 2500 probes and clones were integrated both in genetic maps and the physical data. The final physical map encompasses 535 chromosomally anchored contigs that contains 8361 probes and 22 815 BAC clones. By using the gene order established with the physical map, we detected regions of synteny between sugar beet (order Caryophyllales) and rosid species that involves 1400-2700 genes in the sequenced genomes of Arabidopsis, poplar, grapevine, and cacao. The data suggest that Caryophyllales share the palaeohexaploid ancestor proposed for rosids and asterids. Taken together, we here provide extensive molecular resources for sugar beet and enable future high-resolution trait mapping, gene identification, and cross-referencing to regions sequenced in other plant species.

  13. A spruce gene map infers ancient plant genome reshuffling and subsequent slow evolution in the gymnosperm lineage leading to extant conifers

    PubMed Central

    2012-01-01

    Background Seed plants are composed of angiosperms and gymnosperms, which diverged from each other around 300 million years ago. While much light has been shed on the mechanisms and rate of genome evolution in flowering plants, such knowledge remains conspicuously meagre for the gymnosperms. Conifers are key representatives of gymnosperms and the sheer size of their genomes represents a significant challenge for characterization, sequencing and assembling. Results To gain insight into the macro-organisation and long-term evolution of the conifer genome, we developed a genetic map involving 1,801 spruce genes. We designed a statistical approach based on kernel density estimation to analyse gene density and identified seven gene-rich isochors. Groups of co-localizing genes were also found that were transcriptionally co-regulated, indicative of functional clusters. Phylogenetic analyses of 157 gene families for which at least two duplicates were mapped on the spruce genome indicated that ancient gene duplicates shared by angiosperms and gymnosperms outnumbered conifer-specific duplicates by a ratio of eight to one. Ancient duplicates were much more translocated within and among spruce chromosomes than conifer-specific duplicates, which were mostly organised in tandem arrays. Both high synteny and collinearity were also observed between the genomes of spruce and pine, two conifers that diverged more than 100 million years ago. Conclusions Taken together, these results indicate that much genomic evolution has occurred in the seed plant lineage before the split between gymnosperms and angiosperms, and that the pace of evolution of the genome macro-structure has been much slower in the gymnosperm lineage leading to extent conifers than that seen for the same period of time in flowering plants. This trend is largely congruent with the contrasted rates of diversification and morphological evolution observed between these two groups of seed plants. PMID:23102090

  14. Comparative cytogenetic mapping of Sox2 and Sox14 in cichlid fishes and inferences on the genomic organization of both genes in vertebrates

    PubMed Central

    Mazzuchelli, Juliana; Yang, Fengtang; Kocher, Thomas D.; Martins, Cesar

    2011-01-01

    To better understand the genomic organization and evolution of Sox genes in vertebrates, we cytogenetically mapped Sox2 and Sox14 genes in cichlid fishes and performed comparative analyses of their orthologs in several vertebrate species. The genomic regions neighbouring Sox2 and Sox14 have been conserved during vertebrate diversification. Although cichlids seem to have undergone high rates of genomic rearrangements, Sox2 and Sox14 are linked in the same chromosome in the Etroplinae Etroplus maculatus that represents the sister group of all remaining cichlids. However, this genes are located on different chromosomes in several species of the sister group Pseudocrenilabrinae. Similarly the ancestral synteny of Sox2 and Sox14 has been maintained in several vertebrates, but this synteny has been broken independently in all major groups as a consequence of karyotype rearrangements that took place during the vertebrate evolution. PMID:21691861

  15. Proteomic Stable Isotope Probing Reveals Taxonomically Distinct Patterns in Amino Acid Assimilation by Coastal Marine Bacterioplankton

    PubMed Central

    Bryson, Samuel; Li, Zhou; Pett-Ridge, Jennifer; Hettich, Robert L.; Mayali, Xavier; Pan, Chongle

    2016-01-01

    ABSTRACT Heterotrophic marine bacterioplankton are a critical component of the carbon cycle, processing nearly a quarter of annual primary production, yet defining how substrate utilization preferences and resource partitioning structure microbial communities remains a challenge. In this study, proteomic stable isotope probing (proteomic SIP) was used to characterize population-specific assimilation of dissolved free amino acids (DFAAs), a major source of dissolved organic carbon for bacterial secondary production in aquatic environments. Microcosms of seawater collected from Newport, Oregon, and Monterey Bay, California, were incubated with 1 µM 13C-labeled amino acids for 15 and 32 h. The taxonomic compositions of microcosm metaproteomes were highly similar to those of the sampled natural communities, with Rhodobacteriales, SAR11, and Flavobacteriales representing the dominant taxa. Analysis of 13C incorporation into protein biomass allowed for quantification of the isotopic enrichment of identified proteins and subsequent determination of differential amino acid assimilation patterns between specific bacterioplankton populations. Proteins associated with Rhodobacterales tended to have a significantly high frequency of 13C-enriched peptides, opposite the trend for Flavobacteriales and SAR11 proteins. Rhodobacterales proteins associated with amino acid transport and metabolism had an increased frequency of 13C-enriched spectra at time point 2. Alteromonadales proteins also had a significantly high frequency of 13C-enriched peptides, particularly within ribosomal proteins, demonstrating their rapid growth during incubations. Overall, proteomic SIP facilitated quantitative comparisons of DFAA assimilation by specific taxa, both between sympatric populations and between protein functional groups within discrete populations, allowing an unprecedented examination of population level metabolic responses to resource acquisition in complex microbial communities

  16. Proteomic Stable Isotope Probing Reveals Taxonomically Distinct Patterns in Amino Acid Assimilation by Coastal Marine Bacterioplankton.

    PubMed

    Bryson, Samuel; Li, Zhou; Pett-Ridge, Jennifer; Hettich, Robert L; Mayali, Xavier; Pan, Chongle; Mueller, Ryan S

    2016-01-01

    Heterotrophic marine bacterioplankton are a critical component of the carbon cycle, processing nearly a quarter of annual primary production, yet defining how substrate utilization preferences and resource partitioning structure microbial communities remains a challenge. In this study, proteomic stable isotope probing (proteomic SIP) was used to characterize population-specific assimilation of dissolved free amino acids (DFAAs), a major source of dissolved organic carbon for bacterial secondary production in aquatic environments. Microcosms of seawater collected from Newport, Oregon, and Monterey Bay, California, were incubated with 1 µM (13)C-labeled amino acids for 15 and 32 h. The taxonomic compositions of microcosm metaproteomes were highly similar to those of the sampled natural communities, with Rhodobacteriales, SAR11, and Flavobacteriales representing the dominant taxa. Analysis of (13)C incorporation into protein biomass allowed for quantification of the isotopic enrichment of identified proteins and subsequent determination of differential amino acid assimilation patterns between specific bacterioplankton populations. Proteins associated with Rhodobacterales tended to have a significantly high frequency of (13)C-enriched peptides, opposite the trend for Flavobacteriales and SAR11 proteins. Rhodobacterales proteins associated with amino acid transport and metabolism had an increased frequency of (13)C-enriched spectra at time point 2. Alteromonadales proteins also had a significantly high frequency of (13)C-enriched peptides, particularly within ribosomal proteins, demonstrating their rapid growth during incubations. Overall, proteomic SIP facilitated quantitative comparisons of DFAA assimilation by specific taxa, both between sympatric populations and between protein functional groups within discrete populations, allowing an unprecedented examination of population level metabolic responses to resource acquisition in complex microbial communities

  17. Bacterio-plankton transformation of diazepam and 2-amino-5-chlorobenzophenone in river waters.

    PubMed

    Tappin, Alan D; Loughnane, J Paul; McCarthy, Alan J; Fitzsimons, Mark F

    2014-01-01

    Benzodiazepines are a large class of commonly-prescribed drugs used to treat a variety of clinical disorders. They have been shown to produce ecological effects at environmental concentrations, making understanding their fate in aquatic environments very important. In this study, uptake and biotransformations by riverine bacterio-plankton of the benzodiazepine, diazepam, and 2-amino-5-chlorobenzophenone, ACB (a photo-degradation product of diazepam and several other benzodiazepines), were investigated using batch microcosm incubations. These were conducted using water and bacterio-plankton populations from contrasting river catchments (Tamar and Mersey, UK), both in the presence and absence of a peptide, added as an alternative organic substrate. Incubations lasted 21 days, reflecting the expected water residence time in the catchments. In River Tamar water, 36% of diazepam (p < 0.001) was removed when the peptide was absent. In contrast, there was no removal of diazepam when the peptide was added, although the peptide itself was consumed. For ACB, 61% was removed in the absence of the peptide, and 84% in its presence (p < 0.001 in both cases). In River Mersey water, diazepam removal did not occur in the presence or absence of the peptide, with the latter again consumed, while ACB removal decreased from 44 to 22% with the peptide present. This suggests that bacterio-plankton from the Mersey water degraded the peptide in preference to both diazepam and ACB. Biotransformation products were not detected in any of the samples analysed but a significant increase in ammonium concentration (p < 0.038) was measured in incubations with ACB, confirming mineralization of the amine substituent. Sequential inoculation and incubation of Mersey and Tamar microcosms, for 5 periods of 21 days each, did not produce any evidence of increased ability of the microbial community to remove ACB, suggesting that an indigenous consortium was probably responsible for its metabolism. As ACB

  18. Dimethylsulfoniopropionate and methanethiol are important precursors of methionine and protein-sulfur in marine bacterioplankton.

    PubMed

    Kiene, R P; Linn, L J; González, J; Moran, M A; Bruton, J A

    1999-10-01

    Organic sulfur compounds are present in all aquatic systems, but their use as sources of sulfur for bacteria is generally not considered important because of the high sulfate concentrations in natural waters. This study investigated whether dimethylsulfoniopropionate (DMSP), an algal osmolyte that is abundant and rapidly cycled in seawater, is used as a source of sulfur by bacterioplankton. Natural populations of bacterioplankton from subtropical and temperate marine waters rapidly incorporated 15 to 40% of the sulfur from tracer-level additions of [(35)S]DMSP into a macromolecule fraction. Tests with proteinase K and chloramphenicol showed that the sulfur from DMSP was incorporated into proteins, and analysis of protein hydrolysis products by high-pressure liquid chromatography showed that methionine was the major labeled amino acid produced from [(35)S]DMSP. Bacterial strains isolated from coastal seawater and belonging to the alpha-subdivision of the division Proteobacteria incorporated DMSP sulfur into protein only if they were capable of degrading DMSP to methanethiol (MeSH), whereas MeSH was rapidly incorporated into macromolecules by all tested strains and by natural bacterioplankton. These findings indicate that the demethylation/demethiolation pathway of DMSP degradation is important for sulfur assimilation and that MeSH is a key intermediate in the pathway leading to protein sulfur. Incorporation of sulfur from DMSP and MeSH by natural populations was inhibited by nanomolar levels of other reduced sulfur compounds including sulfide, methionine, homocysteine, cysteine, and cystathionine. In addition, propargylglycine and vinylglycine were potent inhibitors of incorporation of sulfur from DMSP and MeSH, suggesting involvement of the enzyme cystathionine gamma-synthetase in sulfur assimilation by natural populations. Experiments with [methyl-(3)H]MeSH and [(35)S]MeSH showed that the entire methiol group of MeSH was efficiently incorporated into methionine, a

  19. Inferring protein function from genomic sequence: Giardia lamblia expresses a phosphatidylinositol kinase-related kinase similar to yeast and mammalian TOR.

    PubMed

    Morrison, Hilary G; Zamora, Gus; Campbell, Robert K; Sogin, Mitchell L

    2002-12-01

    Functional assays of genes have historically led to insights about the activities of a protein or protein cascade. However, the rapid expansion of genomic and proteomic information for a variety of diverse taxa is an alternative and powerful means of predicting function by comparing the enzymes and metabolic pathways used by different organisms. As part of the Giardia lamblia genome sequencing project, we routinely survey the complement of predicted proteins and compare those found in this putatively early diverging eukaryote with those of prokaryotes and more recently evolved eukaryotic lineages. Such comparisons reveal the minimal composition of conserved metabolic pathways, suggest which proteins may have been acquired by lateral transfer, and, by their absence, hint at functions lost in the transition from a free-living to a parasitic lifestyle. Here, we describe the use of bioinformatic approaches to investigate the complement and conservation of proteins in Giardia involved in the regulation of translation. We compare an FK506 binding protein homologue and phosphatidylinositol kinase-related kinase present in Giardia to those found in other eukaryotes for which complete genomic sequence data are available. Our investigation of the Giardia genome suggests that PIK-related kinases are of ancient origin and are highly conserved.

  20. Temporal Patterns in Bacterioplankton Community Composition in Three Reservoirs of Similar Trophic Status in Shenzhen, China

    PubMed Central

    Li, Jiancheng; Chen, Cheng; Lu, Jun; Lei, Anping; Hu, Zhangli

    2016-01-01

    The bacterioplankton community composition’s (BCC) spatial and temporal variation patterns in three reservoirs (Shiyan, Xikeng, and LuoTian Reservoir) of similar trophic status in Bao’an District, Shenzhen (China), were investigated using PCR amplification of the 16S rDNA gene and the denaturing gradient gel electrophoresis (DGGE) techniques. Water samples were collected monthly in each reservoir during 12 consecutive months. Distinct differences were detected in band number, pattern, and density of DGGE at different sampling sites and time points. Analysis of the DGGE fingerprints showed that changes in the bacterial community structure mainly varied with seasons, and the patterns of change indicated that seasonal forces might have a more significant impact on the BCC than eutrophic status in the reservoirs, despite the similar Shannon-Weiner index among the three reservoirs. The sequences obtained from excised bands were affiliated with Cyanobacteria, Firmicutes, Bacteriodetes, Acidobacteria, Actinobacteria, Planctomycetes, and Proteobacteria. PMID:27322295

  1. Assembly-free metagenomic analysis reveals new metabolic capabilities in surface ocean bacterioplankton.

    PubMed

    Luo, Haiwei; Moran, Mary Ann

    2013-10-01

    Uncovering the metabolic capabilities of microbes is key to understanding global energy flux and nutrient transformations. Since the vast majority of environmental microorganisms are uncultured, metagenomics has become an important tool to genotype the microbial community. This study uses a recently developed computational method to confidently assign metagenomic reads to microbial clades without the requirement of metagenome assembly by comparing the evolutionary pattern of nucleotide sequences at non-synonymous sites between metagenomic and orthologous reference genes. We found evidence for new, ecologically relevant metabolic pathways in several lineages of surface ocean bacterioplankton using the Global Ocean Survey (GOS) metagenomic data, including assimilatory sulfate reduction and alkaline phosphatase capabilities in the alphaproteobacterial SAR11 clade, and proteorhodopsin-like genes in the cyanobacterial genus Prochlorococcus. These findings raise new hypotheses about microbial roles in energy flux and organic matter transformation in the ocean. © 2013 John Wiley & Sons Ltd and Society for Applied Microbiology.

  2. Exploring Microdiversity in Novel Kordia sp. (Bacteroidetes) with Proteorhodopsin from the Tropical Indian Ocean via Single Amplified Genomes

    PubMed Central

    Royo-Llonch, Marta; Ferrera, Isabel; Cornejo-Castillo, Francisco M.; Sánchez, Pablo; Salazar, Guillem; Stepanauskas, Ramunas; González, José M.; Sieracki, Michael E.; Speich, Sabrina; Stemmann, Lars; Pedrós-Alió, Carlos; Acinas, Silvia G.

    2017-01-01

    Marine Bacteroidetes constitute a very abundant bacterioplankton group in the oceans that plays a key role in recycling particulate organic matter and includes several photoheterotrophic members containing proteorhodopsin. Relatively few marine Bacteroidetes species have been described and, moreover, they correspond to cultured isolates, which in most cases do not represent the actual abundant or ecologically relevant microorganisms in the natural environment. In this study, we explored the microdiversity of 98 Single Amplified Genomes (SAGs) retrieved from the surface waters of the underexplored North Indian Ocean, whose most closely related isolate is Kordia algicida OT-1. Using Multi Locus Sequencing Analysis (MLSA) we found no microdiversity in the tested conserved phylogenetic markers (16S rRNA and 23S rRNA genes), the fast-evolving Internal Transcribed Spacer and the functional markers proteorhodopsin and the beta-subunit of RNA polymerase. Furthermore, we carried out a Fragment Recruitment Analysis (FRA) with marine metagenomes to learn about the distribution and dynamics of this microorganism in different locations, depths and size fractions. This analysis indicated that this taxon belongs to the rare biosphere, showing its highest abundance after upwelling-induced phytoplankton blooms and sinking to the deep ocean with large organic matter particles. This uncultured Kordia lineage likely represents a novel Kordia species (Kordia sp. CFSAG39SUR) that contains the proteorhodopsin gene and has a widespread spatial and vertical distribution. The combination of SAGs and MLSA makes a valuable approach to infer putative ecological roles of uncultured abundant microorganisms. PMID:28790980

  3. Transient changes in bacterioplankton communities induced by the submarine volcanic eruption of El Hierro (Canary Islands).

    PubMed

    Ferrera, Isabel; Arístegui, Javier; González, José M; Montero, María F; Fraile-Nuez, Eugenio; Gasol, Josep M

    2015-01-01

    The submarine volcanic eruption occurring near El Hierro (Canary Islands) in October 2011 provided a unique opportunity to determine the effects of such events on the microbial populations of the surrounding waters. The birth of a new underwater volcano produced a large plume of vent material detectable from space that led to abrupt changes in the physical-chemical properties of the water column. We combined flow cytometry and 454-pyrosequencing of 16S rRNA gene amplicons (V1-V3 regions for Bacteria and V3-V5 for Archaea) to monitor the area around the volcano through the eruptive and post-eruptive phases (November 2011 to April 2012). Flow cytometric analyses revealed higher abundance and relative activity (expressed as a percentage of high-nucleic acid content cells) of heterotrophic prokaryotes during the eruptive process as compared to post-eruptive stages. Changes observed in populations detectable by flow cytometry were more evident at depths closer to the volcano (~70-200 m), coinciding also with oxygen depletion. Alpha-diversity analyses revealed that species richness (Chao1 index) decreased during the eruptive phase; however, no dramatic changes in community composition were observed. The most abundant taxa during the eruptive phase were similar to those in the post-eruptive stages and to those typically prevalent in oceanic bacterioplankton communities (i.e. the alphaproteobacterial SAR11 group, the Flavobacteriia class of the Bacteroidetes and certain groups of Gammaproteobacteria). Yet, although at low abundance, we also detected the presence of taxa not typically found in bacterioplankton communities such as the Epsilonproteobacteria and members of the candidate division ZB3, particularly during the eruptive stage. These groups are often associated with deep-sea hydrothermal vents or sulfur-rich springs. Both cytometric and sequence analyses showed that once the eruption ceased, evidences of the volcano-induced changes were no longer observed.

  4. Quantification of Carbon and Phosphorus Co-Limitation in Bacterioplankton: New Insights on an Old Topic

    PubMed Central

    Dorado-García, Irene; Medina-Sánchez, Juan Manuel; Herrera, Guillermo; Cabrerizo, Marco J.; Carrillo, Presentación

    2014-01-01

    Because the nature of the main resource that limits bacterioplankton (e.g. organic carbon [C] or phosphorus [P]) has biogeochemical implications concerning organic C accumulation in freshwater ecosystems, empirical knowledge is needed concerning how bacteria respond to these two resources, available alone or together. We performed field experiments of resource manipulation (2×2 factorial design, with the addition of C, P, or both combined) in two Mediterranean freshwater ecosystems with contrasting trophic states (oligotrophy vs. eutrophy) and trophic natures (autotrophy vs. heterotrophy, measured as gross primary production:respiration ratio). Overall, the two resources synergistically co-limited bacterioplankton, i.e. the magnitude of the response of bacterial production and abundance to the two resources combined was higher than the additive response in both ecosystems. However, bacteria also responded positively to single P and C additions in the eutrophic ecosystem, but not to single C in the oligotrophic one, consistent with the value of the ratio between bacterial C demand and algal C supply. Accordingly, the trophic nature rather than the trophic state of the ecosystems proves to be a key feature determining the expected types of resource co-limitation of bacteria, as summarized in a proposed theoretical framework. The actual types of co-limitation shifted over time and partially deviated (a lesser degree of synergism) from the theoretical expectations, particularly in the eutrophic ecosystem. These deviations may be explained by extrinsic ecological forces to physiological limitations of bacteria, such as predation, whose role in our experiments is supported by the relationship between the dynamics of bacteria and bacterivores tested by SEMs (structural equation models). Our study, in line with the increasingly recognized role of freshwater ecosystems in the global C cycle, suggests that further attention should be focussed on the biotic interactions that

  5. Successive changes in bacterioplankton communities in the River Rhine after copper additions

    SciTech Connect

    Tubbing, D.M.J.; Admiraal, W.; Katako, A.

    1995-09-01

    The sensitivity of bacterioplankton to copper was analyzed to see whether initial steps in the selection of cooper-tolerant life-forms in mixed populations of bacteria were accompanied by changes in basic metabolic parameters. Analysis took place by measuring the incorporation of [{sup 3}H]thymidine and [{sup 3}H]leucine, and the hydrolysis of leucyl-{beta}-naphthylamide over a period of 4 d. In acute toxicity tests the radiochemically determined parameters showed the same sensitivities to copper, whereas in the enzyme test the dose-response curve had a much lower slope, indicating less sensitivity. Marked differences were observed in the susceptibility of the different processes after prolonged exposure to copper. Incorporation of [{sup 3}H]thymidine, [{sup 3}H]leucine, and proteolytic activity changed substantially during exposure to concentrations as low as 2 to 31 {micro}g Cu L{sup {minus}1}. Higher copper concentrations 126--1,000 {micro}g Cu L{sup {minus}1} led in the course of 24 to 48 h to the development of a bacterial community with a higher overall copper tolerance. In winter, these successive events in bacterial populations were observed in the absence of substantial populations of algae or zooplankton. In summer, the metabolic changes in bacterioplankton expose to copper were strongly affected by the poisoning of other organisms, notably algae, and the subsequent release of organic material. Thus, moderate copper concentrations alter the metabolic profile of bacterial communities, probably as an initial step in the selection of tolerant life-forms.

  6. Transient Changes in Bacterioplankton Communities Induced by the Submarine Volcanic Eruption of El Hierro (Canary Islands)

    PubMed Central

    Ferrera, Isabel; Arístegui, Javier; González, José M.; Montero, María F.; Fraile-Nuez, Eugenio; Gasol, Josep M.

    2015-01-01

    The submarine volcanic eruption occurring near El Hierro (Canary Islands) in October 2011 provided a unique opportunity to determine the effects of such events on the microbial populations of the surrounding waters. The birth of a new underwater volcano produced a large plume of vent material detectable from space that led to abrupt changes in the physical-chemical properties of the water column. We combined flow cytometry and 454-pyrosequencing of 16S rRNA gene amplicons (V1–V3 regions for Bacteria and V3–V5 for Archaea) to monitor the area around the volcano through the eruptive and post-eruptive phases (November 2011 to April 2012). Flow cytometric analyses revealed higher abundance and relative activity (expressed as a percentage of high-nucleic acid content cells) of heterotrophic prokaryotes during the eruptive process as compared to post-eruptive stages. Changes observed in populations detectable by flow cytometry were more evident at depths closer to the volcano (~70–200 m), coinciding also with oxygen depletion. Alpha-diversity analyses revealed that species richness (Chao1 index) decreased during the eruptive phase; however, no dramatic changes in community composition were observed. The most abundant taxa during the eruptive phase were similar to those in the post-eruptive stages and to those typically prevalent in oceanic bacterioplankton communities (i.e. the alphaproteobacterial SAR11 group, the Flavobacteriia class of the Bacteroidetes and certain groups of Gammaproteobacteria). Yet, although at low abundance, we also detected the presence of taxa not typically found in bacterioplankton communities such as the Epsilonproteobacteria and members of the candidate division ZB3, particularly during the eruptive stage. These groups are often associated with deep-sea hydrothermal vents or sulfur-rich springs. Both cytometric and sequence analyses showed that once the eruption ceased, evidences of the volcano-induced changes were no longer observed

  7. Physiology and phylogeny of the candidate phylum "Atribacteria" (formerly OP9/JS1) inferred from single-cell genomics and metagenomics

    NASA Astrophysics Data System (ADS)

    Dodsworth, J. A.; Murugapiran, S.; Blainey, P. C.; Nobu, M.; Rinke, C.; Schwientek, P.; Gies, E.; Webster, G.; Kille, P.; Weightman, A.; Liu, W. T.; Hallam, S.; Tsiamis, G.; Swingley, W.; Ross, C.; Tringe, S. G.; Chain, P. S.; Scholz, M. B.; Lo, C. C.; Raymond, J.; Quake, S. R.; Woyke, T.; Hedlund, B. P.

    2014-12-01

    Single-cell sequencing and metagenomics have extended the genomics revolution to yet-uncultivated microorganisms and provided insights into the coding potential of this so-called "microbial dark matter", including microbes belonging candidate phyla with no cultivated representatives. As more datasets emerge, comparison of individual genomes from different lineages and habitats can provide insight into the phylogeny, conserved features, and potential metabolic diversity of candidate phyla. The candidate bacterial phylum OP9 was originally found in Obsidian Pool, Yellowstone National Park, and it has since been detected in geothermal springs, petroleum reservoirs, and engineered thermal environments worldwide. JS1, another uncultivated bacterial lineage affiliated with OP9, is often abundant in marine sediments associated with methane hydrates, hydrocarbon seeps, and on continental margins and shelves, and is found in other non-thermal marine and subsurface environments. The phylogenetic relationship between OP9, JS1, and other Bacteria has not been fully resolved, and to date no axenic cultures from these lineages have been reported. Recently, 31 single amplified genomes (SAGs) from six distinct OP9 and JS1 lineages have been obtained using flow cytometric and microfluidic techniques. These SAGs were used to inform metagenome binning techniques that identified OP9/JS1 sequences in several metagenomes, extending genomic coverage in three of the OP9 and JS1 lineages. Phylogenomic analyses of these SAG and metagenome bin datasets suggest that OP9 and JS1 constitute a single, deeply branching phylum, for which the name "Atribacteria" has recently been proposed. Overall, members of the "Atribacteria" are predicted to be heterotrophic anaerobes without the capacity for respiration, with some lineages potentially specializing in secondary fermentation of organic acids. A set of signature "Atribacteria" genes was tentatively identified, including components of a bacterial

  8. Complete mitochondrial genomes from four subspecies of common chaffinch (Fringilla coelebs): new inferences about mitochondrial rate heterogeneity, neutral theory, and phylogenetic relationships within the order Passeriformes.

    PubMed

    Marshall, H Dawn; Baker, Allan J; Grant, Allison R

    2013-03-15

    We describe whole mitochondrial genome sequences from four subspecies of the common chaffinch (Fringilla coelebs), and compare them to 31 publicly available mitochondrial genome sequences from other Passeriformes. Rates and patterns of mitochondrial gene evolution are analyzed at different taxonomic levels within this avian order, and evidence is adduced for and against the nearly neutral theory of molecular evolution and the role of positive selection in shaping genetic variation of this small but critical genome. We find evidence of mitochondrial rate heterogeneity in birds as in other vertebrates, likely due to differences in mutational pressure across the genome. Unlike in gadine fish and some of the human mitochondrial work we do not observe strong support for the nearly neutral theory of molecular evolution; instead evidence from molecular clocks, distribution of dN/dS ratios at different levels of the taxonomic hierarchy and in different lineages, McDonald-Kreitman tests within Fringillidae, and site-specific tests of selection within Passeriformes, all point to a role for positive selection, especially for the complex I NADH dehydrogenase genes. The protein-coding mitogenome phylogeny of the order Passeriformes is broadly consistent with previously-reported molecular findings, but provides support for a sister relationship between the superfamilies Muscicapoidea and Passeroidea on a short basal internode of the Passerida where relationships have been difficult to resolve. An unexpected placement of the Paridae (represented by Hume's groundpecker) within the Muscicapoidea was observed. Consistent with other molecular studies the mtDNA phylogeny reveals paraphyly within the Muscicapoidea and a sister relationship of Fringilla with Carduelis rather than Emberiza. Copyright © 2013 Elsevier B.V. All rights reserved.

  9. Taxonomy, molecular phylogeny and evolution of plant reverse transcribing viruses (family Caulimoviridae) inferred from full-length genome and reverse transcriptase sequences.

    PubMed

    Bousalem, M; Douzery, E J P; Seal, S E

    2008-01-01

    This study constitutes the first evaluation and application of quantitative taxonomy to the family Caulimoviridae and the first in-depth phylogenetic study of the family Caulimoviridae that integrates the common origin between LTR retrotransposons and caulimoviruses. The phylogenetic trees and PASC analyses derived from the full genome and from the corresponding partial RT concurred, providing strong support for the current genus classification based mainly on genome organisation and use of partial RT sequence as a molecular marker. The PASC distributions obtained are multimodal, making it possible to distinguish between genus, species and strain. The taxonomy of badnaviruses infecting banana (Musa spp.) was clarified, and the consequence of endogenous badnaviruses on the genetic diversity and evolution of caulimoviruses is discussed. The use of LTR retrotransposons as outgroups reveals a structured bipolar topology separating the genus Badnavirus from the other genera. Badnaviruses appear to be the most recent genus, with the genus Tungrovirus in an intermediary position. This structuring intersects the one established by genomic and biological properties and allows us to make a correlation between phylogeny and biogeography. The variability shown between members of the family Caulimoviridae is in a similar range to that reported within other DNA and RNA plant virus families.

  10. Inference of the Genetic Polymorphisms of CYP2D6 in Six Subtribes of the Malaysian Orang Asli from Whole-Genome Sequencing Data.

    PubMed

    Yu, Choo Yee; Ang, Geik Yong; Subramaniam, Vinothini; Johari James, Richard; Ahmad, Aminuddin; Abdul Rahman, Thuhairah; Mohd Nor, Fadzilah; Shaari, Syahrul Azlin; Teh, Lay Kek; Salleh, Mohd Zaki

    2017-07-01

    CYP2D6 is one of the major enzymes in the cytochrome P450 monooxygenase system. It metabolizes ∼25% of prescribed drugs and hence, the genetic diversity of a CYP2D6 gene has continued to be of great interest to the medical and pharmaceutical industries. This study was designed to perform a systematic analysis of the CYP2D6 gene in six subtribes of the Malaysian Orang Asli. Genomic DNAs were extracted from the blood samples followed by whole-genome sequencing. The reads were aligned to the reference human genome hg19 and variants in the CYP2D6 gene were analyzed. CYP2D6*5 and duplication of CYP2D6 were analyzed using previously established methods. A total of 72 single nucleotide polymorphisms were identified. CYP2D6*1, *2, *4, *5, *10,*41, and duplication of the gene were found in the Orang Asli, whereby CYP2D6*2 and *41 alleles are reported for the first time in the Malaysian population. The findings in this study provide insights into the genetic polymorphisms of CYP2D6 in the Orang Asli of Peninsular Malaysia.

  11. Sensitivity of bacterioplankton nitrogen metabolism to eutrophication in sub-tropical coastal waters of Key West, Florida.

    PubMed

    Hoch, Matthew P; Dillon, Kevin S; Coffin, Richard B; Cifuentes, Luis A

    2008-05-01

    Expression of intracellular ammonium assimilation enzymes were used to assess the response of nitrogen (N) metabolism in bacterioplankton to N-loading of sub-tropical coastal waters of Key West, Florida. Specific activities of glutamine synthetase (GS) and total glutamate dehydrogenase (GDHT) were measured on the bacterial size fraction (<0.8 microm) to assess N-deplete versus N-replete metabolic states, respectively. Enzyme results were compared to concentrations of dissolved organic matter and nutrients and to the biomass and production of phytoplankton and bacteria. Concentrations of dissolved inorganic N (DIN), dissolved organic N (DON), and dissolved organic carbon (DOC) positively correlated with specific activities of GDHT and negatively correlated with that of GS. Total dissolved N (TDN) concentration explained 81% of variance in bacterioplankton GDHT:GS activity ratio. The GDHT:GS ratio, TDN, DOC, and bacterial parameters decreased in magnitude along a tidally dynamic trophic gradient from north of Key West to south at the reef tract, which is consistent with the combined effects of localized coastal eutrophication and tidal exchange of seawater from the Southwest Florida Shelf and Florida Strait. The N-replete bacterioplankton north of Key West can regenerate ammonium which sustains primary production transported south to the reef. The range in GDHT:GS ratios was 5-30 times greater than that for commonly used indicators of planktonic eutrophication, which emphasizes the sensitivity of bacterioplankton N-metabolism to changes in N-bioavailability caused by nutrient pollution in sub-tropical coastal waters and utility of GDHT:GS ratio as an bioindicator of N-replete conditions.

  12. Seasonal Dynamics of Bacterioplankton Community Structure in a Eutrophic Lake as Determined by 5S rRNA Analysis

    PubMed Central

    Höfle, Manfred G.; Haas, Heike; Dominik, Katja

    1999-01-01

    Community structure of bacterioplankton was studied during the major growth season for phytoplankton (April to October) in the epilimnion of a temperate eutrophic lake (Lake Plußsee, northern Germany) by using comparative 5S rRNA analysis. Estimates of the relative abundances of single taxonomic groups were made on the basis of the amounts of single 5S rRNA bands obtained after high-resolution electrophoresis of RNA directly from the bacterioplankton. Full-sequence analysis of single environmental 5S rRNAs enabled the identification of single taxonomic groups of bacteria. Comparison of partial 5S rRNA sequences allowed the detection of changes of single taxa over time. Overall, the whole bacterioplankton community showed two to eight abundant (>4% of the total 5S rRNA) taxa. A distinctive seasonal succession was observed in the taxonomic structure of this pelagic community. A rather-stable community structure, with seven to eight different taxonomic units, was observed beginning in April during the spring phytoplankton bloom. A strong reduction in this diversity occurred at the beginning of the clear-water phase (early May), when only two to four abundant taxa were observed, with one taxon dominating (up to 72% of the total 5S rRNA). The community structure during summer stagnation (June and July) was characterized by frequent changes of different dominating taxa. During late summer, a dinoflagellate bloom (Ceratium hirudinella) occurred, with Comamonas acidovorans (β-subclass of the class Proteobacteria) becoming the dominant bacterial species (average abundance of 43% of the total 5S rRNA). Finally, the seasonal dynamics of the community structure of bacterioplankton were compared with the abundances of other major groups of the aquatic food web, such as phyto- and zooplankton, revealing that strong grazing pressure by zooplankton can reduce microbial diversity substantially in pelagic environments. PMID:10388718

  13. An independent genome duplication inferred from Hox paralogs in the American paddlefish--a representative basal ray-finned fish and important comparative reference.

    PubMed

    Crow, Karen D; Smith, Christopher D; Cheng, Jan-Fang; Wagner, Günter P; Amemiya, Chris T

    2012-01-01

    Vertebrates have experienced two rounds of whole-genome duplication (WGD) in the stem lineages of deep nodes within the group and a subsequent duplication event in the stem lineage of the teleosts-a highly diverse group of ray-finned fishes. Here, we present the first full Hox gene sequences for any member of the Acipenseriformes, the American paddlefish, and confirm that an independent WGD occurred in the paddlefish lineage, approximately 42 Ma based on sequences spanning the entire HoxA cluster and eight genes on the HoxD gene cluster. These clusters comprise different HOX loci and maintain conserved synteny relative to bichir, zebrafish, stickleback, and pufferfish, as well as human, mouse, and chick. We also provide a gene genealogy for the duplicated fzd8 gene in paddlefish and present evidence for the first Hox14 gene in any ray-finned fish. Taken together, these data demonstrate that the American paddlefish has an independently duplicated genome. Substitution patterns of the "alpha" paralogs on both the HoxA and HoxD gene clusters suggest transcriptional inactivation consistent with functional diploidization. Further, there are similarities in the pattern of sequence divergence among duplicated Hox genes in paddlefish and teleost lineages, even though they occurred independently approximately 200 Myr apart. We highlight implications on comparative analyses in the study of the "fin-limb transition" as well as gene and genome duplication in bony fishes, which includes all ray-finned fishes as well as the lobe-finned fishes and tetrapod vertebrates.

  14. Using Cases to Strengthen Inference on the Association between Single Nucleotide Polymorphisms and a Secondary Phenotype in Genome-Wide Association Studies

    PubMed Central

    Li, Huilin; Gail, Mitchell H.; Berndt, Sonja; Chatterjee, Nilanjan

    2010-01-01

    Case-control genome-wide association studies provide a vast amount of genetic information that may be used to investigate secondary phenotypes. We study the situation in which the primary disease is rare and the secondary phenotype and genetic markers are dichotomous. An analysis of the association between a genetic marker and the secondary phenotype based on controls only is valid, whereas standard methods that also use cases result in biased estimates and highly inflated type I error if there is an interaction between the secondary phenotype and the genetic marker on the risk of the primary disease. Here we present an adaptively weighted method that combines the case and control data to study the association, while reducing to the controls only analysis if there is strong evidence of an interaction. The possibility of such an interaction and the misleading results for standard methods, but not for the adaptively weighted or controls only approaches, are illustrated by data from a case-control study of colorectal adenoma, in which the secondary phenotype is smoking. Simulations and asymptotic theory indicate that the adaptively weighted method can reduce the mean square error for estimation with a pre-specified SNP and increase the power to discover a new association in a genome-wide study, compared to an analysis of controls only. Further experience with genome-wide studies is needed to determine when methods that assume no interaction and gain precision and power, thereby can be recommended, and when methods such as the adaptively weighted or controls only approaches are needed to guard against the possibility of non-zero interactions. PMID:20583284

  15. Skim-Based Genotyping by Sequencing Using a Double Haploid Population to Call SNPs, Infer Gene Conversions, and Improve Genome Assemblies.

    PubMed

    Bayer, Philipp Emanuel

    2016-01-01

    Genotyping by sequencing (GBS) is an emerging technology to rapidly call an abundance of Single Nucleotide Polymorphisms (SNPs) using genome sequencing technology. Several different methodologies and approaches have recently been established, most of these relying on a specific preparation of data. Here we describe our GBS-pipeline, which uses high coverage reads from two parents and low coverage reads from their double haploid offspring to call SNPs on a large scale. The upside of this approach is the high resolution and scalability of the method.

  16. Response of bacterioplankton community structure to an artificial gradient of pCO2 in the Arctic Ocean

    NASA Astrophysics Data System (ADS)

    Zhang, R.; Xia, X.; Lau, S. C. K.; Motegi, C.; Weinbauer, M. G.; Jiao, N.

    2013-06-01

    In order to test the influences of ocean acidification on the ocean pelagic ecosystem, so far the largest CO2 manipulation mesocosm study (European Project on Ocean Acidification, EPOCA) was performed in Kings Bay (Kongsfjorden), Spitsbergen. During a 30 day incubation, bacterial diversity was investigated using DNA fingerprinting and clone library analysis of bacterioplankton samples. Terminal restriction fragment length polymorphism (T-RFLP) analysis of the PCR amplicons of the 16S rRNA genes revealed that general bacterial diversity, taxonomic richness and community structure were influenced by the variation of productivity during the time of incubation, but not the degree of ocean acidification. A BIOENV analysis suggested a complex control of bacterial community structure by various biological and chemical environmental parameters. The maximum apparent diversity of bacterioplankton (i.e., the number of T-RFs) in high and low pCO2 treatments differed significantly. A negative relationship between the relative abundance of Bacteroidetes and pCO2 levels was observed for samples at the end of the experiment by the combination of T-RFLP and clone library analysis. Our study suggests that ocean acidification affects the development of bacterial assemblages and potentially impacts the ecological function of the bacterioplankton in the marine ecosystem.

  17. Proteomic-based stable isotope probing reveals taxonomically Distinct Patterns in Amino Acid Assimilation by Coastal Marine Bacterioplankton

    DOE PAGES

    Bryson, Samuel; Li, Zhou; Pett-Ridge, Jennifer; ...

    2016-04-26

    Heterotrophic marine bacterioplankton are a critical component of the carbon cycle, processing nearly a quarter of annual global primary production, yet defining how substrate utilization preferences and resource partitioning structure these microbial communities remains a challenge. In this study, we utilized proteomics-based stable isotope probing (proteomic SIP) to characterize the assimilation of amino acids by coastal marine bacterioplankton populations. We incubated microcosms of seawater collected from Newport, OR and Monterey Bay, CA with 1 M 13C-amino acids for 15 and 32 hours. Subsequent analysis of 13C incorporation into protein biomass quantified the frequency and extent of isotope enrichment for identifiedmore » proteins. Using these metrics we tested whether amino acid assimilation patterns were different for specific bacterioplankton populations. Proteins associated with Rhodobacterales and Alteromonadales tended to have a significantly high number of tandem mass spectra from 13C-enriched peptides, while Flavobacteriales and SAR11 proteins generally had significantly low numbers of 13C-enriched spectra. Rhodobacterales proteins associated with amino acid transport and metabolism had an increased frequency of 13C-enriched spectra at time-point 2, while Alteromonadales ribosomal proteins were 13C- enriched across time-points. Overall, proteomic SIP facilitated quantitative comparisons of dissolved free amino acids assimilation by specific taxa, both between sympatric populations and between protein functional groups within discrete populations, allowing an unprecedented examination of population-level metabolic responses to resource acquisition in complex microbial communities.« less

  18. Distribution of bacterioplankton with active metabolism in waters of the St. Anna Trough, Kara Sea, in autumn 2011

    NASA Astrophysics Data System (ADS)

    Mosharova, I. V.; Mosharov, S. A.; Ilinskiy, V. V.

    2017-01-01

    The distribution of bacterioplankton with active electron transport chains, as well as bacteria with intact cell membranes, was investigated for the first time in the region of St. Anna Trough in the Kara Sea. The average number of bacteria with active electron transport chains in the waters of the St. Anna Trough was 15.55 × 103 cells mL-1 (the limits of variation were 1.06-92.17 × 103 cells mL-1). The average number of bacteria with intact membranes was 33.46 × 103 cells mL-1 (the limits of variation were 6.78 to 103.18 × 103 cells mL-1). Almost all bacterioplankton microorganisms in the studied area were potentially viable, and the average share of bacteria with intact membranes was 92.1% of the total number of bacterioplankton (TNB) (the limits of variation were 76.2 to 98.4%). The share of bacteria with active metabolisms was 38.2% of the TNB (the limits of variation were 5.6-93.4%). The shares of the bacteria with active metabolisms were maximum in areas with the most stable environmental conditions (on the shelf and in deep water), whereas on the slope, where the gradients of water temperature and salinity were maximum, these values were lower.

  19. The UV responses of bacterioneuston and bacterioplankton isolates depend on the physiological condition and involve a metabolic shift.

    PubMed

    Santos, Ana L; Baptista, Inês; Lopes, Sílvia; Henriques, Isabel; Gomes, Newton C M; Almeida, Adelaide; Correia, António; Cunha, Angela

    2012-06-01

    Bacteria from the surface microlayer (bacterioneuston) and underlying waters (bacterioplankton) were isolated upon exposure to UV-B radiation, and their individual UV sensitivity in terms of CFU numbers, activity (leucine and thymidine incorporation), sole-carbon source use profiles, repair potential (light-dependent and independent), and photoadaptation potential, under different physiological conditions, was compared. Colony counts were 11.5-16.2% more reduced by UV-B exposure in bacterioplankton isolates (P < 0.05). Inhibition of leucine incorporation in bacterioneuston isolates was 10.9-11.5% higher than in bacterioplankton (P < 0.05). These effects were accompanied by a shift in sole-carbon source use profiles, assessed with Biolog(®) EcoPlates, with a reduction in consumption of amines and amino acids and increased use of polymers, particularly in bacterioneuston isolates. Recovery under starvation was generally enhanced compared with nourished conditions, especially in bacterioneuston isolates. Overall, only insignificant increases in the induction of antibiotic resistant mutant phenotypes (Rif(R) and Nal(R) ) were observed. In general, a potential for photoadaptation could not be detected among the tested isolates. These results indicate that UV effects on bacteria are influenced by their physiological condition and are accompanied by a shift in metabolic profiles, more significant in bacterioneuston isolates, suggesting the presence of bacterial strains adapted to high UV levels in the SML.

  20. The Diversity of the Limnohabitans Genus, an Important Group of Freshwater Bacterioplankton, by Characterization of 35 Isolated Strains

    PubMed Central

    Kasalický, Vojtěch; Jezbera, Jan; Hahn, Martin W.; Šimek, Karel

    2013-01-01

    Bacteria of the genus Limnohabitans, more precisely the R-BT lineage, have a prominent role in freshwater bacterioplankton communities due to their high rates of substrate uptake and growth, growth on algal-derived substrates and high mortality rates from bacterivory. Moreover, due to their generally larger mean cell volume, compared to typical bacterioplankton cells, they contribute over-proportionally to total bacterioplankton biomass. Here we present genetic, morphological and ecophysiological properties of 35 bacterial strains affiliated with the Limnohabitans genus newly isolated from 11 non-acidic European freshwater habitats. The low genetic diversity indicated by the previous studies using the ribosomal SSU gene highly contrasted with the surprisingly rich morphologies and different patterns in substrate utilization of isolated strains. Therefore, the intergenic spacer between 16S and 23S rRNA genes was successfully tested as a fine-scale marker to delineate individual lineages and even genotypes. For further studies, we propose the division of the Limnohabitans genus into five lineages (provisionally named as LimA, LimB, LimC, LimD and LimE) and also additional sublineages within the most diversified lineage LimC. Such a delineation is supported by the morphology of isolated strains which predetermine large differences in their ecology. PMID:23505469

  1. Phylogenetic shifts of bacterioplankton community composition along the Pearl Estuary: the potential impact of hypoxia and nutrients.

    PubMed

    Liu, Jiwen; Fu, Bingbing; Yang, Hongmei; Zhao, Meixun; He, Biyan; Zhang, Xiao-Hua

    2015-01-01

    The significance of salinity in shaping bacterial communities dwelling in estuarine areas has been well documented. However, the influences of other environmental factors such as dissolved oxygen and nutrients in determining distribution patterns of both individual taxa and bacterial communities inhabited local estuarine regions remain elusive. Here, bacterioplankton community structures of surface and bottom waters from eight sites along the Pearl Estuary were characterized with 16S rRNA gene pyrosequencing. The results showed significant differences of bacterioplankton community between freshwater and saltwater sites, and further between surface and bottom waters of saltwater sites. Synechococcus dominated the surface water of saltwater sites while Oceanospirillales, SAR11 and SAR406 were prevalent in the bottom water. Betaproteobacteria was abundant in freshwater sites, with no significant difference between water layers. Occurrence of phylogenetic shifts in taxa affiliated to the same clade was also detected. Dissolved oxygen explained most of the bacterial community variation in the redundancy analysis targeting only freshwater sites, whereas nutrients and salinity explained most of the variation across all samples in the Pearl Estuary. Methylophilales (mainly PE2 clade) was positively correlated to dissolved oxygen, whereas Rhodocyclales (mainly R.12up clade) was negatively correlated. Moreover, high nutrient inputs to the freshwater area of the Pearl Estuary have shifted the bacterial communities toward copiotrophic groups, such as Sphingomonadales. The present study demonstrated that the overall nutrients and freshwater hypoxia play important roles in determining bacterioplankton compositions and provided insights into the potential ecological roles of specific taxa in estuarine environments.

  2. Phylogenetic shifts of bacterioplankton community composition along the Pearl Estuary: the potential impact of hypoxia and nutrients

    PubMed Central

    Liu, Jiwen; Fu, Bingbing; Yang, Hongmei; Zhao, Meixun; He, Biyan; Zhang, Xiao-Hua

    2015-01-01

    The significance of salinity in shaping bacterial communities dwelling in estuarine areas has been well documented. However, the influences of other environmental factors such as dissolved oxygen and nutrients in determining distribution patterns of both individual taxa and bacterial communities inhabited local estuarine regions remain elusive. Here, bacterioplankton community structures of surface and bottom waters from eight sites along the Pearl Estuary were characterized with 16S rRNA gene pyrosequencing. The results showed significant differences of bacterioplankton community between freshwater and saltwater sites, and further between surface and bottom waters of saltwater sites. Synechococcus dominated the surface water of saltwater sites while Oceanospirillales, SAR11 and SAR406 were prevalent in the bottom water. Betaproteobacteria was abundant in freshwater sites, with no significant difference between water layers. Occurrence of phylogenetic shifts in taxa affiliated to the same clade was also detected. Dissolved oxygen explained most of the bacterial community variation in the redundancy analysis targeting only freshwater sites, whereas nutrients and salinity explained most of the variation across all samples in the Pearl Estuary. Methylophilales (mainly PE2 clade) was positively correlated to dissolved oxygen, whereas Rhodocyclales (mainly R.12up clade) was negatively correlated. Moreover, high nutrient inputs to the freshwater area of the Pearl Estuary have shifted the bacterial communities toward copiotrophic groups, such as Sphingomonadales. The present study demonstrated that the overall nutrients and freshwater hypoxia play important roles in determining bacterioplankton compositions and provided insights into the potential ecological roles of specific taxa in estuarine environments. PMID:25713564

  3. Seasonality Affects the Diversity and Composition of Bacterioplankton Communities in Dongjiang River, a Drinking Water Source of Hong Kong

    PubMed Central

    Sun, Wei; Xia, Chunyu; Xu, Meiying; Guo, Jun; Sun, Guoping

    2017-01-01

    Water quality ranks the most vital criterion for rivers serving as drinking water sources, which periodically changes over seasons. Such fluctuation is believed associated with the state shifts of bacterial community within. To date, seasonality effects on bacterioplankton community patterns in large rivers serving as drinking water sources however, are still poorly understood. Here we investigated the intra-annual bacterial community structure in the Dongjiang River, a drinking water source of Hong Kong, using high-throughput pyrosequencing in concert with geochemical property measurements during dry, and wet seasons. Our results showed that Proteobacteria, Actinobacteria, and Bacteroidetes were the dominant phyla of bacterioplankton communities, which varied in composition, and distribution from dry to wet seasons, and exhibited profound seasonal changes. Actinobacteria, Bacteroidetes, and Cyanobacteria seemed to be more associated with seasonality that the relative abundances of Actinobacteria, and Bacteroidetes were significantly higher in the dry season than those in the wet season (p < 0.01), while the relative abundance of Cyanobacteria was about 10-fold higher in the wet season than in the dry season. Temperature and NO3--N concentration represented key contributing factors to the observed seasonal variations. These findings help understand the roles of various bacterioplankton and their interactions with the biogeochemical processes in the river ecosystem. PMID:28912759

  4. Consequences of increased temperature and acidification on bacterioplankton community composition during a mesocosm spring bloom in the Baltic Sea.

    PubMed

    Lindh, Markus V; Riemann, Lasse; Baltar, Federico; Romero-Oliva, Claudia; Salomon, Paulo S; Granéli, Edna; Pinhassi, Jarone

    2013-04-01

    Despite the paramount importance of bacteria for biogeochemical cycling of carbon and nutrients, little is known about the potential effects of climate change on these key organisms. The consequences of the projected climate change on bacterioplankton community dynamics were investigated in a Baltic Sea spring phytoplankton bloom mesocosm experiment by increasing temperature with 3°C and decreasing pH by approximately 0.4 units via CO₂ addition in a factorial design. Temperature was the major driver of differences in community composition during the experiment, as shown by denaturing gradient gel electrophoresis (DGGE) of amplified 16S rRNA gene fragments. Several bacterial phylotypes belonging to Betaproteobacteria were predominant at 3°C but were replaced by members of the Bacteriodetes in the 6°C mesocosms. Acidification alone had a limited impact on phylogenetic composition, but when combined with increased temperature, resulted in the proliferation of specific microbial phylotypes. Our results suggest that although temperature is an important driver in structuring bacterioplankton composition, evaluation of the combined effects of temperature and acidification is necessary to fully understand consequences of climate change for marine bacterioplankton, their implications for future spring bloom dynamics, and their role in ecosystem functioning.

  5. Bacterioplankton community responses to key environmental variables in plateau freshwater lake ecosystems: A structural equation modeling and change point analysis.

    PubMed

    Cao, Xiaofeng; Wang, Jie; Liao, Jingqiu; Gao, Zhe; Jiang, Dalin; Sun, Jinhua; Zhao, Lei; Huang, Yi; Luan, Shengji

    2017-02-15

    Elevated environmental pressures negatively affect the bacterial community structure. However, little knowledge about the nonlinear responses of spatially related environmental variable across multiple plateau lake ecosystems on bacterioplankton communities has been gathered. Here, we used 454 pyrosequencing of 16S rRNA genes to study the associations of bacterial communities in terms of environmental characteristics as well as the potentially ecological threshold-inducing shifts of the bacterial community structure along the key environmental variables based on hypothesized structural equation models and the SEGMENTED method in 21 plateau lakes. Our results showed that water transparency was the major driving force and that total nitrogen was more significant than total phosphorus in determining the taxon composition of the bacterioplankton community. Significant community threshold estimates for bacterioplankton were observed at 7.36 for pH and 25.6% for the percentage of the agricultural area, while the remarkable change point of the cyanobacteria community structure responding to pH was at 7.74. Furthermore, the findings indicated that increasing nutrient loads can induce a distinct shift in dominance from Proteobacteria to Cyanobacteria, as well as a sharp decrease and adjacent increase when crossing the change point for Actinobacteria and Bacteroidetes along the gradient of the agricultural area.

  6. Seasonality Affects the Diversity and Composition of Bacterioplankton Communities in Dongjiang River, a Drinking Water Source of Hong Kong.

    PubMed

    Sun, Wei; Xia, Chunyu; Xu, Meiying; Guo, Jun; Sun, Guoping

    2017-01-01

    Water quality ranks the most vital criterion for rivers serving as drinking water sources, which periodically changes over seasons. Such fluctuation is believed associated with the state shifts of bacterial community within. To date, seasonality effects on bacterioplankton community patterns in large rivers serving as drinking water sources however, are still poorly understood. Here we investigated the intra-annual bacterial community structure in the Dongjiang River, a drinking water source of Hong Kong, using high-throughput pyrosequencing in concert with geochemical property measurements during dry, and wet seasons. Our results showed that Proteobacteria, Actinobacteria, and Bacteroidetes were the dominant phyla of bacterioplankton communities, which varied in composition, and distribution from dry to wet seasons, and exhibited profound seasonal changes. Actinobacteria, Bacteroidetes, and Cyanobacteria seemed to be more associated with seasonality that the relative abundances of Actinobacteria, and Bacteroidetes were significantly higher in the dry season than those in the wet season (p < 0.01), while the relative abundance of Cyanobacteria was about 10-fold higher in the wet season than in the dry season. Temperature and [Formula: see text]-N concentration represented key contributing factors to the observed seasonal variations. These findings help understand the roles of various bacterioplankton and their interactions with the biogeochemical processes in the river ecosystem.

  7. Differences in inferred genome-wide signals of positive selection during the evolution of Trypanosoma cruzi and Leishmania spp. lineages: A result of disparities in host and tissue infection ranges?

    PubMed

    Flores-López, Carlos A; Machado, Carlos A

    2015-07-01

    Trypanosoma cruzi and Leishmania spp. are kinetoplastids responsible for Chagas disease and Leishmaniasis, neglected tropical diseases for which there are no effective methods of control. These two human pathogens differ widely in the range of mammal species they can infect, their cell/tissue tropism and cell invasion mechanisms. Whether such major biological differences have had any impact on genome-wide patterns of genetic diversification in both pathogens has not been explored. The recent genome sequencing projects of medically important species of Leishmania and T. cruzi lineages provide unique resources for performing comparative evolutionary analyses to address that question. We show that inferred genome-wide signals of positive selection are higher in T. cruzi proteins than in Leishmania spp. proteins. We report significant differences in the fraction of protein-coding genes showing evidence of positive selection in the two groups of parasites, and also report that the intensity of positive selection and the proportion of sites under selection are higher in T. cruzi than in Leishmania spp. The pattern is unlikely to be the result of confounding factors like differences in GC content, average gene length or differences in reproductive mode between the two taxa. We propose that the greater versatility of T. cruzi in its host range, cell tropism and cell invasion mechanisms may explain the observed differences between the two groups of parasites. Genes showing evidence of positive selection within each taxonomic group may be under diversifying selection to evade the immune system and thus, depending on their functions, could represent viable candidates for the development of drugs or vaccines for these neglected human diseases. Copyright © 2015 Elsevier B.V. All rights reserved.

  8. Comparison of inferred relatedness based on multilocus variable-number tandem-repeat analysis and whole genome sequencing of Vibrio cholerae O1.

    PubMed

    Rashid, Mahamud-Ur; Almeida, Mathieu; Azman, Andrew S; Lindsay, Brianna R; Sack, David A; Colwell, Rita R; Huq, Anwar; Morris, J Glenn; Alam, Munirul; Stine, O Colin

    2016-06-01

    Vibrio cholerae causes cholera, a severe diarrheal disease. Understanding the local genetic diversity and transmission of V. cholerae will improve our ability to control cholera. Vibrio cholerae isolates clustered in genetically related groups (clonal complexes, CC) by multilocus variable tandem-repeat analysis (MLVA) were compared by whole genome sequencing (WGS). Isolates in CC1 had been isolated from two geographical locations. Isolates in a second genetically distinct group, CC2, were isolated only at one location. Using WGS, CC1 isolates from both locations revealed, on average, 43.8 nucleotide differences, while those strains comprising CC2 averaged 19.7 differences. Strains from both MLVA-CCs had an average difference of 106.6. Thus, isolates comprising CC1 were more closely related (P < 10(-6)) to each other than to isolates in CC2. Within a MLVA-CC, after removing all paralogs, alternative alleles were found in all possible combinations on separate chromosomes indicative of recombination within the core genome. Including recombination did not affect the distinctiveness of the MLVA-CCs when measured by WGS. We found that WGS generally reflected the same genetic relatedness of isolates as MLVA, indicating that isolates from the same MLVA-CC shared a more recent common ancestor than isolates from the same location that clustered in a distinct MLVA-CC. © FEMS 2016.

  9. Bioinformatic Analyses of Unique (Orphan) Core Genes of the Genus Acidithiobacillus: Functional Inferences and Use As Molecular Probes for Genomic and Metagenomic/Transcriptomic Interrogation

    PubMed Central

    González, Carolina; Lazcano, Marcelo; Valdés, Jorge; Holmes, David S.

    2016-01-01

    Using phylogenomic and gene compositional analyses, five highly conserved gene families have been detected in the core genome of the phylogenetically coherent genus Acidithiobacillus of the class Acidithiobacillia. These core gene families are absent in the closest extant genus Thermithiobacillus tepidarius that subtends the Acidithiobacillus genus and roots the deepest in this class. The predicted proteins encoded by these core gene families are not detected by a BLAST search in the NCBI non-redundant database of more than 90 million proteins using a relaxed cut-off of 1.0e−5. None of the five families has a clear functional prediction. However, bioinformatic scrutiny, using pI prediction, motif/domain searches, cellular location predictions, genomic context analyses, and chromosome topology studies together with previously published transcriptomic and proteomic data, suggests that some may have functions associated with membrane remodeling during cell division perhaps in response to pH stress. Despite the high level of amino acid sequence conservation within each family, there is sufficient nucleotide variation of the respective genes to permit the use of the DNA sequences to distinguish different species of Acidithiobacillus, making them useful additions to the armamentarium of tools for phylogenetic analysis. Since the protein families are unique to the Acidithiobacillus genus, they can also be leveraged as probes to detect the genus in environmental metagenomes and metatranscriptomes, including industrial biomining operations, and acid mine drainage (AMD). PMID:28082953

  10. High-throughput sequencing of complete human mtDNA genomes from the Caucasus and West Asia: high diversity and demographic inferences

    PubMed Central

    Schönberg, Anna; Theunert, Christoph; Li, Mingkun; Stoneking, Mark; Nasidze, Ivan

    2011-01-01

    To investigate the demographic history of human populations from the Caucasus and surrounding regions, we used high-throughput sequencing to generate 147 complete mtDNA genome sequences from random samples of individuals from three groups from the Caucasus (Armenians, Azeri and Georgians), and one group each from Iran and Turkey. Overall diversity is very high, with 144 different sequences that fall into 97 different haplogroups found among the 147 individuals. Bayesian skyline plots (BSPs) of population size change through time show a population expansion around 40–50 kya, followed by a constant population size, and then another expansion around 15–18 kya for the groups from the Caucasus and Iran. The BSP for Turkey differs the most from the others, with an increase from 35 to 50 kya followed by a prolonged period of constant population size, and no indication of a second period of growth. An approximate Bayesian computation approach was used to estimate divergence times between each pair of populations; the oldest divergence times were between Turkey and the other four groups from the South Caucasus and Iran (∼400–600 generations), while the divergence time of the three Caucasus groups from each other was comparable to their divergence time from Iran (average of ∼360 generations). These results illustrate the value of random sampling of complete mtDNA genome sequences that can be obtained with high-throughput sequencing platforms. PMID:21487439

  11. The phylogenetic relationships of insectivores with special reference to the lesser hedgehog tenrec as inferred from the complete sequence of their mitochondrial genome.

    PubMed

    Nikaido, Masato; Cao, Ying; Okada, Norihiro; Hasegawa, Masami

    2003-02-01

    The complete mitochondrial genome of a lesser hedgehog tenrec Echinops telfairi was determined in this study. It is an endemic African insectivore that is found specifically in Madagascar. The tenrec's back is covered with hedgehog-like spines. Unlike other spiny mammals, such as spiny mice, spiny rats, spiny dormice and porcupines, lesser hedgehog tenrecs look amazingly like true hedgehogs (Erinaceidae). However, they are distinguished morphologically from hedgehogs by the absence of a jugal bone. We determined the complete sequence of the mitochondrial genome of a lesser hedgehog tenrec and analyzed the results phylogenetically to determine the relationships between the tenrec and other insectivores (moles, shrews and hedgehogs), as well as the relationships between the tenrec and endemic African mammals, classified as Afrotheria, that have recently been shown by molecular analysis to be close relatives of the tenrec. Our data confirmed the afrotherian status of the tenrec, and no direct relation was recovered between the tenrec and the hedgehog. Comparing our data with those of others, we found that within-species variations in the mitochondrial DNA of lesser hedgehog tenrecs appear to be the largest recognized to date among mammals, apart from orangutans, which might be interesting from the view point of evolutionary history of tenrecs on Madagascar.

  12. Short-term variability of heterotrophic bacterioplankton during upwelling off the NW Iberian margin

    NASA Astrophysics Data System (ADS)

    Barbosa, A. B.; Galvão, H. M.; Mendes, P. A.; Álvarez-Salgado, X. A.; Figueiras, F. G.; Joint, I.

    2001-11-01

    Short-term variability of heterotrophic bacterioplankton was studied in a recently upwelled water mass at the NW Iberian margin (August 1998). Bacterioplankton abundance (BA), biomass (BB), production (BP), and specific production (SBP) were monitored during two Lagrangian drift experiments, one along the shelf-edge, the other off-shelf along an upwelling filament. Other measurements included chlorophyll a (Chl a), primary production (PP), suspended particulate organic carbon (POC) and nitrogen (PON), and dissolved organic carbon (DOC) and nitrogen (DON). Although primary production was significantly higher during the shelf-edge drift experiment, bacterial biomass in the euphotic zone (2.68 to 22.20μgC.l -1) was not significantly different from that in the offshore filament. In contrast, bacterial production (0.13-3.52μgC.l -1.d -1), estimated using an empirically determined 14C-leucine to carbon conversion factor, and bacterial growth rates (doubling time, DT: 3.9-29.7d), were significantly higher during the shelf-edge drift (BP: 1.50±0.11 versus 0.50±0.02μgC.l -1.d -1; DT: 6.9±0.3 versus 16.2±0.9 d; p<0.01). Depth-integrated BB over the euphotic zone comprised 15±1% of phytoplankton biomass during shelf-edge drift and 39±4% under the more oligotrophic conditions in the filament. However, daily BP to net primary production ratios were not significantly different in the two regions (6±1% versus 7±1%). BA, BB, BP and SBP were enhanced in the later part of the shelf-edge drift following a pronounced increase in both PP and gross DOC production, suggesting that phytoplankton was a source of substrates for bacteria in recently upwelled waters. This contrasted with the filament drift in which short-term variability of bacterioplankton was much less pronounced and there was no correlation between BP and PP. In both regions, SBP and DOC in the euphotic zone were significantly correlated (p<0.005) indicating some regulatory effect of DOC over bacterial activity

  13. Euphotic zone bacterioplankton sources major sedimentary bacteriohopanepolyols in the Holocene Black Sea

    NASA Astrophysics Data System (ADS)

    Blumenberg, Martin; Seifert, Richard; Kasten, Sabine; Bahlmann, Enno; Michaelis, Walter

    2009-02-01

    Bacteriohopanepolyols (BHPs) are lipid constituents of many bacterial groups. Geohopanoids, the diagenetic products, are therefore ubiquitous in organic matter of the geosphere. To examine the potential of BHPs as environmental markers in marine sediments, we investigated a Holocene sediment core from the Black Sea. The concentrations of BHPs mirror the environmental shift from a well-mixed lake to a stratified marine environment by a strong and gradual increase from low values (˜30 μg g -1 TOC) in the oldest sediments to ˜170 μg g -1 TOC in sediments representing the onset of a permanently anoxic water body at about 7500 years before present (BP). This increase in BHP concentrations was most likely caused by a strong increase in bacterioplanktonic paleoproductivity brought about by several ingressions of Mediterranean Sea waters at the end of the lacustrine stage (˜9500 years BP). δ 15N values coevally decreasing with increasing BHP concentrations may indicate a shift from a phosphorus- to a nitrogen-limited setting supporting growth of N 2-fixing, BHP-producing bacteria. In sediments of the last ˜3000 years BHP concentrations have remained relatively stable at about 50 μg g -1 TOC. The distributions of major BHPs did not change significantly during the shift from lacustrine (or oligohaline) to marine conditions. Tetrafunctionalized BHPs prevailed throughout the entire sediment core, with the common bacteriohopanetetrol and 35-aminobacteriohopanetriol and the rare 35-aminobacteriohopenetriol, so far only known from a purple non-sulfur α-proteobacterium, being the main components. Other BHPs specific to cyanobacteria and pelagic methanotrophic bacteria were also found but only in much smaller amounts. Our results demonstrate that BHPs from microorganisms living in deeper biogeochemical zones of marine water columns are underrepresented or even absent in the sediment compared to the BHPs of bacteria present in the euphotic zone. Obviously, the assemblage of

  14. Inhibitory Effect of Solar Radiation on Thymidine and Leucine Incorporation by Freshwater and Marine Bacterioplankton

    PubMed Central

    Sommaruga, R.; Obernosterer, I.; Herndl, G. J.; Psenner, R.

    1997-01-01

    We studied the effect of solar radiation on the incorporation of [(sup3)H]thymidine ([(sup3)H]TdR) and [(sup14)C]leucine ([(sup14)C]Leu) by bacterioplankton in a high mountain lake and the northern Adriatic Sea. After short-term exposure (3 to 4 h) of natural bacterial assemblages to sunlight just beneath the surface, the rates of incorporation of [(sup3)H]TdR and [(sup14)C]Leu were reduced at both sites by up to (symbl)70% compared to those for the dark control. Within the solar UV radiation (290 to 400 nm), the inhibition was caused exclusively by UV-A radiation (320 to 400 nm). However, photosynthetically active radiation (PAR) (400 to 700 nm) contributed almost equally to this effect. Experiments with samples from the high mountain lake showed that at a depth of 2.5 m, the inhibition was caused almost exclusively by UV-A radiation. At a depth of 8.5 m, where chlorophyll a concentrations were higher than those in the upper water column, the rates of incorporation of [(sup3)H]TdR were higher in those samples exposed to full sunlight or to UV-A plus PAR than in the dark control. In laboratory experiments with artificial UV light, the incorporation of [(sup3)H]TdR and [(sup14)C]Leu by mixed bacterial lake cultures was also inhibited mainly by UV-A. In contrast, in the presence of the green alga Chlamydomonas geitleri at a chlorophyll a concentration of 2.5 (mu)g liter(sup-1), inhibition by UV radiation was significantly reduced. These results suggest that there may be complex interactions among UV radiation, heterotrophic bacteria, and phytoplankton and their release of extracellular organic carbon. Our findings indicate that the wavelengths which caused the strongest inhibition of TdR and Leu incorporation by bacterioplankton in the water column were in the UV-A range. However, it may be premature to extrapolate this effect to estimates of bacterial production before more precise information on how solar radiation affects the transport of TdR and Leu into the cell

  15. The bioinvasion of Guam: inferring geographic origin, pace, pattern and process of an invasive lizard (Carlia) in the Pacific using multi-locus genomic data

    USGS Publications Warehouse

    Austin, C.C.; Rittmeyer, E.N.; Oliver, L.A.; Andermann, J.O.; Zug, G.R.; Rodda, G.H.; Jackson, N.D.

    2011-01-01

    Invasive species often have dramatic negative effects that lead to the deterioration and loss of biodiversity frequently coupled with the burden of expensive biocontrol programs and subversion of socioeconomic stability. The fauna and flora of oceanic islands are particularly susceptible to invasive species and the increase of global movements of humans and their products since WW II has caused numerous anthropogenic translocations and increased the ills of human-mediated invasions. We use a multi-locus genomic dat