Science.gov

Sample records for bacterioplankton genomes inferred

  1. Comparative analysis of deep-sea bacterioplankton OMICS revealed the occurrence of habitat-specific genomic attributes.

    PubMed

    Smedile, Francesco; Messina, Enzo; La Cono, Violetta; Yakimov, Michail M

    2014-10-01

    Bathyal aphotic ocean represents the largest biotope on our planet, which sustains highly diverse but low-density microbial communities, with yet untapped genomic attributes, potentially useful for discovery of new biomolecules, industrial enzymes and pathways. In the last two decades, culture-independent approaches of high-throughput sequencing have provided new insights into structure and function of marine bacterioplankton, leading to unprecedented opportunities to accurately characterize microbial communities and their interactions with the environments. In the present review we focused on the analysis of relatively few deep-sea OMICS studies, completed thus far, to find the specific genomic patterns determining the lifeway and adaptation mechanisms of prokaryotes thriving in the dark deep ocean below the depth of 1000m. Phylogenomic and omic studies provided clear evidence that the bathyal microbial communities are distinct from the epipelagic counterparts and, along with generally larger genomes, possess their own habitat-specific genomic attributes. The high abundance in the deep ocean OMICS of the systems for environmental sensing, signal transduction and metabolic versatility as compared to the epipelagic counterparts is thought to enable the deep-sea bacterioplankton to rapidly adapt to changing environmental conditions associated with resource scarcity and high diversity of energy and carbon substrates in the bathyal biotopes. Together with a versatile heterotrophy, mixotrophy and anaplerosis are thought to enable the deep-sea bacterioplankton to cope with these environmental conditions.

  2. Single-cell genomics-based analysis of virus–host interactions in marine surface bacterioplankton

    DOE PAGES

    Labonté, Jessica M.; Swan, Brandon K.; Poulos, Bonnie; ...

    2015-04-07

    Viral infections dynamically alter the composition and metabolic potential of marine microbial communities and the evolutionary trajectories of host populations with resulting feedback on biogeochemical cycles. It is quite possible that all microbial populations in the ocean are impacted by viral infections. Our knowledge of virus–host relationships, however, has been limited to a minute fraction of cultivated host groups. Here, we utilized single-cell sequencing to obtain genomic blueprints of viruses inside or attached to individual bacterial and archaeal cells captured in their native environment, circumventing the need for host and virus cultivation. Furthermore, a combination of comparative genomics, metagenomic fragmentmore » recruitment, sequence anomalies and irregularities in sequence coverage depth and genome recovery were utilized to detect viruses and to decipher modes of virus–host interactions. Members of all three tailed phage families were identified in 20 out of 58 phylogenetically and geographically diverse single amplified genomes (SAGs) of marine bacteria and archaea. At least four phage–host interactions had the characteristics of late lytic infections, all of which were found in metabolically active cells. One virus had genetic potential for lysogeny. Our findings include first known viruses of Thaumarchaeota, Marinimicrobia, Verrucomicrobia and Gammaproteobacteria clusters SAR86 and SAR92. Viruses were also found in SAGs of Alphaproteobacteria and Bacteroidetes. A high fragment recruitment of viral metagenomic reads confirmed that most of the SAG-associated viruses are abundant in the ocean. This study demonstrates that single-cell genomics, in conjunction with sequence-based computational tools, enable in situ, cultivation-independent insights into host–virus interactions in complex microbial communities.« less

  3. Single-cell genomics-based analysis of virus–host interactions in marine surface bacterioplankton

    SciTech Connect

    Labonté, Jessica M.; Swan, Brandon K.; Poulos, Bonnie; Luo, Haiwei; Koren, Sergey; Hallam, Steven J.; Sullivan, Matthew B.; Woyke, Tanja; Eric Wommack, K.; Stepanauskas, Ramunas

    2015-04-07

    Viral infections dynamically alter the composition and metabolic potential of marine microbial communities and the evolutionary trajectories of host populations with resulting feedback on biogeochemical cycles. It is quite possible that all microbial populations in the ocean are impacted by viral infections. Our knowledge of virus–host relationships, however, has been limited to a minute fraction of cultivated host groups. Here, we utilized single-cell sequencing to obtain genomic blueprints of viruses inside or attached to individual bacterial and archaeal cells captured in their native environment, circumventing the need for host and virus cultivation. Furthermore, a combination of comparative genomics, metagenomic fragment recruitment, sequence anomalies and irregularities in sequence coverage depth and genome recovery were utilized to detect viruses and to decipher modes of virus–host interactions. Members of all three tailed phage families were identified in 20 out of 58 phylogenetically and geographically diverse single amplified genomes (SAGs) of marine bacteria and archaea. At least four phage–host interactions had the characteristics of late lytic infections, all of which were found in metabolically active cells. One virus had genetic potential for lysogeny. Our findings include first known viruses of Thaumarchaeota, Marinimicrobia, Verrucomicrobia and Gammaproteobacteria clusters SAR86 and SAR92. Viruses were also found in SAGs of Alphaproteobacteria and Bacteroidetes. A high fragment recruitment of viral metagenomic reads confirmed that most of the SAG-associated viruses are abundant in the ocean. This study demonstrates that single-cell genomics, in conjunction with sequence-based computational tools, enable in situ, cultivation-independent insights into host–virus interactions in complex microbial communities.

  4. Freshwater bacterial lifestyles inferred from comparative genomics.

    PubMed

    Livermore, Joshua A; Emrich, Scott J; Tan, John; Jones, Stuart E

    2014-03-01

    While micro-organisms actively mediate and participate in freshwater ecosystem services, we know little about freshwater microbial genetic diversity. Genome sequences are available for many bacteria from the human microbiome and the ocean (over 800 and 200, respectively), but only two freshwater genomes are currently available: the streamlined genomes of Polynucleobacter necessarius ssp. asymbioticus and the Actinobacterium AcI-B1. Here, we sequenced and analysed draft genomes of eight phylogentically diverse freshwater bacteria exhibiting a range of lifestyle characteristics. Comparative genomics of these bacteria reveals putative freshwater bacterial lifestyles based on differences in predicted growth rate, capability to respond to environmental stimuli and diversity of useable carbon substrates. Our conceptual model based on these genomic characteristics provides a foundation on which further ecophysiological and genomic studies can be built. In addition, these genomes greatly expand the diversity of existing genomic context for future studies on the ecology and genetics of freshwater bacteria.

  5. Inference of self-regulated transcriptional networks by comparative genomics.

    PubMed

    Cornish, Joseph P; Matthews, Fialelei; Thomas, Julien R; Erill, Ivan

    2012-01-01

    The assumption of basic properties, like self-regulation, in simple transcriptional regulatory networks can be exploited to infer regulatory motifs from the growing amounts of genomic and meta-genomic data. These motifs can in principle be used to elucidate the nature and scope of transcriptional networks through comparative genomics. Here we assess the feasibility of this approach using the SOS regulatory network of Gram-positive bacteria as a test case. Using experimentally validated data, we show that the known regulatory motif can be inferred through the assumption of self-regulation. Furthermore, the inferred motif provides a more robust search pattern for comparative genomics than the experimental motifs defined in reference organisms. We take advantage of this robustness to generate a functional map of the SOS response in Gram-positive bacteria. Our results reveal definite differences in the composition of the LexA regulon between Firmicutes and Actinobacteria, and confirm that regulation of cell-division inhibition is a widespread characteristic of this network among Gram-positive bacteria.

  6. SOP for pathway inference in Integrated Microbial Genomes (IMG).

    PubMed

    Anderson, Iain; Chen, Amy; Markowitz, Victor; Kyrpides, Nikos; Ivanova, Natalia

    2011-12-31

    One of the most important aspects of genomic analysis is the prediction of which pathways, both metabolic and non-metabolic, are present in an organism. In IMG, this is carried out by the assignment of IMG terms, which are organized into IMG pathways. Based on manual and automatic assignment of IMG terms, the presence or absence of IMG pathways is automatically inferred. The three categories of pathway assertion are asserted (likely present), not asserted (likely absent), and unknown. In the unknown category, at least one term necessary for the pathway is missing, but an ortholog in another organism has the corresponding term assigned to it. Automatic pathway inference is an important initial step in genome analysis.

  7. Inferring Heterozygosity from Ancient and Low Coverage Genomes

    PubMed Central

    Kousathanas, Athanasios; Leuenberger, Christoph; Link, Vivian; Sell, Christian; Burger, Joachim; Wegmann, Daniel

    2017-01-01

    While genetic diversity can be quantified accurately from high coverage sequencing data, it is often desirable to obtain such estimates from data with low coverage, either to save costs or because of low DNA quality, as is observed for ancient samples. Here, we introduce a method to accurately infer heterozygosity probabilistically from sequences with average coverage <1× of a single individual. The method relaxes the infinite sites assumption of previous methods, does not require a reference sequence, except for the initial alignment of the sequencing data, and takes into account both variable sequencing errors and potential postmortem damage. It is thus also applicable to nonmodel organisms and ancient genomes. Since error rates as reported by sequencing machines are generally distorted and require recalibration, we also introduce a method to accurately infer recalibration parameters in the presence of postmortem damage. This method does not require knowledge about the underlying genome sequence, but instead works with haploid data (e.g., from the X-chromosome from mammalian males) and integrates over the unknown genotypes. Using extensive simulations we show that a few megabasepairs of haploid data are sufficient for accurate recalibration, even at average coverages as low as 1×. At similar coverages, our method also produces very accurate estimates of heterozygosity down to 10−4 within windows of about 1 Mbp. We further illustrate the usefulness of our approach by inferring genome-wide patterns of diversity for several ancient human samples, and we found that 3000–5000-year-old samples showed diversity patterns comparable to those of modern humans. In contrast, two European hunter-gatherer samples exhibited not only considerably lower levels of diversity than modern samples, but also highly distinct distributions of diversity along their genomes. Interestingly, these distributions were also very different between the two samples, supporting earlier

  8. Genome-Wide Inference of Ancestral Recombination Graphs

    PubMed Central

    Rasmussen, Matthew D.; Hubisz, Melissa J.; Gronau, Ilan; Siepel, Adam

    2014-01-01

    The complex correlation structure of a collection of orthologous DNA sequences is uniquely captured by the “ancestral recombination graph” (ARG), a complete record of coalescence and recombination events in the history of the sample. However, existing methods for ARG inference are computationally intensive, highly approximate, or limited to small numbers of sequences, and, as a consequence, explicit ARG inference is rarely used in applied population genomics. Here, we introduce a new algorithm for ARG inference that is efficient enough to apply to dozens of complete mammalian genomes. The key idea of our approach is to sample an ARG of chromosomes conditional on an ARG of chromosomes, an operation we call “threading.” Using techniques based on hidden Markov models, we can perform this threading operation exactly, up to the assumptions of the sequentially Markov coalescent and a discretization of time. An extension allows for threading of subtrees instead of individual sequences. Repeated application of these threading operations results in highly efficient Markov chain Monte Carlo samplers for ARGs. We have implemented these methods in a computer program called ARGweaver. Experiments with simulated data indicate that ARGweaver converges rapidly to the posterior distribution over ARGs and is effective in recovering various features of the ARG for dozens of sequences generated under realistic parameters for human populations. In applications of ARGweaver to 54 human genome sequences from Complete Genomics, we find clear signatures of natural selection, including regions of unusually ancient ancestry associated with balancing selection and reductions in allele age in sites under directional selection. The patterns we observe near protein-coding genes are consistent with a primary influence from background selection rather than hitchhiking, although we cannot rule out a contribution from recurrent selective sweeps. PMID:24831947

  9. Population Genetic Inference from Personal Genome Data: Impact of Ancestry and Admixture on Human Genomic Variation

    PubMed Central

    Kidd, Jeffrey M.; Gravel, Simon; Byrnes, Jake; Moreno-Estrada, Andres; Musharoff, Shaila; Bryc, Katarzyna; Degenhardt, Jeremiah D.; Brisbin, Abra; Sheth, Vrunda; Chen, Rong; McLaughlin, Stephen F.; Peckham, Heather E.; Omberg, Larsson; Bormann Chung, Christina A.; Stanley, Sarah; Pearlstein, Kevin; Levandowsky, Elizabeth; Acevedo-Acevedo, Suehelay; Auton, Adam; Keinan, Alon; Acuña-Alonzo, Victor; Barquera-Lozano, Rodrigo; Canizales-Quinteros, Samuel; Eng, Celeste; Burchard, Esteban G.; Russell, Archie; Reynolds, Andy; Clark, Andrew G.; Reese, Martin G.; Lincoln, Stephen E.; Butte, Atul J.; De La Vega, Francisco M.; Bustamante, Carlos D.

    2012-01-01

    Full sequencing of individual human genomes has greatly expanded our understanding of human genetic variation and population history. Here, we present a systematic analysis of 50 human genomes from 11 diverse global populations sequenced at high coverage. Our sample includes 12 individuals who have admixed ancestry and who have varying degrees of recent (within the last 500 years) African, Native American, and European ancestry. We found over 21 million single-nucleotide variants that contribute to a 1.75-fold range in nucleotide heterozygosity across diverse human genomes. This heterozygosity ranged from a high of one heterozygous site per kilobase in west African genomes to a low of 0.57 heterozygous sites per kilobase in segments inferred to have diploid Native American ancestry from the genomes of Mexican and Puerto Rican individuals. We show evidence of all three continental ancestries in the genomes of Mexican, Puerto Rican, and African American populations, and the genome-wide statistics are highly consistent across individuals from a population once ancestry proportions have been accounted for. Using a generalized linear model, we identified subtle variations across populations in the proportion of neutral versus deleterious variation and found that genome-wide statistics vary in admixed populations even once ancestry proportions have been factored in. We further infer that multiple periods of gene flow shaped the diversity of admixed populations in the Americas—70% of the European ancestry in today’s African Americans dates back to European gene flow happening only 7–8 generations ago. PMID:23040495

  10. Genomic inference of the metabolism of cosmopolitan subsurface Archaea, Hadesarchaea.

    PubMed

    Baker, Brett J; Saw, Jimmy H; Lind, Anders E; Lazar, Cassandre Sara; Hinrichs, Kai-Uwe; Teske, Andreas P; Ettema, Thijs J G

    2016-02-15

    The subsurface biosphere is largely unexplored and contains a broad diversity of uncultured microbes(1). Despite being one of the few prokaryotic lineages that is cosmopolitan in both the terrestrial and marine subsurface(2-4), the physiological and ecological roles of SAGMEG (South-African Gold Mine Miscellaneous Euryarchaeal Group) Archaea are unknown. Here, we report the metabolic capabilities of this enigmatic group as inferred from genomic reconstructions. Four high-quality (63-90% complete) genomes were obtained from White Oak River estuary and Yellowstone National Park hot spring sediment metagenomes. Phylogenomic analyses place SAGMEG Archaea as a deeply rooting sister clade of the Thermococci, leading us to propose the name Hadesarchaea for this new Archaeal class. With an estimated genome size of around 1.5 Mbp, the genomes of Hadesarchaea are distinctly streamlined, yet metabolically versatile. They share several physiological mechanisms with strict anaerobic Euryarchaeota. Several metabolic characteristics make them successful in the subsurface, including genes involved in CO and H2 oxidation (or H2 production), with potential coupling to nitrite reduction to ammonia (DNRA). This first glimpse into the metabolic capabilities of these cosmopolitan Archaea suggests they are mediating key geochemical processes and are specialized for survival in the subsurface biosphere.

  11. The aggregate site frequency spectrum for comparative population genomic inference.

    PubMed

    Xue, Alexander T; Hickerson, Michael J

    2015-12-01

    Understanding how assemblages of species responded to past climate change is a central goal of comparative phylogeography and comparative population genomics, an endeavour that has increasing potential to integrate with community ecology. New sequencing technology now provides the potential to perform complex demographic inference at unprecedented resolution across assemblages of nonmodel species. To this end, we introduce the aggregate site frequency spectrum (aSFS), an expansion of the site frequency spectrum to use single nucleotide polymorphism (SNP) data sets collected from multiple, co-distributed species for assemblage-level demographic inference. We describe how the aSFS is constructed over an arbitrary number of independent population samples and then demonstrate how the aSFS can differentiate various multispecies demographic histories under a wide range of sampling configurations while allowing effective population sizes and expansion magnitudes to vary independently. We subsequently couple the aSFS with a hierarchical approximate Bayesian computation (hABC) framework to estimate degree of temporal synchronicity in expansion times across taxa, including an empirical demonstration with a data set consisting of five populations of the threespine stickleback (Gasterosteus aculeatus). Corroborating what is generally understood about the recent postglacial origins of these populations, the joint aSFS/hABC analysis strongly suggests that the stickleback data are most consistent with synchronous expansion after the Last Glacial Maximum (posterior probability = 0.99). The aSFS will have general application for multilevel statistical frameworks to test models involving assemblages and/or communities, and as large-scale SNP data from nonmodel species become routine, the aSFS expands the potential for powerful next-generation comparative population genomic inference.

  12. Catchment-scale biogeography of riverine bacterioplankton.

    PubMed

    Read, Daniel S; Gweon, Hyun S; Bowes, Michael J; Newbold, Lindsay K; Field, Dawn; Bailey, Mark J; Griffiths, Robert I

    2015-02-01

    Lotic ecosystems such as rivers and streams are unique in that they represent a continuum of both space and time during the transition from headwaters to the river mouth. As microbes have very different controls over their ecology, distribution and dispersion compared with macrobiota, we wished to explore biogeographical patterns within a river catchment and uncover the major drivers structuring bacterioplankton communities. Water samples collected across the River Thames Basin, UK, covering the transition from headwater tributaries to the lower reaches of the main river channel were characterised using 16S rRNA gene pyrosequencing. This approach revealed an ecological succession in the bacterial community composition along the river continuum, moving from a community dominated by Bacteroidetes in the headwaters to Actinobacteria-dominated downstream. Location of the sampling point in the river network (measured as the cumulative water channel distance upstream) was found to be the most predictive spatial feature; inferring that ecological processes pertaining to temporal community succession are of prime importance in driving the assemblages of riverine bacterioplankton communities. A decrease in bacterial activity rates and an increase in the abundance of low nucleic acid bacteria relative to high nucleic acid bacteria were found to correspond with these downstream changes in community structure, suggesting corresponding functional changes. Our findings show that bacterial communities across the Thames basin exhibit an ecological succession along the river continuum, and that this is primarily driven by water residence time rather than the physico-chemical status of the river.

  13. Catchment-scale biogeography of riverine bacterioplankton

    PubMed Central

    Read, Daniel S; Gweon, Hyun S; Bowes, Michael J; Newbold, Lindsay K; Field, Dawn; Bailey, Mark J; Griffiths, Robert I

    2015-01-01

    Lotic ecosystems such as rivers and streams are unique in that they represent a continuum of both space and time during the transition from headwaters to the river mouth. As microbes have very different controls over their ecology, distribution and dispersion compared with macrobiota, we wished to explore biogeographical patterns within a river catchment and uncover the major drivers structuring bacterioplankton communities. Water samples collected across the River Thames Basin, UK, covering the transition from headwater tributaries to the lower reaches of the main river channel were characterised using 16S rRNA gene pyrosequencing. This approach revealed an ecological succession in the bacterial community composition along the river continuum, moving from a community dominated by Bacteroidetes in the headwaters to Actinobacteria-dominated downstream. Location of the sampling point in the river network (measured as the cumulative water channel distance upstream) was found to be the most predictive spatial feature; inferring that ecological processes pertaining to temporal community succession are of prime importance in driving the assemblages of riverine bacterioplankton communities. A decrease in bacterial activity rates and an increase in the abundance of low nucleic acid bacteria relative to high nucleic acid bacteria were found to correspond with these downstream changes in community structure, suggesting corresponding functional changes. Our findings show that bacterial communities across the Thames basin exhibit an ecological succession along the river continuum, and that this is primarily driven by water residence time rather than the physico-chemical status of the river. PMID:25238398

  14. Phylogeny Inference of Closely Related Bacterial Genomes: Combining the Features of Both Overlapping Genes and Collinear Genomic Regions

    PubMed Central

    Zhang, Yan-Cong; Lin, Kui

    2015-01-01

    Overlapping genes (OGs) represent one type of widespread genomic feature in bacterial genomes and have been used as rare genomic markers in phylogeny inference of closely related bacterial species. However, the inference may experience a decrease in performance for phylogenomic analysis of too closely or too distantly related genomes. Another drawback of OGs as phylogenetic markers is that they usually take little account of the effects of genomic rearrangement on the similarity estimation, such as intra-chromosome/genome translocations, horizontal gene transfer, and gene losses. To explore such effects on the accuracy of phylogeny reconstruction, we combine phylogenetic signals of OGs with collinear genomic regions, here called locally collinear blocks (LCBs). By putting these together, we refine our previous metric of pairwise similarity between two closely related bacterial genomes. As a case study, we used this new method to reconstruct the phylogenies of 88 Enterobacteriale genomes of the class Gammaproteobacteria. Our results demonstrated that the topological accuracy of the inferred phylogeny was improved when both OGs and LCBs were simultaneously considered, suggesting that combining these two phylogenetic markers may reduce, to some extent, the influence of gene loss on phylogeny inference. Such phylogenomic studies, we believe, will help us to explore a more effective approach to increasing the robustness of phylogeny reconstruction of closely related bacterial organisms. PMID:26715828

  15. Coherent dynamics and association networks among lake bacterioplankton taxa

    PubMed Central

    Eiler, Alexander; Heinrich, Friederike; Bertilsson, Stefan

    2012-01-01

    Bacteria have important roles in freshwater food webs and in the cycling of elements in the ecosystem. Yet specific ecological features of individual phylogenetic groups and interactions among these are largely unknown. We used 454 pyrosequencing of 16S rRNA genes to study associations of different bacterioplankton groups to environmental characteristics and their co-occurrence patterns over an annual cycle in a dimictic lake. Clear seasonal succession of the bacterioplankton community was observed. After binning of sequences into previously described and highly resolved phylogenetic groups (tribes), their temporal dynamics revealed extensive synchrony and associations with seasonal events such as ice coverage, ice-off, mixing and phytoplankton blooms. Coupling between closely and distantly related tribes was resolved by time-dependent rank correlations, suggesting ecological coherence that was often dependent on taxonomic relatedness. Association networks with the abundant freshwater Actinobacteria and Proteobacteria in focus revealed complex interdependencies within bacterioplankton communities and contrasting linkages to environmental conditions. Accordingly, unique ecological features can be inferred for each tribe and reveal the natural history of abundant cultured and uncultured freshwater bacteria. PMID:21881616

  16. Robust and scalable inference of population history from hundreds of unphased whole genomes.

    PubMed

    Terhorst, Jonathan; Kamm, John A; Song, Yun S

    2017-02-01

    It has recently been demonstrated that inference methods based on genealogical processes with recombination can uncover past population history in unprecedented detail. However, these methods scale poorly with sample size, limiting resolution in the recent past, and they require phased genomes, which contain switch errors that can catastrophically distort the inferred history. Here we present SMC++, a new statistical tool capable of analyzing orders of magnitude more samples than existing methods while requiring only unphased genomes (its results are independent of phasing). SMC++ can jointly infer population size histories and split times in diverged populations, and it employs a novel spline regularization scheme that greatly reduces estimation error. We apply SMC++ to analyze sequence data from over a thousand human genomes in Africa and Eurasia, hundreds of genomes from a Drosophila melanogaster population in Africa, and tens of genomes from zebra finch and long-tailed finch populations in Australia.

  17. Genomic inferences from Afrotheria and the evolution of elephants.

    PubMed

    Roca, Alfred L; O'Brien, Stephen J

    2005-12-01

    Recent genetic studies have established that African forest and savanna elephants are distinct species with dissociated cytonuclear genomic patterns, and have identified Asian elephants from Borneo and Sumatra as conservation priorities. Representative of Afrotheria, a superordinal clade encompassing six eutherian orders, the African savanna elephant was among the first mammals chosen for whole-genome sequencing to provide a comparative understanding of the human genome. Elephants have large and complex brains and display advanced levels of social structure, communication, learning and intelligence. The elephant genome sequence might prove useful for comparative genomic studies of these advanced traits, which have appeared independently in only three mammalian orders: primates, cetaceans and proboscideans.

  18. Systems Biology and Ecology of Streamlined Bacterioplankton

    NASA Astrophysics Data System (ADS)

    Giovannoni, S. J.

    2014-12-01

    The salient feature of streamlined cells is their small genome size, but "streamlining" refers more generally to selection that favors minimization of cell size and complexity. The essence of streamlining theory is that selection is most efficient in organisms that have large effective population sizes, and, in nutrient-limited systems, favors cell architecture that minimizes resources required for replication. Regardless of the cause of genome reduction, lost coding potential eventually dictates loss of function, raising the questions, what genome features are expendable, and how do cells become highly successful with a minimal genomic repertoire? One consequence of reductive evolution in streamlined organisms is atypical patterns of prototrophy, for example the recent discovery of a requirement for the thiamin precursor 4-amino-5-hydroxymethyl-2-methylpyrimidine in some plankton taxa. Examples such as this fit within the framework of the Black Queen Hypothesis, which describes genome reduction that results in reliance on community goods and increased community connectivity. Other examples of genome reduction include losses of regulatory functions, or replacement with simpler regulatory systems, and increased metabolic integration. In one such case, in the order Pelagibacterales, the PII system for regulating responses to N limitation has been replaced with a simpler system composed of fewer genes. Both the absence of common regulatory systems and atypical patterns of prototrophy have been linked to difficulty in culturing Pelagibacterales, lending credibility to the idea that streamlining might broadly explain the phenomenon of the uncultured microbial majority. The success of streamlined osmotrophic bacterioplankton suggests that they successfully compete for labile organic matter and capture a large share of this resource, but an alternative theory postulates they are not good resource competitors and instead prosper by avoiding predation. The answers to these

  19. Inferring Ancestral Recombination Graphs from Bacterial Genomic Data

    PubMed Central

    Vaughan, Timothy G.; Welch, David; Drummond, Alexei J.; Biggs, Patrick J.; George, Tessy; French, Nigel P.

    2017-01-01

    Homologous recombination is a central feature of bacterial evolution, yet it confounds traditional phylogenetic methods. While a number of methods specific to bacterial evolution have been developed, none of these permit joint inference of a bacterial recombination graph and associated parameters. In this article, we present a new method which addresses this shortcoming. Our method uses a novel Markov chain Monte Carlo algorithm to perform phylogenetic inference under the ClonalOrigin model. We demonstrate the utility of our method by applying it to ribosomal multilocus sequence typing data sequenced from pathogenic and nonpathogenic Escherichia coli serotype O157 and O26 isolates collected in rural New Zealand. The method is implemented as an open source BEAST 2 package, Bacter, which is available via the project web page at http://tgvaughan.github.io/bacter. PMID:28007885

  20. OMA 2011: orthology inference among 1000 complete genomes.

    PubMed

    Altenhoff, Adrian M; Schneider, Adrian; Gonnet, Gaston H; Dessimoz, Christophe

    2011-01-01

    OMA (Orthologous MAtrix) is a database that identifies orthologs among publicly available, complete genomes. Initiated in 2004, the project is at its 11th release. It now includes 1000 genomes, making it one of the largest resources of its kind. Here, we describe recent developments in terms of species covered; the algorithmic pipeline--in particular regarding the treatment of alternative splicing, and new features of the web (OMA Browser) and programming interface (SOAP API). In the second part, we review the various representations provided by OMA and their typical applications. The database is publicly accessible at http://omabrowser.org.

  1. Using Genetic Distance to Infer the Accuracy of Genomic Prediction

    PubMed Central

    Scutari, Marco; Mackay, Ian

    2016-01-01

    The prediction of phenotypic traits using high-density genomic data has many applications such as the selection of plants and animals of commercial interest; and it is expected to play an increasing role in medical diagnostics. Statistical models used for this task are usually tested using cross-validation, which implicitly assumes that new individuals (whose phenotypes we would like to predict) originate from the same population the genomic prediction model is trained on. In this paper we propose an approach based on clustering and resampling to investigate the effect of increasing genetic distance between training and target populations when predicting quantitative traits. This is important for plant and animal genetics, where genomic selection programs rely on the precision of predictions in future rounds of breeding. Therefore, estimating how quickly predictive accuracy decays is important in deciding which training population to use and how often the model has to be recalibrated. We find that the correlation between true and predicted values decays approximately linearly with respect to either FST or mean kinship between the training and the target populations. We illustrate this relationship using simulations and a collection of data sets from mice, wheat and human genetics. PMID:27589268

  2. The History of Slavs Inferred from Complete Mitochondrial Genome Sequences

    PubMed Central

    Mielnik-Sikorska, Marta; Daca, Patrycja; Malyarchuk, Boris; Derenko, Miroslava; Skonieczna, Katarzyna; Perkova, Maria; Dobosz, Tadeusz; Grzybowski, Tomasz

    2013-01-01

    To shed more light on the processes leading to crystallization of a Slavic identity, we investigated variability of complete mitochondrial genomes belonging to haplogroups H5 and H6 (63 mtDNA genomes) from the populations of Eastern and Western Slavs, including new samples of Poles, Ukrainians and Czechs presented here. Molecular dating implies formation of H5 approximately 11.5–16 thousand years ago (kya) in the areas of southern Europe. Within ancient haplogroup H6, dated at around 15–28 kya, there is a subhaplogroup H6c, which probably survived the last glaciation in Europe and has undergone expansion only 3–4 kya, together with the ancestors of some European groups, including the Slavs, because H6c has been detected in Czechs, Poles and Slovaks. Detailed analysis of complete mtDNAs allowed us to identify a number of lineages that seem specific for Central and Eastern Europe (H5a1f, H5a2, H5a1r, H5a1s, H5b4, H5e1a, H5u1, some subbranches of H5a1a and H6a1a9). Some of them could possibly be traced back to at least ∼4 kya, which indicates that some of the ancestors of today's Slavs (Poles, Czechs, Slovaks, Ukrainians and Russians) inhabited areas of Central and Eastern Europe much earlier than it was estimated on the basis of archaeological and historical data. We also sequenced entire mitochondrial genomes of several non-European lineages (A, C, D, G, L) found in contemporary populations of Poland and Ukraine. The analysis of these haplogroups confirms the presence of Siberian (C5c1, A8a1) and Ashkenazi-specific (L2a1l2a) mtDNA lineages in Slavic populations. Moreover, we were able to pinpoint some lineages which could possibly reflect the relatively recent contacts of Slavs with nomadic Altaic peoples (C4a1a, G2a, D5a2a1a1). PMID:23342138

  3. A molecular phylogeny of Hemiptera inferred from mitochondrial genome sequences.

    PubMed

    Song, Nan; Liang, Ai-Ping; Bu, Cui-Ping

    2012-01-01

    Classically, Hemiptera is comprised of two suborders: Homoptera and Heteroptera. Homoptera includes Cicadomorpha, Fulgoromorpha and Sternorrhyncha. However, according to previous molecular phylogenetic studies based on 18S rDNA, Fulgoromorpha has a closer relationship to Heteroptera than to other hemipterans, leaving Homoptera as paraphyletic. Therefore, the position of Fulgoromorpha is important for studying phylogenetic structure of Hemiptera. We inferred the evolutionary affiliations of twenty-five superfamilies of Hemiptera using mitochondrial protein-coding genes and rRNAs. We sequenced three mitogenomes, from Pyrops candelaria, Lycorma delicatula and Ricania marginalis, representing two additional families in Fulgoromorpha. Pyrops and Lycorma are representatives of an additional major family Fulgoridae in Fulgoromorpha, whereas Ricania is a second representative of the highly derived clade Ricaniidae. The organization and size of these mitogenomes are similar to those of the sequenced fulgoroid species. Our consensus phylogeny of Hemiptera largely supported the relationships (((Fulgoromorpha,Sternorrhyncha),Cicadomorpha),Heteroptera), and thus supported the classic phylogeny of Hemiptera. Selection of optimal evolutionary models (exclusion and inclusion of two rRNA genes or of third codon positions of protein-coding genes) demonstrated that rapidly evolving and saturated sites should be removed from the analyses.

  4. Inference of gorilla demographic and selective history from whole-genome sequence data.

    PubMed

    McManus, Kimberly F; Kelley, Joanna L; Song, Shiya; Veeramah, Krishna R; Woerner, August E; Stevison, Laurie S; Ryder, Oliver A; Ape Genome Project, Great; Kidd, Jeffrey M; Wall, Jeffrey D; Bustamante, Carlos D; Hammer, Michael F

    2015-03-01

    Although population-level genomic sequence data have been gathered extensively for humans, similar data from our closest living relatives are just beginning to emerge. Examination of genomic variation within great apes offers many opportunities to increase our understanding of the forces that have differentially shaped the evolutionary history of hominid taxa. Here, we expand upon the work of the Great Ape Genome Project by analyzing medium to high coverage whole-genome sequences from 14 western lowland gorillas (Gorilla gorilla gorilla), 2 eastern lowland gorillas (G. beringei graueri), and a single Cross River individual (G. gorilla diehli). We infer that the ancestors of western and eastern lowland gorillas diverged from a common ancestor approximately 261 ka, and that the ancestors of the Cross River population diverged from the western lowland gorilla lineage approximately 68 ka. Using a diffusion approximation approach to model the genome-wide site frequency spectrum, we infer a history of western lowland gorillas that includes an ancestral population expansion of 1.4-fold around 970 ka and a recent 5.6-fold contraction in population size 23 ka. The latter may correspond to a major reduction in African equatorial forests around the Last Glacial Maximum. We also analyze patterns of variation among western lowland gorillas to identify several genomic regions with strong signatures of recent selective sweeps. We find that processes related to taste, pancreatic and saliva secretion, sodium ion transmembrane transport, and cardiac muscle function are overrepresented in genomic regions predicted to have experienced recent positive selection.

  5. Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance

    SciTech Connect

    Ahn, Tae-Hyuk; Chai, Juanjuan; Pan, Chongle

    2014-09-29

    Motivation: Metagenomic sequencing of clinical samples provides a promising technique for direct pathogen detection and characterization in biosurveillance. Taxonomic analysis at the strain level can be used to resolve serotypes of a pathogen in biosurveillance. Sigma was developed for strain-level identification and quantification of pathogens using their reference genomes based on metagenomic analysis. Results: Sigma provides not only accurate strain-level inferences, but also three unique capabilities: (i) Sigma quantifies the statistical uncertainty of its inferences, which includes hypothesis testing of identified genomes and confidence interval estimation of their relative abundances; (ii) Sigma enables strain variant calling by assigning metagenomic reads to their most likely reference genomes; and (iii) Sigma supports parallel computing for fast analysis of large datasets. In conclusion, the algorithm performance was evaluated using simulated mock communities and fecal samples with spike-in pathogen strains. Availability and Implementation: Sigma was implemented in C++ with source codes and binaries freely available at http://sigma.omicsbio.org.

  6. Insights and inferences about integron evolution from genomic data

    PubMed Central

    Nemergut, Diana R; Robeson, Michael S; Kysela, Robert F; Martin, Andrew P; Schmidt, Steven K; Knight, Rob

    2008-01-01

    Background Integrons are mechanisms that facilitate horizontal gene transfer, allowing bacteria to integrate and express foreign DNA. These are important in the exchange of antibiotic resistance determinants, but can also transfer a diverse suite of genes unrelated to pathogenicity. Here, we provide a systematic analysis of the distribution and diversity of integron intI genes and integron-containing bacteria. Results We found integrons in 103 different pathogenic and non-pathogenic bacteria, in six major phyla. Integrons were widely scattered, and their presence was not confined to specific clades within bacterial orders. Nearly 1/3 of the intI genes that we identified were pseudogenes, containing either an internal stop codon or a frameshift mutation that would render the protein product non-functional. Additionally, 20% of bacteria contained more than one integrase gene. dN/dS ratios revealed mutational hotspots in clades of Vibrio and Shewanella intI genes. Finally, we characterized the gene cassettes associated with integrons in Methylobacillus flagellatus KT and Dechloromonas aromatica RCB, and found a heavy metal efflux gene as well as genes involved in protein folding and stability. Conclusion Our analysis suggests that the present distribution of integrons is due to multiple losses and gene transfer events. While, in some cases, the ability to integrate and excise foreign DNA may be selectively advantageous, the gain, loss, or rearrangment of gene cassettes could also be deleterious, selecting against functional integrases. Thus, such a high fraction of pseudogenes may suggest that the selective impact of integrons on genomes is variable, oscillating between beneficial and deleterious, possibly depending on environmental conditions. PMID:18513439

  7. Inferring Strain Mixture within Clinical Plasmodium falciparum Isolates from Genomic Sequence Data

    PubMed Central

    O’Brien, John D.; Amenga-Etego, Lucas

    2016-01-01

    We present a rigorous statistical model that infers the structure of P. falciparum mixtures—including the number of strains present, their proportion within the samples, and the amount of unexplained mixture—using whole genome sequence (WGS) data. Applied to simulation data, artificial laboratory mixtures, and field samples, the model provides reasonable inference with as few as 10 reads or 50 SNPs and works efficiently even with much larger data sets. Source code and example data for the model are provided in an open source fashion. We discuss the possible uses of this model as a window into within-host selection for clinical and epidemiological studies. PMID:27362949

  8. Untangling statistical and biological models to understand network inference: the need for a genomics network ontology.

    PubMed

    Emmert-Streib, Frank; Dehmer, Matthias; Haibe-Kains, Benjamin

    2014-01-01

    In this paper, we shed light on approaches that are currently used to infer networks from gene expression data with respect to their biological meaning. As we will show, the biological interpretation of these networks depends on the chosen theoretical perspective. For this reason, we distinguish a statistical perspective from a mathematical modeling perspective and elaborate their differences and implications. Our results indicate the imperative need for a genomic network ontology in order to avoid increasing confusion about the biological interpretation of inferred networks, which can be even enhanced by approaches that integrate multiple data sets, respectively, data types.

  9. Improved genome inference in the MHC using a population reference graph.

    PubMed

    Dilthey, Alexander; Cox, Charles; Iqbal, Zamin; Nelson, Matthew R; McVean, Gil

    2015-06-01

    Although much is known about human genetic variation, such information is typically ignored in assembling new genomes. Instead, reads are mapped to a single reference, which can lead to poor characterization of regions of high sequence or structural diversity. We introduce a population reference graph, which combines multiple reference sequences and catalogs of variation. The genomes of new samples are reconstructed as paths through the graph using an efficient hidden Markov model, allowing for recombination between different haplotypes and additional variants. By applying the method to the 4.5-Mb extended MHC region on human chromosome 6, combining 8 assembled haplotypes, the sequences of known classical HLA alleles and 87,640 SNP variants from the 1000 Genomes Project, we demonstrate using simulations, SNP genotyping, and short-read and long-read data how the method improves the accuracy of genome inference and identified regions where the current set of reference sequences is substantially incomplete.

  10. Streamlining and Large Ancestral Genomes in Archaea Inferred with a Phylogenetic Birth-and-Death Model

    PubMed Central

    Miklós, István

    2009-01-01

    Homologous genes originate from a common ancestor through vertical inheritance, duplication, or horizontal gene transfer. Entire homolog families spawned by a single ancestral gene can be identified across multiple genomes based on protein sequence similarity. The sequences, however, do not always reveal conclusively the history of large families. To study the evolution of complete gene repertoires, we propose here a mathematical framework that does not rely on resolved gene family histories. We show that so-called phylogenetic profiles, formed by family sizes across multiple genomes, are sufficient to infer principal evolutionary trends. The main novelty in our approach is an efficient algorithm to compute the likelihood of a phylogenetic profile in a model of birth-and-death processes acting on a phylogeny. We examine known gene families in 28 archaeal genomes using a probabilistic model that involves lineage- and family-specific components of gene acquisition, duplication, and loss. The model enables us to consider all possible histories when inferring statistics about archaeal evolution. According to our reconstruction, most lineages are characterized by a net loss of gene families. Major increases in gene repertoire have occurred only a few times. Our reconstruction underlines the importance of persistent streamlining processes in shaping genome composition in Archaea. It also suggests that early archaeal genomes were as complex as typical modern ones, and even show signs, in the case of the methanogenic ancestor, of an extremely large gene repertoire. PMID:19570746

  11. How to infer relative fitness from a sample of genomic sequences.

    PubMed

    Dayarian, Adel; Shraiman, Boris I

    2014-07-01

    Mounting evidence suggests that natural populations can harbor extensive fitness diversity with numerous genomic loci under selection. It is also known that genealogical trees for populations under selection are quantifiably different from those expected under neutral evolution and described statistically by Kingman's coalescent. While differences in the statistical structure of genealogies have long been used as a test for the presence of selection, the full extent of the information that they contain has not been exploited. Here we demonstrate that the shape of the reconstructed genealogical tree for a moderately large number of random genomic samples taken from a fitness diverse, but otherwise unstructured, asexual population can be used to predict the relative fitness of individuals within the sample. To achieve this we define a heuristic algorithm, which we test in silico, using simulations of a Wright-Fisher model for a realistic range of mutation rates and selection strength. Our inferred fitness ranking is based on a linear discriminator that identifies rapidly coalescing lineages in the reconstructed tree. Inferred fitness ranking correlates strongly with actual fitness, with a genome in the top 10% ranked being in the top 20% fittest with false discovery rate of 0.1-0.3, depending on the mutation/selection parameters. The ranking also enables us to predict the genotypes that future populations inherit from the present one. While the inference accuracy increases monotonically with sample size, samples of 200 nearly saturate the performance. We propose that our approach can be used for inferring relative fitness of genomes obtained in single-cell sequencing of tumors and in monitoring viral outbreaks.

  12. How to Infer Relative Fitness from a Sample of Genomic Sequences

    PubMed Central

    Dayarian, Adel; Shraiman, Boris I.

    2014-01-01

    Mounting evidence suggests that natural populations can harbor extensive fitness diversity with numerous genomic loci under selection. It is also known that genealogical trees for populations under selection are quantifiably different from those expected under neutral evolution and described statistically by Kingman’s coalescent. While differences in the statistical structure of genealogies have long been used as a test for the presence of selection, the full extent of the information that they contain has not been exploited. Here we demonstrate that the shape of the reconstructed genealogical tree for a moderately large number of random genomic samples taken from a fitness diverse, but otherwise unstructured, asexual population can be used to predict the relative fitness of individuals within the sample. To achieve this we define a heuristic algorithm, which we test in silico, using simulations of a Wright–Fisher model for a realistic range of mutation rates and selection strength. Our inferred fitness ranking is based on a linear discriminator that identifies rapidly coalescing lineages in the reconstructed tree. Inferred fitness ranking correlates strongly with actual fitness, with a genome in the top 10% ranked being in the top 20% fittest with false discovery rate of 0.1–0.3, depending on the mutation/selection parameters. The ranking also enables us to predict the genotypes that future populations inherit from the present one. While the inference accuracy increases monotonically with sample size, samples of 200 nearly saturate the performance. We propose that our approach can be used for inferring relative fitness of genomes obtained in single-cell sequencing of tumors and in monitoring viral outbreaks. PMID:24770330

  13. Microarray Data Processing Techniques for Genome-Scale Network Inference from Large Public Repositories.

    PubMed

    Chockalingam, Sriram; Aluru, Maneesha; Aluru, Srinivas

    2016-09-19

    Pre-processing of microarray data is a well-studied problem. Furthermore, all popular platforms come with their own recommended best practices for differential analysis of genes. However, for genome-scale network inference using microarray data collected from large public repositories, these methods filter out a considerable number of genes. This is primarily due to the effects of aggregating a diverse array of experiments with different technical and biological scenarios. Here we introduce a pre-processing pipeline suitable for inferring genome-scale gene networks from large microarray datasets. We show that partitioning of the available microarray datasets according to biological relevance into tissue- and process-specific categories significantly extends the limits of downstream network construction. We demonstrate the effectiveness of our pre-processing pipeline by inferring genome-scale networks for the model plant Arabidopsis thaliana using two different construction methods and a collection of 11,760 Affymetrix ATH1 microarray chips. Our pre-processing pipeline and the datasets used in this paper are made available at http://alurulab.cc.gatech.edu/microarray-pp.

  14. Robust Inference of Genetic Exchange Communities from Microbial Genomes Using TF-IDF

    PubMed Central

    Cong, Yingnan; Chan, Yao-ban; Phillips, Charles A.; Langston, Michael A.; Ragan, Mark A.

    2017-01-01

    Bacteria and archaea can exchange genetic material across lineages through processes of lateral genetic transfer (LGT). Collectively, these exchange relationships can be modeled as a network and analyzed using concepts from graph theory. In particular, densely connected regions within an LGT network have been defined as genetic exchange communities (GECs). However, it has been problematic to construct networks in which edges solely represent LGT. Here we apply term frequency-inverse document frequency (TF-IDF), an alignment-free method originating from document analysis, to infer regions of lateral origin in bacterial genomes. We examine four empirical datasets of different size (number of genomes) and phyletic breadth, varying a key parameter (word length k) within bounds established in previous work. We map the inferred lateral regions to genes in recipient genomes, and construct networks in which the nodes are groups of genomes, and the edges natively represent LGT. We then extract maximum and maximal cliques (i.e., GECs) from these graphs, and identify nodes that belong to GECs across a wide range of k. Most surviving lateral transfer has happened within these GECs. Using Gene Ontology enrichment tests we demonstrate that biological processes associated with metabolism, regulation and transport are often over-represented among the genes affected by LGT within these communities. These enrichments are largely robust to change of k. PMID:28154557

  15. Structure-based inference of molecular functions of proteins of unknown function from Berkeley Structural Genomics Center

    SciTech Connect

    Kim, Sung-Hou; Shin, Dong Hae; Hou, Jingtong; Chandonia, John-Marc; Das, Debanu; Choi, In-Geol; Kim, Rosalind; Kim, Sung-Hou

    2007-09-02

    Advances in sequence genomics have resulted in an accumulation of a huge number of protein sequences derived from genome sequences. However, the functions of a large portion of them cannot be inferred based on the current methods of sequence homology detection to proteins of known functions. Three-dimensional structure can have an important impact in providing inference of molecular function (physical and chemical function) of a protein of unknown function. Structural genomics centers worldwide have been determining many 3-D structures of the proteins of unknown functions, and possible molecular functions of them have been inferred based on their structures. Combined with bioinformatics and enzymatic assay tools, the successful acceleration of the process of protein structure determination through high throughput pipelines enables the rapid functional annotation of a large fraction of hypothetical proteins. We present a brief summary of the process we used at the Berkeley Structural Genomics Center to infer molecular functions of proteins of unknown function.

  16. Structure-based inference of molecular functions of proteins of unknown function from Berkeley Structural Genomics Center.

    PubMed

    Shin, Dong Hae; Hou, Jingtong; Chandonia, John-Marc; Das, Debanu; Choi, In-Geol; Kim, Rosalind; Kim, Sung-Hou

    2007-09-01

    Advances in sequence genomics have resulted in an accumulation of a huge number of protein sequences derived from genome sequences. However, the functions of a large portion of them cannot be inferred based on the current methods of sequence homology detection to proteins of known functions. Three-dimensional structure can have an important impact in providing inference of molecular function (physical and chemical function) of a protein of unknown function. Structural genomics centers worldwide have been determining many 3-D structures of the proteins of unknown functions, and possible molecular functions of them have been inferred based on their structures. Combined with bioinformatics and enzymatic assay tools, the successful acceleration of the process of protein structure determination through high throughput pipelines enables the rapid functional annotation of a large fraction of hypothetical proteins. We present a brief summary of the process we used at the Berkeley Structural Genomics Center to infer molecular functions of proteins of unknown function.

  17. ClonalFrameML: Efficient Inference of Recombination in Whole Bacterial Genomes

    PubMed Central

    Didelot, Xavier; Wilson, Daniel J.

    2015-01-01

    Recombination is an important evolutionary force in bacteria, but it remains challenging to reconstruct the imports that occurred in the ancestry of a genomic sample. Here we present ClonalFrameML, which uses maximum likelihood inference to simultaneously detect recombination in bacterial genomes and account for it in phylogenetic reconstruction. ClonalFrameML can analyse hundreds of genomes in a matter of hours, and we demonstrate its usefulness on simulated and real datasets. We find evidence for recombination hotspots associated with mobile elements in Clostridium difficile ST6 and a previously undescribed 310kb chromosomal replacement in Staphylococcus aureus ST582. ClonalFrameML is freely available at http://clonalframeml.googlecode.com/. PMID:25675341

  18. Genetic diversity in Sargasso Sea bacterioplankton.

    PubMed

    Giovannoni, S J; Britschgi, T B; Moyer, C L; Field, K G

    1990-05-03

    Bacterioplankton are recognized as important agents of biogeochemical change in marine ecosystems, yet relatively little is known about the species that make up these communities. Uncertainties about the genetic structure and diversity of natural bacterioplankton populations stem from the traditional difficulties associated with microbial cultivation techniques. Discrepancies between direct counts and plate counts are typically several orders of magnitude, raising doubts as to whether cultivated marine bacteria are actually representative of dominant planktonic species. We have phylogenetically analysed clone libraries of eubacterial 16S ribosomal RNA genes amplified from natural populations of Sargasso Sea picoplankton by the polymerase chain reaction. The analysis indicates the presence of a novel microbial group, the SAR11 cluster, which appears to be a significant component of this oligotrophic bacterioplankton community. A second cluster of lineages related to the oxygenic phototrophs--cyanobacteria, prochlorophytes and chloroplasts--was also observed. However, none of the genes matched the small subunit rRNA sequences of cultivated marine cyanobacteria from similar habitats. The diversity of 16S rRNA genes observed within the clusters suggests that these bacterioplankton may be consortia of independent lineages sharing surprisingly distant common ancestors.

  19. The feasibility of genome-scale biological network inference using Graphics Processing Units.

    PubMed

    Thiagarajan, Raghuram; Alavi, Amir; Podichetty, Jagdeep T; Bazil, Jason N; Beard, Daniel A

    2017-01-01

    Systems research spanning fields from biology to finance involves the identification of models to represent the underpinnings of complex systems. Formal approaches for data-driven identification of network interactions include statistical inference-based approaches and methods to identify dynamical systems models that are capable of fitting multivariate data. Availability of large data sets and so-called 'big data' applications in biology present great opportunities as well as major challenges for systems identification/reverse engineering applications. For example, both inverse identification and forward simulations of genome-scale gene regulatory network models pose compute-intensive problems. This issue is addressed here by combining the processing power of Graphics Processing Units (GPUs) and a parallel reverse engineering algorithm for inference of regulatory networks. It is shown that, given an appropriate data set, information on genome-scale networks (systems of 1000 or more state variables) can be inferred using a reverse-engineering algorithm in a matter of days on a small-scale modern GPU cluster.

  20. Inferring drug-disease associations from integration of chemical, genomic and phenotype data using network propagation

    PubMed Central

    2013-01-01

    Background During the last few years, the knowledge of drug, disease phenotype and protein has been rapidly accumulated and more and more scientists have been drawn the attention to inferring drug-disease associations by computational method. Development of an integrated approach for systematic discovering drug-disease associations by those informational data is an important issue. Methods We combine three different networks of drug, genomic and disease phenotype and assign the weights to the edges from available experimental data and knowledge. Given a specific disease, we use our network propagation approach to infer the drug-disease associations. Results We apply prostate cancer and colorectal cancer as our test data. We use the manually curated drug-disease associations from comparative toxicogenomics database to be our benchmark. The ranked results show that our proposed method obtains higher specificity and sensitivity and clearly outperforms previous methods. Our result also show that our method with off-targets information gets higher performance than that with only primary drug targets in both test data. Conclusions We clearly demonstrate the feasibility and benefits of using network-based analyses of chemical, genomic and phenotype data to reveal drug-disease associations. The potential associations inferred by our method provide new perspectives for toxicogenomics and drug reposition evaluation. PMID:24565337

  1. Gene network inference via structural equation modeling in genetical genomics experiments.

    PubMed

    Liu, Bing; de la Fuente, Alberto; Hoeschele, Ina

    2008-03-01

    Our goal is gene network inference in genetical genomics or systems genetics experiments. For species where sequence information is available, we first perform expression quantitative trait locus (eQTL) mapping by jointly utilizing cis-, cis-trans-, and trans-regulation. After using local structural models to identify regulator-target pairs for each eQTL, we construct an encompassing directed network (EDN) by assembling all retained regulator-target relationships. The EDN has nodes corresponding to expressed genes and eQTL and directed edges from eQTL to cis-regulated target genes, from cis-regulated genes to cis-trans-regulated target genes, from trans-regulator genes to target genes, and from trans-eQTL to target genes. For network inference within the strongly constrained search space defined by the EDN, we propose structural equation modeling (SEM), because it can model cyclic networks and the EDN indeed contains feedback relationships. On the basis of a factorization of the likelihood and the constrained search space, our SEM algorithm infers networks involving several hundred genes and eQTL. Structure inference is based on a penalized likelihood ratio and an adaptation of Occam's window model selection. The SEM algorithm was evaluated using data simulated with nonlinear ordinary differential equations and known cyclic network topologies and was applied to a real yeast data set.

  2. Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance

    PubMed Central

    Ahn, Tae-Hyuk; Chai, Juanjuan; Pan, Chongle

    2015-01-01

    Motivation: Metagenomic sequencing of clinical samples provides a promising technique for direct pathogen detection and characterization in biosurveillance. Taxonomic analysis at the strain level can be used to resolve serotypes of a pathogen in biosurveillance. Sigma was developed for strain-level identification and quantification of pathogens using their reference genomes based on metagenomic analysis. Results: Sigma provides not only accurate strain-level inferences, but also three unique capabilities: (i) Sigma quantifies the statistical uncertainty of its inferences, which includes hypothesis testing of identified genomes and confidence interval estimation of their relative abundances; (ii) Sigma enables strain variant calling by assigning metagenomic reads to their most likely reference genomes; and (iii) Sigma supports parallel computing for fast analysis of large datasets. The algorithm performance was evaluated using simulated mock communities and fecal samples with spike-in pathogen strains. Availability and Implementation: Sigma was implemented in C++ with source codes and binaries freely available at http://sigma.omicsbio.org. Contact: panc@ornl.gov Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25266224

  3. Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance

    DOE PAGES

    Ahn, Tae-Hyuk; Chai, Juanjuan; Pan, Chongle

    2014-09-29

    Motivation: Metagenomic sequencing of clinical samples provides a promising technique for direct pathogen detection and characterization in biosurveillance. Taxonomic analysis at the strain level can be used to resolve serotypes of a pathogen in biosurveillance. Sigma was developed for strain-level identification and quantification of pathogens using their reference genomes based on metagenomic analysis. Results: Sigma provides not only accurate strain-level inferences, but also three unique capabilities: (i) Sigma quantifies the statistical uncertainty of its inferences, which includes hypothesis testing of identified genomes and confidence interval estimation of their relative abundances; (ii) Sigma enables strain variant calling by assigning metagenomic readsmore » to their most likely reference genomes; and (iii) Sigma supports parallel computing for fast analysis of large datasets. In conclusion, the algorithm performance was evaluated using simulated mock communities and fecal samples with spike-in pathogen strains. Availability and Implementation: Sigma was implemented in C++ with source codes and binaries freely available at http://sigma.omicsbio.org.« less

  4. Inferring Bottlenecks from Genome-Wide Samples of Short Sequence Blocks

    PubMed Central

    Bunnefeld, Lynsey; Frantz, Laurent A. F.; Lohse, Konrad

    2015-01-01

    The advent of the genomic era has necessitated the development of methods capable of analyzing large volumes of genomic data efficiently. Being able to reliably identify bottlenecks—extreme population size changes of short duration—not only is interesting in the context of speciation and extinction but also matters (as a null model) when inferring selection. Bottlenecks can be detected in polymorphism data via their distorting effect on the shape of the underlying genealogy. Here, we use the generating function of genealogies to derive the probability of mutational configurations in short sequence blocks under a simple bottleneck model. Given a large number of nonrecombining blocks, we can compute maximum-likelihood estimates of the time and strength of the bottleneck. Our method relies on a simple summary of the joint distribution of polymorphic sites. We extend the site frequency spectrum by counting mutations in frequency classes in short sequence blocks. Using linkage information over short distances in this way gives greater power to detect bottlenecks than the site frequency spectrum and potentially opens up a wide range of demographic histories to blockwise inference. Finally, we apply our method to genomic data from a species of pig (Sus cebifrons) endemic to islands in the center and west of the Philippines to estimate whether a bottleneck occurred upon island colonization and compare our scheme to Li and Durbin’s pairwise sequentially Markovian coalescent (PSMC) both for the pig data and using simulations. PMID:26341659

  5. RegPredict: an integrated system for regulon inference in prokaryotes by comparative genomics approach

    SciTech Connect

    Novichkov, Pavel S.; Rodionov, Dmitry A.; Stavrovskaya, Elena D.; Novichkova, Elena S.; Kazakov, Alexey E.; Gelfand, Mikhail S.; Arkin, Adam P.; Mironov, Andrey A.; Dubchak, Inna

    2010-05-26

    RegPredict web server is designed to provide comparative genomics tools for reconstruction and analysis of microbial regulons using comparative genomics approach. The server allows the user to rapidly generate reference sets of regulons and regulatory motif profiles in a group of prokaryotic genomes. The new concept of a cluster of co-regulated orthologous operons allows the user to distribute the analysis of large regulons and to perform the comparative analysis of multiple clusters independently. Two major workflows currently implemented in RegPredict are: (i) regulon reconstruction for a known regulatory motif and (ii) ab initio inference of a novel regulon using several scenarios for the generation of starting gene sets. RegPredict provides a comprehensive collection of manually curated positional weight matrices of regulatory motifs. It is based on genomic sequences, ortholog and operon predictions from the MicrobesOnline. An interactive web interface of RegPredict integrates and presents diverse genomic and functional information about the candidate regulon members from several web resources. RegPredict is freely accessible at http://regpredict.lbl.gov.

  6. Hybrid Origins of Citrus Varieties Inferred from DNA Marker Analysis of Nuclear and Organelle Genomes.

    PubMed

    Shimizu, Tokurou; Kitajima, Akira; Nonaka, Keisuke; Yoshioka, Terutaka; Ohta, Satoshi; Goto, Shingo; Toyoda, Atsushi; Fujiyama, Asao; Mochizuki, Takako; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu

    2016-01-01

    Most indigenous citrus varieties are assumed to be natural hybrids, but their parentage has so far been determined in only a few cases because of their wide genetic diversity and the low transferability of DNA markers. Here we infer the parentage of indigenous citrus varieties using simple sequence repeat and indel markers developed from various citrus genome sequence resources. Parentage tests with 122 known hybrids using the selected DNA markers certify their transferability among those hybrids. Identity tests confirm that most variant strains are selected mutants, but we find four types of kunenbo (Citrus nobilis) and three types of tachibana (Citrus tachibana) for which we suggest different origins. Structure analysis with DNA markers that are in Hardy-Weinberg equilibrium deduce three basic taxa coinciding with the current understanding of citrus ancestors. Genotyping analysis of 101 indigenous citrus varieties with 123 selected DNA markers infers the parentages of 22 indigenous citrus varieties including Satsuma, Temple, and iyo, and single parents of 45 indigenous citrus varieties, including kunenbo, C. ichangensis, and Ichang lemon by allele-sharing and parentage tests. Genotyping analysis of chloroplast and mitochondrial genomes using 11 DNA markers classifies their cytoplasmic genotypes into 18 categories and deduces the combination of seed and pollen parents. Likelihood ratio analysis verifies the inferred parentages with significant scores. The reconstructed genealogy identifies 12 types of varieties consisting of Kishu, kunenbo, yuzu, koji, sour orange, dancy, kobeni mikan, sweet orange, tachibana, Cleopatra, willowleaf mandarin, and pummelo, which have played pivotal roles in the occurrence of these indigenous varieties. The inferred parentage of the indigenous varieties confirms their hybrid origins, as found by recent studies.

  7. Hybrid Origins of Citrus Varieties Inferred from DNA Marker Analysis of Nuclear and Organelle Genomes

    PubMed Central

    Kitajima, Akira; Nonaka, Keisuke; Yoshioka, Terutaka; Ohta, Satoshi; Goto, Shingo; Toyoda, Atsushi; Fujiyama, Asao; Mochizuki, Takako; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu

    2016-01-01

    Most indigenous citrus varieties are assumed to be natural hybrids, but their parentage has so far been determined in only a few cases because of their wide genetic diversity and the low transferability of DNA markers. Here we infer the parentage of indigenous citrus varieties using simple sequence repeat and indel markers developed from various citrus genome sequence resources. Parentage tests with 122 known hybrids using the selected DNA markers certify their transferability among those hybrids. Identity tests confirm that most variant strains are selected mutants, but we find four types of kunenbo (Citrus nobilis) and three types of tachibana (Citrus tachibana) for which we suggest different origins. Structure analysis with DNA markers that are in Hardy–Weinberg equilibrium deduce three basic taxa coinciding with the current understanding of citrus ancestors. Genotyping analysis of 101 indigenous citrus varieties with 123 selected DNA markers infers the parentages of 22 indigenous citrus varieties including Satsuma, Temple, and iyo, and single parents of 45 indigenous citrus varieties, including kunenbo, C. ichangensis, and Ichang lemon by allele-sharing and parentage tests. Genotyping analysis of chloroplast and mitochondrial genomes using 11 DNA markers classifies their cytoplasmic genotypes into 18 categories and deduces the combination of seed and pollen parents. Likelihood ratio analysis verifies the inferred parentages with significant scores. The reconstructed genealogy identifies 12 types of varieties consisting of Kishu, kunenbo, yuzu, koji, sour orange, dancy, kobeni mikan, sweet orange, tachibana, Cleopatra, willowleaf mandarin, and pummelo, which have played pivotal roles in the occurrence of these indigenous varieties. The inferred parentage of the indigenous varieties confirms their hybrid origins, as found by recent studies. PMID:27902727

  8. Inferring Where and When Replication Initiates from Genome-Wide Replication Timing Data

    NASA Astrophysics Data System (ADS)

    Baker, A.; Audit, B.; Yang, S. C.-H.; Bechhoefer, J.; Arneodo, A.

    2012-06-01

    Based on an analogy between DNA replication and one dimensional nucleation-and-growth processes, various attempts to infer the local initiation rate I(x,t) of DNA replication origins from replication timing data have been developed in the framework of phase transition kinetics theories. These works have all used curve-fit strategies to estimate I(x,t) from genome-wide replication timing data. Here, we show how to invert analytically the Kolmogorov-Johnson-Mehl-Avrami model and extract I(x,t) directly. Tests on both simulated and experimental budding-yeast data confirm the location and firing-time distribution of replication origins.

  9. Inferring human population size and separation history from multiple genome sequences

    PubMed Central

    Schiffels, Stephan; Durbin, Richard

    2014-01-01

    The availability of complete human genome sequences from populations across the world has given rise to new population genetic inference methods that explicitly model their ancestral relationship under recombination and mutation. So far, application of these methods to evolutionary history more recent than 20-30 thousand years ago and to population separations has been limited. Here we present a new method that overcomes these shortcomings. The Multiple Sequentially Markovian Coalescent (MSMC) analyses the observed pattern of mutations in multiple individuals, focusing on the first coalescence between any two individuals. Results from applying MSMC to genome sequences from nine populations across the world suggest that the genetic separation of non-African ancestors from African Yoruban ancestors started long before 50,000 years ago, and give information about human population history as recently as 2,000 years ago, including the bottleneck in the peopling of the Americas, and separations within Africa, East Asia and Europe. PMID:24952747

  10. Inference of gene regulatory networks from genome-wide knockout fitness data

    PubMed Central

    Wang, Liming; Wang, Xiaodong; Arkin, Adam P.; Samoilov, Michael S.

    2013-01-01

    Motivation: Genome-wide fitness is an emerging type of high-throughput biological data generated for individual organisms by creating libraries of knockouts, subjecting them to broad ranges of environmental conditions, and measuring the resulting clone-specific fitnesses. Since fitness is an organism-scale measure of gene regulatory network behaviour, it may offer certain advantages when insights into such phenotypical and functional features are of primary interest over individual gene expression. Previous works have shown that genome-wide fitness data can be used to uncover novel gene regulatory interactions, when compared with results of more conventional gene expression analysis. Yet, to date, few algorithms have been proposed for systematically using genome-wide mutant fitness data for gene regulatory network inference. Results: In this article, we describe a model and propose an inference algorithm for using fitness data from knockout libraries to identify underlying gene regulatory networks. Unlike most prior methods, the presented approach captures not only structural, but also dynamical and non-linear nature of biomolecular systems involved. A state–space model with non-linear basis is used for dynamically describing gene regulatory networks. Network structure is then elucidated by estimating unknown model parameters. Unscented Kalman filter is used to cope with the non-linearities introduced in the model, which also enables the algorithm to run in on-line mode for practical use. Here, we demonstrate that the algorithm provides satisfying results for both synthetic data as well as empirical measurements of GAL network in yeast Saccharomyces cerevisiae and TyrR–LiuR network in bacteria Shewanella oneidensis. Availability: MATLAB code and datasets are available to download at http://www.duke.edu/∼lw174/Fitness.zip and http://genomics.lbl.gov/supplemental/fitness-bioinf/ Contact: wangx@ee.columbia.edu or mssamoilov@lbl.gov Supplementary information

  11. Occurrence and expression of gene transfer agent genes in marine bacterioplankton.

    PubMed

    Biers, Erin J; Wang, Kui; Pennington, Catherine; Belas, Robert; Chen, Feng; Moran, Mary Ann

    2008-05-01

    Genes with homology to the transduction-like gene transfer agent (GTA) were observed in genome sequences of three cultured members of the marine Roseobacter clade. A broader search for homologs for this host-controlled virus-like gene transfer system identified likely GTA systems in cultured Alphaproteobacteria, and particularly in marine bacterioplankton representatives. Expression of GTA genes and extracellular release of GTA particles ( approximately 50 to 70 nm) was demonstrated experimentally for the Roseobacter clade member Silicibacter pomeroyi DSS-3, and intraspecific gene transfer was documented. GTA homologs are surprisingly infrequent in marine metagenomic sequence data, however, and the role of this lateral gene transfer mechanism in ocean bacterioplankton communities remains unclear.

  12. Phylogenetics and biogeography of the dung beetle genus Onthophagus inferred from mitochondrial genomes.

    PubMed

    Breeschoten, Thijmen; Doorenweerd, Camiel; Tarasov, Sergei; Vogler, Alfried P

    2016-12-01

    Phylogenetic relationships of dung beetles in the tribe Onthophagini, including the species-rich, cosmopolitan genus Onthophagus, were inferred using whole mitochondrial genomes. Data were generated by shotgun sequencing of mixed genomic DNA from >100 individuals on 50% of an Illumina MiSeq flow cell. Genome assembly of the mixed reads produced contigs of 74 (nearly) complete mitogenomes. The final dataset included representatives of Onthophagus from all biogeographic regions, closely related genera of Onthophagini, and the related tribes Onitini and Oniticellini. The analysis defined four major clades of Onthophagini, which was paraphyletic for Oniticellini, with Onitini as sister group to all others. Several (sub)genera considered as members of Onthophagus in the older literature formed separate deep lineages. All New World species of Onthophagus formed a monophyletic group, and the Australian taxa are confined to a single or two closely related clades, one of which forms the sister group of the New World species. Dating the tree by constraining the basal splits with existing calibrations of Scarabaeoidea suggests an origin of Onthophagini sensu lato in the Eocene and a rapid spread from an African ancestral stock into the Oriental region, and secondarily to Australia and the Americas at about 20-24 Mya. The successful assembly of mitogenomes and the well-supported tree obtained from these sequences demonstrates the power of shotgun sequencing from total genomic DNA of species pools as an efficient tool in genus-level phylogenetics.

  13. Proteomics-inferred genome typing (PIGT) demonstrates inter-populationrecombination as a strategy for environmental adaptation

    SciTech Connect

    Denef, Vincent; Verberkmoes, Nathan C; Shah, Manesh B; Abraham, Paul E; Lefsrud, Mark G; Hettich, Robert {Bob} L; Banfield, Jillian F.

    2009-01-01

    Analyses of ecological and evolutionary processes that shape microbial consortia are facilitated by comprehensive studies of ecosystems with low species richness. In the current study we evaluated the role of recombination in altering the fitness of chemoautotrophic bacteria in their natural environment. Proteomics-inferred genome typing (PIGT) was used to determine the genomic make-up of Leptospirillum group II populations in 27 biofilms sampled from six locations in the Richmond Mine acid mine drainage system (Iron Mountain, CA) over a four-year period. We observed six distinct genotypes that are recombinants comprised of segments from two parental genotypes. Community genomic analyses revealed additional low abundance recombinant variants. The dominance of some genotypes despite a larger available genome pool, and patterns of spatiotemporal distribution within the ecosystem, indicate selection for distinct recombinants. Genes involved in motility, signal transduction and transport were overrepresented in the tens to hundreds of kilobase recombinant blocks, whereas core metabolic functions were significantly underrepresented. Our findings demonstrate the power of PIGT and reveal that recombination is a mechanism for fine-scale adaptation in this system.

  14. Inferring gene structures in genomic sequences using pattern recognition and expressed sequence tags.

    PubMed

    Xu, Y; Mural, R J; Uberbacher, E C

    1997-01-01

    Computational methods for gene identification in genomic sequences typically have two phases: coding region prediction and gene parsing. While there are many effective methods for predicting coding regions (exons), parsing the predicted exons into proper gene structures, to a large extent, remains an unsolved problem. This paper presents an algorithm for inferring gene structures from predicted exon candidates, based on Expressed Sequence Tags (ESTs) and biological intuition/rules. The algorithm first finds all the related ESTs in the EST database (dbEST) for each predicted exon, and infers the boundaries of one or a series of genes based on the available EST information and biological rules. Then it constructs gene models within each pair of gene boundaries, that are most consistent with the EST information. By exploiting EST information and biological rules, the algorithm can (1) model complicated multiple gene structures, including embedded genes, (2) identify falsely-predicted exons and locate missed exons, and (3) make more accurate exon boundary predictions. The algorithm has been implemented and tested on long genomic sequences with a number of genes. Test results show that very accurate (predicted) gene models can be expected when related ESTs exist for the predicted exons.

  15. Inferring gene structures in genomic sequences using pattern recognition and expressed sequence tags

    SciTech Connect

    Xu, Y.; Mural, R.; Uberbacher, E.

    1997-02-01

    Computational methods for gene identification in genomic sequences typically have two phases: coding region prediction and gene parsing. While there are many effective methods for predicting coding regions (exons), parsing the predicted exons into proper gene structures, to a large extent, remains an unsolved problem. This paper presents an algorithm for inferring gene structures from predicted exon candidates, based on Expressed Sequence Tags (ESTs) and biological intuition/rules. The algorithm first finds all the related ESTs in the EST database (dbEST) for each predicted exon, and infers the boundaries of one or a series of genes based on the available EST information and biological rules. Then it constructs gene models within each pair of gene boundaries, that are most consistent with the EST information. By exploiting EST information and biological rules, the algorithm can (1) model complicated multiple gene structures, including embedded genes, (2) identify falsely-predicted exons and locate missed exons, and (3) make more accurate exon boundary predictions. The algorithm has been implemented and tested on long genomic sequences with a number of genes. Test results show that very accurate (predicted) gene models can be expected when related ESTs exist for the predicted exons.

  16. The aggregate site frequency spectrum (aSFS) for comparative population genomic inference

    PubMed Central

    Xue, Alexander T.; Hickerson, Michael J.

    2015-01-01

    Understanding how assemblages of species responded to past climate change is a central goal of comparative phylogeography and comparative population genomics, an endeavor that has increasing potential to integrate with community ecology. New sequencing technology now provides the potential to perform complex demographic inference at unprecedented resolution across assemblages of non-model species. To this end, we introduce the aggregate site frequency spectrum (aSFS), an expansion of the site frequency spectrum to use single nucleotide polymorphism (SNP) datasets collected from multiple, co-distributed species for assemblage-level demographic inference. We describe how the aSFS is constructed over an arbitrary number of independent population samples and then demonstrate how the aSFS can differentiate various multi-species demographic histories under a wide range of sampling configurations while allowing effective population sizes and expansion magnitudes to vary independently. We subsequently couple the aSFS with a hierarchical approximate Bayesian computation (hABC) framework to estimate degree of temporal synchronicity in expansion times across taxa, including an empirical demonstration with a dataset consisting of five populations of the threespine stickleback (Gasterosteus aculeatus). Corroborating what is generally understood about the recent post-glacial origins of these populations, the joint aSFS/hABC analysis strongly suggests that the stickleback data are most consistent with synchronous expansion after the Last Glacial Maximum (posterior probability = 0.99). The aSFS will have general application for multi-level statistical frameworks to test models involving assemblages and/or communities and as large-scale SNP data from non-model species become routine, the aSFS expands the potential for powerful next-generation comparative population genomic inference. PMID:26769405

  17. High-Accuracy HLA Type Inference from Whole-Genome Sequencing Data Using Population Reference Graphs.

    PubMed

    Dilthey, Alexander T; Gourraud, Pierre-Antoine; Mentzer, Alexander J; Cereb, Nezih; Iqbal, Zamin; McVean, Gil

    2016-10-01

    Genetic variation at the Human Leucocyte Antigen (HLA) genes is associated with many autoimmune and infectious disease phenotypes, is an important element of the immunological distinction between self and non-self, and shapes immune epitope repertoires. Determining the allelic state of the HLA genes (HLA typing) as a by-product of standard whole-genome sequencing data would therefore be highly desirable and enable the immunogenetic characterization of samples in currently ongoing population sequencing projects. Extensive hyperpolymorphism and sequence similarity between the HLA genes, however, pose problems for accurate read mapping and make HLA type inference from whole-genome sequencing data a challenging problem. We describe how to address these challenges in a Population Reference Graph (PRG) framework. First, we construct a PRG for 46 (mostly HLA) genes and pseudogenes, their genomic context and their characterized sequence variants, integrating a database of over 10,000 known allele sequences. Second, we present a sequence-to-PRG paired-end read mapping algorithm that enables accurate read mapping for the HLA genes. Third, we infer the most likely pair of underlying alleles at G group resolution from the IMGT/HLA database at each locus, employing a simple likelihood framework. We show that HLA*PRG, our algorithm, outperforms existing methods by a wide margin. We evaluate HLA*PRG on six classical class I and class II HLA genes (HLA-A, -B, -C, -DQA1, -DQB1, -DRB1) and on a set of 14 samples (3 samples with 2 x 100bp, 11 samples with 2 x 250bp Illumina HiSeq data). Of 158 alleles tested, we correctly infer 157 alleles (99.4%). We also identify and re-type two erroneous alleles in the original validation data. We conclude that HLA*PRG for the first time achieves accuracies comparable to gold-standard reference methods from standard whole-genome sequencing data, though high computational demands (currently ~30-250 CPU hours per sample) remain a significant

  18. High-Accuracy HLA Type Inference from Whole-Genome Sequencing Data Using Population Reference Graphs

    PubMed Central

    Dilthey, Alexander T.; Gourraud, Pierre-Antoine; McVean, Gil

    2016-01-01

    Genetic variation at the Human Leucocyte Antigen (HLA) genes is associated with many autoimmune and infectious disease phenotypes, is an important element of the immunological distinction between self and non-self, and shapes immune epitope repertoires. Determining the allelic state of the HLA genes (HLA typing) as a by-product of standard whole-genome sequencing data would therefore be highly desirable and enable the immunogenetic characterization of samples in currently ongoing population sequencing projects. Extensive hyperpolymorphism and sequence similarity between the HLA genes, however, pose problems for accurate read mapping and make HLA type inference from whole-genome sequencing data a challenging problem. We describe how to address these challenges in a Population Reference Graph (PRG) framework. First, we construct a PRG for 46 (mostly HLA) genes and pseudogenes, their genomic context and their characterized sequence variants, integrating a database of over 10,000 known allele sequences. Second, we present a sequence-to-PRG paired-end read mapping algorithm that enables accurate read mapping for the HLA genes. Third, we infer the most likely pair of underlying alleles at G group resolution from the IMGT/HLA database at each locus, employing a simple likelihood framework. We show that HLA*PRG, our algorithm, outperforms existing methods by a wide margin. We evaluate HLA*PRG on six classical class I and class II HLA genes (HLA-A, -B, -C, -DQA1, -DQB1, -DRB1) and on a set of 14 samples (3 samples with 2 x 100bp, 11 samples with 2 x 250bp Illumina HiSeq data). Of 158 alleles tested, we correctly infer 157 alleles (99.4%). We also identify and re-type two erroneous alleles in the original validation data. We conclude that HLA*PRG for the first time achieves accuracies comparable to gold-standard reference methods from standard whole-genome sequencing data, though high computational demands (currently ~30–250 CPU hours per sample) remain a significant

  19. Inferring causal genomic alterations in breast cancer using gene expression data

    PubMed Central

    2011-01-01

    Background One of the primary objectives in cancer research is to identify causal genomic alterations, such as somatic copy number variation (CNV) and somatic mutations, during tumor development. Many valuable studies lack genomic data to detect CNV; therefore, methods that are able to infer CNVs from gene expression data would help maximize the value of these studies. Results We developed a framework for identifying recurrent regions of CNV and distinguishing the cancer driver genes from the passenger genes in the regions. By inferring CNV regions across many datasets we were able to identify 109 recurrent amplified/deleted CNV regions. Many of these regions are enriched for genes involved in many important processes associated with tumorigenesis and cancer progression. Genes in these recurrent CNV regions were then examined in the context of gene regulatory networks to prioritize putative cancer driver genes. The cancer driver genes uncovered by the framework include not only well-known oncogenes but also a number of novel cancer susceptibility genes validated via siRNA experiments. Conclusions To our knowledge, this is the first effort to systematically identify and validate drivers for expression based CNV regions in breast cancer. The framework where the wavelet analysis of copy number alteration based on expression coupled with the gene regulatory network analysis, provides a blueprint for leveraging genomic data to identify key regulatory components and gene targets. This integrative approach can be applied to many other large-scale gene expression studies and other novel types of cancer data such as next-generation sequencing based expression (RNA-Seq) as well as CNV data. PMID:21806811

  20. Co-occurrence Analysis of Microbial Taxa in the Atlantic Ocean Reveals High Connectivity in the Free-Living Bacterioplankton.

    PubMed

    Milici, Mathias; Deng, Zhi-Luo; Tomasch, Jürgen; Decelle, Johan; Wos-Oxley, Melissa L; Wang, Hui; Jáuregui, Ruy; Plumeier, Iris; Giebel, Helge-Ansgar; Badewien, Thomas H; Wurst, Mascha; Pieper, Dietmar H; Simon, Meinhard; Wagner-Döbler, Irene

    2016-01-01

    We determined the taxonomic composition of the bacterioplankton of the epipelagic zone of the Atlantic Ocean along a latitudinal transect (51°S-47°N) using Illumina sequencing of the V5-V6 region of the 16S rRNA gene and inferred co-occurrence networks. Bacterioplankon community composition was distinct for Longhurstian provinces and water depth. Free-living microbial communities (between 0.22 and 3 μm) were dominated by highly abundant and ubiquitous taxa with streamlined genomes (e.g., SAR11, SAR86, OM1, Prochlorococcus) and could clearly be separated from particle-associated communities which were dominated by Bacteroidetes, Planktomycetes, Verrucomicrobia, and Roseobacters. From a total of 369 different communities we then inferred co-occurrence networks for each size fraction and depth layer of the plankton between bacteria and between bacteria and phototrophic micro-eukaryotes. The inferred networks showed a reduction of edges in the deepest layer of the photic zone. Networks comprised of free-living bacteria had a larger amount of connections per OTU when compared to the particle associated communities throughout the water column. Negative correlations accounted for roughly one third of the total edges in the free-living communities at all depths, while they decreased with depth in the particle associated communities where they amounted for roughly 10% of the total in the last part of the epipelagic zone. Co-occurrence networks of bacteria with phototrophic micro-eukaryotes were not taxon-specific, and dominated by mutual exclusion (~60%). The data show a high degree of specialization to micro-environments in the water column and highlight the importance of interdependencies particularly between free-living bacteria in the upper layers of the epipelagic zone.

  1. Co-occurrence Analysis of Microbial Taxa in the Atlantic Ocean Reveals High Connectivity in the Free-Living Bacterioplankton

    PubMed Central

    Milici, Mathias; Deng, Zhi-Luo; Tomasch, Jürgen; Decelle, Johan; Wos-Oxley, Melissa L.; Wang, Hui; Jáuregui, Ruy; Plumeier, Iris; Giebel, Helge-Ansgar; Badewien, Thomas H.; Wurst, Mascha; Pieper, Dietmar H.; Simon, Meinhard; Wagner-Döbler, Irene

    2016-01-01

    We determined the taxonomic composition of the bacterioplankton of the epipelagic zone of the Atlantic Ocean along a latitudinal transect (51°S–47°N) using Illumina sequencing of the V5-V6 region of the 16S rRNA gene and inferred co-occurrence networks. Bacterioplankon community composition was distinct for Longhurstian provinces and water depth. Free-living microbial communities (between 0.22 and 3 μm) were dominated by highly abundant and ubiquitous taxa with streamlined genomes (e.g., SAR11, SAR86, OM1, Prochlorococcus) and could clearly be separated from particle-associated communities which were dominated by Bacteroidetes, Planktomycetes, Verrucomicrobia, and Roseobacters. From a total of 369 different communities we then inferred co-occurrence networks for each size fraction and depth layer of the plankton between bacteria and between bacteria and phototrophic micro-eukaryotes. The inferred networks showed a reduction of edges in the deepest layer of the photic zone. Networks comprised of free-living bacteria had a larger amount of connections per OTU when compared to the particle associated communities throughout the water column. Negative correlations accounted for roughly one third of the total edges in the free-living communities at all depths, while they decreased with depth in the particle associated communities where they amounted for roughly 10% of the total in the last part of the epipelagic zone. Co-occurrence networks of bacteria with phototrophic micro-eukaryotes were not taxon-specific, and dominated by mutual exclusion (~60%). The data show a high degree of specialization to micro-environments in the water column and highlight the importance of interdependencies particularly between free-living bacteria in the upper layers of the epipelagic zone. PMID:27199970

  2. Adaptive evolution of chloroplast genome structure inferred using a parametric bootstrap approach

    PubMed Central

    Cui, Liying; Leebens-Mack, Jim; Wang, Li-San; Tang, Jijun; Rymarquis, Linda; Stern, David B; dePamphilis, Claude W

    2006-01-01

    Background Genome rearrangements influence gene order and configuration of gene clusters in all genomes. Most land plant chloroplast DNAs (cpDNAs) share a highly conserved gene content and with notable exceptions, a largely co-linear gene order. Conserved gene orders may reflect a slow intrinsic rate of neutral chromosomal rearrangements, or selective constraint. It is unknown to what extent observed changes in gene order are random or adaptive. We investigate the influence of natural selection on gene order in association with increased rate of chromosomal rearrangement. We use a novel parametric bootstrap approach to test if directional selection is responsible for the clustering of functionally related genes observed in the highly rearranged chloroplast genome of the unicellular green alga Chlamydomonas reinhardtii, relative to ancestral chloroplast genomes. Results Ancestral gene orders were inferred and then subjected to simulated rearrangement events under the random breakage model with varying ratios of inversions and transpositions. We found that adjacent chloroplast genes in C. reinhardtii were located on the same strand much more frequently than in simulated genomes that were generated under a random rearrangement processes (increased sidedness; p < 0.0001). In addition, functionally related genes were found to be more clustered than those evolved under random rearrangements (p < 0.0001). We report evidence of co-transcription of neighboring genes, which may be responsible for the observed gene clusters in C. reinhardtii cpDNA. Conclusion Simulations and experimental evidence suggest that both selective maintenance and directional selection for gene clusters are determinants of chloroplast gene order. PMID:16469102

  3. Comparative genome analyses of Arabidopsis spp.: Inferring chromosomal rearrangement events in the evolutionary history of A. thaliana

    PubMed Central

    Yogeeswaran, Krithika; Frary, Amy; York, Thomas L.; Amenta, Alison; Lesser, Andrew H.; Nasrallah, June B.; Tanksley, Steven D.; Nasrallah, Mikhail E.

    2005-01-01

    Comparative genome analysis is a powerful tool that can facilitate the reconstruction of the evolutionary history of the genomes of modern-day species. The model plant Arabidopsis thaliana with its n = 5 genome is thought to be derived from an ancestral n = 8 genome. Pairwise comparative genome analyses of A. thaliana with polyploid and diploid Brassicaceae species have suggested that rapid genome evolution, manifested by chromosomal rearrangements and duplications, characterizes the polyploid, but not the diploid, lineages of this family. In this study, we constructed a low-density genetic linkage map of Arabidopsis lyrata ssp. lyrata (A. l. lyrata; n = 8, diploid), the closest known relative of A. thaliana (MRCA ∼5 Mya), using A. thaliana-specific markers that resolve into the expected eight linkage groups. We then performed comparative Bayesian analyses using raw mapping data from this study and from a Capsella study to infer the number and nature of rearrangements that distinguish the n = 8 genomes of A. l. lyrata and Capsella from the n = 5 genome of A. thaliana. We conclude that there is strong statistical support in favor of the parsimony scenarios of 10 major chromosomal rearrangements separating these n = 8 genomes from A. thaliana. These chromosomal rearrangement events contribute to a rate of chromosomal evolution higher than previously reported in this lineage. We infer that at least seven of these events, common to both sets of data, are responsible for the change in karyotype and underlie genome reduction in A. thaliana. PMID:15805492

  4. Impact of Sample Type and DNA Isolation Procedure on Genomic Inference of Microbiome Composition

    PubMed Central

    Munk, Patrick; Lukjancenko, Oksana; Priemé, Anders; Aarestrup, Frank M.

    2016-01-01

    ABSTRACT Explorations of complex microbiomes using genomics greatly enhance our understanding about their diversity, biogeography, and function. The isolation of DNA from microbiome specimens is a key prerequisite for such examinations, but challenges remain in obtaining sufficient DNA quantities required for certain sequencing approaches, achieving accurate genomic inference of microbiome composition, and facilitating comparability of findings across specimen types and sequencing projects. These aspects are particularly relevant for the genomics-based global surveillance of infectious agents and antimicrobial resistance from different reservoirs. Here, we compare in a stepwise approach a total of eight commercially available DNA extraction kits and 16 procedures based on these for three specimen types (human feces, pig feces, and hospital sewage). We assess DNA extraction using spike-in controls and different types of beads for bead beating, facilitating cell lysis. We evaluate DNA concentration, purity, and stability and microbial community composition using 16S rRNA gene sequencing and for selected samples using shotgun metagenomic sequencing. Our results suggest that inferred community composition was dependent on inherent specimen properties as well as DNA extraction method. We further show that bead beating or enzymatic treatment can increase the extraction of DNA from Gram-positive bacteria. Final DNA quantities could be increased by isolating DNA from a larger volume of cell lysate than that in standard protocols. Based on this insight, we designed an improved DNA isolation procedure optimized for microbiome genomics that can be used for the three examined specimen types and potentially also for other biological specimens. A standard operating procedure is available from https://dx.doi.org/10.6084/m9.figshare.3475406. IMPORTANCE Sequencing-based analyses of microbiomes may lead to a breakthrough in our understanding of the microbial worlds associated with

  5. The Phylogeny and Evolutionary Timescale of Muscoidea (Diptera: Brachycera: Calyptratae) Inferred from Mitochondrial Genomes

    PubMed Central

    Wang, Ning; Cameron, Stephen L.; Mao, Meng; Wang, Yuyu; Xi, Yuqiang; Yang, Ding

    2015-01-01

    Muscoidea is a significant dipteran clade that includes house flies (Family Muscidae), latrine flies (F. Fannidae), dung flies (F. Scathophagidae) and root maggot flies (F. Anthomyiidae). It is comprised of approximately 7000 described species. The monophyly of the Muscoidea and the precise relationships of muscoids to the closest superfamily the Oestroidea (blow flies, flesh flies etc) are both unresolved. Until now mitochondrial (mt) genomes were available for only two of the four muscoid families precluding a thorough test of phylogenetic relationships using this data source. Here we present the first two mt genomes for the families Fanniidae (Euryomma sp.) (family Fanniidae) and Anthomyiidae (Delia platura (Meigen, 1826)). We also conducted phylogenetic analyses containing of these newly sequenced mt genomes plus 15 other species representative of dipteran diversity to address the internal relationship of Muscoidea and its systematic position. Both maximum-likelihood and Bayesian analyses suggested that Muscoidea was not a monophyletic group with the relationship: (Fanniidae + Muscidae) + ((Anthomyiidae + Scathophagidae) + (Calliphoridae + Sarcophagidae)), supported by the majority of analysed datasets. This also infers that Oestroidea was paraphyletic in the majority of analyses. Divergence time estimation suggested that the earliest split within the Calyptratae, separating (Tachinidae + Oestridae) from the remaining families, occurred in the Early Eocene. The main divergence within the paraphyletic muscoidea grade was between Fanniidae + Muscidae and the lineage ((Anthomyiidae + Scathophagidae) + (Calliphoridae + Sarcophagidae)) which occurred in the Late Eocene. PMID:26225760

  6. The influence of genomic context on mutation patterns in the human genome inferred from rare variants.

    PubMed

    Schaibley, Valerie M; Zawistowski, Matthew; Wegmann, Daniel; Ehm, Margaret G; Nelson, Matthew R; St Jean, Pamela L; Abecasis, Gonçalo R; Novembre, John; Zöllner, Sebastian; Li, Jun Z

    2013-12-01

    Understanding patterns of spontaneous mutations is of fundamental interest in studies of human genome evolution and genetic disease. Here, we used extremely rare variants in humans to model the molecular spectrum of single-nucleotide mutations. Compared to common variants in humans and human-chimpanzee fixed differences (substitutions), rare variants, on average, arose more recently in the human lineage and are less affected by the potentially confounding effects of natural selection, population demographic history, and biased gene conversion. We analyzed variants obtained from a population-based sequencing study of 202 genes in >14,000 individuals. We observed considerable variability in the per-gene mutation rate, which was correlated with local GC content, but not recombination rate. Using >20,000 variants with a derived allele frequency ≤ 10(-4), we examined the effect of local GC content and recombination rate on individual variant subtypes and performed comparisons with common variants and substitutions. The influence of local GC content on rare variants differed from that on common variants or substitutions, and the differences varied by variant subtype. Furthermore, recombination rate and recombination hotspots have little effect on rare variants of any subtype, yet both have a relatively strong impact on multiple variant subtypes in common variants and substitutions. This observation is consistent with the effect of biased gene conversion or selection-dependent processes. Our results highlight the distinct biases inherent in the initial mutation patterns and subsequent evolutionary processes that affect segregating variants.

  7. Integration of Multiple Genomic and Phenotype Data to Infer Novel miRNA-Disease Associations

    PubMed Central

    Zhou, Meng; Cheng, Liang; Yang, Haixiu; Wang, Jing; Sun, Jie; Wang, Zhenzhen

    2016-01-01

    MicroRNAs (miRNAs) play an important role in the development and progression of human diseases. The identification of disease-associated miRNAs will be helpful for understanding the molecular mechanisms of diseases at the post-transcriptional level. Based on different types of genomic data sources, computational methods for miRNA-disease association prediction have been proposed. However, individual source of genomic data tends to be incomplete and noisy; therefore, the integration of various types of genomic data for inferring reliable miRNA-disease associations is urgently needed. In this study, we present a computational framework, CHNmiRD, for identifying miRNA-disease associations by integrating multiple genomic and phenotype data, including protein-protein interaction data, gene ontology data, experimentally verified miRNA-target relationships, disease phenotype information and known miRNA-disease connections. The performance of CHNmiRD was evaluated by experimentally verified miRNA-disease associations, which achieved an area under the ROC curve (AUC) of 0.834 for 5-fold cross-validation. In particular, CHNmiRD displayed excellent performance for diseases without any known related miRNAs. The results of case studies for three human diseases (glioblastoma, myocardial infarction and type 1 diabetes) showed that all of the top 10 ranked miRNAs having no known associations with these three diseases in existing miRNA-disease databases were directly or indirectly confirmed by our latest literature mining. All these results demonstrated the reliability and efficiency of CHNmiRD, and it is anticipated that CHNmiRD will serve as a powerful bioinformatics method for mining novel disease-related miRNAs and providing a new perspective into molecular mechanisms underlying human diseases at the post-transcriptional level. CHNmiRD is freely available at http://www.bio-bigdata.com/CHNmiRD. PMID:26849207

  8. Integration of Multiple Genomic and Phenotype Data to Infer Novel miRNA-Disease Associations.

    PubMed

    Shi, Hongbo; Zhang, Guangde; Zhou, Meng; Cheng, Liang; Yang, Haixiu; Wang, Jing; Sun, Jie; Wang, Zhenzhen

    2016-01-01

    MicroRNAs (miRNAs) play an important role in the development and progression of human diseases. The identification of disease-associated miRNAs will be helpful for understanding the molecular mechanisms of diseases at the post-transcriptional level. Based on different types of genomic data sources, computational methods for miRNA-disease association prediction have been proposed. However, individual source of genomic data tends to be incomplete and noisy; therefore, the integration of various types of genomic data for inferring reliable miRNA-disease associations is urgently needed. In this study, we present a computational framework, CHNmiRD, for identifying miRNA-disease associations by integrating multiple genomic and phenotype data, including protein-protein interaction data, gene ontology data, experimentally verified miRNA-target relationships, disease phenotype information and known miRNA-disease connections. The performance of CHNmiRD was evaluated by experimentally verified miRNA-disease associations, which achieved an area under the ROC curve (AUC) of 0.834 for 5-fold cross-validation. In particular, CHNmiRD displayed excellent performance for diseases without any known related miRNAs. The results of case studies for three human diseases (glioblastoma, myocardial infarction and type 1 diabetes) showed that all of the top 10 ranked miRNAs having no known associations with these three diseases in existing miRNA-disease databases were directly or indirectly confirmed by our latest literature mining. All these results demonstrated the reliability and efficiency of CHNmiRD, and it is anticipated that CHNmiRD will serve as a powerful bioinformatics method for mining novel disease-related miRNAs and providing a new perspective into molecular mechanisms underlying human diseases at the post-transcriptional level. CHNmiRD is freely available at http://www.bio-bigdata.com/CHNmiRD.

  9. Inferring Quantitative Trait Pathways Associated with Bull Fertility from a Genome-Wide Association Study

    PubMed Central

    Peñagaricano, Francisco; Weigel, Kent A.; Rosa, Guilherme J. M.; Khatib, Hasan

    2013-01-01

    Whole-genome association studies typically focus on genetic markers with the strongest evidence of association. However, single markers often explain only a small component of the genetic variance and hence offer a limited understanding of the trait under study. As such, the objective of this study was to perform a pathway-based association analysis in Holstein dairy cattle in order to identify relevant pathways involved in bull fertility. The results of a single-marker association analysis, using 1,755 bulls with sire conception rate data and genotypes for 38,650 single nucleotide polymorphisms (SNPs), were used in this study. A total of 16,819 annotated genes, including 2,767 significantly associated with bull fertility, were used to interrogate a total of 662 Gene Ontology (GO) terms and 248 InterPro (IP) entries using a test of proportions based on the cumulative hypergeometric distribution. After multiple-testing correction, 20 GO categories and one IP entry showed significant overrepresentation of genes statistically associated with bull fertility. Several of these functional categories such as small GTPases mediated signal transduction, neurogenesis, calcium ion binding, and cytoskeleton are known to be involved in biological processes closely related to male fertility. These results could provide insight into the genetic architecture of this complex trait in dairy cattle. In addition, this study shows that quantitative trait pathways inferred from single-marker analyses could enhance our interpretations of the results of genome-wide association studies. PMID:23335935

  10. ABC inference of multi-population divergence with admixture from unphased population genomic data

    PubMed Central

    Robinson, John D; Bunnefeld, Lynsey; Hearn, Jack; Stone, Graham N; Hickerson, Michael J

    2014-01-01

    Rapidly developing sequencing technologies and declining costs have made it possible to collect genome-scale data from population-level samples in nonmodel systems. Inferential tools for historical demography given these data sets are, at present, underdeveloped. In particular, approximate Bayesian computation (ABC) has yet to be widely embraced by researchers generating these data. Here, we demonstrate the promise of ABC for analysis of the large data sets that are now attainable from nonmodel taxa through current genomic sequencing technologies. We develop and test an ABC framework for model selection and parameter estimation, given histories of three-population divergence with admixture. We then explore different sampling regimes to illustrate how sampling more loci, longer loci or more individuals affects the quality of model selection and parameter estimation in this ABC framework. Our results show that inferences improved substantially with increases in the number and/or length of sequenced loci, while less benefit was gained by sampling large numbers of individuals. Optimal sampling strategies given our inferential models included at least 2000 loci, each approximately 2 kb in length, sampled from five diploid individuals per population, although specific strategies are model and question dependent. We tested our ABC approach through simulation-based cross-validations and illustrate its application using previously analysed data from the oak gall wasp, Biorhiza pallida. PMID:25113024

  11. ABC inference of multi-population divergence with admixture from unphased population genomic data.

    PubMed

    Robinson, John D; Bunnefeld, Lynsey; Hearn, Jack; Stone, Graham N; Hickerson, Michael J

    2014-09-01

    Rapidly developing sequencing technologies and declining costs have made it possible to collect genome-scale data from population-level samples in nonmodel systems. Inferential tools for historical demography given these data sets are, at present, underdeveloped. In particular, approximate Bayesian computation (ABC) has yet to be widely embraced by researchers generating these data. Here, we demonstrate the promise of ABC for analysis of the large data sets that are now attainable from nonmodel taxa through current genomic sequencing technologies. We develop and test an ABC framework for model selection and parameter estimation, given histories of three-population divergence with admixture. We then explore different sampling regimes to illustrate how sampling more loci, longer loci or more individuals affects the quality of model selection and parameter estimation in this ABC framework. Our results show that inferences improved substantially with increases in the number and/or length of sequenced loci, while less benefit was gained by sampling large numbers of individuals. Optimal sampling strategies given our inferential models included at least 2000 loci, each approximately 2 kb in length, sampled from five diploid individuals per population, although specific strategies are model and question dependent. We tested our ABC approach through simulation-based cross-validations and illustrate its application using previously analysed data from the oak gall wasp, Biorhiza pallida.

  12. Species Delimitation and Interspecific Relationships of the Genus Orychophragmus (Brassicaceae) Inferred from Whole Chloroplast Genomes

    PubMed Central

    Hu, Huan; Hu, Quanjun; Al-Shehbaz, Ihsan A.; Luo, Xin; Zeng, Tingting; Guo, Xinyi; Liu, Jianquan

    2016-01-01

    Genetic variations from few chloroplast DNA fragments show lower discriminatory power in the delimitation of closely related species and less resolution ability in discerning interspecific relationships than from nrITS. Here we use Orychophragmus (Brassicaceae) as a model system to test the hypothesis that the whole chloroplast genomes (plastomes), with accumulation of more variations despite the slow evolution, can overcome these weaknesses. We used Illumina sequencing technology via a reference-guided assembly to construct complete plastomes of 17 individuals from six putatively assumed species in the genus. All plastomes are highly conserved in genome structure, gene order, and orientation, and they are around 153 kb in length and contain 113 unique genes. However, nucleotide variations are quite substantial to support the delimitation of all sampled species and to resolve interspecific relationships with high statistical supports. As expected, the estimated divergences between major clades and species are lower than those estimated from nrITS probably due to the slow substitution rate of the plastomes. However, the plastome and nrITS phylogenies were contradictory in the placements of most species, thus suggesting that these species may have experienced complex non-bifurcating evolutions with incomplete lineage sorting and/or hybrid introgressions. Overall, our case study highlights the importance of using plastomes to examine species boundaries and establish an independent phylogeny to infer the speciation history of plants. PMID:27999584

  13. Orthology Inference in Nonmodel Organisms Using Transcriptomes and Low-Coverage Genomes: Improving Accuracy and Matrix Occupancy for Phylogenomics

    PubMed Central

    Yang, Ya; Smith, Stephen A.

    2014-01-01

    Orthology inference is central to phylogenomic analyses. Phylogenomic data sets commonly include transcriptomes and low-coverage genomes that are incomplete and contain errors and isoforms. These properties can severely violate the underlying assumptions of orthology inference with existing heuristics. We present a procedure that uses phylogenies for both homology and orthology assignment. The procedure first uses similarity scores to infer putative homologs that are then aligned, constructed into phylogenies, and pruned of spurious branches caused by deep paralogs, misassembly, frameshifts, or recombination. These final homologs are then used to identify orthologs. We explore four alternative tree-based orthology inference approaches, of which two are new. These accommodate gene and genome duplications as well as gene tree discordance. We demonstrate these methods in three published data sets including the grape family, Hymenoptera, and millipedes with divergence times ranging from approximately 100 to over 400 Ma. The procedure significantly increased the completeness and accuracy of the inferred homologs and orthologs. We also found that data sets that are more recently diverged and/or include more high-coverage genomes had more complete sets of orthologs. To explicitly evaluate sources of conflicting phylogenetic signals, we applied serial jackknife analyses of gene regions keeping each locus intact. The methods described here can scale to over 100 taxa. They have been implemented in python with independent scripts for each step, making it easy to modify or incorporate them into existing pipelines. All scripts are available from https://bitbucket.org/yangya/phylogenomic_dataset_construction. PMID:25158799

  14. PICARA, an analytical pipeline providing probabilistic inference about a priori candidates genes underlying genome-wide association QTL in plants

    Technology Transfer Automated Retrieval System (TEKTRAN)

    PICARA is an analytical pipeline designed to systematically summarize observed SNP/trait associations identified by genome wide association studies (GWAS) and to identify candidate genes involved in the regulation of complex trait variation. The pipeline provides probabilistic inference about a prio...

  15. Genomic analysis of circulating cell-free DNA infers breast cancer dormancy

    PubMed Central

    Shaw, Jacqueline A.; Page, Karen; Blighe, Kevin; Hava, Natasha; Guttery, David; Ward, Becky; Brown, James; Ruangpratheep, Chetana; Stebbing, Justin; Payne, Rachel; Palmieri, Carlo; Cleator, Suzy; Walker, Rosemary A.; Coombes, R. Charles

    2012-01-01

    Biomarkers in breast cancer to monitor minimal residual disease have remained elusive. We hypothesized that genomic analysis of circulating free DNA (cfDNA) isolated from plasma may form the basis for a means of detecting and monitoring breast cancer. We profiled 251 genomes using Affymetrix SNP 6.0 arrays to determine copy number variations (CNVs) and loss of heterozygosity (LOH), comparing 138 cfDNA samples with matched primary tumor and normal leukocyte DNA in 65 breast cancer patients and eight healthy female controls. Concordance of SNP genotype calls in paired cfDNA and leukocyte DNA samples distinguished between breast cancer patients and healthy female controls (P < 0.0001) and between preoperative patients and patients on follow-up who had surgery and treatment (P = 0.0016). Principal component analyses of cfDNA SNP/copy number results also separated presurgical breast cancer patients from the healthy controls, suggesting specific CNVs in cfDNA have clinical significance. We identified focal high-level DNA amplification in paired tumor and cfDNA clustered in a number of chromosome arms, some of which harbor genes with oncogenic potential, including USP17L2 (DUB3), BRF1, MTA1, and JAG2. Remarkably, in 50 patients on follow-up, specific CNVs were detected in cfDNA, mirroring the primary tumor, up to 12 yr after diagnosis despite no other evidence of disease. These data demonstrate the potential of SNP/CNV analysis of cfDNA to distinguish between patients with breast cancer and healthy controls during routine follow-up. The genomic profiles of cfDNA infer dormancy/minimal residual disease in the majority of patients on follow-up. PMID:21990379

  16. Genomic analysis of circulating cell-free DNA infers breast cancer dormancy.

    PubMed

    Shaw, Jacqueline A; Page, Karen; Blighe, Kevin; Hava, Natasha; Guttery, David; Ward, Becky; Brown, James; Ruangpratheep, Chetana; Stebbing, Justin; Payne, Rachel; Palmieri, Carlo; Cleator, Suzy; Walker, Rosemary A; Coombes, R Charles

    2012-02-01

    Biomarkers in breast cancer to monitor minimal residual disease have remained elusive. We hypothesized that genomic analysis of circulating free DNA (cfDNA) isolated from plasma may form the basis for a means of detecting and monitoring breast cancer. We profiled 251 genomes using Affymetrix SNP 6.0 arrays to determine copy number variations (CNVs) and loss of heterozygosity (LOH), comparing 138 cfDNA samples with matched primary tumor and normal leukocyte DNA in 65 breast cancer patients and eight healthy female controls. Concordance of SNP genotype calls in paired cfDNA and leukocyte DNA samples distinguished between breast cancer patients and healthy female controls (P < 0.0001) and between preoperative patients and patients on follow-up who had surgery and treatment (P = 0.0016). Principal component analyses of cfDNA SNP/copy number results also separated presurgical breast cancer patients from the healthy controls, suggesting specific CNVs in cfDNA have clinical significance. We identified focal high-level DNA amplification in paired tumor and cfDNA clustered in a number of chromosome arms, some of which harbor genes with oncogenic potential, including USP17L2 (DUB3), BRF1, MTA1, and JAG2. Remarkably, in 50 patients on follow-up, specific CNVs were detected in cfDNA, mirroring the primary tumor, up to 12 yr after diagnosis despite no other evidence of disease. These data demonstrate the potential of SNP/CNV analysis of cfDNA to distinguish between patients with breast cancer and healthy controls during routine follow-up. The genomic profiles of cfDNA infer dormancy/minimal residual disease in the majority of patients on follow-up.

  17. Covariance Between Genotypic Effects and its Use for Genomic Inference in Half-Sib Families

    PubMed Central

    Wittenburg, Dörte; Teuscher, Friedrich; Klosa, Jan; Reinsch, Norbert

    2016-01-01

    In livestock, current statistical approaches utilize extensive molecular data, e.g., single nucleotide polymorphisms (SNPs), to improve the genetic evaluation of individuals. The number of model parameters increases with the number of SNPs, so the multicollinearity between covariates can affect the results obtained using whole genome regression methods. In this study, dependencies between SNPs due to linkage and linkage disequilibrium among the chromosome segments were explicitly considered in methods used to estimate the effects of SNPs. The population structure affects the extent of such dependencies, so the covariance among SNP genotypes was derived for half-sib families, which are typical in livestock populations. Conditional on the SNP haplotypes of the common parent (sire), the theoretical covariance was determined using the haplotype frequencies of the population from which the individual parent (dam) was derived. The resulting covariance matrix was included in a statistical model for a trait of interest, and this covariance matrix was then used to specify prior assumptions for SNP effects in a Bayesian framework. The approach was applied to one family in simulated scenarios (few and many quantitative trait loci) and using semireal data obtained from dairy cattle to identify genome segments that affect performance traits, as well as to investigate the impact on predictive ability. Compared with a method that does not explicitly consider any of the relationship among predictor variables, the accuracy of genetic value prediction was improved by 10–22%. The results show that the inclusion of dependence is particularly important for genomic inference based on small sample sizes. PMID:27402363

  18. Covariance Between Genotypic Effects and its Use for Genomic Inference in Half-Sib Families.

    PubMed

    Wittenburg, Dörte; Teuscher, Friedrich; Klosa, Jan; Reinsch, Norbert

    2016-09-08

    In livestock, current statistical approaches utilize extensive molecular data, e.g., single nucleotide polymorphisms (SNPs), to improve the genetic evaluation of individuals. The number of model parameters increases with the number of SNPs, so the multicollinearity between covariates can affect the results obtained using whole genome regression methods. In this study, dependencies between SNPs due to linkage and linkage disequilibrium among the chromosome segments were explicitly considered in methods used to estimate the effects of SNPs. The population structure affects the extent of such dependencies, so the covariance among SNP genotypes was derived for half-sib families, which are typical in livestock populations. Conditional on the SNP haplotypes of the common parent (sire), the theoretical covariance was determined using the haplotype frequencies of the population from which the individual parent (dam) was derived. The resulting covariance matrix was included in a statistical model for a trait of interest, and this covariance matrix was then used to specify prior assumptions for SNP effects in a Bayesian framework. The approach was applied to one family in simulated scenarios (few and many quantitative trait loci) and using semireal data obtained from dairy cattle to identify genome segments that affect performance traits, as well as to investigate the impact on predictive ability. Compared with a method that does not explicitly consider any of the relationship among predictor variables, the accuracy of genetic value prediction was improved by 10-22%. The results show that the inclusion of dependence is particularly important for genomic inference based on small sample sizes.

  19. The evolutionary history of termites as inferred from 66 mitochondrial genomes.

    PubMed

    Bourguignon, Thomas; Lo, Nathan; Cameron, Stephen L; Šobotník, Jan; Hayashi, Yoshinobu; Shigenobu, Shuji; Watanabe, Dai; Roisin, Yves; Miura, Toru; Evans, Theodore A

    2015-02-01

    Termites have colonized many habitats and are among the most abundant animals in tropical ecosystems, which they modify considerably through their actions. The timing of their rise in abundance and of the dispersal events that gave rise to modern termite lineages is not well understood. To shed light on termite origins and diversification, we sequenced the mitochondrial genome of 48 termite species and combined them with 18 previously sequenced termite mitochondrial genomes for phylogenetic and molecular clock analyses using multiple fossil calibrations. The 66 genomes represent most major clades of termites. Unlike previous phylogenetic studies based on fewer molecular data, our phylogenetic tree is fully resolved for the lower termites. The phylogenetic positions of Macrotermitinae and Apicotermitinae are also resolved as the basal groups in the higher termites, but in the crown termitid groups, including Termitinae + Syntermitinae + Nasutitermitinae + Cubitermitinae, the position of some nodes remains uncertain. Our molecular clock tree indicates that the lineages leading to termites and Cryptocercus roaches diverged 170 Ma (153-196 Ma 95% confidence interval [CI]), that modern Termitidae arose 54 Ma (46-66 Ma 95% CI), and that the crown termitid group arose 40 Ma (35-49 Ma 95% CI). This indicates that the distribution of basal termite clades was influenced by the final stages of the breakup of Pangaea. Our inference of ancestral geographic ranges shows that the Termitidae, which includes more than 75% of extant termite species, most likely originated in Africa or Asia, and acquired their pantropical distribution after a series of dispersal and subsequent diversification events.

  20. Photoheterotrophy of bacterioplankton is ubiquitous in the surface oligotrophic ocean

    NASA Astrophysics Data System (ADS)

    Evans, Claire; Gómez-Pereira, Paola R.; Martin, Adrian P.; Scanlan, David J.; Zubkov, Mikhail V.

    2015-06-01

    Accurate measurements in the Southern Hemisphere were obtained to test a hypothesis of the ubiquity of photoheterotrophy in the oligotrophic ocean. We present experimental results of light-enhanced uptake of methionine, leucine and ATP by bacterioplankton during two large-scale transects of the South Atlantic. Light increased the uptake of substrates by both dominant bacterioplankton groups, Prochlorococcus and SAR11, as well as for the bulk microbial community. Our consistent experimental evidence strongly indicates that photoheterotrophy is characteristic of dominant bacterioplankton populations in the global oligotrophic ocean.

  1. Effect of sampling on the extent and accuracy of the inferred genetic history of recombining genome.

    PubMed

    Platt, Daniel E; Utro, Filippo; Parida, Laxmi

    2014-06-01

    Accessible biotechnology is enabling the cataloging of genetic variants in individuals in populations at unprecedented scales. The use of phylogeny of the individuals within populations allows a model-based approach to studying these variations, which is important in understanding relationships between and across populations. For the somatic genome, however, the phylogeny must take recombinations (and other genetic mixing events) into account. Hence the resulting topology is more complex than a tree. Unlike a tree topology, it is not as apparent which events are visible from the extant samples. An earlier work presented a mathematical model (called the minimal descriptor) for teasing apart the inherent visible information from that which any specific algorithm might see. We use this framework to study the effect of sampling sizes on the overall inferred genetic history. In this paper, we seek to understand the extent, characteristics (in terms of recent versus ancient genetic events) and reliability of what was resolvable within field samples drawn from modern populations. We observed that most of the visible ancient events are recoverable from relatively small sample sizes. However, without identification of this relatively small minority of ancient genetic events, most of the signal will appear to reflect modern events and admixtures. We also found that the more ancient events are likely to be reproduced with higher fidelity between multiple samplings, and that the identified older events are less likely to yield false positive discrimination between populations. We conclude that a recombinant phylogenetic reconstruction is necessary to identify which markers are most likely to discriminate ancient events, and to discriminate between populations with lower risk of false positives. Secondly, on a broader note, this study also provides a general methodology for a critical assessment of the inferred common genetic history of populations (say, in plant cultivars or

  2. Comparative analysis of mitochondrial genomes in Diplura (hexapoda, arthropoda): taxon sampling is crucial for phylogenetic inferences.

    PubMed

    Chen, Wan-Jun; Koch, Markus; Mallatt, Jon M; Luan, Yun-Xia

    2014-01-01

    Two-pronged bristletails (Diplura) are traditionally classified into three major superfamilies: Campodeoidea, Projapygoidea, and Japygoidea. The interrelationships of these three superfamilies and the monophyly of Diplura have been much debated. Few previous studies included Projapygoidea in their phylogenetic considerations, and its position within Diplura still is a puzzle from both morphological and molecular points of view. Until now, no mitochondrial genome has been sequenced for any projapygoid species. To fill in this gap, we determined and annotated the complete mitochondrial genome of Octostigma sinensis (Octostigmatidae, Projapygoidea), and of three more dipluran species, one each from the Campodeidae, Parajapygidae, and Japygidae. All four newly sequenced dipluran mtDNAs encode the same set of genes in the same gene order as shared by most crustaceans and hexapods. Secondary structure truncations have occurred in trnR, trnC, trnS1, and trnS2, and the reduction of transfer RNA D-arms was found to be taxonomically correlated, with Campodeoidea having experienced the most reduction. Partitioned phylogenetic analyses, based on both amino acids and nucleotides of the protein-coding genes plus the ribosomal RNA genes, retrieve significant support for a monophyletic Diplura within Pancrustacea, with Projapygoidea more closely related to Campodeoidea than to Japygoidea. Another key finding is that monophyly of Diplura cannot be recovered unless Projapygoidea is included in the phylogenetic analyses; this explains the dipluran polyphyly found by past mitogenomic studies. Including Projapygoidea increased the sampling density within Diplura and probably helped by breaking up a long-branch-attraction artifact. This finding provides an example of how proper sampling is significant for phylogenetic inference.

  3. Genomic inference accurately predicts the timing and severity of a recent bottleneck in a non-model insect population

    PubMed Central

    McCoy, Rajiv C.; Garud, Nandita R.; Kelley, Joanna L.; Boggs, Carol L.; Petrov, Dmitri A.

    2015-01-01

    The analysis of molecular data from natural populations has allowed researchers to answer diverse ecological questions that were previously intractable. In particular, ecologists are often interested in the demographic history of populations, information that is rarely available from historical records. Methods have been developed to infer demographic parameters from genomic data, but it is not well understood how inferred parameters compare to true population history or depend on aspects of experimental design. Here we present and evaluate a method of SNP discovery using RNA-sequencing and demographic inference using the program δaδi, which uses a diffusion approximation to the allele frequency spectrum to fit demographic models. We test these methods in a population of the checkerspot butterfly Euphydryas gillettii. This population was intentionally introduced to Gothic, Colorado in 1977 and has since experienced extreme fluctuations including bottlenecks of fewer than 25 adults, as documented by nearly annual field surveys. Using RNA-sequencing of eight individuals from Colorado and eight individuals from a native population in Wyoming, we generate the first genomic resources for this system. While demographic inference is commonly used to examine ancient demography, our study demonstrates that our inexpensive, all-in-one approach to marker discovery and genotyping provides sufficient data to accurately infer the timing of a recent bottleneck. This demographic scenario is relevant for many species of conservation concern, few of which have sequenced genomes. Our results are remarkably insensitive to sample size or number of genomic markers, which has important implications for applying this method to other non-model systems. PMID:24237665

  4. epiG: statistical inference and profiling of DNA methylation from whole-genome bisulfite sequencing data.

    PubMed

    Vincent, Martin; Mundbjerg, Kamilla; Skou Pedersen, Jakob; Liang, Gangning; Jones, Peter A; Ørntoft, Torben Falck; Dalsgaard Sørensen, Karina; Wiuf, Carsten

    2017-02-21

    The study of epigenetic heterogeneity at the level of individual cells and in whole populations is the key to understanding cellular differentiation, organismal development, and the evolution of cancer. We develop a statistical method, epiG, to infer and differentiate between different epi-allelic haplotypes, annotated with CpG methylation status and DNA polymorphisms, from whole-genome bisulfite sequencing data, and nucleosome occupancy from NOMe-seq data. We demonstrate the capabilities of the method by inferring allele-specific methylation and nucleosome occupancy in cell lines, and colon and tumor samples, and by benchmarking the method against independent experimental data.

  5. BACTERIOPLANKTON DYNAMICS IN A SUBTROPICAL ESTUARY: EVIDENCE FOR SUBSTRATE LIMITATION

    EPA Science Inventory

    Bacterioplankton abundance and metabolic characteristics were measured along a transect in Pensacola Bay, Florida, USA, to examine the factors that control microbial water column processes in this subtropical estuary. The microbial measures included 3 H-L-leucine incorporation, e...

  6. Diazotrophic bacterioplankton in a coral reef lagoon: phylogeny, diel nitrogenase expression and response to phosphate enrichment.

    PubMed

    Hewson, Ian; Moisander, Pia H; Morrison, Amanda E; Zehr, Jonathan P

    2007-05-01

    We investigated diazotrophic bacterioplankton assemblage composition in the Heron Reef lagoon (Great Barrier Reef, Australia) using culture-independent techniques targeting the nifH fragment of the nitrogenase gene. Seawater was collected at 3 h intervals over a period of 72 h (i.e. over diel as well as tidal cycles). An incubation experiment was also conducted to assess the impact of phosphate (PO(4)3*) availability on nifH expression patterns. DNA-based nifH libraries contained primarily sequences that were most similar to nifH from sediment, microbial mat and surface-associated microorganisms, with a few sequences that clustered with typical open ocean phylotypes. In contrast to genomic DNA sequences, libraries prepared from gene transcripts (mRNA amplified by reverse transcription-polymerase chain reaction) were entirely cyanobacterial and contained phylotypes similar to those observed in open ocean plankton. The abundance of Trichodesmium and two uncultured cyanobacterial phylotypes from previous studies (group A and group B) were studied by quantitative-polymerase chain reaction in the lagoon samples. These were detected as transcripts, but were not detected in genomic DNA. The gene transcript abundance of these phylotypes demonstrated variability over several diel cycles. The PO(4)3* enrichment experiment had a clearer pattern of gene expression over diel cycles than the lagoon sampling, however PO(4)3* additions did not result in enhanced transcript abundance relative to control incubations. The results suggest that a number of diazotrophs in bacterioplankton of the reef lagoon may originate from sediment, coral or beachrock surfaces, sloughing into plankton with the flooding tide. The presence of typical open ocean phylotype transcripts in lagoon bacterioplankton may indicate that they are an important component of the N cycle of the coral reef.

  7. Genome Size Variation and Species Relationships in Hieracium Sub-genus Pilosella (Asteraceae) as Inferred by Flow Cytometry

    PubMed Central

    Suda, Jan; Krahulcová, Anna; Trávníček, Pavel; Rosenbaumová, Radka; Peckert, Tomáš; Krahulec, František

    2007-01-01

    Background and Aims Hieracium sub-genus Pilosella (hawkweeds) is a taxonomically complicated group of vascular plants, the structure of which is substantially influenced by frequent interspecific hybridization and polyploidization. Two kinds of species, ‘basic’ and ‘intermediate’ (i.e. hybridogenous), are usually recognized. In this study, genome size variation was investigated in a representative set of Central European hawkweeds in order to assess the value of such a data set for species delineation and inference of evolutionary relationships. Methods Holoploid and monoploid genome sizes (C- and Cx-values) were determined using propidium iodide flow cytometry for 376 homogeneously cultivated individuals of Hieracium sub-genus Pilosella, including 24 species (271 individuals), five recent natural hybrids (seven individuals) and experimental F1 hybrids from four parental combinations (98 individuals). Chromosome counts were available for more than half of the plant accessions. Base composition (proportion of AT/GC bases) was cytometrically estimated in 73 individuals. Key Results Seven different ploidy levels (2x–8x) were detected, with intraspecific ploidy polymorphism (up to four different cytotypes) occurring in 11 wild species. Mean 2C-values varied approx. 4·3-fold from 3·53 pg in diploid H. hoppeanum to 15·30 pg in octoploid H. brachiatum. 1Cx-values ranged from 1·72 pg in H. pilosella to 2·16 pg in H. echioides (1·26-fold). The DNA content of (high) polyploids was usually proportional to the DNA values of their diploid/low polyploid counterparts, indicating lack of processes altering genome size (i.e. genome down-sizing). Most species showed constant nuclear DNA amounts, exceptions being three hybridogenous taxa, in which introgressive hybridization was suggested as a presumable trigger for genome size variation. Monoploid genome sizes of hybridogenous species were always between the corresponding values of their putative parents. In addition

  8. Genome Alignment Spanning Major Poaceae Lineages Reveals Heterogeneous Evolutionary Rates and Alters Inferred Dates for Key Evolutionary Events.

    PubMed

    Wang, Xiyin; Wang, Jingpeng; Jin, Dianchuan; Guo, Hui; Lee, Tae-Ho; Liu, Tao; Paterson, Andrew H

    2015-06-01

    Multiple comparisons among genomes can clarify their evolution, speciation, and functional innovations. To date, the genome sequences of eight grasses representing the most economically important Poaceae (grass) clades have been published, and their genomic-level comparison is an essential foundation for evolutionary, functional, and translational research. Using a formal and conservative approach, we aligned these genomes. Direct comparison of paralogous gene pairs all duplicated simultaneously reveal striking variation in evolutionary rates among whole genomes, with nucleotide substitution slowest in rice and up to 48% faster in other grasses, adding a new dimension to the value of rice as a grass model. We reconstructed ancestral genome contents for major evolutionary nodes, potentially contributing to understanding the divergence and speciation of grasses. Recent fossil evidence suggests revisions of the estimated dates of key evolutionary events, implying that the pan-grass polyploidization occurred ∼96 million years ago and could not be related to the Cretaceous-Tertiary mass extinction as previously inferred. Adjusted dating to reflect both updated fossil evidence and lineage-specific evolutionary rates suggested that maize subgenome divergence and maize-sorghum divergence were virtually simultaneous, a coincidence that would be explained if polyploidization directly contributed to speciation. This work lays a solid foundation for Poaceae translational genomics.

  9. Biogeography of bacterioplankton in the tropical seawaters of Singapore.

    PubMed

    Lau, Stanley C K; Zhang, Rui; Brodie, Eoin L; Piceno, Yvette M; Andersen, Gary; Liu, Wen-Tso

    2013-05-01

    Knowledge about the biogeography of marine bacterioplankton on the global scale in general and in Southeast Asia in particular has been scarce. This study investigated the biogeography of bacterioplankton community in Singapore seawaters. Twelve stations around Singapore island were sampled on different schedules over 1 year. Using PCR-DNA fingerprinting, DNA cloning and sequencing, and microarray hybridization of the 16S rRNA genes, we observed clear spatial variations of bacterioplankton diversity within the small area of the Singapore seas. Water samples collected from the Singapore Strait (south) throughout the year were dominated by DNA sequences affiliated with Cyanobacteria and Alphaproteobacteria that were believed to be associated with the influx of water from the open seas in Southeast Asia. On the contrary, water in the relatively polluted Johor Strait (north) were dominated by Betaproteobacteria, Gammaproteobacteria, and Bacteroidetes and that were presumably associated with river discharge and the relatively eutrophic conditions of the waterway. Bacterioplankton diversity was temporally stable, except for the episodic surge of Pseudoalteromonas, associated with algal blooms. Overall, these results provide valuable insights into the diversity of bacterioplankton communities in Singapore seas and the possible influences of hydrological conditions and anthropogenic activities on the dynamics of the communities.

  10. A Detailed History of Intron-rich Eukaryotic Ancestors Inferred from a Global Survey of 100 Complete Genomes

    PubMed Central

    Csuros, Miklos; Rogozin, Igor B.; Koonin, Eugene V.

    2011-01-01

    Protein-coding genes in eukaryotes are interrupted by introns, but intron densities widely differ between eukaryotic lineages. Vertebrates, some invertebrates and green plants have intron-rich genes, with 6–7 introns per kilobase of coding sequence, whereas most of the other eukaryotes have intron-poor genes. We reconstructed the history of intron gain and loss using a probabilistic Markov model (Markov Chain Monte Carlo, MCMC) on 245 orthologous genes from 99 genomes representing the three of the five supergroups of eukaryotes for which multiple genome sequences are available. Intron-rich ancestors are confidently reconstructed for each major group, with 53 to 74% of the human intron density inferred with 95% confidence for the Last Eukaryotic Common Ancestor (LECA). The results of the MCMC reconstruction are compared with the reconstructions obtained using Maximum Likelihood (ML) and Dollo parsimony methods. An excellent agreement between the MCMC and ML inferences is demonstrated whereas Dollo parsimony introduces a noticeable bias in the estimations, typically yielding lower ancestral intron densities than MCMC and ML. Evolution of eukaryotic genes was dominated by intron loss, with substantial gain only at the bases of several major branches including plants and animals. The highest intron density, 120 to 130% of the human value, is inferred for the last common ancestor of animals. The reconstruction shows that the entire line of descent from LECA to mammals was intron-rich, a state conducive to the evolution of alternative splicing. PMID:21935348

  11. A detailed history of intron-rich eukaryotic ancestors inferred from a global survey of 100 complete genomes.

    PubMed

    Csuros, Miklos; Rogozin, Igor B; Koonin, Eugene V

    2011-09-01

    Protein-coding genes in eukaryotes are interrupted by introns, but intron densities widely differ between eukaryotic lineages. Vertebrates, some invertebrates and green plants have intron-rich genes, with 6-7 introns per kilobase of coding sequence, whereas most of the other eukaryotes have intron-poor genes. We reconstructed the history of intron gain and loss using a probabilistic Markov model (Markov Chain Monte Carlo, MCMC) on 245 orthologous genes from 99 genomes representing the three of the five supergroups of eukaryotes for which multiple genome sequences are available. Intron-rich ancestors are confidently reconstructed for each major group, with 53 to 74% of the human intron density inferred with 95% confidence for the Last Eukaryotic Common Ancestor (LECA). The results of the MCMC reconstruction are compared with the reconstructions obtained using Maximum Likelihood (ML) and Dollo parsimony methods. An excellent agreement between the MCMC and ML inferences is demonstrated whereas Dollo parsimony introduces a noticeable bias in the estimations, typically yielding lower ancestral intron densities than MCMC and ML. Evolution of eukaryotic genes was dominated by intron loss, with substantial gain only at the bases of several major branches including plants and animals. The highest intron density, 120 to 130% of the human value, is inferred for the last common ancestor of animals. The reconstruction shows that the entire line of descent from LECA to mammals was intron-rich, a state conducive to the evolution of alternative splicing.

  12. Demographic Divergence History of Pied Flycatcher and Collared Flycatcher Inferred from Whole-Genome Re-sequencing Data

    PubMed Central

    Nadachowska-Brzyska, Krystyna; Burri, Reto; Olason, Pall I.; Kawakami, Takeshi; Smeds, Linnéa; Ellegren, Hans

    2013-01-01

    Profound knowledge of demographic history is a prerequisite for the understanding and inference of processes involved in the evolution of population differentiation and speciation. Together with new coalescent-based methods, the recent availability of genome-wide data enables investigation of differentiation and divergence processes at unprecedented depth. We combined two powerful approaches, full Approximate Bayesian Computation analysis (ABC) and pairwise sequentially Markovian coalescent modeling (PSMC), to reconstruct the demographic history of the split between two avian speciation model species, the pied flycatcher and collared flycatcher. Using whole-genome re-sequencing data from 20 individuals, we investigated 15 demographic models including different levels and patterns of gene flow, and changes in effective population size over time. ABC provided high support for recent (mode 0.3 my, range <0.7 my) species divergence, declines in effective population size of both species since their initial divergence, and unidirectional recent gene flow from pied flycatcher into collared flycatcher. The estimated divergence time and population size changes, supported by PSMC results, suggest that the ancestral species persisted through one of the glacial periods of middle Pleistocene and then split into two large populations that first increased in size before going through severe bottlenecks and expanding into their current ranges. Secondary contact appears to have been established after the last glacial maximum. The severity of the bottlenecks at the last glacial maximum is indicated by the discrepancy between current effective population sizes (20,000–80,000) and census sizes (5–50 million birds) of the two species. The recent divergence time challenges the supposition that avian speciation is a relatively slow process with extended times for intrinsic postzygotic reproductive barriers to evolve. Our study emphasizes the importance of using genome-wide data to

  13. Inferring Selective Constraint from Population Genomic Data Suggests Recent Regulatory Turnover in the Human Brain

    PubMed Central

    Schrider, Daniel R.; Kern, Andrew D.

    2015-01-01

    The comparative genomics revolution of the past decade has enabled the discovery of functional elements in the human genome via sequence comparison. While that is so, an important class of elements, those specific to humans, is entirely missed by searching for sequence conservation across species. Here we present an analysis based on variation data among human genomes that utilizes a supervised machine learning approach for the identification of human-specific purifying selection in the genome. Using only allele frequency information from the complete low-coverage 1000 Genomes Project data set in conjunction with a support vector machine trained from known functional and nonfunctional portions of the genome, we are able to accurately identify portions of the genome constrained by purifying selection. Our method identifies previously known human-specific gains or losses of function and uncovers many novel candidates. Candidate targets for gain and loss of function along the human lineage include numerous putative regulatory regions of genes essential for normal development of the central nervous system, including a significant enrichment of gain of function events near neurotransmitter receptor genes. These results are consistent with regulatory turnover being a key mechanism in the evolution of human-specific characteristics of brain development. Finally, we show that the majority of the genome is unconstrained by natural selection currently, in agreement with what has been estimated from phylogenetic methods but in sharp contrast to estimates based on transcriptomics or other high-throughput functional methods. PMID:26590212

  14. Inferring Selective Constraint from Population Genomic Data Suggests Recent Regulatory Turnover in the Human Brain.

    PubMed

    Schrider, Daniel R; Kern, Andrew D

    2015-11-19

    The comparative genomics revolution of the past decade has enabled the discovery of functional elements in the human genome via sequence comparison. While that is so, an important class of elements, those specific to humans, is entirely missed by searching for sequence conservation across species. Here we present an analysis based on variation data among human genomes that utilizes a supervised machine learning approach for the identification of human-specific purifying selection in the genome. Using only allele frequency information from the complete low-coverage 1000 Genomes Project data set in conjunction with a support vector machine trained from known functional and nonfunctional portions of the genome, we are able to accurately identify portions of the genome constrained by purifying selection. Our method identifies previously known human-specific gains or losses of function and uncovers many novel candidates. Candidate targets for gain and loss of function along the human lineage include numerous putative regulatory regions of genes essential for normal development of the central nervous system, including a significant enrichment of gain of function events near neurotransmitter receptor genes. These results are consistent with regulatory turnover being a key mechanism in the evolution of human-specific characteristics of brain development. Finally, we show that the majority of the genome is unconstrained by natural selection currently, in agreement with what has been estimated from phylogenetic methods but in sharp contrast to estimates based on transcriptomics or other high-throughput functional methods.

  15. Reference set of regulons in Desulfovibrionales inferred by comparative genomics approach

    SciTech Connect

    Kazakov, A.E.; Rodionov, D.A.; Price, M.N.; Arkin, A.P.; Dubchak, I.; Novichkov, P.S.

    2010-11-15

    in this study, we carried out large-scale comparative genomics analysis of regulatory interactions in Desulfovibrio vulgaris and 12 related genomes from Desulfovibrionales order using our recently developed web server RegPredict (http://regpredict.lbl.gov). An overall reference collection of 26 Desulfovibrionales regulogs can be accessed through RegPrecise database (http://regpredict.lbl.gov).

  16. High-level phylogeny of the Coleoptera inferred with mitochondrial genome sequences.

    PubMed

    Yuan, Ming-Long; Zhang, Qi-Lin; Zhang, Li; Guo, Zhong-Long; Liu, Yong-Jian; Shen, Yu-Ying; Shao, Renfu

    2016-11-01

    The Coleoptera (beetles) exhibits tremendous morphological, ecological, and behavioral diversity. To better understand the phylogenetics and evolution of beetles, we sequenced three complete mitogenomes from two families (Cleridae and Meloidae), which share conserved mitogenomic features with other completely sequenced beetles. We assessed the influence of six datasets and three inference methods on topology and nodal support within the Coleoptera. We found that both Bayesian inference and maximum likelihood with homogeneous-site models were greatly affected by nucleotide compositional heterogeneity, while the heterogeneous-site mixture model in PhyloBayes could provide better phylogenetic signals for the Coleoptera. The amino acid dataset generated more reliable tree topology at the higher taxonomic levels (i.e. suborders and series), where the inclusion of rRNA genes and the third positions of protein-coding genes improved phylogenetic inference at the superfamily level, especially under a heterogeneous-site model. We recovered the suborder relationships as (Archostemata+Adephaga)+(Myxophaga+Polyphaga). The series relationships within Polyphaga were recovered as (Scirtiformia+(Elateriformia+((Bostrichiformia+Scarabaeiformia+Staphyliniformia)+Cucujiformia))). All superfamilies within Cucujiformia were recovered as monophyletic. We obtained a cucujiform phylogeny of (Cleroidea+(Coccinelloidea+((Lymexyloidea+Tenebrionoidea)+(Cucujoidea+(Chrysomeloidea+Curculionoidea))))). This study showed that although tree topologies were sensitive to data types and inference methods, mitogenomic data could provide useful information for resolving the Coleoptera phylogeny at various taxonomic levels by using suitable datasets and heterogeneous-site models.

  17. Disentangling seasonal bacterioplankton population dynamics by high-frequency sampling.

    PubMed

    Lindh, Markus V; Sjöstedt, Johanna; Andersson, Anders F; Baltar, Federico; Hugerth, Luisa W; Lundin, Daniel; Muthusamy, Saraladevi; Legrand, Catherine; Pinhassi, Jarone

    2015-07-01

    Multiyear comparisons of bacterioplankton succession reveal that environmental conditions drive community shifts with repeatable patterns between years. However, corresponding insight into bacterioplankton dynamics at a temporal resolution relevant for detailed examination of variation and characteristics of specific populations within years is essentially lacking. During 1 year, we collected 46 samples in the Baltic Sea for assessing bacterial community composition by 16S rRNA gene pyrosequencing (nearly twice weekly during productive season). Beta-diversity analysis showed distinct clustering of samples, attributable to seemingly synchronous temporal transitions among populations (populations defined by 97% 16S rRNA gene sequence identity). A wide spectrum of bacterioplankton dynamics was evident, where divergent temporal patterns resulted both from pronounced differences in relative abundance and presence/absence of populations. Rates of change in relative abundance calculated for individual populations ranged from 0.23 to 1.79 day(-1) . Populations that were persistently dominant, transiently abundant or generally rare were found in several major bacterial groups, implying evolution has favoured a similar variety of life strategies within these groups. These findings suggest that high temporal resolution sampling allows constraining the timescales and frequencies at which distinct populations transition between being abundant or rare, thus potentially providing clues about physical, chemical or biological forcing on bacterioplankton community structure.

  18. Demographic History of the Genus Pan Inferred from Whole Mitochondrial Genome Reconstructions

    PubMed Central

    Tucci, Serena; de Manuel, Marc; Ghirotto, Silvia; Benazzo, Andrea; Prado-Martinez, Javier; Lorente-Galdos, Belen; Nam, Kiwoong; Dabad, Marc; Hernandez-Rodriguez, Jessica; Comas, David; Navarro, Arcadi; Schierup, Mikkel H.; Andres, Aida M.; Barbujani, Guido; Hvilsom, Christina; Marques-Bonet, Tomas

    2016-01-01

    The genus Pan is the closest genus to our own and it includes two species, Pan paniscus (bonobos) and Pan troglodytes (chimpanzees). The later is constituted by four subspecies, all highly endangered. The study of the Pan genera has been incessantly complicated by the intricate relationship among subspecies and the statistical limitations imposed by the reduced number of samples or genomic markers analyzed. Here, we present a new method to reconstruct complete mitochondrial genomes (mitogenomes) from whole genome shotgun (WGS) datasets, mtArchitect, showing that its reconstructions are highly accurate and consistent with long-range PCR mitogenomes. We used this approach to build the mitochondrial genomes of 20 newly sequenced samples which, together with available genomes, allowed us to analyze the hitherto most complete Pan mitochondrial genome dataset including 156 chimpanzee and 44 bonobo individuals, with a proportional contribution from all chimpanzee subspecies. We estimated the separation time between chimpanzees and bonobos around 1.15 million years ago (Mya) [0.81–1.49]. Further, we found that under the most probable genealogical model the two clades of chimpanzees, Western + Nigeria-Cameroon and Central + Eastern, separated at 0.59 Mya [0.41–0.78] with further internal separations at 0.32 Mya [0.22–0.43] and 0.16 Mya [0.17–0.34], respectively. Finally, for a subset of our samples, we compared nuclear versus mitochondrial genomes and we found that chimpanzee subspecies have different patterns of nuclear and mitochondrial diversity, which could be a result of either processes affecting the mitochondrial genome, such as hitchhiking or background selection, or a result of population dynamics. PMID:27345955

  19. Demographic History of the Genus Pan Inferred from Whole Mitochondrial Genome Reconstructions.

    PubMed

    Lobon, Irene; Tucci, Serena; de Manuel, Marc; Ghirotto, Silvia; Benazzo, Andrea; Prado-Martinez, Javier; Lorente-Galdos, Belen; Nam, Kiwoong; Dabad, Marc; Hernandez-Rodriguez, Jessica; Comas, David; Navarro, Arcadi; Schierup, Mikkel H; Andres, Aida M; Barbujani, Guido; Hvilsom, Christina; Marques-Bonet, Tomas

    2016-07-03

    The genus Pan is the closest genus to our own and it includes two species, Pan paniscus (bonobos) and Pan troglodytes (chimpanzees). The later is constituted by four subspecies, all highly endangered. The study of the Pan genera has been incessantly complicated by the intricate relationship among subspecies and the statistical limitations imposed by the reduced number of samples or genomic markers analyzed. Here, we present a new method to reconstruct complete mitochondrial genomes (mitogenomes) from whole genome shotgun (WGS) datasets, mtArchitect, showing that its reconstructions are highly accurate and consistent with long-range PCR mitogenomes. We used this approach to build the mitochondrial genomes of 20 newly sequenced samples which, together with available genomes, allowed us to analyze the hitherto most complete Pan mitochondrial genome dataset including 156 chimpanzee and 44 bonobo individuals, with a proportional contribution from all chimpanzee subspecies. We estimated the separation time between chimpanzees and bonobos around 1.15 million years ago (Mya) [0.81-1.49]. Further, we found that under the most probable genealogical model the two clades of chimpanzees, Western + Nigeria-Cameroon and Central + Eastern, separated at 0.59 Mya [0.41-0.78] with further internal separations at 0.32 Mya [0.22-0.43] and 0.16 Mya [0.17-0.34], respectively. Finally, for a subset of our samples, we compared nuclear versus mitochondrial genomes and we found that chimpanzee subspecies have different patterns of nuclear and mitochondrial diversity, which could be a result of either processes affecting the mitochondrial genome, such as hitchhiking or background selection, or a result of population dynamics.

  20. Comparative genomics of four Liliales families inferred from the complete chloroplast genome sequence of Veratrum patulum O. Loes. (Melanthiaceae).

    PubMed

    Do, Hoang Dang Khoa; Kim, Jung Sung; Kim, Joo-Hwan

    2013-11-10

    The sequence of the chloroplast genome, which is inherited maternally, contains useful information for many scientific fields such as plant systematics, biogeography and biotechnology because its characteristics are highly conserved among species. There is an increase in chloroplast genomes of angiosperms that have been sequenced in recent years. In this study, the nucleotide sequence of the chloroplast genome (cpDNA) of Veratrum patulum Loes. (Melanthiaceae, Liliales) was analyzed completely. The circular double-stranded DNA of 153,699 bp consists of two inverted repeat (IR) regions of 26,360 bp each, a large single copy of 83,372 bp, and a small single copy of 17,607 bp. This plastome contains 81 protein-coding genes, 30 distinct tRNA and four genes of rRNA. In addition, there are six hypothetical coding regions (ycf1, ycf2, ycf3, ycf4, ycf15 and ycf68) and two open reading frames (ORF42 and ORF56), which are also found in the chloroplast genomes of the other species. The gene orders and gene contents of the V. patulum plastid genome are similar to that of Smilax china, Lilium longiflorum and Alstroemeria aurea, members of the Smilacaceae, Liliaceae and Alstroemeriaceae (Liliales), respectively. However, the loss rps16 exon 2 in V. patulum results in the difference in the large single copy regions in comparison with other species. The base substitution rate is quite similar among genes of these species. Additionally, the base substitution rate of inverted repeat region was smaller than that of single copy regions in all observed species of Liliales. The IR regions were expanded to trnH_GUG in V. patulum, a part of rps19 in L. longiflorum and A. aurea, and whole sequence of rps19 in S. china. Furthermore, the IGS lengths of rbcL-accD-psaI region were variable among Liliales species, suggesting that this region might be a hotspot of indel events and the informative site for phylogenetic studies in Liliales. In general, the whole chloroplast genome of V. patulum, a

  1. Inferences of drug responses in cancer cells from cancer genomic features and compound chemical and therapeutic properties

    NASA Astrophysics Data System (ADS)

    Wang, Yongcui; Fang, Jianwen; Chen, Shilong

    2016-09-01

    Accurately predicting the response of a cancer patient to a therapeutic agent is a core goal of precision medicine. Existing approaches were mainly relied primarily on genomic alterations in cancer cells that have been treated with different drugs. Here we focus on predicting drug response based on integration of the heterogeneously pharmacogenomics data from both cell and drug sides. Through a systematical approach, named as PDRCC (Predict Drug Response in Cancer Cells), the cancer genomic alterations and compound chemical and therapeutic properties were incorporated to determine the chemotherapeutic response in cancer patients. Using the Cancer Cell Line Encyclopedia (CCLE) study as the benchmark dataset, all pharmacogenomics data exhibited their roles in inferring the relationships between cancer cells and drugs. When integrating both genomic resources and compound information, the prediction coverage was significantly increased. The validity of PDRCC was also supported by its effective in uncovering the unknown cell-drug associations with database and literature evidences. It set the stage for clinical testing of novel therapeutic strategies, such as the sensitive association between cancer cell ‘A549_LUNG’ and compound ‘Topotecan’. In conclusion, PDRCC offers the possibility for faster, safer, and cheaper the development of novel anti-cancer therapeutics in the early-stage clinical trails.

  2. Inferences of drug responses in cancer cells from cancer genomic features and compound chemical and therapeutic properties

    PubMed Central

    Wang, Yongcui; Fang, Jianwen; Chen, Shilong

    2016-01-01

    Accurately predicting the response of a cancer patient to a therapeutic agent is a core goal of precision medicine. Existing approaches were mainly relied primarily on genomic alterations in cancer cells that have been treated with different drugs. Here we focus on predicting drug response based on integration of the heterogeneously pharmacogenomics data from both cell and drug sides. Through a systematical approach, named as PDRCC (Predict Drug Response in Cancer Cells), the cancer genomic alterations and compound chemical and therapeutic properties were incorporated to determine the chemotherapeutic response in cancer patients. Using the Cancer Cell Line Encyclopedia (CCLE) study as the benchmark dataset, all pharmacogenomics data exhibited their roles in inferring the relationships between cancer cells and drugs. When integrating both genomic resources and compound information, the prediction coverage was significantly increased. The validity of PDRCC was also supported by its effective in uncovering the unknown cell-drug associations with database and literature evidences. It set the stage for clinical testing of novel therapeutic strategies, such as the sensitive association between cancer cell ‘A549_LUNG’ and compound ‘Topotecan’. In conclusion, PDRCC offers the possibility for faster, safer, and cheaper the development of novel anti-cancer therapeutics in the early-stage clinical trails. PMID:27645580

  3. Phylogenetic position of the coral symbiont Ostreobium (Ulvophyceae) inferred from chloroplast genome data.

    PubMed

    Verbruggen, Heroen; Marcelino, Vanessa R; Guiry, Michael D; Cremen, M Chiela M; Jackson, Christopher J

    2017-04-10

    The green algal genus Ostreobium is an important symbiont of corals, playing roles in reef decalcification and providing photosynthates to the coral during bleaching events. A chloroplast genome of a cultured strain of Ostreobium was available, but low taxon sampling and Ostreobium's early-branching nature left doubt about its phylogenetic position. Here we generate and describe chloroplast genomes from four Ostreobium strains as well as Avrainvillea mazei and Neomeris sp., strategically sampled early-branching lineages in the Bryopsidales and Dasycladales, respectively. At 80,584 bp, the chloroplast genome of Ostreobium sp. HV05042 is the most compact yet found in the Ulvophyceae. The Avrainvillea chloroplast genome is ca. 94 kbp and contains introns in infA and cysT that have nearly complete sequence identity except for an ORF in infA that is not present in cysT. In line with other bryopsidalean species, it also contains regions with possibly bacteria-derived ORFs. The Neomeris data did not assemble into a canonical circular chloroplast genome but a large number of contigs containing fragments of chloroplast genes and showing evidence of long introns and intergenic regions, and the Neomeris chloroplast genome size was estimated to exceed 1.87 Mb. Chloroplast phylogenomics and 18S nrDNA data showed strong support for the Ostreobium lineage being sister to the remaining Bryopsidales. There were differences in branch support when outgroups were varied, but the overall support for the placement of Ostreobium was strong. These results permitted us to validate two suborders and introduce a third, the Ostreobineae. This article is protected by copyright. All rights reserved.

  4. High-throughput single-cell sequencing identifies photoheterotrophs and chemoautotrophs in freshwater bacterioplankton

    PubMed Central

    Martinez-Garcia, Manuel; Swan, Brandon K; Poulton, Nicole J; Gomez, Monica Lluesma; Masland, Dashiell; Sieracki, Michael E; Stepanauskas, Ramunas

    2012-01-01

    Recent discoveries suggest that photoheterotrophs (rhodopsin-containing bacteria (RBs) and aerobic anoxygenic phototrophs (AAPs)) and chemoautotrophs may be significant for marine and freshwater ecosystem productivity. However, their abundance and taxonomic identities remain largely unknown. We used a combination of single-cell and metagenomic DNA sequencing to study the predominant photoheterotrophs and chemoautotrophs inhabiting the euphotic zone of temperate, physicochemically diverse freshwater lakes. Multi-locus sequencing of 712 single amplified genomes, generated by fluorescence-activated cell sorting and whole genome multiple displacement amplification, showed that most of the cosmopolitan freshwater clusters contain photoheterotrophs. These comprised at least 10–23% of bacterioplankton, and RBs were the dominant fraction. Our data demonstrate that Actinobacteria, including clusters acI, Luna and acSTL, are the predominant freshwater RBs. We significantly broaden the known taxonomic range of freshwater RBs, to include Alpha-, Beta-, Gamma- and Deltaproteobacteria, Verrucomicrobia and Sphingobacteria. By sequencing single cells, we found evidence for inter-phyla horizontal gene transfer and recombination of rhodopsin genes and identified specific taxonomic groups involved in these evolutionary processes. Our data suggest that members of the ubiquitous betaproteobacteria Polynucleobacter spp. are the dominant AAPs in temperate freshwater lakes. Furthermore, the RuBisCO (ribulose 1,5-bisphosphate carboxylase/oxygenase) gene was found in several single cells of Betaproteobacteria, Bacteroidetes and Gammaproteobacteria, suggesting that chemoautotrophs may be more prevalent among aerobic bacterioplankton than previously thought. This study demonstrates the power of single-cell DNA sequencing addressing previously unresolved questions about the metabolic potential and evolutionary histories of uncultured microorganisms, which dominate most natural environments

  5. High-throughput single-cell sequencing identifies photoheterotrophs and chemoautotrophs in freshwater bacterioplankton.

    PubMed

    Martinez-Garcia, Manuel; Swan, Brandon K; Poulton, Nicole J; Gomez, Monica Lluesma; Masland, Dashiell; Sieracki, Michael E; Stepanauskas, Ramunas

    2012-01-01

    Recent discoveries suggest that photoheterotrophs (rhodopsin-containing bacteria (RBs) and aerobic anoxygenic phototrophs (AAPs)) and chemoautotrophs may be significant for marine and freshwater ecosystem productivity. However, their abundance and taxonomic identities remain largely unknown. We used a combination of single-cell and metagenomic DNA sequencing to study the predominant photoheterotrophs and chemoautotrophs inhabiting the euphotic zone of temperate, physicochemically diverse freshwater lakes. Multi-locus sequencing of 712 single amplified genomes, generated by fluorescence-activated cell sorting and whole genome multiple displacement amplification, showed that most of the cosmopolitan freshwater clusters contain photoheterotrophs. These comprised at least 10-23% of bacterioplankton, and RBs were the dominant fraction. Our data demonstrate that Actinobacteria, including clusters acI, Luna and acSTL, are the predominant freshwater RBs. We significantly broaden the known taxonomic range of freshwater RBs, to include Alpha-, Beta-, Gamma- and Deltaproteobacteria, Verrucomicrobia and Sphingobacteria. By sequencing single cells, we found evidence for inter-phyla horizontal gene transfer and recombination of rhodopsin genes and identified specific taxonomic groups involved in these evolutionary processes. Our data suggest that members of the ubiquitous betaproteobacteria Polynucleobacter spp. are the dominant AAPs in temperate freshwater lakes. Furthermore, the RuBisCO (ribulose 1,5-bisphosphate carboxylase/oxygenase) gene was found in several single cells of Betaproteobacteria, Bacteroidetes and Gammaproteobacteria, suggesting that chemoautotrophs may be more prevalent among aerobic bacterioplankton than previously thought. This study demonstrates the power of single-cell DNA sequencing addressing previously unresolved questions about the metabolic potential and evolutionary histories of uncultured microorganisms, which dominate most natural environments.

  6. The green impact: bacterioplankton response toward a phytoplankton spring bloom in the southern North Sea assessed by comparative metagenomic and metatranscriptomic approaches

    PubMed Central

    Wemheuer, Bernd; Wemheuer, Franziska; Hollensteiner, Jacqueline; Meyer, Frauke-Dorothee; Voget, Sonja; Daniel, Rolf

    2015-01-01

    Phytoplankton blooms exhibit a severe impact on bacterioplankton communities as they change nutrient availabilities and other environmental factors. In the current study, the response of a bacterioplankton community to a Phaeocystis globosa spring bloom was investigated in the southern North Sea. For this purpose, water samples were taken inside and reference samples outside of an algal spring bloom. Structural changes of the bacterioplankton community were assessed by amplicon-based analysis of 16S rRNA genes and transcripts generated from environmental DNA and RNA, respectively. Several marine groups responded to bloom presence. The abundance of the Roseobacter RCA cluster and the SAR92 clade significantly increased in bloom presence in the total and active fraction of the bacterial community. Functional changes were investigated by direct sequencing of environmental DNA and mRNA. The corresponding datasets comprised more than 500 million sequences across all samples. Metatranscriptomic data sets were mapped on representative genomes of abundant marine groups present in the samples and on assembled metagenomic and metatranscriptomic datasets. Differences in gene expression profiles between non-bloom and bloom samples were recorded. The genome-wide gene expression level of Planktomarina temperata, an abundant member of the Roseobacter RCA cluster, was higher inside the bloom. Genes that were differently expressed included transposases, which showed increased expression levels inside the bloom. This might contribute to the adaptation of this organism toward environmental stresses through genome reorganization. In addition, several genes affiliated to the SAR92 clade were significantly upregulated inside the bloom including genes encoding for proteins involved in isoleucine and leucine incorporation. Obtained results provide novel insights into compositional and functional variations of marine bacterioplankton communities as response to a phytoplankton bloom. PMID

  7. The green impact: bacterioplankton response toward a phytoplankton spring bloom in the southern North Sea assessed by comparative metagenomic and metatranscriptomic approaches.

    PubMed

    Wemheuer, Bernd; Wemheuer, Franziska; Hollensteiner, Jacqueline; Meyer, Frauke-Dorothee; Voget, Sonja; Daniel, Rolf

    2015-01-01

    Phytoplankton blooms exhibit a severe impact on bacterioplankton communities as they change nutrient availabilities and other environmental factors. In the current study, the response of a bacterioplankton community to a Phaeocystis globosa spring bloom was investigated in the southern North Sea. For this purpose, water samples were taken inside and reference samples outside of an algal spring bloom. Structural changes of the bacterioplankton community were assessed by amplicon-based analysis of 16S rRNA genes and transcripts generated from environmental DNA and RNA, respectively. Several marine groups responded to bloom presence. The abundance of the Roseobacter RCA cluster and the SAR92 clade significantly increased in bloom presence in the total and active fraction of the bacterial community. Functional changes were investigated by direct sequencing of environmental DNA and mRNA. The corresponding datasets comprised more than 500 million sequences across all samples. Metatranscriptomic data sets were mapped on representative genomes of abundant marine groups present in the samples and on assembled metagenomic and metatranscriptomic datasets. Differences in gene expression profiles between non-bloom and bloom samples were recorded. The genome-wide gene expression level of Planktomarina temperata, an abundant member of the Roseobacter RCA cluster, was higher inside the bloom. Genes that were differently expressed included transposases, which showed increased expression levels inside the bloom. This might contribute to the adaptation of this organism toward environmental stresses through genome reorganization. In addition, several genes affiliated to the SAR92 clade were significantly upregulated inside the bloom including genes encoding for proteins involved in isoleucine and leucine incorporation. Obtained results provide novel insights into compositional and functional variations of marine bacterioplankton communities as response to a phytoplankton bloom.

  8. Inference of Candidate Germline Mutator Loci in Humans from Genome-Wide Haplotype Data

    PubMed Central

    2017-01-01

    The rate of germline mutation varies widely between species but little is known about the extent of variation in the germline mutation rate between individuals of the same species. Here we demonstrate that an allele that increases the rate of germline mutation can result in a distinctive signature in the genomic region linked to the affected locus, characterized by a number of haplotypes with a locally high proportion of derived alleles, against a background of haplotypes carrying a typical proportion of derived alleles. We searched for this signature in human haplotype data from phase 3 of the 1000 Genomes Project and report a number of candidate mutator loci, several of which are located close to or within genes involved in DNA repair or the DNA damage response. To investigate whether mutator alleles remained active at any of these loci, we used de novo mutation counts from human parent-offspring trios in the 1000 Genomes and Genome of the Netherlands cohorts, looking for an elevated number of de novo mutations in the offspring of parents carrying a candidate mutator haplotype at each of these loci. We found some support for two of the candidate loci, including one locus just upstream of the BRSK2 gene, which is expressed in the testis and has been reported to be involved in the response to DNA damage. PMID:28095480

  9. A Novel Candidate Vaccine for Cytauxzoonosis Inferred from Comparative Apicomplexan Genomics

    PubMed Central

    Tarigo, Jaime L.; Scholl, Elizabeth H.; Bird, David McK.; Brown, Corrie C.; Cohn, Leah A.; Dean, Gregg A.; Levy, Michael G.; Doolan, Denise L.; Trieu, Angela; Nordone, Shila K.; Felgner, Philip L.; Vigil, Adam; Birkenheuer, Adam J.

    2013-01-01

    Cytauxzoonosis is an emerging infectious disease of domestic cats (Felis catus) caused by the apicomplexan protozoan parasite Cytauxzoon felis. The growing epidemic, with its high morbidity and mortality points to the need for a protective vaccine against cytauxzoonosis. Unfortunately, the causative agent has yet to be cultured continuously in vitro, rendering traditional vaccine development approaches beyond reach. Here we report the use of comparative genomics to computationally and experimentally interpret the C. felis genome to identify a novel candidate vaccine antigen for cytauxzoonosis. As a starting point we sequenced, assembled, and annotated the C. felis genome and the proteins it encodes. Whole genome alignment revealed considerable conserved synteny with other apicomplexans. In particular, alignments with the bovine parasite Theileria parva revealed that a C. felis gene, cf76, is syntenic to p67 (the leading vaccine candidate for bovine theileriosis), despite a lack of significant sequence similarity. Recombinant subdomains of cf76 were challenged with survivor-cat antiserum and found to be highly seroreactive. Comparison of eleven geographically diverse samples from the south-central and southeastern USA demonstrated 91–100% amino acid sequence identity across cf76, including a high level of conservation in an immunogenic 226 amino acid (24 kDa) carboxyl terminal domain. Using in situ hybridization, transcription of cf76 was documented in the schizogenous stage of parasite replication, the life stage that is believed to be the most important for development of a protective immune response. Collectively, these data point to identification of the first potential vaccine candidate antigen for cytauxzoonosis. Further, our bioinformatic approach emphasizes the use of comparative genomics as an accelerated path to developing vaccines against experimentally intractable pathogens. PMID:23977000

  10. Chromosomal instability in Afrotheria: fragile sites, evolutionary breakpoints and phylogenetic inference from genome sequence assemblies

    PubMed Central

    Ruiz-Herrera, Aurora; Robinson, Terence J

    2007-01-01

    Background Extant placental mammals are divided into four major clades (Laurasiatheria, Supraprimates, Xenarthra and Afrotheria). Given that Afrotheria is generally thought to root the eutherian tree in phylogenetic analysis of large nuclear gene data sets, the study of the organization of the genomes of afrotherian species provides new insights into the dynamics of mammalian chromosomal evolution. Here we test if there are chromosomal bands with a high tendency to break and reorganize in Afrotheria, and by analyzing the expression of aphidicolin-induced common fragile sites in three afrotherian species, whether these are coincidental with recognized evolutionary breakpoints. Results We described 29 fragile sites in the aardvark (OAF) genome, 27 in the golden mole (CAS), and 35 in the elephant-shrew (EED) genome. We show that fragile sites are conserved among afrotherian species and these are correlated with evolutionary breakpoints when compared to the human (HSA) genome. Inddition, by computationally scanning the newly released opossum (Monodelphis domestica) and chicken sequence assemblies for use as outgroups to Placentalia, we validate the HSA 3/21/5 chromosomal synteny as a rare genomic change that defines the monophyly of this ancient African clade of mammals. On the other hand, support for HSA 1/19p, which is also thought to underpin Afrotheria, is currently ambiguous. Conclusion We provide evidence that (i) the evolutionary breakpoints that characterise human syntenies detected in the basal Afrotheria correspond at the chromosomal band level with fragile sites, (ii) that HSA 3p/21 was in the amniote ancestor (i.e., common to turtles, lepidosaurs, crocodilians, birds and mammals) and was subsequently disrupted in the lineage leading to marsupials. Its expansion to include HSA 5 in Afrotheria is unique and (iii) that its fragmentation to HSA 3p/21 + HSA 5/21 in elephant and manatee was due to a fission within HSA 21 that is probably shared by all

  11. Phylogeny and physiology of candidate phylum 'Atribacteria' (OP9/JS1) inferred from cultivation-independent genomics.

    PubMed

    Nobu, Masaru K; Dodsworth, Jeremy A; Murugapiran, Senthil K; Rinke, Christian; Gies, Esther A; Webster, Gordon; Schwientek, Patrick; Kille, Peter; Parkes, R John; Sass, Henrik; Jørgensen, Bo B; Weightman, Andrew J; Liu, Wen-Tso; Hallam, Steven J; Tsiamis, George; Woyke, Tanja; Hedlund, Brian P

    2016-02-01

    The 'Atribacteria' is a candidate phylum in the Bacteria recently proposed to include members of the OP9 and JS1 lineages. OP9 and JS1 are globally distributed, and in some cases abundant, in anaerobic marine sediments, geothermal environments, anaerobic digesters and reactors and petroleum reservoirs. However, the monophyly of OP9 and JS1 has been questioned and their physiology and ecology remain largely enigmatic due to a lack of cultivated representatives. Here cultivation-independent genomic approaches were used to provide a first comprehensive view of the phylogeny, conserved genomic features and metabolic potential of members of this ubiquitous candidate phylum. Previously available and heretofore unpublished OP9 and JS1 single-cell genomic data sets were used as recruitment platforms for the reconstruction of atribacterial metagenome bins from a terephthalate-degrading reactor biofilm and from the monimolimnion of meromictic Sakinaw Lake. The single-cell genomes and metagenome bins together comprise six species- to genus-level groups that represent most major lineages within OP9 and JS1. Phylogenomic analyses of these combined data sets confirmed the monophyly of the 'Atribacteria' inclusive of OP9 and JS1. Additional conserved features within the 'Atribacteria' were identified, including a gene cluster encoding putative bacterial microcompartments that may be involved in aldehyde and sugar metabolism, energy conservation and carbon storage. Comparative analysis of the metabolic potential inferred from these data sets revealed that members of the 'Atribacteria' are likely to be heterotrophic anaerobes that lack respiratory capacity, with some lineages predicted to specialize in either primary fermentation of carbohydrates or secondary fermentation of organic acids, such as propionate.

  12. Genome-Wide SNP Discovery, Genotyping and Their Preliminary Applications for Population Genetic Inference in Spotted Sea Bass (Lateolabrax maculatus)

    PubMed Central

    Wang, Juan; Xue, Dong-Xiu; Zhang, Bai-Dong; Li, Yu-Long; Liu, Bing-Jian; Liu, Jin-Xian

    2016-01-01

    Next-generation sequencing and the collection of genome-wide single-nucleotide polymorphisms (SNPs) allow identifying fine-scale population genetic structure and genomic regions under selection. The spotted sea bass (Lateolabrax maculatus) is a non-model species of ecological and commercial importance and widely distributed in northwestern Pacific. A total of 22 648 SNPs was discovered across the genome of L. maculatus by paired-end sequencing of restriction-site associated DNA (RAD-PE) for 30 individuals from two populations. The nucleotide diversity (π) for each population was 0.0028±0.0001 in Dandong and 0.0018±0.0001 in Beihai, respectively. Shallow but significant genetic differentiation was detected between the two populations analyzed by using both the whole data set (FST = 0.0550, P < 0.001) and the putatively neutral SNPs (FST = 0.0347, P < 0.001). However, the two populations were highly differentiated based on the putatively adaptive SNPs (FST = 0.6929, P < 0.001). Moreover, a total of 356 SNPs representing 298 unique loci were detected as outliers putatively under divergent selection by FST-based outlier tests as implemented in BAYESCAN and LOSITAN. Functional annotation of the contigs containing putatively adaptive SNPs yielded hits for 22 of 55 (40%) significant BLASTX matches. Candidate genes for local selection constituted a wide array of functions, including binding, catalytic and metabolic activities, etc. The analyses with the SNPs developed in the present study highlighted the importance of genome-wide genetic variation for inference of population structure and local adaptation in L. maculatus. PMID:27336696

  13. Phylogeny and physiology of candidate phylum ‘Atribacteria' (OP9/JS1) inferred from cultivation-independent genomics

    PubMed Central

    Nobu, Masaru K; Dodsworth, Jeremy A; Murugapiran, Senthil K; Rinke, Christian; Gies, Esther A; Webster, Gordon; Schwientek, Patrick; Kille, Peter; Parkes, R John; Sass, Henrik; Jørgensen, Bo B; Weightman, Andrew J; Liu, Wen-Tso; Hallam, Steven J; Tsiamis, George; Woyke, Tanja; Hedlund, Brian P

    2016-01-01

    The ‘Atribacteria' is a candidate phylum in the Bacteria recently proposed to include members of the OP9 and JS1 lineages. OP9 and JS1 are globally distributed, and in some cases abundant, in anaerobic marine sediments, geothermal environments, anaerobic digesters and reactors and petroleum reservoirs. However, the monophyly of OP9 and JS1 has been questioned and their physiology and ecology remain largely enigmatic due to a lack of cultivated representatives. Here cultivation-independent genomic approaches were used to provide a first comprehensive view of the phylogeny, conserved genomic features and metabolic potential of members of this ubiquitous candidate phylum. Previously available and heretofore unpublished OP9 and JS1 single-cell genomic data sets were used as recruitment platforms for the reconstruction of atribacterial metagenome bins from a terephthalate-degrading reactor biofilm and from the monimolimnion of meromictic Sakinaw Lake. The single-cell genomes and metagenome bins together comprise six species- to genus-level groups that represent most major lineages within OP9 and JS1. Phylogenomic analyses of these combined data sets confirmed the monophyly of the ‘Atribacteria' inclusive of OP9 and JS1. Additional conserved features within the ‘Atribacteria' were identified, including a gene cluster encoding putative bacterial microcompartments that may be involved in aldehyde and sugar metabolism, energy conservation and carbon storage. Comparative analysis of the metabolic potential inferred from these data sets revealed that members of the ‘Atribacteria' are likely to be heterotrophic anaerobes that lack respiratory capacity, with some lineages predicted to specialize in either primary fermentation of carbohydrates or secondary fermentation of organic acids, such as propionate. PMID:26090992

  14. Inferring Population Size History from Large Samples of Genome-Wide Molecular Data - An Approximate Bayesian Computation Approach

    PubMed Central

    Boitard, Simon; Rodríguez, Willy; Jay, Flora; Mona, Stefano; Austerlitz, Frédéric

    2016-01-01

    Inferring the ancestral dynamics of effective population size is a long-standing question in population genetics, which can now be tackled much more accurately thanks to the massive genomic data available in many species. Several promising methods that take advantage of whole-genome sequences have been recently developed in this context. However, they can only be applied to rather small samples, which limits their ability to estimate recent population size history. Besides, they can be very sensitive to sequencing or phasing errors. Here we introduce a new approximate Bayesian computation approach named PopSizeABC that allows estimating the evolution of the effective population size through time, using a large sample of complete genomes. This sample is summarized using the folded allele frequency spectrum and the average zygotic linkage disequilibrium at different bins of physical distance, two classes of statistics that are widely used in population genetics and can be easily computed from unphased and unpolarized SNP data. Our approach provides accurate estimations of past population sizes, from the very first generations before present back to the expected time to the most recent common ancestor of the sample, as shown by simulations under a wide range of demographic scenarios. When applied to samples of 15 or 25 complete genomes in four cattle breeds (Angus, Fleckvieh, Holstein and Jersey), PopSizeABC revealed a series of population declines, related to historical events such as domestication or modern breed creation. We further highlight that our approach is robust to sequencing errors, provided summary statistics are computed from SNPs with common alleles. PMID:26943927

  15. Karyotypic evolution of the family Sciuridae: inferences from the genome organizations of ground squirrels.

    PubMed

    Li, T; Wang, J; Su, W; Nie, W; Yang, F

    2006-01-01

    Cross-species chromosome painting has made a great contribution to our understanding of the evolution of karyotypes and genome organizations of mammals. Several recent papers of comparative painting between tree and flying squirrels have shed some light on the evolution of the family Sciuridae and the order Rodentia. In the present study we have extended the comparative painting to the Himalayan marmot (Marmotahimalayana) and the African ground squirrel (Xerus cf. erythropus), i.e. representative species from another important squirrel group--the ground squirrels--, and have established genome-wide comparative chromosome maps between human, eastern gray squirrel, and these two ground squirrels. The results show that 1) the squirrels so far studied all have conserved karyotypes that resemble the ancestral karyotype of the order Rodentia; 2) the African ground squirrels could have retained the ancestral karyotype of the family Sciuridae. Furthermore, we have mapped the evolutionary rearrangements onto a molecular-based consensus phylogenetic tree of the family Sciuridae.

  16. Simple Math is Enough: Two Examples of Inferring Functional Associations from Genomic Data

    NASA Technical Reports Server (NTRS)

    Liang, Shoudan

    2003-01-01

    Non-random features in the genomic data are usually biologically meaningful. The key is to choose the feature well. Having a p-value based score prioritizes the findings. If two proteins share a unusually large number of common interaction partners, they tend to be involved in the same biological process. We used this finding to predict the functions of 81 un-annotated proteins in yeast.

  17. Low rate of genomic repatterning in Xenarthra inferred from chromosome painting data.

    PubMed

    Dobigny, G; Yang, F; O'Brien, P C M; Volobouev, V; Kovács, A; Pieczarka, J C; Ferguson-Smith, M A; Robinson, T J

    2005-01-01

    Comparative cytogenetic studies on Xenarthra, one of the most basal mammalian clades in the Placentalia, are virtually absent, being restricted largely to descriptions of conventional karyotypes and diploid numbers. We present a molecular cytogenetic comparison of chromosomes from the two-toed (Choloepus didactylus, 2n = 65) and three-toed sloth species (Bradypus tridactylus, 2n = 52), an anteater (Tamandua tetradactyla, 2n = 54) which, together with some data on the six-banded armadillo (Euphractus sexcinctus, 2n = 58), collectively represent all the major xenarthran lineages. Our results, based on interspecific chromosome painting using flow-sorted two-toed sloth chromosomes as painting probes, show the sloth species to be karyotypically closely related but markedly different from the anteater. We also test the synteny disruptions and segmental associations identified within Pilosa (anteaters and sloths) against the chromosomes of the six-banded armadillo as outgroup taxon. We could thus polarize the 35 non-ambiguously identified chromosomal changes characterizing the evolution of the anteater and sloth genomes and map these to a published sequence-based phylogeny for the group. These data suggest a low rate of genomic repatterning when placed in the context of divergence estimates based on molecular and fossil data. Finally, our results provide a glimpse of a likely ancestral karyotype for the extant Xenarthra, a pivotal group for understanding eutherian genome evolution.

  18. ARG-walker: inference of individual specific strengths of meiotic recombination hotspots by population genomics analysis

    PubMed Central

    2015-01-01

    Background Meiotic recombination hotspots play important roles in various aspects of genomics, but the underlying mechanisms for regulating the locations and strengths of recombination hotspots are not yet fully revealed. Most existing algorithms for estimating recombination rates from sequence polymorphism data can only output average recombination rates of a population, although there is evidence for the heterogeneity in recombination rates among individuals. For genome-wide association studies (GWAS) of recombination hotspots, an efficient algorithm that estimates the individualized strengths of recombination hotspots is highly desirable. Results In this work, we propose a novel graph mining algorithm named ARG-walker, based on random walks on ancestral recombination graphs (ARG), to estimate individual-specific recombination hotspot strengths. Extensive simulations demonstrate that ARG-walker is able to distinguish the hot allele of a recombination hotspot from the cold allele. Integrated with output of ARG-walker, we performed GWAS on the phased haplotype data of the 22 autosome chromosomes of the HapMap Asian population samples of Chinese and Japanese (JPT+CHB). Significant cis-regulatory signals have been detected, which is corroborated by the enrichment of the well-known 13-mer motif CCNCCNTNNCCNC of PRDM9 protein. Moreover, two new DNA motifs have been identified in the flanking regions of the significantly associated SNPs (single nucleotide polymorphisms), which are likely to be new cis-regulatory elements of meiotic recombination hotspots of the human genome. Conclusions Our results on both simulated and real data suggest that ARG-walker is a promising new method for estimating the individual recombination variations. In the future, it could be used to uncover the mechanisms of recombination regulation and human diseases related with recombination hotspots. PMID:26679564

  19. Remarkable variation in maize genome structure inferred from haplotype diversity at the bz locus

    PubMed Central

    Wang, Qinghua; Dooner, Hugo K.

    2006-01-01

    Maize is probably the most diverse of all crop species. Unexpectedly large differences among haplotypes were first revealed in a comparison of the bz genomic regions of two different inbred lines, McC and B73. Retrotransposon clusters, which comprise most of the repetitive DNA in maize, varied markedly in makeup, and location relative to the genes in the region and genic sequences, later shown to be carried by two helitron transposons, also differed between the inbreds. Thus, the allelic bz regions of these Corn Belt inbreds shared only a minority of the total sequence. To investigate further the variation caused by retrotransposons, helitrons, and other insertions, we have analyzed the organization of the bz genomic region in five additional cultivars selected because of their geographic and genetic diversity: the inbreds A188, CML258, and I137TN, and the land races Coroico and NalTel. This vertical comparison has revealed the existence of several new helitrons, new retrotransposons, members of every superfamily of DNA transposons, numerous miniature elements, and novel insertions flanked at either end by TA repeats, which we call TAFTs (TA-flanked transposons). The extent of variation in the region is remarkable. In pairwise comparisons of eight bz haplotypes, the percentage of shared sequences ranges from 25% to 84%. Chimeric haplotypes were identified that combine retrotransposon clusters found in different haplotypes. We propose that recombination in the common gene space greatly amplifies the variability produced by the retrotransposition explosion in the maize ancestry, creating the heterogeneity in genome organization found in modern maize. PMID:17101975

  20. Remarkable variation in maize genome structure inferred from haplotype diversity at the bz locus.

    PubMed

    Wang, Qinghua; Dooner, Hugo K

    2006-11-21

    Maize is probably the most diverse of all crop species. Unexpectedly large differences among haplotypes were first revealed in a comparison of the bz genomic regions of two different inbred lines, McC and B73. Retrotransposon clusters, which comprise most of the repetitive DNA in maize, varied markedly in makeup, and location relative to the genes in the region and genic sequences, later shown to be carried by two helitron transposons, also differed between the inbreds. Thus, the allelic bz regions of these Corn Belt inbreds shared only a minority of the total sequence. To investigate further the variation caused by retrotransposons, helitrons, and other insertions, we have analyzed the organization of the bz genomic region in five additional cultivars selected because of their geographic and genetic diversity: the inbreds A188, CML258, and I137TN, and the land races Coroico and NalTel. This vertical comparison has revealed the existence of several new helitrons, new retrotransposons, members of every superfamily of DNA transposons, numerous miniature elements, and novel insertions flanked at either end by TA repeats, which we call TAFTs (TA-flanked transposons). The extent of variation in the region is remarkable. In pairwise comparisons of eight bz haplotypes, the percentage of shared sequences ranges from 25% to 84%. Chimeric haplotypes were identified that combine retrotransposon clusters found in different haplotypes. We propose that recombination in the common gene space greatly amplifies the variability produced by the retrotransposition explosion in the maize ancestry, creating the heterogeneity in genome organization found in modern maize.

  1. Inferring Properties of Ancient Cyanobacteria from Biogeochemical Activity and Genomes of Siderophilic Cyanobacteria

    NASA Technical Reports Server (NTRS)

    McKay, David S.; Brown, I. I.; Tringe, S. G.; Thomas-Keprta, K. E.; Bryant, D. A.; Sarkisova, S. S.; Malley, K.; Sosa, O.; Klatt, C. G.; McKay, D. S.

    2010-01-01

    Interrelationships between life and the planetary system could have simultaneously left landmarks in genomes of microbes and physicochemical signatures in the lithosphere. Verifying the links between genomic features in living organisms and the mineralized signatures generated by these organisms will help to reveal traces of life on Earth and beyond. Among contemporary environments, iron-depositing hot springs (IDHS) may represent one of the most appropriate natural models [1] for insights into ancient life since organisms may have originated on Earth and probably Mars in association with hydrothermal activity [2,3]. IDHS also seem to be appropriate models for studying certain biogeochemical processes that could have taken place in the late Archean and,-or early Paleoproterozoic eras [4, 5]. It has been suggested that inorganic polyphosphate (PPi), in chains of tens to hundreds of phosphate residues linked by high-energy bonds, is environmentally ubiquitous and abundant [6]. Cyanobacteria (CB) react to increased heavy metal concentrations and UV by enhanced generation of PPi bodies (PPB) [7], which are believed to be signatures of life [8]. However, the role of PPi in oxygenic prokaryotes for the suppression of oxidative stress induced by high Fe is poorly studied. Here we present preliminary results of a new mechanism of Fe mineralization in oxygenic prokaryotes, the effect of Fe on the generation of PPi bodies in CB, as well as preliminary analysis of the diversity and phylogeny of proteins involved in the prevention of oxidative stress in phototrophs inhabiting IDHS.

  2. Inferring the choreography of parental genomes during fertilization from ultralarge-scale whole-transcriptome analysis.

    PubMed

    Park, Sung-Joon; Komata, Makiko; Inoue, Fukashi; Yamada, Kaori; Nakai, Kenta; Ohsugi, Miho; Shirahige, Katsuhiko

    2013-12-15

    Fertilization precisely choreographs parental genomes by using gamete-derived cellular factors and activating genome regulatory programs. However, the mechanism remains elusive owing to the technical difficulties of preparing large numbers of high-quality preimplantation cells. Here, we collected >14 × 10(4) high-quality mouse metaphase II oocytes and used these to establish detailed transcriptional profiles for four early embryo stages and parthenogenetic development. By combining these profiles with other public resources, we found evidence that gene silencing appeared to be mediated in part by noncoding RNAs and that this was a prerequisite for post-fertilization development. Notably, we identified 817 genes that were differentially expressed in embryos after fertilization compared with parthenotes. The regulation of these genes was distinctly different from those expressed in parthenotes, suggesting functional specialization of particular transcription factors prior to first cell cleavage. We identified five transcription factors that were potentially necessary for developmental progression: Foxd1, Nkx2-5, Sox18, Myod1, and Runx1. Our very large-scale whole-transcriptome profile of early mouse embryos yielded a novel and valuable resource for studies in developmental biology and stem cell research. The database is available at http://dbtmee.hgc.jp.

  3. Primate phylogenetic relationships and divergence dates inferred from complete mitochondrial genomes

    PubMed Central

    Hodgson, Jason A.; Burrell, Andrew S.; Sterner, Kirstin N.; Raaum, Ryan L.; Disotell, Todd R.

    2014-01-01

    The origins and the divergence times of the most basal lineages within primates have been difficult to resolve mainly due to the incomplete sampling of early fossil taxa. The main source of contention is related to the discordance between molecular and fossil estimates: while there are no crown primate fossils older than 56 Ma, most molecule-based estimates extend the origins of crown primates into the Cretaceous. Here we present a comprehensive mitogenomic study of primates. We assembled 87 mammalian mitochondrial genomes, including 62 primate species representing all the families of the order. We newly sequenced eleven mitochondrial genomes, including eight Old World monkeys and three strepsirrhines. Phylogenetic analyses support a strong topology, confirming the monophyly for all the major primate clades. In contrast to previous mitogenomic studies, the positions of tarsiers and colugos relative to strepsirrhines and anthropoids are well resolved. In order to improve our understanding of how fossil calibrations affect age estimates within primates, we explore the effect of seventeen fossil calibrations across primates and other mammalian groups and we select a subset of calibrations to date our mitogenomic tree. The divergence date estimates of the Strepsirrhine/Haplorhine split support an origin of crown primates in the Late Cretaceous, at around 74 Ma. This result supports a short fuse model of primate origins, whereby relatively little time passed between the origin of the order and the diversification of its major clades. It also suggests that the early primate fossil record is likely poorly sampled. PMID:24583291

  4. Bacterioplankton assembly and interspecies interaction indicating increasing coastal eutrophication.

    PubMed

    Dai, Wenfang; Zhang, Jinjie; Tu, Qichao; Deng, Ye; Qiu, Qiongfen; Xiong, Jinbo

    2017-06-01

    Anthropogenic perturbations impose negative effects on coastal ecosystems, such as increasing levels of eutrophication. Given the biogeochemical significance of microorganisms, understanding the processes and mechanisms underlying their spatial distribution under changing environmental conditions is critical. To address this question, we examined how coastal bacterioplankton communities respond to increasing eutrophication levels created by anthropogenic perturbations. The results showed that the magnitude of changes in the bacterioplankton community compositions (BCCs) and the importance of deterministic processes that constrained bacterial assembly were closely associated with eutrophication levels. Moreover, increasing eutrophication significantly (P < 0.001) attenuated the distance decay rate, with a random spatial distribution of BCCs in the undisturbed location. In contrast, the complexity of interspecies interaction was enhanced under moderate eutrophication levels but declined under heavy eutrophication. Changes in the relative abundances of 27 bacterial families were significantly correlated with eutrophication levels. Notably, the pattern of enrichment or decrease for a given bacterial family was consistent with its known ecological functions. Our findings demonstrate that the magnitude of changes in BCCs and underlying determinism are dependent on eutrophication levels. However, the buffer capacity of bacterioplankton community is limited, with disrupted interspecies interaction occurring under heavy eutrophication. As such, bacterial assemblages are sensitive to changes in environmental conditions and could thus potentially serve as bio-indicators for increasing eutrophication.

  5. Redox-Specialized Bacterioplankton Metacommunity in a Temperate Estuary

    PubMed Central

    Laas, Peeter; Simm, Jaak; Lips, Inga; Lips, Urmas; Kisand, Veljo; Metsis, Madis

    2015-01-01

    This study explored the spatiotemporal dynamics of the bacterioplankton community composition in the Gulf of Finland (easternmost sub-basin of the Baltic Sea) based on phylogenetic analysis of 16S rDNA sequences acquired from community samples via pyrosequencing. Investigations of bacterioplankton in hydrographically complex systems provide good insight into the strategies by which microbes deal with spatiotemporal hydrographic gradients, as demonstrated by our research. Many ribotypes were closely affiliated with sequences isolated from environments with similar steep physiochemical gradients and/or seasonal changes, including seasonally anoxic estuaries. Hence, one of the main conclusions of this study is that marine ecosystems where oxygen and salinity gradients co-occur can be considered a habitat for a cosmopolitan metacommunity consisting of specialized groups occupying niches universal to such environments throughout the world. These niches revolve around functional capabilities to utilize different electron receptors and donors (including trace metal and single carbon compounds). On the other hand, temporal shifts in the bacterioplankton community composition at the surface layer were mainly connected to the seasonal succession of phytoplankton and the inflow of freshwater species. We also conclude that many relatively abundant populations are indigenous and well-established in the area. PMID:25860812

  6. King penguin demography since the last glaciation inferred from genome-wide data

    PubMed Central

    Trucchi, Emiliano; Gratton, Paolo; Whittington, Jason D.; Cristofari, Robin; Le Maho, Yvon; Stenseth, Nils Chr; Le Bohec, Céline

    2014-01-01

    How natural climate cycles, such as past glacial/interglacial patterns, have shaped species distributions at the high-latitude regions of the Southern Hemisphere is still largely unclear. Here, we show how the post-glacial warming following the Last Glacial Maximum (ca 18 000 years ago), allowed the (re)colonization of the fragmented sub-Antarctic habitat by an upper-level marine predator, the king penguin Aptenodytes patagonicus. Using restriction site-associated DNA sequencing and standard mitochondrial data, we tested the behaviour of subsets of anonymous nuclear loci in inferring past demography through coalescent-based and allele frequency spectrum analyses. Our results show that the king penguin population breeding on Crozet archipelago steeply increased in size, closely following the Holocene warming recorded in the Epica Dome C ice core. The following population growth can be explained by a threshold model in which the ecological requirements of this species (year-round ice-free habitat for breeding and access to a major source of food such as the Antarctic Polar Front) were met on Crozet soon after the Pleistocene/Holocene climatic transition. PMID:24920481

  7. King penguin demography since the last glaciation inferred from genome-wide data.

    PubMed

    Trucchi, Emiliano; Gratton, Paolo; Whittington, Jason D; Cristofari, Robin; Le Maho, Yvon; Stenseth, Nils Chr; Le Bohec, Céline

    2014-07-22

    How natural climate cycles, such as past glacial/interglacial patterns, have shaped species distributions at the high-latitude regions of the Southern Hemisphere is still largely unclear. Here, we show how the post-glacial warming following the Last Glacial Maximum (ca 18 000 years ago), allowed the (re)colonization of the fragmented sub-Antarctic habitat by an upper-level marine predator, the king penguin Aptenodytes patagonicus. Using restriction site-associated DNA sequencing and standard mitochondrial data, we tested the behaviour of subsets of anonymous nuclear loci in inferring past demography through coalescent-based and allele frequency spectrum analyses. Our results show that the king penguin population breeding on Crozet archipelago steeply increased in size, closely following the Holocene warming recorded in the Epica Dome C ice core. The following population growth can be explained by a threshold model in which the ecological requirements of this species (year-round ice-free habitat for breeding and access to a major source of food such as the Antarctic Polar Front) were met on Crozet soon after the Pleistocene/Holocene climatic transition.

  8. Triallelic Population Genomics for Inferring Correlated Fitness Effects of Same Site Nonsynonymous Mutations.

    PubMed

    Ragsdale, Aaron P; Coffman, Alec J; Hsieh, PingHsun; Struck, Travis J; Gutenkunst, Ryan N

    2016-05-01

    The distribution of mutational effects on fitness is central to evolutionary genetics. Typical univariate distributions, however, cannot model the effects of multiple mutations at the same site, so we introduce a model in which mutations at the same site have correlated fitness effects. To infer the strength of that correlation, we developed a diffusion approximation to the triallelic frequency spectrum, which we applied to data from Drosophila melanogaster We found a moderate positive correlation between the fitness effects of nonsynonymous mutations at the same codon, suggesting that both mutation identity and location are important for determining fitness effects in proteins. We validated our approach by comparing it to biochemical mutational scanning experiments, finding strong quantitative agreement, even between different organisms. We also found that the correlation of mutational fitness effects was not affected by protein solvent exposure or structural disorder. Together, our results suggest that the correlation of fitness effects at the same site is a previously overlooked yet fundamental property of protein evolution.

  9. Evolutionary landscape of amphibians emerging from ancient freshwater fish inferred from complete mitochondrial genomes.

    PubMed

    Wang, Xiao-Tong; Zhang, Yan-Feng; Wu, Qian; Zhang, Hao

    2012-05-04

    It is very interesting that the only extant marine amphibian is the marine frog, Fejervarya cancrivora. This study investigated the reasons for this apparent rarity by conducting a phylogenetic tree analysis of the complete mitochondrial genomes from 14 amphibians, 67 freshwater fishes, four migratory fishes, 35 saltwater fishes, and one hemichordate. The results showed that amphibians, living fossil fishes, and the common ancestors of modern fishes are phylogenetically separated. In general, amphibians, living fossil fishes, saltwater fishes, and freshwater fishes are clustered in different clades. This suggests that the ancestor of living amphibians arose from a type of primordial freshwater fish, rather than the coelacanth, lungfish, or modern saltwater fish. Modern freshwater fish and modern saltwater fish were probably separated from a common ancestor by a single event, caused by crustal movement.

  10. Conflicting genomic signals affect phylogenetic inference in four species of North American pines

    PubMed Central

    Koralewski, Tomasz E.; Mateos, Mariana; Krutovsky, Konstantin V.

    2016-01-01

    Adaptive evolutionary processes in plants may be accompanied by episodes of introgression, parallel evolution and incomplete lineage sorting that pose challenges in untangling species evolutionary history. Genus Pinus (pines) is one of the most abundant and most studied groups among gymnosperms, and a good example of a lineage where these phenomena have been observed. Pines are among the most ecologically and economically important plant species. Some, such as the pines of the southeastern USA (southern pines in subsection Australes), are subjects of intensive breeding programmes. Despite numerous published studies, the evolutionary history of Australes remains ambiguous and often controversial. We studied the phylogeny of four major southern pine species: shortleaf (Pinus echinata), slash (P. elliottii), longleaf (P. palustris) and loblolly (P. taeda), using sequences from 11 nuclear loci and maximum likelihood and Bayesian methods. Our analysis encountered resolution difficulties similar to earlier published studies. Although incomplete lineage sorting and introgression are two phenomena presumptively underlying our results, the phylogenetic inferences seem to be also influenced by the genes examined, with certain topologies supported by sets of genes sharing common putative functionalities. For example, genes involved in wood formation supported the clade echinata–taeda, genes linked to plant defence supported the clade echinata–elliottii and genes linked to water management properties supported the clade echinata–palustris. The support for these clades was very high and consistent across methods. We discuss the potential factors that could underlie these observations, including incomplete lineage sorting, hybridization and parallel or adaptive evolution. Our results likely reflect the relatively short evolutionary history of the subsection that is thought to have begun during the middle Miocene and has been influenced by climate fluctuations. PMID

  11. Conflicting genomic signals affect phylogenetic inference in four species of North American pines.

    PubMed

    Koralewski, Tomasz E; Mateos, Mariana; Krutovsky, Konstantin V

    2016-01-01

    Adaptive evolutionary processes in plants may be accompanied by episodes of introgression, parallel evolution and incomplete lineage sorting that pose challenges in untangling species evolutionary history. Genus Pinus (pines) is one of the most abundant and most studied groups among gymnosperms, and a good example of a lineage where these phenomena have been observed. Pines are among the most ecologically and economically important plant species. Some, such as the pines of the southeastern USA (southern pines in subsection Australes), are subjects of intensive breeding programmes. Despite numerous published studies, the evolutionary history of Australes remains ambiguous and often controversial. We studied the phylogeny of four major southern pine species: shortleaf (Pinus echinata), slash (P. elliottii), longleaf (P. palustris) and loblolly (P. taeda), using sequences from 11 nuclear loci and maximum likelihood and Bayesian methods. Our analysis encountered resolution difficulties similar to earlier published studies. Although incomplete lineage sorting and introgression are two phenomena presumptively underlying our results, the phylogenetic inferences seem to be also influenced by the genes examined, with certain topologies supported by sets of genes sharing common putative functionalities. For example, genes involved in wood formation supported the clade echinata-taeda, genes linked to plant defence supported the clade echinata-elliottii and genes linked to water management properties supported the clade echinata-palustris The support for these clades was very high and consistent across methods. We discuss the potential factors that could underlie these observations, including incomplete lineage sorting, hybridization and parallel or adaptive evolution. Our results likely reflect the relatively short evolutionary history of the subsection that is thought to have begun during the middle Miocene and has been influenced by climate fluctuations.

  12. Phylogeography of the fire-bellied toads Bombina: independent Pleistocene histories inferred from mitochondrial genomes.

    PubMed

    Hofman, Sebastian; Spolsky, Christina; Uzzell, Thomas; Cogălniceanu, Dan; Babik, Wiesław; Szymura, Jacek M

    2007-06-01

    The fire-bellied toads Bombina bombina and Bombina variegata, interbreed in a long, narrow zone maintained by a balance between selection and dispersal. Hybridization takes place between local, genetically differentiated groups. To quantify divergence between these groups and reconstruct their history and demography, we analysed nucleotide variation at the mitochondrial cytochrome b gene (1096 bp) in 364 individuals from 156 sites representing the entire range of both species. Three distinct clades with high sequence divergence (K2P = 8-11%) were distinguished. One clade grouped B. bombina haplotypes; the two other clades grouped B. variegata haplotypes. One B. variegata clade included only Carpathian individuals; the other represented B. variegata from the southwestern parts of its distribution: Southern and Western Europe (Balkano-Western lineage), Apennines, and the Rhodope Mountains. Differentiation between the Carpathian and Balkano-Western lineages, K2P approximately 8%, approached interspecific divergence. Deep divergence among European Bombina lineages suggests their preglacial origin, and implies long and largely independent evolutionary histories of the species. Multiple glacial refugia were identified in the lowlands adjoining the Black Sea, in the Carpathians, in the Balkans, and in the Apennines. The results of the nested clade and demographic analyses suggest drastic reductions of population sizes during the last glacial period, and significant demographic growth related to postglacial colonization. Inferred history, supported by fossil evidence, demonstrates that Bombina ranges underwent repeated contractions and expansions. Geographical concordance between morphology, allozymes, and mtDNA shows that previous episodes of interspecific hybridization have left no detectable mtDNA introgression. Either the admixed populations went extinct, or selection against hybrids hindered mtDNA gene flow in ancient hybrid zones.

  13. Module Anchored Network Inference: A Sequential Module-Based Approach to Novel Gene Network Construction from Genomic Expression Data on Human Disease Mechanism

    PubMed Central

    Keller, Susanna R.; Lee, Jae K.

    2017-01-01

    Different computational approaches have been examined and compared for inferring network relationships from time-series genomic data on human disease mechanisms under the recent Dialogue on Reverse Engineering Assessment and Methods (DREAM) challenge. Many of these approaches infer all possible relationships among all candidate genes, often resulting in extremely crowded candidate network relationships with many more False Positives than True Positives. To overcome this limitation, we introduce a novel approach, Module Anchored Network Inference (MANI), that constructs networks by analyzing sequentially small adjacent building blocks (modules). Using MANI, we inferred a 7-gene adipogenesis network based on time-series gene expression data during adipocyte differentiation. MANI was also applied to infer two 10-gene networks based on time-course perturbation datasets from DREAM3 and DREAM4 challenges. MANI well inferred and distinguished serial, parallel, and time-dependent gene interactions and network cascades in these applications showing a superior performance to other in silico network inference techniques for discovering and reconstructing gene network relationships. PMID:28197408

  14. Effects of UV-B Radiation on the Structural and Physiological Diversity of Bacterioneuston and Bacterioplankton

    PubMed Central

    Santos, Ana L.; Oliveira, Vanessa; Baptista, Inês; Henriques, Isabel; Gomes, Newton C. M.; Almeida, Adelaide; Correia, António

    2012-01-01

    The effects of UV radiation (UVR) on estuarine bacterioneuston and bacterioplankton were assessed in microcosm experiments. Bacterial abundance and DNA synthesis were more affected in bacterioplankton. Protein synthesis was more inhibited in bacterioneuston. Community analysis indicated that UVR has the potential to select resistant bacteria (e.g., Gammaproteobacteria), particularly abundant in bacterioneuston. PMID:22247171

  15. Inferring regulatory elements from a whole genome. An analysis of Helicobacter pylori sigma(80) family of promoter signals.

    PubMed

    Vanet, A; Marsan, L; Labigne, A; Sagot, M F

    2000-03-24

    Helicobacter pylori is adapted to life in a unique niche, the gastric epithelium of primates. Its promoters may therefore be different from those of other bacteria. Here, we determine motifs possibly involved in the recognition of such promoter sequences by the RNA polymerase using a new motif identification method. An important feature of this method is that the motifs are sought with the least possible assumptions about what they may look like. The method starts by considering the whole genome of H. pylori and attempts to infer directly from it a description for a family of promoters. Thus, this approach differs from searching for such promoters with a previously established description. The two algorithms are based on the idea of inferring motifs by flexibly comparing words in the sequences with an external object, instead of between themselves. The first algorithm infers single motifs, the second a combination of two motifs separated from one another by strictly defined, sterically constrained distances. Besides independently finding motifs known to be present in other bacteria, such as the Shine-Dalgarno sequence and the TATA-box, this approach suggests the existence in H. pylori of a new, combined motif, TTAAGC, followed optimally 21 bp downstream by TATAAT. Between these two motifs, there is in some cases another, TTTTAA or, less frequently, a repetition of TTAAGC separated optimally from the TATA-box by 12 bp. The combined motif TTAAGCx(21+/-2)TATAAT is present with no errors immediately upstream from the only two copies of the ribosomal 23 S-5 S RNA genes in H. pylori, and with one error upstream from the only two copies of the ribosomal 16 S RNA genes. The operons of both ribosomal RNA molecules are strongly expressed, representing an encouraging sign of the pertinence of the motifs found by the algorithms. In 25 cases out of a possible 30, the combined motif is found with no more than three substitutions immediately upstream from ribosomal proteins, or

  16. Morphological homoplasy, life history evolution, and historical biogeography of plethodontid salamanders inferred from complete mitochondrial genomes

    PubMed Central

    Mueller, Rachel Lockridge; Macey, J. Robert; Jaekel, Martin; Wake, David B.; Boore, Jeffrey L.

    2004-01-01

    The evolutionary history of the largest salamander family (Plethodontidae) is characterized by extreme morphological homoplasy. Analysis of the mechanisms generating such homoplasy requires an independent molecular phylogeny. To this end, we sequenced 24 complete mitochondrial genomes (22 plethodontids and two outgroup taxa), added data for three species from GenBank, and performed partitioned and unpartitioned Bayesian, maximum likelihood, and maximum parsimony phylogenetic analyses. We explored four dataset partitioning strategies to account for evolutionary process heterogeneity among genes and codon positions, all of which yielded increased model likelihoods and decreased numbers of supported nodes in the topologies (Bayesian posterior probability >0.95) relative to the unpartitioned analysis. Our phylogenetic analyses yielded congruent trees that contrast with the traditional morphology-based taxonomy; the monophyly of three of four major groups is rejected. Reanalysis of current hypotheses in light of these evolutionary relationships suggests that (i) a larval life history stage reevolved from a direct-developing ancestor multiple times; (ii) there is no phylogenetic support for the “Out of Appalachia” hypothesis of plethodontid origins; and (iii) novel scenarios must be reconstructed for the convergent evolution of projectile tongues, reduction in toe number, and specialization for defensive tail loss. Some of these scenarios imply morphological transformation series that proceed in the opposite direction than was previously thought. In addition, they suggest surprising evolutionary lability in traits previously interpreted to be conservative. PMID:15365171

  17. Morphological homoplasy, life history evolution, and historical biogeography of plethodontid salamanders inferred from complete mitochondrial genomes

    SciTech Connect

    Mueller, Rachel Lockridge; Macey, J. Robert; Jaekel, Martin; Wake, David B.; Boore, Jeffrey L.

    2004-08-01

    The evolutionary history of the largest salamander family (Plethodontidae) is characterized by extreme morphological homoplasy. Analysis of the mechanisms generating such homoplasy requires an independent, molecular phylogeny. To this end, we sequenced 24 complete mitochondrial genomes (22 plethodontids and two outgroup taxa), added data for three species from GenBank, and performed partitioned and unpartitioned Bayesian, ML, and MP phylogenetic analyses. We explored four dataset partitioning strategies to account for evolutionary process heterogeneity among genes and codon positions, all of which yielded increased model likelihoods and decreased numbers of supported nodes in the topologies (PP > 0.95) relative to the unpartitioned analysis. Our phylogenetic analyses yielded congruent trees that contrast with the traditional morphology-based taxonomy; the monophyly of three out of four major groups is rejected. Reanalysis of current hypotheses in light of these new evolutionary relationships suggests that (1) a larval life history stage re-evolved from a direct-developing ancestor multiple times, (2) there is no phylogenetic support for the ''Out of Appalachia'' hypothesis of plethodontid origins, and (3) novel scenarios must be reconstructed for the convergent evolution of projectile tongues, reduction in toe number, and specialization for defensive tail loss. Some of these novel scenarios imply morphological transformation series that proceed in the opposite direction than was previously thought. In addition, they suggest surprising evolutionary lability in traits previously interpreted to be conservative.

  18. Phylogenetic Diversity of the Enteric Pathogen Salmonella enterica subsp. enterica Inferred from Genome-Wide Reference-Free SNP Characters

    PubMed Central

    Timme, Ruth E.; Pettengill, James B.; Allard, Marc W.; Strain, Errol; Barrangou, Rodolphe; Wehnes, Chris; Van Kessel, JoAnn S.; Karns, Jeffrey S.; Musser, Steven M.; Brown, Eric W.

    2013-01-01

    The enteric pathogen Salmonella enterica is one of the leading causes of foodborne illness in the world. The species is extremely diverse, containing more than 2,500 named serovars that are designated for their unique antigen characters and pathogenicity profiles—some are known to be virulent pathogens, while others are not. Questions regarding the evolution of pathogenicity, significance of antigen characters, diversity of clustered regularly interspaced short palindromic repeat (CRISPR) loci, among others, will remain elusive until a strong evolutionary framework is established. We present the first large-scale S. enterica subsp. enterica phylogeny inferred from a new reference-free k-mer approach of gathering single nucleotide polymorphisms (SNPs) from whole genomes. The phylogeny of 156 isolates representing 78 serovars (102 were newly sequenced) reveals two major lineages, each with many strongly supported sublineages. One of these lineages is the S. Typhi group; well nested within the phylogeny. Lineage-through-time analyses suggest there have been two instances of accelerated rates of diversification within the subspecies. We also found that antigen characters and CRISPR loci reveal different evolutionary patterns than that of the phylogeny, suggesting that a horizontal gene transfer or possibly a shared environmental acquisition might have influenced the present character distribution. Our study also shows the ability to extract reference-free SNPs from a large set of genomes and then to use these SNPs for phylogenetic reconstruction. This automated, annotation-free approach is an important step forward for bacterial disease tracking and in efficiently elucidating the evolutionary history of highly clonal organisms. PMID:24158624

  19. Phylogeny and biogeography of the family Salamandridae (Amphibia: Caudata) inferred from complete mitochondrial genomes.

    PubMed

    Zhang, Peng; Papenfuss, Theodore J; Wake, Marvalee H; Qu, Lianghu; Wake, David B

    2008-11-01

    Phylogenetic relationships of members of the salamander family Salamandridae were examined using complete mitochondrial genomes collected from 42 species representing all 20 salamandrid genera and five outgroup taxa. Weighted maximum parsimony, partitioned maximum likelihood, and partitioned Bayesian approaches all produce an identical, well-resolved phylogeny; most branches are strongly supported with greater than 90% bootstrap values and 1.0 Bayesian posterior probabilities. Our results support recent taxonomic changes in finding the traditional genera Mertensiella, Euproctus, and Triturus to be non-monophyletic species assemblages. We successfully resolved the current polytomy at the base of the salamandrid tree: the Italian newt genus Salamandrina is sister to all remaining salamandrids. Beyond Salamandrina, a clade comprising all remaining newts is separated from a clade containing the true salamanders. Among these newts, the branching orders of well-supported clades are: primitive newts (Echinotriton, Pleurodeles, and Tylototriton), New World newts (Notophthalmus-Taricha), Corsica-Sardinia newts (Euproctus), and modern European newts (Calotriton, Lissotriton, Mesotriton, Neurergus, Ommatotriton, and Triturus) plus modern Asian newts (Cynops, Pachytriton, and Paramesotriton).Two alternative sets of calibration points and two Bayesian dating methods (BEAST and MultiDivTime) were used to estimate timescales for salamandrid evolution. The estimation difference by dating methods is slight and we propose two sets of timescales based on different calibration choices. The two timescales suggest that the initial diversification of extant salamandrids took place in Europe about 97 or 69Ma. North American salamandrids were derived from their European ancestors by dispersal through North Atlantic Land Bridges in the Late Cretaceous ( approximately 69Ma) or Middle Eocene ( approximately 43Ma). Ancestors of Asian salamandrids most probably dispersed to the eastern Asia

  20. Phylogenetic relationships and divergence dates of softshell turtles (Testudines: Trionychidae) inferred from complete mitochondrial genomes.

    PubMed

    Li, Haifeng; Liu, Juanjuan; Xiong, Lei; Zhang, Huanhuan; Zhou, Huaxing; Yin, Huazong; Jing, Wanxing; Li, Jun; Shi, Qiong; Wang, Yuqin; Liu, Jianjun; Nie, Liuwang

    2017-03-15

    The softshell turtles (Trionychidae) are one of the most widely distributed reptile groups in the world, and fossils have been found on all continents except Antarctica. The phylogenetic relationships among members of this group have been previously studied; however, there are disagreements regarding its taxonomy, its phylogeography and divergence times are still poorly understood as well. Here we present a comprehensive mitogenomic study of softshell turtles. We sequenced the complete mitochondrial genomes of 10 softshell turtles, in addition to the GenBank sequence of Dogania subplana, Lissemys punctata, Trionyx triunguis, which cover all extant genera within Trionychidae except for Cyclanorbis and Cycloderma. These data were combined with other mitogenomes of turtles for phylogenetic analyses. Divergence time-calibration and ancestral reconstruction were calculated using BEAST and RASP software, respectively. Our phylogenetic analyses indicate that Trionychidae is the sister taxon of Carettochelyidae, and support the monophyly of Trionychinae and Cyclanorbinae, which is consistent with morphological data and molecular analysis. Our phylogenetic analyses have established a sister taxon relationship between the Asian Rafetus and the Asian Palea + Pelodiscus + Dogania + Nilssonia + Amyda, whereas a previous study grouped the Asian Rafetus with the American Apalone. The results of divergence time estimates and area ancestral reconstruction show that extant Trionychidae originated in Asia at around 108 million years ago (MA), and radiations mainly occurred during two warm periods, namely, Late Cretaceous-Early Eocene and Oligocene. By combining the estimateddivergence time and the reconstructed ancestral area of softshell turtles, we determined that the dispersal of softshell turtles out of Asia may have taken three routes. Furthermore, the times of dispersal seem to be in agreement with the time of the India-Asia collision and opening of the Bering Strait, which

  1. Higher-level salamander relationships and divergence dates inferred from complete mitochondrial genomes.

    PubMed

    Zhang, Peng; Wake, David B

    2009-11-01

    Phylogenetic relationships among the salamander families have been difficult to resolve, largely because the window of time in which major lineages diverged was very short relative to the subsequently long evolutionary history of each family. We present seven new complete mitochondrial genomes representing five salamander families that have no or few mitogenome records in GenBank in order to assess the phylogenetic relationships of all salamander families from a mitogenomic perspective. Phylogenetic analyses of two data sets-one combining the entire mitogenome sequence except for the D-loop, and the other combining the deduced amino acid sequences of all 13 mitochondrial protein-coding genes-produce nearly identical well-resolved topologies. The monophyly of each family is supported, including the controversial Proteidae. The internally fertilizing salamanders are demonstrated to be a clade, concordant with recent results using nuclear genes. The internally fertilizing salamanders include two well-supported clades: one is composed of Ambystomatidae, Dicamptodontidae, and Salamandridae, the other Proteidae, Rhyacotritonidae, Amphiumidae, and Plethodontidae. In contrast to results from nuclear loci, our results support the conventional morphological hypothesis that Sirenidae is the sister-group to all other salamanders and they statistically reject the hypothesis from nuclear genes that the suborder Cryptobranchoidea (Cryptobranchidae+Hynobiidae) branched earlier than the Sirenidae. Using recently recommended fossil calibration points and a "soft bound" calibration strategy, we recalculated evolutionary timescales for tetrapods with an emphasis on living salamanders, under a Bayesian framework with and without a rate-autocorrelation assumption. Our dating results indicate: (i) the widely used rate-autocorrelation assumption in relaxed clock analyses is problematic and the accuracy of molecular dating for early lissamphibian evolution is questionable; (ii) the initial

  2. A new statistic and its power to infer membership in a genome-wide association study using genotype frequencies.

    PubMed

    Jacobs, Kevin B; Yeager, Meredith; Wacholder, Sholom; Craig, David; Kraft, Peter; Hunter, David J; Paschal, Justin; Manolio, Teri A; Tucker, Margaret; Hoover, Robert N; Thomas, Gilles D; Chanock, Stephen J; Chatterjee, Nilanjan

    2009-11-01

    Aggregate results from genome-wide association studies (GWAS), such as genotype frequencies for cases and controls, were until recently often made available on public websites because they were thought to disclose negligible information concerning an individual's participation in a study. Homer et al. recently suggested that a method for forensic detection of an individual's contribution to an admixed DNA sample could be applied to aggregate GWAS data. Using a likelihood-based statistical framework, we developed an improved statistic that uses genotype frequencies and individual genotypes to infer whether a specific individual or any close relatives participated in the GWAS and, if so, what the participant's phenotype status is. Our statistic compares the logarithm of genotype frequencies, in contrast to that of Homer et al., which is based on differences in either SNP probe intensity or allele frequencies. We derive the theoretical power of our test statistics and explore the empirical performance in scenarios with varying numbers of randomly chosen or top-associated SNPs.

  3. Genome at Juncture of Early Human Migration: A Systematic Analysis of Two Whole Genomes and Thirteen Exomes from Kuwaiti Population Subgroup of Inferred Saudi Arabian Tribe Ancestry

    PubMed Central

    Alsmadi, Osama; Hebbar, Prashantha; Antony, Dinu; Behbehani, Kazem; Thanaraj, Thangavel Alphonse

    2014-01-01

    Population of the State of Kuwait is composed of three genetic subgroups of inferred Persian, Saudi Arabian tribe and Bedouin ancestry. The Saudi Arabian tribe subgroup traces its origin to the Najd region of Saudi Arabia. By sequencing two whole genomes and thirteen exomes from this subgroup at high coverage (>40X), we identify 4,950,724 Single Nucleotide Polymorphisms (SNPs), 515,802 indels and 39,762 structural variations. Of the identified variants, 10,098 (8.3%) exomic SNPs, 139,923 (2.9%) non-exomic SNPs, 5,256 (54.3%) exomic indels, and 374,959 (74.08%) non-exomic indels are ‘novel’. Up to 8,070 (79.9%) of the reported novel biallelic exomic SNPs are seen in low frequency (minor allele frequency <5%). We observe 5,462 known and 1,004 novel potentially deleterious nonsynonymous SNPs. Allele frequencies of common SNPs from the 15 exomes is significantly correlated with those from genotype data of a larger cohort of 48 individuals (Pearson correlation coefficient, 0.91; p <2.2×10−16). A set of 2,485 SNPs show significantly different allele frequencies when compared to populations from other continents. Two notable variants having risk alleles in high frequencies in this subgroup are: a nonsynonymous deleterious SNP (rs2108622 [19:g.15990431C>T] from CYP4F2 gene [MIM:*604426]) associated with warfarin dosage levels [MIM:#122700] required to elicit normal anticoagulant response; and a 3′ UTR SNP (rs6151429 [22:g.51063477T>C]) from ARSA gene [MIM:*607574]) associated with Metachromatic Leukodystrophy [MIM:#250100]. Hemoglobin Riyadh variant (identified for the first time in a Saudi Arabian woman) is observed in the exome data. The mitochondrial haplogroup profiles of the 15 individuals are consistent with the haplogroup diversity seen in Saudi Arabian natives, who are believed to have received substantial gene flow from Africa and eastern provenance. We present the first genome resource imperative for designing future genetic studies in Saudi Arabian

  4. Coastal Bacterioplankton Community Dynamics in Response to a Natural Disturbance

    PubMed Central

    Rappé, Michael S.

    2013-01-01

    In order to characterize how disturbances to microbial communities are propagated over temporal and spatial scales in aquatic environments, the dynamics of bacterial assemblages throughout a subtropical coastal embayment were investigated via SSU rRNA gene analyses over an 8-month period, which encompassed a large storm event. During non-perturbed conditions, sampling sites clustered into three groups based on their microbial community composition: an offshore oceanic group, a freshwater group, and a distinct and persistent coastal group. Significant differences in measured environmental parameters or in the bacterial community due to the storm event were found only within the coastal cluster of sampling sites, and only at 5 of 12 locations; three of these sites showed a significant response in both environmental and bacterial community characteristics. These responses were most pronounced at sites close to the shoreline. During the storm event, otherwise common bacterioplankton community members such as marine Synechococcus sp. and members of the SAR11 clade of Alphaproteobacteria decreased in relative abundance in the affected coastal zone, whereas several lineages of Gammaproteobacteria, Betaproteobacteria, and members of the Roseobacter clade of Alphaproteobacteria increased. The complex spatial patterns in both environmental conditions and microbial community structure related to freshwater runoff and wind convection during the perturbation event leads us to conclude that spatial heterogeneity was an important factor influencing both the dynamics and the resistance of the bacterioplankton communities to disturbances throughout this complex subtropical coastal system. This heterogeneity may play a role in facilitating a rapid rebound of regions harboring distinctly coastal bacterioplankton communities to their pre-disturbed taxonomic composition. PMID:23409156

  5. Unusual bacterioplankton community structure in ultra-oligotrophic Crater Lake

    USGS Publications Warehouse

    Urbach, Ena; Vergin, Kevin L.; Morse, Ariel

    2001-01-01

    The bacterioplankton assemblage in Crater Lake, Oregon (U.S.A.), is different from communities found in other oxygenated lakes, as demonstrated by four small subunit ribosomal ribonucleic acid (SSU rRNA) gene clone libraries and oligonucleotide probe hybridization to RNA from lake water. Populations in the euphotic zone of this deep (589 m), oligotrophic caldera lake are dominated by two phylogenetic clusters of currently uncultivated bacteria: CL120-10, a newly identified cluster in the verrucomicrobiales, and ACK4 actinomycetes, known as a minor constituent of bacterioplankton in other lakes. Deep-water populations at 300 and 500 m are dominated by a different pair of uncultivated taxa: CL500-11, a novel cluster in the green nonsulfur bacteria, and group I marine crenarchaeota. b-Proteobacteria, dominant in most other freshwater environments, are relatively rare in Crater Lake (<=16% of nonchloroplast bacterial rRNA at all depths). Other taxa identified in Crater Lake libraries include a newly identified candidate bacterial division, ABY1, and a newly identified subcluster, CL0-1, within candidate division OP10. Probe analyses confirmed vertical stratification of several microbial groups, similar to patterns observed in open-ocean systems. Additional similarities between Crater Lake and ocean microbial populations include aphotic zone dominance of group I marine crenarchaeota and green nonsulfur bacteria. Comparison of Crater Lake to other lakes studied by rRNA methods suggests that selective factors structuring Crater Lake bacterioplankton populations may include low concentrations of available trace metals and dissolved organic matter, chemistry of infiltrating hydrothermal waters, and irradiation by high levels of ultraviolet light.

  6. Bacterioplankton carbon cycling along the Subtropical Frontal Zone off New Zealand

    NASA Astrophysics Data System (ADS)

    Baltar, Federico; Stuck, Esther; Morales, Sergio; Currie, Kim

    2015-06-01

    Marine heterotrophic bacterioplankton (Bacteria and Archaea) play a central role in ocean carbon cycling. As such, identifying the factors controlling these microbial populations is crucial to fully understanding carbon fluxes. We studied bacterioplankton activities along a transect crossing three water masses (i.e., Subtropical waters [STW], Sub-Antarctic waters [SAW] and neritic waters [NW]) with contrasting nutrient regimes across the Subtropical Frontal Zone. In contrast to bacterioplankton production and community respiration, bacterioplankton respiration increased in the offshore SAW, causing a seaward increase in the contribution of bacteria to community respiration (from 7% to 100%). Cell-specific bacterioplankton respiration also increased in SAW, but cell-specific production did not, suggesting that prokaryotic cells in SAW were investing more energy towards respiration than growth. This was reflected in a 5-fold decline in bacterioplankton growth efficiency (BGE) towards SAW. One way to explain this decrease in BGE could be due to the observed reduction in phytoplankton biomass (and presumably organic matter concentration) towards SAW. However, this would not explain why bacterioplankton respiration was highest in SAW, where phytoplankton biomass was lowest. Another factor affecting BGE could be the iron limitation characteristic of high-nutrient low-chlorophyll (HNLC) regions like SAW. Our field-study based evidences would agree with previous laboratory experiments in which iron stress provoked a decrease in BGE of marine bacterial isolates. Our results suggest that there is a strong gradient in bacterioplankton carbon cycling rates along the Subtropical Frontal Zone, mainly due to the HNLC conditions of SAW. We suggest that Fe-induced reduction of BGE in HNLC regions like SAW could be relevant in marine carbon cycling, inducing bacterioplankton to act as a link or a sink of organic carbon by impacting on the quantity of organic carbon they incorporate

  7. Phylogeny and biogeography of highly diverged freshwater fish species (Leuciscinae, Cyprinidae, Teleostei) inferred from mitochondrial genome analysis.

    PubMed

    Imoto, Junichi M; Saitoh, Kenji; Sasaki, Takeshi; Yonezawa, Takahiro; Adachi, Jun; Kartavtsev, Yuri P; Miya, Masaki; Nishida, Mutsumi; Hanzawa, Naoto

    2013-02-10

    The distribution of freshwater taxa is a good biogeographic model to study pattern and process of vicariance and dispersal. The subfamily Leuciscinae (Cyprinidae, Teleostei) consists of many species distributed widely in Eurasia and North America. Leuciscinae have been divided into two phyletic groups, leuciscin and phoxinin. The phylogenetic relationships between major clades within the subfamily are poorly understood, largely because of the overwhelming diversity of the group. The origin of the Far Eastern phoxinin is an interesting question regarding the evolutionary history of Leuciscinae. Here we present phylogenetic analysis of 31 species of Leuciscinae and outgroups based on complete mitochondrial genome sequences to clarify the phylogenetic relationships and to infer the evolutionary history of the subfamily. Phylogenetic analysis suggests that the Far Eastern phoxinin species comprised the monophyletic clades Tribolodon, Pseudaspius, Oreoleuciscus and Far Eastern Phoxinus. The Far Eastern phoxinin clade was independent of other Leuciscinae lineages and was closer to North American phoxinins than European leuciscins. All of our analysis also suggested that leuciscins and phoxinins each constituted monophyletic groups. Divergence time estimation suggested that Leuciscinae species diverged from outgroups such as Tincinae to be 83.3 million years ago (Mya) in the Late Cretaceous and leuciscin and phoxinin shared a common ancestor 70.7 Mya. Radiation of Leuciscinae lineages occurred during the Late Cretaceous to Paleocene. This period also witnessed the radiation of tetrapods. Reconstruction of ancestral areas indicates Leuciscinae species originated within Europe. Leuciscin species evolved in Europe and the ancestor of phoxinin was distributed in North America. The Far Eastern phoxinins would have dispersed from North America to Far East across the Beringia land bridge. The present study suggests important roles for the continental rearrangements during the

  8. Changes of bacterioplankton apparent species richness in two ornamental fish aquaria.

    PubMed

    Vlahos, Nikolaos; Kormas, Konstantinos Ar; Pachiadaki, Maria G; Meziti, Alexandra; Hotos, George N; Mente, Eleni

    2013-12-01

    We analysed the 16S rRNA gene diversity within the bacterioplankton community in the water column of the ornamental fish Pterophyllum scalare and Archocentrus nigrofasciatus aquaria during a 60-day growth experiment in order to detect any dominant bacterial species and their possible association with the rearing organisms. The basic physical and chemical parameters remained stable but the bacterial community at 0, 30 and 60 days showed marked differences in bacterial cell abundance and diversity. We found high species richness but no dominant phylotypes were detected. Only few of the phylotypes were found in more than one time point per treatment and always with low relative abundance. The majority of the common phylotypes belonged to the Proteobacteria phylum and were closely related to Acinetobacter junii, Pseudomonas sp., Nevskia ramosa, Vogesella perlucida, Chitinomonas taiwanensis, Acidovorax sp., Pelomonas saccharophila and the rest belonged to the α-Proteobacteria, Bacteroidetes, Actinobacteria, candidate division OP11 and one unaffiliated group. Several of these phylotypes were closely related to known taxa including Sphingopyxis chilensis, Flexibacter aurantiacus subsp. excathedrus and Mycobacterium sp. Despite the high phylogenetic diversity most of the inferred ecophysiological roles of the found phylotypes are related to nitrogen metabolism, a key process for fish aquaria.

  9. Structuring of Bacterioplankton Diversity in a Large Tropical Bay

    PubMed Central

    Gregoracci, Gustavo B.; Nascimento, Juliana R.; Cabral, Anderson S.; Paranhos, Rodolfo; Valentin, Jean L.; Thompson, Cristiane C.; Thompson, Fabiano L.

    2012-01-01

    Structuring of bacterioplanktonic populations and factors that determine the structuring of specific niche partitions have been demonstrated only for a limited number of colder water environments. In order to better understand the physical chemical and biological parameters that may influence bacterioplankton diversity and abundance, we examined their productivity, abundance and diversity in the second largest Brazilian tropical bay (Guanabara Bay, GB), as well as seawater physical chemical and biological parameters of GB. The inner bay location with higher nutrient input favored higher microbial (including vibrio) growth. Metagenomic analysis revealed a predominance of Gammaproteobacteria in this location, while GB locations with lower nutrient concentration favored Alphaproteobacteria and Flavobacteria. According to the subsystems (SEED) functional analysis, GB has a distinctive metabolic signature, comprising a higher number of sequences in the metabolism of phosphorus and aromatic compounds and a lower number of sequences in the photosynthesis subsystem. The apparent phosphorus limitation appears to influence the GB metagenomic signature of the three locations. Phosphorus is also one of the main factors determining changes in the abundance of planktonic vibrios, suggesting that nutrient limitation can be observed at community (metagenomic) and population levels (total prokaryote and vibrio counts). PMID:22363639

  10. BACTERIOPLANKTON DYNAMICS IN NORTHERN SAN FRANCISCO BAY: ROLE OF PARTICLE ASSOCIATION AND SEASONAL FRESHWATER FLOW

    EPA Science Inventory

    Bacterioplankton abundance and metabolic characteristics were observed in northern San Francisco Bay, California, during spring and summer 1996 at three sites: Central Bay, Suisun Bay, and the Sacramento River. These sites spanned a salinity gradient from marine to freshwater, an...

  11. Verrucomicrobia Are Candidates for Polysaccharide-Degrading Bacterioplankton in an Arctic Fjord of Svalbard

    PubMed Central

    Cardman, Z.; Arnosti, C.; Durbin, A.; Ziervogel, K.; Cox, C.; Steen, A. D.

    2014-01-01

    In Arctic marine bacterial communities, members of the phylum Verrucomicrobia are consistently detected, although not typically abundant, in 16S rRNA gene clone libraries and pyrotag surveys of the marine water column and in sediments. In an Arctic fjord (Smeerenburgfjord) of Svalbard, members of the Verrucomicrobia, together with Flavobacteria and smaller proportions of Alpha- and Gammaproteobacteria, constituted the most frequently detected bacterioplankton community members in 16S rRNA gene-based clone library analyses of the water column. Parallel measurements in the water column of the activities of six endo-acting polysaccharide hydrolases showed that chondroitin sulfate, laminarin, and xylan hydrolysis accounted for most of the activity. Several Verrucomicrobia water column phylotypes were affiliated with previously sequenced, glycoside hydrolase-rich genomes of individual Verrucomicrobia cells that bound fluorescently labeled laminarin and xylan and therefore constituted candidates for laminarin and xylan hydrolysis. In sediments, the bacterial community was dominated by different lineages of Verrucomicrobia, Bacteroidetes, and Proteobacteria but also included members of multiple phylum-level lineages not observed in the water column. This community hydrolyzed laminarin, xylan, chondroitin sulfate, and three additional polysaccharide substrates at high rates. Comparisons with data from the same fjord in the previous summer showed that the bacterial community in Smeerenburgfjord changed in composition, most conspicuously in the changing detection frequency of Verrucomicrobia in the water column. Nonetheless, in both years the community hydrolyzed the same polysaccharide substrates. PMID:24727271

  12. Verrucomicrobia are candidates for polysaccharide-degrading bacterioplankton in an arctic fjord of Svalbard.

    PubMed

    Cardman, Z; Arnosti, C; Durbin, A; Ziervogel, K; Cox, C; Steen, A D; Teske, A

    2014-06-01

    In Arctic marine bacterial communities, members of the phylum Verrucomicrobia are consistently detected, although not typically abundant, in 16S rRNA gene clone libraries and pyrotag surveys of the marine water column and in sediments. In an Arctic fjord (Smeerenburgfjord) of Svalbard, members of the Verrucomicrobia, together with Flavobacteria and smaller proportions of Alpha- and Gammaproteobacteria, constituted the most frequently detected bacterioplankton community members in 16S rRNA gene-based clone library analyses of the water column. Parallel measurements in the water column of the activities of six endo-acting polysaccharide hydrolases showed that chondroitin sulfate, laminarin, and xylan hydrolysis accounted for most of the activity. Several Verrucomicrobia water column phylotypes were affiliated with previously sequenced, glycoside hydrolase-rich genomes of individual Verrucomicrobia cells that bound fluorescently labeled laminarin and xylan and therefore constituted candidates for laminarin and xylan hydrolysis. In sediments, the bacterial community was dominated by different lineages of Verrucomicrobia, Bacteroidetes, and Proteobacteria but also included members of multiple phylum-level lineages not observed in the water column. This community hydrolyzed laminarin, xylan, chondroitin sulfate, and three additional polysaccharide substrates at high rates. Comparisons with data from the same fjord in the previous summer showed that the bacterial community in Smeerenburgfjord changed in composition, most conspicuously in the changing detection frequency of Verrucomicrobia in the water column. Nonetheless, in both years the community hydrolyzed the same polysaccharide substrates.

  13. Biogeography of bacterioplankton in lakes and streams of an Arctic tundra catchment.

    PubMed

    Crump, Ron C; Adams, Heather E; Hobbie, John E; Kling, George W

    2007-06-01

    Bacterioplankton community composition was compared across 10 lakes and 14 streams within the catchment of Toolik Lake, a tundra lake in Arctic Alaska, during seven surveys conducted over three years using denaturing gradient gel electrophoresis (DGGE) of PCR-amplified rDNA. Bacterioplankton communities in streams draining tundra were very different than those in streams draining lakes. Communities in streams draining lakes were similar to communities in lakes. In a connected series of lakes and streams, the stream communities changed with distance from the upstream lake and with changes in water chemistry, suggesting inoculation and dilution with bacteria from soil waters or hyporheic zones. In the same system, lakes shared similar bacterioplankton communities (78% similar) that shifted gradually down the catchment. In contrast, unconnected lakes contained somewhat different communities (67% similar). We found evidence that dispersal influences bacterioplankton communities via advection and dilution (mass effects) in streams, and via inoculation and subsequent growth in lakes. The spatial pattern of bacterioplankton community composition was strongly influenced by interactions among soil water, stream, and lake environments. Our results reveal large differences in lake-specific and stream-specific bacterial community composition over restricted spatial scales (<10 km) and suggest that geographic distance and connectivity influence the distribution of bacterioplankton communities across a landscape.

  14. Elevated pCO2 enhances bacterioplankton removal of organic carbon

    PubMed Central

    James, Anna K.; Passow, Uta; Brzezinski, Mark A.; Parsons, Rachel J.; Trapani, Jennifer N.; Carlson, Craig A.

    2017-01-01

    Factors that affect the removal of organic carbon by heterotrophic bacterioplankton can impact the rate and magnitude of organic carbon loss in the ocean through the conversion of a portion of consumed organic carbon to CO2. Through enhanced rates of consumption, surface bacterioplankton communities can also reduce the amount of dissolved organic carbon (DOC) available for export from the surface ocean. The present study investigated the direct effects of elevated pCO2 on bacterioplankton removal of several forms of DOC ranging from glucose to complex phytoplankton exudate and lysate, and naturally occurring DOC. Elevated pCO2 (1000–1500 ppm) enhanced both the rate and magnitude of organic carbon removal by bacterioplankton communities compared to low (pre-industrial and ambient) pCO2 (250 –~400 ppm). The increased removal was largely due to enhanced respiration, rather than enhanced production of bacterioplankton biomass. The results suggest that elevated pCO2 can increase DOC consumption and decrease bacterioplankton growth efficiency, ultimately decreasing the amount of DOC available for vertical export and increasing the production of CO2 in the surface ocean. PMID:28257422

  15. Functional characterization of somatic mutations in cancer using network-based inference of protein activity | Office of Cancer Genomics

    Cancer.gov

    Identifying the multiple dysregulated oncoproteins that contribute to tumorigenesis in a given patient is crucial for developing personalized treatment plans. However, accurate inference of aberrant protein activity in biological samples is still challenging as genetic alterations are only partially predictive and direct measurements of protein activity are generally not feasible.

  16. Arthropod Phylogenetics in Light of Three Novel Millipede (Myriapoda: Diplopoda) Mitochondrial Genomes with Comments on the Appropriateness of Mitochondrial Genome Sequence Data for Inferring Deep Level Relationships

    PubMed Central

    Brewer, Michael S.; Swafford, Lynn; Spruill, Chad L.; Bond, Jason E.

    2013-01-01

    Background Arthropods are the most diverse group of eukaryotic organisms, but their phylogenetic relationships are poorly understood. Herein, we describe three mitochondrial genomes representing orders of millipedes for which complete genomes had not been characterized. Newly sequenced genomes are combined with existing data to characterize the protein coding regions of myriapods and to attempt to reconstruct the evolutionary relationships within the Myriapoda and Arthropoda. Results The newly sequenced genomes are similar to previously characterized millipede sequences in terms of synteny and length. Unique translocations occurred within the newly sequenced taxa, including one half of the Appalachioria falcifera genome, which is inverted with respect to other millipede genomes. Across myriapods, amino acid conservation levels are highly dependent on the gene region. Additionally, individual loci varied in the level of amino acid conservation. Overall, most gene regions showed low levels of conservation at many sites. Attempts to reconstruct the evolutionary relationships suffered from questionable relationships and low support values. Analyses of phylogenetic informativeness show the lack of signal deep in the trees (i.e., genes evolve too quickly). As a result, the myriapod tree resembles previously published results but lacks convincing support, and, within the arthropod tree, well established groups were recovered as polyphyletic. Conclusions The novel genome sequences described herein provide useful genomic information concerning millipede groups that had not been investigated. Taken together with existing sequences, the variety of compositions and evolution of myriapod mitochondrial genomes are shown to be more complex than previously thought. Unfortunately, the use of mitochondrial protein-coding regions in deep arthropod phylogenetics appears problematic, a result consistent with previously published studies. Lack of phylogenetic signal renders the

  17. Marine bacterioplankton biomass, activity and community structure in the vicinity of Antarctic icebergs

    NASA Astrophysics Data System (ADS)

    Murray, Alison E.; Peng, Vivian; Tyler, Charlotte; Wagh, Protima

    2011-06-01

    We studied marine bacterioplankton in the Scotia Sea in June 2008 and in the northwest Weddell Sea in March to mid April 2009 in waters proximal to three free-drifting icebergs (SS-1, A-43k, and C-18a), in a region with a high density of smaller icebergs (iceberg alley), and at stations that were upstream of the iceberg trajectories designated as far-field reference sites that were between 16-75 km away. Hydrographic parameters were used to define water masses in which comparisons between bacterioplankton-associated characteristics (abundance, leucine incorporation into protein, aminopeptidase activities and community structure) within and between water masses could be made. Early winter Scotia Sea bacterioplankton had low levels of cells and low heterotrophic production rates in the upper 50 m. Influences of the icebergs on bacterioplankton at this time of year were minimal, if not deleterious, as we found lower levels of heterotrophic production near A-43k in comparison to stations >16 km away. Additionally, the results point to small but significant differences in cell abundance, heterotrophic production, and community structure between the two icebergs studied. These icebergs differed greatly in size and the findings suggest that the larger iceberg had a greater effect. In the NW Weddell Sea in March-mid April bacterioplankton were twice as abundant and had heterotrophic productions rates that were 8-fold higher than what we determined in the Scotia Sea, though levels were still quite low, which is typical for autumn. We did not detect direct iceberg-related influences on the bacterioplankton characteristics studied here. Clues to understanding bacterioplankton responses may lie in the details of community structure, as there were some significant differences in community structure in the winter water and underlying upper circumpolar deep-water masses between stations occupied close to C-18a and at stations 18 km away (i.e. Polaribacter and Pelagibacter

  18. Coupling Bacterioplankton Populations and Environment to Community Function in Coastal Temperate Waters

    PubMed Central

    Traving, Sachia J.; Bentzon-Tilia, Mikkel; Knudsen-Leerbeck, Helle; Mantikci, Mustafa; Hansen, Jørgen L. S.; Stedmon, Colin A.; Sørensen, Helle; Markager, Stiig; Riemann, Lasse

    2016-01-01

    Bacterioplankton play a key role in marine waters facilitating processes important for carbon cycling. However, the influence of specific bacterial populations and environmental conditions on bacterioplankton community performance remains unclear. The aim of the present study was to identify drivers of bacterioplankton community functions, taking into account the variability in community composition and environmental conditions over seasons, in two contrasting coastal systems. A Least Absolute Shrinkage and Selection Operator (LASSO) analysis of the biological and chemical data obtained from surface waters over a full year indicated that specific bacterial populations were linked to measured functions. Namely, Synechococcus (Cyanobacteria) was strongly correlated with protease activity. Both function and community composition showed seasonal variation. However, the pattern of substrate utilization capacity could not be directly linked to the community dynamics. The overall importance of dissolved organic matter (DOM) parameters in the LASSO models indicate that bacterioplankton respond to the present substrate landscape, with a particular importance of nitrogenous DOM. The identification of common drivers of bacterioplankton community functions in two different systems indicates that the drivers may be of broader relevance in coastal temperate waters. PMID:27729909

  19. Inferring Cell Differentiation Processes Based on Phylogenetic Analysis of Genome-Wide Epigenetic Information: Hematopoiesis as a Model Case

    PubMed Central

    Koyanagi, Kanako O.

    2015-01-01

    How cells divide and differentiate is a fundamental question in organismal development; however, the discovery of differentiation processes in various cell types is laborious and sometimes impossible. Phylogenetic analysis is typically used to reconstruct evolutionary processes based on inherent characters. It could also be used to reconstruct developmental processes based on the developmental changes that occur during cell proliferation and differentiation. In this study, DNA methylation information from differentiated hematopoietic cells was used to perform phylogenetic analyses. The results were assessed for their validity in inferring hierarchical differentiation processes of hematopoietic cells and DNA methylation processes of differentiating progenitor cells. Overall, phylogenetic analyses based on DNA methylation information facilitated inferences regarding hematopoiesis. PMID:25638259

  20. The new physician as unwitting quantum mechanic: is adapting Dirac's inference system best practice for personalized medicine, genomics, and proteomics?

    PubMed

    Robson, Barry

    2007-08-01

    What is the Best Practice for automated inference in Medical Decision Support for personalized medicine? A known system already exists as Dirac's inference system from quantum mechanics (QM) using bra-kets and bras where A and B are states, events, or measurements representing, say, clinical and biomedical rules. Dirac's system should theoretically be the universal best practice for all inference, though QM is notorious as sometimes leading to bizarre conclusions that appear not to be applicable to the macroscopic world of everyday world human experience and medical practice. It is here argued that this apparent difficulty vanishes if QM is assigned one new multiplication function @, which conserves conditionality appropriately, making QM applicable to classical inference including a quantitative form of the predicate calculus. An alternative interpretation with the same consequences is if every i = radical-1 in Dirac's QM is replaced by h, an entity distinct from 1 and i and arguably a hidden root of 1 such that h2 = 1. With that exception, this paper is thus primarily a review of the application of Dirac's system, by application of linear algebra in the complex domain to help manipulate information about associations and ontology in complicated data. Any combined bra-ket can be shown to be composed only of the sum of QM-like bra and ket weights c(), times an exponential function of Fano's mutual information measure I(A; B) about the association between A and B, that is, an association rule from data mining. With the weights and Fano measure re-expressed as expectations on finite data using Riemann's Incomplete (i.e., Generalized) Zeta Functions, actual counts of observations for real world sparse data can be readily utilized. Finally, the paper compares identical character, distinguishability of states events or measurements, correlation, mutual information, and orthogonal character, important issues in data mining

  1. Alkane hydroxylase gene (alkB) phylotype composition and diversity in northern Gulf of Mexico bacterioplankton

    PubMed Central

    Smith, Conor B.; Tolar, Bradley B.; Hollibaugh, James T.; King, Gary M.

    2013-01-01

    Natural and anthropogenic activities introduce alkanes into marine systems where they are degraded by alkane hydroxylases expressed by phylogenetically diverse bacteria. Partial sequences for alkB, one of the structural genes of alkane hydroxylase, have been used to assess the composition of alkane-degrading communities, and to determine their responses to hydrocarbon inputs. We present here the first spatially extensive analysis of alkB in bacterioplankton of the northern Gulf of Mexico (nGoM), a region that experiences numerous hydrocarbon inputs. We have analyzed 401 partial alkB gene sequences amplified from genomic extracts collected during March 2010 from 17 water column samples that included surface waters and bathypelagic depths. Previous analyses of 16S rRNA gene sequences for these and related samples have shown that nGoM bacterial community composition and structure stratify strongly with depth, with distinctly different communities above and below 100 m. Although we hypothesized that alkB gene sequences would exhibit a similar pattern, PCA analyses of operational protein units (OPU) indicated that community composition did not vary consistently with depth or other major physical-chemical variables. We observed 22 distinct OPUs, one of which was ubiquitous and accounted for 57% of all sequences. This OPU clustered with AlkB sequences from known hydrocarbon oxidizers (e.g., Alcanivorax and Marinobacter). Some OPUs could not be associated with known alkane degraders, however, and perhaps represent novel hydrocarbon-oxidizing populations or genes. These results indicate that the capacity for alkane hydrolysis occurs widely in the nGoM, but that alkane degrader diversity varies substantially among sites and responds differently than bulk communities to physical-chemical variables. PMID:24376439

  2. Automatic Determination of Bacterioplankton Biomass by Image Analysis †

    PubMed Central

    Bjørnsen, Peter Koefoed

    1986-01-01

    Image analysis was applied to epifluorescense microscopy of acridine orange-stained plankton samples. A program was developed for discrimination and binary segmentation of digitized video images, taken by an ultrasensitive video camera mounted on the microscope. Cell volumes were estimated from area and perimeter of the objects in the binary image. The program was tested on fluorescent latex beads of known diameters. Biovolumes measured by image analysis were compared with directly determined carbon biomasses in batch cultures of estuarine and freshwater bacterioplankton. This calibration revealed an empirical conversion factor from biovolume to biomass of 0.35 pg of C μm−3 (± 0.03 95% confidence limit). The deviation of this value from the normally used conversion factors of 0.086 to 0.121 pg of C μm−3 is discussed. The described system was capable of measuring 250 cells within 10 min, providing estimates of cell number, mean cell volume, and biovolume with a precision of 5%. Images PMID:16347077

  3. Energetic differences between bacterioplankton trophic groups and coral reef resistance.

    PubMed

    McDole Somera, Tracey; Bailey, Barbara; Barott, Katie; Grasis, Juris; Hatay, Mark; Hilton, Brett J; Hisakawa, Nao; Nosrat, Bahador; Nulton, James; Silveira, Cynthia B; Sullivan, Chris; Brainard, Russell E; Rohwer, Forest

    2016-04-27

    Coral reefs are among the most productive and diverse marine ecosystems on the Earth. They are also particularly sensitive to changing energetic requirements by different trophic levels. Microbialization specifically refers to the increase in the energetic metabolic demands of microbes relative to macrobes and is significantly correlated with increasing human influence on coral reefs. In this study, metabolic theory of ecology is used to quantify the relative contributions of two broad bacterioplankton groups, autotrophs and heterotrophs, to energy flux on 27 Pacific coral reef ecosystems experiencing human impact to varying degrees. The effective activation energy required for photosynthesis is lower than the average energy of activation for the biochemical reactions of the Krebs cycle, and changes in the proportional abundance of these two groups can greatly affect rates of energy and materials cycling. We show that reef-water communities with a higher proportional abundance of microbial autotrophs expend more metabolic energy per gram of microbial biomass. Increased energy and materials flux through fast energy channels (i.e. water-column associated microbial autotrophs) may dampen the detrimental effects of increased heterotrophic loads (e.g. coral disease) on coral reef systems experiencing anthropogenic disturbance.

  4. Phylogeny and genetic history of the Siberian salamander (Salamandrella keyserlingii, Dybowski, 1870) inferred from complete mitochondrial genomes.

    PubMed

    Malyarchuk, Boris; Derenko, Miroslava; Denisova, Galina

    2013-05-01

    We assessed phylogeny of the Siberian salamander (Salamandrella keyserlingii, Dybowski, 1870), the most northern ectothermic, terrestrial vertebrate in Eurasia, by sequence analysis of complete mitochondrial genomes in 26 specimens from different localities (China, Khabarovsk region, Sakhalin, Yakutia, Magadan region, Chukotka, Kamchatka, Ural, European part of Russia). In addition, a complete mitochondrial genome of the Schrenck salamander, Salamandrella schrenckii, was determined for the first time. Bayesian phylogenetic analysis of the entire mtDNA genomes of S. keyserlingii demonstrates that two haplotype clades, AB and C, radiated about 1.4 million years ago (Mya). Bayesian skyline plots of population size change through time show an expansion around 250 thousand years ago (kya) and then a decline around the Last Glacial Maximum (25 kya) with subsequent restoration of population size. Climatic changes during the Quaternary period have dramatically affected the population genetic structure of the Siberian salamanders. In addition, complete mtDNA sequence analysis allowed us to recognize that the vast area of Northern Eurasia was colonized only by the Siberian salamander clade C1b during the last 150 kya. Meanwhile, we were unable to find evidence of molecular adaptation in this clade by analyzing the whole mitochondrial genomes of the Siberian salamanders.

  5. Introgression and phenotypic assimilation in Zimmerius flycatchers (Tyrannidae): population genetic and phylogenetic inferences from genome-wide SNPs.

    PubMed

    Rheindt, Frank E; Fujita, Matthew K; Wilton, Peter R; Edwards, Scott V

    2014-03-01

    Genetic introgression is pervasive in nature and may lead to large-scale phenotypic assimilation and/or admixture of populations, but there is limited knowledge on whether large phenotypic changes are typically accompanied by high levels of introgression throughout the genome. Using bioacoustic, biometric, and spectrophotometric data from a flycatcher (Tyrannidae) system in the Neotropical genus Zimmerius, we document a mosaic pattern of phenotypic admixture in which a population of Zimmerius viridiflavus in northern Peru (henceforth "mosaic") is vocally and biometrically similar to conspecifics to the south but shares plumage characteristics with a different species (Zimmerius chrysops) to the north. To clarify the origins of the mosaic population, we used the RAD-seq approach to generate a data set of 37,361 genome-wide single nucleotide polymorphisms (SNPs). A range of population-genetic diagnostics shows that the genome of the mosaic population is largely indistinguishable from southern Z. viridiflavus and distinct from northern Z. chrysops, and the application of parsimony and species tree methods to the genome-wide SNP data set confirms the close affinity of the mosaic population with southern Z. viridiflavus. Even so, using a subset of 2710 SNPs found across all sampled lineages in configurations appropriate for a recently proposed statistical ("ABBA/BABA") test that distinguishes gene flow from incomplete lineage sorting, we detected low levels of gene flow from northern Z. chrysops into the mosaic population. Mapping the candidate loci for introgression from Z. chrysops into the mosaic population to the zebra finch genome reveals close linkage with genes significantly enriched in functions involving cell projection and plasma membranes. Introgression of key alleles may have led to phenotypic assimilation in the plumage of mosaic birds, suggesting that selection may have been a key factor facilitating introgression.

  6. The phylogenetic position of eriophyoid mites (superfamily Eriophyoidea) in Acariformes inferred from the sequences of mitochondrial genomes and nuclear small subunit (18S) rRNA gene.

    PubMed

    Xue, Xiao-Feng; Dong, Yan; Deng, Wei; Hong, Xiao-Yue; Shao, Renfu

    2017-04-01

    Eriophyoid mites (superfamily Eriophyoidea) comprise >4400 species worldwide. Despite over a century of study, the phylogenetic position of these mites within Acariformes is still poorly resolved. Currently, Eriophyoidea is placed in the order Trombidiformes. We inferred the high-level phylogeny of Acari with the mitochondrial (mt) genome sequences of 110 species including four eriophyoid species, and the nuclear small subunit (18S) rRNA gene sequences of 226 species including 25 eriophyoid species. Maximum likelihood (ML), Bayesian inference (BI) and Maximum parsimony (MP) methods were used to analyze the sequence data. Divergence times were estimated for major lineages of Acari using Bayesian approaches. Our analyses consistently recovered the monophyly of Eriophyoidea but rejected the monophyly of Trombidiformes. The eriophyoid mites were grouped with the sarcoptiform mites, or were the sister group of sarcoptiform mites+non-eriophyoid trombidiform mites, depending on data partition strategies. Eriophyoid mites diverged from other mites in the Devonian (384Mya, 95% HPD, 352-410Mya). The origin of eriophyoid mites was dated to the Permian (262Mya, 95% HPD 230-307Mya), mostly prior to the radiation of gymnosperms (Triassic-Jurassic) and angiosperms (early Cretaceous). We propose that the placement of Eriophyoidea in the order Trombidiformes under the current classification system should be reviewed.

  7. Polytene Chromosomal Maps of 11 Drosophila Species: The Order of Genomic Scaffolds Inferred From Genetic and Physical Maps

    PubMed Central

    Schaeffer, Stephen W.; Bhutkar, Arjun; McAllister, Bryant F.; Matsuda, Muneo; Matzkin, Luciano M.; O'Grady, Patrick M.; Rohde, Claudia; Valente, Vera L. S.; Aguadé, Montserrat; Anderson, Wyatt W.; Edwards, Kevin; Garcia, Ana C. L.; Goodman, Josh; Hartigan, James; Kataoka, Eiko; Lapoint, Richard T.; Lozovsky, Elena R.; Machado, Carlos A.; Noor, Mohamed A. F.; Papaceit, Montserrat; Reed, Laura K.; Richards, Stephen; Rieger, Tania T.; Russo, Susan M.; Sato, Hajime; Segarra, Carmen; Smith, Douglas R.; Smith, Temple F.; Strelets, Victor; Tobari, Yoshiko N.; Tomimura, Yoshihiko; Wasserman, Marvin; Watts, Thomas; Wilson, Robert; Yoshida, Kiyohito; Markow, Therese A.; Gelbart, William M.; Kaufman, Thomas C.

    2008-01-01

    The sequencing of the 12 genomes of members of the genus Drosophila was taken as an opportunity to reevaluate the genetic and physical maps for 11 of the species, in part to aid in the mapping of assembled scaffolds. Here, we present an overview of the importance of cytogenetic maps to Drosophila biology and to the concepts of chromosomal evolution. Physical and genetic markers were used to anchor the genome assembly scaffolds to the polytene chromosomal maps for each species. In addition, a computational approach was used to anchor smaller scaffolds on the basis of the analysis of syntenic blocks. We present the chromosomal map data from each of the 11 sequenced non-Drosophila melanogaster species as a series of sections. Each section reviews the history of the polytene chromosome maps for each species, presents the new polytene chromosome maps, and anchors the genomic scaffolds to the cytological maps using genetic and physical markers. The mapping data agree with Muller's idea that the majority of Drosophila genes are syntenic. Despite the conservation of genes within homologous chromosome arms across species, the karyotypes of these species have changed through the fusion of chromosomal arms followed by subsequent rearrangement events. PMID:18622037

  8. pH Influences the Importance of Niche-Related and Neutral Processes in Lacustrine Bacterioplankton Assembly

    PubMed Central

    Ren, Lijuan; Jeppesen, Erik; He, Dan; Wang, Jianjun; Liboriussen, Lone; Xing, Peng

    2015-01-01

    pH is an important factor that shapes the structure of bacterial communities. However, we have very limited information about the patterns and processes by which overall bacterioplankton communities assemble across wide pH gradients in natural freshwater lakes. Here, we used pyrosequencing to analyze the bacterioplankton communities in 25 discrete freshwater lakes in Denmark with pH levels ranging from 3.8 to 8.8. We found that pH was the key factor impacting lacustrine bacterioplankton community assembly. More acidic lakes imposed stronger environmental filtering, which decreased the richness and evenness of bacterioplankton operational taxonomic units (OTUs) and largely shifted community composition. Although environmental filtering was determined to be the most important determinant of bacterioplankton community assembly, the importance of neutral assembly processes must also be considered, notably in acidic lakes, where the species (OTU) diversity was low. We observed that the strong effect of environmental filtering in more acidic lakes was weakened by the enhanced relative importance of neutral community assembly, and bacterioplankton communities tended to be less phylogenetically clustered in more acidic lakes. In summary, we propose that pH is a major environmental determinant in freshwater lakes, regulating the relative importance and interplay between niche-related and neutral processes and shaping the patterns of freshwater lake bacterioplankton biodiversity. PMID:25724952

  9. Structuring of bacterioplankton communities by specific dissolved organic carbon compounds.

    PubMed

    Gómez-Consarnau, Laura; Lindh, Markus V; Gasol, Josep M; Pinhassi, Jarone

    2012-09-01

    The main role of microorganisms in the cycling of the bulk dissolved organic carbon pool in the ocean is well established. Nevertheless, it remains unclear if particular bacteria preferentially utilize specific carbon compounds and whether such compounds have the potential to shape bacterial community composition. Enrichment experiments in the Mediterranean Sea, Baltic Sea and the North Sea (Skagerrak) showed that different low-molecular-weight organic compounds, with a proven importance for the growth of marine bacteria (e.g. amino acids, glucose, dimethylsulphoniopropionate, acetate or pyruvate), in most cases differentially stimulated bacterial growth. Denaturing gradient gel electrophoresis 'fingerprints' and 16S rRNA gene sequencing revealed that some bacterial phylotypes that became abundant were highly specific to enrichment with specific carbon compounds (e.g. Acinetobacter sp. B1-A3 with acetate or Psychromonas sp. B3-U1 with glucose). In contrast, other phylotypes increased in relative abundance in response to enrichment with several, or all, of the investigated carbon compounds (e.g. Neptuniibacter sp. M2-A4 with acetate, pyruvate and dimethylsulphoniopropionate, and Thalassobacter sp. M3-A3 with pyruvate and amino acids). Furthermore, different carbon compounds triggered the development of unique combinations of dominant phylotypes in several of the experiments. These results suggest that bacteria differ substantially in their abilities to utilize specific carbon compounds, with some bacteria being specialists and others having a more generalist strategy. Thus, changes in the supply or composition of the dissolved organic carbon pool can act as selective forces structuring bacterioplankton communities.

  10. Structure, expression profile and phylogenetic inference of chalcone isomerase-like genes from the narrow-leafed lupin (Lupinus angustifolius L.) genome

    PubMed Central

    Przysiecka, Łucja; Książkiewicz, Michał; Wolko, Bogdan; Naganowska, Barbara

    2015-01-01

    Lupins, like other legumes, have a unique biosynthesis scheme of 5-deoxy-type flavonoids and isoflavonoids. A key enzyme in this pathway is chalcone isomerase (CHI), a member of CHI-fold protein family, encompassing subfamilies of CHI1, CHI2, CHI-like (CHIL), and fatty acid-binding (FAP) proteins. Here, two Lupinus angustifolius (narrow-leafed lupin) CHILs, LangCHIL1 and LangCHIL2, were identified and characterized using DNA fingerprinting, cytogenetic and linkage mapping, sequencing and expression profiling. Clones carrying CHIL sequences were assembled into two contigs. Full gene sequences were obtained from these contigs, and mapped in two L. angustifolius linkage groups by gene-specific markers. Bacterial artificial chromosome fluorescence in situ hybridization approach confirmed the localization of two LangCHIL genes in distinct chromosomes. The expression profiles of both LangCHIL isoforms were very similar. The highest level of transcription was in the roots of the third week of plant growth; thereafter, expression declined. The expression of both LangCHIL genes in leaves and stems was similar and low. Comparative mapping to reference legume genome sequences revealed strong syntenic links; however, LangCHIL2 contig had a much more conserved structure than LangCHIL1. LangCHIL2 is assumed to be an ancestor gene, whereas LangCHIL1 probably appeared as a result of duplication. As both copies are transcriptionally active, questions arise concerning their hypothetical functional divergence. Screening of the narrow-leafed lupin genome and transcriptome with CHI-fold protein sequences, followed by Bayesian inference of phylogeny and cross-genera synteny survey, identified representatives of all but one (CHI1) main subfamilies. They are as follows: two copies of CHI2, FAPa2 and CHIL, and single copies of FAPb and FAPa1. Duplicated genes are remnants of whole genome duplication which is assumed to have occurred after the divergence of Lupinus, Arachis, and Glycine

  11. Complete Genome and Molecular Epidemiological Data Infer the Maintenance of Rabies among Kudu (Tragelaphus strepsiceros) in Namibia

    PubMed Central

    Scott, Terence P.; Fischer, Melina; Khaiseb, Siegfried; Freuling, Conrad; Höper, Dirk; Hoffmann, Bernd; Markotter, Wanda; Müller, Thomas; Nel, Louis H.

    2013-01-01

    Rabies in kudu is unique to Namibia and two major peaks in the epizootic have occurred since it was first noted in 1977. Due to the large numbers of kudu that were affected, it was suspected that horizontal transmission of rabies occurs among kudu and that rabies was being maintained independently within the Namibian kudu population – separate from canid cycles, despite geographic overlap. In this study, it was our aim to show, through phylogenetic analyses, that rabies was being maintained independently within the Namibian kudu population. We also tested, through complete genome sequencing of four rabies virus isolates from jackal and kudu, whether specific mutations occurred in the virus genome due to host adaptation. We found the separate grouping of all rabies isolates from kudu to those of any other canid species in Namibia, suggesting that rabies was being maintained independently in kudu. Additionally, we noted several mutations unique to isolates from kudu, suggesting that these mutations may be due to the adaptation of rabies to a new host. In conclusion, we show clear evidence that rabies is being maintained independently in the Namibian kudu population – a unique phenomenon with ecological and economic impacts. PMID:23527015

  12. In situ interactions between photosynthetic picoeukaryotes and bacterioplankton in the Atlantic Ocean: evidence for mixotrophy.

    PubMed

    Hartmann, Manuela; Zubkov, Mikhail V; Scanlan, Dave J; Lepère, Cécile

    2013-12-01

    Heterotrophic bacterioplankton, cyanobacteria and phototrophic picoeukaryotes (< 5 μm in size) numerically dominate planktonic oceanic communities. While feeding on bacterioplankton is often attributed to aplastidic protists, recent evidence suggests that phototrophic picoeukaryotes could be important bacterivores. Here, we present direct visual evidence from the surface mixed layer of the Atlantic Ocean that bacterioplankton are internalized by phototrophic picoeukaryotes. In situ interactions of phototrophic picoeukaryotes and bacterioplankton (specifically Prochlorococcus cyanobacteria and the SAR11 clade) were investigated using a combination of flow cytometric cell sorting and dual tyramide signal amplification fluorescence in situ hybridization. Using this method, we observed plastidic Prymnesiophyceae and Chrysophyceae cells containing Prochlorococcus, and to a lesser extent SAR11 cells. These microscopic observations of in situ microbial trophic interactions demonstrate the frequency and likely selectivity of phototrophic picoeukaryote bacterivory in the surface mixed layer of both the North and South Atlantic subtropical gyres and adjacent equatorial region, broadening our views on the ecological role of the smallest oceanic plastidic protists.

  13. INFLUENCE OF LIGHT ON BACTERIOPLANKTON PRODUCTION AND RESPIRATION IN A SUBTROPICAL CORAL REEF

    EPA Science Inventory

    The influence of sunlight on bacterioplankton production (14C-leucine (Leu) and 3H-thymidine (TdR) incorporation; changes in cell abundances) and O2 consumption was investigated in a shallow subtropical coral reef located near Key Largo, Florida. Quartz (light) and opaque (dark) ...

  14. Interactions between hydrology and water chemistry shape bacterioplankton biogeography across boreal freshwater networks.

    PubMed

    Niño-García, Juan Pablo; Ruiz-González, Clara; Del Giorgio, Paul A

    2016-07-01

    Disentangling the mechanisms shaping bacterioplankton communities across freshwater ecosystems requires considering a hydrologic dimension that can influence both dispersal and local sorting, but how the environment and hydrology interact to shape the biogeography of freshwater bacterioplankton over large spatial scales remains unexplored. Using Illumina sequencing of the 16S ribosomal RNA gene, we investigate the large-scale spatial patterns of bacterioplankton across 386 freshwater systems from seven distinct regions in boreal Québec. We show that both hydrology and local water chemistry (mostly pH) interact to shape a sequential structuring of communities from highly diverse assemblages in headwater streams toward larger rivers and lakes dominated by fewer taxa. Increases in water residence time along the hydrologic continuum were accompanied by major losses of bacterial richness and by an increased differentiation of communities driven by local conditions (pH and other related variables). This suggests that hydrology and network position modulate the relative role of environmental sorting and mass effects on community assembly by determining both the time frame for bacterial growth and the composition of the immigrant pool. The apparent low dispersal limitation (that is, the lack of influence of geographic distance on the spatial patterns observed at the taxonomic resolution used) suggests that these boreal bacterioplankton communities derive from a shared bacterial pool that enters the networks through the smallest streams, largely dominated by mass effects, and that is increasingly subjected to local sorting of species during transit along the hydrologic continuum.

  15. Uptake of picophytoplankton, bacterioplankton and virioplankton by a fringing coral reef community (Ningaloo Reef, Australia)

    NASA Astrophysics Data System (ADS)

    Patten, N. L.; Wyatt, A. S. J.; Lowe, R. J.; Waite, A. M.

    2011-09-01

    We examined the importance of picoplankton and virioplankton to reef trophodynamics at Ningaloo Reef, (north-western Australia), in May and November 2008. Picophytoplankton ( Prochlorococcus, Synechococcus and picoeukaryotes), bacterioplankton (inclusive of bacteria and Archaea), virioplankton and chlorophyll a (Chl a) were measured at five stations following the consistent wave-driven unidirectional mean flow path of seawater across the reef and into the lagoon. Prochlorococcus, Synechococcus, picoeukaryotes and bacterioplankton were depleted to similar levels (~40% on average) over the fore reef, reef crest and reef flat (=`active reef'), with negligible uptake occurring over the sandy bottom lagoon. Depletion of virioplankton also occurred but to more variable levels. Highest uptake rates, m, of picoplankton occurred over the reef crest, while uptake coefficients, S (independent of cell concentration), were similarly scaled over the reef zones, indicating no preferential uptake of any one group. Collectively, picophytoplankton, bacterioplankton and virioplankton accounted for the uptake of 29 mmol C m-2 day-1, with Synechococcus contributing the highest proportion of the removed C. Picoplankton and virioplankton accounted for 1-5 mmol N m-2 day-1 of the removed N, with bacterioplankton estimated to be a highly rich source of N. Results indicate the importance of ocean-reef interactions and the dependence of certain reef organisms on picoplanktonic supply for reef-level biogeochemistry processes.

  16. Effects of nutrients on specific growth rate of bacterioplankton in oligotrophic lake water cultures

    SciTech Connect

    Coveney, M.F.; Wetzel, R.G. )

    1992-01-01

    The effects of organic and inorganic nutrient additions on the specific growth rates of bacterioplankton in oligotrophic lake water cultures were investigated. Lake water was first passed through 0.8-{mu}m-pore-size filters (prescreening) to remove bacterivores and to minimize confounding effects of algae. Specific growth rates were calculated from changes in both bacterial cell numbers and biovolumes over 36 h. Gross specific growth rates in unmanipulated control samples were estimated through separate measurements of grazing losses by use of penicillin. The addition of mixed organic substrates alone to prescreened water did not significantly increase bacterioplankton specific growth rates. The addition of inorganic phosphorus alone significantly increased one or both specific growth rates in three of four experiments, and one experiment showed a secondary stimulation by organic substrates. The stimulatory effects of phosphorus addition were greatest concurrently with the highest alkaline phosphatase activity in the lake water. Because bacteria have been shown to dominate inorganic phosphorus uptake in other P-deficient systems, the demonstration that phosphorus, rather than organic carbon, can limit bacterioplankton growth suggests direct competition between phytoplankton and bacterioplankton for inorganic phosphorus.

  17. Phytoplankton community succession shaping bacterioplankton community composition in Lake Taihu, China.

    PubMed

    Niu, Yuan; Shen, Hong; Chen, Jun; Xie, Ping; Yang, Xi; Tao, Min; Ma, Zhimei; Qi, Min

    2011-08-01

    PCR-denaturing gradient gel electrophoresis (DGGE) and canonical correspondence analysis (CCA) were used to explore the relationship between succession of phytoplankton community and temporal variation of bacterioplankton community composition (BCC) in the eutrophic Lake Taihu. Serious Microcystis bloom was observed in July-December 2008 and Bacillariophyta and Cryptophyta dominated in January-June 2009. BCC was characterized by DGGE of 16S rRNA gene with subsequent sequencing. The DGGE banding patterns revealed a remarkable seasonality which was closely related to phytoplankton community succession. Variation trend of Shannon-Wiener diversity index in bacterioplankton community was similar to that of phytoplankton community. CCA revealed that temperature and phytoplankton played key roles in structuring BCC. Sequencing of DGGE bands suggested that the majority of the sequences were affiliated with common phylogenetic groups in freshwater: Alphaproteobacteria, Betaproteobacteria, Bacteroidetes and Actinobacteria. The cluster STA2-30 (affiliated with Actinobacteria) was found almost across the sampling time at the two study sites. We observed that the family Flavobacteriaceae (affiliated with Bacteroidetes) tightly coupled to diatom bloom and the cluster ML-5-51.2 (affiliated with Actinobacteria) dominated the bacterioplankton communities during Microcystis bloom. These results were quite similar at the two sampling sites, indicating that BCC changes were not random but with fixed pattern. Our study showed insights into relationships between phytoplankton and bacterioplankton communities at species level, facilitating a better understanding of microbial loop and ecosystem functioning in the lake.

  18. Bacterioplankton: a sink for carbon in a coastal marine plankton community

    SciTech Connect

    Ducklow, H.W.; Purdie, D.A.; Williams, P.J.LeB.; Davis, J.M.

    1986-05-16

    Recent determinations of high production rates (up to 30% of primary production in surface waters) implicate free-living marine bacterioplankton as a link in a microbial loop that supplements phytoplankton as food for herbivores. An enclosed water column of 300 cubic meters was used to test the microbial loop hypothesis by following the fate of carbon-14-labeled bacterioplankton for over 50 days. Only 2% of the label initially fixed from carbon-14-labeled glucose by bacteria was present in larger organisms after 13 days, at which time about 20% of the total label added remained in the particulate fraction. Most of the label appeared to pass directly from particles smaller than 1 micrometer (heterotrophic bacterioplankton and some bacteriovores) to respired labeled carbon dioxide or to regenerated dissolved organic carbon-14. Secondary (and, by implication, primary) production by organisms smaller than 1 micrometer may not be an important food source in marine food chains. Bacterioplankton can be a sink for carbon in planktonic food webs and may serve principally as agents of nutrient regeneration rather than as food.

  19. Simultaneous Extraction from Bacterioplankton of Total RNA and DNA Suitable for Quantitative Structure and Function Analyses

    PubMed Central

    Weinbauer, Markus G.; Fritz, Ingo; Wenderoth, Dirk F.; Höfle, Manfred G.

    2002-01-01

    The aim of this study was to develop a protocol for the simultaneous extraction from bacterioplankton of RNA and DNA suitable for quantitative molecular analysis. By using a combined mechanical and chemical extraction method, the highest RNA and DNA yield was obtained with sodium lauryl sarcosinate-phenol or DivoLab-phenol as the extraction mix. The efficiency of extraction of nucleic acids was comparatively high and varied only moderately in gram-negative bacterial isolates and bacterioplankton (RNA, 52 to 66%; DNA, 43 to 61%); significant amounts of nucleic acids were also obtained for a gram-positive bacterial isolate (RNA, 20 to 30%; DNA, 20 to 25%). Reverse transcription-PCR and PCR amplification products of fragments of 16S rRNA and its genes were obtained from all isolates and communities, indicating that the extracted nucleic acids were intact and pure enough for community structure analyses. By using single-strand conformation polymorphism of fragments of 16S rRNA and its gene, community fingerprints were obtained from pond bacterioplankton. mRNA transcripts encoding fragments of the enzyme nitrite reductase gene (nir gene) could be detected in a pond water sample, indicating that the extraction method is also suitable for studying gene expression. The extraction method presented yields nucleic acids that can be used to perform structural and functional studies of bacterioplankton communities from a single sample. PMID:11872453

  20. Effects of Nutrients on Specific Growth Rate of Bacterioplankton in Oligotrophic Lake Water Cultures †

    PubMed Central

    Coveney, Michael F.; Wetzel, Robert G.

    1992-01-01

    The effects of organic and inorganic nutrient additions on the specific growth rates of bacterioplankton in oligotrophic lake water cultures were investigated. Lake water was first passed through 0.8-μm-pore-size filters (prescreening) to remove bacterivores and to minimize confounding effects of algae. Specific growth rates were calculated from changes in both bacterial cell numbers and biovolumes over 36 h. Gross specific growth rates in unmanipulated control samples were estimated through separate measurements of grazing losses by use of penicillin. The addition of mixed organic substrates alone to prescreened water did not significantly increase bacterioplankton specific growth rates. The addition of inorganic phosphorus alone significantly increased one or both specific growth rates in three of four experiments, and one experiment showed a secondary stimulation by organic substrates. The stimulatory effects of phosphorus addition were greatest concurrently with the highest alkaline phosphatase activity in the lake water. Because bacteria have been shown to dominate inorganic phosphorus uptake in other P-deficient systems, the demonstration that phosphorus, rather than organic carbon, can limit bacterioplankton growth suggests direct competition between phytoplankton and bacterioplankton for inorganic phosphorus. PMID:16348620

  1. BACTERIOPLANKTON DYNAMICS IN PENSACOLA BAY, FL, USA: ROLE OF PHYTOPLANKTON AND DETRIAL CARBON SOURCES

    EPA Science Inventory

    Bacterioplankton Dynamics in Pensacola Bay, FL, USA: Role of Phytoplankton and Detrital Carbon Sources (Abstract). To be presented at the16th Biennial Conference of the Estuarine Research Foundation, ERF 2001: An Estuarine Odyssey, 4-8 November 2001, St. Pete Beach, FL. 1 p. (ER...

  2. Molecular phylogeny of the nettle family (Urticaceae) inferred from multiple loci of three genomes and extensive generic sampling.

    PubMed

    Wu, Zeng-Yuan; Monro, Alex K; Milne, Richard I; Wang, Hong; Yi, Ting-Shuang; Liu, Jie; Li, De-Zhu

    2013-12-01

    Urticaceae is one of the larger Angiosperm families, but relationships within it remain poorly known. This study presents the first densely sampled molecular phylogeny of Urticaceae, using maximum likelihood (ML), maximum parsimony (MP) and Bayesian inference (BI) to analyze the DNA sequence data from two nuclear (ITS and 18S), four chloroplast (matK, rbcL, rpll4-rps8-infA-rpl36, trnL-trnF) and one mitochondrial (matR) loci. We sampled 169 accessions representing 122 species, representing 47 of the 54 recognized genera within Urticaceae, including four of the six sometimes separated as Cecropiaceae. Major results included: (1) Urticaceae including Cecropiaceae was monophyletic; (2) Cecropiaceae was biphyletic, with both lineages nested within Urticaceae; (3) Urticaceae can be divided into four well-supported clades; (4) previously erected tribes or subfamilies were broadly supported, with some additions and alterations; (5) the monophyly of many genera was supported, whereas Boehmeria, Pellionia, Pouzolzia and Urera were clearly polyphyletic, while Urtica and Pilea each had a small genus nested within them; (6) relationships between genera were clarified, mostly with substantial support. These results clarify that some morphological characters have been overstated and others understated in previous classifications of the family, and provide a strong foundation for future studies on biogeography, character evolution, and circumscription of difficult genera.

  3. Coral and macroalgal exudates vary in neutral sugar composition and differentially enrich reef bacterioplankton lineages

    PubMed Central

    Nelson, Craig E; Goldberg, Stuart J; Wegley Kelly, Linda; Haas, Andreas F; Smith, Jennifer E; Rohwer, Forest; Carlson, Craig A

    2013-01-01

    Increasing algal cover on tropical reefs worldwide may be maintained through feedbacks whereby algae outcompete coral by altering microbial activity. We hypothesized that algae and coral release compositionally distinct exudates that differentially alter bacterioplankton growth and community structure. We collected exudates from the dominant hermatypic coral holobiont Porites spp. and three dominant macroalgae (one each Ochrophyta, Rhodophyta and Chlorophyta) from reefs of Mo'orea, French Polynesia. We characterized exudates by measuring dissolved organic carbon (DOC) and fractional dissolved combined neutral sugars (DCNSs) and subsequently tracked bacterioplankton responses to each exudate over 48 h, assessing cellular growth, DOC/DCNS utilization and changes in taxonomic composition (via 16S rRNA amplicon pyrosequencing). Fleshy macroalgal exudates were enriched in the DCNS components fucose (Ochrophyta) and galactose (Rhodophyta); coral and calcareous algal exudates were enriched in total DCNS but in the same component proportions as ambient seawater. Rates of bacterioplankton growth and DOC utilization were significantly higher in algal exudate treatments than in coral exudate and control incubations with each community selectively removing different DCNS components. Coral exudates engendered the smallest shift in overall bacterioplankton community structure, maintained high diversity and enriched taxa from Alphaproteobacteria lineages containing cultured representatives with relatively few virulence factors (VFs) (Hyphomonadaceae and Erythrobacteraceae). In contrast, macroalgal exudates selected for less diverse communities heavily enriched in copiotrophic Gammaproteobacteria lineages containing cultured pathogens with increased VFs (Vibrionaceae and Pseudoalteromonadaceae). Our results demonstrate that algal exudates are enriched in DCNS components, foster rapid growth of bacterioplankton and select for bacterial populations with more potential VFs than

  4. Coral and macroalgal exudates vary in neutral sugar composition and differentially enrich reef bacterioplankton lineages.

    PubMed

    Nelson, Craig E; Goldberg, Stuart J; Wegley Kelly, Linda; Haas, Andreas F; Smith, Jennifer E; Rohwer, Forest; Carlson, Craig A

    2013-05-01

    Increasing algal cover on tropical reefs worldwide may be maintained through feedbacks whereby algae outcompete coral by altering microbial activity. We hypothesized that algae and coral release compositionally distinct exudates that differentially alter bacterioplankton growth and community structure. We collected exudates from the dominant hermatypic coral holobiont Porites spp. and three dominant macroalgae (one each Ochrophyta, Rhodophyta and Chlorophyta) from reefs of Mo'orea, French Polynesia. We characterized exudates by measuring dissolved organic carbon (DOC) and fractional dissolved combined neutral sugars (DCNSs) and subsequently tracked bacterioplankton responses to each exudate over 48 h, assessing cellular growth, DOC/DCNS utilization and changes in taxonomic composition (via 16S rRNA amplicon pyrosequencing). Fleshy macroalgal exudates were enriched in the DCNS components fucose (Ochrophyta) and galactose (Rhodophyta); coral and calcareous algal exudates were enriched in total DCNS but in the same component proportions as ambient seawater. Rates of bacterioplankton growth and DOC utilization were significantly higher in algal exudate treatments than in coral exudate and control incubations with each community selectively removing different DCNS components. Coral exudates engendered the smallest shift in overall bacterioplankton community structure, maintained high diversity and enriched taxa from Alphaproteobacteria lineages containing cultured representatives with relatively few virulence factors (VFs) (Hyphomonadaceae and Erythrobacteraceae). In contrast, macroalgal exudates selected for less diverse communities heavily enriched in copiotrophic Gammaproteobacteria lineages containing cultured pathogens with increased VFs (Vibrionaceae and Pseudoalteromonadaceae). Our results demonstrate that algal exudates are enriched in DCNS components, foster rapid growth of bacterioplankton and select for bacterial populations with more potential VFs than

  5. Genomic Alteration in Head and Neck Squamous Cell Carcinoma (HNSCC) Cell Lines Inferred from Karyotyping, Molecular Cytogenetics, and Array Comparative Genomic Hybridization

    PubMed Central

    Rerkarmnuaychoke, Budsaba; Suntronpong, Aorarat; Fu, Beiyuan; Bodhisuwan, Winai; Peyachoknagul, Surin; Yang, Fengtang; Koontongkaew, Sittichai; Srikulnath, Kornsorn

    2016-01-01

    Genomic alteration in head and neck squamous cell carcinoma (HNSCC) was studied in two cell line pairs (HN30-HN31 and HN4-HN12) using conventional C-banding, multiplex fluorescence in situ hybridization (M-FISH), and array comparative genomic hybridization (array CGH). HN30 and HN4 were derived from primary lesions in the pharynx and base of tongue, respectively, and HN31 and HN12 were derived from lymph-node metastatic lesions belonging to the same patients. Gain of chromosome 1, 7, and 11 were shared in almost all cell lines. Hierarchical clustering revealed that HN31 was closely related to HN4, which shared eight chromosome alteration cases. Large C-positive heterochromatins were found in the centromeric region of chromosome 9 in HN31 and HN4, which suggests complex structural amplification of the repetitive sequence. Array CGH revealed amplification of 7p22.3p11.2, 8q11.23q12.1, and 14q32.33 in all cell lines involved with tumorigenesis and inflammation genes. The amplification of 2p21 (SIX3), 11p15.5 (H19), and 11q21q22.3 (MAML2, PGR, TRPC6, and MMP family) regions, and deletion of 9p23 (PTPRD) and 16q23.1 (WWOX) regions were identified in HN31 and HN12. Interestingly, partial loss of PTPRD (9p23) and WWOX (16q23.1) genes was identified in HN31 and HN12, and the level of gene expression tended to be the down-regulation of PTPRD, with no detectable expression of the WWOX gene. This suggests that the scarcity of PTPRD and WWOX genes might have played an important role in progression of HNSCC, and could be considered as a target for cancer therapy or a biomarker in molecular pathology. PMID:27501229

  6. Phylogenetic inference and SSR characterization of tropical woody bamboos tribe Bambuseae (Poaceae: Bambusoideae) based on complete plastid genome sequences.

    PubMed

    Vieira, Leila do Nascimento; Dos Anjos, Karina Goulart; Faoro, Helisson; Fraga, Hugo Pacheco de Freitas; Greco, Thiago Machado; Pedrosa, Fábio de Oliveira; de Souza, Emanuel Maltempi; Rogalski, Marcelo; de Souza, Robson Francisco; Guerra, Miguel Pedro

    2016-05-01

    The complete plastome sequencing is an efficient option for increasing phylogenetic resolution and evolutionary studies, as well as may greatly facilitate the use of plastid DNA markers in plant population genetic studies. Merostachys and Guadua stand out as the most common and the highest potential utilization bamboos indigenous of Brazil. Here, we sequenced the complete plastome sequences of the Brazilian Guadua chacoensis and Merostachys sp. to perform full plastome phylogeny and characterize the occurrence, type, and distribution of SRRs using 20 Bambuseae species. The determined plastome sequence of Merostachys sp. and G. chacoensis is 136,334 and 135,403 bp in size, respectively, with an identical gene content and typical quadripartite structure consisting of a pair of IRs separated by the LSC and SSC regions. The Maximum Likelihood and Bayesian Inference analyses produced phylogenomic trees identical in topology. These trees supported monophyly of Paleotropical and Neotropical Bamboos clades. The Neotropical bamboos segregated into three well-supported lineages, Chusqueinae, Guaduinae, and Arthrostylidiinae, with the last two forming a well-supported sister relationship. Paleotropical bamboos segregated into two well-supported lineages, Hickeliinae and Bambusinae + Melocanninae. We identified 141.8 cpSSR in Bambuseae plastomes and an inferior value (38.15) for plastome coding sequences. Among them, we identified 16 polymorphic SSR loci, with number of alleles varying from 3 to 10. These 16 polymorphic cpSSR loci in Bambuseae plastome can be assessed for the intraspecific level of polymorphism, leading to innovative highly sensitive phylogeographic and population genetics studies for this tribe.

  7. Demographic inferences using short-read genomic data in an approximate Bayesian computation framework: in silico evaluation of power, biases and proof of concept in Atlantic walrus.

    PubMed

    Shafer, Aaron B A; Gattepaille, Lucie M; Stewart, Robert E A; Wolf, Jochen B W

    2015-01-01

    Approximate Bayesian computation (ABC) is a powerful tool for model-based inference of demographic histories from large genetic data sets. For most organisms, its implementation has been hampered by the lack of sufficient genetic data. Genotyping-by-sequencing (GBS) provides cheap genome-scale data to fill this gap, but its potential has not fully been exploited. Here, we explored power, precision and biases of a coalescent-based ABC approach where GBS data were modelled with either a population mutation parameter (θ) or a fixed site (FS) approach, allowing single or several segregating sites per locus. With simulated data ranging from 500 to 50 000 loci, a variety of demographic models could be reliably inferred across a range of timescales and migration scenarios. Posterior estimates were informative with 1000 loci for migration and split time in simple population divergence models. In more complex models, posterior distributions were wide and almost reverted to the uninformative prior even with 50 000 loci. ABC parameter estimates, however, were generally more accurate than an alternative composite-likelihood method. Bottleneck scenarios proved particularly difficult, and only recent bottlenecks without recovery could be reliably detected and dated. Notably, minor-allele-frequency filters - usual practice for GBS data - negatively affected nearly all estimates. With this in mind, we used a combination of FS and θ approaches on empirical GBS data generated from the Atlantic walrus (Odobenus rosmarus rosmarus), collectively providing support for a population split before the last glacial maximum followed by asymmetrical migration and a high Arctic bottleneck. Overall, this study evaluates the potential and limitations of GBS data in an ABC-coalescence framework and proposes a best-practice approach.

  8. Southeast Asian origins of five Hill Tribe populations and correlation of genetic to linguistic relationships inferred with genome-wide SNP data.

    PubMed

    Listman, J B; Malison, R T; Sanichwankul, K; Ittiwut, C; Mutirangura, A; Gelernter, J

    2011-02-01

    In Thailand, the term Hill Tribe is used to describe populations whose members traditionally practice slash and burn agriculture and reside in the mountains. These tribes are thought to have migrated throughout Asia for up to 5,000 years, including migrations through Southern China and/or Southeast Asia. There have been continuous migrations southward from China into Thailand for approximately the past thousand years and the present geographic range of any given tribe straddles multiple political borders. As none of these populations have autochthonous scripts, written histories have until recently, been externally produced. Northern Asian, Tibetan, and Siberian origins of Hill Tribes have been proposed. All purport endogamy and have nonmutually intelligible languages. To test hypotheses regarding the geographic origins of these populations, relatedness and migrations among them and neighboring populations, and whether their genetic relationships correspond with their linguistic relationships, we analyzed 2,445 genome-wide SNP markers in 118 individuals from five Thai Hill Tribe populations (Akha, Hmong, Karen, Lahu, and Lisu), 90 individuals from majority Thai populations, and 826 individuals from Asian and Oceanean HGDP and HapMap populations using a Bayesian clustering method. Considering these results within the context of results ofrecent large-scale studies of Asian geographic genetic variation allows us to infer a shared Southeast Asian origin of these five Hill Tribe populations as well ancestry components that distinguish among them seen in successive levels of clustering. In addition, the inferred level of shared ancestry among the Hill Tribes corresponds well to relationships among their languages.

  9. First all-in-one diagnostic tool for DNA intelligence: genome-wide inference of biogeographic ancestry, appearance, relatedness, and sex with the Identitas v1 Forensic Chip.

    PubMed

    Keating, Brendan; Bansal, Aruna T; Walsh, Susan; Millman, Jonathan; Newman, Jonathan; Kidd, Kenneth; Budowle, Bruce; Eisenberg, Arthur; Donfack, Joseph; Gasparini, Paolo; Budimlija, Zoran; Henders, Anjali K; Chandrupatla, Hareesh; Duffy, David L; Gordon, Scott D; Hysi, Pirro; Liu, Fan; Medland, Sarah E; Rubin, Laurence; Martin, Nicholas G; Spector, Timothy D; Kayser, Manfred

    2013-05-01

    When a forensic DNA sample cannot be associated directly with a previously genotyped reference sample by standard short tandem repeat profiling, the investigation required for identifying perpetrators, victims, or missing persons can be both costly and time consuming. Here, we describe the outcome of a collaborative study using the Identitas Version 1 (v1) Forensic Chip, the first commercially available all-in-one tool dedicated to the concept of developing intelligence leads based on DNA. The chip allows parallel interrogation of 201,173 genome-wide autosomal, X-chromosomal, Y-chromosomal, and mitochondrial single nucleotide polymorphisms for inference of biogeographic ancestry, appearance, relatedness, and sex. The first assessment of the chip's performance was carried out on 3,196 blinded DNA samples of varying quantities and qualities, covering a wide range of biogeographic origin and eye/hair coloration as well as variation in relatedness and sex. Overall, 95 % of the samples (N = 3,034) passed quality checks with an overall genotype call rate >90 % on variable numbers of available recorded trait information. Predictions of sex, direct match, and first to third degree relatedness were highly accurate. Chip-based predictions of biparental continental ancestry were on average ~94 % correct (further support provided by separately inferred patrilineal and matrilineal ancestry). Predictions of eye color were 85 % correct for brown and 70 % correct for blue eyes, and predictions of hair color were 72 % for brown, 63 % for blond, 58 % for black, and 48 % for red hair. From the 5 % of samples (N = 162) with <90 % call rate, 56 % yielded correct continental ancestry predictions while 7 % yielded sufficient genotypes to allow hair and eye color prediction. Our results demonstrate that the Identitas v1 Forensic Chip holds great promise for a wide range of applications including criminal investigations, missing person investigations, and for national security

  10. Can we continue to neglect genomic variation in introgression rates when inferring the history of speciation? A case study in a Mytilus hybrid zone.

    PubMed

    Roux, C; Fraïsse, C; Castric, V; Vekemans, X; Pogson, G H; Bierne, N

    2014-08-01

    The use of molecular data to reconstruct the history of divergence and gene flow between populations of closely related taxa represents a challenging problem. It has been proposed that the long-standing debate about the geography of speciation can be resolved by comparing the likelihoods of a model of isolation with migration and a model of secondary contact. However, data are commonly only fit to a model of isolation with migration and rarely tested against the secondary contact alternative. Furthermore, most demographic inference methods have neglected variation in introgression rates and assume that the gene flow parameter (Nm) is similar among loci. Here, we show that neglecting this source of variation can give misleading results. We analysed DNA sequences sampled from populations of the marine mussels, Mytilus edulis and M. galloprovincialis, across a well-studied mosaic hybrid zone in Europe and evaluated various scenarios of speciation, with or without variation in introgression rates, using an Approximate Bayesian Computation (ABC) approach. Models with heterogeneous gene flow across loci always outperformed models assuming equal migration rates irrespective of the history of gene flow being considered. By incorporating this heterogeneity, the best-supported scenario was a long period of allopatric isolation during the first three-quarters of the time since divergence followed by secondary contact and introgression during the last quarter. By contrast, constraining migration to be homogeneous failed to discriminate among any of the different models of gene flow tested. Our simulations thus provide statistical support for the secondary contact scenario in the European Mytilus hybrid zone that the standard coalescent approach failed to confirm. Our results demonstrate that genomic variation in introgression rates can have profound impacts on the biological conclusions drawn from inference methods and needs to be incorporated in future studies.

  11. The evolutionary history of Plasmodium vivax as inferred from mitochondrial genomes: parasite genetic diversity in the Americas.

    PubMed

    Taylor, Jesse E; Pacheco, M Andreína; Bacon, David J; Beg, Mohammad A; Machado, Ricardo Luiz; Fairhurst, Rick M; Herrera, Socrates; Kim, Jung-Yeon; Menard, Didier; Póvoa, Marinete Marins; Villegas, Leopoldo; Mulyanto; Snounou, Georges; Cui, Liwang; Zeyrek, Fadile Yildiz; Escalante, Ananias A

    2013-09-01

    Plasmodium vivax is the most prevalent human malaria parasite in the Americas. Previous studies have contrasted the genetic diversity of parasite populations in the Americas with those in Asia and Oceania, concluding that New World populations exhibit low genetic diversity consistent with a recent introduction. Here we used an expanded sample of complete mitochondrial genome sequences to investigate the diversity of P. vivax in the Americas as well as in other continental populations. We show that the diversity of P. vivax in the Americas is comparable to that in Asia and Oceania, and we identify several divergent clades circulating in South America that may have resulted from independent introductions. In particular, we show that several haplotypes sampled in Venezuela and northeastern Brazil belong to a clade that diverged from the other P. vivax lineages at least 30,000 years ago, albeit not necessarily in the Americas. We propose that, unlike in Asia where human migration increases local genetic diversity, the combined effects of the geographical structure and the low incidence of vivax malaria in the Americas has resulted in patterns of low local but high regional genetic diversity. This could explain previous views that P. vivax in the Americas has low genetic diversity because these were based on studies carried out in limited areas. Further elucidation of the complex geographical pattern of P. vivax variation will be important both for diversity assessments of genes encoding candidate vaccine antigens and in the formulation of control and surveillance measures aimed at malaria elimination.

  12. The Evolutionary History of Plasmodium vivax as Inferred from Mitochondrial Genomes: Parasite Genetic Diversity in the Americas

    PubMed Central

    Taylor, Jesse E.; Pacheco, M. Andreína; Bacon, David J.; Beg, Mohammad A.; Machado, Ricardo Luiz; Fairhurst, Rick M.; Herrera, Socrates; Kim, Jung-Yeon; Menard, Didier; Póvoa, Marinete Marins; Villegas, Leopoldo; Mulyanto; Snounou, Georges; Cui, Liwang; Zeyrek, Fadile Yildiz; Escalante, Ananias A.

    2013-01-01

    Plasmodium vivax is the most prevalent human malaria parasite in the Americas. Previous studies have contrasted the genetic diversity of parasite populations in the Americas with those in Asia and Oceania, concluding that New World populations exhibit low genetic diversity consistent with a recent introduction. Here we used an expanded sample of complete mitochondrial genome sequences to investigate the diversity of P. vivax in the Americas as well as in other continental populations. We show that the diversity of P. vivax in the Americas is comparable to that in Asia and Oceania, and we identify several divergent clades circulating in South America that may have resulted from independent introductions. In particular, we show that several haplotypes sampled in Venezuela and northeastern Brazil belong to a clade that diverged from the other P. vivax lineages at least 30,000 years ago, albeit not necessarily in the Americas. We propose that, unlike in Asia where human migration increases local genetic diversity, the combined effects of the geographical structure and the low incidence of vivax malaria in the Americas has resulted in patterns of low local but high regional genetic diversity. This could explain previous views that P. vivax in the Americas has low genetic diversity because these were based on studies carried out in limited areas. Further elucidation of the complex geographical pattern of P. vivax variation will be important both for diversity assessments of genes encoding candidate vaccine antigens and in the formulation of control and surveillance measures aimed at malaria elimination. PMID:23733143

  13. Phylogenetic position of tetraodontiform fishes within the higher teleosts: Bayesian inferences based on 44 whole mitochondrial genome sequences.

    PubMed

    Yamanoue, Yusuke; Miya, Masaki; Matsuura, Keiichi; Yagishita, Naoki; Mabuchi, Kohji; Sakai, Harumi; Katoh, Masaya; Nishida, Mutsumi

    2007-10-01

    Tetraodontiformes includes approximately 350 species assigned to nine families, sharing several reduced morphological features of higher teleosts. The order has been accepted as a monophyletic group by many authors, although several alternative hypotheses exist regarding its phylogenetic position within the higher teleosts. To date, acanthuroids, zeiforms, and lophiiforms have been proposed as sister-groups of the tetraodontiforms. The monophyly and sister-group status was investigated using whole mitochondrial genome (mitogenome) sequences from 44 purposefully-chosen species (26 sequences newly-determined during the study) that fully represent the major tetraodontiform lineages plus all the groups that have been hypothesized as being close relatives. Partitioned Bayesian analyses were conducted with the three datasets that comprised concatenated nucleotide sequences from 13 protein-coding genes (with and without, or with RY-coding, 3rd codon positions), plus 22 transfer RNA and two ribosomal RNA genes. The resultant trees were well resolved and largely congruent, with most internal branches being supported by high posterior probabilities. Mitogenomic data strongly supported the monophyly of tetraodontiform fishes, placing them as a sister-group of either Lophiiformes plus Caproidei or Caproidei only. The sister-group relationship between Acanthuroidei and Tetraodontiformes was statistically rejected using Bayes factors. These results were confirmed by a reanalysis of the previously published nuclear RAG1 gene sequences using the Bayesian method. Within the Tetraodontiformes, however, monophylies of the three superfamilies were not recovered and further taxonomic sampling and subsequent efforts should clarify these relationships.

  14. ChEA: transcription factor regulation inferred from integrating genome-wide ChIP-X experiments

    PubMed Central

    Lachmann, Alexander; Xu, Huilei; Krishnan, Jayanth; Berger, Seth I.; Mazloom, Amin R.; Ma'ayan, Avi

    2010-01-01

    Motivation: Experiments such as ChIP-chip, ChIP-seq, ChIP-PET and DamID (the four methods referred herein as ChIP-X) are used to profile the binding of transcription factors to DNA at a genome-wide scale. Such experiments provide hundreds to thousands of potential binding sites for a given transcription factor in proximity to gene coding regions. Results: In order to integrate data from such studies and utilize it for further biological discovery, we collected interactions from such experiments to construct a mammalian ChIP-X database. The database contains 189 933 interactions, manually extracted from 87 publications, describing the binding of 92 transcription factors to 31 932 target genes. We used the database to analyze mRNA expression data where we perform gene-list enrichment analysis using the ChIP-X database as the prior biological knowledge gene-list library. The system is delivered as a web-based interactive application called ChIP Enrichment Analysis (ChEA). With ChEA, users can input lists of mammalian gene symbols for which the program computes over-representation of transcription factor targets from the ChIP-X database. The ChEA database allowed us to reconstruct an initial network of transcription factors connected based on shared overlapping targets and binding site proximity. To demonstrate the utility of ChEA we present three case studies. We show how by combining the Connectivity Map (CMAP) with ChEA, we can rank pairs of compounds to be used to target specific transcription factor activity in cancer cells. Availability: The ChEA software and ChIP-X database is freely available online at: http://amp.pharm.mssm.edu/lib/chea.jsp Contact: avi.maayan@mssm.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:20709693

  15. Ecological Inference

    NASA Astrophysics Data System (ADS)

    King, Gary; Rosen, Ori; Tanner, Martin A.

    2004-09-01

    This collection of essays brings together a diverse group of scholars to survey the latest strategies for solving ecological inference problems in various fields. The last half-decade has witnessed an explosion of research in ecological inference--the process of trying to infer individual behavior from aggregate data. Although uncertainties and information lost in aggregation make ecological inference one of the most problematic types of research to rely on, these inferences are required in many academic fields, as well as by legislatures and the Courts in redistricting, by business in marketing research, and by governments in policy analysis.

  16. Rapid turnover of dissolved DMS and DMSP by defined bacterioplankton communities in the stratified euphotic zone of the North Sea

    NASA Astrophysics Data System (ADS)

    Zubkov, Mikhail V.; Fuchs, Bernhard M.; Archer, Stephen D.; Kiene, Ronald P.; Amann, Rudolf; Burkill, Peter H.

    Bacterioplankton-driven turnover of the algal osmolyte, dimethylsulphoniopropionate (DMSP), and its degradation product, dimethylsulphide (DMS) the major natural source of atmospheric sulphur, were studied during a Lagrangian SF 6-tracer experiment in the North Sea (60°N, 3°E). The water mass sampled within the euphotic zone was characterised by a surface mixed layer (from 0 m to 13-30 m) and a subsurface layer (from 13-30 m to 45-58 m) separated by a 2°C thermocline spanning 2 m. The fluxes of dissolved DMSP (DMSPd) and DMS were determined using radioactive tracer techniques. Rates of the simultaneous incorporation of 14C-leucine and 3H-thymidine were measured to estimate bacterioplankton production. Flow cytometry was employed to discriminate subpopulations and to determine the numbers and biomass of bacterioplankton by staining for nucleic acids and proteins. Bacterioplankton subpopulations were separated by flow cytometric sorting and their composition determined using 16S ribosomal gene cloning/sequencing and fluorescence in situ hybridisation with designed group-specific oligonucleotide probes. A subpopulation, dominated by bacteria related to Roseobacter-( α-proteobacteria), constituted 26-33% of total bacterioplankton numbers and 45-48% of biomass in both surface and subsurface layers. The other abundant prokaryotes were a group within the SAR86 cluster of γ-proteobacteria and bacteria from the Cytophaga-Flavobacterium—cluster. Bacterial consumption of DMSPd was greater in the subsurface layer (41 nM d -1) than in the surface layer (20 nM d -1). Bacterioplankton tightly controlled the DMSPd pool, particularly in the subsurface layer, with a turnover time of 2 h, whereas the turnover time of DMSPd in the surface layer was 10 h. Consumed DMSP satisfied the majority of sulphur demands of bacterioplankton, even though bacterioplankton assimilated only about 2.5% and 6.0% of consumed DMSPd sulphur in the surface and subsurface layers, respectively

  17. Covariance of bacterioplankton composition and environmental variables in a temperate delta system

    USGS Publications Warehouse

    Stepanauskas, R.; Moran, M.A.; Bergamaschi, B.A.; Hollibaugh, J.T.

    2003-01-01

    We examined seasonal and spatial variation in bacterioplankton composition in the Sacramento-San Joaquin River Delta (CA) using terminal restriction fragment length polymorphism (T-RFLP) analysis. Cloned 16S rRNA genes from this system were used for putative identification of taxa dominating the T-RFLP profiles. Both cloning and T-RFLP analysis indicated that Actinobacteria, Verrucomicrobia, Cytophaga-Flavobacterium and Proteobacteria were the most abundant bacterioplankton groups in the Delta. Despite the broad variety of sampled habitats (deep water channels, lakes, marshes, agricultural drains, freshwater and brackish areas), and the spatial and temporal differences in hydrology, temperature and water chemistry among the sampling campaigns, T-RFLP electropherograms from all samples were similar, indicating that the same bacterioplankton phylotypes dominated in the various habitats of the Delta throughout the year. However, principal component analysis (PCA) and partial least-squares regression (PLS) of T-RFLP profiles revealed consistent grouping of samples on a seasonal, but not a spatial, basis. ??-Proteobacteria related to Ralstonia, Actinobacteria related to Microthrix, and ??-Proteobacteria identical to the environmental Clone LD12 had the highest relative abundance in summer/fall T-RFLP profiles and were associated with low river flow, high pH, and a number of optical and chemical characteristics of dissolved organic carbon (DOC) indicative of an increased proportion of phytoplankton-produced organic material as opposed to allochthonous, terrestrially derived organic material. On the other hand, Geobacter-related ??-Proteobacteria showed a relative increase in abundance in T-RFLP analysis during winter/spring, and probably were washed out from watershed soils or sediment. Various phylotypes associated with the same phylogenetic division, based on tentative identification of T-RFLP fragments, exhibited diverse seasonal patterns, suggesting that ecological

  18. The dynamics of carbon exchange in vertically stratified coastal bacterioplankton communities

    SciTech Connect

    Blum, P.

    1998-07-01

    This research focuses on the development and application of novel molecular methods to measure bacterioplankton growth state in situ. These methods included bulk or population-based studies and single cell studies. Due to the limited duration of support and subsequent termination of the molecular-focused PIs, only the former bulk method was applied to marine samples. In addition, basic laboratory studies were completed which addressed why the selected biomarkers were regulated by bacterial growth state.

  19. Magnitude and regulation of bacterioplankton respiratory quotient across freshwater environmental gradients.

    PubMed

    Berggren, Martin; Lapierre, Jean-François; del Giorgio, Paul A

    2012-05-01

    Bacterioplankton respiration (BR) may represent the largest single sink of organic carbon in the biosphere and constitutes an important driver of atmospheric carbon dioxide (CO(2)) emissions from freshwaters. Complete understanding of BR is precluded by the fact that most studies need to assume a respiratory quotient (RQ; mole of CO(2) produced per mole of O(2) consumed) to calculate rates of BR. Many studies have, without clear support, assumed a fixed RQ around 1. Here we present 72 direct measurements of bacterioplankton RQ that we carried out in epilimnetic samples of 52 freshwater sites in Québec (Canada), using O(2) and CO(2) optic sensors. The RQs tended to converge around 1.2, but showed large variability (s.d.=0.45) and significant correlations with major gradients of ecosystem-level, substrate-level and bacterial community-level characteristics. Experiments with natural bacterioplankton using different single substrates suggested that RQ is intimately linked to the elemental composition of the respired compounds. RQs were on average low in net autotrophic systems, where bacteria likely were utilizing mainly reduced substrates, whereas we found evidence that the dominance of highly oxidized substrates, for example, organic acids formed by photo-chemical processes, led to high RQ in the more heterotrophic systems. Further, we suggest that BR contributes to a substantially larger share of freshwater CO(2) emissions than presently believed based on the assumption that RQ is ∼1. Our study demonstrates that bacterioplankton RQ is not only a practical aspect of BR determination, but also a major ecosystem state variable that provides unique information about aquatic ecosystem functioning.

  20. Magnitude and regulation of bacterioplankton respiratory quotient across freshwater environmental gradients

    PubMed Central

    Berggren, Martin; Lapierre, Jean-François; del Giorgio, Paul A

    2012-01-01

    Bacterioplankton respiration (BR) may represent the largest single sink of organic carbon in the biosphere and constitutes an important driver of atmospheric carbon dioxide (CO2) emissions from freshwaters. Complete understanding of BR is precluded by the fact that most studies need to assume a respiratory quotient (RQ; mole of CO2 produced per mole of O2 consumed) to calculate rates of BR. Many studies have, without clear support, assumed a fixed RQ around 1. Here we present 72 direct measurements of bacterioplankton RQ that we carried out in epilimnetic samples of 52 freshwater sites in Québec (Canada), using O2 and CO2 optic sensors. The RQs tended to converge around 1.2, but showed large variability (s.d.=0.45) and significant correlations with major gradients of ecosystem-level, substrate-level and bacterial community-level characteristics. Experiments with natural bacterioplankton using different single substrates suggested that RQ is intimately linked to the elemental composition of the respired compounds. RQs were on average low in net autotrophic systems, where bacteria likely were utilizing mainly reduced substrates, whereas we found evidence that the dominance of highly oxidized substrates, for example, organic acids formed by photo-chemical processes, led to high RQ in the more heterotrophic systems. Further, we suggest that BR contributes to a substantially larger share of freshwater CO2 emissions than presently believed based on the assumption that RQ is ∼1. Our study demonstrates that bacterioplankton RQ is not only a practical aspect of BR determination, but also a major ecosystem state variable that provides unique information about aquatic ecosystem functioning. PMID:22094347

  1. Temporal variability in the diversity and composition of stream bacterioplankton communities.

    PubMed

    Portillo, Maria C; Anderson, Suzanne P; Fierer, Noah

    2012-09-01

    Bacterioplankton in freshwater streams play a critical role in stream nutrient cycling. Despite their ecological importance, the temporal variability in the structure of stream bacterioplankton communities remains understudied. We investigated the composition and temporal variability of stream bacterial communities and the influence of physicochemical parameters on these communities. We used barcoded pyrosequencing to survey bacterial communities in 107 streamwater samples collected from four locations in the Colorado Rocky Mountains from September 2008 to November 2009. The four sampled locations harboured distinct communities yet, at each sampling location, there was pronounced temporal variability in both community composition and alpha diversity levels. These temporal shifts in bacterioplankton community structure were not seasonal; rather, their diversity and composition appeared to be driven by intermittent changes in various streamwater biogeochemical conditions. Bacterial communities varied independently of time, as indicated by the observation that communities in samples collected close together in time were no more similar than those collected months apart. The temporal turnover in community composition was higher than observed in most previously studied microbial, plant or animal communities, highlighting the importance of stochastic processes and disturbance events in structuring these communities over time. Detailed temporal sampling is important if the objective is to monitor microbial community dynamics in pulsed ecosystems like streams.

  2. Bacterioplankton community shifts associated with epipelagic and mesopelagic waters in the Southern Ocean

    PubMed Central

    Yu, Zheng; Yang, Jun; Liu, Lemian; Zhang, Wenjing; Amalfitano, Stefano

    2015-01-01

    The Southern Ocean is among the least explored marine environments on Earth, and still little is known about regional and vertical variability in the diversity of Antarctic marine prokaryotes. In this study, the bacterioplankton community in both epipelagic and mesopelagic waters was assessed at two adjacent stations by high-throughput sequencing and quantitative PCR. Water temperature was significantly higher in the superficial photic zone, while higher salinity and dissolved oxygen were recorded in the deeper water layers. The highest abundance of the bacterioplankton was found at a depth of 75 m, corresponding to the deep chlorophyll maximum layer. Both Alphaproteobacteria and Gammaproteobacteria were the most abundant taxa throughout the water column, while more sequences affiliated to Cyanobacteria and unclassified bacteria were identified from surface and the deepest waters, respectively. Temperature was the most significant environmental variable affecting the bacterial community structure. The bacterial community composition displayed significant differences at the epipelagic layers between two stations, whereas those in the mesopelagic waters were more similar to each other. Our results indicated that the epipelagic bacterioplankton might be dominated by short-term environmental variable conditions, whereas the mesopelagic communities appeared to be structured by longer water-mass residence time and relative stable environmental factors. PMID:26256889

  3. The temporal scaling of bacterioplankton composition: high turnover and predictability during shrimp cultivation.

    PubMed

    Xiong, Jinbo; Zhu, Jianlin; Wang, Kai; Wang, Xin; Ye, Xiansen; Liu, Lian; Zhao, Qunfen; Hou, Manhua; Qiuqian, Linglin; Zhang, Demin

    2014-02-01

    The spatial distribution of microbial communities has recently been reliably documented in the form of a distance-similarity decay relationship. In contrast, temporal scaling, the pattern defined by the microbial similarity-time relationships (STRs), has received far less attention. As a result, it is unclear whether the spatial and temporal variations of microbial communities share a similar power law. In this study, we applied the 454 pyrosequencing technique to investigate temporal scaling in patterns of bacterioplankton community dynamics during the process of shrimp culture. Our results showed that the similarities decreased significantly (P = 0.002) with time during the period over which the bacterioplankton community was monitored, with a scaling exponent of w = 0.400. However, the diversities did not change dramatically. The community dynamics followed a gradual process of succession relative to the parent communities, with greater similarities between samples from consecutive sampling points. In particular, the variations of the bacterial communities from different ponds shared similar successional trajectories, suggesting that bacterial temporal dynamics are predictable to a certain extent. Changes in bacterial community structure were significantly correlated with the combination of Chl a, TN, PO4 (3-), and the C/N ratio. In this study, we identified predictable patterns in the temporal dynamics of bacterioplankton community structure, demonstrating that the STR of the bacterial community mirrors the spatial distance-similarity decay model.

  4. Tracking differential incorporation of dissolved organic carbon types among diverse lineages of Sargasso Sea bacterioplankton.

    PubMed

    Nelson, Craig E; Carlson, Craig A

    2012-06-01

    Bacterioplankton are the primary trophic conduit for dissolved organic carbon (DOC) and linking community structure with DOC utilization is central to understanding global carbon cycling. We coupled stable isotope probing (SIP) with 16S rRNA pyrosequencing in dark seawater culture experiments on euphotic and mesopelagic communities from the Sargasso Sea. Parallel cultures were amended with equimolar quantities of four DO(13) C substrates to simultaneously evaluate community utilization and population-specific incorporation. Of the substrates tested - two cyanobacterial products (exudates or lysates from a culture of Synechococcus) and two defined monosaccharides (glucose or gluconic acid) - the cyanobacterial exudates were incorporated by the greatest diversity of oligotrophic bacterioplankton populations in surface waters, including taxa from > 10 major subclades within the Flavobacteria, Actinobacteria, Verrucomicrobia and Proteobacteria (including SAR11). In contrast, the monosaccharide glucose was not incorporated by any taxa belonging to extant oligotrophic oceanic clades. Conversely, proteobacterial copiotrophs, which were rare in the ambient water (< 0.1% of sequences), grew rapidly on all DOC amendments at both depths, but with different substrate preferences among lineages. We present a new analytical framework for using SIP to detect DOC incorporation across diverse oligotrophic bacterioplankton and discuss implications for the ecology of bacterial-DOC interactions among populations of diverging trophic strategies.

  5. Understanding diversity patterns in bacterioplankton communities from a sub-Antarctic peatland.

    PubMed

    Quiroga, María Victoria; Valverde, Angel; Mataloni, Gabriela; Cowan, Don

    2015-06-01

    Bacterioplankton communities inhabiting peatlands have the potential to influence local ecosystem functions. However, most microbial ecology research in such wetlands has been done in ecosystems (mostly peat soils) of the Northern Hemisphere, and very little is known of the factors that drive bacterial community assembly in other regions of the world. In this study, we used high-throughput sequencing to analyse the structure of the bacterial communities in five pools located in a sub-Antarctic peat bog (Tierra del Fuego, Argentina), and tested for relationships between bacterial communities and environmental conditions. Bacterioplankton communities in peat bog pools were diverse and dominated by members of the Proteobacteria, Actinobacteria, Bacteroidetes and Verrucomicrobia. Community structure was largely explained by differences in hydrological connectivity, pH and nutrient status (ombrotrophic versus minerotrophic pools). Bacterioplankton communities in ombrotrophic pools showed phylogenetic clustering, suggesting a dominant role of deterministic processes in shaping these assemblages. These correlations between habitat characteristics and bacterial diversity patterns provide new insights into the factors regulating microbial populations in peatland ecosystems.

  6. Combined Carbohydrates Support Rich Communities of Particle-Associated Marine Bacterioplankton

    PubMed Central

    Sperling, Martin; Piontek, Judith; Engel, Anja; Wiltshire, Karen H.; Niggemann, Jutta; Gerdts, Gunnar; Wichels, Antje

    2017-01-01

    Carbohydrates represent an important fraction of labile and semi-labile marine organic matter that is mainly comprised of exopolymeric substances derived from phytoplankton exudation and decay. This study investigates the composition of total combined carbohydrates (tCCHO; >1 kDa) and the community development of free-living (0.2–3 μm) and particle-associated (PA) (3–10 μm) bacterioplankton during a spring phytoplankton bloom in the southern North Sea. Furthermore, rates were determined for the extracellular enzymatic hydrolysis that catalyzes the initial step in bacterial organic matter remineralization. Concentrations of tCCHO greatly increased during bloom development, while the composition showed only minor changes over time. The combined concentration of glucose, galactose, fucose, rhamnose, galactosamine, glucosamine, and glucuronic acid in tCCHO was a significant factor shaping the community composition of the PA bacteria. The richness of PA bacteria greatly increased in the post-bloom phase. At the same time, the increase in extracellular β-glucosidase activity was sufficient to explain the observed decrease in tCCHO, indicating the efficient utilization of carbohydrates by the bacterioplankton community during the post-bloom phase. Our results suggest that carbohydrate concentration and composition are important factors in the multifactorial environmental control of bacterioplankton succession and the enzymatic hydrolysis of organic matter during phytoplankton blooms. PMID:28197132

  7. Marine bacterioplankton community turnover within seasonally hypoxic waters of a subtropical sound: Devil's Hole, Bermuda.

    PubMed

    Parsons, Rachel J; Nelson, Craig E; Carlson, Craig A; Denman, Carmen C; Andersson, Andreas J; Kledzik, Andrew L; Vergin, Kevin L; McNally, Sean P; Treusch, Alexander H; Giovannoni, Stephen J

    2015-10-01

    Understanding bacterioplankton community dynamics in coastal hypoxic environments is relevant to global biogeochemistry because coastal hypoxia is increasing worldwide. The temporal dynamics of bacterioplankton communities were analysed throughout the illuminated water column of Devil's Hole, Bermuda during the 6-week annual transition from a strongly stratified water column with suboxic and high-pCO2 bottom waters to a fully mixed and ventilated state during 2008. A suite of culture-independent methods provided a quantitative spatiotemporal characterization of bacterioplankton community changes, including both direct counts and rRNA gene sequencing. During stratification, the surface waters were dominated by the SAR11 clade of Alphaproteobacteria and the cyanobacterium Synechococcus. In the suboxic bottom waters, cells from the order Chlorobiales prevailed, with gene sequences indicating members of the genera Chlorobium and Prosthecochloris--anoxygenic photoautotrophs that utilize sulfide as a source of electrons for photosynthesis. Transitional zones of hypoxia also exhibited elevated levels of methane- and sulfur-oxidizing bacteria relative to the overlying waters. The abundance of both Thaumarcheota and Euryarcheota were elevated in the suboxic bottom waters (> 10(9) cells l(-1)). Following convective mixing, the entire water column returned to a community typical of oxygenated waters, with Euryarcheota only averaging 5% of cells, and Chlorobiales and Thaumarcheota absent.

  8. Evolutionary origin of a streamlined marine bacterioplankton lineage.

    PubMed

    Luo, Haiwei

    2015-06-01

    Planktonic bacterial lineages with streamlined genomes are prevalent in the ocean. The base composition of their DNA is often highly biased towards low G+C content, a possible source of systematic error in phylogenetic reconstruction. A total of 228 orthologous protein families were sampled that are shared among major lineages of Alphaproteobacteria, including the marine free-living SAR11 clade and the obligate endosymbiotic Rickettsiales. These two ecologically distinct lineages share genome sizes of <1.5 Mbp and genomic G+C content of <30%. Statistical analyses showed that only 28 protein families are composition-homogeneous, whereas the other 200 families significantly violate the composition-homogeneous assumption included in most phylogenetic methods. RAxML analysis based on the concatenation of 24 ribosomal proteins that fall into the heterogeneous protein category clustered the SAR11 and Rickettsiales lineages at the base of the Alphaproteobacteria tree, whereas that based on the concatenation of 28 homogeneous proteins (including 19 ribosomal proteins) disassociated the lineages and placed SAR11 at the base of the non-endosymbiotic lineages. When the two data sets were concatenated, only a model that accounted for compositional bias yielded a tree identical to the tree built with composition-homogeneous proteins. Ancestral genome analysis suggests that the first evolved SAR11 cell had a small genome streamlined from its ancestor by a factor of two and coinciding with an ecological transition, followed by further gradual streamlining towards the extant SAR11 populations.

  9. Bacterioplankton and phytoplankton biomass and production during summer stratification in the northwestern Mediterranean Sea

    NASA Astrophysics Data System (ADS)

    Pedrós-Alió, Carlos; Calderón-Paz, Juan-Isidro; Guixa-Boixereu, Núria; Estrada, Marta; Gasol, Josep M.

    1999-06-01

    We examined bacterioplankton biomass and heterotrophic production (BHP) during summer stratification in the northwestern Mediterranean in four successive stratification seasons (June-July of 1993-1996). Values of phytoplankton biomass and primary production were determined simultaneously so that the data sets for autotrophic and heterotrophic microbial plankton could be compared. Three standard stations were set along a transect from Barcelona to the channel between Mallorca and Menorca, representing coastally influenced shelf waters, frontal waters over the slope front, and open sea waters. Conversion factors from 3H-leucine incorporation to BHP were empirically determined and varied between 0.29 and 3.25 kg C mol -1. Bacterial biomass values were among the lowest found in any marine environment. BHP values (between 0.02 and 2.5 μg C L -1 d -1) were larger than those of low nutrient low chlorophyll areas such as the Sargasso Sea and lower than those from high nutrient low chlorophyll areas such as the equatorial Pacific. Growth rates of bacterioplankton were highest at the slope front (0.20 d -1) and lowest at the open sea station (0.04 d -1). Phytoplankton growth rates were similar at the three stations (˜0.50 d -1). Integrated values of bacterioplankton biomass, BHP and bacterial growth rates did not show significant differences among years, but differences between the three stations were clearly significant. Phytoplankton biomass, primary production, and phytoplankton growth rates did not show significant differences either with year or with station. As a consequence the bacterioplankton to phytoplankton biomass (BB/BPHY) and production (BHP/PP) ratios varied from the coastal to the open sea stations. The BB/BPHY ratio was 0.98 at the coast and ˜0.70 at the other two stations. These ratios are similar to those found in other oligotrophic marine environments. The BHP/PP ratio was 0.83 at the coast, 0.36 at the slope and 0.09 at the open sea station. The last

  10. Responses of spatial-temporal dynamics of bacterioplankton community to large-scale reservoir operation: a case study in the Three Gorges Reservoir, China

    PubMed Central

    Li, Zhe; Lu, Lunhui; Guo, Jinsong; Yang, Jixiang; Zhang, Jiachao; He, Bin; Xu, Linlin

    2017-01-01

    Large rivers are commonly regulated by damming, yet the effects of such disruption on bacterioplankton community structures have not been adequately studied. The aim of this study was to explore the biogeographical patterns present under dam regulation and to uncover the major drivers structuring bacterioplankton communities. Bacterioplankton assemblages in the Three Gorges Reservoir (TGR) were analyzed using Illumina Miseq sequencing by comparing seven sites located within the TGR before and after impoundment. This approach revealed ecological and spatial-temporal variations in bacterioplankton community composition along the longitudinal axis. The community was dynamic and dominated by Proteobacteria and Actinobacteria phyla, encompassing 39.26% and 37.14% of all sequences, respectively, followed by Bacteroidetes (8.67%) and Cyanobacteria (3.90%). The Shannon-Wiener index of the bacterioplankton community in the flood season (August) was generally higher than that in the impoundment season (November). Principal Component Analysis of the bacterioplankton community compositions showed separation between different seasons and sampling sites. Results of the relationship between bacterioplankton community compositions and environmental variables highlighted that ecological processes of element cycling and large dam disturbances are of prime importance in driving the assemblages of riverine bacterioplankton communities. PMID:28211884

  11. Real-Time Pathogen Detection in the Era of Whole-Genome Sequencing and Big Data: Comparison of k-mer and Site-Based Methods for Inferring the Genetic Distances among Tens of Thousands of Salmonella Samples

    PubMed Central

    Pettengill, James B.; Pightling, Arthur W.; Baugher, Joseph D.; Rand, Hugh; Strain, Errol

    2016-01-01

    The adoption of whole-genome sequencing within the public health realm for molecular characterization of bacterial pathogens has been followed by an increased emphasis on real-time detection of emerging outbreaks (e.g., food-borne Salmonellosis). In turn, large databases of whole-genome sequence data are being populated. These databases currently contain tens of thousands of samples and are expected to grow to hundreds of thousands within a few years. For these databases to be of optimal use one must be able to quickly interrogate them to accurately determine the genetic distances among a set of samples. Being able to do so is challenging due to both biological (evolutionary diverse samples) and computational (petabytes of sequence data) issues. We evaluated seven measures of genetic distance, which were estimated from either k-mer profiles (Jaccard, Euclidean, Manhattan, Mash Jaccard, and Mash distances) or nucleotide sites (NUCmer and an extended multi-locus sequence typing (MLST) scheme). When analyzing empirical data (whole-genome sequence data from 18,997 Salmonella isolates) there are features (e.g., genomic, assembly, and contamination) that cause distances inferred from k-mer profiles, which treat absent data as informative, to fail to accurately capture the distance between samples when compared to distances inferred from differences in nucleotide sites. Thus, site-based distances, like NUCmer and extended MLST, are superior in performance, but accessing the computing resources necessary to perform them may be challenging when analyzing large databases. PMID:27832109

  12. Real-Time Pathogen Detection in the Era of Whole-Genome Sequencing and Big Data: Comparison of k-mer and Site-Based Methods for Inferring the Genetic Distances among Tens of Thousands of Salmonella Samples.

    PubMed

    Pettengill, James B; Pightling, Arthur W; Baugher, Joseph D; Rand, Hugh; Strain, Errol

    2016-01-01

    The adoption of whole-genome sequencing within the public health realm for molecular characterization of bacterial pathogens has been followed by an increased emphasis on real-time detection of emerging outbreaks (e.g., food-borne Salmonellosis). In turn, large databases of whole-genome sequence data are being populated. These databases currently contain tens of thousands of samples and are expected to grow to hundreds of thousands within a few years. For these databases to be of optimal use one must be able to quickly interrogate them to accurately determine the genetic distances among a set of samples. Being able to do so is challenging due to both biological (evolutionary diverse samples) and computational (petabytes of sequence data) issues. We evaluated seven measures of genetic distance, which were estimated from either k-mer profiles (Jaccard, Euclidean, Manhattan, Mash Jaccard, and Mash distances) or nucleotide sites (NUCmer and an extended multi-locus sequence typing (MLST) scheme). When analyzing empirical data (whole-genome sequence data from 18,997 Salmonella isolates) there are features (e.g., genomic, assembly, and contamination) that cause distances inferred from k-mer profiles, which treat absent data as informative, to fail to accurately capture the distance between samples when compared to distances inferred from differences in nucleotide sites. Thus, site-based distances, like NUCmer and extended MLST, are superior in performance, but accessing the computing resources necessary to perform them may be challenging when analyzing large databases.

  13. Linking the composition of bacterioplankton to rapid turnover of dissolved dimethylsulphoniopropionate in an algal bloom in the North Sea.

    PubMed

    Zubkov, M V; Fuchs, B M; Archer, S D; Kiene, R P; Amann, R; Burkill, P H

    2001-05-01

    The algal osmolyte, dimethylsulphoniopropionate (DMSP), is abundant in the surface oceans and is the major precursor of dimethyl sulphide (DMS), a gas involved in global climate regulation. Here, we report results from an in situ Lagrangian study that suggests a link between the microbially driven fluxes of dissolved DMSP (DMSPd) and specific members of the bacterioplankton community in a North Sea coccolithophore bloom. The bacterial population in the bloom was dominated by a single species related to the genus Roseobacter, which accounted for 24% of the bacterioplankton numbers and up to 50% of the biomass. The abundance of the Roseobacter cells showed significant paired correlation with DMSPd consumption and bacterioplankton production, whereas abundances of other bacteria did not. Consumed DMSPd (28 nM day(-1)) contributed 95% of the sulphur and up to 15% of the carbon demand of the total bacterial populations, suggesting the importance of DMSP as a substrate for the Roseobacter-dominated bacterioplankton. In dominating DMSPd flux, the Roseobacter species may exert a major control on DMS production. DMSPd turnover rate was 10 times that of DMS (2.7 nM day(-1)), indicating that DMSPd was probably the major source of DMS, but that most of the DMSPd was metabolized without DMS production. Our study suggests that single species of bacterioplankton may at times be important in metabolizing DMSP and regulating the generation of DMS in the sea.

  14. Inferring Horizontal Gene Transfer

    PubMed Central

    Lassalle, Florent; Dessimoz, Christophe

    2015-01-01

    Horizontal or Lateral Gene Transfer (HGT or LGT) is the transmission of portions of genomic DNA between organisms through a process decoupled from vertical inheritance. In the presence of HGT events, different fragments of the genome are the result of different evolutionary histories. This can therefore complicate the investigations of evolutionary relatedness of lineages and species. Also, as HGT can bring into genomes radically different genotypes from distant lineages, or even new genes bearing new functions, it is a major source of phenotypic innovation and a mechanism of niche adaptation. For example, of particular relevance to human health is the lateral transfer of antibiotic resistance and pathogenicity determinants, leading to the emergence of pathogenic lineages [1]. Computational identification of HGT events relies upon the investigation of sequence composition or evolutionary history of genes. Sequence composition-based ("parametric") methods search for deviations from the genomic average, whereas evolutionary history-based ("phylogenetic") approaches identify genes whose evolutionary history significantly differs from that of the host species. The evaluation and benchmarking of HGT inference methods typically rely upon simulated genomes, for which the true history is known. On real data, different methods tend to infer different HGT events, and as a result it can be difficult to ascertain all but simple and clear-cut HGT events. PMID:26020646

  15. Perceptual inference.

    PubMed

    Aggelopoulos, Nikolaos C

    2015-08-01

    Perceptual inference refers to the ability to infer sensory stimuli from predictions that result from internal neural representations built through prior experience. Methods of Bayesian statistical inference and decision theory model cognition adequately by using error sensing either in guiding action or in "generative" models that predict the sensory information. In this framework, perception can be seen as a process qualitatively distinct from sensation, a process of information evaluation using previously acquired and stored representations (memories) that is guided by sensory feedback. The stored representations can be utilised as internal models of sensory stimuli enabling long term associations, for example in operant conditioning. Evidence for perceptual inference is contributed by such phenomena as the cortical co-localisation of object perception with object memory, the response invariance in the responses of some neurons to variations in the stimulus, as well as from situations in which perception can be dissociated from sensation. In the context of perceptual inference, sensory areas of the cerebral cortex that have been facilitated by a priming signal may be regarded as comparators in a closed feedback loop, similar to the better known motor reflexes in the sensorimotor system. The adult cerebral cortex can be regarded as similar to a servomechanism, in using sensory feedback to correct internal models, producing predictions of the outside world on the basis of past experience.

  16. Statistical Inference

    NASA Astrophysics Data System (ADS)

    Khan, Shahjahan

    Often scientific information on various data generating processes are presented in the from of numerical and categorical data. Except for some very rare occasions, generally such data represent a small part of the population, or selected outcomes of any data generating process. Although, valuable and useful information is lurking in the array of scientific data, generally, they are unavailable to the users. Appropriate statistical methods are essential to reveal the hidden "jewels" in the mess of the row data. Exploratory data analysis methods are used to uncover such valuable characteristics of the observed data. Statistical inference provides techniques to make valid conclusions about the unknown characteristics or parameters of the population from which scientifically drawn sample data are selected. Usually, statistical inference includes estimation of population parameters as well as performing test of hypotheses on the parameters. However, prediction of future responses and determining the prediction distributions are also part of statistical inference. Both Classical or Frequentists and Bayesian approaches are used in statistical inference. The commonly used Classical approach is based on the sample data alone. In contrast, increasingly popular Beyesian approach uses prior distribution on the parameters along with the sample data to make inferences. The non-parametric and robust methods are also being used in situations where commonly used model assumptions are unsupported. In this chapter,we cover the philosophical andmethodological aspects of both the Classical and Bayesian approaches.Moreover, some aspects of predictive inference are also included. In the absence of any evidence to support assumptions regarding the distribution of the underlying population, or if the variable is measured only in ordinal scale, non-parametric methods are used. Robust methods are employed to avoid any significant changes in the results due to deviations from the model

  17. Statistical Inference

    NASA Astrophysics Data System (ADS)

    Khan, Shahjahan

    Often scientific information on various data generating processes are presented in the from of numerical and categorical data. Except for some very rare occasions, generally such data represent a small part of the population, or selected outcomes of any data generating process. Although, valuable and useful information is lurking in the array of scientific data, generally, they are unavailable to the users. Appropriate statistical methods are essential to reveal the hidden “jewels” in the mess of the row data. Exploratory data analysis methods are used to uncover such valuable characteristics of the observed data. Statistical inference provides techniques to make valid conclusions about the unknown characteristics or parameters of the population from which scientifically drawn sample data are selected. Usually, statistical inference includes estimation of population parameters as well as performing test of hypotheses on the parameters. However, prediction of future responses and determining the prediction distributions are also part of statistical inference. Both Classical or Frequentists and Bayesian approaches are used in statistical inference. The commonly used Classical approach is based on the sample data alone. In contrast, increasingly popular Beyesian approach uses prior distribution on the parameters along with the sample data to make inferences. The non-parametric and robust methods are also being used in situations where commonly used model assumptions are unsupported. In this chapter,we cover the philosophical andmethodological aspects of both the Classical and Bayesian approaches.Moreover, some aspects of predictive inference are also included. In the absence of any evidence to support assumptions regarding the distribution of the underlying population, or if the variable is measured only in ordinal scale, non-parametric methods are used. Robust methods are employed to avoid any significant changes in the results due to deviations from the model

  18. Diversity and genomics of Antarctic marine micro-organisms.

    PubMed

    Murray, Alison E; Grzymski, Joseph J

    2007-12-29

    Marine bacterioplanktons are thought to play a vital role in Southern Ocean ecology and ecosystem function, as they do in other ocean systems. However, our understanding of phylogenetic diversity, genome-enabled capabilities and specific adaptations to this persistently cold environment is limited. Bacterioplankton community composition shifts significantly over the annual cycle as sea ice melts and phytoplankton bloom. Microbial diversity in sea ice is better known than that of the plankton, where culture collections do not appear to represent organisms detected with molecular surveys. Broad phylogenetic groupings of Antarctic bacterioplankton such as the marine group I Crenarchaeota, alpha-Proteobacteria (Roseobacter-related and SAR-11 clusters), gamma-Proteobacteria (both cultivated and uncultivated groups) and Bacteriodetes-affiliated organisms in Southern Ocean waters are in common with other ocean systems. Antarctic SSU rRNA gene phylotypes are typically affiliated with other polar sequences. Some species such as Polaribacter irgensii and currently uncultivated gamma-Proteobacteria (Ant4D3 and Ant10A4) may flourish in Antarctic waters, though further studies are needed to address diversity on a larger scale. Insights from initial genomics studies on both cultivated organisms and genomes accessed through shotgun cloning of environmental samples suggest that there are many unique features of these organisms that facilitate survival in high-latitude, persistently cold environments.

  19. Snowmelt-driven changes in dissolved organic matter and bacterioplankton communities in the Heilongjiang watershed of China.

    PubMed

    Qiu, Linlin; Cui, Hongyang; Wu, Junqiu; Wang, Baijie; Zhao, Yue; Li, Jiming; Jia, Liming; Wei, Zimin

    2016-06-15

    Bacterioplankton plays a significant role in the circulation of materials and ecosystem function in the biosphere. Dissolved organic matter (DOM) from dead plant material and surface soil leaches into water bodies when snow melts. In our study, water samples from nine sampling sites along the Heilongjiang watershed were collected in February and June 2014 during which period snowmelt occurred. The goal of this study was to characterize changes in DOM and bacterioplankton community composition (BCC) associated with snowmelt, the effects of DOM, environmental and geographical factors on the distribution of BCC and interactions of aquatic bacterioplankton populations with different sources of DOM in the Heilongjiang watershed. BCC was measured by denaturing gradient gel electrophoresis (DGGE). DOM was measured by excitation-emission matrix (EEM) fluorescence spectroscopy. Bacterioplankton exhibited a distinct seasonal change in community composition due to snowmelt at all sampling points except for EG. Redundancy analysis (RDA) indicated that BCC was more closely related to DOM (Components 1 and 4, dissolved organic carbon, biochemical oxygen demand and chlorophyll a) and environmental factors (water temperature and nitrate nitrogen) than geographical factors. Furthermore, DOM had a greater impact on BCC than environmental factors (29.80 vs. 15.90% of the variation). Overall, spring snowmelt played an important role in altering the quality and quantity of DOM and BCC in the Heilongjiang watershed.

  20. Stimulated bacterioplankton growth and selection for certain bacterial taxa in the vicinity of the ctenophore Mnemiopsis leidyi.

    PubMed

    Dinasquet, Julie; Granhag, Lena; Riemann, Lasse

    2012-01-01

    Episodic blooms of voracious gelatinous zooplankton, such as the ctenophore Mnemiopsis leidyi, affect pools of inorganic nutrients and dissolved organic carbon by intensive grazing activities and mucus release. This will potentially influence bacterioplankton activity and community composition, at least at local scales; however, available studies on this are scarce. In the present study we examined effects of M. leidyi on bacterioplankton growth and composition in incubation experiments. Moreover, we examined community composition of bacteria associated with the surface and gut of M. leidyi. High release of ammonium and high bacterial growth was observed in the treatments with M. leidyi relative to controls. Deep 454 pyrosequencing of 16 S rRNA genes showed specific bacterial communities in treatments with M. leidyi as well as specific communities associated with M. leidyi tissue and gut. In particular, members of Flavobacteriaceae were associated with M. leidyi. Our study shows that M. leidyi influences bacterioplankton activity and community composition in the vicinity of the jellyfish. In particular during temporary aggregations of jellyfish, these local zones of high bacterial growth may contribute significantly to the spatial heterogeneity of bacterioplankton activity and community composition in the sea.

  1. Diel fluctuations in the abundance and community diversity of coastal bacterioplankton assemblages over a tidal cycle.

    PubMed

    Olapade, Ola A

    2012-01-01

    The diel change in abundance and community diversity of the bacterioplankton assemblages within the Pacific Ocean at a fixed location in Monterey Bay, California (USA) were examined with several culture-independent (i.e., nucleic acid staining, fluorescence in situ hybridization {FISH}, and 16S ribosomal RNA gene libraries) approaches over a tidal cycle. FISH analyses revealed the quantitative predominance of bacterial members belonging to the Cytophaga-Flavobacterium cluster as well as two Proteobacteria (α- and γ-) subclasses within the bacterioplankton assemblages, especially during high tide (HT) and outgoing tide (OT) than the other tidal events. While the clone libraries showed that majority of the sequences were similar to the 16S rRNA gene sequences of unknown bacteria (32% to 73%), however, the operational taxonomic units from members of the α-Proteobacteria, Bacteroidetes, Firmicutes, and Cyanobacteria were also well represented during the four tidal events examined. Comparatively, sequence diversity was highest in OT, lowest in low tide, and very similar between HT and incoming tide. The results indicate that the dynamics of bacterial occurrence and diversity appeared to be more pronounced during HT and OT, further indicative of the ecological importance of several environmental variables including temperature, light intensity, and nutrient availability that are also concurrently fluctuating during these tidal events in marine systems.

  2. Metagenomic identification of bacterioplankton taxa and pathways involved in microcystin degradation in lake erie.

    PubMed

    Mou, Xiaozhen; Lu, Xinxin; Jacob, Jisha; Sun, Shulei; Heath, Robert

    2013-01-01

    Cyanobacterial harmful blooms (CyanoHABs) that produce microcystins are appearing in an increasing number of freshwater ecosystems worldwide, damaging quality of water for use by human and aquatic life. Heterotrophic bacteria assemblages are thought to be important in transforming and detoxifying microcystins in natural environments. However, little is known about their taxonomic composition or pathways involved in the process. To address this knowledge gap, we compared the metagenomes of Lake Erie free-living bacterioplankton assemblages in laboratory microcosms amended with microcystins relative to unamended controls. A diverse array of bacterial phyla were responsive to elevated supply of microcystins, including Acidobacteria, Actinobacteria, Bacteroidetes, Planctomycetes, Proteobacteria of the alpha, beta, gamma, delta and epsilon subdivisions and Verrucomicrobia. At more detailed taxonomic levels, Methylophilales (mainly in genus Methylotenera) and Burkholderiales (mainly in genera Bordetella, Burkholderia, Cupriavidus, Polaromonas, Ralstonia, Polynucleobacter and Variovorax) of Betaproteobacteria were suggested to be more important in microcystin degradation than Sphingomonadales of Alphaproteobacteria. The latter taxa were previously thought to be major microcystin degraders. Homologs to known microcystin-degrading genes (mlr) were not overrepresented in microcystin-amended metagenomes, indicating that Lake Erie bacterioplankton might employ alternative genes and/or pathways in microcystin degradation. Genes for xenobiotic metabolism were overrepresented in microcystin-amended microcosms, suggesting they are important in bacterial degradation of microcystin, a phenomenon that has been identified previously only in eukaryotic systems.

  3. Recruitment of Members from the Rare Biosphere of Marine Bacterioplankton Communities after an Environmental Disturbance

    PubMed Central

    Sjöstedt, Johanna; Koch-Schmidt, Per; Pontarp, Mikael; Canbäck, Björn; Tunlid, Anders; Lundberg, Per; Hagström, Åke

    2012-01-01

    A bacterial community may be resistant to environmental disturbances if some of its species show metabolic flexibility and physiological tolerance to the changing conditions. Alternatively, disturbances can change the composition of the community and thereby potentially affect ecosystem processes. The impact of disturbance on the composition of bacterioplankton communities was examined in continuous seawater cultures. Bacterial assemblages from geographically closely connected areas, the Baltic Sea (salinity 7 and high dissolved organic carbon [DOC]) and Skagerrak (salinity 28 and low DOC), were exposed to gradual opposing changes in salinity and DOC over a 3-week period such that the Baltic community was exposed to Skagerrak salinity and DOC and vice versa. Denaturing gradient gel electrophoresis and clone libraries of PCR-amplified 16S rRNA genes showed that the composition of the transplanted communities differed significantly from those held at constant salinity. Despite this, the growth yields (number of cells ml−1) were similar, which suggests similar levels of substrate utilization. Deep 454 pyrosequencing of 16S rRNA genes showed that the composition of the disturbed communities had changed due to the recruitment of phylotypes present in the rare biosphere of the original community. The study shows that members of the rare biosphere can become abundant in a bacterioplankton community after disturbance and that those bacteria can have important roles in maintaining ecosystem processes. PMID:22194288

  4. Response of rare, common and abundant bacterioplankton to anthropogenic perturbations in a Mediterranean coastal site.

    PubMed

    Baltar, Federico; Palovaara, Joakim; Vila-Costa, Maria; Salazar, Guillem; Calvo, Eva; Pelejero, Carles; Marrasé, Cèlia; Gasol, Josep M; Pinhassi, Jarone

    2015-06-01

    Bacterioplankton communities are made up of a small set of abundant taxa and a large number of low-abundant organisms (i.e. 'rare biosphere'). Despite the critical role played by bacteria in marine ecosystems, it remains unknown how this large diversity of organisms are affected by human-induced perturbations, or what controls the responsiveness of rare compared to abundant bacteria. We studied the response of a Mediterranean bacterioplankton community to two anthropogenic perturbations (i.e. nutrient enrichment and/or acidification) in two mesocosm experiments (in winter and summer). Nutrient enrichment increased the relative abundance of some operational taxonomic units (OTUs), e.g. Polaribacter, Tenacibaculum, Rhodobacteraceae and caused a relative decrease in others (e.g. Croceibacter). Interestingly, a synergistic effect of acidification and nutrient enrichment was observed on specific OTUs (e.g. SAR86). We analyzed the OTUs that became abundant at the end of the experiments and whether they belonged to the rare (<0.1% of relative abundance), the common (0.1-1.0% of relative abundance) or the abundant (>1% relative abundance) fractions. Most of the abundant OTUs at the end of the experiments were abundant, or at least common, in the original community of both experiments, suggesting that ecosystem alterations do not necessarily call for rare members to grow.

  5. Spatial variability overwhelms seasonal patterns in bacterioplankton communities across a river to ocean gradient

    PubMed Central

    Fortunato, Caroline S; Herfort, Lydie; Zuber, Peter; Baptista, Antonio M; Crump, Byron C

    2012-01-01

    Few studies of microbial biogeography address variability across both multiple habitats and multiple seasons. Here we examine the spatial and temporal variability of bacterioplankton community composition of the Columbia River coastal margin using 16S amplicon pyrosequencing of 300 water samples collected in 2007 and 2008. Communities separated into seven groups (ANOSIM, P<0.001): river, estuary, plume, epipelagic, mesopelagic, shelf bottom (depth<350 m) and slope bottom (depth>850 m). The ordination of these samples was correlated with salinity (ρ=−0.83) and depth (ρ=−0.62). Temporal patterns were obscured by spatial variability among the coastal environments, and could only be detected within individual groups. Thus, structuring environmental factors (for example, salinity, depth) dominate over seasonal changes in determining community composition. Seasonal variability was detected across an annual cycle in the river, estuary and plume where communities separated into two groups, early year (April–July) and late year (August–Nov), demonstrating annual reassembly of communities over time. Determining both the spatial and temporal variability of bacterioplankton communities provides a framework for modeling these communities across environmental gradients from river to deep ocean. PMID:22011718

  6. Response of marine bacterioplankton pH homeostasis gene expression to elevated CO2

    NASA Astrophysics Data System (ADS)

    Bunse, Carina; Lundin, Daniel; Karlsson, Christofer M. G.; Akram, Neelam; Vila-Costa, Maria; Palovaara, Joakim; Svensson, Lovisa; Holmfeldt, Karin; González, José M.; Calvo, Eva; Pelejero, Carles; Marrasé, Cèlia; Dopson, Mark; Gasol, Josep M.; Pinhassi, Jarone

    2016-05-01

    Human-induced ocean acidification impacts marine life. Marine bacteria are major drivers of biogeochemical nutrient cycles and energy fluxes; hence, understanding their performance under projected climate change scenarios is crucial for assessing ecosystem functioning. Whereas genetic and physiological responses of phytoplankton to ocean acidification are being disentangled, corresponding functional responses of bacterioplankton to pH reduction from elevated CO2 are essentially unknown. Here we show, from metatranscriptome analyses of a phytoplankton bloom mesocosm experiment, that marine bacteria responded to lowered pH by enhancing the expression of genes encoding proton pumps, such as respiration complexes, proteorhodopsin and membrane transporters. Moreover, taxonomic transcript analysis showed that distinct bacterial groups expressed different pH homeostasis genes in response to elevated CO2. These responses were substantial for numerous pH homeostasis genes under low-chlorophyll conditions (chlorophyll a <2.5 μg l-1) however, the changes in gene expression under high-chlorophyll conditions (chlorophyll a >20 μg l-1) were low. Given that proton expulsion through pH homeostasis mechanisms is energetically costly, these findings suggest that bacterioplankton adaptation to ocean acidification could have long-term effects on the economy of ocean ecosystems.

  7. Evidence of bacterioplankton community adaptation in response to long-term mariculture disturbance.

    PubMed

    Xiong, Jinbo; Chen, Heping; Hu, Changju; Ye, Xiansen; Kong, Dingjiang; Zhang, Demin

    2015-10-16

    Understanding the underlying mechanisms that shape the temporal dynamics of a microbial community has important implications for predicting the trajectory of an ecosystem's response to anthropogenic disturbances. Here, we evaluated the seasonal dynamics of bacterioplankton community composition (BCC) following more than three decades of mariculture disturbance in Xiangshan Bay. Clear seasonal succession and site (fish farm and control site) separation of the BCC were observed, which were primarily shaped by temperature, dissolved oxygen and sampling time. However, the sensitive bacterial families consistently changed in relative abundance in response to mariculture disturbance, regardless of the season. Temporal changes in the BCC followed the time-decay for similarity relationship at both sites. Notably, mariculture disturbance significantly (P < 0.001) flattened the temporal turnover but intensified bacterial species-to-species interactions. The decrease in bacterial temporal turnover under long-term mariculture disturbance was coupled with a consistent increase in the percentage of deterministic processes that constrained bacterial assembly based on a null model analysis. The results demonstrate that the BCC is sensitive to mariculture disturbance; however, a bacterioplankton community could adapt to a long-term disturbance via attenuating temporal turnover and intensifying species-species interactions. These findings expand our current understanding of microbial assembly in response to long-term anthropogenic disturbances.

  8. Identification of Associations between Bacterioplankton and Photosynthetic Picoeukaryotes in Coastal Waters

    PubMed Central

    Farnelid, Hanna M.; Turk-Kubo, Kendra A.; Zehr, Jonathan P.

    2016-01-01

    Photosynthetic picoeukaryotes are significant contributors to marine primary productivity. Associations between marine bacterioplankton and picoeukaryotes frequently occur and can have large biogeochemical impacts. We used flow cytometry to sort cells from seawater to identify non-eukaryotic phylotypes that are associated with photosynthetic picoeukaryotes. Samples were collected at the Santa Cruz wharf on Monterey Bay, CA, USA during summer and fall, 2014. The phylogeny of associated microbes was assessed through 16S rRNA gene amplicon clone and Illumina MiSeq libraries. The most frequently detected bacterioplankton phyla within the photosynthetic picoeukaryote sorts were Proteobacteria (Alphaproteobacteria and Gammaproteobacteria) and Bacteroidetes. Intriguingly, the presence of free-living bacterial genera in the photosynthetic picoeukaryote sorts could suggest that some of the photosynthetic picoeukaryotes were mixotrophs. However, the occurrence of bacterial sequences, which were not prevalent in the corresponding bulk seawater samples, indicates that there was also a selection for specific OTUs in association with photosynthetic picoeukaryotes suggesting specific functional associations. The results show that diverse bacterial phylotypes are found in association with photosynthetic picoeukaryotes. Taxonomic identification of these associations is a prerequisite for further characterizing and to elucidate their metabolic pathways and ecological functions. PMID:27148165

  9. Phylotype Dynamics of Bacterial P Utilization Genes in Microbialites and Bacterioplankton of a Monomictic Endorheic Lake.

    PubMed

    Valdespino-Castillo, Patricia M; Alcántara-Hernández, Rocío J; Merino-Ibarra, Martín; Alcocer, Javier; Macek, Miroslav; Moreno-Guillén, Octavio A; Falcón, Luisa I

    2017-02-01

    Microbes can modulate ecosystem function since they harbor a vast genetic potential for biogeochemical cycling. The spatial and temporal dynamics of this genetic diversity should be acknowledged to establish a link between ecosystem function and community structure. In this study, we analyzed the genetic diversity of bacterial phosphorus utilization genes in two microbial assemblages, microbialites and bacterioplankton of Lake Alchichica, a semiclosed (i.e., endorheic) system with marked seasonality that varies in nutrient conditions, temperature, dissolved oxygen, and water column stability. We focused on dissolved organic phosphorus (DOP) utilization gene dynamics during contrasting mixing and stratification periods. Bacterial alkaline phosphatases (phoX and phoD) and alkaline beta-propeller phytases (bpp) were surveyed. DOP utilization genes showed different dynamics evidenced by a marked change within an intra-annual period and a differential circadian pattern of expression. Although Lake Alchichica is a semiclosed system, this dynamic turnover of phylotypes (from lake circulation to stratification) points to a different potential of DOP utilization by the microbial communities within periods. DOP utilization gene dynamics was different among genetic markers and among assemblages (microbialite vs. bacterioplankton). As estimated by the system's P mass balance, P inputs and outputs were similar in magnitude (difference was <10 %). A theoretical estimation of water column P monoesters was used to calculate the potential P fraction that can be remineralized on an annual basis. Overall, bacterial groups including Proteobacteria (Alpha and Gamma) and Bacteroidetes seem to be key participants in DOP utilization responses.

  10. Phytoplankton, bacterioplankton and virioplankton structure and function across the southern Great Barrier Reef shelf

    NASA Astrophysics Data System (ADS)

    Alongi, Daniel M.; Patten, Nicole L.; McKinnon, David; Köstner, Nicole; Bourne, David G.; Brinkman, Richard

    2015-02-01

    Bacterioplankton and phytoplankton dynamics, pelagic respiration, virioplankton abundance, and the diversity of pelagic diazotrophs and other bacteria were examined in relation to water-column nutrients and vertical mixing across the southern Great Barrier Reef (GBR) shelf where sharp inshore to offshore gradients in water chemistry and hydrology prevail. A principal component analysis (PCA) revealed station groups clustered geographically, suggesting across-shelf differences in plankton function and structure driven by changes in mixing intensity, sediment resuspension, and the relative contributions of terrestrial, reef and oceanic nutrients. At most stations and sampling periods, microbial abundance and activities peaked both inshore and at channels between outer shelf reefs of the Pompey Reef complex. PCA also revealed that virioplankton numbers and biomass correlated with bacterioplankton numbers and production, and that bacterial growth and respiration correlated with net primary production, suggesting close virus-bacteria-phytoplankton interactions; all plankton groups correlated with particulate C, N, and P. Strong vertical mixing facilitates tight coupling of pelagic and benthic shelf processes as, on average, 37% and 56% of N and P demands of phytoplankton are derived from benthic nutrient regeneration and resuspension. These across-shelf planktonic trends mirror those of the benthic microbial community.

  11. Short-Term Dynamics of North Sea Bacterioplankton-Dissolved Organic Matter Coherence on Molecular Level

    PubMed Central

    Lucas, Judith; Koester, Irina; Wichels, Antje; Niggemann, Jutta; Dittmar, Thorsten; Callies, Ulrich; Wiltshire, Karen H.; Gerdts, Gunnar

    2016-01-01

    Remineralization and transformation of dissolved organic matter (DOM) by marine microbes shape the DOM composition and thus, have large impact on global carbon and nutrient cycling. However, information on bacterioplankton-DOM interactions on a molecular level is limited. We examined the variation of bacterial community composition (BCC) at Helgoland Roads (North Sea) in relation to variation of molecular DOM composition and various environmental parameters on short-time scales. Surface water samples were taken daily over a period of 20 days. Bacterial community and molecular DOM composition were assessed via 16S rRNA gene tag sequencing and ultrahigh resolution Fourier-transform ion cyclotron resonance mass spectrometry (FT-ICR-MS), respectively. Environmental conditions were driven by a coastal water influx during the first half of the sampling period and the onset of a summer phytoplankton bloom toward the end of the sampling period. These phenomena led to a distinct grouping of bacterial communities and DOM composition which was particularly influenced by total dissolved nitrogen (TDN) concentration, temperature, and salinity, as revealed by distance-based linear regression analyses. Bacterioplankton-DOM interaction was demonstrated in strong correlations between specific bacterial taxa and particular DOM molecules, thus, suggesting potential specialization on particular substrates. We propose that a combination of high resolution techniques, as used in this study, may provide substantial information on substrate generalists and specialists and thus, contribute to prediction of BCC variation. PMID:27014241

  12. Impact of warming on phyto-bacterioplankton coupling and bacterial community composition in experimental mesocosms.

    PubMed

    von Scheibner, Markus; Dörge, Petra; Biermann, Antje; Sommer, Ulrich; Hoppe, Hans-Georg; Jürgens, Klaus

    2014-03-01

    Global warming is assumed to alter the trophic interactions and carbon flow patterns of aquatic food webs. The impact of temperature on phyto-bacterioplankton coupling and bacterial community composition (BCC) was the focus of the present study, in which an indoor mesocosm experiment with natural plankton communities from the western Baltic Sea was conducted. A 6 °C increase in water temperature resulted, as predicted, in tighter coupling between the diatom-dominated phytoplankton and heterotrophic bacteria, accompanied by a strong increase in carbon flow into bacterioplankton during the phytoplankton bloom phase. Suppressed bacterial development at cold in situ temperatures probably reflected lowered bacterial production and grazing by protists, as the latter were less affected by low temperatures. BCC was strongly influenced by the phytoplankton bloom stage and to a lesser extent by temperature. Under both temperature regimes, Gammaproteobacteria clearly dominated during the phytoplankton peak, with Glaciecola sp. as the single most abundant taxon. However, warming induced the appearance of additional bacterial taxa belonging to Betaproteobacteria and Bacteroidetes. Our results show that warming during an early phytoplankton bloom causes a shift towards a more heterotrophic system, with the appearance of new bacterial taxa suggesting a potential for utilization of a broader substrate spectrum.

  13. Evidence of bacterioplankton community adaptation in response to long-term mariculture disturbance

    PubMed Central

    Xiong, Jinbo; Chen, Heping; Hu, Changju; Ye, Xiansen; Kong, Dingjiang; Zhang, Demin

    2015-01-01

    Understanding the underlying mechanisms that shape the temporal dynamics of a microbial community has important implications for predicting the trajectory of an ecosystem’s response to anthropogenic disturbances. Here, we evaluated the seasonal dynamics of bacterioplankton community composition (BCC) following more than three decades of mariculture disturbance in Xiangshan Bay. Clear seasonal succession and site (fish farm and control site) separation of the BCC were observed, which were primarily shaped by temperature, dissolved oxygen and sampling time. However, the sensitive bacterial families consistently changed in relative abundance in response to mariculture disturbance, regardless of the season. Temporal changes in the BCC followed the time-decay for similarity relationship at both sites. Notably, mariculture disturbance significantly (P < 0.001) flattened the temporal turnover but intensified bacterial species-to-species interactions. The decrease in bacterial temporal turnover under long-term mariculture disturbance was coupled with a consistent increase in the percentage of deterministic processes that constrained bacterial assembly based on a null model analysis. The results demonstrate that the BCC is sensitive to mariculture disturbance; however, a bacterioplankton community could adapt to a long-term disturbance via attenuating temporal turnover and intensifying species-species interactions. These findings expand our current understanding of microbial assembly in response to long-term anthropogenic disturbances. PMID:26471739

  14. Influence of macrophyte decomposition on growth rate and community structure of Okefenokee Swamp bacterioplankton

    SciTech Connect

    Murray, R.E.; Hodson, R.E.

    1986-02-01

    Dissolved substances released during decomposition of the white water lily (Nymphaea odorata) can alter the growth rate of Okefenokee Swamp bacterioplankton. In microcosm experiments dissolved compounds released bacterioplankton, followed by a period of intense bacterial growth. Rates of (/sup 3/H)thymidine incorporation and turnover of dissolved D-glucose were depressed by over 85%, 3 h after the addition of Nymphaea leachates to microcosms containing Okefenokee Swamp water. Bacterial activity subsequently recovered; after 20 h (/sup 3/H)thymidine incorporation in leachate-treated microcosms was 10-fold greater than that in control microcosms. The recovery of activity was due to a shift in the composition of the bacterial population toward resistance to the inhibitory compounds present in Nymphaea leachates. Inhibitory compounds released during the decomposition of aquatic macrophytes thus act as selective agents which alter the community structure of the bacterial population with respect to leachate resistance. Soluble compounds derived from macrophyte decomposition influence the rate of bacterial secondary production and the availability of microbial biomass to microconsumers.

  15. Occurrence of Plasmids in the Aromatic Degrading Bacterioplankton of the Baltic Sea

    PubMed Central

    Jutkina, Jekaterina; Heinaru, Eeva; Vedler, Eve; Juhanson, Jaanis; Heinaru, Ain

    2011-01-01

    Plasmids are mobile genetic elements that provide their hosts with many beneficial traits including in some cases the ability to degrade different aromatic compounds. To fulfill the knowledge gap regarding catabolic plasmids of the Baltic Sea water, a total of 209 biodegrading bacterial strains were isolated and screened for the presence of these mobile genetic elements. We found that both large and small plasmids are common in the cultivable Baltic Sea bacterioplankton and are particularly prevalent among bacterial genera Pseudomonas and Acinetobacter. Out of 61 plasmid-containing strains (29% of all isolates), 34 strains were found to carry large plasmids, which could be associated with the biodegradative capabilities of the host bacterial strains. Focusing on the diversity of IncP-9 plasmids, self-transmissible m-toluate (TOL) and salicylate (SAL) plasmids were detected. Sequencing the repA gene of IncP-9 carrying isolates revealed a high diversity within IncP-9 plasmid family, as well as extended the assumed bacterial host species range of the IncP-9 representatives. This study is the first insight into the genetic pool of the IncP-9 catabolic plasmids in the Baltic Sea bacterioplankton. PMID:24710296

  16. A spruce gene map infers ancient plant genome reshuffling and subsequent slow evolution in the gymnosperm lineage leading to extant conifers

    PubMed Central

    2012-01-01

    Background Seed plants are composed of angiosperms and gymnosperms, which diverged from each other around 300 million years ago. While much light has been shed on the mechanisms and rate of genome evolution in flowering plants, such knowledge remains conspicuously meagre for the gymnosperms. Conifers are key representatives of gymnosperms and the sheer size of their genomes represents a significant challenge for characterization, sequencing and assembling. Results To gain insight into the macro-organisation and long-term evolution of the conifer genome, we developed a genetic map involving 1,801 spruce genes. We designed a statistical approach based on kernel density estimation to analyse gene density and identified seven gene-rich isochors. Groups of co-localizing genes were also found that were transcriptionally co-regulated, indicative of functional clusters. Phylogenetic analyses of 157 gene families for which at least two duplicates were mapped on the spruce genome indicated that ancient gene duplicates shared by angiosperms and gymnosperms outnumbered conifer-specific duplicates by a ratio of eight to one. Ancient duplicates were much more translocated within and among spruce chromosomes than conifer-specific duplicates, which were mostly organised in tandem arrays. Both high synteny and collinearity were also observed between the genomes of spruce and pine, two conifers that diverged more than 100 million years ago. Conclusions Taken together, these results indicate that much genomic evolution has occurred in the seed plant lineage before the split between gymnosperms and angiosperms, and that the pace of evolution of the genome macro-structure has been much slower in the gymnosperm lineage leading to extent conifers than that seen for the same period of time in flowering plants. This trend is largely congruent with the contrasted rates of diversification and morphological evolution observed between these two groups of seed plants. PMID:23102090

  17. Palaeohexaploid ancestry for Caryophyllales inferred from extensive gene-based physical and genetic mapping of the sugar beet genome (Beta vulgaris).

    PubMed

    Dohm, Juliane C; Lange, Cornelia; Holtgräwe, Daniela; Sörensen, Thomas Rosleff; Borchardt, Dietrich; Schulz, Britta; Lehrach, Hans; Weisshaar, Bernd; Himmelbauer, Heinz

    2012-05-01

    Sugar beet (Beta vulgaris) is an important crop plant that accounts for 30% of the world's sugar production annually. The genus Beta is a distant relative of currently sequenced taxa within the core eudicotyledons; the genomic characterization of sugar beet is essential to make its genome accessible to molecular dissection. Here, we present comprehensive genomic information in genetic and physical maps that cover all nine chromosomes. Based on this information we identified the proposed ancestral linkage groups of rosids and asterids within the sugar beet genome. We generated an extended genetic map that comprises 1127 single nucleotide polymorphism markers prepared from expressed sequence tags and bacterial artificial chromosome (BAC) end sequences. To construct a genome-wide physical map, we hybridized gene-derived oligomer probes against two BAC libraries with 9.5-fold cumulative coverage of the 758 Mbp genome. More than 2500 probes and clones were integrated both in genetic maps and the physical data. The final physical map encompasses 535 chromosomally anchored contigs that contains 8361 probes and 22 815 BAC clones. By using the gene order established with the physical map, we detected regions of synteny between sugar beet (order Caryophyllales) and rosid species that involves 1400-2700 genes in the sequenced genomes of Arabidopsis, poplar, grapevine, and cacao. The data suggest that Caryophyllales share the palaeohexaploid ancestor proposed for rosids and asterids. Taken together, we here provide extensive molecular resources for sugar beet and enable future high-resolution trait mapping, gene identification, and cross-referencing to regions sequenced in other plant species.

  18. Bacterio-plankton transformation of diazepam and 2-amino-5-chlorobenzophenone in river waters.

    PubMed

    Tappin, Alan D; Loughnane, J Paul; McCarthy, Alan J; Fitzsimons, Mark F

    2014-01-01

    Benzodiazepines are a large class of commonly-prescribed drugs used to treat a variety of clinical disorders. They have been shown to produce ecological effects at environmental concentrations, making understanding their fate in aquatic environments very important. In this study, uptake and biotransformations by riverine bacterio-plankton of the benzodiazepine, diazepam, and 2-amino-5-chlorobenzophenone, ACB (a photo-degradation product of diazepam and several other benzodiazepines), were investigated using batch microcosm incubations. These were conducted using water and bacterio-plankton populations from contrasting river catchments (Tamar and Mersey, UK), both in the presence and absence of a peptide, added as an alternative organic substrate. Incubations lasted 21 days, reflecting the expected water residence time in the catchments. In River Tamar water, 36% of diazepam (p < 0.001) was removed when the peptide was absent. In contrast, there was no removal of diazepam when the peptide was added, although the peptide itself was consumed. For ACB, 61% was removed in the absence of the peptide, and 84% in its presence (p < 0.001 in both cases). In River Mersey water, diazepam removal did not occur in the presence or absence of the peptide, with the latter again consumed, while ACB removal decreased from 44 to 22% with the peptide present. This suggests that bacterio-plankton from the Mersey water degraded the peptide in preference to both diazepam and ACB. Biotransformation products were not detected in any of the samples analysed but a significant increase in ammonium concentration (p < 0.038) was measured in incubations with ACB, confirming mineralization of the amine substituent. Sequential inoculation and incubation of Mersey and Tamar microcosms, for 5 periods of 21 days each, did not produce any evidence of increased ability of the microbial community to remove ACB, suggesting that an indigenous consortium was probably responsible for its metabolism. As ACB

  19. Proteomic Stable Isotope Probing Reveals Taxonomically Distinct Patterns in Amino Acid Assimilation by Coastal Marine Bacterioplankton

    PubMed Central

    Bryson, Samuel; Li, Zhou; Pett-Ridge, Jennifer; Hettich, Robert L.; Mayali, Xavier; Pan, Chongle

    2016-01-01

    ABSTRACT Heterotrophic marine bacterioplankton are a critical component of the carbon cycle, processing nearly a quarter of annual primary production, yet defining how substrate utilization preferences and resource partitioning structure microbial communities remains a challenge. In this study, proteomic stable isotope probing (proteomic SIP) was used to characterize population-specific assimilation of dissolved free amino acids (DFAAs), a major source of dissolved organic carbon for bacterial secondary production in aquatic environments. Microcosms of seawater collected from Newport, Oregon, and Monterey Bay, California, were incubated with 1 µM 13C-labeled amino acids for 15 and 32 h. The taxonomic compositions of microcosm metaproteomes were highly similar to those of the sampled natural communities, with Rhodobacteriales, SAR11, and Flavobacteriales representing the dominant taxa. Analysis of 13C incorporation into protein biomass allowed for quantification of the isotopic enrichment of identified proteins and subsequent determination of differential amino acid assimilation patterns between specific bacterioplankton populations. Proteins associated with Rhodobacterales tended to have a significantly high frequency of 13C-enriched peptides, opposite the trend for Flavobacteriales and SAR11 proteins. Rhodobacterales proteins associated with amino acid transport and metabolism had an increased frequency of 13C-enriched spectra at time point 2. Alteromonadales proteins also had a significantly high frequency of 13C-enriched peptides, particularly within ribosomal proteins, demonstrating their rapid growth during incubations. Overall, proteomic SIP facilitated quantitative comparisons of DFAA assimilation by specific taxa, both between sympatric populations and between protein functional groups within discrete populations, allowing an unprecedented examination of population level metabolic responses to resource acquisition in complex microbial communities

  20. Proteomic Stable Isotope Probing Reveals Taxonomically Distinct Patterns in Amino Acid Assimilation by Coastal Marine Bacterioplankton.

    PubMed

    Bryson, Samuel; Li, Zhou; Pett-Ridge, Jennifer; Hettich, Robert L; Mayali, Xavier; Pan, Chongle; Mueller, Ryan S

    2016-01-01

    Heterotrophic marine bacterioplankton are a critical component of the carbon cycle, processing nearly a quarter of annual primary production, yet defining how substrate utilization preferences and resource partitioning structure microbial communities remains a challenge. In this study, proteomic stable isotope probing (proteomic SIP) was used to characterize population-specific assimilation of dissolved free amino acids (DFAAs), a major source of dissolved organic carbon for bacterial secondary production in aquatic environments. Microcosms of seawater collected from Newport, Oregon, and Monterey Bay, California, were incubated with 1 µM (13)C-labeled amino acids for 15 and 32 h. The taxonomic compositions of microcosm metaproteomes were highly similar to those of the sampled natural communities, with Rhodobacteriales, SAR11, and Flavobacteriales representing the dominant taxa. Analysis of (13)C incorporation into protein biomass allowed for quantification of the isotopic enrichment of identified proteins and subsequent determination of differential amino acid assimilation patterns between specific bacterioplankton populations. Proteins associated with Rhodobacterales tended to have a significantly high frequency of (13)C-enriched peptides, opposite the trend for Flavobacteriales and SAR11 proteins. Rhodobacterales proteins associated with amino acid transport and metabolism had an increased frequency of (13)C-enriched spectra at time point 2. Alteromonadales proteins also had a significantly high frequency of (13)C-enriched peptides, particularly within ribosomal proteins, demonstrating their rapid growth during incubations. Overall, proteomic SIP facilitated quantitative comparisons of DFAA assimilation by specific taxa, both between sympatric populations and between protein functional groups within discrete populations, allowing an unprecedented examination of population level metabolic responses to resource acquisition in complex microbial communities

  1. Dimethylsulfoniopropionate and methanethiol are important precursors of methionine and protein-sulfur in marine bacterioplankton.

    PubMed

    Kiene, R P; Linn, L J; González, J; Moran, M A; Bruton, J A

    1999-10-01

    Organic sulfur compounds are present in all aquatic systems, but their use as sources of sulfur for bacteria is generally not considered important because of the high sulfate concentrations in natural waters. This study investigated whether dimethylsulfoniopropionate (DMSP), an algal osmolyte that is abundant and rapidly cycled in seawater, is used as a source of sulfur by bacterioplankton. Natural populations of bacterioplankton from subtropical and temperate marine waters rapidly incorporated 15 to 40% of the sulfur from tracer-level additions of [(35)S]DMSP into a macromolecule fraction. Tests with proteinase K and chloramphenicol showed that the sulfur from DMSP was incorporated into proteins, and analysis of protein hydrolysis products by high-pressure liquid chromatography showed that methionine was the major labeled amino acid produced from [(35)S]DMSP. Bacterial strains isolated from coastal seawater and belonging to the alpha-subdivision of the division Proteobacteria incorporated DMSP sulfur into protein only if they were capable of degrading DMSP to methanethiol (MeSH), whereas MeSH was rapidly incorporated into macromolecules by all tested strains and by natural bacterioplankton. These findings indicate that the demethylation/demethiolation pathway of DMSP degradation is important for sulfur assimilation and that MeSH is a key intermediate in the pathway leading to protein sulfur. Incorporation of sulfur from DMSP and MeSH by natural populations was inhibited by nanomolar levels of other reduced sulfur compounds including sulfide, methionine, homocysteine, cysteine, and cystathionine. In addition, propargylglycine and vinylglycine were potent inhibitors of incorporation of sulfur from DMSP and MeSH, suggesting involvement of the enzyme cystathionine gamma-synthetase in sulfur assimilation by natural populations. Experiments with [methyl-(3)H]MeSH and [(35)S]MeSH showed that the entire methiol group of MeSH was efficiently incorporated into methionine, a

  2. Comparative cytogenetic mapping of Sox2 and Sox14 in cichlid fishes and inferences on the genomic organization of both genes in vertebrates

    PubMed Central

    Mazzuchelli, Juliana; Yang, Fengtang; Kocher, Thomas D.; Martins, Cesar

    2011-01-01

    To better understand the genomic organization and evolution of Sox genes in vertebrates, we cytogenetically mapped Sox2 and Sox14 genes in cichlid fishes and performed comparative analyses of their orthologs in several vertebrate species. The genomic regions neighbouring Sox2 and Sox14 have been conserved during vertebrate diversification. Although cichlids seem to have undergone high rates of genomic rearrangements, Sox2 and Sox14 are linked in the same chromosome in the Etroplinae Etroplus maculatus that represents the sister group of all remaining cichlids. However, this genes are located on different chromosomes in several species of the sister group Pseudocrenilabrinae. Similarly the ancestral synteny of Sox2 and Sox14 has been maintained in several vertebrates, but this synteny has been broken independently in all major groups as a consequence of karyotype rearrangements that took place during the vertebrate evolution. PMID:21691861

  3. Temporal Patterns in Bacterioplankton Community Composition in Three Reservoirs of Similar Trophic Status in Shenzhen, China

    PubMed Central

    Li, Jiancheng; Chen, Cheng; Lu, Jun; Lei, Anping; Hu, Zhangli

    2016-01-01

    The bacterioplankton community composition’s (BCC) spatial and temporal variation patterns in three reservoirs (Shiyan, Xikeng, and LuoTian Reservoir) of similar trophic status in Bao’an District, Shenzhen (China), were investigated using PCR amplification of the 16S rDNA gene and the denaturing gradient gel electrophoresis (DGGE) techniques. Water samples were collected monthly in each reservoir during 12 consecutive months. Distinct differences were detected in band number, pattern, and density of DGGE at different sampling sites and time points. Analysis of the DGGE fingerprints showed that changes in the bacterial community structure mainly varied with seasons, and the patterns of change indicated that seasonal forces might have a more significant impact on the BCC than eutrophic status in the reservoirs, despite the similar Shannon-Weiner index among the three reservoirs. The sequences obtained from excised bands were affiliated with Cyanobacteria, Firmicutes, Bacteriodetes, Acidobacteria, Actinobacteria, Planctomycetes, and Proteobacteria. PMID:27322295

  4. Inferring protein function from genomic sequence: Giardia lamblia expresses a phosphatidylinositol kinase-related kinase similar to yeast and mammalian TOR.

    PubMed

    Morrison, Hilary G; Zamora, Gus; Campbell, Robert K; Sogin, Mitchell L

    2002-12-01

    Functional assays of genes have historically led to insights about the activities of a protein or protein cascade. However, the rapid expansion of genomic and proteomic information for a variety of diverse taxa is an alternative and powerful means of predicting function by comparing the enzymes and metabolic pathways used by different organisms. As part of the Giardia lamblia genome sequencing project, we routinely survey the complement of predicted proteins and compare those found in this putatively early diverging eukaryote with those of prokaryotes and more recently evolved eukaryotic lineages. Such comparisons reveal the minimal composition of conserved metabolic pathways, suggest which proteins may have been acquired by lateral transfer, and, by their absence, hint at functions lost in the transition from a free-living to a parasitic lifestyle. Here, we describe the use of bioinformatic approaches to investigate the complement and conservation of proteins in Giardia involved in the regulation of translation. We compare an FK506 binding protein homologue and phosphatidylinositol kinase-related kinase present in Giardia to those found in other eukaryotes for which complete genomic sequence data are available. Our investigation of the Giardia genome suggests that PIK-related kinases are of ancient origin and are highly conserved.

  5. Successive changes in bacterioplankton communities in the River Rhine after copper additions

    SciTech Connect

    Tubbing, D.M.J.; Admiraal, W.; Katako, A.

    1995-09-01

    The sensitivity of bacterioplankton to copper was analyzed to see whether initial steps in the selection of cooper-tolerant life-forms in mixed populations of bacteria were accompanied by changes in basic metabolic parameters. Analysis took place by measuring the incorporation of [{sup 3}H]thymidine and [{sup 3}H]leucine, and the hydrolysis of leucyl-{beta}-naphthylamide over a period of 4 d. In acute toxicity tests the radiochemically determined parameters showed the same sensitivities to copper, whereas in the enzyme test the dose-response curve had a much lower slope, indicating less sensitivity. Marked differences were observed in the susceptibility of the different processes after prolonged exposure to copper. Incorporation of [{sup 3}H]thymidine, [{sup 3}H]leucine, and proteolytic activity changed substantially during exposure to concentrations as low as 2 to 31 {micro}g Cu L{sup {minus}1}. Higher copper concentrations 126--1,000 {micro}g Cu L{sup {minus}1} led in the course of 24 to 48 h to the development of a bacterial community with a higher overall copper tolerance. In winter, these successive events in bacterial populations were observed in the absence of substantial populations of algae or zooplankton. In summer, the metabolic changes in bacterioplankton expose to copper were strongly affected by the poisoning of other organisms, notably algae, and the subsequent release of organic material. Thus, moderate copper concentrations alter the metabolic profile of bacterial communities, probably as an initial step in the selection of tolerant life-forms.

  6. Transient changes in bacterioplankton communities induced by the submarine volcanic eruption of El Hierro (Canary Islands).

    PubMed

    Ferrera, Isabel; Arístegui, Javier; González, José M; Montero, María F; Fraile-Nuez, Eugenio; Gasol, Josep M

    2015-01-01

    The submarine volcanic eruption occurring near El Hierro (Canary Islands) in October 2011 provided a unique opportunity to determine the effects of such events on the microbial populations of the surrounding waters. The birth of a new underwater volcano produced a large plume of vent material detectable from space that led to abrupt changes in the physical-chemical properties of the water column. We combined flow cytometry and 454-pyrosequencing of 16S rRNA gene amplicons (V1-V3 regions for Bacteria and V3-V5 for Archaea) to monitor the area around the volcano through the eruptive and post-eruptive phases (November 2011 to April 2012). Flow cytometric analyses revealed higher abundance and relative activity (expressed as a percentage of high-nucleic acid content cells) of heterotrophic prokaryotes during the eruptive process as compared to post-eruptive stages. Changes observed in populations detectable by flow cytometry were more evident at depths closer to the volcano (~70-200 m), coinciding also with oxygen depletion. Alpha-diversity analyses revealed that species richness (Chao1 index) decreased during the eruptive phase; however, no dramatic changes in community composition were observed. The most abundant taxa during the eruptive phase were similar to those in the post-eruptive stages and to those typically prevalent in oceanic bacterioplankton communities (i.e. the alphaproteobacterial SAR11 group, the Flavobacteriia class of the Bacteroidetes and certain groups of Gammaproteobacteria). Yet, although at low abundance, we also detected the presence of taxa not typically found in bacterioplankton communities such as the Epsilonproteobacteria and members of the candidate division ZB3, particularly during the eruptive stage. These groups are often associated with deep-sea hydrothermal vents or sulfur-rich springs. Both cytometric and sequence analyses showed that once the eruption ceased, evidences of the volcano-induced changes were no longer observed.

  7. Quantification of Carbon and Phosphorus Co-Limitation in Bacterioplankton: New Insights on an Old Topic

    PubMed Central

    Dorado-García, Irene; Medina-Sánchez, Juan Manuel; Herrera, Guillermo; Cabrerizo, Marco J.; Carrillo, Presentación

    2014-01-01

    Because the nature of the main resource that limits bacterioplankton (e.g. organic carbon [C] or phosphorus [P]) has biogeochemical implications concerning organic C accumulation in freshwater ecosystems, empirical knowledge is needed concerning how bacteria respond to these two resources, available alone or together. We performed field experiments of resource manipulation (2×2 factorial design, with the addition of C, P, or both combined) in two Mediterranean freshwater ecosystems with contrasting trophic states (oligotrophy vs. eutrophy) and trophic natures (autotrophy vs. heterotrophy, measured as gross primary production:respiration ratio). Overall, the two resources synergistically co-limited bacterioplankton, i.e. the magnitude of the response of bacterial production and abundance to the two resources combined was higher than the additive response in both ecosystems. However, bacteria also responded positively to single P and C additions in the eutrophic ecosystem, but not to single C in the oligotrophic one, consistent with the value of the ratio between bacterial C demand and algal C supply. Accordingly, the trophic nature rather than the trophic state of the ecosystems proves to be a key feature determining the expected types of resource co-limitation of bacteria, as summarized in a proposed theoretical framework. The actual types of co-limitation shifted over time and partially deviated (a lesser degree of synergism) from the theoretical expectations, particularly in the eutrophic ecosystem. These deviations may be explained by extrinsic ecological forces to physiological limitations of bacteria, such as predation, whose role in our experiments is supported by the relationship between the dynamics of bacteria and bacterivores tested by SEMs (structural equation models). Our study, in line with the increasingly recognized role of freshwater ecosystems in the global C cycle, suggests that further attention should be focussed on the biotic interactions that

  8. Transient Changes in Bacterioplankton Communities Induced by the Submarine Volcanic Eruption of El Hierro (Canary Islands)

    PubMed Central

    Ferrera, Isabel; Arístegui, Javier; González, José M.; Montero, María F.; Fraile-Nuez, Eugenio; Gasol, Josep M.

    2015-01-01

    The submarine volcanic eruption occurring near El Hierro (Canary Islands) in October 2011 provided a unique opportunity to determine the effects of such events on the microbial populations of the surrounding waters. The birth of a new underwater volcano produced a large plume of vent material detectable from space that led to abrupt changes in the physical-chemical properties of the water column. We combined flow cytometry and 454-pyrosequencing of 16S rRNA gene amplicons (V1–V3 regions for Bacteria and V3–V5 for Archaea) to monitor the area around the volcano through the eruptive and post-eruptive phases (November 2011 to April 2012). Flow cytometric analyses revealed higher abundance and relative activity (expressed as a percentage of high-nucleic acid content cells) of heterotrophic prokaryotes during the eruptive process as compared to post-eruptive stages. Changes observed in populations detectable by flow cytometry were more evident at depths closer to the volcano (~70–200 m), coinciding also with oxygen depletion. Alpha-diversity analyses revealed that species richness (Chao1 index) decreased during the eruptive phase; however, no dramatic changes in community composition were observed. The most abundant taxa during the eruptive phase were similar to those in the post-eruptive stages and to those typically prevalent in oceanic bacterioplankton communities (i.e. the alphaproteobacterial SAR11 group, the Flavobacteriia class of the Bacteroidetes and certain groups of Gammaproteobacteria). Yet, although at low abundance, we also detected the presence of taxa not typically found in bacterioplankton communities such as the Epsilonproteobacteria and members of the candidate division ZB3, particularly during the eruptive stage. These groups are often associated with deep-sea hydrothermal vents or sulfur-rich springs. Both cytometric and sequence analyses showed that once the eruption ceased, evidences of the volcano-induced changes were no longer observed

  9. Physiology and phylogeny of the candidate phylum "Atribacteria" (formerly OP9/JS1) inferred from single-cell genomics and metagenomics

    NASA Astrophysics Data System (ADS)

    Dodsworth, J. A.; Murugapiran, S.; Blainey, P. C.; Nobu, M.; Rinke, C.; Schwientek, P.; Gies, E.; Webster, G.; Kille, P.; Weightman, A.; Liu, W. T.; Hallam, S.; Tsiamis, G.; Swingley, W.; Ross, C.; Tringe, S. G.; Chain, P. S.; Scholz, M. B.; Lo, C. C.; Raymond, J.; Quake, S. R.; Woyke, T.; Hedlund, B. P.

    2014-12-01

    Single-cell sequencing and metagenomics have extended the genomics revolution to yet-uncultivated microorganisms and provided insights into the coding potential of this so-called "microbial dark matter", including microbes belonging candidate phyla with no cultivated representatives. As more datasets emerge, comparison of individual genomes from different lineages and habitats can provide insight into the phylogeny, conserved features, and potential metabolic diversity of candidate phyla. The candidate bacterial phylum OP9 was originally found in Obsidian Pool, Yellowstone National Park, and it has since been detected in geothermal springs, petroleum reservoirs, and engineered thermal environments worldwide. JS1, another uncultivated bacterial lineage affiliated with OP9, is often abundant in marine sediments associated with methane hydrates, hydrocarbon seeps, and on continental margins and shelves, and is found in other non-thermal marine and subsurface environments. The phylogenetic relationship between OP9, JS1, and other Bacteria has not been fully resolved, and to date no axenic cultures from these lineages have been reported. Recently, 31 single amplified genomes (SAGs) from six distinct OP9 and JS1 lineages have been obtained using flow cytometric and microfluidic techniques. These SAGs were used to inform metagenome binning techniques that identified OP9/JS1 sequences in several metagenomes, extending genomic coverage in three of the OP9 and JS1 lineages. Phylogenomic analyses of these SAG and metagenome bin datasets suggest that OP9 and JS1 constitute a single, deeply branching phylum, for which the name "Atribacteria" has recently been proposed. Overall, members of the "Atribacteria" are predicted to be heterotrophic anaerobes without the capacity for respiration, with some lineages potentially specializing in secondary fermentation of organic acids. A set of signature "Atribacteria" genes was tentatively identified, including components of a bacterial

  10. Taxonomy, molecular phylogeny and evolution of plant reverse transcribing viruses (family Caulimoviridae) inferred from full-length genome and reverse transcriptase sequences.

    PubMed

    Bousalem, M; Douzery, E J P; Seal, S E

    2008-01-01

    This study constitutes the first evaluation and application of quantitative taxonomy to the family Caulimoviridae and the first in-depth phylogenetic study of the family Caulimoviridae that integrates the common origin between LTR retrotransposons and caulimoviruses. The phylogenetic trees and PASC analyses derived from the full genome and from the corresponding partial RT concurred, providing strong support for the current genus classification based mainly on genome organisation and use of partial RT sequence as a molecular marker. The PASC distributions obtained are multimodal, making it possible to distinguish between genus, species and strain. The taxonomy of badnaviruses infecting banana (Musa spp.) was clarified, and the consequence of endogenous badnaviruses on the genetic diversity and evolution of caulimoviruses is discussed. The use of LTR retrotransposons as outgroups reveals a structured bipolar topology separating the genus Badnavirus from the other genera. Badnaviruses appear to be the most recent genus, with the genus Tungrovirus in an intermediary position. This structuring intersects the one established by genomic and biological properties and allows us to make a correlation between phylogeny and biogeography. The variability shown between members of the family Caulimoviridae is in a similar range to that reported within other DNA and RNA plant virus families.

  11. Sensitivity of bacterioplankton nitrogen metabolism to eutrophication in sub-tropical coastal waters of Key West, Florida.

    PubMed

    Hoch, Matthew P; Dillon, Kevin S; Coffin, Richard B; Cifuentes, Luis A

    2008-05-01

    Expression of intracellular ammonium assimilation enzymes were used to assess the response of nitrogen (N) metabolism in bacterioplankton to N-loading of sub-tropical coastal waters of Key West, Florida. Specific activities of glutamine synthetase (GS) and total glutamate dehydrogenase (GDHT) were measured on the bacterial size fraction (<0.8 microm) to assess N-deplete versus N-replete metabolic states, respectively. Enzyme results were compared to concentrations of dissolved organic matter and nutrients and to the biomass and production of phytoplankton and bacteria. Concentrations of dissolved inorganic N (DIN), dissolved organic N (DON), and dissolved organic carbon (DOC) positively correlated with specific activities of GDHT and negatively correlated with that of GS. Total dissolved N (TDN) concentration explained 81% of variance in bacterioplankton GDHT:GS activity ratio. The GDHT:GS ratio, TDN, DOC, and bacterial parameters decreased in magnitude along a tidally dynamic trophic gradient from north of Key West to south at the reef tract, which is consistent with the combined effects of localized coastal eutrophication and tidal exchange of seawater from the Southwest Florida Shelf and Florida Strait. The N-replete bacterioplankton north of Key West can regenerate ammonium which sustains primary production transported south to the reef. The range in GDHT:GS ratios was 5-30 times greater than that for commonly used indicators of planktonic eutrophication, which emphasizes the sensitivity of bacterioplankton N-metabolism to changes in N-bioavailability caused by nutrient pollution in sub-tropical coastal waters and utility of GDHT:GS ratio as an bioindicator of N-replete conditions.

  12. Proteomic-based stable isotope probing reveals taxonomically Distinct Patterns in Amino Acid Assimilation by Coastal Marine Bacterioplankton

    DOE PAGES

    Bryson, Samuel; Li, Zhou; Pett-Ridge, Jennifer; ...

    2016-04-26

    Heterotrophic marine bacterioplankton are a critical component of the carbon cycle, processing nearly a quarter of annual global primary production, yet defining how substrate utilization preferences and resource partitioning structure these microbial communities remains a challenge. In this study, we utilized proteomics-based stable isotope probing (proteomic SIP) to characterize the assimilation of amino acids by coastal marine bacterioplankton populations. We incubated microcosms of seawater collected from Newport, OR and Monterey Bay, CA with 1 M 13C-amino acids for 15 and 32 hours. Subsequent analysis of 13C incorporation into protein biomass quantified the frequency and extent of isotope enrichment for identifiedmore » proteins. Using these metrics we tested whether amino acid assimilation patterns were different for specific bacterioplankton populations. Proteins associated with Rhodobacterales and Alteromonadales tended to have a significantly high number of tandem mass spectra from 13C-enriched peptides, while Flavobacteriales and SAR11 proteins generally had significantly low numbers of 13C-enriched spectra. Rhodobacterales proteins associated with amino acid transport and metabolism had an increased frequency of 13C-enriched spectra at time-point 2, while Alteromonadales ribosomal proteins were 13C- enriched across time-points. Overall, proteomic SIP facilitated quantitative comparisons of dissolved free amino acids assimilation by specific taxa, both between sympatric populations and between protein functional groups within discrete populations, allowing an unprecedented examination of population-level metabolic responses to resource acquisition in complex microbial communities.« less

  13. Consequences of increased temperature and acidification on bacterioplankton community composition during a mesocosm spring bloom in the Baltic Sea.

    PubMed

    Lindh, Markus V; Riemann, Lasse; Baltar, Federico; Romero-Oliva, Claudia; Salomon, Paulo S; Granéli, Edna; Pinhassi, Jarone

    2013-04-01

    Despite the paramount importance of bacteria for biogeochemical cycling of carbon and nutrients, little is known about the potential effects of climate change on these key organisms. The consequences of the projected climate change on bacterioplankton community dynamics were investigated in a Baltic Sea spring phytoplankton bloom mesocosm experiment by increasing temperature with 3°C and decreasing pH by approximately 0.4 units via CO₂ addition in a factorial design. Temperature was the major driver of differences in community composition during the experiment, as shown by denaturing gradient gel electrophoresis (DGGE) of amplified 16S rRNA gene fragments. Several bacterial phylotypes belonging to Betaproteobacteria were predominant at 3°C but were replaced by members of the Bacteriodetes in the 6°C mesocosms. Acidification alone had a limited impact on phylogenetic composition, but when combined with increased temperature, resulted in the proliferation of specific microbial phylotypes. Our results suggest that although temperature is an important driver in structuring bacterioplankton composition, evaluation of the combined effects of temperature and acidification is necessary to fully understand consequences of climate change for marine bacterioplankton, their implications for future spring bloom dynamics, and their role in ecosystem functioning.

  14. Bacterioplankton community responses to key environmental variables in plateau freshwater lake ecosystems: A structural equation modeling and change point analysis.

    PubMed

    Cao, Xiaofeng; Wang, Jie; Liao, Jingqiu; Gao, Zhe; Jiang, Dalin; Sun, Jinhua; Zhao, Lei; Huang, Yi; Luan, Shengji

    2017-02-15

    Elevated environmental pressures negatively affect the bacterial community structure. However, little knowledge about the nonlinear responses of spatially related environmental variable across multiple plateau lake ecosystems on bacterioplankton communities has been gathered. Here, we used 454 pyrosequencing of 16S rRNA genes to study the associations of bacterial communities in terms of environmental characteristics as well as the potentially ecological threshold-inducing shifts of the bacterial community structure along the key environmental variables based on hypothesized structural equation models and the SEGMENTED method in 21 plateau lakes. Our results showed that water transparency was the major driving force and that total nitrogen was more significant than total phosphorus in determining the taxon composition of the bacterioplankton community. Significant community threshold estimates for bacterioplankton were observed at 7.36 for pH and 25.6% for the percentage of the agricultural area, while the remarkable change point of the cyanobacteria community structure responding to pH was at 7.74. Furthermore, the findings indicated that increasing nutrient loads can induce a distinct shift in dominance from Proteobacteria to Cyanobacteria, as well as a sharp decrease and adjacent increase when crossing the change point for Actinobacteria and Bacteroidetes along the gradient of the agricultural area.

  15. The UV responses of bacterioneuston and bacterioplankton isolates depend on the physiological condition and involve a metabolic shift.

    PubMed

    Santos, Ana L; Baptista, Inês; Lopes, Sílvia; Henriques, Isabel; Gomes, Newton C M; Almeida, Adelaide; Correia, António; Cunha, Angela

    2012-06-01

    Bacteria from the surface microlayer (bacterioneuston) and underlying waters (bacterioplankton) were isolated upon exposure to UV-B radiation, and their individual UV sensitivity in terms of CFU numbers, activity (leucine and thymidine incorporation), sole-carbon source use profiles, repair potential (light-dependent and independent), and photoadaptation potential, under different physiological conditions, was compared. Colony counts were 11.5-16.2% more reduced by UV-B exposure in bacterioplankton isolates (P < 0.05). Inhibition of leucine incorporation in bacterioneuston isolates was 10.9-11.5% higher than in bacterioplankton (P < 0.05). These effects were accompanied by a shift in sole-carbon source use profiles, assessed with Biolog(®) EcoPlates, with a reduction in consumption of amines and amino acids and increased use of polymers, particularly in bacterioneuston isolates. Recovery under starvation was generally enhanced compared with nourished conditions, especially in bacterioneuston isolates. Overall, only insignificant increases in the induction of antibiotic resistant mutant phenotypes (Rif(R) and Nal(R) ) were observed. In general, a potential for photoadaptation could not be detected among the tested isolates. These results indicate that UV effects on bacteria are influenced by their physiological condition and are accompanied by a shift in metabolic profiles, more significant in bacterioneuston isolates, suggesting the presence of bacterial strains adapted to high UV levels in the SML.

  16. Distribution of bacterioplankton with active metabolism in waters of the St. Anna Trough, Kara Sea, in autumn 2011

    NASA Astrophysics Data System (ADS)

    Mosharova, I. V.; Mosharov, S. A.; Ilinskiy, V. V.

    2017-01-01

    The distribution of bacterioplankton with active electron transport chains, as well as bacteria with intact cell membranes, was investigated for the first time in the region of St. Anna Trough in the Kara Sea. The average number of bacteria with active electron transport chains in the waters of the St. Anna Trough was 15.55 × 103 cells mL-1 (the limits of variation were 1.06-92.17 × 103 cells mL-1). The average number of bacteria with intact membranes was 33.46 × 103 cells mL-1 (the limits of variation were 6.78 to 103.18 × 103 cells mL-1). Almost all bacterioplankton microorganisms in the studied area were potentially viable, and the average share of bacteria with intact membranes was 92.1% of the total number of bacterioplankton (TNB) (the limits of variation were 76.2 to 98.4%). The share of bacteria with active metabolisms was 38.2% of the TNB (the limits of variation were 5.6-93.4%). The shares of the bacteria with active metabolisms were maximum in areas with the most stable environmental conditions (on the shelf and in deep water), whereas on the slope, where the gradients of water temperature and salinity were maximum, these values were lower.

  17. Response of bacterioplankton community structure to an artificial gradient of pCO2 in the Arctic Ocean

    NASA Astrophysics Data System (ADS)

    Zhang, R.; Xia, X.; Lau, S. C. K.; Motegi, C.; Weinbauer, M. G.; Jiao, N.

    2013-06-01

    In order to test the influences of ocean acidification on the ocean pelagic ecosystem, so far the largest CO2 manipulation mesocosm study (European Project on Ocean Acidification, EPOCA) was performed in Kings Bay (Kongsfjorden), Spitsbergen. During a 30 day incubation, bacterial diversity was investigated using DNA fingerprinting and clone library analysis of bacterioplankton samples. Terminal restriction fragment length polymorphism (T-RFLP) analysis of the PCR amplicons of the 16S rRNA genes revealed that general bacterial diversity, taxonomic richness and community structure were influenced by the variation of productivity during the time of incubation, but not the degree of ocean acidification. A BIOENV analysis suggested a complex control of bacterial community structure by various biological and chemical environmental parameters. The maximum apparent diversity of bacterioplankton (i.e., the number of T-RFs) in high and low pCO2 treatments differed significantly. A negative relationship between the relative abundance of Bacteroidetes and pCO2 levels was observed for samples at the end of the experiment by the combination of T-RFLP and clone library analysis. Our study suggests that ocean acidification affects the development of bacterial assemblages and potentially impacts the ecological function of the bacterioplankton in the marine ecosystem.

  18. Using Cases to Strengthen Inference on the Association between Single Nucleotide Polymorphisms and a Secondary Phenotype in Genome-Wide Association Studies

    PubMed Central

    Li, Huilin; Gail, Mitchell H.; Berndt, Sonja; Chatterjee, Nilanjan

    2010-01-01

    Case-control genome-wide association studies provide a vast amount of genetic information that may be used to investigate secondary phenotypes. We study the situation in which the primary disease is rare and the secondary phenotype and genetic markers are dichotomous. An analysis of the association between a genetic marker and the secondary phenotype based on controls only is valid, whereas standard methods that also use cases result in biased estimates and highly inflated type I error if there is an interaction between the secondary phenotype and the genetic marker on the risk of the primary disease. Here we present an adaptively weighted method that combines the case and control data to study the association, while reducing to the controls only analysis if there is strong evidence of an interaction. The possibility of such an interaction and the misleading results for standard methods, but not for the adaptively weighted or controls only approaches, are illustrated by data from a case-control study of colorectal adenoma, in which the secondary phenotype is smoking. Simulations and asymptotic theory indicate that the adaptively weighted method can reduce the mean square error for estimation with a pre-specified SNP and increase the power to discover a new association in a genome-wide study, compared to an analysis of controls only. Further experience with genome-wide studies is needed to determine when methods that assume no interaction and gain precision and power, thereby can be recommended, and when methods such as the adaptively weighted or controls only approaches are needed to guard against the possibility of non-zero interactions. PMID:20583284

  19. An independent genome duplication inferred from Hox paralogs in the American paddlefish--a representative basal ray-finned fish and important comparative reference.

    PubMed

    Crow, Karen D; Smith, Christopher D; Cheng, Jan-Fang; Wagner, Günter P; Amemiya, Chris T

    2012-01-01

    Vertebrates have experienced two rounds of whole-genome duplication (WGD) in the stem lineages of deep nodes within the group and a subsequent duplication event in the stem lineage of the teleosts-a highly diverse group of ray-finned fishes. Here, we present the first full Hox gene sequences for any member of the Acipenseriformes, the American paddlefish, and confirm that an independent WGD occurred in the paddlefish lineage, approximately 42 Ma based on sequences spanning the entire HoxA cluster and eight genes on the HoxD gene cluster. These clusters comprise different HOX loci and maintain conserved synteny relative to bichir, zebrafish, stickleback, and pufferfish, as well as human, mouse, and chick. We also provide a gene genealogy for the duplicated fzd8 gene in paddlefish and present evidence for the first Hox14 gene in any ray-finned fish. Taken together, these data demonstrate that the American paddlefish has an independently duplicated genome. Substitution patterns of the "alpha" paralogs on both the HoxA and HoxD gene clusters suggest transcriptional inactivation consistent with functional diploidization. Further, there are similarities in the pattern of sequence divergence among duplicated Hox genes in paddlefish and teleost lineages, even though they occurred independently approximately 200 Myr apart. We highlight implications on comparative analyses in the study of the "fin-limb transition" as well as gene and genome duplication in bony fishes, which includes all ray-finned fishes as well as the lobe-finned fishes and tetrapod vertebrates.

  20. Skim-Based Genotyping by Sequencing Using a Double Haploid Population to Call SNPs, Infer Gene Conversions, and Improve Genome Assemblies.

    PubMed

    Bayer, Philipp Emanuel

    2016-01-01

    Genotyping by sequencing (GBS) is an emerging technology to rapidly call an abundance of Single Nucleotide Polymorphisms (SNPs) using genome sequencing technology. Several different methodologies and approaches have recently been established, most of these relying on a specific preparation of data. Here we describe our GBS-pipeline, which uses high coverage reads from two parents and low coverage reads from their double haploid offspring to call SNPs on a large scale. The upside of this approach is the high resolution and scalability of the method.

  1. The phylogenetic relationships of insectivores with special reference to the lesser hedgehog tenrec as inferred from the complete sequence of their mitochondrial genome.

    PubMed

    Nikaido, Masato; Cao, Ying; Okada, Norihiro; Hasegawa, Masami

    2003-02-01

    The complete mitochondrial genome of a lesser hedgehog tenrec Echinops telfairi was determined in this study. It is an endemic African insectivore that is found specifically in Madagascar. The tenrec's back is covered with hedgehog-like spines. Unlike other spiny mammals, such as spiny mice, spiny rats, spiny dormice and porcupines, lesser hedgehog tenrecs look amazingly like true hedgehogs (Erinaceidae). However, they are distinguished morphologically from hedgehogs by the absence of a jugal bone. We determined the complete sequence of the mitochondrial genome of a lesser hedgehog tenrec and analyzed the results phylogenetically to determine the relationships between the tenrec and other insectivores (moles, shrews and hedgehogs), as well as the relationships between the tenrec and endemic African mammals, classified as Afrotheria, that have recently been shown by molecular analysis to be close relatives of the tenrec. Our data confirmed the afrotherian status of the tenrec, and no direct relation was recovered between the tenrec and the hedgehog. Comparing our data with those of others, we found that within-species variations in the mitochondrial DNA of lesser hedgehog tenrecs appear to be the largest recognized to date among mammals, apart from orangutans, which might be interesting from the view point of evolutionary history of tenrecs on Madagascar.

  2. Bioinformatic Analyses of Unique (Orphan) Core Genes of the Genus Acidithiobacillus: Functional Inferences and Use As Molecular Probes for Genomic and Metagenomic/Transcriptomic Interrogation

    PubMed Central

    González, Carolina; Lazcano, Marcelo; Valdés, Jorge; Holmes, David S.

    2016-01-01

    Using phylogenomic and gene compositional analyses, five highly conserved gene families have been detected in the core genome of the phylogenetically coherent genus Acidithiobacillus of the class Acidithiobacillia. These core gene families are absent in the closest extant genus Thermithiobacillus tepidarius that subtends the Acidithiobacillus genus and roots the deepest in this class. The predicted proteins encoded by these core gene families are not detected by a BLAST search in the NCBI non-redundant database of more than 90 million proteins using a relaxed cut-off of 1.0e−5. None of the five families has a clear functional prediction. However, bioinformatic scrutiny, using pI prediction, motif/domain searches, cellular location predictions, genomic context analyses, and chromosome topology studies together with previously published transcriptomic and proteomic data, suggests that some may have functions associated with membrane remodeling during cell division perhaps in response to pH stress. Despite the high level of amino acid sequence conservation within each family, there is sufficient nucleotide variation of the respective genes to permit the use of the DNA sequences to distinguish different species of Acidithiobacillus, making them useful additions to the armamentarium of tools for phylogenetic analysis. Since the protein families are unique to the Acidithiobacillus genus, they can also be leveraged as probes to detect the genus in environmental metagenomes and metatranscriptomes, including industrial biomining operations, and acid mine drainage (AMD). PMID:28082953

  3. Short-term variability of heterotrophic bacterioplankton during upwelling off the NW Iberian margin

    NASA Astrophysics Data System (ADS)

    Barbosa, A. B.; Galvão, H. M.; Mendes, P. A.; Álvarez-Salgado, X. A.; Figueiras, F. G.; Joint, I.

    2001-11-01

    Short-term variability of heterotrophic bacterioplankton was studied in a recently upwelled water mass at the NW Iberian margin (August 1998). Bacterioplankton abundance (BA), biomass (BB), production (BP), and specific production (SBP) were monitored during two Lagrangian drift experiments, one along the shelf-edge, the other off-shelf along an upwelling filament. Other measurements included chlorophyll a (Chl a), primary production (PP), suspended particulate organic carbon (POC) and nitrogen (PON), and dissolved organic carbon (DOC) and nitrogen (DON). Although primary production was significantly higher during the shelf-edge drift experiment, bacterial biomass in the euphotic zone (2.68 to 22.20μgC.l -1) was not significantly different from that in the offshore filament. In contrast, bacterial production (0.13-3.52μgC.l -1.d -1), estimated using an empirically determined 14C-leucine to carbon conversion factor, and bacterial growth rates (doubling time, DT: 3.9-29.7d), were significantly higher during the shelf-edge drift (BP: 1.50±0.11 versus 0.50±0.02μgC.l -1.d -1; DT: 6.9±0.3 versus 16.2±0.9 d; p<0.01). Depth-integrated BB over the euphotic zone comprised 15±1% of phytoplankton biomass during shelf-edge drift and 39±4% under the more oligotrophic conditions in the filament. However, daily BP to net primary production ratios were not significantly different in the two regions (6±1% versus 7±1%). BA, BB, BP and SBP were enhanced in the later part of the shelf-edge drift following a pronounced increase in both PP and gross DOC production, suggesting that phytoplankton was a source of substrates for bacteria in recently upwelled waters. This contrasted with the filament drift in which short-term variability of bacterioplankton was much less pronounced and there was no correlation between BP and PP. In both regions, SBP and DOC in the euphotic zone were significantly correlated (p<0.005) indicating some regulatory effect of DOC over bacterial activity

  4. Environmental rather than spatial factors structure bacterioplankton communities in shallow lakes along a > 6000 km latitudinal gradient in South America.

    PubMed

    Souffreau, Caroline; Van der Gucht, Katleen; van Gremberghe, Ineke; Kosten, Sarian; Lacerot, Gissell; Lobão, Lúcia Meirelles; de Moraes Huszar, Vera Lúcia; Roland, Fabio; Jeppesen, Erik; Vyverman, Wim; De Meester, Luc

    2015-07-01

    Metacommunity studies on lake bacterioplankton indicate the importance of environmental factors in structuring communities. Yet most of these studies cover relatively small spatial scales. We assessed the relative importance of environmental and spatial factors in shaping bacterioplankton communities across a > 6000 km latitudinal range, studying 48 shallow lowland lakes in the tropical, tropicali (isothermal subzone of the tropics) and tundra climate regions of South America using denaturing gradient gel electrophoresis. Bacterioplankton community composition (BCC) differed significantly across regions. Although a large fraction of the variation in BCC remained unexplained, the results supported a consistent significant contribution of local environmental variables and to a lesser extent spatial variables, irrespective of spatial scale. Upon correction for space, mainly biotic environmental factors significantly explained the variation in BCC. The abundance of pelagic cladocerans remained particularly significant, suggesting grazer effects on bacterioplankton communities in the studied lakes. These results confirm that bacterioplankton communities are predominantly structured by environmental factors, even over a large-scale latitudinal gradient (6026 km), and stress the importance of including biotic variables in studies that aim to understand patterns in BCC.

  5. Inhibitory Effect of Solar Radiation on Thymidine and Leucine Incorporation by Freshwater and Marine Bacterioplankton

    PubMed Central

    Sommaruga, R.; Obernosterer, I.; Herndl, G. J.; Psenner, R.

    1997-01-01

    We studied the effect of solar radiation on the incorporation of [(sup3)H]thymidine ([(sup3)H]TdR) and [(sup14)C]leucine ([(sup14)C]Leu) by bacterioplankton in a high mountain lake and the northern Adriatic Sea. After short-term exposure (3 to 4 h) of natural bacterial assemblages to sunlight just beneath the surface, the rates of incorporation of [(sup3)H]TdR and [(sup14)C]Leu were reduced at both sites by up to (symbl)70% compared to those for the dark control. Within the solar UV radiation (290 to 400 nm), the inhibition was caused exclusively by UV-A radiation (320 to 400 nm). However, photosynthetically active radiation (PAR) (400 to 700 nm) contributed almost equally to this effect. Experiments with samples from the high mountain lake showed that at a depth of 2.5 m, the inhibition was caused almost exclusively by UV-A radiation. At a depth of 8.5 m, where chlorophyll a concentrations were higher than those in the upper water column, the rates of incorporation of [(sup3)H]TdR were higher in those samples exposed to full sunlight or to UV-A plus PAR than in the dark control. In laboratory experiments with artificial UV light, the incorporation of [(sup3)H]TdR and [(sup14)C]Leu by mixed bacterial lake cultures was also inhibited mainly by UV-A. In contrast, in the presence of the green alga Chlamydomonas geitleri at a chlorophyll a concentration of 2.5 (mu)g liter(sup-1), inhibition by UV radiation was significantly reduced. These results suggest that there may be complex interactions among UV radiation, heterotrophic bacteria, and phytoplankton and their release of extracellular organic carbon. Our findings indicate that the wavelengths which caused the strongest inhibition of TdR and Leu incorporation by bacterioplankton in the water column were in the UV-A range. However, it may be premature to extrapolate this effect to estimates of bacterial production before more precise information on how solar radiation affects the transport of TdR and Leu into the cell

  6. Euphotic zone bacterioplankton sources major sedimentary bacteriohopanepolyols in the Holocene Black Sea

    NASA Astrophysics Data System (ADS)

    Blumenberg, Martin; Seifert, Richard; Kasten, Sabine; Bahlmann, Enno; Michaelis, Walter

    2009-02-01

    Bacteriohopanepolyols (BHPs) are lipid constituents of many bacterial groups. Geohopanoids, the diagenetic products, are therefore ubiquitous in organic matter of the geosphere. To examine the potential of BHPs as environmental markers in marine sediments, we investigated a Holocene sediment core from the Black Sea. The concentrations of BHPs mirror the environmental shift from a well-mixed lake to a stratified marine environment by a strong and gradual increase from low values (˜30 μg g -1 TOC) in the oldest sediments to ˜170 μg g -1 TOC in sediments representing the onset of a permanently anoxic water body at about 7500 years before present (BP). This increase in BHP concentrations was most likely caused by a strong increase in bacterioplanktonic paleoproductivity brought about by several ingressions of Mediterranean Sea waters at the end of the lacustrine stage (˜9500 years BP). δ 15N values coevally decreasing with increasing BHP concentrations may indicate a shift from a phosphorus- to a nitrogen-limited setting supporting growth of N 2-fixing, BHP-producing bacteria. In sediments of the last ˜3000 years BHP concentrations have remained relatively stable at about 50 μg g -1 TOC. The distributions of major BHPs did not change significantly during the shift from lacustrine (or oligohaline) to marine conditions. Tetrafunctionalized BHPs prevailed throughout the entire sediment core, with the common bacteriohopanetetrol and 35-aminobacteriohopanetriol and the rare 35-aminobacteriohopenetriol, so far only known from a purple non-sulfur α-proteobacterium, being the main components. Other BHPs specific to cyanobacteria and pelagic methanotrophic bacteria were also found but only in much smaller amounts. Our results demonstrate that BHPs from microorganisms living in deeper biogeochemical zones of marine water columns are underrepresented or even absent in the sediment compared to the BHPs of bacteria present in the euphotic zone. Obviously, the assemblage of

  7. The bioinvasion of Guam: inferring geographic origin, pace, pattern and process of an invasive lizard (Carlia) in the Pacific using multi-locus genomic data

    USGS Publications Warehouse

    Austin, C.C.; Rittmeyer, E.N.; Oliver, L.A.; Andermann, J.O.; Zug, G.R.; Rodda, G.H.; Jackson, N.D.

    2011-01-01

    Invasive species often have dramatic negative effects that lead to the deterioration and loss of biodiversity frequently coupled with the burden of expensive biocontrol programs and subversion of socioeconomic stability. The fauna and flora of oceanic islands are particularly susceptible to invasive species and the increase of global movements of humans and their products since WW II has caused numerous anthropogenic translocations and increased the ills of human-mediated invasions. We use a multi-locus genomic dataset to identify geographic origin, pace, pattern and historical process of an invasive scincid lizard (Carlia) that has been inadvertently introduced to Guam, the Northern Marianas, and Palau. This lizard is of major importance as its introduction is thought to have assisted in the establishment of the invasive brown treesnake (Boiga irregularis) on Guam by providing a food resource. Our findings demonstrate multiple waves of introductions that appear to be concordant with movements of Allied and Imperial Japanese forces in the Pacific during World War II.

  8. Use of phytoplankton-derived dissolved organic carbon by different types of bacterioplankton.

    PubMed

    Sarmento, Hugo; Gasol, Josep M

    2012-09-01

    Phytoplankton and heterotrophic prokaryotes are major components of the microbial food web and interact continuously: heterotrophic prokaryotes utilize the dissolved organic carbon derived from phytoplankton exudation or cell lysis (DOCp), and mineralization by heterotrophic prokaryotes provides inorganic nutrients for phytoplankton. For this reason, these communities are expected to be closely linked, although the study of the interactions between them is still a major challenge. Recent studies have presented interactions between phytoplankton and heterotrophic prokaryotes based on coexistence or covariation throughout a time-series. However, a real quantification of the carbon flow within these networks (defined as the interaction strength, IS) has not been achieved yet. This is critical to understand the selectivity degree of bacteria responding to specific algal DOCp. Here we used microautoradiography to quantify the preferences of the major heterotrophic prokaryote phylogenetic groups on DOC derived from several representative phytoplankton species, and expressed these preferences as an IS value. The distribution of the ISs was not random but rather skewed towards weak interactions, in a similar way as the distributions described for stable complex non-microbial ecosystems, indicating that there are some cases of high specificity on the use of specific algal DOCp by some bacterial groups, but weak interactions are more common and may be relevant as well. The variety of IS patterns observed supports the view that the vast range of different resources (different types of organic molecules) available in the sea selects and maintains the high levels of diversity described for marine bacterioplankton.

  9. Vertical and Seasonal Variations of Bacterioplankton Subgroups with Different Nucleic Acid Contents: Possible Regulation by Phosphorus†

    PubMed Central

    Nishimura, Yoko; Kim, Chulgoo; Nagata, Toshi

    2005-01-01

    We used flow cytometry to examine seasonal variations in basin-scale distributions of bacterioplankton in Lake Biwa, Japan, a large mesotrophic freshwater lake with an oxygenated hypolimnion. The bacterial communities were divided into three subgroups: bacteria with very high nucleic acid contents (VHNA bacteria), bacteria with high nucleic acid contents (HNA bacteria), and bacteria with low nucleic acid contents (LNA bacteria). During the thermal stratification period, the relative abundance of VHNA bacteria (%VHNA) increased with depth, while the reverse trend was evident for LNA bacteria. Seasonally, the %VHNA was strongly positively correlated (r = 0.87; P < 0.001) with the concentration of dissolved inorganic phosphorus, but not with the concentration of chlorophyll a. The growth of VHNA bacteria was significantly enhanced by addition of phosphate or phosphate plus glucose but not by addition of glucose alone. Although the growth of VHNA and HNA bacteria generally exceeded that of LNA bacteria, our data also revealed that LNA bacteria grew faster than and were grazed as fast as VHNA bacteria in late August, when nutrient limitation was presumably severe. Based on these results, we hypothesize that in severely P-limited environments such as Lake Biwa, P limitation exerts more severe constraints on the growth of bacterial groups with higher nucleic acid contents, which allows LNA bacteria to be competitive and become an important component of the microbial loop. PMID:16204494

  10. Functional diversity of bacterioplankton in three North Florida freshwater lakes over an annual cycle.

    PubMed

    Dickerson, Tamar L; Williams, Henry N

    2014-01-01

    The phylogenetic diversity of freshwater bacterioplankton is widely known; however, there is minimal information on the functional diversity of the bacterial communities in these systems. Understanding the functional diversity of freshwater bacterial communities is important because heterotrophic bacteria can be impacted by anthropogenic perturbation, which in turn can alter biogeochemical cycling. The objective of this study was to use Biolog EcoPlates to acquire spatial and temporal community-level physiological profiles (CLPPs) for three freshwater lakes of different trophic levels and to assess the phylogenetic affiliation of the bacteria responsible for utilizing the various carbon guilds within them by denaturing gradient gel electrophoresis (DGGE). CLPP results showed that bacterial communities utilized the carbon guilds similarly between sites within the three lakes. However, when the metabolic profile of each lake was compared, Lake Bradford and Moore Lake were more similar to one another than to Lake Munson, the eutrophic lake. Additionally, although the bacteria that utilized the five carbon guilds included representatives from the classes α-, β-, γ-Proteobacteria, Flavobacteria and Sphingobacteria, Lake Munson had the largest number of Flavobacteria and γ-Proteobacteria in comparison to Moore Lake and Lake Bradford. Overall, Biolog analysis was useful in identifying differences in the functional diversity of bacterial communities between lakes of different trophic statuses and can be used as a tool to assess ecosystem health.

  11. Away from darkness: a review on the effects of solar radiation on heterotrophic bacterioplankton activity

    PubMed Central

    Ruiz-González, Clara; Simó, Rafel; Sommaruga, Ruben; Gasol, Josep M.

    2013-01-01

    Heterotrophic bacterioplankton are main consumers of dissolved organic matter (OM) in aquatic ecosystems, including the sunlit upper layers of the ocean and freshwater bodies. Their well-known sensitivity to ultraviolet radiation (UVR), together with some recently discovered mechanisms bacteria have evolved to benefit from photosynthetically available radiation (PAR), suggest that natural sunlight plays a relevant, yet difficult to predict role in modulating bacterial biogeochemical functions in aquatic ecosystems. Three decades of experimental work assessing the effects of sunlight on natural bacterial heterotrophic activity reveal responses ranging from high stimulation to total inhibition. In this review, we compile the existing studies on the topic and discuss the potential causes underlying these contrasting results, with special emphasis on the largely overlooked influences of the community composition and the previous light exposure conditions, as well as the different temporal and spatial scales at which exposure to solar radiation fluctuates. These intricate sunlight-bacteria interactions have implications for our understanding of carbon fluxes in aquatic systems, yet further research is necessary before we can accurately evaluate or predict the consequences of increasing surface UVR levels associated with global change. PMID:23734148

  12. Tips and tricks for high quality MAR-FISH preparations: focus on bacterioplankton analysis.

    PubMed

    Alonso, Cecilia

    2012-12-01

    The combination of microautoradiography and fluorescence in situ hybridization (MAR-FISH) is a powerful technique for tracking the incorporation of radiolabelled compounds by specific bacterial populations at a single cell resolution. It has been widely applied in aquatic microbial ecology as a tool to unveil key ecophysiological features, shedding light on relevant ecological issues such as bacterial biomass production, the role of different bacterioplankton groups in the global carbon and sulphur cycle, and, at the same time, providing insights into the life styles and niche differentiation of cosmopolitan members of the aquatic microbial communities. Despite its great potential, its application has remained restricted to a few laboratories around the world, in part due to its reputation as a "difficult technique". Therefore, the objective of this minireview is to highlight the impact of MAR-FISH application on aquatic microbial ecology, and also to provide basic concepts, as well as practical tips, for processing MAR-FISH preparations, thus aiming to contribute to a more widespread application of this powerful method.

  13. Jellyfish-associated bacterial communities and bacterioplankton in Indonesian Marine lakes.

    PubMed

    Cleary, Daniel F R; Becking, Leontine E; Polónia, Ana R M; Freitas, Rossana M; Gomes, Newton C M

    2016-05-01

    In the present study, we compared communities of bacteria in two jellyfish species (the 'golden' jellyfish Mastigias cf.papua and the box jellyfish Tripedalia cf.cystophora) and water in three marine lakes located in the Berau region of northeastern Borneo, Indonesia. Jellyfish-associated bacterial communities were compositionally distinct and less diverse than bacterioplankton communities. Alphaproteobacteria, Gammaproteobacteria, Synechococcophycidae and Flavobacteriia were the most abundant classes in water. Jellyfish-associated bacterial communities were dominated by OTUs assigned to the Gammaproteobacteria (family Endozoicimonaceae), Mollicutes, Spirochaetes and Alphaproteobacteria (orders Kiloniellales and Rhodobacterales). Mollicutes were mainly restricted to Mastigias whereas Spirochaetes and the order Kiloniellales were most abundant in Tripedalia hosts. The most abundant OTU overall in jellyfish hosts was assigned to the family Endozoicimonaceae and was highly similar to organisms in Genbank obtained from various hosts including an octocoral, bivalve and fish species. Other abundant OTUs included an OTU assigned to the order Entomoplasmatales and mainly found in Mastigias hosts and OTUs assigned to the Spirochaetes and order Kiloniellales and mainly found in Tripedalia hosts. The low sequence similarity of the Entomoplasmatales OTU to sequences in Genbank suggests that it may be a novel lineage inhabiting Mastigias and possibly restricted to marine lakes.

  14. Macrophyte Species Drive the Variation of Bacterioplankton Community Composition in a Shallow Freshwater Lake

    PubMed Central

    Zeng, Jin; Bian, Yuanqi; Xing, Peng

    2012-01-01

    Macrophytes play an important role in structuring aquatic ecosystems. In this study, we explored whether macrophyte species are involved in determining the bacterioplankton community composition (BCC) in shallow freshwater lakes. The BCC in field areas dominated by different macrophyte species in Taihu Lake, a large, shallow freshwater lake, was investigated over a 1-year period. Subsequently, microcosm experiments were conducted to determine if single species of different types of macrophytes in an isolated environment would alter the BCC. Denaturing gradient gel electrophoresis (DGGE), followed by cloning and sequence analysis of selected samples, was employed to analyze the BCC. The DGGE results of the field investigations indicated that the BCC changed significantly from season to season and that the presence of different macrophyte species resulted in lower BCC similarities in the summer and fall. LIBSHUFF analysis of selected clone libraries from the summer demonstrated different BCCs in the water column surrounding different macrophytes. Relative to the field observations, the microcosm studies indicated that the BCC differed more pronouncedly when associated with different species of macrophytes, which was also supported by LIBSHUFF analysis of the selected clone libraries. Overall, this study suggested that macrophyte species might be an important factor in determining the composition of bacterial communities in this shallow freshwater lake and that the species-specific influence of macrophytes on BCC is variable with the season and distance. PMID:22038598

  15. Phylogenetic conservation of freshwater lake habitat preference varies between abundant bacterioplankton phyla.

    PubMed

    Schmidt, Marian L; White, Jeffrey D; Denef, Vincent J

    2016-04-01

    Despite their homogeneous appearance, aquatic systems harbour heterogeneous habitats resulting from nutrient gradients, suspended particulate matter and stratification. Recent reports suggest phylogenetically conserved habitat preferences among bacterioplankton, particularly for particle-associated (PA) and free-living (FL) habitats. Here, we show that independent of lake nutrient level and layer, PA and FL abundance-weighted bacterial community composition (BCC) differed and that inter-lake BCC varied more for PA than for FL fractions. In low-nutrient lakes, BCC differences between PA and FL fractions were larger than those between lake layers. The reverse was true for high-nutrient lakes. Nutrient level affected BCC more in hypolimnia than in epilimnia, likely due to hypolimnetic hypoxia in high-nutrient lakes. In line with previous reports, we observed within-phylum operational taxonomic unit (OTU) habitat preference conservation, although not for all phyla, including the phylum with the highest average relative abundance across all habitats (Bacteroidetes). Consistent phylum-level habitat preferences may indicate that the functional traits that underpin ecological adaptation of freshwater bacteria to lake habitats can be phylogenetically conserved, although the levels of conservation are phylum dependent. Resolving taxa preferences for freshwater habitats sets the stage for identification of traits that underpin habitat specialization and associated functional traits that influence differences in biogeochemical cycling across freshwater lake habitats.

  16. Insights into bacterioplankton community structure from Sundarbans mangrove ecoregion using Sanger and Illumina MiSeq sequencing approaches: A comparative analysis.

    PubMed

    Ghosh, Anwesha; Bhadury, Punyasloke

    2017-03-01

    Next generation sequencing using platforms such as Illumina MiSeq provides a deeper insight into the structure and function of bacterioplankton communities in coastal ecosystems compared to traditional molecular techniques such as clone library approach which incorporates Sanger sequencing. In this study, structure of bacterioplankton communities was investigated from two stations of Sundarbans mangrove ecoregion using both Sanger and Illumina MiSeq sequencing approaches. The Illumina MiSeq data is available under the BioProject ID PRJNA35180 and Sanger sequencing data under accession numbers KX014101-KX014140 (Stn1) and KX014372-KX014410 (Stn3). Proteobacteria-, Firmicutes- and Bacteroidetes-like sequences retrieved from both approaches appeared to be abundant in the studied ecosystem. The Illumina MiSeq data (2.1 GB) provided a deeper insight into the structure of bacterioplankton communities and revealed the presence of bacterial phyla such as Actinobacteria, Cyanobacteria, Tenericutes, Verrucomicrobia which were not recovered based on Sanger sequencing. A comparative analysis of bacterioplankton communities from both stations highlighted the presence of genera that appear in both stations and genera that occur exclusively in either station. However, both the Sanger sequencing and Illumina MiSeq data were coherent at broader taxonomic levels. Pseudomonas, Devosia, Hyphomonas and Erythrobacter-like sequences were the abundant bacterial genera found in the studied ecosystem. Both the sequencing methods showed broad coherence although as expected the Illumina MiSeq data helped identify rarer bacterioplankton groups and also showed the presence of unassigned OTUs indicating possible presence of novel bacterioplankton from the studied mangrove ecosystem.

  17. Anisakis simplex complex: ecological significance of recombinant genotypes in an allopatric area of the Adriatic Sea inferred by genome-derived simple sequence repeats.

    PubMed

    Mladineo, Ivona; Trumbić, Željka; Radonić, Ivana; Vrbatović, Anamarija; Hrabar, Jerko; Bušelić, Ivana

    2017-03-01

    The genus Anisakis includes nine species which, due to close morphological resemblance even in the adult stage, have previously caused many issues in their correct identification. Recently observed interspecific hybridisation in sympatric areas of two closely related species, Anisakis simplex sensu stricto (s.s.) and Anisakis pegreffii, has raised concerns whether a F1 hybrid generation is capable of overriding the breeding barrier, potentially giving rise to more resistant/pathogenic strains infecting humans. To assess the ecological significance of anisakid genotypes in the Adriatic Sea, an allopatric area for the two above-mentioned species, we analysed data from PCR-RFLP genotyping of the ITS region and the sequence of the cytochrome oxidase 2 (cox2) mtDNA locus to discern the parental genotype and maternal haplotype of the individuals. Furthermore, using in silico genome-wide screening of the A. simplex database for polymorphic simple sequence repeats or microsatellites in non-coding regions, we randomly selected potentially informative loci that were tested and optimised for multiplex PCR. The first panel of microsatellites developed for Anisakis was shown to be highly polymorphic, sensitive and amplified in both A. simplex s.s. and A. pegreffii. It was used to inspect genetic differentiation of individuals showing mito-nuclear mosaicism which is characteristic for both species. The observed low level of intergroup heterozygosity suggests that existing mosaicism is likely a retention of an ancestral polymorphism rather than a recent recombination event. This is also supported by allopatry of pure A. simplex s.s. and A. pegreffii in the geographical area under study.

  18. Organic substrate quality as the link between bacterioplankton carbon demand and growth efficiency in a temperate salt-marsh estuary.

    PubMed

    Apple, Jude K; del Giorgio, P A

    2007-12-01

    Bacterioplankton communities play a key role in aquatic carbon cycling, specifically with respect to the magnitude of organic carbon processed and partitioning of this carbon into biomass and respiratory losses. Studies of bacterioplankton carbon demand (BCD) and growth efficiency (BGE) frequently report higher values in more productive systems, suggesting these aspects of carbon metabolism may be positively coupled. However, the existence of such a relationship in natural aquatic systems has yet to be identified. Using a comprehensive 2-year study of bacterioplankton carbon metabolism in a temperate estuary, we investigated BCD and BGE and explored factors that may modulate their magnitude and coherence, including nutrient concentrations, dissolved nutrient uptake and source and quality of dissolved organic carbon (DOC). During the course of our study, BCD ranged from 0.4 to 15.9 microg l(-1) h(-1), with an overall mean of 3.8 microg l(-1) h(-1). Mean BGE was similar to that reported for other estuarine systems (0.32) and of comparable range (that is, 0.06-0.68). Initial analyses identified a negative correlation between BCD and BGE, yet removal of the effect of temperature revealed an underlying positive coupling that was also correlated with long-term DOC lability. Whereas BCD was weakly related to ambient DOC concentrations, neither BCD nor BGE showed any relationship with ambient nutrient concentrations or nutrient uptake stoichiometries. We conclude that in this carbon-rich estuary, organic matter source and quality play an important role in regulating the magnitude of carbon metabolism and may be more important than nutrient availability alone in the regulation of BGE.

  19. Free-Living and Particle-Associated Bacterioplankton in Large Rivers of the Mississippi River Basin Demonstrate Biogeographic Patterns

    PubMed Central

    Millar, Justin J.; Payne, Jason T.; Ochs, Clifford A.

    2014-01-01

    The different drainage basins of large rivers such as the Mississippi River represent interesting systems in which to study patterns in freshwater microbial biogeography. Spatial variability in bacterioplankton communities in six major rivers (the Upper Mississippi, Missouri, Illinois, Ohio, Tennessee, and Arkansas) of the Mississippi River Basin was characterized using Ion Torrent 16S rRNA amplicon sequencing. When all systems were combined, particle-associated (>3 μm) bacterial assemblages were found to be different from free-living bacterioplankton in terms of overall community structure, partly because of differences in the proportional abundance of sequences affiliated with major bacterial lineages (Alphaproteobacteria, Cyanobacteria, and Planctomycetes). Both particle-associated and free-living communities ordinated by river system, a pattern that was apparent even after rare sequences or those affiliated with Cyanobacteria were removed from the analyses. Ordination of samples by river system correlated with environmental characteristics of each river, such as nutrient status and turbidity. Communities in the Upper Mississippi and the Missouri and in the Ohio and the Tennessee, pairs of rivers that join each other, contained similar taxa in terms of presence-absence data but differed in the proportional abundance of major lineages. The most common sequence types detected in particle-associated communities were picocyanobacteria in the Synechococcus/Prochlorococcus/Cyanobium (Syn/Pro) clade, while free-living communities also contained a high proportion of LD12 (SAR11/Pelagibacter)-like Alphaproteobacteria. This research shows that while different tributaries of large river systems such as the Mississippi River harbor distinct bacterioplankton communities, there is also microhabitat variation such as that between free-living and particle-associated assemblages. PMID:25217018

  20. Proteomic-based stable isotope probing reveals taxonomically Distinct Patterns in Amino Acid Assimilation by Coastal Marine Bacterioplankton

    SciTech Connect

    Bryson, Samuel; Li, Zhou; Pett-Ridge, Jennifer; Robert L. Hettich; Mayali, Xavier; Pan, Chongle; Mueller, Ryan S.

    2016-04-26

    Heterotrophic marine bacterioplankton are a critical component of the carbon cycle, processing nearly a quarter of annual global primary production, yet defining how substrate utilization preferences and resource partitioning structure these microbial communities remains a challenge. In this study, we utilized proteomics-based stable isotope probing (proteomic SIP) to characterize the assimilation of amino acids by coastal marine bacterioplankton populations. We incubated microcosms of seawater collected from Newport, OR and Monterey Bay, CA with 1 M 13C-amino acids for 15 and 32 hours. Subsequent analysis of 13C incorporation into protein biomass quantified the frequency and extent of isotope enrichment for identified proteins. Using these metrics we tested whether amino acid assimilation patterns were different for specific bacterioplankton populations. Proteins associated with Rhodobacterales and Alteromonadales tended to have a significantly high number of tandem mass spectra from 13C-enriched peptides, while Flavobacteriales and SAR11 proteins generally had significantly low numbers of 13C-enriched spectra. Rhodobacterales proteins associated with amino acid transport and metabolism had an increased frequency of 13C-enriched spectra at time-point 2, while Alteromonadales ribosomal proteins were 13C- enriched across time-points. Overall, proteomic SIP facilitated quantitative comparisons of dissolved free amino acids assimilation by specific taxa, both between sympatric populations and between protein functional groups within discrete populations, allowing an unprecedented examination of population-level metabolic responses to resource acquisition in complex microbial communities.

  1. Free-Living and Particle-Associated Bacterioplankton in Large Rivers of the Mississippi River Basin Demonstrate Biogeographic Patterns.

    PubMed

    Jackson, Colin R; Millar, Justin J; Payne, Jason T; Ochs, Clifford A

    2014-12-01

    The different drainage basins of large rivers such as the Mississippi River represent interesting systems in which to study patterns in freshwater microbial biogeography. Spatial variability in bacterioplankton communities in six major rivers (the Upper Mississippi, Missouri, Illinois, Ohio, Tennessee, and Arkansas) of the Mississippi River Basin was characterized using Ion Torrent 16S rRNA amplicon sequencing. When all systems were combined, particle-associated (>3 μm) bacterial assemblages were found to be different from free-living bacterioplankton in terms of overall community structure, partly because of differences in the proportional abundance of sequences affiliated with major bacterial lineages (Alphaproteobacteria, Cyanobacteria, and Planctomycetes). Both particle-associated and free-living communities ordinated by river system, a pattern that was apparent even after rare sequences or those affiliated with Cyanobacteria were removed from the analyses. Ordination of samples by river system correlated with environmental characteristics of each river, such as nutrient status and turbidity. Communities in the Upper Mississippi and the Missouri and in the Ohio and the Tennessee, pairs of rivers that join each other, contained similar taxa in terms of presence-absence data but differed in the proportional abundance of major lineages. The most common sequence types detected in particle-associated communities were picocyanobacteria in the Synechococcus/Prochlorococcus/Cyanobium (Syn/Pro) clade, while free-living communities also contained a high proportion of LD12 (SAR11/Pelagibacter)-like Alphaproteobacteria. This research shows that while different tributaries of large river systems such as the Mississippi River harbor distinct bacterioplankton communities, there is also microhabitat variation such as that between free-living and particle-associated assemblages.

  2. Deep Learning for Population Genetic Inference

    PubMed Central

    Sheehan, Sara; Song, Yun S.

    2016-01-01

    Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme. PMID:27018908

  3. Deep Learning for Population Genetic Inference.

    PubMed

    Sheehan, Sara; Song, Yun S

    2016-03-01

    Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme.

  4. Depth profiles of bacterioplankton assemblages and their activities in the Ross Sea

    NASA Astrophysics Data System (ADS)

    Celussi, Mauro; Cataletto, Bruno; Fonda Umani, Serena; Del Negro, Paola

    2009-12-01

    The identification of bacterial community structure has led, since the beginning of the 1990s, to the idea that bacterioplankton populations are stratified in the water column and that diverse lineages with mostly unknown phenotypes dominate marine microbial communities. The diversity of depth-related assemblages is also reflected in their patterns of activities, as bacteria affiliated to different groups can express different activities in a given ecosystem. We analysed bacterial assemblages (DGGE fingerprinting) and their activities (prokaryotic carbon production, protease, phosphatase, chitinase, beta-glucosidase and lipase activities) in two areas in the Ross Sea, differing mainly in their productivity regime: two stations are located in the Terra Nova Bay polynya area (highly productive during summer) and two close to Cape Adare (low phytoplankton biomass and activity). At every station a pronounced stratification of bacterial assemblages was identified, highlighting epipelagic communities differing substantially from the mesopelagic and the bathypelagic communities. Multivariate analysis suggested that pressure and indirectly light-affected variables (i.e. oxygen and fluorescence) had a great effect on the bacterial communities outcompeting the possible influences of temperature and dissolved organic carbon concentration. Generally activities decreased with depth even though a signal of the Circumpolar Deep Water (CDW) at one of the northern stations corresponded to an increase in some of the degradative activities, generating some 'hot spots' in the profile. We also found that similar assemblages express similar metabolic requirements reflected in analogous patterns of activity (similar degradative potential and leucine uptake rate). Furthermore, the presence of eukaryotic chloroplasts' 16S rDNA in deep samples highlighted how in some cases the dense surface-water formation (in this case High Salinity Shelf Water—HSSW) and downwelling can affect, at least

  5. Bacterioplankton Biogeography of the Atlantic Ocean: A Case Study of the Distance-Decay Relationship.

    PubMed

    Milici, Mathias; Tomasch, Jürgen; Wos-Oxley, Melissa L; Decelle, Johan; Jáuregui, Ruy; Wang, Hui; Deng, Zhi-Luo; Plumeier, Iris; Giebel, Helge-Ansgar; Badewien, Thomas H; Wurst, Mascha; Pieper, Dietmar H; Simon, Meinhard; Wagner-Döbler, Irene

    2016-01-01

    In order to determine the influence of geographical distance, depth, and Longhurstian province on bacterial community composition and compare it with the composition of photosynthetic micro-eukaryote communities, 382 samples from a depth-resolved latitudinal transect (51°S-47°N) from the epipelagic zone of the Atlantic ocean were analyzed by Illumina amplicon sequencing. In the upper 100 m of the ocean, community similarity decreased toward the equator for 6000 km, but subsequently increased again, reaching similarity values of 40-60% for samples that were separated by ~12,000 km, resulting in a U-shaped distance-decay curve. We conclude that adaptation to local conditions can override the linear distance-decay relationship in the upper epipelagial of the Atlantic Ocean which is apparently not restrained by barriers to dispersal, since the same taxa were shared between the most distant communities. The six Longhurstian provinces covered by the transect were comprised of distinct microbial communities; ~30% of variation in community composition could be explained by province. Bacterial communities belonging to the deeper layer of the epipelagic zone (140-200 m) lacked a distance-decay relationship altogether and showed little provincialism. Interestingly, those biogeographical patterns were consistently found for bacteria from three different size fractions of the plankton with different taxonomic composition, indicating conserved underlying mechanisms. Analysis of the chloroplast 16S rRNA gene sequences revealed that phytoplankton composition was strongly correlated with both free-living and particle associated bacterial community composition (R between 0.51 and 0.62, p < 0.002). The data show that biogeographical patterns commonly found in macroecology do not hold for marine bacterioplankton, most likely because dispersal and evolution occur at drastically different rates in bacteria.

  6. Virio- and Bacterioplankton Microscale Distributions at the Sediment-Water Interface

    PubMed Central

    Dann, Lisa M.; Mitchell, James G.; Speck, Peter G.; Newton, Kelly; Jeffries, Thomas; Paterson, James

    2014-01-01

    The marine sediment-water interface is an important location for microbially controlled nutrient and gas exchange processes. While microbial distributions on the sediment side of the interface are well established in many locations, the distributions of microbes on the water side of the interface are less well known. Here, we measured that distribution for marine virio- and bacterioplankton with a new two-dimensional technique. Our results revealed higher heterogeneity in sediment-water interface biomass distributions than previously reported with a greater than 45– and 2500-fold change cm−1 found within bacterial and viral subpopulations compared to previous maxima of 1.5- and 1.4-fold cm−1 in bacteria and viruses in the same environments. The 45-fold and 2500-fold changes were due to patches of elevated and patches of reduced viral and bacterial abundance. The bacterial and viral hotspots were found over single and multiple sample points and the two groups often coincided whilst the coldspots only occurred over single sample points and the bacterial and viral abundances showed no correlation. The total mean abundances of viruses strongly correlated with bacteria (r = 0.90, p<0.0001, n = 12) for all three microplates (n = 1350). Spatial autocorrelation analysis via Moran’s I and Geary’s C revealed non-random distributions in bacterial subpopulations and random distributions in viral subpopulations. The variable distributions of viral and bacterial abundance over centimetre-scale distances suggest that competition and the likelihood of viral infection are higher in the small volumes important for individual cell encounters than bulk measurements indicate. We conclude that large scale measurements are not an accurate measurement of the conditions under which microbial dynamics exist. The high variability we report indicates that few microbes experience the ‘average’ concentrations that are frequently measured. PMID:25057797

  7. Bacterioplankton Biogeography of the Atlantic Ocean: A Case Study of the Distance-Decay Relationship

    PubMed Central

    Milici, Mathias; Tomasch, Jürgen; Wos-Oxley, Melissa L.; Decelle, Johan; Jáuregui, Ruy; Wang, Hui; Deng, Zhi-Luo; Plumeier, Iris; Giebel, Helge-Ansgar; Badewien, Thomas H.; Wurst, Mascha; Pieper, Dietmar H.; Simon, Meinhard; Wagner-Döbler, Irene

    2016-01-01

    In order to determine the influence of geographical distance, depth, and Longhurstian province on bacterial community composition and compare it with the composition of photosynthetic micro-eukaryote communities, 382 samples from a depth-resolved latitudinal transect (51°S–47°N) from the epipelagic zone of the Atlantic ocean were analyzed by Illumina amplicon sequencing. In the upper 100 m of the ocean, community similarity decreased toward the equator for 6000 km, but subsequently increased again, reaching similarity values of 40–60% for samples that were separated by ~12,000 km, resulting in a U-shaped distance-decay curve. We conclude that adaptation to local conditions can override the linear distance-decay relationship in the upper epipelagial of the Atlantic Ocean which is apparently not restrained by barriers to dispersal, since the same taxa were shared between the most distant communities. The six Longhurstian provinces covered by the transect were comprised of distinct microbial communities; ~30% of variation in community composition could be explained by province. Bacterial communities belonging to the deeper layer of the epipelagic zone (140–200 m) lacked a distance-decay relationship altogether and showed little provincialism. Interestingly, those biogeographical patterns were consistently found for bacteria from three different size fractions of the plankton with different taxonomic composition, indicating conserved underlying mechanisms. Analysis of the chloroplast 16S rRNA gene sequences revealed that phytoplankton composition was strongly correlated with both free-living and particle associated bacterial community composition (R between 0.51 and 0.62, p < 0.002). The data show that biogeographical patterns commonly found in macroecology do not hold for marine bacterioplankton, most likely because dispersal and evolution occur at drastically different rates in bacteria. PMID:27199923

  8. Bacterioplankton features and its relations with doc characteristics and other limnological variables in Paraná river floodplain environments (PR/MS-Brazil)

    PubMed Central

    Teixeira, Mariana Carolina; Santana, Natália Fernanda; de Azevedo, Júlio César Rodrigues; Pagioro, Thomaz Aurélio

    2011-01-01

    Since the introduction of the Microbial Loop concept, many studies aimed to explain the role of bacterioplankton and dissolved organic carbon (DOC) in aquatic ecosystems. Paraná River floodplain system is a very complex environment where these subjects were little explored. The aim of this work was to characterize bacterial community in terms of density, biomass and biovolume in some water bodies of this floodplain and to verify its temporal variation and its relation with some limnological variables, including some indicators of DOC quality, obtained through Ultraviolet-visible (UV-VIS) and fluorescence spectroscopic analysis. Bacterial density, biomass and biovolume are similar to those from other freshwater environments and both density and biomass were higher in the period with less rain. The limnological and spectroscopic features that showed any relation with bacterioplankton were the concentrations of N-NH4 and P-PO4, water transparency, and some indicators of DOC quality and origin. The analysis of these relations showed a possible competition between bacterioplankton and phytoplankton for inorganic nutrients and that the DOC used by bacterioplankton is labile and probably from aquatic macrophytes. PMID:24031705

  9. Attached and Free-Floating Bacterioplankton in Howe Sound, British Columbia, a Coastal Marine Fjord-Embayment

    PubMed Central

    Albright, L. J.; McCrae, S. K.; May, B. E.

    1986-01-01

    Factors which influence the attachment of bacterioplankton to particles (including phytoplankton) were investigated by using (i) water samples removed from a coastal temperate fjord over an annual cycle and (ii) unialgal cultures of Prorocentrum minimum, Dunaliella tertiolecta, and Skeletonema costatum. Silt and salinity levels in this fjord seawater did not appear to influence bacterial attachment, but the percent attached bacteria was inversely related to both chlorophyll a concentrations and primary productivities. During periods of high primary productivities the percent attached bacteria was low, whereas during periods of low, increasing, and declining primary productivities the percent attached bacteria was high. A similar pattern of bacterial attachment was observed when the three phytoplankton were grown as batch cultures. The percent attached bacterial numbers increased upon the initiation of algal growth and after these cells stopped growing, but not while the algae were growing. We suggest that a major factor influencing the attachment of bacterioplankton is the physiological condition of their major nutrient source, the phytoplankton; mainly free-living bacteria are associated with growing phytoplankton, whereas a much greater proportion of the bacteria are attached among senescent phytoplankton populations. Images PMID:16347023

  10. Bacterioplankton communities of Crater Lake, OR: Dynamic changes with euphotic zone food web structure and stable deep water populations

    USGS Publications Warehouse

    Urbach, E.; Vergin, K.L.; Larson, G.L.; Giovannoni, S.J.

    2007-01-01

    The distribution of bacterial and archaeal species in Crater Lake plankton varies dramatically over depth and with time, as assessed by hybridization of group-specific oligonucleotides to RNA extracted from lakewater. Nonmetric, multidimensional scaling (MDS) analysis of relative bacterial phylotype densities revealed complex relationships among assemblages sampled from depth profiles in July, August and September of 1997 through 1999. CL500-11 green nonsulfur bacteria (Phylum Chloroflexi) and marine Group I crenarchaeota are consistently dominant groups in the oxygenated deep waters at 300 and 500 m. Other phylotypes found in the deep waters are similar to surface and mid-depth populations and vary with time. Euphotic zone assemblages are dominated either by ??-proteobacteria or CL120-10 verrucomicrobia, and ACK4 actinomycetes. MDS analyses of euphotic zone populations in relation to environmental variables and phytoplankton and zooplankton population structures reveal apparent links between Daphnia pulicaria zooplankton population densities and microbial community structure. These patterns may reflect food web interactions that link kokanee salmon population densities to community structure of the bacterioplankton, via fish predation on Daphnia with cascading consequences to Daphnia bacterivory and predation on bacterivorous protists. These results demonstrate a stable bottom-water microbial community. They also extend previous observations of food web-driven changes in euphotic zone bacterioplankton community structure to an oligotrophic setting. ?? 2007 Springer Science+Business Media B.V.

  11. Consequences of contaminant mixture on the dynamics and functional diversity of bacterioplankton in a southwestern Mediterranean coastal ecosystem.

    PubMed

    Pringault, Olivier; Lafabrie, Céline; Avezac, Murielle; Bancon-Montigny, Chrystelle; Carre, Claire; Chalghaf, Mohamed; Delpoux, Sophie; Duvivier, Adrien; Elbaz-Poulichet, Françoise; Gonzalez, Catherine; Got, Patrice; Leboulanger, Christophe; Spinelli, Sylvie; Hlaili, Asma Sakka; Bouvy, Marc

    2016-02-01

    Contamination of coastal environments is often due to a complex mixture of pollutants, sometimes in trace levels, that may have significant effects on diversity and function of organisms. The aim of this study was to evaluate the short-term dynamics of bacterioplankton exposed to natural and artificial mixtures of contaminants. Bacterial communities from a southwestern Mediterranean ecosystem, lagoon and the bay (offshore) of Bizerte were exposed to i) elutriate from resuspension of contaminated sediment, and ii) an artificial mixture of metals and herbicides mimicking the contamination observed during sediment resuspension. Elutriate incubation as well as artificial spiking induced strong enrichments in nutrients (up to 18 times), metals (up to six times) and herbicides (up to 20 times) relative to the in situ concentrations in the offshore station, whereas the increases in contaminants were less marked in the lagoon station. In the offshore waters, the artificial mixture of pollutants provoked a strong inhibition of bacterial abundance, production and respiration and significant modifications of the potential functional diversity of bacterioplankton with a strong decrease of the carbohydrate utilization. In contrast, incubation with elutriate resulted in a stimulation of bacterial activities and abundances, suggesting that the toxic effects of pollutants were modified by the increase in nutrient and DOM concentrations due to the sediment resuspension. The effects of elutriate and the artificial mixture of pollutants on bacterial dynamics and the functional diversity were less marked in the lagoon waters, than in offshore waters, suggesting a relative tolerance of lagoon bacteria against contaminants.

  12. Contribution of chemical water properties to the differential responses of bacterioneuston and bacterioplankton to ultraviolet-B radiation.

    PubMed

    Santos, Ana L; Baptista, Inês; Gomes, Newton C M; Henriques, Isabel; Almeida, Adelaide; Correia, António; Cunha, Angela

    2014-02-01

    The surface microlayer (SML) is characterized by different physicochemical properties from underlying waters (UW). However, whether these differences in abiotic factors underlie the distinct sensitivity of bacterioneuston (i.e. SML bacteria) and bacterioplankton to environmental stressors remains to be addressed. We investigated the contribution of abiotic factors to the UV-B sensitivity of bacterioneuston and bacterioplankton. Nutrients (especially nitrogen and phosphate) emerged as important determinants of bacterial UV-B sensitivity. The role of particles, nutrients, and dissolved organic components on bacterial UV-B sensitivity was further evaluated using dilution cultures. Filtered samples were twofold more UV sensitive than unfiltered samples, suggesting a UV-protective effect of particles. High nutrient concentrations attenuated bacterial UV-B sensitivity (up to 40%), compared with unamended conditions, by influencing bacterial physiology and/or community composition. Suspending cells in natural water, particularly from the SML, also attenuated UV-B sensitivity (up to 23%), compared with suspension in an artificial mineral solution. Bioassays using Pseudomonas sp. strain NT5I1.2B revealed that chemical water properties influence UV-induced oxidative damage. UV-B sensitivity was associated with high cell-specific activities. The chemical environment of the SML and UW influences UV-B effects on the corresponding bacterial communities. Maintaining low cell activities might be advantageous in stressful environments, like the SML.

  13. Virio- and bacterioplankton in the estuary zone of the Ob River and adjacent regions of the Kara Sea shelf

    NASA Astrophysics Data System (ADS)

    Kopylov, A. I.; Sazhin, A. F.; Zabotkina, E. A.; Romanenko, A. V.; Romanova, N. D.

    2017-01-01

    The distribution of structural and functional characteristics of virioplankton in the north of the Ob River estuary and the adjacent Kara Sea shelf (between latitudes 71°44'44″ N and 73°45'24″ N) was studied with consideration of the spatial variations in the number ( N B) and productivity ( P B) of bacteria and water properties (temperature, salinity, density) by analyzing samples taken in September 2013. The number of plankton viruses ( N V), the occurrence of visible infected bacteria cells, virus-induced mortality of bacteria, and virioplankton production in the studied region varied within (214-2917) × 103 particles/mL, 0.3-5.6% of NB, 2.2-64.4% of P B, and (6-17248) × 103 particles/(mL day), respectively. These parameters were the highest in water layers with a temperature of +7.3-7.5°C, salinity of 3.75-5.41 psu, and conventional density (στ) of 2.846-4.144. The number of bacterioplankton was (614-822) × 103 cells/mL, and the N V/ N B ratio was 1.1-4.5. A large amount of virus particles were attached to bacterial cells and suspended matter. The data testify to the considerable role of viruses in controlling the number and production of heterotrophic bacterioplankton in the interaction zone of river and sea waters.

  14. Molecular analyses of the diversity in marine bacterioplankton assemblages along the coastline of the northeastern Gulf of Mexico.

    PubMed

    Olapade, Ola A

    2010-10-01

    Bacterial community diversity in marine bacterioplankton assemblages were examined in 3 coastal locations along the northeastern Gulf of Mexico (GOM) using 16S rRNA gene libraries and fluorescence in situ hybridization approaches. The majority of the sequences (30%-60%) were similar to the 16S rRNA gene sequences of unknown bacteria; however, the operational taxonomic units from members of the Cyanobacteria, Proteobacteria, and Bacteroidetes were also present at the 3 GOM sites. Overall, sequence diversity was more similar between the Gulf sites of Carrabelle and Ochlockonee than between either of the Gulf sites and Apalachicola Bay. Fluorescence in situ hybridization analyses revealed the quantitative predominance of members of the Alphaproteobacteria subclass and the Cytophaga-Flavobacterium cluster within the bacterioplankton assemblages. In general, the study further reveals the presence of many bacterial taxa that have been previously found to be dominant in coastal marine environments. Differences observed in the representation of the various bacterial phylogenetic groups among the GOM coastal sites could be partly attributed to dynamic variations in several site-specific conditions, including intermittent tidal events, nutrient availability, and anthropogenic influences.

  15. Response of bacterioplankton community structure to an artificial gradient of pCO2 in the Arctic Ocean

    NASA Astrophysics Data System (ADS)

    Zhang, R.; Xia, X.; Lau, S. C. K.; Motegi, C.; Weinbauer, M. G.; Jiao, N.

    2012-08-01

    The influences of ocean acidification on bacterial diversity were investigated using DNA fingerprinting and clone library analysis of bacterioplankton samples collected from the largest CO2 manipulation mesocosm study that had been performed thus far. Terminal restriction fragment length polymorphism analysis of the PCR amplicons of the 16S rRNA genes revealed that bacterial diversity, species richness and community structure varied with the time of incubation but not the degree of ocean acidification. The phylogenetic composition of the major bacterial assemblage after a 30-day incubation under various pCO2 concentrations did not show clear effects of pCO2 levels. However, the maximum apparent diversity and species richness which occurred during incubation differed in the high and low pCO2 treatments, in which different bacterial community structure harbored. In addition, total alkalinity was one of the contributing factors for the temporal variations in bacterial community structure observed during incubation. A negative relationship between the relative abundance of Bacteroidetes and pCO2 levels was observed for samples at the end of the experiment. Our study suggested that ocean acidification affected the development of bacterial assemblages and potentially impacts the ecological function of the bacterioplankton in the marine ecosystem.

  16. Seasonality in molecular and cytometric diversity of marine bacterioplankton: the re-shuffling of bacterial taxa by vertical mixing.

    PubMed

    García, Francisca C; Alonso-Sáez, Laura; Morán, Xosé Anxelu G; López-Urrutia, Ángel

    2015-10-01

    The 'cytometric diversity' of phytoplankton communities has been studied based on single-cell properties, but the applicability of this method to characterize bacterioplankton has been unexplored. Here, we analysed seasonal changes in cytometric diversity of marine bacterioplankton along a decadal time-series at three coastal stations in the Southern Bay of Biscay. Shannon-Weaver diversity estimates and Bray-Curtis similarities obtained by cytometric and molecular (16S rRNA tag sequencing) methods were significantly correlated in samples from a 3.5 year monthly time-series. Both methods showed a consistent cyclical pattern in the diversity of surface bacterial communities with maximal values in winter. The analysis of the highly resolved flow cytometry time-series across the vertical profile showed that water column mixing was a key factor explaining the seasonal changes in bacterial composition and the winter increase in bacterial diversity in coastal surface waters. Due to its low cost and short processing time as compared with genetic methods, the cytometric diversity approach represents a useful complementary tool in the macroecology of aquatic microbes.

  17. Near-Bottom Hypoxia Impacts Dynamics of Bacterioplankton Assemblage throughout Water Column of the Gulf of Finland (Baltic Sea)

    PubMed Central

    Laas, Peeter; Šatova, Elina; Lips, Inga; Lips, Urmas; Simm, Jaak; Kisand, Veljo; Metsis, Madis

    2016-01-01

    Over the past century the spread of hypoxia in the Baltic Sea has been drastic, reaching its ‘arm’ into the easternmost sub-basin, the Gulf of Finland. The hydrographic and climatological properties of the gulf offer a broad suite of discrete niches for microbial communities. The current study explores spatiotemporal dynamics of bacterioplankton community in the Gulf of Finland using massively parallel sequencing of 16S rRNA fragments obtained by amplifying community DNA from spring to autumn period. The presence of redoxcline and drastic seasonal changes make spatiotemporal dynamics of bacterioplankton community composition (BCC) and abundances in such estuary remarkably complex. To the best of our knowledge, this is the first study that analyses spatiotemporal dynamics of BCC in relation to phytoplankton bloom throughout the water column (and redoxcline), not only at the surface layer. We conclude that capability to survive (or benefit from) shifts between oxic and hypoxic conditions is vital adaptation for bacteria to thrive in such environments. Our results contribute to the understanding of emerging patterns in BCCs that occupy hydrographically similar estuaries dispersed all over the world, and we suggest the presence of a global redox- and salinity-driven metacommunity. These results have important implications for understanding long-term ecological and biogeochemical impacts of hypoxia expansion in the Baltic Sea (and similar ecosystems), as well as global biogeography of bacteria specialized inhabiting similar ecosystems. PMID:27213812

  18. Marine bacterioplankton can increase evaporation and gas transfer bymetabolizing insoluble surfactants from the air-seawater interface.

    PubMed

    Salter, Ian; Zubkov, Mikhail V; Warwick, Phil E; Burkill, Peter H

    2009-05-01

    Hydrophobic surfactants at the air-sea interface can retard evaporative and gaseous exchange between the atmosphere and the ocean.While numerous studies have examined the metabolic role of bacterioneuston at the air-sea interface, the interactions between hydrophobic surfactants and bacterioplankton are not well constrained. A novel experimental design was developed, using Vibrio natriegens and (3)H-labelled hexadecanoic acid tracer, to determine how the bacterial metabolism of fatty acids affects evaporative fluxes. In abiotic systems, >92% of the added hexadecanoic acid remained at the air-water interface. In contrast, the presence of V. natriegens cells draws down insoluble hexadecanoic acid from the air-water interface as an exponential function of time. The exponents characterizing the removal of hexadecanoic acid from the interface co-vary with the concentration of V. natriegens cells in the underlying water, with the largest exponent corresponding to the highest cell abundance. Radiochemical budgets show that evaporative fluxes from the system are linearly proportional to the quantity of hexadecanoic acid at the interface. Thus, bacterioplankton could influence the rate of evaporation and gas transfer in the ocean through the metabolism of otherwise insoluble surfactants.

  19. In vitro study of possible microbial indicators for drowning: Salinity and types of bacterioplankton proliferating in blood.

    PubMed

    Kakizaki, Eiji; Kozawa, Shuji; Matsuda, Hirokazu; Muraoka, Eri; Uchiyama, Taketo; Sakai, Masahiro; Yukawa, Nobuhiro

    2011-01-30

    Numbers and types of bacterioplankton proliferating in blood samples mixed with water of various salinity levels were examined to determine the characteristics of species associated with salinity. Water samples (total n=88) were collected from the midstream of two rivers (freshwater; n=10; salinity <0.05%), from around their estuaries (areas of freshwater, n=20, salinity <0.05%; areas of brackish water, n=20, salinity <0.05-3.1%; areas of marine water beyond the mouths of the rivers, n=28, salinity 2.4-3.3%), and from the coast (areas of marine water; n=10; salinity 3.3-3.5%). Freshwater bacteria were identified in 41 of 42 blood samples mixed with water at ≤1.3% salinity, and the genus Aeromonas, which is universally distributed in freshwater environments, was predominant. Marine bacteria were identified in all of 46 blood samples mixed with water at ≥1.8% salinity, and most comprised the genera Vibrio and Photobacterium that are universally distributed in seawater environments. Aeromonas was undetectable in all blood samples mixed with brackish or sea water at ≥1.8% salinity although they are detectable even in seawater environments. Thus, the present results showed that bacterioplankton capable of proliferating in human blood reflects the salinity of water.

  20. Community differentiation and population enrichment of Sargasso Sea bacterioplankton in the euphotic zone of a mesoscale mode-water eddy.

    PubMed

    Nelson, Craig E; Carlson, Craig A; Ewart, Courtney S; Halewood, Elisa R

    2014-03-01

    Eddies are mesoscale oceanographic features (∼ 200 km diameter) that can cause transient blooms of phytoplankton by shifting density isoclines in relation to light and nutrient resources. To better understand how bacterioplankton respond to eddies, we examined depth-resolved distributions of bacterial populations across an anticyclonic mode-water eddy in the Sargasso Sea. Previous work on this eddy has documented elevated phytoplankton productivity and diatom abundance within the eddy centre with coincident bacterial productivity and biomass maxima. We illustrate bacterial community shifts within the eddy centre, differentiating populations uplifted along isopycnals from those enriched or depleted at horizons of enhanced bacterial and primary productivity. Phylotypes belonging to the Roseobacter, OCS116 and marine Actinobacteria clades were enriched in the eddy core and were highly correlated with pigment-based indicators of diatom abundance, supporting developing hypotheses that members of these clades associate with phytoplankton blooms. Typical mesopelagic clades (SAR202, SAR324, SAR406 and SAR11 IIb) were uplifted within the eddy centre, increasing bacterial diversity in the lower euphotic zone. Typical surface oligotrophic clades (SAR116, OM75, Prochlorococcus and SAR11 Ia) were relatively depleted in the eddy centre. The biogeochemical context of a bloom-inducing eddy provides insight into the ecology of the diverse uncultured bacterioplankton dominating the oligotrophic oceans.

  1. Inferring ethnicity from mitochondrial DNA sequence

    PubMed Central

    2011-01-01

    Background The assignment of DNA samples to coarse population groups can be a useful but difficult task. One such example is the inference of coarse ethnic groupings for forensic applications. Ethnicity plays an important role in forensic investigation and can be inferred with the help of genetic markers. Being maternally inherited, of high copy number, and robust persistence in degraded samples, mitochondrial DNA may be useful for inferring coarse ethnicity. In this study, we compare the performance of methods for inferring ethnicity from the sequence of the hypervariable region of the mitochondrial genome. Results We present the results of comprehensive experiments conducted on datasets extracted from the mtDNA population database, showing that ethnicity inference based on support vector machines (SVM) achieves an overall accuracy of 80-90%, consistently outperforming nearest neighbor and discriminant analysis methods previously proposed in the literature. We also evaluate methods of handling missing data and characterize the most informative segments of the hypervariable region of the mitochondrial genome. Conclusions Support vector machines can be used to infer coarse ethnicity from a small region of mitochondrial DNA sequence with surprisingly high accuracy. In the presence of missing data, utilizing only the regions common to the training sequences and a test sequence proves to be the best strategy. Given these results, SVM algorithms are likely to also be useful in other DNA sequence classification applications. PMID:21554759

  2. Phytozome Comparative Plant Genomics Portal

    SciTech Connect

    Goodstein, David; Batra, Sajeev; Carlson, Joseph; Hayes, Richard; Phillips, Jeremy; Shu, Shengqiang; Schmutz, Jeremy; Rokhsar, Daniel

    2014-09-09

    The Dept. of Energy Joint Genome Institute is a genomics user facility supporting DOE mission science in the areas of Bioenergy, Carbon Cycling, and Biogeochemistry. The Plant Program at the JGI applies genomic, analytical, computational and informatics platforms and methods to: 1. Understand and accelerate the improvement (domestication) of bioenergy crops 2. Characterize and moderate plant response to climate change 3. Use comparative genomics to identify constrained elements and infer gene function 4. Build high quality genomic resource platforms of JGI Plant Flagship genomes for functional and experimental work 5. Expand functional genomic resources for Plant Flagship genomes

  3. Complementary Metaproteomic Approaches to Assess the Bacterioplankton Response toward a Phytoplankton Spring Bloom in the Southern North Sea

    PubMed Central

    Wöhlbrand, Lars; Wemheuer, Bernd; Feenders, Christoph; Ruppersberg, Hanna S.; Hinrichs, Christina; Blasius, Bernd; Daniel, Rolf; Rabus, Ralf

    2017-01-01

    Annually recurring phytoplankton spring blooms are characteristic of temperate coastal shelf seas. During these blooms, environmental conditions, including nutrient availability, differ considerably from non-bloom conditions, affecting the entire ecosystem including the bacterioplankton. Accordingly, the emerging ecological niches during bloom transition are occupied by different bacterial populations, with Roseobacter RCA cluster and SAR92 clade members exhibiting high metabolic activity during bloom events. In this study, the functional response of the ambient bacterial community toward a Phaeocystis globosa bloom in the southern North Sea was studied using metaproteomic approaches. In contrast to other metaproteomic studies of marine bacterial communities, this is the first study comparing two different cell lysis and protein preparation methods [using trifluoroethanol (TFE) and in-solution digest as well as bead beating and SDS-based solubilization and in-gel digest (BB GeLC)]. In addition, two different mass spectrometric techniques (ESI-iontrap MS and MALDI-TOF MS) were used for peptide analysis. A total of 585 different proteins were identified, 296 of which were only detected using the TFE and 191 by the BB GeLC method, demonstrating the complementarity of these sample preparation methods. Furthermore, 158 proteins of the TFE cell lysis samples were exclusively detected by ESI-iontrap MS while 105 were only detected using MALDI-TOF MS, underpinning the value of using two different ionization and mass analysis methods. Notably, 12% of the detected proteins represent predicted integral membrane proteins, including the difficult to detect rhodopsin, indicating a considerable coverage of membrane proteins by this approach. This comprehensive approach verified previous metaproteomic studies of marine bacterioplankton, e.g., detection of many transport-related proteins (17% of the detected proteins). In addition, new insights into e.g., carbon and nitrogen

  4. Complementary Metaproteomic Approaches to Assess the Bacterioplankton Response toward a Phytoplankton Spring Bloom in the Southern North Sea.

    PubMed

    Wöhlbrand, Lars; Wemheuer, Bernd; Feenders, Christoph; Ruppersberg, Hanna S; Hinrichs, Christina; Blasius, Bernd; Daniel, Rolf; Rabus, Ralf

    2017-01-01

    Annually recurring phytoplankton spring blooms are characteristic of temperate coastal shelf seas. During these blooms, environmental conditions, including nutrient availability, differ considerably from non-bloom conditions, affecting the entire ecosystem including the bacterioplankton. Accordingly, the emerging ecological niches during bloom transition are occupied by different bacterial populations, with Roseobacter RCA cluster and SAR92 clade members exhibiting high metabolic activity during bloom events. In this study, the functional response of the ambient bacterial community toward a Phaeocystis globosa bloom in the southern North Sea was studied using metaproteomic approaches. In contrast to other metaproteomic studies of marine bacterial communities, this is the first study comparing two different cell lysis and protein preparation methods [using trifluoroethanol (TFE) and in-solution digest as well as bead beating and SDS-based solubilization and in-gel digest (BB GeLC)]. In addition, two different mass spectrometric techniques (ESI-iontrap MS and MALDI-TOF MS) were used for peptide analysis. A total of 585 different proteins were identified, 296 of which were only detected using the TFE and 191 by the BB GeLC method, demonstrating the complementarity of these sample preparation methods. Furthermore, 158 proteins of the TFE cell lysis samples were exclusively detected by ESI-iontrap MS while 105 were only detected using MALDI-TOF MS, underpinning the value of using two different ionization and mass analysis methods. Notably, 12% of the detected proteins represent predicted integral membrane proteins, including the difficult to detect rhodopsin, indicating a considerable coverage of membrane proteins by this approach. This comprehensive approach verified previous metaproteomic studies of marine bacterioplankton, e.g., detection of many transport-related proteins (17% of the detected proteins). In addition, new insights into e.g., carbon and nitrogen

  5. Inferring genetic networks from microarray data.

    SciTech Connect

    May, Elebeoba Eni; Davidson, George S.; Martin, Shawn Bryan; Werner-Washburne, Margaret C.; Faulon, Jean-Loup Michel

    2004-06-01

    In theory, it should be possible to infer realistic genetic networks from time series microarray data. In practice, however, network discovery has proved problematic. The three major challenges are: (1) inferring the network; (2) estimating the stability of the inferred network; and (3) making the network visually accessible to the user. Here we describe a method, tested on publicly available time series microarray data, which addresses these concerns. The inference of genetic networks from genome-wide experimental data is an important biological problem which has received much attention. Approaches to this problem have typically included application of clustering algorithms [6]; the use of Boolean networks [12, 1, 10]; the use of Bayesian networks [8, 11]; and the use of continuous models [21, 14, 19]. Overviews of the problem and general approaches to network inference can be found in [4, 3]. Our approach to network inference is similar to earlier methods in that we use both clustering and Boolean network inference. However, we have attempted to extend the process to better serve the end-user, the biologist. In particular, we have incorporated a system to assess the reliability of our network, and we have developed tools which allow interactive visualization of the proposed network.

  6. Impacts of combined overfishing and oil spills on the plankton trophodynamics of the West Florida shelf over the last half century of 1965-2011: A two-dimensional simulation analysis of biotic state transitions, from a zooplankton- to a bacterioplankton-modulated ecosystem.

    NASA Astrophysics Data System (ADS)

    Walsh, J. J.; Lenes, J. M.; Darrow, B.; Parks, A.; Weisberg, R. H.

    2016-03-01

    Over 50 years of multiple anthropogenic perturbations, Florida zooplankton stocks of the northeastern Gulf of Mexico declined ten-fold, with increments of mainly dominant toxic dinoflagellate harmful algal blooms (HABs), rather than diatoms, and a shift in loci of nutrient remineralization and oxygen depletion by bacterioplankton, from the sea floor to near surface waters. Yet, lytic bacterial biomass and associated ammonification only increased at most five-fold over the same time period, with consequently little indication of new, expanded "dead zones" of diatom-induced hypoxia. After bacterial lysis of intact cells of these increased HABs, the remaining residues of zooplankton biomass decrements evidently instead exited the water column as malign aerosolized HAB asthma triggers, correlated by co-traveling mercury aerosols, within wind-borne sea sprays. To unravel the causal mechanisms of these inferred decadal food web transitions, a 36-state variable plankton model of algal, bacterial, protozoan, and copepod component communities replicated daily time series of each plankton group's representatives on the West Florida shelf (WFS) during 1965-2011. At the lower phytoplankton trophic levels, 52% of the ungrazed HAB increments, between 1965-1967 and 2001-2002 before recent oil spills, remained in the water column to kill fishes and fuel bacterioplankton. But, another 48% of the WFS primary production then left the ocean's surface as a harbinger of increased public health hazards during continuing sea spray exports of salts, HAB toxins, and Hg poisons. Following the Deepwater Horizon petroleum releases in 2010, little additional change of element partition among the altered importance of WFS food web components of the trophic pyramid then pertained between 2001-2002 and 2010-2011, despite when anomalous upwelled nutrient supplies instead favored retrograde benign, oil-tolerant diatoms over the HABs during 2010. Indeed, by 2011 HABs were back, with biomass

  7. Multiple Instance Fuzzy Inference

    DTIC Science & Technology

    2015-12-02

    INFERENCE A novel fuzzy learning framework that employs fuzzy inference to solve the problem of multiple instance learning (MIL) is presented. The...fuzzy learning framework that employs fuzzy inference to solve the problem of multiple instance learning (MIL) is presented. The framework introduces a...or learned from data. In multiple instance problems, the training data is ambiguously labeled. Instances are grouped into bags, labels of bags are

  8. Gene-network inference by message passing

    NASA Astrophysics Data System (ADS)

    Braunstein, A.; Pagnani, A.; Weigt, M.; Zecchina, R.

    2008-01-01

    The inference of gene-regulatory processes from gene-expression data belongs to the major challenges of computational systems biology. Here we address the problem from a statistical-physics perspective and develop a message-passing algorithm which is able to infer sparse, directed and combinatorial regulatory mechanisms. Using the replica technique, the algorithmic performance can be characterized analytically for artificially generated data. The algorithm is applied to genome-wide expression data of baker's yeast under various environmental conditions. We find clear cases of combinatorial control, and enrichment in common functional annotations of regulated genes and their regulators.

  9. Post-mortem computed tomography coaxial cutting needle biopsy to facilitate the detection of bacterioplankton using PCR probes as a diagnostic indicator for drowning.

    PubMed

    Rutty, Guy N; Johnson, Christopher; Amoroso, Jasmin; Robinson, Claire; Bradley, Carina J; Morgan, Bruno

    2017-01-01

    We report for the first time the use of coaxial cutting needle biopsy, guided by post-mortem computed tomography (PMCT), to sample internal body tissues for bacterioplankton PCR analysis to investigate drowning. This technical report describes the biopsy technique, the comparison of the needle biopsy and the invasive autopsy sampling results, as well as the PMCT and autopsy findings. By using this new biopsy sampling approach for bacterioplankton PCR, we have developed on previous papers describing the minimally invasive PMCT approach for the diagnosis of drowning. When such a system is used, the operator must take all precautions to avoid contamination of the core biopsy samples due to the sensitivity of PCR-based analytic systems.

  10. Inference in `poor` languages

    SciTech Connect

    Petrov, S.

    1996-10-01

    Languages with a solvable implication problem but without complete and consistent systems of inference rules (`poor` languages) are considered. The problem of existence of finite complete and consistent inference rule system for a ``poor`` language is stated independently of the language or rules syntax. Several properties of the problem arc proved. An application of results to the language of join dependencies is given.

  11. Contrasting patterns of free-living bacterioplankton diversity in macrophyte-dominated versus phytoplankton blooming regimes in Dianchi Lake, a shallow lake in China

    NASA Astrophysics Data System (ADS)

    Wang, Yujing; Li, Huabing; Xing, Peng; Wu, Qinglong

    2017-03-01

    Freshwater shallow lakes typically exhibit two alternative stable states under certain nutrient loadings: macrophyte-dominated and phytoplankton-dominated water regimes. An ecosystem regime shift from macrophytes to phytoplankton blooming typically reduces the number of species of invertebrates and fishes and results in the homogenization of communities in freshwater lakes. We investigated how microbial biodiversity has responded to a shift of the ecosystem regime in Dianchi Lake, which was previously fully covered with submerged macrophytes but currently harbors both ecological states. We observed marked divergence in the diversity and community composition of bacterioplankton between the two regimes. Although species richness, estimated as the number of operational taxonomic units and phylogenetic diversity (PD), was higher in the phytoplankton dominated ecosystem after this shift, the dissimilarity of bacterioplankton community across space decreased. This decrease in beta diversity was accompanied by loss of planktonic bacteria unique to the macrophyte-dominated ecosystem. Mantel tests between bacterioplankton community distances and Euclidian distance of environmental parameters indicated that this reduced bacterial community differentiation primarily reflected the loss of environmental niches, particularly in the macrophyte regime. The loss of this small-scale heterogeneity in bacterial communities should be considered when assessing long-term biodiversity changes in response to ecosystem regime conversions in freshwater lakes.

  12. Contrasting patterns of free-living bacterioplankton diversity in macrophyte-dominated versus phytoplankton blooming regimes in Dianchi Lake, a shallow lake in China

    NASA Astrophysics Data System (ADS)

    Wang, Yujing; Li, Huabing; Xing, Peng; Wu, Qinglong

    2016-04-01

    Freshwater shallow lakes typically exhibit two alternative stable states under certain nutrient loadings: macrophyte-dominated and phytoplankton-dominated water regimes. An ecosystem regime shift from macrophytes to phytoplankton blooming typically reduces the number of species of invertebrates and fishes and results in the homogenization of communities in freshwater lakes. We investigated how microbial biodiversity has responded to a shift of the ecosystem regime in Dianchi Lake, which was previously fully covered with submerged macrophytes but currently harbors both ecological states. We observed marked divergence in the diversity and community composition of bacterioplankton between the two regimes. Although species richness, estimated as the number of operational taxonomic units and phylogenetic diversity (PD), was higher in the phytoplankton dominated ecosystem after this shift, the dissimilarity of bacterioplankton community across space decreased. This decrease in beta diversity was accompanied by loss of planktonic bacteria unique to the macrophyte-dominated ecosystem. Mantel tests between bacterioplankton community distances and Euclidian distance of environmental parameters indicated that this reduced bacterial community diff erentiation primarily reflected the loss of environmental niches, particularly in the macrophyte regime. The loss of this small-scale heterogeneity in bacterial communities should be considered when assessing long-term biodiversity changes in response to ecosystem regime conversions in freshwater lakes.

  13. Effects of decreased resource availability, protozoan grazing and viral impact on a structure of bacterioplankton assemblage in a canyon-shaped reservoir.

    PubMed

    Hornák, Karel; Masín, Michal; Jezbera, Jan; Bettarel, Yvan; Nedoma, Jirí; Sime-Ngando, Télesphore; Simek, Karel

    2005-05-01

    We conducted a transplant experiment to elucidate the effects of different levels of grazing pressure, nutrient availability, especially phosphorus, and the impact of viruses on the changes in the structure of bacterioplankton assemblage in a meso-eutrophic reservoir. A sample taken from the nutrient-rich inflow part of the reservoir was size-fractionated and incubated in dialysis bags in both inflow and dam area. The structure of bacterial assemblage was examined by fluorescence in situ hybridization using oligonucleotide probes with different levels of specificity. In terms of the relative proportions of different bacterial groups, we found very few significant changes in the bacterioplankton composition after transplanting the treatments to the nutrient-poor dam area. However, we observed marked shifts in morphology and biomass towards the development of filaments, flocs and "vibrio-like" morphotypes of selected probe-defined groups of bacteria induced by increased grazing pressure. Despite the very high abundances of viruses in all the treatments, their effects on bacterioplankton were rather negligible.

  14. Quality of computationally inferred gene ontology annotations.

    PubMed

    Skunca, Nives; Altenhoff, Adrian; Dessimoz, Christophe

    2012-05-01

    Gene Ontology (GO) has established itself as the undisputed standard for protein function annotation. Most annotations are inferred electronically, i.e. without individual curator supervision, but they are widely considered unreliable. At the same time, we crucially depend on those automated annotations, as most newly sequenced genomes are non-model organisms. Here, we introduce a methodology to systematically and quantitatively evaluate electronic annotations. By exploiting changes in successive releases of the UniProt Gene Ontology Annotation database, we assessed the quality of electronic annotations in terms of specificity, reliability, and coverage. Overall, we not only found that electronic annotations have significantly improved in recent years, but also that their reliability now rivals that of annotations inferred by curators when they use evidence other than experiments from primary literature. This work provides the means to identify the subset of electronic annotations that can be relied upon-an important outcome given that >98% of all annotations are inferred without direct curation.

  15. [Genomic structure of the autotetraploid oat species Avena macrostachya inferred from comparative analysis of the ITS1 and ITS2 sequences: on the oat karyotype evolution during the early stages of the Avena species divergence].

    PubMed

    Rodionov, A V; Tiupa, N B; Kim, E S; Machs, E M; Loskutov, I G

    2005-05-01

    To examine the genomic structure of Avena macrostachya, internal transcribed spacers, ITS1 and ITS2, as well as nuclear 5.8S tRNA genes from three oat species with AsAs karyotype (A. wiestii, A. hirtula, and A. atlantica), and those from A. longiglumis (AlAl), A. canariensis (AcAc), A. ventricosa (CvCv), A. pilosa, and A. clauda (CpCp) were sequenced. All species of the genus Avena examined represented a monophyletic group (bootstrap index = 98), within which two branches, i.e., species with A- and C-genomes, were distinguished (bootstrap indices = 100). The subject of our study, A. macrostachya, albeit belonging to the phylogenetic branch of C-genome oat species (karyotype with submetacentic and subacrocentric chromosomes), has preserved an isobrachyal karyotype, (i.e., that containing metacentric chromosomes), probably typical of the common Avena ancestor. It was suggested to classify the A. macrostachya genome as a specific form of C-genome, Cm-genome. Among the species from other genera studied, Arrhenatherum elatius was found to be the closest to Avena in ITS1 and ITS structure. Phylogenetic relationships between Avena and Helictotrichon remain intriguingly uncertain. The HPR389153 sequence from H. pratense genome was closest to the ITS1 sequences specific to the Avena A-genomes (p-distance = 0.0237), while the differences of this sequence from the ITS1 of A. macrostachya reached 0.1221. On the other hand, HAD389117 from H. adsurgens was close to the ITS1 specific to Avena C-genomes (p-distance = 0.0189), while its differences from the A-genome specific ITS1 sequences reached 0.1221. It seems likely that the appearance of highly polyploid (2n = 12-21x) species of H. pratense and H. adsurgens could be associated with interspecific hybridization involving Mediterranean oat species carrying A- and C-genomes. A hypothesis on the pathways of Avena chromosomes evolution during the early stages the oat species divergence is proposed.

  16. Maximum growth rates and possible life strategies of different bacterioplankton groups in relation to phosphorus availability in a freshwater reservoir.

    PubMed

    Simek, Karel; Hornák, Karel; Jezbera, Jan; Nedoma, Jirí; Vrba, Jaroslav; Straskrábová, Viera; Macek, Miroslav; Dolan, John R; Hahn, Martin W

    2006-09-01

    We investigated net growth rates of distinct bacterioplankton groups and heterotrophic nanoflagellate (HNF) communities in relation to phosphorus availability by analysing eight in situ manipulation experiments, conducted between 1997 and 2003, in the canyon-shaped Rímov reservoir (Czech Republic). Water samples were size-fractionated and incubated in dialysis bags at the sampling site or transplanted into an area of the reservoir, which differed in phosphorus limitation (range of soluble reactive phosphorus concentrations--SRP, 0.7-96 microg l-1). Using five different rRNA-targeted oligonucleotide probes, net growth rates of the probe-defined bacterial groups and HNF assemblages were estimated and related to SRP using Monod kinetics, yielding growth rate constants specific for each bacterial group. We found highly significant differences among their maximum growth rates while insignificant differences were detected in the saturation constants. However, the latter constants represent only tentative estimates mainly due to insufficient sensitivity of the method used at low in situ SRP concentrations. Interestingly, in these same experiments HNF assemblages grew significantly faster than any bacterial group studied except for a small, but abundant cluster of Betaproteobacteria (targeted by the R-BT065 probe). Potential ecological implications of different growth capabilities for possible life strategies of different bacterial phylogenetic lineages are discussed.

  17. Diurnal variation in bacterioplankton composition and DNA damage in the microbial community from an Andean oligotrophic lake.

    PubMed

    Fernández-Zenoff, María V; Estévez, María C; Farías, María E

    2014-01-01

    Laguna Azul is an oligotrophic lake situated at 4,560 m above sea level and subject to a high level of solar radiation. Bacterioplankton community composition (BCC) was analysed by denaturing gradient gel electrophoresis and the impact of solar ultraviolet radiation was assessed by measuring cyclobutane pyrimidine dimers (CPD). Furthermore, pure cultures of Acinetobacter johnsonii A2 and Rhodococcus sp. A5 were exposed simultaneously and CPD accumulation was studied. Gel analyses generated a total of 7 sequences belonging to Alpha-proteobacteria (1 band), Beta-proteobacteria (1 band), Bacteroidetes (2 bands), Actinobacteria (1 band), and Firmicutes (1 band). DGGE profiles showed minimal changes in BCC and no CPD was detected even though a high level of damage was found in biodosimeters. A. johnsonii A2 showed low level of DNA damage while Rhodococcus sp. A5 exhibited high resistance since no CPD were detected under natural UV-B exposure, suggesting that the bacterial community is well adapted to this highly solar irradiated environment.

  18. Diversity of bacterioplankton in contrasting Tibetan lakes revealed by high-density microarray and clone library analysis.

    PubMed

    Zhang, Rui; Wu, Qinglong; Piceno, Yvette M; Desantis, Todd Z; Saunders, F Michael; Andersen, Gary L; Liu, Wen-Tso

    2013-11-01

    Tibetan lakes represent a unique microbial environment and are a good ecosystem to investigate the microbial diversity of high mountain lakes and their relationship with environmental factors. The diversity and community structure of bacterioplankton in Tibetan lakes was determined using DNA fingerprinting analysis, high-density 16S rRNA gene microarray (PhyloChip) analysis, and extensive clone library analysis of bacterial 16S rRNA genes. A previously unseen high microbial diversity (1732 operational taxonomic units based on PhyloChip data) and numerous novel bacterial 16S rRNA gene sequences were observed. Abundant SAR11-like sequences retrieved from saline Lake Qinghai demonstrated a unique SAR11 phylogenetic sister clade related to the freshwater LD12 clade. Water chemistry (e.g. salinity) and altitude played important roles in the selection of bacterial taxa (both presence and relative abundance) in Tibetan lakes. The ubiquity and uniqueness of bacterial taxa, as well as the correlation between environmental factors and bacterial taxa, was observed to vary gradually with different phylogenetic levels. Our study suggested high microbial cosmopolitanism and high endemicity observed at higher and lower phylogenetic levels, respectively.

  19. Combining culture-dependent and -independent methodologies for estimation of richness of estuarine bacterioplankton consuming riverine dissolved organic matter.

    PubMed

    Kisand, Veljo; Wikner, Johan

    2003-06-01

    Three different methods for analyzing natural microbial community diversity were combined to maximize an estimate of the richness of bacterioplankton catabolizing riverine dissolved organic matter (RDOM). We also evaluated the ability of culture-dependent quantitative DNA-DNA hybridization, a 16S rRNA gene clone library, and denaturing gradient gel electrophoresis (DGGE) to detect bacterial taxa in the same sample. Forty-two different cultivatable strains were isolated from rich and poor solid media. In addition, 50 unique clones were obtained by cloning of the bacterial 16S rDNA gene amplified by PCR from the community DNA into an Escherichia coli vector. Twenty-three unique bands were sequenced from 12 DGGE profiles, excluding a composite fuzzy band of the Cytophaga-Flavobacterium group. The different methods gave similar distributions of taxa at the genus level and higher. However, the match at the species level among the methods was poor, and only one species was identified by all three methods. Consequently, all three methods identified unique subsets of bacterial species, amounting to a total richness of 97 operational taxonomic units in the experimental system. The confidence in the results was, however, dependent on the current precision of the phylogenetic determination and definition of the species. Bacterial consumers of RDOM in the studied estuary were primarily both cultivatable and uncultivable taxa of the Cytophaga-Flavobacterium group, a concordant result among the methods applied. Culture-independent methods also suggested several not-yet-cultivated beta-proteobacteria to be RDOM consumers.

  20. The Bayes Inference Engine

    SciTech Connect

    Hanson, K.M.; Cunningham, G.S.

    1996-04-01

    The authors are developing a computer application, called the Bayes Inference Engine, to provide the means to make inferences about models of physical reality within a Bayesian framework. The construction of complex nonlinear models is achieved by a fully object-oriented design. The models are represented by a data-flow diagram that may be manipulated by the analyst through a graphical programming environment. Maximum a posteriori solutions are achieved using a general, gradient-based optimization algorithm. The application incorporates a new technique of estimating and visualizing the uncertainties in specific aspects of the model.

  1. Inference as Prediction

    ERIC Educational Resources Information Center

    Watson, Jane

    2007-01-01

    Inference, or decision making, is seen in curriculum documents as the final step in a statistical investigation. For a formal statistical enquiry this may be associated with sophisticated tests involving probability distributions. For young students without the mathematical background to perform such tests, it is still possible to draw informal…

  2. Virioplankton and bacterioplankton in a shallow CO 2-dominated hydrothermal vent (Panarea Island, Tyrrhenian Sea)

    NASA Astrophysics Data System (ADS)

    Karuza, Ana; Celussi, Mauro; Cibic, Tamara; Del Negro, Paola; De Vittor, Cinzia

    2012-01-01

    Gas hydrothermal vents are used as a natural analogue for studying the effects of CO 2 leakage from hypothetical shallow marine storage sites on benthic and pelagic systems. This study investigated the interrelationships between planktonic prokaryotes and viruses in the Panarea Islands hydrothermal system (southern Tyrrhenian Sea, Italy), especially their abundance, distribution and diversity. No difference in prokaryotic abundance was shown between high-CO 2 and control sites. The community structure displayed differences between fumarolic field and the control, and between surface and bottom waters, the latter likely due to the presence of different water masses. Bacterial assemblages were qualitatively dominated by chemo- and photoautotrophic organisms, able to utilise both CO 2 and H 2S for their metabolic requirements. From significantly lower virioplankton abundance in the proximity of the exhalative area together with particularly low Virus-to-Prokaryotes Ratio, we inferred a reduced impact on prokaryotic abundance and proliferation. Even if the fate of viruses in this particular condition remains still unknown, we consider that lower viral abundance could reflect in enhancing the energy flow to higher trophic levels, thus largely influencing the overall functioning of the system.

  3. Comparison of growth rates of aerobic anoxygenic phototrophic bacteria and other bacterioplankton groups in coastal Mediterranean waters.

    PubMed

    Ferrera, Isabel; Gasol, Josep M; Sebastián, Marta; Hojerová, Eva; Koblízek, Michal

    2011-11-01

    Growth is one of the basic attributes of any living organism. Surprisingly, the growth rates of marine bacterioplankton are only poorly known. Current data suggest that marine bacteria grow relatively slowly, having generation times of several days. However, some bacterial groups, such as the aerobic anoxygenic phototrophic (AAP) bacteria, have been shown to grow much faster. Two manipulation experiments, in which grazing, viruses, and resource competition were reduced, were conducted in the coastal Mediterranean Sea (Blanes Bay Microbial Observatory). The growth rates of AAP bacteria and of several important phylogenetic groups (the Bacteroidetes, the alphaproteobacterial groups Roseobacter and SAR11, and the Gammaproteobacteria group and its subgroups the Alteromonadaceae and the NOR5/OM60 clade) were calculated from changes in cell numbers in the manipulation treatments. In addition, we examined the role that top-down (mortality due to grazers and viruses) and bottom-up (resource availability) factors play in determining the growth rates of these groups. Manipulations resulted in an increase of the growth rates of all groups studied, but its extent differed largely among the individual treatments and among the different groups. Interestingly, higher growth rates were found for the AAP bacteria (up to 3.71 day⁻¹) and for the Alteromonadaceae (up to 5.44 day⁻¹), in spite of the fact that these bacterial groups represented only a very low percentage of the total prokaryotic community. In contrast, the SAR11 clade, which was the most abundant group, was the slower grower in all treatments. Our results show that, in general, the least abundant groups exhibited the highest rates, whereas the most abundant groups were those growing more slowly, indicating that some minor groups, such the AAP bacteria, very likely contribute much more to the recycling of organic matter in the ocean than what their abundances alone would predict.

  4. Combining Culture-Dependent and -Independent Methodologies for Estimation of Richness of Estuarine Bacterioplankton Consuming Riverine Dissolved Organic Matter

    PubMed Central

    Kisand, Veljo; Wikner, Johan

    2003-01-01

    Three different methods for analyzing natural microbial community diversity were combined to maximize an estimate of the richness of bacterioplankton catabolizing riverine dissolved organic matter (RDOM). We also evaluated the ability of culture-dependent quantitative DNA-DNA hybridization, a 16S rRNA gene clone library, and denaturing gradient gel electrophoresis (DGGE) to detect bacterial taxa in the same sample. Forty-two different cultivatable strains were isolated from rich and poor solid media. In addition, 50 unique clones were obtained by cloning of the bacterial 16S rDNA gene amplified by PCR from the community DNA into an Escherichia coli vector. Twenty-three unique bands were sequenced from 12 DGGE profiles, excluding a composite fuzzy band of the Cytophaga-Flavobacterium group. The different methods gave similar distributions of taxa at the genus level and higher. However, the match at the species level among the methods was poor, and only one species was identified by all three methods. Consequently, all three methods identified unique subsets of bacterial species, amounting to a total richness of 97 operational taxonomic units in the experimental system. The confidence in the results was, however, dependent on the current precision of the phylogenetic determination and definition of the species. Bacterial consumers of RDOM in the studied estuary were primarily both cultivatable and uncultivable taxa of the Cytophaga-Flavobacterium group, a concordant result among the methods applied. Culture-independent methods also suggested several not-yet-cultivated β-proteobacteria to be RDOM consumers. PMID:12788769

  5. Inference and validation of predictive gene networks from biomedical literature and gene expression data.

    PubMed

    Olsen, Catharina; Fleming, Kathleen; Prendergast, Niall; Rubio, Renee; Emmert-Streib, Frank; Bontempi, Gianluca; Haibe-Kains, Benjamin; Quackenbush, John

    2014-01-01

    Although many methods have been developed for inference of biological networks, the validation of the resulting models has largely remained an unsolved problem. Here we present a framework for quantitative assessment of inferred gene interaction networks using knock-down data from cell line experiments. Using this framework we are able to show that network inference based on integration of prior knowledge derived from the biomedical literature with genomic data significantly improves the quality of inferred networks relative to other approaches. Our results also suggest that cell line experiments can be used to quantitatively assess the quality of networks inferred from tumor samples.

  6. Transplant experiments uncover Baltic Sea basin-specific responses in bacterioplankton community composition and metabolic activities

    PubMed Central

    Lindh, Markus V.; Figueroa, Daniela; Sjöstedt, Johanna; Baltar, Federico; Lundin, Daniel; Andersson, Agneta; Legrand, Catherine; Pinhassi, Jarone

    2015-01-01

    Anthropogenically induced changes in precipitation are projected to generate increased river runoff to semi-enclosed seas, increasing loads of terrestrial dissolved organic matter and decreasing salinity. To determine how bacterial community structure and functioning adjust to such changes, we designed microcosm transplant experiments with Baltic Proper (salinity 7.2) and Bothnian Sea (salinity 3.6) water. Baltic Proper bacteria generally reached higher abundances than Bothnian Sea bacteria in both Baltic Proper and Bothnian Sea water, indicating higher adaptability. Moreover, Baltic Proper bacteria growing in Bothnian Sea water consistently showed highest bacterial production and beta-glucosidase activity. These metabolic responses were accompanied by basin-specific changes in bacterial community structure. For example, Baltic Proper Pseudomonas and Limnobacter populations increased markedly in relative abundance in Bothnian Sea water, indicating a replacement effect. In contrast, Roseobacter and Rheinheimera populations were stable or increased in abundance when challenged by either of the waters, indicating an adjustment effect. Transplants to Bothnian Sea water triggered the initial emergence of particular Burkholderiaceae populations, and transplants to Baltic Proper water triggered Alteromonadaceae populations. Notably, in the subsequent re-transplant experiment, a priming effect resulted in further increases to dominance of these populations. Correlated changes in community composition and metabolic activity were observed only in the transplant experiment and only at relatively high phylogenetic resolution. This suggested an importance of successional progression for interpreting relationships between bacterial community composition and functioning. We infer that priming effects on bacterial community structure by natural episodic events or climate change induced forcing could translate into long-term changes in bacterial ecosystem process rates. PMID

  7. Optical Inference Machines

    DTIC Science & Technology

    1988-06-27

    de olf nessse end Id e ;-tl Sb ieeI smleo) ,Optical Artificial Intellegence ; Optical inference engines; Optical logic; Optical informationprocessing...common. They arise in areas such as expert systems and other artificial intelligence systems. In recent years, the computer science language PROLOG has...cal processors should in principle be well suited for : I artificial intelligence applications. In recent years, symbolic logic processing. , the

  8. Active inference and learning.

    PubMed

    Friston, Karl; FitzGerald, Thomas; Rigoli, Francesco; Schwartenbeck, Philipp; O'Doherty, John; Pezzulo, Giovanni

    2016-09-01

    This paper offers an active inference account of choice behaviour and learning. It focuses on the distinction between goal-directed and habitual behaviour and how they contextualise each other. We show that habits emerge naturally (and autodidactically) from sequential policy optimisation when agents are equipped with state-action policies. In active inference, behaviour has explorative (epistemic) and exploitative (pragmatic) aspects that are sensitive to ambiguity and risk respectively, where epistemic (ambiguity-resolving) behaviour enables pragmatic (reward-seeking) behaviour and the subsequent emergence of habits. Although goal-directed and habitual policies are usually associated with model-based and model-free schemes, we find the more important distinction is between belief-free and belief-based schemes. The underlying (variational) belief updating provides a comprehensive (if metaphorical) process theory for several phenomena, including the transfer of dopamine responses, reversal learning, habit formation and devaluation. Finally, we show that active inference reduces to a classical (Bellman) scheme, in the absence of ambiguity.

  9. Response of bacterioplankton activity in an Arctic fjord system to elevated pCO2: results from a mesocosm perturbation study

    NASA Astrophysics Data System (ADS)

    Piontek, J.; Borchard, C.; Sperling, M.; Schulz, K. G.; Riebesell, U.; Engel, A.

    2013-01-01

    The effect of elevated seawater carbon dioxide (CO2) on the activity of a natural bacterioplankton community in an Arctic fjord system was investigated by a mesocosm perturbation study in the frame of the European Project on Ocean Acidification (EPOCA). A pCO2 range of 175-1085 μatm was set up in nine mesocosms deployed in the Kongsfjorden (Svalbard). The activity of natural extracellular enzyme assemblages increased in response to acidification. Rates of β-glucosidase and leucine-aminopeptidase increased along the gradient of mesocosm pCO2. A decrease in seawater pH of 0.5 units almost doubled rates of both enzymes. Heterotrophic bacterial activity was closely coupled to phytoplankton productivity in this experiment. The bacterioplankton community responded to rising chlorophyll a concentrations after a lag phase of only a few days with increasing protein production and extracellular enzyme activity. Time-integrated primary production and bacterial protein production were positively correlated, strongly suggesting that higher amounts of phytoplankton-derived organic matter were assimilated by heterotrophic bacteria at increased primary production. Primary production increased under high pCO2 in this study, and it can be suggested that the efficient heterotrophic carbon utilisation had the potential to counteract the enhanced autotrophic CO2 fixation. However, our results also show that beneficial pCO2-related effects on bacterial activity can be mitigated by the top-down control of bacterial abundances in natural microbial communities.

  10. Successional trajectories of bacterioplankton community over the complete cycle of a sudden phytoplankton bloom in the Xiangshan Bay, East China Sea.

    PubMed

    Chen, Heping; Zhang, Huajun; Xiong, Jinbo; Wang, Kai; Zhu, Jianlin; Zhu, Xiangyu; Zhou, Xiaoyan; Zhang, Demin

    2016-12-01

    Phytoplankton bloom has imposed ecological concerns worldwide; however, few studies have been focused on the successional trajectories of bacterioplankton community over a complete phytoplankton bloom cycle. Using 16S pyrosequencing, we investigated how the coastal bacterioplankton community compositions (BCCs) respond to a phytoplankton bloom in the Xiangshan Bay, East China Sea. The results showed that BCCs were significantly different among the pre-bloom, bloom, and after-bloom stages, with the lowest bacterial diversity at the bloom phase. The BCCs at the short-term after-bloom phase showed a rapid but incomplete recovery to the pre-bloom phase, evidenced by 69.8% similarity between pre-bloom and after-bloom communities. This recovery was parallel with the dynamics of the operational taxonomic units (OTUs) affiliated with Actinobacteria, Bacteroidetes, Cyanobacteria, Alphaproteobacteria and Gammaproteobacteria, whose abundance enriched when bloom occur, and decreased after-bloom, and vice versa. Collectively, the results showed that the BCCs were sensitive to algal-induced disturbances, but could recover to a certain extent after bloom. In addition, OTUs which enriched or decreased during this process are closely associated with this temporal pattern, thus holding the potential to evaluate and indicate the succession stage of phytoplankton bloom.

  11. Inferring cellular networks using probabilistic graphical models.

    PubMed

    Friedman, Nir

    2004-02-06

    High-throughput genome-wide molecular assays, which probe cellular networks from different perspectives, have become central to molecular biology. Probabilistic graphical models are useful for extracting meaningful biological insights from the resulting data sets. These models provide a concise representation of complex cellular networks by composing simpler submodels. Procedures based on well-understood principles for inferring such models from data facilitate a model-based methodology for analysis and discovery. This methodology and its capabilities are illustrated by several recent applications to gene expression data.

  12. Multimodel inference and adaptive management

    USGS Publications Warehouse

    Rehme, S.E.; Powell, L.A.; Allen, C.R.

    2011-01-01

    Ecology is an inherently complex science coping with correlated variables, nonlinear interactions and multiple scales of pattern and process, making it difficult for experiments to result in clear, strong inference. Natural resource managers, policy makers, and stakeholders rely on science to provide timely and accurate management recommendations. However, the time necessary to untangle the complexities of interactions within ecosystems is often far greater than the time available to make management decisions. One method of coping with this problem is multimodel inference. Multimodel inference assesses uncertainty by calculating likelihoods among multiple competing hypotheses, but multimodel inference results are often equivocal. Despite this, there may be pressure for ecologists to provide management recommendations regardless of the strength of their study’s inference. We reviewed papers in the Journal of Wildlife Management (JWM) and the journal Conservation Biology (CB) to quantify the prevalence of multimodel inference approaches, the resulting inference (weak versus strong), and how authors dealt with the uncertainty. Thirty-eight percent and 14%, respectively, of articles in the JWM and CB used multimodel inference approaches. Strong inference was rarely observed, with only 7% of JWM and 20% of CB articles resulting in strong inference. We found the majority of weak inference papers in both journals (59%) gave specific management recommendations. Model selection uncertainty was ignored in most recommendations for management. We suggest that adaptive management is an ideal method to resolve uncertainty when research results in weak inference.

  13. Transcriptional response of bathypelagic marine bacterioplankton to the Deepwater Horizon oil spill.

    PubMed

    Rivers, Adam R; Sharma, Shalabh; Tringe, Susannah G; Martin, Jeffrey; Joye, Samantha B; Moran, Mary Ann

    2013-12-01

    The Deepwater Horizon blowout released a massive amount of oil and gas into the deep ocean between April and July 2010, stimulating microbial blooms of petroleum-degrading bacteria. To understand the metabolic response of marine microorganisms, we sequenced ≈ 66 million community transcripts that revealed the identity of metabolically active microbes and their roles in petroleum consumption. Reads were assigned to reference genes from ≈ 2700 bacterial and archaeal taxa, but most assignments (39%) were to just six genomes representing predominantly methane- and petroleum-degrading Gammaproteobacteria. Specific pathways for the degradation of alkanes, aromatic compounds and methane emerged from the metatranscriptomes, with some transcripts assigned to methane monooxygenases representing highly divergent homologs that may degrade either methane or short alkanes. The microbial community in the plume was less taxonomically and functionally diverse than the unexposed community below the plume; this was due primarily to decreased species evenness resulting from Gammaproteobacteria blooms. Surprisingly, a number of taxa (related to SAR11, Nitrosopumilus and Bacteroides, among others) contributed equal numbers of transcripts per liter in both the unexposed and plume samples, suggesting that some groups were unaffected by the petroleum inputs and blooms of degrader taxa, and may be important for re-establishing the pre-spill microbial community structure.

  14. Stimulation of viral infection of bacterioplankton during a mesoscale iron fertilization experiment in the Southern Ocean

    NASA Astrophysics Data System (ADS)

    Weinbauer, M. G.; Arrieta, J.-M.; Herndl, G. J.

    2003-04-01

    A mesoscale iron fertilization in the Southern Ocean (Eisenex ) induced a phytoplankton bloom within three weeks observation as well as in an increased bacterial abundance and production. Viral abundance and viral production were stimulated as well. A virus-dilution approach was used to estimate the frequency of infected cells (FIC) and the frequency of lysogenic cells (FLC), i.e. cells with a dormant viral genome. While the FLC did not vary strongly within the iron-enriched patch and did not differ from waters outside the patch, FIC increased significantly within the iron fertilized patch. This suggests that induction of the lytic cycle in lysogenic cells was not significant. Rather, the stimulated bacterial production and abundance within the patch resulted in higher and more successful encounters between viruses and hosts and thus in higher FIC values. Consequently, the iron fertilization enhanced the influence of viral infection in the microbial food web. According to the current model, this should result a stimulation of bacterial production, since lysed bacterial cells cannot be consumed up by protists and transferred to higher trophic level; lysis products can be taken up by bacteria and thus organic carbon spins within this viral loop. Viral infection is a significant and previously overlooked factor in the carbon flow during iron fertilization experiments.

  15. Genomics on a phylogeny: Evolution of genes and genomes in the genus Drosophila

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Comparative analysis of multiple genomes in a phylogenetic framework dramatically improves the precision and sensitivity of inferences in evolutionary genomics. The genomes of 12 Drosophila species, nine of which are presented here for the first time (sechellia, yakuba, erecta, ananassae, persimili...

  16. Visual Inference Programming

    NASA Technical Reports Server (NTRS)

    Wheeler, Kevin; Timucin, Dogan; Rabbette, Maura; Curry, Charles; Allan, Mark; Lvov, Nikolay; Clanton, Sam; Pilewskie, Peter

    2002-01-01

    The goal of visual inference programming is to develop a software framework data analysis and to provide machine learning algorithms for inter-active data exploration and visualization. The topics include: 1) Intelligent Data Understanding (IDU) framework; 2) Challenge problems; 3) What's new here; 4) Framework features; 5) Wiring diagram; 6) Generated script; 7) Results of script; 8) Initial algorithms; 9) Independent Component Analysis for instrument diagnosis; 10) Output sensory mapping virtual joystick; 11) Output sensory mapping typing; 12) Closed-loop feedback mu-rhythm control; 13) Closed-loop training; 14) Data sources; and 15) Algorithms. This paper is in viewgraph form.

  17. Expansion of biological pathways based on evolutionary inference

    PubMed Central

    Li, Yang; Calvo, Sarah E.; Gutman, Roee

    2014-01-01

    Summary Availability of diverse genomes makes it possible to predict gene function based on shared evolutionary history. This approach can be challenging, however, for pathways whose components do not exhibit a shared history, but rather, consist of distinct “evolutionary modules.” We introduce a computational algorithm, CLIME (clustering by inferred models of evolution), which inputs a eukaryotic species tree, homology matrix, and pathway (gene set) of interest. CLIME partitions the gene set into disjoint evolutionary modules, simultaneously learning the number of modules and a tree-based evolutionary history that defines each module. CLIME then expands each module by scanning the genome for new components that likely arose under the inferred evolutionary model. Application of CLIME to ∼1000 annotated human pathways, organelles and proteomes of yeast, red algae, and malaria, reveals unanticipated evolutionary modularity and novel, co-evolving components. CLIME is freely available and should become increasingly powerful with the growing wealth of eukaryotic genomes. PMID:24995987

  18. Effect of elevated CO2 on the dynamics of particle attached and free living bacterioplankton communities in an Arctic fjord

    NASA Astrophysics Data System (ADS)

    Sperling, M.; Piontek, J.; Gerdts, G.; Wichels, A.; Schunck, H.; Roy, A.-S.; La Roche, J.; Gilbert, J.; Bittner, L.; Romac, S.; Riebesell, U.; Engel, A.

    2012-08-01

    The increase in atmospheric carbon dioxide (CO2) results in acidification of the oceans, expected to lead to the fastest drop in ocean pH in the last 300 million years, if anthropogenic emissions are continued at present rate. Due to higher solubility of gases in cold waters and increased exposure to the atmosphere by decreasing ice cover, the Arctic Ocean will be among the areas most strongly affected by ocean acidification. Yet, the response of the plankton community of high latitudes to ocean acidification has not been studied so far. This work is part of the Arctic campaign of the European Project on Ocean Acidification (EPOCA) in 2010, employing 9 in situ mesocosms of about 45 000 l each to simulate ocean acidification in Kongsfjorden, Svalbard (78°56.2' N 11°53.6' E). In the present study, we investigated effects of elevated CO2 on the composition and richness of particle attached (PA; >3 μm) and free living (FL; <3 μm >0.2 μm) bacterial communities by Automated Ribosomal Intergenic Spacer Analysis (ARISA) in 6 of the mesocosms and the surrounding fjord, ranging from 185 to 1050 initial μatm pCO2. ARISA was able to resolve about 20-30 bacterial band-classes per sample and allowed for a detailed investigation of the explicit richness. Both, the PA and the FL bacterioplankton community exhibited a strong temporal development, which was driven mainly by temperature and phytoplankton development. In response to the breakdown of a picophytoplankton bloom (phase 3 of the experiment), number of ARISA-band classes in the PA-community were reduced at low and medium CO2 (∼180-600 μatm) by about 25%, while it was more or less stable at high CO2 (∼ 650-800 μatm). We hypothesise that enhanced viral lysis and enhanced availability of organic substrates at high CO2 resulted in a more diverse PA-bacterial community in the post-bloom phase. Despite lower cell numbers and extracellular enzyme activities in the post-bloom phase, bacterial protein production was

  19. Circular inferences in schizophrenia.

    PubMed

    Jardri, Renaud; Denève, Sophie

    2013-11-01

    A considerable number of recent experimental and computational studies suggest that subtle impairments of excitatory to inhibitory balance or regulation are involved in many neurological and psychiatric conditions. The current paper aims to relate, specifically and quantitatively, excitatory to inhibitory imbalance with psychotic symptoms in schizophrenia. Considering that the brain constructs hierarchical causal models of the external world, we show that the failure to maintain the excitatory to inhibitory balance results in hallucinations as well as in the formation and subsequent consolidation of delusional beliefs. Indeed, the consequence of excitatory to inhibitory imbalance in a hierarchical neural network is equated to a pathological form of causal inference called 'circular belief propagation'. In circular belief propagation, bottom-up sensory information and top-down predictions are reverberated, i.e. prior beliefs are misinterpreted as sensory observations and vice versa. As a result, these predictions are counted multiple times. Circular inference explains the emergence of erroneous percepts, the patient's overconfidence when facing probabilistic choices, the learning of 'unshakable' causal relationships between unrelated events and a paradoxical immunity to perceptual illusions, which are all known to be associated with schizophrenia.

  20. Moment inference from tomograms

    USGS Publications Warehouse

    Day-Lewis, F. D.; Chen, Y.; Singha, K.

    2007-01-01

    Time-lapse geophysical tomography can provide valuable qualitative insights into hydrologic transport phenomena associated with aquifer dynamics, tracer experiments, and engineered remediation. Increasingly, tomograms are used to infer the spatial and/or temporal moments of solute plumes; these moments provide quantitative information about transport processes (e.g., advection, dispersion, and rate-limited mass transfer) and controlling parameters (e.g., permeability, dispersivity, and rate coefficients). The reliability of moments calculated from tomograms is, however, poorly understood because classic approaches to image appraisal (e.g., the model resolution matrix) are not directly applicable to moment inference. Here, we present a semi-analytical approach to construct a moment resolution matrix based on (1) the classic model resolution matrix and (2) image reconstruction from orthogonal moments. Numerical results for radar and electrical-resistivity imaging of solute plumes demonstrate that moment values calculated from tomograms depend strongly on plume location within the tomogram, survey geometry, regularization criteria, and measurement error. Copyright 2007 by the American Geophysical Union.

  1. Quality of Computationally Inferred Gene Ontology Annotations

    PubMed Central

    Škunca, Nives; Altenhoff, Adrian; Dessimoz, Christophe

    2012-01-01

    Gene Ontology (GO) has established itself as the undisputed standard for protein function annotation. Most annotations are inferred electronically, i.e. without individual curator supervision, but they are widely considered unreliable. At the same time, we crucially depend on those automated annotations, as most newly sequenced genomes are non-model organisms. Here, we introduce a methodology to systematically and quantitatively evaluate electronic annotations. By exploiting changes in successive releases of the UniProt Gene Ontology Annotation database, we assessed the quality of electronic annotations in terms of specificity, reliability, and coverage. Overall, we not only found that electronic annotations have significantly improved in recent years, but also that their reliability now rivals that of annotations inferred by curators when they use evidence other than experiments from primary literature. This work provides the means to identify the subset of electronic annotations that can be relied upon—an important outcome given that >98% of all annotations are inferred without direct curation. PMID:22693439

  2. BIE: Bayesian Inference Engine

    NASA Astrophysics Data System (ADS)

    Weinberg, Martin D.

    2013-12-01

    The Bayesian Inference Engine (BIE) is an object-oriented library of tools written in C++ designed explicitly to enable Bayesian update and model comparison for astronomical problems. To facilitate "what if" exploration, BIE provides a command line interface (written with Bison and Flex) to run input scripts. The output of the code is a simulation of the Bayesian posterior distribution from which summary statistics e.g. by taking moments, or determine confidence intervals and so forth, can be determined. All of these quantities are fundamentally integrals and the Markov Chain approach produces variates heta distributed according to P( heta|D) so moments are trivially obtained by summing of the ensemble of variates.

  3. Bayesian inference in geomagnetism

    NASA Technical Reports Server (NTRS)

    Backus, George E.

    1988-01-01

    The inverse problem in empirical geomagnetic modeling is investigated, with critical examination of recently published studies. Particular attention is given to the use of Bayesian inference (BI) to select the damping parameter lambda in the uniqueness portion of the inverse problem. The mathematical bases of BI and stochastic inversion are explored, with consideration of bound-softening problems and resolution in linear Gaussian BI. The problem of estimating the radial magnetic field B(r) at the earth core-mantle boundary from surface and satellite measurements is then analyzed in detail, with specific attention to the selection of lambda in the studies of Gubbins (1983) and Gubbins and Bloxham (1985). It is argued that the selection method is inappropriate and leads to lambda values much larger than those that would result if a reasonable bound on the heat flow at the CMB were assumed.

  4. Inferring Past Effective Population Size from Distributions of Coalescent Times

    PubMed Central

    Gattepaille, Lucie; Günther, Torsten; Jakobsson, Mattias

    2016-01-01

    Inferring and understanding changes in effective population size over time is a major challenge for population genetics. Here we investigate some theoretical properties of random-mating populations with varying size over time. In particular, we present an exact solution to compute the population size as a function of time, Ne(t), based on distributions of coalescent times of samples of any size. This result reduces the problem of population size inference to a problem of estimating coalescent time distributions. To illustrate the analytic results, we design a heuristic method using a tree-inference algorithm and investigate simulated and empirical population-genetic data. We investigate the effects of a range of conditions associated with empirical data, for instance number of loci, sample size, mutation rate, and cryptic recombination. We show that our approach performs well with genomic data (≥ 10,000 loci) and that increasing the sample size from 2 to 10 greatly improves the inference of Ne(t) whereas further increase in sample size results in modest improvements, even under a scenario of exponential growth. We also investigate the impact of recombination and characterize the potential biases in inference of Ne(t). The approach can handle large sample sizes and the computations are fast. We apply our method to human genomes from four populations and reconstruct population size profiles that are coherent with previous finds, including the Out-of-Africa bottleneck. Additionally, we uncover a potential difference in population size between African and non-African populations as early as 400 KYA. In summary, we provide an analytic relationship between distributions of coalescent times and Ne(t), which can be incorporated into powerful approaches for inferring past population sizes from population-genomic data. PMID:27638421

  5. Bayes factors and multimodel inference

    USGS Publications Warehouse

    Link, W.A.; Barker, R.J.; Thomson, David L.; Cooch, Evan G.; Conroy, Michael J.

    2009-01-01

    Multimodel inference has two main themes: model selection, and model averaging. Model averaging is a means of making inference conditional on a model set, rather than on a selected model, allowing formal recognition of the uncertainty associated with model choice. The Bayesian paradigm provides a natural framework for model averaging, and provides a context for evaluation of the commonly used AIC weights. We review Bayesian multimodel inference, noting the importance of Bayes factors. Noting the sensitivity of Bayes factors to the choice of priors on parameters, we define and propose nonpreferential priors as offering a reasonable standard for objective multimodel inference.

  6. The Pattern of Change in the Abundances of Specific Bacterioplankton Groups Is Consistent across Different Nutrient-Enriched Habitats in Crete

    PubMed Central

    Fodelianakis, Stilianos; Papageorgiou, Nafsika; Pitta, Paraskevi; Kasapidis, Panagiotis; Karakassis, Ioannis

    2014-01-01

    A common source of disturbance for coastal aquatic habitats is nutrient enrichment through anthropogenic activities. Although the water column bacterioplankton communities in these environments have been characterized in some cases, changes in α-diversity and/or the abundances of specific taxonomic groups across enriched habitats remain unclear. Here, we investigated the bacterial community changes at three different nutrient-enriched and adjacent undisturbed habitats along the north coast of Crete, Greece: a fish farm, a closed bay within a town with low water renewal rates, and a city port where the level of nutrient enrichment and the trophic status of the habitat were different. Even though changes in α-diversity were different at each site, we observed across the sites a common change pattern accounting for most of the community variation for five of the most abundant bacterial groups: a decrease in the abundance of the Pelagibacteraceae and SAR86 and an increase in the abundance of the Alteromonadaceae, Rhodobacteraceae, and Cryomorphaceae in the impacted sites. The abundances of the groups that increased and decreased in the impacted sites were significantly correlated (positively and negatively, respectively) with the total heterotrophic bacterial counts and the concentrations of dissolved organic carbon and/or dissolved nitrogen and chlorophyll α, indicating that the common change pattern was associated with nutrient enrichment. Our results provide an in situ indication concerning the association of specific bacterioplankton groups with nutrient enrichment. These groups could potentially be used as indicators for nutrient enrichment if the pattern is confirmed over a broader spatial and temporal scale by future studies. PMID:24747897

  7. Alteration in successional trajectories of bacterioplankton communities in response to co-exposure of cadmium and phenanthrene in coastal water microcosms.

    PubMed

    Qian, Jie; Ding, Qifang; Guo, Annan; Zhang, Demin; Wang, Kai

    2017-02-01

    Coexistence of heavy metals and organic contaminants in coastal ecosystems may lead to complicated circumstances in ecotoxicological assessment for biological communities due to potential interactions of contaminants. Consequences of metals and polycyclic aromatic hydrocarbons (PAHs) co-contamination on coastal marine microbes at the community level were paid less attention. We chose cadmium (Cd) and phenanthrene (PHE) as representatives of metals and PAHs, respectively, and mimicked contaminations using coastal water microcosms spiked with Cd (1 mg/L), PHE (1 mg/L), and their mixture over two weeks. 16S rRNA gene amplicon sequencing was used to compare individual and cumulative effects of Cd and PHE on temporal succession of bacterioplankton communities. Although we found dramatic impacts of dimethylsulfoxide (DMSO, used as a carrier solvent for PHE) on bacterial α-diversity and composition, the individual and cumulative effects of Cd and PHE on bacterial α-diversity were temporally variable showing an antagonistic pattern at early stage in the presence of DMSO. Temporal succession of bacterial community composition (BCC) was associated with temporal variability of water physicochemical parameters, each of which explained more variation in BCC than two target contaminants did. However, Cd, PHE, and their mixture distinctly altered the successional trajectories of BCC, while only the effect of Cd was retained at the end of experiment, suggesting certain resilience in BCC after the complete dissipation of PHE along the temporal trajectory. Moreover, bacterial assemblages at the genus level associated with the target contaminants were highly time-dependent and more unpredictable in the co-contamination group, in which some genera possessing hydrocarbon-degrading members might contribute to PHE degradation. These results provide preliminary insights into how co-exposure of Cd and PHE phylogenetically alters successional trajectories of bacterioplankton communities

  8. Response of bacterioplankton activity in an Arctic fjord system to elevated pCO2: results from a mesocosm perturbation study

    NASA Astrophysics Data System (ADS)

    Piontek, J.; Borchard, C.; Sperling, M.; Schulz, K. G.; Riebesell, U.; Engel, A.

    2012-08-01

    The effect of elevated seawater carbon dioxide (CO2) on the activity of a natural bacterioplankton community in an Arctic fjord system was investigated by a mesocosm perturbation study in the frame of the European Project on Ocean Acidification (EPOCA). A pCO2 range of 175-1085 μatm was set up in nine mesocosms deployed in the Kongsfjorden (Svalbard). The bacterioplankton communities responded to rising chlorophyll a concentrations after a lag phase of only a few days with increasing protein production and extracellular enzyme activity and revealed a close coupling of heterotrophic bacterial activity to phytoplankton productivity in this experiment. The natural extracellular enzyme assemblages showed increased activity in response to moderate acidification. A decrease in seawater pH of 0.5 units roughly doubled rates of β-glucosidase and leucine-aminopeptidase. Activities of extracellular enzymes in the mesocosms were directly related to both seawater pH and primary production. Also primary production and bacterial protein production in the mesocosms at different pCO2 were positively correlated. Therefore, it can be suggested that the efficient heterotrophic carbon utilization in this Arctic microbial food web had the potential to counteract increased phytoplankton production that was achieved under elevated pCO2 in this study. However, our results also show that the transfer of beneficial pCO2-related effects on the cellular bacterial metabolism to the scale of community activity and organic matter degradation can be mitigated by the top-down control of bacterial abundances in natural microbial communities.

  9. Computationally Efficient Composite Likelihood Statistics for Demographic Inference.

    PubMed

    Coffman, Alec J; Hsieh, Ping Hsun; Gravel, Simon; Gutenkunst, Ryan N

    2016-02-01

    Many population genetics tools employ composite likelihoods, because fully modeling genomic linkage is challenging. But traditional approaches to estimating parameter uncertainties and performing model selection require full likelihoods, so these tools have relied on computationally expensive maximum-likelihood estimation (MLE) on bootstrapped data. Here, we demonstrate that statistical theory can be applied to adjust composite likelihoods and perform robust computationally efficient statistical inference in two demographic inference tools: ∂a∂i and TRACTS. On both simulated and real data, the adjustments perform comparably to MLE bootstrapping while using orders of magnitude less computational time.

  10. GENIES: gene network inference engine based on supervised analysis.

    PubMed

    Kotera, Masaaki; Yamanishi, Yoshihiro; Moriya, Yuki; Kanehisa, Minoru; Goto, Susumu

    2012-07-01

    Gene network inference engine based on supervised analysis (GENIES) is a web server to predict unknown part of gene network from various types of genome-wide data in the framework of supervised network inference. The originality of GENIES lies in the construction of a predictive model using partially known network information and in the integration of heterogeneous data with kernel methods. The GENIES server accepts any 'profiles' of genes or proteins (e.g. gene expression profiles, protein subcellular localization profiles and phylogenetic profiles) or pre-calculated gene-gene similarity matrices (or 'kernels') in the tab-delimited file format. As a training data set to learn a predictive model, the users can choose either known molecular network information in the KEGG PATHWAY database or their own gene network data. The user can also select an algorithm of supervised network inference, choose various parameters in the method, and control the weights of heterogeneous data integration. The server provides the list of newly predicted gene pairs, maps the predicted gene pairs onto the associated pathway diagrams in KEGG PATHWAY and indicates candidate genes for missing enzymes in organism-specific metabolic pathways. GENIES (http://www.genome.jp/tools/genies/) is publicly available as one of the genome analysis tools in GenomeNet.

  11. Improving Inferences from Multiple Methods.

    ERIC Educational Resources Information Center

    Shotland, R. Lance; Mark, Melvin M.

    1987-01-01

    Multiple evaluation methods (MEMs) can cause an inferential challenge, although there are strategies to strengthen inferences. Practical and theoretical issues involved in the use by social scientists of MEMs, three potential problems in drawing inferences from MEMs, and short- and long-term strategies for alleviating these problems are outlined.…

  12. Causal Inference and Developmental Psychology

    ERIC Educational Resources Information Center

    Foster, E. Michael

    2010-01-01

    Causal inference is of central importance to developmental psychology. Many key questions in the field revolve around improving the lives of children and their families. These include identifying risk factors that if manipulated in some way would foster child development. Such a task inherently involves causal inference: One wants to know whether…

  13. Causal Inference in Retrospective Studies.

    ERIC Educational Resources Information Center

    Holland, Paul W.; Rubin, Donald B.

    1988-01-01

    The problem of drawing causal inferences from retrospective case-controlled studies is considered. A model for causal inference in prospective studies is applied to retrospective studies. Limitations of case-controlled studies are formulated concerning relevant parameters that can be estimated in such studies. A coffee-drinking/myocardial…

  14. Social Inference Through Technology

    NASA Astrophysics Data System (ADS)

    Oulasvirta, Antti

    Awareness cues are computer-mediated, real-time indicators of people’s undertakings, whereabouts, and intentions. Already in the mid-1970 s, UNIX users could use commands such as “finger” and “talk” to find out who was online and to chat. The small icons in instant messaging (IM) applications that indicate coconversants’ presence in the discussion space are the successors of “finger” output. Similar indicators can be found in online communities, media-sharing services, Internet relay chat (IRC), and location-based messaging applications. But presence and availability indicators are only the tip of the iceberg. Technological progress has enabled richer, more accurate, and more intimate indicators. For example, there are mobile services that allow friends to query and follow each other’s locations. Remote monitoring systems developed for health care allow relatives and doctors to assess the wellbeing of homebound patients (see, e.g., Tang and Venables 2000). But users also utilize cues that have not been deliberately designed for this purpose. For example, online gamers pay attention to other characters’ behavior to infer what the other players are like “in real life.” There is a common denominator underlying these examples: shared activities rely on the technology’s representation of the remote person. The other human being is not physically present but present only through a narrow technological channel.

  15. Inferring Indel Parameters using a Simulation-based Approach

    PubMed Central

    Levy Karin, Eli; Rabin, Avigayel; Ashkenazy, Haim; Shkedy, Dafna; Avram, Oren; Cartwright, Reed A.; Pupko, Tal

    2015-01-01

    In this study, we present a novel methodology to infer indel parameters from multiple sequence alignments (MSAs) based on simulations. Our algorithm searches for the set of evolutionary parameters describing indel dynamics which best fits a given input MSA. In each step of the search, we use parametric bootstraps and the Mahalanobis distance to estimate how well a proposed set of parameters fits input data. Using simulations, we demonstrate that our methodology can accurately infer the indel parameters for a large variety of plausible settings. Moreover, using our methodology, we show that indel parameters substantially vary between three genomic data sets: Mammals, bacteria, and retroviruses. Finally, we demonstrate how our methodology can be used to simulate MSAs based on indel parameters inferred from real data sets. PMID:26537226

  16. Chloroplast Phylogenomic Inference of Green Algae Relationships.

    PubMed

    Sun, Linhua; Fang, Ling; Zhang, Zhenhua; Chang, Xin; Penny, David; Zhong, Bojian

    2016-02-05

    The green algal phylum Chlorophyta has six diverse classes, but the phylogenetic relationship of the classes within Chlorophyta remains uncertain. In order to better understand the ancient Chlorophyta evolution, we have applied a site pattern sorting method to study compositional heterogeneity and the model fit in the green algal chloroplast genomic data. We show that the fastest-evolving sites are significantly correlated with among-site compositional heterogeneity, and these sites have a much poorer fit to the evolutionary model. Our phylogenomic analyses suggest that the class Chlorophyceae is a monophyletic group, and the classes Ulvophyceae, Trebouxiophyceae and Prasinophyceae are non-monophyletic groups. Our proposed phylogenetic tree of Chlorophyta will offer new insights to investigate ancient green algae evolution, and our analytical framework will provide a useful approach for evaluating and mitigating the potential errors of phylogenomic inferences.

  17. Pathway network inference from gene expression data

    PubMed Central

    2014-01-01

    Background The development of high-throughput omics technologies enabled genome-wide measurements of the activity of cellular elements and provides the analytical resources for the progress of the Systems Biology discipline. Analysis and interpretation of gene expression data has evolved from the gene to the pathway and interaction level, i.e. from the detection of differentially expressed genes, to the establishment of gene interaction networks and the identification of enriched functional categories. Still, the understanding of biological systems requires a further level of analysis that addresses the characterization of the interaction between functional modules. Results We present a novel computational methodology to study the functional interconnections among the molecular elements of a biological system. The PANA approach uses high-throughput genomics measurements and a functional annotation scheme to extract an activity profile from each functional block -or pathway- followed by machine-learning methods to infer the relationships between these functional profiles. The result is a global, interconnected network of pathways that represents the functional cross-talk within the molecular system. We have applied this approach to describe the functional transcriptional connections during the yeast cell cycle and to identify pathways that change their connectivity in a disease condition using an Alzheimer example. Conclusions PANA is a useful tool to deepen in our understanding of the functional interdependences that operate within complex biological systems. We show the approach is algorithmically consistent and the inferred network is well supported by the available functional data. The method allows the dissection of the molecular basis of the functional connections and we describe the different regulatory mechanisms that explain the network's topology obtained for the yeast cell cycle data. PMID:25032889

  18. The small genome of an abundant coastal ocean methylotroph.

    PubMed

    Giovannoni, Stephen J; Hayakawa, Darin H; Tripp, H James; Stingl, Ulrich; Givan, Scott A; Cho, Jang-Cheon; Oh, Hyun-Myung; Kitner, Joshua B; Vergin, Kevin L; Rappé, Michael S

    2008-07-01

    OM43 is a clade of uncultured beta-proteobacteria that is commonly found in environmental nucleic acid sequences from productive coastal ocean ecosystems, and some freshwater environments, but is rarely detected in ocean gyres. Ecological studies associate OM43 with phytoplankton blooms, and evolutionary relationships indicate that they might be methylotrophs. Here we report on the genome sequence and metabolic properties of the first axenic isolate of the OM43 clade, strain HTCC2181, which was obtained using new procedures for culturing cells in natural seawater. We found that this strain is an obligate methylotroph that cannot oxidize methane but can use the oxidized C1 compounds methanol and formaldehyde as sources of carbon and energy. Its complete genome is 1304 428 bp in length, the smallest yet reported for a free-living cell. The HTCC2181 genome includes genes for xanthorhodopsin and retinal biosynthesis, an auxiliary system for producing transmembrane electrochemical potentials from light. The discovery that HTCC2181 is an extremely simple specialist in C1 metabolism suggests an unanticipated, important role for oxidized C1 compounds as substrates for bacterioplankton productivity in coastal ecosystems.

  19. Algorithmic methods to infer the evolutionary trajectories in cancer progression

    PubMed Central

    Graudenzi, Alex; Ramazzotti, Daniele; Sanz-Pamplona, Rebeca; De Sano, Luca; Mauri, Giancarlo; Moreno, Victor; Antoniotti, Marco; Mishra, Bud

    2016-01-01

    The genomic evolution inherent to cancer relates directly to a renewed focus on the voluminous next-generation sequencing data and machine learning for the inference of explanatory models of how the (epi)genomic events are choreographed in cancer initiation and development. However, despite the increasing availability of multiple additional -omics data, this quest has been frustrated by various theoretical and technical hurdles, mostly stemming from the dramatic heterogeneity of the disease. In this paper, we build on our recent work on the “selective advantage” relation among driver mutations in cancer progression and investigate its applicability to the modeling problem at the population level. Here, we introduce PiCnIc (Pipeline for Cancer Inference), a versatile, modular, and customizable pipeline to extract ensemble-level progression models from cross-sectional sequenced cancer genomes. The pipeline has many translational implications because it combines state-of-the-art techniques for sample stratification, driver selection, identification of fitness-equivalent exclusive alterations, and progression model inference. We demonstrate PiCnIc’s ability to reproduce much of the current knowledge on colorectal cancer progression as well as to suggest novel experimentally verifiable hypotheses. PMID:27357673

  20. Bayesian Inference of Galaxy Morphology

    NASA Astrophysics Data System (ADS)

    Yoon, Ilsang; Weinberg, M.; Katz, N.

    2011-01-01

    Reliable inference on galaxy morphology from quantitative analysis of ensemble galaxy images is challenging but essential ingredient in studying galaxy formation and evolution, utilizing current and forthcoming large scale surveys. To put galaxy image decomposition problem in broader context of statistical inference problem and derive a rigorous statistical confidence levels of the inference, I developed a novel galaxy image decomposition tool, GALPHAT (GALaxy PHotometric ATtributes) that exploits recent developments in Bayesian computation to provide full posterior probability distributions and reliable confidence intervals for all parameters. I will highlight the significant improvements in galaxy image decomposition using GALPHAT, over the conventional model fitting algorithms and introduce the GALPHAT potential to infer the statistical distribution of galaxy morphological structures, using ensemble posteriors of galaxy morphological parameters from the entire galaxy population that one studies.

  1. Statistical Inference in Graphical Models

    DTIC Science & Technology

    2008-06-17

    Probabilistic Network Library ( PNL ). While not fully mature, PNL does provide the most commonly-used algorithms for inference and learning with the efficiency...of C++, and also offers interfaces for calling the library from MATLAB and R 1361. Notably, both BNT and PNL provide learning and inference algorithms...mature and has been used for research purposes for several years, it is written in MATLAB and thus is not suitable to be used in real-time settings. PNL

  2. Statistical Inference: The Big Picture.

    PubMed

    Kass, Robert E

    2011-02-01

    Statistics has moved beyond the frequentist-Bayesian controversies of the past. Where does this leave our ability to interpret results? I suggest that a philosophy compatible with statistical practice, labelled here statistical pragmatism, serves as a foundation for inference. Statistical pragmatism is inclusive and emphasizes the assumptions that connect statistical models with observed data. I argue that introductory courses often mis-characterize the process of statistical inference and I propose an alternative "big picture" depiction.

  3. Bayesian Inference: with ecological applications

    USGS Publications Warehouse

    Link, William A.; Barker, Richard J.

    2010-01-01

    This text provides a mathematically rigorous yet accessible and engaging introduction to Bayesian inference with relevant examples that will be of interest to biologists working in the fields of ecology, wildlife management and environmental studies as well as students in advanced undergraduate statistics.. This text opens the door to Bayesian inference, taking advantage of modern computational efficiencies and easily accessible software to evaluate complex hierarchical models.

  4. Inferring the Why in Images

    DTIC Science & Technology

    2014-01-01

    images. To our knowledge, this challenging problem has not yet been extensively explored in computer vision. We present a novel learning based...automatically infers why people are performing actions in images by learning from visual data and written language. ∗denotes equal contribution 1 Report...explored in computer vision. We present a novel learning based framework that uses high-level visual recognition to infer why people are performing

  5. Active inference, communication and hermeneutics☆

    PubMed Central

    Friston, Karl J.; Frith, Christopher D.

    2015-01-01

    Hermeneutics refers to interpretation and translation of text (typically ancient scriptures) but also applies to verbal and non-verbal communication. In a psychological setting it nicely frames the problem of inferring the intended content of a communication. In this paper, we offer a solution to the problem of neural hermeneutics based upon active inference. In active inference, action fulfils predictions about how we will behave (e.g., predicting we will speak). Crucially, these predictions can be used to predict both self and others – during speaking and listening respectively. Active inference mandates the suppression of prediction errors by updating an internal model that generates predictions – both at fast timescales (through perceptual inference) and slower timescales (through perceptual learning). If two agents adopt the same model, then – in principle – they can predict each other and minimise their mutual prediction errors. Heuristically, this ensures they are singing from the same hymn sheet. This paper builds upon recent work on active inference and communication to illustrate perceptual learning using simulated birdsongs. Our focus here is the neural hermeneutics implicit in learning, where communication facilitates long-term changes in generative models that are trying to predict each other. In other words, communication induces perceptual learning and enables others to (literally) change our minds and vice versa. PMID:25957007

  6. Active inference, communication and hermeneutics.

    PubMed

    Friston, Karl J; Frith, Christopher D

    2015-07-01

    Hermeneutics refers to interpretation and translation of text (typically ancient scriptures) but also applies to verbal and non-verbal communication. In a psychological setting it nicely frames the problem of inferring the intended content of a communication. In this paper, we offer a solution to the problem of neural hermeneutics based upon active inference. In active inference, action fulfils predictions about how we will behave (e.g., predicting we will speak). Crucially, these predictions can be used to predict both self and others--during speaking and listening respectively. Active inference mandates the suppression of prediction errors by updating an internal model that generates predictions--both at fast timescales (through perceptual inference) and slower timescales (through perceptual learning). If two agents adopt the same model, then--in principle--they can predict each other and minimise their mutual prediction errors. Heuristically, this ensures they are singing from the same hymn sheet. This paper builds upon recent work on active inference and communication to illustrate perceptual learning using simulated birdsongs. Our focus here is the neural hermeneutics implicit in learning, where communication facilitates long-term changes in generative models that are trying to predict each other. In other words, communication induces perceptual learning and enables others to (literally) change our minds and vice versa.

  7. Causal inference and developmental psychology.

    PubMed

    Foster, E Michael

    2010-11-01

    Causal inference is of central importance to developmental psychology. Many key questions in the field revolve around improving the lives of children and their families. These include identifying risk factors that if manipulated in some way would foster child development. Such a task inherently involves causal inference: One wants to know whether the risk factor actually causes outcomes. Random assignment is not possible in many instances, and for that reason, psychologists must rely on observational studies. Such studies identify associations, and causal interpretation of such associations requires additional assumptions. Research in developmental psychology generally has relied on various forms of linear regression, but this methodology has limitations for causal inference. Fortunately, methodological developments in various fields are providing new tools for causal inference-tools that rely on more plausible assumptions. This article describes the limitations of regression for causal inference and describes how new tools might offer better causal inference. This discussion highlights the importance of properly identifying covariates to include (and exclude) from the analysis. This discussion considers the directed acyclic graph for use in accomplishing this task. With the proper covariates having been chosen, many of the available methods rely on the assumption of "ignorability." The article discusses the meaning of ignorability and considers alternatives to this assumption, such as instrumental variables estimation. Finally, the article considers the use of the tools discussed in the context of a specific research question, the effect of family structure on child development.

  8. Decoding Plant and Animal Genome Plasticity from Differential Paleo-Evolutionary Patterns and Processes

    PubMed Central

    Murat, Florent; de Peer, Yves Van; Salse, Jérôme

    2012-01-01

    Continuing advances in genome sequencing technologies and computational methods for comparative genomics currently allow inferring the evolutionary history of entire plant and animal genomes. Based on the comparison of the plant and animal genome paleohistory, major differences are unveiled in 1) evolutionary mechanisms (i.e., polyploidization versus diploidization processes), 2) genome conservation (i.e., coding versus noncoding sequence maintenance), and 3) modern genome architecture (i.e., genome organization including repeats expansion versus contraction phenomena). This article discusses how extant animal and plant genomes are the result of inherently different rates and modes of genome evolution resulting in relatively stable animal and much more dynamic and plastic plant genomes. PMID:22833223

  9. Biochemical composition of pico-, nano- and micro-particulate organic matter and bacterioplankton biomass in the oligotrophic Cretan Sea (NE Mediterranean)

    NASA Astrophysics Data System (ADS)

    Danovaro, Roberto; Dell'Anno, Antonio; Pusceddu, Antonio; Daniela Marrale; Della Croce, Norberto; Fabiano, Mauro; Tselepides, Anastasios

    2000-08-01

    The biochemical composition of different particle size classes (pico-, nano- and micro-particulate matter) and the bacterioplankton biomass were studied over an annual cycle in the Cretan Sea (South Aegean Sea, NE Mediterranean; from 40 to 1540 m depth) to investigate the origin, composition and fate of the suspended particles and to quantify bacterioplankton contribution to organic carbon pools. The oligotrophy of this system was indicated by the extremely low particulate lipid, protein and carbohydrate concentrations (4-15 times lower than in more productive systems). The biopolymeric carbon (BPC as the sum of lipid, protein and carbohydrate carbon) accounted for 80-100% of POC, suggesting the autochthonous origin of the particles. The most evident characteristic of this oligotrophic environment was the dominance of the pico-particles through all seasons, accounting for 43-45% of total carbohydrates, proteins and lipids. The proximate composition of the organic particles revealed the dominance of carbohydrates in all size-classes and highest values of the protein to carbohydrate ratio in the pico-particulate fraction. The relative proportion of the pico-, nano- and micro-particulate carbohydrates, proteins and lipids varied seasonally. The increase in the average particle size from February to September 95, probably as a result of aggregation, appeared to be related to the ‘thermal stability’ of the water column. The analysis of the vertical distribution of the three size classes revealed an increase in the pico fraction and a decrease in the larger components with increasing depth suggesting that nano- and micro-particles were being degraded and fragmented in the deeper water layers. Bacterial densities ranged from 1.1 to 8.8 x 10 8 cells l -1. Bacterial biomass accounted on average for more than 56% (up to 74%) of BPC and was by far, the most important living component. Bacterial-N accounted for a large proportion (>90%) of the protein nitrogen pool

  10. Optimal inference with suboptimal models: Addiction and active Bayesian inference

    PubMed Central

    Schwartenbeck, Philipp; FitzGerald, Thomas H.B.; Mathys, Christoph; Dolan, Ray; Wurst, Friedrich; Kronbichler, Martin; Friston, Karl

    2015-01-01

    When casting behaviour as active (Bayesian) inference, optimal inference is defined with respect to an agent’s beliefs – based on its generative model of the world. This contrasts with normative accounts of choice behaviour, in which optimal actions are considered in relation to the true structure of the environment – as opposed to the agent’s beliefs about worldly states (or the task). This distinction shifts an understanding of suboptimal or pathological behaviour away from aberrant inference as such, to understanding the prior beliefs of a subject that cause them to behave less ‘optimally’ than our prior beliefs suggest they should behave. Put simply, suboptimal or pathological behaviour does not speak against understanding behaviour in terms of (Bayes optimal) inference, but rather calls for a more refined understanding of the subject’s generative model upon which their (optimal) Bayesian inference is based. Here, we discuss this fundamental distinction and its implications for understanding optimality, bounded rationality and pathological (choice) behaviour. We illustrate our argument using addictive choice behaviour in a recently described ‘limited offer’ task. Our simulations of pathological choices and addictive behaviour also generate some clear hypotheses, which we hope to pursue in ongoing empirical work. PMID:25561321

  11. The Causal Meaning of Genomic Predictors and How It Affects Construction and Comparison of Genome-Enabled Selection Models

    PubMed Central

    Valente, Bruno D.; Morota, Gota; Peñagaricano, Francisco; Gianola, Daniel; Weigel, Kent; Rosa, Guilherme J. M.

    2015-01-01

    The term “effect” in additive genetic effect suggests a causal meaning. However, inferences of such quantities for selection purposes are typically viewed and conducted as a prediction task. Predictive ability as tested by cross-validation is currently the most acceptable criterion for comparing models and evaluating new methodologies. Nevertheless, it does not directly indicate if predictors reflect causal effects. Such evaluations would require causal inference methods that are not typical in genomic prediction for selection. This suggests that the usual approach to infer genetic effects contradicts the label of the quantity inferred. Here we investigate if genomic predictors for selection should be treated as standard predictors or if they must reflect a causal effect to be useful, requiring causal inference methods. Conducting the analysis as a prediction or as a causal inference task affects, for example, how covariates of the regression model are chosen, which may heavily affect the magnitude of genomic predictors and therefore selection decisions. We demonstrate that selection requires learning causal genetic effects. However, genomic predictors from some models might capture noncausal signal, providing good predictive ability but poorly representing true genetic effects. Simulated examples are used to show that aiming for predictive ability may lead to poor modeling decisions, while causal inference approaches may guide the construction of regression models that better infer the target genetic effect even when they underperform in cross-validation tests. In conclusion, genomic selection models should be constructed to aim primarily for identifiability of causal genetic effects, not for predictive ability. PMID:25908318

  12. Statistical inference and string theory

    NASA Astrophysics Data System (ADS)

    Heckman, Jonathan J.

    2015-09-01

    In this paper, we expose some surprising connections between string theory and statistical inference. We consider a large collective of agents sweeping out a family of nearby statistical models for an M-dimensional manifold of statistical fitting parameters. When the agents making nearby inferences align along a d-dimensional grid, we find that the pooled probability that the collective reaches a correct inference is the partition function of a nonlinear sigma model in d dimensions. Stability under perturbations to the original inference scheme requires the agents of the collective to distribute along two dimensions. Conformal invariance of the sigma model corresponds to the condition of a stable inference scheme, directly leading to the Einstein field equations for classical gravity. By summing over all possible arrangements of the agents in the collective, we reach a string theory. We also use this perspective to quantify how much an observer can hope to learn about the internal geometry of a superstring compactification. Finally, we present some brief speculative remarks on applications to the AdS/CFT correspondence and Lorentzian signature space-times.

  13. Enhanced viral production and virus-mediated mortality of bacterioplankton in a natural iron-fertilized bloom event above the Kerguelen Plateau

    NASA Astrophysics Data System (ADS)

    Malits, A.; Christaki, U.; Obernosterer, I.; Weinbauer, M. G.

    2014-07-01

    Above the Kerguelen Plateau in the Southern Ocean natural iron fertilization sustains a large phytoplankton bloom over three months during austral summer. During the KEOPS1 project (KErguelen Ocean and Plateau compared Study1) we sampled this phytoplankton bloom during its declining phase along with the surrounding HNLC waters to study the effect of natural iron fertilization on the role of viruses in the microbial food web. Bacterial and viral abundances were 1.7 and 2.1 times, respectively, higher within the bloom than in HNLC waters. Viral production and virus-mediated mortality of bacterioplankton was 4.1 and 4.9 times, respectively, higher in the bloom, while the fraction of infected cells (FIC) and the fraction of lysogenic cells (FLC) showed no significant differences between environments. The present study suggests viruses to be more important for bacterial mortality within the bloom and dominate over protozoan grazing during the late bloom phase. As a consequence, at least at a late bloom stage, viral lysis shunts part of the photosynthetically fixed carbon in iron-fertilized regions into the dissolved organic matter (DOM) pool with potentially less particulate organic carbon transfered to larger members of the food web or exported.

  14. Effect of 5-Fluoro-2′-Deoxyuridine on [3H]Thymidine Incorporation by Bacterioplankton in the Waters of Southwest Florida

    PubMed Central

    Jeffrey, Wade H.; Paul, John H.

    1988-01-01

    The effect of 5-fluoro-2′-deoxyuridine (FdUrd) on [methyl-3H] thymidine incorporation by bacterioplankton populations in subtropical freshwater, estuarine, and oceanic environments was examined. In estuarine waters, intracellular isotope dilution was inhibited by FdUrd, which enabled us to estimate both intracellular and extracellular isotope dilution. In 2 of 10 cases, extracellular isotope dilution was significant. At low concentrations of [methyl-3H]thymidine or [6-3H]thymidine, FdUrd completely inhibited incorporation of radioactivity into protein and RNA. At high concentrations of [3H]thymidine, however, FdUrd had little effect on labeling patterns. The dihydrofolate reductase inhibitors amethopterin and trimethoprim had no effect on macromolecular labeling patterns. These results suggest that thymidylate synthase is not involved in nonspecific labeling and that FdUrd inhibits nonspecific labeling by blocking some other enzyme involved in thymidine catabolism. In oligotrophic oceanic and freshwater samples, FdUrd did not inhibit intracellular isotope dilution or [3H]thymidine labeling of protein and RNA, but caused some inhibition of [3H]thymidine incorporation into DNA. The ability of FdUrd to inhibit nonspecific macromolecular labeling during [3H]thymidine incorporation was significantly correlated (r = 0.84) with total thymidine incorporation (in picomoles per liter per hour). The results are discussed in terms of applications of FdUrd to routine bacterial production measurements and the general assumptions of [3H]thymidine incorporation. PMID:16347546

  15. Effects of UV radiation on the taxonomic composition of natural bacterioplankton communities from Bahía Engaño (Patagonia, Argentina).

    PubMed

    Manrique, Julieta M; Calvo, Andrea Y; Halac, Silvana R; Villafañe, Virginia E; Jones, Leandro R; Walter Helbling, E

    2012-12-05

    In order to gain insights into the effects of solar ultraviolet radiation (UVR, 280-400 nm) on the composition of marine bacterioplankton communities from South Atlantic waters - Bahía Engaño (Patagonia, Argentina), we performed microcosms experiments during the Austral summer of 2010. Water samples were exposed to three solar radiation treatments in 25 L microcosms during 8 days: PAR+UV-A+UV-B (280-700 nm; PAB treatment), PAR+UV-A (320-700 nm; PA treatment), and PAR only (400-700 nm; P treatment). The taxonomic composition of the bacterial communities, at the beginning and at the end of the experiment, were studied by the analyses of 16S rDNA gene libraries. Multivariate and phylogenetic analyses demonstrated substantial differences in the community composition so that the samples exposed to PAR and PAR+UV-A presented more similar taxa assemblages among them than compared to the PAR+UV-A+UV-B exposed one. Our results indicate that overall, exposure to different radiation treatments can shape the taxonomic composition of marine bacterial populations, grown in microcosms, from this Patagonian area.

  16. Enhanced viral production and virus-mediated mortality of bacterioplankton in a natural iron-fertilized bloom event above the Kerguelen Plateau

    NASA Astrophysics Data System (ADS)

    Malits, A.; Christaki, U.; Obernosterer, I.; Weinbauer, M. G.

    2014-12-01

    Above the Kerguelen Plateau in the Southern Ocean natural iron fertilization sustains a large phytoplankton bloom over 3 months during austral summer. During the KEOPS1 project (KErguelen Ocean and Plateau compared Study1) we sampled this phytoplankton bloom during its declining phase along with the surrounding high-nutrient-low-chlorophyll (HNLC) waters to study the effect of natural iron fertilization on the role of viruses in the microbial food web. Bacterial and viral abundances were 1.7 and 2.1 times, respectively, higher within the bloom than in HNLC waters. Viral production and virus-mediated mortality of bacterioplankton were 4.1 and 4.9 times, respectively, higher in the bloom, while the fraction of infected cells (FIC) and the fraction of lysogenic cells (FLC) showed no significant differences between environments. The present study suggests viruses to be more important for bacterial mortality within the bloom and dominate over grazing of heterotrophic nanoflagellates (HNFs) during the late bloom phase. As a consequence, at least at a late bloom stage, viral lysis shunts part of the photosynthetically fixed carbon in iron-fertilized regions into the dissolved organic matter (DOM) pool with potentially less particulate organic carbon transferred to larger members of the food web or exported.

  17. Locative inferences in medical texts.

    PubMed

    Mayer, P S; Bailey, G H; Mayer, R J; Hillis, A; Dvoracek, J E

    1987-06-01

    Medical research relies on epidemiological studies conducted on a large set of clinical records that have been collected from physicians recording individual patient observations. These clinical records are recorded for the purpose of individual care of the patient with little consideration for their use by a biostatistician interested in studying a disease over a large population. Natural language processing of clinical records for epidemiological studies must deal with temporal, locative, and conceptual issues. This makes text understanding and data extraction of clinical records an excellent area for applied research. While much has been done in making temporal or conceptual inferences in medical texts, parallel work in locative inferences has not been done. This paper examines the locative inferences as well as the integration of temporal, locative, and conceptual issues in the clinical record understanding domain by presenting an application that utilizes two key concepts in its parsing strategy--a knowledge-based parsing strategy and a minimal lexicon.

  18. Ecological Genomics of the Uncultivated Marine Roseobacter Lineage CHAB-I-5

    PubMed Central

    Zhang, Yao; Sun, Ying; Jiao, Nianzhi; Stepanauskas, Ramunas

    2016-01-01

    Members of the marine Roseobacter clade are major participants in global carbon and sulfur cycles. While roseobacters are well represented in cultures, several abundant pelagic lineages, including SAG-O19, DC5-80-3, and NAC11-7, remain largely uncultivated and show evidence of genome streamlining. Here, we analyzed the partial genomes of three single cells affiliated with CHAB-I-5, another abundant but exclusively uncultivated Roseobacter lineage. Members of this lineage encode several metabolic potentials that are absent in streamlined genomes. Examples are quorum sensing and type VI secretion systems, which enable them to effectively interact with host and other bacteria. Further analysis of the CHAB-I-5 single-cell amplified genomes (SAGs) predicted that this lineage comprises members with relatively large genomes (4.1 to 4.4 Mbp) and a high fraction of noncoding DNA (10 to 12%), which is similar to what is observed in many cultured, nonstreamlined Roseobacter lineages. The four uncultured lineages, while exhibiting highly variable geographic distributions, together represent >60% of the global pelagic roseobacters. They are consistently enriched in genes encoding the capabilities of light harvesting, oxidation of “energy-rich” reduced sulfur compounds and methylated amines, uptake and catabolism of various carbohydrates and osmolytes, and consumption of abundant exudates from phytoplankton. These traits may define the global prevalence of the four lineages among marine bacterioplankton. PMID:26826224

  19. How Forgetting Aids Heuristic Inference

    ERIC Educational Resources Information Center

    Schooler, Lael J.; Hertwig, Ralph

    2005-01-01

    Some theorists, ranging from W. James (1890) to contemporary psychologists, have argued that forgetting is the key to proper functioning of memory. The authors elaborate on the notion of beneficial forgetting by proposing that loss of information aids inference heuristics that exploit mnemonic information. To this end, the authors bring together 2…

  20. Science Shorts: Observation versus Inference

    ERIC Educational Resources Information Center

    Leager, Craig R.

    2008-01-01

    When you observe something, how do you know for sure what you are seeing, feeling, smelling, or hearing? Asking students to think critically about their encounters with the natural world will help to strengthen their understanding and application of the science-process skills of observation and inference. In the following lesson, students make…

  1. The mechanisms of temporal inference

    NASA Technical Reports Server (NTRS)

    Fox, B. R.; Green, S. R.

    1987-01-01

    The properties of a temporal language are determined by its constituent elements: the temporal objects which it can represent, the attributes of those objects, the relationships between them, the axioms which define the default relationships, and the rules which define the statements that can be formulated. The methods of inference which can be applied to a temporal language are derived in part from a small number of axioms which define the meaning of equality and order and how those relationships can be propagated. More complex inferences involve detailed analysis of the stated relationships. Perhaps the most challenging area of temporal inference is reasoning over disjunctive temporal constraints. Simple forms of disjunction do not sufficiently increase the expressive power of a language while unrestricted use of disjunction makes the analysis NP-hard. In many cases a set of disjunctive constraints can be converted to disjunctive normal form and familiar methods of inference can be applied to the conjunctive sub-expressions. This process itself is NP-hard but it is made more tractable by careful expansion of a tree-structured search space.

  2. Statistical inference and Aristotle's Rhetoric.

    PubMed

    Macdonald, Ranald R

    2004-11-01

    Formal logic operates in a closed system where all the information relevant to any conclusion is present, whereas this is not the case when one reasons about events and states of the world. Pollard and Richardson drew attention to the fact that the reasoning behind statistical tests does not lead to logically justifiable conclusions. In this paper statistical inferences are defended not by logic but by the standards of everyday reasoning. Aristotle invented formal logic, but argued that people mostly get at the truth with the aid of enthymemes--incomplete syllogisms which include arguing from examples, analogies and signs. It is proposed that statistical tests work in the same way--in that they are based on examples, invoke the analogy of a model and use the size of the effect under test as a sign that the chance hypothesis is unlikely. Of existing theories of statistical inference only a weak version of Fisher's takes this into account. Aristotle anticipated Fisher by producing an argument of the form that there were too many cases in which an outcome went in a particular direction for that direction to be plausibly attributed to chance. We can therefore conclude that Aristotle would have approved of statistical inference and there is a good reason for calling this form of statistical inference classical.

  3. Word Learning as Bayesian Inference

    ERIC Educational Resources Information Center

    Xu, Fei; Tenenbaum, Joshua B.

    2007-01-01

    The authors present a Bayesian framework for understanding how adults and children learn the meanings of words. The theory explains how learners can generalize meaningfully from just one or a few positive examples of a novel word's referents, by making rational inductive inferences that integrate prior knowledge about plausible word meanings with…

  4. Starfish: Robust spectroscopic inference tools

    NASA Astrophysics Data System (ADS)

    Czekala, Ian; Andrews, Sean M.; Mandel, Kaisey S.; Hogg, David W.; Green, Gregory M.

    2015-05-01

    Starfish is a set of tools used for spectroscopic inference. It robustly determines stellar parameters using high resolution spectral models and uses Markov Chain Monte Carlo (MCMC) to explore the full posterior probability distribution of the stellar parameters. Additional potential applications include other types of spectra, such as unresolved stellar clusters or supernovae spectra.

  5. Improving Explanatory Inferences from Assessments

    ERIC Educational Resources Information Center

    Diakow, Ronli Phyllis

    2013-01-01

    This dissertation comprises three papers that propose, discuss, and illustrate models to make improved inferences about research questions regarding student achievement in education. Addressing the types of questions common in educational research today requires three different "extensions" to traditional educational assessment: (1)…

  6. Perceptual Inference and Autistic Traits

    ERIC Educational Resources Information Center

    Skewes, Joshua C; Jegindø, Else-Marie; Gebauer, Line

    2015-01-01

    Autistic people are better at perceiving details. Major theories explain this in terms of bottom-up sensory mechanisms or in terms of top-down cognitive biases. Recently, it has become possible to link these theories within a common framework. This framework assumes that perception is implicit neural inference, combining sensory evidence with…

  7. A genomic perspective on hybridization and speciation

    PubMed Central

    Payseur, Bret A.; Rieseberg, Loren H.

    2016-01-01

    Hybridization among diverging lineages is common in nature. Genomic data provide a special opportunity to characterize the history of hybridization and the genetic basis of speciation. We review existing methods and empirical studies to identify recent advances in the genomics of hybridization, as well as issues that need to be addressed. Notable progress has been made in the development of methods for detecting hybridization and inferring individual ancestries. However, few approaches reconstruct the magnitude and timing of gene flow, estimate the fitness of hybrids or incorporate knowledge of recombination rate. Empirical studies indicate that the genomic consequences of hybridization are complex, including a highly heterogeneous landscape of differentiation. Inferred characteristics of hybridization differ substantially among species groups. Loci showing unusual patterns – which may contribute to reproductive barriers – are usually scattered throughout the genome, with potential enrichment in sex chromosomes and regions of reduced recombination. We caution against the growing trend of interpreting genomic variation in summary statistics across genomes as evidence of differential gene flow. We argue that converting genomic patterns into useful inferences about hybridization will ultimately require models and methods that directly incorporate key ingredients of speciation, including the dynamic nature of gene flow, selection acting in hybrid populations and recombination rate variation. PMID:26836441

  8. A new molecular approach to help conclude drowning as a cause of death: simultaneous detection of eight bacterioplankton species using real-time PCR assays with TaqMan probes.

    PubMed

    Uchiyama, Taketo; Kakizaki, Eiji; Kozawa, Shuji; Nishida, Sho; Imamura, Nahoko; Yukawa, Nobuhiro

    2012-10-10

    We developed a novel tool for concluding drowning as a cause of death. We designed nine primer pairs to detect representative freshwater or marine bacterioplankton (aquatic bacteria) and then used real-time PCR with TaqMan probes to rapidly and specifically detect them. We previously cultured the genus Aeromonas, which is a representative freshwater bacterial species, in blood samples from 94% of victims who drowned in freshwater and the genera Vibrio and/or Photobacterium that are representative marine bacteria in 88% of victims who drowned in seawater. Based on these results, we simultaneously detected eight species of bacterioplankton (Aeromonas hydrophila, A. salmonicida; Vibrio fischeri, V. harveyi, V. parahaemolyticus; Photobacterium damselae, P. leiognathi, P. phosphoreum) using three sets of triplex real-time PCR assays and TaqMan probes labelled with fluorophores (FAM, NED, Cy5). We assayed 266 specimens (109 blood, 157 tissues) from 43 victims, including 32 who had drowned in rivers, ditches, wells, sea or around estuaries. All lung samples of these 32 victims were TaqMan PCR-positive including the lung periphery into which water does not readily enter postmortem. On the other hand, findings in blood and/or closed organs (kidney or liver) were PCR-positive in 84% of the drowned victims (except for those who drowned in baths) although the conventional test detected diatoms in closed organs in only 44% of the victims. Thus, the results of the PCR assay reinforced those of diatom tests when only a few diatoms were detectable in organs due to the low density of diatoms in the water where they were found. Multiplex TaqMan PCR assays for bacterioplankton were rapid, less laborious and high-throughput as well as sensitive and specific. Therefore, these assays would be useful for routine forensic screening tests to estimate the amount and type of aspirated water.

  9. Computational inference of gene regulatory networks: Approaches, limitations and opportunities.

    PubMed

    Banf, Michael; Rhee, Seung Y

    2017-01-01

    Gene regulatory networks lie at the core of cell function control. In E. coli and S. cerevisiae, the study of gene regulatory networks has led to the discovery of regulatory mechanisms responsible for the control of cell growth, differentiation and responses to environmental stimuli. In plants, computational rendering of gene regulatory networks is gaining momentum, thanks to the recent availability of high-quality genomes and transcriptomes and development of computational network inference approaches. Here, we review current techniques, challenges and trends in gene regulatory network inference and highlight challenges and opportunities for plant science. We provide plant-specific application examples to guide researchers in selecting methodologies that suit their particular research questions. Given the interdisciplinary nature of gene regulatory network inference, we tried to cater to both biologists and computer scientists to help them engage in a dialogue about concepts and caveats in network inference. Specifically, we discuss problems and opportunities in heterogeneous data integration for eukaryotic organisms and common caveats to be considered during network model evaluation. This article is part of a Special Issue entitled: Plant Gene Regulatory Mechanisms and Networks, edited by Dr. Erich Grotewold and Dr. Nathan Springer.

  10. Gene regulatory network inference using out of equilibrium statistical mechanics

    PubMed Central

    Benecke, Arndt

    2008-01-01

    Spatiotemporal control of gene expression is fundamental to multicellular life. Despite prodigious efforts, the encoding of gene expression regulation in eukaryotes is not understood. Gene expression analyses nourish the hope to reverse engineer effector-target gene networks using inference techniques. Inference from noisy and circumstantial data relies on using robust models with few parameters for the underlying mechanisms. However, a systematic path to gene regulatory network reverse engineering from functional genomics data is still impeded by fundamental problems. Recently, Johannes Berg from the Theoretical Physics Institute of Cologne University has made two remarkable contributions that significantly advance the gene regulatory network inference problem. Berg, who uses gene expression data from yeast, has demonstrated a nonequilibrium regime for mRNA concentration dynamics and was able to map the gene regulatory process upon simple stochastic systems driven out of equilibrium. The impact of his demonstration is twofold, affecting both the understanding of the operational constraints under which transcription occurs and the capacity to extract relevant information from highly time-resolved expression data. Berg has used his observation to predict target genes of selected transcription factors, and thereby, in principle, demonstrated applicability of his out of equilibrium statistical mechanics approach to the gene network inference problem. PMID:19404429

  11. Towards General Algorithms for Grammatical Inference

    NASA Astrophysics Data System (ADS)

    Clark, Alexander

    Many algorithms for grammatical inference can be viewed as instances of a more general algorithm which maintains a set of primitive elements, which distributionally define sets of strings, and a set of features or tests that constrain various inference rules. Using this general framework, which we cast as a process of logical inference, we re-analyse Angluin's famous lstar algorithm and several recent algorithms for the inference of context-free grammars and multiple context-free grammars. Finally, to illustrate the advantages of this approach, we extend it to the inference of functional transductions from positive data only, and we present a new algorithm for the inference of finite state transducers.

  12. Genomic Reconstruction of the Transcriptional Regulatory Network in Bacillus subtilis

    PubMed Central

    Leyn, Semen A.; Kazanov, Marat D.; Sernova, Natalia V.; Ermakova, Ekaterina O.; Novichkov, Pavel S.

    2013-01-01

    The adaptation of microorganisms to their environment is controlled by complex transcriptional regulatory networks (TRNs), which are still only partially understood even for model species. Genome scale annotation of regulatory features of genes and TRN reconstruction are challenging tasks of microbial genomics. We used the knowledge-driven comparative-genomics approach implemented in the RegPredict Web server to infer TRN in the model Gram-positive bacterium Bacillus subtilis and 10 related Bacillales species. For transcription factor (TF) regulons, we combined the available information from the DBTBS database and the literature with bioinformatics tools, allowing inference of TF binding sites (TFBSs), comparative analysis of the genomic context of predicted TFBSs, functional assignment of target genes, and effector prediction. For RNA regulons, we used known RNA regulatory motifs collected in the Rfam database to scan genomes and analyze the genomic context of new RNA sites. The inferred TRN in B. subtilis comprises regulons for 129 TFs and 24 regulatory RNA families. First, we analyzed 66 TF regulons with previously known TFBSs in B. subtilis and projected them to other Bacillales genomes, resulting in refinement of TFBS motifs and identification of novel regulon members. Second, we inferred motifs and described regulons for 28 experimentally studied TFs with previously unknown TFBSs. Third, we discovered novel motifs and reconstructed regulons for 36 previously uncharacterized TFs. The inferred collection of regulons is available in the RegPrecise database (http://regprecise.lbl.gov/) and can be used in genetic experiments, metabolic modeling, and evolutionary analysis. PMID:23504016

  13. Comparative analysis of chromosome counts infers three paleopolyploidies in the mollusca.

    PubMed

    Hallinan, Nathaniel M; Lindberg, David R

    2011-01-01

    The study of paleopolyploidies requires the comparison of multiple whole genome sequences. If the branches of a phylogeny on which a whole-genome duplication (WGD) occurred could be identified before genome sequencing, taxa could be selected that provided a better assessment of that genome duplication. Here, we describe a likelihood model in which the number of chromosomes in a genome evolves according to a Markov process with one rate of chromosome duplication and loss that is proportional to the number of chromosomes in the genome and another stochastic rate at which every chromosome in the genome could duplicate in a single event. We compare the maximum likelihoods of a model in which the genome duplication rate varies to one in which it is fixed at zero using the Akaike information criterion, to determine if a model with WGDs is a good fit for the data. Once it has been determined that the data does fit the WGD model, we infer the phylogenetic position of paleopolyploidies by calculating the posterior probability that a WGD occurred on each branch of the taxon tree. Here, we apply this model to a molluscan tree represented by 124 taxa and infer three putative WGD events. In the Gastropoda, we identify a single branch within the Hypsogastropoda and one of two branches at the base of the Stylommatophora. We also identify one or two branches near the base of the Cephalopoda.

  14. Statistical learning and selective inference

    PubMed Central

    Taylor, Jonathan; Tibshirani, Robert J.

    2015-01-01

    We describe the problem of “selective inference.” This addresses the following challenge: Having mined a set of data to find potential associations, how do we properly assess the strength of these associations? The fact that we have “cherry-picked”—searched for the strongest associations—means that we must set a higher bar for declaring significant the associations that we see. This challenge becomes more important in the era of big data and complex statistical modeling. The cherry tree (dataset) can be very large and the tools for cherry picking (statistical learning methods) are now very sophisticated. We describe some recent new developments in selective inference and illustrate their use in forward stepwise regression, the lasso, and principal components analysis. PMID:26100887

  15. Causal inference based on counterfactuals

    PubMed Central

    Höfler, M

    2005-01-01

    Background The counterfactual or potential outcome model has become increasingly standard for causal inference in epidemiological and medical studies. Discussion This paper provides an overview on the counterfactual and related approaches. A variety of conceptual as well as practical issues when estimating causal effects are reviewed. These include causal interactions, imperfect experiments, adjustment for confounding, time-varying exposures, competing risks and the probability of causation. It is argued that the counterfactual model of causal effects captures the main aspects of causality in health sciences and relates to many statistical procedures. Summary Counterfactuals are the basis of causal inference in medicine and epidemiology. Nevertheless, the estimation of counterfactual differences pose several difficulties, primarily in observational studies. These problems, however, reflect fundamental barriers only when learning from observations, and this does not invalidate the counterfactual concept. PMID:16159397

  16. Statistical learning and selective inference.

    PubMed

    Taylor, Jonathan; Tibshirani, Robert J

    2015-06-23

    We describe the problem of "selective inference." This addresses the following challenge: Having mined a set of data to find potential associations, how do we properly assess the strength of these associations? The fact that we have "cherry-picked"--searched for the strongest associations--means that we must set a higher bar for declaring significant the associations that we see. This challenge becomes more important in the era of big data and complex statistical modeling. The cherry tree (dataset) can be very large and the tools for cherry picking (statistical learning methods) are now very sophisticated. We describe some recent new developments in selective inference and illustrate their use in forward stepwise regression, the lasso, and principal components analysis.

  17. Inferring Centrality from Network Snapshots

    NASA Astrophysics Data System (ADS)

    Shao, Haibin; Mesbahi, Mehran; Li, Dewei; Xi, Yugeng

    2017-01-01

    The topology and dynamics of a complex network shape its functionality. However, the topologies of many large-scale networks are either unavailable or incomplete. Without the explicit knowledge of network topology, we show how the data generated from the network dynamics can be utilised to infer the tempo centrality, which is proposed to quantify the influence of nodes in a consensus network. We show that the tempo centrality can be used to construct an accurate estimate of both the propagation rate of influence exerted on consensus networks and the Kirchhoff index of the underlying graph. Moreover, the tempo centrality also encodes the disturbance rejection of nodes in a consensus network. Our findings provide an approach to infer the performance of a consensus network from its temporal data.

  18. Network Plasticity as Bayesian Inference

    PubMed Central

    Legenstein, Robert; Maass, Wolfgang

    2015-01-01

    General results from statistical learning theory suggest to understand not only brain computations, but also brain plasticity as probabilistic inference. But a model for that has been missing. We propose that inherently stochastic features of synaptic plasticity and spine motility enable cortical networks of neurons to carry out probabilistic inference by sampling from a posterior distribution of network configurations. This model provides a viable alternative to existing models that propose convergence of parameters to maximum likelihood values. It explains how priors on weight distributions and connection probabilities can be merged optimally with learned experience, how cortical networks can generalize learned information so well to novel experiences, and how they can compensate continuously for unforeseen disturbances of the network. The resulting new theory of network plasticity explains from a functional perspective a number of experimental data on stochastic aspects of synaptic plasticity that previously appeared to be quite puzzling. PMID:26545099

  19. Bayesian Inference on Proportional Elections

    PubMed Central

    Brunello, Gabriel Hideki Vatanabe; Nakano, Eduardo Yoshio

    2015-01-01

    Polls for majoritarian voting systems usually show estimates of the percentage of votes for each candidate. However, proportional vote systems do not necessarily guarantee the candidate with the most percentage of votes will be elected. Thus, traditional methods used in majoritarian elections cannot be applied on proportional elections. In this context, the purpose of this paper was to perform a Bayesian inference on proportional elections considering the Brazilian system of seats distribution. More specifically, a methodology to answer the probability that a given party will have representation on the chamber of deputies was developed. Inferences were made on a Bayesian scenario using the Monte Carlo simulation technique, and the developed methodology was applied on data from the Brazilian elections for Members of the Legislative Assembly and Federal Chamber of Deputies in 2010. A performance rate was also presented to evaluate the efficiency of the methodology. Calculations and simulations were carried out using the free R statistical software. PMID:25786259

  20. System Support for Forensic Inference

    NASA Astrophysics Data System (ADS)

    Gehani, Ashish; Kirchner, Florent; Shankar, Natarajan

    Digital evidence is playing an increasingly important role in prosecuting crimes. The reasons are manifold: financially lucrative targets are now connected online, systems are so complex that vulnerabilities abound and strong digital identities are being adopted, making audit trails more useful. If the discoveries of forensic analysts are to hold up to scrutiny in court, they must meet the standard for scientific evidence. Software systems are currently developed without consideration of this fact. This paper argues for the development of a formal framework for constructing “digital artifacts” that can serve as proxies for physical evidence; a system so imbued would facilitate sound digital forensic inference. A case study involving a filesystem augmentation that provides transparent support for forensic inference is described.

  1. Inferring Centrality from Network Snapshots

    PubMed Central

    Shao, Haibin; Mesbahi, Mehran; Li, Dewei; Xi, Yugeng

    2017-01-01

    The topology and dynamics of a complex network shape its functionality. However, the topologies of many large-scale networks are either unavailable or incomplete. Without the explicit knowledge of network topology, we show how the data generated from the network dynamics can be utilised to infer the tempo centrality, which is proposed to quantify the influence of nodes in a consensus network. We show that the tempo centrality can be used to construct an accurate estimate of both the propagation rate of influence exerted on consensus networks and the Kirchhoff index of the underlying graph. Moreover, the tempo centrality also encodes the disturbance rejection of nodes in a consensus network. Our findings provide an approach to infer the performance of a consensus network from its temporal data. PMID:28098166

  2. Bayesian inference for agreement measures.

    PubMed

    Vidal, Ignacio; de Castro, Mário

    2016-08-25

    The agreement of different measurement methods is an important issue in several disciplines like, for example, Medicine, Metrology, and Engineering. In this article, some agreement measures, common in the literature, were analyzed from a Bayesian point of view. Posterior inferences for such agreement measures were obtained based on well-known Bayesian inference procedures for the bivariate normal distribution. As a consequence, a general, simple, and effective method is presented, which does not require Markov Chain Monte Carlo methods and can be applied considering a great variety of prior distributions. Illustratively, the method was exemplified using five objective priors for the bivariate normal distribution. A tool for assessing the adequacy of the model is discussed. Results from a simulation study and an application to a real dataset are also reported.

  3. Comparative 16S rRNA Analysis of Lake Bacterioplankton Reveals Globally Distributed Phylogenetic Clusters Including an Abundant Group of Actinobacteria

    PubMed Central

    Glöckner, Frank Oliver; Zaichikov, Evgeny; Belkova, Natalia; Denissova, Ludmilla; Pernthaler, Jakob; Pernthaler, Annelie; Amann, Rudolf

    2000-01-01

    In a search for cosmopolitan phylogenetic clusters of freshwater bacteria, we recovered a total of 190 full and partial 16S ribosomal DNA (rDNA) sequences from three different lakes (Lake Gossenköllesee, Austria; Lake Fuchskuhle, Germany; and Lake Baikal, Russia). The phylogenetic comparison with the currently available rDNA data set showed that our sequences fall into 16 clusters, which otherwise include bacterial rDNA sequences of primarily freshwater and soil, but not marine, origin. Six of the clusters were affiliated with the α, four were affiliated with the β, and one was affiliated with the γ subclass of the Proteobacteria; four were affiliated with the Cytophaga-Flavobacterium-Bacteroides group; and one was affiliated with the class Actinobacteria (formerly known as the high-G+C gram-positive bacteria). The latter cluster (hgcI) is monophyletic and so far includes only sequences directly retrieved from aquatic environments. Fluorescence in situ hybridization (FISH) with probes specific for the hgcI cluster showed abundances of up to 1.7 × 105 cells ml−1 in Lake Gossenköllesee, with strong seasonal fluctuations, and high abundances in the two other lakes investigated. Cell size measurements revealed that Actinobacteria in Lake Gossenköllesee can account for up to 63% of the bacterioplankton biomass. A combination of phylogenetic analysis and FISH was used to reveal 16 globally distributed sequence clusters and to confirm the broad distribution, abundance, and high biomass of members of the class Actinobacteria in freshwater ecosystems. PMID:11055963

  4. Inference of reversible tree languages.

    PubMed

    López, Damián; Sempere, José M; García, Pedro

    2004-08-01

    In this paper, we study the notion of k-reversibility and k-testability when regular tree languages are involved. We present an inference algorithm for learning a k-testable tree language that runs in polynomial time with respect to the size of the sample used. We also study the tree language classes in relation to other well known ones, and some properties of these languages are proven.

  5. Fast, Flexible, Rational Inductive Inference

    DTIC Science & Technology

    2013-08-23

    learning phonetic categories – the sounds that make up speech – learning the words that those sounds appear in provides sufficiently strong constraints...first to be able to infer realistic phonetic categories directly from simulated speech data. Objective 2.2: Forming feature-based representations...lexicon in phonetic category acquisition. Psychological Review. Griffiths, T. L., Austerweil, J. L., & Berthiaume, V. G. (2012). Comparing the

  6. Cortical circuits for perceptual inference.

    PubMed

    Friston, Karl; Kiebel, Stefan

    2009-10-01

    This paper assumes that cortical circuits have evolved to enable inference about the causes of sensory input received by the brain. This provides a principled specification of what neural circuits have to achieve. Here, we attempt to address how the brain makes inferences by casting inference as an optimisation problem. We look at how the ensuing recognition dynamics could be supported by directed connections and message-passing among neuronal populations, given our knowledge of intrinsic and extrinsic neuronal connections. We assume that the brain models the world as a dynamic system, which imposes causal structure on the sensorium. Perception is equated with the optimisation or inversion of this internal model, to explain sensory input. Given a model of how sensory data are generated, we use a generic variational approach to model inversion to furnish equations that prescribe recognition; i.e., the dynamics of neuronal activity that represents the causes of sensory input. Here, we focus on a model whose hierarchical and dynamical structure enables simulated brains to recognise and predict sequences of sensory states. We first review these models and their inversion under a variational free-energy formulation. We then show that the brain has the necessary infrastructure to implement this inversion and present stimulations using synthetic birds that generate and recognise birdsongs.

  7. An introduction to causal inference.

    PubMed

    Pearl, Judea

    2010-02-26

    This paper summarizes recent advances in causal inference and underscores the paradigmatic shifts that must be undertaken in moving from traditional statistical analysis to causal analysis of multivariate data. Special emphasis is placed on the assumptions that underlie all causal inferences, the languages used in formulating those assumptions, the conditional nature of all causal and counterfactual claims, and the methods that have been developed for the assessment of such claims. These advances are illustrated using a general theory of causation based on the Structural Causal Model (SCM) described in Pearl (2000a), which subsumes and unifies other approaches to causation, and provides a coherent mathematical foundation for the analysis of causes and counterfactuals. In particular, the paper surveys the development of mathematical tools for inferring (from a combination of data and assumptions) answers to three types of causal queries: those about (1) the effects of potential interventions, (2) probabilities of counterfactuals, and (3) direct and indirect effects (also known as "mediation"). Finally, the paper defines the formal and conceptual relationships between the structural and potential-outcome frameworks and presents tools for a symbiotic analysis that uses the strong features of both. The tools are demonstrated in the analyses of mediation, causes of effects, and probabilities of causation.

  8. Children's and adults' evaluation of the certainty of deductive inferences, inductive inferences, and guesses.

    PubMed

    Pillow, Bradford H

    2002-01-01

    Two experiments investigated kindergarten through fourth-grade children's and adults' (N = 128) ability to (1) evaluate the certainty of deductive inferences, inductive inferences, and guesses; and (2) explain the origins of inferential knowledge. When judging their own cognitive state, children in first grade and older rated deductive inferences as more certain than guesses; but when judging another person's knowledge, children did not distinguish valid inferences from invalid inferences and guesses until fourth grade. By third grade, children differentiated their own deductive inferences from inductive inferences and guesses, but only adults both differentiated deductive inferences from inductive inferences and differentiated inductive inferences from guesses. Children's recognition of their own inferences may contribute to the development of knowledge about cognitive processes, scientific reasoning, and a constructivist epistemology.

  9. Single-cell enabled comparative genomics of a deep ocean SAR11 bathytype

    PubMed Central

    Cameron Thrash, J; Temperton, Ben; Swan, Brandon K; Landry, Zachary C; Woyke, Tanja; DeLong, Edward F; Stepanauskas, Ramunas; Giovannoni, Stephan J

    2014-01-01

    Bacterioplankton of the SAR11 clade are the most abundant microorganisms in marine systems, usually representing 25% or more of the total bacterial cells in seawater worldwide. SAR11 is divided into subclades with distinct spatiotemporal distributions (ecotypes), some of which appear to be specific to deep water. Here we examine the genomic basis for deep ocean distribution of one SAR11 bathytype (depth-specific ecotype), subclade Ic. Four single-cell Ic genomes, with estimated completeness of 55%–86%, were isolated from 770 m at station ALOHA and compared with eight SAR11 surface genomes and metagenomic datasets. Subclade Ic genomes dominated metagenomic fragment recruitment below the euphotic zone. They had similar COG distributions, high local synteny and shared a large number (69%) of orthologous clusters with SAR11 surface genomes, yet were distinct at the 16S rRNA gene and amino-acid level, and formed a separate, monophyletic group in phylogenetic trees. Subclade Ic genomes were enriched in genes associated with membrane/cell wall/envelope biosynthesis and showed evidence of unique phage defenses. The majority of subclade Ic-specfic genes were hypothetical, and some were highly abundant in deep ocean metagenomic data, potentially masking mechanisms for niche differentiation. However, the evidence suggests these organisms have a similar metabolism to their surface counterparts, and that subclade Ic adaptations to the deep ocean do not involve large variations in gene content, but rather more subtle differences previously observed deep ocean genomic data, like preferential amino-acid substitutions, larger coding regions among SAR11 clade orthologs, larger intergenic regions and larger estimated average genome size. PMID:24451205

  10. Bayesian Computation Methods for Inferring Regulatory Network Models Using Biomedical Data.

    PubMed

    Tian, Tianhai

    2016-01-01

    The rapid advancement of high-throughput technologies provides huge amounts of information for gene expression and protein activity in the genome-wide scale. The availability of genomics, transcriptomics, proteomics, and metabolomics dataset gives an unprecedented opportunity to study detailed molecular regulations that is very important to precision medicine. However, it is still a significant challenge to design effective and efficient method to infer the network structure and dynamic property of regulatory networks. In recent years a number of computing methods have been designed to explore the regulatory mechanisms as well as estimate unknown model parameters. Among them, the Bayesian inference method can combine both prior knowledge and experimental data to generate updated information regarding the regulatory mechanisms. This chapter gives a brief review for Bayesian statistical methods that are used to infer the network structure and estimate model parameters based on experimental data.

  11. Plant functional genomics

    NASA Astrophysics Data System (ADS)

    Holtorf, Hauke; Guitton, Marie-Christine; Reski, Ralf

    2002-04-01

    Functional genome analysis of plants has entered the high-throughput stage. The complete genome information from key species such as Arabidopsis thaliana and rice is now available and will further boost the application of a range of new technologies to functional plant gene analysis. To broadly assign functions to unknown genes, different fast and multiparallel approaches are currently used and developed. These new technologies are based on known methods but are adapted and improved to accommodate for comprehensive, large-scale gene analysis, i.e. such techniques are novel in the sense that their design allows researchers to analyse many genes at the same time and at an unprecedented pace. Such methods allow analysis of the different constituents of the cell that help to deduce gene function, namely the transcripts, proteins and metabolites. Similarly the phenotypic variations of entire mutant collections can now be analysed in a much faster and more efficient way than before. The different methodologies have developed to form their own fields within the functional genomics technological platform and are termed transcriptomics, proteomics, metabolomics and phenomics. Gene function, however, cannot solely be inferred by using only one such approach. Rather, it is only by bringing together all the information collected by different functional genomic tools that one will be able to unequivocally assign functions to unknown plant genes. This review focuses on current technical developments and their impact on the field of plant functional genomics. The lower plant Physcomitrella is introduced as a new model system for gene function analysis, owing to its high rate of homologous recombination.

  12. Statistical inference for inverse problems

    NASA Astrophysics Data System (ADS)

    Bissantz, Nicolai; Holzmann, Hajo

    2008-06-01

    In this paper we study statistical inference for certain inverse problems. We go beyond mere estimation purposes and review and develop the construction of confidence intervals and confidence bands in some inverse problems, including deconvolution and the backward heat equation. Further, we discuss the construction of certain hypothesis tests, in particular concerning the number of local maxima of the unknown function. The methods are illustrated in a case study, where we analyze the distribution of heliocentric escape velocities of galaxies in the Centaurus galaxy cluster, and provide statistical evidence for its bimodality.

  13. sick: The Spectroscopic Inference Crank

    NASA Astrophysics Data System (ADS)

    Casey, Andrew R.

    2016-03-01

    There exists an inordinate amount of spectral data in both public and private astronomical archives that remain severely under-utilized. The lack of reliable open-source tools for analyzing large volumes of spectra contributes to this situation, which is poised to worsen as large surveys successively release orders of magnitude more spectra. In this article I introduce sick, the spectroscopic inference crank, a flexible and fast Bayesian tool for inferring astrophysical parameters from spectra. sick is agnostic to the wavelength coverage, resolving power, or general data format, allowing any user to easily construct a generative model for their data, regardless of its source. sick can be used to provide a nearest-neighbor estimate of model parameters, a numerically optimized point estimate, or full Markov Chain Monte Carlo sampling of the posterior probability distributions. This generality empowers any astronomer to capitalize on the plethora of published synthetic and observed spectra, and make precise inferences for a host of astrophysical (and nuisance) quantities. Model intensities can be reliably approximated from existing grids of synthetic or observed spectra using linear multi-dimensional interpolation, or a Cannon-based model. Additional phenomena that transform the data (e.g., redshift, rotational broadening, continuum, spectral resolution) are incorporated as free parameters and can be marginalized away. Outlier pixels (e.g., cosmic rays or poorly modeled regimes) can be treated with a Gaussian mixture model, and a noise model is included to account for systematically underestimated variance. Combining these phenomena into a scalar-justified, quantitative model permits precise inferences with credible uncertainties on noisy data. I describe the common model features, the implementation details, and the default behavior, which is balanced to be suitable for most astronomical applications. Using a forward model on low-resolution, high signal

  14. Universum Inference and Corpus Homogeneity

    NASA Astrophysics Data System (ADS)

    Vogel, Carl; Lynch, Gerard; Janssen, Jerom

    Universum Inference is re-interpreted for assessment of corpus homogeneity in computational stylometry. Recent stylometric research quantifies strength of characterization within dramatic works by assessing the homogeneity of corpora associated with dramatic personas. A methodological advance is suggested to mitigate the potential for the assessment of homogeneity to be achieved by chance. Baseline comparison analysis is constructed for contributions to debates by nonfictional participants: the corpus analyzed consists of transcripts of US Presidential and Vice-Presidential debates from the 2000 election cycle. The corpus is also analyzed in translation to Italian, Spanish and Portuguese. Adding randomized categories makes assessments of homogeneity more conservative.

  15. Accurate inference of local phased ancestry of modern admixed populations.

    PubMed

    Ma, Yamin; Zhao, Jian; Wong, Jian-Syuan; Ma, Li; Li, Wenzhi; Fu, Guoxing; Xu, Wei; Zhang, Kui; Kittles, Rick A; Li, Yun; Song, Qing

    2014-07-23

    Population stratification is a growing concern in genetic-association studies. Averaged ancestry at the genome level (global ancestry) is insufficient for detecting the population substructures and correcting population stratifications in association studies. Local and phase stratification are needed for human genetic studies, but current technologies cannot be applied on the entire genome data due to various technical caveats. Here we developed a novel approach (aMAP, ancestry of Modern Admixed Populations) for inferring local phased ancestry. It took about 3 seconds on a desktop computer to finish a local ancestry analysis for each human genome with 1.4-million SNPs. This method also exhibits the scalability to larger datasets with respect to the number of SNPs, the number of samples, and the size of reference panels. It can detect the lack of the proxy of reference panels. The accuracy was 99.4%. The aMAP software has a capacity for analyzing 6-way admixed individuals. As the biomedical community continues to expand its efforts to increase the representation of diverse populations, and as the number of large whole-genome sequence datasets continues to grow rapidly, there is an increasing demand on rapid and accurate local ancestry analysis in genetics, pharmacogenomics, population genetics, and clinical diagnosis.

  16. Thinking too positive? Revisiting current methods of population genetic selection inference.

    PubMed

    Bank, Claudia; Ewing, Gregory B; Ferrer-Admettla, Anna; Foll, Matthieu; Jensen, Jeffrey D

    2014-12-01

    In the age of next-generation sequencing, the availability of increasing amounts and improved quality of data at decreasing cost ought to allow for a better understanding of how natural selection is shaping the genome than ever before. However, alternative forces, such as demography and background selection (BGS), obscure the footprints of positive selection that we would like to identify. In this review, we illustrate recent developments in this area, and outline a roadmap for improved selection inference. We argue (i) that the development and obligatory use of advanced simulation tools is necessary for improved identification of selected loci, (ii) that genomic information from multiple time points will enhance the power of inference, and (iii) that results from experimental evolution should be utilized to better inform population genomic studies.

  17. The genome of Theobroma cacao.

    PubMed

    Argout, Xavier; Salse, Jerome; Aury, Jean-Marc; Guiltinan, Mark J; Droc, Gaetan; Gouzy, Jerome; Allegre, Mathilde; Chaparro, Cristian; Legavre, Thierry; Maximova, Siela N; Abrouk, Michael; Murat, Florent; Fouet, Olivier; Poulain, Julie; Ruiz, Manuel; Roguet, Yolande; Rodier-Goud, Maguy; Barbosa-Neto, Jose Fernandes; Sabot, Francois; Kudrna, Dave; Ammiraju, Jetty Siva S; Schuster, Stephan C; Carlson, John E; Sallet, Erika; Schiex, Thomas; Dievart, Anne; Kramer, Melissa; Gelley, Laura; Shi, Zi; Bérard, Aurélie; Viot, Christopher; Boccara, Michel; Risterucci, Ange Marie; Guignon, Valentin; Sabau, Xavier; Axtell, Michael J; Ma, Zhaorong; Zhang, Yufan; Brown, Spencer; Bourge, Mickael; Golser, Wolfgang; Song, Xiang; Clement, Didier; Rivallan, Ronan; Tahi, Mathias; Akaza, Joseph Moroh; Pitollat, Bertrand; Gramacho, Karina; D'Hont, Angélique; Brunel, Dominique; Infante, Diogenes; Kebe, Ismael; Costet, Pierre; Wing, Rod; McCombie, W Richard; Guiderdoni, Emmanuel; Quetier, Francis; Panaud, Olivier; Wincker, Patrick; Bocs, Stephanie; Lanaud, Claire

    2011-02-01

    We sequenced and assembled the draft genome of Theobroma cacao, an economically important tropical-fruit tree crop that is the source of chocolate. This assembly corresponds to 76% of the estimated genome size and contains almost all previously described genes, with 82% of these genes anchored on the 10 T. cacao chromosomes. Analysis of this sequence information highlighted specific expansion of some gene families during evolution, for example, flavonoid-related genes. It also provides a major source of candidate genes for T. cacao improvement. Based on the inferred paleohistory of the T. cacao genome, we propose an evolutionary scenario whereby the ten T. cacao chromosomes were shaped from an ancestor through eleven chromosome fusions.

  18. Bayesian inference for OPC modeling

    NASA Astrophysics Data System (ADS)

    Burbine, Andrew; Sturtevant, John; Fryer, David; Smith, Bruce W.

    2016-03-01

    The use of optical proximity correction (OPC) demands increasingly accurate models of the photolithographic process. Model building and inference techniques in the data science community have seen great strides in the past two decades which make better use of available information. This paper aims to demonstrate the predictive power of Bayesian inference as a method for parameter selection in lithographic models by quantifying the uncertainty associated with model inputs and wafer data. Specifically, the method combines the model builder's prior information about each modelling assumption with the maximization of each observation's likelihood as a Student's t-distributed random variable. Through the use of a Markov chain Monte Carlo (MCMC) algorithm, a model's parameter space is explored to find the most credible parameter values. During parameter exploration, the parameters' posterior distributions are generated by applying Bayes' rule, using a likelihood function and the a priori knowledge supplied. The MCMC algorithm used, an affine invariant ensemble sampler (AIES), is implemented by initializing many walkers which semiindependently explore the space. The convergence of these walkers to global maxima of the likelihood volume determine the parameter values' highest density intervals (HDI) to reveal champion models. We show that this method of parameter selection provides insights into the data that traditional methods do not and outline continued experiments to vet the method.

  19. Bayesian inference for radio observations

    NASA Astrophysics Data System (ADS)

    Lochner, Michelle; Natarajan, Iniyan; Zwart, Jonathan T. L.; Smirnov, Oleg; Bassett, Bruce A.; Oozeer, Nadeem; Kunz, Martin

    2015-06-01

    New telescopes like the Square Kilometre Array (SKA) will push into a new sensitivity regime and expose systematics, such as direction-dependent effects, that could previously be ignored. Current methods for handling such systematics rely on alternating best estimates of instrumental calibration and models of the underlying sky, which can lead to inadequate uncertainty estimates and biased results because any correlations between parameters are ignored. These deconvolution algorithms produce a single image that is assumed to be a true representation of the sky, when in fact it is just one realization of an infinite ensemble of images compatible with the noise in the data. In contrast, here we report a Bayesian formalism that simultaneously infers both systematics and science. Our technique, Bayesian Inference for Radio Observations (BIRO), determines all parameters directly from the raw data, bypassing image-making entirely, by sampling from the joint posterior probability distribution. This enables it to derive both correlations and accurate uncertainties, making use of the flexible software MEQTREES to model the sky and telescope simultaneously. We demonstrate BIRO with two simulated sets of Westerbork Synthesis Radio Telescope data sets. In the first, we perform joint estimates of 103 scientific (flux densities of sources) and instrumental (pointing errors, beamwidth and noise) parameters. In the second example, we perform source separation with BIRO. Using the Bayesian evidence, we can accurately select between a single point source, two point sources and an extended Gaussian source, allowing for `super-resolution' on scales much smaller than the synthesized beam.

  20. Quantum Inference on Bayesian Networks

    NASA Astrophysics Data System (ADS)

    Yoder, Theodore; Low, Guang Hao; Chuang, Isaac

    2014-03-01

    Because quantum physics is naturally probabilistic, it seems reasonable to expect physical systems to describe probabilities and their evolution in a natural fashion. Here, we use quantum computation to speedup sampling from a graphical probability model, the Bayesian network. A specialization of this sampling problem is approximate Bayesian inference, where the distribution on query variables is sampled given the values e of evidence variables. Inference is a key part of modern machine learning and artificial intelligence tasks, but is known to be NP-hard. Classically, a single unbiased sample is obtained from a Bayesian network on n variables with at most m parents per node in time (nmP(e) - 1 / 2) , depending critically on P(e) , the probability the evidence might occur in the first place. However, by implementing a quantum version of rejection sampling, we obtain a square-root speedup, taking (n2m P(e) -1/2) time per sample. The speedup is the result of amplitude amplification, which is proving to be broadly applicable in sampling and machine learning tasks. In particular, we provide an explicit and efficient circuit construction that implements the algorithm without the need for oracle access.

  1. Dopamine, Affordance and Active Inference

    PubMed Central

    Friston, Karl J.; Shiner, Tamara; FitzGerald, Thomas; Galea, Joseph M.; Adams, Rick; Brown, Harriet; Dolan, Raymond J.; Moran, Rosalyn; Stephan, Klaas Enno; Bestmann, Sven

    2012-01-01

    The role of dopamine in behaviour and decision-making is often cast in terms of reinforcement learning and optimal decision theory. Here, we present an alternative view that frames the physiology of dopamine in terms of Bayes-optimal behaviour. In this account, dopamine controls the precision or salience of (external or internal) cues that engender action. In other words, dopamine balances bottom-up sensory information and top-down prior beliefs when making hierarchical inferences (predictions) about cues that have affordance. In this paper, we focus on the consequences of changing tonic levels of dopamine firing using simulations of cued sequential movements. Crucially, the predictions driving movements are based upon a hierarchical generative model that infers the context in which movements are made. This means that we can confuse agents by changing the context (order) in which cues are presented. These simulations provide a (Bayes-optimal) model of contextual uncertainty and set switching that can be quantified in terms of behavioural and electrophysiological responses. Furthermore, one can simulate dopaminergic lesions (by changing the precision of prediction errors) to produce pathological behaviours that are reminiscent of those seen in neurological disorders such as Parkinson's disease. We use these simulations to demonstrate how a single functional role for dopamine at the synaptic level can manifest in different ways at the behavioural level. PMID:22241972

  2. Shannon Information in Complete Genomes

    NASA Astrophysics Data System (ADS)

    Hsieh, Li-Ching; Chang, Chang-Heng; Lee, Hoong-Chien

    2004-03-01

    Genomes are books of life and necessarily carry a huge amount of information. This study was first motivated by the question: "How much information do complete genomes have?" As an answer we measured a particular type of Shannon information in all prokaryotes and eukaryotes whose complete genomes have been sequenced and are available in publically assessible database. The Shannon information in complete genome sequences follow an extremely simple pattern. With the exception of one eukaryote the Shannon information in all (more than 200) complete sequences belong to a single universality class given by a simple geometric recursion formula. The data are interpreted in terms of models for genome growth and inferred to suggest that the ancestors of present day genomes began to grow, mainly by stochastic, selectively neutral, duplications and short mutations, most likely when they were not more than 300 nt long. This notion of selective neutralism independently corroborates Kimura's neutral theory of evolution which was based on the investigation of polymorphisms of genes.

  3. The Complete Mitochondrial Genome of Aleurocanthus camelliae: Insights into Gene Arrangement and Genome Organization within the Family Aleyrodidae

    PubMed Central

    Chen, Shi-Chun; Wang, Xiao-Qing; Li, Pin-Wu; Hu, Xiang; Wang, Jin-Jun; Peng, Ping

    2016-01-01

    There are numerous gene rearrangements and transfer RNA gene absences existing in mitochondrial (mt) genomes of Aleyrodidae species. To understand how mt genomes evolved in the family Aleyrodidae, we have sequenced the complete mt genome of Aleurocanthus camelliae and comparatively analyzed all reported whitefly mt genomes. The mt genome of A. camelliae is 15,188 bp long, and consists of 13 protein-coding genes, two rRNA genes, 21 tRNA genes and a putative control region (GenBank: KU761949). The tRNA gene, trnI, has not been observed in this genome. The mt genome has a unique gene order and shares most gene boundaries with Tetraleurodes acaciae. Nineteen of 21 tRNA genes have the conventional cloverleaf shaped secondary structure and two (trnS1 and trnS2) lack the dihydrouridine (DHU) arm. Using ARWEN and homologous sequence alignment, we have identified five tRNA genes and revised the annotation for three whitefly mt genomes. This result suggests that most absent genes exist in the genomes and have not been identified, due to be lack of technology and inference sequence. The phylogenetic relationships among 11 whiteflies and Drosophila melanogaster were inferred by maximum likelihood and Bayesian inference methods. Aleurocanthus camelliae and T. acaciae form a sister group, and all three Bemisia tabaci and two Bemisia afer strains gather together. These results are identical to the relationships inferred from gene order. We inferred that gene rearrangement plays an important role in the mt genome evolved from whiteflies. PMID:27827992

  4. The Complete Mitochondrial Genome of Aleurocanthus camelliae: Insights into Gene Arrangement and Genome Organization within the Family Aleyrodidae.

    PubMed

    Chen, Shi-Chun; Wang, Xiao-Qing; Li, Pin-Wu; Hu, Xiang; Wang, Jin-Jun; Peng, Ping

    2016-11-07

    There are numerous gene rearrangements and transfer RNA gene absences existing in mitochondrial (mt) genomes of Aleyrodidae species. To understand how mt genomes evolved in the family Aleyrodidae, we have sequenced the complete mt genome of Aleurocanthus camelliae and comparatively analyzed all reported whitefly mt genomes. The mt genome of A. camelliae is 15,188 bp long, and consists of 13 protein-coding genes, two rRNA genes, 21 tRNA genes and a putative control region (GenBank: KU761949). The tRNA gene, trnI, has not been observed in this genome. The mt genome has a unique gene order and shares most gene boundaries with Tetraleurodes acaciae. Nineteen of 21 tRNA genes have the conventional cloverleaf shaped secondary structure and two (trnS₁ and trnS₂) lack the dihydrouridine (DHU) arm. Using ARWEN and homologous sequence alignment, we have identified five tRNA genes and revised the annotation for three whitefly mt genomes. This result suggests that most absent genes exist in the genomes and have not been identified, due to be lack of technology and inference sequence. The phylogenetic relationships among 11 whiteflies and Drosophila melanogaster were inferred by maximum likelihood and Bayesian inference methods. Aleurocanthus camelliae and T. acaciae form a sister group, and all three Bemisia tabaci and two Bemisia afer strains gather together. These results are identical to the relationships inferred from gene order. We inferred that gene rearrangement plays an important role in the mt genome evolved from whiteflies.

  5. Direct and indirect effects of vertical mixing, nutrients and ultraviolet radiation on the bacterioplankton metabolism in high-mountain lakes from southern Europe

    NASA Astrophysics Data System (ADS)

    Durán, C.; Medina-Sánchez, J. M.; Herrera, G.; Villar-Argaiz, M.; Villafañe, V. E.; Helbling, E. W.; Carrillo, P.

    2014-05-01

    led to higher HBP. Consequently, EOC satisfied BCD in the clear lakes, particularly in the clearest one [LC]. Our results suggest that the higher vulnerability of bacteria to the damaging effects of UVR may be particularly accentuated in the opaque lakes and further recognizes the relevance of light exposure history and biotic interactions on bacterioplankton metabolism when coping with fluctuating radiation and nutrient inputs.

  6. Inferring Demographic History from a Spectrum of Shared Haplotype Lengths

    PubMed Central

    Harris, Kelley; Nielsen, Rasmus

    2013-01-01

    There has been much recent excitement about the use of genetics to elucidate ancestral history and demography. Whole genome data from humans and other species are revealing complex stories of divergence and admixture that were left undiscovered by previous smaller data sets. A central challenge is to estimate the timing of past admixture and divergence events, for example the time at which Neanderthals exchanged genetic material with humans and the time at which modern humans left Africa. Here, we present a method for using sequence data to jointly estimate the timing and magnitude of past admixture events, along with population divergence times and changes in effective population size. We infer demography from a collection of pairwise sequence alignments by summarizing their length distribution of tracts of identity by state (IBS) and maximizing an analytic composite likelihood derived from a Markovian coalescent approximation. Recent gene flow between populations leaves behind long tracts of identity by descent (IBD), and these tracts give our method power by influencing the distribution of shared IBS tracts. In simulated data, we accurately infer the timing and strength of admixture events, population size changes, and divergence times over a variety of ancient and recent time scales. Using the same technique, we analyze deeply sequenced trio parents from the 1000 Genomes project. The data show evidence of extensive gene flow between Africa and Europe after the time of divergence as well as substructure and gene flow among ancestral hominids. In particular, we infer that recent African-European gene flow and ancient ghost admixture into Europe are both necessary to explain the spectrum of IBS sharing in the trios, rejecting simpler models that contain less population structure. PMID:23754952

  7. Multilevel modeling for inference of genetic regulatory networks

    NASA Astrophysics Data System (ADS)

    Ng, Shu-Kay; Wang, Kui; McLachlan, Geoffrey J.

    2005-12-01

    Time-course experiments with microarrays are often used to study dynamic biological systems and genetic regulatory networks (GRNs) that model how genes influence each other in cell-level development of organisms. The inference for GRNs provides important insights into the fundamental biological processes such as growth and is useful in disease diagnosis and genomic drug design. Due to the experimental design, multilevel data hierarchies are often present in time-course gene expression data. Most existing methods, however, ignore the dependency of the expression measurements over time and the correlation among gene expression profiles. Such independence assumptions violate regulatory interactions and can result in overlooking certain important subject effects and lead to spurious inference for regulatory networks or mechanisms. In this paper, a multilevel mixed-effects model is adopted to incorporate data hierarchies in the analysis of time-course data, where temporal and subject effects are both assumed to be random. The method starts with the clustering of genes by fitting the mixture model within the multilevel random-effects model framework using the expectation-maximization (EM) algorithm. The network of regulatory interactions is then determined by searching for regulatory control elements (activators and inhibitors) shared by the clusters of co-expressed genes, based on a time-lagged correlation coefficients measurement. The method is applied to two real time-course datasets from the budding yeast (Saccharomyces cerevisiae) genome. It is shown that the proposed method provides clusters of cell-cycle regulated genes that are supported by existing gene function annotations, and hence enables inference on regulatory interactions for the genetic network.

  8. Regulatory component analysis: a semi-blind extraction approach to infer gene regulatory networks with imperfect biological knowledge

    PubMed Central

    Wang, Chen; Xuan, Jianhua; Shih, Ie-Ming; Clarke, Robert; Wang, Yue

    2011-01-01

    With the advent of high-throughput biotechnology capable of monitoring genomic signals, it becomes increasingly promising to understand molecular cellular mechanisms through systems biology approaches. One of the active research topics in systems biology is to infer gene transcriptional regulatory networks using various genomic data; this inference problem can be formulated as a linear model with latent signals associated with some regulatory proteins called transcription factors (TFs). As common statistical assumptions may not hold for genomic signals, typical latent variable algorithms such as independent component analysis (ICA) are incapable to reveal underlying true regulatory signals. Liao et al. [1] proposed to perform inference using an approach named network component analysis (NCA), the optimization of which is achieved by a least-squares fitting approach with biological knowledge constraints. However, the incompleteness of biological knowledge and its inconsistency with gene expression data are not considered in the original NCA solution, which could greatly affect the inference accuracy. To overcome these limitations, we propose a linear extraction scheme, namely regulatory component analysis (RCA), to infer underlying regulatory signals even with partial biological knowledge. Numerical simulations show a significant improvement of our proposed RCA over NCA, not only when signal-to-noise-ratio (SNR) is low, but also when the given biological knowledge is incomplete and inconsistent to gene expression data. Furthermore, real biological experiments on E. coli are performed for regulatory network inference in comparison with several typical linear latent variable methods, which again demonstrates the effectiveness and improved performance of the proposed algorithm. PMID:22685363

  9. Spontaneous Trait Inferences on Social Media

    PubMed Central

    Utz, Sonja

    2016-01-01

    The present research investigates whether spontaneous trait inferences occur under conditions characteristic of social media and networking sites: nonextreme, ostensibly self-generated content, simultaneous presentation of multiple cues, and self-paced browsing. We used an established measure of trait inferences (false recognition paradigm) and a direct assessment of impressions. Without being asked to do so, participants spontaneously formed impressions of people whose status updates they saw. Our results suggest that trait inferences occurred from nonextreme self-generated content, which is commonly found in social media updates (Experiment 1) and when nine status updates from different people were presented in parallel (Experiment 2). Although inferences did occur during free browsing, the results suggest that participants did not necessarily associate the traits with the corresponding status update authors (Experiment 3). Overall, the findings suggest that spontaneous trait inferences occur on social media. We discuss implications for online communication and research on spontaneous trait inferences. PMID:28123646

  10. Inferring echolocation in ancient bats.

    PubMed

    Simmons, Nancy B; Seymour, Kevin L; Habersetzer, Jörg; Gunnell, Gregg F

    2010-08-19

    Laryngeal echolocation, used by most living bats to form images of their surroundings and to detect and capture flying prey, is considered to be a key innovation for the evolutionary success of bats, and palaeontologists have long sought osteological correlates of echolocation that can be used to infer the behaviour of fossil bats. Veselka et al. argued that the most reliable trait indicating echolocation capabilities in bats is an articulation between the stylohyal bone (part of the hyoid apparatus that supports the throat and larynx) and the tympanic bone, which forms the floor of the middle ear. They examined the oldest and most primitive known bat, Onychonycteris finneyi (early Eocene, USA), and argued that it showed evidence of this stylohyal-tympanic articulation, from which they concluded that O. finneyi may have been capable of echolocation. We disagree with their interpretation of key fossil data and instead argue that O. finneyi was probably not an echolocating bat.

  11. Motion Inference During +Gz Acceleration

    DTIC Science & Technology

    2006-09-01

    AFRL-HW-WP-TP-2006-0091 Motion Inference During +Gz Acceleration Lloyd D . Tripp Jr. Richard A. McKinley Robert L. Esken Air Force Research Laboratory...5c. PROGRAM ELEMENT NUMBER 62202F 6. AUTHOR(S) 5d. PROJECT NUMBER Lloyd D . Tripp Jr 7184 Richard A. McKinley 5e. TASK NUMBER Robert L. Esken 03 5f...CD A Cj CL.C2 C 0~ 0. D 0 0~G)C00.E)’ca)4-100 ( 0 Eo12 E a 0 0L0mm 0a0 " C0 U) U) LUr o CLI.,a @ .- . : ) 0 " 0 C CL.. 70 E- 0 M 0.0 toE-C .- 0)c .2 0UL

  12. Inferred properties of stellar granulation

    SciTech Connect

    Gray, D.F.; Toner, C.G.

    1985-06-01

    Apparent characteristics of stellar granulation in F and G main-sequence stars are inferred directly from observed spectral-line asymmetries and from comparisons of numerical simulations with the observations: (1) the apparent granulation velocity increases with effective temperature, (2) the dispersion of granule velocities about their mean velocity of rise increases with the apparent granulation velocity, (3) the mean velocity of rise of granules must be less than the total line broadening, (4) the apparent velocity difference between granules and dark lanes corresponds to the granulation velocity deduced from stellar line bisectors, (5) the dark lanes show velocities of fall approximately twice as large as the granule rise velocities, (6) the light contributed to the stellar flux by the granules is four to ten times more than the light from the dark lanes. Stellar rotation is predicted to produce distortions in the line bisectors which may give information on the absolute velocity displacements of the line bisectors. 37 references.

  13. Synaptic Computation Underlying Probabilistic Inference

    PubMed Central

    Soltani, Alireza; Wang, Xiao-Jing

    2010-01-01

    In this paper we propose that synapses may be the workhorse of neuronal computations that underlie probabilistic reasoning. We built a neural circuit model for probabilistic inference when information provided by different sensory cues needs to be integrated, and the predictive powers of individual cues about an outcome are deduced through experience. We found that bounded synapses naturally compute, through reward-dependent plasticity, the posterior probability that a choice alternative is correct given that a cue is presented. Furthermore, a decision circuit endowed with such synapses makes choices based on the summated log posterior odds and performs near-optimal cue combination. The model is validated by reproducing salient observations of, and provide insights into, a monkey experiment using a categorization task. Our model thus suggests a biophysical instantiation of the Bayesian decision rule, while predicting important deviations from it similar to ‘base-rate neglect’ observed in human studies when alternatives have unequal priors. PMID:20010823

  14. Generic comparison of protein inference engines.

    PubMed

    Claassen, Manfred; Reiter, Lukas; Hengartner, Michael O; Buhmann, Joachim M; Aebersold, Ruedi

    2012-04-01

    Protein identifications, instead of peptide-spectrum matches, constitute the biologically relevant result of shotgun proteomics studies. How to appropriately infer and report protein identifications has triggered a still ongoing debate. This debate has so far suffered from the lack of appropriate performance measures that allow us to objectively assess protein inference approaches. This study describes an intuitive, generic and yet formal performance measure and demonstrates how it enables experimentalists to select an optimal protein inference strategy for a given collection of fragment ion spectra. We applied the performance measure to systematically explore the benefit of excluding possibly unreliable protein identifications, such as single-hit wonders. Therefore, we defined a family of protein inference engines by extending a simple inference engine by thousands of pruning variants, each excluding a different specified set of possibly unreliable identifications. We benchmarked these protein inference engines on several data sets representing different proteomes and mass spectrometry platforms. Optimally performing inference engines retained all high confidence spectral evidence, without posterior exclusion of any type of protein identifications. Despite the diversity of studied data sets consistently supporting this rule, other data sets might behave differently. In order to ensure maximal reliable proteome coverage for data sets arising in other studies we advocate abstaining from rigid protein inference rules, such as exclusion of single-hit wonders, and instead consider several protein inference approaches and assess these with respect to the presented performance measure in the specific application context.

  15. Antarctic Genomics

    PubMed Central

    Clarke, Andrew; Cockell, Charles S.; Convey, Peter; Detrich III, H. William; Fraser, Keiron P. P.; Johnston, Ian A.; Methe, Barbara A.; Murray, Alison E.; Peck, Lloyd S.; Römisch, Karin; Rogers, Alex D.

    2004-01-01

    With the development of genomic science and its battery of technologies, polar biology stands on the threshold of a revolution, one that will enable the investigation of important questions of unprecedented scope and with extraordinary depth and precision. The exotic organisms of polar ecosystems are ideal candidates for genomic analysis. Through such analyses, it will be possible to learn not only the novel features that enable polar organisms to survive, and indeed thrive, in their extreme environments, but also fundamental biological principles that are common to most, if not all, organisms. This article aims to review recent developments in Antarctic genomics and to demonstrate the global context of such studies. PMID:18629155

  16. Evolutionary Inference across Eukaryotes Identifies Specific Pressures Favoring Mitochondrial Gene Retention.

    PubMed

    Johnston, Iain G; Williams, Ben P

    2016-02-24

    Since their endosymbiotic origin, mitochondria have lost most of their genes. Although many selective mechanisms underlying the evolution of mitochondrial genomes have been proposed, a data-driven exploration of these hypotheses is lacking, and a quantitatively supported consensus remains absent. We developed HyperTraPS, a methodology coupling stochastic modeling with Bayesian inference, to identify the ordering of evolutionary events and suggest their causes. Using 2015 complete mitochondrial genomes, we inferred evolutionary trajectories of mtDNA gene loss across the eukaryotic tree of life. We find that proteins comprising the structural cores of the electron transport chain are preferentially encoded within mitochondrial genomes across eukaryotes. A combination of high GC content and high protein hydrophobicity is required to explain patterns of mtDNA gene retention; a model that accounts for these selective pressures can also predict the success of artificial gene transfer experiments in vivo. This work provides a general method for data-driven inference of the ordering of evolutionary and progressive events, here identifying the distinct features shaping mitochondrial genomes of present-day species.

  17. Protein inference: A protein quantification perspective.

    PubMed

    He, Zengyou; Huang, Ting; Liu, Xiaoqing; Zhu, Peijun; Teng, Ben; Deng, Shengchun

    2016-08-01

    In mass spectrometry-based shotgun proteomics, protein quantification and protein identification are two major computational problems. To quantify the protein abundance, a list of proteins must be firstly inferred from the raw data. Then the relative or absolute protein abundance is estimated with quantification methods, such as spectral counting. Until now, most researchers have been dealing with these two processes separately. In fact, the protein inference problem can be regarded as a special protein quantification problem in the sense that truly present proteins are those proteins whose abundance values are not zero. Some recent published papers have conceptually discussed this possibility. However, there is still a lack of rigorous experimental studies to test this hypothesis. In this paper, we investigate the feasibility of using protein quantification methods to solve the protein inference problem. Protein inference methods aim to determine whether each candidate protein is present in the sample or not. Protein quantification methods estimate the abundance value of each inferred protein. Naturally, the abundance value of an absent protein should be zero. Thus, we argue that the protein inference problem can be viewed as a special protein quantification problem in which one protein is considered to be present if its abundance is not zero. Based on this idea, our paper tries to use three simple protein quantification methods to solve the protein inference problem effectively. The experimental results on six data sets show that these three methods are competitive with previous protein inference algorithms. This demonstrates that it is plausible to model the protein inference problem as a special protein quantification task, which opens the door of devising more effective protein inference algorithms from a quantification perspective. The source codes of our methods are available at: http://code.google.com/p/protein-inference/.

  18. Genomic Testing

    MedlinePlus

    ... Services released a report identifying gaps in the regulation, oversight, and usefulness of genetic testing. They expressed ... December 20, 2016 Content source: Center for Surveillance, Epidemiology and Laboratory Services (CSELS) , Public Health Genomics Email ...

  19. A Comparison of Two Student Instructional Rating Forms Utilizing High-Inference Versus Moderate Inference Items.

    ERIC Educational Resources Information Center

    Wilson, Pamela W.

    Two types of items used in student evaluations of college teaching were compared: high-inference items, which require considerable inferring from what is seen or heard in the classroom to labelling of teacher behavior; and moderate-inference items, such as "teacher listens carefully." Two instruments were administered to random halves of…

  20. Forward and Backward Inference in Spatial Cognition

    PubMed Central

    Penny, Will D.; Zeidman, Peter; Burgess, Neil

    2013-01-01

    This paper shows that the various computations underlying spatial cognition can be implemented using statistical inference in a single probabilistic model. Inference is implemented using a common set of ‘lower-level’ computations involving forward and backward inference over time. For example, to estimate where you are in a known environment, forward inference is used to optimally combine location estimates from path integration with those from sensory input. To decide which way to turn to reach a goal, forward inference is used to compute the likelihood of reaching that goal under each option. To work out which environment you are in, forward inference is used to compute the likelihood of sensory observations under the different hypotheses. For reaching sensory goals that require a chaining together of decisions, forward inference can be used to compute a state trajectory that will lead to that goal, and backward inference to refine the route and estimate control signals that produce the required trajectory. We propose that these computations are reflected in recent findings of pattern replay in the mammalian brain. Specifically, that theta sequences reflect decision making, theta flickering reflects model selection, and remote replay reflects route and motor planning. We also propose a mapping of the above computational processes onto lateral and medial entorhinal cortex and hippocampus. PMID:24348230

  1. Application of Transformations in Parametric Inference

    ERIC Educational Resources Information Center

    Brownstein, Naomi; Pensky, Marianna

    2008-01-01

    The objective of the present paper is to provide a simple approach to statistical inference using the method of transformations of variables. We demonstrate performance of this powerful tool on examples of constructions of various estimation procedures, hypothesis testing, Bayes analysis and statistical inference for the stress-strength systems.…

  2. Scalar Inferences in Autism Spectrum Disorders

    ERIC Educational Resources Information Center

    Chevallier, Coralie; Wilson, Deirdre; Happe, Francesca; Noveck, Ira

    2010-01-01

    On being told "John or Mary will come", one might infer that "not both" of them will come. Yet the semantics of "or" is compatible with a situation where both John and Mary come. Inferences of this type, which enrich the semantics of "or" from an "inclusive" to an "exclusive" interpretation, have been extensively studied in linguistic pragmatics.…

  3. The Reasoning behind Informal Statistical Inference

    ERIC Educational Resources Information Center

    Makar, Katie; Bakker, Arthur; Ben-Zvi, Dani

    2011-01-01

    Informal statistical inference (ISI) has been a frequent focus of recent research in statistics education. Considering the role that context plays in developing ISI calls into question the need to be more explicit about the reasoning that underpins ISI. This paper uses educational literature on informal statistical inference and philosophical…

  4. Local and Global Thinking in Statistical Inference

    ERIC Educational Resources Information Center

    Pratt, Dave; Johnston-Wilder, Peter; Ainley, Janet; Mason, John

    2008-01-01

    In this reflective paper, we explore students' local and global thinking about informal statistical inference through our observations of 10- to 11-year-olds, challenged to infer the unknown configuration of a virtual die, but able to use the die to generate as much data as they felt necessary. We report how they tended to focus on local changes…

  5. Forward and backward inference in spatial cognition.

    PubMed

    Penny, Will D; Zeidman, Peter; Burgess, Neil

    2013-01-01

    This paper shows that the various computations underlying spatial cognition can be implemented using statistical inference in a single probabilistic model. Inference is implemented using a common set of 'lower-level' computations involving forward and backward inference over time. For example, to estimate where you are in a known environment, forward inference is used to optimally combine location estimates from path integration with those from sensory input. To decide which way to turn to reach a goal, forward inference is used to compute the likelihood of reaching that goal under each option. To work out which environment you are in, forward inference is used to compute the likelihood of sensory observations under the different hypotheses. For reaching sensory goals that require a chaining together of decisions, forward inference can be used to compute a state trajectory that will lead to that goal, and backward inference to refine the route and estimate control signals that produce the required trajectory. We propose that these computations are reflected in recent findings of pattern replay in the mammalian brain. Specifically, that theta sequences reflect decision making, theta flickering reflects model selection, and remote replay reflects route and motor planning. We also propose a mapping of the above computational processes onto lateral and medial entorhinal cortex and hippocampus.

  6. Inferring Learners' Knowledge from Their Actions

    ERIC Educational Resources Information Center

    Rafferty, Anna N.; LaMar, Michelle M.; Griffiths, Thomas L.

    2015-01-01

    Watching another person take actions to complete a goal and making inferences about that person's knowledge is a relatively natural task for people. This ability can be especially important in educational settings, where the inferences can be used for assessment, diagnosing misconceptions, and providing informa