Sample records for evolution metagenomic analysis

  1. Methods for comparative metagenomics

    PubMed Central

    Huson, Daniel H; Richter, Daniel C; Mitra, Suparna; Auch, Alexander F; Schuster, Stephan C

    2009-01-01

    Background Metagenomics is a rapidly growing field of research that aims at studying uncultured organisms to understand the true diversity of microbes, their functions, cooperation and evolution, in environments such as soil, water, ancient remains of animals, or the digestive system of animals and humans. The recent development of ultra-high throughput sequencing technologies, which do not require cloning or PCR amplification, and can produce huge numbers of DNA reads at an affordable cost, has boosted the number and scope of metagenomic sequencing projects. Increasingly, there is a need for new ways of comparing multiple metagenomics datasets, and for fast and user-friendly implementations of such approaches. Results This paper introduces a number of new methods for interactively exploring, analyzing and comparing multiple metagenomic datasets, which will be made freely available in a new, comparative version 2.0 of the stand-alone metagenome analysis tool MEGAN. Conclusion There is a great need for powerful and user-friendly tools for comparative analysis of metagenomic data and MEGAN 2.0 will help to fill this gap. PMID:19208111

  2. Survey of (Meta)genomic Approaches for Understanding Microbial Community Dynamics.

    PubMed

    Sharma, Anukriti; Lal, Rup

    2017-03-01

    Advancement in the next generation sequencing technologies has led to evolution of the field of genomics and metagenomics in a slim duration with nominal cost at precipitous higher rate. While metagenomics and genomics can be separately used to reveal the culture-independent and culture-based microbial evolution, respectively, (meta)genomics together can be used to demonstrate results at population level revealing in-depth complex community interactions for specific ecotypes. The field of metagenomics which started with answering "who is out there?" based on 16S rRNA gene has evolved immensely with the precise organismal reconstruction at species/strain level from the deeply covered metagenome data outweighing the need to isolate bacteria of which 99% are de facto non-cultivable. In this review we have underlined the appeal of metagenomic-derived genomes in providing insights into the evolutionary patterns, growth dynamics, genome/gene-specific sweeps, and durability of environmental pressures. We have demonstrated the use of culture-based genomics and environmental shotgun metagenome data together to elucidate environment specific genome modulations via metagenomic recruitments in terms of gene loss/gain, accessory and core-genome extent. We further illustrated the benefit of (meta)genomics in the understanding of infectious diseases by deducing the relationship between human microbiota and clinical microbiology. This review summarizes the technological advances in the (meta)genomic strategies using the genome and metagenome datasets together to increase the resolution of microbial population studies.

  3. Prospecting Metagenomic Enzyme Subfamily Genes for DNA Family Shuffling by a Novel PCR-based Approach*

    PubMed Central

    Wang, Qiuyan; Wu, Huili; Wang, Anming; Du, Pengfei; Pei, Xiaolin; Li, Haifeng; Yin, Xiaopu; Huang, Lifeng; Xiong, Xiaolong

    2010-01-01

    DNA family shuffling is a powerful method for enzyme engineering, which utilizes recombination of naturally occurring functional diversity to accelerate laboratory-directed evolution. However, the use of this technique has been hindered by the scarcity of family genes with the required level of sequence identity in the genome database. We describe here a strategy for collecting metagenomic homologous genes for DNA shuffling from environmental samples by truncated metagenomic gene-specific PCR (TMGS-PCR). Using identified metagenomic gene-specific primers, twenty-three 921-bp truncated lipase gene fragments, which shared 64–99% identity with each other and formed a distinct subfamily of lipases, were retrieved from 60 metagenomic samples. These lipase genes were shuffled, and selected active clones were characterized. The chimeric clones show extensive functional and genetic diversity, as demonstrated by functional characterization and sequence analysis. Our results indicate that homologous sequences of genes captured by TMGS-PCR can be used as suitable genetic material for DNA family shuffling with broad applications in enzyme engineering. PMID:20962349

  4. A bioinformatic analysis of ribonucleotide reductase genes in phage genomes and metagenomes

    PubMed Central

    2013-01-01

    Background Ribonucleotide reductase (RNR), the enzyme responsible for the formation of deoxyribonucleotides from ribonucleotides, is found in all domains of life and many viral genomes. RNRs are also amongst the most abundant genes identified in environmental metagenomes. This study focused on understanding the distribution, diversity, and evolution of RNRs in phages (viruses that infect bacteria). Hidden Markov Model profiles were used to analyze the proteins encoded by 685 completely sequenced double-stranded DNA phages and 22 environmental viral metagenomes to identify RNR homologs in cultured phages and uncultured viral communities, respectively. Results RNRs were identified in 128 phage genomes, nearly tripling the number of phages known to encode RNRs. Class I RNR was the most common RNR class observed in phages (70%), followed by class II (29%) and class III (28%). Twenty-eight percent of the phages contained genes belonging to multiple RNR classes. RNR class distribution varied according to phage type, isolation environment, and the host’s ability to utilize oxygen. The majority of the phages containing RNRs are Myoviridae (65%), followed by Siphoviridae (30%) and Podoviridae (3%). The phylogeny and genomic organization of phage and host RNRs reveal several distinct evolutionary scenarios involving horizontal gene transfer, co-evolution, and differential selection pressure. Several putative split RNR genes interrupted by self-splicing introns or inteins were identified, providing further evidence for the role of frequent genetic exchange. Finally, viral metagenomic data indicate that RNRs are prevalent and highly dynamic in uncultured viral communities, necessitating future research to determine the environmental conditions under which RNRs provide a selective advantage. Conclusions This comprehensive study describes the distribution, diversity, and evolution of RNRs in phage genomes and environmental viral metagenomes. The distinct distributions of specific RNR classes amongst phages, combined with the various evolutionary scenarios predicted from RNR phylogenies suggest multiple inheritance sources and different selective forces for RNRs in phages. This study significantly improves our understanding of phage RNRs, providing insight into the diversity and evolution of this important auxiliary metabolic gene as well as the evolution of phages in response to their bacterial hosts and environments. PMID:23391036

  5. Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes

    PubMed Central

    Aziz, Ramy K.; Dwivedi, Bhakti; Akhter, Sajia; Breitbart, Mya; Edwards, Robert A.

    2015-01-01

    Phages are the most abundant biological entities on Earth and play major ecological roles, yet the current sequenced phage genomes do not adequately represent their diversity, and little is known about the abundance and distribution of these sequenced genomes in nature. Although the study of phage ecology has benefited tremendously from the emergence of metagenomic sequencing, a systematic survey of phage genes and genomes in various ecosystems is still lacking, and fundamental questions about phage biology, lifestyle, and ecology remain unanswered. To address these questions and improve comparative analysis of phages in different metagenomes, we screened a core set of publicly available metagenomic samples for sequences related to completely sequenced phages using the web tool, Phage Eco-Locator. We then adopted and deployed an array of mathematical and statistical metrics for a multidimensional estimation of the abundance and distribution of phage genes and genomes in various ecosystems. Experiments using those metrics individually showed their usefulness in emphasizing the pervasive, yet uneven, distribution of known phage sequences in environmental metagenomes. Using these metrics in combination allowed us to resolve phage genomes into clusters that correlated with their genotypes and taxonomic classes as well as their ecological properties. We propose adding this set of metrics to current metaviromic analysis pipelines, where they can provide insight regarding phage mosaicism, habitat specificity, and evolution. PMID:26005436

  6. Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes

    DOE PAGES

    Aziz, Ramy K.; Dwivedi, Bhakti; Akhter, Sajia; ...

    2015-05-08

    Phages are the most abundant biological entities on Earth and play major ecological roles, yet the current sequenced phage genomes do not adequately represent their diversity, and little is known about the abundance and distribution of these sequenced genomes in nature. Although the study of phage ecology has benefited tremendously from the emergence of metagenomic sequencing, a systematic survey of phage genes and genomes in various ecosystems is still lacking, and fundamental questions about phage biology, lifestyle, and ecology remain unanswered. To address these questions and improve comparative analysis of phages in different metagenomes, we screened a core set ofmore » publicly available metagenomic samples for sequences related to completely sequenced phages using the web tool, Phage Eco-Locator. We then adopted and deployed an array of mathematical and statistical metrics for a multidimensional estimation of the abundance and distribution of phage genes and genomes in various ecosystems. Experiments using those metrics individually showed their usefulness in emphasizing the pervasive, yet uneven, distribution of known phage sequences in environmental metagenomes. Using these metrics in combination allowed us to resolve phage genomes into clusters that correlated with their genotypes and taxonomic classes as well as their ecological properties. By adding this set of metrics to current metaviromic analysis pipelines, where they can provide insight regarding phage mosaicism, habitat specificity, and evolution.« less

  7. Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Aziz, Ramy K.; Dwivedi, Bhakti; Akhter, Sajia

    Phages are the most abundant biological entities on Earth and play major ecological roles, yet the current sequenced phage genomes do not adequately represent their diversity, and little is known about the abundance and distribution of these sequenced genomes in nature. Although the study of phage ecology has benefited tremendously from the emergence of metagenomic sequencing, a systematic survey of phage genes and genomes in various ecosystems is still lacking, and fundamental questions about phage biology, lifestyle, and ecology remain unanswered. To address these questions and improve comparative analysis of phages in different metagenomes, we screened a core set ofmore » publicly available metagenomic samples for sequences related to completely sequenced phages using the web tool, Phage Eco-Locator. We then adopted and deployed an array of mathematical and statistical metrics for a multidimensional estimation of the abundance and distribution of phage genes and genomes in various ecosystems. Experiments using those metrics individually showed their usefulness in emphasizing the pervasive, yet uneven, distribution of known phage sequences in environmental metagenomes. Using these metrics in combination allowed us to resolve phage genomes into clusters that correlated with their genotypes and taxonomic classes as well as their ecological properties. By adding this set of metrics to current metaviromic analysis pipelines, where they can provide insight regarding phage mosaicism, habitat specificity, and evolution.« less

  8. (Meta)genomic insights into the pathogenome of Cellulosimicrobium cellulans

    DOE PAGES

    Sharma, Anukriti; Gilbert, Jack A.; Lal, Rup

    2016-05-06

    Despite having serious clinical manifestations, Cellulosimicrobium cellulans remain under-reported with only three genome sequences available at the time of writing. Genome sequences of C. cellulans LMG16121, C. cellulans J36 and Cellulosimicrobium sp. strain MM were used to determine distribution of pathogenicity islands (PAIs) across C. cellulans, which revealed 49 potential marker genes with known association to human infections, e.g. Fic and VbhA toxin-antitoxin system. Oligonucleotide composition-based analysis of orthologous proteins (n = 791) across three genomes revealed significant negative correlation (P < 0.05) between frequency of optimal codons ( Fopt) and gene G+C content, highlighting the G+C-biased gene conversion (gBGC)more » effect across Cellulosimicrobium strains. Bayesian molecular-clock analysis performed on three virulent PAI proteins (Fic; D-alanyl-D-alanine-carboxypeptidase; transposase) dated the divergence event at 300 million years ago from the most common recent ancestor. Synteny-based annotation of hypothetical proteins highlighted gene transfers from non-pathogenic bacteria as a key factor in the evolution of PAIs. Additonally, deciphering the metagenomic islands using strain MM's genome with environmental data from the site of isolation (hot-spring biofilm) revealed (an)aerobic respiration as population segregation factor across the in situ cohorts. Furthermore, using reference genomes and metagenomic data, our results highlight the emergence and evolution of PAIs in the genus Cellulosimicrobium.« less

  9. Challenges and opportunities of airborne metagenomics.

    PubMed

    Behzad, Hayedeh; Gojobori, Takashi; Mineta, Katsuhiko

    2015-05-06

    Recent metagenomic studies of environments, such as marine and soil, have significantly enhanced our understanding of the diverse microbial communities living in these habitats and their essential roles in sustaining vast ecosystems. The increase in the number of publications related to soil and marine metagenomics is in sharp contrast to those of air, yet airborne microbes are thought to have significant impacts on many aspects of our lives from their potential roles in atmospheric events such as cloud formation, precipitation, and atmospheric chemistry to their major impact on human health. In this review, we will discuss the current progress in airborne metagenomics, with a special focus on exploring the challenges and opportunities of undertaking such studies. The main challenges of conducting metagenomic studies of airborne microbes are as follows: 1) Low density of microorganisms in the air, 2) efficient retrieval of microorganisms from the air, 3) variability in airborne microbial community composition, 4) the lack of standardized protocols and methodologies, and 5) DNA sequencing and bioinformatics-related challenges. Overcoming these challenges could provide the groundwork for comprehensive analysis of airborne microbes and their potential impact on the atmosphere, global climate, and our health. Metagenomic studies offer a unique opportunity to examine viral and bacterial diversity in the air and monitor their spread locally or across the globe, including threats from pathogenic microorganisms. Airborne metagenomic studies could also lead to discoveries of novel genes and metabolic pathways relevant to meteorological and industrial applications, environmental bioremediation, and biogeochemical cycles. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  10. Metagenomic Insights into Evolution of a Heavy Metal-Contaminated Groundwater Microbial Community

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hemme, Christopher L.; Deng, Ye; Gentry, Terry J.

    2010-02-15

    Understanding adaptation of biological communities to environmental change is a central issue in ecology and evolution. Metagenomic analysis of a stressed groundwater microbial community reveals that prolonged exposure to high concentrations of heavy metals, nitric acid and organic solvents (~;;50 years) have resulted in a massive decrease in species and allelic diversity as well as a significant loss of metabolic diversity. Although the surviving microbial community possesses all metabolic pathways necessary for survival and growth in such an extreme environment, its structure is very simple, primarily composed of clonal denitrifying ?- and ?-proteobacterial populations. The resulting community is over-abundant inmore » key genes conferring resistance to specific stresses including nitrate, heavy metals and acetone. Evolutionary analysis indicates that lateral gene transfer could be a key mechanism in rapidly responding and adapting to environmental contamination. The results presented in this study have important implications in understanding, assessing and predicting the impacts of human-induced activities on microbial communities ranging from human health to agriculture to environmental management, and their responses to environmental changes.« less

  11. Metagenomic insights into evolution of heavy metal-contaminated groundwater microbial community

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hemme, C.L.; Deng, Y.; Gentry, T.J.

    2010-07-01

    Understanding adaptation of biological communities to environmental change is a central issue in ecology and evolution. Metagenomic analysis of a stressed groundwater microbial community reveals that prolonged exposure to high concentrations of heavy metals, nitric acid and organic solvents ({approx}50 years) has resulted in a massive decrease in species and allelic diversity as well as a significant loss of metabolic diversity. Although the surviving microbial community possesses all metabolic pathways necessary for survival and growth in such an extreme environment, its structure is very simple, primarily composed of clonal denitrifying {gamma}- and {beta}-proteobacterial populations. The resulting community is overabundant inmore » key genes conferring resistance to specific stresses including nitrate, heavy metals and acetone. Evolutionary analysis indicates that lateral gene transfer could have a key function in rapid response and adaptation to environmental contamination. The results presented in this study have important implications in understanding, assessing and predicting the impacts of human-induced activities on microbial communities ranging from human health to agriculture to environmental management, and their responses to environmental changes.« less

  12. Metagenomics and Bioinformatics in Microbial Ecology: Current Status and Beyond.

    PubMed

    Hiraoka, Satoshi; Yang, Ching-Chia; Iwasaki, Wataru

    2016-09-29

    Metagenomic approaches are now commonly used in microbial ecology to study microbial communities in more detail, including many strains that cannot be cultivated in the laboratory. Bioinformatic analyses make it possible to mine huge metagenomic datasets and discover general patterns that govern microbial ecosystems. However, the findings of typical metagenomic and bioinformatic analyses still do not completely describe the ecology and evolution of microbes in their environments. Most analyses still depend on straightforward sequence similarity searches against reference databases. We herein review the current state of metagenomics and bioinformatics in microbial ecology and discuss future directions for the field. New techniques will allow us to go beyond routine analyses and broaden our knowledge of microbial ecosystems. We need to enrich reference databases, promote platforms that enable meta- or comprehensive analyses of diverse metagenomic datasets, devise methods that utilize long-read sequence information, and develop more powerful bioinformatic methods to analyze data from diverse perspectives.

  13. Integrative workflows for metagenomic analysis

    PubMed Central

    Ladoukakis, Efthymios; Kolisis, Fragiskos N.; Chatziioannou, Aristotelis A.

    2014-01-01

    The rapid evolution of all sequencing technologies, described by the term Next Generation Sequencing (NGS), have revolutionized metagenomic analysis. They constitute a combination of high-throughput analytical protocols, coupled to delicate measuring techniques, in order to potentially discover, properly assemble and map allelic sequences to the correct genomes, achieving particularly high yields for only a fraction of the cost of traditional processes (i.e., Sanger). From a bioinformatic perspective, this boils down to many GB of data being generated from each single sequencing experiment, rendering the management or even the storage, critical bottlenecks with respect to the overall analytical endeavor. The enormous complexity is even more aggravated by the versatility of the processing steps available, represented by the numerous bioinformatic tools that are essential, for each analytical task, in order to fully unveil the genetic content of a metagenomic dataset. These disparate tasks range from simple, nonetheless non-trivial, quality control of raw data to exceptionally complex protein annotation procedures, requesting a high level of expertise for their proper application or the neat implementation of the whole workflow. Furthermore, a bioinformatic analysis of such scale, requires grand computational resources, imposing as the sole realistic solution, the utilization of cloud computing infrastructures. In this review article we discuss different, integrative, bioinformatic solutions available, which address the aforementioned issues, by performing a critical assessment of the available automated pipelines for data management, quality control, and annotation of metagenomic data, embracing various, major sequencing technologies and applications. PMID:25478562

  14. From cultured to uncultured genome sequences: metagenomics and modeling microbial ecosystems.

    PubMed

    Garza, Daniel R; Dutilh, Bas E

    2015-11-01

    Microorganisms and the viruses that infect them are the most numerous biological entities on Earth and enclose its greatest biodiversity and genetic reservoir. With strength in their numbers, these microscopic organisms are major players in the cycles of energy and matter that sustain all life. Scientists have only scratched the surface of this vast microbial world through culture-dependent methods. Recent developments in generating metagenomes, large random samples of nucleic acid sequences isolated directly from the environment, are providing comprehensive portraits of the composition, structure, and functioning of microbial communities. Moreover, advances in metagenomic analysis have created the possibility of obtaining complete or nearly complete genome sequences from uncultured microorganisms, providing important means to study their biology, ecology, and evolution. Here we review some of the recent developments in the field of metagenomics, focusing on the discovery of genetic novelty and on methods for obtaining uncultured genome sequences, including through the recycling of previously published datasets. Moreover we discuss how metagenomics has become a core scientific tool to characterize eco-evolutionary patterns of microbial ecosystems, thus allowing us to simultaneously discover new microbes and study their natural communities. We conclude by discussing general guidelines and challenges for modeling the interactions between uncultured microorganisms and viruses based on the information contained in their genome sequences. These models will significantly advance our understanding of the functioning of microbial ecosystems and the roles of microbes in the environment.

  15. Genome and metagenome analyses reveal adaptive evolution of the host and interaction with the gut microbiota in the goose

    PubMed Central

    Gao, Guangliang; Zhao, Xianzhi; Li, Qin; He, Chuan; Zhao, Wenjing; Liu, Shuyun; Ding, Jinmei; Ye, Weixing; Wang, Jun; Chen, Ye; Wang, Haiwei; Li, Jing; Luo, Yi; Su, Jian; Huang, Yong; Liu, Zuohua; Dai, Ronghua; Shi, Yixiang; Meng, He; Wang, Qigui

    2016-01-01

    The goose is an economically important waterfowl that exhibits unique characteristics and abilities, such as liver fat deposition and fibre digestion. Here, we report de novo whole-genome assemblies for the goose and swan goose and describe the evolutionary relationships among 7 bird species, including domestic and wild geese, which diverged approximately 3.4~6.3 million years ago (Mya). In contrast to chickens as a proximal species, the expanded and rapidly evolving genes found in the goose genome are mainly involved in metabolism, including energy, amino acid and carbohydrate metabolism. Further integrated analysis of the host genome and gut metagenome indicated that the most widely shared functional enrichment of genes occurs for functions such as glycolysis/gluconeogenesis, starch and sucrose metabolism, propanoate metabolism and the citrate cycle. We speculate that the unique physiological abilities of geese benefit from the adaptive evolution of the host genome and symbiotic interactions with gut microbes. PMID:27608918

  16. A metagenomic survey of viral abundance and diversity in mosquitoes from Hubei province.

    PubMed

    Shi, Chenyan; Liu, Yi; Hu, Xiaomin; Xiong, Jinfeng; Zhang, Bo; Yuan, Zhiming

    2015-01-01

    Mosquitoes as one of the most common but important vectors have the potential to transmit or acquire a lot of viruses through biting, however viral flora in mosquitoes and its impact on mosquito-borne disease transmission has not been well investigated and evaluated. In this study, the metagenomic techniquehas been successfully employed in analyzing the abundance and diversity of viral community in three mosquito samples from Hubei, China. Among 92,304 reads produced through a run with 454 GS FLX system, 39% have high similarities with viral sequences belonging to identified bacterial, fungal, animal, plant and insect viruses, and 0.02% were classed into unidentified viral sequences, demonstrating high abundance and diversity of viruses in mosquitoes. Furthermore, two novel viruses in subfamily Densovirinae and family Dicistroviridae were identified, and six torque tenosus virus1 in family Anelloviridae, three porcine parvoviruses in subfamily Parvovirinae and a Culex tritaeniorhynchus rhabdovirus in Family Rhabdoviridae were preliminarily characterized. The viral metagenomic analysis offered us a deep insight into the viral population of mosquito which played an important role in viral initiative or passive transmission and evolution during the process.

  17. The metagenomic data life-cycle: standards and best practices

    PubMed Central

    ten Hoopen, Petra; Finn, Robert D.; Bongo, Lars Ailo; Corre, Erwan; Meyer, Folker; Mitchell, Alex; Pelletier, Eric; Pesole, Graziano; Santamaria, Monica; Willassen, Nils Peder

    2017-01-01

    Abstract Metagenomics data analyses from independent studies can only be compared if the analysis workflows are described in a harmonized way. In this overview, we have mapped the landscape of data standards available for the description of essential steps in metagenomics: (i) material sampling, (ii) material sequencing, (iii) data analysis, and (iv) data archiving and publishing. Taking examples from marine research, we summarize essential variables used to describe material sampling processes and sequencing procedures in a metagenomics experiment. These aspects of metagenomics dataset generation have been to some extent addressed by the scientific community, but greater awareness and adoption is still needed. We emphasize the lack of standards relating to reporting how metagenomics datasets are analysed and how the metagenomics data analysis outputs should be archived and published. We propose best practice as a foundation for a community standard to enable reproducibility and better sharing of metagenomics datasets, leading ultimately to greater metagenomics data reuse and repurposing. PMID:28637310

  18. The metagenomic data life-cycle: standards and best practices

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    ten Hoopen, Petra; Finn, Robert D.; Bongo, Lars Ailo

    Metagenomics data analyses from independent studies can only be compared if the analysis workflows are described in a harmonised way. In this overview, we have mapped the landscape of data standards available for the description of essential steps in metagenomics: (1) material sampling, (2) material sequencing (3) data analysis and (4) data archiving & publishing. Taking examples from marine research, we summarise essential variables used to describe material sampling processes and sequencing procedures in a metagenomics experiment. These aspects of metagenomics dataset generation have been to some extent addressed by the scientific community but greater awareness and adoption is stillmore » needed. We emphasise the lack of standards relating to reporting how metagenomics datasets are analysed and how the metagenomics data analysis outputs should be archived and published. We propose best practice as a foundation for a community standard to enable reproducibility and better sharing of metagenomics datasets, leading ultimately to greater metagenomics data reuse and repurposing.« less

  19. Horizontal gene transfer in an acid mine drainage microbial community.

    PubMed

    Guo, Jiangtao; Wang, Qi; Wang, Xiaoqi; Wang, Fumeng; Yao, Jinxian; Zhu, Huaiqiu

    2015-07-04

    Horizontal gene transfer (HGT) has been widely identified in complete prokaryotic genomes. However, the roles of HGT among members of a microbial community and in evolution remain largely unknown. With the emergence of metagenomics, it is nontrivial to investigate such horizontal flow of genetic materials among members in a microbial community from the natural environment. Because of the lack of suitable methods for metagenomics gene transfer detection, microorganisms from a low-complexity community acid mine drainage (AMD) with near-complete genomes were used to detect possible gene transfer events and suggest the biological significance. Using the annotation of coding regions by the current tools, a phylogenetic approach, and an approximately unbiased test, we found that HGTs in AMD organisms are not rare, and we predicted 119 putative transferred genes. Among them, 14 HGT events were determined to be transfer events among the AMD members. Further analysis of the 14 transferred genes revealed that the HGT events affected the functional evolution of archaea or bacteria in AMD, and it probably shaped the community structure, such as the dominance of G-plasma in archaea in AMD through HGT. Our study provides a novel insight into HGT events among microorganisms in natural communities. The interconnectedness between HGT and community evolution is essential to understand microbial community formation and development.

  20. Effective Analysis of NGS Metagenomic Data with Ultra-Fast Clustering Algorithms (MICW - Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Li, Weizhong

    2018-02-12

    San Diego Supercomputer Center's Weizhong Li on "Effective Analysis of NGS Metagenomic Data with Ultra-fast Clustering Algorithms" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  1. Genomics and Metagenomics of Extreme Acidophiles in Biomining Environments

    NASA Astrophysics Data System (ADS)

    Holmes, D. S.

    2015-12-01

    Over 160 draft or complete genomes of extreme acidophiles (pH < 3) have been published, many of which are from bioleaching and other biomining environments, or are closely related to such microorganisms. In addition, there are over 20 metagenomic studies of such environments. This provides a rich source of latent data that can be exploited for understanding the biology of biomining environments and for advancing biotechnological applications. Genomic and metagenomic data are already yielding valuable insights into cellular processes, including carbon and nitrogen management, heavy metal and acid resistance, iron and sulfur oxido-reduction, linking biogeochemical processes to organismal physiology. The data also allow the construction of useful models of the ecophysiology of biomining environments and provide insight into the gene and genome evolution of extreme acidophiles. Additionally, since most of these acidophiles are also chemoautolithotrophs that use minerals as energy sources or electron sinks, their genomes can be plundered for clues about the evolution of cellular metabolism and bioenergetic pathways during the Archaean abiotic/biotic transition on early Earth. Acknowledgements: Fondecyt 1130683.

  2. Reverse transcriptase genes are highly abundant and transcriptionally active in marine plankton assemblages

    PubMed Central

    Lescot, Magali; Hingamp, Pascal; Kojima, Kenji K; Villar, Emilie; Romac, Sarah; Veluchamy, Alaguraj; Boccara, Martine; Jaillon, Olivier; Iudicone, Daniele; Bowler, Chris; Wincker, Patrick; Claverie, Jean-Michel; Ogata, Hiroyuki

    2016-01-01

    Genes encoding reverse transcriptases (RTs) are found in most eukaryotes, often as a component of retrotransposons, as well as in retroviruses and in prokaryotic retroelements. We investigated the abundance, classification and transcriptional status of RTs based on Tara Oceans marine metagenomes and metatranscriptomes encompassing a wide organism size range. Our analyses revealed that RTs predominate large-size fraction metagenomes (>5 μm), where they reached a maximum of 13.5% of the total gene abundance. Metagenomic RTs were widely distributed across the phylogeny of known RTs, but many belonged to previously uncharacterized clades. Metatranscriptomic RTs showed distinct abundance patterns across samples compared with metagenomic RTs. The relative abundances of viral and bacterial RTs among identified RT sequences were higher in metatranscriptomes than in metagenomes and these sequences were detected in all metatranscriptome size fractions. Overall, these observations suggest an active proliferation of various RT-assisted elements, which could be involved in genome evolution or adaptive processes of plankton assemblage. PMID:26613339

  3. EBI metagenomics--a new resource for the analysis and archiving of metagenomic data.

    PubMed

    Hunter, Sarah; Corbett, Matthew; Denise, Hubert; Fraser, Matthew; Gonzalez-Beltran, Alejandra; Hunter, Christopher; Jones, Philip; Leinonen, Rasko; McAnulla, Craig; Maguire, Eamonn; Maslen, John; Mitchell, Alex; Nuka, Gift; Oisel, Arnaud; Pesseat, Sebastien; Radhakrishnan, Rajesh; Rocca-Serra, Philippe; Scheremetjew, Maxim; Sterk, Peter; Vaughan, Daniel; Cochrane, Guy; Field, Dawn; Sansone, Susanna-Assunta

    2014-01-01

    Metagenomics is a relatively recently established but rapidly expanding field that uses high-throughput next-generation sequencing technologies to characterize the microbial communities inhabiting different ecosystems (including oceans, lakes, soil, tundra, plants and body sites). Metagenomics brings with it a number of challenges, including the management, analysis, storage and sharing of data. In response to these challenges, we have developed a new metagenomics resource (http://www.ebi.ac.uk/metagenomics/) that allows users to easily submit raw nucleotide reads for functional and taxonomic analysis by a state-of-the-art pipeline, and have them automatically stored (together with descriptive, standards-compliant metadata) in the European Nucleotide Archive.

  4. Metagenomic Approaches to Assess Bacteriophages in Various Environmental Niches

    PubMed Central

    Hayes, Stephen; Mahony, Jennifer; Nauta, Arjen; van Sinderen, Douwe

    2017-01-01

    Bacteriophages are ubiquitous and numerous parasites of bacteria and play a critical evolutionary role in virtually every ecosystem, yet our understanding of the extent of the diversity and role of phages remains inadequate for many ecological niches, particularly in cases in which the host is unculturable. During the past 15 years, the emergence of the field of viral metagenomics has drastically enhanced our ability to analyse the so-called viral ‘dark matter’ of the biosphere. Here, we review the evolution of viral metagenomic methodologies, as well as providing an overview of some of the most significant applications and findings in this field of research. PMID:28538703

  5. Genome signature analysis of thermal virus metagenomes reveals Archaea and thermophilic signatures

    PubMed Central

    Pride, David T; Schoenfeld, Thomas

    2008-01-01

    Background Metagenomic analysis provides a rich source of biological information for otherwise intractable viral communities. However, study of viral metagenomes has been hampered by its nearly complete reliance on BLAST algorithms for identification of DNA sequences. We sought to develop algorithms for examination of viral metagenomes to identify the origin of sequences independent of BLAST algorithms. We chose viral metagenomes obtained from two hot springs, Bear Paw and Octopus, in Yellowstone National Park, as they represent simple microbial populations where comparatively large contigs were obtained. Thermal spring metagenomes have high proportions of sequences without significant Genbank homology, which has hampered identification of viruses and their linkage with hosts. To analyze each metagenome, we developed a method to classify DNA fragments using genome signature-based phylogenetic classification (GSPC), where metagenomic fragments are compared to a database of oligonucleotide signatures for all previously sequenced Bacteria, Archaea, and viruses. Results From both Bear Paw and Octopus hot springs, each assembled contig had more similarity to other metagenome contigs than to any sequenced microbial genome based on GSPC analysis, suggesting a genome signature common to each of these extreme environments. While viral metagenomes from Bear Paw and Octopus share some similarity, the genome signatures from each locale are largely unique. GSPC using a microbial database predicts most of the Octopus metagenome has archaeal signatures, while bacterial signatures predominate in Bear Paw; a finding consistent with those of Genbank BLAST. When using a viral database, the majority of the Octopus metagenome is predicted to belong to archaeal virus Families Globuloviridae and Fuselloviridae, while none of the Bear Paw metagenome is predicted to belong to archaeal viruses. As expected, when microbial and viral databases are combined, each of the Octopus and Bear Paw metagenomic contigs are predicted to belong to viruses rather than to any Bacteria or Archaea, consistent with the apparent viral origin of both metagenomes. Conclusion That BLAST searches identify no significant homologs for most metagenome contigs, while GSPC suggests their origin as archaeal viruses or bacteriophages, indicates GSPC provides a complementary approach in viral metagenomic analysis. PMID:18798991

  6. Genome signature analysis of thermal virus metagenomes reveals Archaea and thermophilic signatures.

    PubMed

    Pride, David T; Schoenfeld, Thomas

    2008-09-17

    Metagenomic analysis provides a rich source of biological information for otherwise intractable viral communities. However, study of viral metagenomes has been hampered by its nearly complete reliance on BLAST algorithms for identification of DNA sequences. We sought to develop algorithms for examination of viral metagenomes to identify the origin of sequences independent of BLAST algorithms. We chose viral metagenomes obtained from two hot springs, Bear Paw and Octopus, in Yellowstone National Park, as they represent simple microbial populations where comparatively large contigs were obtained. Thermal spring metagenomes have high proportions of sequences without significant Genbank homology, which has hampered identification of viruses and their linkage with hosts. To analyze each metagenome, we developed a method to classify DNA fragments using genome signature-based phylogenetic classification (GSPC), where metagenomic fragments are compared to a database of oligonucleotide signatures for all previously sequenced Bacteria, Archaea, and viruses. From both Bear Paw and Octopus hot springs, each assembled contig had more similarity to other metagenome contigs than to any sequenced microbial genome based on GSPC analysis, suggesting a genome signature common to each of these extreme environments. While viral metagenomes from Bear Paw and Octopus share some similarity, the genome signatures from each locale are largely unique. GSPC using a microbial database predicts most of the Octopus metagenome has archaeal signatures, while bacterial signatures predominate in Bear Paw; a finding consistent with those of Genbank BLAST. When using a viral database, the majority of the Octopus metagenome is predicted to belong to archaeal virus Families Globuloviridae and Fuselloviridae, while none of the Bear Paw metagenome is predicted to belong to archaeal viruses. As expected, when microbial and viral databases are combined, each of the Octopus and Bear Paw metagenomic contigs are predicted to belong to viruses rather than to any Bacteria or Archaea, consistent with the apparent viral origin of both metagenomes. That BLAST searches identify no significant homologs for most metagenome contigs, while GSPC suggests their origin as archaeal viruses or bacteriophages, indicates GSPC provides a complementary approach in viral metagenomic analysis.

  7. BioMaS: a modular pipeline for Bioinformatic analysis of Metagenomic AmpliconS.

    PubMed

    Fosso, Bruno; Santamaria, Monica; Marzano, Marinella; Alonso-Alemany, Daniel; Valiente, Gabriel; Donvito, Giacinto; Monaco, Alfonso; Notarangelo, Pasquale; Pesole, Graziano

    2015-07-01

    Substantial advances in microbiology, molecular evolution and biodiversity have been carried out in recent years thanks to Metagenomics, which allows to unveil the composition and functions of mixed microbial communities in any environmental niche. If the investigation is aimed only at the microbiome taxonomic structure, a target-based metagenomic approach, here also referred as Meta-barcoding, is generally applied. This approach commonly involves the selective amplification of a species-specific genetic marker (DNA meta-barcode) in the whole taxonomic range of interest and the exploration of its taxon-related variants through High-Throughput Sequencing (HTS) technologies. The accessibility to proper computational systems for the large-scale bioinformatic analysis of HTS data represents, currently, one of the major challenges in advanced Meta-barcoding projects. BioMaS (Bioinformatic analysis of Metagenomic AmpliconS) is a new bioinformatic pipeline designed to support biomolecular researchers involved in taxonomic studies of environmental microbial communities by a completely automated workflow, comprehensive of all the fundamental steps, from raw sequence data upload and cleaning to final taxonomic identification, that are absolutely required in an appropriately designed Meta-barcoding HTS-based experiment. In its current version, BioMaS allows the analysis of both bacterial and fungal environments starting directly from the raw sequencing data from either Roche 454 or Illumina HTS platforms, following two alternative paths, respectively. BioMaS is implemented into a public web service available at https://recasgateway.ba.infn.it/ and is also available in Galaxy at http://galaxy.cloud.ba.infn.it:8080 (only for Illumina data). BioMaS is a friendly pipeline for Meta-barcoding HTS data analysis specifically designed for users without particular computing skills. A comparative benchmark, carried out by using a simulated dataset suitably designed to broadly represent the currently known bacterial and fungal world, showed that BioMaS outperforms QIIME and MOTHUR in terms of extent and accuracy of deep taxonomic sequence assignments.

  8. Technical Report on Modeling for Quasispecies Abundance Inference with Confidence Intervals from Metagenomic Sequence Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McLoughlin, K.

    2016-01-11

    The overall aim of this project is to develop a software package, called MetaQuant, that can determine the constituents of a complex microbial sample and estimate their relative abundances by analysis of metagenomic sequencing data. The goal for Task 1 is to create a generative model describing the stochastic process underlying the creation of sequence read pairs in the data set. The stages in this generative process include the selection of a source genome sequence for each read pair, with probability dependent on its abundance in the sample. The other stages describe the evolution of the source genome from itsmore » nearest common ancestor with a reference genome, breakage of the source DNA into short fragments, and the errors in sequencing the ends of the fragments to produce read pairs.« less

  9. Evaluation of the Cow Rumen Metagenome: Assembly by Single Copy Gene Analysis and Single Cell Genome Assemblies (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Sczyrba, Alex

    2018-02-13

    DOE JGI's Alex Sczyrba on "Evaluation of the Cow Rumen Metagenome" and "Assembly by Single Copy Gene Analysis and Single Cell Genome Assemblies" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  10. Evaluation of the Cow Rumen Metagenome: Assembly by Single Copy Gene Analysis and Single Cell Genome Assemblies (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sczyrba, Alex

    2011-10-13

    DOE JGI's Alex Sczyrba on "Evaluation of the Cow Rumen Metagenome" and "Assembly by Single Copy Gene Analysis and Single Cell Genome Assemblies" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  11. Recovering complete and draft population genomes from metagenome datasets

    DOE PAGES

    Sangwan, Naseer; Xia, Fangfang; Gilbert, Jack A.

    2016-03-08

    Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem ofmore » chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution.« less

  12. Meta genome-wide network from functional linkages of genes in human gut microbial ecosystems.

    PubMed

    Ji, Yan; Shi, Yixiang; Wang, Chuan; Dai, Jianliang; Li, Yixue

    2013-03-01

    The human gut microbial ecosystem (HGME) exerts an important influence on the human health. In recent researches, meta-genomics provided deep insights into the HGME in terms of gene contents, metabolic processes and genome constitutions of meta-genome. Here we present a novel methodology to investigate the HGME on the basis of a set of functionally coupled genes regardless of their genome origins when considering the co-evolution properties of genes. By analyzing these coupled genes, we showed some basic properties of HGME significantly associated with each other, and further constructed a protein interaction map of human gut meta-genome to discover some functional modules that may relate with essential metabolic processes. Compared with other studies, our method provides a new idea to extract basic function elements from meta-genome systems and investigate complex microbial environment by associating its biological traits with co-evolutionary fingerprints encoded in it.

  13. Recovery of a Medieval Brucella melitensis Genome Using Shotgun Metagenomics

    PubMed Central

    Kay, Gemma L.; Sergeant, Martin J.; Giuffra, Valentina; Bandiera, Pasquale; Milanese, Marco; Bramanti, Barbara

    2014-01-01

    ABSTRACT Shotgun metagenomics provides a powerful assumption-free approach to the recovery of pathogen genomes from contemporary and historical material. We sequenced the metagenome of a calcified nodule from the skeleton of a 14th-century middle-aged male excavated from the medieval Sardinian settlement of Geridu. We obtained 6.5-fold coverage of a Brucella melitensis genome. Sequence reads from this genome showed signatures typical of ancient or aged DNA. Despite the relatively low coverage, we were able to use information from single-nucleotide polymorphisms to place the medieval pathogen genome within a clade of B. melitensis strains that included the well-studied Ether strain and two other recent Italian isolates. We confirmed this placement using information from deletions and IS711 insertions. We conclude that metagenomics stands ready to document past and present infections, shedding light on the emergence, evolution, and spread of microbial pathogens. PMID:25028426

  14. Recovering complete and draft population genomes from metagenome datasets

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sangwan, Naseer; Xia, Fangfang; Gilbert, Jack A.

    Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem ofmore » chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution.« less

  15. A Metagenomic Survey of Viral Abundance and Diversity in Mosquitoes from Hubei Province

    PubMed Central

    Shi, Chenyan; Liu, Yi; Hu, Xiaomin; Xiong, Jinfeng; Zhang, Bo; Yuan, Zhiming

    2015-01-01

    Mosquitoes as one of the most common but important vectors have the potential to transmit or acquire a lot of viruses through biting, however viral flora in mosquitoes and its impact on mosquito-borne disease transmission has not been well investigated and evaluated. In this study, the metagenomic techniquehas been successfully employed in analyzing the abundance and diversity of viral community in three mosquito samples from Hubei, China. Among 92,304 reads produced through a run with 454 GS FLX system, 39% have high similarities with viral sequences belonging to identified bacterial, fungal, animal, plant and insect viruses, and 0.02% were classed into unidentified viral sequences, demonstrating high abundance and diversity of viruses in mosquitoes. Furthermore, two novel viruses in subfamily Densovirinae and family Dicistroviridae were identified, and six torque tenosus virus1 in family Anelloviridae, three porcine parvoviruses in subfamily Parvovirinae and a Culex tritaeniorhynchus rhabdovirus in Family Rhabdoviridae were preliminarily characterized. The viral metagenomic analysis offered us a deep insight into the viral population of mosquito which played an important role in viral initiative or passive transmission and evolution during the process. PMID:26030271

  16. MetAMOS: a modular and open source metagenomic assembly and analysis pipeline

    PubMed Central

    2013-01-01

    We describe MetAMOS, an open source and modular metagenomic assembly and analysis pipeline. MetAMOS represents an important step towards fully automated metagenomic analysis, starting with next-generation sequencing reads and producing genomic scaffolds, open-reading frames and taxonomic or functional annotations. MetAMOS can aid in reducing assembly errors, commonly encountered when assembling metagenomic samples, and improves taxonomic assignment accuracy while also reducing computational cost. MetAMOS can be downloaded from: https://github.com/treangen/MetAMOS. PMID:23320958

  17. Metagenomic analysis revealed highly diverse microbial arsenic metabolism genes in paddy soils with low-arsenic contents.

    PubMed

    Xiao, Ke-Qing; Li, Li-Guan; Ma, Li-Ping; Zhang, Si-Yu; Bao, Peng; Zhang, Tong; Zhu, Yong-Guan

    2016-04-01

    Microbe-mediated arsenic (As) metabolism plays a critical role in global As cycle, and As metabolism involves different types of genes encoding proteins facilitating its biotransformation and transportation processes. Here, we used metagenomic analysis based on high-throughput sequencing and constructed As metabolism protein databases to analyze As metabolism genes in five paddy soils with low-As contents. The results showed that highly diverse As metabolism genes were present in these paddy soils, with varied abundances and distribution for different types and subtypes of these genes. Arsenate reduction genes (ars) dominated in all soil samples, and significant correlation existed between the abundance of arr (arsenate respiration), aio (arsenite oxidation), and arsM (arsenite methylation) genes, indicating the co-existence and close-relation of different As resistance systems of microbes in wetland environments similar to these paddy soils after long-term evolution. Among all soil parameters, pH was an important factor controlling the distribution of As metabolism gene in five paddy soils (p = 0.018). To the best of our knowledge, this is the first study using high-throughput sequencing and metagenomics approach in characterizing As metabolism genes in the five paddy soil, showing their great potential in As biotransformation, and therefore in mitigating arsenic risk to humans. Copyright © 2015 Elsevier Ltd. All rights reserved.

  18. Microevolutionary dynamics in Methanothermococcus populations from deep-sea hydrothermal vents in the Mid-Cayman Rise

    NASA Astrophysics Data System (ADS)

    Hoffert, M.; Anderson, R. E.; Stepanauskas, R.; Huber, J. A.

    2017-12-01

    Deep-sea hydrothermal vents sustain diverse communities of microorganisms. The effects of geochemical and biological interactions on the process of evolution in these ecosystems remains poorly understood because the majority of subsurface microorganisms remain uncultivated. By examining metagenomic samples from hydrothermal fluids and mapping the samples to closely-related genomes found in vent sites, we can better understand how the process of evolution is affected by the geochemical and environmental context in deep-sea vents. The Mid-Cayman Rise is a spreading ridge that hosts both mafic-influenced and ultramafic-influenced vent fields. Previous research on metagenomic samples from sites in the Mid-Cayman Rise has shown that these vents contain metabolically and taxonomically diverse microbial communities. Here, we investigate five single cell amplified Methanothermococcus genomes (SAGs) to investigate patterns in pangenomic variation and molecular evolution in these methanogens. Mappings of metagenomic reads from 15 sample sites to the SAGs reveal substantial variation in Methanothermococcus population abundance, nucleotide variability and selection pressure among the 15 geochemically distinct sample sites. Within each sample site, we observed distinct patterns of single nucleotide variant (SNV) accumulation and selection pressure within the SAG populations. Closely related genomes showed similar patterns of SNV accumulation. Analysis of open reading frames (ORFs) from the SAGs indicated that homologous genes accumulated variation at the same rate. For example, a genomic island for Nif genes was identified in three of the five genomes with significantly elevated SNV counts. dN/dS analyses revealed evidence for frequency-dependent selection, in which genes unique to individual SAGs displayed elevated diversifying selection relative to other genes. These results indicate that different strains of Methanothermococcus outcompete others in specific environmental settings, and that these fitness advantages may result from variation in the pangenome, as revealed by dN/dS and SNV analyses. By examining variation and the scale of nucleotide and genes, we aim to gain insight into the roles of genetic diversity and environmental selection on microbial evolution in these ecosystems.

  19. EBI metagenomics—a new resource for the analysis and archiving of metagenomic data

    PubMed Central

    Hunter, Sarah; Corbett, Matthew; Denise, Hubert; Fraser, Matthew; Gonzalez-Beltran, Alejandra; Hunter, Christopher; Jones, Philip; Leinonen, Rasko; McAnulla, Craig; Maguire, Eamonn; Maslen, John; Mitchell, Alex; Nuka, Gift; Oisel, Arnaud; Pesseat, Sebastien; Radhakrishnan, Rajesh; Rocca-Serra, Philippe; Scheremetjew, Maxim; Sterk, Peter; Vaughan, Daniel; Cochrane, Guy; Field, Dawn; Sansone, Susanna-Assunta

    2014-01-01

    Metagenomics is a relatively recently established but rapidly expanding field that uses high-throughput next-generation sequencing technologies to characterize the microbial communities inhabiting different ecosystems (including oceans, lakes, soil, tundra, plants and body sites). Metagenomics brings with it a number of challenges, including the management, analysis, storage and sharing of data. In response to these challenges, we have developed a new metagenomics resource (http://www.ebi.ac.uk/metagenomics/) that allows users to easily submit raw nucleotide reads for functional and taxonomic analysis by a state-of-the-art pipeline, and have them automatically stored (together with descriptive, standards-compliant metadata) in the European Nucleotide Archive. PMID:24165880

  20. Identifying biologically relevant differences between metagenomic communities.

    PubMed

    Parks, Donovan H; Beiko, Robert G

    2010-03-15

    Metagenomics is the study of genetic material recovered directly from environmental samples. Taxonomic and functional differences between metagenomic samples can highlight the influence of ecological factors on patterns of microbial life in a wide range of habitats. Statistical hypothesis tests can help us distinguish ecological influences from sampling artifacts, but knowledge of only the P-value from a statistical hypothesis test is insufficient to make inferences about biological relevance. Current reporting practices for pairwise comparative metagenomics are inadequate, and better tools are needed for comparative metagenomic analysis. We have developed a new software package, STAMP, for comparative metagenomics that supports best practices in analysis and reporting. Examination of a pair of iron mine metagenomes demonstrates that deeper biological insights can be gained using statistical techniques available in our software. An analysis of the functional potential of 'Candidatus Accumulibacter phosphatis' in two enhanced biological phosphorus removal metagenomes identified several subsystems that differ between the A.phosphatis stains in these related communities, including phosphate metabolism, secretion and metal transport. Python source code and binaries are freely available from our website at http://kiwi.cs.dal.ca/Software/STAMP CONTACT: beiko@cs.dal.ca Supplementary data are available at Bioinformatics online.

  1. EBI metagenomics in 2016 - an expanding and evolving resource for the analysis and archiving of metagenomic data

    PubMed Central

    Mitchell, Alex; Bucchini, Francois; Cochrane, Guy; Denise, Hubert; Hoopen, Petra ten; Fraser, Matthew; Pesseat, Sebastien; Potter, Simon; Scheremetjew, Maxim; Sterk, Peter; Finn, Robert D.

    2016-01-01

    EBI metagenomics (https://www.ebi.ac.uk/metagenomics/) is a freely available hub for the analysis and archiving of metagenomic and metatranscriptomic data. Over the last 2 years, the resource has undergone rapid growth, with an increase of over five-fold in the number of processed samples and consequently represents one of the largest resources of analysed shotgun metagenomes. Here, we report the status of the resource in 2016 and give an overview of new developments. In particular, we describe updates to data content, a complete overhaul of the analysis pipeline, streamlining of data presentation via the website and the development of a new web based tool to compare functional analyses of sequence runs within a study. We also highlight two of the higher profile projects that have been analysed using the resource in the last year: the oceanographic projects Ocean Sampling Day and Tara Oceans. PMID:26582919

  2. Bioinformatics tools for quantitative and functional metagenome and metatranscriptome data analysis in microbes.

    PubMed

    Niu, Sheng-Yong; Yang, Jinyu; McDermaid, Adam; Zhao, Jing; Kang, Yu; Ma, Qin

    2017-05-08

    Metagenomic and metatranscriptomic sequencing approaches are more frequently being used to link microbiota to important diseases and ecological changes. Many analyses have been used to compare the taxonomic and functional profiles of microbiota across habitats or individuals. While a large portion of metagenomic analyses focus on species-level profiling, some studies use strain-level metagenomic analyses to investigate the relationship between specific strains and certain circumstances. Metatranscriptomic analysis provides another important insight into activities of genes by examining gene expression levels of microbiota. Hence, combining metagenomic and metatranscriptomic analyses will help understand the activity or enrichment of a given gene set, such as drug-resistant genes among microbiome samples. Here, we summarize existing bioinformatics tools of metagenomic and metatranscriptomic data analysis, the purpose of which is to assist researchers in deciding the appropriate tools for their microbiome studies. Additionally, we propose an Integrated Meta-Function mapping pipeline to incorporate various reference databases and accelerate functional gene mapping procedures for both metagenomic and metatranscriptomic analyses. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  3. Whole-genome sequencing in bacteriology: state of the art

    PubMed Central

    Dark, Michael J

    2013-01-01

    Over the last ten years, genome sequencing capabilities have expanded exponentially. There have been tremendous advances in sequencing technology, DNA sample preparation, genome assembly, and data analysis. This has led to advances in a number of facets of bacterial genomics, including metagenomics, clinical medicine, bacterial archaeology, and bacterial evolution. This review examines the strengths and weaknesses of techniques in bacterial genome sequencing, upcoming technologies, and assembly techniques, as well as highlighting recent studies that highlight new applications for bacterial genomics. PMID:24143115

  4. MALINA: a web service for visual analytics of human gut microbiota whole-genome metagenomic reads.

    PubMed

    Tyakht, Alexander V; Popenko, Anna S; Belenikin, Maxim S; Altukhov, Ilya A; Pavlenko, Alexander V; Kostryukova, Elena S; Selezneva, Oksana V; Larin, Andrei K; Karpova, Irina Y; Alexeev, Dmitry G

    2012-12-07

    MALINA is a web service for bioinformatic analysis of whole-genome metagenomic data obtained from human gut microbiota sequencing. As input data, it accepts metagenomic reads of various sequencing technologies, including long reads (such as Sanger and 454 sequencing) and next-generation (including SOLiD and Illumina). It is the first metagenomic web service that is capable of processing SOLiD color-space reads, to authors' knowledge. The web service allows phylogenetic and functional profiling of metagenomic samples using coverage depth resulting from the alignment of the reads to the catalogue of reference sequences which are built into the pipeline and contain prevalent microbial genomes and genes of human gut microbiota. The obtained metagenomic composition vectors are processed by the statistical analysis and visualization module containing methods for clustering, dimension reduction and group comparison. Additionally, the MALINA database includes vectors of bacterial and functional composition for human gut microbiota samples from a large number of existing studies allowing their comparative analysis together with user samples, namely datasets from Russian Metagenome project, MetaHIT and Human Microbiome Project (downloaded from http://hmpdacc.org). MALINA is made freely available on the web at http://malina.metagenome.ru. The website is implemented in JavaScript (using Ext JS), Microsoft .NET Framework, MS SQL, Python, with all major browsers supported.

  5. 16S rRNA Gene-Based Metagenomic Analysis of Ozark Cave Bacteria.

    PubMed

    Oliveira, Cássia; Gunderman, Lauren; Coles, Cathryn A; Lochmann, Jason; Parks, Megan; Ballard, Ethan; Glazko, Galina; Rahmatallah, Yasir; Tackett, Alan J; Thomas, David J

    2017-09-01

    The microbial diversity within cave ecosystems is largely unknown. Ozark caves maintain a year-round stable temperature (12-14 °C), but most parts of the caves experience complete darkness. The lack of sunlight and geological isolation from surface-energy inputs generate nutrient-poor conditions that may limit species diversity in such environments. Although microorganisms play a crucial role in sustaining life on Earth and impacting human health, little is known about their diversity, ecology, and evolution in community structures. We used five Ozark region caves as test sites for exploring bacterial diversity and monitoring long-term biodiversity. Illumina MiSeq sequencing of five cave soil samples and a control sample revealed a total of 49 bacterial phyla, with seven major phyla: Proteobacteria, Acidobacteria, Actinobacteria, Firmicutes, Chloroflexi, Bacteroidetes, and Nitrospirae. Variation in bacterial composition was observed among the five caves studied. Sandtown Cave had the lowest richness and most divergent community composition. 16S rRNA gene-based metagenomic analysis of cave-dwelling microbial communities in the Ozark caves revealed that species abundance and diversity are vast and included ecologically, agriculturally, and economically relevant taxa.

  6. Stable isotope probing in the metagenomics era: a bridge towards improved bioremediation

    PubMed Central

    Uhlik, Ondrej; Leewis, Mary-Cathrine; Strejcek, Michal; Musilova, Lucie; Mackova, Martina; Leigh, Mary Beth; Macek, Tomas

    2012-01-01

    Microbial biodegradation and biotransformation reactions are essential to most bioremediation processes, yet the specific organisms, genes, and mechanisms involved are often not well understood. Stable isotope probing (SIP) enables researchers to directly link microbial metabolic capability to phylogenetic and metagenomic information within a community context by tracking isotopically labeled substances into phylogenetically and functionally informative biomarkers. SIP is thus applicable as a tool for the identification of active members of the microbial community and associated genes integral to the community functional potential, such as biodegradative processes. The rapid evolution of SIP over the last decade and integration with metagenomics provides researchers with a much deeper insight into potential biodegradative genes, processes, and applications, thereby enabling an improved mechanistic understanding that can facilitate advances in the field of bioremediation. PMID:23022353

  7. Metagenomic and metaproteomic insights into bacterial communities in leaf-cutter ant fungus gardens.

    PubMed

    Aylward, Frank O; Burnum, Kristin E; Scott, Jarrod J; Suen, Garret; Tringe, Susannah G; Adams, Sandra M; Barry, Kerrie W; Nicora, Carrie D; Piehowski, Paul D; Purvine, Samuel O; Starrett, Gabriel J; Goodwin, Lynne A; Smith, Richard D; Lipton, Mary S; Currie, Cameron R

    2012-09-01

    Herbivores gain access to nutrients stored in plant biomass largely by harnessing the metabolic activities of microbes. Leaf-cutter ants of the genus Atta are a hallmark example; these dominant neotropical herbivores cultivate symbiotic fungus gardens on large quantities of fresh plant forage. As the external digestive system of the ants, fungus gardens facilitate the production and sustenance of millions of workers. Using metagenomic and metaproteomic techniques, we characterize the bacterial diversity and physiological potential of fungus gardens from two species of Atta. Our analysis of over 1.2 Gbp of community metagenomic sequence and three 16S pyrotag libraries reveals that in addition to harboring the dominant fungal crop, these ecosystems contain abundant populations of Enterobacteriaceae, including the genera Enterobacter, Pantoea, Klebsiella, Citrobacter and Escherichia. We show that these bacterial communities possess genes associated with lignocellulose degradation and diverse biosynthetic pathways, suggesting that they play a role in nutrient cycling by converting the nitrogen-poor forage of the ants into B-vitamins, amino acids and other cellular components. Our metaproteomic analysis confirms that bacterial glycosyl hydrolases and proteins with putative biosynthetic functions are produced in both field-collected and laboratory-reared colonies. These results are consistent with the hypothesis that fungus gardens are specialized fungus-bacteria communities that convert plant material into energy for their ant hosts. Together with recent investigations into the microbial symbionts of vertebrates, our work underscores the importance of microbial communities in the ecology and evolution of herbivorous metazoans.

  8. Metagenomic and metaproteomic insights into bacterial communities in leaf-cutter ant fungus gardens

    PubMed Central

    Aylward, Frank O; Burnum, Kristin E; Scott, Jarrod J; Suen, Garret; Tringe, Susannah G; Adams, Sandra M; Barry, Kerrie W; Nicora, Carrie D; Piehowski, Paul D; Purvine, Samuel O; Starrett, Gabriel J; Goodwin, Lynne A; Smith, Richard D; Lipton, Mary S; Currie, Cameron R

    2012-01-01

    Herbivores gain access to nutrients stored in plant biomass largely by harnessing the metabolic activities of microbes. Leaf-cutter ants of the genus Atta are a hallmark example; these dominant neotropical herbivores cultivate symbiotic fungus gardens on large quantities of fresh plant forage. As the external digestive system of the ants, fungus gardens facilitate the production and sustenance of millions of workers. Using metagenomic and metaproteomic techniques, we characterize the bacterial diversity and physiological potential of fungus gardens from two species of Atta. Our analysis of over 1.2 Gbp of community metagenomic sequence and three 16S pyrotag libraries reveals that in addition to harboring the dominant fungal crop, these ecosystems contain abundant populations of Enterobacteriaceae, including the genera Enterobacter, Pantoea, Klebsiella, Citrobacter and Escherichia. We show that these bacterial communities possess genes associated with lignocellulose degradation and diverse biosynthetic pathways, suggesting that they play a role in nutrient cycling by converting the nitrogen-poor forage of the ants into B-vitamins, amino acids and other cellular components. Our metaproteomic analysis confirms that bacterial glycosyl hydrolases and proteins with putative biosynthetic functions are produced in both field-collected and laboratory-reared colonies. These results are consistent with the hypothesis that fungus gardens are specialized fungus–bacteria communities that convert plant material into energy for their ant hosts. Together with recent investigations into the microbial symbionts of vertebrates, our work underscores the importance of microbial communities in the ecology and evolution of herbivorous metazoans. PMID:22378535

  9. RIEMS: a software pipeline for sensitive and comprehensive taxonomic classification of reads from metagenomics datasets.

    PubMed

    Scheuch, Matthias; Höper, Dirk; Beer, Martin

    2015-03-03

    Fuelled by the advent and subsequent development of next generation sequencing technologies, metagenomics became a powerful tool for the analysis of microbial communities both scientifically and diagnostically. The biggest challenge is the extraction of relevant information from the huge sequence datasets generated for metagenomics studies. Although a plethora of tools are available, data analysis is still a bottleneck. To overcome the bottleneck of data analysis, we developed an automated computational workflow called RIEMS - Reliable Information Extraction from Metagenomic Sequence datasets. RIEMS assigns every individual read sequence within a dataset taxonomically by cascading different sequence analyses with decreasing stringency of the assignments using various software applications. After completion of the analyses, the results are summarised in a clearly structured result protocol organised taxonomically. The high accuracy and performance of RIEMS analyses were proven in comparison with other tools for metagenomics data analysis using simulated sequencing read datasets. RIEMS has the potential to fill the gap that still exists with regard to data analysis for metagenomics studies. The usefulness and power of RIEMS for the analysis of genuine sequencing datasets was demonstrated with an early version of RIEMS in 2011 when it was used to detect the orthobunyavirus sequences leading to the discovery of Schmallenberg virus.

  10. Metagenomic Insights into the Evolution, Function, and Complexity of the Planktonic Microbial Community of Lake Lanier, a Temperate Freshwater Ecosystem ▿†

    PubMed Central

    Oh, Seungdae; Caro-Quintero, Alejandro; Tsementzi, Despina; DeLeon-Rodriguez, Natasha; Luo, Chengwei; Poretsky, Rachel; Konstantinidis, Konstantinos T.

    2011-01-01

    Lake Lanier is an important freshwater lake for the southeast United States, as it represents the main source of drinking water for the Atlanta metropolitan area and is popular for recreational activities. Temperate freshwater lakes such as Lake Lanier are underrepresented among the growing number of environmental metagenomic data sets, and little is known about how functional gene content in freshwater communities relates to that of other ecosystems. To better characterize the gene content and variability of this freshwater planktonic microbial community, we sequenced several samples obtained around a strong summer storm event and during the fall water mixing using a random whole-genome shotgun (WGS) approach. Comparative metagenomics revealed that the gene content was relatively stable over time and more related to that of another freshwater lake and the surface ocean than to soil. However, the phylogenetic diversity of Lake Lanier communities was distinct from that of soil and marine communities. We identified several important genomic adaptations that account for these findings, such as the use of potassium (as opposed to sodium) osmoregulators by freshwater organisms and differences in the community average genome size. We show that the lake community is predominantly composed of sequence-discrete populations and describe a simple method to assess community complexity based on population richness and evenness and to determine the sequencing effort required to cover diversity in a sample. This study provides the first comprehensive analysis of the genetic diversity and metabolic potential of a temperate planktonic freshwater community and advances approaches for comparative metagenomics. PMID:21764968

  11. Metagenomic sequence of saline desert microbiota from wild ass sanctuary, Little Rann of Kutch, Gujarat, India.

    PubMed

    Patel, Rajesh; Mevada, Vishal; Prajapati, Dhaval; Dudhagara, Pravin; Koringa, Prakash; Joshi, C G

    2015-03-01

    We report Metagenome from the saline desert soil sample of Little Rann of Kutch, Gujarat State, India. Metagenome consisted of 633,760 sequences with size 141,307,202 bp and 56% G + C content. Metagenome sequence data are available at EBI under EBI Metagenomics database with accession no. ERP005612. Community metagenomics revealed total 1802 species belonged to 43 different phyla with dominating Marinobacter (48.7%) and Halobacterium (4.6%) genus in bacterial and archaeal domain respectively. Remarkably, 18.2% sequences in a poorly characterized group and 4% gene for various stress responses along with versatile presence of commercial enzyme were evident in a functional metagenome analysis.

  12. Deduction and Analysis of the Interacting Stress Response Pathways of Metal/Radionuclide-reducing Bacteria

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhou, Jizhong; He, Zhili

    2010-02-28

    Project Title: Deduction and Analysis of the Interacting Stress Response Pathways of Metal/Radionuclide-reducing Bacteria DOE Grant Number: DE-FG02-06ER64205 Principal Investigator: Jizhong (Joe) Zhou (University of Oklahoma) Key members: Zhili He, Aifen Zhou, Christopher Hemme, Joy Van Nostrand, Ye Deng, and Qichao Tu Collaborators: Terry Hazen, Judy Wall, Adam Arkin, Matthew Fields, Aindrila Mukhopadhyay, and David Stahl Summary Three major objectives have been conducted in the Zhou group at the University of Oklahoma (OU): (i) understanding of gene function, regulation, network and evolution of Desulfovibrio vugaris Hildenborough in response to environmental stresses, (ii) development of metagenomics technologies for microbial community analysis,more » and (iii) functional characterization of microbial communities with metagenomic approaches. In the past a few years, we characterized four CRP/FNR regulators, sequenced ancestor and evolved D. vulgaris strains, and functionally analyzed those mutated genes identified in salt-adapted strains. Also, a new version of GeoChip 4.0 has been developed, which also includes stress response genes (StressChip), and a random matrix theory-based conceptual framework for identifying functional molecular ecological networks has been developed with the high throughput functional gene array hybridization data as well as pyrosequencing data from 16S rRNA genes. In addition, GeoChip and sequencing technologies as well as network analysis approaches have been used to analyze microbial communities from different habitats. Those studies provide a comprehensive understanding of gene function, regulation, network, and evolution in D. vulgaris, and microbial community diversity, composition and structure as well as their linkages with environmental factors and ecosystem functioning, which has resulted in more than 60 publications.« less

  13. Challenges and opportunities in understanding microbial communities with metagenome assembly (accompanied by IPython Notebook tutorial)

    DOE PAGES

    Howe, Adina; Chain, Patrick S. G.

    2015-07-09

    Metagenomic investigations hold great promise for informing the genetics, physiology, and ecology of environmental microorganisms. Current challenges for metagenomic analysis are related to our ability to connect the dots between sequencing reads, their population of origin, and their encoding functions. Assembly-based methods reduce dataset size by extending overlapping reads into larger contiguous sequences (contigs), providing contextual information for genetic sequences that does not rely on existing references. These methods, however, tend to be computationally intensive and are again challenged by sequencing errors as well as by genomic repeats. While numerous tools have been developed based on these methodological concepts, theymore » present confounding choices and training requirements to metagenomic investigators. To help with accessibility to assembly tools, this review also includes an IPython Notebook metagenomic assembly tutorial. This tutorial has instructions for execution any operating system using Amazon Elastic Cloud Compute and guides users through downloading, assembly, and mapping reads to contigs of a mock microbiome metagenome. Despite its challenges, metagenomic analysis has already revealed novel insights into many environments on Earth. As software, training, and data continue to emerge, metagenomic data access and its discoveries will to grow.« less

  14. Challenges and opportunities in understanding microbial communities with metagenome assembly (accompanied by IPython Notebook tutorial)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Howe, Adina; Chain, Patrick S. G.

    Metagenomic investigations hold great promise for informing the genetics, physiology, and ecology of environmental microorganisms. Current challenges for metagenomic analysis are related to our ability to connect the dots between sequencing reads, their population of origin, and their encoding functions. Assembly-based methods reduce dataset size by extending overlapping reads into larger contiguous sequences (contigs), providing contextual information for genetic sequences that does not rely on existing references. These methods, however, tend to be computationally intensive and are again challenged by sequencing errors as well as by genomic repeats. While numerous tools have been developed based on these methodological concepts, theymore » present confounding choices and training requirements to metagenomic investigators. To help with accessibility to assembly tools, this review also includes an IPython Notebook metagenomic assembly tutorial. This tutorial has instructions for execution any operating system using Amazon Elastic Cloud Compute and guides users through downloading, assembly, and mapping reads to contigs of a mock microbiome metagenome. Despite its challenges, metagenomic analysis has already revealed novel insights into many environments on Earth. As software, training, and data continue to emerge, metagenomic data access and its discoveries will to grow.« less

  15. MG-Digger: An Automated Pipeline to Search for Giant Virus-Related Sequences in Metagenomes

    PubMed Central

    Verneau, Jonathan; Levasseur, Anthony; Raoult, Didier; La Scola, Bernard; Colson, Philippe

    2016-01-01

    The number of metagenomic studies conducted each year is growing dramatically. Storage and analysis of such big data is difficult and time-consuming. Interestingly, analysis shows that environmental and human metagenomes include a significant amount of non-annotated sequences, representing a ‘dark matter.’ We established a bioinformatics pipeline that automatically detects metagenome reads matching query sequences from a given set and applied this tool to the detection of sequences matching large and giant DNA viral members of the proposed order Megavirales or virophages. A total of 1,045 environmental and human metagenomes (≈ 1 Terabase) were collected, processed, and stored on our bioinformatics server. In addition, nucleotide and protein sequences from 93 Megavirales representatives, including 19 giant viruses of amoeba, and 5 virophages, were collected. The pipeline was generated by scripts written in Python language and entitled MG-Digger. Metagenomes previously found to contain megavirus-like sequences were tested as controls. MG-Digger was able to annotate 100s of metagenome sequences as best matching those of giant viruses. These sequences were most often found to be similar to phycodnavirus or mimivirus sequences, but included reads related to recently available pandoraviruses, Pithovirus sibericum, and faustoviruses. Compared to other tools, MG-Digger combined stand-alone use on Linux or Windows operating systems through a user-friendly interface, implementation of ready-to-use customized metagenome databases and query sequence databases, adjustable parameters for BLAST searches, and creation of output files containing selected reads with best match identification. Compared to Metavir 2, a reference tool in viral metagenome analysis, MG-Digger detected 8% more true positive Megavirales-related reads in a control metagenome. The present work shows that massive, automated and recurrent analyses of metagenomes are effective in improving knowledge about the presence and prevalence of giant viruses in the environment and the human body. PMID:27065984

  16. Improving microbial fitness in the mammalian gut by in vivo temporal functional metagenomics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yaung, Stephanie J.; Deng, Luxue; Li, Ning

    Elucidating functions of commensal microbial genes in the mammalian gut is challenging because many commensals are recalcitrant to laboratory cultivation and genetic manipulation. We present Temporal FUnctional Metagenomics sequencing (TFUMseq), a platform to functionally mine bacterial genomes for genes that contribute to fitness of commensal bacteria in vivo. Our approach uses metagenomic DNA to construct large-scale heterologous expression libraries that are tracked over time in vivo by deep sequencing and computational methods. To demonstrate our approach, we built a TFUMseq plasmid library using the gut commensal Bacteroides thetaiotaomicron (Bt) and introduced Escherichia coli carrying this library into germfree mice. Populationmore » dynamics of library clones revealed Bt genes conferring significant fitness advantages in E. coli over time, including carbohydrate utilization genes, with a Bt galactokinase central to early colonization, and subsequent dominance by a Bt glycoside hydrolase enabling sucrose metabolism coupled with co-evolution of the plasmid library and E. coli genome driving increased galactose utilization. Here, our findings highlight the utility of functional metagenomics for engineering commensal bacteria with improved properties, including expanded colonization capabilities in vivo.« less

  17. Improving microbial fitness in the mammalian gut by in vivo temporal functional metagenomics

    DOE PAGES

    Yaung, Stephanie J.; Deng, Luxue; Li, Ning; ...

    2015-03-11

    Elucidating functions of commensal microbial genes in the mammalian gut is challenging because many commensals are recalcitrant to laboratory cultivation and genetic manipulation. We present Temporal FUnctional Metagenomics sequencing (TFUMseq), a platform to functionally mine bacterial genomes for genes that contribute to fitness of commensal bacteria in vivo. Our approach uses metagenomic DNA to construct large-scale heterologous expression libraries that are tracked over time in vivo by deep sequencing and computational methods. To demonstrate our approach, we built a TFUMseq plasmid library using the gut commensal Bacteroides thetaiotaomicron (Bt) and introduced Escherichia coli carrying this library into germfree mice. Populationmore » dynamics of library clones revealed Bt genes conferring significant fitness advantages in E. coli over time, including carbohydrate utilization genes, with a Bt galactokinase central to early colonization, and subsequent dominance by a Bt glycoside hydrolase enabling sucrose metabolism coupled with co-evolution of the plasmid library and E. coli genome driving increased galactose utilization. Here, our findings highlight the utility of functional metagenomics for engineering commensal bacteria with improved properties, including expanded colonization capabilities in vivo.« less

  18. Viruses as Winners in the Game of Life.

    PubMed

    Cobián Güemes, Ana Georgina; Youle, Merry; Cantú, Vito Adrian; Felts, Ben; Nulton, James; Rohwer, Forest

    2016-09-29

    Viruses are the most abundant and the most diverse life form. In this meta-analysis we estimate that there are 4.80×10 31 phages on Earth. Further, 97% of viruses are in soil and sediment-two underinvestigated biomes that combined account for only ∼2.5% of publicly available viral metagenomes. The majority of the most abundant viral sequences from all biomes are novel. Our analysis drawing on all publicly available viral metagenomes observed a mere 257,698 viral genotypes on Earth-an unrealistically low number-which attests to the current paucity of viral metagenomic data. Further advances in viral ecology and diversity call for a shift of attention to previously ignored major biomes and careful application of verified methods for viral metagenomic analysis.

  19. Heterologous viral expression systems in fosmid vectors increase the functional analysis potential of metagenomic libraries.

    PubMed

    Terrón-González, L; Medina, C; Limón-Mortés, M C; Santero, E

    2013-01-01

    The extraordinary potential of metagenomic functional analyses to identify activities of interest present in uncultured microorganisms has been limited by reduced gene expression in surrogate hosts. We have developed vectors and specialized E. coli strains as improved metagenomic DNA heterologous expression systems, taking advantage of viral components that prevent transcription termination at metagenomic terminators. One of the systems uses the phage T7 RNA-polymerase to drive metagenomic gene expression, while the other approach uses the lambda phage transcription anti-termination protein N to limit transcription termination. A metagenomic library was constructed and functionally screened to identify genes conferring carbenicillin resistance to E. coli. The use of these enhanced expression systems resulted in a 6-fold increase in the frequency of carbenicillin resistant clones. Subcloning and sequence analysis showed that, besides β-lactamases, efflux pumps are not only able contribute to carbenicillin resistance but may in fact be sufficient by themselves to convey carbenicillin resistance.

  20. Longitudinal Metagenomic Analysis of Hospital Air Identifies Clinically Relevant Microbes.

    PubMed

    King, Paula; Pham, Long K; Waltz, Shannon; Sphar, Dan; Yamamoto, Robert T; Conrad, Douglas; Taplitz, Randy; Torriani, Francesca; Forsyth, R Allyn

    2016-01-01

    We describe the sampling of sixty-three uncultured hospital air samples collected over a six-month period and analysis using shotgun metagenomic sequencing. Our primary goals were to determine the longitudinal metagenomic variability of this environment, identify and characterize genomes of potential pathogens and determine whether they are atypical to the hospital airborne metagenome. Air samples were collected from eight locations which included patient wards, the main lobby and outside. The resulting DNA libraries produced 972 million sequences representing 51 gigabases. Hierarchical clustering of samples by the most abundant 50 microbial orders generated three major nodes which primarily clustered by type of location. Because the indoor locations were longitudinally consistent, episodic relative increases in microbial genomic signatures related to the opportunistic pathogens Aspergillus, Penicillium and Stenotrophomonas were identified as outliers at specific locations. Further analysis of microbial reads specific for Stenotrophomonas maltophilia indicated homology to a sequenced multi-drug resistant clinical strain and we observed broad sequence coverage of resistance genes. We demonstrate that a shotgun metagenomic sequencing approach can be used to characterize the resistance determinants of pathogen genomes that are uncharacteristic for an otherwise consistent hospital air microbial metagenomic profile.

  1. MetaStorm: A Public Resource for Customizable Metagenomics Annotation

    PubMed Central

    Arango-Argoty, Gustavo; Singh, Gargi; Heath, Lenwood S.; Pruden, Amy; Xiao, Weidong; Zhang, Liqing

    2016-01-01

    Metagenomics is a trending research area, calling for the need to analyze large quantities of data generated from next generation DNA sequencing technologies. The need to store, retrieve, analyze, share, and visualize such data challenges current online computational systems. Interpretation and annotation of specific information is especially a challenge for metagenomic data sets derived from environmental samples, because current annotation systems only offer broad classification of microbial diversity and function. Moreover, existing resources are not configured to readily address common questions relevant to environmental systems. Here we developed a new online user-friendly metagenomic analysis server called MetaStorm (http://bench.cs.vt.edu/MetaStorm/), which facilitates customization of computational analysis for metagenomic data sets. Users can upload their own reference databases to tailor the metagenomics annotation to focus on various taxonomic and functional gene markers of interest. MetaStorm offers two major analysis pipelines: an assembly-based annotation pipeline and the standard read annotation pipeline used by existing web servers. These pipelines can be selected individually or together. Overall, MetaStorm provides enhanced interactive visualization to allow researchers to explore and manipulate taxonomy and functional annotation at various levels of resolution. PMID:27632579

  2. MetaStorm: A Public Resource for Customizable Metagenomics Annotation.

    PubMed

    Arango-Argoty, Gustavo; Singh, Gargi; Heath, Lenwood S; Pruden, Amy; Xiao, Weidong; Zhang, Liqing

    2016-01-01

    Metagenomics is a trending research area, calling for the need to analyze large quantities of data generated from next generation DNA sequencing technologies. The need to store, retrieve, analyze, share, and visualize such data challenges current online computational systems. Interpretation and annotation of specific information is especially a challenge for metagenomic data sets derived from environmental samples, because current annotation systems only offer broad classification of microbial diversity and function. Moreover, existing resources are not configured to readily address common questions relevant to environmental systems. Here we developed a new online user-friendly metagenomic analysis server called MetaStorm (http://bench.cs.vt.edu/MetaStorm/), which facilitates customization of computational analysis for metagenomic data sets. Users can upload their own reference databases to tailor the metagenomics annotation to focus on various taxonomic and functional gene markers of interest. MetaStorm offers two major analysis pipelines: an assembly-based annotation pipeline and the standard read annotation pipeline used by existing web servers. These pipelines can be selected individually or together. Overall, MetaStorm provides enhanced interactive visualization to allow researchers to explore and manipulate taxonomy and functional annotation at various levels of resolution.

  3. Novel Metagenome-Derived, Cold-Adapted Alkaline Phospholipase with Superior Lipase Activity as an Intermediate between Phospholipase and Lipase

    PubMed Central

    Lee, Mi-Hwa; Oh, Ki-Hoon; Kang, Chul-Hyung; Kim, Ji-Hoon; Oh, Tae-Kwang; Ryu, Choong-Min

    2012-01-01

    A novel lipolytic enzyme was isolated from a metagenomic library obtained from tidal flat sediments on the Korean west coast. Its putative functional domain, designated MPlaG, showed the highest similarity to phospholipase A from Grimontia hollisae CIP 101886, though it was screened from an emulsified tricaprylin plate. Phylogenetic analysis showed that MPlaG is far from family I.6 lipases, including Staphylococcus hyicus lipase, a unique lipase which can hydrolyze phospholipids, and is more evolutionarily related to the bacterial phospholipase A1 family. The specific activities of MPlaG against olive oil and phosphatidylcholine were determined to be 2,957 ± 144 and 1,735 ± 147 U mg−1, respectively, which means that MPlaG is a lipid-preferred phospholipase. Among different synthetic esters, triglycerides, and phosphatidylcholine, purified MPlaG exhibited the highest activity toward p-nitrophenyl palmitate (C16), tributyrin (C4), and 1,2-dihexanoyl-phosphatidylcholine (C8). Finally, MPlaG was identified as a phospholipase A1 with lipase activity by cleavage of the sn-1 position of OPPC, interfacial activity, and triolein hydrolysis. These findings suggest that MPlaG is the first experimentally characterized phospholipase A1 with lipase activity obtained from a metagenomic library. Our study provides an opportunity to improve our insight into the evolution of lipases and phospholipases. PMID:22544255

  4. The Amordad database engine for metagenomics.

    PubMed

    Behnam, Ehsan; Smith, Andrew D

    2014-10-15

    Several technical challenges in metagenomic data analysis, including assembling metagenomic sequence data or identifying operational taxonomic units, are both significant and well known. These forms of analysis are increasingly cited as conceptually flawed, given the extreme variation within traditionally defined species and rampant horizontal gene transfer. Furthermore, computational requirements of such analysis have hindered content-based organization of metagenomic data at large scale. In this article, we introduce the Amordad database engine for alignment-free, content-based indexing of metagenomic datasets. Amordad places the metagenome comparison problem in a geometric context, and uses an indexing strategy that combines random hashing with a regular nearest neighbor graph. This framework allows refinement of the database over time by continual application of random hash functions, with the effect of each hash function encoded in the nearest neighbor graph. This eliminates the need to explicitly maintain the hash functions in order for query efficiency to benefit from the accumulated randomness. Results on real and simulated data show that Amordad can support logarithmic query time for identifying similar metagenomes even as the database size reaches into the millions. Source code, licensed under the GNU general public license (version 3) is freely available for download from http://smithlabresearch.org/amordad andrewds@usc.edu Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  5. 16S rRNA Gene-Based Metagenomic Analysis of Ozark Cave Bacteria

    PubMed Central

    Oliveira, Cássia; Gunderman, Lauren; Coles, Cathryn A.; Lochmann, Jason; Parks, Megan; Ballard, Ethan; Glazko, Galina; Rahmatallah, Yasir; Tackett, Alan J.; Thomas, David J.

    2018-01-01

    The microbial diversity within cave ecosystems is largely unknown. Ozark caves maintain a year-round stable temperature (12–14 °C), but most parts of the caves experience complete darkness. The lack of sunlight and geological isolation from surface-energy inputs generate nutrient-poor conditions that may limit species diversity in such environments. Although microorganisms play a crucial role in sustaining life on Earth and impacting human health, little is known about their diversity, ecology, and evolution in community structures. We used five Ozark region caves as test sites for exploring bacterial diversity and monitoring long-term biodiversity. Illumina MiSeq sequencing of five cave soil samples and a control sample revealed a total of 49 bacterial phyla, with seven major phyla: Proteobacteria, Acidobacteria, Actinobacteria, Firmicutes, Chloroflexi, Bacteroidetes, and Nitrospirae. Variation in bacterial composition was observed among the five caves studied. Sandtown Cave had the lowest richness and most divergent community composition. 16S rRNA gene-based metagenomic analysis of cave-dwelling microbial communities in the Ozark caves revealed that species abundance and diversity are vast and included ecologically, agriculturally, and economically relevant taxa. PMID:29551950

  6. Toward Accurate and Quantitative Comparative Metagenomics

    PubMed Central

    Nayfach, Stephen; Pollard, Katherine S.

    2016-01-01

    Shotgun metagenomics and computational analysis are used to compare the taxonomic and functional profiles of microbial communities. Leveraging this approach to understand roles of microbes in human biology and other environments requires quantitative data summaries whose values are comparable across samples and studies. Comparability is currently hampered by the use of abundance statistics that do not estimate a meaningful parameter of the microbial community and biases introduced by experimental protocols and data-cleaning approaches. Addressing these challenges, along with improving study design, data access, metadata standardization, and analysis tools, will enable accurate comparative metagenomics. We envision a future in which microbiome studies are replicable and new metagenomes are easily and rapidly integrated with existing data. Only then can the potential of metagenomics for predictive ecological modeling, well-powered association studies, and effective microbiome medicine be fully realized. PMID:27565341

  7. Toward Accurate and Quantitative Comparative Metagenomics.

    PubMed

    Nayfach, Stephen; Pollard, Katherine S

    2016-08-25

    Shotgun metagenomics and computational analysis are used to compare the taxonomic and functional profiles of microbial communities. Leveraging this approach to understand roles of microbes in human biology and other environments requires quantitative data summaries whose values are comparable across samples and studies. Comparability is currently hampered by the use of abundance statistics that do not estimate a meaningful parameter of the microbial community and biases introduced by experimental protocols and data-cleaning approaches. Addressing these challenges, along with improving study design, data access, metadata standardization, and analysis tools, will enable accurate comparative metagenomics. We envision a future in which microbiome studies are replicable and new metagenomes are easily and rapidly integrated with existing data. Only then can the potential of metagenomics for predictive ecological modeling, well-powered association studies, and effective microbiome medicine be fully realized. Copyright © 2016 Elsevier Inc. All rights reserved.

  8. Bioactive compounds synthesized by non-ribosomal peptide synthetases and type-I polyketide synthases discovered through genome-mining and metagenomics.

    PubMed

    Nikolouli, Katerina; Mossialos, Dimitris

    2012-08-01

    Non-ribosomal peptide synthetases (NRPS) and type-I polyketide synthases (PKS-I) are multimodular enzymes involved in biosynthesis of oligopeptide and polyketide secondary metabolites produced by microorganisms such as bacteria and fungi. New findings regarding the mechanisms underlying NRPS and PKS-I evolution illustrate how microorganisms expand their metabolic potential. During the last decade rapid development of bioinformatics tools as well as improved sequencing and annotation of microbial genomes led to discovery of novel bioactive compounds synthesized by NRPS and PKS-I through genome-mining. Taking advantage of these technological developments metagenomics is a fast growing research field which directly studies microbial genomes or specific gene groups and their products. Discovery of novel bioactive compounds synthesized by NRPS and PKS-I will certainly be accelerated through metagenomics, allowing the exploitation of so far untapped microbial resources in biotechnology and medicine.

  9. Merging metagenomics and geochemistry reveals environmental controls on biological diversity and evolution.

    PubMed

    Alsop, Eric B; Boyd, Eric S; Raymond, Jason

    2014-05-28

    The metabolic strategies employed by microbes inhabiting natural systems are, in large part, dictated by the physical and geochemical properties of the environment. This study sheds light onto the complex relationship between biology and environmental geochemistry using forty-three metagenomes collected from geochemically diverse and globally distributed natural systems. It is widely hypothesized that many uncommonly measured geochemical parameters affect community dynamics and this study leverages the development and application of multidimensional biogeochemical metrics to study correlations between geochemistry and microbial ecology. Analysis techniques such as a Markov cluster-based measure of the evolutionary distance between whole communities and a principal component analysis (PCA) of the geochemical gradients between environments allows for the determination of correlations between microbial community dynamics and environmental geochemistry and provides insight into which geochemical parameters most strongly influence microbial biodiversity. By progressively building from samples taken along well defined geochemical gradients to samples widely dispersed in geochemical space this study reveals strong links between the extent of taxonomic and functional diversification of resident communities and environmental geochemistry and reveals temperature and pH as the primary factors that have shaped the evolution of these communities. Moreover, the inclusion of extensive geochemical data into analyses reveals new links between geochemical parameters (e.g. oxygen and trace element availability) and the distribution and taxonomic diversification of communities at the functional level. Further, an overall geochemical gradient (from multivariate analyses) between natural systems provides one of the most complete predictions of microbial taxonomic and functional composition. Clustering based on the frequency in which orthologous proteins occur among metagenomes facilitated accurate prediction of the ordering of community functional composition along geochemical gradients, despite a lack of geochemical input. The consistency in the results obtained from the application of Markov clustering and multivariate methods to distinct natural systems underscore their utility in predicting the functional potential of microbial communities within a natural system based on system geochemistry alone, allowing geochemical measurements to be used to predict purely biological metrics such as microbial community composition and metabolism.

  10. Metagenomic analysis of viral diversity in respiratory samples from patients with respiratory tract infections in Kuwait.

    PubMed

    Madi, Nada; Al-Nakib, Widad; Mustafa, Abu Salim; Habibi, Nazima

    2018-03-01

    A metagenomic approach based on target independent next-generation sequencing has become a known method for the detection of both known and novel viruses in clinical samples. This study aimed to use the metagenomic sequencing approach to characterize the viral diversity in respiratory samples from patients with respiratory tract infections. We have investigated 86 respiratory samples received from various hospitals in Kuwait between 2015 and 2016 for the diagnosis of respiratory tract infections. A metagenomic approach using the next-generation sequencer to characterize viruses was used. According to the metagenomic analysis, an average of 145, 019 reads were identified, and 2% of these reads were of viral origin. Also, metagenomic analysis of the viral sequences revealed many known respiratory viruses, which were detected in 30.2% of the clinical samples. Also, sequences of non-respiratory viruses were detected in 14% of the clinical samples, while sequences of non-human viruses were detected in 55.8% of the clinical samples. The average genome coverage of the viruses was 12% with the highest genome coverage of 99.2% for respiratory syncytial virus, and the lowest was 1% for torque teno midi virus 2. Our results showed 47.7% agreement between multiplex Real-Time PCR and metagenomics sequencing in the detection of respiratory viruses in the clinical samples. Though there are some difficulties in using this method to clinical samples such as specimen quality, these observations are indicative of the promising utility of the metagenomic sequencing approach for the identification of respiratory viruses in patients with respiratory tract infections. © 2017 Wiley Periodicals, Inc.

  11. Environmental Metagenomics: The Data Assembly and Data Analysis Perspectives

    NASA Astrophysics Data System (ADS)

    Kumar, Vinay; Maitra, S. S.; Shukla, Rohit Nandan

    2015-03-01

    Novel gene finding is one of the emerging fields in the environmental research. In the past decades the research was focused mainly on the discovery of microorganisms which were capable of degrading a particular compound. A lot of methods are available in literature about the cultivation and screening of these novel microorganisms. All of these methods are efficient for screening of microbes which can be cultivated in the laboratory. Microorganisms which live in extreme conditions like hot springs, frozen glaciers, acid mine drainage, etc. cannot be cultivated in the laboratory, this is because of incomplete knowledge about their growth requirements like temperature, nutrients and their mutual dependence on each other. The microbes that can be cultivated correspond only to less than 1 % of the total microbes which are present in the earth. Rest of the 99 % of uncultivated majority remains inaccessible. Metagenomics transcends the culture requirements of microbes. In metagenomics DNA is directly extracted from the environmental samples such as soil, seawater, acid mine drainage etc., followed by construction and screening of metagenomic library. With the ongoing research, a huge amount of metagenomic data is accumulating. Understanding this data is an essential step to extract novel genes of industrial importance. Various bioinformatics tools have been designed to analyze and annotate the data produced from the metagenome. The Bio-informatic requirements of metagenomics data analysis are different in theory and practice. This paper reviews the tools that are available for metagenomic data analysis and the capability such tools—what they can do and their web availability.

  12. Metagenomic Assembly: Overview, Challenges and Applications

    PubMed Central

    Ghurye, Jay S.; Cepeda-Espinoza, Victoria; Pop, Mihai

    2016-01-01

    Advances in sequencing technologies have led to the increased use of high throughput sequencing in characterizing the microbial communities associated with our bodies and our environment. Critical to the analysis of the resulting data are sequence assembly algorithms able to reconstruct genes and organisms from complex mixtures. Metagenomic assembly involves new computational challenges due to the specific characteristics of the metagenomic data. In this survey, we focus on major algorithmic approaches for genome and metagenome assembly, and discuss the new challenges and opportunities afforded by this new field. We also review several applications of metagenome assembly in addressing interesting biological problems. PMID:27698619

  13. IMG/M-HMP: a metagenome comparative analysis system for the Human Microbiome Project.

    PubMed

    Markowitz, Victor M; Chen, I-Min A; Chu, Ken; Szeto, Ernest; Palaniappan, Krishna; Jacob, Biju; Ratner, Anna; Liolios, Konstantinos; Pagani, Ioanna; Huntemann, Marcel; Mavromatis, Konstantinos; Ivanova, Natalia N; Kyrpides, Nikos C

    2012-01-01

    The Integrated Microbial Genomes and Metagenomes (IMG/M) resource is a data management system that supports the analysis of sequence data from microbial communities in the integrated context of all publicly available draft and complete genomes from the three domains of life as well as a large number of plasmids and viruses. IMG/M currently contains thousands of genomes and metagenome samples with billions of genes. IMG/M-HMP is an IMG/M data mart serving the US National Institutes of Health (NIH) Human Microbiome Project (HMP), focussed on HMP generated metagenome datasets, and is one of the central resources provided from the HMP Data Analysis and Coordination Center (DACC). IMG/M-HMP is available at http://www.hmpdacc-resources.org/imgm_hmp/.

  14. Evolutionary, ecological and biotechnological perspectives on plasmids resident in the human gut mobile metagenome

    PubMed Central

    Ogilvie, Lesley A.; Firouzmand, Sepinoud; Jones, Brian V.

    2012-01-01

    Numerous mobile genetic elements (MGE) are associated with the human gut microbiota and collectively referred to as the gut mobile metagenome. The role of this flexible gene pool in development and functioning of the gut microbial community remains largely unexplored, yet recent evidence suggests that at least some MGE comprising this fraction of the gut microbiome reflect the co-evolution of host and microbe in the gastro-intestinal tract. In conjunction, the high level of novel gene content typical of MGE coupled with their predicted high diversity, suggests that the mobile metagenome constitutes an immense and largely unexplored gene-space likely to encode many novel activities with potential biotechnological or pharmaceutical value, as well as being important to the development and functioning of the gut microbiota. Of the various types of MGE that comprise the gut mobile metagenome, plasmids are of particular importance since these elements are often capable of autonomous transfer between disparate bacterial species, and are known to encode accessory functions that increase bacterial fitness in a given environment facilitating bacterial adaptation. In this article current knowledge regarding plasmids resident in the human gut mobile metagenome is reviewed, and available strategies to access and characterize this portion of the gut microbiome are described. The relative merits of these methods and their present as well as prospective impact on our understanding of the human gut microbiota is discussed. PMID:22126801

  15. Analysis of Metagenomic Sequences: From Megabases to Terabases

    ScienceCinema

    Krypides, Nikos

    2018-05-04

    Nikos Krypides of the DOE Joint Genome Institute discusses metagenomics and the challenge of dealing with terabases of data on June 4, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM.

  16. Direct Detection and Identification of Prosthetic Joint Infection Pathogens in Synovial Fluid by Metagenomic Shotgun Sequencing.

    PubMed

    Ivy, Morgan I; Thoendel, Matthew J; Jeraldo, Patricio R; Greenwood-Quaintance, Kerryl E; Hanssen, Arlen D; Abdel, Matthew P; Chia, Nicholas; Yao, Janet Z; Tande, Aaron J; Mandrekar, Jayawant N; Patel, Robin

    2018-05-30

    Background: Metagenomic shotgun sequencing has the potential to transform how serious infections are diagnosed by offering universal, culture-free pathogen detection. This may be especially advantageous for microbial diagnosis of prosthetic joint infection (PJI) by synovial fluid analysis, since synovial fluid cultures are not universally positive, and synovial fluid is easily obtained pre-operatively. We applied a metagenomics-based approach to synovial fluid in an attempt to detect microorganisms in 168 failed total knee arthroplasties. Results: Genus- and species-level analysis of metagenomic sequencing yielded the known pathogen in 74 (90%) and 68 (83%) of the 82 culture-positive PJIs analyzed, respectively, with testing of two (2%) and three (4%) samples, respectively, yielding additional pathogens not detected by culture. For the 25 culture-negative PJIs tested, genus- and species-level analysis yielded 19 (76%) and 21 (84%) samples with insignificant findings, respectively, and 6 (24%) and 4 (16%) with potential pathogens detected, respectively. Genus- and species-level analysis of the 60 culture-negative aseptic failure cases yielded 53 (88.3%) and 56 (93.3%) cases with insignificant findings, and 7 (11.7%) and 4 (6.7%) with potential clinically-significant organisms detected, respectively. There was one case of aseptic failure with synovial fluid culture growth; metagenomic analysis showed insignificant findings, suggesting possible synovial fluid culture contamination. Conclusion: Metagenomic shotgun sequencing can detect pathogens involved in PJI when applied to synovial fluid and may be particularly useful for culture-negative cases. Copyright © 2018 American Society for Microbiology.

  17. Experimental Design and Bioinformatics Analysis for the Application of Metagenomics in Environmental Sciences and Biotechnology.

    PubMed

    Ju, Feng; Zhang, Tong

    2015-11-03

    Recent advances in DNA sequencing technologies have prompted the widespread application of metagenomics for the investigation of novel bioresources (e.g., industrial enzymes and bioactive molecules) and unknown biohazards (e.g., pathogens and antibiotic resistance genes) in natural and engineered microbial systems across multiple disciplines. This review discusses the rigorous experimental design and sample preparation in the context of applying metagenomics in environmental sciences and biotechnology. Moreover, this review summarizes the principles, methodologies, and state-of-the-art bioinformatics procedures, tools and database resources for metagenomics applications and discusses two popular strategies (analysis of unassembled reads versus assembled contigs/draft genomes) for quantitative or qualitative insights of microbial community structure and functions. Overall, this review aims to facilitate more extensive application of metagenomics in the investigation of uncultured microorganisms, novel enzymes, microbe-environment interactions, and biohazards in biotechnological applications where microbial communities are engineered for bioenergy production, wastewater treatment, and bioremediation.

  18. An application of statistics to comparative metagenomics

    PubMed Central

    Rodriguez-Brito, Beltran; Rohwer, Forest; Edwards, Robert A

    2006-01-01

    Background Metagenomics, sequence analyses of genomic DNA isolated directly from the environments, can be used to identify organisms and model community dynamics of a particular ecosystem. Metagenomics also has the potential to identify significantly different metabolic potential in different environments. Results Here we use a statistical method to compare curated subsystems, to predict the physiology, metabolism, and ecology from metagenomes. This approach can be used to identify those subsystems that are significantly different between metagenome sequences. Subsystems that were overrepresented in the Sargasso Sea and Acid Mine Drainage metagenome when compared to non-redundant databases were identified. Conclusion The methodology described herein applies statistics to the comparisons of metabolic potential in metagenomes. This analysis reveals those subsystems that are more, or less, represented in the different environments that are compared. These differences in metabolic potential lead to several testable hypotheses about physiology and metabolism of microbes from these ecosystems. PMID:16549025

  19. An application of statistics to comparative metagenomics.

    PubMed

    Rodriguez-Brito, Beltran; Rohwer, Forest; Edwards, Robert A

    2006-03-20

    Metagenomics, sequence analyses of genomic DNA isolated directly from the environments, can be used to identify organisms and model community dynamics of a particular ecosystem. Metagenomics also has the potential to identify significantly different metabolic potential in different environments. Here we use a statistical method to compare curated subsystems, to predict the physiology, metabolism, and ecology from metagenomes. This approach can be used to identify those subsystems that are significantly different between metagenome sequences. Subsystems that were overrepresented in the Sargasso Sea and Acid Mine Drainage metagenome when compared to non-redundant databases were identified. The methodology described herein applies statistics to the comparisons of metabolic potential in metagenomes. This analysis reveals those subsystems that are more, or less, represented in the different environments that are compared. These differences in metabolic potential lead to several testable hypotheses about physiology and metabolism of microbes from these ecosystems.

  20. Simultaneous virus identification and characterization of severe unexplained pneumonia cases using a metagenomics sequencing technique.

    PubMed

    Zou, Xiaohui; Tang, Guangpeng; Zhao, Xiang; Huang, Yan; Chen, Tao; Lei, Mingyu; Chen, Wenbing; Yang, Lei; Zhu, Wenfei; Zhuang, Li; Yang, Jing; Feng, Zhaomin; Wang, Dayan; Wang, Dingming; Shu, Yuelong

    2017-03-01

    Many viruses can cause respiratory diseases in humans. Although great advances have been achieved in methods of diagnosis, it remains challenging to identify pathogens in unexplained pneumonia (UP) cases. In this study, we applied next-generation sequencing (NGS) technology and a metagenomic approach to detect and characterize respiratory viruses in UP cases from Guizhou Province, China. A total of 33 oropharyngeal swabs were obtained from hospitalized UP patients and subjected to NGS. An unbiased metagenomic analysis pipeline identified 13 virus species in 16 samples. Human rhinovirus C was the virus most frequently detected and was identified in seven samples. Human measles virus, adenovirus B 55 and coxsackievirus A10 were also identified. Metagenomic sequencing also provided virus genomic sequences, which enabled genotype characterization and phylogenetic analysis. For cases of multiple infection, metagenomic sequencing afforded information regarding the quantity of each virus in the sample, which could be used to evaluate each viruses' role in the disease. Our study highlights the potential of metagenomic sequencing for pathogen identification in UP cases.

  1. Comparative fecal metagenomics unveils unique functional capacity of the swine gut

    PubMed Central

    2011-01-01

    Background Uncovering the taxonomic composition and functional capacity within the swine gut microbial consortia is of great importance to animal physiology and health as well as to food and water safety due to the presence of human pathogens in pig feces. Nonetheless, limited information on the functional diversity of the swine gut microbiome is available. Results Analysis of 637, 722 pyrosequencing reads (130 megabases) generated from Yorkshire pig fecal DNA extracts was performed to help better understand the microbial diversity and largely unknown functional capacity of the swine gut microbiome. Swine fecal metagenomic sequences were annotated using both MG-RAST and JGI IMG/M-ER pipelines. Taxonomic analysis of metagenomic reads indicated that swine fecal microbiomes were dominated by Firmicutes and Bacteroidetes phyla. At a finer phylogenetic resolution, Prevotella spp. dominated the swine fecal metagenome, while some genes associated with Treponema and Anareovibrio species were found to be exclusively within the pig fecal metagenomic sequences analyzed. Functional analysis revealed that carbohydrate metabolism was the most abundant SEED subsystem, representing 13% of the swine metagenome. Genes associated with stress, virulence, cell wall and cell capsule were also abundant. Virulence factors associated with antibiotic resistance genes with highest sequence homology to genes in Bacteroidetes, Clostridia, and Methanosarcina were numerous within the gene families unique to the swine fecal metagenomes. Other abundant proteins unique to the distal swine gut shared high sequence homology to putative carbohydrate membrane transporters. Conclusions The results from this metagenomic survey demonstrated the presence of genes associated with resistance to antibiotics and carbohydrate metabolism suggesting that the swine gut microbiome may be shaped by husbandry practices. PMID:21575148

  2. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bowers, Robert M.; Kyrpides, Nikos C.; Stepanauskas, Ramunas

    The number of genomes from uncultivated microbes will soon surpass the number of isolate genomes in public databases (Hugenholtz, Skarshewski, & Parks, 2016). Technological advancements in high-throughput sequencing and assembly, including single-cell genomics and the computational extraction of genomes from metagenomes (GFMs), are largely responsible. Here we propose community standards for reporting the Minimum Information about a Single-Cell Genome (MIxS-SCG) and Minimum Information about Genomes extracted From Metagenomes (MIxS-GFM) specific for Bacteria and Archaea. The standards have been developed in the context of the International Genomics Standards Consortium (GSC) community (Field et al., 2014) and can be viewed as amore » supplement to other GSC checklists including the Minimum Information about a Genome Sequence (MIGS), Minimum information about a Metagenomic Sequence(s) (MIMS) (Field et al., 2008) and Minimum Information about a Marker Gene Sequence (MIMARKS) (P. Yilmaz et al., 2011). Community-wide acceptance of MIxS-SCG and MIxS-GFM for Bacteria and Archaea will enable broad comparative analyses of genomes from the majority of taxa that remain uncultivated, improving our understanding of microbial function, ecology, and evolution.« less

  3. Genomic and metagenomic challenges and opportunities for bioleaching: a mini-review.

    PubMed

    Cárdenas, Juan Pablo; Quatrini, Raquel; Holmes, David S

    2016-09-01

    High-throughput genomic technologies are accelerating progress in understanding the diversity of microbial life in many environments. Here we highlight advances in genomics and metagenomics of microorganisms from bioleaching heaps and related acidic mining environments. Bioleaching heaps used for copper recovery provide significant opportunities to study the processes and mechanisms underlying microbial successions and the influence of community composition on ecosystem functioning. Obtaining quantitative and process-level knowledge of these dynamics is pivotal for understanding how microorganisms contribute to the solubilization of copper for industrial recovery. Advances in DNA sequencing technology provide unprecedented opportunities to obtain information about the genomes of bioleaching microorganisms, allowing predictive models of metabolic potential and ecosystem-level interactions to be constructed. These approaches are enabling predictive phenotyping of organisms many of which are recalcitrant to genetic approaches or are unculturable. This mini-review describes current bioleaching genomic and metagenomic projects and addresses the use of genome information to: (i) build metabolic models; (ii) predict microbial interactions; (iii) estimate genetic diversity; and (iv) study microbial evolution. Key challenges and perspectives of bioleaching genomics/metagenomics are addressed. Copyright © 2016 The Author(s). Published by Elsevier Masson SAS.. All rights reserved.

  4. Domestication and cereal feeding developed domestic pig-type intestinal microbiota in animals of suidae.

    PubMed

    Ushida, Kazunari; Tsuchida, Sayaka; Ogura, Yoshitoshi; Toyoda, Atsushi; Maruyama, Fumito

    2016-06-01

    Intestinal microbiota are characterized by host-specific microorganisms, which have been selected through host-microbe interactions under phylogenetic evolution and transition of feeding behavior by the host. Although many studies have focused on disease-related intestinal microbiota, the origin and evolution of host-specific intestinal microbiota have not been well elucidated. Pig is the ideal mammal model to reveal the origin and evolution of host-specific intestinal microbiota because their direct wild ancestor and close phylogenetic neighbors are available for comparison. The pig has been recognized as a Lactobacillus-type animal. We analyzed the intestinal microbiota of various animals in Suidae: domestic pigs, wild boars and Red river hogs to survey the origin and evolution of Lactobacillus-dominated intestinal microbiota by metagenomic approach and following quantitative PCR confirmation. The metagenomic datasets were separated in two clusters; the wild animal cluster being characterized by a high abundance of Bifidobacterium, whereas the domesticated (or captured) animal cluster by Lactobacillus. In addition, Enterobacteriaceae were harbored as the major family only in domestic Sus scrofa. We conclude that domestication may have induced a larger Enterobacteriaceae population in pigs, and the introduction of modern feeding system further caused the development of Lactobacillus-dominated intestinal microbiota, with genetic and geographical factors possibly having a minor impact. © 2015 Japanese Society of Animal Science.

  5. HoloVir: A Workflow for Investigating the Diversity and Function of Viruses in Invertebrate Holobionts

    PubMed Central

    Laffy, Patrick W.; Wood-Charlson, Elisha M.; Turaev, Dmitrij; Weynberg, Karen D.; Botté, Emmanuelle S.; van Oppen, Madeleine J. H.; Webster, Nicole S.; Rattei, Thomas

    2016-01-01

    Abundant bioinformatics resources are available for the study of complex microbial metagenomes, however their utility in viral metagenomics is limited. HoloVir is a robust and flexible data analysis pipeline that provides an optimized and validated workflow for taxonomic and functional characterization of viral metagenomes derived from invertebrate holobionts. Simulated viral metagenomes comprising varying levels of viral diversity and abundance were used to determine the optimal assembly and gene prediction strategy, and multiple sequence assembly methods and gene prediction tools were tested in order to optimize our analysis workflow. HoloVir performs pairwise comparisons of single read and predicted gene datasets against the viral RefSeq database to assign taxonomy and additional comparison to phage-specific and cellular markers is undertaken to support the taxonomic assignments and identify potential cellular contamination. Broad functional classification of the predicted genes is provided by assignment of COG microbial functional category classifications using EggNOG and higher resolution functional analysis is achieved by searching for enrichment of specific Swiss-Prot keywords within the viral metagenome. Application of HoloVir to viral metagenomes from the coral Pocillopora damicornis and the sponge Rhopaloeides odorabile demonstrated that HoloVir provides a valuable tool to characterize holobiont viral communities across species, environments, or experiments. PMID:27375564

  6. IDENTIFICATION OF AVIAN-SPECIFIC FECAL METAGENOMIC SEQUENCES USING GENOME FRAGMENT ENRICHMENTS

    EPA Science Inventory

    Sequence analysis of microbial genomes has provided biologists the opportunity to compare genetic differences between closely related microorganisms. While random sequencing has also been used to study natural microbial communities, metagenomic comparisons via sequencing analysis...

  7. Genetic variability of psychrotolerant Acidithiobacillus ferrivorans revealed by (meta)genomic analysis.

    PubMed

    González, Carolina; Yanquepe, María; Cardenas, Juan Pablo; Valdes, Jorge; Quatrini, Raquel; Holmes, David S; Dopson, Mark

    2014-11-01

    Acidophilic microorganisms inhabit low pH environments such as acid mine drainage that is generated when sulfide minerals are exposed to air. The genome sequence of the psychrotolerant Acidithiobacillus ferrivorans SS3 was compared to a metagenome from a low temperature acidic stream dominated by an A. ferrivorans-like strain. Stretches of genomic DNA characterized by few matches to the metagenome, termed 'metagenomic islands', encoded genes associated with metal efflux and pH homeostasis. The metagenomic islands were enriched in mobile elements such as phage proteins, transposases, integrases and in one case, predicted to be flanked by truncated tRNAs. Cus gene clusters predicted to be involved in copper efflux and further Cus-like RND systems were predicted to be located in metagenomic islands and therefore, constitute part of the flexible gene complement of the species. Phylogenetic analysis of Cus clusters showed both lineage specificity within the Acidithiobacillus genus as well as niche specificity associated with an acidic environment. The metagenomic islands also contained a predicted copper efflux P-type ATPase system and a polyphosphate kinase potentially involved in polyphosphate mediated copper resistance. This study identifies genetic variability of low temperature acidophiles that likely reflects metal resistance selective pressures in the copper rich environment. Copyright © 2014 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.

  8. WHAM!: a web-based visualization suite for user-defined analysis of metagenomic shotgun sequencing data.

    PubMed

    Devlin, Joseph C; Battaglia, Thomas; Blaser, Martin J; Ruggles, Kelly V

    2018-06-25

    Exploration of large data sets, such as shotgun metagenomic sequence or expression data, by biomedical experts and medical professionals remains as a major bottleneck in the scientific discovery process. Although tools for this purpose exist for 16S ribosomal RNA sequencing analysis, there is a growing but still insufficient number of user-friendly interactive visualization workflows for easy data exploration and figure generation. The development of such platforms for this purpose is necessary to accelerate and streamline microbiome laboratory research. We developed the Workflow Hub for Automated Metagenomic Exploration (WHAM!) as a web-based interactive tool capable of user-directed data visualization and statistical analysis of annotated shotgun metagenomic and metatranscriptomic data sets. WHAM! includes exploratory and hypothesis-based gene and taxa search modules for visualizing differences in microbial taxa and gene family expression across experimental groups, and for creating publication quality figures without the need for command line interface or in-house bioinformatics. WHAM! is an interactive and customizable tool for downstream metagenomic and metatranscriptomic analysis providing a user-friendly interface allowing for easy data exploration by microbiome and ecological experts to facilitate discovery in multi-dimensional and large-scale data sets.

  9. Ten years of maintaining and expanding a microbial genome and metagenome analysis system.

    PubMed

    Markowitz, Victor M; Chen, I-Min A; Chu, Ken; Pati, Amrita; Ivanova, Natalia N; Kyrpides, Nikos C

    2015-11-01

    Launched in March 2005, the Integrated Microbial Genomes (IMG) system is a comprehensive data management system that supports multidimensional comparative analysis of genomic data. At the core of the IMG system is a data warehouse that contains genome and metagenome datasets sequenced at the Joint Genome Institute or provided by scientific users, as well as public genome datasets available at the National Center for Biotechnology Information Genbank sequence data archive. Genomes and metagenome datasets are processed using IMG's microbial genome and metagenome sequence data processing pipelines and are integrated into the data warehouse using IMG's data integration toolkits. Microbial genome and metagenome application specific data marts and user interfaces provide access to different subsets of IMG's data and analysis toolkits. This review article revisits IMG's original aims, highlights key milestones reached by the system during the past 10 years, and discusses the main challenges faced by a rapidly expanding system, in particular the complexity of maintaining such a system in an academic setting with limited budgets and computing and data management infrastructure. Copyright © 2015 Elsevier Ltd. All rights reserved.

  10. Memory Efficient Sequence Analysis Using Compressed Data Structures (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Simpson, Jared

    2018-01-24

    Wellcome Trust Sanger Institute's Jared Simpson on Memory efficient sequence analysis using compressed data structures at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  11. Antibiotic Resistome: Improving Detection and Quantification Accuracy for Comparative Metagenomics.

    PubMed

    Elbehery, Ali H A; Aziz, Ramy K; Siam, Rania

    2016-04-01

    The unprecedented rise of life-threatening antibiotic resistance (AR), combined with the unparalleled advances in DNA sequencing of genomes and metagenomes, has pushed the need for in silico detection of the resistance potential of clinical and environmental metagenomic samples through the quantification of AR genes (i.e., genes conferring antibiotic resistance). Therefore, determining an optimal methodology to quantitatively and accurately assess AR genes in a given environment is pivotal. Here, we optimized and improved existing AR detection methodologies from metagenomic datasets to properly consider AR-generating mutations in antibiotic target genes. Through comparative metagenomic analysis of previously published AR gene abundance in three publicly available metagenomes, we illustrate how mutation-generated resistance genes are either falsely assigned or neglected, which alters the detection and quantitation of the antibiotic resistome. In addition, we inspected factors influencing the outcome of AR gene quantification using metagenome simulation experiments, and identified that genome size, AR gene length, total number of metagenomics reads and selected sequencing platforms had pronounced effects on the level of detected AR. In conclusion, our proposed improvements in the current methodologies for accurate AR detection and resistome assessment show reliable results when tested on real and simulated metagenomic datasets.

  12. Integrating Metagenomics and NanoSIMS to Investigate the Evolution and Ecophysiology of Magnetotactic Bacteria

    NASA Astrophysics Data System (ADS)

    Lin, W.; Zhang, W.; He, M.; Pan, Y.

    2017-12-01

    Magnetotactic bacteria (MTB) synthesize intracellular nano-sized magnetite (Fe3O4) and/or greigite (Fe3S4) crystals, called magnetosomes, which impart a permanent magnetic dipole moment to the cell causing it to align along the geomagnetic field lines as it swims. MTB play essential roles in global cycling of Fe, S, N and C, and represent an excellent model system not just for the investigation of the mechanisms of microbial engines that drive Earth's biogeochemical cycles but also for magnetotaxis and microbial biomineralization. Most of the previous studies on MTB were based on 16S rRNA gene-targeting analyses, which are powerful approaches to characterize the diversity, ecology and biogeography of MTB in nature. However, these approaches are somewhat limited in the physiological detail they can provide. In the present study, we have combined the genome-resolved metagenomics and nanoscale secondary ion mass spectrometry (NanoSIMS) analyses to study the genomic information, biomineralization mechanism and metabolic potential of environmental MTB. Two nearly complete genomes from uncultivated MTB belonging to the Nitrospirae phylum were reconstructed and their proposed metabolisms were further investigated and confirmed through NanoSIMS analyses. These results improve our understanding about the ecophysiology and evolution of MTB and their environmental function. The development of metagenomics-NanoSIMS integrated approach will provide a powerful tool for the research of geomicrobiology and environmental microbiology.

  13. Interactive metagenomic visualization in a Web browser.

    PubMed

    Ondov, Brian D; Bergman, Nicholas H; Phillippy, Adam M

    2011-09-30

    A critical output of metagenomic studies is the estimation of abundances of taxonomical or functional groups. The inherent uncertainty in assignments to these groups makes it important to consider both their hierarchical contexts and their prediction confidence. The current tools for visualizing metagenomic data, however, omit or distort quantitative hierarchical relationships and lack the facility for displaying secondary variables. Here we present Krona, a new visualization tool that allows intuitive exploration of relative abundances and confidences within the complex hierarchies of metagenomic classifications. Krona combines a variant of radial, space-filling displays with parametric coloring and interactive polar-coordinate zooming. The HTML5 and JavaScript implementation enables fully interactive charts that can be explored with any modern Web browser, without the need for installed software or plug-ins. This Web-based architecture also allows each chart to be an independent document, making them easy to share via e-mail or post to a standard Web server. To illustrate Krona's utility, we describe its application to various metagenomic data sets and its compatibility with popular metagenomic analysis tools. Krona is both a powerful metagenomic visualization tool and a demonstration of the potential of HTML5 for highly accessible bioinformatic visualizations. Its rich and interactive displays facilitate more informed interpretations of metagenomic analyses, while its implementation as a browser-based application makes it extremely portable and easily adopted into existing analysis packages. Both the Krona rendering code and conversion tools are freely available under a BSD open-source license, and available from: http://krona.sourceforge.net.

  14. Transposases are the most abundant, most ubiquitous genes in nature.

    PubMed

    Aziz, Ramy K; Breitbart, Mya; Edwards, Robert A

    2010-07-01

    Genes, like organisms, struggle for existence, and the most successful genes persist and widely disseminate in nature. The unbiased determination of the most successful genes requires access to sequence data from a wide range of phylogenetic taxa and ecosystems, which has finally become achievable thanks to the deluge of genomic and metagenomic sequences. Here, we analyzed 10 million protein-encoding genes and gene tags in sequenced bacterial, archaeal, eukaryotic and viral genomes and metagenomes, and our analysis demonstrates that genes encoding transposases are the most prevalent genes in nature. The finding that these genes, classically considered as selfish genes, outnumber essential or housekeeping genes suggests that they offer selective advantage to the genomes and ecosystems they inhabit, a hypothesis in agreement with an emerging body of literature. Their mobile nature not only promotes dissemination of transposable elements within and between genomes but also leads to mutations and rearrangements that can accelerate biological diversification and--consequently--evolution. By securing their own replication and dissemination, transposases guarantee to thrive so long as nucleic acid-based life forms exist.

  15. Scalability of Comparative Analysis, Novel Algorithms and Tools (MICW - Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Mavrommatis, Kostas

    2017-12-22

    DOE JGI's Kostas Mavrommatis, chair of the Scalability of Comparative Analysis, Novel Algorithms and Tools panel, at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  16. Signature Peptide-Enabled Metagenomics (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    ScienceCinema

    McMahon, Ben

    2018-01-11

    Ben McMahon of Los Alamos National Laboratory (LANL) presents "Signature Peptide-Enabled Metagenomics" at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  17. Signature Peptide-Enabled Metagenomics (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McMahon, Ben

    2012-06-01

    Ben McMahon of Los Alamos National Laboratory (LANL) presents "Signature Peptide-Enabled Metagenomics" at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  18. Metagenomic and metaproteomic insights into bacterial communities in leaf-cutter ant fungus gardens

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Aylward, Frank O.; Burnum, Kristin E.; Scott, Jarrod J.

    2012-09-01

    Herbivores gain access to nutrients stored in plant biomass largely by harnessing the metabolic activities of microbes. Leaf-cutter ants of the genus Atta are a hallmark example; these dominant Neotropical herbivores cultivate symbiotic fungus gardens on massive quantities of fresh plant forage. As the external digestive system of the ants, fungus gardens facilitate the production and sustenance of millions of workers in mature Atta colonies. Here we use metagenomic, and metaproteomic techniques to characterize the bacterial diversity and overall physiological potential of fungus gardens from two species of Atta. Our analysis of over 1.2 Gbp of community metagenomic sequence andmore » three 16S pyrotag libraries reveals that, in addition to harboring the dominant fungal crop, these ecosystems contain abundant populations of Enterobacteriaceae, including the genera Enterobacter, Pantoea, Klebsiella, Citrobacter, and Escherichia. We show that these bacterial communities possess genes commonly associated with lignocellulose degradation, and likely participate in the processing of plant biomass. Additionally, we demonstrate that bacteria in these environments encode a diverse suite of biosynthetic pathways, and that they may enrich the nitrogen-poor forage of the ants with B-vitamins, amino acids, and proteins. These results are consistent with the hypothesis that fungus gardens are highly-specialized fungus-bacteria communities that efficiently convert plant material into usable energy for their ant hosts. Together with recent investigations into the microbial symbionts of vertebrates, our work underscores the importance of microbial communities to the ecology and evolution of herbivorous metazoans.« less

  19. Meta-Storms: efficient search for similar microbial communities based on a novel indexing scheme and similarity score for metagenomic data.

    PubMed

    Su, Xiaoquan; Xu, Jian; Ning, Kang

    2012-10-01

    It has long been intriguing scientists to effectively compare different microbial communities (also referred as 'metagenomic samples' here) in a large scale: given a set of unknown samples, find similar metagenomic samples from a large repository and examine how similar these samples are. With the current metagenomic samples accumulated, it is possible to build a database of metagenomic samples of interests. Any metagenomic samples could then be searched against this database to find the most similar metagenomic sample(s). However, on one hand, current databases with a large number of metagenomic samples mostly serve as data repositories that offer few functionalities for analysis; and on the other hand, methods to measure the similarity of metagenomic data work well only for small set of samples by pairwise comparison. It is not yet clear, how to efficiently search for metagenomic samples against a large metagenomic database. In this study, we have proposed a novel method, Meta-Storms, that could systematically and efficiently organize and search metagenomic data. It includes the following components: (i) creating a database of metagenomic samples based on their taxonomical annotations, (ii) efficient indexing of samples in the database based on a hierarchical taxonomy indexing strategy, (iii) searching for a metagenomic sample against the database by a fast scoring function based on quantitative phylogeny and (iv) managing database by index export, index import, data insertion, data deletion and database merging. We have collected more than 1300 metagenomic data from the public domain and in-house facilities, and tested the Meta-Storms method on these datasets. Our experimental results show that Meta-Storms is capable of database creation and effective searching for a large number of metagenomic samples, and it could achieve similar accuracies compared with the current popular significance testing-based methods. Meta-Storms method would serve as a suitable database management and search system to quickly identify similar metagenomic samples from a large pool of samples. ningkang@qibebt.ac.cn Supplementary data are available at Bioinformatics online.

  20. Some considerations for analyzing biodiversity using integrative metagenomics and gene networks.

    PubMed

    Bittner, Lucie; Halary, Sébastien; Payri, Claude; Cruaud, Corinne; de Reviers, Bruno; Lopez, Philippe; Bapteste, Eric

    2010-07-30

    Improving knowledge of biodiversity will benefit conservation biology, enhance bioremediation studies, and could lead to new medical treatments. However there is no standard approach to estimate and to compare the diversity of different environments, or to study its past, and possibly, future evolution. We argue that there are two conditions for significant progress in the identification and quantification of biodiversity. First, integrative metagenomic studies - aiming at the simultaneous examination (or even better at the integration) of observations about the elements, functions and evolutionary processes captured by the massive sequencing of multiple markers - should be preferred over DNA barcoding projects and over metagenomic projects based on a single marker. Second, such metagenomic data should be studied with novel inclusive network-based approaches, designed to draw inferences both on the many units and on the many processes present in the environments. We reached these conclusions through a comparison of the theoretical foundations of two molecular approaches seeking to assess biodiversity: metagenomics (mostly used on prokaryotes and protists) and DNA barcoding (mostly used on multicellular eukaryotes), and by pragmatic considerations of the issues caused by the 'species problem' in biodiversity studies. Evolutionary gene networks reduce the risk of producing biodiversity estimates with limited explanatory power, biased either by unequal rates of LGT, or difficult to interpret due to (practical) problems caused by type I and type II grey zones. Moreover, these networks would easily accommodate additional (meta)transcriptomic and (meta)proteomic data.

  1. Random whole metagenomic sequencing for forensic discrimination of soils.

    PubMed

    Khodakova, Anastasia S; Smith, Renee J; Burgoyne, Leigh; Abarno, Damien; Linacre, Adrian

    2014-01-01

    Here we assess the ability of random whole metagenomic sequencing approaches to discriminate between similar soils from two geographically distinct urban sites for application in forensic science. Repeat samples from two parklands in residential areas separated by approximately 3 km were collected and the DNA was extracted. Shotgun, whole genome amplification (WGA) and single arbitrarily primed DNA amplification (AP-PCR) based sequencing techniques were then used to generate soil metagenomic profiles. Full and subsampled metagenomic datasets were then annotated against M5NR/M5RNA (taxonomic classification) and SEED Subsystems (metabolic classification) databases. Further comparative analyses were performed using a number of statistical tools including: hierarchical agglomerative clustering (CLUSTER); similarity profile analysis (SIMPROF); non-metric multidimensional scaling (NMDS); and canonical analysis of principal coordinates (CAP) at all major levels of taxonomic and metabolic classification. Our data showed that shotgun and WGA-based approaches generated highly similar metagenomic profiles for the soil samples such that the soil samples could not be distinguished accurately. An AP-PCR based approach was shown to be successful at obtaining reproducible site-specific metagenomic DNA profiles, which in turn were employed for successful discrimination of visually similar soil samples collected from two different locations.

  2. Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ahn, Tae-Hyuk; Chai, Juanjuan; Pan, Chongle

    Motivation: Metagenomic sequencing of clinical samples provides a promising technique for direct pathogen detection and characterization in biosurveillance. Taxonomic analysis at the strain level can be used to resolve serotypes of a pathogen in biosurveillance. Sigma was developed for strain-level identification and quantification of pathogens using their reference genomes based on metagenomic analysis. Results: Sigma provides not only accurate strain-level inferences, but also three unique capabilities: (i) Sigma quantifies the statistical uncertainty of its inferences, which includes hypothesis testing of identified genomes and confidence interval estimation of their relative abundances; (ii) Sigma enables strain variant calling by assigning metagenomic readsmore » to their most likely reference genomes; and (iii) Sigma supports parallel computing for fast analysis of large datasets. In conclusion, the algorithm performance was evaluated using simulated mock communities and fecal samples with spike-in pathogen strains. Availability and Implementation: Sigma was implemented in C++ with source codes and binaries freely available at http://sigma.omicsbio.org.« less

  3. Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance

    DOE PAGES

    Ahn, Tae-Hyuk; Chai, Juanjuan; Pan, Chongle

    2014-09-29

    Motivation: Metagenomic sequencing of clinical samples provides a promising technique for direct pathogen detection and characterization in biosurveillance. Taxonomic analysis at the strain level can be used to resolve serotypes of a pathogen in biosurveillance. Sigma was developed for strain-level identification and quantification of pathogens using their reference genomes based on metagenomic analysis. Results: Sigma provides not only accurate strain-level inferences, but also three unique capabilities: (i) Sigma quantifies the statistical uncertainty of its inferences, which includes hypothesis testing of identified genomes and confidence interval estimation of their relative abundances; (ii) Sigma enables strain variant calling by assigning metagenomic readsmore » to their most likely reference genomes; and (iii) Sigma supports parallel computing for fast analysis of large datasets. In conclusion, the algorithm performance was evaluated using simulated mock communities and fecal samples with spike-in pathogen strains. Availability and Implementation: Sigma was implemented in C++ with source codes and binaries freely available at http://sigma.omicsbio.org.« less

  4. Human milk metagenome: a functional capacity analysis

    PubMed Central

    2013-01-01

    Background Human milk contains a diverse population of bacteria that likely influences colonization of the infant gastrointestinal tract. Recent studies, however, have been limited to characterization of this microbial community by 16S rRNA analysis. In the present study, a metagenomic approach using Illumina sequencing of a pooled milk sample (ten donors) was employed to determine the genera of bacteria and the types of bacterial open reading frames in human milk that may influence bacterial establishment and stability in this primal food matrix. The human milk metagenome was also compared to that of breast-fed and formula-fed infants’ feces (n = 5, each) and mothers’ feces (n = 3) at the phylum level and at a functional level using open reading frame abundance. Additionally, immune-modulatory bacterial-DNA motifs were also searched for within human milk. Results The bacterial community in human milk contained over 360 prokaryotic genera, with sequences aligning predominantly to the phyla of Proteobacteria (65%) and Firmicutes (34%), and the genera of Pseudomonas (61.1%), Staphylococcus (33.4%) and Streptococcus (0.5%). From assembled human milk-derived contigs, 30,128 open reading frames were annotated and assigned to functional categories. When compared to the metagenome of infants’ and mothers’ feces, the human milk metagenome was less diverse at the phylum level, and contained more open reading frames associated with nitrogen metabolism, membrane transport and stress response (P < 0.05). The human milk metagenome also contained a similar occurrence of immune-modulatory DNA motifs to that of infants’ and mothers’ fecal metagenomes. Conclusions Our results further expand the complexity of the human milk metagenome and enforce the benefits of human milk ingestion on the microbial colonization of the infant gut and immunity. Discovery of immune-modulatory motifs in the metagenome of human milk indicates more exhaustive analyses of the functionality of the human milk metagenome are warranted. PMID:23705844

  5. Comparative analysis of taxonomic, functional, and metabolic patterns of microbiomes from 14 full-scale biogas reactors by metagenomic sequencing and radioisotopic analysis.

    PubMed

    Luo, Gang; Fotidis, Ioannis A; Angelidaki, Irini

    2016-01-01

    Biogas production is a very complex process due to the high complexity in diversity and interactions of the microorganisms mediating it, and only limited and diffuse knowledge exists about the variation of taxonomic and functional patterns of microbiomes across different biogas reactors, and their relationships with the metabolic patterns. The present study used metagenomic sequencing and radioisotopic analysis to assess the taxonomic, functional, and metabolic patterns of microbiomes from 14 full-scale biogas reactors operated under various conditions treating either sludge or manure. The results from metagenomic analysis showed that the dominant methanogenic pathway revealed by radioisotopic analysis was not always correlated with the taxonomic and functional compositions. It was found by radioisotopic experiments that the aceticlastic methanogenic pathway was dominant, while metagenomics analysis showed higher relative abundance of hydrogenotrophic methanogens. Principal coordinates analysis showed the sludge-based samples were clearly distinct from the manure-based samples for both taxonomic and functional patterns, and canonical correspondence analysis showed that the both temperature and free ammonia were crucial environmental variables shaping the taxonomic and functional patterns. The study further the overall patterns of functional genes were strongly correlated with overall patterns of taxonomic composition across different biogas reactors. The discrepancy between the metabolic patterns determined by metagenomic analysis and metabolic pathways determined by radioisotopic analysis was found. Besides, a clear correlation between taxonomic and functional patterns was demonstrated for biogas reactors, and also the environmental factors that shaping both taxonomic and functional genes patterns were identified.

  6. The effects of variable sample biomass on comparative metagenomics.

    PubMed

    Chafee, Meghan; Maignien, Loïs; Simmons, Sheri L

    2015-07-01

    Longitudinal studies that integrate samples with variable biomass are essential to understand microbial community dynamics across space or time. Shotgun metagenomics is widely used to investigate these communities at the functional level, but little is known about the effects of combining low and high biomass samples on downstream analysis. We investigated the interacting effects of DNA input and library amplification by polymerase chain reaction on comparative metagenomic analysis using dilutions of a single complex template from an Arabidopsis thaliana-associated microbial community. We modified the Illumina Nextera kit to generate high-quality large-insert (680 bp) paired-end libraries using a range of 50 pg to 50 ng of input DNA. Using assembly-based metagenomic analysis, we demonstrate that DNA input level has a significant impact on community structure due to overrepresentation of low-GC genomic regions following library amplification. In our system, these differences were largely superseded by variations between biological replicates, but our results advocate verifying the influence of library amplification on a case-by-case basis. Overall, this study provides recommendations for quality filtering and de-replication prior to analysis, as well as a practical framework to address the issue of low biomass or biomass heterogeneity in longitudinal metagenomic surveys. © 2014 Society for Applied Microbiology and John Wiley & Sons Ltd.

  7. Microbial and viral-like rhodopsins present in coastal marine sediments from four polar and subpolar regions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    López, José L.; Golemba, Marcelo; Hernández, Edgardo

    Rhodopsins are broadly distributed. In this work, we analyzed 23 metagenomes corresponding to marine sediment samples from four regions that share cold climate conditions (Norway; Sweden; Argentina and Antarctica). In order to investigate the genes evolution of viral rhodopsins, an initial set of 6224 bacterial rhodopsin sequences according to COG5524 were retrieved from the 23 metagenomes. After selection by the presence of transmembrane domains and alignment, 123 viral (51) and non-viral (72) sequences (>50 amino acids) were finally included in further analysis. Viral rhodopsin genes were homologs of Phaeocystis globosa virus and Organic lake Phycodnavirus. Non-viral microbial rhodopsin genes weremore » ascribed to Bacteroidetes, Planctomycetes, Firmicutes, Actinobacteria, Cyanobacteria, Proteobacteria, Deinococcus-Thermus and Cryptophyta and Fungi. A rescreening using Blastp, using as queries the viral sequences previously described, retrieved 30 sequences (>100 amino acids). Phylogeographic analysis revealed a geographical clustering of the sequences affiliated to the viral group. This clustering was not observed for the microbial non-viral sequences. The phylogenetic reconstruction allowed us to propose the existence of a putative ancestor of viral rhodopsin genes related to Actinobacteria and Chloroflexi. This is the first report about the existence of a phylogeographic association of the viral rhodopsin sequences from marine sediments.« less

  8. Interactive metagenomic visualization in a Web browser

    PubMed Central

    2011-01-01

    Background A critical output of metagenomic studies is the estimation of abundances of taxonomical or functional groups. The inherent uncertainty in assignments to these groups makes it important to consider both their hierarchical contexts and their prediction confidence. The current tools for visualizing metagenomic data, however, omit or distort quantitative hierarchical relationships and lack the facility for displaying secondary variables. Results Here we present Krona, a new visualization tool that allows intuitive exploration of relative abundances and confidences within the complex hierarchies of metagenomic classifications. Krona combines a variant of radial, space-filling displays with parametric coloring and interactive polar-coordinate zooming. The HTML5 and JavaScript implementation enables fully interactive charts that can be explored with any modern Web browser, without the need for installed software or plug-ins. This Web-based architecture also allows each chart to be an independent document, making them easy to share via e-mail or post to a standard Web server. To illustrate Krona's utility, we describe its application to various metagenomic data sets and its compatibility with popular metagenomic analysis tools. Conclusions Krona is both a powerful metagenomic visualization tool and a demonstration of the potential of HTML5 for highly accessible bioinformatic visualizations. Its rich and interactive displays facilitate more informed interpretations of metagenomic analyses, while its implementation as a browser-based application makes it extremely portable and easily adopted into existing analysis packages. Both the Krona rendering code and conversion tools are freely available under a BSD open-source license, and available from: http://krona.sourceforge.net. PMID:21961884

  9. Metagenomic Analysis of Viruses in Feces from Unsolved Outbreaks of Gastroenteritis in Humans

    PubMed Central

    Moore, Nicole E.; Wang, Jing; Hewitt, Joanne; Croucher, Dawn; Williamson, Deborah A.; Paine, Shevaun; Yen, Seiha; Greening, Gail E.

    2014-01-01

    The etiology of an outbreak of gastroenteritis in humans cannot always be determined, and ∼25% of outbreaks remain unsolved in New Zealand. It is hypothesized that novel viruses may account for a proportion of unsolved cases, and new unbiased high-throughput sequencing methods hold promise for their detection. Analysis of the fecal metagenome can reveal the presence of viruses, bacteria, and parasites which may have evaded routine diagnostic testing. Thirty-one fecal samples from 26 gastroenteritis outbreaks of unknown etiology occurring in New Zealand between 2011 and 2012 were selected for de novo metagenomic analysis. A total data set of 193 million sequence reads of 150 bp in length was produced on an Illumina MiSeq. The metagenomic data set was searched for virus and parasite sequences, with no evidence of novel pathogens found. Eight viruses and one parasite were detected, each already known to be associated with gastroenteritis, including adenovirus, rotavirus, sapovirus, and Dientamoeba fragilis. In addition, we also describe the first detection of human parechovirus 3 (HPeV3) in Australasia. Metagenomics may thus provide a useful audit tool when applied retrospectively to determine where routine diagnostic processes may have failed to detect a pathogen. PMID:25339401

  10. Loeffler 4.0: Diagnostic Metagenomics.

    PubMed

    Höper, Dirk; Wylezich, Claudia; Beer, Martin

    2017-01-01

    A new world of possibilities for "virus discovery" was opened up with high-throughput sequencing becoming available in the last decade. While scientifically metagenomic analysis was established before the start of the era of high-throughput sequencing, the availability of the first second-generation sequencers was the kick-off for diagnosticians to use sequencing for the detection of novel pathogens. Today, diagnostic metagenomics is becoming the standard procedure for the detection and genetic characterization of new viruses or novel virus variants. Here, we provide an overview about technical considerations of high-throughput sequencing-based diagnostic metagenomics together with selected examples of "virus discovery" for animal diseases or zoonoses and metagenomics for food safety or basic veterinary research. © 2017 Elsevier Inc. All rights reserved.

  11. Evaluating the Quantitative Capabilities of Metagenomic Analysis Software.

    PubMed

    Kerepesi, Csaba; Grolmusz, Vince

    2016-05-01

    DNA sequencing technologies are applied widely and frequently today to describe metagenomes, i.e., microbial communities in environmental or clinical samples, without the need for culturing them. These technologies usually return short (100-300 base-pairs long) DNA reads, and these reads are processed by metagenomic analysis software that assign phylogenetic composition-information to the dataset. Here we evaluate three metagenomic analysis software (AmphoraNet--a webserver implementation of AMPHORA2--, MG-RAST, and MEGAN5) for their capabilities of assigning quantitative phylogenetic information for the data, describing the frequency of appearance of the microorganisms of the same taxa in the sample. The difficulties of the task arise from the fact that longer genomes produce more reads from the same organism than shorter genomes, and some software assign higher frequencies to species with longer genomes than to those with shorter ones. This phenomenon is called the "genome length bias." Dozens of complex artificial metagenome benchmarks can be found in the literature. Because of the complexity of those benchmarks, it is usually difficult to judge the resistance of a metagenomic software to this "genome length bias." Therefore, we have made a simple benchmark for the evaluation of the "taxon-counting" in a metagenomic sample: we have taken the same number of copies of three full bacterial genomes of different lengths, break them up randomly to short reads of average length of 150 bp, and mixed the reads, creating our simple benchmark. Because of its simplicity, the benchmark is not supposed to serve as a mock metagenome, but if a software fails on that simple task, it will surely fail on most real metagenomes. We applied three software for the benchmark. The ideal quantitative solution would assign the same proportion to the three bacterial taxa. We have found that AMPHORA2/AmphoraNet gave the most accurate results and the other two software were under-performers: they counted quite reliably each short read to their respective taxon, producing the typical genome length bias. The benchmark dataset is available at http://pitgroup.org/static/3RandomGenome-100kavg150bps.fna.

  12. Exploring neighborhoods in the metagenome universe.

    PubMed

    Aßhauer, Kathrin P; Klingenberg, Heiner; Lingner, Thomas; Meinicke, Peter

    2014-07-14

    The variety of metagenomes in current databases provides a rapidly growing source of information for comparative studies. However, the quantity and quality of supplementary metadata is still lagging behind. It is therefore important to be able to identify related metagenomes by means of the available sequence data alone. We have studied efficient sequence-based methods for large-scale identification of similar metagenomes within a database retrieval context. In a broad comparison of different profiling methods we found that vector-based distance measures are well-suitable for the detection of metagenomic neighbors. Our evaluation on more than 1700 publicly available metagenomes indicates that for a query metagenome from a particular habitat on average nine out of ten nearest neighbors represent the same habitat category independent of the utilized profiling method or distance measure. While for well-defined labels a neighborhood accuracy of 100% can be achieved, in general the neighbor detection is severely affected by a natural overlap of manually annotated categories. In addition, we present results of a novel visualization method that is able to reflect the similarity of metagenomes in a 2D scatter plot. The visualization method shows a similarly high accuracy in the reduced space as compared with the high-dimensional profile space. Our study suggests that for inspection of metagenome neighborhoods the profiling methods and distance measures can be chosen to provide a convenient interpretation of results in terms of the underlying features. Furthermore, supplementary metadata of metagenome samples in the future needs to comply with readily available ontologies for fine-grained and standardized annotation. To make profile-based k-nearest-neighbor search and the 2D-visualization of the metagenome universe available to the research community, we included the proposed methods in our CoMet-Universe server for comparative metagenome analysis.

  13. Exploring Neighborhoods in the Metagenome Universe

    PubMed Central

    Aßhauer, Kathrin P.; Klingenberg, Heiner; Lingner, Thomas; Meinicke, Peter

    2014-01-01

    The variety of metagenomes in current databases provides a rapidly growing source of information for comparative studies. However, the quantity and quality of supplementary metadata is still lagging behind. It is therefore important to be able to identify related metagenomes by means of the available sequence data alone. We have studied efficient sequence-based methods for large-scale identification of similar metagenomes within a database retrieval context. In a broad comparison of different profiling methods we found that vector-based distance measures are well-suitable for the detection of metagenomic neighbors. Our evaluation on more than 1700 publicly available metagenomes indicates that for a query metagenome from a particular habitat on average nine out of ten nearest neighbors represent the same habitat category independent of the utilized profiling method or distance measure. While for well-defined labels a neighborhood accuracy of 100% can be achieved, in general the neighbor detection is severely affected by a natural overlap of manually annotated categories. In addition, we present results of a novel visualization method that is able to reflect the similarity of metagenomes in a 2D scatter plot. The visualization method shows a similarly high accuracy in the reduced space as compared with the high-dimensional profile space. Our study suggests that for inspection of metagenome neighborhoods the profiling methods and distance measures can be chosen to provide a convenient interpretation of results in terms of the underlying features. Furthermore, supplementary metadata of metagenome samples in the future needs to comply with readily available ontologies for fine-grained and standardized annotation. To make profile-based k-nearest-neighbor search and the 2D-visualization of the metagenome universe available to the research community, we included the proposed methods in our CoMet-Universe server for comparative metagenome analysis. PMID:25026170

  14. The GAAS metagenomic tool and its estimations of viral and microbial average genome size in four major biomes.

    PubMed

    Angly, Florent E; Willner, Dana; Prieto-Davó, Alejandra; Edwards, Robert A; Schmieder, Robert; Vega-Thurber, Rebecca; Antonopoulos, Dionysios A; Barott, Katie; Cottrell, Matthew T; Desnues, Christelle; Dinsdale, Elizabeth A; Furlan, Mike; Haynes, Matthew; Henn, Matthew R; Hu, Yongfei; Kirchman, David L; McDole, Tracey; McPherson, John D; Meyer, Folker; Miller, R Michael; Mundt, Egbert; Naviaux, Robert K; Rodriguez-Mueller, Beltran; Stevens, Rick; Wegley, Linda; Zhang, Lixin; Zhu, Baoli; Rohwer, Forest

    2009-12-01

    Metagenomic studies characterize both the composition and diversity of uncultured viral and microbial communities. BLAST-based comparisons have typically been used for such analyses; however, sampling biases, high percentages of unknown sequences, and the use of arbitrary thresholds to find significant similarities can decrease the accuracy and validity of estimates. Here, we present Genome relative Abundance and Average Size (GAAS), a complete software package that provides improved estimates of community composition and average genome length for metagenomes in both textual and graphical formats. GAAS implements a novel methodology to control for sampling bias via length normalization, to adjust for multiple BLAST similarities by similarity weighting, and to select significant similarities using relative alignment lengths. In benchmark tests, the GAAS method was robust to both high percentages of unknown sequences and to variations in metagenomic sequence read lengths. Re-analysis of the Sargasso Sea virome using GAAS indicated that standard methodologies for metagenomic analysis may dramatically underestimate the abundance and importance of organisms with small genomes in environmental systems. Using GAAS, we conducted a meta-analysis of microbial and viral average genome lengths in over 150 metagenomes from four biomes to determine whether genome lengths vary consistently between and within biomes, and between microbial and viral communities from the same environment. Significant differences between biomes and within aquatic sub-biomes (oceans, hypersaline systems, freshwater, and microbialites) suggested that average genome length is a fundamental property of environments driven by factors at the sub-biome level. The behavior of paired viral and microbial metagenomes from the same environment indicated that microbial and viral average genome sizes are independent of each other, but indicative of community responses to stressors and environmental conditions.

  15. Metagenomic Analysis of the Indian Ocean Picocyanobacterial Community: Structure, Potential Function and Evolution

    PubMed Central

    Díez, Beatriz; Nylander, Johan A. A.; Ininbergs, Karolina; Dupont, Christopher L.; Allen, Andrew E.; Yooseph, Shibu; Rusch, Douglas B.; Bergman, Birgitta

    2016-01-01

    Unicellular cyanobacteria are ubiquitous photoautotrophic microbes that contribute substantially to global primary production. Picocyanobacteria such as Synechococcus and Prochlorococcus depend on chlorophyll a-binding protein complexes to capture light energy. In addition, Synechococcus has accessory pigments organized into phycobilisomes, and Prochlorococcus contains chlorophyll b. Across a surface water transect spanning the sparsely studied tropical Indian Ocean, we examined Synechococcus and Prochlorococcus occurrence, taxonomy and habitat preference in an evolutionary context. Shotgun sequencing of size fractionated microbial communities from 0.1 μm to 20 μm and subsequent phylogenetic analysis indicated that cyanobacteria account for up to 15% of annotated reads, with the genera Prochlorococcus and Synechococcus comprising 90% of the cyanobacterial reads, even in the largest size fraction (3.0–20 mm). Phylogenetic analyses of cyanobacterial light-harvesting genes (chl-binding pcb/isiA, allophycocyanin (apcAB), phycocyanin (cpcAB) and phycoerythin (cpeAB)) mostly identified picocyanobacteria clades comprised of overlapping sequences obtained from Indian Ocean, Atlantic and/or Pacific Oceans samples. Habitat reconstructions coupled with phylogenetic analysis of the Indian Ocean samples suggested that large Synechococcus-like ancestors in coastal waters expanded their ecological niche towards open oligotrophic waters in the Indian Ocean through lineage diversification and associated streamlining of genomes (e.g. loss of phycobilisomes and acquisition of Chl b); resulting in contemporary small celled Prochlorococcus. Comparative metagenomic analysis with picocyanobacteria populations in other oceans suggests that this evolutionary scenario may be globally important. PMID:27196065

  16. Metagenomic Analysis of the Indian Ocean Picocyanobacterial Community: Structure, Potential Function and Evolution

    DOE PAGES

    Diez, Beatriz; Nylander, Johan A. A.; Ininbergs, Karolina; ...

    2016-05-19

    Unicellular cyanobacteria are ubiquitous photoautotrophic microbes that contribute substantially to global primary production. Picocyanobacteria such as Synechococcus and Prochlorococcus depend on chlorophyll a-binding protein complexes to capture light energy. In addition, Synechococcus has accessory pigments organized into phycobilisomes, and Prochlorococcus contains chlorophyll b. Across a surface water transect spanning the sparsely studied tropical Indian Ocean, we examined Synechococcus and Prochlorococcus occurrence, taxonomy and habitat preference in an evolutionary context. Shotgun sequencing of size fractionated microbial communities from 0.1 μm to 20 μm and subsequent phylogenetic analysis indicated that cyanobacteria account for up to 15% of annotated reads, with the generamore » Prochlorococcus and Synechococcus comprising 90% of the cyanobacterial reads, even in the largest size fraction (3.0–20 mm). Phylogenetic analyses of cyanobacterial light-harvesting genes (chl-binding pcb/isiA, allophycocyanin ( apcAB), phycocyanin ( cpcAB) and phycoerythin ( cpeAB)) mostly identified picocyanobacteria clades comprised of overlapping sequences obtained from Indian Ocean, Atlantic and/or Pacific Oceans samples. Habitat reconstructions coupled with phylogenetic analysis of the Indian Ocean samples suggested that large Synechococcus-like ancestors in coastal waters expanded their ecological niche towards open oligotrophic waters in the Indian Ocean through lineage diversification and associated streamlining of genomes (e.g. loss of phycobilisomes and acquisition of Chl b); resulting in contemporary small celled Prochlorococcus. As a result, comparative metagenomic analysis with picocyanobacteria populations in other oceans suggests that this evolutionary scenario may be globally important.« less

  17. Metagenomic Analysis of the Indian Ocean Picocyanobacterial Community: Structure, Potential Function and Evolution

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Diez, Beatriz; Nylander, Johan A. A.; Ininbergs, Karolina

    Unicellular cyanobacteria are ubiquitous photoautotrophic microbes that contribute substantially to global primary production. Picocyanobacteria such as Synechococcus and Prochlorococcus depend on chlorophyll a-binding protein complexes to capture light energy. In addition, Synechococcus has accessory pigments organized into phycobilisomes, and Prochlorococcus contains chlorophyll b. Across a surface water transect spanning the sparsely studied tropical Indian Ocean, we examined Synechococcus and Prochlorococcus occurrence, taxonomy and habitat preference in an evolutionary context. Shotgun sequencing of size fractionated microbial communities from 0.1 μm to 20 μm and subsequent phylogenetic analysis indicated that cyanobacteria account for up to 15% of annotated reads, with the generamore » Prochlorococcus and Synechococcus comprising 90% of the cyanobacterial reads, even in the largest size fraction (3.0–20 mm). Phylogenetic analyses of cyanobacterial light-harvesting genes (chl-binding pcb/isiA, allophycocyanin ( apcAB), phycocyanin ( cpcAB) and phycoerythin ( cpeAB)) mostly identified picocyanobacteria clades comprised of overlapping sequences obtained from Indian Ocean, Atlantic and/or Pacific Oceans samples. Habitat reconstructions coupled with phylogenetic analysis of the Indian Ocean samples suggested that large Synechococcus-like ancestors in coastal waters expanded their ecological niche towards open oligotrophic waters in the Indian Ocean through lineage diversification and associated streamlining of genomes (e.g. loss of phycobilisomes and acquisition of Chl b); resulting in contemporary small celled Prochlorococcus. As a result, comparative metagenomic analysis with picocyanobacteria populations in other oceans suggests that this evolutionary scenario may be globally important.« less

  18. High definition for systems biology of microbial communities: metagenomics gets genome-centric and strain-resolved.

    PubMed

    Turaev, Dmitrij; Rattei, Thomas

    2016-06-01

    The systems biology of microbial communities, organismal communities inhabiting all ecological niches on earth, has in recent years been strongly facilitated by the rapid development of experimental, sequencing and data analysis methods. Novel experimental approaches and binning methods in metagenomics render the semi-automatic reconstructions of near-complete genomes of uncultivable bacteria possible, while advances in high-resolution amplicon analysis allow for efficient and less biased taxonomic community characterization. This will also facilitate predictive modeling approaches, hitherto limited by the low resolution of metagenomic data. In this review, we pinpoint the most promising current developments in metagenomics. They facilitate microbial systems biology towards a systemic understanding of mechanisms in microbial communities with scopes of application in many areas of our daily life. Copyright © 2016 Elsevier Ltd. All rights reserved.

  19. A user's guide to quantitative and comparative analysis of metagenomic datasets.

    PubMed

    Luo, Chengwei; Rodriguez-R, Luis M; Konstantinidis, Konstantinos T

    2013-01-01

    Metagenomics has revolutionized microbiological studies during the past decade and provided new insights into the diversity, dynamics, and metabolic potential of natural microbial communities. However, metagenomics still represents a field in development, and standardized tools and approaches to handle and compare metagenomes have not been established yet. An important reason accounting for the latter is the continuous changes in the type of sequencing data available, for example, long versus short sequencing reads. Here, we provide a guide to bioinformatic pipelines developed to accomplish the following tasks, focusing primarily on those developed by our team: (i) assemble a metagenomic dataset; (ii) determine the level of sequence coverage obtained and the amount of sequencing required to obtain complete coverage; (iii) identify the taxonomic affiliation of a metagenomic read or assembled contig; and (iv) determine differentially abundant genes, pathways, and species between different datasets. Most of these pipelines do not depend on the type of sequences available or can be easily adjusted to fit different types of sequences, and are freely available (for instance, through our lab Web site: http://www.enve-omics.gatech.edu/). The limitations of current approaches, as well as the computational aspects that can be further improved, will also be briefly discussed. The work presented here provides practical guidelines on how to perform metagenomic analysis of microbial communities characterized by varied levels of diversity and establishes approaches to handle the resulting data, independent of the sequencing platform employed. © 2013 Elsevier Inc. All rights reserved.

  20. Antibiotic resistance genes across a wide variety of metagenomes.

    PubMed

    Fitzpatrick, David; Walsh, Fiona

    2016-02-01

    The distribution of potential clinically relevant antibiotic resistance (AR) genes across soil, water, animal, plant and human microbiomes is not well understood. We aimed to investigate if there were differences in the distribution and relative abundances of resistance genes across a variety of ecological niches. All sequence reads (human, animal, water, soil, plant and insect metagenomes) from the MG-RAST database were downloaded and assembled into a local sequence database. We show that there are many reservoirs of the basic form of resistance genes e.g. blaTEM, but the human and mammalian gut microbiomes contain the widest diversity of clinically relevant resistance genes using metagenomic analysis. The human microbiomes contained a high relative abundance of resistance genes, while the relative abundances varied greatly in the marine and soil metagenomes, when datasets with greater than one million genes were compared. While these results reflect a bias in the distribution of AR genes across the metagenomes, we note this interpretation with caution. Metagenomics analysis includes limits in terms of detection and identification of AR genes in complex and diverse microbiome population. Therefore, if we do not detect the AR gene is it in fact not there or just below the limits of our techniques? © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  1. Validation of Metagenomic Next-Generation Sequencing Tests for Universal Pathogen Detection.

    PubMed

    Schlaberg, Robert; Chiu, Charles Y; Miller, Steve; Procop, Gary W; Weinstock, George

    2017-06-01

    - Metagenomic sequencing can be used for detection of any pathogens using unbiased, shotgun next-generation sequencing (NGS), without the need for sequence-specific amplification. Proof-of-concept has been demonstrated in infectious disease outbreaks of unknown causes and in patients with suspected infections but negative results for conventional tests. Metagenomic NGS tests hold great promise to improve infectious disease diagnostics, especially in immunocompromised and critically ill patients. - To discuss challenges and provide example solutions for validating metagenomic pathogen detection tests in clinical laboratories. A summary of current regulatory requirements, largely based on prior guidance for NGS testing in constitutional genetics and oncology, is provided. - Examples from 2 separate validation studies are provided for steps from assay design, and validation of wet bench and bioinformatics protocols, to quality control and assurance. - Although laboratory and data analysis workflows are still complex, metagenomic NGS tests for infectious diseases are increasingly being validated in clinical laboratories. Many parallels exist to NGS tests in other fields. Nevertheless, specimen preparation, rapidly evolving data analysis algorithms, and incomplete reference sequence databases are idiosyncratic to the field of microbiology and often overlooked.

  2. Shotgun metagenomic data streams: surfing without fear

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Berendzen, Joel R

    2010-12-06

    Timely information about bio-threat prevalence, consequence, propagation, attribution, and mitigation is needed to support decision-making, both routinely and in a crisis. One DNA sequencer can stream 25 Gbp of information per day, but sampling strategies and analysis techniques are needed to turn raw sequencing power into actionable knowledge. Shotgun metagenomics can enable biosurveillance at the level of a single city, hospital, or airplane. Metagenomics characterizes viruses and bacteria from complex environments such as soil, air filters, or sewage. Unlike targeted-primer-based sequencing, shotgun methods are not blind to sequences that are truly novel, and they can measure absolute prevalence. Shotgun metagenomicmore » sampling can be non-invasive, efficient, and inexpensive while being informative. We have developed analysis techniques for shotgun metagenomic sequencing that rely upon phylogenetic signature patterns. They work by indexing local sequence patterns in a manner similar to web search engines. Our methods are laptop-fast and favorable scaling properties ensure they will be sustainable as sequencing methods grow. We show examples of application to soil metagenomic samples.« less

  3. myPhyloDB: a local web server for the storage and analysis of metagenomics data

    USDA-ARS?s Scientific Manuscript database

    myPhyloDB is a user-friendly personal database with a browser-interface designed to facilitate the storage, processing, analysis, and distribution of metagenomics data. MyPhyloDB archives raw sequencing files, and allows for easy selection of project(s)/sample(s) of any combination from all availab...

  4. Applying meta-pathway analyses through metagenomics to identify the functional properties of the major bacterial communities of a single spontaneous cocoa bean fermentation process sample.

    PubMed

    Illeghems, Koen; Weckx, Stefan; De Vuyst, Luc

    2015-09-01

    A high-resolution functional metagenomic analysis of a representative single sample of a Brazilian spontaneous cocoa bean fermentation process was carried out to gain insight into its bacterial community functioning. By reconstruction of microbial meta-pathways based on metagenomic data, the current knowledge about the metabolic capabilities of bacterial members involved in the cocoa bean fermentation ecosystem was extended. Functional meta-pathway analysis revealed the distribution of the metabolic pathways between the bacterial members involved. The metabolic capabilities of the lactic acid bacteria present were most associated with the heterolactic fermentation and citrate assimilation pathways. The role of Enterobacteriaceae in the conversion of substrates was shown through the use of the mixed-acid fermentation and methylglyoxal detoxification pathways. Furthermore, several other potential functional roles for Enterobacteriaceae were indicated, such as pectinolysis and citrate assimilation. Concerning acetic acid bacteria, metabolic pathways were partially reconstructed, in particular those related to responses toward stress, explaining their metabolic activities during cocoa bean fermentation processes. Further, the in-depth metagenomic analysis unveiled functionalities involved in bacterial competitiveness, such as the occurrence of CRISPRs and potential bacteriocin production. Finally, comparative analysis of the metagenomic data with bacterial genomes of cocoa bean fermentation isolates revealed the applicability of the selected strains as functional starter cultures. Copyright © 2015 Elsevier Ltd. All rights reserved.

  5. Host-Associated Genomic Features of the Novel Uncultured Intracellular Pathogen Ca. Ichthyocystis Revealed by Direct Sequencing of Epitheliocysts

    PubMed Central

    Qi, Weihong; Vaughan, Lloyd; Katharios, Pantelis; Schlapbach, Ralph; Seth-Smith, Helena M.B.

    2016-01-01

    Advances in single-cell and mini-metagenome sequencing have enabled important investigations into uncultured bacteria. In this study, we applied the mini-metagenome sequencing method to assemble genome drafts of the uncultured causative agents of epitheliocystis, an emerging infectious disease in the Mediterranean aquaculture species gilthead seabream. We sequenced multiple cyst samples and constructed 11 genome drafts from a novel beta-proteobacterial lineage, Candidatus Ichthyocystis. The draft genomes demonstrate features typical of pathogenic bacteria with an obligate intracellular lifestyle: a reduced genome of up to 2.6 Mb, reduced G + C content, and reduced metabolic capacity. Reconstruction of metabolic pathways reveals that Ca. Ichthyocystis genomes lack all amino acid synthesis pathways, compelling them to scavenge from the fish host. All genomes encode type II, III, and IV secretion systems, a large repertoire of predicted effectors, and a type IV pilus. These are all considered to be virulence factors, required for adherence, invasion, and host manipulation. However, no evidence of lipopolysaccharide synthesis could be found. Beyond the core functions shared within the genus, alignments showed distinction into different species, characterized by alternative large gene families. These comprise up to a third of each genome, appear to have arisen through duplication and diversification, encode many effector proteins, and are seemingly critical for virulence. Thus, Ca. Ichthyocystis represents a novel obligatory intracellular pathogenic beta-proteobacterial lineage. The methods used: mini-metagenome analysis and manual annotation, have generated important insights into the lifestyle and evolution of the novel, uncultured pathogens, elucidating many putative virulence factors including an unprecedented array of novel gene families. PMID:27190004

  6. A Metagenomic Survey of Serpentinites and Nearby Soils in Taiwan

    NASA Astrophysics Data System (ADS)

    Li, K. Y.; Hsu, Y. W.; Chen, Y. W.; Huang, T. Y.; Shih, Y. J.; Chen, J. S.; Hsu, B. M.

    2016-12-01

    The serpentinite of Taiwan is originated from the subduction zone of the Eurasian plate and the Philippine Sea plate. Many small bodies of serpentinite are scattered around the lands of the East Rift Valley, which are also one of the major agricultural areas in Taiwan. Since microbial communities play a role both on weathering process and soil recovery, uncovering the microbial compositions in serpentinites and surrounding soils may help people to understand the roles of microorganisms on serpentinites during the nature weathering process. In this study, microorganisms growing on the surface of serpentinites, in the surrounding soil, and agriculture soils that are miles of horizontal distance away from serpentinite were collected. Next generation sequencing (NGS) was carried out to examine the metagenomics of uncultured microbial community in these samples. The metagenomics were further clustered into operational taxonomic units (OTUs) to analyze relative abundance, heatmap of OTUs, and principal coordinates analysis (PCoA). Our data revealed the different types of geographic material had their own distinct structures of microbial community. In serpentinites, the heatmaps based on the phylogenetic pattern showed that the OTUs distributions were similar in phyla of Bacteroidetes, Cyanobacteria, Proteobacteria, Verrucomicrobia, and WPS-1/WPS-2. On the other hand, the heatmaps of phylogenetic pattern of agriculture soils showed that the OTUs distributions in phyla of Chloroflexi, Acidobacteria, Actinobacteria, WPS-1/WPS-2, and Proteobacteria were similar. In soil nearby the serpentinite, some clusters of OTUs in phyla of Bacteroidetes, Cyanobacteria, and WPS-1/WPS-2 have disappeared. Our data provided evidence regarding kinetic evolutions of microbial communities in different geographic materials.

  7. An evaluation of the accuracy and speed of metagenome analysis tools

    PubMed Central

    Lindgreen, Stinus; Adair, Karen L.; Gardner, Paul P.

    2016-01-01

    Metagenome studies are becoming increasingly widespread, yielding important insights into microbial communities covering diverse environments from terrestrial and aquatic ecosystems to human skin and gut. With the advent of high-throughput sequencing platforms, the use of large scale shotgun sequencing approaches is now commonplace. However, a thorough independent benchmark comparing state-of-the-art metagenome analysis tools is lacking. Here, we present a benchmark where the most widely used tools are tested on complex, realistic data sets. Our results clearly show that the most widely used tools are not necessarily the most accurate, that the most accurate tool is not necessarily the most time consuming, and that there is a high degree of variability between available tools. These findings are important as the conclusions of any metagenomics study are affected by errors in the predicted community composition and functional capacity. Data sets and results are freely available from http://www.ucbioinformatics.org/metabenchmark.html PMID:26778510

  8. FY11 Report on Metagenome Analysis using Pathogen Marker Libraries

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gardner, Shea N.; Allen, Jonathan E.; McLoughlin, Kevin S.

    2011-06-02

    A method, sequence library, and software suite was invented to rapidly assess whether any member of a pre-specified list of threat organisms or their near neighbors is present in a metagenome. The system was designed to handle mega- to giga-bases of FASTA-formatted raw sequence reads from short or long read next generation sequencing platforms. The approach is to pre-calculate a viral and a bacterial "Pathogen Marker Library" (PML) containing sub-sequences specific to pathogens or their near neighbors. A list of expected matches comparing every bacterial or viral genome against the PML sequences is also pre-calculated. To analyze a metagenome, readsmore » are compared to the PML, and observed PML-metagenome matches are compared to the expected PML-genome matches, and the ratio of observed relative to expected matches is reported. In other words, a 3-way comparison among the PML, metagenome, and existing genome sequences is used to quickly assess which (if any) species included in the PML is likely to be present in the metagenome, based on available sequence data. Our tests showed that the species with the most PML matches correctly indicated the organism sequenced for empirical metagenomes consisting of a cultured, relatively pure isolate. These runs completed in 1 minute to 3 hours on 12 CPU (1 thread/CPU), depending on the metagenome and PML. Using more threads on the same number of CPU resulted in speed improvements roughly proportional to the number of threads. Simulations indicated that detection sensitivity depends on both sequencing coverage levels for a species and the size of the PML: species were correctly detected even at ~0.003x coverage by the large PMLs, and at ~0.03x coverage by the smaller PMLs. Matches to true positive species were 3-4 orders of magnitude higher than to false positives. Simulations with short reads (36 nt and ~260 nt) showed that species were usually detected for metagenome coverage above 0.005x and coverage in the PML above 0.05x, and detection probability appears to be a function of both coverages. Multiple species could be detected simultaneously in a simulated low-coverage, complex metagenome, and the largest PML gave no false negative species and no false positive genera. The presence of multiple species was predicted in a complex metagenome from a human gut microbiome with 1.9 GB of short reads (75 nt); the species predicted were reasonable gut flora and no biothreat agents were detected, showing the feasibility of PML analysis of empirical complex metagenomes.« less

  9. Metagenomic studies of the Red Sea.

    PubMed

    Behzad, Hayedeh; Ibarra, Martin Augusto; Mineta, Katsuhiko; Gojobori, Takashi

    2016-02-01

    Metagenomics has significantly advanced the field of marine microbial ecology, revealing the vast diversity of previously unknown microbial life forms in different marine niches. The tremendous amount of data generated has enabled identification of a large number of microbial genes (metagenomes), their community interactions, adaptation mechanisms, and their potential applications in pharmaceutical and biotechnology-based industries. Comparative metagenomics reveals that microbial diversity is a function of the local environment, meaning that unique or unusual environments typically harbor novel microbial species with unique genes and metabolic pathways. The Red Sea has an abundance of unique characteristics; however, its microbiota is one of the least studied among marine environments. The Red Sea harbors approximately 25 hot anoxic brine pools, plus a vibrant coral reef ecosystem. Physiochemical studies describe the Red Sea as an oligotrophic environment that contains one of the warmest and saltiest waters in the world with year-round high UV radiations. These characteristics are believed to have shaped the evolution of microbial communities in the Red Sea. Over-representation of genes involved in DNA repair, high-intensity light responses, and osmoregulation were found in the Red Sea metagenomic databases suggesting acquisition of specific environmental adaptation by the Red Sea microbiota. The Red Sea brine pools harbor a diverse range of halophilic and thermophilic bacterial and archaeal communities, which are potential sources of enzymes for pharmaceutical and biotechnology-based application. Understanding the mechanisms of these adaptations and their function within the larger ecosystem could also prove useful in light of predicted global warming scenarios where global ocean temperatures are expected to rise by 1-3°C in the next few decades. In this review, we provide an overview of the published metagenomic studies that were conducted in the Red Sea, and the bio-prospecting potential of the Red Sea microbiota. Furthermore, we discuss the limitations of the previous studies and the need for generating a large and representative metagenomic database of the Red Sea to help establish a dynamic model of the Red Sea microbiota. Copyright © 2015 Elsevier B.V. All rights reserved.

  10. Mining virulence genes using metagenomics.

    PubMed

    Belda-Ferre, Pedro; Cabrera-Rubio, Raúl; Moya, Andrés; Mira, Alex

    2011-01-01

    When a bacterial genome is compared to the metagenome of an environment it inhabits, most genes recruit at high sequence identity. In free-living bacteria (for instance marine bacteria compared against the ocean metagenome) certain genomic regions are totally absent in recruitment plots, representing therefore genes unique to individual bacterial isolates. We show that these Metagenomic Islands (MIs) are also visible in bacteria living in human hosts when their genomes are compared to sequences from the human microbiome, despite the compartmentalized structure of human-related environments such as the gut. From an applied point of view, MIs of human pathogens (e.g. those identified in enterohaemorragic Escherichia coli against the gut metagenome or in pathogenic Neisseria meningitidis against the oral metagenome) include virulence genes that appear to be absent in related strains or species present in the microbiome of healthy individuals. We propose that this strategy (i.e. recruitment analysis of pathogenic bacteria against the metagenome of healthy subjects) can be used to detect pathogenicity regions in species where the genes involved in virulence are poorly characterized. Using this approach, we detect well-known pathogenicity islands and identify new potential virulence genes in several human pathogens.

  11. Translational metagenomics and the human resistome: confronting the menace of the new millennium.

    PubMed

    Willmann, Matthias; Peter, Silke

    2017-01-01

    The increasing threat of antimicrobial resistance poses one of the greatest challenges to modern medicine. The collection of all antimicrobial resistance genes carried by various microorganisms in the human body is called the human resistome and represents the source of resistance in pathogens that can eventually cause life-threatening and untreatable infections. A deep understanding of the human resistome and its multilateral interaction with various environments is necessary for developing proper measures that can efficiently reduce the spread of resistance. However, the human resistome and its evolution still remain, for the most part, a mystery to researchers. Metagenomics, particularly in combination with next-generation-sequencing technology, provides a powerful methodological approach for studying the human microbiome as well as the pathogenome, the virolume and especially the resistome. We summarize below current knowledge on how the human resistome is shaped and discuss how metagenomics can be employed to improve our understanding of these complex processes, particularly as regards a rapid translation of new findings into clinical diagnostics, infection control and public health.

  12. Functional metagenomic selection of RubisCOs from uncultivated bacteria

    USGS Publications Warehouse

    Varaljay, Vanessa A; Satagopan, Sriram; North, Justin A.; Witteveen, Briana; Dourado, Manuella N.; Anantharaman, Karthik; Arbing, Mark A.; McCann, Shelley; Oremland, Ronald S.; Banfield, Jillian F.; Wrighton, Kelly C.; Tabita, F. Robert

    2016-01-01

    Ribulose 1,5-bisphosphate carboxylase/oxygenase (RubisCO) is a critical yet severely inefficient enzyme that catalyses the fixation of virtually all of the carbon found on Earth. Here, we report a functional metagenomic selection that recovers physiologically active RubisCO molecules directly from uncultivated and largely unknown members of natural microbial communities. Selection is based on CO2-dependent growth in a host strain capable of expressing environmental deoxyribonucleic acid (DNA), precluding the need for pure cultures or screening of recombinant clones for enzymatic activity. Seventeen functional RubisCO-encoded sequences were selected using DNA extracted from soil and river autotrophic enrichments, a photosynthetic biofilm and a subsurface groundwater aquifer. Notably, three related form II RubisCOs were recovered which share high sequence similarity with metagenomic scaffolds from uncultivated members of theGallionellaceae family. One of the Gallionellaceae RubisCOs was purified and shown to possessCO2/O2 specificity typical of form II enzymes. X-ray crystallography determined that this enzyme is a hexamer, only the second form II multimer ever solved and the first RubisCO structure obtained from an uncultivated bacterium. Functional metagenomic selection leverages natural biological diversity and billions of years of evolution inherent in environmental communities, providing a new window into the discovery of CO2-fixing enzymes not previously characterized.

  13. Metagenomic Characterization and Biochemical Analysis of Cellulose-Degrading Bacterial Communities from Sheep Rumen, Termite Hindgut, Decaying Plant Materials, and Soil

    DTIC Science & Technology

    2016-01-04

    Biochemical Analysis of Cellulose-DegradingBacterial Communities from Sheep Rumen, Termite Hindgut, Decaying Plant Materials,and Soil In an effort to...degrading bacteria from various samples, including termite gut, sheep rumen, soil, and decaying plant materials. Using selective media culture with...Metagenomic Characterization and Biochemical Analysis of Cellulose-DegradingBacterial Communities from Sheep Rumen, Termite Hindgut, Decaying Plant

  14. Convergent Evolution of Rumen Microbiomes in High-Altitude Mammals.

    PubMed

    Zhang, Zhigang; Xu, Dongming; Wang, Li; Hao, Junjun; Wang, Jinfeng; Zhou, Xin; Wang, Weiwei; Qiu, Qiang; Huang, Xiaodan; Zhou, Jianwei; Long, Ruijun; Zhao, Fangqing; Shi, Peng

    2016-07-25

    Studies of genetic adaptation, a central focus of evolutionary biology, most often focus on the host's genome and only rarely on its co-evolved microbiome. The Qinghai-Tibetan Plateau (QTP) offers one of the most extreme environments for the survival of human and other mammalian species. Yaks (Bos grunniens) and Tibetan sheep (T-sheep) (Ovis aries) have adaptations for living in this harsh high-altitude environment, where nomadic Tibetan people keep them primarily for food and livelihood [1]. Adaptive evolution affects energy-metabolism-related genes in a way that helps these ruminants live at high altitude [2, 3]. Herein, we report convergent evolution of rumen microbiomes for energy harvesting persistence in two typical high-altitude ruminants, yaks and T-sheep. Both ruminants yield significantly lower levels of methane and higher yields of volatile fatty acids (VFAs) than their low-altitude relatives, cattle (Bos taurus) and ordinary sheep (Ovis aries). Ultra-deep metagenomic sequencing reveals significant enrichment in VFA-yielding pathways of rumen microbial genes in high-altitude ruminants, whereas methanogenesis pathways show enrichment in the cattle metagenome. Analyses of RNA transcriptomes reveal significant upregulation in 36 genes associated with VFA transport and absorption in the ruminal epithelium of high-altitude ruminants. Our study provides novel insights into the contributions of microbiomes to adaptive evolution in mammals and sheds light on the biological control of greenhouse gas emissions from livestock enteric fermentation. Copyright © 2016 Elsevier Ltd. All rights reserved.

  15. A novel genome signature based on inter-nucleotide distances profiles for visualization of metagenomic data

    NASA Astrophysics Data System (ADS)

    Xie, Xian-Hua; Yu, Zu-Guo; Ma, Yuan-Lin; Han, Guo-Sheng; Anh, Vo

    2017-09-01

    There has been a growing interest in visualization of metagenomic data. The present study focuses on the visualization of metagenomic data using inter-nucleotide distances profile. We first convert the fragment sequences into inter-nucleotide distances profiles. Then we analyze these profiles by principal component analysis. Finally the principal components are used to obtain the 2-D scattered plot according to their source of species. We name our method as inter-nucleotide distances profiles (INP) method. Our method is evaluated on three benchmark data sets used in previous published papers. Our results demonstrate that the INP method is good, alternative and efficient for visualization of metagenomic data.

  16. Preparation of fosmid libraries and functional metagenomic analysis of microbial community DNA.

    PubMed

    Martínez, Asunción; Osburne, Marcia S

    2013-01-01

    One of the most important challenges in contemporary microbial ecology is to assign a functional role to the large number of novel genes discovered through large-scale sequencing of natural microbial communities that lack similarity to genes of known function. Functional screening of metagenomic libraries, that is, screening environmental DNA clones for the ability to confer an activity of interest to a heterologous bacterial host, is a promising approach for bridging the gap between metagenomic DNA sequencing and functional characterization. Here, we describe methods for isolating environmental DNA and constructing metagenomic fosmid libraries, as well as methods for designing and implementing successful functional screens of such libraries. © 2013 Elsevier Inc. All rights reserved.

  17. Colonic Mucosal Microbiota in Colorectal Cancer: A Single-Center Metagenomic Study in Saudi Arabia.

    PubMed

    Alomair, Ahmed O; Masoodi, Ibrahim; Alyamani, Essam J; Allehibi, Abed A; Qutub, Adel N; Alsayari, Khalid N; Altammami, Musaad A; Alshanqeeti, Ali S

    2018-01-01

    Because genetic and geographic variations in intestinal microbiota are known to exist, the focus of this study was to establish an estimation of microbiota in colorectal cancer (CRC) patients in Saudi Arabia by means of metagenomic studies. From July 2010 to November 2012, colorectal cancer patients attending our hospital were enrolled for the metagenomic studies. All underwent clinical, endoscopic, and histological assessment. Mucosal microbiota samples were collected from each patient by jet-flushing colonic mucosa with distilled water at unified segments of the colon, followed by aspiration, during colonoscopy. Total purified dsDNA was extracted and quantified prior to metagenomic sequencing using an Illumina platform. Satisfactory DNA samples ( n = 29) were subjected to metagenomics studies, followed by comprehensive comparative phylogenetic analysis. An equal number of healthy age-matched controls were also examined for colonic mucosal microbiota. Metagenomics data on 29 patients (14 females) in the age range 38-77 years were analyzed. The majority 11 (37%) of our patients were overweight (BMI = 25-30). Rectal bleeding was the presenting symptom in 18/29 (62%), while symptomatic anemia was the presenting symptom in 11/29 (37%). The location of colon cancer was rectal in 14 (48%), while cecal growth was observed in 8 (27%). Hepatic flexure growth was found in 1 (3%), descending colonic growth was found in 2 (6%), and 4 (13%) patients had transverse colon growth. The metagenomics analysis was carried out, and a total of 3.58G reads were sequenced, and about 321.91G data were used in the analysis. This study identified 11 genera specific to colorectal cancer patients when compared to genera in the control group. Bacteroides fragilis and Fusobacterium were found to be significantly prevalent in the carcinoma group when compared to the control group. The current study has given an insight into the microbiota of colorectal cancer patients in Saudi Arabia and has identified various genera significantly present in these patients when compared to those of the control group.

  18. Critical Assessment of Metagenome Interpretation – a benchmark of computational metagenomics software

    PubMed Central

    Sczyrba, Alexander; Hofmann, Peter; Belmann, Peter; Koslicki, David; Janssen, Stefan; Dröge, Johannes; Gregor, Ivan; Majda, Stephan; Fiedler, Jessika; Dahms, Eik; Bremges, Andreas; Fritz, Adrian; Garrido-Oter, Ruben; Jørgensen, Tue Sparholt; Shapiro, Nicole; Blood, Philip D.; Gurevich, Alexey; Bai, Yang; Turaev, Dmitrij; DeMaere, Matthew Z.; Chikhi, Rayan; Nagarajan, Niranjan; Quince, Christopher; Meyer, Fernando; Balvočiūtė, Monika; Hansen, Lars Hestbjerg; Sørensen, Søren J.; Chia, Burton K. H.; Denis, Bertrand; Froula, Jeff L.; Wang, Zhong; Egan, Robert; Kang, Dongwan Don; Cook, Jeffrey J.; Deltel, Charles; Beckstette, Michael; Lemaitre, Claire; Peterlongo, Pierre; Rizk, Guillaume; Lavenier, Dominique; Wu, Yu-Wei; Singer, Steven W.; Jain, Chirag; Strous, Marc; Klingenberg, Heiner; Meinicke, Peter; Barton, Michael; Lingner, Thomas; Lin, Hsin-Hung; Liao, Yu-Chieh; Silva, Genivaldo Gueiros Z.; Cuevas, Daniel A.; Edwards, Robert A.; Saha, Surya; Piro, Vitor C.; Renard, Bernhard Y.; Pop, Mihai; Klenk, Hans-Peter; Göker, Markus; Kyrpides, Nikos C.; Woyke, Tanja; Vorholt, Julia A.; Schulze-Lefert, Paul; Rubin, Edward M.; Darling, Aaron E.; Rattei, Thomas; McHardy, Alice C.

    2018-01-01

    In metagenome analysis, computational methods for assembly, taxonomic profiling and binning are key components facilitating downstream biological data interpretation. However, a lack of consensus about benchmarking datasets and evaluation metrics complicates proper performance assessment. The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark their programs on datasets of unprecedented complexity and realism. Benchmark metagenomes were generated from ~700 newly sequenced microorganisms and ~600 novel viruses and plasmids, including genomes with varying degrees of relatedness to each other and to publicly available ones and representing common experimental setups. Across all datasets, assembly and genome binning programs performed well for species represented by individual genomes, while performance was substantially affected by the presence of related strains. Taxonomic profiling and binning programs were proficient at high taxonomic ranks, with a notable performance decrease below the family level. Parameter settings substantially impacted performances, underscoring the importance of program reproducibility. While highlighting current challenges in computational metagenomics, the CAMI results provide a roadmap for software selection to answer specific research questions. PMID:28967888

  19. Endosymbiont hunting in the metagenome of Asian citrus psyllid (Diaphorina citri) (7th Annual SFAF Meeting, 2012)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Saha, Surya

    Surya Saha on "Endosymbiont hunting in the metagenome of Asian citrus psyllid (Diaphorina citri)" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  20. Endosymbiont hunting in the metagenome of Asian citrus psyllid (Diaphorina citri) (7th Annual SFAF Meeting, 2012)

    ScienceCinema

    Saha, Surya [Cornell University

    2017-12-09

    Surya Saha on "Endosymbiont hunting in the metagenome of Asian citrus psyllid (Diaphorina citri)" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  1. Metagenomics: Probing pollutant fate in natural and engineered ecosystems.

    PubMed

    Bouhajja, Emna; Agathos, Spiros N; George, Isabelle F

    2016-12-01

    Polluted environments are a reservoir of microbial species able to degrade or to convert pollutants to harmless compounds. The proper management of microbial resources requires a comprehensive characterization of their genetic pool to assess the fate of contaminants and increase the efficiency of bioremediation processes. Metagenomics offers appropriate tools to describe microbial communities in their whole complexity without lab-based cultivation of individual strains. After a decade of use of metagenomics to study microbiomes, the scientific community has made significant progress in this field. In this review, we survey the main steps of metagenomics applied to environments contaminated with organic compounds or heavy metals. We emphasize technical solutions proposed to overcome encountered obstacles. We then compare two metagenomic approaches, i.e. library-based targeted metagenomics and direct sequencing of metagenomes. In the former, environmental DNA is cloned inside a host, and then clones of interest are selected based on (i) their expression of biodegradative functions or (ii) sequence homology with probes and primers designed from relevant, already known sequences. The highest score for the discovery of novel genes and degradation pathways has been achieved so far by functional screening of large clone libraries. On the other hand, direct sequencing of metagenomes without a cloning step has been more often applied to polluted environments for characterization of the taxonomic and functional composition of microbial communities and their dynamics. In this case, the analysis has focused on 16S rRNA genes and marker genes of biodegradation. Advances in next generation sequencing and in bioinformatic analysis of sequencing data have opened up new opportunities for assessing the potential of biodegradation by microbes, but annotation of collected genes is still hampered by a limited number of available reference sequences in databases. Although metagenomics is still facing technical and computational challenges, our review of the recent literature highlights its value as an aid to efficiently monitor the clean-up of contaminated environments and develop successful strategies to mitigate the impact of pollutants on ecosystems. Copyright © 2016 Elsevier Inc. All rights reserved.

  2. The MG-RAST Metagenomics Database and Portal in 2015

    DOE PAGES

    Wilke, Andreas; Bischof, Jared; Gerlach, Wolfgang; ...

    2015-12-09

    MG-RAST (http://metagenomics.anl.gov) is an opensubmission data portal for processing, analyzing, sharing and disseminating metagenomic datasets. Currently, the system hosts over 200 000 datasets and is continuously updated. The volume of submissions has increased 4-fold over the past 24 months, now averaging 4 terabasepairs per month. In addition to several new features, we report changes to the analysis workflow and the technologies used to scale the pipeline up to the required throughput levels. Lastly, to show possible uses for the data from MG-RAST, we present several examples integrating data and analyses from MG-RAST into popular third-party analysis tools or sequence alignmentmore » tools.« less

  3. Metagenomic analysis reveals a green sulfur bacterium as a potential coral symbiont.

    PubMed

    Cai, Lin; Zhou, Guowei; Tian, Ren-Mao; Tong, Haoya; Zhang, Weipeng; Sun, Jin; Ding, Wei; Wong, Yue Him; Xie, James Y; Qiu, Jian-Wen; Liu, Sheng; Huang, Hui; Qian, Pei-Yuan

    2017-08-24

    Coral reefs are ecologically significant habitats. Coral-algal symbiosis confers ecological success on coral reefs and coral-microbial symbiosis is also vital to coral reefs. However, current understanding of coral-microbial symbiosis on a genomic scale is largely unknown. Here we report a potential microbial symbiont in corals revealed by metagenomics-based genomic study. Microbial cells in coral were enriched for metagenomic analysis and a high-quality draft genome of "Candidatus Prosthecochloris korallensis" was recovered by metagenome assembly and genome binning. Phylogenetic analysis shows "Ca. P. korallensis" belongs to the Prosthecochloris clade and is clustered with two Prosthecochloris clones derived from Caribbean corals. Genomic analysis reveals "Ca. P. korallensis" has potentially important ecological functions including anoxygenic photosynthesis, carbon fixation via the reductive tricarboxylic acid (rTCA) cycle, nitrogen fixation, and sulfur oxidization. Core metabolic pathway analysis suggests "Ca. P. korallensis" is a green sulfur bacterium capable of photoautotrophy or mixotrophy. Potential host-microbial interaction reveals a symbiotic relationship: "Ca. P. korallensis" might provide organic and nitrogenous nutrients to its host and detoxify sulfide for the host; the host might provide "Ca. P. korallensis" with an anaerobic environment for survival, carbon dioxide and acetate for growth, and hydrogen sulfide as an electron donor for photosynthesis.

  4. Molecular analysis of the bacterial microbiome in the forestomach fluid from the dromedary camel (Camelus dromedarius).

    PubMed

    Bhatt, Vaibhav D; Dande, Suchitra S; Patil, Nitin V; Joshi, Chaitanya G

    2013-04-01

    Rumen microorganisms play an important role in ruminant digestion and absorption of nutrients and have great potential applications in the field of rumen adjusting, food fermentation and biomass utilization etc. In order to investigate the composition of microorganisms in the rumen of camel (Camelus dromedarius), this study delves in the microbial diversity by culture-independent approach. It includes comparison of rumen samples investigated in the present study to other currently available metagenomes to reveal potential differences in rumen microbial systems. Pyrosequencing based metagenomics was applied to analyze phylogenetic and metabolic profiles by MG-RAST, a web based tool. Pyrosequencing of camel rumen sample yielded 8,979,755 nucleotides assembled to 41,905 sequence reads with an average read length of 214 nucleotides. Taxonomic analysis of metagenomic reads indicated Bacteroidetes (55.5 %), Firmicutes (22.7 %) and Proteobacteria (9.2 %) phyla as predominant camel rumen taxa. At a finer phylogenetic resolution, Bacteroides species dominated the camel rumen metagenome. Functional analysis revealed that clustering-based subsystem and carbohydrate metabolism were the most abundant SEED subsystem representing 17 and 13 % of camel metagenome, respectively. A high taxonomic and functional similarity of camel rumen was found with the cow metagenome which is not surprising given the fact that both are mammalian herbivores with similar digestive tract structures and functions. Combined pyrosequencing approach and subsystems-based annotations available in the SEED database allowed us access to understand the metabolic potential of these microbiomes. Altogether, these data suggest that agricultural and animal husbandry practices can impose significant selective pressures on the rumen microbiota regardless of rumen type. The present study provides a baseline for understanding the complexity of camel rumen microbial ecology while also highlighting striking similarities and differences when compared to other animal gastrointestinal environments.

  5. An integrated metagenome and -proteome analysis of the microbial community residing in a biogas production plant.

    PubMed

    Ortseifen, Vera; Stolze, Yvonne; Maus, Irena; Sczyrba, Alexander; Bremges, Andreas; Albaum, Stefan P; Jaenicke, Sebastian; Fracowiak, Jochen; Pühler, Alfred; Schlüter, Andreas

    2016-08-10

    To study the metaproteome of a biogas-producing microbial community, fermentation samples were taken from an agricultural biogas plant for microbial cell and protein extraction and corresponding metagenome analyses. Based on metagenome sequence data, taxonomic community profiling was performed to elucidate the composition of bacterial and archaeal sub-communities. The community's cytosolic metaproteome was represented in a 2D-PAGE approach. Metaproteome databases for protein identification were compiled based on the assembled metagenome sequence dataset for the biogas plant analyzed and non-corresponding biogas metagenomes. Protein identification results revealed that the corresponding biogas protein database facilitated the highest identification rate followed by other biogas-specific databases, whereas common public databases yielded insufficient identification rates. Proteins of the biogas microbiome identified as highly abundant were assigned to the pathways involved in methanogenesis, transport and carbon metabolism. Moreover, the integrated metagenome/-proteome approach enabled the examination of genetic-context information for genes encoding identified proteins by studying neighboring genes on the corresponding contig. Exemplarily, this approach led to the identification of a Methanoculleus sp. contig encoding 16 methanogenesis-related gene products, three of which were also detected as abundant proteins within the community's metaproteome. Thus, metagenome contigs provide additional information on the genetic environment of identified abundant proteins. Copyright © 2016 Elsevier B.V. All rights reserved.

  6. Life in Oligotropic Desert Environments: Contrasting Taxonomic and Functional Diversity of Two Microbial Mats with Metagenomics

    NASA Astrophysics Data System (ADS)

    Bonilla-Rosso, G.; Peimbert, M.; Olmedo, G.; Alcaraz, L. D.; Eguiarte, L. E.; Souza, V.

    2010-04-01

    The metagenomic analysis of two microbial mats from the oligotrophic waters in the Cuatrociéngas basin reveals large differences both at taxonomic and functional level. These are explained in terms of environmental stability and nutrient availability.

  7. MetaDP: a comprehensive web server for disease prediction of 16S rRNA metagenomic datasets.

    PubMed

    Xu, Xilin; Wu, Aiping; Zhang, Xinlei; Su, Mingming; Jiang, Taijiao; Yuan, Zhe-Ming

    2016-01-01

    High-throughput sequencing-based metagenomics has garnered considerable interest in recent years. Numerous methods and tools have been developed for the analysis of metagenomic data. However, it is still a daunting task to install a large number of tools and complete a complicated analysis, especially for researchers with minimal bioinformatics backgrounds. To address this problem, we constructed an automated software named MetaDP for 16S rRNA sequencing data analysis, including data quality control, operational taxonomic unit clustering, diversity analysis, and disease risk prediction modeling. Furthermore, a support vector machine-based prediction model for intestinal bowel syndrome (IBS) was built by applying MetaDP to microbial 16S sequencing data from 108 children. The success of the IBS prediction model suggests that the platform may also be applied to other diseases related to gut microbes, such as obesity, metabolic syndrome, or intestinal cancer, among others (http://metadp.cn:7001/).

  8. Comparison of normalization methods for the analysis of metagenomic gene abundance data.

    PubMed

    Pereira, Mariana Buongermino; Wallroth, Mikael; Jonsson, Viktor; Kristiansson, Erik

    2018-04-20

    In shotgun metagenomics, microbial communities are studied through direct sequencing of DNA without any prior cultivation. By comparing gene abundances estimated from the generated sequencing reads, functional differences between the communities can be identified. However, gene abundance data is affected by high levels of systematic variability, which can greatly reduce the statistical power and introduce false positives. Normalization, which is the process where systematic variability is identified and removed, is therefore a vital part of the data analysis. A wide range of normalization methods for high-dimensional count data has been proposed but their performance on the analysis of shotgun metagenomic data has not been evaluated. Here, we present a systematic evaluation of nine normalization methods for gene abundance data. The methods were evaluated through resampling of three comprehensive datasets, creating a realistic setting that preserved the unique characteristics of metagenomic data. Performance was measured in terms of the methods ability to identify differentially abundant genes (DAGs), correctly calculate unbiased p-values and control the false discovery rate (FDR). Our results showed that the choice of normalization method has a large impact on the end results. When the DAGs were asymmetrically present between the experimental conditions, many normalization methods had a reduced true positive rate (TPR) and a high false positive rate (FPR). The methods trimmed mean of M-values (TMM) and relative log expression (RLE) had the overall highest performance and are therefore recommended for the analysis of gene abundance data. For larger sample sizes, CSS also showed satisfactory performance. This study emphasizes the importance of selecting a suitable normalization methods in the analysis of data from shotgun metagenomics. Our results also demonstrate that improper methods may result in unacceptably high levels of false positives, which in turn may lead to incorrect or obfuscated biological interpretation.

  9. FMAP: Functional Mapping and Analysis Pipeline for metagenomics and metatranscriptomics studies.

    PubMed

    Kim, Jiwoong; Kim, Min Soo; Koh, Andrew Y; Xie, Yang; Zhan, Xiaowei

    2016-10-10

    Given the lack of a complete and comprehensive library of microbial reference genomes, determining the functional profile of diverse microbial communities is challenging. The available functional analysis pipelines lack several key features: (i) an integrated alignment tool, (ii) operon-level analysis, and (iii) the ability to process large datasets. Here we introduce our open-sourced, stand-alone functional analysis pipeline for analyzing whole metagenomic and metatranscriptomic sequencing data, FMAP (Functional Mapping and Analysis Pipeline). FMAP performs alignment, gene family abundance calculations, and statistical analysis (three levels of analyses are provided: differentially-abundant genes, operons and pathways). The resulting output can be easily visualized with heatmaps and functional pathway diagrams. FMAP functional predictions are consistent with currently available functional analysis pipelines. FMAP is a comprehensive tool for providing functional analysis of metagenomic/metatranscriptomic sequencing data. With the added features of integrated alignment, operon-level analysis, and the ability to process large datasets, FMAP will be a valuable addition to the currently available functional analysis toolbox. We believe that this software will be of great value to the wider biology and bioinformatics communities.

  10. Identification and initial characterization of a novel turkey-origin picobirnavirus using a metagenomic approach

    USDA-ARS?s Scientific Manuscript database

    Using the Genome Sequencer FLX Titanium technology (Roche, 454 Life Sciences), a ribonucleic acid (RNA) virus-specific metagenome was prepared using the pooled intestinal contents collected from North Carolina turkey flocks experiencing enteric disease signs. This analysis produced 6526 contigs rang...

  11. Metagenome Analyses of Corroded Concrete Wastewater Pipe Biofilms Reveals a Complex Microbial System

    EPA Science Inventory

    Analysis of whole-metagenome pyrosequencing data and 16S rRNA gene clone libraries was used to determine microbial composition and functional genes associated with biomass harvested from crown (top) and invert (bottom) sections of a corroded wastewater pipe. Taxonomic and functio...

  12. Prospecting Biotechnologically-Relevant Monooxygenases from Cold Sediment Metagenomes: An In Silico Approach

    DOE PAGES

    Musumeci, Matias A.; Lozada, Mariana; Rial, Daniela V.; ...

    2017-04-09

    The goal of this work was to identify sequences encoding monooxygenase biocatalysts with novel features by in silico mining an assembled metagenomic dataset of polar and subpolar marine sediments. The targeted enzyme sequences were Baeyer-Villiger and bacterial cytochrome P450 monooxygenases (CYP153). These enzymes have wide-ranging applications, from the synthesis of steroids, antibiotics, mycotoxins and pheromones to the synthesis of monomers for polymerization and anticancer precursors, due to their extraordinary enantio-, regio-, and chemo- selectivity that are valuable features for organic synthesis. Phylogenetic analyses were used to select the most divergent sequences affiliated to these enzyme families among the 264 putativemore » monooxygenases recovered from the ~14 million protein-coding sequences in the assembled metagenome dataset. Three-dimensional structure modeling and docking analysis suggested features useful in biotechnological applications in five metagenomic sequences, such as wide substrate range, novel substrate specificity or regioselectivity. Further analysis revealed structural features associated with psychrophilic enzymes, such as broader substrate accessibility, larger catalytic pockets or low domain interactions, suggesting that they could be applied in biooxidations at room or low temperatures, saving costs inherent to energy consumption. As a result, this work allowed the identification of putative enzyme candidates with promising features from metagenomes, providing a suitable starting point for further developments.« less

  13. Prospecting Biotechnologically-Relevant Monooxygenases from Cold Sediment Metagenomes: An In Silico Approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Musumeci, Matias A.; Lozada, Mariana; Rial, Daniela V.

    The goal of this work was to identify sequences encoding monooxygenase biocatalysts with novel features by in silico mining an assembled metagenomic dataset of polar and subpolar marine sediments. The targeted enzyme sequences were Baeyer-Villiger and bacterial cytochrome P450 monooxygenases (CYP153). These enzymes have wide-ranging applications, from the synthesis of steroids, antibiotics, mycotoxins and pheromones to the synthesis of monomers for polymerization and anticancer precursors, due to their extraordinary enantio-, regio-, and chemo- selectivity that are valuable features for organic synthesis. Phylogenetic analyses were used to select the most divergent sequences affiliated to these enzyme families among the 264 putativemore » monooxygenases recovered from the ~14 million protein-coding sequences in the assembled metagenome dataset. Three-dimensional structure modeling and docking analysis suggested features useful in biotechnological applications in five metagenomic sequences, such as wide substrate range, novel substrate specificity or regioselectivity. Further analysis revealed structural features associated with psychrophilic enzymes, such as broader substrate accessibility, larger catalytic pockets or low domain interactions, suggesting that they could be applied in biooxidations at room or low temperatures, saving costs inherent to energy consumption. As a result, this work allowed the identification of putative enzyme candidates with promising features from metagenomes, providing a suitable starting point for further developments.« less

  14. Prospecting Biotechnologically-Relevant Monooxygenases from Cold Sediment Metagenomes: An In Silico Approach.

    PubMed

    Musumeci, Matías A; Lozada, Mariana; Rial, Daniela V; Mac Cormack, Walter P; Jansson, Janet K; Sjöling, Sara; Carroll, JoLynn; Dionisi, Hebe M

    2017-04-09

    The goal of this work was to identify sequences encoding monooxygenase biocatalysts with novel features by in silico mining an assembled metagenomic dataset of polar and subpolar marine sediments. The targeted enzyme sequences were Baeyer-Villiger and bacterial cytochrome P450 monooxygenases (CYP153). These enzymes have wide-ranging applications, from the synthesis of steroids, antibiotics, mycotoxins and pheromones to the synthesis of monomers for polymerization and anticancer precursors, due to their extraordinary enantio-, regio-, and chemo- selectivity that are valuable features for organic synthesis. Phylogenetic analyses were used to select the most divergent sequences affiliated to these enzyme families among the 264 putative monooxygenases recovered from the ~14 million protein-coding sequences in the assembled metagenome dataset. Three-dimensional structure modeling and docking analysis suggested features useful in biotechnological applications in five metagenomic sequences, such as wide substrate range, novel substrate specificity or regioselectivity. Further analysis revealed structural features associated with psychrophilic enzymes, such as broader substrate accessibility, larger catalytic pockets or low domain interactions, suggesting that they could be applied in biooxidations at room or low temperatures, saving costs inherent to energy consumption. This work allowed the identification of putative enzyme candidates with promising features from metagenomes, providing a suitable starting point for further developments.

  15. Prospecting Biotechnologically-Relevant Monooxygenases from Cold Sediment Metagenomes: An In Silico Approach

    PubMed Central

    Musumeci, Matías A.; Lozada, Mariana; Rial, Daniela V.; Mac Cormack, Walter P.; Jansson, Janet K.; Sjöling, Sara; Carroll, JoLynn; Dionisi, Hebe M.

    2017-01-01

    The goal of this work was to identify sequences encoding monooxygenase biocatalysts with novel features by in silico mining an assembled metagenomic dataset of polar and subpolar marine sediments. The targeted enzyme sequences were Baeyer–Villiger and bacterial cytochrome P450 monooxygenases (CYP153). These enzymes have wide-ranging applications, from the synthesis of steroids, antibiotics, mycotoxins and pheromones to the synthesis of monomers for polymerization and anticancer precursors, due to their extraordinary enantio-, regio-, and chemo- selectivity that are valuable features for organic synthesis. Phylogenetic analyses were used to select the most divergent sequences affiliated to these enzyme families among the 264 putative monooxygenases recovered from the ~14 million protein-coding sequences in the assembled metagenome dataset. Three-dimensional structure modeling and docking analysis suggested features useful in biotechnological applications in five metagenomic sequences, such as wide substrate range, novel substrate specificity or regioselectivity. Further analysis revealed structural features associated with psychrophilic enzymes, such as broader substrate accessibility, larger catalytic pockets or low domain interactions, suggesting that they could be applied in biooxidations at room or low temperatures, saving costs inherent to energy consumption. This work allowed the identification of putative enzyme candidates with promising features from metagenomes, providing a suitable starting point for further developments. PMID:28397770

  16. Integrated Metagenomic and Metatranscriptomic Analyses of Microbial Communities in the Meso- and Bathypelagic Realm of North Pacific Ocean

    PubMed Central

    Wu, Jieying; Gao, Weimin; Johnson, Roger H.; Zhang, Weiwen; Meldrum, Deirdre R.

    2013-01-01

    Although emerging evidence indicates that deep-sea water contains an untapped reservoir of high metabolic and genetic diversity, this realm has not been studied well compared with surface sea water. The study provided the first integrated meta-genomic and -transcriptomic analysis of the microbial communities in deep-sea water of North Pacific Ocean. DNA/RNA amplifications and simultaneous metagenomic and metatranscriptomic analyses were employed to discover information concerning deep-sea microbial communities from four different deep-sea sites ranging from the mesopelagic to pelagic ocean. Within the prokaryotic community, bacteria is absolutely dominant (~90%) over archaea in both metagenomic and metatranscriptomic data pools. The emergence of archaeal phyla Crenarchaeota, Euryarchaeota, Thaumarchaeota, bacterial phyla Actinobacteria, Firmicutes, sub-phyla Betaproteobacteria, Deltaproteobacteria, and Gammaproteobacteria, and the decrease of bacterial phyla Bacteroidetes and Alphaproteobacteria are the main composition changes of prokaryotic communities in the deep-sea water, when compared with the reference Global Ocean Sampling Expedition (GOS) surface water. Photosynthetic Cyanobacteria exist in all four metagenomic libraries and two metatranscriptomic libraries. In Eukaryota community, decreased abundance of fungi and algae in deep sea was observed. RNA/DNA ratio was employed as an index to show metabolic activity strength of microbes in deep sea. Functional analysis indicated that deep-sea microbes are leading a defensive lifestyle. PMID:24152557

  17. Metagenomic analysis of bacterial and archaeal assemblages in the soil-mousse surrounding a geothermal spring.

    PubMed

    Bhatia, Sonu; Batra, Navneet; Pathak, Ashish; Joshi, Amit; Souza, Leila; Almeida, Paulo; Chauhan, Ashvini

    2015-09-01

    The soil-mousse surrounding a geothermal spring was analyzed for bacterial and archaeal diversity using 16S rRNA gene amplicon metagenomic sequencing which revealed the presence of 18 bacterial phyla distributed across 109 families and 219 genera. Firmicutes, Actinobacteria, and the Deinococcus-Thermus group were the predominant bacterial assemblages with Crenarchaeota and Thaumarchaeota as the main archaeal assemblages in this largely understudied geothermal habitat. Several metagenome sequences remained taxonomically unassigned suggesting the presence of a repertoire of hitherto undescribed microbes in this geothermal soil-mousse econiche.

  18. Metagenomics and the protein universe

    PubMed Central

    Godzik, Adam

    2011-01-01

    Metagenomics sequencing projects have dramatically increased our knowledge of the protein universe and provided over one-half of currently known protein sequences; they have also introduced a much broader phylogenetic diversity into the protein databases. The full analysis of metagenomic datasets is only beginning, but it has already led to the discovery of thousands of new protein families, likely representing novel functions specific to given environments. At the same time, a deeper analysis of such novel families, including experimental structure determination of some representatives, suggests that most of them represent distant homologs of already characterized protein families, and thus most of the protein diversity present in the new environments are due to functional divergence of the known protein families rather than the emergence of new ones. PMID:21497084

  19. GenomePeek—an online tool for prokaryotic genome and metagenome analysis

    DOE PAGES

    McNair, Katelyn; Edwards, Robert A.

    2015-06-16

    As increases in prokaryotic sequencing take place, a method to quickly and accurately analyze this data is needed. Previous tools are mainly designed for metagenomic analysis and have limitations; such as long runtimes and significant false positive error rates. The online tool GenomePeek (edwards.sdsu.edu/GenomePeek) was developed to analyze both single genome and metagenome sequencing files, quickly and with low error rates. GenomePeek uses a sequence assembly approach where reads to a set of conserved genes are extracted, assembled and then aligned against the highly specific reference database. GenomePeek was found to be faster than traditional approaches while still keeping errormore » rates low, as well as offering unique data visualization options.« less

  20. IMG/M: integrated genome and metagenome comparative data analysis system

    DOE PAGES

    Chen, I-Min A.; Markowitz, Victor M.; Chu, Ken; ...

    2016-10-13

    The Integrated Microbial Genomes with Microbiome Samples (IMG/M: https://img.jgi.doe.gov/m/) system contains annotated DNA and RNA sequence data of (i) archaeal, bacterial, eukaryotic and viral genomes from cultured organisms, (ii) single cell genomes (SCG) and genomes from metagenomes (GFM) from uncultured archaea, bacteria and viruses and (iii) metagenomes from environmental, host associated and engineered microbiome samples. Sequence data are generated by DOE's Joint Genome Institute (JGI), submitted by individual scientists, or collected from public sequence data archives. Structural and functional annotation is carried out by JGI's genome and metagenome annotation pipelines. A variety of analytical and visualization tools provide support formore » examining and comparing IMG/M's datasets. IMG/M allows open access interactive analysis of publicly available datasets, while manual curation, submission and access to private datasets and computationally intensive workspace-based analysis require login/password access to its expert review(ER) companion system (IMG/M ER: https://img.jgi.doe.gov/ mer/). Since the last report published in the 2014 NAR Database Issue, IMG/M's dataset content has tripled in terms of number of datasets and overall protein coding genes, while its analysis tools have been extended to cope with the rapid growth in the number and size of datasets handled by the system.« less

  1. IMG/M: integrated genome and metagenome comparative data analysis system

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, I-Min A.; Markowitz, Victor M.; Chu, Ken

    The Integrated Microbial Genomes with Microbiome Samples (IMG/M: https://img.jgi.doe.gov/m/) system contains annotated DNA and RNA sequence data of (i) archaeal, bacterial, eukaryotic and viral genomes from cultured organisms, (ii) single cell genomes (SCG) and genomes from metagenomes (GFM) from uncultured archaea, bacteria and viruses and (iii) metagenomes from environmental, host associated and engineered microbiome samples. Sequence data are generated by DOE's Joint Genome Institute (JGI), submitted by individual scientists, or collected from public sequence data archives. Structural and functional annotation is carried out by JGI's genome and metagenome annotation pipelines. A variety of analytical and visualization tools provide support formore » examining and comparing IMG/M's datasets. IMG/M allows open access interactive analysis of publicly available datasets, while manual curation, submission and access to private datasets and computationally intensive workspace-based analysis require login/password access to its expert review(ER) companion system (IMG/M ER: https://img.jgi.doe.gov/ mer/). Since the last report published in the 2014 NAR Database Issue, IMG/M's dataset content has tripled in terms of number of datasets and overall protein coding genes, while its analysis tools have been extended to cope with the rapid growth in the number and size of datasets handled by the system.« less

  2. IMG/M: integrated genome and metagenome comparative data analysis system

    PubMed Central

    Chen, I-Min A.; Markowitz, Victor M.; Chu, Ken; Palaniappan, Krishna; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Andersen, Evan; Huntemann, Marcel; Varghese, Neha; Hadjithomas, Michalis; Tennessen, Kristin; Nielsen, Torben; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2017-01-01

    The Integrated Microbial Genomes with Microbiome Samples (IMG/M: https://img.jgi.doe.gov/m/) system contains annotated DNA and RNA sequence data of (i) archaeal, bacterial, eukaryotic and viral genomes from cultured organisms, (ii) single cell genomes (SCG) and genomes from metagenomes (GFM) from uncultured archaea, bacteria and viruses and (iii) metagenomes from environmental, host associated and engineered microbiome samples. Sequence data are generated by DOE's Joint Genome Institute (JGI), submitted by individual scientists, or collected from public sequence data archives. Structural and functional annotation is carried out by JGI's genome and metagenome annotation pipelines. A variety of analytical and visualization tools provide support for examining and comparing IMG/M's datasets. IMG/M allows open access interactive analysis of publicly available datasets, while manual curation, submission and access to private datasets and computationally intensive workspace-based analysis require login/password access to its expert review (ER) companion system (IMG/M ER: https://img.jgi.doe.gov/mer/). Since the last report published in the 2014 NAR Database Issue, IMG/M's dataset content has tripled in terms of number of datasets and overall protein coding genes, while its analysis tools have been extended to cope with the rapid growth in the number and size of datasets handled by the system. PMID:27738135

  3. Complete Genome Analysis of Thermus parvatiensis and Comparative Genomics of Thermus spp. Provide Insights into Genetic Variability and Evolution of Natural Competence as Strategic Survival Attributes

    PubMed Central

    Tripathi, Charu; Mishra, Harshita; Khurana, Himani; Dwivedi, Vatsala; Kamra, Komal; Negi, Ram K.; Lal, Rup

    2017-01-01

    Thermophilic environments represent an interesting niche. Among thermophiles, the genus Thermus is among the most studied genera. In this study, we have sequenced the genome of Thermus parvatiensis strain RL, a thermophile isolated from Himalayan hot water springs (temperature >96°C) using PacBio RSII SMRT technique. The small genome (2.01 Mbp) comprises a chromosome (1.87 Mbp) and a plasmid (143 Kbp), designated in this study as pTP143. Annotation revealed a high number of repair genes, a squeezed genome but containing highly plastic plasmid with transposases, integrases, mobile elements and hypothetical proteins (44%). We performed a comparative genomic study of the group Thermus with an aim of analysing the phylogenetic relatedness as well as niche specific attributes prevalent among the group. We compared the reference genome RL with 16 Thermus genomes to assess their phylogenetic relationships based on 16S rRNA gene sequences, average nucleotide identity (ANI), conserved marker genes (31 and 400), pan genome and tetranucleotide frequency. The core genome of the analyzed genomes contained 1,177 core genes and many singleton genes were detected in individual genomes, reflecting a conserved core but adaptive pan repertoire. We demonstrated the presence of metagenomic islands (chromosome:5, plasmid:5) by recruiting raw metagenomic data (from the same niche) against the genomic replicons of T. parvatiensis. We also dissected the CRISPR loci wide all genomes and found widespread presence of this system across Thermus genomes. Additionally, we performed a comparative analysis of competence loci wide Thermus genomes and found evidence for recent horizontal acquisition of the locus and continued dispersal among members reflecting that natural competence is a beneficial survival trait among Thermus members and its acquisition depicts unending evolution in order to accomplish optimal fitness. PMID:28798737

  4. Arsenic metabolism in high altitude modern stromatolites revealed by metagenomic analysis.

    PubMed

    Kurth, Daniel; Amadio, Ariel; Ordoñez, Omar F; Albarracín, Virginia H; Gärtner, Wolfgang; Farías, María E

    2017-04-21

    Modern stromatolites thrive only in selected locations in the world. Socompa Lake, located in the Andean plateau at 3570 masl, is one of the numerous extreme Andean microbial ecosystems described over recent years. Extreme environmental conditions include hypersalinity, high UV incidence, and high arsenic content, among others. After Socompa's stromatolite microbial communities were analysed by metagenomic DNA sequencing, taxonomic classification showed dominance of Proteobacteria, Bacteroidetes and Firmicutes, and a remarkably high number of unclassified sequences. A functional analysis indicated that carbon fixation might occur not only by the Calvin-Benson cycle, but also through alternative pathways such as the reverse TCA cycle, and the reductive acetyl-CoA pathway. Deltaproteobacteria were involved both in sulfate reduction and nitrogen fixation. Significant differences were found when comparing the Socompa stromatolite metagenome to the Shark Bay (Australia) smooth mat metagenome: namely, those involving stress related processes, particularly, arsenic resistance. An in-depth analysis revealed a surprisingly diverse metabolism comprising all known types of As resistance and energy generating pathways. While the ars operon was the main mechanism, an important abundance of arsM genes was observed in selected phyla. The data resulting from this work will prove a cornerstone for further studies on this rare microbial community.

  5. Global metagenomic survey reveals a new bacterial candidate phylum in geothermal springs

    PubMed Central

    Eloe-Fadrosh, Emiley A.; Paez-Espino, David; Jarett, Jessica; Dunfield, Peter F.; Hedlund, Brian P.; Dekas, Anne E.; Grasby, Stephen E.; Brady, Allyson L.; Dong, Hailiang; Briggs, Brandon R.; Li, Wen-Jun; Goudeau, Danielle; Malmstrom, Rex; Pati, Amrita; Pett-Ridge, Jennifer; Rubin, Edward M.; Woyke, Tanja; Kyrpides, Nikos C.; Ivanova, Natalia N.

    2016-01-01

    Analysis of the increasing wealth of metagenomic data collected from diverse environments can lead to the discovery of novel branches on the tree of life. Here we analyse 5.2 Tb of metagenomic data collected globally to discover a novel bacterial phylum (‘Candidatus Kryptonia') found exclusively in high-temperature pH-neutral geothermal springs. This lineage had remained hidden as a taxonomic ‘blind spot' because of mismatches in the primers commonly used for ribosomal gene surveys. Genome reconstruction from metagenomic data combined with single-cell genomics results in several high-quality genomes representing four genera from the new phylum. Metabolic reconstruction indicates a heterotrophic lifestyle with conspicuous nutritional deficiencies, suggesting the need for metabolic complementarity with other microbes. Co-occurrence patterns identifies a number of putative partners, including an uncultured Armatimonadetes lineage. The discovery of Kryptonia within previously studied geothermal springs underscores the importance of globally sampled metagenomic data in detection of microbial novelty, and highlights the extraordinary diversity of microbial life still awaiting discovery. PMID:26814032

  6. Global metagenomic survey reveals a new bacterial candidate phylum in geothermal springs.

    PubMed

    Eloe-Fadrosh, Emiley A; Paez-Espino, David; Jarett, Jessica; Dunfield, Peter F; Hedlund, Brian P; Dekas, Anne E; Grasby, Stephen E; Brady, Allyson L; Dong, Hailiang; Briggs, Brandon R; Li, Wen-Jun; Goudeau, Danielle; Malmstrom, Rex; Pati, Amrita; Pett-Ridge, Jennifer; Rubin, Edward M; Woyke, Tanja; Kyrpides, Nikos C; Ivanova, Natalia N

    2016-01-27

    Analysis of the increasing wealth of metagenomic data collected from diverse environments can lead to the discovery of novel branches on the tree of life. Here we analyse 5.2 Tb of metagenomic data collected globally to discover a novel bacterial phylum ('Candidatus Kryptonia') found exclusively in high-temperature pH-neutral geothermal springs. This lineage had remained hidden as a taxonomic 'blind spot' because of mismatches in the primers commonly used for ribosomal gene surveys. Genome reconstruction from metagenomic data combined with single-cell genomics results in several high-quality genomes representing four genera from the new phylum. Metabolic reconstruction indicates a heterotrophic lifestyle with conspicuous nutritional deficiencies, suggesting the need for metabolic complementarity with other microbes. Co-occurrence patterns identifies a number of putative partners, including an uncultured Armatimonadetes lineage. The discovery of Kryptonia within previously studied geothermal springs underscores the importance of globally sampled metagenomic data in detection of microbial novelty, and highlights the extraordinary diversity of microbial life still awaiting discovery.

  7. Global metagenomic survey reveals a new bacterial candidate phylum in geothermal springs

    DOE PAGES

    Eloe-Fadrosh, Emiley A.; Paez-Espino, David; Jarett, Jessica; ...

    2016-01-27

    Analysis of the increasing wealth of metagenomic data collected from diverse environments can lead to the discovery of novel branches on the tree of life. Here we analyse 5.2 Tb of metagenomic data collected globally to discover a novel bacterial phylum (' Candidatus Kryptonia') found exclusively in higherature pH-neutral geothermal springs. This lineage had remained hidden as a taxonomic 'blind spot' because of mismatches in the primers commonly used for ribosomal gene surveys. Genome reconstruction from metagenomic data combined with single-cell genomics results in several high-quality genomes representing four genera from the new phylum. Metabolic reconstruction indicates a heterotrophic lifestylemore » with conspicuous nutritional deficiencies, suggesting the need for metabolic complementarity with other microbes. Co-occurrence patterns identifies a number of putative partners, including an uncultured Armatimonadetes lineage. The discovery of Kryptonia within previously studied geothermal springs underscores the importance of globally sampled metagenomic data in detection of microbial novelty, and highlights the extraordinary diversity of microbial life still awaiting discovery.« less

  8. An Enrichment of CRISPR and Other Defense-Related Features in Marine Sponge-Associated Microbial Metagenomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Horn, Hannes; Slaby, Beate M.; Jahn, Martin T.

    Many marine sponges are populated by dense and taxonomically diverse microbial consortia. We employed a metagenomics approach to unravel the differences in the functional gene repertoire among three Mediterranean sponge species, Petrosia ficiformis, Sarcotragus foetidus, Aplysina aerophoba and seawater. Different signatures were observed between sponge and seawater metagenomes with regard to microbial community composition, GC content, and estimated bacterial genome size. Our analysis showed further a pronounced repertoire for defense systems in sponge metagenomes. Specifically, clustered regularly interspaced short palindromic repeats, restriction modification, DNA phosphorothioation and phage growth limitation systems were enriched in sponge metagenomes. These data suggest that defensemore » is an important functional trait for an existence within sponges that requires mechanisms to defend against foreign DNA from microorganisms and viruses. Furthermore, this study contributes to an understanding of the evolutionary arms race between viruses/phages and bacterial genomes and it sheds light on the bacterial defenses that have evolved in the context of the sponge holobiont.« less

  9. An Enrichment of CRISPR and Other Defense-Related Features in Marine Sponge-Associated Microbial Metagenomes

    DOE PAGES

    Horn, Hannes; Slaby, Beate M.; Jahn, Martin T.; ...

    2016-11-08

    Many marine sponges are populated by dense and taxonomically diverse microbial consortia. We employed a metagenomics approach to unravel the differences in the functional gene repertoire among three Mediterranean sponge species, Petrosia ficiformis, Sarcotragus foetidus, Aplysina aerophoba and seawater. Different signatures were observed between sponge and seawater metagenomes with regard to microbial community composition, GC content, and estimated bacterial genome size. Our analysis showed further a pronounced repertoire for defense systems in sponge metagenomes. Specifically, clustered regularly interspaced short palindromic repeats, restriction modification, DNA phosphorothioation and phage growth limitation systems were enriched in sponge metagenomes. These data suggest that defensemore » is an important functional trait for an existence within sponges that requires mechanisms to defend against foreign DNA from microorganisms and viruses. Furthermore, this study contributes to an understanding of the evolutionary arms race between viruses/phages and bacterial genomes and it sheds light on the bacterial defenses that have evolved in the context of the sponge holobiont.« less

  10. Use of simulated data sets to evaluate the fidelity of metagenomic processing methods

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mavromatis, K; Ivanova, N; Barry, Kerrie

    2007-01-01

    Metagenomics is a rapidly emerging field of research for studying microbial communities. To evaluate methods presently used to process metagenomic sequences, we constructed three simulated data sets of varying complexity by combining sequencing reads randomly selected from 113 isolate genomes. These data sets were designed to model real metagenomes in terms of complexity and phylogenetic composition. We assembled sampled reads using three commonly used genome assemblers (Phrap, Arachne and JAZZ), and predicted genes using two popular gene-finding pipelines (fgenesb and CRITICA/GLIMMER). The phylogenetic origins of the assembled contigs were predicted using one sequence similarity-based ( blast hit distribution) and twomore » sequence composition-based (PhyloPythia, oligonucleotide frequencies) binning methods. We explored the effects of the simulated community structure and method combinations on the fidelity of each processing step by comparison to the corresponding isolate genomes. The simulated data sets are available online to facilitate standardized benchmarking of tools for metagenomic analysis.« less

  11. Assembly of large metagenome data sets using a Convey HC-1 hybrid core computer (7th Annual SFAF Meeting, 2012)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Copeland, Alex

    2012-06-01

    Alex Copeland on "Assembly of large metagenome data sets using a Convey HC-1 hybrid core computer" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  12. Assembly of large metagenome data sets using a Convey HC-1 hybrid core computer (7th Annual SFAF Meeting, 2012)

    ScienceCinema

    Copeland, Alex [DOE JGI

    2017-12-09

    Alex Copeland on "Assembly of large metagenome data sets using a Convey HC-1 hybrid core computer" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  13. Metazen – metadata capture for metagenomes

    PubMed Central

    2014-01-01

    Background As the impact and prevalence of large-scale metagenomic surveys grow, so does the acute need for more complete and standards compliant metadata. Metadata (data describing data) provides an essential complement to experimental data, helping to answer questions about its source, mode of collection, and reliability. Metadata collection and interpretation have become vital to the genomics and metagenomics communities, but considerable challenges remain, including exchange, curation, and distribution. Currently, tools are available for capturing basic field metadata during sampling, and for storing, updating and viewing it. Unfortunately, these tools are not specifically designed for metagenomic surveys; in particular, they lack the appropriate metadata collection templates, a centralized storage repository, and a unique ID linking system that can be used to easily port complete and compatible metagenomic metadata into widely used assembly and sequence analysis tools. Results Metazen was developed as a comprehensive framework designed to enable metadata capture for metagenomic sequencing projects. Specifically, Metazen provides a rapid, easy-to-use portal to encourage early deposition of project and sample metadata. Conclusions Metazen is an interactive tool that aids users in recording their metadata in a complete and valid format. A defined set of mandatory fields captures vital information, while the option to add fields provides flexibility. PMID:25780508

  14. Metazen - metadata capture for metagenomes.

    PubMed

    Bischof, Jared; Harrison, Travis; Paczian, Tobias; Glass, Elizabeth; Wilke, Andreas; Meyer, Folker

    2014-01-01

    As the impact and prevalence of large-scale metagenomic surveys grow, so does the acute need for more complete and standards compliant metadata. Metadata (data describing data) provides an essential complement to experimental data, helping to answer questions about its source, mode of collection, and reliability. Metadata collection and interpretation have become vital to the genomics and metagenomics communities, but considerable challenges remain, including exchange, curation, and distribution. Currently, tools are available for capturing basic field metadata during sampling, and for storing, updating and viewing it. Unfortunately, these tools are not specifically designed for metagenomic surveys; in particular, they lack the appropriate metadata collection templates, a centralized storage repository, and a unique ID linking system that can be used to easily port complete and compatible metagenomic metadata into widely used assembly and sequence analysis tools. Metazen was developed as a comprehensive framework designed to enable metadata capture for metagenomic sequencing projects. Specifically, Metazen provides a rapid, easy-to-use portal to encourage early deposition of project and sample metadata. Metazen is an interactive tool that aids users in recording their metadata in a complete and valid format. A defined set of mandatory fields captures vital information, while the option to add fields provides flexibility.

  15. Genomic and metagenomic technologies to explore the antibiotic resistance mobilome.

    PubMed

    Martínez, José L; Coque, Teresa M; Lanza, Val F; de la Cruz, Fernando; Baquero, Fernando

    2017-01-01

    Antibiotic resistance is a relevant problem for human health that requires global approaches to establish a deep understanding of the processes of acquisition, stabilization, and spread of resistance among human bacterial pathogens. Since natural (nonclinical) ecosystems are reservoirs of resistance genes, a health-integrated study of the epidemiology of antibiotic resistance requires the exploration of such ecosystems with the aim of determining the role they may play in the selection, evolution, and spread of antibiotic resistance genes, involving the so-called resistance mobilome. High-throughput sequencing techniques allow an unprecedented opportunity to describe the genetic composition of a given microbiome without the need to subculture the organisms present inside. However, bioinformatic methods for analyzing this bulk of data, mainly with respect to binning each resistance gene with the organism hosting it, are still in their infancy. Here, we discuss how current genomic methodologies can serve to analyze the resistance mobilome and its linkage with different bacterial genomes and metagenomes. In addition, we describe the drawbacks of current methodologies for analyzing the resistance mobilome, mainly in cases of complex microbiotas, and discuss the possibility of implementing novel tools to improve our current metagenomic toolbox. © 2016 New York Academy of Sciences.

  16. MerCat: a versatile k-mer counter and diversity estimator for database-independent property analysis obtained from metagenomic and/or metatranscriptomic sequencing data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    White, Richard A.; Panyala, Ajay R.; Glass, Kevin A.

    MerCat is a parallel, highly scalable and modular property software package for robust analysis of features in next-generation sequencing data. MerCat inputs include assembled contigs and raw sequence reads from any platform resulting in feature abundance counts tables. MerCat allows for direct analysis of data properties without reference sequence database dependency commonly used by search tools such as BLAST and/or DIAMOND for compositional analysis of whole community shotgun sequencing (e.g. metagenomes and metatranscriptomes).

  17. The future of genomics in polar and alpine cyanobacteria

    PubMed Central

    Anesio, Alexandre M; Sánchez-Baracaldo, Patricia

    2018-01-01

    Abstract In recent years, genomic analyses have arisen as an exciting way of investigating the functional capacity and environmental adaptations of numerous micro-organisms of global relevance, including cyanobacteria. In the extreme cold of Arctic, Antarctic and alpine environments, cyanobacteria are of fundamental ecological importance as primary producers and ecosystem engineers. While their role in biogeochemical cycles is well appreciated, little is known about the genomic makeup of polar and alpine cyanobacteria. In this article, we present ways that genomic techniques might be used to further our understanding of cyanobacteria in cold environments in terms of their evolution and ecology. Existing examples from other environments (e.g. marine/hot springs) are used to discuss how methods developed there might be used to investigate specific questions in the cryosphere. Phylogenomics, comparative genomics and population genomics are identified as methods for understanding the evolution and biogeography of polar and alpine cyanobacteria. Transcriptomics will allow us to investigate gene expression under extreme environmental conditions, and metagenomics can be used to complement tradition amplicon-based methods of community profiling. Finally, new techniques such as single cell genomics and metagenome assembled genomes will also help to expand our understanding of polar and alpine cyanobacteria that cannot readily be cultured. PMID:29506259

  18. Accessing the Soil Metagenome for Studies of Microbial Diversity▿ †

    PubMed Central

    Delmont, Tom O.; Robe, Patrick; Cecillon, Sébastien; Clark, Ian M.; Constancias, Florentin; Simonet, Pascal; Hirsch, Penny R.; Vogel, Timothy M.

    2011-01-01

    Soil microbial communities contain the highest level of prokaryotic diversity of any environment, and metagenomic approaches involving the extraction of DNA from soil can improve our access to these communities. Most analyses of soil biodiversity and function assume that the DNA extracted represents the microbial community in the soil, but subsequent interpretations are limited by the DNA recovered from the soil. Unfortunately, extraction methods do not provide a uniform and unbiased subsample of metagenomic DNA, and as a consequence, accurate species distributions cannot be determined. Moreover, any bias will propagate errors in estimations of overall microbial diversity and may exclude some microbial classes from study and exploitation. To improve metagenomic approaches, investigate DNA extraction biases, and provide tools for assessing the relative abundances of different groups, we explored the biodiversity of the accessible community DNA by fractioning the metagenomic DNA as a function of (i) vertical soil sampling, (ii) density gradients (cell separation), (iii) cell lysis stringency, and (iv) DNA fragment size distribution. Each fraction had a unique genetic diversity, with different predominant and rare species (based on ribosomal intergenic spacer analysis [RISA] fingerprinting and phylochips). All fractions contributed to the number of bacterial groups uncovered in the metagenome, thus increasing the DNA pool for further applications. Indeed, we were able to access a more genetically diverse proportion of the metagenome (a gain of more than 80% compared to the best single extraction method), limit the predominance of a few genomes, and increase the species richness per sequencing effort. This work stresses the difference between extracted DNA pools and the currently inaccessible complete soil metagenome. PMID:21183646

  19. Metagenomic analysis of microbial communities yields insight into impacts of nanoparticle design

    NASA Astrophysics Data System (ADS)

    Metch, Jacob W.; Burrows, Nathan D.; Murphy, Catherine J.; Pruden, Amy; Vikesland, Peter J.

    2018-01-01

    Next-generation DNA sequencing and metagenomic analysis provide powerful tools for the environmentally friendly design of nanoparticles. Herein we demonstrate this approach using a model community of environmental microbes (that is, wastewater-activated sludge) dosed with gold nanoparticles of varying surface coatings and morphologies. Metagenomic analysis was highly sensitive in detecting the microbial community response to gold nanospheres and nanorods with either cetyltrimethylammonium bromide or polyacrylic acid surface coatings. We observed that the gold-nanoparticle morphology imposes a stronger force in shaping the microbial community structure than does the surface coating. Trends were consistent in terms of the compositions of both taxonomic and functional genes, which include antibiotic resistance genes, metal resistance genes and gene-transfer elements associated with cell stress that are relevant to public health. Given that nanoparticle morphology remained constant, the potential influence of gold dissolution was minimal. Surface coating governed the nanoparticle partitioning between the bioparticulate and aqueous phases.

  20. Characterization of the SOS meta-regulon in the human gut microbiome.

    PubMed

    Cornish, Joseph P; Sanchez-Alberola, Neus; O'Neill, Patrick K; O'Keefe, Ronald; Gheba, Jameel; Erill, Ivan

    2014-05-01

    Data from metagenomics projects remain largely untapped for the analysis of transcriptional regulatory networks. Here, we provide proof-of-concept that metagenomic data can be effectively leveraged to analyze regulatory networks by characterizing the SOS meta-regulon in the human gut microbiome. We combine well-established in silico and in vitro techniques to mine the human gut microbiome data and determine the relative composition of the SOS network in a natural setting. Our analysis highlights the importance of translesion synthesis as a primary function of the SOS response. We predict the association of this network with three novel protein clusters involved in cell wall biogenesis, chromosome partitioning and restriction modification, and we confirm binding of the SOS response transcriptional repressor to sites in the promoter of a cell wall biogenesis enzyme, a phage integrase and a death-on-curing protein. We discuss the implications of these findings and the potential for this approach for metagenome analysis.

  1. myPhyloDB: a local web-server and database for the storage and analysis of metagenomics data

    USDA-ARS?s Scientific Manuscript database

    The advent of next-generation sequencing has resulted in an explosion of metagenomics data associated with microbial communities from a variety of ecosystems. However, no database and/or analytical software is currently available that allows for archival and cross-study comparison of such data. my...

  2. deFUME: Dynamic exploration of functional metagenomic sequencing data.

    PubMed

    van der Helm, Eric; Geertz-Hansen, Henrik Marcus; Genee, Hans Jasper; Malla, Sailesh; Sommer, Morten Otto Alexander

    2015-07-31

    Functional metagenomic selections represent a powerful technique that is widely applied for identification of novel genes from complex metagenomic sources. However, whereas hundreds to thousands of clones can be easily generated and sequenced over a few days of experiments, analyzing the data is time consuming and constitutes a major bottleneck for experimental researchers in the field. Here we present the deFUME web server, an easy-to-use web-based interface for processing, annotation and visualization of functional metagenomics sequencing data, tailored to meet the requirements of non-bioinformaticians. The web-server integrates multiple analysis steps into one single workflow: read assembly, open reading frame prediction, and annotation with BLAST, InterPro and GO classifiers. Analysis results are visualized in an online dynamic web-interface. The deFUME webserver provides a fast track from raw sequence to a comprehensive visual data overview that facilitates effortless inspection of gene function, clustering and distribution. The webserver is available at cbs.dtu.dk/services/deFUME/and the source code is distributed at github.com/EvdH0/deFUME.

  3. The Metagenome of Utricularia gibba's Traps: Into the Microbial Input to a Carnivorous Plant

    PubMed Central

    Alcaraz, Luis David; Martínez-Sánchez, Shamayim; Torres, Ignacio; Ibarra-Laclette, Enrique; Herrera-Estrella, Luis

    2016-01-01

    The genome and transcriptome sequences of the aquatic, rootless, and carnivorous plant Utricularia gibba L. (Lentibulariaceae), were recently determined. Traps are necessary for U. gibba because they help the plant to survive in nutrient-deprived environments. The U. gibba's traps (Ugt) are specialized structures that have been proposed to selectively filter microbial inhabitants. To determine whether the traps indeed have a microbiome that differs, in composition or abundance, from the microbiome in the surrounding environment, we used whole-genome shotgun (WGS) metagenomics to describe both the taxonomic and functional diversity of the Ugt microbiome. We collected U. gibba plants from their natural habitat and directly sequenced the metagenome of the Ugt microbiome and its surrounding water. The total predicted number of species in the Ugt was more than 1,100. Using pan-genome fragment recruitment analysis, we were able to identify to the species level of some key Ugt players, such as Pseudomonas monteilii. Functional analysis of the Ugt metagenome suggests that the trap microbiome plays an important role in nutrient scavenging and assimilation while complementing the hydrolytic functions of the plant. PMID:26859489

  4. The standard operating procedure of the DOE-JGI Metagenome Annotation Pipeline (MAP v.4)

    DOE PAGES

    Huntemann, Marcel; Ivanova, Natalia N.; Mavromatis, Konstantinos; ...

    2016-02-24

    The DOE-JGI Metagenome Annotation Pipeline (MAP v.4) performs structural and functional annotation for metagenomic sequences that are submitted to the Integrated Microbial Genomes with Microbiomes (IMG/M) system for comparative analysis. The pipeline runs on nucleotide sequences provide d via the IMG submission site. Users must first define their analysis projects in GOLD and then submit the associated sequence datasets consisting of scaffolds/contigs with optional coverage information and/or unassembled reads in fasta and fastq file formats. The MAP processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNAs, as well as CRISPR elements. Structural annotation ismore » followed by functional annotation including assignment of protein product names and connection to various protein family databases.« less

  5. The standard operating procedure of the DOE-JGI Metagenome Annotation Pipeline (MAP v.4)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Huntemann, Marcel; Ivanova, Natalia N.; Mavromatis, Konstantinos

    The DOE-JGI Metagenome Annotation Pipeline (MAP v.4) performs structural and functional annotation for metagenomic sequences that are submitted to the Integrated Microbial Genomes with Microbiomes (IMG/M) system for comparative analysis. The pipeline runs on nucleotide sequences provide d via the IMG submission site. Users must first define their analysis projects in GOLD and then submit the associated sequence datasets consisting of scaffolds/contigs with optional coverage information and/or unassembled reads in fasta and fastq file formats. The MAP processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNAs, as well as CRISPR elements. Structural annotation ismore » followed by functional annotation including assignment of protein product names and connection to various protein family databases.« less

  6. Metagenomic analysis of bacterial and archaeal assemblages in the soil-mousse surrounding a geothermal spring

    PubMed Central

    Bhatia, Sonu; Batra, Navneet; Pathak, Ashish; Joshi, Amit; Souza, Leila; Almeida, Paulo; Chauhan, Ashvini

    2015-01-01

    The soil-mousse surrounding a geothermal spring was analyzed for bacterial and archaeal diversity using 16S rRNA gene amplicon metagenomic sequencing which revealed the presence of 18 bacterial phyla distributed across 109 families and 219 genera. Firmicutes, Actinobacteria, and the Deinococcus-Thermus group were the predominant bacterial assemblages with Crenarchaeota and Thaumarchaeota as the main archaeal assemblages in this largely understudied geothermal habitat. Several metagenome sequences remained taxonomically unassigned suggesting the presence of a repertoire of hitherto undescribed microbes in this geothermal soil-mousse econiche. PMID:26484255

  7. Microbial community analysis using MEGAN.

    PubMed

    Huson, Daniel H; Weber, Nico

    2013-01-01

    Metagenomics, the study of microbes in the environment using DNA sequencing, depends upon dedicated software tools for processing and analyzing very large sequencing datasets. One such tool is MEGAN (MEtaGenome ANalyzer), which can be used to interactively analyze and compare metagenomic and metatranscriptomic data, both taxonomically and functionally. To perform a taxonomic analysis, the program places the reads onto the NCBI taxonomy, while functional analysis is performed by mapping reads to the SEED, COG, and KEGG classifications. Samples can be compared taxonomically and functionally, using a wide range of different charting and visualization techniques. PCoA analysis and clustering methods allow high-level comparison of large numbers of samples. Different attributes of the samples can be captured and used within analysis. The program supports various input formats for loading data and can export analysis results in different text-based and graphical formats. The program is designed to work with very large samples containing many millions of reads. It is written in Java and installers for the three major computer operating systems are available from http://www-ab.informatik.uni-tuebingen.de. © 2013 Elsevier Inc. All rights reserved.

  8. The single-species metagenome: subtyping Staphylococcus aureus core genome sequences from shotgun metagenomic data

    PubMed Central

    Li, Ben; Petit III, Robert A.; Qin, Zhaohui S.; Darrow, Lyndsey

    2016-01-01

    In this study we developed a genome-based method for detecting Staphylococcus aureus subtypes from metagenome shotgun sequence data. We used a binomial mixture model and the coverage counts at >100,000 known S. aureus SNP (single nucleotide polymorphism) sites derived from prior comparative genomic analysis to estimate the proportion of 40 subtypes in metagenome samples. We were able to obtain >87% sensitivity and >94% specificity at 0.025X coverage for S. aureus. We found that 321 and 149 metagenome samples from the Human Microbiome Project and metaSUB analysis of the New York City subway, respectively, contained S. aureus at genome coverage >0.025. In both projects, CC8 and CC30 were the most common S. aureus clonal complexes encountered. We found evidence that the subtype composition at different body sites of the same individual were more similar than random sampling and more limited evidence that certain body sites were enriched for particular subtypes. One surprising finding was the apparent high frequency of CC398, a lineage often associated with livestock, in samples from the tongue dorsum. Epidemiologic analysis of the HMP subject population suggested that high BMI (body mass index) and health insurance are possibly associated with S. aureus carriage but there was limited power to identify factors linked to carriage of even the most common subtype. In the NYC subway data, we found a small signal of geographic distance affecting subtype clustering but other unknown factors influence taxonomic distribution of the species around the city. PMID:27781166

  9. Integrated metagenomic analysis of the rumen microbiome of cattle reveals key biological mechanisms associated with methane traits.

    PubMed

    Wang, Haiying; Zheng, Huiru; Browne, Fiona; Roehe, Rainer; Dewhurst, Richard J; Engel, Felix; Hemmje, Matthias; Lu, Xiangwu; Walsh, Paul

    2017-07-15

    Methane is one of the major contributors to global warming. The rumen microbiota is directly involved in methane production in cattle. The link between variation in rumen microbial communities and host genetics has important applications and implications in bioscience. Having the potential to reveal the full extent of microbial gene diversity and complex microbial interactions, integrated metagenomics and network analysis holds great promise in this endeavour. This study investigates the rumen microbial community in cattle through the integration of metagenomic and network-based approaches. Based on the relative abundance of 1570 microbial genes identified in a metagenomics analysis, the co-abundance network was constructed and functional modules of microbial genes were identified. One of the main contributions is to develop a random matrix theory-based approach to automatically determining the correlation threshold used to construct the co-abundance network. The resulting network, consisting of 549 microbial genes and 3349 connections, exhibits a clear modular structure with certain trait-specific genes highly over-represented in modules. More specifically, all the 20 genes previously identified to be associated with methane emissions are found in a module (hypergeometric test, p<10 -11 ). One third of genes are involved in methane metabolism pathways. The further examination of abundance profiles across 8 samples of genes highlights that the revealed pattern of metagenomics abundance has a strong association with methane emissions. Furthermore, the module is significantly enriched with microbial genes encoding enzymes that are directly involved in methanogenesis (hypergeometric test, p<10 -9 ). Copyright © 2017 Elsevier Inc. All rights reserved.

  10. Investigation of Microbial Diversity in Geothermal Hot Springs in Unkeshwar, India, Based on 16S rRNA Amplicon Metagenome Sequencing

    PubMed Central

    Mehetre, Gajanan T.; Paranjpe, Aditi; Dastager, Syed G.

    2016-01-01

    Microbial diversity in geothermal waters of the Unkeshwar hot springs in Maharashtra, India, was studied using 16S rRNA amplicon metagenomic sequencing. Taxonomic analysis revealed the presence of Bacteroidetes, Proteobacteria, Cyanobacteria, Actinobacteria, Archeae, and OD1 phyla. Metabolic function prediction analysis indicated a battery of biological information systems indicating rich and novel microbial diversity, with potential biotechnological applications in this niche. PMID:26950332

  11. Metazen – metadata capture for metagenomes

    DOE PAGES

    Bischof, Jared; Harrison, Travis; Paczian, Tobias; ...

    2014-12-08

    Background: As the impact and prevalence of large-scale metagenomic surveys grow, so does the acute need for more complete and standards compliant metadata. Metadata (data describing data) provides an essential complement to experimental data, helping to answer questions about its source, mode of collection, and reliability. Metadata collection and interpretation have become vital to the genomics and metagenomics communities, but considerable challenges remain, including exchange, curation, and distribution. Currently, tools are available for capturing basic field metadata during sampling, and for storing, updating and viewing it. These tools are not specifically designed for metagenomic surveys; in particular, they lack themore » appropriate metadata collection templates, a centralized storage repository, and a unique ID linking system that can be used to easily port complete and compatible metagenomic metadata into widely used assembly and sequence analysis tools. Results: Metazen was developed as a comprehensive framework designed to enable metadata capture for metagenomic sequencing projects. Specifically, Metazen provides a rapid, easy-to-use portal to encourage early deposition of project and sample metadata. Conclusion: Metazen is an interactive tool that aids users in recording their metadata in a complete and valid format. A defined set of mandatory fields captures vital information, while the option to add fields provides flexibility.« less

  12. Metazen – metadata capture for metagenomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bischof, Jared; Harrison, Travis; Paczian, Tobias

    Background: As the impact and prevalence of large-scale metagenomic surveys grow, so does the acute need for more complete and standards compliant metadata. Metadata (data describing data) provides an essential complement to experimental data, helping to answer questions about its source, mode of collection, and reliability. Metadata collection and interpretation have become vital to the genomics and metagenomics communities, but considerable challenges remain, including exchange, curation, and distribution. Currently, tools are available for capturing basic field metadata during sampling, and for storing, updating and viewing it. These tools are not specifically designed for metagenomic surveys; in particular, they lack themore » appropriate metadata collection templates, a centralized storage repository, and a unique ID linking system that can be used to easily port complete and compatible metagenomic metadata into widely used assembly and sequence analysis tools. Results: Metazen was developed as a comprehensive framework designed to enable metadata capture for metagenomic sequencing projects. Specifically, Metazen provides a rapid, easy-to-use portal to encourage early deposition of project and sample metadata. Conclusion: Metazen is an interactive tool that aids users in recording their metadata in a complete and valid format. A defined set of mandatory fields captures vital information, while the option to add fields provides flexibility.« less

  13. An integrated metagenomics pipeline for strain profiling reveals novel patterns of bacterial transmission and biogeography

    PubMed Central

    Nayfach, Stephen; Rodriguez-Mueller, Beltran; Garud, Nandita

    2016-01-01

    We present the Metagenomic Intra-species Diversity Analysis System (MIDAS), which is an integrated computational pipeline for quantifying bacterial species abundance and strain-level genomic variation, including gene content and single-nucleotide polymorphisms (SNPs), from shotgun metagenomes. Our method leverages a database of more than 30,000 bacterial reference genomes that we clustered into species groups. These cover the majority of abundant species in the human microbiome but only a small proportion of microbes in other environments, including soil and seawater. We applied MIDAS to stool metagenomes from 98 Swedish mothers and their infants over one year and used rare SNPs to track strains between hosts. Using this approach, we found that although species compositions of mothers and infants converged over time, strain-level similarity diverged. Specifically, early colonizing bacteria were often transmitted from an infant’s mother, while late colonizing bacteria were often transmitted from other sources in the environment and were enriched for spore-formation genes. We also applied MIDAS to 198 globally distributed marine metagenomes and used gene content to show that many prevalent bacterial species have population structure that correlates with geographic location. Strain-level genetic variants present in metagenomes clearly reveal extensive structure and dynamics that are obscured when data are analyzed at a coarser taxonomic resolution. PMID:27803195

  14. Challenges and Opportunities of Airborne Metagenomics

    PubMed Central

    Behzad, Hayedeh; Gojobori, Takashi; Mineta, Katsuhiko

    2015-01-01

    Recent metagenomic studies of environments, such as marine and soil, have significantly enhanced our understanding of the diverse microbial communities living in these habitats and their essential roles in sustaining vast ecosystems. The increase in the number of publications related to soil and marine metagenomics is in sharp contrast to those of air, yet airborne microbes are thought to have significant impacts on many aspects of our lives from their potential roles in atmospheric events such as cloud formation, precipitation, and atmospheric chemistry to their major impact on human health. In this review, we will discuss the current progress in airborne metagenomics, with a special focus on exploring the challenges and opportunities of undertaking such studies. The main challenges of conducting metagenomic studies of airborne microbes are as follows: 1) Low density of microorganisms in the air, 2) efficient retrieval of microorganisms from the air, 3) variability in airborne microbial community composition, 4) the lack of standardized protocols and methodologies, and 5) DNA sequencing and bioinformatics-related challenges. Overcoming these challenges could provide the groundwork for comprehensive analysis of airborne microbes and their potential impact on the atmosphere, global climate, and our health. Metagenomic studies offer a unique opportunity to examine viral and bacterial diversity in the air and monitor their spread locally or across the globe, including threats from pathogenic microorganisms. Airborne metagenomic studies could also lead to discoveries of novel genes and metabolic pathways relevant to meteorological and industrial applications, environmental bioremediation, and biogeochemical cycles. PMID:25953766

  15. Comparison of methods for library construction and short read annotation of shellfish viral metagenomes.

    PubMed

    Wei, Hong-Ying; Huang, Sheng; Wang, Jiang-Yong; Gao, Fang; Jiang, Jing-Zhe

    2018-03-01

    The emergence and widespread use of high-throughput sequencing technologies have promoted metagenomic studies on environmental or animal samples. Library construction for metagenome sequencing and annotation of the produced sequence reads are important steps in such studies and influence the quality of metagenomic data. In this study, we collected some marine mollusk samples, such as Crassostrea hongkongensis, Chlamys farreri, and Ruditapes philippinarum, from coastal areas in South China. These samples were divided into two batches to compare two library construction methods for shellfish viral metagenome. Our analysis showed that reverse-transcribing RNA into cDNA and then amplifying it simultaneously with DNA by whole genome amplification (WGA) yielded a larger amount of DNA compared to using only WGA or WTA (whole transcriptome amplification). Moreover, higher quality libraries were obtained by agarose gel extraction rather than with AMPure bead size selection. However, the latter can also provide good results if combined with the adjustment of the filter parameters. This, together with its simplicity, makes it a viable alternative. Finally, we compared three annotation tools (BLAST, DIAMOND, and Taxonomer) and two reference databases (NCBI's NR and Uniprot's Uniref). Considering the limitations of computing resources and data transfer speed, we propose the use of DIAMOND with Uniref for annotating metagenomic short reads as its running speed can guarantee a good annotation rate. This study may serve as a useful reference for selecting methods for Shellfish viral metagenome library construction and read annotation.

  16. Resolving prokaryotic taxonomy without rRNA: longer oligonucleotide word lengths improve genome and metagenome taxonomic classification.

    PubMed

    Alsop, Eric B; Raymond, Jason

    2013-01-01

    Oligonucleotide signatures, especially tetranucleotide signatures, have been used as method for homology binning by exploiting an organism's inherent biases towards the use of specific oligonucleotide words. Tetranucleotide signatures have been especially useful in environmental metagenomics samples as many of these samples contain organisms from poorly classified phyla which cannot be easily identified using traditional homology methods, including NCBI BLAST. This study examines oligonucleotide signatures across 1,424 completed genomes from across the tree of life, substantially expanding upon previous work. A comprehensive analysis of mononucleotide through nonanucleotide word lengths suggests that longer word lengths substantially improve the classification of DNA fragments across a range of sizes of relevance to high throughput sequencing. We find that, at present, heptanucleotide signatures represent an optimal balance between prediction accuracy and computational time for resolving taxonomy using both genomic and metagenomic fragments. We directly compare the ability of tetranucleotide and heptanucleotide world lengths (tetranucleotide signatures are the current standard for oligonucleotide word usage analyses) for taxonomic binning of metagenome reads. We present evidence that heptanucleotide word lengths consistently provide more taxonomic resolving power, particularly in distinguishing between closely related organisms that are often present in metagenomic samples. This implies that longer oligonucleotide word lengths should replace tetranucleotide signatures for most analyses. Finally, we show that the application of longer word lengths to metagenomic datasets leads to more accurate taxonomic binning of DNA scaffolds and have the potential to substantially improve taxonomic assignment and assembly of metagenomic data.

  17. Gut metagenomes of type 2 diabetic patients have characteristic single-nucleotide polymorphism distribution in Bacteroides coprocola.

    PubMed

    Chen, Yaowen; Li, Zongcheng; Hu, Shuofeng; Zhang, Jian; Wu, Jiaqi; Shao, Ningsheng; Bo, Xiaochen; Ni, Ming; Ying, Xiaomin

    2017-02-01

    Gut microbes play a critical role in human health and disease, and researchers have begun to characterize their genomes, the so-called gut metagenome. Thus far, metagenomics studies have focused on genus- or species-level composition and microbial gene sets, while strain-level composition and single-nucleotide polymorphism (SNP) have been overlooked. The gut metagenomes of type 2 diabetes (T2D) patients have been found to be enriched with butyrate-producing bacteria and sulfate reduction functions. However, it is not known whether the gut metagenomes of T2D patients have characteristic strain patterns or SNP distributions. We downloaded public gut metagenome datasets from 170 T2D patients and 174 healthy controls and performed a systematic comparative analysis of their metagenome SNPs. We found that Bacteroides coprocola, whose relative abundance did not differ between the groups, had a characteristic distribution of SNPs in the T2D patient group. We identified 65 genes, all in B. coprocola, that had remarkably different enrichment of SNPs. The first and sixth ranked genes encode glycosyl hydrolases (GenBank accession EDU99824.1 and EDV02301.1). Interestingly, alpha-glucosidase, which is also a glycosyl hydrolase located in the intestine, is an important drug target of T2D. These results suggest that different strains of B. coprocola may have different roles in human gut and a specific set of B. coprocola strains are correlated with T2D.

  18. The integrated microbial genome resource of analysis.

    PubMed

    Checcucci, Alice; Mengoni, Alessio

    2015-01-01

    Integrated Microbial Genomes and Metagenomes (IMG) is a biocomputational system that allows to provide information and support for annotation and comparative analysis of microbial genomes and metagenomes. IMG has been developed by the US Department of Energy (DOE)-Joint Genome Institute (JGI). IMG platform contains both draft and complete genomes, sequenced by Joint Genome Institute and other public and available genomes. Genomes of strains belonging to Archaea, Bacteria, and Eukarya domains are present as well as those of viruses and plasmids. Here, we provide some essential features of IMG system and case study for pangenome analysis.

  19. Strain-Level Metagenomic Analysis of the Fermented Dairy Beverage Nunu Highlights Potential Food Safety Risks

    PubMed Central

    Walsh, Aaron M.; Crispie, Fiona; Daari, Kareem; O'Sullivan, Orla; Martin, Jennifer C.; Arthur, Cornelius T.; Claesson, Marcus J.; Scott, Karen P.

    2017-01-01

    ABSTRACT The rapid detection of pathogenic strains in food products is essential for the prevention of disease outbreaks. It has already been demonstrated that whole-metagenome shotgun sequencing can be used to detect pathogens in food but, until recently, strain-level detection of pathogens has relied on whole-metagenome assembly, which is a computationally demanding process. Here we demonstrated that three short-read-alignment-based methods, i.e., MetaMLST, PanPhlAn, and StrainPhlAn, could accurately and rapidly identify pathogenic strains in spinach metagenomes that had been intentionally spiked with Shiga toxin-producing Escherichia coli in a previous study. Subsequently, we employed the methods, in combination with other metagenomics approaches, to assess the safety of nunu, a traditional Ghanaian fermented milk product that is produced by the spontaneous fermentation of raw cow milk. We showed that nunu samples were frequently contaminated with bacteria associated with the bovine gut and, worryingly, we detected putatively pathogenic E. coli and Klebsiella pneumoniae strains in a subset of nunu samples. Ultimately, our work establishes that short-read-alignment-based bioinformatics approaches are suitable food safety tools, and we describe a real-life example of their utilization. IMPORTANCE Foodborne pathogens are responsible for millions of illnesses each year. Here we demonstrate that short-read-alignment-based bioinformatics tools can accurately and rapidly detect pathogenic strains in food products by using shotgun metagenomics data. The methods used here are considerably faster than both traditional culturing methods and alternative bioinformatics approaches that rely on metagenome assembly; therefore, they can potentially be used for more high-throughput food safety testing. Overall, our results suggest that whole-metagenome sequencing can be used as a practical food safety tool to prevent diseases or to link outbreaks to specific food products. PMID:28625983

  20. Zooplankton community analysis in the Changjiang River estuary by single-gene-targeted metagenomics

    NASA Astrophysics Data System (ADS)

    Cheng, Fangping; Wang, Minxiao; Li, Chaolun; Sun, Song

    2014-07-01

    DNA barcoding provides accurate identification of zooplankton species through all life stages. Single-gene-targeted metagenomic analysis based on DNA barcode databases can facilitate longterm monitoring of zooplankton communities. With the help of the available zooplankton databases, the zooplankton community of the Changjiang (Yangtze) River estuary was studied using a single-gene-targeted metagenomic method to estimate the species richness of this community. A total of 856 mitochondrial cytochrome oxidase subunit 1 (cox1) gene sequences were determined. The environmental barcodes were clustered into 70 molecular operational taxonomic units (MOTUs). Forty-two MOTUs matched barcoded marine organisms with more than 90% similarity and were assigned to either the species (similarity>96%) or genus level (similarity<96%). Sibling species could also be distinguished. Many species that were overlooked by morphological methods were identified by molecular methods, especially gelatinous zooplankton and merozooplankton that were likely sampled at different life history phases. Zooplankton community structures differed significantly among all of the samples. The MOTU spatial distributions were influenced by the ecological habits of the corresponding species. In conclusion, single-gene-targeted metagenomic analysis is a useful tool for zooplankton studies, with which specimens from all life history stages can be identified quickly and effectively with a comprehensive database.

  1. Community cyberinfrastructure for Advanced Microbial Ecology Research and Analysis: the CAMERA resource

    PubMed Central

    Sun, Shulei; Chen, Jing; Li, Weizhong; Altintas, Ilkay; Lin, Abel; Peltier, Steve; Stocks, Karen; Allen, Eric E.; Ellisman, Mark; Grethe, Jeffrey; Wooley, John

    2011-01-01

    The Community Cyberinfrastructure for Advanced Microbial Ecology Research and Analysis (CAMERA, http://camera.calit2.net/) is a database and associated computational infrastructure that provides a single system for depositing, locating, analyzing, visualizing and sharing data about microbial biology through an advanced web-based analysis portal. CAMERA collects and links metadata relevant to environmental metagenome data sets with annotation in a semantically-aware environment allowing users to write expressive semantic queries against the database. To meet the needs of the research community, users are able to query metadata categories such as habitat, sample type, time, location and other environmental physicochemical parameters. CAMERA is compliant with the standards promulgated by the Genomic Standards Consortium (GSC), and sustains a role within the GSC in extending standards for content and format of the metagenomic data and metadata and its submission to the CAMERA repository. To ensure wide, ready access to data and annotation, CAMERA also provides data submission tools to allow researchers to share and forward data to other metagenomics sites and community data archives such as GenBank. It has multiple interfaces for easy submission of large or complex data sets, and supports pre-registration of samples for sequencing. CAMERA integrates a growing list of tools and viewers for querying, analyzing, annotating and comparing metagenome and genome data. PMID:21045053

  2. Community cyberinfrastructure for Advanced Microbial Ecology Research and Analysis: the CAMERA resource.

    PubMed

    Sun, Shulei; Chen, Jing; Li, Weizhong; Altintas, Ilkay; Lin, Abel; Peltier, Steve; Stocks, Karen; Allen, Eric E; Ellisman, Mark; Grethe, Jeffrey; Wooley, John

    2011-01-01

    The Community Cyberinfrastructure for Advanced Microbial Ecology Research and Analysis (CAMERA, http://camera.calit2.net/) is a database and associated computational infrastructure that provides a single system for depositing, locating, analyzing, visualizing and sharing data about microbial biology through an advanced web-based analysis portal. CAMERA collects and links metadata relevant to environmental metagenome data sets with annotation in a semantically-aware environment allowing users to write expressive semantic queries against the database. To meet the needs of the research community, users are able to query metadata categories such as habitat, sample type, time, location and other environmental physicochemical parameters. CAMERA is compliant with the standards promulgated by the Genomic Standards Consortium (GSC), and sustains a role within the GSC in extending standards for content and format of the metagenomic data and metadata and its submission to the CAMERA repository. To ensure wide, ready access to data and annotation, CAMERA also provides data submission tools to allow researchers to share and forward data to other metagenomics sites and community data archives such as GenBank. It has multiple interfaces for easy submission of large or complex data sets, and supports pre-registration of samples for sequencing. CAMERA integrates a growing list of tools and viewers for querying, analyzing, annotating and comparing metagenome and genome data.

  3. BioSurfDB: knowledge and algorithms to support biosurfactants and biodegradation studies

    PubMed Central

    Oliveira, Jorge S.; Araújo, Wydemberg; Lopes Sales, Ana Isabela; de Brito Guerra, Alaine; da Silva Araújo, Sinara Carla; de Vasconcelos, Ana Tereza Ribeiro; Agnez-Lima, Lucymara F.; Freitas, Ana Teresa

    2015-01-01

    Crude oil extraction, transportation and use provoke the contamination of countless ecosystems. Therefore, bioremediation through surfactants mobilization or biodegradation is an important subject, both economically and environmentally. Bioremediation research had a great boost with the recent advances in Metagenomics, as it enabled the sequencing of uncultured microorganisms providing new insights on surfactant-producing and/or oil-degrading bacteria. Many research studies are making available genomic data from unknown organisms obtained from metagenomics analysis of oil-contaminated environmental samples. These new datasets are presently demanding the development of new tools and data repositories tailored for the biological analysis in a context of bioremediation data analysis. This work presents BioSurfDB, www.biosurfdb.org, a curated relational information system integrating data from: (i) metagenomes; (ii) organisms; (iii) biodegradation relevant genes; proteins and their metabolic pathways; (iv) bioremediation experiments results, with specific pollutants treatment efficiencies by surfactant producing organisms; and (v) a biosurfactant-curated list, grouped by producing organism, surfactant name, class and reference. The main goal of this repository is to gather information on the characterization of biological compounds and mechanisms involved in biosurfactant production and/or biodegradation and make it available in a curated way and associated with a number of computational tools to support studies of genomic and metagenomic data. Database URL: www.biosurfdb.org PMID:25833955

  4. Tentacle: distributed quantification of genes in metagenomes.

    PubMed

    Boulund, Fredrik; Sjögren, Anders; Kristiansson, Erik

    2015-01-01

    In metagenomics, microbial communities are sequenced at increasingly high resolution, generating datasets with billions of DNA fragments. Novel methods that can efficiently process the growing volumes of sequence data are necessary for the accurate analysis and interpretation of existing and upcoming metagenomes. Here we present Tentacle, which is a novel framework that uses distributed computational resources for gene quantification in metagenomes. Tentacle is implemented using a dynamic master-worker approach in which DNA fragments are streamed via a network and processed in parallel on worker nodes. Tentacle is modular, extensible, and comes with support for six commonly used sequence aligners. It is easy to adapt Tentacle to different applications in metagenomics and easy to integrate into existing workflows. Evaluations show that Tentacle scales very well with increasing computing resources. We illustrate the versatility of Tentacle on three different use cases. Tentacle is written for Linux in Python 2.7 and is published as open source under the GNU General Public License (v3). Documentation, tutorials, installation instructions, and the source code are freely available online at: http://bioinformatics.math.chalmers.se/tentacle.

  5. Productivity and salinity structuring of the microplankton revealed by comparative freshwater metagenomics

    PubMed Central

    Eiler, Alexander; Zaremba-Niedzwiedzka, Katarzyna; Martínez-García, Manuel; McMahon, Katherine D; Stepanauskas, Ramunas; Andersson, Siv G E; Bertilsson, Stefan

    2014-01-01

    Little is known about the diversity and structuring of freshwater microbial communities beyond the patterns revealed by tracing their distribution in the landscape with common taxonomic markers such as the ribosomal RNA. To address this gap in knowledge, metagenomes from temperate lakes were compared to selected marine metagenomes. Taxonomic analyses of rRNA genes in these freshwater metagenomes confirm the previously reported dominance of a limited subset of uncultured lineages of freshwater bacteria, whereas Archaea were rare. Diversification into marine and freshwater microbial lineages was also reflected in phylogenies of functional genes, and there were also significant differences in functional beta-diversity. The pathways and functions that accounted for these differences are involved in osmoregulation, active transport, carbohydrate and amino acid metabolism. Moreover, predicted genes orthologous to active transporters and recalcitrant organic matter degradation were more common in microbial genomes from oligotrophic versus eutrophic lakes. This comparative metagenomic analysis allowed us to formulate a general hypothesis that oceanic- compared with freshwater-dwelling microorganisms, invest more in metabolism of amino acids and that strategies of carbohydrate metabolism differ significantly between marine and freshwater microbial communities. PMID:24118837

  6. Marine Metagenome as A Resource for Novel Enzymes.

    PubMed

    Alma'abadi, Amani D; Gojobori, Takashi; Mineta, Katsuhiko

    2015-10-01

    More than 99% of identified prokaryotes, including many from the marine environment, cannot be cultured in the laboratory. This lack of capability restricts our knowledge of microbial genetics and community ecology. Metagenomics, the culture-independent cloning of environmental DNAs that are isolated directly from an environmental sample, has already provided a wealth of information about the uncultured microbial world. It has also facilitated the discovery of novel biocatalysts by allowing researchers to probe directly into a huge diversity of enzymes within natural microbial communities. Recent advances in these studies have led to a great interest in recruiting microbial enzymes for the development of environmentally-friendly industry. Although the metagenomics approach has many limitations, it is expected to provide not only scientific insights but also economic benefits, especially in industry. This review highlights the importance of metagenomics in mining microbial lipases, as an example, by using high-throughput techniques. In addition, we discuss challenges in the metagenomics as an important part of bioinformatics analysis in big data. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.

  7. Comparative analysis of metagenomes of Italian top soil improvers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gigliucci, Federica, E-mail: Federica.gigliucci@li

    Biosolids originating from Municipal Waste Water Treatment Plants are proposed as top soil improvers (TSI) for their beneficial input of organic carbon on agriculture lands. Their use to amend soil is controversial, as it may lead to the presence of emerging hazards of anthropogenic or animal origin in the environment devoted to food production. In this study, we used a shotgun metagenomics sequencing as a tool to perform a characterization of the hazards related with the TSIs. The samples showed the presence of many virulence genes associated to different diarrheagenic E. coli pathotypes as well as of different antimicrobial resistance-associatedmore » genes. The genes conferring resistance to Fluoroquinolones was the most relevant class of antimicrobial resistance genes observed in all the samples tested. To a lesser extent traits associated with the resistance to Methicillin in Staphylococci and genes conferring resistance to Streptothricin, Fosfomycin and Vancomycin were also identified. The most represented metal resistance genes were cobalt-zinc-cadmium related, accounting for 15–50% of the sequence reads in the different metagenomes out of the total number of those mapping on the class of resistance to compounds determinants. Moreover the taxonomic analysis performed by comparing compost-based samples and biosolids derived from municipal sewage-sludges treatments divided the samples into separate populations, based on the microbiota composition. The results confirm that the metagenomics is efficient to detect genomic traits associated with pathogens and antimicrobial resistance in complex matrices and this approach can be efficiently used for the traceability of TSI samples using the microorganisms’ profiles as indicators of their origin. - Highlights: • Sludge- and green- based biosolids analysed by metagenomics. • Biosolids may introduce microbial hazards in the food chain. • Metagenomics enables tracking biosolids’ sources.« less

  8. Phylogeny and phylogeography of functional genes shared among seven terrestrial subsurface metagenomes reveal N-cycling and microbial evolutionary relationships

    PubMed Central

    Lau, Maggie C. Y.; Cameron, Connor; Magnabosco, Cara; Brown, C. Titus; Schilkey, Faye; Grim, Sharon; Hendrickson, Sarah; Pullin, Michael; Sherwood Lollar, Barbara; van Heerden, Esta; Kieft, Thomas L.; Onstott, Tullis C.

    2014-01-01

    Comparative studies on community phylogenetics and phylogeography of microorganisms living in extreme environments are rare. Terrestrial subsurface habitats are valuable for studying microbial biogeographical patterns due to their isolation and the restricted dispersal mechanisms. Since the taxonomic identity of a microorganism does not always correspond well with its functional role in a particular community, the use of taxonomic assignments or patterns may give limited inference on how microbial functions are affected by historical, geographical and environmental factors. With seven metagenomic libraries generated from fracture water samples collected from five South African mines, this study was carried out to (1) screen for ubiquitous functions or pathways of biogeochemical cycling of CH4, S, and N; (2) to characterize the biodiversity represented by the common functional genes; (3) to investigate the subsurface biogeography as revealed by this subset of genes; and (4) to explore the possibility of using metagenomic data for evolutionary study. The ubiquitous functional genes are NarV, NPD, PAPS reductase, NifH, NifD, NifK, NifE, and NifN genes. Although these eight common functional genes were taxonomically and phylogenetically diverse and distinct from each other, the dissimilarity between samples did not correlate strongly with geographical or environmental parameters or residence time of the water. Por genes homologous to those of Thermodesulfovibrio yellowstonii detected in all metagenomes were deep lineages of Nitrospirae, suggesting that subsurface habitats have preserved ancestral genetic signatures that inform the study of the origin and evolution of prokaryotes. PMID:25400621

  9. Metagenomics Analysis of Microorganisms in Freshwater Lakes of the Amazon Basin.

    PubMed

    Toyama, Danyelle; Kishi, Luciano Takeshi; Santos-Júnior, Célio Dias; Soares-Costa, Andrea; de Oliveira, Tereza Cristina Souza; de Miranda, Fernando Pellon; Henrique-Silva, Flávio

    2016-12-22

    The Amazon Basin is the largest hydrographic basin on the planet, and the dynamics of its aquatic microorganisms strongly impact global biogeochemical cycles. However, it remains poorly studied. This metagenome project was performed to obtain a snapshot of prokaryotic microbiota from four important lakes in the Amazon Basin. Copyright © 2016 Toyama et al.

  10. Cost-benefit analysis of introducing next-generation sequencing (metagenomic) pathogen testing in the setting of pyrexia of unknown origin.

    PubMed

    Chai, Jia Hui; Lee, Chun Kiat; Lee, Hong Kai; Wong, Nicholas; Teo, Kahwee; Tan, Chuen Seng; Thokala, Praveen; Tang, Julian Wei-Tze; Tambyah, Paul Anantharajah; Oh, Vernon Min Sen; Loh, Tze Ping; Yoong, Joanne

    2018-01-01

    Pyrexia of unknown origin (PUO) is defined as a temperature of >38.3°C that lasts for >3 weeks, where no cause can be found despite appropriate investigation. Existing protocols for the work-up of PUO can be extensive and costly, motivating the application of recent advances in molecular diagnostics to pathogen testing. There have been many reports describing various analytical methods and performance of metagenomic pathogen testing in clinical samples but the economics of it has been less well studied. This study pragmatically evaluates the feasibility of introducing metagenomic testing in this setting by assessing the relative cost of clinically-relevant strategies employing this investigative tool under various cost and performance scenarios using Singapore as a demonstration case, and assessing the price and performance benchmarks, which would need to be achieved for metagenomic testing to be potentially considered financially viable relative to the current diagnostic standard. This study has some important limitations: we examined only impact of introducing the metagenomic test to the overall diagnostic cost and excluded costs associated with hospitalization and makes assumptions about the performance of the routine diagnostic tests, limiting the cost of metagenomic test, and the lack of further work-up after positive pathogen detection by the metagenomic test. However, these assumptions were necessary to keep the model within reasonable limits. In spite of these, the simplified presentation lends itself to the illustration of the key insights of our paper. In general, we find the use of metagenomic testing as second-line investigation is effectively dominated, and that use of metagenomic testing at first-line would typically require higher rates of detection or lower cost than currently available in order to be justifiable purely as a cost-saving measure. We conclude that current conditions do not warrant a widespread rush to deploy metagenomic testing to resolve any and all uncertainty, but rather as a front-line technology that should be used in specific contexts, as a supplement to rather than a replacement for careful clinical judgement.

  11. Discovery of new cellulases from the metagenome by a metagenomics-guided strategy.

    PubMed

    Yang, Chao; Xia, Yu; Qu, Hong; Li, An-Dong; Liu, Ruihua; Wang, Yubo; Zhang, Tong

    2016-01-01

    Energy shortage has become a global problem. Production of biofuels from renewable biomass resources is an inevitable trend of sustainable development. Cellulose is the most abundant and renewable resource in nature. Lack of new cellulases with unique properties has become the bottleneck of the efficient utilization of cellulose. Environmental metagenomes are regarded as huge reservoirs for a variety of cellulases. However, new cellulases cannot be obtained easily by functional screening of metagenomic libraries. In this work, a metagenomics-guided strategy for obtaining new cellulases from the metagenome was proposed. Metagenomic sequences of DNA extracted from the anaerobic beer lees converting consortium enriched at thermophilic conditions were assembled, and 23 glycoside hydrolase (GH) sequences affiliated with the GH family 5 were identified. Among the 23 GH sequences, three target sequences (designated as cel7482, cel3623 and cel36) showing low identity with those known GHs were chosen as the putative cellulase genes to be functionally expressed in Escherichia coli after PCR cloning. The three cellulases were classified into endo-β-1,4-glucanases by product pattern analysis. The recombinant cellulases were more active at pH 5.5 and within a temperature range of 60-70 °C. Computer-assisted 3D structure modeling indicated that the active residues in the active site of the recombinant cellulases were more similar to each other compared with non-active site residues. The recombinant cel7482 was extremely tolerant to 2 M NaCl, suggesting that cel7482 may be a halotolerant cellulase. Moreover, the recombinant cel7482 was shown to have an ability to resist three ionic liquids (ILs), which are widely used for cellulose pretreatment. Furthermore, active cel7482 was secreted by the twin-arginine translocation (Tat) pathway of Bacillus subtilis 168 into the culture medium, which facilitates the subsequent purification and reduces the formation of inclusion body in the context of overexpression. This study demonstrated a simple and efficient method for direct cloning of new cellulase genes from environmental metagenomes. In the future, the metagenomics-guided strategy may be applied to the high-throughput screening of new cellulases from environmental metagenomes.

  12. Vinasse fertirrigation alters soil resistome dynamics: an analysis based on metagenomic profiles.

    PubMed

    Braga, Lucas P P; Alves, Rafael F; Dellias, Marina T F; Navarrete, Acacio A; Basso, Thiago O; Tsai, Siu M

    2017-01-01

    Every year around 300 Gl of vinasse, a by-product of ethanol distillation in sugarcane mills, are flushed into more than 9 Mha of sugarcane cropland in Brazil. This practice links fermentation waste management to fertilization for plant biomass production, and it is known as fertirrigation. Here we evaluate public datasets of soil metagenomes mining for changes in antibiotic resistance genes (ARGs) of soils from sugarcane mesocosms repeatedly amended with vinasse. The metagenomes were annotated using the ResFam database. We found that the abundance of open read frames (ORFs) annotated as ARGs changed significantly across 43 different families ( p -value < 0.05). Co-occurrence network analysis revealed distinct patterns of interactions among ARGs, suggesting that nutrient amendment to soil microbial communities can impact on the coevolutionary dynamics of indigenous ARGs within soil resistome.

  13. Freshwater Metaviromics and Bacteriophages: A Current Assessment of the State of the Art in Relation to Bioinformatic Challenges

    PubMed Central

    Bruder, Katherine; Malki, Kema; Cooper, Alexandria; Sible, Emily; Shapiro, Jason W.; Watkins, Siobhan C.; Putonti, Catherine

    2016-01-01

    Advances in bioinformatics and sequencing technologies have allowed for the analysis of complex microbial communities at an unprecedented rate. While much focus is often placed on the cellular members of these communities, viruses play a pivotal role, particularly bacteria-infecting viruses (bacteriophages); phages mediate global biogeochemical processes and drive microbial evolution through bacterial grazing and horizontal gene transfer. Despite their importance and ubiquity in nature, very little is known about the diversity and structure of viral communities. Though the need for culture-based methods for viral identification has been somewhat circumvented through metagenomic techniques, the analysis of metaviromic data is marred with many unique issues. In this review, we examine the current bioinformatic approaches for metavirome analyses and the inherent challenges facing the field as illustrated by the ongoing efforts in the exploration of freshwater phage populations. PMID:27375355

  14. Investigation of Microbial Diversity in Geothermal Hot Springs in Unkeshwar, India, Based on 16S rRNA Amplicon Metagenome Sequencing.

    PubMed

    Mehetre, Gajanan T; Paranjpe, Aditi; Dastager, Syed G; Dharne, Mahesh S

    2016-02-25

    Microbial diversity in geothermal waters of the Unkeshwar hot springs in Maharashtra, India, was studied using 16S rRNA amplicon metagenomic sequencing. Taxonomic analysis revealed the presence of Bacteroidetes, Proteobacteria, Cyanobacteria, Actinobacteria, Archeae, and OD1 phyla. Metabolic function prediction analysis indicated a battery of biological information systems indicating rich and novel microbial diversity, with potential biotechnological applications in this niche. Copyright © 2016 Mehetre et al.

  15. Environmental microbiology through the lens of high-throughput DNA sequencing: synopsis of current platforms and bioinformatics approaches.

    PubMed

    Logares, Ramiro; Haverkamp, Thomas H A; Kumar, Surendra; Lanzén, Anders; Nederbragt, Alexander J; Quince, Christopher; Kauserud, Håvard

    2012-10-01

    The incursion of High-Throughput Sequencing (HTS) in environmental microbiology brings unique opportunities and challenges. HTS now allows a high-resolution exploration of the vast taxonomic and metabolic diversity present in the microbial world, which can provide an exceptional insight on global ecosystem functioning, ecological processes and evolution. This exploration has also economic potential, as we will have access to the evolutionary innovation present in microbial metabolisms, which could be used for biotechnological development. HTS is also challenging the research community, and the current bottleneck is present in the data analysis side. At the moment, researchers are in a sequence data deluge, with sequencing throughput advancing faster than the computer power needed for data analysis. However, new tools and approaches are being developed constantly and the whole process could be depicted as a fast co-evolution between sequencing technology, informatics and microbiologists. In this work, we examine the most popular and recently commercialized HTS platforms as well as bioinformatics methods for data handling and analysis used in microbial metagenomics. This non-exhaustive review is intended to serve as a broad state-of-the-art guide to researchers expanding into this rapidly evolving field. Copyright © 2012 Elsevier B.V. All rights reserved.

  16. Limited dissemination of the wastewater treatment plant core resistome.

    PubMed

    Munck, Christian; Albertsen, Mads; Telke, Amar; Ellabaan, Mostafa; Nielsen, Per Halkjær; Sommer, Morten O A

    2015-09-30

    Horizontal gene transfer is a major contributor to the evolution of bacterial genomes and can facilitate the dissemination of antibiotic resistance genes between environmental reservoirs and potential pathogens. Wastewater treatment plants (WWTPs) are believed to play a central role in the dissemination of antibiotic resistance genes. However, the contribution of the dominant members of the WWTP resistome to resistance in human pathogens remains poorly understood. Here we use a combination of metagenomic functional selections and comprehensive metagenomic sequencing to uncover the dominant genes of the WWTP resistome. We find that this core resistome is unique to the WWTP environment, with <10% of the resistance genes found outside the WWTP environment. Our data highlight that, despite an abundance of functional resistance genes within WWTPs, only few genes are found in other environments, suggesting that the overall dissemination of the WWTP resistome is comparable to that of the soil resistome.

  17. Limited dissemination of the wastewater treatment plant core resistome

    PubMed Central

    Munck, Christian; Albertsen, Mads; Telke, Amar; Ellabaan, Mostafa; Nielsen, Per Halkjær; Sommer, Morten O. A.

    2015-01-01

    Horizontal gene transfer is a major contributor to the evolution of bacterial genomes and can facilitate the dissemination of antibiotic resistance genes between environmental reservoirs and potential pathogens. Wastewater treatment plants (WWTPs) are believed to play a central role in the dissemination of antibiotic resistance genes. However, the contribution of the dominant members of the WWTP resistome to resistance in human pathogens remains poorly understood. Here we use a combination of metagenomic functional selections and comprehensive metagenomic sequencing to uncover the dominant genes of the WWTP resistome. We find that this core resistome is unique to the WWTP environment, with <10% of the resistance genes found outside the WWTP environment. Our data highlight that, despite an abundance of functional resistance genes within WWTPs, only few genes are found in other environments, suggesting that the overall dissemination of the WWTP resistome is comparable to that of the soil resistome. PMID:26419330

  18. Cloning and characterization of a novel α-amylase from a fecal microbial metagenome.

    PubMed

    Xu, Bo; Yang, Fuya; Xiong, Caiyun; Li, Junjun; Tang, Xianghua; Zhou, Junpei; Xie, Zhenrong; Ding, Junmei; Yang, Yunjuan; Huang, Zunxi

    2014-04-01

    To isolate novel and useful microbial enzymes from uncultured gastrointestinal microorganisms, a fecal microbial metagenomic library of the pygmy loris was constructed. The library was screened for amylolytic activity, and 8 of 50,000 recombinant clones showed amylolytic activity. Subcloning and sequence analysis of a positive clone led to the identification a novel gene (amyPL) coding for α-amylase. AmyPL was expressed in Escherichia coli BL21 (DE3) and the purified AmyPL was enzymatically characterized. This study is the first to report the molecular and biochemical characterization of a novel α-amylase from a gastrointestinal metagenomic library.

  19. Exploring Genomic Diversity Using Metagenomics of Deep-Sea Subsurface Microbes from the Louisville Seamount and the South Pacific Gyre

    NASA Astrophysics Data System (ADS)

    Tully, B. J.; Sylvan, J. B.; Heidelberg, J. F.; Huber, J. A.

    2014-12-01

    There are many limitations involved with sampling microbial diversity from deep-sea subsurface environments, ranging from physical sample collection, low microbial biomass, culturing at in situ conditions, and inefficient nucleic acid extractions. As such, we are continually modifying our methods to obtain better results and expanding what we know about microbes in these environments. Here we present analysis of metagenomes sequences from samples collected from 120 m within the Louisville Seamount and from the top 5-10cm of the sediment in the center of the south Pacific gyre (SPG). Both systems are low biomass with ~102 and ~104 cells per cm3 for Louisville Seamount samples analyzed and the SPG sediment, respectively. The Louisville Seamount represents the first in situ subseafloor basalt and the SPG sediments represent the first in situ low biomass sediment microbial metagenomes. Both of these environments, subseafloor basalt and sediments underlying oligotrophic ocean gyres, represent large provinces of the seafloor environment that remain understudied. Despite the low biomass and DNA generated from these samples, we have generated 16 near complete genomes (5 from Louisville and 11 from the SPG) from the two metagenomic datasets. These genomes are estimated to be between 51-100% complete and span a range of phylogenetic groups, including the Proteobacteria, Actinobacteria, Firmicutes, Chloroflexi, and unclassified bacterial groups. With these genomes, we have assessed potential functional capabilities of these organisms and performed a comparative analysis between the environmental genomes and previously sequenced relatives to determine possible adaptations that may elucidate survival mechanisms for these low energy environments. These methods illustrate a baseline analysis that can be applied to future metagenomic deep-sea subsurface datasets and will help to further our understanding of microbiology within these environments.

  20. Quantifying the biases in metagenome mining for realistic assessment of microbial ecology of naturally fermented foods.

    PubMed

    Keisam, Santosh; Romi, Wahengbam; Ahmed, Giasuddin; Jeyaram, Kumaraswamy

    2016-09-27

    Cultivation-independent investigation of microbial ecology is biased by the DNA extraction methods used. We aimed to quantify those biases by comparative analysis of the metagenome mined from four diverse naturally fermented foods (bamboo shoot, milk, fish, soybean) using eight different DNA extraction methods with different cell lysis principles. Our findings revealed that the enzymatic lysis yielded higher eubacterial and yeast metagenomic DNA from the food matrices compared to the widely used chemical and mechanical lysis principles. Further analysis of the bacterial community structure by Illumina MiSeq amplicon sequencing revealed a high recovery of lactic acid bacteria by the enzymatic lysis in all food types. However, Bacillaceae, Acetobacteraceae, Clostridiaceae and Proteobacteria were more abundantly recovered when mechanical and chemical lysis principles were applied. The biases generated due to the differential recovery of operational taxonomic units (OTUs) by different DNA extraction methods including DNA and PCR amplicons mix from different methods have been quantitatively demonstrated here. The different methods shared only 29.9-52.0% of the total OTUs recovered. Although similar comparative research has been performed on other ecological niches, this is the first in-depth investigation of quantifying the biases in metagenome mining from naturally fermented foods.

  1. Abundance and functional diversity of riboswitches in microbial communities

    PubMed Central

    Kazanov, Marat D; Vitreschak, Alexey G; Gelfand, Mikhail S

    2007-01-01

    Background Several recently completed large-scale enviromental sequencing projects produced a large amount of genetic information about microbial communities ('metagenomes') which is not biased towards cultured organisms. It is a good source for estimation of the abundance of genes and regulatory structures in both known and unknown members of microbial communities. In this study we consider the distribution of RNA regulatory structures, riboswitches, in the Sargasso Sea, Minnesota Soil and Whale Falls metagenomes. Results Over three hundred riboswitches were found in about 2 Gbp metagenome DNA sequences. The abundabce of riboswitches in metagenomes was highest for the TPP, B12 and GCVT riboswitches; the S-box, RFN, YKKC/YXKD, YYBP/YKOY regulatory elements showed lower but significant abundance, while the LYS, G-box, GLMS and YKOK riboswitches were rare. Regions downstream of identified riboswitches were scanned for open reading frames. Comparative analysis of identified ORFs revealed new riboswitch-regulated functions for several classes of riboswitches. In particular, we have observed phosphoserine aminotransferase serC (COG1932) and malate synthase glcB (COG2225) to be regulated by the glycine (GCVT) riboswitch; fatty acid desaturase ole1 (COG1398), by the cobalamin (B12) riboswitch; 5-methylthioribose-1-phosphate isomerase ykrS (COG0182), by the SAM-riboswitch. We also identified conserved riboswitches upstream of genes of unknown function: thiamine (TPP), cobalamine (B12), and glycine (GCVT, upstream of genes from COG4198). Conclusion This study demonstrates applicability of bioinformatics to the analysis of RNA regulatory structures in metagenomes. PMID:17908319

  2. Identification of nitrogen-fixing genes and gene clusters from metagenomic library of acid mine drainage.

    PubMed

    Dai, Zhimin; Guo, Xue; Yin, Huaqun; Liang, Yili; Cong, Jing; Liu, Xueduan

    2014-01-01

    Biological nitrogen fixation is an essential function of acid mine drainage (AMD) microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community.

  3. Identification of Nitrogen-Fixing Genes and Gene Clusters from Metagenomic Library of Acid Mine Drainage

    PubMed Central

    Yin, Huaqun; Liang, Yili; Cong, Jing; Liu, Xueduan

    2014-01-01

    Biological nitrogen fixation is an essential function of acid mine drainage (AMD) microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community. PMID:24498417

  4. Sequence-based screening for self-sufficient P450 monooxygenase from a metagenome library.

    PubMed

    Kim, B S; Kim, S Y; Park, J; Park, W; Hwang, K Y; Yoon, Y J; Oh, W K; Kim, B Y; Ahn, J S

    2007-05-01

    Cytochrome P450 monooxygenases (CYPs) are useful catalysts for oxidation reactions. Self-sufficient CYPs harbour a reductive domain covalently connected to a P450 domain and are known for their robust catalytic activity with great potential as biocatalysts. In an effort to expand genetic sources of self-sufficient CYPs, we devised a sequence-based screening system to identify them in a soil metagenome. We constructed a soil metagenome library and performed sequence-based screening for self-sufficient CYP genes. A new CYP gene, syk181, was identified from the metagenome library. Phylogenetic analysis revealed that SYK181 formed a distinct phylogenic line with 46% amino-acid-sequence identity to CYP102A1 which has been extensively studied as a fatty acid hydroxylase. The heterologously expressed SYK181 showed significant hydroxylase activity towards naphthalene and phenanthrene as well as towards fatty acids. Sequence-based screening of metagenome libraries is expected to be a useful approach for searching self-sufficient CYP genes. The translated product of syk181 shows self-sufficient hydroxylase activity towards fatty acids and aromatic compounds. SYK181 is the first self-sufficient CYP obtained directly from a metagenome library. The genetic and biochemical information on SYK181 are expected to be helpful for engineering self-sufficient CYPs with broader catalytic activities towards various substrates, which would be useful for bioconversion of natural products and biodegradation of organic chemicals.

  5. Automated and Accurate Estimation of Gene Family Abundance from Shotgun Metagenomes

    PubMed Central

    Nayfach, Stephen; Bradley, Patrick H.; Wyman, Stacia K.; Laurent, Timothy J.; Williams, Alex; Eisen, Jonathan A.; Pollard, Katherine S.; Sharpton, Thomas J.

    2015-01-01

    Shotgun metagenomic DNA sequencing is a widely applicable tool for characterizing the functions that are encoded by microbial communities. Several bioinformatic tools can be used to functionally annotate metagenomes, allowing researchers to draw inferences about the functional potential of the community and to identify putative functional biomarkers. However, little is known about how decisions made during annotation affect the reliability of the results. Here, we use statistical simulations to rigorously assess how to optimize annotation accuracy and speed, given parameters of the input data like read length and library size. We identify best practices in metagenome annotation and use them to guide the development of the Shotgun Metagenome Annotation Pipeline (ShotMAP). ShotMAP is an analytically flexible, end-to-end annotation pipeline that can be implemented either on a local computer or a cloud compute cluster. We use ShotMAP to assess how different annotation databases impact the interpretation of how marine metagenome and metatranscriptome functional capacity changes across seasons. We also apply ShotMAP to data obtained from a clinical microbiome investigation of inflammatory bowel disease. This analysis finds that gut microbiota collected from Crohn’s disease patients are functionally distinct from gut microbiota collected from either ulcerative colitis patients or healthy controls, with differential abundance of metabolic pathways related to host-microbiome interactions that may serve as putative biomarkers of disease. PMID:26565399

  6. Metagenomic Taxonomy-Guided Database-Searching Strategy for Improving Metaproteomic Analysis.

    PubMed

    Xiao, Jinqiu; Tanca, Alessandro; Jia, Ben; Yang, Runqing; Wang, Bo; Zhang, Yu; Li, Jing

    2018-04-06

    Metaproteomics provides a direct measure of the functional information by investigating all proteins expressed by a microbiota. However, due to the complexity and heterogeneity of microbial communities, it is very hard to construct a sequence database suitable for a metaproteomic study. Using a public database, researchers might not be able to identify proteins from poorly characterized microbial species, while a sequencing-based metagenomic database may not provide adequate coverage for all potentially expressed protein sequences. To address this challenge, we propose a metagenomic taxonomy-guided database-search strategy (MT), in which a merged database is employed, consisting of both taxonomy-guided reference protein sequences from public databases and proteins from metagenome assembly. By applying our MT strategy to a mock microbial mixture, about two times as many peptides were detected as with the metagenomic database only. According to the evaluation of the reliability of taxonomic attribution, the rate of misassignments was comparable to that obtained using an a priori matched database. We also evaluated the MT strategy with a human gut microbial sample, and we found 1.7 times as many peptides as using a standard metagenomic database. In conclusion, our MT strategy allows the construction of databases able to provide high sensitivity and precision in peptide identification in metaproteomic studies, enabling the detection of proteins from poorly characterized species within the microbiota.

  7. Metavir 2: new tools for viral metagenome comparison and assembled virome analysis

    PubMed Central

    2014-01-01

    Background Metagenomics, based on culture-independent sequencing, is a well-fitted approach to provide insights into the composition, structure and dynamics of environmental viral communities. Following recent advances in sequencing technologies, new challenges arise for existing bioinformatic tools dedicated to viral metagenome (i.e. virome) analysis as (i) the number of viromes is rapidly growing and (ii) large genomic fragments can now be obtained by assembling the huge amount of sequence data generated for each metagenome. Results To face these challenges, a new version of Metavir was developed. First, all Metavir tools have been adapted to support comparative analysis of viromes in order to improve the analysis of multiple datasets. In addition to the sequence comparison previously provided, viromes can now be compared through their k-mer frequencies, their taxonomic compositions, recruitment plots and phylogenetic trees containing sequences from different datasets. Second, a new section has been specifically designed to handle assembled viromes made of thousands of large genomic fragments (i.e. contigs). This section includes an annotation pipeline for uploaded viral contigs (gene prediction, similarity search against reference viral genomes and protein domains) and an extensive comparison between contigs and reference genomes. Contigs and their annotations can be explored on the website through specifically developed dynamic genomic maps and interactive networks. Conclusions The new features of Metavir 2 allow users to explore and analyze viromes composed of raw reads or assembled fragments through a set of adapted tools and a user-friendly interface. PMID:24646187

  8. PathoScope 2.0: a complete computational framework for strain identification in environmental or clinical sequencing samples

    PubMed Central

    2014-01-01

    Background Recent innovations in sequencing technologies have provided researchers with the ability to rapidly characterize the microbial content of an environmental or clinical sample with unprecedented resolution. These approaches are producing a wealth of information that is providing novel insights into the microbial ecology of the environment and human health. However, these sequencing-based approaches produce large and complex datasets that require efficient and sensitive computational analysis workflows. Many recent tools for analyzing metagenomic-sequencing data have emerged, however, these approaches often suffer from issues of specificity, efficiency, and typically do not include a complete metagenomic analysis framework. Results We present PathoScope 2.0, a complete bioinformatics framework for rapidly and accurately quantifying the proportions of reads from individual microbial strains present in metagenomic sequencing data from environmental or clinical samples. The pipeline performs all necessary computational analysis steps; including reference genome library extraction and indexing, read quality control and alignment, strain identification, and summarization and annotation of results. We rigorously evaluated PathoScope 2.0 using simulated data and data from the 2011 outbreak of Shiga-toxigenic Escherichia coli O104:H4. Conclusions The results show that PathoScope 2.0 is a complete, highly sensitive, and efficient approach for metagenomic analysis that outperforms alternative approaches in scope, speed, and accuracy. The PathoScope 2.0 pipeline software is freely available for download at: http://sourceforge.net/projects/pathoscope/. PMID:25225611

  9. RNA viral metagenome of whiteflies leads to the discovery and characterization of a whitefly-transmitted carlavirus in North America.

    PubMed

    Rosario, Karyna; Capobianco, Heather; Ng, Terry Fei Fan; Breitbart, Mya; Polston, Jane E

    2014-01-01

    Whiteflies from the Bemisia tabaci species complex have the ability to transmit a large number of plant viruses and are some of the most detrimental pests in agriculture. Although whiteflies are known to transmit both DNA and RNA viruses, most of the diversity has been recorded for the former, specifically for the Begomovirus genus. This study investigated the total diversity of DNA and RNA viruses found in whiteflies collected from a single site in Florida to evaluate if there are additional, previously undetected viral types within the B. tabaci vector. Metagenomic analysis of viral DNA extracted from the whiteflies only resulted in the detection of begomoviruses. In contrast, whiteflies contained sequences similar to RNA viruses from divergent groups, with a diversity that extends beyond currently described viruses. The metagenomic analysis of whiteflies also led to the first report of a whitefly-transmitted RNA virus similar to Cowpea mild mottle virus (CpMMV Florida) (genus Carlavirus) in North America. Further investigation resulted in the detection of CpMMV Florida in native and cultivated plants growing near the original field site of whitefly collection and determination of its experimental host range. Analysis of complete CpMMV Florida genomes recovered from whiteflies and plants suggests that the current classification criteria for carlaviruses need to be reevaluated. Overall, metagenomic analysis supports that DNA plant viruses carried by B. tabaci are dominated by begomoviruses, whereas significantly less is known about RNA viruses present in this damaging insect vector.

  10. The Intestinal Microbiota in Colorectal Cancer.

    PubMed

    Tilg, Herbert; Adolph, Timon E; Gerner, Romana R; Moschen, Alexander R

    2018-06-11

    Experimental evidence from the past years highlights a key role for the intestinal microbiota in inflammatory and malignant gastrointestinal diseases. Diet exhibits a strong impact on microbial composition and provides risk for developing colorectal carcinoma (CRC). Large metagenomic studies in human CRC associated microbiome signatures with the colorectal adenoma-carcinoma sequence, suggesting a fundamental role of the intestinal microbiota in the evolution of gastrointestinal malignancy. Basic science established a critical function for the intestinal microbiota in promoting tumorigenesis. Further studies are needed to decipher the mechanisms of tumor promotion and microbial co-evolution in CRC, which may be exploited therapeutically in the future. Copyright © 2018 Elsevier Inc. All rights reserved.

  11. Metagenomic Analysis of the Sponge Discodermia Reveals the Production of the Cyanobacterial Natural Product Kasumigamide by 'Entotheonella'.

    PubMed

    Nakashima, Yu; Egami, Yoko; Kimura, Miki; Wakimoto, Toshiyuki; Abe, Ikuro

    2016-01-01

    Sponge metagenomes are a useful platform to mine cryptic biosynthetic gene clusters responsible for production of natural products involved in the sponge-microbe association. Since numerous sponge-derived bioactive metabolites are biosynthesized by the symbiotic bacteria, this strategy may concurrently reveal sponge-symbiont produced compounds. Accordingly, a metagenomic analysis of the Japanese marine sponge Discodermia calyx has resulted in the identification of a hybrid type I polyketide synthase-nonribosomal peptide synthetase gene (kas). Bioinformatic analysis of the gene product suggested its involvement in the biosynthesis of kasumigamide, a tetrapeptide originally isolated from freshwater free-living cyanobacterium Microcystis aeruginosa NIES-87. Subsequent investigation of the sponge metabolic profile revealed the presence of kasumigamide in the sponge extract. The kasumigamide producing bacterium was identified as an 'Entotheonella' sp. Moreover, an in silico analysis of kas gene homologs uncovered the presence of kas family genes in two additional bacteria from different phyla. The production of kasumigamide by distantly related multiple bacterial strains implicates horizontal gene transfer and raises the potential for a wider distribution across other bacterial groups.

  12. X-ray structure of a two-domain type laccase: a missing link in the evolution of multi-copper proteins.

    PubMed

    Komori, Hirofumi; Miyazaki, Kentaro; Higuchi, Yoshiki

    2009-04-02

    A multi-copper protein with two cupredoxin-like domains was identified from our in-house metagenomic database. The recombinant protein, mgLAC, contained four copper ions/subunits, oxidized various phenolic and non-phenolic substrates, and had spectroscopic properties similar to common laccases. X-ray structure analysis revealed a homotrimeric architecture for this enzyme, which resembles nitrite reductase (NIR). However, a difference in copper coordination was found at the domain interface. mgLAC contains a T2/T3 tri-nuclear copper cluster at this site, whereas a mononuclear T2 copper occupies this position in NIR. The trimer is thus an essential part of the architecture of two-domain multi-copper proteins, and mgLAC may be an evolutionary precursor of NIR.

  13. Metagenomic analysis reveals significant changes of microbial compositions and protective functions during drinking water treatment.

    PubMed

    Chao, Yuanqing; Ma, Liping; Yang, Ying; Ju, Feng; Zhang, Xu-Xiang; Wu, Wei-Min; Zhang, Tong

    2013-12-19

    The metagenomic approach was applied to characterize variations of microbial structure and functions in raw (RW) and treated water (TW) in a drinking water treatment plant (DWTP) at Pearl River Delta, China. Microbial structure was significantly influenced by the treatment processes, shifting from Gammaproteobacteria and Betaproteobacteria in RW to Alphaproteobacteria in TW. Further functional analysis indicated the basic metabolic functions of microorganisms in TW did not vary considerably. However, protective functions, i.e. glutathione synthesis genes in 'oxidative stress' and 'detoxification' subsystems, significantly increased, revealing the surviving bacteria may have higher chlorine resistance. Similar results were also found in glutathione metabolism pathway, which identified the major reaction for glutathione synthesis and supported more genes for glutathione metabolism existed in TW. This metagenomic study largely enhanced our knowledge about the influences of treatment processes, especially chlorination, on bacterial community structure and protective functions (e.g. glutathione metabolism) in ecosystems of DWTPs.

  14. Comparative (Meta)genomic Analysis and Ecological Profiling of Human Gut-Specific Bacteriophage φB124-14

    PubMed Central

    Ogilvie, Lesley A.; Caplin, Jonathan; Dedi, Cinzia; Diston, David; Cheek, Elizabeth; Bowler, Lucas; Taylor, Huw; Ebdon, James; Jones, Brian V.

    2012-01-01

    Bacteriophage associated with the human gut microbiome are likely to have an important impact on community structure and function, and provide a wealth of biotechnological opportunities. Despite this, knowledge of the ecology and composition of bacteriophage in the gut bacterial community remains poor, with few well characterized gut-associated phage genomes currently available. Here we describe the identification and in-depth (meta)genomic, proteomic, and ecological analysis of a human gut-specific bacteriophage (designated φB124-14). In doing so we illuminate a fraction of the biological dark matter extant in this ecosystem and its surrounding eco-genomic landscape, identifying a novel and uncharted bacteriophage gene-space in this community. φB124-14 infects only a subset of closely related gut-associated Bacteroides fragilis strains, and the circular genome encodes functions previously found to be rare in viral genomes and human gut viral metagenome sequences, including those which potentially confer advantages upon phage and/or host bacteria. Comparative genomic analyses revealed φB124-14 is most closely related to φB40-8, the only other publically available Bacteroides sp. phage genome, whilst comparative metagenomic analysis of both phage failed to identify any homologous sequences in 136 non-human gut metagenomic datasets searched, supporting the human gut-specific nature of this phage. Moreover, a potential geographic variation in the carriage of these and related phage was revealed by analysis of their distribution and prevalence within 151 human gut microbiomes and viromes from Europe, America and Japan. Finally, ecological profiling of φB124-14 and φB40-8, using both gene-centric alignment-driven phylogenetic analyses, as well as alignment-free gene-independent approaches was undertaken. This not only verified the human gut-specific nature of both phage, but also indicated that these phage populate a distinct and unexplored ecological landscape within the human gut microbiome. PMID:22558115

  15. Computational prediction of CRISPR cassettes in gut metagenome samples from Chinese type-2 diabetic patients and healthy controls.

    PubMed

    Mangericao, Tatiana C; Peng, Zhanhao; Zhang, Xuegong

    2016-01-11

    CRISPR has been becoming a hot topic as a powerful technique for genome editing for human and other higher organisms. The original CRISPR-Cas (Clustered Regularly Interspaced Short Palindromic Repeats coupled with CRISPR-associated proteins) is an important adaptive defence system for prokaryotes that provides resistance against invading elements such as viruses and plasmids. A CRISPR cassette contains short nucleotide sequences called spacers. These unique regions retain a history of the interactions between prokaryotes and their invaders in individual strains and ecosystems. One important ecosystem in the human body is the human gut, a rich habitat populated by a great diversity of microorganisms. Gut microbiomes are important for human physiology and health. Metagenome sequencing has been widely applied for studying the gut microbiomes. Most efforts in metagenome study has been focused on profiling taxa compositions and gene catalogues and identifying their associations with human health. Less attention has been paid to the analysis of the ecosystems of microbiomes themselves especially their CRISPR composition. We conducted a preliminary analysis of CRISPR sequences in a human gut metagenomic data set of Chinese individuals of type-2 diabetes patients and healthy controls. Applying an available CRISPR-identification algorithm, PILER-CR, we identified 3169 CRISPR cassettes in the data, from which we constructed a set of 1302 unique repeat sequences and 36,709 spacers. A more extensive analysis was made for the CRISPR repeats: these repeats were submitted to a more comprehensive clustering and classification using the web server tool CRISPRmap. All repeats were compared with known CRISPRs in the database CRISPRdb. A total of 784 repeats had matches in the database, and the remaining 518 repeats from our set are potentially novel ones. The computational analysis of CRISPR composition based contigs of metagenome sequencing data is feasible. It provides an efficient approach for finding potential novel CRISPR arrays and for analysing the ecosystem and history of human microbiomes.

  16. Analysis of the global ocean sampling (GOS) project for trends in iron uptake by surface ocean microbes.

    PubMed

    Toulza, Eve; Tagliabue, Alessandro; Blain, Stéphane; Piganeau, Gwenael

    2012-01-01

    Microbial metagenomes are DNA samples of the most abundant, and therefore most successful organisms at the sampling time and location for a given cell size range. The study of microbial communities via their DNA content has revolutionized our understanding of microbial ecology and evolution. Iron availability is a critical resource that limits microbial communities' growth in many oceanic areas. Here, we built a database of 2319 sequences, corresponding to 140 gene families of iron metabolism with a large phylogenetic spread, to explore the microbial strategies of iron acquisition in the ocean's bacterial community. We estimate iron metabolism strategies from metagenome gene content and investigate whether their prevalence varies with dissolved iron concentrations obtained from a biogeochemical model. We show significant quantitative and qualitative variations in iron metabolism pathways, with a higher proportion of iron metabolism genes in low iron environments. We found a striking difference between coastal and open ocean sites regarding Fe(2+) versus Fe(3+) uptake gene prevalence. We also show that non-specific siderophore uptake increases in low iron open ocean environments, suggesting bacteria may acquire iron from natural siderophore-like organic complexes. Despite the lack of knowledge of iron uptake mechanisms in most marine microorganisms, our approach provides insights into how the iron metabolic pathways of microbial communities may vary with seawater iron concentrations.

  17. Prokaryote genome fluidity: toward a system approach of the mobilome.

    PubMed

    Toussaint, Ariane; Chandler, Mick

    2012-01-01

    The importance of horizontal/lateral gene transfer (LGT) in shaping the genomes of prokaryotic organisms has been recognized in recent years as a result of analysis of the increasing number of available genome sequences. LGT is largely due to the transfer and recombination activities of mobile genetic elements (MGEs). Bacterial and archaeal genomes are mosaics of vertically and horizontally transmitted DNA segments. This generates reticulate relationships between members of the prokaryotic world that are better represented by networks than by "classical" phylogenetic trees. In this review we summarize the nature and activities of MGEs, and the problems that presently limit their analysis on a large scale. We propose routes to improve their annotation in the flow of genomic and metagenomic sequences that currently exist and those that become available. We describe network analysis of evolutionary relationships among some MGE categories and sketch out possible developments of this type of approach to get more insight into the role of the mobilome in bacterial adaptation and evolution.

  18. Stratification of co-evolving genomic groups using ranked phylogenetic profiles

    PubMed Central

    Freilich, Shiri; Goldovsky, Leon; Gottlieb, Assaf; Blanc, Eric; Tsoka, Sophia; Ouzounis, Christos A

    2009-01-01

    Background Previous methods of detecting the taxonomic origins of arbitrary sequence collections, with a significant impact to genome analysis and in particular metagenomics, have primarily focused on compositional features of genomes. The evolutionary patterns of phylogenetic distribution of genes or proteins, represented by phylogenetic profiles, provide an alternative approach for the detection of taxonomic origins, but typically suffer from low accuracy. Herein, we present rank-BLAST, a novel approach for the assignment of protein sequences into genomic groups of the same taxonomic origin, based on the ranking order of phylogenetic profiles of target genes or proteins across the reference database. Results The rank-BLAST approach is validated by computing the phylogenetic profiles of all sequences for five distinct microbial species of varying degrees of phylogenetic proximity, against a reference database of 243 fully sequenced genomes. The approach - a combination of sequence searches, statistical estimation and clustering - analyses the degree of sequence divergence between sets of protein sequences and allows the classification of protein sequences according to the species of origin with high accuracy, allowing taxonomic classification of 64% of the proteins studied. In most cases, a main cluster is detected, representing the corresponding species. Secondary, functionally distinct and species-specific clusters exhibit different patterns of phylogenetic distribution, thus flagging gene groups of interest. Detailed analyses of such cases are provided as examples. Conclusion Our results indicate that the rank-BLAST approach can capture the taxonomic origins of sequence collections in an accurate and efficient manner. The approach can be useful both for the analysis of genome evolution and the detection of species groups in metagenomics samples. PMID:19860884

  19. Draft Genome Sequence of Thermotoga maritima A7A Reconstructed from Metagenomic Sequencing Analysis of a Hydrocarbon Reservoir in the Bass Strait, Australia

    PubMed Central

    Sutcliffe, Brodie; Rosewarne, Carly P.; Greenfield, Paul; Li, Dongmei

    2013-01-01

    The draft genome sequence of Thermotoga maritima A7A was obtained from a metagenomic assembly obtained from a high-temperature hydrocarbon reservoir in the Gippsland Basin, Australia. The organism is predicted to be a motile anaerobe with an array of catabolic enzymes for the degradation of numerous carbohydrates. PMID:24009120

  20. Analysis of Microbial Functions in the Rhizosphere Using a Metabolic-Network Based Framework for Metagenomics Interpretation

    PubMed Central

    Ofaim, Shany; Ofek-Lalzar, Maya; Sela, Noa; Jinag, Jiandong; Kashi, Yechezkel; Minz, Dror; Freilich, Shiri

    2017-01-01

    Advances in metagenomics enable high resolution description of complex bacterial communities in their natural environments. Consequently, conceptual approaches for community level functional analysis are in high need. Here, we introduce a framework for a metagenomics-based analysis of community functions. Environment-specific gene catalogs, derived from metagenomes, are processed into metabolic-network representation. By applying established ecological conventions, network-edges (metabolic functions) are assigned with taxonomic annotations according to the dominance level of specific groups. Once a function-taxonomy link is established, prediction of the impact of dominant taxa on the overall community performances is assessed by simulating removal or addition of edges (taxa associated functions). This approach is demonstrated on metagenomic data describing the microbial communities from the root environment of two crop plants – wheat and cucumber. Predictions for environment-dependent effects revealed differences between treatments (root vs. soil), corresponding to documented observations. Metabolism of specific plant exudates (e.g., organic acids, flavonoids) was linked with distinct taxonomic groups in simulated root, but not soil, environments. These dependencies point to the impact of these metabolite families as determinants of community structure. Simulations of the activity of pairwise combinations of taxonomic groups (order level) predicted the possible production of complementary metabolites. Complementation profiles allow formulating a possible metabolic role for observed co-occurrence patterns. For example, production of tryptophan-associated metabolites through complementary interactions is unique to the tryptophan-deficient cucumber root environment. Our approach enables formulation of testable predictions for species contribution to community activity and exploration of the functional outcome of structural shifts in complex bacterial communities. Understanding community-level metabolism is an essential step toward the manipulation and optimization of microbial function. Here, we introduce an analysis framework addressing three key challenges of such data: producing quantified links between taxonomy and function; contextualizing discrete functions into communal networks; and simulating environmental impact on community performances. New technologies will soon provide a high-coverage description of biotic and a-biotic aspects of complex microbial communities such as these found in gut and soil. This framework was designed to allow the integration of high-throughput metabolomic and metagenomic data toward tackling the intricate associations between community structure, community function, and metabolic inputs. PMID:28878756

  1. Strain-Level Metagenomic Analysis of the Fermented Dairy Beverage Nunu Highlights Potential Food Safety Risks.

    PubMed

    Walsh, Aaron M; Crispie, Fiona; Daari, Kareem; O'Sullivan, Orla; Martin, Jennifer C; Arthur, Cornelius T; Claesson, Marcus J; Scott, Karen P; Cotter, Paul D

    2017-08-15

    The rapid detection of pathogenic strains in food products is essential for the prevention of disease outbreaks. It has already been demonstrated that whole-metagenome shotgun sequencing can be used to detect pathogens in food but, until recently, strain-level detection of pathogens has relied on whole-metagenome assembly, which is a computationally demanding process. Here we demonstrated that three short-read-alignment-based methods, i.e., MetaMLST, PanPhlAn, and StrainPhlAn, could accurately and rapidly identify pathogenic strains in spinach metagenomes that had been intentionally spiked with Shiga toxin-producing Escherichia coli in a previous study. Subsequently, we employed the methods, in combination with other metagenomics approaches, to assess the safety of nunu, a traditional Ghanaian fermented milk product that is produced by the spontaneous fermentation of raw cow milk. We showed that nunu samples were frequently contaminated with bacteria associated with the bovine gut and, worryingly, we detected putatively pathogenic E. coli and Klebsiella pneumoniae strains in a subset of nunu samples. Ultimately, our work establishes that short-read-alignment-based bioinformatics approaches are suitable food safety tools, and we describe a real-life example of their utilization. IMPORTANCE Foodborne pathogens are responsible for millions of illnesses each year. Here we demonstrate that short-read-alignment-based bioinformatics tools can accurately and rapidly detect pathogenic strains in food products by using shotgun metagenomics data. The methods used here are considerably faster than both traditional culturing methods and alternative bioinformatics approaches that rely on metagenome assembly; therefore, they can potentially be used for more high-throughput food safety testing. Overall, our results suggest that whole-metagenome sequencing can be used as a practical food safety tool to prevent diseases or to link outbreaks to specific food products. Copyright © 2017 American Society for Microbiology.

  2. Comparative genomic analysis reveals 2-oxoacid dehydrogenase complex lipoylation correlation with aerobiosis in archaea.

    PubMed

    Borziak, Kirill; Posner, Mareike G; Upadhyay, Abhishek; Danson, Michael J; Bagby, Stefan; Dorus, Steve

    2014-01-01

    Metagenomic analyses have advanced our understanding of ecological microbial diversity, but to what extent can metagenomic data be used to predict the metabolic capacity of difficult-to-study organisms and their abiotic environmental interactions? We tackle this question, using a comparative genomic approach, by considering the molecular basis of aerobiosis within archaea. Lipoylation, the covalent attachment of lipoic acid to 2-oxoacid dehydrogenase multienzyme complexes (OADHCs), is essential for metabolism in aerobic bacteria and eukarya. Lipoylation is catalysed either by lipoate protein ligase (LplA), which in archaea is typically encoded by two genes (LplA-N and LplA-C), or by a lipoyl(octanoyl) transferase (LipB or LipM) plus a lipoic acid synthetase (LipA). Does the genomic presence of lipoylation and OADHC genes across archaea from diverse habitats correlate with aerobiosis? First, analyses of 11,826 biotin protein ligase (BPL)-LplA-LipB transferase family members and 147 archaeal genomes identified 85 species with lipoylation capabilities and provided support for multiple ancestral acquisitions of lipoylation pathways during archaeal evolution. Second, with the exception of the Sulfolobales order, the majority of species possessing lipoylation systems exclusively retain LplA, or either LipB or LipM, consistent with archaeal genome streamlining. Third, obligate anaerobic archaea display widespread loss of lipoylation and OADHC genes. Conversely, a high level of correspondence is observed between aerobiosis and the presence of LplA/LipB/LipM, LipA and OADHC E2, consistent with the role of lipoylation in aerobic metabolism. This correspondence between OADHC lipoylation capacity and aerobiosis indicates that genomic pathway profiling in archaea is informative and that well characterized pathways may be predictive in relation to abiotic conditions in difficult-to-study extremophiles. Given the highly variable retention of gene repertoires across the archaea, the extension of comparative genomic pathway profiling to broader metabolic and homeostasis networks should be useful in revealing characteristics from metagenomic datasets related to adaptations to diverse environments.

  3. Genomic and metagenomic surveys of hydrogenase distribution indicate H2 is a widely utilised energy source for microbial growth and survival

    PubMed Central

    Greening, Chris; Biswas, Ambarish; Carere, Carlo R; Jackson, Colin J; Taylor, Matthew C; Stott, Matthew B; Cook, Gregory M; Morales, Sergio E

    2016-01-01

    Recent physiological and ecological studies have challenged the long-held belief that microbial metabolism of molecular hydrogen (H2) is a niche process. To gain a broader insight into the importance of microbial H2 metabolism, we comprehensively surveyed the genomic and metagenomic distribution of hydrogenases, the reversible enzymes that catalyse the oxidation and evolution of H2. The protein sequences of 3286 non-redundant putative hydrogenases were curated from publicly available databases. These metalloenzymes were classified into multiple groups based on (1) amino acid sequence phylogeny, (2) metal-binding motifs, (3) predicted genetic organisation and (4) reported biochemical characteristics. Four groups (22 subgroups) of [NiFe]-hydrogenase, three groups (6 subtypes) of [FeFe]-hydrogenases and a small group of [Fe]-hydrogenases were identified. We predict that this hydrogenase diversity supports H2-based respiration, fermentation and carbon fixation processes in both oxic and anoxic environments, in addition to various H2-sensing, electron-bifurcation and energy-conversion mechanisms. Hydrogenase-encoding genes were identified in 51 bacterial and archaeal phyla, suggesting strong pressure for both vertical and lateral acquisition. Furthermore, hydrogenase genes could be recovered from diverse terrestrial, aquatic and host-associated metagenomes in varying proportions, indicating a broad ecological distribution and utilisation. Oxygen content (pO2) appears to be a central factor driving the phylum- and ecosystem-level distribution of these genes. In addition to compounding evidence that H2 was the first electron donor for life, our analysis suggests that the great diversification of hydrogenases has enabled H2 metabolism to sustain the growth or survival of microorganisms in a wide range of ecosystems to the present day. This work also provides a comprehensive expanded system for classifying hydrogenases and identifies new prospects for investigating H2 metabolism. PMID:26405831

  4. Making a living while starving in the dark: metagenomic insights into the energy dynamics of a carbonate cave.

    PubMed

    Ortiz, Marianyoly; Legatzki, Antje; Neilson, Julia W; Fryslie, Brandon; Nelson, William M; Wing, Rod A; Soderlund, Carol A; Pryor, Barry M; Maier, Raina M

    2014-02-01

    Carbonate caves represent subterranean ecosystems that are largely devoid of phototrophic primary production. In semiarid and arid regions, allochthonous organic carbon inputs entering caves with vadose-zone drip water are minimal, creating highly oligotrophic conditions; however, past research indicates that carbonate speleothem surfaces in these caves support diverse, predominantly heterotrophic prokaryotic communities. The current study applied a metagenomic approach to elucidate the community structure and potential energy dynamics of microbial communities, colonizing speleothem surfaces in Kartchner Caverns, a carbonate cave in semiarid, southeastern Arizona, USA. Manual inspection of a speleothem metagenome revealed a community genetically adapted to low-nutrient conditions with indications that a nitrogen-based primary production strategy is probable, including contributions from both Archaea and Bacteria. Genes for all six known CO2-fixation pathways were detected in the metagenome and RuBisCo genes representative of the Calvin-Benson-Bassham cycle were over-represented in Kartchner speleothem metagenomes relative to bulk soil, rhizosphere soil and deep-ocean communities. Intriguingly, quantitative PCR found Archaea to be significantly more abundant in the cave communities than in soils above the cave. MEtaGenome ANalyzer (MEGAN) analysis of speleothem metagenome sequence reads found Thaumarchaeota to be the third most abundant phylum in the community, and identified taxonomic associations to this phylum for indicator genes representative of multiple CO2-fixation pathways. The results revealed that this oligotrophic subterranean environment supports a unique chemoautotrophic microbial community with potentially novel nutrient cycling strategies. These strategies may provide key insights into other ecosystems dominated by oligotrophy, including aphotic subsurface soils or aquifers and photic systems such as arid deserts.

  5. Functional metagenomics reveals novel β-galactosidases not predictable from gene sequences.

    PubMed

    Cheng, Jiujun; Romantsov, Tatyana; Engel, Katja; Doxey, Andrew C; Rose, David R; Neufeld, Josh D; Charles, Trevor C

    2017-01-01

    The techniques of metagenomics have allowed researchers to access the genomic potential of uncultivated microbes, but there remain significant barriers to determination of gene function based on DNA sequence alone. Functional metagenomics, in which DNA is cloned and expressed in surrogate hosts, can overcome these barriers, and make important contributions to the discovery of novel enzymes. In this study, a soil metagenomic library carried in an IncP cosmid was used for functional complementation for β-galactosidase activity in both Sinorhizobium meliloti (α-Proteobacteria) and Escherichia coli (γ-Proteobacteria) backgrounds. One β-galactosidase, encoded by six overlapping clones that were selected in both hosts, was identified as a member of glycoside hydrolase family 2. We could not identify ORFs obviously encoding possible β-galactosidases in 19 other sequenced clones that were only able to complement S. meliloti. Based on low sequence identity to other known glycoside hydrolases, yet not β-galactosidases, three of these ORFs were examined further. Biochemical analysis confirmed that all three encoded β-galactosidase activity. Lac36W_ORF11 and Lac161_ORF7 had conserved domains, but lacked similarities to known glycoside hydrolases. Lac161_ORF10 had neither conserved domains nor similarity to known glycoside hydrolases. Bioinformatic and structural modeling implied that Lac161_ORF10 protein represented a novel enzyme family with a five-bladed propeller glycoside hydrolase domain. By discovering founding members of three novel β-galactosidase families, we have reinforced the value of functional metagenomics for isolating novel genes that could not have been predicted from DNA sequence analysis alone.

  6. Seasonal patterns in Arctic prasinophytes and inferred ecology of Bathycoccus unveiled in an Arctic winter metagenome.

    PubMed

    Joli, Nathalie; Monier, Adam; Logares, Ramiro; Lovejoy, Connie

    2017-06-01

    Prasinophytes occur in all oceans but rarely dominate phytoplankton populations. In contrast, a single ecotype of the prasinophyte Micromonas is frequently the most abundant photosynthetic taxon reported in the Arctic from summer through autumn. However, seasonal dynamics of prasinophytes outside of this period are little known. To address this, we analyzed high-throughput V4 18S rRNA amplicon data collected from November to July in the Amundsen Gulf Region, Beaufort Sea, Arctic. Surprisingly during polar sunset in November and December, we found a high proportion of reads from both DNA and RNA belonging to another prasinophyte, Bathycoccus. We then analyzed a metagenome from a December sample and the resulting Bathycoccus metagenome assembled genome (MAG) covered ~90% of the Bathycoccus Ban7 reference genome. In contrast, only ~20% of a reference Micromonas genome was found in the metagenome. Our phylogenetic analysis of marker genes placed the Arctic Bathycoccus in the B1 coastal clade. In addition, substitution rates of 129 coding DNA sequences were ~1.6% divergent between the Arctic MAG and coastal Chilean upwelling MAGs and 17.3% between it and a South East Atlantic open ocean MAG in the B2 Clade. The metagenomic analysis also revealed a winter viral community highly skewed toward viruses targeting Micromonas, with a much lower diversity of viruses targeting Bathycoccus. Overall a combination of Micromonas being relatively less able to maintain activity under dark winter conditions and viral suppression of Micromonas may have contributed to the success of Bathycoccus in the Amundsen Gulf during winter.

  7. High frequency of phylogenetically diverse reductive dehalogenase-homologous genes in deep subseafloor sedimentary metagenomes

    PubMed Central

    Kawai, Mikihiko; Futagami, Taiki; Toyoda, Atsushi; Takaki, Yoshihiro; Nishi, Shinro; Hori, Sayaka; Arai, Wataru; Tsubouchi, Taishi; Morono, Yuki; Uchiyama, Ikuo; Ito, Takehiko; Fujiyama, Asao; Inagaki, Fumio; Takami, Hideto

    2014-01-01

    Marine subsurface sediments on the Pacific margin harbor diverse microbial communities even at depths of several hundreds meters below the seafloor (mbsf) or more. Previous PCR-based molecular analysis showed the presence of diverse reductive dehalogenase gene (rdhA) homologs in marine subsurface sediment, suggesting that anaerobic respiration of organohalides is one of the possible energy-yielding pathways in the organic-rich sedimentary habitat. However, primer-independent molecular characterization of rdhA has remained to be demonstrated. Here, we studied the diversity and frequency of rdhA homologs by metagenomic analysis of five different depth horizons (0.8, 5.1, 18.6, 48.5, and 107.0 mbsf) at Site C9001 off the Shimokita Peninsula of Japan. From all metagenomic pools, remarkably diverse rdhA-homologous sequences, some of which are affiliated with novel clusters, were observed with high frequency. As a comparison, we also examined frequency of dissimilatory sulfite reductase genes (dsrAB), key functional genes for microbial sulfate reduction. The dsrAB were also widely observed in the metagenomic pools whereas the frequency of dsrAB genes was generally smaller than that of rdhA-homologous genes. The phylogenetic composition of rdhA-homologous genes was similar among the five depth horizons. Our metagenomic data revealed that subseafloor rdhA homologs are more diverse than previously identified from PCR-based molecular studies. Spatial distribution of similar rdhA homologs across wide depositional ages indicates that the heterotrophic metabolic processes mediated by the genes can be ecologically important, functioning in the organic-rich subseafloor sedimentary biosphere. PMID:24624126

  8. High frequency of phylogenetically diverse reductive dehalogenase-homologous genes in deep subseafloor sedimentary metagenomes.

    PubMed

    Kawai, Mikihiko; Futagami, Taiki; Toyoda, Atsushi; Takaki, Yoshihiro; Nishi, Shinro; Hori, Sayaka; Arai, Wataru; Tsubouchi, Taishi; Morono, Yuki; Uchiyama, Ikuo; Ito, Takehiko; Fujiyama, Asao; Inagaki, Fumio; Takami, Hideto

    2014-01-01

    Marine subsurface sediments on the Pacific margin harbor diverse microbial communities even at depths of several hundreds meters below the seafloor (mbsf) or more. Previous PCR-based molecular analysis showed the presence of diverse reductive dehalogenase gene (rdhA) homologs in marine subsurface sediment, suggesting that anaerobic respiration of organohalides is one of the possible energy-yielding pathways in the organic-rich sedimentary habitat. However, primer-independent molecular characterization of rdhA has remained to be demonstrated. Here, we studied the diversity and frequency of rdhA homologs by metagenomic analysis of five different depth horizons (0.8, 5.1, 18.6, 48.5, and 107.0 mbsf) at Site C9001 off the Shimokita Peninsula of Japan. From all metagenomic pools, remarkably diverse rdhA-homologous sequences, some of which are affiliated with novel clusters, were observed with high frequency. As a comparison, we also examined frequency of dissimilatory sulfite reductase genes (dsrAB), key functional genes for microbial sulfate reduction. The dsrAB were also widely observed in the metagenomic pools whereas the frequency of dsrAB genes was generally smaller than that of rdhA-homologous genes. The phylogenetic composition of rdhA-homologous genes was similar among the five depth horizons. Our metagenomic data revealed that subseafloor rdhA homologs are more diverse than previously identified from PCR-based molecular studies. Spatial distribution of similar rdhA homologs across wide depositional ages indicates that the heterotrophic metabolic processes mediated by the genes can be ecologically important, functioning in the organic-rich subseafloor sedimentary biosphere.

  9. Metagenomic systems biology and metabolic modeling of the human microbiome: from species composition to community assembly rules.

    PubMed

    Levy, Roie; Borenstein, Elhanan

    2014-01-01

    The human microbiome is a key contributor to health and development. Yet little is known about the ecological forces that are at play in defining the composition of such host-associated communities. Metagenomics-based studies have uncovered clear patterns of community structure but are often incapable of distinguishing alternative structuring paradigms. In a recent study, we integrated metagenomic analysis with a systems biology approach, using a reverse ecology framework to model numerous human microbiota species and to infer metabolic interactions between species. Comparing predicted interactions with species composition data revealed that the assembly of the human microbiome is dominated at the community level by habitat filtering. Furthermore, we demonstrated that this habitat filtering cannot be accounted for by known host phenotypes or by the metabolic versatility of the various species. Here we provide a summary of our findings and offer a brief perspective on related studies and on future approaches utilizing this metagenomic systems biology framework.

  10. Data on partial polyhydroxyalkanoate synthase genes (phaC) mined from Aaptos aaptos marine sponge-associated bacteria metagenome.

    PubMed

    Amelia, Tan Suet May; Amirul, Al-Ashraf Abdullah; Bhubalan, Kesaven

    2018-02-01

    We report data associated with the identification of three polyhydroxyalkanoate synthase genes (phaC) isolated from the marine bacteria metagenome of Aaptos aaptos marine sponge in the waters of Bidong Island, Terengganu, Malaysia. Our data describe the extraction of bacterial metagenome from sponge tissue, measurement of purity and concentration of extracted metagenome, polymerase chain reaction (PCR)-mediated amplification using degenerate primers targeting Class I and II phaC genes, sequencing at First BASE Laboratories Sdn Bhd, and phylogenetic analysis of identified and known phaC genes. The partial nucleotide sequences were aligned, refined, compared with the Basic Local Alignment Search Tool (BLAST) databases, and released online in GenBank. The data include the identified partial putative phaC and their GenBank accession numbers, which are Rhodocista sp. phaC (MF457754), Pseudomonas sp. phaC (MF437016), and an uncultured bacterium AR5-9d_16 phaC (MF457753).

  11. Use of simulated data sets to evaluate the fidelity of Metagenomicprocessing methods

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mavromatis, Konstantinos; Ivanova, Natalia; Barry, Kerri

    2006-12-01

    Metagenomics is a rapidly emerging field of research for studying microbial communities. To evaluate methods presently used to process metagenomic sequences, we constructed three simulated data sets of varying complexity by combining sequencing reads randomly selected from 113 isolate genomes. These data sets were designed to model real metagenomes in terms of complexity and phylogenetic composition. We assembled sampled reads using three commonly used genome assemblers (Phrap, Arachne and JAZZ), and predicted genes using two popular gene finding pipelines (fgenesb and CRITICA/GLIMMER). The phylogenetic origins of the assembled contigs were predicted using one sequence similarity--based (blast hit distribution) and twomore » sequence composition--based (PhyloPythia, oligonucleotide frequencies) binning methods. We explored the effects of the simulated community structure and method combinations on the fidelity of each processing step by comparison to the corresponding isolate genomes. The simulated data sets are available online to facilitate standardized benchmarking of tools for metagenomic analysis.« less

  12. SUPER-FOCUS: A tool for agile functional analysis of shotgun metagenomic data

    DOE PAGES

    Silva, Genivaldo Gueiros Z.; Green, Kevin T.; Dutilh, Bas E.; ...

    2015-10-09

    Analyzing the functional profile of a microbial community from unannotated shotgun sequencing reads is one of the important goals in metagenomics. Functional profiling has valuable applications in biological research because it identifies the abundances of the functional genes of the organisms present in the original sample, answering the question what they can do. Currently, available tools do not scale well with increasing data volumes, which is important because both the number and lengths of the reads produced by sequencing platforms keep increasing. Here, we introduce SUPER-FOCUS, SUbsystems Profile by databasE Reduction using FOCUS, an agile homology-based approach using a reducedmore » reference database to report the subsystems present in metagenomic datasets and profile their abundances. We tested SUPER-FOCUS with over 70 real metagenomes, the results showing that it accurately predicts the subsystems present in the profiled microbial communities, and is up to 1000 times faster than other tools.« less

  13. An Agile Functional Analysis of Metagenomic Data Using SUPER-FOCUS.

    PubMed

    Silva, Genivaldo Gueiros Z; Lopes, Fabyano A C; Edwards, Robert A

    2017-01-01

    One of the main goals in metagenomics is to identify the functional profile of a microbial community from unannotated shotgun sequencing reads. Functional annotation is important in biological research because it enables researchers to identify the abundance of functional genes of the organisms present in the sample, answering the question, "What can the organisms in the sample do?" Most currently available approaches do not scale with increasing data volumes, which is important because both the number and lengths of the reads provided by sequencing platforms keep increasing. Here, we present SUPER-FOCUS, SUbsystems Profile by databasE Reduction using FOCUS, an agile homology-based approach using a reduced reference database to report the subsystems present in metagenomic datasets and profile their abundances. SUPER-FOCUS was tested with real metagenomes, and the results show that it accurately predicts the subsystems present in the profiled microbial communities, is computationally efficient, and up to 1000 times faster than other tools. SUPER-FOCUS is freely available at http://edwards.sdsu.edu/SUPERFOCUS .

  14. In-depth resistome analysis by targeted metagenomics.

    PubMed

    Lanza, Val F; Baquero, Fernando; Martínez, José Luís; Ramos-Ruíz, Ricardo; González-Zorn, Bruno; Andremont, Antoine; Sánchez-Valenzuela, Antonio; Ehrlich, Stanislav Dusko; Kennedy, Sean; Ruppé, Etienne; van Schaik, Willem; Willems, Rob J; de la Cruz, Fernando; Coque, Teresa M

    2018-01-15

    Antimicrobial resistance is a major global health challenge. Metagenomics allows analyzing the presence and dynamics of "resistomes" (the ensemble of genes encoding antimicrobial resistance in a given microbiome) in disparate microbial ecosystems. However, the low sensitivity and specificity of available metagenomic methods preclude the detection of minority populations (often present below their detection threshold) and/or the identification of allelic variants that differ in the resulting phenotype. Here, we describe a novel strategy that combines targeted metagenomics using last generation in-solution capture platforms, with novel bioinformatics tools to establish a standardized framework that allows both quantitative and qualitative analyses of resistomes. We developed ResCap, a targeted sequence capture platform based on SeqCapEZ (NimbleGene) technology, which includes probes for 8667 canonical resistance genes (7963 antibiotic resistance genes and 704 genes conferring resistance to metals or biocides), and 2517 relaxase genes (plasmid markers) and 78,600 genes homologous to the previous identified targets (47,806 for antibiotics and 30,794 for biocides or metals). Its performance was compared with metagenomic shotgun sequencing (MSS) for 17 fecal samples (9 humans, 8 swine). ResCap significantly improves MSS to detect "gene abundance" (from 2.0 to 83.2%) and "gene diversity" (26 versus 14.9 genes unequivocally detected per sample per million of reads; the number of reads unequivocally mapped increasing up to 300-fold by using ResCap), which were calculated using novel bioinformatic tools. ResCap also facilitated the analysis of novel genes potentially involved in the resistance to antibiotics, metals, biocides, or any combination thereof. ResCap, the first targeted sequence capture, specifically developed to analyze resistomes, greatly enhances the sensitivity and specificity of available metagenomic methods and offers the possibility to analyze genes related to the selection and transfer of antimicrobial resistance (biocides, heavy metals, plasmids). The model opens the possibility to study other complex microbial systems in which minority populations play a relevant role.

  15. High Prevalence of Quorum-Sensing and Quorum-Quenching Activity among Cultivable Bacteria and Metagenomic Sequences in the Mediterranean Sea

    PubMed Central

    López-Pérez, Mario; Mayer, Celia; Parga, Ana; Amaro-Blanco, Jaime

    2018-01-01

    There is increasing evidence being accumulated regarding the importance of N-acyl homoserine lactones (AHL)-mediated quorum-sensing (QS) and quorum-quenching (QQ) processes in the marine environment, but in most cases, data has been obtained from specific microhabitats, and subsequently little is known regarding these activities in free-living marine bacteria. The QS and QQ activities among 605 bacterial isolates obtained at 90 and 2000 m depths in the Mediterranean Sea were analyzed. Additionally, putative QS and QQ sequences were searched in metagenomic data obtained at different depths (15–2000 m) at the same sampling site. The number of AHL producers was higher in the 90 m sample (37.66%) than in the 2000 m sample (4.01%). However, the presence of QQ enzymatic activity was 1.63-fold higher in the 2000 m sample. The analysis of putative QQ enzymes in the metagenomes supports the relevance of QQ processes in the deepest samples, found in cultivable bacteria. Despite the unavoidable biases in the cultivation methods and biosensor assays and the possible promiscuous activity of the QQ enzymes retrieved in the metagenomic analysis, the results indicate that AHL-related QS and QQ processes could be common activity in the marine environment. PMID:29462892

  16. Bayesian mixture analysis for metagenomic community profiling.

    PubMed

    Morfopoulou, Sofia; Plagnol, Vincent

    2015-09-15

    Deep sequencing of clinical samples is now an established tool for the detection of infectious pathogens, with direct medical applications. The large amount of data generated produces an opportunity to detect species even at very low levels, provided that computational tools can effectively profile the relevant metagenomic communities. Data interpretation is complicated by the fact that short sequencing reads can match multiple organisms and by the lack of completeness of existing databases, in particular for viral pathogens. Here we present metaMix, a Bayesian mixture model framework for resolving complex metagenomic mixtures. We show that the use of parallel Monte Carlo Markov chains for the exploration of the species space enables the identification of the set of species most likely to contribute to the mixture. We demonstrate the greater accuracy of metaMix compared with relevant methods, particularly for profiling complex communities consisting of several related species. We designed metaMix specifically for the analysis of deep transcriptome sequencing datasets, with a focus on viral pathogen detection; however, the principles are generally applicable to all types of metagenomic mixtures. metaMix is implemented as a user friendly R package, freely available on CRAN: http://cran.r-project.org/web/packages/metaMix sofia.morfopoulou.10@ucl.ac.uk Supplementary data are available at Bionformatics online. © The Author 2015. Published by Oxford University Press.

  17. Metagenomic Analysis of Gingival Sulcus Microbiota and Pathogenesis of Periodontitis Associated with Type 2 Diabetes Mellitus.

    PubMed

    Babaev, E A; Balmasova, I P; Mkrtumyan, A M; Kostryukova, S N; Vakhitova, E S; Il'ina, E N; Tsarev, V N; Gabibov, A G; Arutyunov, S D

    2017-10-01

    Biofilm of the gingival sulcus from 22 patients with type 2 diabetes mellitus and periodontitis, 30 patients with periodontitis not complicated by diabetes mellitus (reference group), and 22 healthy volunteers without signs of gingival disease (control group) was studied by quantitative PCR. Quantitative analysis for the content of P. gingivalis, T. forsythia, A. ctinomycetemcomitans, T. denticola, P. intermedia, F. nucleatum/periodonticum, and P. endodontalis in the dental plaque was performed with a Dentoscreen kit. The presence of other bacterial groups was verified by metagenomic sequencing of the 16S rRNA gene to evaluate some specific features of the etiological factor for periodontitis in type 2 diabetes mellitus. Specimens of the Porphiromonadaceae and Fusobacteriaceae families were characterized by an extremely high incidence in combined pathology. The amount of Sphingobacteriaceae bacteria in the biofilm was shown to decrease significantly during periodontitis. Metagenomic analysis confirmed the pathogenic role of microbiota in combined pathology, as well as the hypothesis on a possible influence of periodontitis on the course and development of type 2 diabetes mellitus.

  18. Metagenomic analyses of the late Pleistocene permafrost - additional tools for reconstruction of environmental conditions

    NASA Astrophysics Data System (ADS)

    Rivkina, Elizaveta; Petrovskaya, Lada; Vishnivetskaya, Tatiana; Krivushin, Kirill; Shmakova, Lyubov; Tutukina, Maria; Meyers, Arthur; Kondrashov, Fyodor

    2016-04-01

    A comparative analysis of the metagenomes from two 30 000-year-old permafrost samples, one of lake-alluvial origin and the other from late Pleistocene Ice Complex sediments, revealed significant differences within microbial communities. The late Pleistocene Ice Complex sediments (which have been characterized by the absence of methane with lower values of redox potential and Fe2+ content) showed a low abundance of methanogenic archaea and enzymes from both the carbon and nitrogen cycles, but a higher abundance of enzymes associated with the sulfur cycle. The metagenomic and geochemical analyses described in the paper provide evidence that the formation of the sampled late Pleistocene Ice Complex sediments likely took place under much more aerobic conditions than lake-alluvial sediments.

  19. Phylogenetic and Functional Analysis of Metagenome Sequence from High-Temperature Archaeal Habitats Demonstrate Linkages between Metabolic Potential and Geochemistry

    PubMed Central

    Inskeep, William P.; Jay, Zackary J.; Herrgard, Markus J.; Kozubal, Mark A.; Rusch, Douglas B.; Tringe, Susannah G.; Macur, Richard E.; Jennings, Ryan deM.; Boyd, Eric S.; Spear, John R.; Roberto, Francisco F.

    2013-01-01

    Geothermal habitats in Yellowstone National Park (YNP) provide an unparalleled opportunity to understand the environmental factors that control the distribution of archaea in thermal habitats. Here we describe, analyze, and synthesize metagenomic and geochemical data collected from seven high-temperature sites that contain microbial communities dominated by archaea relative to bacteria. The specific objectives of the study were to use metagenome sequencing to determine the structure and functional capacity of thermophilic archaeal-dominated microbial communities across a pH range from 2.5 to 6.4 and to discuss specific examples where the metabolic potential correlated with measured environmental parameters and geochemical processes occurring in situ. Random shotgun metagenome sequence (∼40–45 Mb Sanger sequencing per site) was obtained from environmental DNA extracted from high-temperature sediments and/or microbial mats and subjected to numerous phylogenetic and functional analyses. Analysis of individual sequences (e.g., MEGAN and G + C content) and assemblies from each habitat type revealed the presence of dominant archaeal populations in all environments, 10 of whose genomes were largely reconstructed from the sequence data. Analysis of protein family occurrence, particularly of those involved in energy conservation, electron transport, and autotrophic metabolism, revealed significant differences in metabolic strategies across sites consistent with differences in major geochemical attributes (e.g., sulfide, oxygen, pH). These observations provide an ecological basis for understanding the distribution of indigenous archaeal lineages across high-temperature systems of YNP. PMID:23720654

  20. Metagenome Assembly at the DOE JGI (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Chain, Patrick

    2018-01-25

    Patrick Chain of DOE JGI at LANL, Co-Chair of the Metagenome-specific Assembly session, on Metagenome Assembly at the DOE JGIat the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  1. Rapid and efficient method to extract metagenomic DNA from estuarine sediments.

    PubMed

    Shamim, Kashif; Sharma, Jaya; Dubey, Santosh Kumar

    2017-07-01

    Metagenomic DNA from sediments of selective estuaries of Goa, India was extracted using a simple, fast, efficient and environment friendly method. The recovery of pure metagenomic DNA from our method was significantly high as compared to other well-known methods since the concentration of recovered metagenomic DNA ranged from 1185.1 to 4579.7 µg/g of sediment. The purity of metagenomic DNA was also considerably high as the ratio of absorbance at 260 and 280 nm ranged from 1.88 to 1.94. Therefore, the recovered metagenomic DNA was directly used to perform various molecular biology experiments viz. restriction digestion, PCR amplification, cloning and metagenomic library construction. This clearly proved that our protocol for metagenomic DNA extraction using silica gel efficiently removed the contaminants and prevented shearing of the metagenomic DNA. Thus, this modified method can be used to recover pure metagenomic DNA from various estuarine sediments in a rapid, efficient and eco-friendly manner.

  2. Metagenomic Survey for Viruses in Western Arctic Caribou, Alaska, through Iterative Assembly of Taxonomic Units

    PubMed Central

    Schürch, Anita C.; Schipper, Debby; Bijl, Maarten A.; Dau, Jim; Beckmen, Kimberlee B.; Schapendonk, Claudia M. E.; Raj, V. Stalin; Osterhaus, Albert D. M. E.; Haagmans, Bart L.; Tryland, Morten; Smits, Saskia L.

    2014-01-01

    Pathogen surveillance in animals does not provide a sufficient level of vigilance because it is generally confined to surveillance of pathogens with known economic impact in domestic animals and practically nonexistent in wildlife species. As most (re-)emerging viral infections originate from animal sources, it is important to obtain insight into viral pathogens present in the wildlife reservoir from a public health perspective. When monitoring living, free-ranging wildlife for viruses, sample collection can be challenging and availability of nucleic acids isolated from samples is often limited. The development of viral metagenomics platforms allows a more comprehensive inventory of viruses present in wildlife. We report a metagenomic viral survey of the Western Arctic herd of barren ground caribou (Rangifer tarandus granti) in Alaska, USA. The presence of mammalian viruses in eye and nose swabs of 39 free-ranging caribou was investigated by random amplification combined with a metagenomic analysis approach that applied exhaustive iterative assembly of sequencing results to define taxonomic units of each metagenome. Through homology search methods we identified the presence of several mammalian viruses, including different papillomaviruses, a novel parvovirus, polyomavirus, and a virus that potentially represents a member of a novel genus in the family Coronaviridae. PMID:25140520

  3. Isolation and characterization of a novel metagenomic enzyme capable of degrading bacterial phytotoxin toxoflavin

    PubMed Central

    Lee, Boyoung; Park, Ji Hyun; Oh, Joon Young; Choi, Jung Sup; Kim, Jin-Cheol

    2018-01-01

    Toxoflavin, a 7-azapteridine phytotoxin produced by the bacterial pathogens such as Burkholderia glumae and Burkholderia gladioli, has been known as one of the key virulence factors in crop diseases. Because the toxoflavin had an antibacterial activity, a metagenomic E. coli clone capable of growing well in the presence of toxoflavin (30 μg/ml) was isolated and the first metagenome-derived toxoflavin-degrading enzyme, TxeA of 140 amino acid residues, was identified from the positive E. coli clone. The conserved amino acids for metal-binding and extradiol dioxygenase activity, Glu-12, His-8 and Glu-130, were revealed by the sequence analysis of TxeA. The optimum conditions for toxoflavin degradation were evaluated with the TxeA purified in E. coli. Toxoflavin was totally degraded at an initial toxoflavin concentration of 100 μg/ml and at pH 5.0 in the presence of Mn2+, dithiothreitol and oxygen. The final degradation products of toxoflavin and methyltoxoflavin were fully identified by MS and NMR as triazines. Therefore, we suggested that the new metagenomic enzyme, TxeA, provided the clue to applying the new metagenomic enzyme to resistance development of crop plants to toxoflavin-mediated disease as well as to biocatalysis for Baeyer-Villiger type oxidation. PMID:29293506

  4. Metagenomic assembly through the lens of validation: recent advances in assessing and improving the quality of genomes assembled from metagenomes.

    PubMed

    Olson, Nathan D; Treangen, Todd J; Hill, Christopher M; Cepeda-Espinoza, Victoria; Ghurye, Jay; Koren, Sergey; Pop, Mihai

    2017-08-07

    Metagenomic samples are snapshots of complex ecosystems at work. They comprise hundreds of known and unknown species, contain multiple strain variants and vary greatly within and across environments. Many microbes found in microbial communities are not easily grown in culture making their DNA sequence our only clue into their evolutionary history and biological function. Metagenomic assembly is a computational process aimed at reconstructing genes and genomes from metagenomic mixtures. Current methods have made significant strides in reconstructing DNA segments comprising operons, tandem gene arrays and syntenic blocks. Shorter, higher-throughput sequencing technologies have become the de facto standard in the field. Sequencers are now able to generate billions of short reads in only a few days. Multiple metagenomic assembly strategies, pipelines and assemblers have appeared in recent years. Owing to the inherent complexity of metagenome assembly, regardless of the assembly algorithm and sequencing method, metagenome assemblies contain errors. Recent developments in assembly validation tools have played a pivotal role in improving metagenomics assemblers. Here, we survey recent progress in the field of metagenomic assembly, provide an overview of key approaches for genomic and metagenomic assembly validation and demonstrate the insights that can be derived from assemblies through the use of assembly validation strategies. We also discuss the potential for impact of long-read technologies in metagenomics. We conclude with a discussion of future challenges and opportunities in the field of metagenomic assembly and validation. © The Author 2017. Published by Oxford University Press.

  5. Spatial and temporal heterogeneity of microbial life in artificial landscapes

    NASA Astrophysics Data System (ADS)

    Sengupta, A.; Kaur, R.; Meredith, L. K.; Troch, P. A. A.

    2017-12-01

    The Landscape Evolution Observatory (LEO) project at Biosphere 2 consists of three replicated artificial landscapes which are sealed within a climate-controlled glass house. LEO is composed of basaltic soil material with low organic matter, nutrients, and microbes. The landscapes are built to resemble zero-order basins and enable researchers to observe hydrological, biological, and geochemical evolution of landscapes in a controlled environment. This study is focused on capturing microbial community dynamics in LEO soil, pre- and post-controlled rainfall episodes. Soil samples were collected from six different locations and at five depths in each of the three slopes followed by DNA extraction from 180 samples and sent for amplicon and minimal draft metagenome sequencing. The average concentration of DNA recovered from each sample was higher in the post-rainfall samples than the pre-rainfall samples, a trend consistent in all three slopes. The sequence data will be evaluated to reveal heterogeneity of the soil microbes, providing a more exact narrative of the microbes present in each slope and the spatiotemporal trends of microbial life in the landscapes. Next, functional traits will be predicted from the community data and metagenomes to determine whether consistent changes occur with respect to wetting and drying episodes. Together, these results will highlight the relevance of a unique terrestrial ecosystem research infrastructure in supporting interdisciplinary hydrobiogeochemical research.

  6. Metabolic potential of lithifying cyanobacteria-dominated thrombolitic mats.

    PubMed

    Mobberley, Jennifer M; Khodadad, Christina L M; Foster, Jamie S

    2013-11-01

    Thrombolites are unlaminated carbonate deposits formed by the metabolic activities of microbial mats and can serve as potential models for understanding the molecular mechanisms underlying the formation of lithifying communities. To assess the metabolic complexity of these ecosystems, high throughput DNA sequencing of a thrombolitic mat metagenome was coupled with phenotypic microarray analysis. Functional protein analysis of the thrombolite community metagenome delineated several of the major metabolic pathways that influence carbonate mineralization including cyanobacterial photosynthesis, sulfate reduction, sulfide oxidation, and aerobic heterotrophy. Spatial profiling of metabolite utilization within the thrombolite-forming microbial mats suggested that the top 5 mm contained a more metabolically diverse and active community than the deeper within the mat. This study provides evidence that despite the lack of mineral layering within the clotted thrombolite structure there is a vertical gradient of metabolic activity within the thrombolitic mat community. This metagenomic profiling also serves as a foundation for examining the active role individual functional groups of microbes play in coordinating metabolisms that lead to mineralization.

  7. Windshield splatter analysis with the Galaxy metagenomic pipeline

    PubMed Central

    Kosakovsky Pond, Sergei; Wadhawan, Samir; Chiaromonte, Francesca; Ananda, Guruprasad; Chung, Wen-Yu; Taylor, James; Nekrutenko, Anton

    2009-01-01

    How many species inhabit our immediate surroundings? A straightforward collection technique suitable for answering this question is known to anyone who has ever driven a car at highway speeds. The windshield of a moving vehicle is subjected to numerous insect strikes and can be used as a collection device for representative sampling. Unfortunately the analysis of biological material collected in that manner, as with most metagenomic studies, proves to be rather demanding due to the large number of required tools and considerable computational infrastructure. In this study, we use organic matter collected by a moving vehicle to design and test a comprehensive pipeline for phylogenetic profiling of metagenomic samples that includes all steps from processing and quality control of data generated by next-generation sequencing technologies to statistical analyses and data visualization. To the best of our knowledge, this is also the first publication that features a live online supplement providing access to exact analyses and workflows used in the article. PMID:19819906

  8. Metagenomic analysis of the pinewood nematode microbiome reveals a symbiotic relationship critical for xenobiotics degradation

    PubMed Central

    Cheng, Xin-Yue; Tian, Xue-Liang; Wang, Yun-Sheng; Lin, Ren-Miao; Mao, Zhen-Chuan; Chen, Nansheng; Xie, Bing-Yan

    2013-01-01

    Our recent research revealed that pinewood nematode (PWN) possesses few genes encoding enzymes for degrading α-pinene, which is the main compound in pine resin. In this study, we examined the role of PWN microbiome in xenobiotics detoxification by metagenomic and bacteria culture analyses. Functional annotation of metagenomes illustrated that benzoate degradation and its related metabolisms may provide the main metabolic pathways for xenobiotics detoxification in the microbiome, which is obviously different from that in PWN that uses cytochrome P450 metabolism as the main pathway for detoxification. The metabolic pathway of degrading α-pinene is complete in microbiome, but incomplete in PWN genome. Experimental analysis demonstrated that most of tested cultivable bacteria can not only survive the stress of 0.4% α-pinene, but also utilize α-pinene as carbon source for their growth. Our results indicate that PWN and its microbiome have established a potentially mutualistic symbiotic relationship with complementary pathways in detoxification metabolism. PMID:23694939

  9. The kinetoplastid-infecting Bodo saltans virus (BsV), a window into the most abundant giant viruses in the sea

    PubMed Central

    Deeg, Christoph M; Chow, Cheryl-Emiliane T

    2018-01-01

    Giant viruses are ecologically important players in aquatic ecosystems that have challenged concepts of what constitutes a virus. Herein, we present the giant Bodo saltans virus (BsV), the first characterized representative of the most abundant group of giant viruses in ocean metagenomes, and the first isolate of a klosneuvirus, a subgroup of the Mimiviridae proposed from metagenomic data. BsV infects an ecologically important microzooplankton, the kinetoplastid Bodo saltans. Its 1.39 Mb genome encodes 1227 predicted ORFs, including a complex replication machinery. Yet, much of its translational apparatus has been lost, including all tRNAs. Essential genes are invaded by homing endonuclease-encoding self-splicing introns that may defend against competing viruses. Putative anti-host factors show extensive gene duplication via a genomic accordion indicating an ongoing evolutionary arms race and highlighting the rapid evolution and genomic plasticity that has led to genome gigantism and the enigma that is giant viruses. PMID:29582753

  10. Resources and costs for microbial sequence analysis evaluated using virtual machines and cloud computing.

    PubMed

    Angiuoli, Samuel V; White, James R; Matalka, Malcolm; White, Owen; Fricke, W Florian

    2011-01-01

    The widespread popularity of genomic applications is threatened by the "bioinformatics bottleneck" resulting from uncertainty about the cost and infrastructure needed to meet increasing demands for next-generation sequence analysis. Cloud computing services have been discussed as potential new bioinformatics support systems but have not been evaluated thoroughly. We present benchmark costs and runtimes for common microbial genomics applications, including 16S rRNA analysis, microbial whole-genome shotgun (WGS) sequence assembly and annotation, WGS metagenomics and large-scale BLAST. Sequence dataset types and sizes were selected to correspond to outputs typically generated by small- to midsize facilities equipped with 454 and Illumina platforms, except for WGS metagenomics where sampling of Illumina data was used. Automated analysis pipelines, as implemented in the CloVR virtual machine, were used in order to guarantee transparency, reproducibility and portability across different operating systems, including the commercial Amazon Elastic Compute Cloud (EC2), which was used to attach real dollar costs to each analysis type. We found considerable differences in computational requirements, runtimes and costs associated with different microbial genomics applications. While all 16S analyses completed on a single-CPU desktop in under three hours, microbial genome and metagenome analyses utilized multi-CPU support of up to 120 CPUs on Amazon EC2, where each analysis completed in under 24 hours for less than $60. Representative datasets were used to estimate maximum data throughput on different cluster sizes and to compare costs between EC2 and comparable local grid servers. Although bioinformatics requirements for microbial genomics depend on dataset characteristics and the analysis protocols applied, our results suggests that smaller sequencing facilities (up to three Roche/454 or one Illumina GAIIx sequencer) invested in 16S rRNA amplicon sequencing, microbial single-genome and metagenomics WGS projects can achieve cost-efficient bioinformatics support using CloVR in combination with Amazon EC2 as an alternative to local computing centers.

  11. Resources and Costs for Microbial Sequence Analysis Evaluated Using Virtual Machines and Cloud Computing

    PubMed Central

    Angiuoli, Samuel V.; White, James R.; Matalka, Malcolm; White, Owen; Fricke, W. Florian

    2011-01-01

    Background The widespread popularity of genomic applications is threatened by the “bioinformatics bottleneck” resulting from uncertainty about the cost and infrastructure needed to meet increasing demands for next-generation sequence analysis. Cloud computing services have been discussed as potential new bioinformatics support systems but have not been evaluated thoroughly. Results We present benchmark costs and runtimes for common microbial genomics applications, including 16S rRNA analysis, microbial whole-genome shotgun (WGS) sequence assembly and annotation, WGS metagenomics and large-scale BLAST. Sequence dataset types and sizes were selected to correspond to outputs typically generated by small- to midsize facilities equipped with 454 and Illumina platforms, except for WGS metagenomics where sampling of Illumina data was used. Automated analysis pipelines, as implemented in the CloVR virtual machine, were used in order to guarantee transparency, reproducibility and portability across different operating systems, including the commercial Amazon Elastic Compute Cloud (EC2), which was used to attach real dollar costs to each analysis type. We found considerable differences in computational requirements, runtimes and costs associated with different microbial genomics applications. While all 16S analyses completed on a single-CPU desktop in under three hours, microbial genome and metagenome analyses utilized multi-CPU support of up to 120 CPUs on Amazon EC2, where each analysis completed in under 24 hours for less than $60. Representative datasets were used to estimate maximum data throughput on different cluster sizes and to compare costs between EC2 and comparable local grid servers. Conclusions Although bioinformatics requirements for microbial genomics depend on dataset characteristics and the analysis protocols applied, our results suggests that smaller sequencing facilities (up to three Roche/454 or one Illumina GAIIx sequencer) invested in 16S rRNA amplicon sequencing, microbial single-genome and metagenomics WGS projects can achieve cost-efficient bioinformatics support using CloVR in combination with Amazon EC2 as an alternative to local computing centers. PMID:22028928

  12. Phylogenetic characterization of a biogas plant microbial community integrating clone library 16S-rDNA sequences and metagenome sequence data obtained by 454-pyrosequencing.

    PubMed

    Kröber, Magdalena; Bekel, Thomas; Diaz, Naryttza N; Goesmann, Alexander; Jaenicke, Sebastian; Krause, Lutz; Miller, Dimitri; Runte, Kai J; Viehöver, Prisca; Pühler, Alfred; Schlüter, Andreas

    2009-06-01

    The phylogenetic structure of the microbial community residing in a fermentation sample from a production-scale biogas plant fed with maize silage, green rye and liquid manure was analysed by an integrated approach using clone library sequences and metagenome sequence data obtained by 454-pyrosequencing. Sequencing of 109 clones from a bacterial and an archaeal 16S-rDNA amplicon library revealed that the obtained nucleotide sequences are similar but not identical to 16S-rDNA database sequences derived from different anaerobic environments including digestors and bioreactors. Most of the bacterial 16S-rDNA sequences could be assigned to the phylum Firmicutes with the most abundant class Clostridia and to the class Bacteroidetes, whereas most archaeal 16S-rDNA sequences cluster close to the methanogen Methanoculleus bourgensis. Further sequences of the archaeal library most probably represent so far non-characterised species within the genus Methanoculleus. A similar result derived from phylogenetic analysis of mcrA clone sequences. The mcrA gene product encodes the alpha-subunit of methyl-coenzyme-M reductase involved in the final step of methanogenesis. BLASTn analysis applying stringent settings resulted in assignment of 16S-rDNA metagenome sequence reads to 62 16S-rDNA amplicon sequences thus enabling frequency of abundance estimations for 16S-rDNA clone library sequences. Ribosomal Database Project (RDP) Classifier processing of metagenome 16S-rDNA reads revealed abundance of the phyla Firmicutes, Bacteroidetes and Euryarchaeota and the orders Clostridiales, Bacteroidales and Methanomicrobiales. Moreover, a large fraction of 16S-rDNA metagenome reads could not be assigned to lower taxonomic ranks, demonstrating that numerous microorganisms in the analysed fermentation sample of the biogas plant are still unclassified or unknown.

  13. Evaluation method for the potential functionome harbored in the genome and metagenome.

    PubMed

    Takami, Hideto; Taniguchi, Takeaki; Moriya, Yuki; Kuwahara, Tomomi; Kanehisa, Minoru; Goto, Susumu

    2012-12-12

    One of the main goals of genomic analysis is to elucidate the comprehensive functions (functionome) in individual organisms or a whole community in various environments. However, a standard evaluation method for discerning the functional potentials harbored within the genome or metagenome has not yet been established. We have developed a new evaluation method for the potential functionome, based on the completion ratio of Kyoto Encyclopedia of Genes and Genomes (KEGG) functional modules. Distribution of the completion ratio of the KEGG functional modules in 768 prokaryotic species varied greatly with the kind of module, and all modules primarily fell into 4 patterns (universal, restricted, diversified and non-prokaryotic modules), indicating the universal and unique nature of each module, and also the versatility of the KEGG Orthology (KO) identifiers mapped to each one. The module completion ratio in 8 phenotypically different bacilli revealed that some modules were shared only in phenotypically similar species. Metagenomes of human gut microbiomes from 13 healthy individuals previously determined by the Sanger method were analyzed based on the module completion ratio. Results led to new discoveries in the nutritional preferences of gut microbes, believed to be one of the mutualistic representations of gut microbiomes to avoid nutritional competition with the host. The method developed in this study could characterize the functionome harbored in genomes and metagenomes. As this method also provided taxonomical information from KEGG modules as well as the gene hosts constructing the modules, interpretation of completion profiles was simplified and we could identify the complementarity between biochemical functions in human hosts and the nutritional preferences in human gut microbiomes. Thus, our method has the potential to be a powerful tool for comparative functional analysis in genomics and metagenomics, able to target unknown environments containing various uncultivable microbes within unidentified phyla.

  14. Metagenomic analysis of a permafrost microbial community reveals a rapid response to thaw

    USGS Publications Warehouse

    MacKelprang, R.; Waldrop, M.P.; Deangelis, K.M.; David, M.M.; Chavarria, K.L.; Blazewicz, S.J.; Rubin, E.M.; Jansson, J.K.

    2011-01-01

    Permafrost contains an estimated 1672????????Pg carbon (C), an amount roughly equivalent to the total currently contained within land plants and the atmosphere. This reservoir of C is vulnerable to decomposition as rising global temperatures cause the permafrost to thaw. During thaw, trapped organic matter may become more accessible for microbial degradation and result in greenhouse gas emissions. Despite recent advances in the use of molecular tools to study permafrost microbial communities, their response to thaw remains unclear. Here we use deep metagenomic sequencing to determine the impact of thaw on microbial phylogenetic and functional genes, and relate these data to measurements of methane emissions. Metagenomics, the direct sequencing of DNA from the environment, allows the examination of whole biochemical pathways and associated processes, as opposed to individual pieces of the metabolic puzzle. Our metagenome analyses reveal that during transition from a frozen to a thawed state there are rapid shifts in many microbial, phylogenetic and functional gene abundances and pathways. After one week of incubation at 5 ??C, permafrost metagenomes converge to be more similar to each other than while they are frozen. We find that multiple genes involved in cycling of C and nitrogen shift rapidly during thaw. We also construct the first draft genome from a complex soil metagenome, which corresponds to a novel methanogen. Methane previously accumulated in permafrost is released during thaw and subsequently consumed by methanotrophic bacteria. Together these data point towards the importance of rapid cycling of methane and nitrogen in thawing permafrost. ?? 2011 Macmillan Publishers Limited. All rights reserved.

  15. Biotechnological applications of functional metagenomics in the food and pharmaceutical industries

    PubMed Central

    Coughlan, Laura M.; Cotter, Paul D.; Hill, Colin; Alvarez-Ordóñez, Avelino

    2015-01-01

    Microorganisms are found throughout nature, thriving in a vast range of environmental conditions. The majority of them are unculturable or difficult to culture by traditional methods. Metagenomics enables the study of all microorganisms, regardless of whether they can be cultured or not, through the analysis of genomic data obtained directly from an environmental sample, providing knowledge of the species present, and allowing the extraction of information regarding the functionality of microbial communities in their natural habitat. Function-based screenings, following the cloning and expression of metagenomic DNA in a heterologous host, can be applied to the discovery of novel proteins of industrial interest encoded by the genes of previously inaccessible microorganisms. Functional metagenomics has considerable potential in the food and pharmaceutical industries, where it can, for instance, aid (i) the identification of enzymes with desirable technological properties, capable of catalyzing novel reactions or replacing existing chemically synthesized catalysts which may be difficult or expensive to produce, and able to work under a wide range of environmental conditions encountered in food and pharmaceutical processing cycles including extreme conditions of temperature, pH, osmolarity, etc; (ii) the discovery of novel bioactives including antimicrobials active against microorganisms of concern both in food and medical settings; (iii) the investigation of industrial and societal issues such as antibiotic resistance development. This review article summarizes the state-of-the-art functional metagenomic methods available and discusses the potential of functional metagenomic approaches to mine as yet unexplored environments to discover novel genes with biotechnological application in the food and pharmaceutical industries. PMID:26175729

  16. Biotechnological applications of functional metagenomics in the food and pharmaceutical industries.

    PubMed

    Coughlan, Laura M; Cotter, Paul D; Hill, Colin; Alvarez-Ordóñez, Avelino

    2015-01-01

    Microorganisms are found throughout nature, thriving in a vast range of environmental conditions. The majority of them are unculturable or difficult to culture by traditional methods. Metagenomics enables the study of all microorganisms, regardless of whether they can be cultured or not, through the analysis of genomic data obtained directly from an environmental sample, providing knowledge of the species present, and allowing the extraction of information regarding the functionality of microbial communities in their natural habitat. Function-based screenings, following the cloning and expression of metagenomic DNA in a heterologous host, can be applied to the discovery of novel proteins of industrial interest encoded by the genes of previously inaccessible microorganisms. Functional metagenomics has considerable potential in the food and pharmaceutical industries, where it can, for instance, aid (i) the identification of enzymes with desirable technological properties, capable of catalyzing novel reactions or replacing existing chemically synthesized catalysts which may be difficult or expensive to produce, and able to work under a wide range of environmental conditions encountered in food and pharmaceutical processing cycles including extreme conditions of temperature, pH, osmolarity, etc; (ii) the discovery of novel bioactives including antimicrobials active against microorganisms of concern both in food and medical settings; (iii) the investigation of industrial and societal issues such as antibiotic resistance development. This review article summarizes the state-of-the-art functional metagenomic methods available and discusses the potential of functional metagenomic approaches to mine as yet unexplored environments to discover novel genes with biotechnological application in the food and pharmaceutical industries.

  17. Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights.

    PubMed

    Pasolli, Edoardo; Truong, Duy Tin; Malik, Faizan; Waldron, Levi; Segata, Nicola

    2016-07-01

    Shotgun metagenomic analysis of the human associated microbiome provides a rich set of microbial features for prediction and biomarker discovery in the context of human diseases and health conditions. However, the use of such high-resolution microbial features presents new challenges, and validated computational tools for learning tasks are lacking. Moreover, classification rules have scarcely been validated in independent studies, posing questions about the generality and generalization of disease-predictive models across cohorts. In this paper, we comprehensively assess approaches to metagenomics-based prediction tasks and for quantitative assessment of the strength of potential microbiome-phenotype associations. We develop a computational framework for prediction tasks using quantitative microbiome profiles, including species-level relative abundances and presence of strain-specific markers. A comprehensive meta-analysis, with particular emphasis on generalization across cohorts, was performed in a collection of 2424 publicly available metagenomic samples from eight large-scale studies. Cross-validation revealed good disease-prediction capabilities, which were in general improved by feature selection and use of strain-specific markers instead of species-level taxonomic abundance. In cross-study analysis, models transferred between studies were in some cases less accurate than models tested by within-study cross-validation. Interestingly, the addition of healthy (control) samples from other studies to training sets improved disease prediction capabilities. Some microbial species (most notably Streptococcus anginosus) seem to characterize general dysbiotic states of the microbiome rather than connections with a specific disease. Our results in modelling features of the "healthy" microbiome can be considered a first step toward defining general microbial dysbiosis. The software framework, microbiome profiles, and metadata for thousands of samples are publicly available at http://segatalab.cibio.unitn.it/tools/metaml.

  18. Metagenomic analysis of the airborne environment in urban spaces.

    PubMed

    Be, Nicholas A; Thissen, James B; Fofanov, Viacheslav Y; Allen, Jonathan E; Rojas, Mark; Golovko, George; Fofanov, Yuriy; Koshinsky, Heather; Jaing, Crystal J

    2015-02-01

    The organisms in aerosol microenvironments, especially densely populated urban areas, are relevant to maintenance of public health and detection of potential epidemic or biothreat agents. To examine aerosolized microorganisms in this environment, we performed sequencing on the material from an urban aerosol surveillance program. Whole metagenome sequencing was applied to DNA extracted from air filters obtained during periods from each of the four seasons. The composition of bacteria, plants, fungi, invertebrates, and viruses demonstrated distinct temporal shifts. Bacillus thuringiensis serovar kurstaki was detected in samples known to be exposed to aerosolized spores, illustrating the potential utility of this approach for identification of intentionally introduced microbial agents. Together, these data demonstrate the temporally dependent metagenomic complexity of urban aerosols and the potential of genomic analytical techniques for biosurveillance and monitoring of threats to public health.

  19. Genomics and metagenomics in medical microbiology.

    PubMed

    Padmanabhan, Roshan; Mishra, Ajay Kumar; Raoult, Didier; Fournier, Pierre-Edouard

    2013-12-01

    Over the last two decades, sequencing tools have evolved from laborious time-consuming methodologies to real-time detection and deciphering of genomic DNA. Genome sequencing, especially using next generation sequencing (NGS) has revolutionized the landscape of microbiology and infectious disease. This deluge of sequencing data has not only enabled advances in fundamental biology but also helped improve diagnosis, typing of pathogen, virulence and antibiotic resistance detection, and development of new vaccines and culture media. In addition, NGS also enabled efficient analysis of complex human micro-floras, both commensal, and pathological, through metagenomic methods, thus helping the comprehension and management of human diseases such as obesity. This review summarizes technological advances in genomics and metagenomics relevant to the field of medical microbiology. Copyright © 2013 Elsevier B.V. All rights reserved.

  20. Robust estimation of microbial diversity in theory and in practice

    PubMed Central

    Haegeman, Bart; Hamelin, Jérôme; Moriarty, John; Neal, Peter; Dushoff, Jonathan; Weitz, Joshua S

    2013-01-01

    Quantifying diversity is of central importance for the study of structure, function and evolution of microbial communities. The estimation of microbial diversity has received renewed attention with the advent of large-scale metagenomic studies. Here, we consider what the diversity observed in a sample tells us about the diversity of the community being sampled. First, we argue that one cannot reliably estimate the absolute and relative number of microbial species present in a community without making unsupported assumptions about species abundance distributions. The reason for this is that sample data do not contain information about the number of rare species in the tail of species abundance distributions. We illustrate the difficulty in comparing species richness estimates by applying Chao's estimator of species richness to a set of in silico communities: they are ranked incorrectly in the presence of large numbers of rare species. Next, we extend our analysis to a general family of diversity metrics (‘Hill diversities'), and construct lower and upper estimates of diversity values consistent with the sample data. The theory generalizes Chao's estimator, which we retrieve as the lower estimate of species richness. We show that Shannon and Simpson diversity can be robustly estimated for the in silico communities. We analyze nine metagenomic data sets from a wide range of environments, and show that our findings are relevant for empirically-sampled communities. Hence, we recommend the use of Shannon and Simpson diversity rather than species richness in efforts to quantify and compare microbial diversity. PMID:23407313

  1. Detection and characterization of a novel rhabdovirus in Aedes cantans mosquitoes and evidence for a mosquito-associated new genus in the family Rhabdoviridae.

    PubMed

    Shahhosseini, Nariman; Lühken, Renke; Jöst, Hanna; Jansen, Stephanie; Börstler, Jessica; Rieger, Toni; Krüger, Andreas; Yadouleton, Anges; de Mendonça Campos, Renata; Cirne-Santos, Claudio Cesar; Ferreira, Davis Fernandes; Garms, Rolf; Becker, Norbert; Tannich, Egbert; Cadar, Daniel; Schmidt-Chanasit, Jonas

    2017-11-01

    Thanks to recent advances in random amplification technologies, metagenomic surveillance expanded the number of novel, often unclassified viruses within the family Rhabdoviridae. Using a vector-enabled metagenomic (VEM) tool, we identified a novel rhabdovirus in Aedes cantans mosquitoes collected from Germany provisionally named Ohlsdorf virus (OHSDV). The OHSDV genome encodes the canonical rhabdovirus structural proteins (N, P, M, G and L) with alternative ORF in the P gene. Sequence analysis indicated that OHSDV exhibits a similar genome organization and characteristics compared to other mosquito-associated rhabdoviruses (Riverside virus, Tongilchon virus and North Creek virus). Complete L protein based phylogeny revealed that all four viruses share a common ancestor and form a deeply rooted and divergent monophyletic group within the dimarhabdovirus supergroup and define a new genus, tentatively named Ohlsdorfvirus. Although the Ohlsdorfvirus clade is basal within the dimarhabdovirus supergroup phylogeny that includes genera of arthropod-borne rhabdoviruses, it remains unknown if viruses in the proposed new genus are vector-borne pathogens. The observed spatiotemporal distribution in mosquitoes suggests that members of the proposed genus Ohlsdorfvirus are geographically restricted/separated. These findings increase the current knowledge of the genetic diversity, classification and evolution of this virus family. Further studies are needed to determine the host range, transmission route and the evolutionary relationships of these mosquito-associated viruses with those infecting vertebrates. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Assessing the genetic diversity of Cu resistance in mine tailings through high-throughput recovery of full-length copA genes

    PubMed Central

    Li, Xiaofang; Zhu, Yong-Guan; Shaban, Babak; Bruxner, Timothy J. C.; Bond, Philip L.; Huang, Longbin

    2015-01-01

    Characterizing the genetic diversity of microbial copper (Cu) resistance at the community level remains challenging, mainly due to the polymorphism of the core functional gene copA. In this study, a local BLASTN method using a copA database built in this study was developed to recover full-length putative copA sequences from an assembled tailings metagenome; these sequences were then screened for potentially functioning CopA using conserved metal-binding motifs, inferred by evolutionary trace analysis of CopA sequences from known Cu resistant microorganisms. In total, 99 putative copA sequences were recovered from the tailings metagenome, out of which 70 were found with high potential to be functioning in Cu resistance. Phylogenetic analysis of selected copA sequences detected in the tailings metagenome showed that topology of the copA phylogeny is largely congruent with that of the 16S-based phylogeny of the tailings microbial community obtained in our previous study, indicating that the development of copA diversity in the tailings might be mainly through vertical descent with few lateral gene transfer events. The method established here can be used to explore copA (and potentially other metal resistance genes) diversity in any metagenome and has the potential to exhaust the full-length gene sequences for downstream analyses. PMID:26286020

  3. Metagenomic analysis of nitrogen metabolism genes in the surface of marine sediments

    NASA Astrophysics Data System (ADS)

    Reyes, Carolina; Schneider, Dominik; Thürmer, Andrea; Dellwig, Olaf; Lipka, Marko; Daniel, Rolf; Böttcher, Michael E.; Friedrich, Michael W.

    2016-04-01

    In this study, we analysed metagenomes along with biogeochemical profiles from Skagerrak (North Sea) and Bothnian Bay (Baltic Sea) sediments, to trace the prevailing nitrogen pathways. NO3- was present in the top 5 cm below the sediment-water interface at both sites. NH4+ increased with depth below 5 cm where it overlapped with the NO3- zone. Steady state modelling of NO3- and NH4+ porewater profiles indicates zones of net nitrogen species transformations. Protease, peptidase, urease and deaminase ammonification genes were detected in metagenomes. Genes involved in ammonia oxidation (amo, hao), nitrite oxidation (nxr), denitrification (nar, nir, nor) and dissimilatory NO3- reduction to NH4+ (nap, nfr and otr) were also present. 16S rRNA gene analysis showed that the nitrifying group Nitrosopumilales and other groups involved in nitrification and denitrification (Nitrobacter, Nitrosomonas, Nitrospira, Nitrosococcus, and Nitrosonomas) appeared less abundant in Skagerrak sediments compared to Bothnian Bay sediments. Beggiatoa and Thiothrix 16S rRNA genes were also present suggesting chemolithoautotrophic NO3- reduction to NO2- or NH4+ as a possible pathway. Although anammox planctomycetes 16S rRNA genes were present in metagenomes, anammox protein-coding genes were not detected. Our results show the metabolic potential for ammonification, nitrification, NO3- reduction, and denitrification activities in Skagerrak and Bothnian Bay sediments.

  4. Metasecretome-selective phage display approach for mining the functional potential of a rumen microbial community.

    PubMed

    Ciric, Milica; Moon, Christina D; Leahy, Sinead C; Creevey, Christopher J; Altermann, Eric; Attwood, Graeme T; Rakonjac, Jasna; Gagic, Dragana

    2014-05-12

    In silico, secretome proteins can be predicted from completely sequenced genomes using various available algorithms that identify membrane-targeting sequences. For metasecretome (collection of surface, secreted and transmembrane proteins from environmental microbial communities) this approach is impractical, considering that the metasecretome open reading frames (ORFs) comprise only 10% to 30% of total metagenome, and are poorly represented in the dataset due to overall low coverage of metagenomic gene pool, even in large-scale projects. By combining secretome-selective phage display and next-generation sequencing, we focused the sequence analysis of complex rumen microbial community on the metasecretome component of the metagenome. This approach achieved high enrichment (29 fold) of secreted fibrolytic enzymes from the plant-adherent microbial community of the bovine rumen. In particular, we identified hundreds of heretofore rare modules belonging to cellulosomes, cell-surface complexes specialised for recognition and degradation of the plant fibre. As a method, metasecretome phage display combined with next-generation sequencing has a power to sample the diversity of low-abundance surface and secreted proteins that would otherwise require exceptionally large metagenomic sequencing projects. As a resource, metasecretome display library backed by the dataset obtained by next-generation sequencing is ready for i) affinity selection by standard phage display methodology and ii) easy purification of displayed proteins as part of the virion for individual functional analysis.

  5. The YNP Metagenome Project: Environmental Parameters Responsible for Microbial Distribution in the Yellowstone Geothermal Ecosystem

    PubMed Central

    Inskeep, William P.; Jay, Zackary J.; Tringe, Susannah G.; Herrgård, Markus J.; Rusch, Douglas B.

    2013-01-01

    The Yellowstone geothermal complex contains over 10,000 diverse geothermal features that host numerous phylogenetically deeply rooted and poorly understood archaea, bacteria, and viruses. Microbial communities in high-temperature environments are generally less diverse than soil, marine, sediment, or lake habitats and therefore offer a tremendous opportunity for studying the structure and function of different model microbial communities using environmental metagenomics. One of the broader goals of this study was to establish linkages among microbial distribution, metabolic potential, and environmental variables. Twenty geochemically distinct geothermal ecosystems representing a broad spectrum of Yellowstone hot-spring environments were used for metagenomic and geochemical analysis and included approximately equal numbers of: (1) phototrophic mats, (2) “filamentous streamer” communities, and (3) archaeal-dominated sediments. The metagenomes were analyzed using a suite of complementary and integrative bioinformatic tools, including phylogenetic and functional analysis of both individual sequence reads and assemblies of predominant phylotypes. This volume identifies major environmental determinants of a large number of thermophilic microbial lineages, many of which have not been fully described in the literature nor previously cultivated to enable functional and genomic analyses. Moreover, protein family abundance comparisons and in-depth analyses of specific genes and metabolic pathways relevant to these hot-spring environments reveal hallmark signatures of metabolic capabilities that parallel the distribution of phylotypes across specific types of geochemical environments. PMID:23653623

  6. Multisubstrate Isotope Labeling and Metagenomic Analysis of Active Soil Bacterial Communities

    PubMed Central

    Verastegui, Y.; Cheng, J.; Engel, K.; Kolczynski, D.; Mortimer, S.; Lavigne, J.; Montalibet, J.; Romantsov, T.; Hall, M.; McConkey, B. J.; Rose, D. R.; Tomashek, J. J.; Scott, B. R.

    2014-01-01

    ABSTRACT Soil microbial diversity represents the largest global reservoir of novel microorganisms and enzymes. In this study, we coupled functional metagenomics and DNA stable-isotope probing (DNA-SIP) using multiple plant-derived carbon substrates and diverse soils to characterize active soil bacterial communities and their glycoside hydrolase genes, which have value for industrial applications. We incubated samples from three disparate Canadian soils (tundra, temperate rainforest, and agricultural) with five native carbon (12C) or stable-isotope-labeled (13C) carbohydrates (glucose, cellobiose, xylose, arabinose, and cellulose). Indicator species analysis revealed high specificity and fidelity for many uncultured and unclassified bacterial taxa in the heavy DNA for all soils and substrates. Among characterized taxa, Actinomycetales (Salinibacterium), Rhizobiales (Devosia), Rhodospirillales (Telmatospirillum), and Caulobacterales (Phenylobacterium and Asticcacaulis) were bacterial indicator species for the heavy substrates and soils tested. Both Actinomycetales and Caulobacterales (Phenylobacterium) were associated with metabolism of cellulose, and Alphaproteobacteria were associated with the metabolism of arabinose; members of the order Rhizobiales were strongly associated with the metabolism of xylose. Annotated metagenomic data suggested diverse glycoside hydrolase gene representation within the pooled heavy DNA. By screening 2,876 cloned fragments derived from the 13C-labeled DNA isolated from soils incubated with cellulose, we demonstrate the power of combining DNA-SIP, multiple-displacement amplification (MDA), and functional metagenomics by efficiently isolating multiple clones with activity on carboxymethyl cellulose and fluorogenic proxy substrates for carbohydrate-active enzymes. PMID:25028422

  7. Metagenomic insight of nitrogen metabolism in a tannery wastewater treatment plant bioaugmented with the microbial consortium BM-S-1.

    PubMed

    Sul, Woo-Jun; Kim, In-Soo; Ekpeghere, Kalu I; Song, Bongkeun; Kim, Bong-Soo; Kim, Hong-Gi; Kim, Jong-Tae; Koh, Sung-Cheol

    2016-11-09

    Nitrogen (N) removal in a tannery wastewater treatment plant was significantly enhanced by the bioaugmentation of the novel consortium BM-S-1. In order to identify dominant taxa responsible for N metabolisms in the different stages of the treatment process, Illumina MiSeq Sequencer was used to conduct metagenome sequencing of the microbial communities in the different stages of treatment system, including influent (I), buffering (B), primary aeration (PA), secondary aeration (SA) and sludge digestion (SD). Based on MG-RAST analysis, the dominant phyla were Proteobacteria, Bacteroidetes and Firmicutes in B, PA, SA and SD, whereas Firmicutes was the most dominant in I before augmentation. The augmentation increased the abundance of the denitrification genes found in the genera such as Ralstonia (nirS, norB and nosZ), Pseudomonas (narG, nirS and norB) and Escherichia (narG) in B and PA. In addition, Bacteroides, Geobacter, Porphyromonasand Wolinella carrying nrfA gene encoding dissimilatory nitrate reduction to ammonium were abundantly present in B and PA. This was corroborated with the higher total N removal in these two stages. Thus, metagenomic analysis was able to identify the dominant taxa responsible for dissimilatory N metabolisms in the tannery wastewater treatment system undergoing bioaugmentation. This metagenomic insight into the nitrogen metabolism will contribute to a successful monitoring and operation of the eco-friendly tannery wastewater treatment system.

  8. Single Cell and Metagenomic Assemblies: Biology Drives Technical Choices and Goals (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Stepanauskas, Ramunas

    2018-02-06

    DOE JGI's Tanja Woyke, chair of the Single Cells and Metagenomes session, delivers an introduction, followed by Bigelow Laboratory's Ramunas Stepanauskas on "Single Cell and Metagenomic Assemblies: Biology Drives Technical Choices and Goals" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  9. MGmapper: Reference based mapping and taxonomy annotation of metagenomics sequence reads

    PubMed Central

    Lukjancenko, Oksana; Thomsen, Martin Christen Frølund; Maddalena Sperotto, Maria; Lund, Ole; Møller Aarestrup, Frank; Sicheritz-Pontén, Thomas

    2017-01-01

    An increasing amount of species and gene identification studies rely on the use of next generation sequence analysis of either single isolate or metagenomics samples. Several methods are available to perform taxonomic annotations and a previous metagenomics benchmark study has shown that a vast number of false positive species annotations are a problem unless thresholds or post-processing are applied to differentiate between correct and false annotations. MGmapper is a package to process raw next generation sequence data and perform reference based sequence assignment, followed by a post-processing analysis to produce reliable taxonomy annotation at species and strain level resolution. An in-vitro bacterial mock community sample comprised of 8 genuses, 11 species and 12 strains was previously used to benchmark metagenomics classification methods. After applying a post-processing filter, we obtained 100% correct taxonomy assignments at species and genus level. A sensitivity and precision at 75% was obtained for strain level annotations. A comparison between MGmapper and Kraken at species level, shows MGmapper assigns taxonomy at species level using 84.8% of the sequence reads, compared to 70.5% for Kraken and both methods identified all species with no false positives. Extensive read count statistics are provided in plain text and excel sheets for both rejected and accepted taxonomy annotations. The use of custom databases is possible for the command-line version of MGmapper, and the complete pipeline is freely available as a bitbucked package (https://bitbucket.org/genomicepidemiology/mgmapper). A web-version (https://cge.cbs.dtu.dk/services/MGmapper) provides the basic functionality for analysis of small fastq datasets. PMID:28467460

  10. MGmapper: Reference based mapping and taxonomy annotation of metagenomics sequence reads.

    PubMed

    Petersen, Thomas Nordahl; Lukjancenko, Oksana; Thomsen, Martin Christen Frølund; Maddalena Sperotto, Maria; Lund, Ole; Møller Aarestrup, Frank; Sicheritz-Pontén, Thomas

    2017-01-01

    An increasing amount of species and gene identification studies rely on the use of next generation sequence analysis of either single isolate or metagenomics samples. Several methods are available to perform taxonomic annotations and a previous metagenomics benchmark study has shown that a vast number of false positive species annotations are a problem unless thresholds or post-processing are applied to differentiate between correct and false annotations. MGmapper is a package to process raw next generation sequence data and perform reference based sequence assignment, followed by a post-processing analysis to produce reliable taxonomy annotation at species and strain level resolution. An in-vitro bacterial mock community sample comprised of 8 genuses, 11 species and 12 strains was previously used to benchmark metagenomics classification methods. After applying a post-processing filter, we obtained 100% correct taxonomy assignments at species and genus level. A sensitivity and precision at 75% was obtained for strain level annotations. A comparison between MGmapper and Kraken at species level, shows MGmapper assigns taxonomy at species level using 84.8% of the sequence reads, compared to 70.5% for Kraken and both methods identified all species with no false positives. Extensive read count statistics are provided in plain text and excel sheets for both rejected and accepted taxonomy annotations. The use of custom databases is possible for the command-line version of MGmapper, and the complete pipeline is freely available as a bitbucked package (https://bitbucket.org/genomicepidemiology/mgmapper). A web-version (https://cge.cbs.dtu.dk/services/MGmapper) provides the basic functionality for analysis of small fastq datasets.

  11. Comparative analysis of metagenomes from three methanogenic hydrocarbon-degrading enrichment cultures with 41 environmental samples.

    PubMed

    Tan, Boonfei; Fowler, S Jane; Abu Laban, Nidal; Dong, Xiaoli; Sensen, Christoph W; Foght, Julia; Gieg, Lisa M

    2015-09-01

    Methanogenic hydrocarbon metabolism is a key process in subsurface oil reservoirs and hydrocarbon-contaminated environments and thus warrants greater understanding to improve current technologies for fossil fuel extraction and bioremediation. In this study, three hydrocarbon-degrading methanogenic cultures established from two geographically distinct environments and incubated with different hydrocarbon substrates (added as single hydrocarbons or as mixtures) were subjected to metagenomic and 16S rRNA gene pyrosequencing to test whether these differences affect the genetic potential and composition of the communities. Enrichment of different putative hydrocarbon-degrading bacteria in each culture appeared to be substrate dependent, though all cultures contained both acetate- and H2-utilizing methanogens. Despite differing hydrocarbon substrates and inoculum sources, all three cultures harbored genes for hydrocarbon activation by fumarate addition (bssA, assA, nmsA) and carboxylation (abcA, ancA), along with those for associated downstream pathways (bbs, bcr, bam), though the cultures incubated with hydrocarbon mixtures contained a broader diversity of fumarate addition genes. A comparative metagenomic analysis of the three cultures showed that they were functionally redundant despite their enrichment backgrounds, sharing multiple features associated with syntrophic hydrocarbon conversion to methane. In addition, a comparative analysis of the culture metagenomes with those of 41 environmental samples (containing varying proportions of methanogens) showed that the three cultures were functionally most similar to each other but distinct from other environments, including hydrocarbon-impacted environments (for example, oil sands tailings ponds and oil-affected marine sediments). This study provides a basis for understanding key functions and environmental selection in methanogenic hydrocarbon-associated communities.

  12. Genometa--a fast and accurate classifier for short metagenomic shotgun reads.

    PubMed

    Davenport, Colin F; Neugebauer, Jens; Beckmann, Nils; Friedrich, Benedikt; Kameri, Burim; Kokott, Svea; Paetow, Malte; Siekmann, Björn; Wieding-Drewes, Matthias; Wienhöfer, Markus; Wolf, Stefan; Tümmler, Burkhard; Ahlers, Volker; Sprengel, Frauke

    2012-01-01

    Metagenomic studies use high-throughput sequence data to investigate microbial communities in situ. However, considerable challenges remain in the analysis of these data, particularly with regard to speed and reliable analysis of microbial species as opposed to higher level taxa such as phyla. We here present Genometa, a computationally undemanding graphical user interface program that enables identification of bacterial species and gene content from datasets generated by inexpensive high-throughput short read sequencing technologies. Our approach was first verified on two simulated metagenomic short read datasets, detecting 100% and 94% of the bacterial species included with few false positives or false negatives. Subsequent comparative benchmarking analysis against three popular metagenomic algorithms on an Illumina human gut dataset revealed Genometa to attribute the most reads to bacteria at species level (i.e. including all strains of that species) and demonstrate similar or better accuracy than the other programs. Lastly, speed was demonstrated to be many times that of BLAST due to the use of modern short read aligners. Our method is highly accurate if bacteria in the sample are represented by genomes in the reference sequence but cannot find species absent from the reference. This method is one of the most user-friendly and resource efficient approaches and is thus feasible for rapidly analysing millions of short reads on a personal computer. The Genometa program, a step by step tutorial and Java source code are freely available from http://genomics1.mh-hannover.de/genometa/ and on http://code.google.com/p/genometa/. This program has been tested on Ubuntu Linux and Windows XP/7.

  13. A Microbiomic Analysis in African Americans with Colonic Lesions Reveals Streptococcus sp.VT162 as a Marker of Neoplastic Transformation

    PubMed Central

    Brim, Hassan; Yooseph, Shibu; Lee, Edward; Sherif, Zaki A.; Abbas, Muneer; Laiyemo, Adeyinka O.; Varma, Sudhir; Torralba, Manolito; Dowd, Scot E.; Nelson, Karen E.; Pathmasiri, Wimal; Sumner, Susan; de Vos, Willem; Liang, Qiaoyi; Yu, Jun; Zoetendal, Erwin; Ashktorab, Hassan

    2017-01-01

    Increasing evidence suggests a role of the gut microbiota in colorectal carcinogenesis (CRC). To detect bacterial markers of colorectal cancer in African Americans a metabolomic analysis was performed on fecal water extracts. DNA from stool samples of adenoma and healthy subjects and from colon cancer and matched normal tissues was analyzed to determine the microbiota composition (using 16S rDNA) and genomic content (metagenomics). Metagenomic functions with discriminative power between healthy and neoplastic specimens were established. Quantitative Polymerase Chain Reaction (q-PCR) using primers and probes specific to Streptococcus sp. VT_162 were used to validate this bacterium association with neoplastic transformation in stool samples from two independent cohorts of African Americans and Chinese patients with colorectal lesions. The metabolomic analysis of adenomas revealed low amino acids content. The microbiota in both cancer vs. normal tissues and adenoma vs. normal stool samples were different at the 16S rRNA gene level. Cross-mapping of metagenomic data led to 9 markers with significant discriminative power between normal and diseased specimens. These markers identified with Streptococcus sp. VT_162. Q-PCR data showed a statistically significant presence of this bacterium in advanced adenoma and cancer samples in an independent cohort of CRC patients. We defined metagenomic functions from Streptococcus sp. VT_162 with discriminative power among cancers vs. matched normal and adenomas vs. healthy subjects’ stools. Streptococcus sp. VT_162 specific 16S rDNA was validated in an independent cohort. These findings might facilitate non-invasive screening for colorectal cancer. PMID:29120399

  14. Comprehensive benchmarking and ensemble approaches for metagenomic classifiers.

    PubMed

    McIntyre, Alexa B R; Ounit, Rachid; Afshinnekoo, Ebrahim; Prill, Robert J; Hénaff, Elizabeth; Alexander, Noah; Minot, Samuel S; Danko, David; Foox, Jonathan; Ahsanuddin, Sofia; Tighe, Scott; Hasan, Nur A; Subramanian, Poorani; Moffat, Kelly; Levy, Shawn; Lonardi, Stefano; Greenfield, Nick; Colwell, Rita R; Rosen, Gail L; Mason, Christopher E

    2017-09-21

    One of the main challenges in metagenomics is the identification of microorganisms in clinical and environmental samples. While an extensive and heterogeneous set of computational tools is available to classify microorganisms using whole-genome shotgun sequencing data, comprehensive comparisons of these methods are limited. In this study, we use the largest-to-date set of laboratory-generated and simulated controls across 846 species to evaluate the performance of 11 metagenomic classifiers. Tools were characterized on the basis of their ability to identify taxa at the genus, species, and strain levels, quantify relative abundances of taxa, and classify individual reads to the species level. Strikingly, the number of species identified by the 11 tools can differ by over three orders of magnitude on the same datasets. Various strategies can ameliorate taxonomic misclassification, including abundance filtering, ensemble approaches, and tool intersection. Nevertheless, these strategies were often insufficient to completely eliminate false positives from environmental samples, which are especially important where they concern medically relevant species. Overall, pairing tools with different classification strategies (k-mer, alignment, marker) can combine their respective advantages. This study provides positive and negative controls, titrated standards, and a guide for selecting tools for metagenomic analyses by comparing ranges of precision, accuracy, and recall. We show that proper experimental design and analysis parameters can reduce false positives, provide greater resolution of species in complex metagenomic samples, and improve the interpretation of results.

  15. Metagenomic analysis reveals adaptations to a cold-adapted lifestyle in a low-temperature acid mine drainage stream.

    PubMed

    Liljeqvist, Maria; Ossandon, Francisco J; González, Carolina; Rajan, Sukithar; Stell, Adam; Valdes, Jorge; Holmes, David S; Dopson, Mark

    2015-04-01

    An acid mine drainage (pH 2.5-2.7) stream biofilm situated 250 m below ground in the low-temperature (6-10°C) Kristineberg mine, northern Sweden, contained a microbial community equipped for growth at low temperature and acidic pH. Metagenomic sequencing of the biofilm and planktonic fractions identified the most abundant microorganism to be similar to the psychrotolerant acidophile, Acidithiobacillus ferrivorans. In addition, metagenome contigs were most similar to other Acidithiobacillus species, an Acidobacteria-like species, and a Gallionellaceae-like species. Analyses of the metagenomes indicated functional characteristics previously characterized as related to growth at low temperature including cold-shock proteins, several pathways for the production of compatible solutes and an anti-freeze protein. In addition, genes were predicted to encode functions related to pH homeostasis and metal resistance related to growth in the acidic metal-containing mine water. Metagenome analyses identified microorganisms capable of nitrogen fixation and exhibiting a primarily autotrophic lifestyle driven by the oxidation of the ferrous iron and inorganic sulfur compounds contained in the sulfidic mine waters. The study identified a low diversity of abundant microorganisms adapted to a low-temperature acidic environment as well as identifying some of the strategies the microorganisms employ to grow in this extreme environment. © FEMS 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  16. Identification and Resolution of Microdiversity through Metagenomic Sequencing of Parallel Consortia

    PubMed Central

    Maezato, Yukari; Wu, Yu-Wei; Romine, Margaret F.; Lindemann, Stephen R.

    2015-01-01

    To gain a predictive understanding of the interspecies interactions within microbial communities that govern community function, the genomic complement of every member population must be determined. Although metagenomic sequencing has enabled the de novo reconstruction of some microbial genomes from environmental communities, microdiversity confounds current genome reconstruction techniques. To overcome this issue, we performed short-read metagenomic sequencing on parallel consortia, defined as consortia cultivated under the same conditions from the same natural community with overlapping species composition. The differences in species abundance between the two consortia allowed reconstruction of near-complete (at an estimated >85% of gene complement) genome sequences for 17 of the 20 detected member species. Two Halomonas spp. indistinguishable by amplicon analysis were found to be present within the community. In addition, comparison of metagenomic reads against the consensus scaffolds revealed within-species variation for one of the Halomonas populations, one of the Rhodobacteraceae populations, and the Rhizobiales population. Genomic comparison of these representative instances of inter- and intraspecies microdiversity suggests differences in functional potential that may result in the expression of distinct roles in the community. In addition, isolation and complete genome sequence determination of six member species allowed an investigation into the sensitivity and specificity of genome reconstruction processes, demonstrating robustness across a wide range of sequence coverage (9× to 2,700×) within the metagenomic data set. PMID:26497460

  17. HPMCD: the database of human microbial communities from metagenomic datasets and microbial reference genomes.

    PubMed

    Forster, Samuel C; Browne, Hilary P; Kumar, Nitin; Hunt, Martin; Denise, Hubert; Mitchell, Alex; Finn, Robert D; Lawley, Trevor D

    2016-01-04

    The Human Pan-Microbe Communities (HPMC) database (http://www.hpmcd.org/) provides a manually curated, searchable, metagenomic resource to facilitate investigation of human gastrointestinal microbiota. Over the past decade, the application of metagenome sequencing to elucidate the microbial composition and functional capacity present in the human microbiome has revolutionized many concepts in our basic biology. When sufficient high quality reference genomes are available, whole genome metagenomic sequencing can provide direct biological insights and high-resolution classification. The HPMC database provides species level, standardized phylogenetic classification of over 1800 human gastrointestinal metagenomic samples. This is achieved by combining a manually curated list of bacterial genomes from human faecal samples with over 21000 additional reference genomes representing bacteria, viruses, archaea and fungi with manually curated species classification and enhanced sample metadata annotation. A user-friendly, web-based interface provides the ability to search for (i) microbial groups associated with health or disease state, (ii) health or disease states and community structure associated with a microbial group, (iii) the enrichment of a microbial gene or sequence and (iv) enrichment of a functional annotation. The HPMC database enables detailed analysis of human microbial communities and supports research from basic microbiology and immunology to therapeutic development in human health and disease. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  18. Identification and Resolution of Microdiversity through Metagenomic Sequencing of Parallel Consortia

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nelson, William C.; Maezato, Yukari; Wu, Yu-Wei

    2015-10-23

    To gain a predictive understanding of the interspecies interactions within microbial communities that govern community function, the genomic complement of every member population must be determined. Although metagenomic sequencing has enabled thede novoreconstruction of some microbial genomes from environmental communities, microdiversity confounds current genome reconstruction techniques. To overcome this issue, we performed short-read metagenomic sequencing on parallel consortia, defined as consortia cultivated under the same conditions from the same natural community with overlapping species composition. The differences in species abundance between the two consortia allowed reconstruction of near-complete (at an estimated >85% of gene complement) genome sequences for 17 ofmore » the 20 detected member species. TwoHalomonasspp. indistinguishable by amplicon analysis were found to be present within the community. In addition, comparison of metagenomic reads against the consensus scaffolds revealed within-species variation for one of theHalomonaspopulations, one of theRhodobacteraceaepopulations, and theRhizobialespopulation. Genomic comparison of these representative instances of inter- and intraspecies microdiversity suggests differences in functional potential that may result in the expression of distinct roles in the community. In addition, isolation and complete genome sequence determination of six member species allowed an investigation into the sensitivity and specificity of genome reconstruction processes, demonstrating robustness across a wide range of sequence coverage (9× to 2,700×) within the metagenomic data set.« less

  19. BusyBee Web: metagenomic data analysis by bootstrapped supervised binning and annotation

    PubMed Central

    Kiefer, Christina; Fehlmann, Tobias; Backes, Christina

    2017-01-01

    Abstract Metagenomics-based studies of mixed microbial communities are impacting biotechnology, life sciences and medicine. Computational binning of metagenomic data is a powerful approach for the culture-independent recovery of population-resolved genomic sequences, i.e. from individual or closely related, constituent microorganisms. Existing binning solutions often require a priori characterized reference genomes and/or dedicated compute resources. Extending currently available reference-independent binning tools, we developed the BusyBee Web server for the automated deconvolution of metagenomic data into population-level genomic bins using assembled contigs (Illumina) or long reads (Pacific Biosciences, Oxford Nanopore Technologies). A reversible compression step as well as bootstrapped supervised binning enable quick turnaround times. The binning results are represented in interactive 2D scatterplots. Moreover, bin quality estimates, taxonomic annotations and annotations of antibiotic resistance genes are computed and visualized. Ground truth-based benchmarks of BusyBee Web demonstrate comparably high performance to state-of-the-art binning solutions for assembled contigs and markedly improved performance for long reads (median F1 scores: 70.02–95.21%). Furthermore, the applicability to real-world metagenomic datasets is shown. In conclusion, our reference-independent approach automatically bins assembled contigs or long reads, exhibits high sensitivity and precision, enables intuitive inspection of the results, and only requires FASTA-formatted input. The web-based application is freely accessible at: https://ccb-microbe.cs.uni-saarland.de/busybee. PMID:28472498

  20. The genetic potential for key biogeochemical processes in Arctic frost flowers and young sea ice revealed by metagenomic analysis.

    PubMed

    Bowman, Jeff S; Berthiaume, Chris T; Armbrust, E Virginia; Deming, Jody W

    2014-08-01

    Newly formed sea ice is a vast and biogeochemically active environment. Recently, we reported an unusual microbial community dominated by members of the Rhizobiales in frost flowers at the surface of Arctic young sea ice based on the presence of 16S gene sequences related to these strains. Here, we use metagenomic analysis of two samples, from a field of frost flowers and the underlying young sea ice, to explore the metabolic potential of this surface ice community. The analysis links genes for key biogeochemical processes to the Rhizobiales, including dimethylsulfide uptake, betaine glycine turnover, and halocarbon production. Nodulation and nitrogen fixation genes characteristic of terrestrial root-nodulating Rhizobiales were generally lacking from these metagenomes. Non-Rhizobiales clades at the ice surface had genes that would enable additional biogeochemical processes, including mercury reduction and dimethylsulfoniopropionate catabolism. Although the ultimate source of the observed microbial community is not known, considerations of the possible role of eolian deposition or transport with particles entrained during ice formation favor a suspended particle source for this microbial community. © 2014 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved.

  1. MinION™ nanopore sequencing of environmental metagenomes: a synthetic approach

    PubMed Central

    Watson, Mick; Minot, Samuel S.; Rivera, Maria C.; Franklin, Rima B.

    2017-01-01

    Abstract Background: Environmental metagenomic analysis is typically accomplished by assigning taxonomy and/or function from whole genome sequencing or 16S amplicon sequences. Both of these approaches are limited, however, by read length, among other technical and biological factors. A nanopore-based sequencing platform, MinION™, produces reads that are ≥1 × 104 bp in length, potentially providing for more precise assignment, thereby alleviating some of the limitations inherent in determining metagenome composition from short reads. We tested the ability of sequence data produced by MinION (R7.3 flow cells) to correctly assign taxonomy in single bacterial species runs and in three types of low-complexity synthetic communities: a mixture of DNA using equal mass from four species, a community with one relatively rare (1%) and three abundant (33% each) components, and a mixture of genomic DNA from 20 bacterial strains of staggered representation. Taxonomic composition of the low-complexity communities was assessed by analyzing the MinION sequence data with three different bioinformatic approaches: Kraken, MG-RAST, and One Codex. Results: Long read sequences generated from libraries prepared from single strains using the version 5 kit and chemistry, run on the original MinION device, yielded as few as 224 to as many as 3497 bidirectional high-quality (2D) reads with an average overall study length of 6000 bp. For the single-strain analyses, assignment of reads to the correct genus by different methods ranged from 53.1% to 99.5%, assignment to the correct species ranged from 23.9% to 99.5%, and the majority of misassigned reads were to closely related organisms. A synthetic metagenome sequenced with the same setup yielded 714 high quality 2D reads of approximately 5500 bp that were up to 98% correctly assigned to the species level. Synthetic metagenome MinION libraries generated using version 6 kit and chemistry yielded from 899 to 3497 2D reads with lengths averaging 5700 bp with up to 98% assignment accuracy at the species level. The observed community proportions for “equal” and “rare” synthetic libraries were close to the known proportions, deviating from 0.1% to 10% across all tests. For a 20-species mock community with staggered contributions, a sequencing run detected all but 3 species (each included at <0.05% of DNA in the total mixture), 91% of reads were assigned to the correct species, 93% of reads were assigned to the correct genus, and >99% of reads were assigned to the correct family. Conclusions: At the current level of output and sequence quality (just under 4 × 103 2D reads for a synthetic metagenome), MinION sequencing followed by Kraken or One Codex analysis has the potential to provide rapid and accurate metagenomic analysis where the consortium is comprised of a limited number of taxa. Important considerations noted in this study included: high sensitivity of the MinION platform to the quality of input DNA, high variability of sequencing results across libraries and flow cells, and relatively small numbers of 2D reads per analysis limit. Together, these limited detection of very rare components of the microbial consortia, and would likely limit the utility of MinION for the sequencing of high-complexity metagenomic communities where thousands of taxa are expected. Furthermore, the limitations of the currently available data analysis tools suggest there is considerable room for improvement in the analytical approaches for the characterization of microbial communities using long reads. Nevertheless, the fact that the accurate taxonomic assignment of high-quality reads generated by MinION is approaching 99.5% and, in most cases, the inferred community structure mirrors the known proportions of a synthetic mixture warrants further exploration of practical application to environmental metagenomics as the platform continues to develop and improve. With further improvement in sequence throughput and error rate reduction, this platform shows great promise for precise real-time analysis of the composition and structure of more complex microbial communities. PMID:28327976

  2. MinION™ nanopore sequencing of environmental metagenomes: a synthetic approach.

    PubMed

    Brown, Bonnie L; Watson, Mick; Minot, Samuel S; Rivera, Maria C; Franklin, Rima B

    2017-03-01

    Environmental metagenomic analysis is typically accomplished by assigning taxonomy and/or function from whole genome sequencing or 16S amplicon sequences. Both of these approaches are limited, however, by read length, among other technical and biological factors. A nanopore-based sequencing platform, MinION™, produces reads that are ≥1 × 104 bp in length, potentially providing for more precise assignment, thereby alleviating some of the limitations inherent in determining metagenome composition from short reads. We tested the ability of sequence data produced by MinION (R7.3 flow cells) to correctly assign taxonomy in single bacterial species runs and in three types of low-complexity synthetic communities: a mixture of DNA using equal mass from four species, a community with one relatively rare (1%) and three abundant (33% each) components, and a mixture of genomic DNA from 20 bacterial strains of staggered representation. Taxonomic composition of the low-complexity communities was assessed by analyzing the MinION sequence data with three different bioinformatic approaches: Kraken, MG-RAST, and One Codex. Results: Long read sequences generated from libraries prepared from single strains using the version 5 kit and chemistry, run on the original MinION device, yielded as few as 224 to as many as 3497 bidirectional high-quality (2D) reads with an average overall study length of 6000 bp. For the single-strain analyses, assignment of reads to the correct genus by different methods ranged from 53.1% to 99.5%, assignment to the correct species ranged from 23.9% to 99.5%, and the majority of misassigned reads were to closely related organisms. A synthetic metagenome sequenced with the same setup yielded 714 high quality 2D reads of approximately 5500 bp that were up to 98% correctly assigned to the species level. Synthetic metagenome MinION libraries generated using version 6 kit and chemistry yielded from 899 to 3497 2D reads with lengths averaging 5700 bp with up to 98% assignment accuracy at the species level. The observed community proportions for “equal” and “rare” synthetic libraries were close to the known proportions, deviating from 0.1% to 10% across all tests. For a 20-species mock community with staggered contributions, a sequencing run detected all but 3 species (each included at <0.05% of DNA in the total mixture), 91% of reads were assigned to the correct species, 93% of reads were assigned to the correct genus, and >99% of reads were assigned to the correct family. Conclusions: At the current level of output and sequence quality (just under 4 × 103 2D reads for a synthetic metagenome), MinION sequencing followed by Kraken or One Codex analysis has the potential to provide rapid and accurate metagenomic analysis where the consortium is comprised of a limited number of taxa. Important considerations noted in this study included: high sensitivity of the MinION platform to the quality of input DNA, high variability of sequencing results across libraries and flow cells, and relatively small numbers of 2D reads per analysis limit. Together, these limited detection of very rare components of the microbial consortia, and would likely limit the utility of MinION for the sequencing of high-complexity metagenomic communities where thousands of taxa are expected. Furthermore, the limitations of the currently available data analysis tools suggest there is considerable room for improvement in the analytical approaches for the characterization of microbial communities using long reads. Nevertheless, the fact that the accurate taxonomic assignment of high-quality reads generated by MinION is approaching 99.5% and, in most cases, the inferred community structure mirrors the known proportions of a synthetic mixture warrants further exploration of practical application to environmental metagenomics as the platform continues to develop and improve. With further improvement in sequence throughput and error rate reduction, this platform shows great promise for precise real-time analysis of the composition and structure of more complex microbial communities. © The Author 2017. Published by Oxford University Press.

  3. Current and future resources for functional metagenomics

    PubMed Central

    Lam, Kathy N.; Cheng, Jiujun; Engel, Katja; Neufeld, Josh D.; Charles, Trevor C.

    2015-01-01

    Functional metagenomics is a powerful experimental approach for studying gene function, starting from the extracted DNA of mixed microbial populations. A functional approach relies on the construction and screening of metagenomic libraries—physical libraries that contain DNA cloned from environmental metagenomes. The information obtained from functional metagenomics can help in future annotations of gene function and serve as a complement to sequence-based metagenomics. In this Perspective, we begin by summarizing the technical challenges of constructing metagenomic libraries and emphasize their value as resources. We then discuss libraries constructed using the popular cloning vector, pCC1FOS, and highlight the strengths and shortcomings of this system, alongside possible strategies to maximize existing pCC1FOS-based libraries by screening in diverse hosts. Finally, we discuss the known bias of libraries constructed from human gut and marine water samples, present results that suggest bias may also occur for soil libraries, and consider factors that bias metagenomic libraries in general. We anticipate that discussion of current resources and limitations will advance tools and technologies for functional metagenomics research. PMID:26579102

  4. Current and future resources for functional metagenomics.

    PubMed

    Lam, Kathy N; Cheng, Jiujun; Engel, Katja; Neufeld, Josh D; Charles, Trevor C

    2015-01-01

    Functional metagenomics is a powerful experimental approach for studying gene function, starting from the extracted DNA of mixed microbial populations. A functional approach relies on the construction and screening of metagenomic libraries-physical libraries that contain DNA cloned from environmental metagenomes. The information obtained from functional metagenomics can help in future annotations of gene function and serve as a complement to sequence-based metagenomics. In this Perspective, we begin by summarizing the technical challenges of constructing metagenomic libraries and emphasize their value as resources. We then discuss libraries constructed using the popular cloning vector, pCC1FOS, and highlight the strengths and shortcomings of this system, alongside possible strategies to maximize existing pCC1FOS-based libraries by screening in diverse hosts. Finally, we discuss the known bias of libraries constructed from human gut and marine water samples, present results that suggest bias may also occur for soil libraries, and consider factors that bias metagenomic libraries in general. We anticipate that discussion of current resources and limitations will advance tools and technologies for functional metagenomics research.

  5. Metagenomic Frameworks for Monitoring Antibiotic Resistance in Aquatic Environments

    PubMed Central

    Port, Jesse A.; Cullen, Alison C.; Wallace, James C.; Smith, Marissa N.

    2013-01-01

    Background: High-throughput genomic technologies offer new approaches for environmental health monitoring, including metagenomic surveillance of antibiotic resistance determinants (ARDs). Although natural environments serve as reservoirs for antibiotic resistance genes that can be transferred to pathogenic and human commensal bacteria, monitoring of these determinants has been infrequent and incomplete. Furthermore, surveillance efforts have not been integrated into public health decision making. Objectives: We used a metagenomic epidemiology–based approach to develop an ARD index that quantifies antibiotic resistance potential, and we analyzed this index for common modal patterns across environmental samples. We also explored how metagenomic data such as this index could be conceptually framed within an early risk management context. Methods: We analyzed 25 published data sets from shotgun pyrosequencing projects. The samples consisted of microbial community DNA collected from marine and freshwater environments across a gradient of human impact. We used principal component analysis to identify index patterns across samples. Results: We observed significant differences in the overall index and index subcategory levels when comparing ecosystems more proximal versus distal to human impact. The selection of different sequence similarity thresholds strongly influenced the index measurements. Unique index subcategory modes distinguished the different metagenomes. Conclusions: Broad-scale screening of ARD potential using this index revealed utility for framing environmental health monitoring and surveillance. This approach holds promise as a screening tool for establishing baseline ARD levels that can be used to inform and prioritize decision making regarding management of ARD sources and human exposure routes. Citation: Port JA, Cullen AC, Wallace JC, Smith MN, Faustman EM. 2014. Metagenomic frameworks for monitoring antibiotic resistance in aquatic environments. Environ Health Perspect 122:222–228; http://dx.doi.org/10.1289/ehp.1307009 PMID:24334622

  6. Metagenomic frameworks for monitoring antibiotic resistance in aquatic environments.

    PubMed

    Port, Jesse A; Cullen, Alison C; Wallace, James C; Smith, Marissa N; Faustman, Elaine M

    2014-03-01

    High-throughput genomic technologies offer new approaches for environmental health monitoring, including metagenomic surveillance of antibiotic resistance determinants (ARDs). Although natural environments serve as reservoirs for antibiotic resistance genes that can be transferred to pathogenic and human commensal bacteria, monitoring of these determinants has been infrequent and incomplete. Furthermore, surveillance efforts have not been integrated into public health decision making. We used a metagenomic epidemiology-based approach to develop an ARD index that quantifies antibiotic resistance potential, and we analyzed this index for common modal patterns across environmental samples. We also explored how metagenomic data such as this index could be conceptually framed within an early risk management context. We analyzed 25 published data sets from shotgun pyrosequencing projects. The samples consisted of microbial community DNA collected from marine and freshwater environments across a gradient of human impact. We used principal component analysis to identify index patterns across samples. We observed significant differences in the overall index and index subcategory levels when comparing ecosystems more proximal versus distal to human impact. The selection of different sequence similarity thresholds strongly influenced the index measurements. Unique index subcategory modes distinguished the different metagenomes. Broad-scale screening of ARD potential using this index revealed utility for framing environmental health monitoring and surveillance. This approach holds promise as a screening tool for establishing baseline ARD levels that can be used to inform and prioritize decision making regarding management of ARD sources and human exposure routes. Port JA, Cullen AC, Wallace JC, Smith MN, Faustman EM. 2014. Metagenomic frameworks for monitoring antibiotic resistance in aquatic environments. Environ Health Perspect 122:222–228; http://dx.doi.org/10.1289/ehp.1307009

  7. Functional Screening of Metagenome and Genome Libraries for Detection of Novel Flavonoid-Modifying Enzymes

    PubMed Central

    Rabausch, U.; Juergensen, J.; Ilmberger, N.; Böhnke, S.; Fischer, S.; Schubach, B.; Schulte, M.

    2013-01-01

    The functional detection of novel enzymes other than hydrolases from metagenomes is limited since only a very few reliable screening procedures are available that allow the rapid screening of large clone libraries. For the discovery of flavonoid-modifying enzymes in genome and metagenome clone libraries, we have developed a new screening system based on high-performance thin-layer chromatography (HPTLC). This metagenome extract thin-layer chromatography analysis (META) allows the rapid detection of glycosyltransferase (GT) and also other flavonoid-modifying activities. The developed screening method is highly sensitive, and an amount of 4 ng of modified flavonoid molecules can be detected. This novel technology was validated against a control library of 1,920 fosmid clones generated from a single Bacillus cereus isolate and then used to analyze more than 38,000 clones derived from two different metagenomic preparations. Thereby we identified two novel UDP glycosyltransferase (UGT) genes. The metagenome-derived gtfC gene encoded a 52-kDa protein, and the deduced amino acid sequence was weakly similar to sequences of putative UGTs from Fibrisoma and Dyadobacter. GtfC mediated the transfer of different hexose moieties and exhibited high activities on flavones, flavonols, flavanones, and stilbenes and also accepted isoflavones and chalcones. From the control library we identified a novel macroside glycosyltransferase (MGT) with a calculated molecular mass of 46 kDa. The deduced amino acid sequence was highly similar to sequences of MGTs from Bacillus thuringiensis. Recombinant MgtB transferred the sugar residue from UDP-glucose effectively to flavones, flavonols, isoflavones, and flavanones. Moreover, MgtB exhibited high activity on larger flavonoid molecules such as tiliroside. PMID:23686272

  8. Comparative metagenomics of bathypelagic plankton and bottom sediment from the Sea of Marmara

    PubMed Central

    Quaiser, Achim; Zivanovic, Yvan; Moreira, David; López-García, Purificación

    2011-01-01

    To extend comparative metagenomic analyses of the deep-sea, we produced metagenomic data by direct 454 pyrosequencing from bathypelagic plankton (1000 m depth) and bottom sediment of the Sea of Marmara, the gateway between the Eastern Mediterranean and the Black Seas. Data from small subunit ribosomal RNA (SSU rRNA) gene libraries and direct pyrosequencing of the same samples indicated that Gamma- and Alpha-proteobacteria, followed by Bacteroidetes, dominated the bacterial fraction in Marmara deep-sea plankton, whereas Planctomycetes, Delta- and Gamma-proteobacteria were the most abundant groups in high bacterial-diversity sediment. Group I Crenarchaeota/Thaumarchaeota dominated the archaeal plankton fraction, although group II and III Euryarchaeota were also present. Eukaryotes were highly diverse in SSU rRNA gene libraries, with group I (Duboscquellida) and II (Syndiniales) alveolates and Radiozoa dominating plankton, and Opisthokonta and Alveolates, sediment. However, eukaryotic sequences were scarce in pyrosequence data. Archaeal amo genes were abundant in plankton, suggesting that Marmara planktonic Thaumarchaeota are ammonia oxidizers. Genes involved in sulfate reduction, carbon monoxide oxidation, anammox and sulfatases were over-represented in sediment. Genome recruitment analyses showed that Alteromonas macleodii ‘surface ecotype', Pelagibacter ubique and Nitrosopumilus maritimus were highly represented in 1000 m-deep plankton. A comparative analysis of Marmara metagenomes with ALOHA deep-sea and surface plankton, whale carcasses, Peru subsurface sediment and soil metagenomes clustered deep-sea Marmara plankton with deep-ALOHA plankton and whale carcasses, likely because of the suboxic conditions in the deep Marmara water column. The Marmara sediment clustered with the soil metagenome, highlighting the common ecological role of both types of microbial communities in the degradation of organic matter and the completion of biogeochemical cycles. PMID:20668488

  9. Metagenomic Assembly Reveals Hosts of Antibiotic Resistance Genes and the Shared Resistome in Pig, Chicken, and Human Feces.

    PubMed

    Ma, Liping; Xia, Yu; Li, Bing; Yang, Ying; Li, Li-Guan; Tiedje, James M; Zhang, Tong

    2016-01-05

    The risk associated with antibiotic resistance disseminating from animal and human feces is an urgent public issue. In the present study, we sought to establish a pipeline for annotating antibiotic resistance genes (ARGs) based on metagenomic assembly to investigate ARGs and their co-occurrence with associated genetic elements. Genetic elements found on the assembled genomic fragments include mobile genetic elements (MGEs) and metal resistance genes (MRGs). We then explored the hosts of these resistance genes and the shared resistome of pig, chicken and human fecal samples. High levels of tetracycline, multidrug, erythromycin, and aminoglycoside resistance genes were discovered in these fecal samples. In particular, significantly high level of ARGs (7762 ×/Gb) was detected in adult chicken feces, indicating higher ARG contamination level than other fecal samples. Many ARGs arrangements (e.g., macA-macB and tetA-tetR) were discovered shared by chicken, pig and human feces. In addition, MGEs such as the aadA5-dfrA17-carrying class 1 integron were identified on an assembled scaffold of chicken feces, and are carried by human pathogens. Differential coverage binning analysis revealed significant ARG enrichment in adult chicken feces. A draft genome, annotated as multidrug resistant Escherichia coli, was retrieved from chicken feces metagenomes and was determined to carry diverse ARGs (multidrug, acriflavine, and macrolide). The present study demonstrates the determination of ARG hosts and the shared resistome from metagenomic data sets and successfully establishes the relationship between ARGs, hosts, and environments. This ARG annotation pipeline based on metagenomic assembly will help to bridge the knowledge gaps regarding ARG-associated genes and ARG hosts with metagenomic data sets. Moreover, this pipeline will facilitate the evaluation of environmental risks in the genetic context of ARGs.

  10. Metagenomic analysis of faecal microbiome as a tool towards targeted non-invasive biomarkers for colorectal cancer.

    PubMed

    Yu, Jun; Feng, Qiang; Wong, Sunny Hei; Zhang, Dongya; Liang, Qiao Yi; Qin, Youwen; Tang, Longqing; Zhao, Hui; Stenvang, Jan; Li, Yanli; Wang, Xiaokai; Xu, Xiaoqiang; Chen, Ning; Wu, William Ka Kei; Al-Aama, Jumana; Nielsen, Hans Jørgen; Kiilerich, Pia; Jensen, Benjamin Anderschou Holbech; Yau, Tung On; Lan, Zhou; Jia, Huijue; Li, Junhua; Xiao, Liang; Lam, Thomas Yuen Tung; Ng, Siew Chien; Cheng, Alfred Sze-Lok; Wong, Vincent Wai-Sun; Chan, Francis Ka Leung; Xu, Xun; Yang, Huanming; Madsen, Lise; Datz, Christian; Tilg, Herbert; Wang, Jian; Brünner, Nils; Kristiansen, Karsten; Arumugam, Manimozhiyan; Sung, Joseph Jao-Yiu; Wang, Jun

    2017-01-01

    To evaluate the potential for diagnosing colorectal cancer (CRC) from faecal metagenomes. We performed metagenome-wide association studies on faecal samples from 74 patients with CRC and 54 controls from China, and validated the results in 16 patients and 24 controls from Denmark. We further validated the biomarkers in two published cohorts from France and Austria. Finally, we employed targeted quantitative PCR (qPCR) assays to evaluate diagnostic potential of selected biomarkers in an independent Chinese cohort of 47 patients and 109 controls. Besides confirming known associations of Fusobacterium nucleatum and Peptostreptococcus stomatis with CRC, we found significant associations with several species, including Parvimonas micra and Solobacterium moorei. We identified 20 microbial gene markers that differentiated CRC and control microbiomes, and validated 4 markers in the Danish cohort. In the French and Austrian cohorts, these four genes distinguished CRC metagenomes from controls with areas under the receiver-operating curve (AUC) of 0.72 and 0.77, respectively. qPCR measurements of two of these genes accurately classified patients with CRC in the independent Chinese cohort with AUC=0.84 and OR of 23. These genes were enriched in early-stage (I-II) patient microbiomes, highlighting the potential for using faecal metagenomic biomarkers for early diagnosis of CRC. We present the first metagenomic profiling study of CRC faecal microbiomes to discover and validate microbial biomarkers in ethnically different cohorts, and to independently validate selected biomarkers using an affordable clinically relevant technology. Our study thus takes a step further towards affordable non-invasive early diagnostic biomarkers for CRC from faecal samples. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.

  11. Captured metagenomics: large-scale targeting of genes based on ‘sequence capture’ reveals functional diversity in soils

    PubMed Central

    Manoharan, Lokeshwaran; Kushwaha, Sandeep K.; Hedlund, Katarina; Ahrén, Dag

    2015-01-01

    Microbial enzyme diversity is a key to understand many ecosystem processes. Whole metagenome sequencing (WMG) obtains information on functional genes, but it is costly and inefficient due to large amount of sequencing that is required. In this study, we have applied a captured metagenomics technique for functional genes in soil microorganisms, as an alternative to WMG. Large-scale targeting of functional genes, coding for enzymes related to organic matter degradation, was applied to two agricultural soil communities through captured metagenomics. Captured metagenomics uses custom-designed, hybridization-based oligonucleotide probes that enrich functional genes of interest in metagenomic libraries where only probe-bound DNA fragments are sequenced. The captured metagenomes were highly enriched with targeted genes while maintaining their target diversity and their taxonomic distribution correlated well with the traditional ribosomal sequencing. The captured metagenomes were highly enriched with genes related to organic matter degradation; at least five times more than similar, publicly available soil WMG projects. This target enrichment technique also preserves the functional representation of the soils, thereby facilitating comparative metagenomics projects. Here, we present the first study that applies the captured metagenomics approach in large scale, and this novel method allows deep investigations of central ecosystem processes by studying functional gene abundances. PMID:26490729

  12. Taxonomic and predicted metabolic profiles of the human gut microbiome in pre-Columbian mummies.

    PubMed

    Santiago-Rodriguez, Tasha M; Fornaciari, Gino; Luciani, Stefania; Dowd, Scot E; Toranzos, Gary A; Marota, Isolina; Cano, Raul J

    2016-11-01

    Characterization of naturally mummified human gut remains could potentially provide insights into the preservation and evolution of commensal and pathogenic microorganisms, and metabolic profiles. We characterized the gut microbiome of two pre-Columbian Andean mummies dating to the 10-15th centuries using 16S rRNA gene high-throughput sequencing and metagenomics, and compared them to a previously characterized gut microbiome of an 11th century AD pre-Columbian Andean mummy. Our previous study showed that the Clostridiales represented the majority of the bacterial communities in the mummified gut remains, but that other microbial communities were also preserved during the process of natural mummification, as shown with the metagenomics analyses. The gut microbiome of the other two mummies were mainly comprised by Clostridiales or Bacillales, as demonstrated with 16S rRNA gene amplicon sequencing, many of which are facultative anaerobes, possibly consistent with the process of natural mummification requiring low oxygen levels. Metagenome analyses showed the presence of other microbial groups that were positively or negatively correlated with specific metabolic profiles. The presence of sequences similar to both Trypanosoma cruzi and Leishmania donovani could suggest that these pathogens were prevalent in pre-Columbian individuals. Taxonomic and functional profiling of mummified human gut remains will aid in the understanding of the microbial ecology of the process of natural mummification. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  13. Metagenomics workflow analysis of endophytic bacteria from oil palm fruits

    NASA Astrophysics Data System (ADS)

    Tanjung, Z. A.; Aditama, R.; Sudania, W. M.; Utomo, C.; Liwang, T.

    2017-05-01

    Next-Generation Sequencing (NGS) has become a powerful sequencing tool for microbial study especially to lead the establishment of the field area of metagenomics. This study described a workflow to analyze metagenomics data of a Sequence Read Archive (SRA) file under accession ERP004286 deposited by University of Sao Paulo. It was a direct sequencing data generated by 454 pyrosequencing platform originated from oil palm fruits endophytic bacteria which were cultured using oil-palm enriched medium. This workflow used SortMeRNA to split ribosomal reads sequence, Newbler (GS Assembler and GS Mapper) to assemble and map reads into genome reference, BLAST package to identify and annotate contigs sequence, and QualiMap for statistical analysis. Eight bacterial species were identified in this study. Enterobacter cloacae was the most abundant species followed by Citrobacter koseri, Seratia marcescens, Latococcus lactis subsp. lactis, Klebsiella pneumoniae, Citrobacter amalonaticus, Achromobacter xylosoxidans, and Pseudomonas sp. respectively. All of these species have been reported as endophyte bacteria in various plant species and each has potential as plant growth promoting bacteria or another application in agricultural industries.

  14. Identification of a Novel Human Papillomavirus by Metagenomic Analysis of Samples from Patients with Febrile Respiratory Illness

    PubMed Central

    Mokili, John L.; Dutilh, Bas E.; Lim, Yan Wei; Schneider, Bradley S.; Taylor, Travis; Haynes, Matthew R.; Metzgar, David; Myers, Christopher A.; Blair, Patrick J.; Nosrat, Bahador; Wolfe, Nathan D.; Rohwer, Forest

    2013-01-01

    As part of a virus discovery investigation using a metagenomic approach, a highly divergent novel Human papillomavirus type was identified in pooled convenience nasal/oropharyngeal swab samples collected from patients with febrile respiratory illness. Phylogenetic analysis of the whole genome and the L1 gene reveals that the new HPV identified in this study clusters with previously described gamma papillomaviruses, sharing only 61.1% (whole genome) and 63.1% (L1) sequence identity with its closest relative in the Papillomavirus episteme (PAVE) database. This new virus was named HPV_SD2 pending official classification. The complete genome of HPV-SD2 is 7,299 bp long (36.3% G/C) and contains 7 open reading frames (L2, L1, E6, E7, E1, E2 and E4) and a non-coding long control region (LCR) between L1 and E6. The metagenomic procedures, coupled with the bioinformatic methods described herein are well suited to detect small circular genomes such as those of human papillomaviruses. PMID:23554892

  15. Metagenomic and satellite analyses of red snow in the Russian Arctic.

    PubMed

    Hisakawa, Nao; Quistad, Steven D; Hester, Eric R; Martynova, Daria; Maughan, Heather; Sala, Enric; Gavrilo, Maria V; Rohwer, Forest

    2015-01-01

    Cryophilic algae thrive in liquid water within snow and ice in alpine and polar regions worldwide. Blooms of these algae lower albedo (reflection of sunlight), thereby altering melting patterns (Kohshima, Seko & Yoshimura, 1993; Lutz et al., 2014; Thomas & Duval, 1995). Here metagenomic DNA analysis and satellite imaging were used to investigate red snow in Franz Josef Land in the Russian Arctic. Franz Josef Land red snow metagenomes confirmed that the communities are composed of the autotroph Chlamydomonas nivalis that is supporting a complex viral and heterotrophic bacterial community. Comparisons with white snow communities from other sites suggest that white snow and ice are initially colonized by fungal-dominated communities and then succeeded by the more complex C. nivalis-heterotroph red snow. Satellite image analysis showed that red snow covers up to 80% of the surface of snow and ice fields in Franz Josef Land and globally. Together these results show that C. nivalis supports a local food web that is on the rise as temperatures warm, with potential widespread impacts on alpine and polar environments worldwide.

  16. MetaComp: comprehensive analysis software for comparative meta-omics including comparative metagenomics.

    PubMed

    Zhai, Peng; Yang, Longshu; Guo, Xiao; Wang, Zhe; Guo, Jiangtao; Wang, Xiaoqi; Zhu, Huaiqiu

    2017-10-02

    During the past decade, the development of high throughput nucleic sequencing and mass spectrometry analysis techniques have enabled the characterization of microbial communities through metagenomics, metatranscriptomics, metaproteomics and metabolomics data. To reveal the diversity of microbial communities and interactions between living conditions and microbes, it is necessary to introduce comparative analysis based upon integration of all four types of data mentioned above. Comparative meta-omics, especially comparative metageomics, has been established as a routine process to highlight the significant differences in taxon composition and functional gene abundance among microbiota samples. Meanwhile, biologists are increasingly concerning about the correlations between meta-omics features and environmental factors, which may further decipher the adaptation strategy of a microbial community. We developed a graphical comprehensive analysis software named MetaComp comprising a series of statistical analysis approaches with visualized results for metagenomics and other meta-omics data comparison. This software is capable to read files generated by a variety of upstream programs. After data loading, analyses such as multivariate statistics, hypothesis testing of two-sample, multi-sample as well as two-group sample and a novel function-regression analysis of environmental factors are offered. Here, regression analysis regards meta-omic features as independent variable and environmental factors as dependent variables. Moreover, MetaComp is capable to automatically choose an appropriate two-group sample test based upon the traits of input abundance profiles. We further evaluate the performance of its choice, and exhibit applications for metagenomics, metaproteomics and metabolomics samples. MetaComp, an integrative software capable for applying to all meta-omics data, originally distills the influence of living environment on microbial community by regression analysis. Moreover, since the automatically chosen two-group sample test is verified to be outperformed, MetaComp is friendly to users without adequate statistical training. These improvements are aiming to overcome the new challenges under big data era for all meta-omics data. MetaComp is available at: http://cqb.pku.edu.cn/ZhuLab/MetaComp/ and https://github.com/pzhaipku/MetaComp/ .

  17. Microbial contributions to coupled arsenic and sulfur cycling in the acid-sulfide hot spring Champagne Pool, New Zealand.

    PubMed

    Hug, Katrin; Maher, William A; Stott, Matthew B; Krikowa, Frank; Foster, Simon; Moreau, John W

    2014-01-01

    Acid-sulfide hot springs are analogs of early Earth geothermal systems where microbial metal(loid) resistance likely first evolved. Arsenic is a metalloid enriched in the acid-sulfide hot spring Champagne Pool (Waiotapu, New Zealand). Arsenic speciation in Champagne Pool follows reaction paths not yet fully understood with respect to biotic contributions and coupling to biogeochemical sulfur cycling. Here we present quantitative arsenic speciation from Champagne Pool, finding arsenite dominant in the pool, rim and outflow channel (55-75% total arsenic), and dithio- and trithioarsenates ubiquitously present as 18-25% total arsenic. In the outflow channel, dimethylmonothioarsenate comprised ≤9% total arsenic, while on the outflow terrace thioarsenates were present at 55% total arsenic. We also quantified sulfide, thiosulfate, sulfate and elemental sulfur, finding sulfide and sulfate as major species in the pool and outflow terrace, respectively. Elemental sulfur concentration reached a maximum at the terrace. Phylogenetic analysis of 16S rRNA genes from metagenomic sequencing revealed the dominance of Sulfurihydrogenibium at all sites and an increased archaeal population at the rim and outflow channel. Several phylotypes were found closely related to known sulfur- and sulfide-oxidizers, as well as sulfur- and sulfate-reducers. Bioinformatic analysis revealed genes underpinning sulfur redox transformations, consistent with sulfur speciation data, and illustrating a microbial role in sulfur-dependent transformation of arsenite to thioarsenate. Metagenomic analysis also revealed genes encoding for arsenate reductase at all sites, reflecting the ubiquity of thioarsenate and a need for microbial arsenate resistance despite anoxic conditions. Absence of the arsenite oxidase gene, aio, at all sites suggests prioritization of arsenite detoxification over coupling to energy conservation. Finally, detection of methyl arsenic in the outflow channel, in conjunction with increased sequences from Aquificaceae, supports a role for methyltransferase in thermophilic arsenic resistance. Our study highlights microbial contributions to coupled arsenic and sulfur cycling at Champagne Pool, with implications for understanding the evolution of microbial arsenic resistance in sulfidic geothermal systems.

  18. A Delphi Technology Foresight Study: Mapping Social Construction of Scientific Evidence on Metagenomics Tests for Water Safety

    PubMed Central

    Birko, Stanislav; Dove, Edward S.; Özdemir, Vural

    2015-01-01

    Access to clean water is a grand challenge in the 21st century. Water safety testing for pathogens currently depends on surrogate measures such as fecal indicator bacteria (e.g., E. coli). Metagenomics concerns high-throughput, culture-independent, unbiased shotgun sequencing of DNA from environmental samples that might transform water safety by detecting waterborne pathogens directly instead of their surrogates. Yet emerging innovations such as metagenomics are often fiercely contested. Innovations are subject to shaping/construction not only by technology but also social systems/values in which they are embedded, such as experts’ attitudes towards new scientific evidence. We conducted a classic three-round Delphi survey, comprised of 107 questions. A multidisciplinary expert panel (n = 24) representing the continuum of discovery scientists and policymakers evaluated the emergence of metagenomics tests. To the best of our knowledge, we report here the first Delphi foresight study of experts’ attitudes on (1) the top 10 priority evidentiary criteria for adoption of metagenomics tests for water safety, (2) the specific issues critical to governance of metagenomics innovation trajectory where there is consensus or dissensus among experts, (3) the anticipated time lapse from discovery to practice of metagenomics tests, and (4) the role and timing of public engagement in development of metagenomics tests. The ability of a test to distinguish between harmful and benign waterborne organisms, analytical/clinical sensitivity, and reproducibility were the top three evidentiary criteria for adoption of metagenomics. Experts agree that metagenomic testing will provide novel information but there is dissensus on whether metagenomics will replace the current water safety testing methods or impact the public health end points (e.g., reduction in boil water advisories). Interestingly, experts view the publics relevant in a “downstream capacity” for adoption of metagenomics rather than a co-productionist role at the “upstream” scientific design stage of metagenomics tests. In summary, these findings offer strategic foresight to govern metagenomics innovations symmetrically: by identifying areas where acceleration (e.g., consensus areas) and deceleration/reconsideration (e.g., dissensus areas) of the innovation trajectory might be warranted. Additionally, we show how scientific evidence is subject to potential social construction by experts’ value systems and the need for greater upstream public engagement on metagenomics innovations. PMID:26066837

  19. Mining for hemicellulases in the fungus-growing termite Pseudacanthotermes militaris using functional metagenomics.

    PubMed

    Bastien, Géraldine; Arnal, Grégory; Bozonnet, Sophie; Laguerre, Sandrine; Ferreira, Fernando; Fauré, Régis; Henrissat, Bernard; Lefèvre, Fabrice; Robe, Patrick; Bouchez, Olivier; Noirot, Céline; Dumon, Claire; O'Donohue, Michael

    2013-05-14

    The metagenomic analysis of gut microbiomes has emerged as a powerful strategy for the identification of biomass-degrading enzymes, which will be no doubt useful for the development of advanced biorefining processes. In the present study, we have performed a functional metagenomic analysis on comb and gut microbiomes associated with the fungus-growing termite, Pseudacanthotermes militaris. Using whole termite abdomens and fungal-comb material respectively, two fosmid-based metagenomic libraries were created and screened for the presence of xylan-degrading enzymes. This revealed 101 positive clones, corresponding to an extremely high global hit rate of 0.49%. Many clones displayed either β-d-xylosidase (EC 3.2.1.37) or α-l-arabinofuranosidase (EC 3.2.1.55) activity, while others displayed the ability to degrade AZCL-xylan or AZCL-β-(1,3)-β-(1,4)-glucan. Using secondary screening it was possible to pinpoint clones of interest that were used to prepare fosmid DNA. Sequencing of fosmid DNA generated 1.46 Mbp of sequence data, and bioinformatics analysis revealed 63 sequences encoding putative carbohydrate-active enzymes, with many of these forming parts of sequence clusters, probably having carbohydrate degradation and metabolic functions. Taxonomic assignment of the different sequences revealed that Firmicutes and Bacteroidetes were predominant phyla in the gut sample, while microbial diversity in the comb sample resembled that of typical soil samples. Cloning and expression in E. coli of six enzyme candidates identified in the libraries provided access to individual enzyme activities, which all proved to be coherent with the primary and secondary functional screens. This study shows that the gut microbiome of P. militaris possesses the potential to degrade biomass components, such as arabinoxylans and arabinans. Moreover, the data presented suggests that prokaryotic microorganisms present in the comb could also play a part in the degradation of biomass within the termite mound, although further investigation will be needed to clarify the complex synergies that might exist between the different microbiomes that constitute the termitosphere of fungus-growing termites. This study exemplifies the power of functional metagenomics for the discovery of biomass-active enzymes and has provided a collection of potentially interesting biocatalysts for further study.

  20. Bacterial diversity of the American sand fly Lutzomyia intermedia using high-throughput metagenomic sequencing.

    PubMed

    Monteiro, Carolina Cunha; Villegas, Luis Eduardo Martinez; Campolina, Thais Bonifácio; Pires, Ana Clara Machado Araújo; Miranda, Jose Carlos; Pimenta, Paulo Filemon Paolucci; Secundino, Nagila Francinete Costa

    2016-08-31

    Parasites of the genus Leishmania cause a broad spectrum of diseases, collectively known as leishmaniasis, in humans worldwide. American cutaneous leishmaniasis is a neglected disease transmitted by sand fly vectors including Lutzomyia intermedia, a proven vector. The female sand fly can acquire or deliver Leishmania spp. parasites while feeding on a blood meal, which is required for nutrition, egg development and survival. The microbiota composition and abundance varies by food source, life stages and physiological conditions. The sand fly microbiota can affect parasite life-cycle in the vector. We performed a metagenomic analysis for microbiota composition and abundance in Lu. intermedia, from an endemic area in Brazil. The adult insects were collected using CDC light traps, morphologically identified, carefully sterilized, dissected under a microscope and the females separated into groups according to their physiological condition: (i) absence of blood meal (unfed = UN); (ii) presence of blood meal (blood-fed = BF); and (iii) presence of developed ovaries (gravid = GR). Then, they were processed for metagenomics with Illumina Hiseq Sequencing in order to be sequence analyzed and to obtain the taxonomic profiles of the microbiota. Bacterial metagenomic analysis revealed differences in microbiota composition based upon the distinct physiological stages of the adult insect. Sequence identification revealed two phyla (Proteobacteria and Actinobacteria), 11 families and 15 genera; 87 % of the bacteria were Gram-negative, while only one family and two genera were identified as Gram-positive. The genera Ochrobactrum, Bradyrhizobium and Pseudomonas were found across all of the groups. The metagenomic analysis revealed that the microbiota of the Lu. intermedia female sand flies are distinct under specific physiological conditions and consist of 15 bacterial genera. The Ochrobactrum, Bradyrhizobium and Pseudomonas were the common genera. Our results detailing the constituents of Lu. intermedia native microbiota contribute to the knowledge regarding the bacterial community in an important sand fly vector and allow for further studies to better understand how the microbiota interacts with vectors of human parasites and to develop tools for biological control.

  1. Quasi-metagenomics and realtime sequencing aided detection and subtyping of Salmonella enterica from food samples.

    PubMed

    Hyeon, Ji-Yeon; Li, Shaoting; Mann, David A; Zhang, Shaokang; Li, Zhen; Chen, Yi; Deng, Xiangyu

    2017-12-01

    Metagenomics analysis of food samples promises isolation-independent detection and subtyping of foodborne bacterial pathogens in a single workflow. Selective concentration of Salmonella genomic DNA through immunomagnetic separation (IMS) and multiple displacement amplification (MDA) were shown to shorten culture enrichment of Salmonella -spiked raw chicken breast samples by over 12 hours while permitting serotyping and high-fidelity single nucleotide polymorphisms (SNP) typing of the pathogen using short shotgun sequencing reads. The herein termed quasi-metagenomics approach was evaluated on Salmonella -spiked lettuce and black peppercorn samples as well as retail chicken parts naturally contaminated with different serotypes of Salmonella. Between 8 and 24 h culture enrichment was required for detecting and subtyping naturally occurring Salmonella from unspiked chicken parts compared with 4 to 12 h culture enrichment when Salmonella -spiked food samples were analyzed, indicating the likely need for longer culture enrichment to revive low levels of stressed or injured Salmonella cells in food. Further acceleration of the workflow was achieved by real-time nanopore sequencing. After 1.5 hours of analysis on a potable sequencer, sufficient data were generated from sequencing IMS-MDA product of a cultured-enriched lettuce sample to allow serotyping and robust phylogenetic placement of the inoculated isolate. Importance Both culture enrichment and next-generation sequencing remain to be time-consuming processes for food testing where rapid methods for pathogen detection are widely available. Our study demonstrated substantial acceleration of the respective process through IMS-MDA and real-time nanopore sequencing. In one example, the combined use of the two methods delivered a less than 24 h turnaround time from a Salmonella -contaminated lettuce sample to phylogenetic identification of the pathogen. Improved efficiency like this is important for further expanding the use of whole genome and metagenomics sequencing in microbial analysis of food. Our results suggest the potential of the quasi-metagenomics approach in areas where rapid detection and subtyping of foodborne pathogens is important, such as foodborne outbreak response and precision tracking and monitoring of foodborne pathogens in production environments and supply chains. Copyright © 2017 American Society for Microbiology.

  2. Phylogenetic analysis of a spontaneous cocoa bean fermentation metagenome reveals new insights into its bacterial and fungal community diversity.

    PubMed

    Illeghems, Koen; De Vuyst, Luc; Papalexandratou, Zoi; Weckx, Stefan

    2012-01-01

    This is the first report on the phylogenetic analysis of the community diversity of a single spontaneous cocoa bean box fermentation sample through a metagenomic approach involving 454 pyrosequencing. Several sequence-based and composition-based taxonomic profiling tools were used and evaluated to avoid software-dependent results and their outcome was validated by comparison with previously obtained culture-dependent and culture-independent data. Overall, this approach revealed a wider bacterial (mainly γ-Proteobacteria) and fungal diversity than previously found. Further, the use of a combination of different classification methods, in a software-independent way, helped to understand the actual composition of the microbial ecosystem under study. In addition, bacteriophage-related sequences were found. The bacterial diversity depended partially on the methods used, as composition-based methods predicted a wider diversity than sequence-based methods, and as classification methods based solely on phylogenetic marker genes predicted a more restricted diversity compared with methods that took all reads into account. The metagenomic sequencing analysis identified Hanseniaspora uvarum, Hanseniaspora opuntiae, Saccharomyces cerevisiae, Lactobacillus fermentum, and Acetobacter pasteurianus as the prevailing species. Also, the presence of occasional members of the cocoa bean fermentation process was revealed (such as Erwinia tasmaniensis, Lactobacillus brevis, Lactobacillus casei, Lactobacillus rhamnosus, Lactococcus lactis, Leuconostoc mesenteroides, and Oenococcus oeni). Furthermore, the sequence reads associated with viral communities were of a restricted diversity, dominated by Myoviridae and Siphoviridae, and reflecting Lactobacillus as the dominant host. To conclude, an accurate overview of all members of a cocoa bean fermentation process sample was revealed, indicating the superiority of metagenomic sequencing over previously used techniques.

  3. Metagenomic analysis of bacterial community structure and diversity of lignocellulolytic bacteria in Vietnamese native goat rumen.

    PubMed

    Do, Thi Huyen; Dao, Trong Khoa; Nguyen, Khanh Hoang Viet; Le, Ngoc Giang; Nguyen, Thi Mai Phuong; Le, Tung Lam; Phung, Thu Nguyet; van Straalen, Nico M; Roelofs, Dick; Truong, Nam Hai

    2018-05-01

    In a previous study, analysis of Illumina sequenced metagenomic DNA data of bacteria in Vietnamese goats' rumen showed a high diversity of putative lignocellulolytic genes. In this study, taxonomy speculation of microbial community and lignocellulolytic bacteria population in the rumen was conducted to elucidate a role of bacterial structure for effective degradation of plant materials. The metagenomic data had been subjected into Basic Local Alignment Search Tool (BLASTX) algorithm and the National Center for Biotechnology Information non-redundant sequence database. Here the BLASTX hits were further processed by the Metagenome Analyzer program to statistically analyze the abundance of taxa. Microbial community in the rumen is defined by dominance of Bacteroidetes compared to Firmicutes. The ratio of Firmicutes versus Bacteroidetes was 0.36:1. An abundance of Synergistetes was uniquely identified in the goat microbiome may be formed by host genotype. With regard to bacterial lignocellulose degraders, the ratio of lignocellulolytic genes affiliated with Firmicutes compared to the genes linked to Bacteroidetes was 0.11:1, in which the genes encoding putative hemicellulases, carbohydrate esterases, polysaccharide lyases originated from Bacteroidetes were 14 to 20 times higher than from Firmicutes. Firmicutes seem to possess more cellulose hydrolysis capacity showing a Firmicutes/Bacteroidetes ratio of 0.35:1. Analysis of lignocellulolytic potential degraders shows that four species belonged to Bacteroidetes phylum, while two species belonged to Firmicutes phylum harbouring at least 12 different catalytic domains for all lignocellulose pretreatment, cellulose, as well as hemicellulose saccharification. Based on these findings, we speculate that increasing the members of Bacteroidetes to keep a low ratio of Firmicutes versus Bacteroidetes in goat rumen has resulted most likely in an increased lignocellulose digestion.

  4. BeerDeCoded: the open beer metagenome project.

    PubMed

    Sobel, Jonathan; Henry, Luc; Rotman, Nicolas; Rando, Gianpaolo

    2017-01-01

    Next generation sequencing has radically changed research in the life sciences, in both academic and corporate laboratories. The potential impact is tremendous, yet a majority of citizens have little or no understanding of the technological and ethical aspects of this widespread adoption. We designed BeerDeCoded as a pretext to discuss the societal issues related to genomic and metagenomic data with fellow citizens, while advancing scientific knowledge of the most popular beverage of all. In the spirit of citizen science, sample collection and DNA extraction were carried out with the participation of non-scientists in the community laboratory of Hackuarium, a not-for-profit organisation that supports unconventional research and promotes the public understanding of science. The dataset presented herein contains the targeted metagenomic profile of 39 bottled beers from 5 countries, based on internal transcribed spacer (ITS) sequencing of fungal species. A preliminary analysis reveals the presence of a large diversity of wild yeast species in commercial brews. With this project, we demonstrate that coupling simple laboratory procedures that can be carried out in a non-professional environment with state-of-the-art sequencing technologies and targeted metagenomic analyses, can lead to the detection and identification of the microbial content in bottled beer.

  5. SUPER-FOCUS: a tool for agile functional analysis of shotgun metagenomic data

    PubMed Central

    Green, Kevin T.; Dutilh, Bas E.; Edwards, Robert A.

    2016-01-01

    Summary: Analyzing the functional profile of a microbial community from unannotated shotgun sequencing reads is one of the important goals in metagenomics. Functional profiling has valuable applications in biological research because it identifies the abundances of the functional genes of the organisms present in the original sample, answering the question what they can do. Currently, available tools do not scale well with increasing data volumes, which is important because both the number and lengths of the reads produced by sequencing platforms keep increasing. Here, we introduce SUPER-FOCUS, SUbsystems Profile by databasE Reduction using FOCUS, an agile homology-based approach using a reduced reference database to report the subsystems present in metagenomic datasets and profile their abundances. SUPER-FOCUS was tested with over 70 real metagenomes, the results showing that it accurately predicts the subsystems present in the profiled microbial communities, and is up to 1000 times faster than other tools. Availability and implementation: SUPER-FOCUS was implemented in Python, and its source code and the tool website are freely available at https://edwards.sdsu.edu/SUPERFOCUS. Contact: redwards@mail.sdsu.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26454280

  6. Single sample resolution of rare microbial dark matter in a marine invertebrate metagenome

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Miller, Ian J.; Weyna, Theodore R.; Fong, Stephen S.

    Direct, untargeted sequencing of environmental samples (metagenomics) and de novo genome assembly enable the study of uncultured and phylogenetically divergent organisms. However, separating individual genomes from a mixed community has often relied on the differential-coverage analysis of multiple, deeply sequenced samples. In the metagenomic investigation of the marine bryozoan Bugula neritina, we uncovered seven bacterial genomes associated with a single B. neritina individual that appeared to be transient associates, two of which were unique to one individual and undetectable using certain “universal” 16S rRNA primers and probes. We recovered high quality genome assemblies for several rare instances of “microbial darkmore » matter,” or phylogenetically divergent bacteria lacking genomes in reference databases, from a single tissue sample that was not subjected to any physical or chemical pre-treatment. One of these rare, divergent organisms has a small (593 kbp), poorly annotated genome with low GC content (20.9%) and a 16S rRNA gene with just 65% sequence similarity to the closest reference sequence. Lastly, our findings illustrate the importance of sampling strategy and de novo assembly of metagenomic reads to understand the extent and function of bacterial biodiversity.« less

  7. MPD: a pathogen genome and metagenome database

    PubMed Central

    Zhang, Tingting; Miao, Jiaojiao; Han, Na; Qiang, Yujun; Zhang, Wen

    2018-01-01

    Abstract Advances in high-throughput sequencing have led to unprecedented growth in the amount of available genome sequencing data, especially for bacterial genomes, which has been accompanied by a challenge for the storage and management of such huge datasets. To facilitate bacterial research and related studies, we have developed the Mypathogen database (MPD), which provides access to users for searching, downloading, storing and sharing bacterial genomics data. The MPD represents the first pathogenic database for microbial genomes and metagenomes, and currently covers pathogenic microbial genomes (6604 genera, 11 071 species, 41 906 strains) and metagenomic data from host, air, water and other sources (28 816 samples). The MPD also functions as a management system for statistical and storage data that can be used by different organizations, thereby facilitating data sharing among different organizations and research groups. A user-friendly local client tool is provided to maintain the steady transmission of big sequencing data. The MPD is a useful tool for analysis and management in genomic research, especially for clinical Centers for Disease Control and epidemiological studies, and is expected to contribute to advancing knowledge on pathogenic bacteria genomes and metagenomes. Database URL: http://data.mypathogen.org PMID:29917040

  8. BeerDeCoded: the open beer metagenome project

    PubMed Central

    Sobel, Jonathan; Henry, Luc; Rotman, Nicolas; Rando, Gianpaolo

    2017-01-01

    Next generation sequencing has radically changed research in the life sciences, in both academic and corporate laboratories. The potential impact is tremendous, yet a majority of citizens have little or no understanding of the technological and ethical aspects of this widespread adoption. We designed BeerDeCoded as a pretext to discuss the societal issues related to genomic and metagenomic data with fellow citizens, while advancing scientific knowledge of the most popular beverage of all. In the spirit of citizen science, sample collection and DNA extraction were carried out with the participation of non-scientists in the community laboratory of Hackuarium, a not-for-profit organisation that supports unconventional research and promotes the public understanding of science. The dataset presented herein contains the targeted metagenomic profile of 39 bottled beers from 5 countries, based on internal transcribed spacer (ITS) sequencing of fungal species. A preliminary analysis reveals the presence of a large diversity of wild yeast species in commercial brews. With this project, we demonstrate that coupling simple laboratory procedures that can be carried out in a non-professional environment with state-of-the-art sequencing technologies and targeted metagenomic analyses, can lead to the detection and identification of the microbial content in bottled beer. PMID:29123645

  9. Single sample resolution of rare microbial dark matter in a marine invertebrate metagenome

    DOE PAGES

    Miller, Ian J.; Weyna, Theodore R.; Fong, Stephen S.; ...

    2016-09-29

    Direct, untargeted sequencing of environmental samples (metagenomics) and de novo genome assembly enable the study of uncultured and phylogenetically divergent organisms. However, separating individual genomes from a mixed community has often relied on the differential-coverage analysis of multiple, deeply sequenced samples. In the metagenomic investigation of the marine bryozoan Bugula neritina, we uncovered seven bacterial genomes associated with a single B. neritina individual that appeared to be transient associates, two of which were unique to one individual and undetectable using certain “universal” 16S rRNA primers and probes. We recovered high quality genome assemblies for several rare instances of “microbial darkmore » matter,” or phylogenetically divergent bacteria lacking genomes in reference databases, from a single tissue sample that was not subjected to any physical or chemical pre-treatment. One of these rare, divergent organisms has a small (593 kbp), poorly annotated genome with low GC content (20.9%) and a 16S rRNA gene with just 65% sequence similarity to the closest reference sequence. Lastly, our findings illustrate the importance of sampling strategy and de novo assembly of metagenomic reads to understand the extent and function of bacterial biodiversity.« less

  10. SUPER-FOCUS: a tool for agile functional analysis of shotgun metagenomic data.

    PubMed

    Silva, Genivaldo Gueiros Z; Green, Kevin T; Dutilh, Bas E; Edwards, Robert A

    2016-02-01

    Analyzing the functional profile of a microbial community from unannotated shotgun sequencing reads is one of the important goals in metagenomics. Functional profiling has valuable applications in biological research because it identifies the abundances of the functional genes of the organisms present in the original sample, answering the question what they can do. Currently, available tools do not scale well with increasing data volumes, which is important because both the number and lengths of the reads produced by sequencing platforms keep increasing. Here, we introduce SUPER-FOCUS, SUbsystems Profile by databasE Reduction using FOCUS, an agile homology-based approach using a reduced reference database to report the subsystems present in metagenomic datasets and profile their abundances. SUPER-FOCUS was tested with over 70 real metagenomes, the results showing that it accurately predicts the subsystems present in the profiled microbial communities, and is up to 1000 times faster than other tools. SUPER-FOCUS was implemented in Python, and its source code and the tool website are freely available at https://edwards.sdsu.edu/SUPERFOCUS. redwards@mail.sdsu.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  11. Taxator-tk: precise taxonomic assignment of metagenomes by fast approximation of evolutionary neighborhoods

    PubMed Central

    Dröge, J.; Gregor, I.; McHardy, A. C.

    2015-01-01

    Motivation: Metagenomics characterizes microbial communities by random shotgun sequencing of DNA isolated directly from an environment of interest. An essential step in computational metagenome analysis is taxonomic sequence assignment, which allows identifying the sequenced community members and reconstructing taxonomic bins with sequence data for the individual taxa. For the massive datasets generated by next-generation sequencing technologies, this cannot be performed with de-novo phylogenetic inference methods. We describe an algorithm and the accompanying software, taxator-tk, which performs taxonomic sequence assignment by fast approximate determination of evolutionary neighbors from sequence similarities. Results: Taxator-tk was precise in its taxonomic assignment across all ranks and taxa for a range of evolutionary distances and for short as well as for long sequences. In addition to the taxonomic binning of metagenomes, it is well suited for profiling microbial communities from metagenome samples because it identifies bacterial, archaeal and eukaryotic community members without being affected by varying primer binding strengths, as in marker gene amplification, or copy number variations of marker genes across different taxa. Taxator-tk has an efficient, parallelized implementation that allows the assignment of 6 Gb of sequence data per day on a standard multiprocessor system with 10 CPU cores and microbial RefSeq as the genomic reference data. Availability and implementation: Taxator-tk source and binary program files are publicly available at http://algbio.cs.uni-duesseldorf.de/software/. Contact: Alice.McHardy@uni-duesseldorf.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25388150

  12. Insights into resistome and stress responses genes in Bubalus bubalis rumen through metagenomic analysis.

    PubMed

    Reddy, Bhaskar; Singh, Krishna M; Patel, Amrutlal K; Antony, Ancy; Panchasara, Harshad J; Joshi, Chaitanya G

    2014-10-01

    Buffalo rumen microbiota experience variety of diets and represents a huge reservoir of mobilome, resistome and stress responses. However, knowledge of metagenomic responses to such conditions is still rudimentary. We analyzed the metagenomes of buffalo rumen in the liquid and solid phase of the rumen biomaterial from river buffalo adapted to varying proportion of concentrate to green or dry roughages, using high-throughput sequencing to know the occurrence of antibiotics resistance genes, genetic exchange between bacterial population and environmental reservoirs. A total of 3914.94 MB data were generated from all three treatments group. The data were analysed with Metagenome rapid annotation system tools. At phyla level, Bacteroidetes were dominant in all the treatments followed by Firmicutes. Genes coding for functional responses to stress (oxidative stress and heat shock proteins) and resistome genes (resistance to antibiotics and toxic compounds, phages, transposable elements and pathogenicity islands) were prevalent in similar proportion in liquid and solid fraction of rumen metagenomes. The fluoroquinolone resistance, MDR efflux pumps and Methicillin resistance genes were broadly distributed across 11, 9, and 14 bacterial classes, respectively. Bacteria responsible for phages replication and prophages and phage packaging and rlt-like streptococcal phage genes were mostly assigned to phyla Bacteroides, Firmicutes and proteaobacteria. Also, more reads matching the sigma B genes were identified in the buffalo rumen. This study underscores the presence of diverse mechanisms of adaptation to different diet, antibiotics and other stresses in buffalo rumen, reflecting the proportional representation of major bacterial groups.

  13. ReprDB and panDB: minimalist databases with maximal microbial representation.

    PubMed

    Zhou, Wei; Gay, Nicole; Oh, Julia

    2018-01-18

    Profiling of shotgun metagenomic samples is hindered by a lack of unified microbial reference genome databases that (i) assemble genomic information from all open access microbial genomes, (ii) have relatively small sizes, and (iii) are compatible to various metagenomic read mapping tools. Moreover, computational tools to rapidly compile and update such databases to accommodate the rapid increase in new reference genomes do not exist. As a result, database-guided analyses often fail to profile a substantial fraction of metagenomic shotgun sequencing reads from complex microbiomes. We report pipelines that efficiently traverse all open access microbial genomes and assemble non-redundant genomic information. The pipelines result in two species-resolution microbial reference databases of relatively small sizes: reprDB, which assembles microbial representative or reference genomes, and panDB, for which we developed a novel iterative alignment algorithm to identify and assemble non-redundant genomic regions in multiple sequenced strains. With the databases, we managed to assign taxonomic labels and genome positions to the majority of metagenomic reads from human skin and gut microbiomes, demonstrating a significant improvement over a previous database-guided analysis on the same datasets. reprDB and panDB leverage the rapid increases in the number of open access microbial genomes to more fully profile metagenomic samples. Additionally, the databases exclude redundant sequence information to avoid inflated storage or memory space and indexing or analyzing time. Finally, the novel iterative alignment algorithm significantly increases efficiency in pan-genome identification and can be useful in comparative genomic analyses.

  14. Genetic and functional analysis of the bovine uterine microbiota. Part II: Purulent vaginal discharge versus healthy cows.

    PubMed

    Bicalho, M L S; Lima, S; Higgins, C H; Machado, V S; Lima, F S; Bicalho, R C

    2017-05-01

    The aim of this study was to characterize, using metagenomic shotgun DNA sequencing, the intrauterine microbial population and its predicted functional diversity within healthy cows and cows presenting purulent vaginal discharge (PVD). Twenty Holstein dairy cows from a single farm were enrolled in the study at 25 to 35 d postpartum. Purulent vaginal discharge was diagnosed by retrieving and scoring vaginal discharge using the Metricheck device (Simcro, Hamilton, New Zealand). Intrauterine samples for metagenomic analysis were collected by the cytobrush technique from 8 cows diagnosed with PVD and 12 healthy cows. Pair-end sequencing was performed using the Illumina MiSeq platform (Illumina Inc., San Diego, CA). Metagenomic sequences were analyzed using the MG-RAST server (metagenomic rapid annotations using subsystems technology; http://metagenomics.anl.gov/), and the STAMP software (http://kiwi.cs.dal.ca/Software/STAMP) was used to study statistically significant differential abundance of taxonomic and functional features between the 2 metagenomes. Additionally, the total number of bacterial 16S rDNA copies was estimated by real-time PCR. Taxonomic analysis revealed that Bacteroidetes was the most abundant phylum in the uterine microbiota from cows with PVD, and Fusobacteria was almost completely absent in the healthy uterine microbiota. Moreover, species belonging to the genus Trueperella were present only in the uterine microbiota of PVD cows. The increased abundance of Fusobacteria and the unique presence of Trueperella in the PVD cows highlight the important role of these bacteria in the pathogenesis of PVD. Genes encoding cytolethal distending toxin were exclusive to the microbiota of PVD cows. Similarly, genes associated with lipid A modification were present only in samples from PVD cows; such modification is associated with greater resistance to cationic antimicrobial peptides. Conversely, genes encoding bacteriocins and ribosomally antibacterial peptide were exclusively found in the healthy uterine microbiota and dominated by tolerance to colicin E2. No difference was observed in total bacterial load between the 2 microbiotas. This study provides deep insight into the uterine microbial community in health and disease. The observations that the healthy microbiota is tolerant to colicin E2, whereas the uterine microbiota of PVD cows produces cytolethal distending toxins and modifies its lipopolysaccharides suggest that species-intrinsic factors may be more relevant than bacterial abundance to the development of disease or maintenance of health in the dairy cow postpartum uterus. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  15. Evaluation method for the potential functionome harbored in the genome and metagenome

    PubMed Central

    2012-01-01

    Background One of the main goals of genomic analysis is to elucidate the comprehensive functions (functionome) in individual organisms or a whole community in various environments. However, a standard evaluation method for discerning the functional potentials harbored within the genome or metagenome has not yet been established. We have developed a new evaluation method for the potential functionome, based on the completion ratio of Kyoto Encyclopedia of Genes and Genomes (KEGG) functional modules. Results Distribution of the completion ratio of the KEGG functional modules in 768 prokaryotic species varied greatly with the kind of module, and all modules primarily fell into 4 patterns (universal, restricted, diversified and non-prokaryotic modules), indicating the universal and unique nature of each module, and also the versatility of the KEGG Orthology (KO) identifiers mapped to each one. The module completion ratio in 8 phenotypically different bacilli revealed that some modules were shared only in phenotypically similar species. Metagenomes of human gut microbiomes from 13 healthy individuals previously determined by the Sanger method were analyzed based on the module completion ratio. Results led to new discoveries in the nutritional preferences of gut microbes, believed to be one of the mutualistic representations of gut microbiomes to avoid nutritional competition with the host. Conclusions The method developed in this study could characterize the functionome harbored in genomes and metagenomes. As this method also provided taxonomical information from KEGG modules as well as the gene hosts constructing the modules, interpretation of completion profiles was simplified and we could identify the complementarity between biochemical functions in human hosts and the nutritional preferences in human gut microbiomes. Thus, our method has the potential to be a powerful tool for comparative functional analysis in genomics and metagenomics, able to target unknown environments containing various uncultivable microbes within unidentified phyla. PMID:23234305

  16. Diversity and community composition of methanogenic archaea in the rumen of Scottish upland sheep assessed by different methods.

    PubMed

    Snelling, Timothy J; Genç, Buğra; McKain, Nest; Watson, Mick; Waters, Sinéad M; Creevey, Christopher J; Wallace, R John

    2014-01-01

    Ruminal archaeomes of two mature sheep grazing in the Scottish uplands were analysed by different sequencing and analysis methods in order to compare the apparent archaeal communities. All methods revealed that the majority of methanogens belonged to the Methanobacteriales order containing the Methanobrevibacter, Methanosphaera and Methanobacteria genera. Sanger sequenced 1.3 kb 16S rRNA gene amplicons identified the main species of Methanobrevibacter present to be a SGMT Clade member Mbb. millerae (≥ 91% of OTUs); Methanosphaera comprised the remainder of the OTUs. The primers did not amplify ruminal Thermoplasmatales-related 16S rRNA genes. Illumina sequenced V6-V8 16S rRNA gene amplicons identified similar Methanobrevibacter spp. and Methanosphaera clades and also identified the Thermoplasmatales-related order as 13% of total archaea. Unusually, both methods concluded that Mbb. ruminantium and relatives from the same clade (RO) were almost absent. Sequences mapping to rumen 16S rRNA and mcrA gene references were extracted from Illumina metagenome data. Mapping of the metagenome data to 16S rRNA gene references produced taxonomic identification to Order level including 2-3% Thermoplasmatales, but was unable to discriminate to species level. Mapping of the metagenome data to mcrA gene references resolved 69% to unclassified Methanobacteriales. Only 30% of sequences were assigned to species level clades: of the sequences assigned to Methanobrevibacter, most mapped to SGMT (16%) and RO (10%) clades. The Sanger 16S amplicon and Illumina metagenome mcrA analyses showed similar species richness (Chao1 Index 19-35), while Illumina metagenome and amplicon 16S rRNA analysis gave lower richness estimates (10-18). The values of the Shannon Index were low in all methods, indicating low richness and uneven species distribution. Thus, although much information may be extracted from the other methods, Illumina amplicon sequencing of the V6-V8 16S rRNA gene would be the method of choice for studying rumen archaeal communities.

  17. MetaSort untangles metagenome assembly by reducing microbial community complexity

    PubMed Central

    Ji, Peifeng; Zhang, Yanming; Wang, Jinfeng; Zhao, Fangqing

    2017-01-01

    Most current approaches to analyse metagenomic data rely on reference genomes. Novel microbial communities extend far beyond the coverage of reference databases and de novo metagenome assembly from complex microbial communities remains a great challenge. Here we present a novel experimental and bioinformatic framework, metaSort, for effective construction of bacterial genomes from metagenomic samples. MetaSort provides a sorted mini-metagenome approach based on flow cytometry and single-cell sequencing methodologies, and employs new computational algorithms to efficiently recover high-quality genomes from the sorted mini-metagenome by the complementary of the original metagenome. Through extensive evaluations, we demonstrated that metaSort has an excellent and unbiased performance on genome recovery and assembly. Furthermore, we applied metaSort to an unexplored microflora colonized on the surface of marine kelp and successfully recovered 75 high-quality genomes at one time. This approach will greatly improve access to microbial genomes from complex or novel communities. PMID:28112173

  18. BioPig: Developing Cloud Computing Applications for Next-Generation Sequence Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bhatia, Karan; Wang, Zhong

    Next Generation sequencing is producing ever larger data sizes with a growth rate outpacing Moore's Law. The data deluge has made many of the current sequenceanalysis tools obsolete because they do not scale with data. Here we present BioPig, a collection of cloud computing tools to scale data analysis and management. Pig is aflexible data scripting language that uses Apache's Hadoop data structure and map reduce framework to process very large data files in parallel and combine the results.BioPig extends Pig with capability with sequence analysis. We will show the performance of BioPig on a variety of bioinformatics tasks, includingmore » screeningsequence contaminants, Illumina QA/QC, and gene discovery from metagenome data sets using the Rumen metagenome as an example.« less

  19. Metagenomic assessment of the interplay between the environment and the genetic diversification of Acinetobacter

    PubMed Central

    Touchon, Marie; Brisse, Sylvain; Rocha, Eduardo P.C.

    2017-01-01

    Summary Most bacteria have poorly characterized environmental reservoirs and unknown closely related species. This hampers the study of bacterial evolutionary ecology because both the environment and the genetic background of ancestral lineages are unknown. We combined metagenomics, comparative genomics and phylogenomics to overcome this limitation, to identify novel taxa and to propose environments where they can be isolated. We applied this method to characterize the ecological distribution of known and novel lineages of Acinetobacter spp. We observed two major environmental transitions at deep phylogenetic levels, splitting the genus into three ecologically differentiated clades. One of these has rapidly shifted towards host‐association by acquiring genes involved in bacteria‐eukaryote interactions. We show that environmental perturbations affect species distribution in predictable ways: bovines have very diverse communities of Acinetobacter, unless they were administered antibiotics, in which case they show highly uniform communities of Acinetobacter spp. that resemble those of humans. Our results uncover the diversity of bacterial lineages, overpassing the limitations of classical cultivation methods and highlight the role of the environment in shaping their evolution. PMID:28967182

  20. Ecology and evolution of viruses infecting uncultivated SUP05 bacteria as revealed by single-cell- and meta-genomics

    DOE PAGES

    Roux, Simon; Hawley, Alyse K.; Torres Beltran, Monica; ...

    2014-08-29

    Viruses modulate microbial communities and alter ecosystem functions. However, due to cultivation bottlenecks, specific virus–host interaction dynamics remain cryptic. In this study, we examined 127 single-cell amplified genomes (SAGs) from uncultivated SUP05 bacteria isolated from a model marine oxygen minimum zone (OMZ) to identify 69 viral contigs representing five new genera within dsDNA Caudovirales and ssDNA Microviridae. Infection frequencies suggest that ∼1/3 of SUP05 bacteria is viral-infected, with higher infection frequency where oxygen-deficiency was most severe. Observed Microviridae clonality suggests recovery of bloom-terminating viruses, while systematic co-infection between dsDNA and ssDNA viruses posits previously unrecognized cooperation modes. Analyses of 186more » microbial and viral metagenomes revealed that SUP05 viruses persisted for years, but remained endemic to the OMZ. Finally, identification of virus-encoded dissimilatory sulfite reductase suggests SUP05 viruses reprogram their host's energy metabolism. Together, these results demonstrate closely coupled SUP05 virus–host co-evolutionary dynamics with the potential to modulate biogeochemical cycling in climate-critical and expanding OMZs.« less

  1. Centrifuge: rapid and sensitive classification of metagenomic sequences

    PubMed Central

    Song, Li; Breitwieser, Florian P.

    2016-01-01

    Centrifuge is a novel microbial classification engine that enables rapid, accurate, and sensitive labeling of reads and quantification of species on desktop computers. The system uses an indexing scheme based on the Burrows-Wheeler transform (BWT) and the Ferragina-Manzini (FM) index, optimized specifically for the metagenomic classification problem. Centrifuge requires a relatively small index (4.2 GB for 4078 bacterial and 200 archaeal genomes) and classifies sequences at very high speed, allowing it to process the millions of reads from a typical high-throughput DNA sequencing run within a few minutes. Together, these advances enable timely and accurate analysis of large metagenomics data sets on conventional desktop computers. Because of its space-optimized indexing schemes, Centrifuge also makes it possible to index the entire NCBI nonredundant nucleotide sequence database (a total of 109 billion bases) with an index size of 69 GB, in contrast to k-mer-based indexing schemes, which require far more extensive space. PMID:27852649

  2. Metagenomic Analyses Reveal That Energy Transfer Gene Abundances Can Predict the Syntrophic Potential of Environmental Microbial Communities.

    PubMed

    Oberding, Lisa; Gieg, Lisa M

    2016-01-05

    Hydrocarbon compounds can be biodegraded by anaerobic microorganisms to form methane through an energetically interdependent metabolic process known as syntrophy. The microorganisms that perform this process as well as the energy transfer mechanisms involved are difficult to study and thus are still poorly understood, especially on an environmental scale. Here, metagenomic data was analyzed for specific clusters of orthologous groups (COGs) related to key energy transfer genes thus far identified in syntrophic bacteria, and principal component analysis was used in order to determine whether potentially syntrophic environments could be distinguished using these syntroph related COGs as opposed to universally present COGs. We found that COGs related to hydrogenase and formate dehydrogenase genes were able to distinguish known syntrophic consortia and environments with the potential for syntrophy from non-syntrophic environments, indicating that these COGs could be used as a tool to identify syntrophic hydrocarbon biodegrading environments using metagenomic data.

  3. MEGGASENSE - The Metagenome/Genome Annotated Sequence Natural Language Search Engine: A Platform for 
the Construction of Sequence Data Warehouses.

    PubMed

    Gacesa, Ranko; Zucko, Jurica; Petursdottir, Solveig K; Gudmundsdottir, Elisabet Eik; Fridjonsson, Olafur H; Diminic, Janko; Long, Paul F; Cullum, John; Hranueli, Daslav; Hreggvidsson, Gudmundur O; Starcevic, Antonio

    2017-06-01

    The MEGGASENSE platform constructs relational databases of DNA or protein sequences. The default functional analysis uses 14 106 hidden Markov model (HMM) profiles based on sequences in the KEGG database. The Solr search engine allows sophisticated queries and a BLAST search function is also incorporated. These standard capabilities were used to generate the SCATT database from the predicted proteome of Streptomyces cattleya . The implementation of a specialised metagenome database (AMYLOMICS) for bioprospecting of carbohydrate-modifying enzymes is described. In addition to standard assembly of reads, a novel 'functional' assembly was developed, in which screening of reads with the HMM profiles occurs before the assembly. The AMYLOMICS database incorporates additional HMM profiles for carbohydrate-modifying enzymes and it is illustrated how the combination of HMM and BLAST analyses helps identify interesting genes. A variety of different proteome and metagenome databases have been generated by MEGGASENSE.

  4. Methods for understanding microbial community structures and functions in microbial fuel cells: a review.

    PubMed

    Zhi, Wei; Ge, Zheng; He, Zhen; Zhang, Husen

    2014-11-01

    Microbial fuel cells (MFCs) employ microorganisms to recover electric energy from organic matter. However, fundamental knowledge of electrochemically active bacteria is still required to maximize MFCs power output for practical applications. This review presents microbiological and electrochemical techniques to help researchers choose the appropriate methods for the MFCs study. Pre-genomic and genomic techniques such as 16S rRNA based phylogeny and metagenomics have provided important information in the structure and genetic potential of electrode-colonizing microbial communities. Post-genomic techniques such as metatranscriptomics allow functional characterizations of electrode biofilm communities by quantifying gene expression levels. Isotope-assisted phylogenetic analysis can further link taxonomic information to microbial metabolisms. A combination of electrochemical, phylogenetic, metagenomic, and post-metagenomic techniques offers opportunities to a better understanding of the extracellular electron transfer process, which in turn can lead to process optimization for power output. Copyright © 2014 Elsevier Ltd. All rights reserved.

  5. A Statistical Framework for the Functional Analysis of Metagenomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sharon, Itai; Pati, Amrita; Markowitz, Victor

    2008-10-01

    Metagenomic studies consider the genetic makeup of microbial communities as a whole, rather than their individual member organisms. The functional and metabolic potential of microbial communities can be analyzed by comparing the relative abundance of gene families in their collective genomic sequences (metagenome) under different conditions. Such comparisons require accurate estimation of gene family frequencies. They present a statistical framework for assessing these frequencies based on the Lander-Waterman theory developed originally for Whole Genome Shotgun (WGS) sequencing projects. They also provide a novel method for assessing the reliability of the estimations which can be used for removing seemingly unreliable measurements.more » They tested their method on a wide range of datasets, including simulated genomes and real WGS data from sequencing projects of whole genomes. Results suggest that their framework corrects inherent biases in accepted methods and provides a good approximation to the true statistics of gene families in WGS projects.« less

  6. A metagenomic framework for the study of airborne microbial communities.

    PubMed

    Yooseph, Shibu; Andrews-Pfannkoch, Cynthia; Tenney, Aaron; McQuaid, Jeff; Williamson, Shannon; Thiagarajan, Mathangi; Brami, Daniel; Zeigler-Allen, Lisa; Hoffman, Jeff; Goll, Johannes B; Fadrosh, Douglas; Glass, John; Adams, Mark D; Friedman, Robert; Venter, J Craig

    2013-01-01

    Understanding the microbial content of the air has important scientific, health, and economic implications. While studies have primarily characterized the taxonomic content of air samples by sequencing the 16S or 18S ribosomal RNA gene, direct analysis of the genomic content of airborne microorganisms has not been possible due to the extremely low density of biological material in airborne environments. We developed sampling and amplification methods to enable adequate DNA recovery to allow metagenomic profiling of air samples collected from indoor and outdoor environments. Air samples were collected from a large urban building, a medical center, a house, and a pier. Analyses of metagenomic data generated from these samples reveal airborne communities with a high degree of diversity and different genera abundance profiles. The identities of many of the taxonomic groups and protein families also allows for the identification of the likely sources of the sampled airborne bacteria.

  7. A Metagenomic Framework for the Study of Airborne Microbial Communities

    PubMed Central

    Tenney, Aaron; McQuaid, Jeff; Williamson, Shannon; Thiagarajan, Mathangi; Brami, Daniel; Zeigler-Allen, Lisa; Hoffman, Jeff; Goll, Johannes B.; Fadrosh, Douglas; Glass, John; Adams, Mark D.; Friedman, Robert; Venter, J. Craig

    2013-01-01

    Understanding the microbial content of the air has important scientific, health, and economic implications. While studies have primarily characterized the taxonomic content of air samples by sequencing the 16S or 18S ribosomal RNA gene, direct analysis of the genomic content of airborne microorganisms has not been possible due to the extremely low density of biological material in airborne environments. We developed sampling and amplification methods to enable adequate DNA recovery to allow metagenomic profiling of air samples collected from indoor and outdoor environments. Air samples were collected from a large urban building, a medical center, a house, and a pier. Analyses of metagenomic data generated from these samples reveal airborne communities with a high degree of diversity and different genera abundance profiles. The identities of many of the taxonomic groups and protein families also allows for the identification of the likely sources of the sampled airborne bacteria. PMID:24349140

  8. Meeting Report: “Metagenomics, Metadata and Meta-analysis” (M3) Special Interest Group at ISMB 2009

    PubMed Central

    Field, Dawn; Friedberg, Iddo; Sterk, Peter; Kottmann, Renzo; Glöckner, Frank Oliver; Hirschman, Lynette; Garrity, George M.; Cochrane, Guy; Wooley, John; Gilbert, Jack

    2009-01-01

    This report summarizes the proceedings of the “Metagenomics, Metadata and Meta-analysis” (M3) Special Interest Group (SIG) meeting held at the Intelligent Systems for Molecular Biology 2009 conference. The Genomic Standards Consortium (GSC) hosted this meeting to explore the bottlenecks and emerging solutions for obtaining biological insights through large-scale comparative analysis of metagenomic datasets. The M3 SIG included 16 talks, half of which were selected from submitted abstracts, a poster session and a panel discussion involving members of the GSC Board. This report summarizes this one-day SIG, attempts to identify shared themes and recapitulates community recommendations for the future of this field. The GSC will also host an M3 workshop at the Pacific Symposium on Biocomputing (PSB) in January 2010. Further information about the GSC and its range of activities can be found at http://gensc.org/. PMID:21304668

  9. Application of metagenomic techniques in mining enzymes from microbial communities for biofuel synthesis.

    PubMed

    Xing, Mei-Ning; Zhang, Xue-Zhu; Huang, He

    2012-01-01

    Feedstock for biofuel synthesis is transitioning to lignocelluosic biomass to address criticism over competition between first generation biofuels and food production. As microbial catalysis is increasingly applied for the conversion of biomass to biofuels, increased import has been placed on the development of novel enzymes. With revolutionary advances in sequencer technology and metagenomic sequencing, mining enzymes from microbial communities for biofuel synthesis is becoming more and more practical. The present article highlights the latest research progress on the special characteristics of metagenomic sequencing, which has been a powerful tool for new enzyme discovery and gene functional analysis in the biomass energy field. Critical enzymes recently developed for the pretreatment and conversion of lignocellulosic materials are evaluated with respect to their activity and stability, with additional explorations into xylanase, laccase, amylase, chitinase, and lipolytic biocatalysts for other biomass feedstocks. Copyright © 2012 Elsevier Inc. All rights reserved.

  10. Diverse circovirus-like genome architectures revealed by environmental metagenomics.

    PubMed

    Rosario, Karyna; Duffy, Siobain; Breitbart, Mya

    2009-10-01

    Single-stranded DNA (ssDNA) viruses with circular genomes are the smallest viruses known to infect eukaryotes. The present study identified 10 novel genomes similar to ssDNA circoviruses through data-mining of public viral metagenomes. The metagenomic libraries included samples from reclaimed water and three different marine environments (Chesapeake Bay, British Columbia coastal waters and Sargasso Sea). All the genomes have similarities to the replication (Rep) protein of circoviruses; however, only half have genomic features consistent with known circoviruses. Some of the genomes exhibit a mixture of genomic features associated with different families of ssDNA viruses (i.e. circoviruses, geminiviruses and parvoviruses). Unique genome architectures and phylogenetic analysis of the Rep protein suggest that these viruses belong to novel genera and/or families. Investigating the complex community of ssDNA viruses in the environment can lead to the discovery of divergent species and help elucidate evolutionary links between ssDNA viruses.

  11. MetaVelvet: An Extension of Velvet Assembler to de novo Metagenome Assembly from Short Sequence Reads (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Sakakibara, Yasumbumi

    2018-02-13

    Keio University's Yasumbumi Sakakibara on "MetaVelvet: An Extension of Velvet Assembler to de novo Metagenome Assembly from Short Sequence Reads" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  12. MetaVelvet: An Extension of Velvet Assembler to de novo Metagenome Assembly from Short Sequence Reads (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sakakibara, Yasumbumi

    2011-10-13

    Keio University's Yasumbumi Sakakibara on "MetaVelvet: An Extension of Velvet Assembler to de novo Metagenome Assembly from Short Sequence Reads" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  13. DOE JGI Quality Metrics; Approaches to Scaling and Improving Metagenome Assembly (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Copeland, Alex; Brown, C. Titus

    2011-10-13

    DOE JGI's Alex Copeland on "DOE JGI Quality Metrics" and Michigan State University's C. Titus Brown on "Approaches to Scaling and Improving Metagenome Assembly" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  14. DOE JGI Quality Metrics; Approaches to Scaling and Improving Metagenome Assembly (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Copeland, Alex; Brown, C. Titus

    2018-04-27

    DOE JGI's Alex Copeland on "DOE JGI Quality Metrics" and Michigan State University's C. Titus Brown on "Approaches to Scaling and Improving Metagenome Assembly" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  15. Population-Sequencing as a Biomarker of Burkholderia mallei and Burkholderia pseudomallei Evolution through Microbial Forensic Analysis.

    PubMed

    Jakupciak, John P; Wells, Jeffrey M; Karalus, Richard J; Pawlowski, David R; Lin, Jeffrey S; Feldman, Andrew B

    2013-01-01

    Large-scale genomics projects are identifying biomarkers to detect human disease. B. pseudomallei and B. mallei are two closely related select agents that cause melioidosis and glanders. Accurate characterization of metagenomic samples is dependent on accurate measurements of genetic variation between isolates with resolution down to strain level. Often single biomarker sensitivity is augmented by use of multiple or panels of biomarkers. In parallel with single biomarker validation, advances in DNA sequencing enable analysis of entire genomes in a single run: population-sequencing. Potentially, direct sequencing could be used to analyze an entire genome to serve as the biomarker for genome identification. However, genome variation and population diversity complicate use of direct sequencing, as well as differences caused by sample preparation protocols including sequencing artifacts and mistakes. As part of a Department of Homeland Security program in bacterial forensics, we examined how to implement whole genome sequencing (WGS) analysis as a judicially defensible forensic method for attributing microbial sample relatedness; and also to determine the strengths and limitations of whole genome sequence analysis in a forensics context. Herein, we demonstrate use of sequencing to provide genetic characterization of populations: direct sequencing of populations.

  16. Population-Sequencing as a Biomarker of Burkholderia mallei and Burkholderia pseudomallei Evolution through Microbial Forensic Analysis

    PubMed Central

    Jakupciak, John P.; Wells, Jeffrey M.; Karalus, Richard J.; Pawlowski, David R.; Lin, Jeffrey S.; Feldman, Andrew B.

    2013-01-01

    Large-scale genomics projects are identifying biomarkers to detect human disease. B. pseudomallei and B. mallei are two closely related select agents that cause melioidosis and glanders. Accurate characterization of metagenomic samples is dependent on accurate measurements of genetic variation between isolates with resolution down to strain level. Often single biomarker sensitivity is augmented by use of multiple or panels of biomarkers. In parallel with single biomarker validation, advances in DNA sequencing enable analysis of entire genomes in a single run: population-sequencing. Potentially, direct sequencing could be used to analyze an entire genome to serve as the biomarker for genome identification. However, genome variation and population diversity complicate use of direct sequencing, as well as differences caused by sample preparation protocols including sequencing artifacts and mistakes. As part of a Department of Homeland Security program in bacterial forensics, we examined how to implement whole genome sequencing (WGS) analysis as a judicially defensible forensic method for attributing microbial sample relatedness; and also to determine the strengths and limitations of whole genome sequence analysis in a forensics context. Herein, we demonstrate use of sequencing to provide genetic characterization of populations: direct sequencing of populations. PMID:24455204

  17. BioCreative Workshops for DOE Genome Sciences: Text Mining for Metagenomics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wu, Cathy H.; Hirschman, Lynette

    The objective of this project was to host BioCreative workshops to define and develop text mining tasks to meet the needs of the Genome Sciences community, focusing on metadata information extraction in metagenomics. Following the successful introduction of metagenomics at the BioCreative IV workshop, members of the metagenomics community and BioCreative communities continued discussion to identify candidate topics for a BioCreative metagenomics track for BioCreative V. Of particular interest was the capture of environmental and isolation source information from text. The outcome was to form a “community of interest” around work on the interactive EXTRACT system, which supported interactive taggingmore » of environmental and species data. This experiment is included in the BioCreative V virtual issue of Database. In addition, there was broad participation by members of the metagenomics community in the panels held at BioCreative V, leading to valuable exchanges between the text mining developers and members of the metagenomics research community. These exchanges are reflected in a number of the overview and perspective pieces also being captured in the BioCreative V virtual issue. Overall, this conversation has exposed the metagenomics researchers to the possibilities of text mining, and educated the text mining developers to the specific needs of the metagenomics community.« less

  18. MetaCRAST: reference-guided extraction of CRISPR spacers from unassembled metagenomes.

    PubMed

    Moller, Abraham G; Liang, Chun

    2017-01-01

    Clustered regularly interspaced short palindromic repeat (CRISPR) systems are the adaptive immune systems of bacteria and archaea against viral infection. While CRISPRs have been exploited as a tool for genetic engineering, their spacer sequences can also provide valuable insights into microbial ecology by linking environmental viruses to their microbial hosts. Despite this importance, metagenomic CRISPR detection remains a major challenge. Here we present a reference-guided CRISPR spacer detection tool ( Meta genomic C RISPR R eference- A ided S earch T ool-MetaCRAST) that constrains searches based on user-specified direct repeats (DRs). These DRs could be expected from assembly or taxonomic profiles of metagenomes. We compared the performance of MetaCRAST to those of two existing metagenomic CRISPR detection tools-Crass and MinCED-using both real and simulated acid mine drainage (AMD) and enhanced biological phosphorus removal (EBPR) metagenomes. Our evaluation shows MetaCRAST improves CRISPR spacer detection in real metagenomes compared to the de novo CRISPR detection methods Crass and MinCED. Evaluation on simulated metagenomes show it performs better than de novo tools for Illumina metagenomes and comparably for 454 metagenomes. It also has comparable performance dependence on read length and community composition, run time, and accuracy to these tools. MetaCRAST is implemented in Perl, parallelizable through the Many Core Engine (MCE), and takes metagenomic sequence reads and direct repeat queries (FASTA or FASTQ) as input. It is freely available for download at https://github.com/molleraj/MetaCRAST.

  19. Lateral Gene Transfer in a Heavy Metal-Contaminated-Groundwater Microbial Community

    PubMed Central

    Hemme, Christopher L.; Green, Stefan J.; Rishishwar, Lavanya; Prakash, Om; Pettenato, Angelica; Chakraborty, Romy; Deutschbauer, Adam M.; Van Nostrand, Joy D.; Wu, Liyou; He, Zhili; Jordan, I. King; Arkin, Adam P.; Kostka, Joel E.

    2016-01-01

    ABSTRACT Unraveling the drivers controlling the response and adaptation of biological communities to environmental change, especially anthropogenic activities, is a central but poorly understood issue in ecology and evolution. Comparative genomics studies suggest that lateral gene transfer (LGT) is a major force driving microbial genome evolution, but its role in the evolution of microbial communities remains elusive. To delineate the importance of LGT in mediating the response of a groundwater microbial community to heavy metal contamination, representative Rhodanobacter reference genomes were sequenced and compared to shotgun metagenome sequences. 16S rRNA gene-based amplicon sequence analysis indicated that Rhodanobacter populations were highly abundant in contaminated wells with low pHs and high levels of nitrate and heavy metals but remained rare in the uncontaminated wells. Sequence comparisons revealed that multiple geochemically important genes, including genes encoding Fe2+/Pb2+ permeases, most denitrification enzymes, and cytochrome c553, were native to Rhodanobacter and not subjected to LGT. In contrast, the Rhodanobacter pangenome contained a recombinational hot spot in which numerous metal resistance genes were subjected to LGT and/or duplication. In particular, Co2+/Zn2+/Cd2+ efflux and mercuric resistance operon genes appeared to be highly mobile within Rhodanobacter populations. Evidence of multiple duplications of a mercuric resistance operon common to most Rhodanobacter strains was also observed. Collectively, our analyses indicated the importance of LGT during the evolution of groundwater microbial communities in response to heavy metal contamination, and a conceptual model was developed to display such adaptive evolutionary processes for explaining the extreme dominance of Rhodanobacter populations in the contaminated groundwater microbiome. PMID:27048805

  20. Trichodesmium genome maintains abundant, widespread noncoding DNA in situ, despite oligotrophic lifestyle

    DOE PAGES

    Walworth, Nathan; Pfreundt, Ulrike; Nelson, William C.; ...

    2015-03-23

    Understanding the evolution of the free-living, cyanobacterial, diazotroph Trichodesmium is of great importance because of its critical role in oceanic biogeochemistry and primary production. Unlike the other >150 available genomes of free-living cyanobacteria, only 63.8% of the Trichodesmium erythraeum (strain IMS101) genome is predicted to encode protein, which is 20–25% less than the average for other cyanobacteria and nonpathogenic, free-living bacteria. In this paper, we use distinctive isolates and metagenomic data to show that low coding density observed in IMS101 is a common feature of the Trichodesmium genus, both in culture and in situ. Transcriptome analysis indicates that 86% ofmore » the noncoding space is expressed, although the function of these transcripts is unclear. The density of noncoding, possible regulatory elements predicted in Trichodesmium, when normalized per intergenic kilobase, was comparable and twofold higher than that found in the gene-dense genomes of the sympatric cyanobacterial genera Synechococcus and Prochlorococcus, respectively. Conserved Trichodesmium noncoding RNA secondary structures were predicted between most culture and metagenomic sequences, lending support to the structural conservation. Conservation of these intergenic regions in spatiotemporally separated Trichodesmium populations suggests possible genus-wide selection for their maintenance. These large intergenic spacers may have developed during intervals of strong genetic drift caused by periodic blooms of a subset of genotypes, which may have reduced effective population size. Finally, our data suggest that transposition of selfish DNA, low effective population size, and high-fidelity replication allowed the unusual “inflation” of noncoding sequence observed in Trichodesmium despite its oligotrophic lifestyle.« less

  1. Trichodesmium genome maintains abundant, widespread noncoding DNA in situ, despite oligotrophic lifestyle

    DOE PAGES

    Walworth, Nathan G.; Pfreundt, Ulrike; Nelson, William C.; ...

    2015-04-07

    Understanding the evolution of the free-living, cyanobacterial, diazotroph Trichodesmium is of great importance due to its critical role in oceanic biogeochemistry and primary production. Unlike the other >150 available genomes of free-living cyanobacteria, only 63.8% of the Trichodesmium erythraeum (strain IMS101) genome is predicted to encode protein, which is 20-25% less than the average for other cyanobacteria and non-pathogenic, free-living bacteria. We use distinctive isolates and metagenomic data to show that low coding density observed in IMS101 is a common feature of the Trichodesmium genus both in culture and in situ. Transcriptome analysis indicates that 86% of the non-coding spacemore » is expressed, although the function of these transcripts is unclear. The density of noncoding, possible regulatory elements predicted in Trichodesmium, when normalized per intergenic kilobase, was comparable and two fold higher than that found in the gene dense genomes of the sympatric cyanobacterial genera Synechococcus and Prochlorococcus, respectively. Conserved Trichodesmium ncRNA secondary structures were predicted between most culture and metagenomic sequences lending support to the structural conservation. Conservation of these intergenic regions in spatiotemporally separated Trichodesmium populations suggests possible genus-wide selection for their maintenance. These large intergenic spacers may have developed during intervals of strong genetic drift caused by periodic blooms of a subset of genotypes, which may have reduced effective population size. Our data suggest that transposition of selfish DNA, low effective population size, and high fidelity replication allowed the unusual ‘inflation’ of noncoding sequence observed in Trichodesmium despite its oligotrophic lifestyle.« less

  2. Exploiting HPC Platforms for Metagenomics: Challenges and Opportunities (MICW - Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Canon, Shane

    2018-01-24

    DOE JGI's Zhong Wang, chair of the High-performance Computing session, gives a brief introduction before Berkeley Lab's Shane Canon talks about "Exploiting HPC Platforms for Metagenomics: Challenges and Opportunities" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  3. Metagenomic analysis of carbon cycling and biogenic methane formation in terrestrial serpentinizing fluid springs

    NASA Astrophysics Data System (ADS)

    Woycheese, K. M.; Meyer-Dombard, D. R.; Cardace, D.; Arcilla, C. A.; Ono, S.

    2016-12-01

    The products of serpentinization are proposed to support a hydrogen-driven microbial biosphere in ultrabasic, highly reducing fluids. Shotgun metagenomic analysis of microbial communities collected from terrestrial serpentinizing springs in the Philippines and Turkey suggest that mutualistic relationships may help microbial communities thrive in highly oligotrophic environments. Understanding how these relationships affect production of methane in the deep subsurface is critical to applications such as carbon sequestration and natural gas production. There is conflicting evidence regarding whether methane and C2-C6 alkanes in serpentinizing ecosystems are produced abiogenically or through biotic reactions such as methanogenesis1, 2. While geochemical analysis of methane from serpentinizing ecosystems has previously indicated abiogenic and/or mixed formation3, 4, methanogens have been detected in an increasing number of investigations2. Here, putative metabolisms were identified via assembly and annotation of metagenomic sequence data from the Philippines and Turkey. At both sites, hydrogenotrophic methanogenesis and homoacetogenesis were identified as the principal autotrophic carbon fixation pathways. Heterotrophic acetogenesis and acetoclastic methanogenesis were also detected in sequence data. Other heterotrophic metabolic pathways identified included sulfate reduction, methanotrophy, and biodegradation of aromatic carbon compounds. Many of these metabolic pathways have been shown to be favorable under conditions typical of serpentinizing habitats5. Metagenomic analysis strongly suggests that at least some of the methane originating from these serpentinizing ecosystems may be biologically derived. Ongoing work will further clarify the mechanisms of methane formation by examining the clumped isotopologue ratios of dissolved methane in serpentinizing fluids. 1. Wang et al. (2015). Science. 348. doi: 10.1126/science.aaa4326 2. Kohl et al. (2016). JGR. Biogeosci. 121. doi:10.1002/2015JG003233 3. Abrajano et al. (1988). Chem. Geol. 71. doi:10.1016/0009-2541(88)90116-7 4. Etiope et al. (2011). EPSL. 310. doi:10.1016/j.epsl.2011.08.001 5. Cardace et al. (2015). Front. Microbiol. 6. doi: 10.3389/fmicb.2015.0001

  4. MetaCAA: A clustering-aided methodology for efficient assembly of metagenomic datasets.

    PubMed

    Reddy, Rachamalla Maheedhar; Mohammed, Monzoorul Haque; Mande, Sharmila S

    2014-01-01

    A key challenge in analyzing metagenomics data pertains to assembly of sequenced DNA fragments (i.e. reads) originating from various microbes in a given environmental sample. Several existing methodologies can assemble reads originating from a single genome. However, these methodologies cannot be applied for efficient assembly of metagenomic sequence datasets. In this study, we present MetaCAA - a clustering-aided methodology which helps in improving the quality of metagenomic sequence assembly. MetaCAA initially groups sequences constituting a given metagenome into smaller clusters. Subsequently, sequences in each cluster are independently assembled using CAP3, an existing single genome assembly program. Contigs formed in each of the clusters along with the unassembled reads are then subjected to another round of assembly for generating the final set of contigs. Validation using simulated and real-world metagenomic datasets indicates that MetaCAA aids in improving the overall quality of assembly. A software implementation of MetaCAA is available at https://metagenomics.atc.tcs.com/MetaCAA. Copyright © 2014 Elsevier Inc. All rights reserved.

  5. BMPOS: a Flexible and User-Friendly Tool Sets for Microbiome Studies.

    PubMed

    Pylro, Victor S; Morais, Daniel K; de Oliveira, Francislon S; Dos Santos, Fausto G; Lemos, Leandro N; Oliveira, Guilherme; Roesch, Luiz F W

    2016-08-01

    Recent advances in science and technology are leading to a revision and re-orientation of methodologies, addressing old and current issues under a new perspective. Advances in next generation sequencing (NGS) are allowing comparative analysis of the abundance and diversity of whole microbial communities, generating a large amount of data and findings at a systems level. The current limitation for biologists has been the increasing demand for computational power and training required for processing of NGS data. Here, we describe the deployment of the Brazilian Microbiome Project Operating System (BMPOS), a flexible and user-friendly Linux distribution dedicated to microbiome studies. The Brazilian Microbiome Project (BMP) has developed data analyses pipelines for metagenomic studies (phylogenetic marker genes), conducted using the two main high-throughput sequencing platforms (Ion Torrent and Illumina MiSeq). The BMPOS is freely available and possesses the entire requirement of bioinformatics packages and databases to perform all the pipelines suggested by the BMP team. The BMPOS may be used as a bootable live USB stick or installed in any computer with at least 1 GHz CPU and 512 MB RAM, independent of the operating system previously installed. The BMPOS has proved to be effective for sequences processing, sequences clustering, alignment, taxonomic annotation, statistical analysis, and plotting of metagenomic data. The BMPOS has been used during several metagenomic analyses courses, being valuable as a tool for training, and an excellent starting point to anyone interested in performing metagenomic studies. The BMPOS and its documentation are available at http://www.brmicrobiome.org .

  6. Inhibition of the growth of Bacillus subtilis DSM10 by a newly discovered antibacterial protein from the soil metagenome.

    PubMed

    O'Mahony, Mark M; Henneberger, Ruth; Selvin, Joseph; Kennedy, Jonathan; Doohan, Fiona; Marchesi, Julian R; Dobson, Alan D W

    2015-01-01

    A functional metagenomics based approach exploiting the microbiota of suppressive soils from an organic field site has succeeded in the identification of a clone with the ability to inhibit the growth of Bacillus subtilis DSM10. Sequencing of the fosmid identified a putative β-lactamase-like gene abgT. Transposon mutagenesis of the abgT gene resulted in a loss in ability to inhibit the growth of B. subtilis DSM10. Further analysis of the deduced amino acid sequence of AbgT revealed moderate homology to esterases, suggesting that the protein may possess hydrolytic activity. Weak lipolytic activity was detected; however the clone did not appear to produce any β-lactamase activity. Phylogenetic analysis revealed the protein is a member of the family VIII group of lipase/esterases and clusters with a number of proteins of metagenomic origin. The abgT gene was sub-cloned into a protein expression vector and when introduced into the abgT transposon mutant clones restored the ability of the clones to inhibit the growth of B. subtilis DSM10, clearly indicating that the abgT gene is involved in the antibacterial activity. While the precise role of this protein has yet to fully elucidated, it may be involved in the generation of free fatty acid with antibacterial properties. Thus functional metagenomic approaches continue to provide a significant resource for the discovery of novel functional proteins and it is clear that hydrolytic enzymes, such as AbgT, may be a potential source for the development of future antimicrobial therapies.

  7. Metagenomic mining pectinolytic microbes and enzymes from an apple pomace-adapted compost microbial community.

    PubMed

    Zhou, Man; Guo, Peng; Wang, Tao; Gao, Lina; Yin, Huijun; Cai, Cheng; Gu, Jie; Lü, Xin

    2017-01-01

    Degradation of pectin in lignocellulosic materials is one of the key steps for biofuel production. Biological hydrolysis of pectin, i.e., degradation by pectinolytic microbes and enzymes, is an attractive paradigm because of its obvious advantages, such as environmentally friendly procedures, low in energy demand for lignin removal, and the possibility to be integrated in consolidated process. In this study, a metagenomics sequence-guided strategy coupled with enrichment culture technique was used to facilitate targeted discovery of pectinolytic microbes and enzymes. An apple pomace-adapted compost (APAC) habitat was constructed to boost the enrichment of pectinolytic microorganisms. Analyses of 16S rDNA high-throughput sequencing revealed that microbial communities changed dramatically during composting with some bacterial populations being greatly enriched. Metagenomics data showed that apple pomace-adapted compost microbial community (APACMC) was dominated by Proteobacteria and Bacteroidetes . Functional analysis and carbohydrate-active enzyme profiles confirmed that APACMC had been successfully enriched for the targeted functions. Among the 1756 putative genes encoding pectinolytic enzymes, 129 were predicted as novel (with an identity <30% to any CAZy database entry) and only 1.92% were more than 75% identical with proteins in NCBI environmental database, demonstrating that they have not been observed in previous metagenome projects. Phylogenetic analysis showed that APACMC harbored a broad range of pectinolytic bacteria and many of them were previously unrecognized. The immensely diverse pectinolytic microbes and enzymes found in our study will expand the arsenal of proficient degraders and enzymes for lignocellulosic biofuel production. Our study provides a powerful approach for targeted mining microbes and enzymes in numerous industries.

  8. Metagenomics as a Tool for Enzyme Discovery: Hydrolytic Enzymes from Marine-Related Metagenomes.

    PubMed

    Popovic, Ana; Tchigvintsev, Anatoly; Tran, Hai; Chernikova, Tatyana N; Golyshina, Olga V; Yakimov, Michail M; Golyshin, Peter N; Yakunin, Alexander F

    2015-01-01

    This chapter discusses metagenomics and its application for enzyme discovery, with a focus on hydrolytic enzymes from marine metagenomic libraries. With less than one percent of culturable microorganisms in the environment, metagenomics, or the collective study of community genetics, has opened up a rich pool of uncharacterized metabolic pathways, enzymes, and adaptations. This great untapped pool of genes provides the particularly exciting potential to mine for new biochemical activities or novel enzymes with activities tailored to peculiar sets of environmental conditions. Metagenomes also represent a huge reservoir of novel enzymes for applications in biocatalysis, biofuels, and bioremediation. Here we present the results of enzyme discovery for four enzyme activities, of particular industrial or environmental interest, including esterase/lipase, glycosyl hydrolase, protease and dehalogenase.

  9. Genomic variation of subseafloor archaeal and bacterial populations from venting fluids at the Mid-Cayman Rise

    NASA Astrophysics Data System (ADS)

    Anderson, R. E.; Eren, A. M.; Stepanauskas, R.; Huber, J. A.; Reveillaud, J.

    2015-12-01

    Deep-sea hydrothermal vent systems serve as windows to a dynamic, gradient-dominated deep biosphere that is home to a wide diversity of archaea, bacteria, and viruses. Until recently the majority of these microbial lineages were uncultivated, resulting in a poor understanding of how the physical and geochemical context shapes microbial evolution in the deep subsurface. By comparing metagenomes, metatranscriptomes and single-cell genomes between geologically distinct vent fields, we can better understand the relationship between the environment and the evolution of subsurface microbial communities. An ideal setting in which to use this approach is the Mid-Cayman Rise, located on the world's deepest and slowest-spreading mid-ocean ridge, which hosts both the mafic-influenced Piccard and ultramafic-influenced Von Damm vent fields. Previous work has shown that Von Damm has higher taxonomic and metabolic diversity than Piccard, consistent with geochemical model expectations, and the fluids from all vents are enriched in hydrogen (Reveillaud et al., submitted). Mapping of both metagenomes and metatranscriptomes to a combined assembly showed very little overlap among the Von Damm samples, indicating substantial variability that is consistent with the diversity of potential metabolites in this ultramafic vent field. In contrast, the most consistently abundant and active lineage across the Piccard samples was Sulfurovum, a sulfur-oxidizing chemolithotroph that uses nitrate or oxygen as an electron acceptor. Moreover, analysis of point mutations within individual lineages suggested that Sulfurovumat Piccard is under strong selection, whereas microbial genomes at Von Damm were more variable. These results are consistent with the hypothesis that the subsurface environment at Piccard supports the emergence of a dominant lineage that is under strong selection pressure, whereas the more geochemically diverse microbial habitat at Von Damm creates a wider variety of stable ecological niches, facilitating higher diversity both within and between microbial lineages. By examining how the environment is imprinted into microbial genomes, we hope to gain insight into how subsurface microbial communities co-evolve with their environment in both the present and the deep past.

  10. A combined meta-barcoding and shotgun metagenomic analysis of spontaneous wine fermentation.

    PubMed

    Sternes, Peter R; Lee, Danna; Kutyna, Dariusz R; Borneman, Anthony R

    2017-07-01

    Wine is a complex beverage, comprising hundreds of metabolites produced through the action of yeasts and bacteria in fermenting grape must. Commercially, there is now a growing trend away from using wine yeast (Saccharomyces) starter cultures, toward the historic practice of uninoculated or "wild" fermentation, where the yeasts and bacteria associated with the grapes and/or winery perform the fermentation. It is the varied metabolic contributions of these numerous non-Saccharomyces species that are thought to impart complexity and desirable taste and aroma attributes to wild ferments in comparison to their inoculated counterparts. To map the microflora of spontaneous fermentation, metagenomic techniques were employed to characterize and monitor the progression of fungal species in 5 different wild fermentations. Both amplicon-based ribosomal DNA internal transcribed spacer (ITS) phylotyping and shotgun metagenomics were used to assess community structure across different stages of fermentation. While providing a sensitive and highly accurate means of characterizing the wine microbiome, the shotgun metagenomic data also uncovered a significant overabundance bias in the ITS phylotyping abundance estimations for the common non-Saccharomyces wine yeast genus Metschnikowia. By identifying biases such as that observed for Metschnikowia, abundance measurements from future ITS phylotyping datasets can be corrected to provide more accurate species representation. Ultimately, as more shotgun metagenomic and single-strain de novo assemblies for key wine species become available, the accuracy of both ITS-amplicon and shotgun studies will greatly increase, providing a powerful methodology for deciphering the influence of the microbial community on the wine flavor and aroma. © The Authors 2017. Published by Oxford University Press.

  11. A combined meta-barcoding and shotgun metagenomic analysis of spontaneous wine fermentation

    PubMed Central

    Sternes, Peter R.; Lee, Danna; Kutyna, Dariusz R.

    2017-01-01

    Abstract Wine is a complex beverage, comprising hundreds of metabolites produced through the action of yeasts and bacteria in fermenting grape must. Commercially, there is now a growing trend away from using wine yeast (Saccharomyces) starter cultures, toward the historic practice of uninoculated or “wild” fermentation, where the yeasts and bacteria associated with the grapes and/or winery perform the fermentation. It is the varied metabolic contributions of these numerous non-Saccharomyces species that are thought to impart complexity and desirable taste and aroma attributes to wild ferments in comparison to their inoculated counterparts. To map the microflora of spontaneous fermentation, metagenomic techniques were employed to characterize and monitor the progression of fungal species in 5 different wild fermentations. Both amplicon-based ribosomal DNA internal transcribed spacer (ITS) phylotyping and shotgun metagenomics were used to assess community structure across different stages of fermentation. While providing a sensitive and highly accurate means of characterizing the wine microbiome, the shotgun metagenomic data also uncovered a significant overabundance bias in the ITS phylotyping abundance estimations for the common non-Saccharomyces wine yeast genus Metschnikowia. By identifying biases such as that observed for Metschnikowia, abundance measurements from future ITS phylotyping datasets can be corrected to provide more accurate species representation. Ultimately, as more shotgun metagenomic and single-strain de novo assemblies for key wine species become available, the accuracy of both ITS-amplicon and shotgun studies will greatly increase, providing a powerful methodology for deciphering the influence of the microbial community on the wine flavor and aroma. PMID:28595314

  12. Metagenome, metatranscriptome and single-cell sequencing reveal microbial response to Deepwater Horizon oil spill.

    PubMed

    Mason, Olivia U; Hazen, Terry C; Borglin, Sharon; Chain, Patrick S G; Dubinsky, Eric A; Fortney, Julian L; Han, James; Holman, Hoi-Ying N; Hultman, Jenni; Lamendella, Regina; Mackelprang, Rachel; Malfatti, Stephanie; Tom, Lauren M; Tringe, Susannah G; Woyke, Tanja; Zhou, Jizhong; Rubin, Edward M; Jansson, Janet K

    2012-09-01

    The Deepwater Horizon oil spill in the Gulf of Mexico resulted in a deep-sea hydrocarbon plume that caused a shift in the indigenous microbial community composition with unknown ecological consequences. Early in the spill history, a bloom of uncultured, thus uncharacterized, members of the Oceanospirillales was previously detected, but their role in oil disposition was unknown. Here our aim was to determine the functional role of the Oceanospirillales and other active members of the indigenous microbial community using deep sequencing of community DNA and RNA, as well as single-cell genomics. Shotgun metagenomic and metatranscriptomic sequencing revealed that genes for motility, chemotaxis and aliphatic hydrocarbon degradation were significantly enriched and expressed in the hydrocarbon plume samples compared with uncontaminated seawater collected from plume depth. In contrast, although genes coding for degradation of more recalcitrant compounds, such as benzene, toluene, ethylbenzene, total xylenes and polycyclic aromatic hydrocarbons, were identified in the metagenomes, they were expressed at low levels, or not at all based on analysis of the metatranscriptomes. Isolation and sequencing of two Oceanospirillales single cells revealed that both cells possessed genes coding for n-alkane and cycloalkane degradation. Specifically, the near-complete pathway for cyclohexane oxidation in the Oceanospirillales single cells was elucidated and supported by both metagenome and metatranscriptome data. The draft genome also included genes for chemotaxis, motility and nutrient acquisition strategies that were also identified in the metagenomes and metatranscriptomes. These data point towards a rapid response of members of the Oceanospirillales to aliphatic hydrocarbons in the deep sea.

  13. [Cloning, expression and characterization of a novel esterase from marine sediment microbial metagenomic library].

    PubMed

    Xu, Shiqing; Hu, Yongfei; Yuan, Aihua; Zhu, Baoli

    2010-07-01

    To clone, express and characterize a novel esterase from marine sediment microbial metagenomic library. Using esterase segregation agar containing tributyrin, we obtained esterase positive fosmid clone FL10 from marine sediment microbial metagenomic library. This fosmid was partially digested with Sau3A I to construct the sublibrary, from which the esterase positive subclone pFLS10 was obtained. The full length of the esterase gene was amplified and cloned into the expressing vector pET28a, and the recombinant plasmid was transformed into E. coli BL21 cells. We analyse the enzyme activity and study the characterization of the esterase after its expression and purification. An ORF (Open Reading Frame) of 924 bp was identified from the subclone pFLS10. Sequence analysis indicated that it showed 71% amino acid identity to esterase (ADA70030) from a marine sediment metagenomic library. The esterase is a novel low-temperature-active esterase and had highest lipolytic activity to the substrate of 4-nitrophenyl butyrate (C4). The optimum temperature of the esterase was 20 degrees C, the optimum pH was 7.5. The esterase in this study had good thermostability at 20 degrees C and good pH stability at pH8 -10. Significant increase in lipolytic activity was observed with addition of K+ and Mg2+, while decrease with Mn2+ etc. We obtained the novel esterase gene fls10 from the marine sediment microbial metagenomic library. The esterase had good thermostability and high lipolytic activity at low temperature and under basic conditions, which laid a basis for industrial application.

  14. Metagenomics, metatranscriptomics and single cell genomics reveal functional response of active Oceanospirillales to Gulf oil spill

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mason, Olivia U.; Hazen, Terry C.; Borglin, Sharon

    The Deepwater Horizon oil spill in the Gulf of Mexico resulted in a deep-sea hydrocarbon plume that caused a shift in the indigenous microbial community composition with unknown ecological consequences. Early in the spill history, a bloom of uncultured, thus uncharacterized, members of the Oceanospirillales was previously detected, but their role in oil disposition was unknown. Here our aim was to determine the functional role of the Oceanospirillales and other active members of the indigenous microbial community using deep sequencing of community DNA and RNA, as well as single-cell genomics. Shotgun metagenomic and metatranscriptomic sequencing revealed that genes for motility,more » chemotaxis and aliphatic hydrocarbon degradation were significantly enriched and expressed in the hydrocarbon plume samples compared with uncontaminated seawater collected from plume depth. In contrast, although genes coding for degradation of more recalcitrant compounds, such as benzene, toluene, ethylbenzene, total xylenes and polycyclic aromatic hydrocarbons, were identified in the metagenomes, they were expressed at low levels, or not at all based on analysis of the metatranscriptomes. Isolation and sequencing of two Oceanospirillales single cells revealed that both cells possessed genes coding for n-alkane and cycloalkane degradation. Specifically, the near-complete pathway for cyclohexane oxidation in the Oceanospirillales single cells was elucidated and supported by both metagenome and metatranscriptome data. The draft genome also included genes for chemotaxis, motility and nutrient acquisition strategies that were also identified in the metagenomes and metatranscriptomes. These data point towards a rapid response of members of the Oceanospirillales to aliphatic hydrocarbons in the deep sea.« less

  15. MICCA: a complete and accurate software for taxonomic profiling of metagenomic data.

    PubMed

    Albanese, Davide; Fontana, Paolo; De Filippo, Carlotta; Cavalieri, Duccio; Donati, Claudio

    2015-05-19

    The introduction of high throughput sequencing technologies has triggered an increase of the number of studies in which the microbiota of environmental and human samples is characterized through the sequencing of selected marker genes. While experimental protocols have undergone a process of standardization that makes them accessible to a large community of scientist, standard and robust data analysis pipelines are still lacking. Here we introduce MICCA, a software pipeline for the processing of amplicon metagenomic datasets that efficiently combines quality filtering, clustering of Operational Taxonomic Units (OTUs), taxonomy assignment and phylogenetic tree inference. MICCA provides accurate results reaching a good compromise among modularity and usability. Moreover, we introduce a de-novo clustering algorithm specifically designed for the inference of Operational Taxonomic Units (OTUs). Tests on real and synthetic datasets shows that thanks to the optimized reads filtering process and to the new clustering algorithm, MICCA provides estimates of the number of OTUs and of other common ecological indices that are more accurate and robust than currently available pipelines. Analysis of public metagenomic datasets shows that the higher consistency of results improves our understanding of the structure of environmental and human associated microbial communities. MICCA is an open source project.

  16. Structuring of Bacterioplankton Diversity in a Large Tropical Bay

    PubMed Central

    Gregoracci, Gustavo B.; Nascimento, Juliana R.; Cabral, Anderson S.; Paranhos, Rodolfo; Valentin, Jean L.; Thompson, Cristiane C.; Thompson, Fabiano L.

    2012-01-01

    Structuring of bacterioplanktonic populations and factors that determine the structuring of specific niche partitions have been demonstrated only for a limited number of colder water environments. In order to better understand the physical chemical and biological parameters that may influence bacterioplankton diversity and abundance, we examined their productivity, abundance and diversity in the second largest Brazilian tropical bay (Guanabara Bay, GB), as well as seawater physical chemical and biological parameters of GB. The inner bay location with higher nutrient input favored higher microbial (including vibrio) growth. Metagenomic analysis revealed a predominance of Gammaproteobacteria in this location, while GB locations with lower nutrient concentration favored Alphaproteobacteria and Flavobacteria. According to the subsystems (SEED) functional analysis, GB has a distinctive metabolic signature, comprising a higher number of sequences in the metabolism of phosphorus and aromatic compounds and a lower number of sequences in the photosynthesis subsystem. The apparent phosphorus limitation appears to influence the GB metagenomic signature of the three locations. Phosphorus is also one of the main factors determining changes in the abundance of planktonic vibrios, suggesting that nutrient limitation can be observed at community (metagenomic) and population levels (total prokaryote and vibrio counts). PMID:22363639

  17. High-throughput metagenomic analysis of petroleum-contaminated soil microbiome reveals the versatility in xenobiotic aromatics metabolism.

    PubMed

    Bao, Yun-Juan; Xu, Zixiang; Li, Yang; Yao, Zhi; Sun, Jibin; Song, Hui

    2017-06-01

    The soil with petroleum contamination is one of the most studied soil ecosystems due to its rich microorganisms for hydrocarbon degradation and broad applications in bioremediation. However, our understanding of the genomic properties and functional traits of the soil microbiome is limited. In this study, we used high-throughput metagenomic sequencing to comprehensively study the microbial community from petroleum-contaminated soils near Tianjin Dagang oilfield in eastern China. The analysis reveals that the soil metagenome is characterized by high level of community diversity and metabolic versatility. The metageome community is predominated by γ-Proteobacteria and α-Proteobacteria, which are key players for petroleum hydrocarbon degradation. The functional study demonstrates over-represented enzyme groups and pathways involved in degradation of a broad set of xenobiotic aromatic compounds, including toluene, xylene, chlorobenzoate, aminobenzoate, DDT, methylnaphthalene, and bisphenol. A composite metabolic network is proposed for the identified pathways, thus consolidating our identification of the pathways. The overall data demonstrated the great potential of the studied soil microbiome in the xenobiotic aromatics degradation. The results not only establish a rich reservoir for novel enzyme discovery but also provide putative applications in bioremediation. Copyright © 2016. Published by Elsevier B.V.

  18. PlasFlow: predicting plasmid sequences in metagenomic data using genome signatures

    PubMed Central

    Lipinski, Leszek; Dziembowski, Andrzej

    2018-01-01

    Abstract Plasmids are mobile genetics elements that play an important role in the environmental adaptation of microorganisms. Although plasmids are usually analyzed in cultured microorganisms, there is a need for methods that allow for the analysis of pools of plasmids (plasmidomes) in environmental samples. To that end, several molecular biology and bioinformatics methods have been developed; however, they are limited to environments with low diversity and cannot recover large plasmids. Here, we present PlasFlow, a novel tool based on genomic signatures that employs a neural network approach for identification of bacterial plasmid sequences in environmental samples. PlasFlow can recover plasmid sequences from assembled metagenomes without any prior knowledge of the taxonomical or functional composition of samples with an accuracy up to 96%. It can also recover sequences of both circular and linear plasmids and can perform initial taxonomical classification of sequences. Compared to other currently available tools, PlasFlow demonstrated significantly better performance on test datasets. Analysis of two samples from heavy metal-contaminated microbial mats revealed that plasmids may constitute an important fraction of their metagenomes and carry genes involved in heavy-metal homeostasis, proving the pivotal role of plasmids in microorganism adaptation to environmental conditions. PMID:29346586

  19. Metagenomic Analysis of a Tropical Composting Operation at the São Paulo Zoo Park Reveals Diversity of Biomass Degradation Functions and Organisms

    PubMed Central

    Pascon, Renata C.; de Oliveira, Julio Cezar Franco; Digiampietri, Luciano A.; Barbosa, Deibs; Peixoto, Bruno Malveira; Vallim, Marcelo A.; Viana-Niero, Cristina; Ostroski, Eric H.; Telles, Guilherme P.; Dias, Zanoni; da Cruz, João Batista; Juliano, Luiz; Verjovski-Almeida, Sergio; da Silva, Aline Maria; Setubal, João Carlos

    2013-01-01

    Composting operations are a rich source for prospection of biomass degradation enzymes. We have analyzed the microbiomes of two composting samples collected in a facility inside the São Paulo Zoo Park, in Brazil. All organic waste produced in the park is processed in this facility, at a rate of four tons/day. Total DNA was extracted and sequenced with Roche/454 technology, generating about 3 million reads per sample. To our knowledge this work is the first report of a composting whole-microbial community using high-throughput sequencing and analysis. The phylogenetic profiles of the two microbiomes analyzed are quite different, with a clear dominance of members of the Lactobacillus genus in one of them. We found a general agreement of the distribution of functional categories in the Zoo compost metagenomes compared with seven selected public metagenomes of biomass deconstruction environments, indicating the potential for different bacterial communities to provide alternative mechanisms for the same functional purposes. Our results indicate that biomass degradation in this composting process, including deconstruction of recalcitrant lignocellulose, is fully performed by bacterial enzymes, most likely by members of the Clostridiales and Actinomycetales orders. PMID:23637931

  20. Unique Features of Ethnic Mongolian Gut Microbiome revealed by metagenomic analysis.

    PubMed

    Liu, Wenjun; Zhang, Jiachao; Wu, Chunyan; Cai, Shunfeng; Huang, Weiqiang; Chen, Jing; Xi, Xiaoxia; Liang, Zebin; Hou, Qiangchuan; Zhou, Bing; Qin, Nan; Zhang, Heping

    2016-10-06

    The human gut microbiota varies considerably among world populations due to a variety of factors including genetic background, diet, cultural habits and socioeconomic status. Here we characterized 110 healthy Mongolian adults gut microbiota by shotgun metagenomic sequencing and compared the intestinal microbiome among Mongolians, the Hans and European cohorts. The results showed that the taxonomic profile of intestinal microbiome among cohorts revealed the Actinobaceria and Bifidobacterium were the key microbes contributing to the differences among Mongolians, the Hans and Europeans at the phylum level and genus level, respectively. Metagenomic species analysis indicated that Faecalibacterium prausnitzii and Coprococcus comeswere enrich in Mongolian people which might contribute to gut health through anti-inflammatory properties and butyrate production, respectively. On the other hand, the enriched genus Collinsella, biomarker in symptomatic atherosclerosis patients, might be associated with the high morbidity of cardiovascular and cerebrovascular diseases in Mongolian adults. At the functional level, a unique microbial metabolic pathway profile was present in Mongolian's gut which mainly distributed in amino acid metabolism, carbohydrate metabolism, energy metabolism, lipid metabolism, glycan biosynthesis and metabolism. We can attribute the specific signatures of Mongolian gut microbiome to their unique genotype, dietary habits and living environment.

  1. MICCA: a complete and accurate software for taxonomic profiling of metagenomic data

    PubMed Central

    Albanese, Davide; Fontana, Paolo; De Filippo, Carlotta; Cavalieri, Duccio; Donati, Claudio

    2015-01-01

    The introduction of high throughput sequencing technologies has triggered an increase of the number of studies in which the microbiota of environmental and human samples is characterized through the sequencing of selected marker genes. While experimental protocols have undergone a process of standardization that makes them accessible to a large community of scientist, standard and robust data analysis pipelines are still lacking. Here we introduce MICCA, a software pipeline for the processing of amplicon metagenomic datasets that efficiently combines quality filtering, clustering of Operational Taxonomic Units (OTUs), taxonomy assignment and phylogenetic tree inference. MICCA provides accurate results reaching a good compromise among modularity and usability. Moreover, we introduce a de-novo clustering algorithm specifically designed for the inference of Operational Taxonomic Units (OTUs). Tests on real and synthetic datasets shows that thanks to the optimized reads filtering process and to the new clustering algorithm, MICCA provides estimates of the number of OTUs and of other common ecological indices that are more accurate and robust than currently available pipelines. Analysis of public metagenomic datasets shows that the higher consistency of results improves our understanding of the structure of environmental and human associated microbial communities. MICCA is an open source project. PMID:25988396

  2. Extensive Microbial and Functional Diversity within the Chicken Cecal Microbiome

    PubMed Central

    Sergeant, Martin J.; Constantinidou, Chrystala; Cogan, Tristan A.; Bedford, Michael R.; Penn, Charles W.; Pallen, Mark J.

    2014-01-01

    Chickens are major source of food and protein worldwide. Feed conversion and the health of chickens relies on the largely unexplored complex microbial community that inhabits the chicken gut, including the ceca. We have carried out deep microbial community profiling of the microbiota in twenty cecal samples via 16S rRNA gene sequences and an in-depth metagenomics analysis of a single cecal microbiota. We recovered 699 phylotypes, over half of which appear to represent previously unknown species. We obtained 648,251 environmental gene tags (EGTs), the majority of which represent new species. These were binned into over two-dozen draft genomes, which included Campylobacter jejuni and Helicobacter pullorum. We found numerous polysaccharide- and oligosaccharide-degrading enzymes encoding within the metagenome, some of which appeared to be part of polysaccharide utilization systems with genetic evidence for the co-ordination of polysaccharide degradation with sugar transport and utilization. The cecal metagenome encodes several fermentation pathways leading to the production of short-chain fatty acids, including some with novel features. We found a dozen uptake hydrogenases encoded in the metagenome and speculate that these provide major hydrogen sinks within this microbial community and might explain the high abundance of several genera within this microbiome, including Campylobacter, Helicobacter and Megamonas. PMID:24657972

  3. Forest harvesting reduces the soil metagenomic potential for biomass decomposition.

    PubMed

    Cardenas, Erick; Kranabetter, J M; Hope, Graeme; Maas, Kendra R; Hallam, Steven; Mohn, William W

    2015-11-01

    Soil is the key resource that must be managed to ensure sustainable forest productivity. Soil microbial communities mediate numerous essential ecosystem functions, and recent studies show that forest harvesting alters soil community composition. From a long-term soil productivity study site in a temperate coniferous forest in British Columbia, 21 forest soil shotgun metagenomes were generated, totaling 187 Gb. A method to analyze unassembled metagenome reads from the complex community was optimized and validated. The subsequent metagenome analysis revealed that, 12 years after forest harvesting, there were 16% and 8% reductions in relative abundances of biomass decomposition genes in the organic and mineral soil layers, respectively. Organic and mineral soil layers differed markedly in genetic potential for biomass degradation, with the organic layer having greater potential and being more strongly affected by harvesting. Gene families were disproportionately affected, and we identified 41 gene families consistently affected by harvesting, including families involved in lignin, cellulose, hemicellulose and pectin degradation. The results strongly suggest that harvesting profoundly altered below-ground cycling of carbon and other nutrients at this site, with potentially important consequences for forest regeneration. Thus, it is important to determine whether these changes foreshadow long-term changes in forest productivity or resilience and whether these changes are broadly characteristic of harvested forests.

  4. Community genomic analysis of an extremely acidophilic sulfur-oxidizing biofilm

    PubMed Central

    Jones, Daniel S; Albrecht, Heidi L; Dawson, Katherine S; Schaperdoth, Irene; Freeman, Katherine H; Pi, Yundan; Pearson, Ann; Macalady, Jennifer L

    2012-01-01

    Highly acidic (pH 0–1) biofilms, known as ‘snottites', form on the walls and ceilings of hydrogen sulfide-rich caves. We investigated the population structure, physiology and biogeochemistry of these biofilms using metagenomics, rRNA methods and lipid geochemistry. Snottites from the Frasassi cave system (Italy) are dominated (>70% of cells) by Acidithiobacillus thiooxidans, with smaller populations including an archaeon in the uncultivated ‘G-plasma' clade of Thermoplasmatales (>15%) and a bacterium in the Acidimicrobiaceae family (>5%). Based on metagenomic evidence, the Acidithiobacillus population is autotrophic (ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCO), carboxysomes) and oxidizes sulfur by the sulfide–quinone reductase and sox pathways. No reads matching nitrogen fixation genes were detected in the metagenome, whereas multiple matches to nitrogen assimilation functions are present, consistent with geochemical evidence, that fixed nitrogen is available in the snottite environment to support autotrophic growth. Evidence for adaptations to extreme acidity include Acidithiobacillus sequences for cation transporters and hopanoid synthesis, and direct measurements of hopanoid membrane lipids. Based on combined metagenomic, molecular and geochemical evidence, we suggest that Acidithiobacillus is the snottite architect and main primary producer, and that snottite morphology and distributions in the cave environment are directly related to the supply of C, N and energy substrates from the cave atmosphere. PMID:21716305

  5. Metagenomics analysis of microbial communities associated with a traditional rice wine starter culture (Xaj-pitha) of Assam, India.

    PubMed

    Bora, Sudipta Sankar; Keot, Jyotshna; Das, Saurav; Sarma, Kishore; Barooah, Madhumita

    2016-12-01

    This is the first report on the microbial diversity of xaj-pitha, a rice wine fermentation starter culture through a metagenomics approach involving Illumine-based whole genome shotgun (WGS) sequencing method. Metagenomic DNA was extracted from rice wine starter culture concocted by Ahom community of Assam and analyzed using a MiSeq ® System. A total of 2,78,231 contigs, with an average read length of 640.13 bp, were obtained. Data obtained from the use of several taxonomic profiling tools were compared with previously reported microbial diversity studies through the culture-dependent and culture-independent method. The microbial community revealed the existence of amylase producers, such as Rhizopus delemar, Mucor circinelloides, and Aspergillus sp. Ethanol producers viz., Meyerozyma guilliermondii, Wickerhamomyces ciferrii, Saccharomyces cerevisiae, Candida glabrata, Debaryomyces hansenii, Ogataea parapolymorpha, and Dekkera bruxellensis, were found associated with the starter culture along with a diverse range of opportunistic contaminants. The bacterial microflora was dominated by lactic acid bacteria (LAB). The most frequent occurring LAB was Lactobacillus plantarum, Lactobacillus brevis, Leuconostoc lactis, Weissella cibaria, Lactococcus lactis, Weissella para mesenteroides, Leuconostoc pseudomesenteroides, etc. Our study provided a comprehensive picture of microbial diversity associated with rice wine fermentation starter and indicated the superiority of metagenomic sequencing over previously used techniques.

  6. Phytoplankton Diversity and Geologically Relevant Carbon: Using metagenomics to determine phytoplankton biomarker production

    NASA Astrophysics Data System (ADS)

    Kodner, R. B.; Armbrust, E.

    2008-12-01

    Phytoplankton play an important role in the global carbon cycle, on short and long time scales. On long time scales, organic carbon, especially recalcitrant forms of biomass such as lipids, can be preserved and thus sequestered in sediments and rocks on geologic time scales. If the preserved lipids have some taxonomic specificity, they can be used as fossil biomarkers to characterize the community of organisms that contributed to ancient carbon sinks. Currently, it is not well understood how well the complex mixture of organic compounds preserved in geological carbon sinks represents the original community that produced those molecules or how the diversity of organism in a community is reflected in the lipid biomarkers they collectively synthesize. We have begun to investigate these questions by characterizing lipid biomarker production in modern phytoplankton communities with metagenomic data sets. Here we evaluate the information on community biomarker biosynthesis gathered from this type of data set using sterols as a case study. We have identified genes involved in sterol biosynthesis in a number of metagenomes and placed these genes in a phylogenetic context using a method designed to deal with short metagenomic sequences. The degree of taxonomic diversity of biomarker production measured with gene sequences can be more specific than lipid analysis alone.

  7. A retrospective metagenomics approach to studying Blastocystis.

    PubMed

    Andersen, Lee O'Brien; Bonde, Ida; Nielsen, Henrik Bjørn; Stensvold, Christen Rune

    2015-07-01

    Blastocystis is a common single-celled intestinal parasitic genus, comprising several subtypes. Here, we screened data obtained by metagenomic analysis of faecal DNA for Blastocystis by searching for subtype-specific genes in coabundance gene groups, which are groups of genes that covary across a selection of 316 human faecal samples, hence representing genes originating from a single subtype. The 316 faecal samples were from 236 healthy individuals, 13 patients with Crohn's disease (CD) and 67 patients with ulcerative colitis (UC). The prevalence of Blastocystis was 20.3% in the healthy individuals and 14.9% in patients with UC. Meanwhile, Blastocystis was absent in patients with CD. Individuals with intestinal microbiota dominated by Bacteroides were much less prone to having Blastocystis-positive stool (Matthew's correlation coefficient = -0.25, P < 0.0001) than individuals with Ruminococcus- and Prevotella-driven enterotypes. This is the first study to investigate the relationship between Blastocystis and communities of gut bacteria using a metagenomics approach. The study serves as an example of how it is possible to retrospectively investigate microbial eukaryotic communities in the gut using metagenomic datasets targeting the bacterial component of the intestinal microbiome and the interplay between these microbial communities. © FEMS 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  8. Identification and characterization of a cellulase-encoding gene from the buffalo rumen metagenomic library.

    PubMed

    Nguyen, Nhung Hong; Maruset, Lalita; Uengwetwanit, Tanaporn; Mhuantong, Wuttichai; Harnpicharnchai, Piyanun; Champreda, Verawat; Tanapongpipat, Sutipa; Jirajaroenrat, Kanya; Rakshit, Sudip K; Eurwilaichitr, Lily; Pongpattanakitshote, Somchai

    2012-01-01

    Microorganisms residing in the rumens of cattle represent a rich source of lignocellulose-degrading enzymes, since their diet consists of plant-based materials that are high in cellulose and hemicellulose. In this study, a metagenomic library was constructed from buffalo rumen contents using pCC1FOS fosmid vector. Ninety-three clones from the pooled library of approximately 10,000 clones showed degrading activity against AZCL-HE-Cellulose, whereas four other clones showed activity against AZCL-Xylan. Contig analysis of pyrosequencing data derived from the selected strongly positive clones revealed 15 ORFs that were closely related to lignocellulose-degrading enzymes belonging to several glycosyl hydrolase families. Glycosyl hydrolase family 5 (GHF5) was the most abundant glycosyl hydrolase found, and a majority of the GHF5s in our metagenomes were closely related to several ruminal bacteria, especially ones from other buffalo rumen metagenomes. Characterization of BT-01, a selected clone with highest cellulase activity from the primary plate screening assay, revealed a cellulase encoding gene with optimal working conditions at pH 5.5 at 50 °C. Along with its stability over acidic pH, the capability efficiently to hydrolyze cellulose in feed for broiler chickens, as exhibited in an in vitro digestibility test, suggests that BT-01 has potential application as a feed supplement.

  9. Unsupervised discovery of microbial population structure within metagenomes using nucleotide base composition

    PubMed Central

    Saeed, Isaam; Tang, Sen-Lin; Halgamuge, Saman K.

    2012-01-01

    An approach to infer the unknown microbial population structure within a metagenome is to cluster nucleotide sequences based on common patterns in base composition, otherwise referred to as binning. When functional roles are assigned to the identified populations, a deeper understanding of microbial communities can be attained, more so than gene-centric approaches that explore overall functionality. In this study, we propose an unsupervised, model-based binning method with two clustering tiers, which uses a novel transformation of the oligonucleotide frequency-derived error gradient and GC content to generate coarse groups at the first tier of clustering; and tetranucleotide frequency to refine these groups at the secondary clustering tier. The proposed method has a demonstrated improvement over PhyloPythia, S-GSOM, TACOA and TaxSOM on all three benchmarks that were used for evaluation in this study. The proposed method is then applied to a pyrosequenced metagenomic library of mud volcano sediment sampled in southwestern Taiwan, with the inferred population structure validated against complementary sequencing of 16S ribosomal RNA marker genes. Finally, the proposed method was further validated against four publicly available metagenomes, including a highly complex Antarctic whale-fall bone sample, which was previously assumed to be too complex for binning prior to functional analysis. PMID:22180538

  10. The evolution, diversity, and host associations of rhabdoviruses.

    PubMed

    Longdon, Ben; Murray, Gemma G R; Palmer, William J; Day, Jonathan P; Parker, Darren J; Welch, John J; Obbard, Darren J; Jiggins, Francis M

    2015-01-01

    Metagenomic studies are leading to the discovery of a hidden diversity of RNA viruses. These new viruses are poorly characterized and new approaches are needed predict the host species these viruses pose a risk to. The rhabdoviruses are a diverse family of RNA viruses that includes important pathogens of humans, animals, and plants. We have discovered thirty-two new rhabdoviruses through a combination of our own RNA sequencing of insects and searching public sequence databases. Combining these with previously known sequences we reconstructed the phylogeny of 195 rhabdovirus sequences, and produced the most in depth analysis of the family to date. In most cases we know nothing about the biology of the viruses beyond the host they were identified from, but our dataset provides a powerful phylogenetic approach to predict which are vector-borne viruses and which are specific to vertebrates or arthropods. By reconstructing ancestral and present host states we found that switches between major groups of hosts have occurred rarely during rhabdovirus evolution. This allowed us to propose seventy-six new likely vector-borne vertebrate viruses among viruses identified from vertebrates or biting insects. Based on currently available data, our analysis suggests it is likely there was a single origin of the known plant viruses and arthropod-borne vertebrate viruses, while vertebrate- and arthropod-specific viruses arose at least twice. There are also few transitions between aquatic and terrestrial ecosystems. Viruses also cluster together at a finer scale, with closely related viruses tending to be found in closely related hosts. Our data therefore suggest that throughout their evolution, rhabdoviruses have occasionally jumped between distantly related host species before spreading through related hosts in the same environment. This approach offers a way to predict the most probable biology and key traits of newly discovered viruses.

  11. Reconstructing the Genomic Content of Microbiome Taxa through Shotgun Metagenomic Deconvolution

    PubMed Central

    Carr, Rogan; Shen-Orr, Shai S.; Borenstein, Elhanan

    2013-01-01

    Metagenomics has transformed our understanding of the microbial world, allowing researchers to bypass the need to isolate and culture individual taxa and to directly characterize both the taxonomic and gene compositions of environmental samples. However, associating the genes found in a metagenomic sample with the specific taxa of origin remains a critical challenge. Existing binning methods, based on nucleotide composition or alignment to reference genomes allow only a coarse-grained classification and rely heavily on the availability of sequenced genomes from closely related taxa. Here, we introduce a novel computational framework, integrating variation in gene abundances across multiple samples with taxonomic abundance data to deconvolve metagenomic samples into taxa-specific gene profiles and to reconstruct the genomic content of community members. This assembly-free method is not bounded by various factors limiting previously described methods of metagenomic binning or metagenomic assembly and represents a fundamentally different approach to metagenomic-based genome reconstruction. An implementation of this framework is available at http://elbo.gs.washington.edu/software.html. We first describe the mathematical foundations of our framework and discuss considerations for implementing its various components. We demonstrate the ability of this framework to accurately deconvolve a set of metagenomic samples and to recover the gene content of individual taxa using synthetic metagenomic samples. We specifically characterize determinants of prediction accuracy and examine the impact of annotation errors on the reconstructed genomes. We finally apply metagenomic deconvolution to samples from the Human Microbiome Project, successfully reconstructing genus-level genomic content of various microbial genera, based solely on variation in gene count. These reconstructed genera are shown to correctly capture genus-specific properties. With the accumulation of metagenomic data, this deconvolution framework provides an essential tool for characterizing microbial taxa never before seen, laying the foundation for addressing fundamental questions concerning the taxa comprising diverse microbial communities. PMID:24146609

  12. Characterization of chlorinated and chloraminated drinking water microbial communities in a distribution system simulator using pyrosequencing data analysis

    EPA Science Inventory

    The molecular analysis of drinking water microbial communities has focused primarily on 16S rRNA gene sequence analysis. Since this approach provides limited information on function potential of microbial communities, analysis of whole-metagenome pyrosequencing data was used to...

  13. Biogeographic patterns in ocean microbes emerge in a neutral agent-based model.

    PubMed

    Hellweger, Ferdi L; van Sebille, Erik; Fredrick, Neil D

    2014-09-12

    A key question in ecology and evolution is the relative role of natural selection and neutral evolution in producing biogeographic patterns. We quantify the role of neutral processes by simulating division, mutation, and death of 100,000 individual marine bacteria cells with full 1 million-base-pair genomes in a global surface ocean circulation model. The model is run for up to 100,000 years and output is analyzed using BLAST (Basic Local Alignment Search Tool) alignment and metagenomics fragment recruitment. Simulations show the production and maintenance of biogeographic patterns, characterized by distinct provinces subject to mixing and periodic takeovers by neighbors (coalescence), after which neutral evolution reestablishes the province and the patterns reorganize. The emergent patterns are substantial (e.g., down to 99.5% DNA identity between North and Central Pacific provinces) and suggest that microbes evolve faster than ocean currents can disperse them. This approach can also be used to explore environmental selection. Copyright © 2014, American Association for the Advancement of Science.

  14. i-rDNA: alignment-free algorithm for rapid in silico detection of ribosomal gene fragments from metagenomic sequence data sets.

    PubMed

    Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Chadaram, Sudha; Mande, Sharmila S

    2011-11-30

    Obtaining accurate estimates of microbial diversity using rDNA profiling is the first step in most metagenomics projects. Consequently, most metagenomic projects spend considerable amounts of time, money and manpower for experimentally cloning, amplifying and sequencing the rDNA content in a metagenomic sample. In the second step, the entire genomic content of the metagenome is extracted, sequenced and analyzed. Since DNA sequences obtained in this second step also contain rDNA fragments, rapid in silico identification of these rDNA fragments would drastically reduce the cost, time and effort of current metagenomic projects by entirely bypassing the experimental steps of primer based rDNA amplification, cloning and sequencing. In this study, we present an algorithm called i-rDNA that can facilitate the rapid detection of 16S rDNA fragments from amongst millions of sequences in metagenomic data sets with high detection sensitivity. Performance evaluation with data sets/database variants simulating typical metagenomic scenarios indicates the significantly high detection sensitivity of i-rDNA. Moreover, i-rDNA can process a million sequences in less than an hour on a simple desktop with modest hardware specifications. In addition to the speed of execution, high sensitivity and low false positive rate, the utility of the algorithmic approach discussed in this paper is immense given that it would help in bypassing the entire experimental step of primer-based rDNA amplification, cloning and sequencing. Application of this algorithmic approach would thus drastically reduce the cost, time and human efforts invested in all metagenomic projects. A web-server for the i-rDNA algorithm is available at http://metagenomics.atc.tcs.com/i-rDNA/

  15. Microbial Metagenomics: Beyond the Genome

    NASA Astrophysics Data System (ADS)

    Gilbert, Jack A.; Dupont, Christopher L.

    2011-01-01

    Metagenomics literally means “beyond the genome.” Marine microbial metagenomic databases presently comprise ˜400 billion base pairs of DNA, only ˜3% of that found in 1 ml of seawater. Very soon a trillion-base-pair sequence run will be feasible, so it is time to reflect on what we have learned from metagenomics. We review the impact of metagenomics on our understanding of marine microbial communities. We consider the studies facilitated by data generated through the Global Ocean Sampling expedition, as well as the revolution wrought at the individual laboratory level through next generation sequencing technologies. We review recent studies and discoveries since 2008, provide a discussion of bioinformatic analyses, including conceptual pipelines and sequence annotation and predict the future of metagenomics, with suggestions of collaborative community studies tailored toward answering some of the fundamental questions in marine microbial ecology.

  16. Swine Fecal Metagenomics

    EPA Science Inventory

    Metagenomic approaches are providing rapid and more robust means to investigate the composition and functional genetic potential of complex microbial communities. In this study, we utilized a metagenomic approach to further understand the functional diversity of the swine gut. To...

  17. The rumen microbial metagenome associated with high methane production in cattle.

    PubMed

    Wallace, R John; Rooke, John A; McKain, Nest; Duthie, Carol-Anne; Hyslop, Jimmy J; Ross, David W; Waterhouse, Anthony; Watson, Mick; Roehe, Rainer

    2015-10-23

    Methane represents 16 % of total anthropogenic greenhouse gas emissions. It has been estimated that ruminant livestock produce ca. 29 % of this methane. As individual animals produce consistently different quantities of methane, understanding the basis for these differences may lead to new opportunities for mitigating ruminal methane emissions. Metagenomics is a powerful new tool for understanding the composition and function of complex microbial communities. Here we have applied metagenomics to the rumen microbial community to identify differences in the microbiota and metagenome that lead to high- and low-methane-emitting cattle phenotypes. Four pairs of beef cattle were selected for extreme high and low methane emissions from 72 animals, matched for breed (Aberdeen-Angus or Limousin cross) and diet (high or medium concentrate). Community analysis was carried out by qPCR of 16S and 18S rRNA genes and by alignment of Illumina HiSeq reads to the GREENGENES database. Total genomic reads were aligned to the KEGG genes databasefor functional analysis. Deep sequencing produced on average 11.3 Gb per sample. 16S rRNA gene abundances indicated that archaea, predominantly Methanobrevibacter, were 2.5× more numerous (P = 0.026) in high emitters, whereas among bacteria Proteobacteria, predominantly Succinivibrionaceae, were 4-fold less abundant (2.7 vs. 11.2 %; P = 0.002). KEGG analysis revealed that archaeal genes leading directly or indirectly to methane production were 2.7-fold more abundant in high emitters. Genes less abundant in high emitters included acetate kinase, electron transport complex proteins RnfC and RnfD and glucose-6-phosphate isomerase. Sequence data were assembled de novo and over 1.5 million proteins were annotated on the subsequent metagenome scaffolds. Less than half of the predicted genes matched matched a domain within Pfam. Amongst 2774 identified proteins of the 20 KEGG orthologues that correlated with methane emissions, only 16 showed 100 % identity with a publicly available protein sequence. The abundance of archaeal genes in ruminal digesta correlated strongly with differing methane emissions from individual animals, a finding useful for genetic screening purposes. Lower emissions were accompanied by higher Succinovibrionaceae abundance and changes in acetate and hydrogen production leading to less methanogenesis, as similarly postulated for Australian macropods. Large numbers of predicted protein sequences differed between high- and low-methane-emitting cattle. Ninety-nine percent were unknown, indicating a fertile area for future exploitation.

  18. New Hydrocarbon Degradation Pathways in the Microbial Metagenome from Brazilian Petroleum Reservoirs

    PubMed Central

    Sierra-García, Isabel Natalia; Correa Alvarez, Javier; Pantaroto de Vasconcellos, Suzan; Pereira de Souza, Anete; dos Santos Neto, Eugenio Vaz; de Oliveira, Valéria Maia

    2014-01-01

    Current knowledge of the microbial diversity and metabolic pathways involved in hydrocarbon degradation in petroleum reservoirs is still limited, mostly due to the difficulty in recovering the complex community from such an extreme environment. Metagenomics is a valuable tool to investigate the genetic and functional diversity of previously uncultured microorganisms in natural environments. Using a function-driven metagenomic approach, we investigated the metabolic abilities of microbial communities in oil reservoirs. Here, we describe novel functional metabolic pathways involved in the biodegradation of aromatic compounds in a metagenomic library obtained from an oil reservoir. Although many of the deduced proteins shared homology with known enzymes of different well-described aerobic and anaerobic catabolic pathways, the metagenomic fragments did not contain the complete clusters known to be involved in hydrocarbon degradation. Instead, the metagenomic fragments comprised genes belonging to different pathways, showing novel gene arrangements. These results reinforce the potential of the metagenomic approach for the identification and elucidation of new genes and pathways in poorly studied environments and contribute to a broader perspective on the hydrocarbon degradation processes in petroleum reservoirs. PMID:24587220

  19. Metagenome assembly through clustering of next-generation sequencing data using protein sequences.

    PubMed

    Sim, Mikang; Kim, Jaebum

    2015-02-01

    The study of environmental microbial communities, called metagenomics, has gained a lot of attention because of the recent advances in next-generation sequencing (NGS) technologies. Microbes play a critical role in changing their environments, and the mode of their effect can be solved by investigating metagenomes. However, the difficulty of metagenomes, such as the combination of multiple microbes and different species abundance, makes metagenome assembly tasks more challenging. In this paper, we developed a new metagenome assembly method by utilizing protein sequences, in addition to the NGS read sequences. Our method (i) builds read clusters by using mapping information against available protein sequences, and (ii) creates contig sequences by finding consensus sequences through probabilistic choices from the read clusters. By using simulated NGS read sequences from real microbial genome sequences, we evaluated our method in comparison with four existing assembly programs. We found that our method could generate relatively long and accurate metagenome assemblies, indicating that the idea of using protein sequences, as a guide for the assembly, is promising. Copyright © 2015 Elsevier B.V. All rights reserved.

  20. Comparative Viral Metagenomics of Environmental Samples from Korea

    PubMed Central

    Kim, Min-Soo; Whon, Tae Woong

    2013-01-01

    The introduction of metagenomics into the field of virology has facilitated the exploration of viral communities in various natural habitats. Understanding the viral ecology of a variety of sample types throughout the biosphere is important per se, but it also has potential applications in clinical and diagnostic virology. However, the procedures used by viral metagenomics may produce technical errors, such as amplification bias, while public viral databases are very limited, which may hamper the determination of the viral diversity in samples. This review considers the current state of viral metagenomics, based on examples from Korean viral metagenomic studies-i.e., rice paddy soil, fermented foods, human gut, seawater, and the near-surface atmosphere. Viral metagenomics has become widespread due to various methodological developments, and much attention has been focused on studies that consider the intrinsic role of viruses that interact with their hosts. PMID:24124407

  1. Metagenomic applications in environmental monitoring and bioremediation.

    PubMed

    Techtmann, Stephen M; Hazen, Terry C

    2016-10-01

    With the rapid advances in sequencing technology, the cost of sequencing has dramatically dropped and the scale of sequencing projects has increased accordingly. This has provided the opportunity for the routine use of sequencing techniques in the monitoring of environmental microbes. While metagenomic applications have been routinely applied to better understand the ecology and diversity of microbes, their use in environmental monitoring and bioremediation is increasingly common. In this review we seek to provide an overview of some of the metagenomic techniques used in environmental systems biology, addressing their application and limitation. We will also provide several recent examples of the application of metagenomics to bioremediation. We discuss examples where microbial communities have been used to predict the presence and extent of contamination, examples of how metagenomics can be used to characterize the process of natural attenuation by unculturable microbes, as well as examples detailing the use of metagenomics to understand the impact of biostimulation on microbial communities.

  2. Metagenome phylogenetic profiling of microbial community evolution in a tetrachloroethene-contaminated aquifer responding to enhanced reductive dechlorination protocols.

    PubMed

    Reiss, Rebecca A; Guerra, Peter; Makhnin, Oleg

    2016-01-01

    Chlorinated solvent contamination of potable water supplies is a serious problem worldwide. Biostimulation protocols can successfully remediate chlorinated solvent contamination through enhanced reductive dechlorination pathways, however the process is poorly understood and sometimes stalls creating a more serious problem. Whole metagenome techniques have the potential to reveal details of microbial community changes induced by biostimulation. Here we compare the metagenome of a tetrachloroethene contaminated Environmental Protection Agency Superfund Site before and after the application of biostimulation protocols. Environmental DNA was extracted from uncultured microbes that were harvested by on-site filtration of groundwater one month prior to and five months after the injection of emulsified vegetable oil, nutrients, and hydrogen gas bioamendments. Pair-end libraries were prepared for high-throughput DNA sequencing and 90 basepairs from both ends of randomly fragmented 400 basepair DNA fragments were sequenced. Over 31 millions reads were annotated with Metagenome Rapid Annotation using Subsystem Technology representing 32 prokaryotic phyla, 869 genera, and 3,181 species. A 3.6 log 2 fold increase in biomass as measured by DNA yield per mL water was measured, but there was a 9% decrease in the number of genera detected post-remediation. We apply Bayesian statistical methods to assign false discovery rates to fold-change abundance data and use Zipf's power law to filter genera with low read counts. Plotting the log-rank against the log-fold-change facilitates the visualization of the changes in the community in response to the enhanced reductive dechlorination protocol. Members of the Archaea domain increased 4.7 log 2 fold, dominated by methanogens. Prior to remediation, classes Alphaproteobacteria and Betaproteobacteria dominated the community but exhibit significant decreases five months after biostimulation. Geobacter and Sulfurospirillum replace " Sideroxydans " and Burkholderia as the most abundant genera. As a result of biostimulation, Deltaproteobacteria and Epsilonproteobacteria capable of dehalogenation, iron and sulfate reduction, and sulfur oxidation increase. Matches to thermophilic, haloalkane respiring archaea is evidence for additional species involved in biodegradation of chlorinated solvents. Additionally, potentially pathogenic bacteria increase, indicating that there may be unintended consequences of bioremediation.

  3. Centrifuge: rapid and sensitive classification of metagenomic sequences.

    PubMed

    Kim, Daehwan; Song, Li; Breitwieser, Florian P; Salzberg, Steven L

    2016-12-01

    Centrifuge is a novel microbial classification engine that enables rapid, accurate, and sensitive labeling of reads and quantification of species on desktop computers. The system uses an indexing scheme based on the Burrows-Wheeler transform (BWT) and the Ferragina-Manzini (FM) index, optimized specifically for the metagenomic classification problem. Centrifuge requires a relatively small index (4.2 GB for 4078 bacterial and 200 archaeal genomes) and classifies sequences at very high speed, allowing it to process the millions of reads from a typical high-throughput DNA sequencing run within a few minutes. Together, these advances enable timely and accurate analysis of large metagenomics data sets on conventional desktop computers. Because of its space-optimized indexing schemes, Centrifuge also makes it possible to index the entire NCBI nonredundant nucleotide sequence database (a total of 109 billion bases) with an index size of 69 GB, in contrast to k-mer-based indexing schemes, which require far more extensive space. © 2016 Kim et al.; Published by Cold Spring Harbor Laboratory Press.

  4. Metagenomic Analyses Reveal That Energy Transfer Gene Abundances Can Predict the Syntrophic Potential of Environmental Microbial Communities

    PubMed Central

    Oberding, Lisa; Gieg, Lisa M.

    2016-01-01

    Hydrocarbon compounds can be biodegraded by anaerobic microorganisms to form methane through an energetically interdependent metabolic process known as syntrophy. The microorganisms that perform this process as well as the energy transfer mechanisms involved are difficult to study and thus are still poorly understood, especially on an environmental scale. Here, metagenomic data was analyzed for specific clusters of orthologous groups (COGs) related to key energy transfer genes thus far identified in syntrophic bacteria, and principal component analysis was used in order to determine whether potentially syntrophic environments could be distinguished using these syntroph related COGs as opposed to universally present COGs. We found that COGs related to hydrogenase and formate dehydrogenase genes were able to distinguish known syntrophic consortia and environments with the potential for syntrophy from non-syntrophic environments, indicating that these COGs could be used as a tool to identify syntrophic hydrocarbon biodegrading environments using metagenomic data. PMID:27681901

  5. International Standards for Genomes, Transcriptomes, and Metagenomes

    PubMed Central

    Mason, Christopher E.; Afshinnekoo, Ebrahim; Tighe, Scott; Wu, Shixiu; Levy, Shawn

    2017-01-01

    Challenges and biases in preparing, characterizing, and sequencing DNA and RNA can have significant impacts on research in genomics across all kingdoms of life, including experiments in single-cells, RNA profiling, and metagenomics (across multiple genomes). Technical artifacts and contamination can arise at each point of sample manipulation, extraction, sequencing, and analysis. Thus, the measurement and benchmarking of these potential sources of error are of paramount importance as next-generation sequencing (NGS) projects become more global and ubiquitous. Fortunately, a variety of methods, standards, and technologies have recently emerged that improve measurements in genomics and sequencing, from the initial input material to the computational pipelines that process and annotate the data. Here we review current standards and their applications in genomics, including whole genomes, transcriptomes, mixed genomic samples (metagenomes), and the modified bases within each (epigenomes and epitranscriptomes). These standards, tools, and metrics are critical for quantifying the accuracy of NGS methods, which will be essential for robust approaches in clinical genomics and precision medicine. PMID:28337071

  6. Comparative Metagenomics of the Polymicrobial Black Band Disease of Corals

    PubMed Central

    Meyer, Julie L.; Paul, Valerie J.; Raymundo, Laurie J.; Teplitski, Max

    2017-01-01

    Black Band Disease (BBD), the destructive microbial consortium dominated by the cyanobacterium Roseofilum reptotaenium, affects corals worldwide. While the taxonomic composition of BBD consortia has been well-characterized, substantially less is known about its functional repertoire. We sequenced the metagenomes of Caribbean and Pacific black band mats and cultured Roseofilum and obtained five metagenome-assembled genomes (MAGs) of Roseofilum, nine of Proteobacteria, and 12 of Bacteroidetes. Genomic content analysis suggests that Roseofilum is a source of organic carbon and nitrogen, as well as natural products that may influence interactions between microbes. Proteobacteria and Bacteroidetes members of the disease consortium are suited to the degradation of amino acids, proteins, and carbohydrates. The accumulation of sulfide underneath the black band mat, in part due to a lack of sulfur oxidizers, contributes to the lethality of the disease. The presence of sulfide:quinone oxidoreductase genes in all five Roseofilum MAGs and in the MAGs of several heterotrophs demonstrates that resistance to sulfide is an important characteristic for members of the BBD consortium. PMID:28458657

  7. Dip in the gene pool: metagenomic survey of natural coccolithovirus communities.

    PubMed

    Pagarete, António; Kusonmano, Kanthida; Petersen, Kjell; Kimmance, Susan A; Martínez Martínez, Joaquín; Wilson, William H; Hehemann, Jan-Hendrik; Allen, Michael J; Sandaa, Ruth-Anne

    2014-10-01

    Despite the global oceanic distribution and recognised biogeochemical impact of coccolithoviruses (EhV), their diversity remains poorly understood. Here we employed a metagenomic approach to study the occurrence and progression of natural EhV community genomic variability. Analysis of EhV metagenomes from the early and late stages of an induced bloom led to three main discoveries. First, we observed resilient and specific genomic signatures in the EhV community associated with the Norwegian coast, which reinforce the existence of limitations to the capacity of dispersal and genomic exchange among EhV populations. Second, we identified a hyper-variable region (approximately 21kbp long) in the coccolithovirus genome. Third, we observed a clear trend for EhV relative amino-acid diversity to reduce from early to late stages of the bloom. This study validated two new methodological combinations, and proved very useful in the discovery of new genomic features associated with coccolithovirus natural communities. Copyright © 2014 Elsevier Inc. All rights reserved.

  8. Subsurface metabolic potential on the Costa Rican Margin

    NASA Astrophysics Data System (ADS)

    Biddle, J.; Leon, Z. R.; Martino, A. J.; Bousses, K.; House, C. H.

    2017-12-01

    The distribution of archaea and bacteria and their associated metabolic abilities in the deep subseafloor are poorly understood. In order to explore this, we focused on samples from the Costa Rica margin IODP Expedition 334. The microbial community was analyzed via metagenomics in two different sites at multiple depths. At Site 1378, samples are from 2 meters below the sea floor (mbsf), 33 mbsf and 93 mbsf, and at Site 1379 from 22 mbsf to 45 mbsf. Whole community analysis of conserved gene markers in the metagenome show that the microbial community varies with depth, and drastically differs between the two geographically close sites. Thirty-two genomes were recovered from the metagenomic data with more than 30% completion. Archaea make 49% of all genomes recovered and over 90% of these recovered genomes belong to recently discovered and poorly characterized groups of Archaea. This study explored the relative dynamics of microbial communities in the deep biosphere and presents the metabolic potential of distinct subsurface biosphere archaeal groups.

  9. Methanogenic pathways in Alaskan peatlands at different trophic levels with evidence from stable isotope ratios and metagenomics

    NASA Astrophysics Data System (ADS)

    Zhang, L.; Liu, X.; Langford, L.; Chanton, J.; Roth, S.; Schaefer, J.; Barkay, T.; Hines, M. E.

    2017-12-01

    To better constrain the large uncertainties in emission fluxes, it is necessary to improve the understanding of methanogenic pathways in northern peatlands with heterogeneous surface vegetation and pH. Surface vegetation is an excellent indicator of porewater pH, which heavily influences the microbial communities in peatlands. Stable C isotope ratios (d13C) have been used as a robust tool to distinguish methanogenic pathways, especially in conjunction with metagenomic analysis of the microbial communities. To link surface vegetation species compositions, pH, microbial communities, and methanogenic pathways, 15 peatland sites were studied in Fairbanks and Anchorage, Alaska in the summer of 2014. These sites were ordinated using multiple factor analysis into 3 clusters based on pH, temp, CH4 and volatile fatty acid production rates, d13C values, and surface vegetation composition. In the ombrotrophic group (pH 3.3), various Sphagna species dominanted, but included shrubs Ledum decumbens and Eriophorum vaginatum. Primary fermentation rates were slow with no CH4 detected. The fen cluster (pH 5.3) was dominated by various Carex species, and CH4 production rates were lower than those in the intermediate cluster but more enriched in 13C (-49‰). Methanosaeta and Methanosarcina were the dominant methanogens. In the intermediate trophic level (pH 4.7), Sphagnum squarrosum and Carex aquatilis were abundant. The same methanogens as in fen cluster also dominated this group, but with higher abundances, which, in part, lead to the higher CH4 production rates in this cluster. The syntrophs Syntrophobacter and Pelobacter were also more abundant than the fen sites, which may explain the d13CH4 values that were the lighetest among the three clusters (-54‰). The high methanogenic potential in the intermediate trophic sites warrant further study since they are not only present in large areas currently, but also represent the transient stage during the evolution from bog to fen in projected climate change scenarios.

  10. ELIXIR pilot action: Marine metagenomics - towards a domain specific set of sustainable services.

    PubMed

    Robertsen, Espen Mikal; Denise, Hubert; Mitchell, Alex; Finn, Robert D; Bongo, Lars Ailo; Willassen, Nils Peder

    2017-01-01

    Metagenomics, the study of genetic material recovered directly from environmental samples, has the potential to provide insight into the structure and function of heterogeneous microbial communities.  There has been an increased use of metagenomics to discover and understand the diverse biosynthetic capacities of marine microbes, thereby allowing them to be exploited for industrial, food, and health care products. This ELIXIR pilot action was motivated by the need to establish dedicated data resources and harmonized metagenomics pipelines for the marine domain, in order to enhance the exploration and exploitation of marine genetic resources. In this paper, we summarize some of the results from the ELIXIR pilot action "Marine metagenomics - towards user centric services".

  11. Metagenomic analysis of microbial consortium from natural crude oil that seeps into the marine ecosystem offshore Southern California

    PubMed Central

    Hawley, Erik R.; Piao, Hailan; Scott, Nicole M.; Malfatti, Stephanie; Pagani, Ioanna; Huntemann, Marcel; Chen, Amy; Glavina del Rio, Tijana; Foster, Brian; Copeland, Alex; Jansson, Janet; Pati, Amrita; Tringe, Susannah; Gilbert, Jack A.; Lorenson, Thomas D.; Hess, Matthias

    2014-01-01

    Crude oils can be major contaminants of the marine ecosystem and microorganisms play a significant role in the degradation of its main constituents. To increase our understanding of the microbial hydrocarbon degradation process in the marine ecosystem, we collected crude oil from an active seep area located in the Santa Barbara Channel (SBC) and generated a total of about 52 Gb of raw metagenomic sequence data. The assembled data comprised ~500 Mb, representing ~1.1 million genes derived primarily from chemolithoautotrophic bacteria. Members of Oceanospirillales, a bacterial order belonging to the Deltaproteobacteria, recruited less than 2% of the assembled genes within the SBC metagenome. In contrast, the microbial community associated with the oil plume that developed in the aftermath of the Deepwater Horizon (DWH) blowout in 2010, was dominated by Oceanospirillales, which comprised more than 60% of the metagenomic data generated from the DWH oil plume. This suggests that Oceanospirillales might play a less significant role in the microbially mediated hydrocarbon conversion within the SBC seep oil compared to the DWH plume oil. We hypothesize that this difference results from the SBC oil seep being mostly anaerobic, while the DWH oil plume is aerobic. Within the Archaea, the phylum Euryarchaeota, recruited more than 95% of the assembled archaeal sequences from the SBC oil seep metagenome, with more than 50% of the sequences assigned to members of the orders Methanomicrobiales and Methanosarcinales. These orders contain organisms capable of anaerobic methanogenesis and methane oxidation (AOM) and we hypothesize that these orders – and their metabolic capabilities – may be fundamental to the ecology of the SBC oil seep. PMID:25197496

  12. Enrichment allows identification of diverse, rare elements in metagenomic resistome-virulome sequencing.

    PubMed

    Noyes, Noelle R; Weinroth, Maggie E; Parker, Jennifer K; Dean, Chris J; Lakin, Steven M; Raymond, Robert A; Rovira, Pablo; Doster, Enrique; Abdo, Zaid; Martin, Jennifer N; Jones, Kenneth L; Ruiz, Jaime; Boucher, Christina A; Belk, Keith E; Morley, Paul S

    2017-10-17

    Shotgun metagenomic sequencing is increasingly utilized as a tool to evaluate ecological-level dynamics of antimicrobial resistance and virulence, in conjunction with microbiome analysis. Interest in use of this method for environmental surveillance of antimicrobial resistance and pathogenic microorganisms is also increasing. In published metagenomic datasets, the total of all resistance- and virulence-related sequences accounts for < 1% of all sequenced DNA, leading to limitations in detection of low-abundance resistome-virulome elements. This study describes the extent and composition of the low-abundance portion of the resistome-virulome, using a bait-capture and enrichment system that incorporates unique molecular indices to count DNA molecules and correct for enrichment bias. The use of the bait-capture and enrichment system significantly increased on-target sequencing of the resistome-virulome, enabling detection of an additional 1441 gene accessions and revealing a low-abundance portion of the resistome-virulome that was more diverse and compositionally different than that detected by more traditional metagenomic assays. The low-abundance portion of the resistome-virulome also contained resistance genes with public health importance, such as extended-spectrum betalactamases, that were not detected using traditional shotgun metagenomic sequencing. In addition, the use of the bait-capture and enrichment system enabled identification of rare resistance gene haplotypes that were used to discriminate between sample origins. These results demonstrate that the rare resistome-virulome contains valuable and unique information that can be utilized for both surveillance and population genetic investigations of resistance. Access to the rare resistome-virulome using the bait-capture and enrichment system validated in this study can greatly advance our understanding of microbiome-resistome dynamics.

  13. Microbiota composition, gene pool and its expression in Gir cattle (Bos indicus) rumen under different forage diets using metagenomic and metatranscriptomic approaches.

    PubMed

    Pandit, Ramesh J; Hinsu, Ankit T; Patel, Shriram H; Jakhesara, Subhash J; Koringa, Prakash G; Bruno, Fosso; Psifidi, Androniki; Shah, S V; Joshi, Chaitanya G

    2018-03-09

    Zebu (Bos indicus) is a domestic cattle species originating from the Indian subcontinent and now widely domesticated on several continents. In this study, we were particularly interested in understanding the functionally active rumen microbiota of an important Zebu breed, the Gir, under different dietary regimes. Metagenomic and metatranscriptomic data were compared at various taxonomic levels to elucidate the differential microbial population and its functional dynamics in Gir cattle rumen under different roughage dietary regimes. Different proportions of roughage rather than the type of roughage (dry or green) modulated microbiome composition and the expression of its gene pool. Fibre degrading bacteria (i.e. Clostridium, Ruminococcus, Eubacterium, Butyrivibrio, Bacillus and Roseburia) were higher in the solid fraction of rumen (P<0.01) compared to the liquid fraction, whereas bacteria considered to be utilizers of the degraded product (i.e. Prevotella, Bacteroides, Parabacteroides, Paludibacter and Victivallis) were dominant in the liquid fraction (P<0.05). Likewise, expression of fibre degrading enzymes and related carbohydrate binding modules (CBMs) occurred in the solid fraction. When metagenomic and metatranscriptomic data were compared, it was found that some genera and species were transcriptionally more active, although they were in low abundance, making an important contribution to fibre degradation and its further metabolism in the rumen. This study also identified some of the transcriptionally active genera, such as Caldicellulosiruptor and Paludibacter, whose potential has been less-explored in rumen. Overall, the comparison of metagenomic shotgun and metatranscriptomic sequencing appeared to be a much richer source of information compared to conventional metagenomic analysis. Copyright © 2018 Elsevier GmbH. All rights reserved.

  14. Elucidation of taste- and odor-producing bacteria and toxigenic cyanobacteria in a Midwestern drinking water supply reservoir by shotgun metagenomics analysis

    USGS Publications Warehouse

    Otten, Timothy; Graham, Jennifer L.; Harris, Theodore D.; Dreher, Theo

    2016-01-01

    While commonplace in clinical settings, DNA-based assays for identification or enumeration of drinking water pathogens and other biological contaminants remain widely unadopted by the monitoring community. In this study, shotgun metagenomics was used to identify taste-and-odor producers and toxin-producing cyanobacteria over a 2-year period in a drinking water reservoir. The sequencing data implicated several cyanobacteria, including Anabaena spp.,Microcystis spp., and an unresolved member of the order Oscillatoriales as the likely principal producers of geosmin, microcystin, and 2-methylisoborneol (MIB), respectively. To further demonstrate this, quantitative PCR (qPCR) assays targeting geosmin-producing Anabaena and microcystin-producing Microcystis were utilized, and these data were fitted using generalized linear models and compared with routine monitoring data, including microscopic cell counts, sonde-based physicochemical analyses, and assays of all inorganic and organic nitrogen and phosphorus forms and fractions. The qPCR assays explained the greatest variation in observed geosmin (adjusted R2 = 0.71) and microcystin (adjusted R2 = 0.84) concentrations over the study period, highlighting their potential for routine monitoring applications. The origin of the monoterpene cyclase required for MIB biosynthesis was putatively linked to a periphytic cyanobacterial mat attached to the concrete drinking water inflow structure. We conclude that shotgun metagenomics can be used to identify microbial agents involved in water quality deterioration and to guide PCR assay selection or design for routine monitoring purposes. Finally, we offer estimates of microbial diversity and metagenomic coverage of our data sets for reference to others wishing to apply shotgun metagenomics to other lacustrine systems.

  15. Screening Currency Notes for Microbial Pathogens and Antibiotic Resistance Genes Using a Shotgun Metagenomic Approach

    PubMed Central

    Jalali, Saakshi; Kohli, Samantha; Latka, Chitra; Bhatia, Sugandha; Vellarikal, Shamsudheen Karuthedath; Sivasubbu, Sridhar; Scaria, Vinod; Ramachandran, Srinivasan

    2015-01-01

    Fomites are a well-known source of microbial infections and previous studies have provided insights into the sojourning microbiome of fomites from various sources. Paper currency notes are one of the most commonly exchanged objects and its potential to transmit pathogenic organisms has been well recognized. Approaches to identify the microbiome associated with paper currency notes have been largely limited to culture dependent approaches. Subsequent studies portrayed the use of 16S ribosomal RNA based approaches which provided insights into the taxonomical distribution of the microbiome. However, recent techniques including shotgun sequencing provides resolution at gene level and enable estimation of their copy numbers in the metagenome. We investigated the microbiome of Indian paper currency notes using a shotgun metagenome sequencing approach. Metagenomic DNA isolated from samples of frequently circulated denominations of Indian currency notes were sequenced using Illumina Hiseq sequencer. Analysis of the data revealed presence of species belonging to both eukaryotic and prokaryotic genera. The taxonomic distribution at kingdom level revealed contigs mapping to eukaryota (70%), bacteria (9%), viruses and archae (~1%). We identified 78 pathogens including Staphylococcus aureus, Corynebacterium glutamicum, Enterococcus faecalis, and 75 cellulose degrading organisms including Acidothermus cellulolyticus, Cellulomonas flavigena and Ruminococcus albus. Additionally, 78 antibiotic resistance genes were identified and 18 of these were found in all the samples. Furthermore, six out of 78 pathogens harbored at least one of the 18 common antibiotic resistance genes. To the best of our knowledge, this is the first report of shotgun metagenome sequence dataset of paper currency notes, which can be useful for future applications including as bio-surveillance of exchangeable fomites for infectious agents. PMID:26035208

  16. Metagenomic Sequencing for Surveillance of Food- and Waterborne Viral Diseases.

    PubMed

    Nieuwenhuijse, David F; Koopmans, Marion P G

    2017-01-01

    A plethora of viruses can be transmitted by the food- and waterborne route. However, their recognition is challenging because of the variety of viruses, heterogeneity of symptoms, the lack of awareness of clinicians, and limited surveillance efforts. Classical food- and waterborne viral disease outbreaks are mainly caused by caliciviruses, but the source of the virus is often not known and the foodborne mode of transmission is difficult to discriminate from human-to-human transmission. Atypical food- and waterborne viral disease can be caused by viruses such as hepatitis A and hepatitis E. In addition, a source of novel emerging viruses with a potential to spread via the food- and waterborne route is the repeated interaction of humans with wildlife. Wildlife-to-human adaptation may give rise to self- limiting outbreaks in some cases, but when fully adjusted to the human host can be devastating. Metagenomic sequencing has been investigated as a promising solution for surveillance purposes as it detects all viruses in a single protocol, delivers additional genomic information for outbreak tracing, and detects novel unknown viruses. Nevertheless, several issues must be addressed to apply metagenomic sequencing in surveillance. First, sample preparation is difficult since the genomic material of viruses is generally overshadowed by host- and bacterial genomes. Second, several data analysis issues hamper the efficient, robust, and automated processing of metagenomic data. Third, interpretation of metagenomic data is hard, because of the lack of general knowledge of the virome in the food chain and the environment. Further developments in virus-specific nucleic acid extraction methods, bioinformatic data processing applications, and unifying data visualization tools are needed to gain insightful surveillance knowledge from suspect food samples.

  17. Metagenomic Sequencing for Surveillance of Food- and Waterborne Viral Diseases

    PubMed Central

    Nieuwenhuijse, David F.; Koopmans, Marion P. G.

    2017-01-01

    A plethora of viruses can be transmitted by the food- and waterborne route. However, their recognition is challenging because of the variety of viruses, heterogeneity of symptoms, the lack of awareness of clinicians, and limited surveillance efforts. Classical food- and waterborne viral disease outbreaks are mainly caused by caliciviruses, but the source of the virus is often not known and the foodborne mode of transmission is difficult to discriminate from human-to-human transmission. Atypical food- and waterborne viral disease can be caused by viruses such as hepatitis A and hepatitis E. In addition, a source of novel emerging viruses with a potential to spread via the food- and waterborne route is the repeated interaction of humans with wildlife. Wildlife-to-human adaptation may give rise to self- limiting outbreaks in some cases, but when fully adjusted to the human host can be devastating. Metagenomic sequencing has been investigated as a promising solution for surveillance purposes as it detects all viruses in a single protocol, delivers additional genomic information for outbreak tracing, and detects novel unknown viruses. Nevertheless, several issues must be addressed to apply metagenomic sequencing in surveillance. First, sample preparation is difficult since the genomic material of viruses is generally overshadowed by host- and bacterial genomes. Second, several data analysis issues hamper the efficient, robust, and automated processing of metagenomic data. Third, interpretation of metagenomic data is hard, because of the lack of general knowledge of the virome in the food chain and the environment. Further developments in virus-specific nucleic acid extraction methods, bioinformatic data processing applications, and unifying data visualization tools are needed to gain insightful surveillance knowledge from suspect food samples. PMID:28261185

  18. Metagenomic evidence for reciprocal particle exchange between the mainstem estuary and lateral bay sediments of the lower Columbia River

    PubMed Central

    Smith, Maria W.; Davis, Richard E.; Youngblut, Nicholas D.; Kärnä, Tuomas; Herfort, Lydie; Whitaker, Rachel J.; Metcalf, William W.; Tebo, Bradley M.; Baptista, António M.; Simon, Holly M.

    2015-01-01

    Lateral bays of the lower Columbia River estuary are areas of enhanced water retention that influence net ecosystem metabolism through activities of their diverse microbial communities. Metagenomic characterization of sediment microbiota from three disparate sites in two brackish lateral bays (Baker and Youngs) produced ∼100 Gbp of DNA sequence data analyzed subsequently for predicted SSU rRNA and peptide-coding genes. The metagenomes were dominated by Bacteria. A large component of Eukaryota was present in Youngs Bay samples, i.e., the inner bay sediment was enriched with the invasive New Zealand mudsnail, Potamopyrgus antipodarum, known for high ammonia production. The metagenome was also highly enriched with an archaeal ammonia oxidizer closely related to Nitrosoarchaeum limnia. Combined analysis of sequences and continuous, high-resolution time series of biogeochemical data from fixed and mobile platforms revealed the importance of large-scale reciprocal particle exchanges between the mainstem estuarine water column and lateral bay sediments. Deposition of marine diatom particles in sediments near Youngs Bay mouth was associated with a dramatic enrichment of Bacteroidetes (58% of total Bacteria) and corresponding genes involved in phytoplankton polysaccharide degradation. The Baker Bay sediment metagenome contained abundant Archaea, including diverse methanogens, as well as functional genes for methylotrophy and taxonomic markers for syntrophic bacteria, suggesting that active methane cycling occurs at this location. Our previous work showed enrichments of similar anaerobic taxa in particulate matter of the mainstem estuarine water column. In total, our results identify the lateral bays as both sources and sinks of biogenic particles significantly impacting microbial community composition and biogeochemical activities in the estuary. PMID:26483785

  19. Metagenomic analysis of microbial consortium from natural crude oil that seeps into the marine ecosystem offshore Southern California

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hawley, Erik R.; Piao, Hailan; Scott, Nicole M.

    2014-01-02

    Crude oils can be major contaminants of the marine ecosystem and microorganisms play a significant role in the degradation of the main constituents of crude oil. To increase our understanding of the microbial hydrocarbon degradation process in the marine ecosystem, we collected crude oil from an active seep area located in the Santa Barbara Channel (SBC) and generated a total of about 52 Gb of raw metagenomic sequence data. The assembled data comprised ~500 Mb, representing ~1.1 million genes derived primarily from chemolithoautotrophic bacteria. Members of Oceanospirillales, a bacterial order belonging to the Deltaproteobacteria, recruited less than 2% of themore » assembled genes within the SBC metagenome. In contrast, the microbial community associated with the oil plume that developed in the aftermath of the Deepwater Horizon (DWH) blowout in 2010, was dominated by Oceanospirillales, which comprised more than 60% of the metagenomic data generated from the DWH oil plume. This suggests that Oceanospirillales might play a less significant role in the microbially mediated hydrocarbon conversion within the SBC seep oil compared to the DWH plume oil. We hypothesize that this difference results from the SBC oil seep being mostly anaerobic, while the DWH oil plume is aerobic. Within the Archaea, the phylum Euryarchaeota, recruited more than 95% of the assembled archaeal sequences from the SBC oil seep metagenome, with more than 50% of the sequences assigned to members of the orders Methanomicrobiales and Methanosarcinales. These orders contain organisms capable of anaerobic methanogenesis and methane oxidation (AOM) and we hypothesize that these orders and their metabolic capabilities may be fundamental to the ecology of the SBC oil seep.« less

  20. Phylogenetically Novel LuxI/LuxR-Type Quorum Sensing Systems Isolated Using a Metagenomic Approach

    PubMed Central

    Nasuno, Eri; Fujita, Masaki J.; Nakatsu, Cindy H.; Kamagata, Yoichi; Hanada, Satoshi

    2012-01-01

    A great deal of research has been done to understand bacterial cell-to-cell signaling systems, but there is still a large gap in our current knowledge because the majority of microorganisms in natural environments do not have cultivated representatives. Metagenomics is one approach to identify novel quorum sensing (QS) systems from uncultured bacteria in environmental samples. In this study, fosmid metagenomic libraries were constructed from a forest soil and an activated sludge from a coke plant, and the target genes were detected using a green fluorescent protein (GFP)-based Escherichia coli biosensor strain whose fluorescence was screened by spectrophotometry. DNA sequence analysis revealed two pairs of new LuxI family N-acyl-l-homoserine lactone (AHL) synthases and LuxR family transcriptional regulators (clones N16 and N52, designated AubI/AubR and AusI/AusR, respectively). AubI and AusI each produced an identical AHL, N-dodecanoyl-l-homoserine lactone (C12-HSL), as determined by nuclear magnetic resonance (NMR) and mass spectrometry. Phylogenetic analysis based on amino acid sequences suggested that AusI/AusR was from an uncultured member of the Betaproteobacteria and AubI/AubR was very deeply branched from previously described LuxI/LuxR homologues in isolates of the Proteobacteria. The phylogenetic position of AubI/AubR indicates that they represent a QS system not acquired recently from the Proteobacteria by horizontal gene transfer but share a more ancient ancestry. We demonstrated that metagenomic screening is useful to provide further insight into the phylogenetic diversity of bacterial QS systems by describing two new LuxI/LuxR-type QS systems from uncultured bacteria. PMID:22983963

  1. Metagenomic Analysis of Antibiotic Resistance Genes in Dairy Cow Feces following Therapeutic Administration of Third Generation Cephalosporin

    PubMed Central

    Ray, Partha; Zhang, Tong; Pruden, Amy; Strickland, Michael; Knowlton, Katharine

    2015-01-01

    Although dairy manure is widely applied to land, it is relatively understudied compared to other livestock as a potential source of antibiotic resistance genes (ARGs) to the environment and ultimately to human pathogens. Ceftiofur, the most widely used antibiotic used in U.S. dairy cows, is a 3rd generation cephalosporin, a critically important class of antibiotics to human health. The objective of this study was to evaluate the effect of typical ceftiofur antibiotic treatment on the prevalence of ARGs in the fecal microbiome of dairy cows using a metagenomics approach. β-lactam ARGs were found to be elevated in feces from Holstein cows administered ceftiofur (n = 3) relative to control cows (n = 3). However, total numbers of ARGs across all classes were not measurably affected by ceftiofur treatment, likely because of dominance of unaffected tetracycline ARGs in the metagenomics libraries. Functional analysis via MG-RAST further revealed that ceftiofur treatment resulted in increases in gene sequences associated with “phages, prophages, transposable elements, and plasmids”, suggesting that this treatment also enriched the ability to horizontally transfer ARGs. Additional functional shifts were noted with ceftiofur treatment (e.g., increase in genes associated with stress, chemotaxis, and resistance to toxic compounds; decrease in genes associated with metabolism of aromatic compounds and cell division and cell cycle), along with measureable taxonomic shifts (increase in Bacterioidia and decrease in Actinobacteria). This study demonstrates that ceftiofur has a broad, measureable and immediate effect on the cow fecal metagenome. Given the importance of 3rd generation cephalospirins to human medicine, their continued use in dairy cattle should be carefully considered and waste treatment strategies to slow ARG dissemination from dairy cattle manure should be explored. PMID:26258869

  2. Genome and metagenome enabled analyses reveal new insight into the global biogeography and potential urea utilization in marine Thaumarchaeota.

    NASA Astrophysics Data System (ADS)

    Ahlgren, N.; Parada, A. E.; Fuhrman, J. A.

    2016-02-01

    Marine Thaumarchaea are an abundant, important group of marine microbial communities as they fix carbon, oxidize ammonium, and thus contribute to key N and C cycles in the oceans. From an enrichment culture, we have sequenced the complete genome of a new Thaumarchaeota strain, SPOT01. Analysis of this genome and other Thaumarchaeal genomes contributes new insight into its role in N cycling and clarifies the broader biogeography of marine Thaumarchaeal genera. Phylogenomics of Thaumarchaeota genomes reveal coherent separation into clusters roughly equivalent to the genus level, and SPOT01 represents a new genus of marine Thaumarchaea. Competitive fragment recruitment of globally distributed metagenomes from TARA, Ocean Sampling Day, and those generated from a station off California shows that the SPOT01 genus is often the most abundant genus, especially where total Thaumarchaea are most abundant in the overall community. The SPOT01 genome contains urease genes allowing it to use an alternative form of N. Genomic and metagenomic analysis also reveal that among planktonic genomes and populations, the urease genes in general are more frequently found in members of the SPOT01 genus and another genus dominant in deep waters, thus we predict these two genera contribute most significantly to urea utilization among marine Thaumarchaea. Recruitment also revealed broader biogeographic and ecological patterns of the putative genera. The SPOT01 genus was most abundant at colder temperatures (<16 C), reflective of its dominance at subpolar to polar latitudes (>45 degrees). The genus containing Nitrosopumilus maritimus had the highest temperature range, and the genus containing Candidatus Nitrosopelagicus brevis was typically most abundant at intermediate temperatures and intermediate latitudes ( 35-45 degrees). Together these genome and metagenome enabled analyses provide significant new insight into the ecology and biogeochemical contributions of marine archaea.

  3. Microbiological profile of chicken carcasses: A comparative analysis using shotgun metagenomic sequencing

    PubMed Central

    Cesare, Alessandra De; Palma, Federica; Lucchi, Alex; Pasquali, Frederique; Manfreda, Gerardo

    2018-01-01

    In the last few years metagenomic and 16S rRNA sequencing have completly changed the microbiological investigations of food products. In this preliminary study, the microbiological profile of chicken carcasses collected from animals fed with different diets were tested by using shotgun metagenomic sequencing. A total of 15 carcasses have been collected at the slaughetrhouse at the end of the refrigeration tunnel from chickens reared for 35 days and fed with a control diet (n=5), a diet supplemented with 1500 FTU/kg of commercial phytase (n=5) and a diet supplemented with 1500 FTU/kg of commercial phytase and 3g/kg of inositol (n=5). Ten grams of neck and breast skin were obtained from each carcass and submited to total DNA extraction by using the DNeasy Blood & Tissue Kit (Qiagen). Sequencing libraries have been prepared by using the Nextera XT DNA Library Preparation Kit (Illumina) and sequenced in a HiScanSQ (Illumina) at 100 bp in paired ends. A number of sequences ranging between 5 and 9 million was obtained for each sample. Sequence analysis showed that Proteobacteria and Firmicutes represented more than 98% of whole bacterial populations associated to carcass skin in all groups but their abundances were different between groups. Moraxellaceae and other degradative bacteria showed a significantly higher abundance in the control compared to the treated groups. Furthermore, Clostridium perfringens showed a relative frequency of abundance significantly higher in the group fed with phytase and Salmonella enterica in the group fed with phytase plus inositol. The results of this preliminary study showed that metagenome sequencing is suitable to investigate and monitor carcass microbiota in order to detect specific pathogenic and/or degradative populations. PMID:29732327

  4. Integrated metagenomic data analysis demonstrates that a loss of diversity in oral microbiota is associated with periodontitis.

    PubMed

    Ai, Dongmei; Huang, Ruocheng; Wen, Jin; Li, Chao; Zhu, Jiangping; Xia, Li Charlie

    2017-01-25

    Periodontitis is an inflammatory disease affecting the tissues supporting teeth (periodontium). Integrative analysis of metagenomic samples from multiple periodontitis studies is a powerful way to examine microbiota diversity and interactions within host oral cavity. A total of 43 subjects were recruited to participate in two previous studies profiling the microbial community of human subgingival plaque samples using shotgun metagenomic sequencing. We integrated metagenomic sequence data from those two studies, including six healthy controls, 14 sites representative of stable periodontitis, 16 sites representative of progressing periodontitis, and seven periodontal sites of unknown status. We applied phylogenetic diversity, differential abundance, and network analyses, as well as clustering, to the integrated dataset to compare microbiological community profiles among the different disease states. We found alpha-diversity, i.e., mean species diversity in sites or habitats at a local scale, to be the single strongest predictor of subjects' periodontitis status (P < 0.011). More specifically, healthy subjects had the highest alpha-diversity, while subjects with stable sites had the lowest alpha-diversity. From these results, we developed an alpha-diversity logistic model-based naive classifier able to perfectly predict the disease status of the seven subjects with unknown periodontal status (not used in training). Phylogenetic profiling resulted in the discovery of nine marker microbes, and these species are able to differentiate between stable and progressing periodontitis, achieving an accuracy of 94.4%. Finally, we found that the reduction of negatively correlated species is a notable signature of disease progression. Our results consistently show a strong association between the loss of oral microbiota diversity and the progression of periodontitis, suggesting that metagenomics sequencing and phylogenetic profiling are predictive of early periodontitis, leading to potential therapeutic intervention. Our results also support a keystone pathogen-mediated polymicrobial synergy and dysbiosis (PSD) model to explain the etiology of periodontitis. Apart from P. gingivalis, we identified three additional keystone species potentially mediating the progression of periodontitis progression based on pathogenic characteristics similar to those of known keystone pathogens.

  5. Taxonomic and functional metagenomic profiling of gastrointestinal tract microbiome of the farmed adult turbot (Scophthalmus maximus).

    PubMed

    Xing, Mengxin; Hou, Zhanhui; Yuan, Jianbo; Liu, Yuan; Qu, Yanmei; Liu, Bin

    2013-12-01

    Metagenomics combined with 16S rRNA gene sequence analyses was applied to unveil the taxonomic composition and functional diversity of the farmed adult turbot gastrointestinal (GI) microbiome. Proteobacteria and Firmicutes which existed in both GI content and mucus were dominated in the turbot GI microbiome. 16S rRNA gene sequence analyses also indicated that the turbot GI tract may harbor some bacteria which originated from associated seawater. Functional analyses indicated that the clustering-based subsystem and many metabolic subsystems were dominant in the turbot GI metagenome. Compared with other gut metagenomes, quorum sensing and biofilm formation was overabundant in the turbot GI metagenome. Genes associated with quorum sensing and biofilm formation were found in species within Vibrio, including Vibrio vulnificus, Vibrio cholerae and Vibrio parahaemolyticus. In farmed fish gut metagenomes, the stress response and protein folding subsystems were over-represented and several genes concerning antibiotic and heavy metal resistance were also detected. These data suggested that the turbot GI microbiome may be affected by human factors in aquaculture. Additionally, iron acquisition and the metabolism subsystem were more abundant in the turbot GI metagenome when compared with freshwater fish gut metagenome, suggesting that unique metabolic potential may be observed in marine animal GI microbiomes. © 2013 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved.

  6. Strain/species identification in metagenomes using genome-specific markers

    PubMed Central

    Tu, Qichao; He, Zhili; Zhou, Jizhong

    2014-01-01

    Shotgun metagenome sequencing has become a fast, cheap and high-throughput technology for characterizing microbial communities in complex environments and human body sites. However, accurate identification of microorganisms at the strain/species level remains extremely challenging. We present a novel k-mer-based approach, termed GSMer, that identifies genome-specific markers (GSMs) from currently sequenced microbial genomes, which were then used for strain/species-level identification in metagenomes. Using 5390 sequenced microbial genomes, 8 770 321 50-mer strain-specific and 11 736 360 species-specific GSMs were identified for 4088 strains and 2005 species (4933 strains), respectively. The GSMs were first evaluated against mock community metagenomes, recently sequenced genomes and real metagenomes from different body sites, suggesting that the identified GSMs were specific to their targeting genomes. Sensitivity evaluation against synthetic metagenomes with different coverage suggested that 50 GSMs per strain were sufficient to identify most microbial strains with ≥0.25× coverage, and 10% of selected GSMs in a database should be detected for confident positive callings. Application of GSMs identified 45 and 74 microbial strains/species significantly associated with type 2 diabetes patients and obese/lean individuals from corresponding gastrointestinal tract metagenomes, respectively. Our result agreed with previous studies but provided strain-level information. The approach can be directly applied to identify microbial strains/species from raw metagenomes, without the effort of complex data pre-processing. PMID:24523352

  7. Ecology and evolution of viruses infecting uncultivated SUP05 bacteria as revealed by single-cell- and meta-genomics

    PubMed Central

    Roux, Simon; Hawley, Alyse K; Torres Beltran, Monica; Scofield, Melanie; Schwientek, Patrick; Stepanauskas, Ramunas; Woyke, Tanja; Hallam, Steven J; Sullivan, Matthew B

    2014-01-01

    Viruses modulate microbial communities and alter ecosystem functions. However, due to cultivation bottlenecks, specific virus–host interaction dynamics remain cryptic. In this study, we examined 127 single-cell amplified genomes (SAGs) from uncultivated SUP05 bacteria isolated from a model marine oxygen minimum zone (OMZ) to identify 69 viral contigs representing five new genera within dsDNA Caudovirales and ssDNA Microviridae. Infection frequencies suggest that ∼1/3 of SUP05 bacteria is viral-infected, with higher infection frequency where oxygen-deficiency was most severe. Observed Microviridae clonality suggests recovery of bloom-terminating viruses, while systematic co-infection between dsDNA and ssDNA viruses posits previously unrecognized cooperation modes. Analyses of 186 microbial and viral metagenomes revealed that SUP05 viruses persisted for years, but remained endemic to the OMZ. Finally, identification of virus-encoded dissimilatory sulfite reductase suggests SUP05 viruses reprogram their host's energy metabolism. Together, these results demonstrate closely coupled SUP05 virus–host co-evolutionary dynamics with the potential to modulate biogeochemical cycling in climate-critical and expanding OMZs. DOI: http://dx.doi.org/10.7554/eLife.03125.001 PMID:25171894

  8. Current strategies for mobilome research.

    PubMed

    Jørgensen, Tue S; Kiil, Anne S; Hansen, Martin A; Sørensen, Søren J; Hansen, Lars H

    2014-01-01

    Mobile genetic elements (MGEs) are pivotal for bacterial evolution and adaptation, allowing shuffling of genes even between distantly related bacterial species. The study of these elements is biologically interesting as the mode of genetic propagation is kaleidoscopic and important, as MGEs are the main vehicles of the increasing bacterial antibiotic resistance that causes thousands of human deaths each year. The study of MGEs has previously focused on plasmids from individual isolates, but the revolution in sequencing technology has allowed the study of mobile genomic elements of entire communities using metagenomic approaches. The problem in using metagenomic sequencing for the study of MGEs is that plasmids and other mobile elements only comprise a small fraction of the total genetic content that are difficult to separate from chromosomal DNA based on sequence alone. The distinction between plasmid and chromosome is important as the mobility and regulation of genes largely depend on their genetic context. Several different approaches have been proposed that specifically enrich plasmid DNA from community samples. Here, we review recent approaches used to study entire plasmid pools from complex environments, and point out possible future developments for and pitfalls of these approaches. Further, we discuss the use of the PacBio long-read sequencing technology for MGE discovery.

  9. The human gut and groundwater harbor non-photosynthetic bacteria belonging to a new candidate phylum sibling to Cyanobacteria

    PubMed Central

    Di Rienzi, Sara C; Sharon, Itai; Wrighton, Kelly C; Koren, Omry; Hug, Laura A; Thomas, Brian C; Goodrich, Julia K; Bell, Jordana T; Spector, Timothy D; Banfield, Jillian F; Ley, Ruth E

    2013-01-01

    Cyanobacteria were responsible for the oxygenation of the ancient atmosphere; however, the evolution of this phylum is enigmatic, as relatives have not been characterized. Here we use whole genome reconstruction of human fecal and subsurface aquifer metagenomic samples to obtain complete genomes for members of a new candidate phylum sibling to Cyanobacteria, for which we propose the designation ‘Melainabacteria’. Metabolic analysis suggests that the ancestors to both lineages were non-photosynthetic, anaerobic, motile, and obligately fermentative. Cyanobacterial light sensing may have been facilitated by regulators present in the ancestor of these lineages. The subsurface organism has the capacity for nitrogen fixation using a nitrogenase distinct from that in Cyanobacteria, suggesting nitrogen fixation evolved separately in the two lineages. We hypothesize that Cyanobacteria split from Melainabacteria prior or due to the acquisition of oxygenic photosynthesis. Melainabacteria remained in anoxic zones and differentiated by niche adaptation, including for symbiosis in the mammalian gut. DOI: http://dx.doi.org/10.7554/eLife.01102.001 PMID:24137540

  10. Orpheovirus IHUMI-LCC2: A New Virus among the Giant Viruses

    PubMed Central

    Andreani, Julien; Khalil, Jacques Y. B.; Baptiste, Emeline; Hasni, Issam; Michelle, Caroline; Raoult, Didier; Levasseur, Anthony; La Scola, Bernard

    2018-01-01

    Giant viruses continue to invade the world of virology, in gigantic genome sizes and various particles shapes. Strains discoveries and metagenomic studies make it possible to reveal the complexity of these microorganisms, their origins, ecosystems and putative roles. We isolated from a rat stool sample a new giant virus “Orpheovirus IHUMI-LCC2,” using Vermamoeba vermiformis as host cell. In this paper, we describe the main genomic features and replicative cycle of Orpheovirus IHUMI-LCC2. It possesses a circular genome exceeding 1.4 Megabases with 25% G+C content and ovoidal-shaped particles ranging from 900 to 1300 nm. Particles are closed by at least one thick membrane in a single ostiole-like shape in their apex. Phylogenetic analysis and the reciprocal best hit for Orpheovirus show a connection to the proposed Pithoviridae family. However, some genomic characteristics bear witness to a completely divergent evolution for Orpheovirus IHUMI-LCC2 when compared to Cedratviruses or Pithoviruses. PMID:29403444

  11. Environmental Microbial Forensics and Archaeology of Past Pandemics.

    PubMed

    Fornaciari, Antonio

    2017-01-01

    The development of paleomicrobiology with new molecular techniques such as metagenomics is revolutionizing our knowledge of microbial evolution in human history. The study of microbial agents that are concomitantly active in the same biological environment makes it possible to obtain a picture of the complex interrelations among the different pathogens and gives us the perspective to understand the microecosystem of ancient times. This research acts as a bridge between disciplines such as archaeology, biology, and medicine, and the development of paleomicrobiology forces archaeology to broaden and update its methods. This chapter addresses the archaeological issues related to the identification of cemeteries from epidemic catastrophes (typology of burials, stratigraphy, topography, paleodemography) and the issues related to the sampling of human remains for biomolecular analysis. Developments in the field of paleomicrobiology are described with the example of the plague. Because of its powerful interdisciplinary features, the paleomicrobiological study of Yersinia pestis is an extremely interesting field, in which paleomicrobiology, historical research, and archeology are closely related, and it has important implications for the current dynamics of epidemiology.

  12. Metagenomic Analysis Revealing Antibiotic Resistance Genes (ARGs) and Their Genetic Compartments in the Tibetan Environment.

    PubMed

    Chen, Baowei; Yuan, Ke; Chen, Xin; Yang, Ying; Zhang, Tong; Wang, Yawei; Luan, Tiangang; Zou, Shichun; Li, Xiangdong

    2016-07-05

    Comprehensive profiles of antibiotic resistance genes (ARGs) and mobile genetic elements (MGEs) in a minimally impacted environment are essential to understanding the evolution and dissemination of modern antibiotic resistance. Chemical analyses of the samples collected from Tibet demonstrated that the region under investigation was almost devoid of anthropogenic antibiotics. The soils, animal wastes, and sediments were different from each other in terms of bacterial community structures, and in the typical profiles of ARGs and MGEs. Diverse ARGs that encoded resistance to common antibiotics (e.g., beta-lactams, fluoroquinolones, etc.) were found mainly via an efflux mechanism completely distinct from modern antibiotic resistome. In addition, a very small fraction of ARGs in the Tibetan environment were carried by MGEs, indicating the low potential of these ARGs to be transferred among bacteria. In comparison to the ARG profiles in relatively pristine Tibet, contemporary ARGs and MGEs in human-impacted environments have evolved substantially since the broad use of anthropogenic antibiotics.

  13. Systemic approaches to biodegradation.

    PubMed

    Trigo, Almudena; Valencia, Alfonso; Cases, Ildefonso

    2009-01-01

    Biodegradation, the ability of microorganisms to remove complex chemicals from the environment, is a multifaceted process in which many biotic and abiotic factors are implicated. The recent accumulation of knowledge about the biochemistry and genetics of the biodegradation process, and its categorization and formalization in structured databases, has recently opened the door to systems biology approaches, where the interactions of the involved parts are the main subject of study, and the system is analysed as a whole. The global analysis of the biodegradation metabolic network is beginning to produce knowledge about its structure, behaviour and evolution, such as its free-scale structure or its intrinsic robustness. Moreover, these approaches are also developing into useful tools such as predictors for compounds' degradability or the assisted design of artificial pathways. However, it is the environmental application of high-throughput technologies from the genomics, metagenomics, proteomics and metabolomics that harbours the most promising opportunities to understand the biodegradation process, and at the same time poses tremendous challenges from the data management and data mining point of view.

  14. High level of intergenera gene exchange shapes the evolution of haloarchaea in an isolated Antarctic lake.

    PubMed

    DeMaere, Matthew Z; Williams, Timothy J; Allen, Michelle A; Brown, Mark V; Gibson, John A E; Rich, John; Lauro, Federico M; Dyall-Smith, Michael; Davenport, Karen W; Woyke, Tanja; Kyrpides, Nikos C; Tringe, Susannah G; Cavicchioli, Ricardo

    2013-10-15

    Deep Lake in Antarctica is a globally isolated, hypersaline system that remains liquid at temperatures down to -20 °C. By analyzing metagenome data and genomes of four isolates we assessed genome variation and patterns of gene exchange to learn how the lake community evolved. The lake is completely and uniformly dominated by haloarchaea, comprising a hierarchically structured, low-complexity community that differs greatly to temperate and tropical hypersaline environments. The four Deep Lake isolates represent distinct genera (∼85% 16S rRNA gene similarity and ∼73% genome average nucleotide identity) with genomic characteristics indicative of niche adaptation, and collectively account for ∼72% of the cellular community. Network analysis revealed a remarkable level of intergenera gene exchange, including the sharing of long contiguous regions (up to 35 kb) of high identity (∼100%). Although the genomes of closely related Halobacterium, Haloquadratum, and Haloarcula (>90% average nucleotide identity) shared regions of high identity between species or strains, the four Deep Lake isolates were the only distantly related haloarchaea to share long high-identity regions. Moreover, the Deep Lake high-identity regions did not match to any other hypersaline environment metagenome data. The most abundant species, tADL, appears to play a central role in the exchange of insertion sequences, but not the exchange of high-identity regions. The genomic characteristics of the four haloarchaea are consistent with a lake ecosystem that sustains a high level of intergenera gene exchange while selecting for ecotypes that maintain sympatric speciation. The peculiarities of this polar system restrict which species can grow and provide a tempo and mode for accentuating gene exchange.

  15. Soup to Tree: The Phylogeny of Beetles Inferred by Mitochondrial Metagenomics of a Bornean Rainforest Sample.

    PubMed

    Crampton-Platt, Alex; Timmermans, Martijn J T N; Gimmel, Matthew L; Kutty, Sujatha Narayanan; Cockerill, Timothy D; Vun Khen, Chey; Vogler, Alfried P

    2015-09-01

    In spite of the growth of molecular ecology, systematics and next-generation sequencing, the discovery and analysis of diversity is not currently integrated with building the tree-of-life. Tropical arthropod ecologists are well placed to accelerate this process if all specimens obtained through mass-trapping, many of which will be new species, could be incorporated routinely into phylogeny reconstruction. Here we test a shotgun sequencing approach, whereby mitochondrial genomes are assembled from complex ecological mixtures through mitochondrial metagenomics, and demonstrate how the approach overcomes many of the taxonomic impediments to the study of biodiversity. DNA from approximately 500 beetle specimens, originating from a single rainforest canopy fogging sample from Borneo, was pooled and shotgun sequenced, followed by de novo assembly of complete and partial mitogenomes for 175 species. The phylogenetic tree obtained from this local sample was highly similar to that from existing mitogenomes selected for global coverage of major lineages of Coleoptera. When all sequences were combined only minor topological changes were induced against this reference set, indicating an increasingly stable estimate of coleopteran phylogeny, while the ecological sample expanded the tip-level representation of several lineages. Robust trees generated from ecological samples now enable an evolutionary framework for ecology. Meanwhile, the inclusion of uncharacterized samples in the tree-of-life rapidly expands taxon and biogeographic representation of lineages without morphological identification. Mitogenomes from shotgun sequencing of unsorted environmental samples and their associated metadata, placed robustly into the phylogenetic tree, constitute novel DNA "superbarcodes" for testing hypotheses regarding global patterns of diversity. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  16. Activity-Based Screening of Metagenomic Libraries for Hydrogenase Enzymes.

    PubMed

    Adam, Nicole; Perner, Mirjam

    2017-01-01

    Here we outline how to identify hydrogenase enzymes from metagenomic libraries through an activity-based screening approach. A metagenomic fosmid library is constructed in E. coli and the fosmids are transferred into a hydrogenase deletion mutant of Shewanella oneidensis (ΔhyaB) via triparental mating. If a fosmid exhibits hydrogen uptake activity, S. oneidensis' phenotype is restored and hydrogenase activity is indicated by a color change of the medium from yellow to colorless. This new method enables screening of 48 metagenomic fosmid clones in parallel.

  17. Metagenomes from two microbial consortia associated with Santa Barbara seep oil.

    PubMed

    Hawley, Erik R; Malfatti, Stephanie A; Pagani, Ioanna; Huntemann, Marcel; Chen, Amy; Foster, Brian; Copeland, Alexander; del Rio, Tijana Glavina; Pati, Amrita; Jansson, Janet R; Gilbert, Jack A; Tringe, Susannah Green; Lorenson, Thomas D; Hess, Matthias

    2014-12-01

    The metagenomes from two microbial consortia associated with natural oils seeping into the Pacific Ocean offshore the coast of Santa Barbara (California, USA) were determined to complement already existing metagenomes generated from microbial communities associated with hydrocarbons that pollute the marine ecosystem. This genomics resource article is the first of two publications reporting a total of four new metagenomes from oils that seep into the Santa Barbara Channel. Copyright © 2014 Elsevier B.V. All rights reserved.

  18. Metagenomics of Bacterial Diversity in Villa Luz Caves with Sulfur Water Springs

    PubMed Central

    Artacho, Alejandro; Bautista, José S.; Méndez, Roberto; Gamboa, María T.; Gamboa, Jesús R.; Gómez-Cruz, Rodolfo

    2018-01-01

    New biotechnology applications require in-depth preliminary studies of biodiversity. The methods of massive sequencing using metagenomics and bioinformatics tools offer us sufficient and reliable knowledge to understand environmental diversity, to know new microorganisms, and to take advantage of their functional genes. Villa Luz caves, in the southern Mexican state of Tabasco, are fed by at least 26 groundwater inlets, containing 300–500 mg L−1 H2S and <0.1 mg L−1 O2. We extracted environmental DNA for metagenomic analysis of collected samples in five selected Villa Luz caves sites, with pH values from 2.5 to 7. Foreign organisms found in this underground ecosystem can oxidize H2S to H2SO4. These include: biovermiculites, a bacterial association that can grow on the rock walls; snottites, that are whitish, viscous biofilms hanging from the rock walls, and sacks or bags of phlegm, which live within the aquatic environment of the springs. Through the emergency food assistance program (TEFAP) pyrosequencing, a total of 20,901 readings of amplification products from hypervariable regions V1 and V3 of 16S rRNA bacterial gene in whole and pure metagenomic DNA samples were generated. Seven bacterial phyla were identified. As a result, Proteobacteria was more frequent than Acidobacteria. Finally, acidophilic Proteobacteria was detected in UJAT5 sample. PMID:29361802

  19. CO Dehydrogenase Genes Found in Metagenomic Fosmid Clones from the Deep Mediterranean Sea▿ †

    PubMed Central

    Martin-Cuadrado, Ana-Belen; Ghai, Rohit; Gonzaga, Aitor; Rodriguez-Valera, Francisco

    2009-01-01

    The use of carbon monoxide (CO) as a biological energy source is widespread in microbes. In recent years, the role of CO oxidation in superficial ocean waters has been shown to be an important energy supplement for heterotrophs (carboxydovores). The key enzyme CO dehydrogenase was found in both isolates and metagenomes from the ocean's photic zone, where CO is continuously generated by organic matter photolysis. We have also found genes that code for both forms I (low affinity) and II (high affinity) in fosmids from a metagenomic library generated from a 3,000-m depth in the Mediterranean Sea. Analysis of other metagenomic databases indicates that similar genes are also found in the mesopelagic and bathypelagic North Pacific and on the surfaces of this and other oceanic locations (in lower proportions and similarities). The frequency with which this gene was found indicates that this energy-generating metabolism would be at least as important in the bathypelagic habitat as it is in the photic zone. Although there are no data about CO concentrations or origins deep in the ocean, it could have a geothermal origin or be associated with anaerobic metabolism of organic matter. The identities of the microbes that carry out these processes were not established, but they seem to be representatives of either Bacteroidetes or Chloroflexi. PMID:19801465

  20. Diversity and functions of bacterial community in drinking water biofilms revealed by high-throughput sequencing

    PubMed Central

    Chao, Yuanqing; Mao, Yanping; Wang, Zhiping; Zhang, Tong

    2015-01-01

    The development of biofilms in drinking water (DW) systems may cause various problems to water quality. To investigate the community structure of biofilms on different pipe materials and the global/specific metabolic functions of DW biofilms, PCR-based 454 pyrosequencing data for 16S rRNA genes and Illumina metagenomic data were generated and analysed. Considerable differences in bacterial diversity and taxonomic structure were identified between biofilms formed on stainless steel and biofilms formed on plastics, indicating that the metallic materials facilitate the formation of higher diversity biofilms. Moreover, variations in several dominant genera were observed during biofilm formation. Based on PCA analysis, the global functions in the DW biofilms were similar to other DW metagenomes. Beyond the global functions, the occurrences and abundances of specific protective genes involved in the glutathione metabolism, the SoxRS system, the OxyR system, RpoS regulated genes, and the production/degradation of extracellular polymeric substances were also evaluated. A near-complete and low-contamination draft genome was constructed from the metagenome of the DW biofilm, based on the coverage and tetranucleotide frequencies, and identified as a Bradyrhizobiaceae-like bacterium according to a phylogenetic analysis. Our findings provide new insight into DW biofilms, especially in terms of their metabolic functions. PMID:26067561

  1. Network-guided genomic and metagenomic analysis of the faecal microbiota of the critically endangered kakapo.

    PubMed

    Waite, David W; Dsouza, Melissa; Sekiguchi, Yuji; Hugenholtz, Philip; Taylor, Michael W

    2018-05-25

    The kakapo is a critically endangered, herbivorous parrot endemic to New Zealand. The kakapo hindgut hosts a dense microbial community of low taxonomic diversity, typically dominated by Escherichia fergusonii, and has proven to be a remarkably stable ecosystem, displaying little variation in core membership over years of study. To elucidate mechanisms underlying this robustness, we performed 16S rRNA gene-based co-occurrence network analysis to identify potential interactions between E. fergusonii and the wider bacterial community. Genomic and metagenomic sequencing were employed to facilitate interpretation of potential interactions observed in the network. E. fergusonii maintained very few correlations with other members of the microbiota, and isolates possessed genes for the generation of energy from a wide range of carbohydrate sources, including plant fibres such as cellulose. We surmise that this dominant microorganism is abundant not due to ecological interaction with other members of the microbiota, but its ability to metabolise a wide range of nutrients in the gut. This research represents the first concerted effort to understand the functional roles of the kakapo microbiota, and leverages metagenomic data to contextualise co-occurrence patterns. By combining these two techniques we provide a means for studying the diversity-stability hypothesis in the context of bacterial ecosystems.

  2. Community and gene composition of a human dental plaque microbiota obtained by metagenomic sequencing

    PubMed Central

    Xie, G.; Chain, P.S.G.; Lo, C.; Liu, K-L.; Gans, J.; Merritt, J.; Qi, F.

    2010-01-01

    SUMMARY Human dental plaque is a complex microbial community containing an estimated 700 to 19,000 species/phylotypes. Despite numerous studies analysing species richness in healthy and diseased human subjects, the true genomic composition of the human dental plaque microbiota remains unknown. Here we report a metagenomic analysis of a healthy human plaque sample using a combination of second-generation sequencing platforms. A total of 860 million base pairs of non-human sequences were generated. Various analysis tools revealed the presence of 12 well-characterized phyla, members of the TM-7 and BRC1 clade, and sequences that could not be classified. Both pathogens and opportunistic pathogens were identified, supporting the ecological plaque hypothesis for oral diseases. Mapping the metagenomic reads to sequenced reference genomes demonstrated that 4% of the reads could be assigned to the sequenced species. Preliminary annotation identified genes belonging to all known functional categories. Interestingly, although 73% of the total assembled contig sequences were predicted to code for proteins, only 51% of them could be assigned a functional role. Furthermore, ~ 2.8% of the total predicted genes coded for proteins involved in resistance to antibiotics and toxic compounds, suggesting that the oral cavity is an important reservoir for antimicrobial resistance. PMID:21040513

  3. Community and gene composition of a human dental plaque microbiota obtained by metagenomic sequencing.

    PubMed

    Xie, G; Chain, P S G; Lo, C-C; Liu, K-L; Gans, J; Merritt, J; Qi, F

    2010-12-01

    Human dental plaque is a complex microbial community containing an estimated 700 to 19,000 species/phylotypes. Despite numerous studies analysing species richness in healthy and diseased human subjects, the true genomic composition of the human dental plaque microbiota remains unknown. Here we report a metagenomic analysis of a healthy human plaque sample using a combination of second-generation sequencing platforms. A total of 860 million base pairs of non-human sequences were generated. Various analysis tools revealed the presence of 12 well-characterized phyla, members of the TM-7 and BRC1 clade, and sequences that could not be classified. Both pathogens and opportunistic pathogens were identified, supporting the ecological plaque hypothesis for oral diseases. Mapping the metagenomic reads to sequenced reference genomes demonstrated that 4% of the reads could be assigned to the sequenced species. Preliminary annotation identified genes belonging to all known functional categories. Interestingly, although 73% of the total assembled contig sequences were predicted to code for proteins, only 51% of them could be assigned a functional role. Furthermore, ~2.8% of the total predicted genes coded for proteins involved in resistance to antibiotics and toxic compounds, suggesting that the oral cavity is an important reservoir for antimicrobial resistance. © 2010 John Wiley & Sons A/S.

  4. Diversity and functions of bacterial community in drinking water biofilms revealed by high-throughput sequencing

    NASA Astrophysics Data System (ADS)

    Chao, Yuanqing; Mao, Yanping; Wang, Zhiping; Zhang, Tong

    2015-06-01

    The development of biofilms in drinking water (DW) systems may cause various problems to water quality. To investigate the community structure of biofilms on different pipe materials and the global/specific metabolic functions of DW biofilms, PCR-based 454 pyrosequencing data for 16S rRNA genes and Illumina metagenomic data were generated and analysed. Considerable differences in bacterial diversity and taxonomic structure were identified between biofilms formed on stainless steel and biofilms formed on plastics, indicating that the metallic materials facilitate the formation of higher diversity biofilms. Moreover, variations in several dominant genera were observed during biofilm formation. Based on PCA analysis, the global functions in the DW biofilms were similar to other DW metagenomes. Beyond the global functions, the occurrences and abundances of specific protective genes involved in the glutathione metabolism, the SoxRS system, the OxyR system, RpoS regulated genes, and the production/degradation of extracellular polymeric substances were also evaluated. A near-complete and low-contamination draft genome was constructed from the metagenome of the DW biofilm, based on the coverage and tetranucleotide frequencies, and identified as a Bradyrhizobiaceae-like bacterium according to a phylogenetic analysis. Our findings provide new insight into DW biofilms, especially in terms of their metabolic functions.

  5. Characterization of the stromatolite microbiome from Little Darby Island, The Bahamas using predictive and whole shotgun metagenomic analysis.

    PubMed

    Casaburi, Giorgio; Duscher, Alexandrea A; Reid, R Pamela; Foster, Jamie S

    2016-05-01

    Modern stromatolites represent ideal ecosystems to understand the biological processes required for the precipitation of carbonate due to their long evolutionary history and occurrence in a wide range of habitats. However, most of the prior molecular work on stromatolites has focused on understanding the taxonomic complexity and not fully elucidating the functional capabilities of these systems. Here, we begin to characterize the microbiome associated with stromatolites of Little Darby Island, The Bahamas using predictive metagenomics of the 16S rRNA gene coupled with direct whole shotgun sequencing. The metagenomic analysis of the Little Darby stromatolites revealed many shared taxa and core pathways associated with biologically induced carbonate precipitation, suggesting functional convergence within Bahamian stromatolites. A comparison of the Little Darby stromatolites with other lithifying microbial ecosystems also revealed that although factors, such as geographic location and salinity, do drive some differences within the population, there are extensive similarities within the microbial populations. These results suggest that for stromatolite formation, 'who' is in the community is not as critical as metabolic activities and environmental interactions. Together, these analyses help improve our understanding of the similarities among lithifying ecosystems and provide an important first step in characterizing the shared microbiome of modern stromatolites. © 2015 Society for Applied Microbiology and John Wiley & Sons Ltd.

  6. Diversity and functions of bacterial community in drinking water biofilms revealed by high-throughput sequencing.

    PubMed

    Chao, Yuanqing; Mao, Yanping; Wang, Zhiping; Zhang, Tong

    2015-06-12

    The development of biofilms in drinking water (DW) systems may cause various problems to water quality. To investigate the community structure of biofilms on different pipe materials and the global/specific metabolic functions of DW biofilms, PCR-based 454 pyrosequencing data for 16S rRNA genes and Illumina metagenomic data were generated and analysed. Considerable differences in bacterial diversity and taxonomic structure were identified between biofilms formed on stainless steel and biofilms formed on plastics, indicating that the metallic materials facilitate the formation of higher diversity biofilms. Moreover, variations in several dominant genera were observed during biofilm formation. Based on PCA analysis, the global functions in the DW biofilms were similar to other DW metagenomes. Beyond the global functions, the occurrences and abundances of specific protective genes involved in the glutathione metabolism, the SoxRS system, the OxyR system, RpoS regulated genes, and the production/degradation of extracellular polymeric substances were also evaluated. A near-complete and low-contamination draft genome was constructed from the metagenome of the DW biofilm, based on the coverage and tetranucleotide frequencies, and identified as a Bradyrhizobiaceae-like bacterium according to a phylogenetic analysis. Our findings provide new insight into DW biofilms, especially in terms of their metabolic functions.

  7. Metagenomic insights into chlorination effects on microbial antibiotic resistance in drinking water.

    PubMed

    Shi, Peng; Jia, Shuyu; Zhang, Xu-Xiang; Zhang, Tong; Cheng, Shupei; Li, Aimin

    2013-01-01

    This study aimed to investigate the chlorination effects on microbial antibiotic resistance in a drinking water treatment plant. Biochemical identification, 16S rRNA gene cloning and metagenomic analysis consistently indicated that Proteobacteria were the main antibiotic resistant bacteria (ARB) dominating in the drinking water and chlorine disinfection greatly affected microbial community structure. After chlorination, higher proportion of the surviving bacteria was resistant to chloramphenicol, trimethoprim and cephalothin. Quantitative real-time PCRs revealed that sulI had the highest abundance among the antibiotic resistance genes (ARGs) detected in the drinking water, followed by tetA and tetG. Chlorination caused enrichment of ampC, aphA2, bla(TEM-1), tetA, tetG, ermA and ermB, but sulI was considerably removed (p < 0.05). Metagenomic analysis confirmed that drinking water chlorination could concentrate various ARGs, as well as of plasmids, insertion sequences and integrons involved in horizontal transfer of the ARGs. Water pipeline transportation tended to reduce the abundance of most ARGs, but various ARB and ARGs were still present in the tap water, which deserves more public health concerns. The results highlighted prevalence of ARB and ARGs in chlorinated drinking water and this study might be technologically useful for detecting the ARGs in water environments. Copyright © 2012 Elsevier Ltd. All rights reserved.

  8. Characterization of the stromatolite microbiome from Little Darby Island, The Bahamas using predictive and whole shotgun metagenomic analysis

    PubMed Central

    Casaburi, Giorgio; Duscher, Alexandrea A.; Reid, R. Pamela; Foster, Jamie S.

    2018-01-01

    Summary Modern stromatolites represent ideal ecosystems to understand the biological processes required for the precipitation of carbonate due to their long evolutionary history and occurrence in a wide range of habitats. However, most of the prior molecular work on stromatolites has focused on understanding the taxonomic complexity and not fully elucidating the functional capabilities of these systems. Here, we begin to characterize the microbiome associated with stromatolites of Little Darby Island, The Bahamas using predictive metagenomics of the 16S rRNA gene coupled with direct whole shotgun sequencing. The metagenomic analysis of the Little Darby stromatolites revealed many shared taxa and core pathways associated with biologically induced carbonate precipitation, suggesting functional convergence within Bahamian stromatolites. A comparison of the Little Darby stromatolites with other lithifying microbial ecosystems also revealed that although factors, such as geographic location and salinity, do drive some differences within the population, there are extensive similarities within the microbial populations. These results suggest that for stromatolite formation, ‘who’ is in the community is not as critical as metabolic activities and environmental interactions. Together, these analyses help improve our understanding of the similarities among lithifying ecosystems and provide an important first step in characterizing the shared microbiome of modern stromatolites. PMID:26471001

  9. Exploring variation-aware contig graphs for (comparative) metagenomics using MaryGold

    PubMed Central

    Nijkamp, Jurgen F.; Pop, Mihai; Reinders, Marcel J. T.; de Ridder, Dick

    2013-01-01

    Motivation: Although many tools are available to study variation and its impact in single genomes, there is a lack of algorithms for finding such variation in metagenomes. This hampers the interpretation of metagenomics sequencing datasets, which are increasingly acquired in research on the (human) microbiome, in environmental studies and in the study of processes in the production of foods and beverages. Existing algorithms often depend on the use of reference genomes, which pose a problem when a metagenome of a priori unknown strain composition is studied. In this article, we develop a method to perform reference-free detection and visual exploration of genomic variation, both within a single metagenome and between metagenomes. Results: We present the MaryGold algorithm and its implementation, which efficiently detects bubble structures in contig graphs using graph decomposition. These bubbles represent variable genomic regions in closely related strains in metagenomic samples. The variation found is presented in a condensed Circos-based visualization, which allows for easy exploration and interpretation of the found variation. We validated the algorithm on two simulated datasets containing three respectively seven Escherichia coli genomes and showed that finding allelic variation in these genomes improves assemblies. Additionally, we applied MaryGold to publicly available real metagenomic datasets, enabling us to find within-sample genomic variation in the metagenomes of a kimchi fermentation process, the microbiome of a premature infant and in microbial communities living on acid mine drainage. Moreover, we used MaryGold for between-sample variation detection and exploration by comparing sequencing data sampled at different time points for both of these datasets. Availability: MaryGold has been written in C++ and Python and can be downloaded from http://bioinformatics.tudelft.nl/software Contact: d.deridder@tudelft.nl PMID:24058058

  10. A highly optimized grid deployment: the metagenomic analysis example.

    PubMed

    Aparicio, Gabriel; Blanquer, Ignacio; Hernández, Vicente

    2008-01-01

    Computational resources and computationally expensive processes are two topics that are not growing at the same ratio. The availability of large amounts of computing resources in Grid infrastructures does not mean that efficiency is not an important issue. It is necessary to analyze the whole process to improve partitioning and submission schemas, especially in the most critical experiments. This is the case of metagenomic analysis, and this text shows the work done in order to optimize a Grid deployment, which has led to a reduction of the response time and the failure rates. Metagenomic studies aim at processing samples of multiple specimens to extract the genes and proteins that belong to the different species. In many cases, the sequencing of the DNA of many microorganisms is hindered by the impossibility of growing significant samples of isolated specimens. Many bacteria cannot survive alone, and require the interaction with other organisms. In such cases, the information of the DNA available belongs to different kinds of organisms. One important stage in Metagenomic analysis consists on the extraction of fragments followed by the comparison and analysis of their function stage. By the comparison to existing chains, whose function is well known, fragments can be classified. This process is computationally intensive and requires of several iterations of alignment and phylogeny classification steps. Source samples reach several millions of sequences, which could reach up to thousands of nucleotides each. These sequences are compared to a selected part of the "Non-redundant" database which only implies the information from eukaryotic species. From this first analysis, a refining process is performed and alignment analysis is restarted from the results. This process implies several CPU years. The article describes and analyzes the difficulties to fragment, automate and check the above operations in current Grid production environments. This environment has been tuned-up from an experimental study which has tested the most efficient and reliable resources, the optimal job size, and the data transference and database reindexation overhead. The environment should re-submit faulty jobs, detect endless tasks and ensure that the results are correctly retrieved and workflow synchronised. The paper will give an outline on the structure of the system, and the preparation steps performed to deal with this experiment.

  11. ELIXIR pilot action: Marine metagenomics – towards a domain specific set of sustainable services

    PubMed Central

    Robertsen, Espen Mikal; Denise, Hubert; Mitchell, Alex; Finn, Robert D.; Bongo, Lars Ailo; Willassen, Nils Peder

    2017-01-01

    Metagenomics, the study of genetic material recovered directly from environmental samples, has the potential to provide insight into the structure and function of heterogeneous microbial communities.  There has been an increased use of metagenomics to discover and understand the diverse biosynthetic capacities of marine microbes, thereby allowing them to be exploited for industrial, food, and health care products. This ELIXIR pilot action was motivated by the need to establish dedicated data resources and harmonized metagenomics pipelines for the marine domain, in order to enhance the exploration and exploitation of marine genetic resources. In this paper, we summarize some of the results from the ELIXIR pilot action “Marine metagenomics – towards user centric services”. PMID:28620454

  12. Introduction to Metagenomics at DOE JGI: Program Overview and Program Informatics (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Tringe, Susannah

    2018-01-15

    Susannah Tringe of the DOE Joint Genome Institute talks about the Program Overview and Program Informatics at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  13. CAFE: aCcelerated Alignment-FrEe sequence analysis.

    PubMed

    Lu, Yang Young; Tang, Kujin; Ren, Jie; Fuhrman, Jed A; Waterman, Michael S; Sun, Fengzhu

    2017-07-03

    Alignment-free genome and metagenome comparisons are increasingly important with the development of next generation sequencing (NGS) technologies. Recently developed state-of-the-art k-mer based alignment-free dissimilarity measures including CVTree, $d_2^*$ and $d_2^S$ are more computationally expensive than measures based solely on the k-mer frequencies. Here, we report a standalone software, aCcelerated Alignment-FrEe sequence analysis (CAFE), for efficient calculation of 28 alignment-free dissimilarity measures. CAFE allows for both assembled genome sequences and unassembled NGS shotgun reads as input, and wraps the output in a standard PHYLIP format. In downstream analyses, CAFE can also be used to visualize the pairwise dissimilarity measures, including dendrograms, heatmap, principal coordinate analysis and network display. CAFE serves as a general k-mer based alignment-free analysis platform for studying the relationships among genomes and metagenomes, and is freely available at https://github.com/younglululu/CAFE. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. ATLAS (Automatic Tool for Local Assembly Structures) - A Comprehensive Infrastructure for Assembly, Annotation, and Genomic Binning of Metagenomic and Metaranscripomic Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    White, Richard A.; Brown, Joseph M.; Colby, Sean M.

    ATLAS (Automatic Tool for Local Assembly Structures) is a comprehensive multiomics data analysis pipeline that is massively parallel and scalable. ATLAS contains a modular analysis pipeline for assembly, annotation, quantification and genome binning of metagenomics and metatranscriptomics data and a framework for reference metaproteomic database construction. ATLAS transforms raw sequence data into functional and taxonomic data at the microbial population level and provides genome-centric resolution through genome binning. ATLAS provides robust taxonomy based on majority voting of protein coding open reading frames rolled-up at the contig level using modified lowest common ancestor (LCA) analysis. ATLAS provides robust taxonomy based onmore » majority voting of protein coding open reading frames rolled-up at the contig level using modified lowest common ancestor (LCA) analysis. ATLAS is user-friendly, easy install through bioconda maintained as open-source on GitHub, and is implemented in Snakemake for modular customizable workflows.« less

  15. Fast and sensitive taxonomic classification for metagenomics with Kaiju

    PubMed Central

    Menzel, Peter; Ng, Kim Lee; Krogh, Anders

    2016-01-01

    Metagenomics emerged as an important field of research not only in microbial ecology but also for human health and disease, and metagenomic studies are performed on increasingly larger scales. While recent taxonomic classification programs achieve high speed by comparing genomic k-mers, they often lack sensitivity for overcoming evolutionary divergence, so that large fractions of the metagenomic reads remain unclassified. Here we present the novel metagenome classifier Kaiju, which finds maximum (in-)exact matches on the protein-level using the Burrows–Wheeler transform. We show in a genome exclusion benchmark that Kaiju classifies reads with higher sensitivity and similar precision compared with current k-mer-based classifiers, especially in genera that are underrepresented in reference databases. We also demonstrate that Kaiju classifies up to 10 times more reads in real metagenomes. Kaiju can process millions of reads per minute and can run on a standard PC. Source code and web server are available at http://kaiju.binf.ku.dk. PMID:27071849

  16. Fast and sensitive taxonomic classification for metagenomics with Kaiju.

    PubMed

    Menzel, Peter; Ng, Kim Lee; Krogh, Anders

    2016-04-13

    Metagenomics emerged as an important field of research not only in microbial ecology but also for human health and disease, and metagenomic studies are performed on increasingly larger scales. While recent taxonomic classification programs achieve high speed by comparing genomic k-mers, they often lack sensitivity for overcoming evolutionary divergence, so that large fractions of the metagenomic reads remain unclassified. Here we present the novel metagenome classifier Kaiju, which finds maximum (in-)exact matches on the protein-level using the Burrows-Wheeler transform. We show in a genome exclusion benchmark that Kaiju classifies reads with higher sensitivity and similar precision compared with current k-mer-based classifiers, especially in genera that are underrepresented in reference databases. We also demonstrate that Kaiju classifies up to 10 times more reads in real metagenomes. Kaiju can process millions of reads per minute and can run on a standard PC. Source code and web server are available at http://kaiju.binf.ku.dk.

  17. Mining the metagenome of activated biomass of an industrial wastewater treatment plant by a novel method.

    PubMed

    Sharma, Nandita; Tanksale, Himgouri; Kapley, Atya; Purohit, Hemant J

    2012-12-01

    Metagenomic libraries herald the era of magnifying the microbial world, tapping into the vast metabolic potential of uncultivated microbes, and enhancing the rate of discovery of novel genes and pathways. In this paper, we describe a method that facilitates the extraction of metagenomic DNA from activated sludge of an industrial wastewater treatment plant and its use in mining the metagenome via library construction. The efficiency of this method was demonstrated by the large representation of the bacterial genome in the constructed metagenomic libraries and by the functional clones obtained. The BAC library represented 95.6 times the bacterial genome, while, the pUC library represented 41.7 times the bacterial genome. Twelve clones in the BAC library demonstrated lipolytic activity, while four clones demonstrated dioxygenase activity. Four clones in pUC library tested positive for cellulase activity. This method, using FTA cards, not only can be used for library construction, but can also store the metagenome at room temperature.

  18. Metagenomic applications in environmental monitoring and bioremediation

    DOE PAGES

    Techtmann, Stephen M.; Hazen, Terry C.

    2016-01-01

    With the rapid advances in sequencing technology, the cost of sequencing has dramatically dropped and the scale of sequencing projects has increased accordingly. This has provided the opportunity for the routine use of sequencing techniques in the monitoring of environmental microbes. While metagenomic applications have been routinely applied to better understand the ecology and diversity of microbes, their use in environmental monitoring and bioremediation is increasingly common. In this review we seek to provide an overview of some of the metagenomic techniques used in environmental systems biology, addressing their application and limitation. We will also provide several recent examples ofmore » the application of metagenomics to bioremediation. We discuss examples where microbial communities have been used to predict the presence and extent of contamination, examples of how metagenomics can be used to characterize the process of natural attenuation by unculturable microbes, as well as examples detailing the use of metagenomics to understand the impact of biostimulation on microbial communities.« less

  19. Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes.

    PubMed

    Nielsen, H Bjørn; Almeida, Mathieu; Juncker, Agnieszka Sierakowska; Rasmussen, Simon; Li, Junhua; Sunagawa, Shinichi; Plichta, Damian R; Gautier, Laurent; Pedersen, Anders G; Le Chatelier, Emmanuelle; Pelletier, Eric; Bonde, Ida; Nielsen, Trine; Manichanh, Chaysavanh; Arumugam, Manimozhiyan; Batto, Jean-Michel; Quintanilha Dos Santos, Marcelo B; Blom, Nikolaj; Borruel, Natalia; Burgdorf, Kristoffer S; Boumezbeur, Fouad; Casellas, Francesc; Doré, Joël; Dworzynski, Piotr; Guarner, Francisco; Hansen, Torben; Hildebrand, Falk; Kaas, Rolf S; Kennedy, Sean; Kristiansen, Karsten; Kultima, Jens Roat; Léonard, Pierre; Levenez, Florence; Lund, Ole; Moumen, Bouziane; Le Paslier, Denis; Pons, Nicolas; Pedersen, Oluf; Prifti, Edi; Qin, Junjie; Raes, Jeroen; Sørensen, Søren; Tap, Julien; Tims, Sebastian; Ussery, David W; Yamada, Takuji; Renault, Pierre; Sicheritz-Ponten, Thomas; Bork, Peer; Wang, Jun; Brunak, Søren; Ehrlich, S Dusko

    2014-08-01

    Most current approaches for analyzing metagenomic data rely on comparisons to reference genomes, but the microbial diversity of many environments extends far beyond what is covered by reference databases. De novo segregation of complex metagenomic data into specific biological entities, such as particular bacterial strains or viruses, remains a largely unsolved problem. Here we present a method, based on binning co-abundant genes across a series of metagenomic samples, that enables comprehensive discovery of new microbial organisms, viruses and co-inherited genetic entities and aids assembly of microbial genomes without the need for reference sequences. We demonstrate the method on data from 396 human gut microbiome samples and identify 7,381 co-abundance gene groups (CAGs), including 741 metagenomic species (MGS). We use these to assemble 238 high-quality microbial genomes and identify affiliations between MGS and hundreds of viruses or genetic entities. Our method provides the means for comprehensive profiling of the diversity within complex metagenomic samples.

  20. MetaABC--an integrated metagenomics platform for data adjustment, binning and clustering.

    PubMed

    Su, Chien-Hao; Hsu, Ming-Tsung; Wang, Tse-Yi; Chiang, Sufeng; Cheng, Jen-Hao; Weng, Francis C; Kao, Cheng-Yan; Wang, Daryi; Tsai, Huai-Kuang

    2011-08-15

    MetaABC is a metagenomic platform that integrates several binning tools coupled with methods for removing artifacts, analyzing unassigned reads and controlling sampling biases. It allows users to arrive at a better interpretation via series of distinct combinations of analysis tools. After execution, MetaABC provides outputs in various visual formats such as tables, pie and bar charts as well as clustering result diagrams. MetaABC source code and documentation are available at http://bits2.iis.sinica.edu.tw/MetaABC/ CONTACT: dywang@gate.sinica.edu.tw; hktsai@iis.sinica.edu.tw Supplementary data are available at Bioinformatics online.

  1. A Graph-Centric Approach for Metagenome-Guided Peptide and Protein Identification in Metaproteomics

    PubMed Central

    Tang, Haixu; Li, Sujun; Ye, Yuzhen

    2016-01-01

    Metaproteomic studies adopt the common bottom-up proteomics approach to investigate the protein composition and the dynamics of protein expression in microbial communities. When matched metagenomic and/or metatranscriptomic data of the microbial communities are available, metaproteomic data analyses often employ a metagenome-guided approach, in which complete or fragmental protein-coding genes are first directly predicted from metagenomic (and/or metatranscriptomic) sequences or from their assemblies, and the resulting protein sequences are then used as the reference database for peptide/protein identification from MS/MS spectra. This approach is often limited because protein coding genes predicted from metagenomes are incomplete and fragmental. In this paper, we present a graph-centric approach to improving metagenome-guided peptide and protein identification in metaproteomics. Our method exploits the de Bruijn graph structure reported by metagenome assembly algorithms to generate a comprehensive database of protein sequences encoded in the community. We tested our method using several public metaproteomic datasets with matched metagenomic and metatranscriptomic sequencing data acquired from complex microbial communities in a biological wastewater treatment plant. The results showed that many more peptides and proteins can be identified when assembly graphs were utilized, improving the characterization of the proteins expressed in the microbial communities. The additional proteins we identified contribute to the characterization of important pathways such as those involved in degradation of chemical hazards. Our tools are released as open-source software on github at https://github.com/COL-IU/Graph2Pro. PMID:27918579

  2. Evaluation of rapid and simple techniques for the enrichment of viruses prior to metagenomic virus discovery.

    PubMed

    Hall, Richard J; Wang, Jing; Todd, Angela K; Bissielo, Ange B; Yen, Seiha; Strydom, Hugo; Moore, Nicole E; Ren, Xiaoyun; Huang, Q Sue; Carter, Philip E; Peacey, Matthew

    2014-01-01

    The discovery of new or divergent viruses using metagenomics and high-throughput sequencing has become more commonplace. The preparation of a sample is known to have an effect on the representation of virus sequences within the metagenomic dataset yet comparatively little attention has been given to this. Physical enrichment techniques are often applied to samples to increase the number of viral sequences and therefore enhance the probability of detection. With the exception of virus ecology studies, there is a paucity of information available to researchers on the type of sample preparation required for a viral metagenomic study that seeks to identify an aetiological virus in an animal or human diagnostic sample. A review of published virus discovery studies revealed the most commonly used enrichment methods, that were usually quick and simple to implement, namely low-speed centrifugation, filtration, nuclease-treatment (or combinations of these) which have been routinely used but often without justification. These were applied to a simple and well-characterised artificial sample composed of bacterial and human cells, as well as DNA (adenovirus) and RNA viruses (influenza A and human enterovirus), being either non-enveloped capsid or enveloped viruses. The effect of the enrichment method was assessed by both quantitative real-time PCR and metagenomic analysis that incorporated an amplification step. Reductions in the absolute quantities of bacteria and human cells were observed for each method as determined by qPCR, but the relative abundance of viral sequences in the metagenomic dataset remained largely unchanged. A 3-step method of centrifugation, filtration and nuclease-treatment showed the greatest increase in the proportion of viral sequences. This study provides a starting point for the selection of a purification method in future virus discovery studies, and highlights the need for more data to validate the effect of enrichment methods on different sample types, amplification, bioinformatics approaches and sequencing platforms. This study also highlights the potential risks that may attend selection of a virus enrichment method without any consideration for the sample type being investigated. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.

  3. Elucidation of Taste- and Odor-Producing Bacteria and Toxigenic Cyanobacteria in a Midwestern Drinking Water Supply Reservoir by Shotgun Metagenomic Analysis.

    PubMed

    Otten, Timothy G; Graham, Jennifer L; Harris, Theodore D; Dreher, Theo W

    2016-09-01

    While commonplace in clinical settings, DNA-based assays for identification or enumeration of drinking water pathogens and other biological contaminants remain widely unadopted by the monitoring community. In this study, shotgun metagenomics was used to identify taste-and-odor producers and toxin-producing cyanobacteria over a 2-year period in a drinking water reservoir. The sequencing data implicated several cyanobacteria, including Anabaena spp., Microcystis spp., and an unresolved member of the order Oscillatoriales as the likely principal producers of geosmin, microcystin, and 2-methylisoborneol (MIB), respectively. To further demonstrate this, quantitative PCR (qPCR) assays targeting geosmin-producing Anabaena and microcystin-producing Microcystis were utilized, and these data were fitted using generalized linear models and compared with routine monitoring data, including microscopic cell counts, sonde-based physicochemical analyses, and assays of all inorganic and organic nitrogen and phosphorus forms and fractions. The qPCR assays explained the greatest variation in observed geosmin (adjusted R(2) = 0.71) and microcystin (adjusted R(2) = 0.84) concentrations over the study period, highlighting their potential for routine monitoring applications. The origin of the monoterpene cyclase required for MIB biosynthesis was putatively linked to a periphytic cyanobacterial mat attached to the concrete drinking water inflow structure. We conclude that shotgun metagenomics can be used to identify microbial agents involved in water quality deterioration and to guide PCR assay selection or design for routine monitoring purposes. Finally, we offer estimates of microbial diversity and metagenomic coverage of our data sets for reference to others wishing to apply shotgun metagenomics to other lacustrine systems. Cyanobacterial toxins and microbial taste-and-odor compounds are a growing concern for drinking water utilities reliant upon surface water resources. Specific identification of the microorganism(s) responsible for water quality degradation is often complicated by the presence of co-occurring taxa capable of producing these undesirable metabolites. Here we present a framework for how shotgun metagenomics can be used to definitively identify problematic microorganisms and how these data can guide the development of rapid genetic assays for routine monitoring purposes. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  4. Elucidation of Taste- and Odor-Producing Bacteria and Toxigenic Cyanobacteria in a Midwestern Drinking Water Supply Reservoir by Shotgun Metagenomic Analysis

    PubMed Central

    Graham, Jennifer L.; Harris, Theodore D.

    2016-01-01

    ABSTRACT While commonplace in clinical settings, DNA-based assays for identification or enumeration of drinking water pathogens and other biological contaminants remain widely unadopted by the monitoring community. In this study, shotgun metagenomics was used to identify taste-and-odor producers and toxin-producing cyanobacteria over a 2-year period in a drinking water reservoir. The sequencing data implicated several cyanobacteria, including Anabaena spp., Microcystis spp., and an unresolved member of the order Oscillatoriales as the likely principal producers of geosmin, microcystin, and 2-methylisoborneol (MIB), respectively. To further demonstrate this, quantitative PCR (qPCR) assays targeting geosmin-producing Anabaena and microcystin-producing Microcystis were utilized, and these data were fitted using generalized linear models and compared with routine monitoring data, including microscopic cell counts, sonde-based physicochemical analyses, and assays of all inorganic and organic nitrogen and phosphorus forms and fractions. The qPCR assays explained the greatest variation in observed geosmin (adjusted R2 = 0.71) and microcystin (adjusted R2 = 0.84) concentrations over the study period, highlighting their potential for routine monitoring applications. The origin of the monoterpene cyclase required for MIB biosynthesis was putatively linked to a periphytic cyanobacterial mat attached to the concrete drinking water inflow structure. We conclude that shotgun metagenomics can be used to identify microbial agents involved in water quality deterioration and to guide PCR assay selection or design for routine monitoring purposes. Finally, we offer estimates of microbial diversity and metagenomic coverage of our data sets for reference to others wishing to apply shotgun metagenomics to other lacustrine systems. IMPORTANCE Cyanobacterial toxins and microbial taste-and-odor compounds are a growing concern for drinking water utilities reliant upon surface water resources. Specific identification of the microorganism(s) responsible for water quality degradation is often complicated by the presence of co-occurring taxa capable of producing these undesirable metabolites. Here we present a framework for how shotgun metagenomics can be used to definitively identify problematic microorganisms and how these data can guide the development of rapid genetic assays for routine monitoring purposes. PMID:27342564

  5. Metagenomics, metaMicrobesOnline and Kbase Data Integration (MICW - Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Dehal, Paramvir

    2018-02-06

    Berkeley Lab's Paramvir Dehal on "Managing and Storing large Datasets in MicrobesOnline, metaMicrobesOnline and the DOE Knowledgebase" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  6. Biofilm-Growing Bacteria Involved in the Corrosion of Concrete Wastewater Pipes: Protocols for Comparative Metagenomic Analyses

    EPA Science Inventory

    Advances in high-throughput next-generation sequencing (NGS) technology for direct sequencing of environmental DNA (i.e. shotgun metagenomics) is transforming the field of microbiology. NGS technologies are now regularly being applied in comparative metagenomic studies, which pr...

  7. Introduction to Metagenomics at DOE JGI (Opening Remarks for the Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Kyrpides, Nikos [DOE JGI

    2018-05-30

    After a quick introduction by DOE JGI Director Eddy Rubin, DOE JGI's Nikos Kyrpides delivers the opening remarks at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  8. Metagenome-scale analysis yields insights into the structure and function of microbial communities in a copper bioleaching heap.

    PubMed

    Zhang, Xian; Niu, Jiaojiao; Liang, Yili; Liu, Xueduan; Yin, Huaqun

    2016-01-19

    Metagenomics allows us to acquire the potential resources from both cultivatable and uncultivable microorganisms in the environment. Here, shotgun metagenome sequencing was used to investigate microbial communities from the surface layer of low grade copper tailings that were industrially bioleached at the Dexing Copper Mine, China. A bioinformatics analysis was further performed to elucidate structural and functional properties of the microbial communities in a copper bioleaching heap. Taxonomic analysis revealed unexpectedly high microbial biodiversity of this extremely acidic environment, as most sequences were phylogenetically assigned to Proteobacteria, while Euryarchaeota-related sequences occupied little proportion in this system, assuming that Archaea probably played little role in the bioleaching systems. At the genus level, the microbial community in mineral surface-layer was dominated by the sulfur- and iron-oxidizing acidophiles such as Acidithiobacillus-like populations, most of which were A. ferrivorans-like and A. ferrooxidans-like groups. In addition, Caudovirales were the dominant viral type observed in this extremely environment. Functional analysis illustrated that the principal participants related to the key metabolic pathways (carbon fixation, nitrogen metabolism, Fe(II) oxidation and sulfur metabolism) were mainly identified to be Acidithiobacillus-like, Thiobacillus-like and Leptospirillum-like microorganisms, indicating their vital roles. Also, microbial community harbored certain adaptive mechanisms (heavy metal resistance, low pH adaption, organic solvents tolerance and detoxification of hydroxyl radicals) as they performed their functions in the bioleaching system. Our study provides several valuable datasets for understanding the microbial community composition and function in the surface-layer of copper bioleaching heap.

  9. Origin of microbial biomineralization and magnetotaxis during the Archean.

    PubMed

    Lin, Wei; Paterson, Greig A; Zhu, Qiyun; Wang, Yinzhao; Kopylova, Evguenia; Li, Ying; Knight, Rob; Bazylinski, Dennis A; Zhu, Rixiang; Kirschvink, Joseph L; Pan, Yongxin

    2017-02-28

    Microbes that synthesize minerals, a process known as microbial biomineralization, contributed substantially to the evolution of current planetary environments through numerous important geochemical processes. Despite its geological significance, the origin and evolution of microbial biomineralization remain poorly understood. Through combined metagenomic and phylogenetic analyses of deep-branching magnetotactic bacteria from the Nitrospirae phylum, and using a Bayesian molecular clock-dating method, we show here that the gene cluster responsible for biomineralization of magnetosomes, and the arrangement of magnetosome chain(s) within cells, both originated before or near the Archean divergence between the Nitrospirae and Proteobacteria This phylogenetic divergence occurred well before the Great Oxygenation Event. Magnetotaxis likely evolved due to environmental pressures conferring an evolutionary advantage to navigation via the geomagnetic field. Earth's dynamo must therefore have been sufficiently strong to sustain microbial magnetotaxis in the Archean, suggesting that magnetotaxis coevolved with the geodynamo over geological time.

  10. Isolation and characterization of a novel tannase from a metagenomic library.

    PubMed

    Yao, Jian; Fan, Xin Jiong; Lu, Yi; Liu, Yu Huan

    2011-04-27

    A novel gene (designated as tan410) encoding tannase was isolated from a cotton field metagenomic library by functional screening. Sequence analysis revealed that tan410 encoded a protein of 521 amino acids. SDS-PAGE and gel filtration chromatography analysis of purified tannase suggested that Tan410 was a monomeric enzyme with a molecular mass of 55 kDa. The optimum temperature and pH of Tan410 were 30 °C and 6.4. The activity was enhanced by addition of Ca(2+), Mg(2+) and Cd(2+). In addition, Tan410 was stable in the presence of 4 M NaCl. Chlorogenic acid, rosmarinic acid, ethyl ferulate, tannic acid, epicatechin gallate and epigallocathchin gallate were efficiently hydrolyzed by recombinant tannase. All of these excellent properties make Tan410 an interesting enzyme for biotechnological application.

  11. Viruses in the Oceanic Basement.

    PubMed

    Nigro, Olivia D; Jungbluth, Sean P; Lin, Huei-Ting; Hsieh, Chih-Chiang; Miranda, Jaclyn A; Schvarcz, Christopher R; Rappé, Michael S; Steward, Grieg F

    2017-03-07

    Microbial life has been detected well into the igneous crust of the seafloor (i.e., the oceanic basement), but there have been no reports confirming the presence of viruses in this habitat. To detect and characterize an ocean basement virome, geothermally heated fluid samples (ca. 60 to 65°C) were collected from 117 to 292 m deep into the ocean basement using seafloor observatories installed in two boreholes (Integrated Ocean Drilling Program [IODP] U1362A and U1362B) drilled in the eastern sediment-covered flank of the Juan de Fuca Ridge. Concentrations of virus-like particles in the fluid samples were on the order of 0.2 × 10 5 to 2 × 10 5  ml -1 ( n = 8), higher than prokaryote-like cells in the same samples by a factor of 9 on average (range, 1.5 to 27). Electron microscopy revealed diverse viral morphotypes similar to those of viruses known to infect bacteria and thermophilic archaea. An analysis of virus-like sequences in basement microbial metagenomes suggests that those from archaeon-infecting viruses were the most common (63 to 80%). Complete genomes of a putative archaeon-infecting virus and a prophage within an archaeal scaffold were identified among the assembled sequences, and sequence analysis suggests that they represent lineages divergent from known thermophilic viruses. Of the clustered regularly interspaced short palindromic repeat (CRISPR)-containing scaffolds in the metagenomes for which a taxonomy could be inferred (163 out of 737), 51 to 55% appeared to be archaeal and 45 to 49% appeared to be bacterial. These results imply that the warmed, highly altered fluids in deeply buried ocean basement harbor a distinct assemblage of novel viruses, including many that infect archaea, and that these viruses are active participants in the ecology of the basement microbiome. IMPORTANCE The hydrothermally active ocean basement is voluminous and likely provided conditions critical to the origins of life, but the microbiology of this vast habitat is not well understood. Viruses in particular, although integral to the origins, evolution, and ecology of all life on earth, have never been documented in basement fluids. This report provides the first estimate of free virus particles (virions) within fluids circulating through the extrusive basalt of the seafloor and describes the morphological and genetic signatures of basement viruses. These data push the known geographical limits of the virosphere deep into the ocean basement and point to a wealth of novel viral diversity, exploration of which could shed light on the early evolution of viruses. Copyright © 2017 Nigro et al.

  12. Wild eel microbiome reveals that skin mucus of fish could be a natural niche for aquatic mucosal pathogen evolution.

    PubMed

    Carda-Diéguez, Miguel; Ghai, Rohit; Rodríguez-Valera, Francisco; Amaro, Carmen

    2017-12-21

    Fish skin mucosal surfaces (SMS) are quite similar in composition and function to some mammalian MS and, in consequence, could constitute an adequate niche for the evolution of mucosal aquatic pathogens in natural environments. We aimed to test this hypothesis by searching for metagenomic and genomic evidences in the SMS-microbiome of a model fish species (Anguilla Anguilla or eel), from different ecosystems (four natural environments of different water salinity and one eel farm) as well as the water microbiome (W-microbiome) surrounding the host. Remarkably, potentially pathogenic Vibrio monopolized wild eel SMS-microbiome from natural ecosystems, Vibrio anguillarum/Vibrio vulnificus and Vibrio cholerae/Vibrio metoecus being the most abundant ones in SMS from estuary and lake, respectively. Functions encoded in the SMS-microbiome differed significantly from those in the W-microbiome and allowed us to predict that successful mucus colonizers should have specific genes for (i) attachment (mainly by forming biofilms), (ii) bacterial competence and communication, and (iii) resistance to mucosal innate immunity, predators (amoeba), and heavy metals/drugs. In addition, we found several mobile genetic elements (mainly integrative conjugative elements) as well as a series of evidences suggesting that bacteria exchange DNA in SMS. Further, we isolated and sequenced a V. metoecus strain from SMS. This isolate shares pathogenicity islands with V. cholerae O1 from intestinal infections that are absent in the rest of sequenced V. metoecus strains, all of them from water and extra-intestinal infections. We have obtained metagenomic and genomic evidence in favor of the hypothesis on the role of fish mucosal surfaces as a specialized habitat selecting microbes capable of colonizing and persisting on other comparable mucosal surfaces, e.g., the human intestine.

  13. Methanotrophic bacteria in oilsands tailings ponds of northern Alberta

    PubMed Central

    Saidi-Mehrabad, Alireza; He, Zhiguo; Tamas, Ivica; Sharp, Christine E; Brady, Allyson L; Rochman, Fauziah F; Bodrossy, Levente; Abell, Guy CJ; Penner, Tara; Dong, Xiaoli; Sensen, Christoph W; Dunfield, Peter F

    2013-01-01

    We investigated methanotrophic bacteria in slightly alkaline surface water (pH 7.4–8.7) of oilsands tailings ponds in Fort McMurray, Canada. These large lakes (up to 10 km2) contain water, silt, clay and residual hydrocarbons that are not recovered in oilsands mining. They are primarily anoxic and produce methane but have an aerobic surface layer. Aerobic methane oxidation was measured in the surface water at rates up to 152 nmol CH4 ml−1 water d−1. Microbial diversity was investigated via pyrotag sequencing of amplified 16S rRNA genes, as well as by analysis of methanotroph-specific pmoA genes using both pyrosequencing and microarray analysis. The predominantly detected methanotroph in surface waters at all sampling times was an uncultured species related to the gammaproteobacterial genus Methylocaldum, although a few other methanotrophs were also detected, including Methylomonas spp. Active species were identified via 13CH4 stable isotope probing (SIP) of DNA, combined with pyrotag sequencing and shotgun metagenomic sequencing of heavy 13C-DNA. The SIP-PCR results demonstrated that the Methylocaldum and Methylomonas spp. actively consumed methane in fresh tailings pond water. Metagenomic analysis of DNA from the heavy SIP fraction verified the PCR-based results and identified additional pmoA genes not detected via PCR. The metagenome indicated that the overall methylotrophic community possessed known pathways for formaldehyde oxidation, carbon fixation and detoxification of nitrogenous compounds but appeared to possess only particulate methane monooxygenase not soluble methane monooxygenase. PMID:23254511

  14. Construction of a dairy microbial genome catalog opens new perspectives for the metagenomic analysis of dairy fermented products.

    PubMed

    Almeida, Mathieu; Hébert, Agnès; Abraham, Anne-Laure; Rasmussen, Simon; Monnet, Christophe; Pons, Nicolas; Delbès, Céline; Loux, Valentin; Batto, Jean-Michel; Leonard, Pierre; Kennedy, Sean; Ehrlich, Stanislas Dusko; Pop, Mihai; Montel, Marie-Christine; Irlinger, Françoise; Renault, Pierre

    2014-12-13

    Microbial communities of traditional cheeses are complex and insufficiently characterized. The origin, safety and functional role in cheese making of these microbial communities are still not well understood. Metagenomic analysis of these communities by high throughput shotgun sequencing is a promising approach to characterize their genomic and functional profiles. Such analyses, however, critically depend on the availability of appropriate reference genome databases against which the sequencing reads can be aligned. We built a reference genome catalog suitable for short read metagenomic analysis using a low-cost sequencing strategy. We selected 142 bacteria isolated from dairy products belonging to 137 different species and 67 genera, and succeeded to reconstruct the draft genome of 117 of them at a standard or high quality level, including isolates from the genera Kluyvera, Luteococcus and Marinilactibacillus, still missing from public database. To demonstrate the potential of this catalog, we analysed the microbial composition of the surface of two smear cheeses and one blue-veined cheese, and showed that a significant part of the microbiota of these traditional cheeses was composed of microorganisms newly sequenced in our study. Our study provides data, which combined with publicly available genome references, represents the most expansive catalog to date of cheese-associated bacteria. Using this extended dairy catalog, we revealed the presence in traditional cheese of dominant microorganisms not deliberately inoculated, mainly Gram-negative genera such as Pseudoalteromonas haloplanktis or Psychrobacter immobilis, that may contribute to the characteristics of cheese produced through traditional methods.

  15. Metagenomic analysis of the microbiota in the highly compartmented hindguts of six wood- or soil-feeding higher termites.

    PubMed

    Rossmassler, Karen; Dietrich, Carsten; Thompson, Claire; Mikaelyan, Aram; Nonoh, James O; Scheffrahn, Rudolf H; Sillam-Dussès, David; Brune, Andreas

    2015-11-26

    Termites are important contributors to carbon and nitrogen cycling in tropical ecosystems. Higher termites digest lignocellulose in various stages of humification with the help of an entirely prokaryotic microbiota housed in their compartmented intestinal tract. Previous studies revealed fundamental differences in community structure between compartments, but the functional roles of individual lineages in symbiotic digestion are mostly unknown. Here, we conducted a highly resolved analysis of the gut microbiota in six species of higher termites that feed on plant material at different levels of humification. Combining amplicon sequencing and metagenomics, we assessed similarities in community structure and functional potential between the major hindgut compartments (P1, P3, and P4). Cluster analysis of the relative abundances of orthologous gene clusters (COGs) revealed high similarities among wood- and litter-feeding termites and strong differences to humivorous species. However, abundance estimates of bacterial phyla based on 16S rRNA genes greatly differed from those based on protein-coding genes. Community structure and functional potential of the microbiota in individual gut compartments are clearly driven by the digestive strategy of the host. The metagenomics libraries obtained in this study provide the basis for future studies that elucidate the fundamental differences in the symbiont-mediated breakdown of lignocellulose and humus by termites of different feeding groups. The high proportion of uncultured bacterial lineages in all samples calls for a reference-independent approach for the correct taxonomic assignment of protein-coding genes.

  16. MIPE: A metagenome-based community structure explorer and SSU primer evaluation tool

    PubMed Central

    Zhou, Quan

    2017-01-01

    An understanding of microbial community structure is an important issue in the field of molecular ecology. The traditional molecular method involves amplification of small subunit ribosomal RNA (SSU rRNA) genes by polymerase chain reaction (PCR). However, PCR-based amplicon approaches are affected by primer bias and chimeras. With the development of high-throughput sequencing technology, unbiased SSU rRNA gene sequences can be mined from shotgun sequencing-based metagenomic or metatranscriptomic datasets to obtain a reflection of the microbial community structure in specific types of environment and to evaluate SSU primers. However, the use of short reads obtained through next-generation sequencing for primer evaluation has not been well resolved. The software MIPE (MIcrobiota metagenome Primer Explorer) was developed to adapt numerous short reads from metagenomes and metatranscriptomes. Using metagenomic or metatranscriptomic datasets as input, MIPE extracts and aligns rRNA to reveal detailed information on microbial composition and evaluate SSU rRNA primers. A mock dataset, a real Metagenomics Rapid Annotation using Subsystem Technology (MG-RAST) test dataset, two PrimerProspector test datasets and a real metatranscriptomic dataset were used to validate MIPE. The software calls Mothur (v1.33.3) and the SILVA database (v119) for the alignment and classification of rRNA genes from a metagenome or metatranscriptome. MIPE can effectively extract shotgun rRNA reads from a metagenome or metatranscriptome and is capable of classifying these sequences and exhibiting sensitivity to different SSU rRNA PCR primers. Therefore, MIPE can be used to guide primer design for specific environmental samples. PMID:28350876

  17. Intracellular screen to identify metagenomic clones that induce or inhibit a quorum-sensing biosensor.

    PubMed

    Williamson, Lynn L; Borlee, Bradley R; Schloss, Patrick D; Guan, Changhui; Allen, Heather K; Handelsman, Jo

    2005-10-01

    The goal of this study was to design and evaluate a rapid screen to identify metagenomic clones that produce biologically active small molecules. We built metagenomic libraries with DNA from soil on the floodplain of the Tanana River in Alaska. We extracted DNA directly from the soil and cloned it into fosmid and bacterial artificial chromosome vectors, constructing eight metagenomic libraries that contain 53,000 clones with inserts ranging from 1 to 190 kb. To identify clones of interest, we designed a high throughput "intracellular" screen, designated METREX, in which metagenomic DNA is in a host cell containing a biosensor for compounds that induce bacterial quorum sensing. If the metagenomic clone produces a quorum-sensing inducer, the cell produces green fluorescent protein (GFP) and can be identified by fluorescence microscopy or captured by fluorescence-activated cell sorting. Our initial screen identified 11 clones that induce and two that inhibit expression of GFP. The intracellular screen detected quorum-sensing inducers among metagenomic clones that a traditional overlay screen would not. One inducing clone carries a LuxI homologue that directs the synthesis of an N-acyl homoserine lactone quorum-sensing signal molecule. The LuxI homologue has 62% amino acid sequence identity to its closest match in GenBank, AmfI from Pseudomonas fluorescens, and is on a 78-kb insert that contains 67 open reading frames. Another inducing clone carries a gene with homology to homocitrate synthase. Our results demonstrate the power of an intracellular screen to identify functionally active clones and biologically active small molecules in metagenomic libraries.

  18. Taxonomic and functional assignment of cloned sequences from high Andean forest soil metagenome.

    PubMed

    Montaña, José Salvador; Jiménez, Diego Javier; Hernández, Mónica; Angel, Tatiana; Baena, Sandra

    2012-02-01

    Total metagenomic DNA was isolated from high Andean forest soil and subjected to taxonomical and functional composition analyses by means of clone library generation and sequencing. The obtained yield of 1.7 μg of DNA/g of soil was used to construct a metagenomic library of approximately 20,000 clones (in the plasmid p-Bluescript II SK+) with an average insert size of 4 Kb, covering 80 Mb of the total metagenomic DNA. Metagenomic sequences near the plasmid cloning site were sequenced and them trimmed and assembled, obtaining 299 reads and 31 contigs (0.3 Mb). Taxonomic assignment of total sequences was performed by BLASTX, resulting in 68.8, 44.8 and 24.5% classification into taxonomic groups using the metagenomic RAST server v2.0, WebCARMA v1.0 online system and MetaGenome Analyzer v3.8 software, respectively. Most clone sequences were classified as Bacteria belonging to phlya Actinobacteria, Proteobacteria and Acidobacteria. Among the most represented orders were Actinomycetales (34% average), Rhizobiales, Burkholderiales and Myxococcales and with a greater number of sequences in the genus Mycobacterium (7% average), Frankia, Streptomyces and Bradyrhizobium. The vast majority of sequences were associated with the metabolism of carbohydrates, proteins, lipids and catalytic functions, such as phosphatases, glycosyltransferases, dehydrogenases, methyltransferases, dehydratases and epoxide hydrolases. In this study we compared different methods of taxonomic and functional assignment of metagenomic clone sequences to evaluate microbial diversity in an unexplored soil ecosystem, searching for putative enzymes of biotechnological interest and generating important information for further functional screening of clone libraries.

  19. Metagenomic Analysis of a Biphenyl-Degrading Soil Bacterial Consortium Reveals the Metabolic Roles of Specific Populations

    PubMed Central

    Garrido-Sanz, Daniel; Manzano, Javier; Martín, Marta; Redondo-Nieto, Miguel; Rivilla, Rafael

    2018-01-01

    Polychlorinated biphenyls (PCBs) are widespread persistent pollutants that cause several adverse health effects. Aerobic bioremediation of PCBs involves the activity of either one bacterial species or a microbial consortium. Using multiple species will enhance the range of PCB congeners co-metabolized since different PCB-degrading microorganisms exhibit different substrate specificity. We have isolated a bacterial consortium by successive enrichment culture using biphenyl (analog of PCBs) as the sole carbon and energy source. This consortium is able to grow on biphenyl, benzoate, and protocatechuate. Whole-community DNA extracted from the consortium was used to analyze biodiversity by Illumina sequencing of a 16S rRNA gene amplicon library and to determine the metagenome by whole-genome shotgun Illumina sequencing. Biodiversity analysis shows that the consortium consists of 24 operational taxonomic units (≥97% identity). The consortium is dominated by strains belonging to the genus Pseudomonas, but also contains betaproteobacteria and Rhodococcus strains. whole-genome shotgun (WGS) analysis resulted in contigs containing 78.3 Mbp of sequenced DNA, representing around 65% of the expected DNA in the consortium. Bioinformatic analysis of this metagenome has identified the genes encoding the enzymes implicated in three pathways for the conversion of biphenyl to benzoate and five pathways from benzoate to tricarboxylic acid (TCA) cycle intermediates, allowing us to model the whole biodegradation network. By genus assignment of coding sequences, we have also been able to determine that the three biphenyl to benzoate pathways are carried out by Rhodococcus strains. In turn, strains belonging to Pseudomonas and Bordetella are the main responsible of three of the benzoate to TCA pathways while the benzoate conversion into TCA cycle intermediates via benzoyl-CoA and the catechol meta-cleavage pathways are carried out by beta proteobacteria belonging to genera such as Achromobacter and Variovorax. We have isolated a Rhodococcus strain WAY2 from the consortium which contains the genes encoding the three biphenyl to benzoate pathways indicating that this strain is responsible for all the biphenyl to benzoate transformations. The presented results show that metagenomic analysis of consortia allows the identification of bacteria active in biodegradation processes and the assignment of specific reactions and pathways to specific bacterial groups. PMID:29497412

  20. Evolutionary Genomics of Defense Systems in Archaea and Bacteria*

    PubMed Central

    Koonin, Eugene V.; Makarova, Kira S.; Wolf, Yuri I.

    2018-01-01

    Evolution of bacteria and archaea involves an incessant arms race against an enormous diversity of genetic parasites. Accordingly, a substantial fraction of the genes in most bacteria and archaea are dedicated to antiparasite defense. The functions of these defense systems follow several distinct strategies, including innate immunity; adaptive immunity; and dormancy induction, or programmed cell death. Recent comparative genomic studies taking advantage of the expanding database of microbial genomes and metagenomes, combined with direct experiments, resulted in the discovery of several previously unknown defense systems, including innate immunity centered on Argonaute proteins, bacteriophage exclusion, and new types of CRISPR-Cas systems of adaptive immunity. Some general principles of function and evolution of defense systems are starting to crystallize, in particular, extensive gain and loss of defense genes during the evolution of prokaryotes; formation of genomic defense islands; evolutionary connections between mobile genetic elements and defense, whereby genes of mobile elements are repeatedly recruited for defense functions; the partially selfish and addictive behavior of the defense systems; and coupling between immunity and dormancy induction/programmed cell death. PMID:28657885

  1. Advances in Cryptococcus genomics: insights into the evolution of pathogenesis.

    PubMed

    Cuomo, Christina A; Rhodes, Johanna; Desjardins, Christopher A

    2018-01-01

    Cryptococcus species are the causative agents of cryptococcal meningitis, a significant source of mortality in immunocompromised individuals. Initial work on the molecular epidemiology of this fungal pathogen utilized genotyping approaches to describe the genetic diversity and biogeography of two species, Cryptococcus neoformans and Cryptococcus gattii. Whole genome sequencing of representatives of both species resulted in reference assemblies enabling a wide array of downstream studies and genomic resources. With the increasing availability of whole genome sequencing, both species have now had hundreds of individual isolates sequenced, providing fine-scale insight into the evolution and diversification of Cryptococcus and allowing for the first genome-wide association studies to identify genetic variants associated with human virulence. Sequencing has also begun to examine the microevolution of isolates during prolonged infection and to identify variants specific to outbreak lineages, highlighting the potential role of hyper-mutation in evolving within short time scales. We can anticipate that further advances in sequencing technology and sequencing microbial genomes at scale, including metagenomics approaches, will continue to refine our view of how the evolution of Cryptococcus drives its success as a pathogen.

  2. Identification of fungi in shotgun metagenomics datasets

    PubMed Central

    Donovan, Paul D.; Gonzalez, Gabriel; Higgins, Desmond G.

    2018-01-01

    Metagenomics uses nucleic acid sequencing to characterize species diversity in different niches such as environmental biomes or the human microbiome. Most studies have used 16S rRNA amplicon sequencing to identify bacteria. However, the decreasing cost of sequencing has resulted in a gradual shift away from amplicon analyses and towards shotgun metagenomic sequencing. Shotgun metagenomic data can be used to identify a wide range of species, but have rarely been applied to fungal identification. Here, we develop a sequence classification pipeline, FindFungi, and use it to identify fungal sequences in public metagenome datasets. We focus primarily on animal metagenomes, especially those from pig and mouse microbiomes. We identified fungi in 39 of 70 datasets comprising 71 fungal species. At least 11 pathogenic species with zoonotic potential were identified, including Candida tropicalis. We identified Pseudogymnoascus species from 13 Antarctic soil samples initially analyzed for the presence of bacteria capable of degrading diesel oil. We also show that Candida tropicalis and Candida loboi are likely the same species. In addition, we identify several examples where contaminating DNA was erroneously included in fungal genome assemblies. PMID:29444186

  3. Scalable metagenomic taxonomy classification using a reference genome database

    PubMed Central

    Ames, Sasha K.; Hysom, David A.; Gardner, Shea N.; Lloyd, G. Scott; Gokhale, Maya B.; Allen, Jonathan E.

    2013-01-01

    Motivation: Deep metagenomic sequencing of biological samples has the potential to recover otherwise difficult-to-detect microorganisms and accurately characterize biological samples with limited prior knowledge of sample contents. Existing metagenomic taxonomic classification algorithms, however, do not scale well to analyze large metagenomic datasets, and balancing classification accuracy with computational efficiency presents a fundamental challenge. Results: A method is presented to shift computational costs to an off-line computation by creating a taxonomy/genome index that supports scalable metagenomic classification. Scalable performance is demonstrated on real and simulated data to show accurate classification in the presence of novel organisms on samples that include viruses, prokaryotes, fungi and protists. Taxonomic classification of the previously published 150 giga-base Tyrolean Iceman dataset was found to take <20 h on a single node 40 core large memory machine and provide new insights on the metagenomic contents of the sample. Availability: Software was implemented in C++ and is freely available at http://sourceforge.net/projects/lmat Contact: allen99@llnl.gov Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23828782

  4. Metagenomic and functional analyses of the consequences of reduction of bacterial diversity on soil functions and bioremediation in diesel-contaminated microcosms.

    PubMed

    Jung, Jaejoon; Philippot, Laurent; Park, Woojun

    2016-03-14

    The relationship between microbial biodiversity and soil function is an important issue in ecology, yet most studies have been performed in pristine ecosystems. Here, we assess the role of microbial diversity in ecological function and remediation strategies in diesel-contaminated soils. Soil microbial diversity was manipulated using a removal by dilution approach and microbial functions were determined using both metagenomic analyses and enzymatic assays. A shift from Proteobacteria- to Actinobacteria-dominant communities was observed when species diversity was reduced. Metagenomic analysis showed that a large proportion of functional gene categories were significantly altered by the reduction in biodiversity. The abundance of genes related to the nitrogen cycle was significantly reduced in the low-diversity community, impairing denitrification. In contrast, the efficiency of diesel biodegradation was increased in the low-diversity community and was further enhanced by addition of red clay as a stimulating agent. Our results suggest that the relationship between microbial diversity and ecological function involves trade-offs among ecological processes, and should not be generalized as a positive, neutral, or negative relationship.

  5. Metagenomic investigation of gastrointestinal microbiome in cattle

    PubMed Central

    Kim, Minseok; Park, Tansol; Yu, Zhongtang

    2017-01-01

    The gastrointestinal (GI) tract, including the rumen and the other intestinal segments of cattle, harbors a diverse, complex, and dynamic microbiome that drives feed digestion and fermentation in cattle, determining feed efficiency and output of pollutants. This microbiome also plays an important role in affecting host health. Research has been conducted for more than a century to understand the microbiome and its relationship to feed efficiency and host health. The traditional cultivation-based research elucidated some of the major metabolism, but studies using molecular biology techniques conducted from late 1980’s to the late early 2000’s greatly expanded our view of the diversity of the rumen and intestinal microbiome of cattle. Recently, metagenomics has been the primary technology to characterize the GI microbiome and its relationship with host nutrition and health. This review addresses the main methods/techniques in current use, the knowledge gained, and some of the challenges that remain. Most of the primers used in quantitative real-time polymerase chain reaction quantification and diversity analysis using metagenomics of ruminal bacteria, archaea, fungi, and protozoa were also compiled. PMID:28830126

  6. Metagenomic characterization of viral communities in Goseong Bay, Korea

    NASA Astrophysics Data System (ADS)

    Hwang, Jinik; Park, So Yun; Park, Mirye; Lee, Sukchan; Jo, Yeonhwa; Cho, Won Kyong; Lee, Taek-Kyun

    2016-12-01

    In this study, seawater samples were collected from Goseong Bay, Korea in March 2014 and viral populations were examined by metagenomics assembly. Enrichment of marine viral particles using FeCl3 followed by next-generation sequencing produced numerous sequences. De novo assembly and BLAST search showed that most of the obtained contigs were unknown sequences and only 0.74% of sequences were associated with known viruses. As a result, 138 viruses, including bacteriophages (87%), viruses infecting algae and others (13%) were identified. The identified 138 viruses were divided into 11 orders, 14 families, 34 genera, and 133 species. The dominant viruses were Pelagibacter phage HTVC010P and Roseobacter phage SIO1. The viruses infecting algae, including the Ostreococcus species, accounted for 9.4% of total identified viruses. In addition, we identified pathogenic herpes viruses infecting fishes and giant viruses infecting parasitic acanthamoeba species. This is a comprehensive study to reveal the viral populations in the Goseong Bay using metagenomics. The information associated with the marine viral community in Goseong Bay, Korea will be useful for comparative analysis in other marine viral communities.

  7. Phylogenetic screening of a bacterial, metagenomic library using homing endonuclease restriction and marker insertion

    PubMed Central

    Yung, Pui Yi; Burke, Catherine; Lewis, Matt; Egan, Suhelen; Kjelleberg, Staffan; Thomas, Torsten

    2009-01-01

    Metagenomics provides access to the uncultured majority of the microbial world. The approaches employed in this field have, however, had limited success in linking functional genes to the taxonomic or phylogenetic origin of the organism they belong to. Here we present an efficient strategy to recover environmental DNA fragments that contain phylogenetic marker genes from metagenomic libraries. Our method involves the cleavage of 23S ribsosmal RNA (rRNA) genes within pooled library clones by the homing endonuclease I-CeuI followed by the insertion and selection of an antibiotic resistance cassette. This approach was applied to screen a library of 6500 fosmid clones derived from the microbial community associated with the sponge Cymbastela concentrica. Several fosmid clones were recovered after the screen and detailed phylogenetic and taxonomic assignment based on the rRNA gene showed that they belong to previously unknown organisms. In addition, compositional features of these fosmid clones were used to classify and taxonomically assign a dataset of environmental shotgun sequences. Our approach represents a valuable tool for the analysis of rapidly increasing, environmental DNA sequencing information. PMID:19767618

  8. Metagenome sequencing and 98 microbial genomes from Juan de Fuca Ridge flank subsurface fluids

    NASA Astrophysics Data System (ADS)

    Jungbluth, Sean P.; Amend, Jan P.; Rappé, Michael S.

    2017-03-01

    The global deep subsurface biosphere is one of the largest reservoirs for microbial life on our planet. This study takes advantage of new sampling technologies and couples them with improvements to DNA sequencing and associated informatics tools to reconstruct the genomes of uncultivated Bacteria and Archaea from fluids collected deep within the Juan de Fuca Ridge subseafloor. Here, we generated two metagenomes from borehole observatories located 311 meters apart and, using binning tools, retrieved 98 genomes from metagenomes (GFMs). Of the GFMs, 31 were estimated to be >90% complete, while an additional 17 were >70% complete. Phylogenomic analysis revealed 53 bacterial and 45 archaeal GFMs, of which nearly all were distantly related to known cultivated isolates. In the GFMs, abundant Bacteria included Chloroflexi, Nitrospirae, Acetothermia (OP1), EM3, Aminicenantes (OP8), Gammaproteobacteria, and Deltaproteobacteria, while abundant Archaea included Archaeoglobi, Bathyarchaeota (MCG), and Marine Benthic Group E (MBG-E). These data are the first GFMs reconstructed from the deep basaltic subseafloor biosphere, and provide a dataset available for further interrogation.

  9. Metagenome sequencing and 98 microbial genomes from Juan de Fuca Ridge flank subsurface fluids.

    PubMed

    Jungbluth, Sean P; Amend, Jan P; Rappé, Michael S

    2017-03-28

    The global deep subsurface biosphere is one of the largest reservoirs for microbial life on our planet. This study takes advantage of new sampling technologies and couples them with improvements to DNA sequencing and associated informatics tools to reconstruct the genomes of uncultivated Bacteria and Archaea from fluids collected deep within the Juan de Fuca Ridge subseafloor. Here, we generated two metagenomes from borehole observatories located 311 meters apart and, using binning tools, retrieved 98 genomes from metagenomes (GFMs). Of the GFMs, 31 were estimated to be >90% complete, while an additional 17 were >70% complete. Phylogenomic analysis revealed 53 bacterial and 45 archaeal GFMs, of which nearly all were distantly related to known cultivated isolates. In the GFMs, abundant Bacteria included Chloroflexi, Nitrospirae, Acetothermia (OP1), EM3, Aminicenantes (OP8), Gammaproteobacteria, and Deltaproteobacteria, while abundant Archaea included Archaeoglobi, Bathyarchaeota (MCG), and Marine Benthic Group E (MBG-E). These data are the first GFMs reconstructed from the deep basaltic subseafloor biosphere, and provide a dataset available for further interrogation.

  10. Metagenomic and functional analyses of the consequences of reduction of bacterial diversity on soil functions and bioremediation in diesel-contaminated microcosms

    PubMed Central

    Jung, Jaejoon; Philippot, Laurent; Park, Woojun

    2016-01-01

    The relationship between microbial biodiversity and soil function is an important issue in ecology, yet most studies have been performed in pristine ecosystems. Here, we assess the role of microbial diversity in ecological function and remediation strategies in diesel-contaminated soils. Soil microbial diversity was manipulated using a removal by dilution approach and microbial functions were determined using both metagenomic analyses and enzymatic assays. A shift from Proteobacteria- to Actinobacteria-dominant communities was observed when species diversity was reduced. Metagenomic analysis showed that a large proportion of functional gene categories were significantly altered by the reduction in biodiversity. The abundance of genes related to the nitrogen cycle was significantly reduced in the low-diversity community, impairing denitrification. In contrast, the efficiency of diesel biodegradation was increased in the low-diversity community and was further enhanced by addition of red clay as a stimulating agent. Our results suggest that the relationship between microbial diversity and ecological function involves trade-offs among ecological processes, and should not be generalized as a positive, neutral, or negative relationship. PMID:26972977

  11. Metagenome sequencing and 98 microbial genomes from Juan de Fuca Ridge flank subsurface fluids

    PubMed Central

    Jungbluth, Sean P.; Amend, Jan P.; Rappé, Michael S.

    2017-01-01

    The global deep subsurface biosphere is one of the largest reservoirs for microbial life on our planet. This study takes advantage of new sampling technologies and couples them with improvements to DNA sequencing and associated informatics tools to reconstruct the genomes of uncultivated Bacteria and Archaea from fluids collected deep within the Juan de Fuca Ridge subseafloor. Here, we generated two metagenomes from borehole observatories located 311 meters apart and, using binning tools, retrieved 98 genomes from metagenomes (GFMs). Of the GFMs, 31 were estimated to be >90% complete, while an additional 17 were >70% complete. Phylogenomic analysis revealed 53 bacterial and 45 archaeal GFMs, of which nearly all were distantly related to known cultivated isolates. In the GFMs, abundant Bacteria included Chloroflexi, Nitrospirae, Acetothermia (OP1), EM3, Aminicenantes (OP8), Gammaproteobacteria, and Deltaproteobacteria, while abundant Archaea included Archaeoglobi, Bathyarchaeota (MCG), and Marine Benthic Group E (MBG-E). These data are the first GFMs reconstructed from the deep basaltic subseafloor biosphere, and provide a dataset available for further interrogation. PMID:28350381

  12. Metagenomic insights into ultraviolet disinfection effects on antibiotic resistome in biologically treated wastewater.

    PubMed

    Hu, Qing; Zhang, Xu-Xiang; Jia, Shuyu; Huang, Kailong; Tang, Junying; Shi, Peng; Ye, Lin; Ren, Hongqiang

    2016-09-15

    High-throughput sequencing-based metagenomic approaches were used to comprehensively investigate ultraviolet effects on the microbial community structure, and diversity and abundance of antibiotic resistance genes (ARGs) and mobile genetic elements (MGEs) in biologically treated wastewater. After ultraviolet radiation, some dominant genera, like Aeromonas and Halomonas, in the wastewater almost disappeared, while the relative abundance of some minor genera including Pseudomonas and Bacillus increased dozens of times. Metagenomic analysis showed that 159 ARGs within 14 types were detectable in the samples, and the radiation at 500 mJ/cm(2) obviously increased their total relative abundance from 31.68 ppm to 190.78 ppm, which was supported by quantitative real time PCR. As the dominant persistent ARGs, multidrug resistance genes carried by Pseudomonas and bacitracin resistance gene bacA carried by Bacillus mainly contributed to the ARGs abundance increase. Bacterial community shift and MGEs replication induced by the radiation might drive the resistome alteration. The findings may shed new light on the mechanism behind the ultraviolet radiation effects on antibiotic resistance in wastewater. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. Metagenomic chromosome conformation capture (meta3C) unveils the diversity of chromosome organization in microorganisms

    PubMed Central

    Marbouty, Martial; Cournac, Axel; Flot, Jean-François; Marie-Nelly, Hervé; Mozziconacci, Julien; Koszul, Romain

    2014-01-01

    Genomic analyses of microbial populations in their natural environment remain limited by the difficulty to assemble full genomes of individual species. Consequently, the chromosome organization of microorganisms has been investigated in a few model species, but the extent to which the features described can be generalized to other taxa remains unknown. Using controlled mixes of bacterial and yeast species, we developed meta3C, a metagenomic chromosome conformation capture approach that allows characterizing individual genomes and their average organization within a mix of organisms. Not only can meta3C be applied to species already sequenced, but a single meta3C library can be used for assembling, scaffolding and characterizing the tridimensional organization of unknown genomes. By applying meta3C to a semi-complex environmental sample, we confirmed its promising potential. Overall, this first meta3C study highlights the remarkable diversity of microorganisms chromosome organization, while providing an elegant and integrated approach to metagenomic analysis. DOI: http://dx.doi.org/10.7554/eLife.03318.001 PMID:25517076

  14. Open resource metagenomics: a model for sharing metagenomic libraries.

    PubMed

    Neufeld, J D; Engel, K; Cheng, J; Moreno-Hagelsieb, G; Rose, D R; Charles, T C

    2011-11-30

    Both sequence-based and activity-based exploitation of environmental DNA have provided unprecedented access to the genomic content of cultivated and uncultivated microorganisms. Although researchers deposit microbial strains in culture collections and DNA sequences in databases, activity-based metagenomic studies typically only publish sequences from the hits retrieved from specific screens. Physical metagenomic libraries, conceptually similar to entire sequence datasets, are usually not straightforward to obtain by interested parties subsequent to publication. In order to facilitate unrestricted distribution of metagenomic libraries, we propose the adoption of open resource metagenomics, in line with the trend towards open access publishing, and similar to culture- and mutant-strain collections that have been the backbone of traditional microbiology and microbial genetics. The concept of open resource metagenomics includes preparation of physical DNA libraries, preferably in versatile vectors that facilitate screening in a diversity of host organisms, and pooling of clones so that single aliquots containing complete libraries can be easily distributed upon request. Database deposition of associated metadata and sequence data for each library provides researchers with information to select the most appropriate libraries for further research projects. As a starting point, we have established the Canadian MetaMicroBiome Library (CM(2)BL [1]). The CM(2)BL is a publicly accessible collection of cosmid libraries containing environmental DNA from soils collected from across Canada, spanning multiple biomes. The libraries were constructed such that the cloned DNA can be easily transferred to Gateway® compliant vectors, facilitating functional screening in virtually any surrogate microbial host for which there are available plasmid vectors. The libraries, which we are placing in the public domain, will be distributed upon request without restriction to members of both the academic research community and industry. This article invites the scientific community to adopt this philosophy of open resource metagenomics to extend the utility of functional metagenomics beyond initial publication, circumventing the need to start from scratch with each new research project.

  15. Open resource metagenomics: a model for sharing metagenomic libraries

    PubMed Central

    Neufeld, J.D.; Engel, K.; Cheng, J.; Moreno-Hagelsieb, G.; Rose, D.R.; Charles, T.C.

    2011-01-01

    Both sequence-based and activity-based exploitation of environmental DNA have provided unprecedented access to the genomic content of cultivated and uncultivated microorganisms. Although researchers deposit microbial strains in culture collections and DNA sequences in databases, activity-based metagenomic studies typically only publish sequences from the hits retrieved from specific screens. Physical metagenomic libraries, conceptually similar to entire sequence datasets, are usually not straightforward to obtain by interested parties subsequent to publication. In order to facilitate unrestricted distribution of metagenomic libraries, we propose the adoption of open resource metagenomics, in line with the trend towards open access publishing, and similar to culture- and mutant-strain collections that have been the backbone of traditional microbiology and microbial genetics. The concept of open resource metagenomics includes preparation of physical DNA libraries, preferably in versatile vectors that facilitate screening in a diversity of host organisms, and pooling of clones so that single aliquots containing complete libraries can be easily distributed upon request. Database deposition of associated metadata and sequence data for each library provides researchers with information to select the most appropriate libraries for further research projects. As a starting point, we have established the Canadian MetaMicroBiome Library (CM2BL [1]). The CM2BL is a publicly accessible collection of cosmid libraries containing environmental DNA from soils collected from across Canada, spanning multiple biomes. The libraries were constructed such that the cloned DNA can be easily transferred to Gateway® compliant vectors, facilitating functional screening in virtually any surrogate microbial host for which there are available plasmid vectors. The libraries, which we are placing in the public domain, will be distributed upon request without restriction to members of both the academic research community and industry. This article invites the scientific community to adopt this philosophy of open resource metagenomics to extend the utility of functional metagenomics beyond initial publication, circumventing the need to start from scratch with each new research project. PMID:22180823

  16. Exploring nucleo-cytoplasmic large DNA viruses in Tara Oceans microbial metagenomes

    PubMed Central

    Hingamp, Pascal; Grimsley, Nigel; Acinas, Silvia G; Clerissi, Camille; Subirana, Lucie; Poulain, Julie; Ferrera, Isabel; Sarmento, Hugo; Villar, Emilie; Lima-Mendez, Gipsi; Faust, Karoline; Sunagawa, Shinichi; Claverie, Jean-Michel; Moreau, Hervé; Desdevises, Yves; Bork, Peer; Raes, Jeroen; de Vargas, Colomban; Karsenti, Eric; Kandels-Lewis, Stefanie; Jaillon, Olivier; Not, Fabrice; Pesant, Stéphane; Wincker, Patrick; Ogata, Hiroyuki

    2013-01-01

    Nucleo-cytoplasmic large DNA viruses (NCLDVs) constitute a group of eukaryotic viruses that can have crucial ecological roles in the sea by accelerating the turnover of their unicellular hosts or by causing diseases in animals. To better characterize the diversity, abundance and biogeography of marine NCLDVs, we analyzed 17 metagenomes derived from microbial samples (0.2–1.6 μm size range) collected during the Tara Oceans Expedition. The sample set includes ecosystems under-represented in previous studies, such as the Arabian Sea oxygen minimum zone (OMZ) and Indian Ocean lagoons. By combining computationally derived relative abundance and direct prokaryote cell counts, the abundance of NCLDVs was found to be in the order of 104–105 genomes ml−1 for the samples from the photic zone and 102–103 genomes ml−1 for the OMZ. The Megaviridae and Phycodnaviridae dominated the NCLDV populations in the metagenomes, although most of the reads classified in these families showed large divergence from known viral genomes. Our taxon co-occurrence analysis revealed a potential association between viruses of the Megaviridae family and eukaryotes related to oomycetes. In support of this predicted association, we identified six cases of lateral gene transfer between Megaviridae and oomycetes. Our results suggest that marine NCLDVs probably outnumber eukaryotic organisms in the photic layer (per given water mass) and that metagenomic sequence analyses promise to shed new light on the biodiversity of marine viruses and their interactions with potential hosts. PMID:23575371

  17. Structural and Functional Insights from the Metagenome of an Acidic Hot Spring Microbial Planktonic Community in the Colombian Andes

    PubMed Central

    Jiménez, Diego Javier; Andreote, Fernando Dini; Chaves, Diego; Montaña, José Salvador; Osorio-Forero, Cesar; Junca, Howard; Zambrano, María Mercedes; Baena, Sandra

    2012-01-01

    A taxonomic and annotated functional description of microbial life was deduced from 53 Mb of metagenomic sequence retrieved from a planktonic fraction of the Neotropical high Andean (3,973 meters above sea level) acidic hot spring El Coquito (EC). A classification of unassembled metagenomic reads using different databases showed a high proportion of Gammaproteobacteria and Alphaproteobacteria (in total read affiliation), and through taxonomic affiliation of 16S rRNA gene fragments we observed the presence of Proteobacteria, micro-algae chloroplast and Firmicutes. Reads mapped against the genomes Acidiphilium cryptum JF-5, Legionella pneumophila str. Corby and Acidithiobacillus caldus revealed the presence of transposase-like sequences, potentially involved in horizontal gene transfer. Functional annotation and hierarchical comparison with different datasets obtained by pyrosequencing in different ecosystems showed that the microbial community also contained extensive DNA repair systems, possibly to cope with ultraviolet radiation at such high altitudes. Analysis of genes involved in the nitrogen cycle indicated the presence of dissimilatory nitrate reduction to N2 (narGHI, nirS, norBCDQ and nosZ), associated with Proteobacteria-like sequences. Genes involved in the sulfur cycle (cysDN, cysNC and aprA) indicated adenylsulfate and sulfite production that were affiliated to several bacterial species. In summary, metagenomic sequence data provided insight regarding the structure and possible functions of this hot spring microbial community, describing some groups potentially involved in the nitrogen and sulfur cycling in this environment. PMID:23251687

  18. Indexed variation graphs for efficient and accurate resistome profiling.

    PubMed

    Rowe, Will P M; Winn, Martyn D

    2018-05-14

    Antimicrobial resistance remains a major threat to global health. Profiling the collective antimicrobial resistance genes within a metagenome (the "resistome") facilitates greater understanding of antimicrobial resistance gene diversity and dynamics. In turn, this can allow for gene surveillance, individualised treatment of bacterial infections and more sustainable use of antimicrobials. However, resistome profiling can be complicated by high similarity between reference genes, as well as the sheer volume of sequencing data and the complexity of analysis workflows. We have developed an efficient and accurate method for resistome profiling that addresses these complications and improves upon currently available tools. Our method combines a variation graph representation of gene sets with an LSH Forest indexing scheme to allow for fast classification of metagenomic sequence reads using similarity-search queries. Subsequent hierarchical local alignment of classified reads against graph traversals enables accurate reconstruction of full-length gene sequences using a scoring scheme. We provide our implementation, GROOT, and show it to be both faster and more accurate than a current reference-dependent tool for resistome profiling. GROOT runs on a laptop and can process a typical 2 gigabyte metagenome in 2 minutes using a single CPU. Our method is not restricted to resistome profiling and has the potential to improve current metagenomic workflows. GROOT is written in Go and is available at https://github.com/will-rowe/groot (MIT license). will.rowe@stfc.ac.uk. Supplementary data are available at Bioinformatics online.

  19. Exploring nucleo-cytoplasmic large DNA viruses in Tara Oceans microbial metagenomes.

    PubMed

    Hingamp, Pascal; Grimsley, Nigel; Acinas, Silvia G; Clerissi, Camille; Subirana, Lucie; Poulain, Julie; Ferrera, Isabel; Sarmento, Hugo; Villar, Emilie; Lima-Mendez, Gipsi; Faust, Karoline; Sunagawa, Shinichi; Claverie, Jean-Michel; Moreau, Hervé; Desdevises, Yves; Bork, Peer; Raes, Jeroen; de Vargas, Colomban; Karsenti, Eric; Kandels-Lewis, Stefanie; Jaillon, Olivier; Not, Fabrice; Pesant, Stéphane; Wincker, Patrick; Ogata, Hiroyuki

    2013-09-01

    Nucleo-cytoplasmic large DNA viruses (NCLDVs) constitute a group of eukaryotic viruses that can have crucial ecological roles in the sea by accelerating the turnover of their unicellular hosts or by causing diseases in animals. To better characterize the diversity, abundance and biogeography of marine NCLDVs, we analyzed 17 metagenomes derived from microbial samples (0.2-1.6 μm size range) collected during the Tara Oceans Expedition. The sample set includes ecosystems under-represented in previous studies, such as the Arabian Sea oxygen minimum zone (OMZ) and Indian Ocean lagoons. By combining computationally derived relative abundance and direct prokaryote cell counts, the abundance of NCLDVs was found to be in the order of 10(4)-10(5) genomes ml(-1) for the samples from the photic zone and 10(2)-10(3) genomes ml(-1) for the OMZ. The Megaviridae and Phycodnaviridae dominated the NCLDV populations in the metagenomes, although most of the reads classified in these families showed large divergence from known viral genomes. Our taxon co-occurrence analysis revealed a potential association between viruses of the Megaviridae family and eukaryotes related to oomycetes. In support of this predicted association, we identified six cases of lateral gene transfer between Megaviridae and oomycetes. Our results suggest that marine NCLDVs probably outnumber eukaryotic organisms in the photic layer (per given water mass) and that metagenomic sequence analyses promise to shed new light on the biodiversity of marine viruses and their interactions with potential hosts.

  20. Metagenome changes in the biogas producing community during anaerobic digestion of rice straw.

    PubMed

    Pore, Soham D; Shetty, Deepa; Arora, Preeti; Maheshwari, Sneha; Dhakephalkar, Prashant K

    2016-08-01

    The present investigation was undertaken to study the microbial community succession in a sour and healthy digester. Ion torrent next-generation sequencing (NGS)-based metagenomic approach indicated abundance of hydrolytic bacteria and exclusion of methanogens and syntrophic bacteria in sour digester. Functional gene analysis revealed higher abundance of enzymes involved in acidogenesis and lower abundance of enzymes associated with methanogenesis like Methyl coenzyme M-reductase, F420 dependent reductase and Formylmethanofuran dehydrogenase in sour digester. Increased abundance of methanogens (Methanomicrobia) and genes involved in methanogenesis was observed in the restored/healthy digester highlighting revival of pH sensitive methanogenic community. Copyright © 2016 Elsevier Ltd. All rights reserved.

Top